Podcasts about Lambdas

  • 74PODCASTS
  • 149EPISODES
  • 51mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • May 14, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about Lambdas

Latest podcast episodes about Lambdas

Syntax - Tasty Web Development Treats
902: Fullstack Cloudflare with React and Vite (Redwood SDK)

Syntax - Tasty Web Development Treats

Play Episode Listen Later May 14, 2025 46:54


Wes talks with Peter Pistorius about RedwoodSDK, a new React framework built natively for Cloudflare. They dive into real-time React, server components, zero-cost infrastructure, and why RedwoodSDK empowers developers to ship faster with fewer tradeoffs and more control. Show Notes 00:00 Welcome to Syntax! 00:52 What is RedwoodSDK? 04:49 Choosing openness over abstraction 08:46 More setup, more control 12:20 Why RedwoodSDK only runs on Cloudflare 14:25 What the database setup looks like 16:15 Durable Objects explained – Ep 879: Fullstack Cloudflare 18:14 Middleware and request flow 23:14 No built-in client-side router? 24:07 Integrating routers with defineApp 26:04 React Server Components and real-time updates 29:53 What happened to RedwoodJS? 31:14 Why do opinionated frameworks struggle to catch on? 34:35 The problem with Lambdas 36:16 Cloudflare's JavaScript runtime compatibility 40:04 Brought to you by Sentry.io 41:44 The vision behind RedwoodSDK Hit us up on Socials! Syntax: X Instagram Tiktok LinkedIn Threads Wes: X Instagram Tiktok LinkedIn Threads Scott: X Instagram Tiktok LinkedIn Threads Randy: X Instagram YouTube Threads

Unpivot
Time Quotes, 100k Subscribers and Black Box Lambdas

Unpivot

Play Episode Listen Later Mar 29, 2025 63:27


#Excel #PowerBI QotW: How to quote for work and estimate time. Other chat: What's the secret to 100K subscribers Are Lambdas a black box? Giles toys with occupational ruin. Sue is not happy with Mark Mark drops shade on Wyn's REGEX idea Please leave a 5 Star Review  Links to hosts and other content Unpivot Show Links Page   Hosts Wyn Hopkins, Mark Proctor, Sue Bayes, and Giles Male.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Today's episode is with Paul Klein, founder of Browserbase. We talked about building browser infrastructure for AI agents, the future of agent authentication, and their open source framework Stagehand.* [00:00:00] Introductions* [00:04:46] AI-specific challenges in browser infrastructure* [00:07:05] Multimodality in AI-Powered Browsing* [00:12:26] Running headless browsers at scale* [00:18:46] Geolocation when proxying* [00:21:25] CAPTCHAs and Agent Auth* [00:28:21] Building “User take over” functionality* [00:33:43] Stagehand: AI web browsing framework* [00:38:58] OpenAI's Operator and computer use agents* [00:44:44] Surprising use cases of Browserbase* [00:47:18] Future of browser automation and market competition* [00:53:11] Being a solo founderTranscriptAlessio [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.swyx [00:00:12]: Hey, and today we are very blessed to have our friends, Paul Klein, for the fourth, the fourth, CEO of Browserbase. Welcome.Paul [00:00:21]: Thanks guys. Yeah, I'm happy to be here. I've been lucky to know both of you for like a couple of years now, I think. So it's just like we're hanging out, you know, with three ginormous microphones in front of our face. It's totally normal hangout.swyx [00:00:34]: Yeah. We've actually mentioned you on the podcast, I think, more often than any other Solaris tenant. Just because like you're one of the, you know, best performing, I think, LLM tool companies that have started up in the last couple of years.Paul [00:00:50]: Yeah, I mean, it's been a whirlwind of a year, like Browserbase is actually pretty close to our first birthday. So we are one years old. And going from, you know, starting a company as a solo founder to... To, you know, having a team of 20 people, you know, a series A, but also being able to support hundreds of AI companies that are building AI applications that go out and automate the web. It's just been like, really cool. It's been happening a little too fast. I think like collectively as an AI industry, let's just take a week off together. I took my first vacation actually two weeks ago, and Operator came out on the first day, and then a week later, DeepSeat came out. And I'm like on vacation trying to chill. I'm like, we got to build with this stuff, right? So it's been a breakneck year. But I'm super happy to be here and like talk more about all the stuff we're seeing. And I'd love to hear kind of what you guys are excited about too, and share with it, you know?swyx [00:01:39]: Where to start? So people, you've done a bunch of podcasts. I think I strongly recommend Jack Bridger's Scaling DevTools, as well as Turner Novak's The Peel. And, you know, I'm sure there's others. So you covered your Twilio story in the past, talked about StreamClub, you got acquired to Mux, and then you left to start Browserbase. So maybe we just start with what is Browserbase? Yeah.Paul [00:02:02]: Browserbase is the web browser for your AI. We're building headless browser infrastructure, which are browsers that run in a server environment that's accessible to developers via APIs and SDKs. It's really hard to run a web browser in the cloud. You guys are probably running Chrome on your computers, and that's using a lot of resources, right? So if you want to run a web browser or thousands of web browsers, you can't just spin up a bunch of lambdas. You actually need to use a secure containerized environment. You have to scale it up and down. It's a stateful system. And that infrastructure is, like, super painful. And I know that firsthand, because at my last company, StreamClub, I was CTO, and I was building our own internal headless browser infrastructure. That's actually why we sold the company, is because Mux really wanted to buy our headless browser infrastructure that we'd built. And it's just a super hard problem. And I actually told my co-founders, I would never start another company unless it was a browser infrastructure company. And it turns out that's really necessary in the age of AI, when AI can actually go out and interact with websites, click on buttons, fill in forms. You need AI to do all of that work in an actual browser running somewhere on a server. And BrowserBase powers that.swyx [00:03:08]: While you're talking about it, it occurred to me, not that you're going to be acquired or anything, but it occurred to me that it would be really funny if you became the Nikita Beer of headless browser companies. You just have one trick, and you make browser companies that get acquired.Paul [00:03:23]: I truly do only have one trick. I'm screwed if it's not for headless browsers. I'm not a Go programmer. You know, I'm in AI grant. You know, browsers is an AI grant. But we were the only company in that AI grant batch that used zero dollars on AI spend. You know, we're purely an infrastructure company. So as much as people want to ask me about reinforcement learning, I might not be the best guy to talk about that. But if you want to ask about headless browser infrastructure at scale, I can talk your ear off. So that's really my area of expertise. And it's a pretty niche thing. Like, nobody has done what we're doing at scale before. So we're happy to be the experts.swyx [00:03:59]: You do have an AI thing, stagehand. We can talk about the sort of core of browser-based first, and then maybe stagehand. Yeah, stagehand is kind of the web browsing framework. Yeah.What is Browserbase? Headless Browser Infrastructure ExplainedAlessio [00:04:10]: Yeah. Yeah. And maybe how you got to browser-based and what problems you saw. So one of the first things I worked on as a software engineer was integration testing. Sauce Labs was kind of like the main thing at the time. And then we had Selenium, we had Playbrite, we had all these different browser things. But it's always been super hard to do. So obviously you've worked on this before. When you started browser-based, what were the challenges? What were the AI-specific challenges that you saw versus, there's kind of like all the usual running browser at scale in the cloud, which has been a problem for years. What are like the AI unique things that you saw that like traditional purchase just didn't cover? Yeah.AI-specific challenges in browser infrastructurePaul [00:04:46]: First and foremost, I think back to like the first thing I did as a developer, like as a kid when I was writing code, I wanted to write code that did stuff for me. You know, I wanted to write code to automate my life. And I do that probably by using curl or beautiful soup to fetch data from a web browser. And I think I still do that now that I'm in the cloud. And the other thing that I think is a huge challenge for me is that you can't just create a web site and parse that data. And we all know that now like, you know, taking HTML and plugging that into an LLM, you can extract insights, you can summarize. So it was very clear that now like dynamic web scraping became very possible with the rise of large language models or a lot easier. And that was like a clear reason why there's been more usage of headless browsers, which are necessary because a lot of modern websites don't expose all of their page content via a simple HTTP request. You know, they actually do require you to run this type of code for a specific time. JavaScript on the page to hydrate this. Airbnb is a great example. You go to airbnb.com. A lot of that content on the page isn't there until after they run the initial hydration. So you can't just scrape it with a curl. You need to have some JavaScript run. And a browser is that JavaScript engine that's going to actually run all those requests on the page. So web data retrieval was definitely one driver of starting BrowserBase and the rise of being able to summarize that within LLM. Also, I was familiar with if I wanted to automate a website, I could write one script and that would work for one website. It was very static and deterministic. But the web is non-deterministic. The web is always changing. And until we had LLMs, there was no way to write scripts that you could write once that would run on any website. That would change with the structure of the website. Click the login button. It could mean something different on many different websites. And LLMs allow us to generate code on the fly to actually control that. So I think that rise of writing the generic automation scripts that can work on many different websites, to me, made it clear that browsers are going to be a lot more useful because now you can automate a lot more things without writing. If you wanted to write a script to book a demo call on 100 websites, previously, you had to write 100 scripts. Now you write one script that uses LLMs to generate that script. That's why we built our web browsing framework, StageHand, which does a lot of that work for you. But those two things, web data collection and then enhanced automation of many different websites, it just felt like big drivers for more browser infrastructure that would be required to power these kinds of features.Alessio [00:07:05]: And was multimodality also a big thing?Paul [00:07:08]: Now you can use the LLMs to look, even though the text in the dome might not be as friendly. Maybe my hot take is I was always kind of like, I didn't think vision would be as big of a driver. For UI automation, I felt like, you know, HTML is structured text and large language models are good with structured text. But it's clear that these computer use models are often vision driven, and they've been really pushing things forward. So definitely being multimodal, like rendering the page is required to take a screenshot to give that to a computer use model to take actions on a website. And it's just another win for browser. But I'll be honest, that wasn't what I was thinking early on. I didn't even think that we'd get here so fast with multimodality. I think we're going to have to get back to multimodal and vision models.swyx [00:07:50]: This is one of those things where I forgot to mention in my intro that I'm an investor in Browserbase. And I remember that when you pitched to me, like a lot of the stuff that we have today, we like wasn't on the original conversation. But I did have my original thesis was something that we've talked about on the podcast before, which is take the GPT store, the custom GPT store, all the every single checkbox and plugin is effectively a startup. And this was the browser one. I think the main hesitation, I think I actually took a while to get back to you. The main hesitation was that there were others. Like you're not the first hit list browser startup. It's not even your first hit list browser startup. There's always a question of like, will you be the category winner in a place where there's a bunch of incumbents, to be honest, that are bigger than you? They're just not targeted at the AI space. They don't have the backing of Nat Friedman. And there's a bunch of like, you're here in Silicon Valley. They're not. I don't know.Paul [00:08:47]: I don't know if that's, that was it, but like, there was a, yeah, I mean, like, I think I tried all the other ones and I was like, really disappointed. Like my background is from working at great developer tools, companies, and nothing had like the Vercel like experience. Um, like our biggest competitor actually is partly owned by private equity and they just jacked up their prices quite a bit. And the dashboard hasn't changed in five years. And I actually used them at my last company and tried them and I was like, oh man, like there really just needs to be something that's like the experience of these great infrastructure companies, like Stripe, like clerk, like Vercel that I use in love, but oriented towards this kind of like more specific category, which is browser infrastructure, which is really technically complex. Like a lot of stuff can go wrong on the internet when you're running a browser. The internet is very vast. There's a lot of different configurations. Like there's still websites that only work with internet explorer out there. How do you handle that when you're running your own browser infrastructure? These are the problems that we have to think about and solve at BrowserBase. And it's, it's certainly a labor of love, but I built this for me, first and foremost, I know it's super cheesy and everyone says that for like their startups, but it really, truly was for me. If you look at like the talks I've done even before BrowserBase, and I'm just like really excited to try and build a category defining infrastructure company. And it's, it's rare to have a new category of infrastructure exists. We're here in the Chroma offices and like, you know, vector databases is a new category of infrastructure. Is it, is it, I mean, we can, we're in their office, so, you know, we can, we can debate that one later. That is one.Multimodality in AI-Powered Browsingswyx [00:10:16]: That's one of the industry debates.Paul [00:10:17]: I guess we go back to the LLMOS talk that Karpathy gave way long ago. And like the browser box was very clearly there and it seemed like the people who were building in this space also agreed that browsers are a core primitive of infrastructure for the LLMOS that's going to exist in the future. And nobody was building something there that I wanted to use. So I had to go build it myself.swyx [00:10:38]: Yeah. I mean, exactly that talk that, that honestly, that diagram, every box is a startup and there's the code box and then there's the. The browser box. I think at some point they will start clashing there. There's always the question of the, are you a point solution or are you the sort of all in one? And I think the point solutions tend to win quickly, but then the only ones have a very tight cohesive experience. Yeah. Let's talk about just the hard problems of browser base you have on your website, which is beautiful. Thank you. Was there an agency that you used for that? Yeah. Herb.paris.Paul [00:11:11]: They're amazing. Herb.paris. Yeah. It's H-E-R-V-E. I highly recommend for developers. Developer tools, founders to work with consumer agencies because they end up building beautiful things and the Parisians know how to build beautiful interfaces. So I got to give prep.swyx [00:11:24]: And chat apps, apparently are, they are very fast. Oh yeah. The Mistral chat. Yeah. Mistral. Yeah.Paul [00:11:31]: Late chat.swyx [00:11:31]: Late chat. And then your videos as well, it was professionally shot, right? The series A video. Yeah.Alessio [00:11:36]: Nico did the videos. He's amazing. Not the initial video that you shot at the new one. First one was Austin.Paul [00:11:41]: Another, another video pretty surprised. But yeah, I mean, like, I think when you think about how you talk about your company. You have to think about the way you present yourself. It's, you know, as a developer, you think you evaluate a company based on like the API reliability and the P 95, but a lot of developers say, is the website good? Is the message clear? Do I like trust this founder? I'm building my whole feature on. So I've tried to nail that as well as like the reliability of the infrastructure. You're right. It's very hard. And there's a lot of kind of foot guns that you run into when running headless browsers at scale. Right.Competing with Existing Headless Browser Solutionsswyx [00:12:10]: So let's pick one. You have eight features here. Seamless integration. Scalability. Fast or speed. Secure. Observable. Stealth. That's interesting. Extensible and developer first. What comes to your mind as like the top two, three hardest ones? Yeah.Running headless browsers at scalePaul [00:12:26]: I think just running headless browsers at scale is like the hardest one. And maybe can I nerd out for a second? Is that okay? I heard this is a technical audience, so I'll talk to the other nerds. Whoa. They were listening. Yeah. They're upset. They're ready. The AGI is angry. Okay. So. So how do you run a browser in the cloud? Let's start with that, right? So let's say you're using a popular browser automation framework like Puppeteer, Playwright, and Selenium. Maybe you've written a code, some code locally on your computer that opens up Google. It finds the search bar and then types in, you know, search for Latent Space and hits the search button. That script works great locally. You can see the little browser open up. You want to take that to production. You want to run the script in a cloud environment. So when your laptop is closed, your browser is doing something. The browser is doing something. Well, I, we use Amazon. You can see the little browser open up. You know, the first thing I'd reach for is probably like some sort of serverless infrastructure. I would probably try and deploy on a Lambda. But Chrome itself is too big to run on a Lambda. It's over 250 megabytes. So you can't easily start it on a Lambda. So you maybe have to use something like Lambda layers to squeeze it in there. Maybe use a different Chromium build that's lighter. And you get it on the Lambda. Great. It works. But it runs super slowly. It's because Lambdas are very like resource limited. They only run like with one vCPU. You can run one process at a time. Remember, Chromium is super beefy. It's barely running on my MacBook Air. I'm still downloading it from a pre-run. Yeah, from the test earlier, right? I'm joking. But it's big, you know? So like Lambda, it just won't work really well. Maybe it'll work, but you need something faster. Your users want something faster. Okay. Well, let's put it on a beefier instance. Let's get an EC2 server running. Let's throw Chromium on there. Great. Okay. I can, that works well with one user. But what if I want to run like 10 Chromium instances, one for each of my users? Okay. Well, I might need two EC2 instances. Maybe 10. All of a sudden, you have multiple EC2 instances. This sounds like a problem for Kubernetes and Docker, right? Now, all of a sudden, you're using ECS or EKS, the Kubernetes or container solutions by Amazon. You're spending up and down containers, and you're spending a whole engineer's time on kind of maintaining this stateful distributed system. Those are some of the worst systems to run because when it's a stateful distributed system, it means that you are bound by the connections to that thing. You have to keep the browser open while someone is working with it, right? That's just a painful architecture to run. And there's all this other little gotchas with Chromium, like Chromium, which is the open source version of Chrome, by the way. You have to install all these fonts. You want emojis working in your browsers because your vision model is looking for the emoji. You need to make sure you have the emoji fonts. You need to make sure you have all the right extensions configured, like, oh, do you want ad blocking? How do you configure that? How do you actually record all these browser sessions? Like it's a headless browser. You can't look at it. So you need to have some sort of observability. Maybe you're recording videos and storing those somewhere. It all kind of adds up to be this just giant monster piece of your project when all you wanted to do was run a lot of browsers in production for this little script to go to google.com and search. And when I see a complex distributed system, I see an opportunity to build a great infrastructure company. And we really abstract that away with Browserbase where our customers can use these existing frameworks, Playwright, Publisher, Selenium, or our own stagehand and connect to our browsers in a serverless-like way. And control them, and then just disconnect when they're done. And they don't have to think about the complex distributed system behind all of that. They just get a browser running anywhere, anytime. Really easy to connect to.swyx [00:15:55]: I'm sure you have questions. My standard question with anything, so essentially you're a serverless browser company, and there's been other serverless things that I'm familiar with in the past, serverless GPUs, serverless website hosting. That's where I come from with Netlify. One question is just like, you promised to spin up thousands of servers. You promised to spin up thousands of browsers in milliseconds. I feel like there's no real solution that does that yet. And I'm just kind of curious how. The only solution I know, which is to kind of keep a kind of warm pool of servers around, which is expensive, but maybe not so expensive because it's just CPUs. So I'm just like, you know. Yeah.Browsers as a Core Primitive in AI InfrastructurePaul [00:16:36]: You nailed it, right? I mean, how do you offer a serverless-like experience with something that is clearly not serverless, right? And the answer is, you need to be able to run... We run many browsers on single nodes. We use Kubernetes at browser base. So we have many pods that are being scheduled. We have to predictably schedule them up or down. Yes, thousands of browsers in milliseconds is the best case scenario. If you hit us with 10,000 requests, you may hit a slower cold start, right? So we've done a lot of work on predictive scaling and being able to kind of route stuff to different regions where we have multiple regions of browser base where we have different pools available. You can also pick the region you want to go to based on like lower latency, round trip, time latency. It's very important with these types of things. There's a lot of requests going over the wire. So for us, like having a VM like Firecracker powering everything under the hood allows us to be super nimble and spin things up or down really quickly with strong multi-tenancy. But in the end, this is like the complex infrastructural challenges that we have to kind of deal with at browser base. And we have a lot more stuff on our roadmap to allow customers to have more levers to pull to exchange, do you want really fast browser startup times or do you want really low costs? And if you're willing to be more flexible on that, we may be able to kind of like work better for your use cases.swyx [00:17:44]: Since you used Firecracker, shouldn't Fargate do that for you or did you have to go lower level than that? We had to go lower level than that.Paul [00:17:51]: I find this a lot with Fargate customers, which is alarming for Fargate. We used to be a giant Fargate customer. Actually, the first version of browser base was ECS and Fargate. And unfortunately, it's a great product. I think we were actually the largest Fargate customer in our region for a little while. No, what? Yeah, seriously. And unfortunately, it's a great product, but I think if you're an infrastructure company, you actually have to have a deeper level of control over these primitives. I think it's the same thing is true with databases. We've used other database providers and I think-swyx [00:18:21]: Yeah, serverless Postgres.Paul [00:18:23]: Shocker. When you're an infrastructure company, you're on the hook if any provider has an outage. And I can't tell my customers like, hey, we went down because so-and-so went down. That's not acceptable. So for us, we've really moved to bringing things internally. It's kind of opposite of what we preach. We tell our customers, don't build this in-house, but then we're like, we build a lot of stuff in-house. But I think it just really depends on what is in the critical path. We try and have deep ownership of that.Alessio [00:18:46]: On the distributed location side, how does that work for the web where you might get sort of different content in different locations, but the customer is expecting, you know, if you're in the US, I'm expecting the US version. But if you're spinning up my browser in France, I might get the French version. Yeah.Paul [00:19:02]: Yeah. That's a good question. Well, generally, like on the localization, there is a thing called locale in the browser. You can set like what your locale is. If you're like in the ENUS browser or not, but some things do IP, IP based routing. And in that case, you may want to have a proxy. Like let's say you're running something in the, in Europe, but you want to make sure you're showing up from the US. You may want to use one of our proxy features so you can turn on proxies to say like, make sure these connections always come from the United States, which is necessary too, because when you're browsing the web, you're coming from like a, you know, data center IP, and that can make things a lot harder to browse web. So we do have kind of like this proxy super network. Yeah. We have a proxy for you based on where you're going, so you can reliably automate the web. But if you get scheduled in Europe, that doesn't happen as much. We try and schedule you as close to, you know, your origin that you're trying to go to. But generally you have control over the regions you can put your browsers in. So you can specify West one or East one or Europe. We only have one region of Europe right now, actually. Yeah.Alessio [00:19:55]: What's harder, the browser or the proxy? I feel like to me, it feels like actually proxying reliably at scale. It's much harder than spending up browsers at scale. I'm curious. It's all hard.Paul [00:20:06]: It's layers of hard, right? Yeah. I think it's different levels of hard. I think the thing with the proxy infrastructure is that we work with many different web proxy providers and some are better than others. Some have good days, some have bad days. And our customers who've built browser infrastructure on their own, they have to go and deal with sketchy actors. Like first they figure out their own browser infrastructure and then they got to go buy a proxy. And it's like you can pay in Bitcoin and it just kind of feels a little sus, right? It's like you're buying drugs when you're trying to get a proxy online. We have like deep relationships with these counterparties. We're able to audit them and say, is this proxy being sourced ethically? Like it's not running on someone's TV somewhere. Is it free range? Yeah. Free range organic proxies, right? Right. We do a level of diligence. We're SOC 2. So we have to understand what is going on here. But then we're able to make sure that like we route around proxy providers not working. There's proxy providers who will just, the proxy will stop working all of a sudden. And then if you don't have redundant proxying on your own browsers, that's hard down for you or you may get some serious impacts there. With us, like we intelligently know, hey, this proxy is not working. Let's go to this one. And you can kind of build a network of multiple providers to really guarantee the best uptime for our customers. Yeah. So you don't own any proxies? We don't own any proxies. You're right. The team has been saying who wants to like take home a little proxy server, but not yet. We're not there yet. You know?swyx [00:21:25]: It's a very mature market. I don't think you should build that yourself. Like you should just be a super customer of them. Yeah. Scraping, I think, is the main use case for that. I guess. Well, that leads us into CAPTCHAs and also off, but let's talk about CAPTCHAs. You had a little spiel that you wanted to talk about CAPTCHA stuff.Challenges of Scaling Browser InfrastructurePaul [00:21:43]: Oh, yeah. I was just, I think a lot of people ask, if you're thinking about proxies, you're thinking about CAPTCHAs too. I think it's the same thing. You can go buy CAPTCHA solvers online, but it's the same buying experience. It's some sketchy website, you have to integrate it. It's not fun to buy these things and you can't really trust that the docs are bad. What Browserbase does is we integrate a bunch of different CAPTCHAs. We do some stuff in-house, but generally we just integrate with a bunch of known vendors and continually monitor and maintain these things and say, is this working or not? Can we route around it or not? These are CAPTCHA solvers. CAPTCHA solvers, yeah. Not CAPTCHA providers, CAPTCHA solvers. Yeah, sorry. CAPTCHA solvers. We really try and make sure all of that works for you. I think as a dev, if I'm buying infrastructure, I want it all to work all the time and it's important for us to provide that experience by making sure everything does work and monitoring it on our own. Yeah. Right now, the world of CAPTCHAs is tricky. I think AI agents in particular are very much ahead of the internet infrastructure. CAPTCHAs are designed to block all types of bots, but there are now good bots and bad bots. I think in the future, CAPTCHAs will be able to identify who a good bot is, hopefully via some sort of KYC. For us, we've been very lucky. We have very little to no known abuse of Browserbase because we really look into who we work with. And for certain types of CAPTCHA solving, we only allow them on certain types of plans because we want to make sure that we can know what people are doing, what their use cases are. And that's really allowed us to try and be an arbiter of good bots, which is our long term goal. I want to build great relationships with people like Cloudflare so we can agree, hey, here are these acceptable bots. We'll identify them for you and make sure we flag when they come to your website. This is a good bot, you know?Alessio [00:23:23]: I see. And Cloudflare said they want to do more of this. So they're going to set by default, if they think you're an AI bot, they're going to reject. I'm curious if you think this is something that is going to be at the browser level or I mean, the DNS level with Cloudflare seems more where it should belong. But I'm curious how you think about it.Paul [00:23:40]: I think the web's going to change. You know, I think that the Internet as we have it right now is going to change. And we all need to just accept that the cat is out of the bag. And instead of kind of like wishing the Internet was like it was in the 2000s, we can have free content line that wouldn't be scraped. It's just it's not going to happen. And instead, we should think about like, one, how can we change? How can we change the models of, you know, information being published online so people can adequately commercialize it? But two, how do we rebuild applications that expect that AI agents are going to log in on their behalf? Those are the things that are going to allow us to kind of like identify good and bad bots. And I think the team at Clerk has been doing a really good job with this on the authentication side. I actually think that auth is the biggest thing that will prevent agents from accessing stuff, not captchas. And I think there will be agent auth in the future. I don't know if it's going to happen from an individual company, but actually authentication providers that have a, you know, hidden login as agent feature, which will then you put in your email, you'll get a push notification, say like, hey, your browser-based agent wants to log into your Airbnb. You can approve that and then the agent can proceed. That really circumvents the need for captchas or logging in as you and sharing your password. I think agent auth is going to be one way we identify good bots going forward. And I think a lot of this captcha solving stuff is really short-term problems as the internet kind of reorients itself around how it's going to work with agents browsing the web, just like people do. Yeah.Managing Distributed Browser Locations and Proxiesswyx [00:24:59]: Stitch recently was on Hacker News for talking about agent experience, AX, which is a thing that Netlify is also trying to clone and coin and talk about. And we've talked about this on our previous episodes before in a sense that I actually think that's like maybe the only part of the tech stack that needs to be kind of reinvented for agents. Everything else can stay the same, CLIs, APIs, whatever. But auth, yeah, we need agent auth. And it's mostly like short-lived, like it should not, it should be a distinct, identity from the human, but paired. I almost think like in the same way that every social network should have your main profile and then your alt accounts or your Finsta, it's almost like, you know, every, every human token should be paired with the agent token and the agent token can go and do stuff on behalf of the human token, but not be presumed to be the human. Yeah.Paul [00:25:48]: It's like, it's, it's actually very similar to OAuth is what I'm thinking. And, you know, Thread from Stitch is an investor, Colin from Clerk, Octaventures, all investors in browser-based because like, I hope they solve this because they'll make browser-based submission more possible. So we don't have to overcome all these hurdles, but I think it will be an OAuth-like flow where an agent will ask to log in as you, you'll approve the scopes. Like it can book an apartment on Airbnb, but it can't like message anybody. And then, you know, the agent will have some sort of like role-based access control within an application. Yeah. I'm excited for that.swyx [00:26:16]: The tricky part is just, there's one, one layer of delegation here, which is like, you're authoring my user's user or something like that. I don't know if that's tricky or not. Does that make sense? Yeah.Paul [00:26:25]: You know, actually at Twilio, I worked on the login identity and access. Management teams, right? So like I built Twilio's login page.swyx [00:26:31]: You were an intern on that team and then you became the lead in two years? Yeah.Paul [00:26:34]: Yeah. I started as an intern in 2016 and then I was the tech lead of that team. How? That's not normal. I didn't have a life. He's not normal. Look at this guy. I didn't have a girlfriend. I just loved my job. I don't know. I applied to 500 internships for my first job and I got rejected from every single one of them except for Twilio and then eventually Amazon. And they took a shot on me and like, I was getting paid money to write code, which was my dream. Yeah. Yeah. I'm very lucky that like this coding thing worked out because I was going to be doing it regardless. And yeah, I was able to kind of spend a lot of time on a team that was growing at a company that was growing. So it informed a lot of this stuff here. I think these are problems that have been solved with like the SAML protocol with SSO. I think it's a really interesting stuff with like WebAuthn, like these different types of authentication, like schemes that you can use to authenticate people. The tooling is all there. It just needs to be tweaked a little bit to work for agents. And I think the fact that there are companies that are already. Providing authentication as a service really sets it up. Well, the thing that's hard is like reinventing the internet for agents. We don't want to rebuild the internet. That's an impossible task. And I think people often say like, well, we'll have this second layer of APIs built for agents. I'm like, we will for the top use cases, but instead of we can just tweak the internet as is, which is on the authentication side, I think we're going to be the dumb ones going forward. Unfortunately, I think AI is going to be able to do a lot of the tasks that we do online, which means that it will be able to go to websites, click buttons on our behalf and log in on our behalf too. So with this kind of like web agent future happening, I think with some small structural changes, like you said, it feels like it could all slot in really nicely with the existing internet.Handling CAPTCHAs and Agent Authenticationswyx [00:28:08]: There's one more thing, which is the, your live view iframe, which lets you take, take control. Yeah. Obviously very key for operator now, but like, was, is there anything interesting technically there or that the people like, well, people always want this.Paul [00:28:21]: It was really hard to build, you know, like, so, okay. Headless browsers, you don't see them, right. They're running. They're running in a cloud somewhere. You can't like look at them. And I just want to really make, it's a weird name. I wish we came up with a better name for this thing, but you can't see them. Right. But customers don't trust AI agents, right. At least the first pass. So what we do with our live view is that, you know, when you use browser base, you can actually embed a live view of the browser running in the cloud for your customer to see it working. And that's what the first reason is the build trust, like, okay, so I have this script. That's going to go automate a website. I can embed it into my web application via an iframe and my customer can watch. I think. And then we added two way communication. So now not only can you watch the browser kind of being operated by AI, if you want to pause and actually click around type within this iframe that's controlling a browser, that's also possible. And this is all thanks to some of the lower level protocol, which is called the Chrome DevTools protocol. It has a API called start screencast, and you can also send mouse clicks and button clicks to a remote browser. And this is all embeddable within iframes. You have a browser within a browser, yo. And then you simulate the screen, the click on the other side. Exactly. And this is really nice often for, like, let's say, a capture that can't be solved. You saw this with Operator, you know, Operator actually uses a different approach. They use VNC. So, you know, you're able to see, like, you're seeing the whole window here. What we're doing is something a little lower level with the Chrome DevTools protocol. It's just PNGs being streamed over the wire. But the same thing is true, right? Like, hey, I'm running a window. Pause. Can you do something in this window? Human. Okay, great. Resume. Like sometimes 2FA tokens. Like if you get that text message, you might need a person to type that in. Web agents need human-in-the-loop type workflows still. You still need a person to interact with the browser. And building a UI to proxy that is kind of hard. You may as well just show them the whole browser and say, hey, can you finish this up for me? And then let the AI proceed on afterwards. Is there a future where I stream my current desktop to browser base? I don't think so. I think we're very much cloud infrastructure. Yeah. You know, but I think a lot of the stuff we're doing, we do want to, like, build tools. Like, you know, we'll talk about the stage and, you know, web agent framework in a second. But, like, there's a case where a lot of people are going desktop first for, you know, consumer use. And I think cloud is doing a lot of this, where I expect to see, you know, MCPs really oriented around the cloud desktop app for a reason, right? Like, I think a lot of these tools are going to run on your computer because it makes... I think it's breaking out. People are putting it on a server. Oh, really? Okay. Well, sweet. We'll see. We'll see that. I was surprised, though, wasn't I? I think that the browser company, too, with Dia Browser, it runs on your machine. You know, it's going to be...swyx [00:30:50]: What is it?Paul [00:30:51]: So, Dia Browser, as far as I understand... I used to use Arc. Yeah. I haven't used Arc. But I'm a big fan of the browser company. I think they're doing a lot of cool stuff in consumer. As far as I understand, it's a browser where you have a sidebar where you can, like, chat with it and it can control the local browser on your machine. So, if you imagine, like, what a consumer web agent is, which it lives alongside your browser, I think Google Chrome has Project Marina, I think. I almost call it Project Marinara for some reason. I don't know why. It's...swyx [00:31:17]: No, I think it's someone really likes the Waterworld. Oh, I see. The classic Kevin Costner. Yeah.Paul [00:31:22]: Okay. Project Marinara is a similar thing to the Dia Browser, in my mind, as far as I understand it. You have a browser that has an AI interface that will take over your mouse and keyboard and control the browser for you. Great for consumer use cases. But if you're building applications that rely on a browser and it's more part of a greater, like, AI app experience, you probably need something that's more like infrastructure, not a consumer app.swyx [00:31:44]: Just because I have explored a little bit in this area, do people want branching? So, I have the state. Of whatever my browser's in. And then I want, like, 100 clones of this state. Do people do that? Or...Paul [00:31:56]: People don't do it currently. Yeah. But it's definitely something we're thinking about. I think the idea of forking a browser is really cool. Technically, kind of hard. We're starting to see this in code execution, where people are, like, forking some, like, code execution, like, processes or forking some tool calls or branching tool calls. Haven't seen it at the browser level yet. But it makes sense. Like, if an AI agent is, like, using a website and it's not sure what path it wants to take to crawl this website. To find the information it's looking for. It would make sense for it to explore both paths in parallel. And that'd be a very, like... A road not taken. Yeah. And hopefully find the right answer. And then say, okay, this was actually the right one. And memorize that. And go there in the future. On the roadmap. For sure. Don't make my roadmap, please. You know?Alessio [00:32:37]: How do you actually do that? Yeah. How do you fork? I feel like the browser is so stateful for so many things.swyx [00:32:42]: Serialize the state. Restore the state. I don't know.Paul [00:32:44]: So, it's one of the reasons why we haven't done it yet. It's hard. You know? Like, to truly fork, it's actually quite difficult. The naive way is to open the same page in a new tab and then, like, hope that it's at the same thing. But if you have a form halfway filled, you may have to, like, take the whole, you know, container. Pause it. All the memory. Duplicate it. Restart it from there. It could be very slow. So, we haven't found a thing. Like, the easy thing to fork is just, like, copy the page object. You know? But I think there needs to be something a little bit more robust there. Yeah.swyx [00:33:12]: So, MorphLabs has this infinite branch thing. Like, wrote a custom fork of Linux or something that let them save the system state and clone it. MorphLabs, hit me up. I'll be a customer. Yeah. That's the only. I think that's the only way to do it. Yeah. Like, unless Chrome has some special API for you. Yeah.Paul [00:33:29]: There's probably something we'll reverse engineer one day. I don't know. Yeah.Alessio [00:33:32]: Let's talk about StageHand, the AI web browsing framework. You have three core components, Observe, Extract, and Act. Pretty clean landing page. What was the idea behind making a framework? Yeah.Stagehand: AI web browsing frameworkPaul [00:33:43]: So, there's three frameworks that are very popular or already exist, right? Puppeteer, Playwright, Selenium. Those are for building hard-coded scripts to control websites. And as soon as I started to play with LLMs plus browsing, I caught myself, you know, code-genning Playwright code to control a website. I would, like, take the DOM. I'd pass it to an LLM. I'd say, can you generate the Playwright code to click the appropriate button here? And it would do that. And I was like, this really should be part of the frameworks themselves. And I became really obsessed with SDKs that take natural language as part of, like, the API input. And that's what StageHand is. StageHand exposes three APIs, and it's a super set of Playwright. So, if you go to a page, you may want to take an action, click on the button, fill in the form, etc. That's what the act command is for. You may want to extract some data. This one takes a natural language, like, extract the winner of the Super Bowl from this page. You can give it a Zod schema, so it returns a structured output. And then maybe you're building an API. You can do an agent loop, and you want to kind of see what actions are possible on this page before taking one. You can do observe. So, you can observe the actions on the page, and it will generate a list of actions. You can guide it, like, give me actions on this page related to buying an item. And you can, like, buy it now, add to cart, view shipping options, and pass that to an LLM, an agent loop, to say, what's the appropriate action given this high-level goal? So, StageHand isn't a web agent. It's a framework for building web agents. And we think that agent loops are actually pretty close to the application layer because every application probably has different goals or different ways it wants to take steps. I don't think I've seen a generic. Maybe you guys are the experts here. I haven't seen, like, a really good AI agent framework here. Everyone kind of has their own special sauce, right? I see a lot of developers building their own agent loops, and they're using tools. And I view StageHand as the browser tool. So, we expose act, extract, observe. Your agent can call these tools. And from that, you don't have to worry about it. You don't have to worry about generating playwright code performantly. You don't have to worry about running it. You can kind of just integrate these three tool calls into your agent loop and reliably automate the web.swyx [00:35:48]: A special shout-out to Anirudh, who I met at your dinner, who I think listens to the pod. Yeah. Hey, Anirudh.Paul [00:35:54]: Anirudh's a man. He's a StageHand guy.swyx [00:35:56]: I mean, the interesting thing about each of these APIs is they're kind of each startup. Like, specifically extract, you know, Firecrawler is extract. There's, like, Expand AI. There's a whole bunch of, like, extract companies. They just focus on extract. I'm curious. Like, I feel like you guys are going to collide at some point. Like, right now, it's friendly. Everyone's in a blue ocean. At some point, it's going to be valuable enough that there's some turf battle here. I don't think you have a dog in a fight. I think you can mock extract to use an external service if they're better at it than you. But it's just an observation that, like, in the same way that I see each option, each checkbox in the side of custom GBTs becoming a startup or each box in the Karpathy chart being a startup. Like, this is also becoming a thing. Yeah.Paul [00:36:41]: I mean, like, so the way StageHand works is that it's MIT-licensed, completely open source. You bring your own API key to your LLM of choice. You could choose your LLM. We don't make any money off of the extract or really. We only really make money if you choose to run it with our browser. You don't have to. You can actually use your own browser, a local browser. You know, StageHand is completely open source for that reason. And, yeah, like, I think if you're building really complex web scraping workflows, I don't know if StageHand is the tool for you. I think it's really more if you're building an AI agent that needs a few general tools or if it's doing a lot of, like, web automation-intensive work. But if you're building a scraping company, StageHand is not your thing. You probably want something that's going to, like, get HTML content, you know, convert that to Markdown, query it. That's not what StageHand does. StageHand is more about reliability. I think we focus a lot on reliability and less so on cost optimization and speed at this point.swyx [00:37:33]: I actually feel like StageHand, so the way that StageHand works, it's like, you know, page.act, click on the quick start. Yeah. It's kind of the integration test for the code that you would have to write anyway, like the Puppeteer code that you have to write anyway. And when the page structure changes, because it always does, then this is still the test. This is still the test that I would have to write. Yeah. So it's kind of like a testing framework that doesn't need implementation detail.Paul [00:37:56]: Well, yeah. I mean, Puppeteer, Playwright, and Slenderman were all designed as testing frameworks, right? Yeah. And now people are, like, hacking them together to automate the web. I would say, and, like, maybe this is, like, me being too specific. But, like, when I write tests, if the page structure changes. Without me knowing, I want that test to fail. So I don't know if, like, AI, like, regenerating that. Like, people are using StageHand for testing. But it's more for, like, usability testing, not, like, testing of, like, does the front end, like, has it changed or not. Okay. But generally where we've seen people, like, really, like, take off is, like, if they're using, you know, something. If they want to build a feature in their application that's kind of like Operator or Deep Research, they're using StageHand to kind of power that tool calling in their own agent loop. Okay. Cool.swyx [00:38:37]: So let's go into Operator, the first big agent launch of the year from OpenAI. Seems like they have a whole bunch scheduled. You were on break and your phone blew up. What's your just general view of computer use agents is what they're calling it. The overall category before we go into Open Operator, just the overall promise of Operator. I will observe that I tried it once. It was okay. And I never tried it again.OpenAI's Operator and computer use agentsPaul [00:38:58]: That tracks with my experience, too. Like, I'm a huge fan of the OpenAI team. Like, I think that I do not view Operator as the company. I'm not a company killer for browser base at all. I think it actually shows people what's possible. I think, like, computer use models make a lot of sense. And I'm actually most excited about computer use models is, like, their ability to, like, really take screenshots and reasoning and output steps. I think that using mouse click or mouse coordinates, I've seen that proved to be less reliable than I would like. And I just wonder if that's the right form factor. What we've done with our framework is anchor it to the DOM itself, anchor it to the actual item. So, like, if it's clicking on something, it's clicking on that thing, you know? Like, it's more accurate. No matter where it is. Yeah, exactly. Because it really ties in nicely. And it can handle, like, the whole viewport in one go, whereas, like, Operator can only handle what it sees. Can you hover? Is hovering a thing that you can do? I don't know if we expose it as a tool directly, but I'm sure there's, like, an API for hovering. Like, move mouse to this position. Yeah, yeah, yeah. I think you can trigger hover, like, via, like, the JavaScript on the DOM itself. But, no, I think, like, when we saw computer use, everyone's eyes lit up because they realized, like, wow, like, AI is going to actually automate work for people. And I think seeing that kind of happen from both of the labs, and I'm sure we're going to see more labs launch computer use models, I'm excited to see all the stuff that people build with it. I think that I'd love to see computer use power, like, controlling a browser on browser base. And I think, like, Open Operator, which was, like, our open source version of OpenAI's Operator, was our first take on, like, how can we integrate these models into browser base? And we handle the infrastructure and let the labs do the models. I don't have a sense that Operator will be released as an API. I don't know. Maybe it will. I'm curious to see how well that works because I think it's going to be really hard for a company like OpenAI to do things like support CAPTCHA solving or, like, have proxies. Like, I think it's hard for them structurally. Imagine this New York Times headline, OpenAI CAPTCHA solving. Like, that would be a pretty bad headline, this New York Times headline. Browser base solves CAPTCHAs. No one cares. No one cares. And, like, our investors are bored. Like, we're all okay with this, you know? We're building this company knowing that the CAPTCHA solving is short-lived until we figure out how to authenticate good bots. I think it's really hard for a company like OpenAI, who has this brand that's so, so good, to balance with, like, the icky parts of web automation, which it can be kind of complex to solve. I'm sure OpenAI knows who to call whenever they need you. Yeah, right. I'm sure they'll have a great partnership.Alessio [00:41:23]: And is Open Operator just, like, a marketing thing for you? Like, how do you think about resource allocation? So, you can spin this up very quickly. And now there's all this, like, open deep research, just open all these things that people are building. We started it, you know. You're the original Open. We're the original Open operator, you know? Is it just, hey, look, this is a demo, but, like, we'll help you build out an actual product for yourself? Like, are you interested in going more of a product route? That's kind of the OpenAI way, right? They started as a model provider and then…Paul [00:41:53]: Yeah, we're not interested in going the product route yet. I view Open Operator as a model provider. It's a reference project, you know? Let's show people how to build these things using the infrastructure and models that are out there. And that's what it is. It's, like, Open Operator is very simple. It's an agent loop. It says, like, take a high-level goal, break it down into steps, use tool calling to accomplish those steps. It takes screenshots and feeds those screenshots into an LLM with the step to generate the right action. It uses stagehand under the hood to actually execute this action. It doesn't use a computer use model. And it, like, has a nice interface using the live view that we talked about, the iframe, to embed that into an application. So I felt like people on launch day wanted to figure out how to build their own version of this. And we turned that around really quickly to show them. And I hope we do that with other things like deep research. We don't have a deep research launch yet. I think David from AOMNI actually has an amazing open deep research that he launched. It has, like, 10K GitHub stars now. So he's crushing that. But I think if people want to build these features natively into their application, they need good reference projects. And I think Open Operator is a good example of that.swyx [00:42:52]: I don't know. Actually, I'm actually pretty bullish on API-driven operator. Because that's the only way that you can sort of, like, once it's reliable enough, obviously. And now we're nowhere near. But, like, give it five years. It'll happen, you know. And then you can sort of spin this up and browsers are working in the background and you don't necessarily have to know. And it just is booking restaurants for you, whatever. I can definitely see that future happening. I had this on the landing page here. This might be a slightly out of order. But, you know, you have, like, sort of three use cases for browser base. Open Operator. Or this is the operator sort of use case. It's kind of like the workflow automation use case. And it completes with UiPath in the sort of RPA category. Would you agree with that? Yeah, I would agree with that. And then there's Agents we talked about already. And web scraping, which I imagine would be the bulk of your workload right now, right?Paul [00:43:40]: No, not at all. I'd say actually, like, the majority is browser automation. We're kind of expensive for web scraping. Like, I think that if you're building a web scraping product, if you need to do occasional web scraping or you have to do web scraping that works every single time, you want to use browser automation. Yeah. You want to use browser-based. But if you're building web scraping workflows, what you should do is have a waterfall. You should have the first request is a curl to the website. See if you can get it without even using a browser. And then the second request may be, like, a scraping-specific API. There's, like, a thousand scraping APIs out there that you can use to try and get data. Scraping B. Scraping B is a great example, right? Yeah. And then, like, if those two don't work, bring out the heavy hitter. Like, browser-based will 100% work, right? It will load the page in a real browser, hydrate it. I see.swyx [00:44:21]: Because a lot of people don't render to JS.swyx [00:44:25]: Yeah, exactly.Paul [00:44:26]: So, I mean, the three big use cases, right? Like, you know, automation, web data collection, and then, you know, if you're building anything agentic that needs, like, a browser tool, you want to use browser-based.Alessio [00:44:35]: Is there any use case that, like, you were super surprised by that people might not even think about? Oh, yeah. Or is it, yeah, anything that you can share? The long tail is crazy. Yeah.Surprising use cases of BrowserbasePaul [00:44:44]: One of the case studies on our website that I think is the most interesting is this company called Benny. So, the way that it works is if you're on food stamps in the United States, you can actually get rebates if you buy certain things. Yeah. You buy some vegetables. You submit your receipt to the government. They'll give you a little rebate back. Say, hey, thanks for buying vegetables. It's good for you. That process of submitting that receipt is very painful. And the way Benny works is you use their app to take a photo of your receipt, and then Benny will go submit that receipt for you and then deposit the money into your account. That's actually using no AI at all. It's all, like, hard-coded scripts. They maintain the scripts. They've been doing a great job. And they build this amazing consumer app. But it's an example of, like, all these, like, tedious workflows that people have to do to kind of go about their business. And they're doing it for the sake of their day-to-day lives. And I had never known about, like, food stamp rebates or the complex forms you have to do to fill them. But the world is powered by millions and millions of tedious forms, visas. You know, Emirate Lighthouse is a customer, right? You know, they do the O1 visa. Millions and millions of forms are taking away humans' time. And I hope that Browserbase can help power software that automates away the web forms that we don't need anymore. Yeah.swyx [00:45:49]: I mean, I'm very supportive of that. I mean, forms. I do think, like, government itself is a big part of it. I think the government itself should embrace AI more to do more sort of human-friendly form filling. Mm-hmm. But I'm not optimistic. I'm not holding my breath. Yeah. We'll see. Okay. I think I'm about to zoom out. I have a little brief thing on computer use, and then we can talk about founder stuff, which is, I tend to think of developer tooling markets in impossible triangles, where everyone starts in a niche, and then they start to branch out. So I already hinted at a little bit of this, right? We mentioned more. We mentioned E2B. We mentioned Firecrawl. And then there's Browserbase. So there's, like, all this stuff of, like, have serverless virtual computer that you give to an agent and let them do stuff with it. And there's various ways of connecting it to the internet. You can just connect to a search API, like SERP API, whatever other, like, EXA is another one. That's what you're searching. You can also have a JSON markdown extractor, which is Firecrawl. Or you can have a virtual browser like Browserbase, or you can have a virtual machine like Morph. And then there's also maybe, like, a virtual sort of code environment, like Code Interpreter. So, like, there's just, like, a bunch of different ways to tackle the problem of give a computer to an agent. And I'm just kind of wondering if you see, like, everyone's just, like, happily coexisting in their respective niches. And as a developer, I just go and pick, like, a shopping basket of one of each. Or do you think that you eventually, people will collide?Future of browser automation and market competitionPaul [00:47:18]: I think that currently it's not a zero-sum market. Like, I think we're talking about... I think we're talking about all of knowledge work that people do that can be automated online. All of these, like, trillions of hours that happen online where people are working. And I think that there's so much software to be built that, like, I tend not to think about how these companies will collide. I just try to solve the problem as best as I can and make this specific piece of infrastructure, which I think is an important primitive, the best I possibly can. And yeah. I think there's players that are actually going to like it. I think there's players that are going to launch, like, over-the-top, you know, platforms, like agent platforms that have all these tools built in, right? Like, who's building the rippling for agent tools that has the search tool, the browser tool, the operating system tool, right? There are some. There are some. There are some, right? And I think in the end, what I have seen as my time as a developer, and I look at all the favorite tools that I have, is that, like, for tools and primitives with sufficient levels of complexity, you need to have a solution that's really bespoke to that primitive, you know? And I am sufficiently convinced that the browser is complex enough to deserve a primitive. Obviously, I have to. I'm the founder of BrowserBase, right? I'm talking my book. But, like, I think maybe I can give you one spicy take against, like, maybe just whole OS running. I think that when I look at computer use when it first came out, I saw that the majority of use cases for computer use were controlling a browser. And do we really need to run an entire operating system just to control a browser? I don't think so. I don't think that's necessary. You know, BrowserBase can run browsers for way cheaper than you can if you're running a full-fledged OS with a GUI, you know, operating system. And I think that's just an advantage of the browser. It is, like, browsers are little OSs, and you can run them very efficiently if you orchestrate it well. And I think that allows us to offer 90% of the, you know, functionality in the platform needed at 10% of the cost of running a full OS. Yeah.Open Operator: Browserbase's Open-Source Alternativeswyx [00:49:16]: I definitely see the logic in that. There's a Mark Andreessen quote. I don't know if you know this one. Where he basically observed that the browser is turning the operating system into a poorly debugged set of device drivers, because most of the apps are moved from the OS to the browser. So you can just run browsers.Paul [00:49:31]: There's a place for OSs, too. Like, I think that there are some applications that only run on Windows operating systems. And Eric from pig.dev in this upcoming YC batch, or last YC batch, like, he's building all run tons of Windows operating systems for you to control with your agent. And like, there's some legacy EHR systems that only run on Internet-controlled systems. Yeah.Paul [00:49:54]: I think that's it. I think, like, there are use cases for specific operating systems for specific legacy software. And like, I'm excited to see what he does with that. I just wanted to give a shout out to the pig.dev website.swyx [00:50:06]: The pigs jump when you click on them. Yeah. That's great.Paul [00:50:08]: Eric, he's the former co-founder of banana.dev, too.swyx [00:50:11]: Oh, that Eric. Yeah. That Eric. Okay. Well, he abandoned bananas for pigs. I hope he doesn't start going around with pigs now.Alessio [00:50:18]: Like he was going around with bananas. A little toy pig. Yeah. Yeah. I love that. What else are we missing? I think we covered a lot of, like, the browser-based product history, but. What do you wish people asked you? Yeah.Paul [00:50:29]: I wish people asked me more about, like, what will the future of software look like? Because I think that's really where I've spent a lot of time about why do browser-based. Like, for me, starting a company is like a means of last resort. Like, you shouldn't start a company unless you absolutely have to. And I remain convinced that the future of software is software that you're going to click a button and it's going to do stuff on your behalf. Right now, software. You click a button and it maybe, like, calls it back an API and, like, computes some numbers. It, like, modifies some text, whatever. But the future of software is software using software. So, I may log into my accounting website for my business, click a button, and it's going to go load up my Gmail, search my emails, find the thing, upload the receipt, and then comment it for me. Right? And it may use it using APIs, maybe a browser. I don't know. I think it's a little bit of both. But that's completely different from how we've built software so far. And that's. I think that future of software has different infrastructure requirements. It's going to require different UIs. It's going to require different pieces of infrastructure. I think the browser infrastructure is one piece that fits into that, along with all the other categories you mentioned. So, I think that it's going to require developers to think differently about how they've built software for, you know

Ancient Warfare Podcast
AWA338 - Lambdas and ancient Greek shield devices

Ancient Warfare Podcast

Play Episode Listen Later Jan 3, 2025 12:57


For the first episode of 2025, we have this from @mrookeward, who asks Murray to explore some of the tropes (or not tropes) for 'uniforms'. E.g. the Spartan lambda shield, or ancient Egyptian headwear.   Join us on Patron patreon.com/ancientwarfarepodcast  

airhacks.fm podcast with adam bien
From XML-Driven Enterprise Java to Serverless AWS Lambdas

airhacks.fm podcast with adam bien

Play Episode Listen Later Nov 10, 2024 56:07


An airhacks.fm conversation with Vadym Kazulkin (@VKazulkin) about: journey as a Java developer from the late 1990s to present, early experiences with Java and J2EE development, transition to cloud and serverless technologies, particularly AWS Lambda, discussion of Java performance on lambda compared to node.js, detailed explanation of AWS SnapStart technology for improving Java cold starts, pros and cons of "fat" Lambda functions versus microservices, challenges of using GraalVM with Lambda, importance of optimizing Lambda package size and dependencies, comparison of quarkus and Spring Boot on Lambda, benefits of serverless architecture for business logic focus, involvement with Java User Group Bonn and AWS Community Builder program, brief mention of asynchronous patterns in serverless architectures, importance of staying technically hands-on as a manager in the rapidly evolving cloud world Vadym Kazulkin on twitter: @VKazulkin

Cloud Security Podcast
Cloud Native Strategies from a FinTech CISO

Cloud Security Podcast

Play Episode Listen Later Jul 30, 2024 21:56


What are you doing differently today that you're stopping tomorrow's legacy? In this episode Ashish spoke to Adrian Asher, CISO and Cloud Architect at Checkout.com, to explore the journey from monolithic architecture to cloud-native solutions in a regulated fintech environment. Adrian shared his perspective on why there "aren't enough lambdas" and how embracing cloud-native technologies like AWS Lambda and Fargate can enhance security, scalability, and efficiency. Guest Socials:⁠ ⁠⁠Adrian's Linkedin ⁠ Podcast Twitter - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠@CloudSecPod⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ If you want to watch videos of this LIVE STREAMED episode and past episodes - Check out our other Cloud Security Social Channels: - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Cloud Security Podcast- Youtube⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Cloud Security Newsletter ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Cloud Security BootCamp Questions asked: (00:00) Introduction (01:59) A bit about Adrian (02:47) Cloud Naive vs Cloud Native (03:54) Checkout's Cloud Native Journey (05:44) What is AWS Fargate? (06:52) There are not enough Lambdas (09:52) The evolution of the Security Function (12:15) Culture change for being more cloud native (15:23) Getting security teams ready for Gen AI (18:16) Where to start with Cloud Native? (19:14) Where you can connect with Adrian? (19:39) The Fun Section

AWS Developers Podcast
Stream responses from your GraphQL API with asynchronous Lambda functions

AWS Developers Podcast

Play Episode Listen Later Jul 12, 2024 26:37


Dive into the world of GraphQL APIs on AWS this week! We'll explore the recently launched feature in AppSync: asynchronous Lambda functions for GraphQL resolvers. But first, we'll break down the advantages of GraphQL over REST APIs and the limitations of synchronous calls in GraphQL. Then, we'll uncover the power of async Lambdas: stream data directly to your client for a more responsive experience and unlock innovative use cases, like generative AI-powered chatbots built with Lambdas. Curious how this can transform your applications? Tune in to learn more! With Derek Bingham, Developer Advocate, AWS https://www.linkedin.com/in/derekwbingham/ - Derek's blog about AppSync async Lambda resolvers https://community.aws/content/2hlqAp86YvckSS2DrVvZ1qdArqF/async-lambda-and-appsync?lang=en - AWS AppSync https://docs.aws.amazon.com/appsync/latest/devguide/what-is-appsync.html - AWS Lambda https://docs.aws.amazon.com/lambda/latest/dg/welcome.html - Streaming a response from a Lambda function https://docs.aws.amazon.com/lambda/latest/dg/configuration-response-streaming.html - AWS AppSync sample code https://github.com/aws-samples/aws-appsync-resolver-samples - Michael (App Sync Developer Advocate) YouTube channel https://www.youtube.com/@focusotter/videos

IFTTD - If This Then Dev
#219.exe - Tezos: La blockchain comme des lambdas par Sammy Teillet

IFTTD - If This Then Dev

Play Episode Listen Later Apr 5, 2024 11:38


Pour l'épisode #219 je recevais Xavier van de Woestyne. On en débrief avec Sammy.**Découvrez Shopify : Votre Allié E-commerce** "Vous êtes développeur ou entrepreneur et cherchez à créer ou optimiser votre boutique en ligne ? Ne cherchez pas plus loin que Shopify. Cette plateforme de commerce tout-en-un vous offre les outils nécessaires pour lancer, gérer et développer votre entreprise avec aisance et efficacité. Que vous vendiez en personne ou en ligne, Shopify s'adapte à vos besoins et vous permet de personnaliser votre expérience e-commerce. Avec une interface intuitive et un large éventail d'outils de gestion puissants, Shopify transforme le processus de vente en une expérience fluide et agréable. Profitez maintenant d'une période d'essai à un euro par mois en vous inscrivant sur Shopify. Prenez le contrôle de votre aventure commerciale et faites passer votre marque au niveau supérieur avec Shopify."Archives | Site | Boutique | TikTok | Discord | Twitter | LinkedIn | Instagram | Youtube | Twitch | Job Board |

Charlas técnicas de AWS (AWS en Español)
#5.01 - La Vida de un SRE, con Pelado Nerd

Charlas técnicas de AWS (AWS en Español)

Play Episode Listen Later Feb 12, 2024 69:19


En este primer episodio de la Temporada 5, charlamos con Pelado Nerd, reconocido SRE y creador de contenido en YouTube. Exploramos su trayectoria desde sus inicios hasta su éxito en YouTube, así como su experiencia en el mundo de Site Reliability Engineering (SRE). Discutimos el día a día de un SRE y herramientas esenciales para el rol, entre otros.Tabla de Contenidos 01:34 Intro al invitado, los orígenes de Pelado... 03:50 Tú faceta como creador de contenidos. 11:00 Aplicando lo aprendido en Youtube y viceversa. 14:00 Balanceando el ejercicio con el trabajo / mejorando la productividad 18:30 El día a día de un SRE. 25:47 Las 3 herramientas imprescindibles del SRE 27:04 La gran ventaja de Kubernetes 31:21 Kubernetes NO es la opción de ORO para todo 33:45 Lanzando 300 nodos...en 30 min! 35:12 Descubriendo los warm-up 40:13 Historias para no dormir: Adiós a los certificados 44:28 Consejos para futuros SREs 48:30 Para qué quieres Jenkins? Usa Dagger. 52:20 Escalado de clusters con Karpenter 55:20 Lambdas en contenedores 58:12 El futuro de K8s y la 3ra ola de contenedores: WASM 1:01:45 Impacto de la IA en la Infraestructura1:04:50 Recomendaciones finalesRedes Sociales del InvitadoTwitter: https://twitter.com/peladonerdYouTube: https://www.youtube.com/@PeladoNerdLinkedIN:  https://www.linkedin.com/in/pablofredrikson/Videos MencionadosDocker de Novato a Pro: https://www.youtube.com/watch?v=CV_Uf3Dq-EU&t=115sIntroducción a Dagger: https://www.youtube.com/watch?v=lGl1UlcODLQWASM, la 3ra ola de contenedores: https://www.youtube.com/watch?v=bgWTf3m6HG0LENS, la mejor interfaz para K8s: https://www.youtube.com/watch?v=DFMKcR4BqwMCrossplane, mejor que Terraform? https://www.youtube.com/watch?v=dWbEvHOtljg&t=129sRecomendacionesLibro: Time Management for System Administrators: https://amzn.eu/d/fL7FiUlLibro: Site Reliability Engineering (Gratis)https://sre.google/books/Canal Pelado Entrena, el desafío de correr una maratón: https://www.youtube.com/@PeladoEntrena✉️ Si quieren escribirnos pueden hacerlo a este correo: podcast-aws-espanol@amazon.comPodes encontrar el podcast en este link: https://aws-espanol.buzzsprout.com/O en tu plataforma de podcast favoritaMás información y tutoriales en el canal de youtube de Charlas Técnicas#foobar #AWSenEspañol

Friday Afternoon Deploy:  A Developer Podcast
DDD (Disappointment Driven Development)

Friday Afternoon Deploy: A Developer Podcast

Play Episode Listen Later Dec 22, 2023 55:08


Disappointment drives development far beyond tech, it's just one [area]...where we can see it, because we have the ability to build almost anything. We see the full cycle of it quickly...Show Notes:Joseph gives Waffle House tips for cooking (0:44)PKD & Entomology in Patreon (2:25)Tyrel & Joseph shared a roommate on weekends (7:55)SPOILER ALERT we spoil every literary work and movie… (11:05)Custom software in a decision & delivery world (16:50)Modularization, composition, scalability of Lambdas (22:12)Rage tools and Disappointment Driven Development (31:34)Tyrel would buy a typewriter for software development (37:40)Joseph, Tyrel, & Alan talk good ol' days of social networks (40:16)Alan says paradoxically original  (46:05)Uncle Bob is still our hero (51:52)We're Thinking About:Kafka and Spark (and Hadoop)Segment Anything ModelMeta's AI image creatorVision Pro gogglesShow Links:Misty Mountain Hop - Led ZeppelinRober Jordan - WoTHigh Art Patric RothfussKingkiller ChronicleUnreleased Wu Tang album soldOnce Upon a Time in ShaolinBest FB fake story out thereKafka for event-driven ArchKafkaThe original KafkaXanga still exists? - we don't even know if this is the same xanga…SLC PunkAn Inconvenient TruthUncle Bob & Professionalism  Support Friday Afternoon Deploy Online:Facebook | Twitter | Patreon | Teespring

Screaming in the Cloud
Learnings From A Lifelong Career in Open-Source with Amir Szekely

Screaming in the Cloud

Play Episode Listen Later Nov 7, 2023 38:47


Amir Szekely, Owner at CloudSnorkel, joins Corey on Screaming in the Cloud to discuss how he got his start in the early days of cloud and his solo project, CloudSnorkel. Throughout this conversation, Corey and Amir discuss the importance of being pragmatic when moving to the cloud, and the different approaches they see in developers from the early days of cloud to now. Amir shares what motivates him to develop open-source projects, and why he finds fulfillment in fixing bugs and operating CloudSnorkel as a one-man show. About AmirAmir Szekely is a cloud consultant specializing in deployment automation, AWS CDK, CloudFormation, and CI/CD. His background includes security, virtualization, and Windows development. Amir enjoys creating open-source projects like cdk-github-runners, cdk-turbo-layers, and NSIS.Links Referenced: CloudSnorkel: https://cloudsnorkel.com/ lasttootinaws.com: https://lasttootinaws.com camelcamelcamel.com: https://camelcamelcamel.com github.com/cloudsnorkel: https://github.com/cloudsnorkel Personal website: https://kichik.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, and this is an episode that I have been angling for for longer than you might imagine. My guest today is Amir Szekely, who's the owner at CloudSnorkel. Amir, thank you for joining me.Amir: Thanks for having me, Corey. I love being here.Corey: So, I've been using one of your open-source projects for an embarrassingly long amount of time, and for the longest time, I make the critical mistake of referring to the project itself as CloudSnorkel because that's the word that shows up in the GitHub project that I can actually see that jumps out at me. The actual name of the project within your org is cdk-github-runners if I'm not mistaken.Amir: That's real original, right?Corey: Exactly. It's like, “Oh, good, I'll just mention that, and suddenly everyone will know what I'm talking about.” But ignoring the problems of naming things well, which is a pain that everyone at AWS or who uses it knows far too well, the product is basically magic. Before I wind up basically embarrassing myself by doing a poor job of explaining what it is, how do you think about it?Amir: Well, I mean, it's a pretty simple project, which I think what makes it great as well. It creates GitHub runners with CDK. That's about it. It's in the name, and it just does that. And I really tried to make it as simple as possible and kind of learn from other projects that I've seen that are similar, and basically learn from my pain points in them.I think the reason I started is because I actually deployed CDK runners—sorry, GitHub runners—for one company, and I ended up using the Kubernetes one, right? So, GitHub in themselves, they have two projects they recommend—and not to nudge GitHub, please recommend my project one day as well—they have the Kubernetes controller and they have the Terraform deployer. And the specific client that I worked for, they wanted to use Kubernetes. And I tried to deploy it, and, Corey, I swear, I worked three days; three days to deploy the thing, which was crazy to me. And every single step of the way, I had to go and read some documentation, figure out what I did wrong, and apparently the order the documentation was was incorrect.And I had to—I even opened tickets, and they—you know, they were rightfully like, “It's open-source project. Please contribute and fix the documentation for us.” At that point, I said, “Nah.” [laugh]. Let me create something better with CDK and I decided just to have the simplest setup possible.So usually, right, what you end up doing in these projects, you have to set up either secrets or SSM parameters, and you have to prepare the ground and you have to get your GitHub token and all those things. And that's just annoying. So, I decided to create a—Corey: So much busy work.Amir: Yes, yeah, so much busy work and so much boilerplate and so much figuring out the right way and the right order, and just annoying. So, I decided to create a setup page. I thought, “What if you can actually install it just like you install any app on GitHub,” which is the way it's supposed to be right? So, when you install cdk-github-runners—CloudSnorkel—you get an HTML page and you just click a few buttons and you tell it where to install it and it just installs it for you. And it sets the secrets and everything. And if you want to change the secret, you don't have to redeploy. You can just change the secret, right? You have to roll the token over or whatever. So, it's much, much easier to install.Corey: And I feel like I discovered this project through one of the more surreal approaches—and I had cause to revisit it a few weeks ago when I was redoing my talk for the CDK Community Day, which has since happened and people liked the talk—and I mentioned what CloudSnorkel had been doing and how I was using the runners accordingly. So, that was what I accidentally caused me to pop back up with, “Hey, I've got some issues here.” But we'll get to that. Because once upon a time, I built a Twitter client for creating threads because shitposting is my love language, I would sit and create Twitter threads in the middle of live keynote talks. Threading in the native client was always terrible, and I wanted to build something that would help me do that. So, I did.And it was up for a while. It's not anymore because I'm not paying $42,000 a month in API costs to some jackass, but it still exists in the form of lasttootinaws.com if you want to create threads on Mastodon. But after I put this out, some people complained that it was slow.To which my response was, “What do you mean? It's super fast for me in San Francisco talking to it hosted in Oregon.” But on every round trip from halfway around the world, it became a problem. So, I got it into my head that since this thing was fully stateless, other than a Lambda function being fronted via an API Gateway, that I should deploy it to every region. It didn't quite fit into a Cloudflare Worker or into one of the Edge Lambda functions that AWS has given up on, but okay, how do I deploy something to every region?And the answer is, with great difficulty because it's clear that no one was ever imagining with all those regions that anyone would use all of them. It's imagined that most customers use two or three, but customers are different, so which two or three is going to be widely varied. So, anything halfway sensible about doing deployments like this didn't work out. Again, because this thing was also a Lambda function and an API Gateway, it was dirt cheap, so I didn't really want to start spending stupid amounts of money doing deployment infrastructure and the rest.So okay, how do I do this? Well, GitHub Actions is awesome. It is basically what all of AWS's code offerings wish that they were. CodeBuild is sad and this was kind of great. The problem is, once you're out of the free tier, and if you're a bad developer where you do a deploy on every iteration, suddenly it starts costing for what I was doing in every region, something like a quarter of per deploy, which adds up when you're really, really bad at programming.Amir: [laugh].Corey: So, their matrix jobs are awesome, but I wanted to do some self-hosted runners. How do I do that? And I want to keep it cheap, so how do I do a self-hosted runner inside of a Lambda function? Which led me directly to you. And it was nothing short of astonishing. This was a few years ago. I seem to recall that it used to be a bit less well-architected in terms of its elegance. Did it always use step functions, for example, to wind up orchestrating these things?Amir: Yeah, so I do remember that day. We met pretty much… basically as a joke because the Lambda Runner was a joke that I did, and I posted on Twitter, and I was half-proud of my joke that starts in ten seconds, right? But yeah, no, the—I think it always used functions. I've been kind of in love with the functions for the past two years. They just—they're nice.Corey: Oh, they're magic, and AWS is so bad at telling their story. Both of those things are true.Amir: Yeah. And the API is not amazing. But like, when you get it working—and you know, you have to spend some time to get it working—it's really nice because then you have nothing to manage, ever. And they can call APIs directly now, so you don't have to even create Lambdas. It's pretty cool.Corey: And what I loved is you wind up deploying this thing to whatever account you want it to live within. What is it, the OIDC? I always get those letters in the wrong direction. OIDC, I think, is correct.Amir: I think it's OIDC, yeah.Corey: Yeah, and it winds up doing this through a secure method as opposed to just okay, now anyone with access to the project can deploy into your account, which is not ideal. And it just works. It spins up a whole bunch of these Lambda functions that are using a Docker image as the deployment environment. And yeah, all right, if effectively my CDK deploy—which is what it's doing inside of this thing—doesn't complete within 15 minutes, then it's not going to and the thing is going to break out. We've solved the halting problem. After 15 minutes, the loop will terminate. The end.But that's never been a problem, even with getting ACM certificates spun up. It completes well within that time limit. And its cost to me is effectively nothing. With one key exception: that you made the choice to use Secrets Manager to wind up storing a lot of the things it cares about instead of Parameter Store, so I think you wind up costing me—I think there's two of those different secrets, so that's 80 cents a month. Which I will be demanding in blood one of these days if I ever catch you at re:Invent.Amir: I'll buy you beer [laugh].Corey: There we go. That'll count. That'll buy, like, several months of that. That works—at re:Invent, no. The beers there are, like, $18, so that'll cover me for years. We're set.Amir: We'll split it [laugh].Corey: Exactly. Problem solved. But I like the elegance of it, I like how clever it is, and I want to be very clear, though, it's not just for shitposting. Because it's very configurable where, yes, you can use Lambda functions, you can use Spot Instances, you can use CodeBuild containers, you can use Fargate containers, you can use EC2 instances, and it just automatically orchestrates and adds these self-hosted runners to your account, and every build gets a pristine environment as a result. That is no small thing.Amir: Oh, and I love making things configurable. People really appreciate it I feel, you know, and gives people kind of a sense of power. But as long as you make that configuration simple enough, right, or at least the defaults good defaults, right, then, even with that power, people still don't shoot themselves in the foot and it still works really well. By the way, we just added ECS recently, which people really were asking for because it gives you the, kind of, easy option to have the runner—well, not the runner but at least the runner infrastructure staying up, right? So, you can have auto-scaling group backing ECS and then the runner can start up a lot faster. It was actually very important to other people because Lambda, as fast that it is, it's limited, and Fargate, for whatever reason, still to this day, takes a minute to start up.Corey: Yeah. What's wild to me about this is, start to finish, I hit a deploy to the main branch and it sparks the thing up, runs the deploy. Deploy itself takes a little over two minutes. And every time I do this, within three minutes of me pushing to commit, the deploy is done globally. It is lightning fast.And I know it's easy to lose yourself in the idea of this being a giant shitpost, where, oh, who's going to do deployment jobs in Lambda functions? Well, kind of a lot of us for a variety of reasons, some of which might be better than others. In my case, it was just because I was cheap, but the massive parallelization ability to do 20 simultaneous deploys in a matrix configuration that doesn't wind up smacking into rate limits everywhere, that was kind of great.Amir: Yeah, we have seen people use Lambda a lot. It's mostly for, yeah, like you said, small jobs. And the environment that they give you, it's kind of limited, so you can't actually install packages, right? There is no sudo, and you can't actually install anything unless it's in your temp directory. But still, like, just being able to run a lot of little jobs, it's really great. Yeah.Corey: And you can also make sure that there's a Docker image ready to go with the stuff that you need, just by configuring how the build works in the CDK. I will admit, I did have a couple of bug reports for you. One was kind of useful, where it was not at all clear how to do this on top of a Graviton-based Lambda function—because yeah, that was back when not everything really supported ARM architectures super well—and a couple of other times when the documentation was fairly ambiguous from my perspective, where it wasn't at all clear, what was I doing? I spent four hours trying to beat my way through it, I give up, filed an issue, went to get a cup of coffee, came back, and the answer was sitting there waiting for me because I'm not convinced you sleep.Amir: Well, I am a vampire. My last name is from the Transylvania area [laugh]. So—Corey: Excellent. Excellent.Amir: By the way, not the first time people tell me that. But anyway [laugh].Corey: There's something to be said for getting immediate responsiveness because one of the reasons I'm always so loath to go and do a support ticket anywhere is this is going to take weeks. And then someone's going to come back with a, “I don't get it.” And try and, like, read the support portfolio to you. No, you went right into yeah, it's this. Fix it and your problem goes away. And sure enough, it did.Amir: The escalation process that some companies put you through is very frustrating. I mean, lucky for you, CloudSnorkel is a one-man show and this man loves solving bugs. So [laugh].Corey: Yeah. Do you know of anyone using it for anything that isn't ridiculous and trivial like what I'm using it for?Amir: Yeah, I have to think whether or not I can… I mean, so—okay. We have a bunch of dedicated users, right, the GitHub repo, that keep posting bugs and keep posting even patches, right, so you can tell that they're using it. I even have one sponsor, one recurring sponsor on GitHub that uses it.Corey: It's always nice when people thank you via money.Amir: Yeah. Yeah, it is very validating. I think [BLEEP] is using it, but I also don't think I can actually say it because I got it from the GitHub.Corey: It's always fun. That's the beautiful part about open-source. You don't know who's using this. You see what other things people are working on, and you never know, is one of their—is this someone's side project, is it a skunkworks thing, or God forbid, is this inside of every car going forward and no one bothered to tell me about that. That is the magic and mystery of open-source. And you've been doing open-source for longer than I have and I thought I was old. You were originally named in some of the WinAMP credits, for God's sake, that media player that really whipped the llama's ass.Amir: Oh, yeah, I started real early. I started about when I was 15, I think. I started off with Pascal or something or even Perl, and then I decided I have to learn C and I have to learn Windows API. I don't know what possessed me to do that. Win32 API is… unique [laugh].But once I created those applications for myself, right, I think there was—oh my God, do you know the—what is it called, Sherlock in macOS, right? And these days, for PowerToys, there is the equivalent of it called, I don't know, whatever that—PowerBar? That's exactly—that was that. That's a project I created as a kid. I wanted something where I can go to the Run menu of Windows when you hit Winkey R, and you can just type something and it will start it up, right?I didn't want to go to the Start menu and browse and click things. I wanted to do everything with the keyboard. So, I created something called Blazerun [laugh], which [laugh] helped you really easily create shortcuts that went into your path, right, the Windows path, so you can really easily start them from Winkey R. I don't think that anyone besides me used it, but anyway, that thing needed an installer, right? Because Windows, you got to install things. So, I ended up—Corey: Yeah, these days on Mac OS, I use Alfred for that which is kind of long in the tooth, but there's a launch bar and a bunch of other stuff for it. What I love is that if I—I can double-tap the command key and that just pops up whatever I need it to and tell the computer what to do. It feels like there's an AI play in there somewhere if people can figure out how to spend ten minutes on building AI that does something other than lets them fire their customer service staff.Amir: Oh, my God. Please don't fire customer service staff. AI is so bad.Corey: Yeah, when I reach out to talk to a human, I really needed a human.Amir: Yes. Like, I'm not calling you because I want to talk to a robot. I know there's a website. Leave me alone, just give me a person.Corey: Yeah. Like, you already failed to solve my problem on your website. It's person time.Amir: Exactly. Oh, my God. Anyway [laugh]. So, I had to create an installer, right, and I found it was called NSIS. So, it was a Nullsoft “SuperPiMP” installation system. Or in the future, when Justin, the guy who created Winamp and NSIS, tried to tone down a little bit, Nullsoft Scriptable Installation System. And SuperPiMP is—this is such useless history for you, right, but SuperPiMP is the next generation of PiMP which is Plug-in Mini Packager [laugh].Corey: I remember so many of the—like, these days, no one would ever name any project like that, just because it's so off-putting to people with sensibilities, but back then that was half the stuff that came out. “Oh, you don't like how this thing I built for free in the wee hours when I wasn't working at my fast food job wound up—you know, like, how I chose to name it, well, that's okay. Don't use it. Go build your own. Oh, what you're using it anyway. That's what I thought.”Amir: Yeah. The source code was filled with profanity, too. And like, I didn't care, I really did not care, but some people would complain and open bug reports and patches. And my policy was kind of like, okay if you're complaining, I'm just going to ignore you. If you're opening a patch, fine, I'm going to accept that you're—you guys want to create something that's sensible for everybody, sure.I mean, it's just source code, you know? Whatever. So yeah, I started working on that NSIS. I used it for myself and I joined the forums—and this kind of answers to your question of why I respond to things so fast, just because of the fun—I did the same when I was 15, right? I started going on the forums, you remember forums? You remember that [laugh]?Corey: Oh, yeah, back before they all became terrible and monetized.Amir: Oh, yeah. So, you know, people were using NSIS, too, and they had requests, right? They wanted. Back in the day—what was it—there was only support for 16-bit colors for the icon, so they want 32-bit colors and big colors—32—big icon, sorry, 32 pixels by 32 pixels. Remember, 32 pixels?Corey: Oh, yes. Not well, and not happily, but I remember it.Amir: Yeah. So, I started just, you know, giving people—working on that open-source and creating up a fork. It wasn't even called ‘fork' back then, but yeah, I created, like, a little fork of myself and I started adding all these features. And people were really happy, and kind of created, like, this happy cycle for myself: when people were happy, I was happy coding. And then people were happy by what I was coding. And then they were asking for more and they were getting happier, the more I responded.So, it was kind of like a serotonin cycle that made me happy and made everybody happy. So, it's like a win, win, win, win, win. And that's how I started with open-source. And eventually… NSIS—again, that installation system—got so big, like, my fork got so big, and Justin, the guy who works on WinAMP and NSIS, he had other things to deal with. You know, there's a whole history there with AOL. I'm sure you've heard all the funny stories.Corey: Oh, yes. In fact, one thing that—you want to talk about weird collisions of things crossing, one of the things I picked up from your bio when you finally got tired of telling me no and agreed to be on the show was that you're also one of the team who works on camelcamelcamel.com. And I keep forgetting that's one of those things that most people have no idea exists. But it's very simple: all it does is it tracks Amazon products that you tell it to and alerts you when there's a price drop on the thing that you're looking at.It's something that is useful. I try and use it for things of substance or hobbies because I feel really pathetic when I'm like, get excited emails about a price drop in toilet paper. But you know, it's very handy just to keep an idea for price history, where okay, am I actually being ripped off? Oh, they claim it's their big Amazon Deals day and this is 40% off. Let's see what camelcamelcamel has to say.Oh, surprise. They just jacked the price right beforehand and now knocked 40% off. Genius. I love that. It always felt like something that was going to be blown off the radar by Amazon being displeased, but I discovered you folks in 2010 and here you are now, 13 years later, still here. I will say the website looks a lot better now.Amir: [laugh]. That's a recent change. I actually joined camel, maybe two or three years ago. I wasn't there from the beginning. But I knew the guy who created it—again, as you were saying—from the Winamp days, right? So, we were both working in the free—well, it wasn't freenode. It was not freenode. It was a separate IRC server that, again, Justin created for himself. It was called landoleet.Corey: Mmm. I never encountered that one.Amir: Yeah, no, it was pretty private. The only people that cared about WinAMP and NSIS ended up joining there. But it was a lot of fun. I met a lot of friends there. And yeah, I met Daniel Green there as well, and he's the guy that created, along with some other people in there that I think want to remain anonymous so I'm not going to mention, but they also were on the camel project.And yeah, I was kind of doing my poor version of shitposting on Twitter about AWS, kind of starting to get some traction and maybe some clients and talk about AWS so people can approach me, and Daniel approached me out of the blue and he was like, “Do you just post about AWS on Twitter or do you also do some AWS work?” I was like, “I do some AWS work.”Corey: Yes, as do all of us. It's one of those, well crap, we're getting called out now. “Do you actually know how any of this stuff works?” Like, “Much to my everlasting shame, yes. Why are you asking?”Amir: Oh, my God, no, I cannot fix your printer. Leave me alone.Corey: Mm-hm.Amir: I don't want to fix your Lambdas. No, but I do actually want to fix your Lambdas. And so, [laugh] he approached me and he asked if I can help them move camelcamelcamel from their data center to AWS. So, that was a nice big project. So, we moved, actually, all of camelcamelcamel into AWS. And this is how I found myself not only in the Winamp credits, but also in the camelcamelcamel credits page, which has a great picture of me riding a camel.Corey: Excellent. But one of the things I've always found has been that when you take an application that has been pre-existing for a while in a data center and then move it into the cloud, you suddenly have to care about things that no one sensible pays any attention to in the land of the data center. Because it's like, “What do I care about how much data passes between my application server and the database? Wait, what do you mean that in this configuration, that's a chargeable data transfer? Oh, dear Lord.” And things that you've never had to think about optimizing are suddenly things are very much optimizing.Because let's face it, when it comes to putting things in racks and then running servers, you aren't auto-scaling those things, so everything tends to be running over-provisioned, for very good reasons. It's an interesting education. Anything you picked out from that process that you think it'd be useful for folks to bear in mind if they're staring down the barrel of the same thing?Amir: Yeah, for sure. I think… in general, right, not just here. But in general, you always want to be pragmatic, right? You don't want to take steps are huge, right? So, the thing we did was not necessarily rewrite everything and change everything to AWS and move everything to Lambda and move everything to Docker.Basically, we did a mini lift-and-shift, but not exactly lift-and-shift, right? We didn't take it as is. We moved to RDS, we moved to ElastiCache, right, we obviously made use of security groups and session connect and we dropped SSH Sage and we improved the security a lot and we locked everything down, all the permissions and all that kind of stuff, right? But like you said, there's stuff that you start having to pay attention to. In our case, it was less the data transfer because we have a pretty good CDN. There was more of IOPS. So—and IOPS, specifically for a database.We had a huge database with about one terabyte of data and a lot of it is that price history that you see, right? So, all those nice little graphs that we create in—what do you call them, charts—that we create in camelcamelcamel off the price history. There's a lot of data behind that. And what we always want to do is actually remove that from MySQL, which has been kind of struggling with it even before the move to AWS, but after the move to AWS, where everything was no longer over-provisioned and we couldn't just buy a few more NVMes on Amazon for 100 bucks when they were on sale—back when we had to pay Amazon—Corey: And you know, when they're on sale. That's the best part.Amir: And we know [laugh]. We get good prices on NVMe. But yeah, on Amazon—on AWS, sorry—you have to pay for io1 or something, and that adds up real quick, as you were saying. So, part of that move was also to move to something that was a little better for that data structure. And we actually removed just that data, the price history, the price points from MySQL to DynamoDB, which was a pretty nice little project.Actually, I wrote about it in my blog. There is, kind of, lessons learned from moving one terabyte from MySQL to DynamoDB, and I think the biggest lesson was about hidden price of storage in DynamoDB. But before that, I want to talk about what you asked, which was the way that other people should make that move, right? So again, be pragmatic, right? If you Google, “How do I move stuff from DynamoDB to MySQL,” everybody's always talking about their cool project using Lambda and how you throttle Lambda and how you get throttled from DynamoDB and how you set it up with an SQS, and this and that. You don't need all that.Just fire up an EC2 instance, write some quick code to do it. I used, I think it was Go with some limiter code from Uber, and that was it. And you don't need all those Lambdas and SQS and the complication. That thing was a one-time thing anyway, so it doesn't need to be super… super-duper serverless, you know?Corey: That is almost always the way that it tends to play out. You encounter these weird little things along the way. And you see so many things that are tied to this is how architecture absolutely must be done. And oh you're not a real serverless person if you don't have everything running in Lambda and the rest. There are times where yeah, spin up an EC2 box, write some relatively inefficient code in ten minutes and just do the thing, and then turn it off when you're done. Problem solved. But there's such an aversion to that. It's nice to encounter people who are pragmatists more than they are zealots.Amir: I mostly learned that lesson. And both Daniel Green and me learned that lesson from the Winamp days. Because we both have written plugins for Winamp and we've been around that area and you can… if you took one of those non-pragmatist people, right, and you had them review the Winamp code right now—or even before—they would have a million things to say. That code was—and NSIS, too, by the way—and it was so optimized. It was so not necessarily readable, right? But it worked and it worked amazing. And Justin would—if you think I respond quickly, right, Justin Frankel, the guy who wrote Winamp, he would release versions of NSIS and of Winamp, like, four versions a day, right? That was before [laugh] you had CI/CD systems and GitHub and stuff. That was just CVS. You remember CVS [laugh]?Corey: Oh, I've done multiple CVS migrations. One to Git and a couple to Subversion.Amir: Oh yeah, Subversion. Yep. Done ‘em all. CVS to Subversion to Git. Yep. Yep. That was fun.Corey: And these days, everyone's using Git because it—we're beginning to have a monoculture.Amir: Yeah, yeah. I mean, but Git is nicer than Subversion, for me, at least. I've had more fun with it.Corey: Talk about damning with faint praise.Amir: Faint?Corey: Yeah, anything's better than Subversion, let's be honest here.Amir: Oh [laugh].Corey: I mean, realistically, copying a bunch of files and directories to a.bak folder is better than Subversion.Amir: Well—Corey: At least these days. But back then it was great.Amir: Yeah, I mean, the only thing you had, right [laugh]?Corey: [laugh].Amir: Anyway, achieving great things with not necessarily the right tools, but just sheer power of will, that's what I took from the Winamp days. Just the entire world used Winamp. And by the way, the NSIS project that I was working on, right, I always used to joke that every computer in the world ran my code, every Windows computer in the world when my code, just because—Corey: Yes.Amir: So, many different companies use NSIS. And none of them cared that the code was not very readable, to put it mildly.Corey: So, many companies founder on those shores where they lose sight of the fact that I can point to basically no companies that died because their code was terrible, yeah, had an awful lot that died with great-looking code, but they didn't nail the business problem.Amir: Yeah. I would be lying if I said that I nailed exactly the business problem at NSIS because the most of the time I would spend there and actually shrinking the stub, right, there was appended to your installer data, right? So, there's a little stub that came—the executable, basically, that came before your data that was extracted. I spent, I want to say, years of my life [laugh] just shrinking it down by bytes—by literal bytes—just so it stays under 34, 35 kilobytes. It was kind of a—it was a challenge and something that people appreciated, but not necessarily the thing that people appreciate the most. I think the features—Corey: Well, no I have to do the same thing to make sure something fits into a Lambda deployment package. The scale changes, the problem changes, but somehow everything sort of rhymes with history.Amir: Oh, yeah. I hope you don't have to disassemble code to do that, though because that's uh… I mean, it was fun. It was just a lot.Corey: I have to ask, how much work went into building your cdk-github-runners as far as getting it to a point of just working out the door? Because I look at that and it feels like there's—like, the early versions, yeah, there wasn't a whole bunch of code tied to it, but geez, the iterative, “How exactly does this ridiculous step functions API work or whatnot,” feels like I'm looking at weeks of frustration. At least it would have been for me.Amir: Yeah, yeah. I mean, it wasn't, like, a day or two. It was definitely not—but it was not years, either. I've been working on it I think about a year now. Don't quote me on that. But I've put a lot of time into it. So, you know, like you said, the skeleton code is pretty simple: it's a step function, which as we said, takes a long time to get right. The functions, they are really nice, but their definition language is not very straightforward. But beyond that, right, once that part worked, it worked. Then came all the bug reports and all the little corner cases, right? We—Corey: Hell is other people's use cases. Always is. But that's honestly better than a lot of folks wind up experiencing where they'll put an open-source project up and no one ever knows. So, getting users is often one of the biggest barriers to a lot of this stuff. I've found countless hidden gems lurking around on GitHub with a very particular search for something that no one had ever looked at before, as best I can tell.Amir: Yeah.Corey: Open-source is a tricky thing. There needs to be marketing brought into it, there needs to be storytelling around it, and has to actually—dare I say—solve a problem someone has.Amir: I mean, I have many open-source projects like that, that I find super useful, I created for myself, but no one knows. I think cdk-github-runners, I'm pretty sure people know about it only because you talked about it on Screaming in the Cloud or your newsletter. And by the way, thank you for telling me that you talked about it last week in the conference because now we know why there was a spike [laugh] all of a sudden. People Googled it.Corey: Yeah. I put links to it as well, but it's the, yeah, I use this a lot and it's great. I gave a crappy explanation on how it works, but that's the trick I've found between conference talks and, dare I say, podcast episodes, you gives people a glimpse and a hook and tell them where to go to learn more. Otherwise, you're trying to explain every nuance and every intricacy in 45 minutes. And you can't do that effectively in almost every case. All you're going to do is drive people away. Make it sound exciting, get them to see the value in it, and then let them go.Amir: You have to explain the market for it, right? That's it.Corey: Precisely.Amir: And I got to say, I somewhat disagree with your—or I have a different view when you say that, you know, open-source projects needs marketing and all those things. It depends on what open-source is for you, right? I don't create open-source projects so they are successful, right? It's obviously always nicer when they're successful, but—and I do get that cycle of happiness that, like I was saying, people create bugs and I have to fix them and stuff, right? But not every open-source project needs to be a success. Sometimes it's just fun.Corey: No. When I talk about marketing, I'm talking about exactly what we're doing here. I'm not talking take out an AdWords campaign or something horrifying like that. It's you build something that solved the problem for someone. The big problem that worries me about these things is how do you not lose sleep at night about the fact that solve someone's problem and they don't know that it exists?Because that drives me nuts. I've lost count of the number of times I've been beating my head against a wall and asked someone like, “How would you handle this?” Like, “Oh, well, what's wrong with this project?” “What do you mean?” “Well, this project seems to do exactly what you want it to do.” And no one has it all stuffed in their head. But yeah, then it seems like open-source becomes a little more corporatized and it becomes a lead gen tool for people to wind up selling their SaaS services or managed offerings or the rest.Amir: Yeah.Corey: And that feels like the increasing corporatization of open-source that I'm not a huge fan of.Amir: Yeah. I mean, I'm not going to lie, right? Like, part of why I created this—or I don't know if it was part of it, but like, I had a dream that, you know, I'm going to get, oh, tons of GitHub sponsors, and everybody's going to use it and I can retire on an island and just make money out of this, right? Like, that's always a dream, right? But it's a dream, you know?And I think bottom line open-source is… just a tool, and some people use it for, like you were saying, driving sales into their SaaS, some people, like, may use it just for fun, and some people use it for other things. Or some people use it for politics, even, right? There's a lot of politics around open-source.I got to tell you a story. Back in the NSIS days, right—talking about politics—so this is not even about politics of open-source. People made NSIS a battleground for their politics. We would have translations, right? People could upload their translations. And I, you know, or other people that worked on NSIS, right, we don't speak every language of the world, so there's only so much we can do about figuring out if it's a real translation, if it's good or not.Back in the day, Google Translate didn't exist. Like, these days, we check Google Translate, we kind of ask a few questions to make sure they make sense. But back in the day, we did the best that we could. At some point, we got a patch for Catalan language, I'm probably mispronouncing it—but the separatist people in Spain, I think, and I didn't know anything about that. I was a young kid and… I just didn't know.And I just included it, you know? Someone submitted a patch, they worked hard, they wanted to be part of the open-source project. Why not? Sure I included it. And then a few weeks later, someone from Spain wanted to change Catalan into Spanish to make sure that doesn't exist for whatever reason.And then they just started fighting with each other and started making demands of me. Like, you have to do this, you have to do that, you have to delete that, you have to change the name. And I was just so baffled by why would someone fight so much over a translation of an open-source project. Like, these days, I kind of get what they were getting at, right?Corey: But they were so bad at telling that story that it was just like, so basically, screw, “You for helping,” is how it comes across.Amir: Yeah, screw you for helping. You're a pawn now. Just—you're a pawn unwittingly. Just do what I say and help me in my political cause. I ended up just telling both of them if you guys can agree on anything, I'm just going to remove both translations. And that's what I ended up doing. I just removed both translations. And then a few months later—because we had a release every month basically, I just added both of them back and I've never heard from them again. So sort of problem solved. Peace the Middle East? I don't know.Corey: It's kind of wild just to see how often that sort of thing tends to happen. It's a, I don't necessarily understand why folks are so opposed to other people trying to help. I think they feel like there's this loss of control as things are slipping through their fingers, but it's a really unwelcoming approach. One of the things that got me deep into the open-source ecosystem surprisingly late in my development was when I started pitching in on the SaltStack project right after it was founded, where suddenly everything I threw their way was merged, and then Tom Hatch, the guy who founded the project, would immediately fix all the bugs and stuff I put in and then push something else immediately thereafter. But it was such a welcoming thing.Instead of nitpicking me to death in the pull request, it just got merged in and then silently fixed. And I thought that was a classy way to do it. Of course, it doesn't scale and of course, it causes other problems, but I envy the simplicity of those days and just the ethos behind that.Amir: That's something I've learned the last few years, I would say. Back in the NSIS day, I was not like that. I nitpicked. I nitpicked a lot. And I can guess why, but it just—you create a patch—in my mind, right, like you create a patch, you fix it, right?But these days I get, I've been on the other side as well, right? Like I created patches for open-source projects and I've seen them just wither away and die, and then five years later, someone's like, “Oh, can you fix this line to have one instead of two, and then I'll merge it.” I'm like, “I don't care anymore. It was five years ago. I don't work there anymore. I don't need it. If you want it, do it.”So, I get it these days. And these days, if someone creates a patch—just yesterday, someone created a patch to format cdk-github-runners in VS Code. And they did it just, like, a little bit wrong. So, I just fixed it for them and I approved it and pushed it. You know, it's much better. You don't need to bug people for most of it.Corey: You didn't yell at them for having the temerity to contribute?Amir: My voice is so raw because I've been yelling for five days at them, yeah.Corey: Exactly, exactly. I really want to thank you for taking the time to chat with me about how all this stuff came to be and your own path. If people want to learn more, where's the best place for them to find you?Amir: So, I really appreciate you having me and driving all this traffic to my projects. If people want to learn more, they can always go to cloudsnorkel.com; it has all the projects. github.com/cloudsnorkel has a few more. And then my private blog is kichik.com. So, K-I-C-H-I-K dot com. I don't post there as much as I should, but it has some interesting AWS projects from the past few years that I've done.Corey: And we will, of course, put links to all of that in the show notes. Thank you so much for taking the time. I really appreciate it.Amir: Thank you, Corey. It was really nice meeting you.Corey: Amir Szekely, owner of CloudSnorkel. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment. Heck, put it on all of the podcast platforms with a step function state machine that you somehow can't quite figure out how the API works.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)

If you work on Serverless Architectures, and are building Lambdas on AWS, it's highly likely you are already using DynamoDB, & if you aren't, it's only a matter of time before you realize you really ought to :) While there's no dearth of NoSQL databases, and despite the fact that AWS has plentiful support (to varying degrees) for a number of them, DynamoDB is a slightly unique database and it has a specific purpose when it comes to where it fits in & how well it does. Given that, it's certainly useful to understand it a bit (better). Purchase course in one of 2 ways: 1. Go to https://getsnowpal.com, and purchase it on the Web 2. On your phone:     (i) If you are an iPhone user, go to http://ios.snowpal.com, and watch the course on the go.     (ii). If you are an Android user, go to http://android.snowpal.com.

Screaming in the Cloud
Making an Affordable Event Data Solution with Seif Lotfy

Screaming in the Cloud

Play Episode Listen Later Oct 19, 2023 27:49


Seif Lotfy, Co-Founder and CTO at Axiom, joins Corey on Screaming in the Cloud to discuss how and why Axiom has taken a low-cost approach to event data. Seif describes the events that led to him helping co-found a company, and explains why the team wrote all their code from scratch. Corey and Seif discuss their views on AWS pricing, and Seif shares his views on why AWS doesn't have to compete on price. Seif also reveals some of the exciting new products and features that Axiom is currently working on. About SeifSeif is the bubbly Co-founder and CTO of Axiom where he has helped build the next generation of logging, tracing, and metrics. His background is at Xamarin, and Deutche Telekom and he is the kind of deep technical nerd that geeks out on white papers about emerging technology and then goes to see what he can build.Links Referenced: Axiom: https://axiom.co/ Twitter: https://twitter.com/seiflotfy TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by my friends, and soon to be yours, over at Axiom. Today I'm talking with Seif Lotfy, who's the co-founder and CTO of Axiom. Seif, how are you?Seif: Hey, Corey, I am very good, thank you. It's pretty late here, but it's worth it. I'm excited to be on this interview. How are you today?Corey: I'm not dead yet. It's weird, I see you at a bunch of different conferences, and I keep forgetting that you do in fact live half a world away. Is the entire company based in Europe? And where are you folks? Where do you start and where do you stop geographically? Let's start there. We over—everyone dives right into product. No, no, no. I want to know where in the world people sit because apparently, that's the most important thing about a company in 2023.Seif: Unless you ask Zoom because they're undoing whatever they did. We're from New Zealand, all the way to San Francisco, and everything in between. So, we have people in Egypt and Nigeria, all around Europe, all around the US… and UK, if you don't consider it Europe anymore.Corey: Yeah, it really depends. There's a lot of unfortunate naming that needs to get changed in the wake of that.Seif: [laugh].Corey: But enough about geopolitics. Let's talk about industry politics. I've been a fan of Axiom for a while and I was somewhat surprised to realize how long it had been around because I only heard about you folks a couple of years back. What is it you folks do? Because I know how I think about what you're up to, but you've also gone through some messaging iteration, and it is a near certainty that I am behind the times.Seif: Well, at this point, we just define ourselves as the best home for event data. So, Axiom is the best home for event data. We try to deal with everything that is event-based, so time-series. So, we can talk metrics, logs, traces, et cetera. And right now predominantly serving engineering and security.And we're trying to be—or we are—the first cloud-native time-series platform to provide streaming search, reporting, and monitoring capabilities. And we're built from the ground up, by the way. Like, we didn't actually—we're not using Parquet [unintelligible 00:02:36] thing. We're completely everything from the ground up.Corey: When I first started talking to you folks a few years back, there were two points to me that really stood out, and I know at least one of them still holds true. The first is that at the time, you were primarily talking about log data. Just send all your logs over to Axiom. The end. And that was a simple message that was simple enough that I could understand it, frankly.Because back when I was slinging servers around and you know breaking half of them, logs were effectively how we kept track of what was going on, where. These days, it feels like everything has been repainted with a very broad brush called observability, and the takeaway from most company pitches has been, you must be smarter than you are to understand what it is that we're up to. And in some cases, you scratch below the surface and realize it no, they have no idea what they're talking about either and they're really hoping you don't call them on that.Seif: It's packaging.Corey: Yeah. It is packaging and that's important.Seif: It's literally packaging. If you look at it, traces and logs, these are events. There's a timestamp and just data with it. It's a timestamp and data with it, right? Even metrics is all the way to that point.And a good example, now everybody's jumping on [OTel 00:03:46]. For me, OTel is nothing else, but a different structure for time series, for different types of time series, and that can be used differently, right? Or at least not used differently but you can leverage it differently.Corey: And the other thing that you did that was interesting and is a lot, I think, more sustainable as far as [moats 00:04:04] go, rather than things that can be changed on a billboard or whatnot, is your economic position. And your pricing has changed around somewhat, but I ran a number of analyses on your cost that you were passing on to customers and my takeaway was that it was a little bit more expensive to store data for logs in Axiom than it was to store it in S3, but not by much. And it just blew away the price point of everything else focused around logs, including AWS; you're paying 50 cents a gigabyte to ingest CloudWatch logs data over there. Other companies are charging multiples of that and Cisco recently bought Splunk for $28 billion because it was cheaper than paying their annual Splunk bill. How did you get to that price point? Is it just a matter of everyone else being greedy or have you done something different?Seif: We looked at it from the perspective of… so there's the three L's of logging. I forgot the name of the person at Netflix who talked about that, but basically, it's low costs, low latency, large scale, right? And you will never be able to fulfill all three of them. And we decided to work on low costs and large scale. And in terms of low latency, we won't be low as others like ClickHouse, but we are low enough. Like, we're fast enough.The idea is to be fast enough because in most cases, I don't want to compete on milliseconds. I think if the user can see his data in two seconds, he's happy. Or three seconds, he's happy. I'm not going to be, like, one to two seconds and make the cost exponentially higher because I'm one second faster than the other. And that's, I think, that the way we approached this from day one.And from day one, we also started utilizing the idea of existence of Open—Object Storage, we have our own compressions, our own encodings, et cetera, from day one, too, so and we still stick to that. That's why we never converted to other existing things like Parquet. Also because we are a Schema-On-Read, which Parquet doesn't allow you really to do. But other than that, it's… from day one, we wanted to save costs by also making coordination free. So, ingest has to be coordination free, right, because then we don't run a shitty Kafka, like, honestly a lot—a lot of the [logs 00:06:19] companies who running a Kafka in front of it, the Kafka tax reflects in what they—the bill that you're paying for them.Corey: What I found fun about your pricing model is it gets to a point that for any reasonable workload, how much to log or what to log or sample or keep everything is no longer an investment decision; it's just go ahead and handle it. And that was originally what you wound up building out. Increasingly, it seems like you're not just the place to send all the logs to, which to be honest, I was excited enough about that. That was replacing one of the projects I did a couple of times myself, which is building highly available, fault-tolerant, rsyslog clusters in data centers. Okay, great, you've gotten that unlocked, the economics are great, I don't have to worry about that anymore.And then you started adding interesting things on top of it, analyzing things, replaying events that happen to other players, et cetera, et cetera, it almost feels like you're not just a storage depot, but you also can forward certain things on under a variety of different rules or guises and format them as whatever on the other side is expecting them to be. So, there's a story about integrating with other observability vendors, for example, and only sending the stuff that's germane and relevant to them since everyone loves to charge by ingest.Seif: Yeah. So, we did this one thing called endpoints, the number one. Endpoints was a beginning where we said, “Let's let people send us data using whatever API they like using, let's say Elasticsearch, Datadog, Honeycomb, Loki, whatever, and we will just take that data and multiplex it back to them.” So, that's how part of it started. This allows us to see, like, how—allows customers to see how we compared to others, but then we took it a bit further and now, it's still in closed invite-only, but we have Pipelines—codenamed Pipelines—which allows you to send data to us and we will keep it as a source of truth, then we will, given specific rules, we can then ship it anywhere to a different destination, right, and this allows you just to, on the fly, send specific filter things out to, I don't know, a different vendor or even to S3 or you could send it to Splunk. But at the same time, you can—because we have all your data, you can go back in the past, if the incident happens and replay that completely into a different product.Corey: I would say that there's a definite approach to observability, from the perspective of every company tends to visualize stuff a little bit differently. And one of the promises of OTel that I'm seeing that as it grows is the idea of oh, I can send different parts of what I'm seeing off to different providers. But the instrumentation story for OTel is still very much emerging. Logs are kind of eternal and the only real change we've seen to logs over the past decade or so has been instead of just being plain text and their positional parameters would define what was what—if it's in this column, it's an IP address and if it's in this column, it's a return code, and that just wound up being ridiculous—now you see them having schemas; they are structured in a variety of different ways. Which, okay, it's a little harder to wind up just cat'ing a file together and piping it to grep, but there are trade-offs that make it worth it, in my experience.This is one of those transitional products that not only is great once you get to where you're going, from my playing with it, but also it meets you where you already are to get started because everything you've got is emitting logs somewhere, whether you know it or not.Seif: Yes. And that's why we picked up on OTel, right? Like, one of the first things, we now support… we have an OTel endpoint natively bec—or as a first-class citizen because we wanted to build this experience around OTel in general. Whether we like it or not, and there's more reasons to like it, OTel is a standard that's going to stay and it's going to move us forward. I think of OTel as will have the same effect if not bigger as [unintelligible 00:10:11] back of the day, but now it just went away from metrics, just went to metrics, logs, and traces.Traces is, for me, very interesting because I think OTel is the first one to push it in a standard way. There were several attempts to make standardized [logs 00:10:25], but I think traces was something that OTel really pushed into a proper standard that we can follow. It annoys me that everybody uses a different bits and pieces of it and adds something to it, but I think it's also because it's not that mature yet, so people are trying to figure out how to deliver the best experience and package it in a way that it's actually interesting for a user.Corey: What I have found is that there's a lot that's in this space that is just simply noise. Whenever I spend a protracted time period working on basically anything and I'm still confused by the way people talk about that thing, months or years later, I'm starting to get the realization that maybe I'm not the problem here. And I'm not—I don't mean this to be insulting, but one of the things I've loved about you folks is I've always understood what you're saying. Now, you can hear that as, “Oh, you mean we talk like simpletons?” No, it means what you're talking about resonates with at least a subset of the people who have the problem you solve. That's not nothing.Seif: Yes. We've tried really hard because one of the things we've tried to do is actually bring observability to people who are not always busy or it's not part of their day to day. So, we try to bring into [Versal 00:11:37] developers, right, with doing a Versal integration. And all of a sudden, now they have their logs, and they have a metrics, and they have some traces. So, all of a sudden, they're doing the observability work. Or they have actual observability, for their Versal based, [unintelligible 00:11:54]-based product.And we try to meet the people where they are, so we try to—instead of actually telling people, “You should send us data.”—I mean, that's what they do now—we try to find, okay, what product are you using and how can we grab data from there and send it to us to make your life easier? You see that we did that with Versal, we did that with Cloudflare. AWS, we have extensions, Lambda extensions, et cetera, but we're doing it for more things. For Netlify, it's a one-click integration, too, and that's what we're trying to do to actually make the experience and the journey easier.Corey: I want to change gears a little bit because something that we spent a fair bit of time talking about—it's why we became friends, I would think anyway—is that we have a shared appreciation for several things. One of which, at most notable to anyone around us is whenever we hang out, we greet each other effusively and then immediately begin complaining about costs of cloud services. What is your take on the way that clouds charge for things? And I know it's a bit of a leading question, but it's core and foundational to how you think about Axiom, as well as how you serve customers.Seif: They're ripping us off. I'm sorry [laugh]. They just—the amount of money they make, like, it's crazy. I would love to know what margins they have. That's a big question I've always had. I'm like, what are the margins they have at AWS right now?Corey: Across the board, it's something around 30 to 40%, last time I looked at it.Seif: That's a lot, too.Corey: Well, that's also across the board of everything, to be clear. It is very clear that some services are subsidized by other services. As it should be. If you start charging me per IAM call, we're done.Seif: And also, I mean, the machine learning stuff. Like, they won't be doing that much on top of it right now, right, [else nobody 00:13:32] will be using it.Corey: But data transfer? Yeah, there's a significant upcharge on that. But I hear you. I would moderate it a bit. I don't think that I would say that it's necessarily an intentional ripoff. My problem with most cloud services that they offer is not usually that they're too expensive—though there are exceptions to that—but rather that the dimensions are unpredictable in advance. So, you run something for a while and see what it costs. From where I sit, if a customer uses your service and then at the end of usage is surprised by how much it cost them, you've kind of screwed up.Seif: Look, if they can make egress free—like, you saw how Cloudflare just did the egress of R2 free? Because I am still stuck with AWS because let's face it, for me, it is still my favorite cloud, right? Cloudflare is my next favorite because of all the features that are trying to develop and the pace they're picking, the pace they're trying to catch up with. But again, one of the biggest things I liked is R2, and R2 egress is free. Now, that's interesting, right?But I never saw anything coming back from S3 from AWS on S3 for that, like you know. I think Amazon is so comfortable because from a product perspective, they're simple, they have the tools, et cetera. And the UI is not the flashiest one, but you know what you're doing, right? The CLI is not the flashiest one, but you know what you're doing. It is so cool that they don't really need to compete with others yet.And I think they're still dominantly the biggest cloud out there. I think you know more than me about that, but [unintelligible 00:14:57], like, I think they are the biggest one right now in terms of data volume. Like, how many customers are using them, and even in terms of profiles of people using them, it's very, so much. I know, like, a lot of the Microsoft Azure people who are using it, are using it because they come from enterprise that have been always Microsoft… very Microsoft friendly. And eventually, Microsoft also came in Europe in these all these different weird ways. But I feel sometimes ripped off by AWS because I see Cloudflare trying to reduce the prices and AWS just looking, like, “Yeah, you're not a threat to us so we'll keep our prices as they are.”Corey: I have it on good authority from folks who know that there are reasons behind the economic structures of both of those companies based—in terms of the primary direction the traffic flows and the rest. But across the board, they've done such a poor job of articulating this that, frankly, I think the confusion is on them to clear up, not us.Seif: True. True. And the reason I picked R2 and S3 to compare there and not look at Workers and Lambdas because I look at it as R2 is S3 compatible from an API perspective, right? So, they're giving me something that I already use. Everything else I'm using, I'm using inside Amazon, so it's in a VPC, but just the idea. Let me dream. Let me dream that S3 egress will be free at some point.Corey: I can dream.Seif: That's like Christmas. It's better than Christmas.Corey: What I'm surprised about is how reasonable your pricing is in turn. You wind up charging on the basis of ingest, which is basically the only thing that really makes sense for how your company is structured. But it's predictable in advance, the free tier is, what, 500 gigs a month of ingestion, and before people think, “Oh, that doesn't sound like a lot,” I encourage you to just go back and think how much data that really is in the context of logs for any toy project. Like, “Well, our production environment spits out way more than that.” Yes, and by the word production that you just used, you probably shouldn't be using a free trial of anything as your critical path observability tooling. Become a customer, not a user. I'm a big believer in that philosophy, personally. For all of my toy projects that are ridiculous, this is ample.Seif: People always tend to overestimate how much logs they're going to be sending. Like so, there's one thing. What you said it right: people who already have something going on, they already know how much logs they'll be sending around. But then eventually they're sending too much, and that's why we're back here and they're talking to us. Like, “We want to ttry your tool, but you know, we'll be sending more than that.” So, if you don't like our pricing, go find something else because I think we are the cheapest out there right now. We're the competitive the cheapest out there right now.Corey: If there is one that is less expensive, I'm unaware of it.Seif: [laugh].Corey: And I've been looking, let's be clear. That's not just me saying, “Well, nothing has skittered across my desk.” No, no, no, I pay attention to this space.Seif: Hey, where's—Corey, we're friends. Loyalty.Corey: Exactly.Seif: If you find something, you tell me.Corey: Oh, if I find something, I'll tell everyone.Seif: Nononon, you tell me first and you tell me in a nice way so I can reduce the prices on my site [laugh].Corey: This is how we start a price was, industry-wide, and I would love to see it.Seif: [laugh]. But there's enough channels that we share at this point across different Slacks and messaging apps that you should be able to ping me if you find one. Also, get me the name of the CEO and the CTO while you're at it.Corey: And where they live. Yes, yes, of course. The dire implications will be awesome.Seif: That was you, not me. That was your suggestion.Corey: Exactly.Seif: I will not—[laugh].Corey: Before we turn into a bit of an old thud and blunder, let's talk about something else that I'm curious about here. You've been working on Axiom for something like seven years now. You come from a world of databases and events and the like. Why start a company in the model of Axiom? Even back then, when I looked around, my big problem with the entire observability space could never have been described as, “You know what we need? More companies that do exactly this.” What was it that you saw that made you say, “Yeah, we're going to start a company because that sounds easy.”Seif: So, I'll be very clear. Like, I'm not going to, like, sugarcoat this. We kind of got in a position where it [forced counterweighted 00:19:10]. And [laugh] by that I mean, we came from a company where we were dealing with logs. Like, we actually wrote an event crash analytics tool for a company, but then we ended up wanting to use stuff like Datadog, but we didn't have the budget for that because Datadog was killing us.So, we ended up hosting our own Elasticsearch. And Elasticsearch, it costs us more to maintain our Elasticsearch cluster for the logs than to actually maintain our own little infrastructure for the crash events when we were getting, like, 1 billion crashes a month at this point. So eventually, we just—that was the first burn. And then you had alert fatigue and then you had consolidating events and timestamps and whatnot. The whole thing just seemed very messy.So, we started off after some company got sold, we started off by saying, “Okay, let's go work on a new self-hosted version of the [unintelligible 00:20:05] where we do metrics and logs.” And then that didn't go as well as we thought it would, but we ended up—because from day one, we were working on cloud na—because we d—we cloud ho—we were self-hosted, so we wanted to keep costs low, we were working on and making it stateless and work against object store. And this is kind of how we started. We realized, oh, our cost, we can host this and make it scale, and won't cost us that much.So, we did that. And that started gaining more attention. But the reason we started this was we wanted to start a self-hosted version of Datadog that is not costly, and we ended up doing a Software as a Service. I mean, you can still come and self-hosted, but you'll have to pay money for it, like, proper money for that. But we do as a SaaS version of this and instead of trying to be a self-hosted Datadog, we are now trying to compete—or we are competing with Datadog.Corey: Is the technology that you've built this on top of actually that different from everything else out there, or is this effectively what you see in a lot of places: “Oh, yeah, we're just going to manage Elasticsearch for you because that's annoying.” Do you have anything that distinguishes you from, I guess, the rest of the field?Seif: Yeah. So, very just bluntly, like, I think Scuba was the first thing that started standing out, and then Honeycomb came into the scene and they start building something based on Scuba, the [unintelligible 00:21:23] principles of Scuba. Then one of the authors of actual Scuba reached out to me when I told him I'm trying to build something, and he's gave me some ideas, and I start building that. And from day one, I said, “Okay, everything in S3. All queries have to be serverless.”So, all the queries run on functions. There's no real disks. It's just all on S3 right now. And the biggest issue—achievement we got to lower our cost was to get rid of Kafka, and have—let's say, in behind the scenes we have our own coordination-free mechanism, but the idea is not to actually have to use Kafka at all and thus reduce the costs incredibly. In terms of technology, no, we don't use Elasticsearch.We wrote everything from the ground up, from scratch, even the query language. Like, we have our own query language that's based—modeled after Kusto—KQL by Microsoft—so everything we have is built from absolutely from the ground up. And no Elastic. I'm not using Elastic anymore. Elastic is a horror for me. Absolutely horror.Corey: People love the API, but no, I've never met anyone who likes managing Elasticsearch or OpenSearch, or whatever we're calling your particular flavor of it. It is a colossal pain, it is subject to significant trade-offs, regardless of how you work with it, and Amazon's managed offering doesn't make it better; it makes it worse in a bunch of ways.Seif: And the green status of Elasticsearch is a myth. You'll only see it once: the first time you start that cluster, that's what the Elasticsearch cluster is green. After that, it's just orange, or red. And you know what? I'm happy when it's orange. Elasticsearch kept me up for so long. And we had actually a very interesting situation where we had Elasticsearch running on Azure, on Windows machines, and I would have server [unintelligible 00:23:10]. And I'd have to log in and every day—you remember, what's it called—RP… RP Something. What was it called?Corey: RDP? Remote Desktop Protocol, or something else?Seif: Yeah, yeah. Where you have to log in, like, you actually have visual thing, and you have to go in and—Corey: Yep.Seif: And visually go in and say, “Please don't restart.” Every day, I'd have to do that. Please don't restart, please don't restart. And also a lot of weird issues, and also at that point, Azure would decide to disconnect the pod, wanted to try to bring in a new pod, and all these weird things were happening back then. So, eventually, end up with a [unintelligible 00:23:39] decision. I'm talking 2013, '14, so it was back in the day when Elasticsearch was very young. And so, that was just a bad start for me.Corey: I will say that Azure is the most cost-effective cloud because their security is so clown shoes, you can just run whatever you want in someone else's account and it's free to you. Problem solved.Seif: Don't tell people how we save costs, okay?Corey: [laugh]. I love that.Seif: [laugh]. Don't tell people how we do this. Like, Corey, come on [laugh], you're exposing me here. Let me tell you one thing, though. Elasticsearch is the reason I literally use a shock collar or a shock bracelet on myself every time it went down—which was almost every day, instead of having PagerDuty, like, ring my phone.And, you know, I'd wake up and my partner back then would wake up. I bought a Bluetooth collar off of Alibaba that would tase me every time I'd get a notification, regardless of the notification. So, some things are false alarm, but I got tased for at least two, three weeks before I gave up. Every night I'd wake up, like, to a full discharge.Corey: I would never hook myself up to a shocker tied to outages, even if I owned a company. There are pleasant ways to wake up, unpleasant ways to wake up, and even worse. So, you're getting shocked for some—so someone else can wind up effectively driving the future of the business. You're, more or less, the monkey that gets shocked awake to go ahead and fix the thing that just broke.Seif: [laugh]. Well, the fix to that was moving from Azure to AWS without telling anybody. That got us in a lot of trouble. Again, that wasn't my company.Corey: They didn't notice that you did this, or it caused a lot of trouble because suddenly nothing worked where they thought it would work?Seif: They—no, no, everything worked fine on AWS. That's how my love story began. But they didn't notice for, like, six months.Corey: That's kind of amazing.Seif: [laugh]. That was specta—we rewrote everything from C# to Node.js and moved everything away from Elasticsearch, started using Redshift, Redis and a—you name it. We went AWS all the way and they didn't even notice. We took the budget from another department to start filling that in.But we cut the costs from $100,000 down to, like, 40, and then eventually down to $30,000 a month.Corey: More than a little wild.Seif: Oh, God, yeah. Good times, good times. Next time, just ask me to tell you the full story about this. I can't go into details on this podcast. I'll get in a lot—I think I'll get in trouble. I didn't sign anything though.Corey: Those are the best stories. But no, I hear you. I absolutely hear you. Seif, I really want to thank you for taking the time to speak with me. If people want to learn more, where should they go?Seif: So, axiom.co—not dot com. Dot C-O. That's where they learn more about Axiom. And other than that, I think I have a Twitter somewhere. And if you know how to write my name, you'll—it's just one word and find me on Twitter.Corey: We will put that all in the [show notes 00:26:33]. Thank you so much for taking the time to speak with me. I really appreciate it.Seif: Dude, that was awesome. Thank you, man.Corey: Seif Lotfy, co-founder and CTO of Axiom, who has brought this promoted guest episode our way. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that one of these days, I will get around to aggregating in some horrifying custom homebrew logging system, probably built on top of rsyslog.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Le coureur lambda
#23 Le 1er apéro des lambdas! La diagonale des fous.

Le coureur lambda

Play Episode Listen Later Oct 14, 2023 78:44


Nouveau format dans le coureur lambda! On se retrouver à l'apéro autour d'un sujet pour en débattre. Aujourd'hui on parle de la diagonale des fous avec Fred, finisher en 2014, Fabien, qui sera au départ cette année et Damien, qui rêve de participer à cette course un jour. Merci pour votre soutien et vos messages, partagez le podcast sur vos réseaux!

Screaming in the Cloud
Making a Difference Through Technology in the Public Sector with Dmitry Kagansky

Screaming in the Cloud

Play Episode Listen Later Oct 5, 2023 33:04


Dmitry Kagansky, State CTO and Deputy Executive Director for the Georgia Technology Authority, joins Corey on Screaming in the Cloud to discuss how he became the CTO for his home state and the nuances of working in the public sector. Dmitry describes his focus on security and reliability, and why they are both equally important when working with state government agencies. Corey and Dmitry describe AWS's infamous GovCloud, and Dmitry explains why he's employing a multi-cloud strategy but that it doesn't work for all government agencies. Dmitry also talks about how he's focusing on hiring and training for skills, and the collaborative approach he's taking to working with various state agencies.About DmitryMr. Kagansky joined GTA in 2021 from Amazon Web Services where he worked for over four years helping state agencies across the country in their cloud implementations and migrations.Prior to his time with AWS, he served as Executive Vice President of Development for Star2Star Communications, a cloud-based unified communications company. Previously, Mr. Kagansky was in many technical and leadership roles for different software vending companies. Most notably, he was Federal Chief Technology Officer for Quest Software, spending several years in Europe working with commercial and government customers.Mr. Kagansky holds a BBA in finance from Hofstra University and an MBA in management of information systems and operations management from the University of Georgia.Links Referenced: Twitter: https://twitter.com/dimikagi LinkedIn: https://www.linkedin.com/in/dimikagi/ GTA Website: https://gta.ga.gov TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: In the cloud, ideas turn into innovation at virtually limitless speed and scale. To secure innovation in the cloud, you need Runtime Insights to prioritize critical risks and stay ahead of unknown threats. What's Runtime Insights, you ask? Visit sysdig.com/screaming to learn more. That's S-Y-S-D-I-G.com/screaming.My thanks as well to Sysdig for sponsoring this ridiculous podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Technical debt is one of those fun things that everyone gets to deal with, on some level. Today's guest apparently gets to deal with 235 years of technical debt. Dmitry Kagansky is the CTO of the state of Georgia. Dmitry, thank you for joining me.Dmitry: Corey, thank you very much for having me.Corey: So, I want to just begin here because this has caused confusion in my life; I can only imagine how much it's caused for you folks. We're talking Georgia the US state, not Georgia, the sovereign country?Dmitry: Yep. Exactly.Corey: Excellent. It's always good to triple-check those things because otherwise, I feel like the shipping costs are going to skyrocket in one way or the other. So, you have been doing a lot of very interesting things in the course of your career. You're former AWS, for example, you come from commercial life working in industry, and now it's yeah, I'm going to go work in state government. How did this happen?Dmitry: Yeah, I've actually been working with governments for quite a long time, both here and abroad. So, way back when, I've been federal CTO for software companies, I've done other work. And then even with AWS, I was working with state and local governments for about four, four-and-a-half years. But came to Georgia when the opportunity presented itself, really to try and make a difference in my own home state. You mentioned technical debt at the beginning and it's one of the things I'm hoping that helped the state pay down and get rid of some of it.Corey: It's fun because governments obviously are not thought of historically as being the early adopters, bleeding edge when it comes to technical innovation. And from where I sit, for good reason. You don't want code that got written late last night and shoved into production to control things like municipal infrastructure, for example. That stuff matters. Unlike a lot of other walks of life, you don't usually get to choose your government, and, “Oh, I don't like this one so I'm going to go for option B.”I mean you get to do at the ballot box, but that takes significant amounts of time. So, people want above all else—I suspect—their state services from an IT perspective to be stable, first and foremost. Does that align with how you think about these things? I mean, security, obviously, is a factor in that as well, but how do you see, I guess, the primary mandate of what you do?Dmitry: Yeah. I mean, security is obviously up there, but just as important is that reliance on reliability, right? People take time off of work to get driver's licenses, right, they go to different government agencies to get work done in the middle of their workday, and we've got to have systems available to them. We can't have them show up and say, “Yeah, come back in an hour because some system is rebooting.” And that's one of the things that we're trying to fix and trying to have fewer of, right?There's always going to be things that happen, but we're trying to really cut down the impact. One of the biggest things that we're doing is obviously a move to the cloud, but also segmenting out all of our agency applications so that agencies manage them separately. Today, my organization, Georgia Technology Authority—you'll hear me say GTA—we run what we call NADC, the North Atlanta Data Center, a pretty large-scale data center, lots of different agencies, app servers all sitting there running. And then a lot of times, you know, an impact to one could have an impact to many. And so, with the cloud, we get some partitioning and some segmentation where even if there is an outage—a term you'll often hear used that we can cut down on the blast radius, right, that we can limit the impact so that we affect the fewest number of constituents.Corey: So, I have to ask this question, and I understand it's loaded and people are going to have opinions with a capital O on it, but since you work for the state of Georgia, are you using GovCloud over in AWS-land?Dmitry: So… [sigh] we do have some footprint in GovCloud, but I actually spent time, even before coming to GTA, trying to talk agencies out of using it. I think there's a big misconception, right? People say, “I'm government. They called it GovCloud. Surely I need to be there.”But back when I was with AWS, you know, I would point-blank tell people that really I know it's called GovCloud, but it's just a poorly named region. There are some federal requirements that it meets; it was built around the ITAR, which is International Traffic of Arms Regulations, but states aren't in that business, right? They are dealing with HIPAA data, with various criminal justice data, and other things, but all of those things can run just fine on the commercial side. And truthfully, it's cheaper and easier to run on the commercial side. And that's one of the concerns I have is that if the commercial regions meet those requirements, is there a reason to go into GovCloud, just because you get some extra certifications? So, I still spend time trying to talk agencies out of going to GovCloud. Ultimately, the agencies with their apps make the choice of where they go, but we have been pretty good about reducing the footprint in GovCloud unless it's absolutely necessary.Corey: Has this always been the case? Because my distant recollection around all of this has been that originally when GovCloud first came out, it was a lot harder to run a whole bunch of workloads in commercial regions. And it feels like the commercial regions have really stepped up as far as what compliance boxes they check. So, is one of those stories where five or ten years ago, whenever it GovCloud first came out, there were a bunch of reasons to use it that no longer apply?Dmitry: I actually can't go past I'll say, seven or eight years, but certainly within the last eight years, there's not been a reason for state and local governments to use it. At the federal level, that's a different discussion, but for most governments that I worked with and work with now, the commercial regions have been just fine. They've met the compliance requirements, controls, and everything that's in place without having to go to the GovCloud region.Corey: Something I noticed that was strange to me about the whole GovCloud approach when I was at the most recent public sector summit that AWS threw is whenever I was talking to folks from AWS about GovCloud and adopting it and launching new workloads and the rest, unlike in almost any other scenario, they seemed that their first response—almost a knee jerk reflex—was to pass that work off to one of their partners. Now, on the commercial side, AWS will do that when it makes sense, and each one becomes a bit of a judgment call, but it just seemed like every time someone's doing something with GovCloud, “Oh, talk to Company X or Company Y.” And it wasn't just one or two companies; there were a bunch of them. Why is that?Dmitry: I think a lot of that is because of the limitations within GovCloud, right? So, when you look at anything that AWS rolls out, it almost always rolls out into either us-east-1 or us-west-2, right, one of those two regions, and it goes out worldwide. And then it comes out in GovCloud months, sometimes even years later. And in fact, sometimes there are features that never show up in GovCloud. So, there's not parity there, and I think what happens is, it's these partners that know what limitations GovCloud has and what things are missing and GovCloud they still have to work around.Like, I remember when I started with AWS back in 2016, right, there had been a new console, you know, the new skin that everyone's now familiar with. But that old console, if you remember that, that was in GovCloud for years afterwards. I mean, it took them at least two more years to get GovCloud to even look like the current commercial console that you see. So, it's things like that where I think AWS themselves want to keep moving forward and having to do anything with kind of that legacy platform that doesn't have all the bells and whistles is why they say, “Go get a partner [unintelligible 00:08:06] those things that aren't there yet.”Corey: That's it makes a fair bit of sense. What I was always wondering how much of this was tied to technical challenges working within those, and building solutions that don't depend upon things. “Oh, wait, that one's not available in GovCloud,” versus a lack of ability to navigate the acquisition process for a lot of governments natively in the same way that a lot of their customers can.Dmitry: Yeah, I don't think that's the case because even to get a GovCloud account, you have to start off with a commercial account, right? So, you actually have to go through the same purchasing steps and then essentially, click an extra button or two.Corey: Oh, I've done that myself already. I have a shitposting account and a—not kidding—Ministry of Shitposting GovCloud account. But that's also me just kicking the tires on it. As I went through the process, it really felt like everything was built around a bunch of unstated assumption—because of course you've worked within GovCloud before and you know where these things are. And I kept tripping into a variety of different aspects of that. I'm wondering how much of that is just due to the fact that partners are almost always the ones guiding customers through that.Dmitry: Yeah. It is almost always that. There's very few people, even in the AWS world, right, if you look at all the employees they have there, it's small subset that work with that environment, and probably an even smaller subset of those that understand what it's really needed for. So, this is where if there's not good understanding, you're better off handing it off to a partner. But I don't think it is the purchasing side of things. It really is the regulatory things and just having someone else sign off on a piece of paper, above and beyond just AWS themselves.Corey: I am curious, since it seems that people love to talk about multi-cloud in a variety of different ways, but I find there's a reality that, ehh, basically, on a long enough timeline, everyone uses everything, versus the idea of, “Oh, we're going to build everything so we can seamlessly flow from one provider to another.” Are you folks all in on AWS? Are you using a bunch of different cloud providers for different workloads? How are you approaching a cloud strategy?Dmitry: So, when you say ‘you guys,' I'll say—as AWS will always say—“It depends.” So, GTA is multi-cloud. We support AWS, we support OCI, we support Azure, and we are working towards getting Google in as well, GCP. However, on the agency side, I am encouraging agencies to pick a cloud. And part of that is because you do have limited staff, they are all different, right?They'll do similar things, but if it's done in a different way and you don't have people that know those little tips and tricks, kind of how to navigate certain cloud vendors, it just makes things more difficult. So, I always look at it as kind of the car analogy, right? Most people are not multi-car, right? You go you buy a car—Toyota, Ford, whatever it is—and you're committed to that thing for the next 4 or 5, 10 years, however long you own it, right? You may not like where the cupholder is or you need to get used to something, you know, being somewhere else, but you do commit to it.And I think it's the same thing with cloud that, you know, do you have to be in one cloud for the rest of your life? No, but know that you're not going to hop from cloud to cloud. No one really does. No one says, “Every six months, I'm going to go move my application from one cloud to another.” It's a pretty big lift and no one really needs to do that. Just find the one that's most comfortable for you.Corey: I assume that you have certain preferences as far as different cloud providers go. But I've found even in corporate life that, “Well, I like this company better than the other,” is generally not the best basis for making sweeping decisions around this. What frameworks do you give various departments to consider where a given workload should live? Like, how do you advise them to think about this?Dmitry: You know, it's funny, we actually had a call with an agency recently that said, “You know, we don't know cloud. What do you guys think we should do?” And it was for a very small, I don't want to call it workload; it was really for some DNS work that they wanted to do. And really came down to, for that size and scale, right, we're looking at a few dollars, maybe a month, they picked it based on the console, right? They liked one console over another.Not going to get into which cloud they picked, but we wound up them giving them a demo of here's what this looks like in these various cloud providers. And they picked that just because they liked the buttons and the layout of one console over another. Now, having said that, for obviously larger workloads, things that are more important, there is criteria. And in many cases, it's also the vendors. Probably about 60 to 70% of the applications we run are all vendor-provided in some way, and the vendors will often dictate platforms that they'll support over others, right?So, that supportability is important to us. Just like you were saying, no one wants code rolled out overnight and surprise all the constituents one day. We take our vendor relations pretty seriously and we take our cue from them. If we're buying software from someone and they say, “Look, this is better in AWS,” or, “This is better in OCI,” for whatever reasons they have, will go in that direction more often than not.Corey: I made a crack at the beginning of the episode where the state was founded 235 years ago, as of this recording. So, how accurate is that? I have to imagine that back in those days, they didn't really have a whole lot of computers, except probably something from IBM. How much technical debt are you folks actually wrestling with?Dmitry: It's pretty heavy. One of the biggest things we have is, we ourselves, in our data center, still have a mainframe. That mainframe is used for a lot of important work. Most notably, a lot of healthcare benefits are really distributed through that system. So, you're talking about federal partnerships, you're talking about, you know, insurance companies, health care providers, all somehow having—Corey: You're talking about things that absolutely, positively cannot break.Dmitry: Yep, exactly. We can't have outages, we can't have blips, and they've got to be accurate. So, even that sort of migration, right, that's not something that we can do overnight. It's something we've been working on for well over a year, and right now we're targeting probably roughly another year or so to get that fully migrated out. And even there, we're doing what would be considered a traditional lift-and-shift. We're going to mainframe emulation, we're not going cloud-native, we're not going to do a whole bunch of refactoring out of the gate. It's just picking up what's working and running and just moving it to a new venue.Corey: Did they finally build an AWS/400 that you can run that out? I didn't realize they had a mainframe emulation offering these days.Dmitry: They do. There's actually several providers that do it. And there's other agencies in the state that have made this sort of move as well, so we're also not even looking to be innovators in that respect, right? We're not going to be first movers to try that out. We'll have another agency make that move first and now we're doing this with our Department of Human Services.But yeah, a lot of technical debt around that platform. When you look at just the cost of operating these platforms, that mainframe costs the state roughly $15 million a year. We think in the cloud, it's going to wind up costing us somewhere between 3 to 4 million. Even if it's 5 million, that's still considerable savings over what we're paying for today. So, it's worth making that move, but it's still very deliberate, very slow, with a lot of testing along the way. But yeah, you're talking about that workload has been in the state, I want to say, for over 20, 25 years.Corey: So, what's the reason to move it? Because not for nothing, but there's an old—the old saw, “Well, don't fix it if it ain't broke.” Well, what's broke about it?Dmitry: Well, there's a couple of things. First off, the real estate that it takes up as an issue. It is a large machine sitting on a floor of a data center that we've got to consolidate to. We actually have some real estate constraints and we've got to cut down our footprint by next year, contractually, right? We've agreed, we're going to move into a smaller space.The other part is the technical talent. While yes, it's not broke, things are working on it, there are fewer and fewer people that can manage it. What we've found was doing a complete refactor while doing a move anywhere, is really too risky, right? Rewriting everything with a bunch of Lambdas is kind of scary, as well as moving it into another venue. So, there are mainframe emulators out there that will run in the cloud. We've gotten one and we're making this move now. So, we're going to do that lift-and-shift in and then look to refactor it piecemeal.Corey: Specifics are always going to determine, but as a general point, I felt like I am the only voice in the room sometimes advocating in favor of lift-and-shift. Because people say, “Oh, it's terrible for reasons X, Y, and Z.” It's, “Yes, all of your options are terrible and for the common case, this is the one that I have the sneaking suspicion, based upon my lived experience, is going to be the least bad have all of those various options.” Was there a thought given to doing a refactor in flight?Dmitry: So… from the time I got here, no. But I could tell you just having worked with the state even before coming in as CTO, there were constant conversations about a refactor. And the problem is, no one actually has an appetite for it. Everyone talks about it, but then when you say, “Look, there's a risk to doing this,”—right, governments are about minimizing risk—when you say, “Look, there's a risk to rewriting and moving code at the same time and it's going to take years longer,” right, that refactoring every time, I've seen an estimate, it would be as small as three years, as large as seven or eight years, depending on who was doing the estimate. Whereas the lift-and-shift, we're hoping we can get it done in two years, but even if it's two-and-a-half, it's still less than any of the estimates we've seen for a refactor and less risky. So, we're going with that model and we'll tinker and optimize later. But we just need to get out of that mainframe so that we can have more modern technology and more modern support.Corey: It seems like the right approach. I'm sorry, I didn't mean to frame that is quite as insulting as it might have come across. Like, “Did anyone consider other options just out of curi—” of course. Whenever you're making big changes, we're going to throw a dart at a whiteboard. It's not what appears to be Twitter's current product strategy we're talking about here. This is stuff that's very much measure twice, cut once.Dmitry: Yeah. Very much so. And you see that with just about everything we do here. I know, when the state, what now, three years ago, moved their tax system over to AWS, not only did they do two or three trial runs of just the data migration, we actually wound up doing six, right? You're talking about adding two months of testing just to make sure every time we did the data move, it was done correctly and all the data got moved over. I mean, government is very, very much about measure three, four times, cut once.Corey: Which is kind of the way you'd want it. One thing that I found curious whenever I've been talking to folks in the public sector space around things that they care about—and in years past, I periodically tried to, “Oh, should we look at doing some cost consulting for folks in this market?” And by and large, there have been a couple of exceptions, but—generally, in our experience with sovereign governments, more so than municipal or state ones—but saving money is not usually one of the top three things that governments care about when it comes to their AWS's state. Is cost something that's on your radar? And how do you conceptualize around this? And I should also disclose, this is not in any way, shape, or form intended to be a sales pitch.Dmitry: Yeah, no, cost actually, for GTA. Is a concern. But I think it's more around the way we're structured. I have worked with other governments where they say, “Look, we've already gotten an allotment of money. It costs whatever it costs and we're good with it.”With the way my organization is set up, though, we're not appropriated funds, meaning we're not given any tax dollars. We actually have to provide services to the agencies and they pay us for it. And so, my salary and everyone else's here, all the work that we do, is basically paid for by agencies and they do have a choice to leave. They could go find other providers. It doesn't have to be GTA always.So, cost is a consideration. But we're also finding that we can get those cost savings pretty easily with this move to the cloud because of the number of available tools that we now have available. We have—that data center I talked about, right? That data center is obviously locked down, secured, very limited access, you can't walk in, but that also prevents agencies from doing a lot of day-to-day work that now in the cloud, they can do on their own. And so, the savings are coming just from this move of not having to have as much locks away from the agency, but having more locks from the outside world as well, right? There's definitely scaling up in the number of tools that they have available to them to work around their applications that they didn't have before.Corey: It's, on some level, a capability story, I think, when it comes to cloud. But something I have heard from a number of folks is that even more so than in enterprises, budgets tend to be much more fixed things in the context of cloud in government. Often in enterprises, what you'll see is sprawl: someone leaves something running and oops, the bill wound up going up higher than we projected for this given period of time. When we start getting into the realm of government, that stops being a you broke budgeting policy and starts to resemble things that are called crimes. How do you wind up providing governance as a government around cloud usage to avoid, you know, someone going to prison over a Managed NAT Gateway?Dmitry: Yeah. So, we do have some pretty stringent monitoring. I know, even before the show, we talked about fact that we do have a separate security group. So, on that side of it, they are keeping an eye on what are people doing in the cloud. So, even though agencies now have more access to more tooling, they can do more, right, GTA hasn't stepped back from it and so, we're able to centrally manage things.We've put in a lot of controls. In fact, we're using Control Tower. We've got a lot of guardrails put in, even basic things like you can't run things outside of the US, right? We don't want you running things in the India region or anywhere in South America. Like, that's not even allowed, so we're able to block that off.And then we've got some pretty tight financial controls where we're watching the spend on a regular basis, agency by agency. Not enforcing any of it, obviously, agencies know what they're doing and it's their apps, but we do warn them of, “Hey, we're seeing this trend or that trend.” We've been at this now for about a year-and-a-half, and so agencies are starting to see that we provide more oversight and a lot less pressure, but at the same time, there's definitely a lot more collaboration assistance with one another.Corey: It really feels like the entire procurement model is shifted massively. As opposed to going out for a bunch of bids and doing all these other things, it's consumption-based. And that has been—I know for enterprises—a difficult pill for a lot of their procurement teams to wind up wrapping their heads around. I can only imagine what that must be like for things that are enshrined in law.Dmitry: Yeah, there's definitely been a shift, although it's not as big as you would think on that side because you do have cloud but then you also have managed services around cloud, right? So, you look at AWS, OCI, Azure, no one's out there putting a credit card down to open an environment anymore, you know, a tenant or an account. It is done through procurement rules. Like, we don't actually buy AWS directly from AWS; we go through a reseller, right, so there's some controls there as well from the procurement side. So, there's still a lot of oversight.But it is scary to some of our procurement people. Like, AWS Marketplace is a very, very scary place for them, right? The fact that you can go and—you can hire people at Marketplace, you could buy things with a single button-click. So, we've gone out of our way, in my agency, to go through and lock that down to make sure that before anyone clicks one of those purchase buttons, that we at least know about it, they've made the request, and we have to go in and unlock that button for that purchase. So, we've got to put in more controls in some cases. But in other cases, it has made things easier.Corey: As you look across the landscape of effectively, what you're doing is uprooting an awful lot of technical systems that have been in place for decades at this point. And we look at cloud and I'm not saying it's not stable—far from it—but it also feels a little strange to be, effectively, making a similar timespan of commitment—because functionally a lot of us are—when we look at these platforms. Was that something that had already been a pre-existing appetite for when you started the role or is that something that you've found that you've had to socialize in the last couple years?Dmitry: It's a little bit of both. It's been lumpy, agency by agency, I'll say. There are some agencies that are raring to go, they want to make some changes, do a lot of good, so to speak, by upgrading their infrastructure. There are others that will sit and say, “Hey, I've been doing this for 20, 30 years. It's been fine.” That whole, “If it ain't broke, don't fix it,” mindset.So, for them, there's definitely been, you know, a lot more friction to get them going in that direction. But what I'm also finding is the people with their hands on the keyboards, right, the ones that are doing the work, are excited by this. This is something new for them. In addition to actually going to cloud, the other thing we've been doing is providing a lot of different training options. And so, that's something that's perked people up and definitely made them much more excited to come into work.I know, down at the, you know, the operator level, the administrators, the managers, all of those folks, are pretty pleased with the moves we're making. You do get some of the folks in upper management in the agencies that do say, “Look, this is a risk.” We're saying, “Look, it's a risk not to do this.” Right? You've also got to think about staffing and what people are willing to work on. Things like the mainframe, you know, you're not going to be able to hire those people much longer. They're going to be fewer and far between. So, you have to retool. I do tell people that, you know, if you don't like change, IT is probably not the industry to be in, even in government. You probably want to go somewhere else, then.Corey: That is sort of the next topic I want to get into, where companies across the board are finding it challenging to locate and source talent to work in their environments. How has the process of recruiting cloud talent gone for you?Dmitry: It's difficult. Not going to sugarcoat that. It's, it's—Corey: [laugh]. I'm not sure anyone would say otherwise, no matter where you are. You can pay absolutely insane, top-of-market money and still have that exact same response. No one says, “Oh, it's super easy.” Everyone finds it hard. But please continue [laugh].Dmitry: Yeah, but it's also not a problem that we can even afford to throw money at, right? So, that's not something that we'd ever do. But what I have found is that there's actually a lot of people, really, that I'll say are tech adjacent, that are interested in making that move. And so, for us, having a mentoring and training program that bring people in and get them comfortable with it is probably more important than finding the talent exactly as it is, right? If you look at our job descriptions that we put out there, we do want things like cloud certs and certain experience, but we'll drop off things like certain college requirements. Say, “Look, do you really need a college degree if you know what you're doing in the cloud or if you know what you're doing with a database and you can prove that?”So, it's re-evaluating who we're bringing in. And in some cases, can we also train someone, right, bring someone in for a lower rate, but willing to learn and then give them the experience, knowing that they may not be here for 15, 20 years and that's okay. But we've got to retool that model to say, we expect some attrition, but they walk away with some valuable skills and while they're here, they learn those skills, right? So, that's the payoff for them.Corey: I think that there's a lot of folks exploring that where there are people who have the interest and the aptitude that are looking to transition in. So, much of the discussion points around filling the talent pipeline have come from a place of, oh, we're just going to talk to all the schools and make sure that they're teaching people the right way. And well, colleges aren't really aimed at being vocational institutions most of the time. And maybe you want people who can bring an understanding of various aspects of business, of workplace dynamics, et cetera, and even the organization themselves, you can transition them in. I've always been a big fan of helping people lateral from one part of an organization to another. It's nice to see that there's actual formal processes around that for you, folks.Dmitry: Yeah, we're trying to do that and we're also working across agencies, right, where we might pull someone in from another agency that's got that aptitude and willingness, especially if it's someone that already has government experience, right, they know how to work within the system that we have here, it certainly makes things easier. It's less of a learning curve for them on that side. We think, you know, in some cases, the technical skills, we can teach you those, but just operating in this environment is just as important to understand the soft side of it.Corey: No, I hear you. One thing that I've picked up from doing this show and talking to people in the different places that you all tend to come from, has been that everyone's working with really hard problems and there's a whole universe of various constraints that everyone's wrestling with. The biggest lie in our industry across the board that I'm coming to realize is any whiteboard architecture diagram. Full stop. The real world is messy.Nothing is ever quite like it looks like in that sterile environment where you're just designing and throwing things up there. The world is built on constraints and trade-offs. I'm glad to see that you're able to bring people into your organization. I think it gives an awful lot of folks hope when they despair about seeing what some of the job prospects are for folks in the tech industry, depending on what direction they want to go in.Dmitry: Yeah. I mean, I think we've got the same challenge as everyone else does, right? It is messy. The one thing that I think is also interesting is that we also have to have transparency but to some degree—and I'll shift; I know this wasn't meant to kind of go off into the security side of things, but I think one of the things that's most interesting is trying to balance a security mindset with that transparency, right?You have private corporations, other organizations that they do whatever they do, they're not going to talk about it, you don't need to know about it. In our case, I think we've got even more of a challenge because on the one hand, we do want to lock things down, make sure they're secure and we protect not just the data, but how we do things, right, some are mechanisms and methods. But same time, we've got a responsibility to be transparent to our constituents. They've got to be able to see what we're doing, what are we spending money on? And so, to me, that's also one of the biggest challenges we have is how do we make sure we balance that out, that we can provide people and even our vendors, right, a lot of times our vendors [will 00:30:40] say, “How are you doing something? We want to know so that we can help you better in some areas.” And it's really become a real challenge for us.Corey: I really want to thank you for taking the time to speak with me about what you're doing. If people want to learn more, where's the best place for them to find you?Dmitry: I guess now it's no longer called Twitter, but really just about anywhere. Twitter, Instagram—I'm not a big Instagram user—LinkedIn, Dmitry Kagansky, there's not a whole lot of us out there; pretty easy to do a search. But also you'll see there's my contact info, I believe, on the GTA website, just gta.ga.gov.Corey: Excellent. We will, of course, put links to that in the [show notes 00:31:20]. Thank you so much for being so generous with your time. I really appreciate it.Dmitry: Thank you, Corey. I really appreciate it as well.Corey: Dmitry Kagansky, CTO for the state of Georgia. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment telling me that I've got it all wrong and mainframes will in fact rise again.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Hindsight is Horrifying
Are Nerds to Sexy for 2023? It's "Revenge of the Nerds" on Hindsight is Horrifying!

Hindsight is Horrifying

Play Episode Listen Later Sep 26, 2023 95:13


It's every nerdy boy's dream to matriculate into college and hook up with the head cheerleader, no matter the cost.This 1980s “classic” turns that dream into a wildly illegal reality! Academically oriented freshmen Lewis & Gilbert (we THINK those are their names) discover just how shallow the dating pool is when they face off with the vicious Greek Council and feral cheerleaders of Adams College.Sexual assault abounds as the geeky tri-Lambdas seek retribution against the bullying and harassment of the Alpha-Betas and their bouncy girlfriends.Can our antagonist nerds win tug-of-war? Can they belch loudly enough? Did you know that John Goodman is in this movie? Does this movie hold up? Listen and find out as Darth, Adam, and Jason discuss Revenge of the Nerds! Hosted on Acast. See acast.com/privacy for more information.

Financial Modeler's Corner
Are LAMBDAs & Dynamic Arrays the future of Financial Modeling? A lively discussion with 4 modeling experts.

Financial Modeler's Corner

Play Episode Listen Later Sep 14, 2023 53:38


Welcome to Financial Modeler's Corner (FMC) where we discuss the art and science of financial modeling with your host Paul Barnhurst. Financial Modeler's Corner is sponsored by Financial Modeling Institute (FMI), the most respected accreditations in Financial Modeling globally. In this episode, your host, Paul Barnhurst is joined by 4 fabulous guests, namely, Jeff Robson, Danielle Stein Fairhurst, Craig Hatmaker & Ian Schnoor. They are all highly accomplished and respected modelers in the Financial Modeling profession. In this episode they share thoughts on Dynamic Arrays and LAMBDAs. Jeff is a Financial Modeler, Business Analyst, International Trainer & Presenter. Danielle is a Financial Modeler, Author, and Corporate Trainer based in Australia. Craig is a retired Financial Modeler and is a 5G (fifth generation) modeling enthusiast, and Ian is the Executive Director at the Financial Modeling Institute (FMI), which is also the sponsor of this Podcast. Listen to this episode as the guests talk about: Why it is important to keep current on the latest developments in modelingLAMBDAs what they are and how they could revolutionize financial modelingDynamic Arrays, what they are, how they work, and what it is like to build a model using Dynamic ArraysHow Dynamic Arrays will make auditing models easierThe guest's favorite Dynamic Array formulas Sign up for the Advanced Financial Modeler Accreditation or FMI Fundamentals Today and receive 15% off by using the special show code ‘Podcast'. Visit www.fminstitute.com/podcast and use code Podcast to save 15% when you register. Go to https://earmarkcpe.com , download the app, take the quiz and you can receive CPE credit with this episode. Follow Jeff Robson: Linkedin: https://www.linkedin.com/in/jeffrobson/ Website: www.accessanalytic.com.au YouTube: https://www.youtube.com/@AccessAnalytic Follow Danielle Stein Fairhurst: Linkedin: https://www.linkedin.com/in/daniellesteinfairhurst/ Website: https://plumsolutions.com.au/ Upcoming Events for Danielle Stein FairhurstDanielle is speaking at the global excel summit - Global Excel Summit 2024 | Microsoft Excel Event | WelcomeUpcoming webinar on Scalable models - https://www.linkedin.com/events/scalablemodelling-usingmsfabric7106464501708840960/ Follow Craig Hatmaker: Linkedin: https://www.linkedin.com/in/craig-hatmaker-4449879/ Website: https://sites.google.com/site/beyondexcel/home YouTube: https://www.youtube.com/@CraigHatmakerBXL Building with Fast + 5G- https://www.youtube.com/watch?v=8Zl3yURsvdE&t=1s Demo using 5G Components- Intro 5G - Watch an Excel Model do something incredible using 5G methods - YouTube Gist site to download Craig Hatmakers LAMBDAs for 5G modeling- https://gist.github.com/CHatmaker Follow Ian Schnoor: Linkedin: https://www.linkedin.com/in/ianschnoor/ Website: https://fminstitute.com/ Follow Paul: Website - https://www.thefpandaguy.com/ LinkedIn - https://www.linkedin.com/in/thefpandaguy/ Instagram - https://www.instagram.com/thefpandaguy/ TikTok - https://www.tiktok.com/@thefpandaguy Twitter - https://twitter.com/TheFPandAGuy YouTube - https://www.youtube.com/@thefpaguy8376 Follow Financial Modeler's Corner LinkedIn Page - https://www.linkedin.com/company/financial-modeler-s-corner/?viewAsMember=true Subscribe to our Newsletter- https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7079020077076905984 Quotes: “Python brings a suite of libraries that can do some amazing things.” “You don't have to learn Python, Dynamic Arrays, or LAMBDAs for Financial Modeling, but you need to know it exists to stay relevant.” “ Dynamic arrays force you to have consistent formulas. ” “ LAMBDAs are functions that write functions using native excel functions. ” “Learn the basics and start experimenting.” In today's episode: (00:22) Intro; (00:39) Introduction of Guests; (03:30) Take on Python into Excel ; (06:49) Addressing Security Issues ; (07:32) What are Dynamic Arrays ; (09:58) Use case of Dynamic Arrays; (11:45) What is LAMBDA; (18:20) FMI exams and Dynamic Arrays; (21:57) LAMBDAs for 5G modeling ; (24:13) How to use 5G modeling; (26:29 - 27:15) Validate your Financial Modeling Skills with FMI's Accreditation Program (ad); (28:59) Auditing Dynamic Array Models; (33:30) Favorite spillable functions; (36:58) The potential of a standard to use LAMBDAs and Dynamic Arrays; (39:45) AI building Models; (44:20) Advice for the audience; (46:59) Advice on LAMBDAs; (49:15) Advice for Modelers; (51:42) Connect with Jeff, Danielle, Craig & Ian (52:57) Outro

Datacenter Technical Deep Dives
How to Kick Ass with Python, Lambdas, and Terraform with Sean Tibor

Datacenter Technical Deep Dives

Play Episode Listen Later Aug 23, 2023 61:58


Sean Tibor is a Sr. Cloud Engineer at Mondelez International and host of the Teaching Python Podcast! In this episode he discusses tools, lessons learned and best practices for using Terraform in a serverless environment! Resources: https://www.linkedin.com/in/seantibor/ https://twitter.com/smtibor https://serverless.tf/ https://github.com/antonbabenko/serverless.tf https://docs.powertools.aws.dev/lambda/python/latest/ https://docs.python.org/3/ https://www.oreilly.com/library/view/fluent-python-2nd/9781492056348/ https://www.youtube.com/watch?v=dH2GP6Lydj8

IFTTD - If This Then Dev
#219 - La blockchain comme des lambdas - Xavier van de Woestyne

IFTTD - If This Then Dev

Play Episode Listen Later Jun 21, 2023 65:31


"Au final, la blockchain c'est juste une BDD distribuée" Le D.E.V. de la semaine est Xavier van de Woestyne. Xavier discute de OCaml, Haskell et de Blockchain. Il aborde des sujets tels que les "smart contracts" Tezos, les fonctionnalités offertes par la Blockchain, les DAO. Il explore également la manière dont les données sociales peuvent s'intégrer sur la Blockchain, la possibilité d'exécuter des programmes on-chain et le fantasme des blockchains sans crypto. Finalement, il fournit une définition et des exemples d'une application décentralisée (dApp). Liens évoqués pendant l'émission Derrière nos écrans de fuméeMadame BovarySite personnel de Xavier **Continuons la discussion**@ifthisthendev@bibear@vdwxvLinkedInLinkedIn de Xavier van de WoestyneDiscord** Plus de contenus de dev **Retrouvez tous nos épisodes sur notre site.Nous sommes aussi sur Instagram, TikTok, Youtube, Twitch ** Job Board If This Then Dev **Si vous avez envie de changer de job, visitez le job board If This Then Dev ! Si vous voulez recruter des personnes qui écoutent IFTTD, il va s'en dire que le job board d'IFTTD est l'endroit où il faut être ! Ce job board est fait avec My Little Team!** La Boutique IFTTD !!! **Affichez votre appréciation de ce podcast avec des goodies faits avec amour sur la boutique ou affichez clairement votre camp tabulation ou espace.** Participez au prochain enregistrement !**Retrouvez-nous tous les lundis à 19:00 (mais pas que) pour assister à l'enregistrement de l'épisode en live et pouvoir poser vos questions pendant l'épisode :)Nous sommes en live sur Youtube, Twitch, LinkedIn et Twitter

Le Podcast AWS en Français
Cross The Ages

Le Podcast AWS en Français

Play Episode Listen Later Apr 28, 2023 49:23


Décrouvrez Cross The Ages. Plus qu'un jeu, un univers dystopique et dynamique où vous pouvez jouer, collectioner et échanger des cartes virtuelles et réelles. Mais que se cache-t-il sous le capot ? Quelle est l'infrastructure cloud requise pour offrir cette expérience à des centaines de milliers de joueurs ? On parle de fonctions Lambdas, de conteneurs, de bases de données, de replications multi-régions. On y apprend aussi que le serverless n'est pas toujours moins cher. Découvrez les dessous d'une architecture de jeu moderne dans le cloud.

Le Podcast AWS en Français
Cross The Ages

Le Podcast AWS en Français

Play Episode Listen Later Apr 28, 2023 49:23


Décrouvrez Cross The Ages. Plus qu'un jeu, un univers dystopique et dynamique où vous pouvez jouer, collectioner et échanger des cartes virtuelles et réelles. Mais que se cache-t-il sous le capot ? Quelle est l'infrastructure cloud requise pour offrir cette expérience à des centaines de milliers de joueurs ? On parle de fonctions Lambdas, de conteneurs, de bases de données, de replications multi-régions. On y apprend aussi que le serverless n'est pas toujours moins cher. Découvrez les dessous d'une architecture de jeu moderne dans le cloud.

airhacks.fm podcast with adam bien
Pommes, PaaS and Java on AWS

airhacks.fm podcast with adam bien

Play Episode Listen Later Apr 1, 2023 65:58


An airhacks.fm conversation with Sascha Moellering (@sascha242) about: Schneider CPC, starting programming with C-16, enjoying Finger's Malone, upgrade to C-128, playing Turrican, Manfred Trenz created Turrican and R-Type, publishing a Pommes Game, programming on Amiga 1200, math in game development, implementing a painting application, walking through C pointer and reference hell, from C to Java 1.0 on a Mac 6500 with 200MHz, using Metrowerks JVM, using CodeWarrior, CodeWarrior vs. stormc, Java is a clean language, working on SpiritLink, using Caucho Resin, starting at Accenture, from Accenture to Softlab, building a PaaS solution with JBoss for Allianz, managing hundreds of JVMs with a pizza team, implementing a low latency marketing solution with Vert.x, starting at Zanox, an episode with Arjan Tijms "#184 Piranha: Headless Applets Loaded with Maven", starting at AWS as Account Solution Architect, using quarkus on lambda as a microservice, using POJO asynchronous lambdas, EJB programming restrictions and Lambdas, airhacks discord server, Optimize your Spring Boot application for AWS Fargate, Reactive Microservices Architecture on AWS, Field Notes: Optimize your Java application for Amazon ECS with Quarkus, Field Notes: Optimize your Java application for AWS Lambda with Quarkus, How to deploy your Quarkus application to Amazon EKS, Using GraalVM to Build Minimal Docker Images for Java Applications Sascha Moellering on twitter: @sascha242

Screaming in the Cloud
Exciting Times in Cloud Security with Chris Farris

Screaming in the Cloud

Play Episode Listen Later Mar 21, 2023 32:46


Episode SummaryChris Farris, Cloud Security Nerd at Turbot, joins Corey on Screaming in the Cloud to discuss the latest events in cloud security, which leads to an interesting analysis from Chris on how legal departments obscure valuable information that could lead to fewer security failures in the name of protecting company liability, and what the future of accountability for security failures looks like. Chris and Corey also discuss the newest dangers in cloud security and billing practices, and Chris describes his upcoming cloud security conference, fwd:cloudsec. About ChrisChris Farris has been in the IT field since 1994 primarily focused on Linux, networking, and security. For the last 8 years, he has focused on public-cloud and public-cloud security. He has built and evolved multiple cloud security programs for major media companies, focusing on enabling the broader security team's objectives of secure design, incident response and vulnerability management. He has developed cloud security standards and baselines to provide risk-based guidance to development and operations teams. As a practitioner, he's architected and implemented multiple serverless and traditional cloud applications focused on deployment, security, operations, and financial modeling.Chris now does cloud security research for Turbot and evangelizes for the open source tool Steampipe. He is one of the organizers of the fwd:cloudsec conference (https://fwdcloudsec.org) and has given multiple presentations at AWS conferences and BSides events.When not building things with AWS's building blocks, he enjoys building Legos with his kid and figuring out what interesting part of the globe to travel to next. He opines on security and technology on Mastodon, Twitter and his website https://www.chrisfarris.comLinks Referenced: Turbot: https://turbot.com/ fwd:cloudsec: https://fwdcloudsec.org/ Mastodon: https://infosec.exchange/@jcfarris Personal website: https://chrisfarris.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn and we are here today to learn exciting things, steal exciting secrets, and make big trouble for Moose and Squirrel. Maybe that's the podcast; maybe that's the KGB, we're not entirely sure. But I am joined once again by Chris Farris, cloud security nerd at Turbot, which I will insist on pronouncing as ‘Turbo.' Chris, thanks for coming back.Chris: Thanks for having me.Corey: So, it's been a little while and it's been an uneventful time in cloud security with nothing particularly noteworthy happening, not a whole lot of things to point out, and honestly, we're just sort of scraping the bottom of the barrel for news… is what I wish I could say, but it isn't true. Instead, it's, “Oh, let's see what disastrous tire fire we have encountered this week.” What's top of mind for you as we record this?Chris: I think the most interesting one I thought was, you know, going back and seeing the guilty plea from Nickolas Sharp, who formerly was an employee at Ubiquiti and apparently had, like, complete access to everything there and then ran amok with it.Corey: Mm-hm.Chris: The details that were buried at the time in the indictment, but came out in the press releases were he was leveraging root keys, he was leveraging lifecycle policies to suppress the CloudTrail logs. And then of course, you know, just doing dumb things like exfiltrating all of this data from his home IP address, or exfiltrating it from his home through a VPN, which have accidentally dropped and then exposed his home IP address. Oops.Corey: There's so much to dive into there because I am not in any way shape or form, saying that what he did was good, or I endorse any of those things. And yeah, I think he belongs in prison for what he did; let's be very clear on this. But I personally did not have a business relationship with him. I am, however, Ubiquiti's customer. And after—whether it was an insider threat or whether it was someone external breaching them, Krebs On Security wound up doing a whole write-up on this and was single-sourcing some stuff from the person who it turned out, did this.And they made a lot of hay about this. They sued him at one point via some terrible law firm that's entire brand is suing media companies. And yeah, just wonderful, wonderful optics there and brilliant plan. But I don't care about the sourcing. I don't care about the exact accuracy of the reporting because what I'm seeing here is that what is not disputed is this person, who whether they were an employee or not was beside the point, deleted all of the audit logs and then as a customer of Ubiquiti, I received an email saying, “We have no indication or evidence that any customer data was misappropriated.” Yeah, you just turn off your logs and yeah, you could say that always and forever and save money on logging costs. [unintelligible 00:03:28] best practice just dropped, I guess. Clowns.Chris: So, yeah. And there's definitely, like, compliance and standards and everything else that say you turn on your logs and you protect your logs, and service control policies should have been able to detect that. If they had a security operations center, you know, the fact that somebody was using root keys should have been setting off red flags and causing escalations to occur. And that wasn't happening.Corey: My business partner and I have access to our AWS org, and when I was setting this stuff up for what we do here, at a very small company, neither of us can log in with root credentials without alarms going off that alert the other. Not that I don't trust the man; let's be very clear here. We both own the company.Chris: In business together. Yes.Corey: Ri—exactly. It is, in many ways, like a marriage in that one of us can absolutely ruin the other without a whole lot of effort. But there's still the idea of separation of duties, visibility into what's going on, and we don't use root API keys. Let me further point out that we are not pushing anything that requires you to send data to us. We're not providing a service that is software powered to people, much less one that is built around security. So, how is it that I have a better security posture than Ubiquiti?Chris: You understand AWS and in-depth cloud better. You know, it really comes down to how do you, as an AWS customer, understand all of the moving parts, all of the security tooling, all of the different ways that something can happen. And Amazon will say, “Well, it's in the documentation,” but you know, they have, what, 357 services? Are you reading the security pages of all of those? So, user education, I agree, you should have, and I have on all of my accounts, if anything pops up, if any IAM change happens, I'm getting text messages. Which is great if my account got compromised, but is really annoying when I'm actually making a change and my phone is blowing up.Corey: Yeah. It's worth pointing out as well that yes, Ubiquiti is publicly traded—that is understood and accepted—however, 93% of it is owned by their CEO-founder god-king. So, it is effectively one person's personal fiefdom. And I tend to take a very dim view as a direct result. When you're in cloud and you have suffered a breach, you have severely screwed something up somewhere. These breaches are never, “Someone stole a whole bunch of drives out of an AWS data center.” You have misconfigured something somewhere. And lashing out at people who reported on it is just a bad look.Chris: Definitely. Only error—now, of course, part of the problem here is that our legal system encourages people to not come forward and say, “I screwed up. Here's how I screwed up. Everybody come learn from my mistakes.” The legal professions are also there to manage risk for the company and they're like, “Don't say anything. Don't say anything. Don't even tell the government. Don't say anything.”Whereas we all need to learn from these errors. Which is why I think every time I do see a breach or I do see an indictment, I start diving into it to learn more. I did a blog post on some of the things that happened with Drizly and GitHub, and you know, I think the most interesting thing that came out of Drizly case was the ex-CEO of Drizly, who was CEO at the time of the breach, now has following him, for the rest of his life, an FTC order that says he must implement a security program wherever he goes and works. You know, I don't know what happens when he becomes a Starbucks barista or whatever, but that is on him. That is not on the company; that is on him.And I do think that, you know, we will start seeing more and more chief executive officers, chief security or information security officers becoming accountable to—or for the breaches and being personally accountable or professionally accountable for it. I think we kind of need it, even though, you know, there's only so much a CISO can do.Corey: One of the things that I did when I started consulting independently on AWS bills back in 2016 was, while I was looking at customer environments, I also would do a quick check for a few security baseline things. And I stopped doing it because I kept encountering a bunch of things that needed attention and it completely derailed the entire stated purpose of the engagement. And, frankly, I don't want to be running a security consultancy. There's a reason I focus on AWS bills. And people think I'm kidding, but I swear to you I'm not, when I say that the reason is in part because no one has a middle-of-the-night billing emergency. It is strictly a business-hours problem. Whereas with security, wake up.In fact, the one time I have been woken up in the middle of the night by a customer phone call, they were freaking out because it was a security incident and their bill had just pegged through the stratosphere. It's, “Cool. Fix the security problem first, then we'll worry about the bill during business hours. Bye.” And then I stopped leaving my phone off of Do Not Disturb at night.Chris: Your AWS bill is one of your indicators of compromise. Keep an eye on it.Corey: Oh, absolutely. We've had multiple engagements discover security issues on that. “So, what are these instances in Australia doing?” “We don't have anything there.” “I believe you're being sincere when you say this.”Chris: Yes.Corey: However.Chris: “Last month, you're at $1,000 and this month, you're at $50,000. And oh, by the way, it's the ninth, so you might want to go look at that.”Corey: Here's the problem that you start seeing in large-scale companies though. You or I wind up posting our IAM credentials on GitHub somewhere in public—and I do this from time to time, intentionally with absolutely no permissions attached to a thing—and I started look at the timeline of, “Okay 3, 2, 1, go,” with the push and now I start counting. What happens? At what time does the quarantine policy apply? When do I get an email alert? When do people start trying to exploit it? From where are they trying to exploit it?It's a really interesting thing to look into, just from the position of how this stuff all fits together and works. And that's great, but there's a whole ‘nother piece to it where if you or I were to do such a thing and actually give it admin credentials, okay, my, I don't know, what, $50, $100 a month account that I use for a lot of my test stuff now starts getting charged enormous piles of money that winds up looking like a mortgage in San Francisco, I'm going to notice that. But if you have a company that spending, I don't know, between ten and $20 million a month, do you have any idea how much Bitcoin you've got to be mining in that account to even make a slight dent in the overall trajectory of those accounts?Chris: In the overall bill, a lot. And in a particularly mismanaged account, my experience is you will notice it if you're monitoring billing anomalies on a per-account basis. I think it's important to note, you talked about that quarantine policy. If you look at what actually Amazon drops a deny on, it's effectively start EC2 instances and change IAM policies. It doesn't prevent anybody from listing all your buckets and exfiltrating all your data. It doesn't prevent anybody from firing up Lambdas and other less commonly used resources. Don't assume oh, Amazon dropped the quarantine policy. I'm safe.Corey: I was talking to somebody who spends $4 a month on S3 and they wound up suddenly getting $60 grand a day and Lambda charges, because max out the Lambda concurrency in every region and set it to mine crypto for 15 minutes apiece, yeah, you'll spend $60,000 a day to get, what $500 in crypto. But it's super economical as long as it's in someone else's account. And then Amazon hits them with a straight face on these things, where, “Please pay the bill.” Which is horrifying when there's several orders of magnitude difference between your normal bill and what happens post-breach. But what I did my whole post on “17 Ways to Run Containers on AWS,” followed by “17 More Ways to Run Containers on AWS,” and [unintelligible 00:12:00] about three services away from having a third one ready to go on that, the point is not, “Too many ways to run containers,” because yes, that is true and it's also amusing to me—less so to the containers team at AWS which does not have a sense of humor or sense of self-awareness of which they have been alerted—and fine, but every time you're running a container, it is a way to turn it into a crypto mining operation, in some way shape or form, which means there are almost 40-some-odd services now that can reasonably be used to spin up cryptocurrency mining. And that is the best-case breach scenario in a bunch of ways. It costs a bunch of money and things to clean up, but ‘we lost customer data.' That can destroy companies.Chris: Here's the worst part. Crypto mining is no longer profitable even when I've got stolen API keys because bitcoin's in the toilet. So, now they are going after different things. Actually, the most recent one is they look to see if your account is out of the SCS sandbox and if so, they go back to the tried-and-true way of doing internet scams, which is email spam.Corey: For me, having worked in operations for a very long time, I've been in situations where I worked at Expensify and had access to customer data there. I have worked in other finance companies—I worked at Blackrock. Where I work now, I have access to customer billing data. And let me be serious here for a second, I take all of these things seriously, but I also in all of those roles slept pretty well at night. The one that kept me up was a brief stint I did as the Director of Tech Ops at Grindr over ten years ago because unlike the stuff where I'm spending the rest of my career and my time now, it's not just money anymore.Whereas today, if I get popped, someone can get access to what a bunch of companies are paying AWS. It's scandalous, and I will be sued into oblivion and my company will not exist anymore and I will have a cloud hanging over my head forever. So, I have to be serious about it—Chris: But nobody will die.Corey: Nobody dies. Whereas, “Oh, this person is on Grindr and they're not out publicly,” or they live in a jurisdiction where that is punishable by imprisonment or death, you have blood on your hands, on some level, and I have never wanted that kind of responsibility.Chris: Yeah. It's reasonably scary. I've always been happy to say that, you know, the worst thing that I had to do was keep the Russians off CNN and my friends from downloading Rick and Morty.Corey: Exactly. It's, “Oh, heavens, you're winding up costing some giant conglomerate somewhere theoretical money on streaming subscriptions.” It's not material to the state of the world. And part of it, too, is—what's always informed my approach to things is, I'm not a data hoarder in the way that it seems our entire industry is. For the Last Week in AWS newsletter, the data that I collect and track is pretty freaking small.It's, “You want to sign up for the lastweekinaws.com newsletter. Great, I need your email address.” I don't need your name, I don't need the company you work at. You want to give me a tagged email address? Fine. You want to give me some special address that goes through some anonymizing thing? Terrific. I need to know where I'm sending the newsletter. And then I run a query on that for metrics sometimes, which is this really sophisticated database query called a count. How many subscribers do I have at any given point because that matters to our sponsors. But can we get—you give us any demographic? No, I cannot. I can't. I have people who [unintelligible 00:15:43] follow up surveys sometimes and that's it.Chris: And you're able to make money doing that. You don't have to collect, okay, you know, Chris's zip code is this and Bob's zip code is that and Frank's zip code is the other thing.Corey: Exactly.Chris: Or job titles, or you know, our mother's maiden name or anything else like that.Corey: I talk about what's going on in the world of AWS, so it sort of seems to me that if you're reading this stuff every week, either because of the humor or in spite of the humor, you probably are in a position where services and goods tied to that ecosystem would be well-received by you or one of the other 32,000 people who happen to be reading the newsletter or listening to the podcast or et cetera, et cetera, et cetera. It's an old-timey business model. It's okay, I want to wind up selling, I don't know, expensive wristwatches. Well, maybe I'll advertise in a magazine that caters to people who have an interest in wristwatches, or caters to a demographic that traditionally buys those wristwatches. And okay, we'll run an ad campaign and see if it works.Chris: It's been traditional advertising, not the micro-targeting stuff. And you know, television was the same way back in the broadcast era, you know? You watched a particular show, people of that demographic who watched that particular show had certain advertisers they wanted.Corey: That part of the challenge I've seen too, from sponsors of this show, for example, is they know it works, but they're trying to figure out how to do any form of attribution on this. And my answer—which sounds self-serving, but it's true—is, there's no effective way to do it because every time you try, like, “Enter this coupon code,” yeah, I assure you, some of these things wind up costing millions of dollars to deploy at large companies at scale and they provide value for doing it. No one's going to punch in a coupon code to get 10% off or something like that. Procurement is going to negotiate custom contracts and it's going to be brought up maybe by someone who heard the podcast ad. Maybe it just sits in the back of their mind until they hear something and it just winds of contributing to a growing awareness of these things.You're never going to do attribution that works on things like that. People try sometimes to, “Oh, you'll get $25 in credit,” or, “We'll give you a free t-shirt if you fill out the form.” Yeah, but now you're biasing for people who find that a material motivator. When I'm debating what security suite I'm going to roll out at my enterprise I don't want a free t-shirt for that. In fact, if I get a free t-shirt and I wear that shirt from the vendor around the office while I'm trying to champion bringing that thing in, I look a little compromised.Chris: Yeah. Yeah, I am—[laugh] I got no response to that [laugh].Corey: No, no. I hear you. One thing I do want to talk about is the last time we spoke, you mentioned you were involved in getting fwd:cloudsec—a conference—off the ground. Like all good cloud security conferences, it's named after an email subject line.It is co-located with re:Inforce this year in Anaheim, California. Somewhat ominously enough, I used to live a block-and-a-half away from the venue. But I don't anymore and in fact, because nobody checks the global event list when they schedule these things, I will be on the other side of the world officiating a wedding the same day. So, yet again, I will not be at re:Inforce.Chris: That is a shame because I think you would have made an excellent person to contribute to our call for papers and attend. So yes, fwd:cloudsec is deliberately actually named after a subject line because all of the other Amazon conferences seem to be that way. And we didn't want to be going backwards and thinking, you know, past tense. We were looking forward to our conference. Yeah, so we're effectively a vendor-neutral cloud security conference. We liked the idea of being able to take the talks that Amazon PR would never allow on stage at re:Inforce and run with it.Corey: I would question that. I do want to call that out because I gave a talk at re:Invent one year about a vulnerability I found and reported, with the help of two other people, Scott Piper and Brandon Sherman, to the AWS security team. And we were able to talk about that on stage with Zack Glick, who at the time, was one of basically God's own prototypes, working over in the AWS environment next to Dan [Erson 00:19:56]. Now, Dan remains the salt of the earth, and if he ever leaves basically just short the entire US economy. It's easier. He is amazing. I digress. The point being is that they were very open about talking about an awful lot of stuff that I would never have expected that they would be okay with.Chris: And last year at re:Inforce, they had an excellent, excellent chalk talk—but it was a chalk talk, not recorded—on how ransomware attacks operate. And they actually, like, revealed some internal, very anonymized patterns of how attacks are working. So, they're starting to realize what we've been saying in the cloud security community for a while, which is, we need more legitimate threat intelligence. On the other hand, they don't want to call it threat intelligence because the word threat is threatening, and therefore, you know, we're going to just call it, you know, patterns or whatever. And our conference is, again, also multi-cloud, a concept that until recently, AWS, you know, didn't really want to acknowledge that there were other clouds and that people would use both of them [crosstalk 00:21:01]—Corey: Multi-cloud security is a nightmare. It's just awful.Chris: Yeah, I don't like multi-cloud, but I've come to realize that it is a thing. That you will either start at a company that says, “We're AWS and we're uni-cloud,” and then next thing, you know, either some rogue developer out there has gone and spun up an Azure subscription or your acquire somebody who's in GCP, or heaven forbid, you have to go into some, you know, tinhorn dictator's jurisdiction and they require you to be on-prem or leverage Oracle Cloud or something. And suddenly, congratulations, you're now multi-cloud. So yes, our goal is really to be the things that aren't necessarily onstage or aren't all just, “It's great.” Even your talk was how great the incident response and vulnerability remediation process was.Corey: How great my experience with it was at the time, to be clear. Because I also have gotten to a point where I am very aware that, in many cases when dealing with AWS, my reputation precedes me. So, when I wind up tweeting about a problem or opening a support case, I do not accept as a given that my experience is what everyone is going to experience. But a lot of the things they did made a lot of sense and I was frankly, impressed that they were willing to just talk about anything that they did internally. Because previously that had not been a thing that they did in open forums like that.Chris: But you go back to the Glue incident where somebody found a bug and they literally went and went to every single CloudTrail event going back to the dawn of the service to validate that, okay, the, only two times we ever saw this happen were between the two researcher's accounts who disclosed it. And so, kudos to them for that level of forward communication to their customers because yeah, I think we still haven't heard anything out of Azure for last year's—or a year-and-a-half ago's Wiz findings.Corey: Well, they did do a broad blog post about this that they put out, which I thought, “Okay, that was great. More of this please.” Because until they start talking about security issues and culture and the remediation thereof, I don't give a shit what they have to say about almost anything else because it all comes back to security. The only things I use Azure for, which admittedly has some great stuff; their computer vision API? Brilliant—but the things I use them for are things that I start from a premise of security is not important to that service.The thing I use it for on the soon-to-be-pivoted to Mastodon Twitter thread client that I built, it writes alt-text for images that are about to be put out publicly. Yeah, there's no security issue from that perspective. I am very hard-pressed to imagine a scenario in which that were not true.Chris: I can come up with a couple, but you know—Corey: It feels really contrived. And honestly, that's the thing that concerns me, too: the fact that I finally read, somewhat recently, an AWS white paper talking about—was it a white paper or was it blog post? I forget the exact media that it took. But it was about how they are seeing ransomware attacks on S3, which was huge because before that, I assumed it was something that was being made up by vendors to sell me something.Chris: So, that was the chalk talk.Corey: Yes.Chris: They finally got the chalk talk from re:Inforce, they gave it again at re:Invent because it was so well received and now they have it as a blog post out there, so that, you know, it's not just for people who show up in the room, they can hear it; it's actually now documented out there. And so, kudos to the Amazon security team for really getting that sort of threat intelligence out there to the community.Corey: Now, it's in writing, and that's something that I can cite as opposed to, “Well, I was at re:Invent and I heard—” Yeah, we saw the drink tab. We know what you might have thought you heard or saw at re:Invent. Give us something we can take to the board.Chris: There were a lot of us on that bar tab, so it's not all you.Corey: Exactly. And it was my pleasure to do it, to be clear. But getting back to fwd:cloudsec, I'm going to do you a favor. Whether it's an actual favor or the word favor belongs in quotes, the way that I submit CFPs, or conference talks, is optimized because I don't want to build a talk that is never going to get picked up. Why bother to go through all the work until I have to give it somewhere?So, I start with a catchy title and then three to five sentences. And if people accept it, great, then I get to build the talk. This is a forcing function in some ways because if you get a little delayed, they will not move the conference for you. I've checked. But the title of a talk that I think someone should submit for fwd:cloudsec is, “I Am Smarter Than You, so Cloud Security is Easy.”And the format and the conceit of the talk is present it with sort of a stand-it-up-to-take-it-down level of approach where you are over-confident in the fact that you are smarter than everyone else and best practices don't apply to you and so much of this stuff is just security theater designed as a revenue extraction mechanism as opposed to something you should actually be doing. And talk about why none of these things matter because you use good security and you know, it's good because you came up with it and there's no way that you could come up with something that you couldn't break because you're smart. It says so right in the title and you're on stage and you have a microphone. They don't. Turn that into something. I feel like there's a great way to turn that in a bunch of different directions. I'd love to see someone give that talk.Chris: I think Nickolas Sharp thought that too.Corey: [laugh]. Exactly. In fact, that will be a great way to bring it back around at the end. And it's like, “And that's why I'm better at security than you are. If you have any questions beyond this, you can reach me at whatever correctional institute I go in on Thursday.” Exactly. There's ways to make it fun and engaging. Because from my perspective, talks have to be entertaining or people don't pay attention.Chris: They're either entertaining, or they're so new and advanced. We're definitely an advanced cloud security practice thing. They were 500 levels. Not to brag or anything, but you know, you want the two to 300-level stuff, you can go CCJ up the street. We're hitting and going above and beyond what a lot of the [unintelligible 00:27:18]—Corey: I am not as advanced on that path as you are; I want to be very clear on this. You speak, I listen. You're one of those people when it comes to security. Because again, no one's life is hanging in the balance with respect to what I do. I am confident in our security posture here, but nothing's perfect. Everything is exploitable, on some level.It's also not my core area of focus. It is yours. And if you are not better than I am at this, then I have done something sort of strange, or so of you, in the same way that it is a near certainty—but not absolute—that I am better at optimizing AWS bills than you are. Specialists exist for a reason and to discount that expertise is the peak of hubris. Put that in your talk.Chris: Yeah. So, one talk I really want to see, and I've been threatening to give it for a while, is okay, if there's seventeen ways—or sorry, seventeen times two, soon to be seventeen times three ways to run containers in AWS, there's that many ways to exfiltrate credentials from those containers. What are all of those things? Do we have a holistic way of understanding, this is how credentials can be exfiltrated so that we then as defenders can go figure out, okay, how do we build detections and mitigations for this?Corey: Yeah. I'm a huge fan of Canarytokens myself, for that exact purpose. There are many devices I have where the only credentials in plain text on disk are things that as soon as they get used, I wind up with a bunch of things screaming at me that there's been a problem and telling me where it is. I'm not saying that my posture is impenetrable. Far from it. But you're going to have to work for it a little bit harder than running some random off-the-shelf security scanner against my AWS account and finding, oops, I forgot to turn on a bucket protection.Chris: And the other area that I think is getting really interesting is, all of the things that have credentials into your Cloud account, whether it's something like CircleCI or GitHub. I was having a conversation with somebody just this morning and we were talking about Roles Anywhere, and I was like, “Roles Anywhere is great if you've got a good strong PKI solution and can keep that private certificate or that certificate you need safe.” If you just put it on a disk, like, you would have put your AKIA and secret on a desk, congratulations, you haven't really improved security. You've just gotten rid of the IAM users that are being flagged in your CSPM tool, and congratulations, you have, in fact, achieved security theater.Corey: It's obnoxious, on some level. And part of the problem is cost and security are aligned and that people care about them right after they really should have cared about them. The difference is you can beg, cry, whine, et cetera to AWS for concessions, you can raise another round of funding; there have solutions with money. But security? That ship has already sailed.Chris: Yeah. Once the data is out, the data is out. Now, I will say on the bill, you get reminded of it every month, about three or four days after. It's like, “Oh. Crap, yeah, I should have turned off that EC2 instance. I just burned $100.” Or, “Oh hey, we didn't turn off that application. I just burned $100,000.” That doesn't happen on security. Security events tend to be few and far between; they're just much bigger when they happen.Corey: I really want to thank you for taking the time to chat with me. I'm sure I'll have you back on between now and re:Inforce slash fwd:cloudsec or anything else we come up with that resembles an email subject line. If people want to learn more and follow along with your adventures—as they should—where's the best place for him to find you these days?Chris: So, I am now pretty much living on Mastodon on the InfoSec Exchange. And my website, chrisfarris.com is where you can find the link to that because it's not just at, you know, whatever. You have to give the whole big long URL in Mastodon. It's no longer—Corey: Yeah. It's like a full-on email address with weird domains.Chris: Exactly, yeah. So, find me at http colon slash slash infosec dot exchange slash at jcfarris. Or just hit Chris Farris and follow the links. For fwd:cloudsec, we are conveniently located at fwdcloudsec.org, which is F-W-D cloud sec dot org. No colons because I don't think those are valid in whois.Corey: Excellent choice. And of course, links to that go in the [show notes 00:31:32], so click the button. It's easier. Thanks again for your time. I really appreciate it.Chris: Thank you.Corey: Chris Farris, Cloud Security Nerd at Turbot slash Turbo. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that resembles a lawsuit being filed, and then have it processed-served to me because presumably, you work at Ubiquiti.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

Lambda3 Podcast
Lambda3 Podcast 343 – Nostalgia: Jogos marcantes

Lambda3 Podcast

Play Episode Listen Later Mar 17, 2023 92:22


Neste episódio Lambdas relembram jogos de videogame que marcaram diferentes gerações e que até hoje seguem encantando quem vive no universo gamer.

Lambda3 Podcast
Lambda3 Podcast 325 – Começo de Carreira

Lambda3 Podcast

Play Episode Listen Later Nov 11, 2022 76:22


Neste episódio do Podcast,  pessoas Lambdas de diferentes áreas conversam sobre suas experiências de início de carreira nas áreas que atuam hoje - medos, aprendizados, dicas e muita história compartilhada. Entre no nosso grupo do Telegram e compartilhe seus comentários com a gente: https://lb3.io/telegram Feed do podcast: www.lambda3.com.br/feed/podcast Feed do podcast somente com episódios técnicos: www.lambda3.com.br/feed/podcast-tecnico Feed do podcast somente com episódios não técnicos: www.lambda3.com.br/feed/podcast-nao-tecnico Feed do podcast somente com episódios de negócios: www.lambda3.com.br/feed/podcast-negocios Lambda3 · #325 - Começo de Carreira Pauta: O que fazemos hoje O caminho até aqui Como foi iniciar em uma nova carreira Início x transição de carreira A importância da diversidade nos times Ansiedade no início de uma profissão nova Medo de não saber alguma coisa Saber pedir ajuda Sobrecarga e Burnout Comunidades como apoio Cultura de que perguntar é sinônimo de incapacidade A importância do ambiente de trabalho Olhar as pessoas para além do currículo/formação A experiência Lambda3 e dicas para quem está começando Participantes: Dickson Melo - @dicksonmelo Izabela Oliveira - @izabelaoliveira Jonatan Crespo - @jonatan-crespo João Moraes - @joaopedromoraez Juan Barata - @juancarlosbarata Rogério Anselmo - @rogerio-anselmo Edição: Compasso Coolab Créditos das músicas usadas neste programa: Music by Kevin MacLeod (incompetech.com) licensed under Creative Commons: By Attribution 3.0 - creativecommons.org/licenses/by/3.0

Screaming in the Cloud
A Cloud Economist is Born - The AlterNAT Origin Story

Screaming in the Cloud

Play Episode Listen Later Nov 9, 2022 34:45


About BenBen Whaley is a staff software engineer at Chime. Ben is co-author of the UNIX and Linux System Administration Handbook, the de facto standard text on Linux administration, and is the author of two educational videos: Linux Web Operations and Linux System Administration. He is an AWS Community Hero since 2014. Ben has held Red Hat Certified Engineer (RHCE) and Certified Information Systems Security Professional (CISSP) certifications. He earned a B.S. in Computer Science from Univ. of Colorado, Boulder.Links Referenced: Chime Financial: https://www.chime.com/ alternat.cloud: https://alternat.cloud Twitter: https://twitter.com/iamthewhaley LinkedIn: https://www.linkedin.com/in/benwhaley/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With Tailscale SSH, you can do exactly that. Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH.Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built-in key rotation, permissions as code, connectivity between any two devices, reduce latency, and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. Tailscale is completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn and this is an episode unlike any other that has yet been released on this august podcast. Let's begin by introducing my first-time guest somehow because apparently an invitation got lost in the mail somewhere. Ben Whaley is a staff software engineer at Chime Financial and has been an AWS Community Hero since Andy Jassy was basically in diapers, to my level of understanding. Ben, welcome to the show.Ben: Corey, so good to be here. Thanks for having me on.Corey: I'm embarrassed that you haven't been on the show before. You're one of those people that slipped through the cracks and somehow I was very bad at following up slash hounding you into finally agreeing to be here. But you certainly waited until you had something auspicious to talk about.Ben: Well, you know, I'm the one that really should be embarrassed here. You did extend the invitation and I guess I just didn't feel like I had something to drop. But I think today we have something that will interest most of the listeners without a doubt.Corey: So, folks who have listened to this podcast before, or read my newsletter, or follow me on Twitter, or have shared an elevator with me, or at any point have passed me on the street, have heard me complain about the Managed NAT Gateway and it's egregious data processing fee of four-and-a-half cents per gigabyte. And I have complained about this for small customers because they're in the free tier; why is this thing charging them 32 bucks a month? And I have complained about this on behalf of large customers who are paying the GDP of the nation of Belize in data processing fees as they wind up shoving very large workloads to and fro, which is I think part of the prerequisite requirements for having a data warehouse. And you are no different than the rest of these people who have those challenges, with the singular exception that you have done something about it, and what you have done is so, in retrospect, blindingly obvious that I am embarrassed the rest of us never thought of it.Ben: It's interesting because when you are doing engineering, it's often the simplest solution that is the best. I've seen this repeatedly. And it's a little surprising that it didn't come up before, but I think it's in some way, just a matter of timing. But what we came up with—and is this the right time to get into it, do you want to just kind of name the solution, here?Corey: Oh, by all means. I'm not going to steal your thunder. Please, tell us what you have wrought.Ben: We're calling it AlterNAT and it's an alternative solution to a high-availability NAT solution. As everybody knows, NAT Gateway is sort of the default choice; it certainly is what AWS pushes everybody towards. But there is, in fact, a legacy solution: NAT instances. These were around long before NAT Gateway made an appearance. And like I said they're considered legacy, but with the help of lots of modern AWS innovations and technologies like Lambdas and auto-scaling groups with max instance lifetimes and the latest generation of networking improved or enhanced instances, it turns out that we can maybe not quite get as effective as a NAT Gateway, but we can save a lot of money and skip those data processing charges entirely by having a NAT instance solution with a failover NAT Gateway, which I think is kind of the key point behind the solution. So, are you interested in diving into the technical details?Corey: That is very much the missing piece right there. You're right. What we used to use was NAT instances. That was the thing that we used because we didn't really have another option. And they had an interface in the public subnet where they lived and an interface hanging out in the private subnet, and they had to be configured to wind up passing traffic to and fro.Well, okay, that's great and all but isn't that kind of brittle and dangerous? I basically have a single instance as a single point of failure and these are the days early on when individual instances did not have the level of availability and durability they do now. Yeah, it's kind of awful, but here you go. I mean, the most galling part of the Managed NAT Gateway service is not that it's expensive; it's that it's expensive, but also incredibly good at what it does. You don't have to think about this whole problem anymore, and as of recently, it also supports ipv4 to ipv6 translation as well.It's not that the service is bad. It's that the service is stonkingly expensive, particularly at scale. And everything that we've seen before is either oh, run your own NAT instances or bend your knee and pays your money. And a number of folks have come up with different options where this is ridiculous. Just go ahead and run your own NAT instances.Yeah, but what happens when I have to take it down for maintenance or replace it? It's like, well, I guess you're not going to the internet today. This has the, in hindsight, obvious solution, well, we just—we run the Managed NAT Gateway because the 32 bucks a year in instance-hour charges don't actually matter at any point of scale when you're doing this, but you wind up using that for day in, day out traffic, and the failover mode is simply you'll use the expensive Managed NAT Gateway until the instance is healthy again and then automatically change the route table back and forth.Ben: Yep. That's exactly it. So, the auto-scaling NAT instance solution has been around for a long time well, before even NAT Gateway was released. You could have NAT instances in an auto-scaling group where the size of the group was one, and if the NAT instance failed, it would just replace itself. But this left a period in which you'd have no internet connectivity during that, you know, when the NAT instance was swapped out.So, the solution here is that when auto-scaling terminates an instance, it fails over the route table to a standby NAT Gateway, rerouting the traffic. So, there's never a point at which there's no internet connectivity, right? The NAT instance is running, processing traffic, gets terminated after a certain period of time, configurable, 14 days, 30 days, whatever makes sense for your security strategy could be never, right? You could choose that you want to have your own maintenance window in which to do it.Corey: And let's face it, this thing is more or less sitting there as a network traffic router, for lack of a better term. There is no need to ever log into the thing and make changes to it until and unless there's a vulnerability that you can exploit via somehow just talking to the TCP stack when nothing's actually listening on the host.Ben: You know, you can run your own AMI that has been pared down to almost nothing, and that instance doesn't do much. It's using just a Linux kernel to sit on two networks and pass traffic back and forth. It has a translation table that kind of keeps track of the state of connections and so you don't need to have any service running. To manage the system, we have SSM so you can use Session Manager to log in, but frankly, you can just disable that. You almost never even need to get a shell. And that is, in fact, an option we have in the solution is to disable SSM entirely.Corey: One of the things I love about this approach is that it is turnkey. You throw this thing in there and it's good to go. And in the event that the instance becomes unhealthy, great, it fails traffic over to the Managed NAT Gateway while it terminates the old node and replaces it with a healthy one and then fails traffic back. Now, I do need to ask, what is the story of network connections during that failover and failback scenario?Ben: Right, that's the primary drawback, I would say, of the solution is that any established TCP connections that are on the NAT instance at the time of a route change will be lost. So, say you have—Corey: TCP now terminates on the floor.Ben: Pretty much. The connections are dropped. If you have an open SSH connection from a host in the private network to a host on the internet and the instance fails over to the NAT Gateway, the NAT Gateway doesn't have the translation table that the NAT instance had. And not to mention, the public IP address also changes because you have an Elastic IP assigned to the NAT instance, a different Elastic IP assigned to the NAT Gateway, and so because that upstream IP is different, the remote host is, like, tracking the wrong IP. So, those connections, they're going to be lost.So, there are some use cases where this may not be suitable. We do have some ideas on how you might mitigate that, for example, with the use of a maintenance window to schedule the replacement, replaced less often so it doesn't have to affect your workflow as much, but frankly, for many use cases, my belief is that it's actually fine. In our use case at Chime, we found that it's completely fine and we didn't actually experience any errors or failures. But there might be some use cases that are more sensitive or less resilient to failure in the first place.Corey: I would also point out that a lot of how software is going to behave is going to be a reflection of the era in which it was moved to cloud. Back in the early days of EC2, you had no real sense of reliability around any individual instance, so everything was written in a very defensive manner. These days, with instances automatically being able to flow among different hardware so we don't get instance interrupt notifications the way we once did on a semi-constant basis, it more or less has become what presents is bulletproof, so a lot of people are writing software that's a bit more brittle. But it's always been a best practice that when a connection fails okay, what happens at failure? Do you just give up and throw your hands in the air and shriek for help or do you attempt to retry a few times, ideally backing off exponentially?In this scenario, those retries will work. So, it's a question of how well have you built your software. Okay, let's say that you made the worst decisions imaginable, and okay, if that connection dies, the entire workload dies. Okay, you have the option to refactor it to be a little bit better behaved, or alternately, you can keep paying the Manage NAT Gateway tax of four-and-a-half cents per gigabyte in perpetuity forever. I'm not going to tell you what decision to make, but I know which one I'm making.Ben: Yeah, exactly. The cost savings potential of it far outweighs the potential maintenance troubles, I guess, that you could encounter. But the fact is, if you're relying on Managed NAT Gateway and paying the price for doing so, it's not as if there's no chance for connection failure. NAT Gateway could also fail. I will admit that I think it's an extremely robust and resilient solution. I've been really impressed with it, especially so after having worked on this project, but it doesn't mean it can't fail.And beyond that, upstream of the NAT Gateway, something could in fact go wrong. Like, internet connections are unreliable, kind of by design. So, if your system is not resilient to connection failures, like, there's a problem to solve there anyway; you're kind of relying on hope. So, it's a kind of a forcing function in some ways to build architectural best practices, in my view.Corey: I can't stress enough that I have zero problem with the capabilities and the stability of the Managed NAT Gateway solution. My complaints about it start and stop entirely with the price. Back when you first showed me the blog post that is releasing at the same time as this podcast—and you can visit that at alternat.cloud—you sent me an early draft of this and what I loved the most was that your math was off because of a not complete understanding of the gloriousness that is just how egregious the NAT Gateway charges are.Your initial analysis said, “All right, if you're throwing half a terabyte out to the internet, this has the potential of cutting the bill by”—I think it was $10,000 or something like that. It's, “Oh no, no. It has the potential to cut the bill by an entire twenty-two-and-a-half thousand dollars.” Because this processing fee does not replace any egress fees whatsoever. It's purely additive. If you forget to have a free S3 Gateway endpoint in a private subnet, every time you put something into or take something out of S3, you're paying four-and-a-half cents per gigabyte on that, despite the fact there's no internet transitory work, it's not crossing availability zones. It is simply a four-and-a-half cent fee to retrieve something that has only cost you—at most—2.3 cents per month to store in the first place. Flip that switch, that becomes completely free.Ben: Yeah. I'm not embarrassed at all to talk about the lack of education I had around this topic. The fact is I'm an engineer primarily and I came across the cost stuff because it kind of seemed like a problem that needed to be solved within my organization. And if you don't mind, I might just linger on this point and kind of think back a few months. I looked at the AWS bill and I saw this egregious ‘EC2 Other' category. It was taking up the majority of our bill. Like, the single biggest line item was EC2 Other. And I was like, “What could this be?”Corey: I want to wind up flagging that just because that bears repeating because I often get people pushing back of, “Well, how bad—it's one Managed NAT Gateway. How much could it possibly cost? $10?” No, it is the majority of your monthly bill. I cannot stress that enough.And that's not because the people who work there are doing anything that they should not be doing or didn't understand all the nuances of this. It's because for the security posture that is required for what you do—you are at Chime Financial, let's be clear here—putting everything in public subnets was not really a possibility for you folks.Ben: Yeah. And not only that but there are plenty of services that have to be on private subnets. For example, AWS Glue services must run in private VPC subnets if you want them to be able to talk to other systems in your VPC; like, they cannot live in public subnet. So essentially, if you want to talk to the internet from those jobs, you're forced into some kind of NAT solution. So, I dug into the EC2 Other category and I started trying to figure out what was going on there.There's no way—natively—to look at what traffic is transiting the NAT Gateway. There's not an interface that shows you what's going on, what's the biggest talkers over that network. Instead, you have to have flow logs enabled and have to parse those flow logs. So, I dug into that.Corey: Well, you're missing a step first because in a lot of environments, people have more than one of these things, so you get to first do the scavenger hunt of, okay, I have a whole bunch of Managed NAT Gateways and first I need to go diving into CloudWatch metrics and figure out which are the heavy talkers. Is usually one or two followed by a whole bunch of small stuff, but not always, so figuring out which VPC you're even talking about is a necessary prerequisite.Ben: Yeah, exactly. The data around it is almost missing entirely. Once you come to the conclusion that it is a particular NAT Gateway—like, that's a set of problems to solve on its own—but first, you have to go to the flow logs, you have to figure out what are the biggest upstream IPs that it's talking to. Once you have the IP, it still isn't apparent what that host is. In our case, we had all sorts of outside parties that we were talking to a lot and it's a matter of sorting by volume and figuring out well, this IP, what is the reverse IP? Who is potentially the host there?I actually had some wrong answers at first. I set up VPC endpoints to S3 and DynamoDB and SQS because those were some top talkers and that was a nice way to gain some security and some resilience and save some money. And then I found, well, Datadog; that's another top talker for us, so I ended up creating a nice private link to Datadog, which they offer for free, by the way, which is more than I can say for some other vendors. But then I found some outside parties, there wasn't a nice private link solution available to us, and yet, it was by far the largest volume. So, that's what kind of started me down this track is analyzing the NAT Gateway myself by looking at VPC flow logs. Like, it's shocking that there isn't a better way to find that traffic.Corey: It's worse than that because VPC flow logs tell you where the traffic is going and in what volumes, sure, on an IP address and port basis, but okay, now you have a Kubernetes cluster that spans two availability zones. Okay, great. What is actually passing through that? So, you have one big application that just seems awfully chatty, you have multiple workloads running on the thing. What's the expensive thing talking back and forth? The only way that you can reliably get the answer to that I found is to talk to people about what those workloads are actually doing, and failing that you're going code spelunking.Ben: Yep. You're exactly right about that. In our case, it ended up being apparent because we have a set of subnets where only one particular project runs. And when I saw the source IP, I could immediately figure that part out. But if it's a K8s cluster in the private subnets, yeah, how are you going to find it out? You're going to have to ask everybody that has workloads running there.Corey: And we're talking about in some cases, millions of dollars a month. Yeah, it starts to feel a little bit predatory as far as how it's priced and the amount of work you have to put in to track this stuff down. I've done this a handful of times myself, and it's always painful unless you discover something pretty early on, like, oh, it's talking to S3 because that's pretty obvious when you see that. It's, yeah, flip switch and this entire engagement just paid for itself a hundred times over. Now, let's see what else we can discover.That is always one of those fun moments because, first, customers are super grateful to learn that, oh, my God, I flipped that switch. And I'm saving a whole bunch of money. Because it starts with gratitude. “Thank you so much. This is great.” And it doesn't take a whole lot of time for that to alchemize into anger of, “Wait. You mean, I've been being ridden like a pony for this long and no one bothered to mention that if I click a button, this whole thing just goes away?”And when you mention this to your AWS account team, like, they're solicitous, but they either have to present as, “I didn't know that existed either,” which is not a good look, or, “Yeah, you caught us,” which is worse. There's no positive story on this. It just feels like a tax on not knowing trivia about AWS. I think that's what really winds me up about it so much.Ben: Yeah, I think you're right on about that as well. My misunderstanding about the NAT pricing was data processing is additive to data transfer. I expected when I replaced NAT Gateway with NAT instance, that I would be substituting data transfer costs for NAT Gateway costs, NAT Gateway data processing costs. But in fact, NAT Gateway incurs both data processing and data transfer. NAT instances only incur data transfer costs. And so, this is a big difference between the two solutions.Not only that, but if you're in the same region, if you're egressing out of your say us-east-1 region and talking to another hosted service also within us-east-1—never leaving the AWS network—you don't actually even incur data transfer costs. So, if you're using a NAT Gateway, you're paying data processing.Corey: To be clear you do, but it is cross-AZ in most cases billed at one penny egressing, and on the other side, that hosted service generally pays one penny ingressing as well. Don't feel bad about that one. That was extraordinarily unclear and the only reason I know the answer to that is that I got tired of getting stonewalled by people that later turned out didn't know the answer, so I ran a series of experiments designed explicitly to find this out.Ben: Right. As opposed to the five cents to nine cents that is data transfer to the internet. Which, add that to data processing on a NAT Gateway and you're paying between thirteen-and-a-half cents to nine-and-a-half cents for every gigabyte egressed. And this is a phenomenal cost. And at any kind of volume, if you're doing terabytes to petabytes, this becomes a significant portion of your bill. And this is why people hate the NAT Gateway so much.Corey: I am going to short-circuit an angry comment I can already see coming on this where people are going to say, “Well, yes. But it's a multi-petabyte scale. Nobody's paying on-demand retail price.” And they're right. Most people who are transmitting that kind of data, have a specific discount rate applied to what they're doing that varies depending upon usage and use case.Sure, great. But I'm more concerned with the people who are sitting around dreaming up ideas for a company where I want to wind up doing some sort of streaming service. I talked to one of those companies very early on in my tenure as a consultant around the billing piece and they wanted me to check their napkin math because they thought that at their numbers when they wound up scaling up, if their projections were right, that they were going to be spending $65,000 a minute, and what did they not understand? And the answer was, well, you didn't understand this other thing, so it's going to be more than that, but no, you're directionally correct. So, that idea that started off on a napkin, of course, they didn't build it on top of AWS; they went elsewhere.And last time I checked, they'd raised well over a quarter-billion dollars in funding. So, that's a business that AWS would love to have on a variety of different levels, but they're never going to even be considered because by the time someone is at scale, they either have built this somewhere else or they went broke trying.Ben: Yep, absolutely. And we might just make the point there that while you can get discounts on data transfer, you really can't—or it's very rare—to get discounts on data processing for the NAT Gateway. So, any kind of savings you can get on data transfer would apply to a NAT instance solution, you know, saving you four-and-a-half cents per gigabyte inbound and outbound over the NAT Gateway equivalent solution. So, you're paying a lot for the benefit of a fully-managed service there. Very robust, nicely engineered fully-managed service as we've already acknowledged, but an extremely expensive solution for what it is, which is really just a proxy in the end. It doesn't add any value to you.Corey: The only way to make that more expensive would be to route it through something like Splunk or whatnot. And Splunk does an awful lot for what they charge per gigabyte, but it just feels like it's rent-seeking in some of the worst ways possible. And what I love about this is that you've solved the problem in a way that is open-source, you have already released it in Terraform code. I think one of the first to-dos on this for someone is going to be, okay now also make it CloudFormation and also make it CDK so you can drop it in however you want.And anyone can use this. I think the biggest mistake people might make in glancing at this is well, I'm looking at the hourly charge for the NAT Gateways and that's 32-and-a-half bucks a month and the instances that you recommend are hundreds of dollars a month for the big network-optimized stuff. Yeah, if you care about the hourly rate of either of those two things, this is not for you. That is not the problem that it solves. If you're an independent learner annoyed about the $30 charge you got for a Managed NAT Gateway, don't do this. This will only add to your billing concerns.Where it really shines is once you're at, I would say probably about ten terabytes a month, give or take, in Managed NAT Gateway data processing is where it starts to consider this. The breakeven is around six or so but there is value to not having to think about things. Once you get to that level of spend, though it's worth devoting a little bit of infrastructure time to something like this.Ben: Yeah, that's effectively correct. The total cost of running the solution, like, all-in, there's eight Elastic IPs, four NAT Gateways, if you're—say you're four zones; could be less if you're in fewer zones—like, n NAT Gateways, n NAT instances, depending on how many zones you're in, and I think that's about it. And I said right in the documentation, if any of those baseline fees are a material number for your use case, then this is probably not the right solution. Because we're talking about saving thousands of dollars. Any of these small numbers for NAT Gateway hourly costs, NAT instance hourly costs, that shouldn't be a factor, basically.Corey: Yeah, it's like when I used to worry about costing my customers a few tens of dollars in Cost Explorer or CloudWatch or request fees against S3 for their Cost and Usage Reports. It's yeah, that does actually have a cost, there's no real way around it, but look at the savings they're realizing by going through that. Yeah, they're not going to come back and complaining about their five-figure consulting engagement costing an additional $25 in AWS charges and then lowering it by a third. So, there's definitely a difference as far as how those things tend to be perceived. But it's easy to miss the big stuff when chasing after the little stuff like that.This is part of the problem I have with an awful lot of cost tooling out there. They completely ignore cost components like this and focus only on the things that are easy to query via API, of, oh, we're going to cost-optimize your Kubernetes cluster when they think about compute and RAM. And, okay, that's great, but you're completely ignoring all the data transfer because there's still no great way to get at that programmatically. And it really is missing the forest for the trees.Ben: I think this is key to any cost reduction project or program that you're undertaking. When you look at a bill, look for the biggest spend items first and work your way down from there, just because of the impact you can have. And that's exactly what I did in this project. I saw that ‘EC2 Other' slash NAT Gateway was the big item and I started brainstorming ways that we could go about addressing that. And now I have my next targets in mind now that we've reduced this cost to effectively… nothing, extremely low compared to what it was, we have other new line items on our bill that we can start optimizing. But in any cost project, start with the big things.Corey: You have come a long way around to answer a question I get asked a lot, which is, “How do I become a cloud economist?” And my answer is, you don't. It's something that happens to you. And it appears to be happening to you, too. My favorite part about the solution that you built, incidentally, is that it is being released under the auspices of your employer, Chime Financial, which is immune to being acquired by Amazon just to kill this thing and shut it up.Because Amazon already has something shitty called Chime. They don't need to wind up launching something else or acquiring something else and ruining it because they have a Slack competitor of sorts called Amazon Chime. There's no way they could acquire you [unintelligible 00:27:45] going to get lost in the hallways.Ben: Well, I have confidence that Chime will be a good steward of the project. Chime's goal and mission as a company is to help everyone achieve financial peace of mind and we take that really seriously. We even apply it to ourselves and that was kind of the impetus behind developing this in the first place. You mentioned earlier we have Terraform support already and you're exactly right. I'd love to have CDK, CloudFormation, Pulumi supports, and other kinds of contributions are more than welcome from the community.So, if anybody feels like participating, if they see a feature that's missing, let's make this project the best that it can be. I suspect we can save many companies, hundreds of thousands or millions of dollars. And this really feels like the right direction to go in.Corey: This is easily a multi-billion dollar savings opportunity, globally.Ben: That's huge. I would be flabbergasted if that was the outcome of this.Corey: The hardest part is reaching these people and getting them on board with the idea of handling this. And again, I think there's a lot of opportunity for the project to evolve in the sense of different settings depending upon risk tolerance. I can easily see a scenario where in the event of a disruption to the NAT instance, it fails over to the Managed NAT Gateway, but fail back becomes manual so you don't have a flapping route table back and forth or a [hold 00:29:05] downtime or something like that. Because again, in that scenario, the failure mode is just well, you're paying four-and-a-half cents per gigabyte for a while until you wind up figuring out what's going on as opposed to the failure mode of you wind up disrupting connections on an ongoing basis, and for some workloads, that's not tenable. This is absolutely, for the common case, the right path forward.Ben: Absolutely. I think it's an enterprise-grade solution and the more knobs and dials that we add to tweak to make it more robust or adaptable to different kinds of use cases, the best outcome here would actually be that the entire solution becomes irrelevant because AWS fixes the NAT Gateway pricing. If that happens, I will consider the project a great success.Corey: I will be doing backflips like you wouldn't believe. I would sing their praises day in, day out. I'm not saying reduce it to nothing, even. I'm not saying it adds no value. I would change the way that it's priced because honestly, the fact that I can run an EC2 instance and be charged $0 on a per-gigabyte basis, yeah, I would pay a premium on an hourly charge based upon traffic volumes, but don't meter per gigabyte. That's where it breaks down.Ben: Absolutely. And why is it additive to data transfer, also? Like, I remember first starting to use VPC when it was launched and reading about the NAT instance requirement and thinking, “Wait a minute. I have to pay this extra management and hourly fee just so my private hosts could reach the internet? That seems kind of janky.”And Amazon established a norm here because Azure and GCP both have their own equivalent of this now. This is a business choice. This is not a technical choice. They could just run this under the hood and not charge anybody for it or build in the cost and it wouldn't be this thing we have to think about.Corey: I almost hate to say it, but Oracle Cloud does, for free.Ben: Do they?Corey: It can be done. This is a business decision. It is not a technical capability issue where well, it does incur cost to run these things. I understand that and I'm not asking for things for free. I very rarely say that this is overpriced when I'm talking about AWS billing issues. I'm talking about it being unpredictable, I'm talking about it being impossible to see in advance, but the fact that it costs too much money is rarely my complaint. In this case, it costs too much money. Make it cost less.Ben: If I'm not mistaken. GCPs equivalent solution is the exact same price. It's also four-and-a-half cents per gigabyte. So, that shows you that there's business games being played here. Like, Amazon could get ahead and do right by the customer by dropping this to a much more reasonable price.Corey: I really want to thank you both for taking the time to speak with me and building this glorious, glorious thing. Where can we find it? And where can we find you?Ben: alternat.cloud is going to be the place to visit. It's on Chime's GitHub, which will be released by the time this podcast comes out. As for me, if you want to connect, I'm on Twitter. @iamthewhaley is my handle. And of course, I'm on LinkedIn.Corey: Links to all of that will be in the podcast notes. Ben, thank you so much for your time and your hard work.Ben: This was fun. Thanks, Corey.Corey: Ben Whaley, staff software engineer at Chime Financial, and AWS Community Hero. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry rant of a comment that I will charge you not only four-and-a-half cents per word to read, but four-and-a-half cents to reply because I am experimenting myself with being a rent-seeking schmuck.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Computing on the Edge with Macrometa's Chetan Venkatesh

Screaming in the Cloud

Play Episode Listen Later Nov 1, 2022 40:29


About ChetanChetan Venkatesh is a technology startup veteran focused on distributed data, edge computing, and software products for enterprises and developers. He has 20 years of experience in building primary data storage, databases, and data replication products. Chetan holds a dozen patents in the area of distributed computing and data storage.Chetan is the CEO and Co-Founder of Macrometa – a Global Data Network featuring a Global Data Mesh, Edge Compute, and In-Region Data Protection. Macrometa helps enterprise developers build real-time apps and APIs in minutes – not months.Links Referenced: Macrometa: https://www.macrometa.com Macrometa Developer Week: https://www.macrometa.com/developer-week TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With Tailscale SSH, you can do exactly that. Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH.Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built in key rotation permissions is code connectivity between any two devices, reduce latency and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. tail scales. Completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomomento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today, this promoted guest episode is brought to us basically so I can ask a question that has been eating at me for a little while. That question is, what is the edge? Because I have a lot of cynical sarcastic answers to it, but that doesn't really help understanding. My guest today is Chetan Venkatesh, CEO and co-founder at Macrometa. Chetan, thank you for joining me.Chetan: It's my pleasure, Corey. You're one of my heroes. I think I've told you this before, so I am absolutely delighted to be here.Corey: Well, thank you. We all need people to sit on the curb and clap as we go by and feel like giant frauds in the process. So let's start with the easy question that sets up the rest of it. Namely, what is Macrometa, and what puts you in a position to be able to speak at all, let alone authoritatively, on what the edge might be?Chetan: I'll answer the second part of your question first, which is, you know, what gives me the authority to even talk about this? Well, for one, I've been trying to solve the same problem for 20 years now, which is build distributed systems that work really fast and can answer questions about data in milliseconds. And my journey's sort of been like the spiral staircase journey, you know, I keep going around in circles, but the view just keeps getting better every time I do one of these things. So I'm on my fourth startup doing distributed data infrastructure, and this time really focused on trying to provide a platform that's the antithesis of the cloud. It's kind of like taking the cloud and flipping it on its head because instead of having a single region application where all your stuff runs in one place, on us-west-1 or us-east-1, what if your apps could run everywhere, like, they could run in hundreds and hundreds of cities around the world, much closer to where your users and devices and most importantly, where interesting things in the real world are happening?And so we started Macrometa about five years back to build a new kind of distributed cloud—let's call the edge—that kind of looks like a CDN, a Content Delivery Network, but really brings very sophisticated platform-level primitives for developers to build applications in a distributed way around primitives for compute, primitives for data, but also some very interesting things that you just can't do in the cloud anymore. So that's Macrometa. And we're doing something with edge computing, which is a big buzzword these days, but I'm sure you'll ask me about that.Corey: It seems to be. Generally speaking, when I look around and companies are talking about edge, it feels almost like it is a redefining of what they already do to use a term that is currently trending and deep in the hype world.Chetan: Yeah. You know, I think humans just being biologically social beings just tend to be herd-like, and so when we see a new trend, we like to slap it on everything we have. We did that 15 years back with cloud, if you remember, you know? Everybody was very busy trying to stick the cloud label on everything that was on-prem. Edge is sort of having that edge-washing moment right now.But I define edge very specifically is very different from the cloud. You know, where the cloud is defined by centralization, i.e., you've got a giant hyperscale data center somewhere far, far away, where typically electricity, real estate, and those things are reasonably cheap, i.e., not in urban centers, where those things tend to be expensive.You know, you have platforms where you run things at scale, it's sort of a your mess for less business in the cloud and somebody else manages that for you. The edge is actually defined by location. And there are three types of edges. The first edge is the CDN edge, which is historically where we've been trying to make things faster with the internet and make the internet scale. So Akamai came about, about 20 years back and created this thing called the CDN that allowed the web to scale. And that was the first killer app for edge, actually. So that's the first location that defines the edge where a lot of the peering happens between different network providers and the on-ramp around the cloud happens.The second edge is the telecom edge. That's actually right next to you in terms of, you know, the logical network topology because every time you do something on your computer, it goes through that telecom layer. And now we have the ability to actually run web services, applications, data, directly from that telecom layer.And then the third edge is—sort of, people have been familiar with this for 30 years. The third edge is your device, just your mobile phone. It's your internet gateway and, you know, things that you carry around in your pocket or sit on your desk, where you have some compute power, but it's very restricted and it only deals with things that are interesting or important to you as a person, not in a broad range. So those are sort of the three things. And it's not the cloud. And these three things are now becoming important as a place for you to build and run enterprise apps.Corey: Something that I think is often overlooked here—and this is sort of a natural consequence of the cloud's own success and the joy that we live in a system that we do where companies are required to always grow and expand and find new markets—historically, for example, when I went to AWS re:Invent, which is a cloud service carnival in the desert that no one in the right mind should ever want to attend but somehow we keep doing, it used to be that, oh, these announcements are generally all aligned with people like me, where I have specific problems and they look a lot like what they're talking about on stage. And now they're talking about things that, from that perspective, seem like Looney Tunes. Like, I'm trying to build Twitter for Pets or something close to it, and I don't understand why there's so much talk about things like industrial IoT and, “Machine learning,” quote-unquote, and other things that just do not seem to align with. I'm trying to build a web service, like it says on the name of a company; what gives?And part of that, I think, is that it's difficult to remember, for most of us—especially me—that what they're coming out with is not your shopping list. Every service is for someone, not every service is for everyone, so figuring out what it is that they're talking about and what those workloads look like, is something that I think is getting lost in translation. And in our defense—collective defense—Amazon is not the best at telling stories to realize that, oh, this is not me they're talking to; I'm going to opt out of this particular thing. You figure it out by getting it wrong first. Does that align with how you see the market going?Chetan: I think so. You know, I think of Amazon Web Services, or even Google, or Azure as sort of Costco and, you know, Sam's Wholesale Club or whatever, right? They cater to a very broad audience and they sell a lot of stuff in bulk and cheap. And you know, so it's sort of a lowest common denominator type of a model. And so emerging applications, and especially emerging needs that enterprises have, don't necessarily get solved in the cloud. You've got to go and build up yourself on sort of the crude primitives that they provide.So okay, go use your bare basic EC2, your S3, and build your own edgy, or whatever, you know, cutting edge thing you want to build over there. And if enough people are doing it, I'm sure Amazon and Google start to pay interest and you know, develop something that makes it easier. So you know, I agree with you, they're not the best at this sort of a thing. The edge is phenomenon also that's orthogonally, and diametrically opposite to the architecture of the cloud and the economics of the cloud.And we do centralization in the cloud in a big way. Everything is in one place; we make giant piles of data in one database or data warehouse slice and dice it, and almost all our computer science is great at doing things in a centralized way. But when you take data and chop it into 50 copies and keep it in 50 different places on Earth, and you have this thing called the internet or the wide area network in the middle, trying to keep all those copies in sync is a nightmare. So you start to deal with some very basic computer science problems like distributed state and how do you build applications that have a consistent view of that distributed state? So you know, there have been attempts to solve these problems for 15, 18 years, but none of those attempts have really cracked the intersection of three things: a way for programmers to do this in a way that doesn't blow their heads with complexity, a way to do this cheaply and effectively enough where you can build real-world applications that serve billions of users concurrently at a cost point that actually is economical and make sense, and third, a way to do this with adequate levels of performance where you don't die waiting for the spinning wheel on your screen to go away.So these are the three problems with edge. And as I said, you know, me and my team, we've been focused on this for a very long while. And me and my co-founder have come from this world and we created a platform very uniquely designed to solve these three problems, the problems of complexity for programmers to build in a distributed environment like this where data sits in hundreds of places around the world and you need a consistent view of that data, being able to operate and modify and replicate that data with consistency guarantees, and then a third one, being able to do that, at high levels of performance, which translates to what we call ultra-low latency, which is human perception. The threshold of human perception, visually, is about 70 milliseconds. Our finest athletes, the best Esports players are about 70 to 80 milliseconds in their twitch, in their ability to twitch when something happens on the screen. The average human is about 100 to 110 milliseconds.So in a second, we can maybe do seven things at rapid rates. You know, that's how fast our brain can process it. Anything that falls below 100 milliseconds—especially if it falls into 50 to 70 milliseconds—appears instantaneous to the human mind and we experience it as magic. And so where edge computing and where my platform comes in is that it literally puts data and applications within 50 milliseconds of 90% of humans and devices on Earth and allows now a whole new set of applications where latency and location and the ability to control those things with really fine-grained capability matters. And we can talk a little more about what those apps are in a bit.Corey: And I think that's probably an interesting place to dive into at the moment because whenever we talk about the idea of new ways of building things that are aimed at decentralization, first, people at this point automatically have a bit of an aversion to, “Wait, are you talking about some of the Web3 nonsense?” It's one of those look around the poker table and see if you can spot the sucker, and if you can't, it's you. Because there are interesting aspects to that entire market, let's be clear, but it also seems to be occluded by so much of the grift and nonsense and spam and the rest that, again, sort of characterize the early internet as well. The idea though, of decentralizing out of the cloud is deeply compelling just to anyone who's really ever had to deal with the egress charges, or even the data transfer charges inside of one of the cloud providers. The counterpoint is it feels that historically, you either get to pay the tax and go all-in on a cloud provider and get all the higher-level niceties, or otherwise, you wind up deciding you're going to have to more or less go back to physical data centers, give or take, and other than the very baseline primitives that you get to work with of VMs and block storage and maybe a load balancer, you're building it all yourself from scratch. It seems like you're positioning this as setting up for a third option. I'd be very interested to hear it.Chetan: Yeah. And a quick comment on decentralization: good; not so sure about the Web3 pieces around it. We tend to talk about computer science and not the ideology of distributing data. There are political reasons, there are ideological reasons around data and sovereignty and individual human rights, and things like that. There are people far smarter than me who should explain that.I fall personally into the Nicholas Weaver school of skepticism about Web3 and blockchain and those types of things. And for readers who are not familiar with Nicholas Weaver, please go online. He teaches at UC Berkeley is just one of the finest minds of our time. And I think he's broken down some very good reasons why we should be skeptical about, sort of, Web3 and, you know, things like that. Anyway, that's a digression.Coming back to what we're talking about, yes, it is a new paradigm, but that's the challenge, which is I don't want to introduce a new paradigm. I want to provide a continuum. So what we've built is a platform that looks and feels very much like Lambdas, and a poly-model database. I hate the word multi. It's a pretty dumb word, so I've started to substitute ‘multi' with ‘poly' everywhere, wherever I can find it.So it's not multi-cloud; it's poly-cloud. And it's not multi-model; it's poly-model. Because what we want is a world where developers have the ability to use the best paradigm for solving problems. And it turns out when we build applications that deal with data, data doesn't just come in one form, it comes in many different forms, it's polymorphic, and so you need a data platform, that's also, you know, polyglot and poly-model to be able to handle that. So that's one part of the problem, which is, you know, we're trying to provide a platform that provides continuity by looking like a key-value store like Redis. It looks like a document database—Corey: Or the best database in the world Route 53 TXT records. But please, keep going.Chetan: Well, we've got that too, so [laugh] you know? And then we've got a streaming graph engine built into it that kind of looks and behaves like a graph database, like Neo4j, for example. And, you know, it's got columnar capabilities as well. So it's sort of a really interesting data platform that is not open-source; it's proprietary because it's designed to solve these problems of being able to distribute data, put it in hundreds of locations, keep it all in sync, but it looks like a conventional NoSQL database. And it speaks PostgreSQL, so if you know PostgreSQL, you can program it, you know, pretty easily.What it's also doing is taking away the responsibility for engineers and developers to understand how to deal with very arcane problems like conflict resolution in data. I made a change in Mumbai; you made a change in Tokyo; who wins? Our systems in the cloud—you know, DynamoDB, and things like that—they have very crude answers for this something called last writer wins. We've done a lot of work to build a protocol that brings you ACID-like consistency in these types of problems and makes it easy to reason with state change when you've got an application that's potentially running in 100 locations and each of those places is modifying the same record, for example.And then the second part of it is it's a converged platform. So it doesn't just provide data; it provides a compute layer that's deeply integrated directly with the data layer itself. So think of it as Lambdas running, like, stored procedures inside the database. That's really what it is. We've built a very, very specialized compute engine that exposes containers in functions as stored procedures directly on the database.And so they run inside the context of the database and so you can build apps in Python, Go, your favorite language; it compiles down into a [unintelligible 00:15:02] kernel that actually runs inside the database among all these different polyglot interfaces that we have. And the third thing that we do is we provide an ability for you to have very fine-grained control on your data. Because today, data's become a political tool; it's become something that nation-states care a lot about.Corey: Oh, do they ever.Chetan: Exactly. And [unintelligible 00:15:24] regulated. So here's the problem. You're an enterprise architect and your application is going to be consumed in 15 countries, there are 13 different frameworks to deal with. What do you do? Well, you spin up 13 different versions, one for each country, and you know, build 13 different teams, and have 13 zero-day attacks and all that kind of craziness, right?Well, data protection is actually one of the most important parts of the edge because, with something like Macrometa, you can build an app once, and we'll provide all the necessary localization for any region processing, data protection with things like tokenization of data so you can exfiltrate data securely without violating potentially PII sensitive data exfiltration laws within countries, things like that, i.e. It's solving some really hard problems by providing an opinionated platform that does these three things. And I'll summarize it as thus, Corey, we can kind of dig into each piece. Our platform is called the Global Data Network. It's not a global database; it's a global data network. It looks like a frickin database, but it's actually a global network available in 175 cities around the world.Corey: The challenge, of course, is where does the data actually live at rest, and—this is why people care about—well, they're two reasons people care about that; one is the data residency locality stuff, which has always, honestly for me, felt a little bit like a bit of a cloud provider shakedown. Yeah, build a data center here or you don't get any of the business of anything that falls under our regulation. The other is, what is the egress cost of that look like? Because yeah, I can build a whole multicenter data store on top of AWS, for example, but minimum, we're talking two cents, a gigabyte of transfer, even with inside of a region in some cases, and many times that externally.Chetan: Yeah, that's the real shakedown: the egress costs [laugh] more than the other example that you talked about over there. But it's a reality of how cloud pricing works and things like that. What we have built is a network that is completely independent of the cloud providers. We're built on top of five different service providers. Some of them are cloud providers, some of them are telecom providers, some of them are CDNs.And so we're building our global data network on top of routes and capacity provided by transfer providers who have different economics than the cloud providers do. So our cost for egress falls somewhere between two and five cents, for example, depending on which edge locations, which countries, and things that you're going to use over there. We've got a pretty generous egress fee where, you know, for certain thresholds, there's no egress charge at all, but over certain thresholds, we start to charge between two to five cents. But even if you were to take it at the higher end of that spectrum, five cents per gigabyte for transfer, the amount of value our platform brings in architecture and reduction in complexity and the ability to build apps that are frankly, mind-boggling—one of my customers is a SaaS company in marketing that uses us to inject offers while people are on their website, you know, browsing. Literally, you hit their website, you do a few things, and then boom, there's a customized offer for them.In banking that's used, for example, you know, you're making your minimum payments on your credit card, but you have a good payment history and you've got a decent credit score, well, let's give you an offer to give you a short-term loan, for example. So those types of new applications, you know, are really at this intersection where you need low latency, you need in-region processing, and you also need to comply with data regulation. So when you building a high-value revenue-generating app like that egress cost, even at five cents, right, tends to be very, very cheap, and the smallest part of you know, the complexity of building them.Corey: One of the things that I think we see a lot of is that the tone of this industry is set by the big players, and they have done a reasonable job, by and large, of making anything that isn't running in their blessed environments, let me be direct, sound kind of shitty, where it's like, “Oh, do you want to be smart and run things in AWS?”—or GCP? Or Azure, I guess—“Or do you want to be foolish and try and build it yourself out of popsicle sticks and twine?” And, yeah, on some level, if I'm trying to treat everything like it's AWS and run a crappy analog version of DynamoDB, for example, I'm not going to have a great experience, but if I also start from a perspective of not using things that are higher up the stack offerings, that experience starts to look a lot more reasonable as we start expanding out. But it still does present to a lot of us as well, we're just going to run things in VM somewhere and treat them just like we did back in 2005. What's changed in that perspective?Chetan: Yeah, you know, I can't talk for others but for us, we provide a high-level Platform-as-a-Service, and that platform, the global data network, has three pieces to it. First piece is—and none of this will translate into anything that AWS or GCP has because this is the edge, Corey, is completely different, right? So the global data network that we have is composed of three technology components. The first one is something that we call the global data mesh. And this is Pub/Sub and event processing on steroids. We have the ability to connect data sources across all kinds of boundaries; you've got some data in Germany and you've got some data in New York. How do you put these things together and get them streaming so that you can start to do interesting things with correlating this data, for example?And you might have to get across not just physical boundaries, like, they're sitting in different systems in different data centers; they might be logical boundaries, like, hey, I need to collaborate with data from my supply chain partner and we need to be able to do something that's dynamic in real-time, you know, to solve a business problem. So the global data mesh is a way to very quickly connect data wherever it might be in legacy systems, in flat files, in streaming databases, in data warehouses, what have you—you know, we have 500-plus types of connectors—but most importantly, it's not just getting the data streaming, it's then turning it into an API and making that data fungible. Because the minute you put an API on it and it's become fungible now that data is actually got a lot of value. And so the data mesh is a way to very quickly connect things up and put an API on it. And that API can now be consumed by front-ends, it can be consumed by other microservices, things like that.Which brings me to the second piece, which is edge compute. So we've built a compute runtime that is Docker compatible, so it runs containers, it's also Lambda compatible, so it runs functions. Let me rephrase that; it's not Lambda-compatible, it's Lambda-like. So no, you can't take your Lambda and dump it on us and it won't just work. You have to do some things to make it work on us.Corey: But so many of those things are so deeply integrated to the ecosystem that they're operating within, and—Chetan: Yeah.Corey: That, on the one hand, is presented by cloud providers as, “Oh, yes. This shows how wonderful these things are.” In practice, talk to customers. “Yeah, we're using it as spackle between the different cloud services that don't talk to one another despite being made by the same company.”Chetan: [laugh] right.Corey: It's fun.Chetan: Yeah. So the second edge compute piece, which allows you now to build microservices that are stateful, i.e., they have data that they interact with locally, and schedule them along with the data on our network of 175 regions around the world. So you can build distributed applications now.Now, your microservice back-end for your banking application or for your HR SaaS application or e-commerce application is not running in us-east-1 and Virginia; it's running literally in 15, 18, 25 cities where your end-users are, potentially. And to take an industrial IoT case, for example, you might be ingesting data from the electricity grid in 15, 18 different cities around the world; you can do all of that locally now. So that's what the edge functions does, it flips the cloud model around because instead of sending data to where the compute is in the cloud, you're actually bringing compute to where the data is originating, or the data is being consumed, such as through a mobile app. So that's the second piece.And the third piece is global data protection, which is hey, now I've got a distributed infrastructure; how do I comply with all the different privacy and regulatory frameworks that are out there? How do I keep data secure in each region? How do I potentially share data between regions in such a way that, you know, I don't break the model of compliance globally and create a billion-dollar headache for my CIO and CEO and CFO, you know? So that's the third piece of capabilities that this provides.All of this is presented as a set of serverless APIs. So you simply plug these APIs into your existing applications. Some of your applications work great in the cloud. Maybe there are just parts of that app that should be on our edge. And that's usually where most customers start; they take a single web service or two that's not doing so great in the cloud because it's too far away; it has data sensitivity, location sensitivity, time sensitivity, and so they use us as a way to just deal with that on the edge.And there are other applications where it's completely what I call edge native, i.e., no dependancy on the cloud comes and runs completely distributed across our network and consumes primarily the edges infrastructure, and just maybe send some data back on the cloud for long-term storage or long-term analytics.Corey: And ingest does remain free. The long-term analytics, of course, means that once that data is there, good luck convincing a customer to move it because that gets really expensive.Chetan: Exactly, exactly. It's a speciation—as I like to say—of the cloud, into a fast tier where interactions happen, i.e., the edge. So systems of record are still in the cloud; we still have our transactional systems over there, our databases, data warehouses.And those are great for historical types of data, as you just mentioned, but for things that are operational in nature, that are interactive in nature, where you really need to deal with them because they're time-sensitive, they're depleting value in seconds or milliseconds, they're location sensitive, there's a lot of noise in the data and you need to get to just those bits of data that actually matter, throw the rest away, for example—which is what you do with a lot of telemetry in cybersecurity, for example, right—those are all the things that require a new kind of a platform, not a system of record, a system of interaction, and that's what the global data network is, the GDN. And these three primitives, the data mesh, Edge compute, and data protection, are the way that our APIs are shaped to help our enterprise customers solve these problems. So put it another way, imagine ten years from now what DynamoDB and global tables with a really fast Lambda and Kinesis with actually Event Processing built directly into Kinesis might be like. That's Macrometa today, available in 175 cities.Corey: This episode is brought to us in part by our friends at Datadog. Datadog is a SaaS monitoring and security platform that enables full-stack observability for modern infrastructure and applications at every scale. Datadog enables teams to see everything: dashboarding, alerting, application performance monitoring, infrastructure monitoring, UX monitoring, security monitoring, dog logos, and log management, in one tightly integrated platform. With 600-plus out-of-the-box integrations with technologies including all major cloud providers, databases, and web servers, Datadog allows you to aggregate all your data into one platform for seamless correlation, allowing teams to troubleshoot and collaborate together in one place, preventing downtime and enhancing performance and reliability. Get started with a free 14-day trial by visiting datadoghq.com/screaminginthecloud, and get a free t-shirt after installing the agent.Corey: I think it's also worth pointing out that it's easy for me to fall into a trap that I wonder if some of our listeners do as well, which is, I live in, basically, downtown San Francisco. I have gigabit internet connectivity here, to the point where when it goes out, it is suspicious and more a little bit frightening because my ISP—Sonic.net—is amazing and deserves every bit of praise that you never hear any ISP ever get. But when I travel, it's a very different experience. When I go to oh, I don't know, the conference center at re:Invent last year and find that the internet is patchy at best, or downtown San Francisco on Verizon today, I discover that the internet is almost non-existent, and suddenly applications that I had grown accustomed to just working suddenly didn't.And there's a lot more people who live far away from these data center regions and tier one backbones directly to same than don't. So I think that there's a lot of mistaken ideas around exactly what the lower bandwidth experience of the internet is today. And that is something that feels inadvertently classist if that make sense. Are these geographically bigoted?Chetan: Yeah. No, I think those two points are very well articulated. I wish I could articulate it that well. But yes, if you can afford 5G, some of those things get better. But again, 5G is not everywhere yet. It will be, but 5G can in many ways democratize at least one part of it, which is provide an overlap network at the edge, where if you left home and you switched networks, on to a wireless, you can still get the same quality of service that you used to getting from Sonic, for example. So I think it can solve some of those things in the future. But the second part of it—what did you call it? What bigoted?Corey: Geographically bigoted. And again, that's maybe a bit of a strong term, but it's easy to forget that you can't get around the speed of light. I would say that the most poignant example of that I had was when I was—in the before times—giving a keynote in Australia. So ah, I know what I'll do, I'll spin up an EC2 instance for development purposes—because that's how I do my development—in Australia. And then I would just pay my provider for cellular access for my iPad and that was great.And I found the internet was slow as molasses for everything I did. Like, how do people even live here? Well, turns out that my provider would backhaul traffic to the United States. So to log into my session, I would wind up having to connect with a local provider, backhaul to the US, then connect back out from there to Australia across the entire Pacific Ocean, talk to the server, get the response, would follow that return path. It's yeah, turns out that doing laps around the world is not the most efficient way of transferring any data whatsoever, let alone in sizable amounts.Chetan: And that's why we decided to call our platform the global data network, Corey. In fact, it's really built inside of sort of a very simple reason is that we have our own network underneath all of this and we stop this whole ping-pong effect of data going around and help create deterministic guarantees around latency, around location, around performance. We're trying to democratize latency and these types of problems in a way that programmers shouldn't have to worry about all this stuff. You write your code, you push publish, it runs on a network, and it all gets there with a guarantee that 95% of all your requests will happen within 50 milliseconds round-trip time, from any device, you know, in these population centers around the world.So yeah, it's a big deal. It's sort of one of our je ne sais quoi pieces in our mission and charter, which is to just democratize latency and access, and sort of get away from this geographical nonsense of, you know, how networks work and it will dynamically switch topology and just make everything slow, you know, very non-deterministic way.Corey: One last topic that I want to ask you about—because I near certain given your position, you will have an opinion on this—what's your take on, I guess, the carbon footprint of clouds these days? Because a lot of people been talking about it; there has been a lot of noise made about, justifiably so. I'm curious to get your take.Chetan: Yeah, you know, it feels like we're in the '30s and the '40s of the carbon movement when it comes to clouds today, right? Maybe there's some early awareness of the problem, but you know, frankly, there's very little we can do than just sort of put a wet finger in the air, compute some carbon offset and plant some trees. I think these are good building blocks; they're not necessarily the best ways to solve this problem, ultimately. But one of the things I care deeply about and you know, my company cares a lot about is helping make developers more aware off what kind of carbon footprint their code tangibly has on the environment. And so we've started two things inside the company. We've started a foundation that we call the Carbon Conscious Computing Consortium—the four C's. We're going to announce that publicly next year, we're going to invite folks to come and join us and be a part of it.The second thing that we're doing is we're building a completely open-source, carbon-conscious computing platform that is built on real data that we're collecting about, to start with, how Macrometa's platform emits carbon in response to different types of things you build on it. So for example, you wrote a query that hits our database and queries, you know, I don't know, 20 billion objects inside of our database. It'll tell you exactly how many micrograms or how many milligrams of carbon—it's an estimate; not exactly. I got to learn to throttle myself down. It's an estimate, you know, you can't really measure these things exactly because the cost of carbon is different in different places, you know, there are different technologies, et cetera.Gives you a good decent estimate, something that reliably tells you, “Hey, you know that query that you have over there, that piece of SQL? That's probably going to do this much of micrograms of carbon at this scale.” You know, if this query was called a million times every hour, this is how much it costs. A million times a day, this is how much it costs and things like that. But the most important thing that I feel passionate about is that when we give developers visibility, they do good things.I mean, when we give them good debugging tools, the code gets better, the code gets faster, the code gets more efficient. And Corey, you're in the business of helping people save money, when we give them good visibility into how much their code costs to run, they make the code more efficient. So we're doing the same thing with carbon, we know there's a cost to run your code, whether it's a function, a container, a query, what have you, every operation has a carbon cost. And we're on a mission to measure that and provide accurate tooling directly in our platform so that along with your debug lines, right, where you've got all these print statements that are spitting up stuff about what's happening there, we can also print out, you know, what did it cost in carbon.And you can set budgets. You can basically say, “Hey, I want my application to consume this much of carbon.” And down the road, we'll have AI and ML models that will help us optimize your code to be able to fit within those carbon budgets. For example. I'm not a big fan of planting—you know, I love planting trees, but don't get me wrong, we live in California and those trees get burned down.And I was reading this heartbreaking story about how we returned back into the atmosphere a giant amount of carbon because the forest reserve that had been planted, you know, that was capturing carbon, you know, essentially got burned down in a forest fire. So, you know, we're trying to just basically say, let's try and reduce the amount of carbon, you know, that we can potentially create by having better tooling.Corey: That would be amazing, and I think it also requires something that I guess acts almost as an exchange where there's a centralized voice that can make sure that, well, one, the provider is being honest, and two, being able to ensure you're doing an apples-to-apples comparison and not just discounting a whole lot of negative externalities. Because, yes, we're talking about carbon released into the environment. Okay, great. What about water effects from what's happening with your data centers are located? That can have significant climate impact as well. It's about trying to avoid the picking and choosing. It's hard, hard problem, but I'm unconvinced that there's anything more critical in the entire ecosystem right now to worry about.Chetan: So as a startup, we care very deeply about starting with the carbon part. And I agree, Corey, it's a multi-dimensional problem; there's lots of tentacles. The hydrocarbon industry goes very deeply into all parts of our lives. I'm a startup, what do I know? I can't solve all of those things, but I wanted to start with the philosophy that if we provide developers with the right tooling, they'll have the right incentives then to write better code. And as we open-source more of what we learn and, you know, our tooling, others will do the same. And I think in ten years, we might have better answers. But someone's got to start somewhere, and this is where we'd like to start.Corey: I really want to thank you for taking as much time as you have for going through what you're up to and how you view the world. If people want to learn more, where's the best place to find you?Chetan: Yes, so two things on that front. Go to www.macrometa.com—M-A-C-R-O-M-E-T-A dot com—and that's our website. And you can come and experience the full power of the platform. We've got a playground where you can come, open an account and build anything you want for free, and you can try and learn. You just can't run it in production because we've got a giant network, as I said, of 175 cities around the world. But there are tiers available for you to purchase and build and run apps. Like I think about 80 different customers, some of the biggest ones in the world, some of the biggest telecom customers, retail, E-Tail customers, [unintelligible 00:34:28] tiny startups are building some interesting things on.And the second thing I want to talk about is November 7th through 11th of 2022, just a couple of weeks—or maybe by the time this recording comes out, a week from now—is developer week at Macrometa. And we're going to be announcing some really interesting new capabilities, some new features like real-time complex event processing with low, ultra-low latency, data connectors, a search feature that allows you to build search directly on top of your applications without needing to spin up a giant Elastic Cloud Search cluster, or providing search locally and regionally so that, you know, you can have search running in 25 cities that are instant to search rather than sending all your search requests back in one location. There's all kinds of very cool things happening over there.And we're also announcing a partnership with the original, the OG of the edge, one of the largest, most impressive, interesting CDN players that has become a partner for us as well. And then we're also announcing some very interesting experimental work where you as a developer can build apps directly on the 5G telecom cloud as well. And then you'll hear from some interesting companies that are building apps that are edge-native, that are impossible to build in the cloud because they take advantage of these three things that we talked about: geography, latency, and data protection in some very, very powerful ways. So you'll hear actual customer case studies from real customers in the flesh, not anonymous BS, no marchitecture. It's a week-long of technical talk by developers, for developers. And so, you know, come and join the fun and let's learn all about the edge together, and let's go build something together that's impossible to do today.Corey: And we will, of course, put links to that in the [show notes 00:36:06]. Thank you so much for being so generous with your time. I appreciate it.Chetan: My pleasure, Corey. Like I said, you're one of my heroes. I've always loved your work. The Snark-as-a-Service is a trillion-dollar market cap company. If you're ever interested in taking that public, I know some investors that I'd happily put you in touch with. But—Corey: Sadly, so many of those investors lack senses of humor.Chetan: [laugh]. That is true. That is true [laugh].Corey: [laugh]. [sigh].Chetan: Well, thank you. Thanks again for having me.Corey: Thank you. Chetan Venkatesh, CEO and co-founder at Macrometa. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry and insulting comment about why we should build everything on the cloud provider that you work for and then the attempt to challenge Chetan for the title of Edgelord.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Lexman Artificial
Dioxanes, Fiches, Lambdas and Lobscouses

Lexman Artificial

Play Episode Listen Later Aug 11, 2022 4:33


Lexman and Sergey Levine discuss the role that dioxanes play in our everyday lives. They also explore the use of lambda symbols in programming and discuss the difference between lambdas andPenrith lobscouses.

AWS Morning Brief
The Mental Breakdown of Auto-Remediation

AWS Morning Brief

Play Episode Listen Later Jul 27, 2022 5:14


Links: The Nigerian government scores this week's S3 Bucket Negligence Award New Air-Gap Attack Uses SATA Cable as an Antenna to Transfer Radio Signals Automatically block suspicious DNS activity with Amazon GuardDuty and Route 53 Resolver DNS Firewall Use Security Hub custom actions to remediate S3 resources based on Macie discovery results  There has been significant improvement to the AWS IAM documentation around IAM best practices. Artillery lets you use Lambdas for open source load testing. 

Screaming in the Cloud
Kubernetes and OpenGitOps with Chris Short

Screaming in the Cloud

Play Episode Listen Later Jul 14, 2022 39:01


About ChrisChris Short has been a proponent of open source solutions throughout his over two decades in various IT disciplines, including systems, security, networks, DevOps management, and cloud native advocacy across the public and private sectors. He currently works on the Kubernetes team at Amazon Web Services and is an active Kubernetes contributor and Co-chair of OpenGitOps. Chris is a disabled US Air Force veteran living with his wife and son in Greater Metro Detroit. Chris writes about Cloud Native, DevOps, and other topics at ChrisShort.net. He also runs the Cloud Native, DevOps, GitOps, Open Source, industry news, and culture focused newsletter DevOps'ish.Links Referenced: DevOps'ish: https://devopsish.com/ EKS News: https://eks.news/ Containers from the Couch: https://containersfromthecouch.com opengitops.dev: https://opengitops.dev ChrisShort.net: https://chrisshort.net Twitter: https://twitter.com/ChrisShort TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Coming back to us since episode two—it's always nice to go back and see the where are they now type of approach—I am joined by Senior Developer Advocate at AWS Chris Short. Chris, been a few years. How has it been?Chris: Ha. Corey, we have talked outside of the podcast. But it's been good. For those that have been listening, I think when we recorded I wasn't even—like, when was season two, what year was that? [laugh].Corey: Episode two was first pre-pandemic and the rest. I believe—Chris: Oh. So, yeah. I was at Red Hat, maybe, when I—yeah.Corey: Yeah. You were doing Red Hat stuff, back when you got to work on open-source stuff, as opposed to now, where you're not within 1000 miles of that stuff, right?Chris: Actually well, no. So, to be clear, I'm on the EKS team, the Kubernetes team here at AWS. So, when I joined AWS in October, they were like, “Hey, you do open-source stuff. We like that. Do more.” And I was like, “Oh, wait, do more?” And they were like, “Yes, do more.” “Okay.”So, since joining AWS, I've probably done more open-source work than the three years at Red Hat that I did. So, that's kind of—you know, like, it's an interesting point when I talk to people about it because the first couple months are, like—you know, my friends are like, “So, are you liking it? Are you enjoying it? What's going on?” And—Corey: Do they beat you with reeds? Like, all the questions people have about companies? Because—Chris: Right. Like, I get a lot of random questions about Amazon and AWS that I don't know the answer to.Corey: Oh, when I started telling people, I fixed Amazon bills, I had to quickly pivot that to AWS bills because people started asking me, “Well, can you save me money on underpants?” It's I—Chris: Yeah.Corey: How do you—fine. Get the prime credit card. It docks 5% off the bill, so there you go. But other than that, no, I can't.Chris: No.Corey: It's—Chris: Like, I had to call my bank this morning about a transaction that I didn't recognize, and it was from Amazon. And I was like, that's weird. Why would that—Corey: Money just flows one direction, and that's the wrong direction from my employer.Chris: Yeah. Like, what is going on here? It shouldn't have been on that card kind of thing. And I had to explain to the person on the phone that I do work at Amazon but under the Web Services team. And he was like, “Oh, so you're in IT?”And I'm like, “No.” [laugh]. “It's actually this big company. That—it's a cloud company.” And they're like, “Oh, okay, okay. Yeah. The cloud. Got it.” [laugh]. So, it's interesting talking to people about, “I work at Amazon.” “Oh, my son works at Amazon distribution center,” blah, blah, blah. It's like, cool. “I know about that, but very little. I do this.”Corey: Your son works in Amazon distribution center. Is he a robot? Is normally my next question on that? Yeah. That's neither here nor there.So, you and I started talking a while back. We both write newsletters that go to a somewhat similar audience. You write DevOps'ish. I write Last Week in AWS. And recently, you also have started EKS News because, yeah, the one thing I look at when I'm doing these newsletters every week is, you know what I want to do? That's right. Write more newsletters.Chris: [laugh].Corey: So, you are just a glutton for punishment? And, yeah, welcome to the addiction, I suppose. How's it been going for you?Chris: It's actually been pretty interesting, right? Like, we haven't pushed it very hard. We're now starting to include it in things. Like we did Container Day; we made sure that EKS news was on the landing page for Container Day at KubeCon EU. And you know, it's kind of just grown organically since then.But it was one of those things where it's like, internally—this happened at Red Hat, right—when I started live streaming at Red Hat, the ultimate goal was to do our product management—like, here's what's new in the next version thing—do those live so anybody can see that at any point in time anywhere on Earth, the second it's available. Similar situation to here. This newsletter actually is generated as part of a report my boss puts together to brief our other DAs—or developer advocates—you know, our solutions architects, the whole nine yards about new EKS features. So, I was like, why can't we just flip that into a weekly newsletter, you know? Like, I can pull from the same sources you can.And what's interesting is, he only does the meeting bi-weekly. So, there's some weeks where it's just all me doing it and he ends up just kind of copying and pasting the newsletter into his document, [laugh] and then adds on for the week. But that report meeting for that team is now getting disseminated to essentially anyone that subscribes to eks.news. Just go to the site, there's a subscribe thing right there. And we've gotten 20 issues in and it's gotten rave reviews, right?Corey: I have been a subscriber for a while. I will say that it has less Chris Short personality—Chris: Mm-hm.Corey: —to it than DevOps'ish does, which I have to assume is by design. A lot of The Duckbill Group's marketing these days is no longer in my voice, rather intentionally, because it turns out that being a sarcastic jackass and doing half-billion dollar AWS contracts can not to be the most congruent thing in the world. So okay, we're slowly ameliorating that. It's professional voice versus snarky voice.Chris: Well, and here's the thing, right? Like, I realized this year with DevOps'ish that, like, if I want to take a week off, I have to do, like, what you did when your child was born. You hired folks to like, do the newsletter for you, or I actually don't do the newsletter, right? It's binary: hire someone else to do it, or don't do it. So, the way I structured this newsletter was that any developer advocate on my team could jump in and take over the newsletter so that, you know, if I'm off that week, or whatever may be happening, I, Chris Short, am not the voice. It is now the entire developer advocate team.Corey: I will challenge you on that a bit. Because it's not Chris Short voice, that's for sure, but it's also not official AWS brand voice either.Chris: No.Corey: It is clearly written by a human being who is used to communicating with the audience for whom it is written. And that is no small thing. Normally, when oh, there's a corporate newsletter; that's just a lot of words to say it's bad. This one is good. I want to be very clear on that.Chris: Yeah, I mean, we have just, like, DevOps'ish, we have sections, just like your newsletter, there's certain sections, so any new, what's new announcements, those go in automatically. So, like, that can get delivered to your inbox every Friday. Same thing with new blog posts about anything containers related to EKS, those will be in there, then Containers from the Couch, our streaming platform, essentially, for all things Kubernetes. Those videos go in.And then there's some ecosystem news as well that I collect and put in the newsletter to give people a broader sense of what's going on out there in Kubernetes-land because let's face it, there's upstream and then there's downstream, and sometimes those aren't in sync, and that's normal. That's how Kubernetes kind of works sometimes. If you're running upstream Kubernetes, you are awesome. I appreciate you, but I feel like that would cause more problems and it's worse sometimes.Corey: Thank you for being the trailblazers. The rest of us can learn from your misfortune.Chris: [laugh]. Yeah, exactly. Right? Like, please file your bugs accordingly. [laugh].Corey: EKS is interesting to me because I don't see a lot of it, which is, probably, going to get a whole lot of, “Wait, what?” Moments because wait, don't you deal with very large AWS bills? And I do. But what I mean by that is that EKS, until you're using its Fargate expression, charges for the control plane, which rounds to no money, and the rest is running on EC2 instances running in a company's account. From the billing perspective, there is no difference between, “We're running massive fleets of EKS nodes.” And, “We're managing a whole bunch of EC2 instances by hand.”And that feels like an interesting allegory for how Kubernetes winds up expressing itself to cloud providers. Because from a billing perspective, it just looks like one big single-tenant application that has some really strange behaviors internally. It gets very chatty across AZs when there's no reason to, and whatnot. And it becomes a very interesting study in how to expose aspects of what's going on inside of those containers and inside of the Kubernetes environment to the cloud provider in a way that becomes actionable. There are no good answers for this yet, but it's something I've been seeing a lot of. Like, “Oh, I thought you'd be running Kubernetes. Oh, wait, you are and I just keep forgetting what I'm looking at sometimes.”Chris: So, that's an interesting point. The billing is kind of like, yeah, it's just compute, right? So—Corey: And my insight into AWS and the way I start thinking about it is always from a billing perspective. That's great. It's because that means the more expensive the services, the more I know about it. It's like, “IAM. What is that?” Like, “Oh, I have no idea. It's free. How important could it be?” Professional advice: do not take that philosophy, ever.Chris: [laugh]. No. Ever. No.Corey: Security: it matters. Oh, my God. It's like you're all stars. Your IAM policy should not be. I digress.Chris: Right. Yeah. Anyways, so two points I want to make real quick on that is, one, we've recently released an open-source project called Carpenter, which is really cool in my purview because it looks at your Kubernetes file and says, “Oh, you want this to run on ARM instance.” And you can even go so far as to say, right, here's my limits, and it'll find an instance that fits those limits and add that to your cluster automatically. Run your pod on that compute as long as it needs to run and then if it's done, it'll downsize—eventually, kind of thing—your cluster.So, you can basically just throw a bunch of workloads at it, and it'll auto-detect what kind of compute you will need and then provision it for you, run it, and then be done. So, that is one-way folks are probably starting to save money running EKS is to adopt Carpenter as your autoscaler as opposed to the inbuilt Kubernetes autoscaler. Because this is instance-aware, essentially, so it can say, like, “Oh, your massive ARM application can run here,” because you know, thank you, Graviton. We have those processors in-house. And you know, you can run your ARM64 instances, you can run all the Intel workloads you want, and it'll right size the compute for your workloads.And I'll look at one container or all your containers, however you want to configure it. Secondly, the good folks over at Kubecost have opencost, which is the open-source version of Kubecost, basically. So, they have a service that you can run in your clusters that will help you say, “Hey, maybe this one notes too heavy; maybe this one notes too light,” and you know, give you some insights into Kubernetes spend that are a little bit more granular as far as usage and things like that go. So, those two projects right there, I feel like, will give folks an optimal savings experience when it comes to Kubernetes. But to your point, it's just compute, right? And that's really how we treat it, kind of, here internally is that it's a way to run… compute, Kubernetes, or ECS, or any of those tools.Corey: A fairly expensive one because ignoring entirely for a second the actual raw cost of compute, you also have the other side of it, which is in every environment, unless you are doing something very strange or pre-funding as a one-person startup in your spare time, your payroll costs will it—should—exceed your AWS bill by a fairly healthy amount. And engineering time is always more expensive than services time. So, for example, looking at EKS, I would absolutely recommend people use that rather than rolling their own because—Chris: Rolling their own? Yeah.Corey: —get out of that engineering space where your time is free. I assure you from a business context, it is not. So, there's always that question of what you can do to make things easier for people and do more of the heavy lifting.Chris: Yeah, and to your rather cheeky point that there's 17 ways to run a container on AWS, it is answering that question, right? Like those 17 ways, like, how much of this do you want to run yourself, you could run EKS distro on EC2 instances if you want full control over your environment.Corey: And then run IoT Greengrass core on top within that cluster—Chris: Right.Corey: So, I can run my own Lambda function runtime, so I'm not locked in. Also, DynamoDB local so I'm not locked into AWS. At which point I have gone so far around the bend, no one can help me.Chris: Well—Corey: Pro tip, don't do that. Just don't do that.Chris: But to your point, we have all these options for compute, and specifically containers because there's a lot of people that want to granularly say, “This is where my engineering team gets involved. Everything else you handle.” If I want EKS on Spot Instances only, you can do that. If you want EKS to use Carpenter and say only run ARM workloads, you can do that. If you want to say Fargate and not have anything to manage other than the container file, you can do that.It's how much does your team want to manage? That's the customer obsession part of AWS coming through when it comes to containers is because there's so many different ways to run those workloads, but there's so many different ways to make sure that your team is right-sized, based off the services you're using.Corey: I do want to change gears a bit here because you are mostly known for a couple of things: the DevOps'ish newsletter because that is the oldest and longest thing you've been doing the time that I've known you; EKS, obviously. But when prepping for this show, I discovered you are now co-chair of the OpenGitOps project.Chris: Yes.Corey: So, I have heard of GitOps in the context of, “Oh, it's just basically your CI/CD stuff is triggered by Git events and whatnot.” And I'm sitting here going, “Okay, so from where you're sitting, the two best user interfaces in the world that you have discovered are YAML and Git.” And I just have to start with the question, “Who hurt you?”Chris: [laugh]. Yeah, I share your sentiment when it comes to Git. Not so much with YAML, but I think it's because I'm so used to it. Maybe it's Stockholm Syndrome, maybe the whole YAML thing. I don't know.Corey: Well, it's no XML. We'll put it that way.Chris: Thankfully, yes because if it was, I would have way more, like, just template files laying around to build things. But the—Corey: And rage. Don't forget rage.Chris: And rage, yeah. So, GitOps is a little bit more than just Git in IaC—infrastructure as Code. It's more like Justin Garrison, who's also on my team, he calls it infrastructure software because there's four main principles to GitOps, and if you go to opengitops.dev, you can see them. It's version one.So, we put them on the website, right there on the page. You have to have a declared state and that state has to live somewhere. Now, it's called GitOps because Git is probably the most full-featured thing to put your state in, but you could use an S3 bucket and just version it, for example. And make it private so no one else can get to it.Corey: Or you could use local files: copy-of-copy-of-this-thing-restored-parentheses-use-this-one-dot-final-dot-doc-dot-zip. You know, my preferred naming convention.Chris: Ah, yeah. Wow. Okay. [laugh]. Yeah.Corey: Everything I touch is terrifying.Chris: Yes. Geez, I'm sorry. So first, it's declarative. You declare your state. You store it somewhere. It's versioned and immutable, like I said. And then pulled automatically—don't focus so much on pull—but basically, software agents are applying the desired state from source. So, what does that mean? When it's—you know, the fourth principle is implemented, continuously reconciled. That means those software agents that are checking your desired state are actually putting it back into the desired state if it's out of whack, right? So—Corey: You're talking about agents running it persistently on instances, validating—Chris: Yes.Corey: —a checkpoint on a cron. How is this meaningfully different than a Puppet agent running in years past? Having spent I learned to speak publicly by being a traveling trainer for Puppet; same type of model, and in fact, when I was at Pinterest, we wound up having a fair bit—like, that was their entire model, where they would have—the Puppet's code would live in an S3 bucket that was then copied down, I believe, via Git, and then applied to the instance on a schedule. Like, that sounds like this was sort of a early days GitOps.Chris: Yeah, exactly. Right? Like so it's, I like to think of that as a component of GitOps, right? DevOps, when you talk about DevOps in general, there's a lot of stuff out there. There's a lot of things labeled DevOps that maybe are, or maybe aren't sticking to some of those DevOps core things that make you great.Like the stuff that Nicole Forsgren writes about in books, you know? Accelerate is on my desk for a reason because there's things that good, well-managed DevOps practices do. I see GitOps as an actual implementation of DevOps in an open-source manner because all the tooling for GitOps these days is open-source and it all started as open-source. Now, you can get, like, Flux or Argo—Argo, specifically—there's managed services out there for it, you can have Flux and not maintain it, through an add-on, on EKS for example, and it will reconcile that state for you automatically. And the other thing I like to say about GitOps, specifically, is that it moves at the speed of the Kubernetes Audit Log.If you've ever looked at a Kubernetes audit log, you know it's rather noisy with all these groups and versions and kinds getting thrown out there. So, GitOps will say, “Oh, there's an event for said thing that I'm supposed to be watching. Do I need to change anything? Yes or no? Yes? Okay, go.”And the change gets applied, or, “Hey, there's a new Git thing. Pull it in. A change has happened inGit I need to update it.” You can set it to reconcile on events on time. It's like a cron or it's like an event-driven architecture, but it's combined.Corey: How does it survive the stake through the heart of configuration management? Because before I was doing all this, I wasn't even a T-shaped engineer: you're broad across a bunch of things, but deep in one or two areas, and one of mine was configuration management. I wrote part of SaltStack, once upon a time—Chris: Oh.Corey: —due to a bunch of very strange coincidences all hitting it once, like, I taught people how to use Puppet. But containers ultimately arose and the idea of immutable infrastructure became a thing. And these days when we were doing full-on serverless, well, great, I just wind up deploying a new code bundle to the Lambdas function that I wind up caring about, and that is a immutable version replacement. There is no drift because there is no way to log in and change those things other than through a clear deployment of this as the new version that goes out there. Where does GitOps fit into that imagined pattern?Chris: So, configuration management becomes part of your approval process, right? So, you now are generating an audit log, essentially, of all changes to your system through the approval process that you set up as part of your, how you get things into source and then promote that out to production. That's kind of the beauty of it, right? Like, that's why we suggest using Git because it has functions, like, requests and issues and things like that you can say, “Hey, yes, I approve this,” or, “Hey, no, I don't approve that. We need changes.” So, that's kind of natively happening with Git and, you know, GitLab, GitHub, whatever implementation of Git. There's always, kind of—Corey: Uh, JIF-ub is, I believe, the pronunciation.Chris: JIF-ub? Oh.Corey: Yeah. That's what I'm—Chris: Today, I learned. Okay.Corey: Exactly. And that's one of the things that I do for my lasttweetinaws.com Twitter client that I build—because I needed it, and if other people want to use it, that's great—that is now deployed to 20 different AWS commercial regions, simultaneously. And that is done via—because it turns out that that's a very long to execute for loop if you start down that path—Chris: Well, yeah.Corey: I wound up building out a GitHub Actions matrix—sorry a JIF-ub—actions matrix job that winds up instantiating 20 parallel builds of the CDK deploy that goes out to each region as expected. And because that gets really expensive with native GitHub Actions runners for, like, 36 cents per deploy, and I don't know how to test my own code, so every time I have a typo, that's another quarter in the jar. Cool, but that was annoying for me so I built my own custom runner system that uses Lambda functions as runners running containers pulled from ECR that, oh, it just runs in parallel, less than three minutes. Every time I commit something between I press the push button and it is out and running in the wild across all regions. Which is awesome and also terrifying because, as previously mentioned, I don't know how to test my code.Chris: Yeah. So, you don't know what you're deploying to 20 regions sometime, right?Corey: But it also means I have a pristine, re-composable build environment because I can—Chris: Right.Corey: Just automatically have that go out and the fact that I am making a—either merging a pull request or doing a direct push because I consider main to be my feature branch as whenever something hits that, all the automation kicks off. That was something that I found to be transformative as far as a way of thinking about this because I was very tired of having to tweak my local laptop environment to, “Oh, you didn't assume the proper role and everything failed again and you broke it. Good job.” It wound up being something where I could start developing on more and more disparate platforms. And it finally is what got me away from my old development model of everything I build is on an EC2 instance, and that means that my editor of choice was Vim. I use the VS Code now for these things, and I'm pretty happy with it.Chris: Yeah. So, you know, I'm glad you brought up CDK. CDK gives you a lot of the capabilities to implement GitOps in a way that you could say, like, “Hey, use CDK to declare I need four Amazon EKS clusters with this size, shape, and configuration. Go.” Or even further, connect to these EKS clusters to RDS instances and load balancers and everything else.But you put that state into Git and then you have something that deploys that automatically upon changes. That is infrastructure as code. Now, when you say, “Okay, main is your feature branch,” you know, things happen on main, if this were running in Kubernetes across a fleet of clusters or the globe-wide in 20 regions, something like Flux or Argo would kick in and say, “There's been a change to source, main, and we need to roll this out.” And it'll start applying those changes. Now, what do you get with GitOps that you don't get with your configuration?I mean, can you rollback if you ever have, like, a bad commit that's just awful? I mean, that's really part of the process with GitOps is to make sure that you can, A, roll back to the previous good state, B, roll forward to a known good state, or C, promote that state up through various environments. And then having that all done declaratively, automatically, and immutably, and versioned with an audit log, that I think is the real power of GitOps in the sense that, like, oh, so-and-so approve this change to security policy XYZ on this date at this time. And that to an auditor, you just hand them a log file on, like, “Here's everything we've ever done to our system. Done.” Right?Like, you could get to that state, if you want to, which I think is kind of the idea of DevOps, which says, “Take all these disparate tools and processes and procedures and culture changes”—culture being the hardest part to adopt in DevOps; GitOps kind of forces a culture change where, like, you can't do a CAB with GitOps. Like, those two things don't fly. You don't have a configuration management database unless you absolutely—Corey: Oh, you CAB now but they're all the comments of the pull request.Chris: Right. Exactly. Like, don't push this change out until Thursday after this other thing has happened, kind of thing. Yeah, like, that all happens in GitHub. But it's very democratizing in the sense that people don't have to waste time in an hour-long meeting to get their five minutes in, right?Corey: DoorDash had a problem. As their cloud-native environment scaled and developers delivered new features, their monitoring system kept breaking down. In an organization where data is used to make better decisions about technology and about the business, losing observability means the entire company loses their competitive edge. With Chronosphere, DoorDash is no longer losing visibility into their applications suite. The key? Chronosphere is an open-source compatible, scalable, and reliable observability solution that gives the observability lead at DoorDash business, confidence, and peace of mind. Read the full success story at snark.cloud/chronosphere. That's snark.cloud slash C-H-R-O-N-O-S-P-H-E-R-E.Corey: So, would it be overwhelmingly cynical to suggest that GitOps is the means to implement what we've all been pretending to have implemented for the last decade when giving talks at conferences?Chris: Ehh, I wouldn't go that far. I would say that GitOps is an excellent way to implement the things you've been talking about at all these conferences for all these years. But keep in mind, the technology has changed a lot in the, what 11, 12 years of the existence of DevOps, now. I mean, we've gone from, let's try to manage whole servers immutably to, “Oh, now we just need to maintain an orchestration platform and run containers.” That whole compute interface, you go from SSH to a Docker file, that's a big leap, right?Like, you don't have bespoke sysadmins; you have, like, a platform team. You don't have DevOps engineers; they're part of that platform team, or DevOps teams, right? Like, which was kind of antithetical to the whole idea of DevOps to have a DevOps team. You know, everybody's kind of in the same boat now, where we see skill sets kind of changing. And GitOps and Kubernetes-land is, like, a platform team that manages the cluster, and its state, and health and, you know, production essentially.And then you have your developers deploying what they want to deploy in when whatever namespace they've been given access to and whatever rights they have. So, now you have the potential for one set of people—the platform team—to use one set of GitOps tooling, and your applications teams might not like that, and that's fine. They can have their own namespaces with their own tooling in it. Like, Argo, for example, is preferred by a lot of developers because it has a nice UI with green and red dots and they can show people and it looks nice, Flux, it's command line based. And there are some projects out there that kind of take the UI of Argo and try to run Flux underneath that, and those are cool kind of projects, I think, in my mind, but in general, right, I think GitOps gives you the choice that we missed somewhat in DevOps implementations of the past because it was, “Oh, we need to go get cloud.” “Well, you can only use this cloud.” “Oh, we need to go get this thing.” “Well, you can only use this thing in-house.”And you know, there's a lot of restrictions sometimes placed on what you can use in your environment. Well, if your environment is Kubernetes, how do you restrict what you can run, right? Like you can't have an easily configured say, no open-source policy if you're running Kubernetes. [laugh] so it becomes, you know—Corey: Well, that doesn't stop some companies from trying.Chris: Yeah, that's true. But the idea of, like, enabling your developers to deploy at will and then promote their changes as they see fit is really the dream of DevOps, right? Like, same with production and platform teams, right? I want to push my changes out to a larger system that is across the globe. How do I do that? How do I manage that? How do I make sure everything's consistent?GitOps gives you those ways, with Kubernetes native things like customizations, to make consistent environments that are robust and actually going to be reconciled automatically if someone breaks the glass and says, “Oh, I need to run this container immediately.” Well, that's going to create problems because it's deviated from state and it's just that one region, so we'll put it back into state.Corey: It'll be dueling banjos, at some point. You'll try and doing something manually, it gets reverted automatically. I love that pattern. You'll get bored before the computer does, always.Chris: Yeah. And GitOps is very new, right? When you think about the lifetime of GitOps, I think it was coined in, like, 2018. So, it's only four years old, right? When—Corey: I prefer it to ChatOps, at least, as far as—Chris: Well, I mean—Corey: —implementation and expression of the thing.Chris: —ChatOps was a way to do DevOps. I think GitOps—Corey: Well, ChatOps is also a way to wind up giving whoever gets access to your Slack workspace root in production.Chris: Mmm.Corey: But that's neither here nor there.Chris: Mm-hm.Corey: It's yeah, we all like to pretend that's not a giant security issue in our industry, but that's a topic for another time.Chris: Yeah. And that's why, like, GitOps also depends upon you having good security, you know, and good authorization and approval processes. It enforces that upon—Corey: Yeah, who doesn't have one of those?Chris: Yeah. If it's a sole operation kind of deal, like in your setup, your case, I think you kind of got it doing right, right? Like, as far as GitOps goes—Corey: Oh, to be clear, we are 11 people and we do have dueling pull requests and all the rest.Chris: Right, right, right.Corey: But most of the stuff I talk about publicly is not our production stuff, so it really is just me. Just as a point of clarity there. I've n—the 11 people here do not all—the rest of you don't just sit there and clap as I do all the work.Chris: Right.Corey: Most days.Chris: No, I'm sure they don't. I'm almost certain they don't clap… for you. I mean, they would—Corey: No. No, they try and talk me out of it in almost every case.Chris: Yeah, exactly. So, the setup that you, Corey Quinn, have implemented to deploy these 20 regions is kind of very GitOps-y, in the sense that when main changes, it gets updated. Where it's not GitOps-y is what if the endpoint changes? Does it get reconciled? That's the piece you're probably missing is that continuous reconciliation component, where it's constantly checking and saying, “This thing out there is deployed in the way I want it. You know, the way I declared it to be in my source of truth.”Corey: Yeah, when you start having other people getting involved, there can—yeah, that's where regressions enter. And it's like, “Well, I know where things are so why would I change the endpoint?” Yeah, it turns out, not everyone has the state of the entire application in their head. Ideally it should live in—Chris: Yeah. Right. And, you know—Corey: —you know, Git or S3.Chris: —when I—yeah, exactly. When I think about interactions of the past coming out as a new DevOps engineer to work with developers, it's always been, will developers have access to prod or they don't? And if you're in that environment with—you're trying to run a multi-billion dollar operation, and your devs have direct—or one Dev has direct access to prod because prod is in his brain, that's where it's like, well, now wait a minute. Prod doesn't have to be only in your brain. You can put that in the codebase and now we know what is in your brain, right?Like, you can almost do—if you document your code, well, you can have your full lifecycle right there in one place, including documentation, which I think is the best part, too. So, you know, it encourages approval processes and automation over this one person has an entire state of the system in their head; they have to go in and fix it. And what if they're not on call, or in Jamaica, or on a cruise ship somewhere kind of thing? Things get difficult. Like, for example, I just got back from vacation. We were so far off the grid, we had satellite internet. And let me tell you, it was hard to write an email newsletter where I usually open 50 to 100 tabs.Corey: There's a little bit of internet out Californ-ie way.Chris: [laugh].Corey: Yeah it's… it's always weird going from, like, especially after pandemic; I have gigabit symmetric here and going even to re:Invent where I'm trying to upload a bunch of video and whatnot.Chris: Yeah. Oh wow.Corey: And the conference WiFi was doing its thing, and well, Verizon 5G was there but spotty. And well, yeah. Usual stuff.Chris: Yeah. It's amazing to me how connectivity has become so ubiquitous.Corey: To the point where when it's not there anymore, it's what do I do with myself? Same story about people pushing back against remote development of, “Oh, I'm just going to do it all on my laptop because what happens if I'm on a plane?” It's, yeah, the year before the pandemic, I flew 140,000 miles domestically and I was almost never hamstrung by my ability to do work. And my only local computer is an iPad for those things. So, it turns out that is less of a real world concern for most folks.Chris: Yeah I actually ordered the components to upgrade an old Nook that I have here and turn it into my, like, this is my remote code server, that's going to be all attached to GitHub and everything else. That's where I want to be: have Tailscale and just VPN into this box.Corey: Tailscale is transformative.Chris: Yes. Tailscale will change your life. That's just my personal opinion.Corey: Yep.Chris: That's not an AWS opinion or anything. But yeah, when you start thinking about your network as it could be anywhere, that's where Tailscale, like, really shines. So—Corey: Tailscale makes the internet work like we all wanted to believe that it worked.Chris: Yeah. And Wireguard is an excellent open-source project. And Tailscale consumes that and puts an amazingly easy-to-use UI, and troubleshooting tools, and routing, and all kinds of forwarding capabilities, and makes it kind of easy, which is really, really, really kind of awesome. And Tailscale and Kubernetes—Corey: Yeah, ‘network' and ‘easy' don't belong in the same sentence, but in this case, they do.Chris: Yeah. And trust me, the Kubernetes story in Tailscale, there is a lot of there. I understand you might want to not open ports in your VPC, maybe, but if you use Tailscale, that node is just another thing on your network. You can connect to that and see what's going on. Your management cluster is just another thing on the network where you can watch the state.But it's all—you're connected to it continuously through Tailscale. Or, you know, it's a much lighter weight, kind of meshy VPN, I would say, if I had to sum it up in one sentence. That was not on our agenda to talk about at all. Anyways. [laugh]Corey: No, no. I love how many different topics we talk about on these things. We'll have to have you back soon to talk again. I really want to thank you for being so generous with your time. If people want to learn more about what you're up to and how you view these things, where can they find you?Chris: Go to ChrisShort.net. So, Chris Short—I'm six-four so remember, it's Short—dot net, and you will find all the places that I write, you can go to devopsish.com to subscribe to my newsletter, which goes out every week. This year. Next year, there'll be breaks. And then finally, if you want to follow me on Twitter, Chris Short: at @ChrisShort on Twitter. All one word so you see two s's. Like, it's okay, there's two s's there.Corey: Links to all of that will of course be in the show notes. It's easier for people to do the clicky-clicky thing as a general rule.Chris: Clicky things are easier than the wordy things, yes.Corey: Says the Kubernetes guy.Chris: Yeah. Says the Kubernetes guy. Yeah, you like that, huh? Like I said, Argo gives you a UI. [laugh].Corey: Thank you [laugh] so much for your time. I really do appreciate it.Chris: Thank you. This has been fun. If folks have questions, feel free to reach out. Like, I am not one of those people that hides behind a screen all day and doesn't respond. I will respond to you eventually.Corey: I'm right here, Chris. Come on, come on. You're calling me out in front of myself. My God.Chris: Egh. It might take a day or two, but I will respond. I promise.Corey: Thanks again for your time. This has been Chris Short, senior developer advocate at AWS. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice and if it's YouTube, click the thumbs-up button. Whereas if you've hated this podcast, same thing, smash the buttons five-star review and leave an insulting comment that is written in syntactically correct YAML because it's just so easy to do.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

PHPUgly
290: PHP Lambos and Lambdas

PHPUgly

Play Episode Listen Later Jun 10, 2022 64:38


Links from the show:Message Repository for Illuminate - EventSauceEU enforces USB Type-C charging on most electronic devices | PC GamerXdebug Update: May 2022 — Derick RethansThis episode of PHPUgly was sponsored by:Honeybadger.io - https://www.honeybadger.io/PHPUgly streams the recording of this podcast live. Typically every Thursday night around 9 PM PT. Come and join us, and subscribe to our Youtube Channel, Twitch, or Periscope. Also, be sure to check out our Patreon Page.Twitter Account https://twitter.com/phpuglyHost:Eric Van JohnsonJohn CongdonTom RideoutStreams:Youtube ChannelTwitchPeriscopePowered by RestreamPatreon PagePHPUgly Anthem by Harry Mack / Harry Mack Youtube ChannelThanks to all of our Patreon Sponsors:Honeybadger ** This weeks Sponsor **ButteryCrumpetFrank WDavid QShawnKen FBoštjanMarcusShelby CS FergusonRodrigo CBillyDarryl HKnut Erik BDmitri GElgimboMikePageDevKenrick BKalen JR. C. S.Peter AClayton SRonny MBen RAlex BKevin YEnno RWayneJeroen FAndy HSeviChris CSteve MRobert SThorstenEmily JJoe FAndrew WulrikJohn CJames HEric MLaravel MagazineEd GRirielilHermit

Screaming in the Cloud
Conveying Authenticity in Marketing with Sharone Zitzman

Screaming in the Cloud

Play Episode Listen Later Jun 2, 2022 32:16


About SharoneI'm Sharone Zitzman, a marketing technologist and open source community builder, who likes to work with engineering teams that are building products that developers love. Having built both the DevOps Israel and Cloud Native Israel communities from the ground up, today I spend my time finding the places where technology and people intersect and ensuring that this is an excellent experience. You can find my talks, articles, and employment experience at rtfmplease.dev. Find me on Twitter or Github as @shar1z.Links Referenced: Personal Twitter: https://twitter.com/shar1z Website: https://rtfmplease.dev LinkedIn: https://www.linkedin.com/in/sharonez/ @TLVCommunity: https://twitter.com/TLVcommunity @DevOpsDaysTLV: https://twitter.com/devopsdaystlv TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: DoorDash had a problem as their cloud native environment scaled and developers delivered new features, their monitoring system kept breaking down. In an organization where data is used to make better decisions about technology and about the business, losing observability means the entire company loses their competitive edge. With Chronosphere, DoorDash is no longer losing visibility into their applications suite. The key? Chronosphere is an open source compatible, scalable, and reliable observability solution that gives the observability lead at DoorDash business, competence, and peace of mind. Read the full success story at snark.cloud/chronosphere. That's snark.cloud/C-H-R-O-N-O-S-P-H-E-R-E.Corey: The company 0x4447 builds products to increase standardization and security in AWS organizations. They do this with automated pipelines that use well-structured projects to create secure, easy-to-maintain and fail-tolerant solutions, one of which is their VPN product built on top of the popular OpenVPN project which has no license restrictions; you are only limited by the network card in the instance. To learn more visit: snark.cloud/deployandgoCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn and I have been remiss by not having today's guest on years ago because back before I started this ridiculous nonsense that, well, whatever it is you'd call what I do for a living, I did other things instead. I did the DevOps, which means I was sad all the time. And the thing that I enjoyed was the chance to go and speak on conference stages. One of those stages, early on in my speaking career, was at DevOpsDays Tel Aviv.My guest today is Sharone Zitzman, who was an organizer of DevOpsDays Tel Aviv, who started convincing me to come back. And today is in fact, in the strong tradition here of making up your own job titles in ways that make people smile, she is the Chief Manual Reader at RTFM Please Ltd. Sharone, thank you for joining me.Sharone: Thank you for having me, Corey. Israelis love the name of my company, but Americans think it has a lot of moxie and chutzpah. [laugh].Corey: It seems a little direct and aggressive. It's like, oh, good, you are familiar with how this is going to go. There's something to be said for telling people what you do on the tin upfront. I've never been a big fan of trying to hide that. I mean, the first iteration of my company was the Quinn Advisory Group because I thought, you know, let's make it look boring and sedate and like I can talk to finance people. And yeah, that didn't last more than ten seconds of people talking to me.Also, in hindsight, the logo of a big stylized Q. Yeah, I would have had to change that anyway, for the whole QAnon nonsense because I don't want to be mistaken for that particular brand of nuts.Sharone: Yeah, I decided to do away with the whole formalities and upfront, just go straight [laugh]. For the core of who we are, Corey; you are very similar in that. So, yes. Being a dev first company, I thought the developers would appreciate such a title and name for my company. And I have to give a shout out here to Avishai Ish-Shalom, who's my friend from the community who you also know from the DevOpsDays community.Corey: Oh, yeah @nukemberg on Twitter—Sharone: Yes exactly.Corey: For those who are not familiar.Sharone: [laugh]. Yep. He coined the name.Corey: The problem that I found is that people when they start companies or they manage their careers, they don't bias for the things that they're really good at. And it took me a long time to realize this, I finally discovered, “Ah, what am I the best at? That's right, getting myself fired for my personality, so why don't I build a business where that stops being a liability?” So, I started my own company. And I can tell this heroic retcon of what happened, but no, it's because I had nowhere else to go at that point.And would you hire me? Think about this for a minute. You, on the other hand, had options. You are someone with a storied history in community building, in marketing to developers without that either coming across as insincere or that marked condescending accent that so many companies love to have of, “Oh, you're a developer. Let me look at you and get down on my hands and knees like we're going camping and tell a story in ways that actively and passively insult you.”No, you have always gotten that pitch-perfect. The world was your oyster. And for some godforsaken reason, you looked around and decided, “Ah, I'm going to go out independently because you know what I love? Worrying.” Because let's face it, running your own company is an exercise in finding new and exciting things to worry about that 20 minutes ago, you didn't know existed. I say this from my own personal experience. Why would you ever do such a thing?Sharone: [laugh]. That's a great question. It was a long one, but a good one. And I do a thing where I hit the mic a lot because I also have. I can't control my hand motions.Corey: I too speak with my hands. It's fine.Sharone: [laugh]. Yeah, so it's interesting because I wanted to be independent for a really long time. And I wasn't sure, you know, if it was something that I could do if I was a responsible enough adult to even run my own company, if I could make it work, if I could find the business, et cetera. And I left the job in December 2020, and it was the first time that I hadn't figured out what I was doing next yet. And I wanted to take some time off.And then immediately, like, maybe a week after I started to get a lot of, like, kind of people reaching out. And I started to interview places and I started to look into possibly being a co-founder at places and I started to look at all these different options. And then just, I was like, “Well…. This is an opportunity, right? Maybe I should finally—that thing that's gnawing at the back of my head to see if, like, you know if I should go for this dream that I've always wanted, maybe now I can just POC it and see if, you know, it'll work.”And it just, like, kind of exploded on me. It was like there was so much demand, like, I just put a little, like, signal out to the world that this is something that I'm interested in doing, and everyone was like, “Ahh, I need that.” [laugh]. I wanted to take a quarter off and I signed my first clients already on February 1st, which was, like, a month after. I left in December and that—it was crazy. And since then, I've been in business. So, yeah. So, and since then, it's also been a really crazy ride; I got to discover some really exciting companies. So.Corey: How did you get into this? I found myself doing marketing-adjacent work almost entirely by accident. I started the newsletter and this podcast, and I was talking to sponsors periodically and they'd come back with, “Here's the thing we want you to talk about in the sponsor read.” And it's, “Okay, you want to give people a URL to go to that has four sub-directories and entire UTM code… okay, have you considered, I don't know, not?” And because so much of what they were talking about did not resonate.Because I have the engineering background, and it was, I don't understand what your company does and you're spending all your time talking about you instead of my painful problem. Because as your target market, I don't give the slightest of shits about you, I care about my problem, so tell me how you're going to solve my problem and suddenly I'm all ears. Spend the whole time talking about you, and I could not possibly care less and I'll fast-forward through the nonsense. That was my path to it. How did you get into it?Sharone: How did I get into it? It's interesting. So, I started my journey in typical marketing, enterprise B2B marketing. And then at GigaSpaces, we kickstarted the open-source project Cloudify, and that's when I found myself leading this project as the open-source community team leader, building, kind of, the community from the ground floor. And I discovered a whole new world of, like, how to build experience into your marketing, kind of making it really experiential and making sure that everyone has a really, really easy and frictionless way of using your product, and that the product—putting the product at the center and letting it speak for itself. And then you discover this whole new world of marketing where it's—and today, you know, it has more of a name and a title, PLG, and people—it has a whole methodology and practice, but then it was like we were—Corey: PLG? I'm unfamiliar with the acronym. I thought tech was bad for acronyms.Sharone: Right? [laugh]. So, product-led growth. But then, you know, like, kind of wasn't solidified yet. And so, a lot of what we were doing was making sure that developers had a really great experience with the product then it kind of sold itself and marketed itself.And then you understood what they wanted to hear and how they wanted to consume the product and how they wanted it to be and to learn about it and to kind of educate themselves and get into it. And so, a lot of the things that I learned in the context of marketing was very guerilla, right, from the ground up and kind of getting in front of people and in the way they wanted to consume it. And that taught me a lot about how developers consume technology, the different channels that they're involved in, and the different tools that they need in order to succeed, and the different, you know, all the peripheral experience, that makes marketing really, really great. And it's not about what you're selling to somebody; it's making your product shine and making the experience shine, making them ensure that it's a really, really easy and frictionless experience. You know, I like how [Donald Bacon 00:08:00] says it; he calls it, like, mean time to hello world, and that to me is the best kind of marketing, right? When you enable people to succeed very, very quickly.Corey: Yeah, there's something to be said for the ring of authenticity and the rest. Periodically I'll promote guest episodes on this, where it's a sponsored episode where people get up and they talk about what they're working on. And they're like, “Great. So, here's the sales pitch I want to give,” and it's no you won't because first, it won't work. And secondly, I'm sorry, whether it's a promoted episode or not, I will not publish something that isn't good because I have a reputation to uphold here.And people run into challenges an awful lot when they're trying to effectively tell their story. If you have a startup that was founded by an engineer, for example, as so many of these technical startups were, the engineer is often so deeply and profoundly in love with this problem space and the solution and the rest, but if they talk about that, no one cares about the how. I mean, I fix AWS bills, and people don't care—as a general rule—how I do that at all if they're in my target market. They don't care if it's through clever optimization, amazing tooling, doing it on-site, or taking hostages in Seattle. They care about their outcome much more than they ever do about the how.The only people who care about the how are engineers who very often are going to want to build it themselves, or work for you, or start a competitor. And it doesn't resonate in quite the same way. It's weird because all these companies are in slightly different spaces; all of them tend to do slightly different things—or very different things—but so many of the challenges that I see in the way that they're articulating what they do to customers rhymes with one another.Sharone: Yeah. So, I agree completely that developers will talk often about how it works. How it works. How does it work under the hood? What are the bits and bytes, you know?Like, nobody cares about how it works. People care about how will this make my life better, right? How will this improve my life? How will this change my life? [laugh]. As an operations engineer, if I'm, you know, crunching through logs, how will this tool change that? What my days look like? What will my on-call rotation look like? What will—you know, how are you changing my life for the better?So, I think that that's the question. When you learn how to crystallize the answer to that question and you hit it right on the mark—you know, and it takes a long time to understand the market, and to understand the buying persona, and t—and there's so much that you have to do in the background, and so much research you have to do to understand who is that person that needs to have that question answered? But once you do and you crystallize that answer, it lands. And that's the fun part about marketing, really trying to understand the person who's going to consume your product and how you can help them understand that you will make their life better.Corey: Back when I was starting out as a consultant myself, I would tell stories that I had seen in the AWS billing environment, and I occasionally had clients reach out to me, “Hey, why don't you tell our story in public?” It's, “Because that wasn't your story. That was something I saw on six different accounts in the same month. It is something that everyone is feeling.” It's, people think that you're talking about them.So, with that particular mindset on this, without naming specific companies, what themes are you seeing emerging? What are companies getting wrong when they are attempting and failing to market effectively to developers?Sharone: So, exactly what we're talking about in terms of the product pitch, in that they're talking at developers from this kind of marketing speak and this business language that, you know, developers often—you know, unless a company does a really, really good job of translating, kind of, the business value—which they should do, by the way—to engineers, but oftentimes, it's a little bit far from them in the chain, and so it's very hard for them to understand the business fluff. If you talk to them in bits and bytes of this is what my day-to-day developer workflow looks like and if we do these things, it'll cut down the time that I'm working on these things, it'll make these things easier, it'll help streamline whatever processes that are difficult, remove these bottlenecks, and help them understand, like I said, how it improves their life.But the things that I've seen breakdown is also in the authenticity, right? So obviously, the world is built on a lot of the same gimmicks and it's just a matter of whether you're doing it right or not, right? So, there's so much content out there and webcasts and webinars, and I don't know what and podcasts and whatever it is, but a lot of the time, people, their most valuable asset is their time. And if you end up wasting their time, without it being, like, really deeply valuable—if you're going to write content, make sure that there is a valuable takeaway; if you're going to create a webinar, make sure that somebody learned something. That if they're investing their time to join your marketing activities, make sure that they come away with something meaningful and then they'll really appreciate you.And it's the same idea behind the whole DevOpsDays movement with the law of mobility and open spaces that people if they find value, they'll join this open space and they'll participate meaningfully and they'll be a part of your event, and they'll come back to your event from year to year. But if you're not going to provide that tangible value that somebody takes away, and it's like, okay, well, I can practically apply this in my specific tech stack without using your tool, without having to have this very deterministic or specific kind of tech stack that they're talking about. You want to give people something—or even if it is, but even how to do it with or without, or giving them, like, kind of practical tools to try it. Or if there's an open-source project that they can check out first, or some kind of lean utility that gives them a good indication of the value that this will give them, that's a lot more valuable, I think. And practically understandable to somebody who wants to eventually consume your product or use your products.Corey: The way that I see things, at least in the past couple of years, the pandemic has sharpened an awful lot of the messaging that needs to happen. Because in most environments, you're sitting at a DevOpsDays in the front row or whatnot, and it's time for the sponsor talks and someone gets up and starts babbling and wasting your time, most people are not going to get up and leave. Okay, they will in Israel, but in most places, they're not going to get up and leave, whereas in pandemic land, it's you are one tab away from something I actually want—Sharone: Exactly.Corey: To be doing, so if you become even slightly boring, it's not going to go well. So, you have to be on message, you have to be on point or no one cares. People are like, “Oh, well what if we say the wrong thing and people wind up yelling about us on Twitter?” It's like unless it is for something horrifying, you should be so lucky because people are then talking about you. The failure mode isn't that people don't like your product, it's no one talks about it.Sharone: Yeah. No such thing as bad publicity [crosstalk 00:14:32] [laugh]—Corey: Oh, there very much is such a thing is bad publicity. Like, “I could be tweeting about your product most days,” is apparently a version of that, according to some folks. But it's a hard problem to solve for. And one of the things that continually surprises me is the things I'm still learning about this entire industry. The reason that people sponsor this show—and the rates they pay, to be direct—have little bearing to the actual size of the audience—as best we can tell; lies, damn lies, and podcast statistics; if you're listening to this, let me know. I'd love to know if anyone listens to this nonsense—but when you see all of that coming out, why are we able to charge the rates that we do?It's because the long-term value of someone who is going to buy a long-term subscription or wind up rolling out something like ChaosSearch or whatnot that is going to be a fundamental tenet of their product, one prospect becoming a customer pays for anything, I can sell a company, it will sponsor—they can pay me to sponsor for the next ten years, as opposed to the typical mass-market audience where well, I'm here to sling Casper mattresses today or something. It's a different audience and there's a different perception there. People are starting to figure out the value of—in an age where tracking is getting harder and harder to do and attribution will drive you nuts, instead of go where your audience is. Go where the people who care about the problem that you have and will experience that problem are going to hang out. And it always is wild to me to see companies missing out on that.It's, “Okay, so you're going to do a $25 million billboard ad in spotted in airports around the world talking about your company… but looking at your billboard, it makes no sense. I don't understand what it's there for.” Even as a brand awareness play, it fails because your logo is tiny in the corner or something. It's you spent that much money on ads, and maybe a buck on messaging because it seems like with all that attention you just bought, you had nothing worthwhile to say. That's the cardinal sin to me at least.Sharone: Yeah. One thing that I found—and back to our community circuit and things that we've done historically—but that's one thing that, you know, as a person comes from community, I've seen so much value, even from the smaller events. I mean, today, like with Covid and the pandemic and everything has changed all the equilibrium and the way things are happening. But some meetups are getting smaller, face-to-face events are getting smaller, but I've had people telling me that even from small, 30 to 40 people events, they'll go up and they'll do a talk and great, okay, a talk; everybody does talks, but it's like, kind of, the hallway track or the networking that you do after the talk and you actually talk to real users and hear their real problems and you tap into the real community. And some people will tell me like, I had four concrete leads from a 30-person meet up just because they didn't even know that this was a real challenge, or they didn't know that there was a tool that solves this problem, or they didn't understand that this can actually be achieved today.Or there's so many interesting technologies and emerging technologies. I'm privileged to be able to be at the forefront of that and discover it all, and I if I could, I would drop names of all of the awesome companies that work for me, that I work with, and just give them a shout out. But really, there's so many amazing companies doing, like, developer metrics, and all kinds of troubleshooting and failure analysis that's, like, deeply intelligent—and you're going to love this one: I have a Git replacement client apropos to your closing keynote of DevOpsDays 2015—and tapping into the communities and tapping into the real users.And sometimes, you know, it's just a matter of really understanding how developers are working, what processes look like, what workflows look like, what teams look like, and being able to architect your products and things around real use cases. And that you can only discover by really getting in front of actual users, or potential users, and learning from them and feedback loops, and that's the little core behind DevRel and developer advocacy is really understanding your actual users and your consumers, and encouraging them to you know, give you feedback and try things, and beta programs and a million things that are a lot more experiential today that help you understand what your users need, eventually, and how to actually architect that into your products. And that's the important part in terms of marketing. And it's a whole different marketing set. It's a whole different skill set. It's not talking at people, it's actually… ingesting and understanding and hearing and implementing and bringing it into your products.Corey: And it takes time. And you have to make yourself synonymous with a painful problem. And those problems are invariably very point-in-time specific. I don't give a crap about log aggregation today, but in two weeks from now, when I'm trying to chase down 18 different Lambdas function trying to figure out what the hell's broken this week, I suddenly will care very much about log aggregation. Who was that company that's in that space that's doing interesting things? And maybe it's Cribl, for example; they do a lot of stuff in that space and they've been a good sponsor. Great.I start thinking about those things in that light because it is—when I started having these problems, it sticks in your head and it resonates. And there's value and validity to that, but you're never going to be able to attribute that either, which is where people often lose their minds. Because for anything even slightly complicated—you're going to be selling things to big bank—great, good on you. Most of those customers are not going to go and spin up a trial in the dead of night. They're going to hear about you somewhere and think, “Ohh, this is interesting.”They're going to talk about a meeting, they're going to get approval, and at that point, you have long since lost any tracking opportunity there. So, the problem is that by saying it like this, as someone who is a publisher, let's be very clear here, it sounds like you're trying to justify your entire business model. I feel like that half the time, but I've been reassured by people who are experts in doing these things, like, oh, yeah, we have data on this; it's working. So, the alternative is either I accept that they're right or I sit here and arrogantly presume I know more about marketing than people who've devoted their entire careers to it. I'm not that bold. I am a white guy in tech, but not that much.Sharone: Yeah, I mean, the DevRel measurement problem is a known problem. We have people like [unintelligible 00:20:21] who have written about it. We have [Sarah Drasner 00:20:23], we have a million people that have written really, really great content about how do you really measure DevRel and the quality. And one of the things that I liked, Philipp Krenn, the dev advocate at Elastic once said in one of his talks that, you know, “If you're measuring your developer advocates on leads, you're a marketing organization. If you're measuring them on revenue, you're a sales organization. It's about reach, engagement, and awareness, and a lot of things that it's much, much harder to measure.”And I can say that, like, once upon a time, I used to try and attribute it at Cloudify. Like, I remember thinking, like, “Okay, maybe I could really track this back to, you know, the first touch that I actually had with this user.” It's really, really difficult, but I do remember, like, when we used to go out into the events and we were really active in the OpenStack community, in the DevOps community, and many other things, and I remember, like, even after events, like, you get all those lead gen emails. All I would say now is, like, “Hey, if you missed us at the booth, you know, and you want still want a t-shirt, you know, reach out and I'll ship it to you.” And some of those eventually, after we continued the relationship, and we, you know, when we were friends and community friends, six months later, when they moved to their next role at their next job, they were like, “Oh, now I have an opportunity to use Cloudify and I'm going to check it out.”And it's very long relationship that you have to cultivate. It has to be, you know, mutual. You have to be, you have to give be giving something and eventually is going to come back to you. Good deeds come back to you. So, I—that's my credo, by the way, good deeds come back to you. I believe in that and I try to live by that.Corey: This episode is sponsored in parts by our friend EnterpriseDB. EnterpriseDB has been powering enterprise applications with PostgreSQL for 15 years. And now EnterpriseDB has you covered wherever you deploy PostgreSQL on-premises, private cloud, and they just announced a fully-managed service on AWS and Azure called BigAnimal, all one word. Don't leave managing your database to your cloud vendor because they're too busy launching another half-dozen managed databases to focus on any one of them that they didn't build themselves. Instead, work with the experts over at EnterpriseDB. They can save you time and money, they can even help you migrate legacy applications—including Oracle—to the cloud. To learn more, try BigAnimal for free. Go to biganimal.com/snark, and tell them Corey sent you.Corey: So, I have one last question for you and it is pointed and the reason I buried it this deep in the episode is so that if I open with it, I will get letters and I'm hoping to get fewer of them. But I met you, again, at DevOpsDays Tel Aviv, and it was glorious. And then you said, “This is fun. Come help me organize it next year.”And I, like an idiot said, “Sure, that sounds awesome because I love going to conferences and it's great. So, what's involved?” “Oh, a whole bunch of meetings.” “Okay, great.” “And planning”—things I'm terrible at—“Okay.” And then the big day finally arrives where, “Great, when do we get to get on stage and tell a story?” Like, “That's the neat part. We don't.” So, I have to ask, given that it is all behind-the-scenes work that is fairly thankless unless you really screw it up because then it's very visible, what is the point of being so involved in the community?Sharone: Wow, that's a big question, Corey.Corey: It really is.Sharone: [laugh].Corey: Because you've been involved in community for a long time and you're very good at it.Sharone: It's true. It's true. Appreciate it, thank you. So, for me, first of all, I enjoy, kind of, the people aspect of it, absolutely. And that people aspect of it actually has played out in so many different ways.Corey: Oh, you mean great people, and also me.Sharone: [laugh]. Particularly you, Corey, and we will bring you back. [laugh]. And we will make sure you chop wood and carry water because eventually it'll fill your soul, you'll see. [laugh] one of the things that really I have had the privilege and honor, and having come out of, like, kind of all my community work is really the network I've built and the people that I've met.And I've learned so much and I've grown so much, but I've also had the opportunity to connect people, connect things that you wouldn't imagine, un—seemingly-related things. So, there are so many friends of mine that have grown up with me in this community, it's been already ten years now, and a lot of folks have now been going on to new adventures and are looking to kickstart their new startup and I can connect them to this investor, I can connect them to this other person who is maybe a good, you know, partner for their startup, and hiring opportunities, and something—I've had this, like, privilege of kind of being able to connect Israel to the outer world and other things and the global kind of community, and also bring really intelligent folks into the community. And this has just created this amazing flywheel of opportunity that I'm really happy to be at the center of. And I think I've grown as a person, I think our community has grown, has learned, and there's a lot of value in that, I think, yeah. We got to meet wonderful folks like you, Corey. [laugh].Corey: It has its moments. Again, you're one of those rarities in that it's almost become a trope in VC land where VCs always like, “How may I be useful?” And it's this self-serving transparent thing. Every single time you have deigned to introduce me to someone, it's been a productive conversation and I'm always glad I took the meeting. That is no small thing.A lot of people say, “I'm good at community,” which is sort of cover for, “I'm not good at anything,” but in your case, it—Sharone: [laugh]. [I'm an entrepreneur 00:24:48].—Corey: Is very much not true. Oh, yeah. I'm a big believer that ‘entrepreneur' and ‘hero' and other terms like that are things people call you; you don't call yourself that. It always feels weird for, “Oh, he's an entrepreneur.” It's like, that's a pretty lofty word for shitposting, but okay, we'll roll with it.It doesn't work that way. You've clearly invested long-term in a building reputation for yourself by building a name for yourself in the space, and I know that whenever you reach out to me as a result, you are not there to waste my time or shill some bullshit. It is always something that is going to, even if I don't love every aspect of it or agree with the core of the message you're sending, great, it is never not going to be worth my time, which is why I'm so glad I got the chance to talk to you this show.Sharone: I appreciate that. It's something that I really believe in, I don't want to waste people's time and I really only will connect folks or only really will reach out to someone if I do think that there's something meaningful for both sides. It's never only what's in it for me, also. I also want to make sure that there's something in it for the other person and it's something that makes sense and it's meaningful for both sides. I've had the opportunity of meeting such interesting folks, and sometimes it's just like, “You must meet. [laugh]. You will love each other.” You will have so much to do together or it's so much collaboration opportunity.And so yeah, I really am that type of person. And I'll even say from a personal perspective, you know, I know a lot of people, and I've even been asked from the flip side, “Okay, is this a toxic manager? Or is this a, you know, a good hire? Is this”—and I tried to provide really authentic input so people make the right decisions, or make, you know, the right contacts, or make—and that's something I really value. And I managed to build trust with a lot of really great folks—Corey: And also me—Sharone: —and it's come back to me, also. And—[laugh] and particularly you, again. [laugh].Corey: If people want to learn more about how you see the world and the space and otherwise bask in your wisdom, where's the best place to find you?Sharone: So, I'm on Twitter as @shar1z, which is SharoneZ. Basically, everyone thinks it's such a smart, or I don't know what, like, or an esoteric screen name. And I'm like, no, it's just my name, I just—the O-N-E is… the one. [laugh].So yes, shar1z on Twitter, but also my website, rtfmplease.dev, you can reach out, there's a contact form there. You can find me on the web anywhere—LinkedIn. Reach out, I answer almost all my DMs when I can. It's very rare that I don't answer DMs. Maybe there'll be a slight lag, but I do. And I really do like when folks reach out to me. I do like it when people try and make contact.Corey: And you can also be found, of course, wherever find DevOps products are sold, on stage apparently.Sharone: [laugh]. The DevOps community, that's right. @TLVCommunity, @DevOpsDaysTLV—don't out me. All those are—yes, those are also handles that I run on Twitter, it's true.Corey: Excellent.Sharone: So, when you see them all retweeting the same tweet, yes, it's happening within same five minutes, it's me.Corey: Oh, that would have made it way easier to go viral. My God, I should have just thought of that earlier.Sharone: [laugh].Corey: Thank you so much for your time. I appreciate it.Sharone: Thank you, Corey, for having me. It's been a privilege and honor being on your show and I really do think that you are doing wonderful things in the cloud space. You're teaching us, and we're all learning, and you—keep up the good work.Corey: Well, thank you. I appreciate that.Sharone: I also want to add that on proposed marketing and whatever, I do actually listen to all of your openings of all of your shows because they're not fluffy and I like that you do, like, kind of a deep explanation, a deep technical explanation of what your sponsoring product does, and it gives a lot more insight into why is this important. So, I think you're doing that right. So, anybody who's sponsoring this show, listen. Corey knows what he's doing.Corey: Well, thank you. I appreciate that. Yay, “I know what I'm doing.” That one's going in the testimonial kit. My God.Sharone: [laugh]. That's the name of this episode, “Corey knows what he's doing.”Corey: We're going to roll with it, you know. No take-backsies. Sharone Zitzman, Chief Manual Reader at RTFM Please. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review of your podcast platform of choice, or if it's on the YouTubes smash the like and subscribe buttons, whereas if you've hated this show, exact same thing—five-star review wherever you happen to find it, smash both the buttons—but also leave an insulting comment telling me that I'm completely wrong which then devolves into an 18-page diatribe about exactly how your nonsense, bullshit product is built and works.Sharone: [laugh].Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Lambda3 Podcast
Lambda3 Podcast 301 – Tretas da organização de casamento

Lambda3 Podcast

Play Episode Listen Later May 27, 2022 56:46


Neste episódio do Podcast, algumas pessoas Lambdas que já passaram pela experiência de organizar festa de casamento, falam sobre as alegrias e as tristezas deste processo - e as tretas que surgem no caminho. Para agregar ao tema, convidamos a Silvia Pereira, uma das sócias da Inesquecível Assessoria, que nos acompanhou nesse bate papo, no mínimo, divertido. Entre no nosso grupo do Telegram e compartilhe seus comentários com a gente: https://lb3.io/telegram Feed do podcast: www.lambda3.com.br/feed/podcast Feed do podcast somente com episódios técnicos: www.lambda3.com.br/feed/podcast-tecnico Feed do podcast somente com episódios não técnicos: www.lambda3.com.br/feed/podcast-nao-tecnico Lambda3 · #301 - Tretas da organização de casamento Pauta: Contratar ou não uma assessoria? Tradicional x Original Achar um lugar para a festa Decoração Lista de convidados Roteiro da festa (cerimônia, fotos, passar gravata, jogar o buquê, ...) Cerimônia - Igreja / Salão / Ao ar livre Participantes: Camila Alves - @camilaalvescp Fernando Okuma - @feokuma Izabela Oliveira - @izabelaoliveira Juliana Ruiz - @juliana-guerra-ruiz Silvia Pereira - Inesquecível Assessoria - @inesquecivelassessoria Vanda Ferreira dos Santos - @vandafdsantos Links: Inesquecível Assessoria - Site Inesquecível Assessoria - YouTube Edição: Compasso Coolab Créditos das músicas usadas neste programa: Music by Kevin MacLeod (incompetech.com) licensed under Creative Commons: By Attribution 3.0 - creativecommons.org/licenses/by/3.0

airhacks.fm podcast with adam bien
Real World Enterprise Serverless Java on AWS Cloud

airhacks.fm podcast with adam bien

Play Episode Listen Later May 15, 2022 66:41


An airhacks.fm conversation with Goran Opacic (@goranopacic) about: sales force automation at ehsteh, Palm Pilot syncing, starting a SaS company, hetzner, Azure, then AWS, running EC2 machines, going serverless, kubernetes and the clouds, running MicroProfile applications on Quarkus and AWS Lambda, one code base - multiple lambdas, Lambda runs on Firecracker VM, OkHTTP on Lambdas, tree shaking with GraalVM, AWS CodeArtifact to cache Maven repositories, Amazon ECR, AWS CodeCommit, databases are hard to split, AWS CodeDeploy with scheduler, code hot swap, managed services is serverless, running AWS Fargate on spot intances, using Eclipse BIRT on AWS Lambda, Goran is AWS Data Hero, Goran Opacic on twitter: @goranopacic, Goran's blog: madabout.cloud

Screaming in the Cloud
Reliability Starts in Cultural Change with Amy Tobey

Screaming in the Cloud

Play Episode Listen Later May 11, 2022 46:37


About AmyAmy Tobey has worked in tech for more than 20 years at companies of every size, working with everything from kernel code to user interfaces. These days she spends her time building an innovative Site Reliability Engineering program at Equinix, where she is a principal engineer. When she's not working, she can be found with her nose in a book, watching anime with her son, making noise with electronics, or doing yoga poses in the sun.Links Referenced: Equinix Metal: https://metal.equinix.com Personal Twitter: https://twitter.com/MissAmyTobey Personal Blog: https://tobert.github.io/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Vultr. Optimized cloud compute plans have landed at Vultr to deliver lightning-fast processing power, courtesy of third-gen AMD EPYC processors without the IO or hardware limitations of a traditional multi-tenant cloud server. Starting at just 28 bucks a month, users can deploy general-purpose, CPU, memory, or storage optimized cloud instances in more than 20 locations across five continents. Without looking, I know that once again, Antarctica has gotten the short end of the stick. Launch your Vultr optimized compute instance in 60 seconds or less on your choice of included operating systems, or bring your own. It's time to ditch convoluted and unpredictable giant tech company billing practices and say goodbye to noisy neighbors and egregious egress forever. Vultr delivers the power of the cloud with none of the bloat. “Screaming in the Cloud” listeners can try Vultr for free today with a $150 in credit when they visit getvultr.com/screaming. That's G-E-T-V-U-L-T-R dot com slash screaming. My thanks to them for sponsoring this ridiculous podcast.Corey: Finding skilled DevOps engineers is a pain in the neck! And if you need to deploy a secure and compliant application to AWS, forgettaboutit! But that's where DuploCloud can help. Their comprehensive no-code/low-code software platform guarantees a secure and compliant infrastructure in as little as two weeks, while automating the full DevSecOps lifestyle. Get started with DevOps-as-a-Service from DuploCloud so that your cloud configurations are done right the first time. Tell them I sent you and your first two months are free. To learn more visit: snark.cloud/duplo. Thats's snark.cloud/D-U-P-L-O-C-L-O-U-D.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while I catch up with someone that it feels like I've known for ages, and I realize somehow I have never been able to line up getting them on this show as a guest. Today is just one of those days. And my guest is Amy Tobey who has been someone I've been talking to for ages, even in the before-times, if you can remember such a thing. Today, she's a Senior Principal Engineer at Equinix. Amy, thank you for finally giving in to my endless wheedling.Amy: Thanks for having me. You mentioned the before-times. Like, I remember it was, like, right before the pandemic we had beers in San Francisco wasn't it? There was Ian there—Corey: Yeah, I—Amy: —and a couple other people. It was a really great time. And then—Corey: I vaguely remember beer. Yeah. And then—Amy: And then the world ended.Corey: Oh, my God. Yes. It's still March of 2020, right?Amy: As far as I know. Like, I haven't checked in a couple years.Corey: So, you do an awful lot. And it's always a difficult question to ask someone, so can you encapsulate your entire existence in a paragraph? It's—Amy: [sigh].Corey: —awful, so I'd like to give a bit more structure to it. Let's start with the introduction: You are a Senior Principal Engineer. We know it's high level because of all the adjectives that get put in there, and none of those adjectives are ‘associate' or ‘beginner' or ‘junior,' or all the other diminutives that companies like to play games with to justify paying people less. And you're at Equinix, which is a company that is a bit unlike most of the, shall we say, traditional cloud providers. What do you do over there and both as a company, as a person?Amy: So, as a company Equinix, what most people know about is that we have a whole bunch of data centers all over the world. I think we have the most of any company. And what we do is we lease out space in that data center, and then we have a number of other products that people don't know as well, which one is Equinix Metal, which is what I specifically work on, where we rent you bare-metal servers. None of that fancy stuff that you get any other clouds on top of it, there's things you can get that are… partner things that you can add-on, like, you know, storage and other things like that, but we just deliver you bare-metal servers with really great networking. So, what I work on is the reliability of that whole system. All of the things that go into provisioning the servers, making them come up, making sure that they get delivered to the server, make sure the API works right, all of that stuff.Corey: So, you're on the Equinix cloud side of the world more so than you are on the building data centers by the sweat of your brow, as they say?Amy: Correct. Yeah, yeah. Software side.Corey: Excellent. I spent some time in data centers in the early part of my career before cloud ate that. That was sort of cotemporaneous with the discovery that I'm the hardware destruction bunny, and I should go to great pains to keep my aura from anything expensive and important, like, you know, the SAN. So—Amy: Right, yeah.Corey: Companies moving out of data centers, and me getting out was a great thing.Amy: But the thing about SANs though, is, like, it might not be you. They're just kind of cursed from the start, right? They just always were kind of fussy and easy to break.Corey: Oh, yeah. I used to think—and I kid you not—that I had a limited upside to my career in tech because I sometimes got sloppy and I was fairly slow at crimping ethernet cables.Amy: [laugh].Corey: That is very similar to growing up in third grade when it became apparent that I was going to have problems in my career because my handwriting was sloppy. Yeah, it turns out the future doesn't look like we predicted it would.Amy: Oh, gosh. Are we going to talk about, like, neurological development now or… [laugh] okay, that's a thing I struggle with, too right, is I started typing as soon as they would let—in fact, before they would let me. I remember in high school, I had teachers who would grade me down for typing a paper out. They want me to handwrite it and I would go, “Cool. Go ahead and take a grade off because if I handwrite it, you're going to take two grades off my handwriting, so I'm cool with this deal.”Corey: Yeah, it was pretty easy early on. I don't know when the actual shift was, but it became more and more apparent that more and more things are moving towards a world where you could type. And I was almost five when I started working on that stuff, and that really wound up changing a lot of aspects of how I started seeing things. One thing I think you're probably fairly well known for is incidents. I want to be clear when I say that you are not the root cause as—“So, why are things broken?” “It's Amy again. What's she gotten into this time?” Great.Amy: [laugh]. But it does happen, but not all the time.Corey: Exa—it's a learning experience.Amy: Right.Corey: You've also been deeply involved with SREcon and a number of—a lot of aspects of what I will term—and please don't yell at me for this—SRE culture—Amy: Yeah.Corey: Which is sometimes a challenging thing to wind up describing or putting a definition around. The one that I've always been somewhat partial to is, “SRE is DevOps, except you worked at Google for a while.” I don't know how necessarily accurate that is, but it does rile people up.Amy: Yeah, it does. Dave Stanke actually did a really great talk at SREcon San Francisco just a couple weeks ago, about the DORA report. And the new DORA report, they split SRE out into its own function and kind of is pushing against that old model, which actually comes from Liz Fong-Jones—I think it's from her, or older—about, like, class SRE implements DevOps, which is kind of this idea that, like, SREs make DevOps happen. Things have evolved, right, since then. Things have evolved since Google released those books, and we're all just figured out what works and what doesn't a little bit.And so, it's not that we're implementing DevOps so much. In fact, it's that ops stuff that kind of holds us back from the really high impact work that SREs, I think, should be doing, that aren't just, like, fixing the problems, the symptoms down at the bottom layer, right? Like what we did as sysadmins 20 years ago. You know, we'd go and a lot of people are SREs that came out of the sysadmin world and still think in that mode, where it's like, “Well, I set up the systems, and when things break, I go and I fix them.” And, “Why did the developers keep writing crappy code? Why do I have to always getting up in the middle of the night because this thing crashed?”And it turns out that the work we need to do to make things more reliable, there's a ceiling to how far away the platform can take us, right? Like, we can have the best platform in the world with redundancy, and, you know, nine-way replicated data storage and all this crazy stuff, and still if we put crappy software on top, it's going to be unreliable. So, how do we make less crappy software? And for most of my career, people would be, like, “Well, you should test it.” And so, we started doing that, and we still have crappy software, so what's going on here? We still have incidents.So, we write more tests, and we still have incidents. We had a QA group, we still have incidents. We send the developers to training, and we still have incidents. So like, what is the thing we need to do to make things more reliable? And it turns out, most of it is culture work.Corey: My perspective on this stems from being a grumpy old sysadmin. And at some point, I started calling myself a systems engineer or DevOps or production engineer, or SRE. It was all from my point of view, the same job, but you know, if you call yourself a sysadmin, you're just asking for a 40% pay cut off the top.Amy: [laugh].Corey: But I still tended to view the world through that lens. I tended to be very good at Linux systems internals, for example, understanding system calls and the rest, but increasingly, as the DevOps wave or SRE wave, or Google-isation of the internet wound up being more and more of a thing, I found myself increasingly in job interviews, where, “Great, now, can you go wind up implementing a sorting algorithm on the whiteboard?” “What on earth? No.” Like, my lingua franca is shitty Bash, and no one tends to write that without a bunch of tab completions and quick checking with manpages—die.net or whatnot—on the fly as you go down that path.And it was awful, and I felt… like my skill set was increasingly eroding. And it wasn't honestly until I started this place where I really got into writing a fair bit of code to do different things because it felt like an orthogonal skill set, but the fullness of time, it seems like it's not. And it's a reskilling. And it made me wonder, does this mean that the areas of technology that I focused on early in my career, was that all a waste? And the answer is not really. Sometimes, sure, in that I don't spend nearly as much time worrying about inodes—for example—as I once did. But every once in a while, I'll run into something and I looked like a wizard from the future, but instead, I'm a wizard from the past.Amy: Yeah, I find that a lot in my work, now. Sometimes things I did 20 years ago, come back, and it's like, oh, yeah, I remember I did all that threading work in 2002 in Perl, and I learned everything the very, very, very hard way. And then, you know, this January, did some threading work to fix some stability issues, and all of it came flooding back, right? Just that the experiences really, more than the code or the learning or the text and stuff; more just the, like, this feels like threads [BLEEP]-ery. Is a diagnostic thing that sometimes we have to say.And then people are like, “Can you prove it?” And I'm like, “Not really,” because it's literally thread [BLEEP]-ery. Like, the definition of it is that there's weird stuff happening that we can't figure out why it's happening. There's something acting in the system that isn't synchronized, that isn't connected to other things, that's happening out of order from what we expect, and if we had a clear signal, we would just fix it, but we don't. We just have, like, weird stuff happening over here and then over there and over there and over there.And, like, that tells me there's just something happening at that layer and then have to go and dig into that right, and like, just basically charge through. My colleagues are like, “Well, maybe you should look at this, and go look at the database,” the things that they're used to looking at and that their experiences inform, whereas then I bring that ancient toiling through the threading mines experiences back and go, “Oh, yeah. So, let's go find where this is happening, where people are doing dangerous things with threads, and see if we can spot something.” But that came from that experience.Corey: And there's so much that just repeats itself. And history rhymes. The challenge is that, do you have 20 years of experience, or do you have one year of experience repeated 20 times? And as the tide rises, doing the same task by hand, it really is just a matter of time before your full-time job winds up being something a piece of software does. An easy example is, “Oh, what's your job?” “I manually place containers onto specific hosts.” “Well, I've got news for you, and you're not going to like it at all.”Amy: Yeah, yeah. I think that we share a little bit. I'm allergic to repeated work. I don't know if allergic is the right word, but you know, if I sit and I do something once, fine. Like, I'll just crank it out, you know, it's this form, or it's a datafile I got to write and I'll—fine I'll type it in and do the manual labor.The second time, the difficulty goes up by ten, right? Like, just mentally, just to do it, be like, I've already done this once. Doing it again is anathema to everything that I am. And then sometimes I'll get through it, but after that, like, writing a program is so much easier because it's like exponential, almost, growth in difficulty. You know, the third time I have to do the same thing that's like just typing the same stuff—like, look over here, read this thing and type it over here—I'm out; I can't do it. You know, I got to find a way to automate. And I don't know, maybe normal people aren't driven to live this way, but it's kept me from getting stuck in those spots, too.Corey: It was weird because I spent a lot of time as a consultant going from place to place and it led to some weird changes. For example, “Oh, thank God, I don't have to think about that whole messaging queue thing.” Sure enough, next engagement, it's message queue time. Fantastic. I found that repeating myself drove me nuts, but you also have to be very sensitive not to wind up, you know, stealing IP from the people that you're working with.Amy: Right.Corey: But what I loved about the sysadmin side of the world is that the vast majority of stuff that I've taken with me, lives in my shell config. And what I mean by that is I'm not—there's nothing in there is proprietary, but when you have a weird problem with trying to figure out the best way to figure out which Ruby process is stealing all the CPU, great, turns out that you can chain seven or eight different shell commands together through a bunch of pipes. I don't want to remember that forever. So, that's the sort of thing I would wind up committing as I learned it. I don't remember what company I picked that up at, but it was one of those things that was super helpful.I have a sarcastic—it's a one-liner, except no sane editor setting is going to show it in any less than three—of a whole bunch of Perl, piped into du, piped into the rest, that tells you one of the largest consumers of files in a given part of the system. And it rates them with stars and it winds up doing some neat stuff. I would never sit down and reinvent something like that today, but the fact that it's there means that I can do all kinds of neat tricks when I need to. It's making sure that as you move through your career, on some level, you're picking up skills that are repeatable and applicable beyond one company.Amy: Skills and tooling—Corey: Yeah.Amy: —right? Like, you just described the tool. Another SREcon talk was John Allspaw and Dr. Richard Cook talking about above the line; below the line. And they started with these metaphors about tools, right, showing all the different kinds of hammers.And if you're a blacksmith, a lot of times you craft specialized hammers for very specific jobs. And that's one of the properties of a tool that they were trying to get people to think about, right, is that tools get crafted to the job. And what you just described as a bespoke tool that you had created on the fly, that kind of floated under the radar of intellectual property. [laugh].So, let's not tell the security or IP people right? Like, because there's probably billions and billions of dollars of technically, like, made-up IP value—I'm doing air quotes with my fingers—you know, that's just basically people's shell profiles. And my God, the Emacs automation that people have done. If you've ever really seen somebody who's amazing at Emacs and is 10, 20, 30, maybe 40 years of experience encoded in their emacs settings, it's a wonder to behold. Like, I look at it and I go, “Man, I wish I could do that.”It's like listening to a really great guitar player and be like, “Wow, I wish I could play like them.” You see them just flying through stuff. But all that IP in there is both that person's collection of wisdom and experience and working with that code, but also encodes that stuff like you described, right? It's just all these little systems tricks and little fiddly commands and things we don't want to remember and so we encode them into our toolset.Corey: Oh, yeah. Anything I wound up taking, I always would share it with people internally, too. I'd mention, “Yeah, I'm keeping this in my shell files.” Because I disclosed it, which solves a lot of the problem. And also, none of it was even close to proprietary or anything like that. I'm sorry, but the way that you wind up figuring out how much of a disk is being eaten up and where in a more pleasing way, is not a competitive advantage. It just isn't.Amy: It isn't to you or me, but, you know, back in the beginning of our careers, people thought it was worth money and should be proprietary. You know, like, oh, that disk-checking script as a competitive advantage for our company because there are only a few of us doing this work. Like, it was actually being able to, like, manage your—[laugh] actually manage your servers was a competitive advantage. Now, it's kind of commodity.Corey: Let's also be clear that the world has moved on. I wound up buying a DaisyDisk a while back for Mac, which I love. It is a fantastic, pretty effective, “Where's all the stuff on your disk going?” And it does a scan and you can drive and collect things and delete them when trying to clean things out. I was using it the other day, so it's top of mind at the moment.But it's way more polished than that crappy Perl three-liner. And I see both sides, truly I do. The trick also, for those wondering [unintelligible 00:15:45], like, “Where is the line?” It's super easy. Disclose it, what you're doing, in those scenarios in the event someone is no because they believe that finding the right man page section for something is somehow proprietary.Great. When you go home that evening in a completely separate environment, build it yourself from scratch to solve the problem, reimplement it and save that. And you're done. There are lots of ways to do this. Don't steal from your employer, but your employer employs you; they don't own you and the way that you think about these problems.Every person I've met who has had a career that's longer than 20 minutes has a giant doc somewhere on some system of all of the scripts that they wound up putting together, all of the one-liners, the notes on, “Next time you see this, this is the thing to check.”Amy: Yeah, the cheat sheet or the notebook with all the little commands, or again the Emacs config, sometimes for some people, or shell profiles. Yeah.Corey: Here's the awk one-liner that I put that automatically spits out from an Apache log file what—the httpd log file that just tells me what are the most frequent talkers, and what are the—Amy: You should probably let go of that one. You know, like, I think that one's lifetime is kind of past, Corey. Maybe you—Corey: I just have to get it working with Nginx, and we're good to go.Amy: Oh, yeah, there you go. [laugh].Corey: Or S3 access logs. Perish the thought. But yeah, like, what are the five most high-volume talkers, and what are those relative to each other? Huh, that one thing seems super crappy and it's coming from Russia. But that's—hmm, one starts to wonder; maybe it's time to dig back in.So, one of the things that I have found is that a lot of the people talking about SRE seem to have descended from an ivory tower somewhere. And they're talking about how some of the best-in-class companies out there, renowned for their technical cultures—at least externally—are doing these things. But there's a lot more folks who are not there. And honestly, I consider myself one of those people who is not there. I was a competent engineer, but never a terrific one.And looking at the way this was described, I often came away thinking, “Okay, it was the purpose of this conference talk just to reinforce how smart people are, and how I'm not,” and/or, “There are the 18 cultural changes you need to make to your company, and then you can do something kind of like we were just talking about on stage.” It feels like there's a combination of problems here. One is making this stuff more accessible to folks who are not themselves in those environments, and two, how to drive cultural change as an individual contributor if that's even possible. And I'm going to go out on a limb and guess you have thoughts on both aspects of that, and probably some more hit me, please.Amy: So, the ivory tower, right. Let's just be straight up, like, the ivory tower is Google. I mean, that's where it started. And we get it from the other large companies that, you know, want to do conference talks about what this stuff means and what it does. What I've kind of come around to in the last couple of years is that those talks don't really reach the vast majority of engineers, they don't really apply to a large swath of the enterprise especially, which is, like, where a lot of the—the bulk of our industry sits, right? We spend a lot of time talking about the darlings out here on the West Coast in high tech culture and startups and so on.But, like, we were talking about before we started the show, right, like, the interior of even just America, is filled with all these, like, insurance and banks and all of these companies that are cranking out tons of code and servers and stuff, and they're trying to figure out the same problems. But they're structured in companies where their tech arm is still, in most cases, considered a cost center, often is bundled under finance, for—that's a whole show of itself about that historical blunder. And so, the tech culture is tend to be very, very different from what we experience in—what do we call it anymore? Like, I don't even want to say West Coast anymore because we've gone remote, but, like, high tech culture we'll say. And so, like, thinking about how to make SRE and all this stuff more accessible comes down to, like, thinking about who those engineers are that are sitting at the computers, writing all the code that runs our banks, all the code that makes sure that—I'm trying to think of examples that are more enterprise-y right?Or shoot buying clothes online. You go to Macy's for example. They have a whole bunch of servers that run their online store and stuff. They have internal IT-ish people who keep all this stuff running and write that code and probably integrating open-source stuff much like we all do. But when you go to try to put in a reliability program that's based on the current SRE models, like SLOs; you put in SLOs and you start doing, like, this incident management program that's, like, you know, you have a form you fill out after every incident, and then you [unintelligible 00:20:25] retros.And it turns out that those things are very high-level skills, skills and capabilities in an organization. And so, when you have this kind of IT mindset or the enterprise mindset, bringing the culture together to make those things work often doesn't happen. Because, you know, they'll go with the prescriptive model and say, like, okay, we're going to implement SLOs, we're going to start measuring SLIs on all of the services, and we're going to hold you accountable for meeting those targets. If you just do that, right, you're just doing more gatekeeping and policing of your tech environment. My bet is, reliability almost never improves in those cases.And that's been my experience, too, and why I get charged up about this is, if you just go slam in these practices, people end up miserable, the practices then become tarnished because people experienced the worst version of them. And then—Corey: And with the remote explosion as well, it turns out that changing jobs basically means their company sends you a different Mac, and the next Monday, you wind up signing into a different Slack team.Amy: Yeah, so the culture really matters, right? You can't cover it over with foosball tables and great lunch. You actually have to deliver tools that developers want to use and you have to deliver a software engineering culture that brings out the best in developers instead of demanding the best from developers. I think that's a fundamental business shift that's kind of happening. If I'm putting on my wizard hat and looking into the future and dreaming about what might change in the world, right, is that there's kind of a change in how we do leadership and how we do business that's shifting more towards that model where we look at what people are capable of and we trust in our people, and we get more out of them, the knowledge work model.If we want more knowledge work, we need people to be happy and to feel engaged in their community. And suddenly we start to see these kind of generational, bigger-pie kind of things start to happen. But how do we get there? It's not SLOs. It maybe it's a little bit starting with incidents. That's where I've had the most success, and you asked me about that. So, getting practical, incident management is probably—Corey: Right. Well, as I see it, the problem with SLOs across the board is it feels like it's a very insular community so far, and communicating it to engineers seems to be the focus of where the community has been, but from my understanding of it, you absolutely need buy-in at significantly high executive levels, to at the very least by you air cover while you're doing these things and making these changes, but also to help drive that cultural shift. None of this is something I have the slightest clue how to do, let's be very clear. If I knew how to change a company's culture, I'd have a different job.Amy: Yeah. [laugh]. The biggest omission in the Google SRE books was [Ers 00:22:58]. There was a guy at Google named Ers who owns availability for Google, and when anything is, like, in dispute and bubbles up the management team, it goes to Ers, and he says, “Thou shalt…” right? Makes the call. And that's why it works, right?Like, it's not just that one person, but that system of management where the whole leadership team—there's a large, very well-funded team with a lot of power in the organization that can drive availability, and they can say, this is how you're going to do metrics for your service, and this is the system that you're in. And it's kind of, yeah, sure it works for them because they have all the organizational support in place. What I was saying to my team just the other day—because we're in the middle of our SLO rollout—is that really, I think an SLO program isn't [clear throat] about the engineers at all until late in the game. At the beginning of the game, it's really about getting the leadership team on board to say, “Hey, we want to put in SLIs and SLOs to start to understand the functioning of our software system.” But if they don't have that curiosity in the first place, that desire to understand how well their teams are doing, how healthy their teams are, don't do it. It's not going to work. It's just going to make everyone miserable.Corey: It feels like it's one of those difficult to sell problems as well, in that it requires some tooling changes, absolutely. It requires cultural change and buy-in and whatnot, but in order for that to happen, there has to be a painful problem that a company recognizes and is willing to pay to make go away. The problem with stuff like this is that once you pay, there's a lot of extra work that goes on top of it as well, that does not have a perception—rightly or wrongly—of contributing to feature velocity, of hitting the next milestone. It's, “Really? So, we're going to be spending how much money to make engineers happier? They should get paid an awful lot and they're still complaining and never seem happy. Why do I care if they're happy other than the pure mercenary perspective of otherwise they'll quit?” I'm not saying that it's not worth pursuing; it's not a worthy goal. I am saying that it becomes a very difficult thing to wind up selling as a product.Amy: Well, as a product for sure, right? Because—[sigh] gosh, I have friends in the space who work on these tools. And I want to be careful.Corey: Of course. Nothing but love for all of those people, let's be very clear.Amy: But a lot of them, you know, they're pulling metrics from existing monitoring systems, they are doing some interesting math on them, but what you get at the end is a nice service catalog and dashboard, which are things we've been trying to land as products in this industry for as long as I can remember, and—Corey: “We've got it this time, though. This time we'll crack the nut.” Yeah. Get off the island, Gilligan.Amy: And then the other, like, risky thing, right, is the other part that makes me uncomfortable about SLOs, and why I will often tell folks that I talk to out in the industry that are asking me about this, like, one-on-one, “Should I do it here?” And it's like, you can bring the tool in, and if you have a management team that's just looking to have metrics to drive productivity, instead of you know, trying to drive better knowledge work, what you get is just a fancier version of more Taylorism, right, which is basically scientific management, this idea that we can, like, drive workers to maximum efficiency by measuring random things about them and driving those numbers. It turns out, that doesn't really work very well, even in industrial scale, it just happened to work because, you know, we have a bloody enough society that we pushed people into it. But the reality is, if you implement SLOs badly, you get more really bad Taylorism that's bad for you developers. And my suspicion is that you will get worse availability out of it than you would if you just didn't do it at all.Corey: This episode is sponsored by our friends at Revelo. Revelo is the Spanish word of the day, and its spelled R-E-V-E-L-O. It means “I reveal.” Now, have you tried to hire an engineer lately? I assure you it is significantly harder than it sounds. One of the things that Revelo has recognized is something I've been talking about for a while, specifically that while talent is evenly distributed, opportunity is absolutely not. They're exposing a new talent pool to, basically, those of us without a presence in Latin America via their platform. It's the largest tech talent marketplace in Latin America with over a million engineers in their network, which includes—but isn't limited to—talent in Mexico, Costa Rica, Brazil, and Argentina. Now, not only do they wind up spreading all of their talent on English ability, as well as you know, their engineering skills, but they go significantly beyond that. Some of the folks on their platform are hands down the most talented engineers that I've ever spoken to. Let's also not forget that Latin America has high time zone overlap with what we have here in the United States, so you can hire full-time remote engineers who share most of the workday as your team. It's an end-to-end talent service, so you can find and hire engineers in Central and South America without having to worry about, frankly, the colossal pain of cross-border payroll and benefits and compliance because Revelo handles all of it. If you're hiring engineers, check out revelo.io/screaming to get 20% off your first three months. That's R-E-V-E-L-O dot I-O slash screaming.Corey: That is part of the problem is, in some cases, to drive some of these improvements, you have to go backwards to move forwards. And it's one of those, “Great, so we spent all this effort and money in the rest of now things are worse?” No, not necessarily, but suddenly are aware of things that were slipping through the cracks previously.Amy: Yeah. Yeah.Corey: Like, the most realistic thing about first The Phoenix Project and then The Unicorn Project, both by Gene Kim, has been the fact that companies have these problems and actively cared enough to change it. In my experience, that feels a little on the rare side.Amy: Yeah, and I think that's actually the key, right? It's for the culture change, and for, like, if you really looking to be, like, do I want to work at this company? Am I investing my myself in here? Is look at the leadership team and be, like, do these people actually give a crap? Are they looking just to punt another number down the road?That's the real question, right? Like, the technology and stuff, at the point where I'm at in my career, I just don't care that much anymore. [laugh]. Just… fine, use Kubernetes, use Postgres, [unintelligible 00:27:30], I don't care. I just don't. Like, Oracle, I might have to ask, you know, go to finance and be like, “Hey, can we spend 20 million for a database?” But like, nobody really asks for that anymore, so. [laugh].Corey: As one does. I will say that I mostly agree with you, but a technology that I found myself getting excited about, given the time of the recording on this is… fun, I spent a bit of time yesterday—from when we're recording this—teaching myself just enough Go to wind up being together a binary that I needed to do something actively ridiculous for my camera here. And I found myself coming away deeply impressed by a lot of things about it, how prescriptive it was for one, how self-contained for another. And after spending far too many years of my life writing shitty Perl, and shitty Bash, and worse Python, et cetera, et cetera, the prescriptiveness was great. The fact that it wound up giving me something I could just run, I could cross-compile for anything I need to run it on, and it just worked. It's been a while since I found a technology that got me this interested in exploring further.Amy: Go is great for that. You mentioned one of my two favorite features of Go. One is usually when a program compiles—at least the way I code in Go—it usually works. I've been working with Go since about 0.9, like, just a little bit before it was released as 1.0, and that's what I've noticed over the years of working with it is that most of the time, if you have a pretty good data structure design and you get the code to compile, usually it's going to work, unless you're doing weird stuff.The other thing I really love about Go and that maybe you'll discover over time is the malleability of it. And the reason why I think about that more than probably most folks is that I work on other people's code most of the time. And maybe this is something that you probably run into with your business, too, right, where you're working on other people's infrastructure. And the way that we encode business rules and things in the languages, in our programming language or our config syntax and stuff has a huge impact on folks like us and how quickly we can come into a situation, assess, figure out what's going on, figure out where things are laid out, and start making changes with confidence.Corey: Forget other people for a minute they're looking at what I built out three or four years ago here, myself, like, I look at past me, it's like, “What was that rat bastard thinking? This is awful.” And it's—forget other people's code; hell is your own code, on some level, too, once it's slipped out of the mental stack and you have to re-explore it and, “Oh, well thank God I defensively wound up not including any comments whatsoever explaining what the living hell this thing was.” It's terrible. But you're right, the other people's shell scripts are finicky and odd.I started poking around for help when I got stuck on something, by looking at GitHub, and a few bit of searching here and there. Even these large, complex, well-used projects started making sense to me in a way that I very rarely find. It's, “What the hell is that thing?” is my most common refrain when I'm looking at other people's code, and Go for whatever reason avoids that, I think because it is so prescriptive about formatting, about how things should be done, about the vision that it has. Maybe I'm romanticizing it and I'll hate it and a week from now, and I want to go back and remove this recording, but.Amy: The size of the language helps a lot.Corey: Yeah.Amy: But probably my favorite. It's more of a convention, which actually funny the way I'm going to talk about this because the two languages I work on the most right now are Ruby and Go. And I don't feel like two languages could really be more different.Syntax-wise, they share some things, but really, like, the mental models are so very, very different. Ruby is all the way in on object-oriented programming, and, like, the actual real kind of object-oriented with messaging and stuff, and, like, the whole language kind of springs from that. And it kind of requires you to understand all of these concepts very deeply to be effective in large programs. So, what I find is, when I approach Ruby codebase, I have to load all this crap into my head and remember, “Okay, so yeah, there's this convention, when you do this kind of thing in Ruby”—or especially Ruby on Rails is even worse because they go deep into convention over configuration. But what that's code for is, this code is accessible to people who have a lot of free cognitive capacity to load all this convention into their heads and keep it in their heads so that the code looks pretty, right?And so, that's the trade-off as you said, okay, my developers have to be these people with all these spare brain cycles to understand, like, why I would put the code here in this place versus this place? And all these, like, things that are in the code, like, very compact, dense concepts. And then you go to something like Go, which is, like, “Nah, we're not going to do Lambdas. Nah”—[laugh]—“We're not doing all this fancy stuff.” So, everything is there on the page.This drives some people crazy, right, is that there's all this boilerplate, boilerplate, boilerplate. But the reality is, I can read most Go files from top to the bottom and understand what the hell it's doing, whereas I can go sometimes look at, like, a Ruby thing, or sometimes Python and e—Perl is just [unintelligible 00:32:19] all the time, right, it's there's so much indirection. And it just be, like, “What the [BLEEP] is going on? This is so dense. I'm going to have to sit down and write it out in longhand so I can understand what the developer was even doing here.” And—Corey: Well, that's why I got the Mac Studio; for when I'm not doing A/V stuff with it, that means that I'll have one core that I can use for, you know, front-end processing and the rest, and the other 19 cores can be put to work failing to build Nokogiri in Ruby yet again.Amy: [laugh].Corey: I remember the travails of working with Ruby, and the problem—I have similar problems with Python, specifically in that—I don't know if I'm special like this—it feels like it's a SRE DevOps style of working, but I am grabbing random crap off a GitHub constantly and running it, like, small scripts other people have built. And let's be clear, I run them on my test AWS account that has nothing important because I'm not a fool that I read most of it before I run it, but I also—it wants a different version of Python every single time. It wants a whole bunch of other things, too. And okay, so I use ASDF as my version manager for these things, which for whatever reason, does not work for the way that I think about this ergonomically. Okay, great.And I wind up with detritus scattered throughout my system. It's, “Hey, can you make this reproducible on my machine?” “Almost certainly not, but thank you for asking.” It's like ‘Step 17: Master the Wolf' level of instructions.Amy: And I think Docker generally… papers over the worst of it, right, is when we built all this stuff in the aughts, you know, [CPAN 00:33:45]—Corey: Dev containers and VS Code are very nice.Amy: Yeah, yeah. You know, like, we had CPAN back in the day, I was doing chroots, I think in, like, '04 or '05, you know, to solve this problem, right, which is basically I just—screw it; I will compile an entire distro into a directory with a Perl and all of its dependencies so that I can isolate it from the other things I want to run on this machine and not screw up and not have these interactions. And I think that's kind of what you're talking about is, like, the old model, when we deployed servers, there was one of us sitting there and then we'd log into the server and be like, I'm going to install the Perl. You know, I'll compile it into, like, [/app/perl 558 00:34:21] whatever, and then I'll CPAN all this stuff in, and I'll give it over to the developer, tell them to set their shebang to that and everything just works. And now we're in a mode where it's like, okay, you got to set up a thousand of those. “Okay, well, I'll make a tarball.” [laugh]. But it's still like we had to just—Corey: DevOps, but [unintelligible 00:34:37] dev closer to ops. You're interrelating all the time. Yeah, then Docker comes along, and add dev is, like, “Well, here's the container. Good luck, asshole.” And it feels like it's been cast into your yard to worry about.Amy: Yeah, well, I mean, that's just kind of business, or just—Corey: Yeah. Yeah.Amy: I'm not sure if it's business or capitalism or something like that, but just the idea that, you know, if I can hand off the shitty work to some other poor schlub, why wouldn't I? I mean, that's most folks, right? Like, just be like, “Well”—Corey: Which is fair.Amy: —“I got it working. Like, my part is done, I did what I was supposed to do.” And now there's a lot of folks out there, that's how they work, right? “I hit done. I'm done. I shipped it. Sure. It's an old [unintelligible 00:35:16] Ubuntu. Sure, there's a bunch of shell scripts that rip through things. Sure”—you know, like, I've worked on repos where there's hundreds of things that need to be addressed.Corey: And passing to someone else is fine. I'm thrilled to do it. Where I run into problems with it is where people assume that well, my part was the hard part and anything you schlubs do is easy. I don't—Amy: Well, that's the underclass. Yeah. That's—Corey: Forget engineering for a second; I throw things to the people over in the finance group here at The Duckbill Group because those people are wizards at solving for this thing. And it's—Amy: Well, that's how we want to do things.Corey: Yeah, specialization works.Amy: But we have this—it's probably more cultural. I don't want to pick, like, capitalism to beat on because this is really, like, human cultural thing, and it's not even really particularly Western. Is the idea that, like, “If I have an underclass, why would I give a shit what their experience is?” And this is why I say, like, ops teams, like, get out of here because most ops teams, the extant ops teams are still called ops, and a lot of them have been renamed SRE—but they still do the same job—are an underclass. And I don't mean that those people are below us. People are treated as an underclass, and they shouldn't be. Absolutely not.Corey: Yes.Amy: Because the idea is that, like, well, I'm a fancy person who writes code at my ivory tower, and then it all flows down, and those people, just faceless people, do the deployment stuff that's beneath me. That attitude is the most toxic thing, I think, in tech orgs to address. Like, if you're trying to be like, “Well, our liability is bad, we have security problems, people won't fix their code.” And go look around and you will find people that are treated as an underclass that are given codes thrown over the wall at them and then they just have to toil through and make it work. I've worked on that a number of times in my career.And I think just like saying, underclass, right, or caste system, is what I found is the most effective way to get people actually thinking about what the hell is going on here. Because most people are just, like, “Well, that's just the way things are. It's just how we've always done it. The developers write to code, then give it to the sysadmins. The sysadmins deploy the code. Isn't that how it always works?”Corey: You'd really like to hope, wouldn't you?Amy: [laugh]. Not me. [laugh].Corey: Again, the way I see it is, in theory—in theory—sysadmins, ops, or that should not exist. People should theoretically be able to write code as developers that just works, the end. And write it correct the first time and never have to change it again. Yeah. There's a reason that I always like to call staging environments in places I work ‘theory' because it works in theory, but not in production, and that is fundamentally the—like, that entire job role is the difference between theory and practice.Amy: Yeah, yeah. Well, I think that's the problem with it. We're already so disconnected from the physical world, right? Like, you and I right now are talking over multiple strands of glass and digital transcodings and things right now, right? Like, we are detached from the physical reality.You mentioned earlier working in data centers, right? The thing I miss about it is, like, the physicality of it. Like, actually, like, I held a server in my arms and put it in the rack and slid it into the rails. I plugged into power myself; I pushed the power button myself. There's a server there. I physically touched it.Developers who don't work in production, we talked about empathy and stuff, but really, I think the big problem is when they work out in their idea space and just writing code, they write the unit tests, if we're very lucky, they'll write a functional test, and then they hand that wad off to some poor ops group. They're detached from the reality of operations. It's not even about accountability; it's about experience. The ability to see all of the weird crap we deal with, right? You know, like, “Well, we pushed the code to that server, but there were three bit flips, so we had to do it again. And then the other server, the disk failed. And on the other server…” You know? [laugh].It's just, there's all this weird crap that happens, these systems are so complex that they're always doing something weird. And if you're a developer that just spends all day in your IDE, you don't get to see that. And I can't really be mad at those folks, as individuals, for not understanding our world. I figure out how to help them, and the best thing we've come up with so far is, like, well, we start giving this—some responsibility in a production environment so that they can learn that. People do that, again, is another one that can be done wrong, where it turns into kind of a forced empathy.I actually really hate that mode, where it's like, “We're forcing all the developers online whether they like it or not. On-call whether they like it or not because they have to learn this.” And it's like, you know, maybe slow your roll a little buddy because the stuff is actually hard to learn. Again, minimizing how hard ops work is. “Oh, we'll just put the developers on it. They'll figure it out, right? They're software engineers. They're probably smarter than you sysadmins.” Is the unstated thing when we do that, right? When we throw them in the pit and be like, “Yeah, they'll get it.” [laugh].Corey: And that was my problem [unintelligible 00:39:49] the interview stuff. It was in the write code on a whiteboard. It's, “Look, I understood how the system fundamentally worked under the hood.” Being able to power my way through to get to an outcome even in language I don't know, was sort of part and parcel of the job. But this idea of doing it in artificially constrained environment, in a language I'm not super familiar with, off the top of my head, it took me years to get to a point of being able to do it with a Bash script because who ever starts with an empty editor and starts getting to work in a lot of these scenarios? Especially in an ops role where we're not building something from scratch.Amy: That's the interesting thing, right? In the majority of tech work today—maybe 20 years ago, we did it more because we were literally building the internet we have today. But today, most of the engineers out there working—most of us working stiffs—are working on stuff that already exists. We're making small incremental changes, which is great that's what we're doing. And we're dealing with old code.Corey: We're gluing APIs together, and that's fine. Ugh. I really want to thank you for taking so much time to talk to me about how you see all these things. If people want to learn more about what you're up to, where's the best place to find you?Amy: I'm on Twitter every once in a while as @MissAmyTobey, M-I-S-S-A-M-Y-T-O-B-E-Y. I have a blog I don't write on enough. And there's a couple things on the Equinix Metal blog that I've written, so if you're looking for that. Otherwise, mainly Twitter.Corey: And those links will of course be in the [show notes 00:41:08]. Thank you so much for your time. I appreciate it.Amy: I had fun. Thank you.Corey: As did I. Amy Tobey, Senior Principal Engineer at Equinix. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, or on the YouTubes, smash the like and subscribe buttons, as the kids say. Whereas if you've hated this episode, same thing, five-star review all the platforms, smash the buttons, but also include an angry comment telling me that you're about to wind up subpoenaing a copy of my shell script because you're convinced that your intellectual property and secrets are buried within.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Data Analytics in Real Time with Venkat Venkataramani

Screaming in the Cloud

Play Episode Listen Later Apr 27, 2022 38:41


About VenkatVenkat Venkataramani is CEO and co-founder of Rockset. In his role, Venkat helps organizations build, grow and compete with data by making real-time analytics accessible to developers and data teams everywhere. Prior to founding Rockset in 2016, he was an Engineering Director for the Facebook infrastructure team that managed online data services for 1.5 billion users. These systems scaled 1000x during Venkat's eight years at Facebook, serving five billion queries per second at single-digit millisecond latency and five 9's of reliability. Venkat and his team also created and contributed to many noted data technologies and open-source projects, including Facebook's TAO distributed data store, RocksDB, Memcached, MySQL, MongoRocks, and others. Prior to Facebook, Venkat worked on tools to make the Oracle database easier to manage. He has a master's in computer science from the University of Wisconsin-Madison, and bachelor's in computer science from the National Institute of Technology, Tiruchirappalli.Links Referenced: Company website: https://rockset.com Company blog: https://rockset.com/blog TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored by our friends at Revelo. Revelo is the Spanish word of the day, and its spelled R-E-V-E-L-O. It means “I reveal.” Now, have you tried to hire an engineer lately? I assure you it is significantly harder than it sounds. One of the things that Revelo has recognized is something I've been talking about for a while, specifically that while talent is evenly distributed, opportunity is absolutely not. They're exposing a new talent pool to, basically, those of us without a presence in Latin America via their platform. It's the largest tech talent marketplace in Latin America with over a million engineers in their network, which includes—but isn't limited to—talent in Mexico, Costa Rica, Brazil, and Argentina. Now, not only do they wind up spreading all of their talent on English ability, as well as you know, their engineering skills, but they go significantly beyond that. Some of the folks on their platform are hands down the most talented engineers that I've ever spoken to. Let's also not forget that Latin America has high time zone overlap with what we have here in the United States, so you can hire full-time remote engineers who share most of the workday as your team. It's an end-to-end talent service, so you can find and hire engineers in Central and South America without having to worry about, frankly, the colossal pain of cross-border payroll and benefits and compliance because Revelo handles all of it. If you're hiring engineers, check out revelo.io/screaming to get 20% off your first three months. That's R-E-V-E-L-O dot I-O slash screaming.Corey: This episode is sponsored in part by LaunchDarkly. Take a look at what it takes to get your code into production. I'm going to just guess that it's awful because it's always awful. No one loves their deployment process. What if launching new features didn't require you to do a full-on code and possibly infrastructure deploy? What if you could test on a small subset of users and then roll it back immediately if results aren't what you expect? LaunchDarkly does exactly this. To learn more, visit launchdarkly.com and tell them Corey sent you, and watch for the wince.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's promoted guest episode is one of those questions I really like to ask because it can often come across as incredibly, well, direct, which is one of the things I love doing. In this case, the question that I am asking is, when you look around at the list of colossal blunders that people make in the course of careers in technology and the rest, it's one of the most common is, “Oh, yeah. I don't like the way that this thing works, so I'm going to build my own database.” That is the siren call to engineers, and it is often the prelude to horrifying disasters. Today, my guest is Venkat Venkataramani, co-founder and CEO at Rockset. Venkat, thank you for joining me.Venkat: Thanks for having me, Corey. It's a pleasure to be here.Corey: So, it is easy for me to sit here in my beautiful ivory tower that is crumbling down around me and use my favorite slash the best database imaginable, which is TXT records shoved into Route 53. Now, there are certainly better databases than that for most use cases. Almost anything really, to be honest with you, because that is a terrifying pattern; good joke, terrible practice. What is Rockset as we look at the broad landscape of things that store data?Venkat: Rockset is a real-time analytics platform built for the cloud. Let me break that down a little bit, right? I think it's a very good question when you say does the world really need another database? Don't we have enough already? SQL databases, NoSQL databases, warehouses, and lake houses now.So, if you really break it down, the first digital transformation that happened in the '80s was when people actually retired pen and paper records and started using a relational database to actually manage their business records and what have you instead of ledgers and books and what have you. And that was the first digital transformation. That was—and Oracle called the rows in a table ‘records' for a reason. They're called records to this date. And then, you know, 20 years later, when all businesses were doing system of record and transactions and transactional databases, then analytics was born, right?This was, like, the whole reason why I wanted to make better data-driven business decisions, and BI was born, warehouses and data lakes started becoming more and more mainstream. And there was really a second category of database management systems because the first category it was very good at to be a system of record, but not really good at complex analytics that businesses are asking to be able to guide their decisions. Fast-forward 20 years from then, the nature of applications are changing. The world is going from batch to real-time, your data never stops coming, advent of Apache Kafka and technologies like that, 5G, IoTs, data is coming from all sorts of nooks and corners within an enterprise, and now customers in enterprises are acquiring the data in real-time at a scale that the world has never seen before.Now, how do you get analytics out of that? And then if you look at the database market—entire market—there are still only two large categories of databases: OLTP databases for transaction processing, and warehouses and data lakes for batch analytics. Now suddenly, you need the speed of OLTP at the scale of batch, right, in terms of, like, complexity of compute, complexity of storage. So, that is really why we thought the data management space needs that third leg, and we call it real-time analytics platform or real-time analytics processing. And this is where the data never stops coming; the queries never stopped coming.You need the speed and the scale, and it's about time we innovate and solve the problem well because in 2015, 2016, when I was researching for this, every company that was looking to solve build applications that were real-time applications was building a custom Rube Goldberg machine of sorts. And it was insanely complex, it was insanely expensive. Fast-forward now, you can build a real-time application in a matter of hours with the simplicity of the cloud using Rockset.Corey: There's a lot to be said that the way we used to do things after the first transformation and we got into the world of batch processing, where—in the days of punch cards, which was a bit before my time and I believe yours as well—where they would drop them off and then the next day, or two days, they would come back later after the run, they would get the results only to figure out syntax error because you put the wrong card first or something like that. And it was maddening. In time, that got better, but still, nightly runs have become a thing to the point where even now, by default, if you wind up looking at the typical timing of a default Linux install, for example, you see that in the middle of the night is when a bunch of things will rotate when various cleanup jobs get done, et cetera, et cetera. And that seemed like a weird direction to go in. One of the most famous Google April Fools Day jokes was when they put out their white paper on MapReduce.And then Yahoo fell for it hook, line, and sinker, built out Hadoop, and we've been stuck with this idea of performing these big query jobs on top of existing giant piles of data, where ideally, you can measure it with a wall clock; in practice, you often measure the calendar in some cases. And as the world continues to evolve, being able to do streaming processing and understand in real-time what is going on, is unlocking different approaches, at least by all accounts. Do you have an example you can give me of a problem that real-time analytics solves for a customer? Because I can sit here and talk all day about how things might theoretically work, but I have to get out of my Route 53-based ivory tower over here, what are customers seeing?Venkat: That's a great question. And I want one hundred percent agree. I think Google did build MapReduce, and I think it's a very nice continuation of what happened there and what is happening in the world now. And built MapReduce and they quickly realized re-indexing the whole world [laugh] every night, as the size of the internet is exploding is a bad idea. And you know how Google index is now? They do real-time indexing.That is how they index the wor—you know, web. And they look for the changes that are happening in the internet, and they only index the changes. And that is exactly the same principle behind—one of the core principles behind Rockset's real-time analytics platform. So, what is the customer story? So, let me give you one of my favorite ones.So, the world's number one or number two buy now, pay later company, they have hundreds of millions of users, they have 300,000-plus merchants, they operate in, like, maybe 100-plus countries, so many different payment methods, you can imagine the complexity. At any given point in time, some part of the product is broken, well, Apple Pay stopped working in Switzerland for this e-commerce merchant. Oh God, like, we got to first detect that. Forget even debugging and figuring out what happened and having an incident response team. So, what did they do as they scale the number of payments processed in the system across the world—it's, like, in millions; first, it was millions in the day, and there was millions in an hour—so like everybody else, they built a batch-based system.So, they would accumulate all these payment records, and every six hours—so initially, it was a day, and then afterwards, you know, you try to see how far I can push it, and they couldn't push it beyond every six hours. Every six hours, some batch job would come and process through all the payments that happened, have some statistical models to detect, hey, here are some of the things that you might want to double-click and follow up on. And as they were scaling, the batch job that they will kick off every six hours was starting to take more than six hours. So, you can see how the story goes. Now, fast-forward, they came to us and say—it's almost like Rockset has, like, a big red button that says, “Real-time this.”And then they kind of like, “Can you make this real-time? Because not only that we are losing millions of potential revenue dollars in a year because something stops working and we're not processing payments, and we don't find out about that up to, like, three hours later, five hours later, six hours later, but our merchants are also very unhappy. We are also not able to protect our customers' business because that is all we are about.” And so fast-forward, they use Rockset, and simply using SQL now they have all the metrics and statistical computation that they want to do, happens in real-time, that are accurate up to the second. All of their anomaly detectors run every minute and the anomaly detectors take, like, hundreds of milliseconds to run.And so, now they've cut down the business observability, I would say. It's not metrics and machine observability is actually the—you know, they have now business observability in real-time. And that not only actually saves them a lot of potential revenue loss from downtimes, that's also allowing them to build a better product and give their customers a better experience because they are now telling their merchants and their customers that something is not working in some part of your e-commerce footprint before even the customers notice that something is wrong. And that allows them to build a better product and a better customer experience than their competitors. So, this is a very real-world example of why companies and enterprises are moving from batch to real-time.Corey: With the stories that you, and frankly, a lot of other data analytics companies tend to fall back on all the time has been stories of the ones you're telling, where you're talking about the largest buy now, pay later lender, for example. These are companies operating at massive scale who have tremendous existing transaction volume, and they're built out already. That's great, but then I wanted to try to cut to the truth of some of these things. And when I visit your pricing page at Rockset, it doesn't have what I would expect if that were the only use case. And what that would be is, “Great. Call here to conta—open up a sales quote, and we'll talk to you et cetera, et cetera, et cetera.”And the answer then is, “Okay, I know it's going to have at least two commas in it, ideally, not three, but okay, great.” Instead, you have a free tier where it's, “Hey, we'll give you a pile of credits, here's some limits on our free account, et cetera, et cetera.” Great. That is awesome. So, it tells me that there is a use case here for folks who have not already, on some level, made a good show of starting the process of conquering the world.Rather, someone with an idea some evening at two in the morning can wind up diving in and getting started. What is the Twitter for Pets, in my garage, spare-time side project story for using something like Rockset? What problem will I have as I wind up building those things out, when I don't have any user traffic or data yet, but I want to, you know for once in my life, do the smart thing in advance rather than building an impressive tower of technical debt?Venkat: That is the first thing we built, by the way. When we finish our product, the first thing we built was self-service. The first thing we built was a free forever tier, which has certain limits because somebody has to pay the bill, right? And then we also have compute instances that are very, very affordable that cost you, like, approximately $1 a day. And so, we built all of that because real-time analytics is not a need that only, like, the large-scale companies have. And I'll give you a very, very simple example.Let's say you're building a game, it's a mobile game. You can use Amazon DynamoDB and use AWS Lambdas and have a serverless stack and, like, you're really only paying… you're kind of keeping your footprint very, very small, and you're able to build a very lively game and see if it gets [wider 00:12:16], and it's growing. And once it grows, you can have all the big company scaling problems. But in the early days, you're just getting started. Now, if you think about DynamoDB and Lambdas and whatnot, you can build almost every part of the game except probably the leaderboard.So, how do I build a leaderboard when thousands of people are playing and all of their individual gameplays and scores and everything is just another simple record in DynamoDB. It's all serverless. But DynamoDB doesn't give me a SQL SELECT *, order by score, limit 100, distinct by the same player. No, this is a analytical question, and it has to be updated in real-time, otherwise, you really don't have this thing where I just finished playing. I go to the leaderboard, and within a second or two, if it doesn't update, you kind of lose people along the way. So, this is one of actually a very popular use case, when the scale is much smaller, which is, like, Rockset augments NoSQL database like a Dynamo or a Mongo where you can continue to use that for—or even a Postgres or MySQL for that case where you can use that as your system of record and keep it small, but cover all of your compute-heavy and analytical parts of your application with Rockset.So, it's almost like kind of a CQRS pattern where you use your OLTP database as your system of record, you connect Rockset to it, and so—Rockset comes in with built-in connectors, by the way, so you don't have to write a single line of code for your inserts and updates and deletes in your transactional database to get reflected in Rockset within one to two seconds. And so now, all of a sudden you have a fully indexed, fast SQL replica of your transactional database that on which you can do all sorts of analytical queries and that's fully isolated with your transactional database. So, this is the pattern that I'm talking about. The mobile leaderboard is an example of that pattern where it comes in very handy. But you can imagine almost everybody building some kind of an application has certain parts of it that is very analytical in nature. And by augmenting your transactional database with Rockset, you can have your cake and eat it too.Corey: One of the challenges I think that at least I've run into when it comes to working with data—and let's be clear, I tend to deal with data in relatively small volumes, mostly. The stuff that's significantly large, like, oh, I don't know, AWS bills from large organizations, the format of those is mostly predefined. When I'm building something out, we're using, I don't know, DynamoDB or being dangerous with SQLite or whatnot, invariably I find that even at small-scale, I paint myself into a corner by data model design or how I wind up structuring access or the rest, and the thing that I'm doing that makes perfect sense today winds up being incredibly challenging to change later. And I still, in production and have a DynamoDB table that has the word ‘test' in its name because of course I do.It's not a great place to find yourself in some cases. And I'm curious as to what you've seen, as you've been building this out and watching customers, especially ones who already had significant datasets as they move to you. Do you have any guidance around how to avoid falling down that particular well?Venkat: I will say a lot of the complexity in this world is by solving the right problems using the wrong tool, or by solving the right problem on the wrong part of the stack. I'll unpack this a little bit, right? So, when your patterns change, your application is getting more complex, it is demanding more things, that doesn't necessarily mean the first part of the application you build—and let's say DynamoDB was your solution for that—was the wrong choice. That is the right choice, but now you're expanded the scope of your application and the demand that you have on your backend transactional database. And now you have to ask the question, now in the expanded scope, which ones are still more of the same category of things on why I chose Dynamo and which ones are actually not at all?And so, instead of going and abusing the GSIs and other really complex and expensive indexing options and whatnot, that Dynamo, you know, has built, and has all sorts of limitations, instead of that, what do I really need and what is the best tool for the job, right? What is the best system for that? And how do I augment? And how do I manage these things? And this goes to the first thing I said, which is, like, this tremendous complexity when you start to build a Rube Goldberg machine of sorts.Okay, now, I'm going to start making changes to Dynamo. Oh, God, like, how do I pick up all of those things and not miss a single record? Now, replicate that to another second system that is going to be search-centric or reporting-centric, and do I have to rethink this once in a while? Do I have to build and manage these pipelines? And suddenly, instead of going from one system to two system, you actually end up going from one system to, like, four different things that with all the pipes and tubes going into the middle.And so, this is what we really observed. And so, when you come in to Rockset and you point us at your DynamoDB table, you don't write a single line of code, and Rockset will automatically scan your Dynamo tables, move that into Rockset, and in real-time, your changes, insert, updates, deletes to Dynamo will be reflected in Rockset. And this is all using Dynamo Streams API, Dynamo Scan API, and whatnot, behind the scenes. And this just gives you an example of if you use the right tool for the job here, when suddenly your application is demanding analytical queries on Dynamo, and you do the right research and find the right tool, your complexity doesn't explode at all, and you can still, again, continue to use Dynamo for what it is very, very good at while augmenting that with a system built for analytics with full-featured SQL and other capabilities that I can talk about, for the parts of your application for which Dynamo is not a good fit. And so, if you use the right tool for the job, you should be in very good place.The other thing is part about this wrong part of the stack. I'll give a very kind of naive example, and then maybe you can extrapolate that to, like, other patterns on how people could—you know, accidental complexities the worst. So, let's just say you need to implement access control on your data. Let's say the best place to implement access control is at the database level, just happens to be that is the right thing. But this database that I picked, doesn't really have role-based access control or what have you, it doesn't really give me all the security features to be able to protect the data the way I want it.So, then what I'm going to do is, I'm going to go look at all the places that is actually having business logic and querying the database and I'm going to put a whole bunch of permission management and roles and privileges, and you can just see how that will be so error-prone, so hard to maintain, and it will be impossible to scale. And this is what is the worst form of accidental complexity because if you had just looked at it that one week or two weeks, how do I get something out, or the database I picked doesn't have it, and then the two weeks, you feel like you made some progress by, kind of like, putting some duct tape if conditions on all the access paths. But now, [laugh] you've just painted yourself into a really, really bad corner.And so, this is another variation of the same problem where you end up solving the right problems in the wrong part of the stack, and that just introduces tremendous amount of accidental complexity. And so, I think yeah, both of these are the common pitfalls that I think people make. I think it's easy to avoid them. I would say there's so much research, there's so much content, and if you know how to search for these things, they're available in the internet. It's a beautiful place. [laugh]. But I guess you have to know how to search for these things. But in my experience, these are the two common pitfalls a lot of people fall into and paint themselves in a corner.Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.Corey: A question I have, though, that is an extension is this—and I want to give some flavor to it—but why is there a market for real-time analytics? And what I mean by that is, early on in my tenure of fixing horrifying AWS bills, I saw a giant pile of money being hurled over at effectively a MapReduce cluster for Elastic MapReduce. Great. Okay, well, stream-processing is kind of a thing; what about migrating to that? Well, that was a complete non-starter because it wasn't just the job running on those things; there were downstream jobs, and with their own downstream jobs. There were thousands of business processes tied to that thing.And similarly, the idea of real-time analytics, we don't have any use for that because of, oh I don't know, I only wind up pulling these reports on a once-a-week basis, and that's fine, so what do I need that updated for in real-time if I'm looking at them once a week? In practice, the answer is often something aligned with the, “Well, yeah, but you had a real-time updating dashboard, you would find that more useful than those reports.” But people's expectations and business processes have shaped themselves around constraints that now can be removed, but how do you get them to see that? How do you get them to buy in on that? And then how do you untangle that enormous pile of previous constraint into something that leverages the technology that's now available for a brighter future?Venkat: I think [unintelligible 00:21:40] a really good question, who are the people moving to real-time analytics? What do they see? And why can they do it with other tech? Like, you know, as you say… EMR, you know, it's just MapReduce; can't I just run it in sort of every twenty-four hours, every six hours, every hour? How about every five minutes? It doesn't work that way.Corey: How about I spin up a whole bunch of parallel clusters on different timescales so I constantly—Venkat: [laugh].Corey: Have a new report coming in. It's real-time, except—Venkat: Exactly.Corey: You're constantly putting out new ones, but they're just six hours delayed every time.Venkat: Exactly. So, you don't really want to do this. And so, let me unpack it one at a time, right? I mean, we talked about a very good example of a business team which is building business observability at the buy now, pay later company. That's a very clear value-prop on why they want to go from batch to real-time because it saves their company tremendous losses—potential losses—and also allows them to build a better product.So, it could be a marketing operations team looking to get more real-time observability to see what campaigns are working well today and how do I double down and make sure my ad budget for the day is put to good use? I don't have to mention security operations, you know, needing real-time. Don't tell me I got owned three days ago. Tell me—[laugh] somebody is, you know, breaking glass and might be, you know, entering into your house right now. And tell me then and not three days later, you know—Corey: “Yeah, what alert system do you have for security intrusion?” “So, I read the front page of_The New York Times_ every morning and waiting to see my company's name.” Yeah, there probably are better ways to reduce that cycle time.Venkat: Exactly, right. And so, that is really the need, right? Like, I think more and more business teams are saying, “I need operational intelligence and not business intelligence.” Don't make me play Monday morning quarterback.My favorite analogy is it's the middle of the third quarter. I'm six points down. A couple of people, star players in my team and my opponent's team are injured, but there's some in offense, some in defense. What plays do I do and how do I play the game slightly differently to change the outcome of the game and win this game as opposed to losing by six points. So, that I think is kind of really what is driving businesses.You know, I want to be more agile, I want to be more nimble, and take, kind of, being data-driven decision-making to another level. So that, I think, is the real force in play. So, now the real question is, why can they do it already? Because if you go ask a hundred people, “Do you want fast analytics on real-time data or slow analytics on stale data?” How many people are going to say give me slow and stale? Zero, right? Exactly zero people.So, but then why hasn't it happened yet? I think it goes back to the world only has seen two kinds of databases: Transaction processing systems, built for system of record, don't lose my data kind of systems; and then batch analytics, you know, all these warehouses and data lakes. And so, in real-time analytics use cases, the data never stops coming, so you have to actually need a system that is running 24/7. And then what happens is, as soon as you build a real-time dashboard, like this example that you gave, which is, like, I just want all of these dashboards to automatically update all the time, immediately people respond, says, “But I'm not going to be like Clockwork Orange, you know, toothpicks in my eyelids and be staring at this 24/7. Can you do something to alert or detect some anomalies and tap on my shoulder when something off is going on?”And so, now what happens is somebody's actually—a program more than a person—is actually actively monitoring all of these metrics and graphs and doing some analysis, and only bringing this to your attention when you really need to because something is off, right? So, then suddenly what happens is you went from, accumulate all the data and run a batch report to [unintelligible 00:25:16], like, the data never stops coming, the queries never stopped coming, I never stop asking questions; it's just a programmatic way of asking those things. And at that point, you have a data app. This is not a analytics dashboard report anymore. You have a full-fledged application.In fact, that application is harder to build and scale than any application you've ever built before [laugh] because in those situations, again, you don't have this torrent of data coming in all the time and complex analytical questions you're asking on the data 24/7, you know? And so, that I think is really why real-time analytics platform has to be built as almost a third leg. So, this is what we call data apps, which is when your data never stops coming and your queries never stop coming. So, this is really, I think, what is pushing all the expensive EMR clusters or misusing your warehouse, misusing your data lakes. At the end of the day, is what is I think blowing up your Snowflake bills, is what blowing up your warehouse builds because you somehow accidentally use the wrong tool for the job [laugh] going back to the one that we just talked about.You accidentally say, “Oh, God, like, I just need some real-time.” With enough thrust, pigs can fly. Is that a good idea? Probably not, right? And so, I don't want to be building a data app on my warehouse just because I can. You should probably use the best tool for the job, and really use something that was built ground up for it.And I'll give you one technical insight about how real-time analytics platforms are different than warehouses.Corey: Please. I'm here for this.Venkat: Yes. So really, if you think about warehouses and data lakes, I call them storage-optimized systems. I've been building databases all my life, so if I have to really build a database that is for batch analytics, you just break down all of your expenses in terms of let's say, compute and storage. What I'm burning 24/7 is storage. Compute comes and goes when I'm doing a batch data load, or I'm running—an analyst who logs in and tries to run some queries.But what I'm actually burning 24/7 is storage, so I want to compress the heck out of the data, and I want to store it in very cheap media. I want to store it—and I want to make the storage as cheap as possible, so I want to optimize the heck out of the storage use. And I want to make computation on that possible but not efficient. I can shuffle things around and make the analysis possible, but I'm not trying to be compute-efficient. And we just talked about how, as soon as you get into real-time analytics, you very quickly get into the data app business. You're not building a real-time dashboard anymore, you're actually building your application.So, as soon as you get into that, what happens is you start burning both storage and compute 24/7. And we all know, relatively, [laugh] compute and RAM is about a hundred to a thousand times more expensive than storage in the grand scheme of things. And so, if you actually go and look at your Snowflake bill, if you go look at your warehouse bill—BigQuery, no matter what—I bet the computational part of it is about 90 to 95% of the bill and not the storage. And then, if you again, break down, okay, who's spending all the compute, and you'll very quickly narrow down all these real-time-y and data app-y use cases where you can never turn off the compute on your warehouse or your BigQuery, and those are the ones that are blowing up your costs and complexity. And on the Rockset side, we are actually not storage-optimized; we're compute-optimized.So, we index all the data as it comes in. And so, the storage actually goes slightly higher because the, you know, we stored the data and also the indexes of those data automatically, but we usually fold the computational cost to a quarter of what a typical warehouse needs. So, the TCO for our customers goes down by two to four folds, you know? It goes down by half or even to a quarter of what they used to spend. Even though their storage cost goes up in net, that is a very, very small fraction of their spend.And so really, I think, good real-time analytics platforms are all compute-optimized and not storage-optimized, and that is what allows them to be a lot more efficient at being the backend for these data applications.Corey: As someone who spends a lot of time staring into the depths of AWS bills, I think that people also lose sight of the reality that it doesn't matter what you're spending on AWS; it invariably pales in comparison to what you're spending on people to work with these things. The reason to go to cloud is not because it is the cheapest possible way to get computers to do things; it's because it's a capability story. It's about unlocking capacity and capabilities you do not have otherwise. And that dramatically increases your feature velocity and it lets you to achieve things faster, sooner, with better results. And unlocking a capability is always going to be more interesting to a company than saving money on it. When a company cares first, last, and always about just save money, make the bill lower, the end, it's usually a company in decline. Or alternately, something very strange is going on over there.Venkat: I agree with that. One of our favorite customers told us that Rockset took their six-month roadmap and shrunk it to a single afternoon. And their supply chain SaaS backend for heavy construction, 80% of concrete that are being delivered and tracked in North America follows through their platform, and Rockset powers all of their real-time analytics and reporting. And before Rockset, what did they have? They had built a beautiful serverless stack using DynamoDB, even have AWS Lambdas and what-have-you.And why did they have to do all serverless? Because the entire team was two people. [laugh]. And maybe a third person once in a while, they'll get, so 2.5. Brilliant people, like, you know, really pioneers of building an entire data stack on AWS in a serverless fashion; no pipes, no ETL.And then they were like, oh God, finally, I have to do something because my business demands and my customers are demanding real-time reporting on all of these concrete trucks and aggregate trucks delivering stuff. And real-time reporting is the name of the game for them, and so how do I power this? So, I have to build a whole bunch of pipes, deliver it to, like, some Elasticsearch or some kind of like a cluster that I had to keep up in real-time. And this will take me a couple of months, that will take me a couple of months. They came into Rockset on a Thursday, built their MVP over the weekend, and they had the first working version of their product the following Tuesday.So—and then, you know, there was no turning back at that point, not a single line of code was written. You know, you just go and create an account with Rockset, point us at your Dynamo, and then off you go. You know, you can use start using SQL and go start building your real-time application. So again, I think the tremendous value, I think a lot of customers like us, and a lot of customers love us. And if you really ask them what is one thing about Rockset that you really like, I think it'll come back to the same thing, which is, you gave me a lot of time back.What I thought would take six months is now a week. What I thought would be three weeks, we got that in a day. And that allows me to focus on my business. I want to spend more time with my stakeholders, you know, my CPO, my sales teams, and see what they need to grow our business and succeed, and not build yet another data pipeline and have data pipelines and other things coming out of my nose, you know? So, at the end of the day, the simplicity aspects of it is very, very important for real-time analytics because, you know, we can't really realize our vision for real-time being the new default in every enterprise for whenever analytics concern without making it very, very simple and accessible to everybody.And so, that continues to be one of our core thing. And I think you're absolutely right when you say the biggest expense is actually the people and the time and the energy they have to spend. And not having to stand up a huge data ops team that is building and managing all of these things, is probably the number one reason why customers really, really like working with our product.Corey: I want to thank you for taking so much time to talk me through what you're working on these days. If people want to learn more, where's the best place to find you?Venkat: We are Rockset, I'll spell it out for your listeners ROCKSET—rock set—rockset.com. You can go there, you can start a free trial. There is a blog, rockset.com/blog has a prolific blog that is very active. We have all sorts of stories there, and you know engineers talking about how they implemented certain things, to customer case studies.So, if you're really interested in this space, that's one on space to follow and watch. If you're interested in giving this a spin, you know, you can go to rockset.com and start a free trial. If you want to talk to someone, there is, like, a ‘Request Demo' button there; you click it and one of our solutions people or somebody that is more familiar with Rockset would get in touch with you and you can have a conversation with them.Corey: Excellent. And links to that will of course go in the [show notes 00:34:20]. Thank you so much for your time today. I appreciate it.Venkat: Thanks, Corey. It was great.Corey: Venkat Venkataramani, co-founder and CEO at Rockset. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting crappy comment that I will immediately see show up on my real-time dashboard.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Creating “Quinntainers” with Casey Lee

Screaming in the Cloud

Play Episode Listen Later Apr 20, 2022 46:16


About CaseyCasey spends his days leveraging AWS to help organizations improve the speed at which they deliver software. With a background in software development, he has spent the past 20 years architecting, building, and supporting software systems for organizations ranging from startups to Fortune 500 enterprises.Links Referenced: “17 Ways to Run Containers in AWS”: https://www.lastweekinaws.com/blog/the-17-ways-to-run-containers-on-aws/ “17 More Ways to Run Containers on AWS”: https://www.lastweekinaws.com/blog/17-more-ways-to-run-containers-on-aws/ kubernetestheeasyway.com: https://kubernetestheeasyway.com snark.cloud/quinntainers: https://snark.cloud/quinntainers ECS Chargeback: https://github.com/gaggle-net/ecs-chargeback  twitter.com/nektos: https://twitter.com/nektos TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored by our friends at Revelo. Revelo is the Spanish word of the day, and its spelled R-E-V-E-L-O. It means “I reveal.” Now, have you tried to hire an engineer lately? I assure you it is significantly harder than it sounds. One of the things that Revelo has recognized is something I've been talking about for a while, specifically that while talent is evenly distributed, opportunity is absolutely not. They're exposing a new talent pool to, basically, those of us without a presence in Latin America via their platform. It's the largest tech talent marketplace in Latin America with over a million engineers in their network, which includes—but isn't limited to—talent in Mexico, Costa Rica, Brazil, and Argentina. Now, not only do they wind up spreading all of their talent on English ability, as well as you know, their engineering skills, but they go significantly beyond that. Some of the folks on their platform are hands down the most talented engineers that I've ever spoken to. Let's also not forget that Latin America has high time zone overlap with what we have here in the United States, so you can hire full-time remote engineers who share most of the workday as your team. It's an end-to-end talent service, so you can find and hire engineers in Central and South America without having to worry about, frankly, the colossal pain of cross-border payroll and benefits and compliance because Revelo handles all of it. If you're hiring engineers, check out revelo.io/screaming to get 20% off your first three months. That's R-E-V-E-L-O dot I-O slash screaming.Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today is someone that I had the pleasure of meeting at re:Invent last year, but we'll get to that story in a minute. Casey Lee is the CTO with a company called Gaggle, which is—as they frame it—saving lives. Now, that seems to be a relatively common position that an awful lot of different tech companies take. “We're saving lives here.” It's, “You show banner ads and some of them are attack platforms for JavaScript malware. Let's be serious here.” Casey, thank you for joining me, and what makes the statement that Gaggle saves lives not patently ridiculous?Casey: Sure. Thanks, Corey. Thanks for having me on the show. So Gaggle, we're ed-tech company. We sell software to school districts, and school districts use our software to help protect their students while the students use the school-issued Google or Microsoft accounts.So, we're looking for signs of bullying, harassment, self-harm, and potentially suicide from K-12 students while they're using these platforms. They will take the thoughts, concerns, emotions they're struggling with and write them in their school-issued accounts. We detect that and then we notify the school districts, and they get the students the help they need before they can do any permanent damage to themselves. We protect about 6 million students throughout the US. We ingest a lot of content.Last school year, over 6 billion files, about the equal number of emails ingested. We're looking for concerning content and then we have humans review the stuff that our machine learning algorithms detect and flag. About 40 million items had to go in front of humans last year, resulted in about 20,000 what we call PSSes. These are Possible Student Situations where students are talking about harming themselves or harming others. And that resulted in what we like to track as lives saved. 1400 incidents last school year where a student was dealing with suicide ideation, they were planning to take their own lives. We detect that and get them help within minutes before they can act on that. That's what Gaggle has been doing. We're using tech, solving tech problems, and also saving lives as we do it.Corey: It's easy to lob a criticism at some of the things you're alluding to, the idea of oh, you're using machine learning on student data for young kids, yadda, yadda, yadda. Look at the outcome, look at the privacy controls you have in place, and look at the outcomes you're driving to. Now, I don't necessarily trust the number of school administrations not to become heavy-handed and overbearing with it, but let's be clear, that's not the intent. That is not what the success stories you have alluded to. I've got to say I'm a fan, so thanks for doing what you're doing. I don't say that very often to people who work in tech companies.Casey: Cool. Thanks, Corey.Corey: But let's rewind a bit because you and I had passed like ships in the night on Twitter for a while, but last year at re:Invent something odd happened. First, my business partner procrastinated at getting his ticket—that's not the odd part; he does that a lot—but then suddenly ticket sales slammed shut and none were to be had anywhere. You reached out with a, “Hey, I have a spare ticket because someone can't go. Let me get it to you.” And I said, “Terrific. Let me pay you for the ticket and take you to dinner.”You said, “Yes on the dinner, but I'd rather you just look at my AWS bill and don't worry about the cost of the ticket.” “All right,” said I. I know a deal when I see one. We grabbed dinner at the Venetian. I said, “Bust out your laptop.” And you said, “Oh, I was kidding.” And I said, “Great. I wasn't. Bust it out.”And you went from laughing to taking notes in about the usual time that happens when I start looking at these things. But how was your recollection of that? I always tend to romanticize some of these things. Like, “And then everyone's restaurant just turned, stopped, and clapped the entire time.” Maybe that part didn't happen.Casey: Everything was right up until the clapping part. That was a really cool experience. I appreciate you walking through that with me. Yeah, we've got lots of opportunity to save on our AWS bill here at Gaggle, and in that little bit of time that we had together, I think I walked away with no more than a dozen ideas for where to shave some costs. The most obvious one, the first thing that you keyed in on, is we had RIs coming due that weren't really well-optimized and you steered me towards savings plans. We put that in place and we're able to apply those savings plans not just to our EC2 instances but also to our serverless spend as well.So, that was a very worthwhile and cost-effective dinner for us. The thing that was most surprising though, Corey, was your approach. Your approach to how to review our bill was not what I thought at all.Corey: Well, what did you expect my approach was going to be? Because this always is of interest to me. Like, do you expect me to, like, whip a portable machine learning rig out of my backpack full of GPUs or something?Casey: I didn't know if you had, like, some secret tool you were going to hit, or if nothing else, I thought you were going to go for the Cost Explorer. I spend a lot of time in Cost Explorer, that's my go-to tool, and you wanted nothing to do with Cost Exp—I think I was actually pulling up Cost Explorer for you and you said, “I'm not interested. Take me to the bills.” So, we went right to the billing dashboard, you started opening up the invoices, and I thought to myself, “I don't remember the last time I looked at an AWS invoice.” I just, it's noise; it's not something that I pay attention to.And I learned something, that you get a real quick view of both the cost and the usage. And that's what you were keyed in on, right? And you were looking at things relative to each other. “Okay, I have no idea about Gaggle or what they do, but normally, for a company that's spending x amount of dollars in EC2, why is your data transfer cost the way it is? Is that high or low?” So, you're looking for kind of relative numbers, but it was really cool watching you slice and dice that bill through the dashboard there.Corey: There are a few things I tie together there. Part of it is that this is sort of a surprising thing that people don't think about but start with big numbers first, rather than going alphabetically because I don't really care about your $6 Alexa for Business spend. I care a bit more about the $6 million, or whatever it happens to be at EC2—I'm pulling numbers completely out of the ether, let's be clear; I don't recall what the exact magnitude of your bill is and it's not relevant to the conversation.And then you see that and it's like, “Huh. Okay, you're spending $6 million on EC2. Why are you spending 400 bucks on S3? Seems to me that those two should be a little closer aligned. What's the deal here? Oh, God, you're using eight petabytes of EBS volumes. Oh, dear.”And just, it tends to lead to interesting stuff. Break it down by region, service, and use case—or usage type, rather—is what shows up on those exploded bills, and that's where I tend to start. It also is one of the easiest things to wind up having someone throw into a PDF and email my way if I'm not doing it in a restaurant with, you know, people clapping standing around.Casey: [laugh]. Right.Corey: I also want to highlight that you've been using AWS for a long time. You're a Container Hero; you are not bad at understanding the nuances and depths of AWS, so I take praise from you around this stuff as valuing it very highly. This stuff is not intuitive, it is deeply nuanced, and you have a business outcome you are working towards that invariably is not oriented day in day out around, “How do I get these services for less money than I'm currently paying?” But that is how I see the world and I tend to live in a very different space just based on the nature of what I do. It's sort of a case study and the advantage of specialization. But I know remarkably little about containers, which is how we wound up reconnecting about a week or so before we did this recording.Casey: Yeah. I saw your tweet; you were trying to run some workload—container workload—and I could hear the frustration on the other end of Twitter when you were shaking your fist at—Corey: I should not tweet angrily, and I did in this case. And, eh, every time I do I regret it. But it played well with the people, so that does help. I believe my exact comment was, “‘me: I've got this container. Run it, please.' ‘Google Cloud: Run. You got it, boss.' AWS has 17 ways to run containers and they all suck.”And that's painting with an overly broad brush, let's be clear, but that was at the tail end of two or three days of work trying to solve a very specific, very common, business problem, that I was just beating my head off of a wall again and again and again. And it took less than half an hour from start to finish with Google Cloud Run and I didn't have to think about it anymore. And it's one of those moments where you look at this and realize that the future is here, we just don't see it in certain ways. And you took exception to this. So please, let's dive in because 280 characters of text after half a bottle of wine is not the best context to have a nuanced discussion that leaves friendships intact the following morning.Casey: Nice. Well, I just want to make sure I understand the use case first because I was trying to read between the lines on what you needed, but let me take a guess. My guess is you got your source code in GitHub, you have a Docker file, and you want to be able to take that repo from GitHub and just have it continuously deployed somewhere in Run. And you don't want to have headaches with it; you just want to push more changes up to GitHub, Docker Build runs and updates some service somewhere. Am I right so far?Corey: Ish, but think a little further up the stack. It was in service of this show. So, this show, as people who are listening to this are probably aware by this point, periodically has sponsors, which we love: We thank them for participating in the ongoing support of this show, which empowers conversations like this. Sometimes a sponsor will come to us with, “Oh, and here's the URL we want to give people.” And it's, “First, you misspelled your company name from the common English word; there are three sublevels within the domain, and then you have a complex UTM tagging tracking co—yeah, you realize people are driving to work when they're listening to this?”So, I've built a while back a link shortener, snark.cloud because is it the shortest thing in the world? Not really, but it's easily understandable when I say that, and people hear it for what it is. And that's been running for a long time as an S3 bucket with full of redirects, behind CloudFront. So, I wind up adding a zero-byte object with a redirect parameter on it, and it just works.Now, the challenge that I have here as a business is that I am increasingly prolific these days. So, anything that I am not directly required to be doing, I probably shouldn't necessarily be the one to do it. And care and feeding of those redirect links is a prime example of this. So, I went hunting, and the things that I was looking for were, obviously, do the redirect. Now, if you pull up GitHub, there are hundreds of solutions here.There are AWS blog posts. One that I really liked and almost got working was Eric Johnson's three-part blog post on how to do it serverlessly, with API Gateway, and DynamoDB, no Lambdas required. I really liked aspects of what that was, but it was complex, I kept smacking into weird challenges as I went, and front end is just baffling to me. Because I needed a front end app for people to be able to use here; I need to be able to secure that because it turns out that if you just have a, anyone who stumbles across the URL can redirect things to other places, well, you've just empowered a whole bunch of spam email, and you're going to find that service abused, and everyone starts blocking it, and then you have trouble. Nothing lasts the first encounter with jerks.And I was getting more and more frustrated, and then I found something by a Twitter engineer on GitHub, with a few creative search terms, who used to work at Google Cloud. And what it uses as a client is it doesn't build any kind of custom web app. Instead, as a database, it uses not S3 objects, not Route 53—the ideal database—but a Google sheet, which sounds ridiculous, but every business user here knows how to use that.Casey: Sure.Corey: And it looks for the two columns. The first one is the slug after the snark.cloud, and the second is the long URL. And it has a TTL of five seconds on cache, so make a change to that spreadsheet, five seconds later, it's live. Everyone gets it, I don't have to build anything new, I just put it somewhere around the relevant people can access it, I gave him a tutorial and a giant warning on it, and everyone gets that. And it just works well. It was, “Click here to deploy. Follow the steps.”And the documentation was a little, eh, okay, I had to undo it once and redo it again. Getting the domain registered was getting—ported over took a bit of time, and there were some weird SSL errors as the certificates were set up, but once all of that was done, it just worked. And I tested the heck out of it, and cold starts are relatively low, and the entire thing fits within the free tier. And it is reminiscent of the magic that I first saw when I started working with some of the cloud providers services, years ago. It's been a long time since I had that level of delight with something, especially after three days of frustration. It's one of the, “This is a great service. Why are people not shouting about this from the rooftops?” That was my perspective. And I put it out on Twitter and oh, Lord, did I get comments. What was your take on it?Casey: Well, so my take was, when you're evaluating a platform to use for running your applications, how fast it can get you to Hello World is not necessarily the best way to go. I just assumed you're wrong. I assumed of the 17 ways AWS has to run containers, Corey just doesn't understand. And so I went after it. And I said, “Okay, let me see if I can find a way that solves his use case, as I understand it, through a quick tweet.”And so I tried to App Runner; I saw that App Runner does not meet your needs because you have to somehow get your Docker image pushed up to a repo. App Runner can take an image that's already been pushed up and deployed for you or it can build from source but neither of those were the way I understood your use case.Corey: Having used App Runner before via the Copilot CLI, it is the closest as best I can tell to achieving what I want. But also let's be clear that I don't believe there's a free tier; there needs to be a load balancer in front of it, so you're starting with 15 bucks a month for this thing. Which is not the end of the world. Had I known at the beginning that all of this was going to be there, I would have just signed up for a bit.ly account and called it good. But here we are.Casey: Yeah. I tried Copilot. Copilot is a great developer experience, but it also is just pulling together tons of—I mean just trying to do a Copilot service deploy, VPCs are being created and tons IAM roles are being created, code pipelines, there's just so much going on. I was like 20 minutes into it, and I said, “Yeah, this is not fitting the bill for what Corey was looking for.” Plus, it doesn't solve my the way I understood your use case, which is you don't want to worry about builds, you just want to push code and have new Docker images get built for you.Corey: Well, honestly, let's be clear here, once it's up and running, I don't want to ever have to touch the silly thing again.Casey: Right.Corey: And that's so far has been the case, after I forked the repo and made a couple of changes to it that I wanted to see. One of them was to render the entire thing case insensitive because I get that one wrong a lot, and the other is I wanted to change the permanent 301 redirect to a temporary 302 redirect because occasionally, sponsors will want to change where it goes in the fullness of time. And that is just fine, but I want to be able to support that and not have to deal with old cached data. So, getting that up and running was a bit of a challenge. But the way that it worked, was following the instructions in the GitHub repo.The developer environment had spun up in the Google's Cloud Shell was just spectacular. It prompted me for a few things and it told me step by step what to do. This is the sort of thing I could have given a basically non-technical user, and they would have had success with it.Casey: So, I tried it as well. I said, “Well, okay, if I'm going to respond to Corey here and challenge him on this, I need to try Cloud Run.” I had no experience with Cloud Run. I had a small example repo that loosely mapped what I understood you were trying to do. Within five minutes, I had Cloud Run working.And I was surprised anytime I pushed a new change, within 45 seconds the change was built and deployed. So, here's my conclusion, Corey. Google Cloud Run is great for your use case, and AWS doesn't have the perfect answer. But here's my challenge to you. I think that you just proved why there's 17 different ways to run containers on AWS, is because there's that many different types of users that have different needs and you just happen to be number 18 that hasn't gotten the right attention yet from AWS.Corey: Well, let's be clear, like, my gag about 17 ways to run containers on AWS was largely a joke, and it went around the internet three times. So, I wrote a list of them on the blog post of “17 Ways to Run Containers in AWS” and people liked it. And then a few months later, I wrote “17 More Ways to Run Containers on AWS” listing 17 additional services that all run containers.And my favorite email that I think I've ever received in feedback was from a salty AWS employee, saying that one of them didn't really count because of some esoteric reason. And it turns out that when I'm trying to make a point of you have a sarcastic number of ways to run containers, pointing out that well, one of them isn't quite valid, doesn't really shatter the argument, let's be very clear here. So, I appreciate the feedback, I always do. And it's partially snark, but there is an element of truth to it in that customers don't want to run containers, by and large. That is what they do in service of a business goal.And they want their application to run which is in turn to serve as the business goal that continues to abstract out into, “Remain a going concern via the current position the company stakes out.” In your case, it is saving lives; in my case, it is fixing horrifying AWS bills and making fun of Amazon at the same time, and in most other places, there are somewhat more prosaic answers to that. But containers are simply an implementation detail, to some extent—to my way of thinking—of getting to that point. An important one [unintelligible 00:18:20], let's be clear, I was very anti-container for a long time. I wrote a talk, “Heresy in the Church of Docker” that then was accepted at ContainerCon. It's like, “Oh, boy, I'm not going to leave here alive.”And the honest answer is many years later, that Kubernetes solves almost all the criticisms that I had with the downside of well, first, you have to learn Kubernetes, and that continues to be mind-bogglingly complex from where I sit. There's a reason that I've registered kubernetestheeasyway.com and repointed it to ECS, Amazon's container service that is not requiring you to cosplay as a cloud provider yourself. But even ECS has a number of challenges to it, I want to be very clear here. There are no silver bullets in this.And you're completely correct in that I have a large, complex environment, and the application is nuanced, and I'm willing to invest a few weeks in setting up the baseline underlying infrastructure on AWS with some of these services, ideally not all of them at once because that's something a lunatic would do, but getting them up and running. The other side of it, though, is that if I am trying to evaluate a cloud provider's handling of containers and how this stuff works, the reason that everyone starts with a Hello World-style example is that it delivers ideally, the meantime to dopamine. There's a reason that Hello World doesn't have 18 different dependencies across a bunch of different databases and message queues and all the other complicated parts of running a modern application. Because you just want to see how it works out of the gate. And if getting that baseline empty container that just returns the string ‘Hello World' is that complicated and requires that much work, my takeaway is not that this user experience is going to get better once I'd make the application itself more complicated.So, I find that off-putting. My approach has always been find something that I can get the easy, minimum viable thing up and running on, and then as I expand know that you'll be there to catch me as my needs intensify and become ever more complex. But if I can't get the baseline thing up and running, I'm unlikely to be super enthused about continuing to beat my head against the wall like, “Well, I'll just make it more complex. That'll solve the problem.” Because it often does not. That's my position.Casey: Yeah, I agree that dopamine hit is valuable in getting attached to want to invest into whatever tech stack you're using. The challenge is your second part of that. Your second part is will it grow with me and scale with me and support the complex edge cases that I have? And the problem I've seen is a lot of organizations will start with something that's very easy to get started with and then quickly outgrow it, and then come up with all sorts of weird Rube Goldberg-type solutions. Because they jumped all in before seeing—I've got kind of an example of that.I'm happy to announce that there's now 18 ways to run containers on AWS. Because in your use case, in the spirit of AWS customer obsession, I hear your use case, I've created an open-source project that I want to share called Quinntainers—Corey: Oh, no.Casey: —and it solves—yes. Quinntainers is live and is ready for the world. So, now we've got 18 ways to run containers. And if you have Corey's use case of, “Hey, here's my container. Run it for me,” now we've got a one command that you can run to get things going for you. I can share a link for you and you could check it out. This is a [unintelligible 00:21:38]—Corey: Oh, we're putting that in the [show notes 00:21:37], for sure. In fact, if you go to snark.cloud/quinntainers, you'll find it.Casey: You'll find it. There you go. The idea here was this: There is a real use case that you had, and I looked at AWS does not have an out-of-the-box simple solution for you. I agree with that. And Google Cloud Run does.Well, the answer would have been from AWS, “Well, then here, we need to make that solution.” And so that's what this was, was a way to demonstrate that it is a solvable problem. AWS has all the right primitives, just that use case hadn't been covered. So, how does Quinntainers work? Real straightforward: It's a command-line—it's an NPM tool.You just run a [MPX 00:22:17] Quinntainer, it sets up a GitHub action role in your AWS account, it then creates a GitHub action workflow in your repo, and then uses the Quinntainer GitHub action—reusable action—that creates the image for you; every time you push to the branch, pushes it up to ECR, and then automatically pushes up that new version of the image to App Runner for you. So, now it's using App Runner under the covers, but it's providing that nice developer experience that you are getting out of Cloud Run. Look, is container really the right way to go with running containers? No, I'm not making that point at all. But the point is it is a—Corey: It might very well be.Casey: Well, if you want to show a good Hello World experience, Quinntainer's the best because within 30 seconds, your app is now set up to continuously deliver containers into AWS for your very specific use case. The problem is, it's not going to grow for you. I mean that it was something I did over the weekend just for fun; it's not something that would ever be worthy of hitching up a real production workload to. So, the point there is, you can build frameworks and tools that are very good at getting that initial dopamine hit, but then are not going to be there for you unnecessarily as you mature and get more complex.Corey: And yet, I've tilted a couple of times at the windmill of integrating GitHub actions in anything remotely resembling a programmatic way with AWS services, as far as instance roles go. Are you using permanent credentials for this as stored secrets or are you doing the [OICD 00:23:50][00:23:50] handoff?Casey: OIDC. So, what happens is the tool creates the IAM role for you with the trust policy on GitHub's OIDC provider, sets all that up for you in your account, locks it down so that just your repo and your main branch is able to push or is able to assume the role, the role is set up just to allow deployments to App Runner and ECR repository. And then that's it. At that point, it's out of your way. And you're just git push, and couple minutes later, your updates are now running an App Runner for you.Corey: This episode is sponsored in part by our friends at Vultr. Optimized cloud compute plans have landed at Vultr to deliver lightning fast processing power, courtesy of third gen AMD EPYC processors without the IO, or hardware limitations, of a traditional multi-tenant cloud server. Starting at just 28 bucks a month, users can deploy general purpose, CPU, memory, or storage optimized cloud instances in more than 20 locations across five continents. Without looking, I know that once again, Antarctica has gotten the short end of the stick. Launch your Vultr optimized compute instance in 60 seconds or less on your choice of included operating systems, or bring your own. It's time to ditch convoluted and unpredictable giant tech company billing practices, and say goodbye to noisy neighbors and egregious egress forever.Vultr delivers the power of the cloud with none of the bloat. "Screaming in the Cloud" listeners can try Vultr for free today with a $150 in credit when they visit getvultr.com/screaming. That's G E T V U L T R.com/screaming. My thanks to them for sponsoring this ridiculous podcast.Corey: Don't undersell what you've just built. This is something that—is this what I would use for a large-scale production deployment, obviously not, but it has streamlined and made incredibly accessible things that previously have been very complex for folks to get up and running. One of the most disturbing themes behind some of the feedback I got was, at one point I said, “Well, have you tried running a Docker container on Lambda?” Because now it supports containers as a packaging format. And I said no because I spent a few weeks getting Lambda up and running back when it first came out and I've basically been copying and pasting what I got working ever since the way most of us do.And response is, “Oh, that explains a lot.” With the implication being that I'm just a fool. Maybe, but let's be clear, I am never the only person in the room who doesn't know how to do something; I'm just loud about what I don't know. And the failure mode of a bad user experience is that a customer feels dumb. And that's not okay because this stuff is complicated, and when a user has a bad time, it's a bug.I learned that in 2012. From Jordan Sissel the creator of LogStash. He has been an inspiration to me for the last ten years. And that's something I try to live by that if a user has a bad time, something needs to get fixed. Maybe it's the tool itself, maybe it's the documentation, maybe it's the way that GitHub repo's readme is structured in a way that just makes it accessible.Because I am not a trailblazer in most things, nor do I intend to be. I'm not the world's best engineer by a landslide. Just look at my code and you'd argue the fact that I'm an engineer at all. But if it's bad and it works, how bad is it? Is sort of the other side of it.So, my problem is that there needs to be a couple of things. Ignore for a second the aspect of making it the right answer to get something out of the door. The fact that I want to take this container and just run it, and you and I both reach for App Runner as the default AWS service that does this because I've been swimming in the AWS waters a while and you're a frickin AWS Container Hero, where it is expected that you know what most of these things do. For someone who shows up on the containers webpage—which by the way lists, I believe 15 ways to run containers on mobile and 19 ways to run containers on non-mobile, which is just fascinating in its own right—and it's overwhelming, it's confusing, and it's not something that makes it is abundantly clear what the golden path is. First, get it up and working, get it running, then you can add nuance and flavor and the rest, and I think that's something that's gotten overlooked in our mad rush to pretend that we're all Google engineers, circa 2012.Casey: Mmm. I think people get stressed out when they tried to run containers in AWS because they think, “What is that golden path?” You said golden path. And my advice to people is there is no golden path. And the great thing about AWS is they do continue to invest in the solutions they come up with. I'm still bitter about Google Reader.Corey: As am I.Casey: Yeah. I built so much time getting my perfect set of RSS feeds and then I had to find somewhere else to—with AWS, the different offerings that are available for running containers, those are there intentionally, it's not by accident. They're there to solve specific problems, so the trick is finding what works best for you and don't feel like one is better than the other is going to get more attention than others. And they each have different use cases.And I approach it this way. I've seen a couple of different people do some great flowcharts—I think Forrest did one, Vlad did one—on ways to make the decision on how to run your containers. And I break it down to three questions. I ask people first of all, where are you going to run these workloads? If someone says, “It has to be in the data center,” okay, cool, then ECS Anywhere or EKS Anywhere and we'll figure out if Kubernetes is needed.If they need specific requirements, so if they say, “No, we can run in the cloud, but we need privileged mode for containers,” or, “We need EBS volumes,” or, “We want really small container sizes,” like, less than a quarter-VCP or less than half a gig of RAM—or if you have custom log requirements, Fargate is not going to work for you, so you're going to run on EC2. Otherwise, run it on Fargate. But that's the first question. Figure out where are you going to run your containers. That leads to the second question: What's your control plane?But those are different, sort of related but different questions. And I only see six options there. That's App Runner for your control plane, LightSail for your control plane, Rosa if you're invested in OpenShift already, EKS either if you have Momentum and Kubernetes or you have a bunch of engineers that have a bunch of experience with Kubernetes—if you don't have either, don't choose it—or ECS. The last option Elastic Beanstalk, but let's leave that as a—if you're not currently invested in Elastic Beanstalk don't start today. But I look at those as okay, so I—first question, where am I going to run my containers? Second question, what do I want to use for my control plane? And there's different pros and cons of each of those.And then the third question, how do I want to manage them? What tools do I want to use for managing deployment? All those other tools like Copilot or App2Container or Proton, those aren't my control plane; those aren't where I run my containers; that's how I manage, deploy, and orchestrate all the different containers. So, I look at it as those three questions. But I don't know, what do you think of that, Corey?Corey: I think you're onto something. I think that is a terrific way of exploring that question. I would argue that setting up a framework like that—one or very similar—is what the AWS containers page should be, just coming from the perspective of what is the neophyte customer experience. On some level, you almost need a slide of have choose your level of experience ranging from, “What's a container?” To, “I named my kid Kubernetes because I make terrible life decisions,” and anywhere in between.Casey: Sure. Yeah, well, and I think that really dictates the control plane level. So, for example, LightSail, where does LightSail fit? To me, the value of LightSail is the simplicity. I'm looking at a monthly pricing: Seven bucks a month for a container.I don't know how [unintelligible 00:30:23] works, but I can think in terms of monthly pricing. And it's tailored towards a console user, someone just wants to click in, point to an image. That's a very specific user, there's thousands of customers that are very happy with that experience, and they use it. App Runner presents that scale to zero. That's one of the big selling points I see with App Runner. Likewise, with Google Cloud Run. I've got that scale to zero. I can't do that with ECS, or EKS, or any of the other platforms. So, if you've got something that has a ton of idle time, I'd really be looking at those. I would argue that I think I did the math, Google Cloud Run is about 30% more expensive than App Runner.Corey: Yeah, if you disregard the free tier, I think that's have it—running persistently at all times throughout the month, the drop-out cold starts would cost something like 40 some odd bucks a month or something like that. Don't quote me on it. Again and to be clear, I wound up doing this very congratulatory and complimentary tweet about them on I think it was Thursday, and then they immediately apparently took one look at this and said, “Holy shit. Corey's saying nice things about us. What do we do? What do we do?” Panic.And the next morning, they raised prices on a bunch of cloud offerings. Whew, that'll fix it. Like—Casey: [laugh].Corey: Di-, did you miss the direction you're going on here? No, that's the exact opposite of what you should be doing. But here we are. Interestingly enough, to tie our two conversation threads together, when I look at an AWS bill, unless you're using Fargate, I can't tell whether you're using Kubernetes or not because EKS is a small charge. And almost every case for the control plane, or Fargate under it.Everything else just manifests as EC2 spend. From the perspective of the cloud provider. If you're running a Kubernetes cluster, it is a single-tenant application that can have some very funky behaviors like cross-AZ chatter back and fourth because there's no internal mechanism to say talk to the free thing, rather than the two cents a gigabyte thing. It winds up spinning up and down in a bunch of different ways, and the behavior patterns, because of how placement works are not necessarily deterministic, depending upon workload. And that becomes something that people find odd when, “Okay, we look at our bill for a week, what can you say?”“Well, first question. Are you running Kubernetes at all?” And they're like, “Who invited these clowns?” Understand, we're not prying into your workloads for a variety of excellent legal and contractual reasons, here. We are looking at how they behave, and for specific workloads, once we have a conversation engineering team, yeah, we're going to dive in, but it is not at all intuitive from the outside to make any determination whether you're running containers, or whether you're running VMs that you just haven't done anything with in 20 years, or what exactly is going on. And that's just an artifact of the billing system.Casey: We ran into this challenge in Gaggle. We don't use EKS, we use ECS, but we have some shared clusters, lots of EC2 spend, hard to figure out which team is creating the services that's running that up. We actually ended up creating a tool—we open-sourced it—ECS Chargeback, and what it does is it looks at the CPU memory reservations for each task definition, and then prorates the overall charge of the ECS cluster, and then creates metrics in Datadog to give us a breakdown of cost per ECS service. And it also measures what we like to refer to as waste, right? Because if you're reserving four gigs of memory, but your utilization never goes over two gigs, we're paying for that reservation, but you're underutilizing.So, we're able to also show which services have the highest degree of waste, not just utilization, so it helps us go after it. But this is a hard problem. I'd be curious, how do you approach these shared ECS resources and slicing and dicing those bills?Corey: Everyone has a different approach, too. This there is no unifiable, correct answer. A previous show guest, Peter Hamilton, over at Remind had done something very similar, open-sourced a bunch of these things. Understanding what your spend is important on this, and it comes down to getting at the actual business concern because in some cases, effectively dead reckoning is enough. You take a look at the cluster that is really hard to attribute because it's a shared service. Great. It is 5% of your bill.First pass, why don't we just agree that it is a third for Service A, two-thirds for Service B, and we'll call it mostly good at that point? That can be enough in a lot of cases. With scale [laugh] you're just sort of hand-waving over many millions of dollars a year there. How about we get into some more depth? And then you start instrumenting and reporting to something, be it CloudWatch, be a Datadog, be it something else, and understanding what the use case is.In some cases, customers have broken apart shared clusters for that specific reason. I don't think that's necessarily the best approach from an engineering perspective, but again, this is not purely an engineering decision. It comes down to serving the business need. And if you're taking up partial credits on that cluster, for a tax credit for R&D for example, you want that position to be extraordinarily defensible, and spending a few extra dollars to ensure that it is the right business decision. I mean, again, we're pure advisory; we advise customers on what we would do in their position, but people often mistake that to be we're going to go for the lowest possible price—bad idea, or that we're going to wind up doing this from a purely engineering-centric point of view.It's, be aware of that in almost every case, with some very notable weird exceptions, the AWS Bill costs significantly less than the payroll expense that you have of people working on the AWS environment in various ways. People are more expensive, so the idea of, well, you can save a whole bunch of engineering effort by spending a bit more on your cloud, yeah, let's go ahead and do that.Casey: Yeah, good point.Corey: The real mark of someone who's senior enough is their answer to almost any question is, “It depends.” And I feel I've fallen into that trap as well. Much as I'd love to sit here and say, “Oh, it's really simple. You do X, Y, and Z.” Yeah… honestly, my answer, the simple answer, is I think that we orchestrate a cyber-bullying campaign against AWS through the AWS wishlist hashtag, we get people to harass their account managers with repeated requests for, “Hey, could you go ahead and [dip 00:36:19] that thing in—they give that a plus-one for me, whatever internal system you're using?”Just because this is a problem we're seeing more and more. Given that it's an unbounded growth problem, we're going to see it more and more for the foreseeable future. So, I wish I had a better answer for you, but yeah, that's stuff's super hard is honest, but it's also not the most useful answer for most of us.Casey: I'd love feedback from anyone from you or your team on that tool that we created. I can share link after the fact. ECS Chargeback is what we call it.Corey: Excellent. I will follow up with you separately on that. That is always worth diving into. I'm curious to see new and exciting approaches to this. Just be aware that we have an obnoxious talent sometimes for seeing these things and, “Well, what about”—and asking about some weird corner edge case that either invalidates the entire thing, or you're like, “Who on earth would ever have a problem like that?” And the answer is always, “The next customer.”Casey: Yeah.Corey: For a bounded problem space of the AWS bill. Every time I think I've seen it all, I just have to talk to one more customer.Casey: Mmm. Cool.Corey: In fact, the way that we approached your teardown in the restaurant is how we launched our first pass approach. Because there's value in something like that is different than the value of a six to eight-week-long, deep-dive engagement to every nook and cranny. And—Casey: Yeah, for sure. It was valuable to us.Corey: Yeah, having someone come in to just spend a day with your team, diving into it up one side and down the other, it seems like a weird thing, like, “How much good could you possibly do in a day?” And the answer in some cases is—we had a Honeycomb saying that in a couple of days of something like this, we wound up blowing 10% off their entire operating budget for the company, it led to an increased valuation, Liz Fong-Jones says that—on multiple occasions—that the company would not be what it was without our efforts on their bill, which is just incredibly gratifying to hear. It's easy to get lost in the idea of well, it's the AWS bill. It's just making big companies spend a little bit less to another big company. And that's not exactly, you know, saving the lives of K through 12 students here.Casey: It's opening up opportunities.Corey: Yeah. It's about optimizing for the win for everyone. Because now AWS gets a lot more money from Honeycomb than they would if Honeycomb had not continued on their trajectory. It's, you can charge customers a lot right now, or you can charge them a little bit over time and grow with them in a partnership context. I've always opted for the second model rather than the first.Casey: Right on.Corey: But here we are. I want to thank you for taking so much time out of well, several days now to argue with me on Twitter, which is always appreciated, particularly when it's, you know, constructive—thanks for that—Casey: Yeah.Corey: For helping me get my business partner to re:Invent, although then he got me that horrible puzzle of 1000 pieces for the Cloud-Native Computing Foundation landscape and now I don't ever want to see him again—so you know, that happens—and of course, spending the time to write Quinntainers, which is going to be at snark.cloud/quinntainers as soon as we're done with this recording. Then I'm going to kick the tires and send some pull requests.Casey: Right on. Yeah, thanks for having me. I appreciate you starting the conversation. I would just conclude with I think that yes, there are a lot of ways to run containers in AWS; don't let it stress you out. They're there for intention, they're there by design. Understand them.I would also encourage people to go a little deeper, especially if you got a significantly large workload. You got to get your hands dirty. As a matter of fact, there's a hands-on lab that a company called Liatrio does. They call it their Night Lab; it's a one-day free, hands-on, you run legacy monolithic job applications on Kubernetes, gives you first-hand experience on how to—gets all the way up into observability and doing things like Canary deployments. It's a great, great lab.But you got to do something like that to really get your hands dirty and understand how these things work. So, don't sweat it; there's not one right way. There's a way that will probably work best for each user, and just take the time and understand the ways to make sure you're applying the one that's going to give you the most runway for your workload.Corey: I will definitely dig into that myself. But I think you're right, I think you have nailed a point that is, again, a nuanced one and challenging to put in a rage tweet. But the services don't exist in a vacuum. They're not there because, despite the joke, someone wants to get promoted. It's because there are customer needs that are going on that, and this is another way of meeting those needs.I think there could be better guidance, but I also understand that there are a lot of nuanced perspectives here and that… hell is someone else's workflow—Casey: [laugh].Corey: —and there's always value in broadening your perspective a bit on those things. If people want to learn more about you and how you see the world, where's the best place to find you?Casey: Probably on Twitter: twitter.com/nektos, N-E-K-T-O-S.Corey: That might be the first time Twitter has been described as a best place for anything. But—Casey: [laugh].Corey: Thank you once again, for your time. It is always appreciated.Casey: Thanks, Corey.Corey: Casey Lee, CTO at Gaggle and AWS Container Hero. And apparently writing code in anger to invalidate my points, which is always appreciated. Please do more of that, folks. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, or the YouTube comments, which is always a great place to go reading, whereas if you've hated this podcast, please leave a five-star review in the usual places and an angry comment telling me that I'm completely wrong, and then launching your own open-source tool to point out exactly what I've gotten wrong this time.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

airhacks.fm podcast with adam bien
A Cloud Migration Story: From J2EE to Serverless Java

airhacks.fm podcast with adam bien

Play Episode Listen Later Apr 8, 2022 70:37


An airhacks.fm conversation with Goran Opacic (@goranopacic) about: ZX Spectrum with 9 years, fortran listings as a present, Basic programming on Atari, Manic Miner and Jet Set Willy on Amstrad CPC 64, Defender of the Crown, printing with C 64, desktop publishing with Atari 520 ST and Calamus, testing the first website in 1993, using UUCP to split files into emails, drawing maps with Java Applets in browser, 17 years old code as Java AWS Lambda, Cloud Development Kit - applying the Java knowledge to the clouds, Jakarta EE and MicroProfile in the clouds, in the clouds there are different possibilities, mobile sales application with esteh, the serverless Tomcat, hetzner provides hosting services, no vacuuming on databases, how to become an AWS Data Hero, attending airhacks.com at MUC airport, serverless quarkus in the clouds, OpenLiberty for Java EE, building AWS Lambdas with Quarkus, Infrastructure as Code and CDK with Java, the cloud has limits, self-mutating CodePipelines, every AWS service has well-documented limits, EC 2 spot instances for GraalVM compilations, plain Java SE for asynchronous Lambdas, Goran Opacic on twitter: @goranopacic, Goran's blog: madabout.cloud

Lambda3 Podcast
Lambda3 Podcast 292 – Validando o produto com MVP

Lambda3 Podcast

Play Episode Listen Later Mar 25, 2022 92:40


Neste episódio do Podcast, Lambdas dos times de Agilidade, Dados e Desenvolvimento conversam sobre a validação de um produto com MVP, explicando e tirando dúvidas sobre o processo. Entre no nosso grupo do Telegram e compartilhe seus comentários com a gente: https://lb3.io/telegram Feed do podcast: www.lambda3.com.br/feed/podcast Feed do podcast somente com episódios técnicos: www.lambda3.com.br/feed/podcast-tecnico Feed do podcast somente com episódios não técnicos: www.lambda3.com.br/feed/podcast-nao-tecnico Lambda3 · #292 - Validando o produto com MVP Pauta: O que é um MVP? É a mesma coisa que protótipo? Qual é o momento de fazer um MVP? Como descobrir qual parte do produto é o mínimo para validar no mercado? Afinal, o que é validar o MVP? Que tipo de métricas precisamos observar depois do MVP ser colocado em produção? O que acontece depois do MVP? Links:  Business model generator MVP Canvas Métricas de vaidade Livro: A startup enxuta (Eric Ries) Livro: Direto ao Ponto: Criando produtos de forma enxuta (Paulo Caroli) Livro: Inspirado: Como criar produtos de tecnologia que os clientes amam (Marty Cagan) Livro: Atravessando o abismo (Geoffrey A. Moore) Livro: Unicornio verde e amarelo (Paulo Veras, Tania Menai) Livro: Computer engineering for babies (Chase Roberts) Podcast Lambda3 - UX/UI ou Designer de Produto? Podcast Lambda3 - O que é Iniciação de Projetos? Lambda3 - A primeira Inception a gente nunca esquece Participantes: Ahirton Lopes - @ahirton Camila Alves - @camilaalvescp Fernando Okuma - @feokuma Lucas Mendes - @lucas-meendes Edição: Compasso Coolab Créditos das músicas usadas neste programa: Music by Kevin MacLeod (incompetech.com) licensed under Creative Commons: By Attribution 3.0 - creativecommons.org/licenses/by/3.0

AWS Morning Brief
The Surprise Mandoogle

AWS Morning Brief

Play Episode Listen Later Mar 17, 2022 5:55


Links: Links Referenced: Couchbase Capella: https://couchbase.com/screaminginthecloud couchbase.com/screaminginthecloud: https://couchbase.com/screaminginthecloud blog post: https://awsteele.com/blog/2022/02/03/aws-vpc-data-exfiltration-using-codebuild.html AutoWarp: https://orca.security/resources/blog/autowarp-microsoft-azure-automation-service-vulnerability/ “Google Announces Intent to Acquire Mandiant”: https://www.googlecloudpresscorner.com/2022-03-08-mgc password table: https://www.hivesystems.io/blog/are-your-passwords-in-the-green New Relic: http://newrelic.com newrelic.com/morningbrief: http://newrelic.com/morningbrief newrelic.com/morningbrief: http://newrelic.com/morningbrief DirtyPipe: https://www.theregister.com/2022/03/08/in_brief_security/ “Manage AWS resources in your Slack channels with AWS Chatbot”: https://aws.amazon.com/blogs/mt/manage-aws-resources-in-your-slack-channels-with-aws-chatbot/ “How to set up federated single-sign-on to AWS using Google Workspace”: https://aws.amazon.com/blogs/security/how-to-set-up-federated-single-sign-on-to-aws-using-google-workspace/ Cloudsaga: https://github.com/awslabs/aws-cloudsaga lastweekinaws.com: https://lastweekinaws.com TranscriptCorey: This is the AWS Morning Brief: Security Edition. AWS is fond of saying security is job zero. That means it's nobody in particular's job, which means it falls to the rest of us. Just the news you need to know, none of the fluff.Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured, and fully managed with built-in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: Make your data sing.Hello and welcome to Last Week in AWS Security. A lot has happened; let's tear into it.So, there was a “Sort of yes, sort of no” security issue with CodeBuild that I've talked about previously. The blog post I referenced has, in fact, been updated. AWS has stated that, “We have updated the CodeBuild service to block all outbound network access for newly created CodeBuild projects which contain a customer-defined VPC configuration,” which indeed closes the gap. I love happy endings.On the other side, oof. Orca Security found a particularly nasty Azure breach called AutoWarp. You effectively could get credentials for other tenants by simply asking a high port on localhost for them via curl or netcat. This is bad enough; I'm dreading the AWS equivalent breach in another four months of them stonewalling a security researcher if the previous round of their nonsense silence about security patterns is any indicator.“Google Announces Intent to Acquire Mandiant”. This is a big deal. Mandiant has been a notable center of excellent cybersecurity talent for a long time. Congratulations or condolences to any Mandoogles in the audience. Please let me know how the transition goes for you.Hive Systems has updated its password table for 2022, which is just a graphic that shows how long passwords of various levels of length and complexity would take to break on modern systems. The takeaway here is to use long passwords and use a password manager.Corey: You know the drill: You're just barely falling asleep and you're jolted awake by an emergency page. That's right, it's your night on call, and this is the bad kind of Call of Duty. The good news is, is that you've got New Relic, so you can quickly run down the incident checklist and find the problem. You have an errors inbox that tells you that Lambdas are good, RUM is good, but something's up in APM. So, you click the error and find the deployment marker where it all began. Dig deeper, there's another set of errors. What is it? Of course, it's Kubernetes, starting after an update. You ask that team to roll back and bam, problem solved. That's the value of combining 16 different monitoring products into a single platform: You can pinpoint issues down to the line of code quickly. That's why the Dev and Ops teams at DoorDash, GitHub, Epic Games, and more than 14,000 other companies use New Relic. The next late-night call is just waiting to happen, so get New Relic before it starts. And you can get access to the whole New Relic platform at 100 gigabytes of data free, forever, with no credit card. Visit newrelic.com/morningbrief that's newrelic.com/morningbrief.And of course, another week, another terrifying security concern. This one is called DirtyPipe. It's in the Linux kernel, and the name is evocative of something you'd expect to see demoed onstage at re:Invent.Now, what did AWS have to say? Two things. The first is “Manage AWS resources in your Slack channels with AWS Chatbot”. A helpful reminder that it's important to restrict access to your AWS production environment down to just the folks at your company who need access to it. Oh, and to whomever can access your Slack workspace who works over at Slack, apparently. We don't talk about that one very much, now do we?And the second was, “How to set up federated single-sign-on to AWS using Google Workspace”. This is super-aligned with what I want to do, but something about the way that it's described makes it sounds mind-numbingly complicated. This isn't a problem that's specific to this post or even to AWS; it's industry-wide when it comes to SSO. I'm starting to think that maybe I'm the problem here.And lastly, AWS has open-sourced a tool called Cloudsaga, designed to simulate security events in AWS. This may be better known as, “Testing out your security software,” and with sufficiently poor communication, “Giving your CISO a heart attack.”And that's what happened last week in AWS security. If you've enjoyed it, please tell your friends about this place. I'll talk to you next week.Corey: Thank you for listening to the AWS Morning Brief: Security Edition with the latest in AWS security that actually matters. Please follow AWS Morning Brief on Apple Podcast, Spotify, Overcast—or wherever the hell it is you find the dulcet tones of my voice—and be sure to sign up for the Last Week in AWS newsletter at lastweekinaws.com.Announcer: This has been a HumblePod production. Stay humble.

Serverless Craic from The Serverless Edge
Serverless Craic Ep16 Modern Cloud CEO

Serverless Craic from The Serverless Edge

Play Episode Listen Later Mar 11, 2022 16:05


We're continuing our series on Modern Cloud looking at the Modern CEO. What does a CEO expect from Modern Cloud? They want capability from your engineering or IT organisation. They also want speed. And flexibility. I don't think they care about Lambdas or Kubernetes! It's good way to really frustrate your CEO if you tell him all about Kubernetes and Lambdas. As a CEO, you're going to have the needs of your customers in your mind. You're going to have user groups needs for your company to meet. You can look at Modern Cloud and think, how can I rapidly meet the needs of those customers or users? What capabilities does Modern Cloud provide me with? They are looking to Modern Cloud to quickly stand up capabilities to meet those needs. A big part of Modern Cloud is aimed at integrating with things that exist. Let's not build it. Let's not start way back with nothing and work our way up. It's the Buy, Rent, Build question! In my experience, CEOs are obsessed with the customer in the business domain that they're in. They're very focused on that industry. The healthy question to ask is: 'why are we building that?'. Why are we building that thing if it doesn't have anything to do with our core business. There's a value chain that you can draw. We've seen lots of so-called SAS offerings that don't pay as you go, don't scale up or down and come with a lot of operational overhead and burden. As we evolve towards modern cloud and a serverless first mindset approach, the SAS offerings we're looking to rent need to be assessed to see if they are built on modern cloud principles and practices. Otherwise, you will tie yourself up into knots with things that will not scale. There's an appreciation that modern cloud can actually drive your business. It's not just a cost centre. If you have a modern cloud attitude in your company, engineering is actually part of the business. Engineers are not stuck in the IT department. If everything is stuck in the IT department and it is treated like a black box, you might be doing modern cloud, but you're not getting the commercial benefits from bringing the tech potential to your business leaders. The next one is Speed. We've talked previously about 'Time To Value'. It's not how fast the developer can type in the code. It's the value stream from an idea to how quickly that makes it into the hands of the customers. That's not just IT. It's the whole org from front to back. And obviously, in the modern cloud, you can speed that up. You should be able to go from ideation, discovery and framing through to production and into the hands of a real customer and delivering value in days if not hours. There's a nirvana point where you're having discovery and framing sessions with the business and your end users, and you're actually showing them real prototypes in real production environments that have been toggled appropriately so that they're not exposed to the existing customer base. There is a flywheel effect here! But if the flywheel gets stuck and you're spending ages iterating, there's inertia and stoppage. When you start executing quickly, Product realises they can ask for things quickly. The flywheel starts to turn. The third item is flexibility. There's a couple of different ways to think about this: the ability to pivot a line of business or the ability to scale in different ways or different global locations. If you leverage modern capabilities, you're not worrying about a lot of upfront investment. You're not outlying capital expenditure. Your software features and capabilities are operational expenses. It's not OpEx versus CapEx. You're not worried about cost making it hard for you to pivot and change. You're not betting your credibility on a $50 million data centre that you've just purchased, and you have to make it work. You're able to do 'safe to fail' experiments in a rapid fashion, like we talked about with Speed. Your feedback loop is a lot tighter. You can pivot more efficiently and effectively to find that product/market fit. From a CEO's point of view, they want to have lots of options and they don't want to go through one way doors, They want two way doors. So if it doesn't work out, they can come back out and try something else. There are data implications as well. Organisations that embrace modern cloud are able to leverage data capabilities and expand into new products or ventures or experimentation. They're not fixated on yesterday's success. They've got their heads on a swivel, looking for that next opportunity. It's a radical target for orgs that embrace successful modern cloud. Serverless Craic from The Serverless Edge theserverlessedge.com @ServerlessEdge

AWS Morning Brief
Corporate Solidarity

AWS Morning Brief

Play Episode Listen Later Mar 3, 2022 5:20


Links: Charlie Bell in the Wall Street Journal The Register's Roundup Melijoe.com's award AWS Announcement Granted TranscriptCorey: This is the AWS Morning Brief: Security Edition. AWS is fond of saying security is job zero. That means it's nobody in particular's job, which means it falls to the rest of us. Just the news you need to know, none of the fluff.Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured, and fully managed with built-in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: Make your data sing.Corey: We begin with a yikes because suddenly the world is aflame and of course there are cybersecurity considerations to that. I'm going to have more on that to come in future weeks because my goal with this podcast is to have considered takes, not the rapid-response, alarmist, the-world-is-ending ones. There are lots of other places to find those. So, more to come on that.In happier news, your favorite Cloud Economist was quoted in the Wall Street Journal last week, talking about how staggering Microsoft's security surface really is. And credit where due, it's hard to imagine a better person for the role than Charlie Bell. He's going to either fix a number of systemic problems at Azure or else carve his resignation letter into Satya Nadella's door with an axe. I really have a hard time envisioning a third outcome.A relatively light week aside from that. The Register has a decent roundup of how various companies are responding to Russia's invasion of a sovereign country. Honestly, the solidarity among those companies is kind of breathtaking. I didn't have that on my bingo card for the year.Corey: You know the drill: You're just barely falling asleep and you're jolted awake by an emergency page. That's right, it's your night on call, and this is the bad kind of Call of Duty. The good news is, is that you've got New Relic, so you can quickly run down the incident checklist and find the problem. You have an errors inbox that tells you that Lambdas are good, RUM is good, but something's up in APM. So, you click the error and find the deployment marker where it all began. Dig deeper, there's another set of errors. What is it? Of course, it's Kubernetes, starting after an update. You ask that team to roll back and bam, problem solved. That's the value of combining 16 different monitoring products into a single platform: You can pinpoint issues down to the line of code quickly. That's why the Dev and Ops teams at DoorDash, GitHub, Epic Games, and more than 14,000 other companies use New Relic. The next late-night call is just waiting to happen, so get New Relic before it starts. And you can get access to the whole New Relic platform at 100 gigabytes of data free, forever, with no credit card. Visit newrelic.com/morningbrief that's newrelic.com/morningbrief.Corey: If you expose 200GB of data it's bad. If that data belongs to customers, it's worse. If a lot of those customers are themselves children, it's awful. But if you ignore reports about the issue, leave the bucket open, and only secure it after your government investigates you for ignoring it under the GDPR, you are this week's S3 Bucket Negligence Awardwinner and should probably be fired immediately.AWS had a single announcement of note last week. “Fine-tune and optimize AWS WAF Bot Control mitigation capability”, and it's super important because, with WAF and Bot Control, the failure mode in one direction of a service like this is that bots overwhelm your site. The failure mode in the other direction is that you start blocking legitimate traffic. And the worst failure mode is that both of these happen at the same time.And a new tool I'm kicking the tires on, Granted. It's apparently another way of logging into a bunch of different AWS accounts, so it's time for me to kick the tires on that because I consistently have problems with that exact thing. And that's what happened last week in AWS security which, let's be clear, is not the most important area of the world to be focusing on right now. Thanks for listening; I'll talk to you next week.Corey: Thank you for listening to the AWS Morning Brief: Security Edition with the latest in AWS security that actually matters. Please follow AWS Morning Brief on Apple Podcast, Spotify, Overcast—or wherever the hell it is you find the dulcet tones of my voice—and be sure to sign up for the Last Week in AWS newsletter at lastweekinaws.com.Announcer: This has been a HumblePod production. Stay humble.

Lambda3 Podcast
Lambda3 Podcast 287 – Quem é o dono da Agilidade?

Lambda3 Podcast

Play Episode Listen Later Feb 18, 2022 70:52


Neste podcast, Lambdas e pessoas convidadas debatem a polêmica que rolou nesta semana sobre o universo ágil. Afinal, quem é o dono da Agilidade? Entre no nosso grupo do Telegram e compartilhe seus comentários com a gente: https://lb3.io/telegram Feed do podcast: www.lambda3.com.br/feed/podcast Feed do podcast somente com episódios técnicos: www.lambda3.com.br/feed/podcast-tecnico Feed do podcast somente com episódios não técnicos: www.lambda3.com.br/feed/podcast-nao-tecnico Lambda3 · #287 - Quem é o dono da Agilidade? Pauta: Qual é a polêmica ágil da semana? Quem tem o direito de registrar termos genéricos? O que é comunidade? Competitividade e os incentivos de um mercado em expansão Quem é dono da Agilidade? O Mainstream gera problemas éticos? Vale tudo na concorrência pela venda de cursos? A Agilidade de Hoje não presta? E a do passado? O Passado da Agilidade é necessariamente melhor?! A competição é benéfica? O Quanto de comunidade se constrói nas redes individuais Como responder ao mercado? Quem está no controle? Links:  Agile Coaching Ethics Initiative Agile Aliance Brazil Posição da Agile Alliance sobre certificação A História da Agilidade, do Manifesto ao Hype Instituto Nacional da Propriedade Industrial Tweet do VH Agile Brazil Manifesto Ágil Participantes: Lara Rejane - @malugreen Olívia Janequine - @oliviagj Raphael Donaire Albino - @rapha_albino Victor Hugo Germano - @victorhg Edição: Compasso Coolab Créditos das músicas usadas neste programa: Music by Kevin MacLeod (incompetech.com) licensed under Creative Commons: By Attribution 3.0 - creativecommons.org/licenses/by/3.0

Giant Robots Smashing Into Other Giant Robots
405: RackN Digital Rebar with Rob Hirschfeld

Giant Robots Smashing Into Other Giant Robots

Play Episode Listen Later Dec 23, 2021 47:24


Chad talks to Rob Hirschfeld, the Founder and CEO of RackN, which develops software to help automate data centers, which they call Digital Rebar. RackN is focused on helping customers automate infrastructure. They focus on customer autonomy and self-management, and that's why they're a software company, not a services or as-a-service platform company. Digital Rebar is a platform that helps connect all of the different pieces and tools that people use to manage infrastructure into infrastructure pipelines through the seamless multi-component automation across all of the different pieces and parts that have to be run to bring up infrastructure. RackN's Website (https://rackn.com/); Digial Rebar (https://rackn.com/rebar/) Follow Rob on Twitter (https://twitter.com/zehicle) or LinkedIn (https://www.linkedin.com/in/rhirschfeld/). Visit his website at robhirschfeld.com (https://robhirschfeld.com/). Follow RackN on Twitter (https://twitter.com/rackngo), LinkedIn (https://www.linkedin.com/company/rackn/), or YouTube (https://www.youtube.com/channel/UCr3bBtP-pMsDQ5c0IDjt_LQ). Follow thoughtbot on Twitter (https://twitter.com/thoughtbot), or LinkedIn (https://www.linkedin.com/company/150727/). Become a Sponsor (https://thoughtbot.com/sponsorship) of Giant Robots! Transcript: CHAD: This is the Giant Robots Smashing Into Other Giant Robots Podcast where we explore the design, development, and business of great products. I'm your host, Chad Pytel. And with me today is Rob Hirschfeld, Founder, and CEO of RackN, which develops software to help automate data centers, which they call Digital Rebar. Rob, welcome to the show ROB: Chad, it is a pleasure to be here. Looking forward to the conversation. CHAD: Why don't we start with a little bit more information about what RackN and the Digital Rebar platform actually is. ROB: I would be happy to. RackN is focused on helping customers automate infrastructure. And for us, it's really important that the customers are doing the automation. We're very focused on customer autonomy and self-management. It's why we're a software company, not a services or as a service platform company. But fundamentally, what Digital Rebar does is it is the platform that helps connect all of the different pieces and tools that people use to manage infrastructure into infrastructure pipelines through the seamless multi-component automation across all of the different pieces and parts that have to be run to bring up infrastructure. And we were talking data centers do a lot of on-premises all the way from the bare metal up. But multi-cloud, you name it, we're doing infrastructure at that level. CHAD: So, how agnostic to the actual bare metal are you? ROB: We're very agnostic to the bare metal. The way we look at it is data centers are heterogeneous, diverse places. And that the thing that sometimes blocks companies from being innovative is when they decide, oh, we're going to use this one vendor for this one platform. And that keeps them actually from moving forward. So when we look at data centers, the heterogeneity and sometimes the complexity of that environment is a feature. It's not a bug from that perspective. And so it's always been important to us to be multi-vendor, to do things in a vendor-neutral way to accommodate the quirks and the differences between...and it's not just vendors; it's actually user choice. A lot of companies have a multi-vendor problem (I'm air quoting) that is actually a multi-team problem where teams have chosen to make different choices. TerraForm has no conformance standard built into it. [laughs] And so you might have everybody in your company using TerraForm and Ansible happily but all differently. And that's the problem that we walk into when we walk into a data center challenge. And you can't sweep that under the rug. So we embraced it. CHAD: What kind of companies are your primary customers? ROB: We're very wide-ranging, from the top banks use us and deploy us, telcos, service providers, very large scale service providers use us under the covers, media companies. It really runs the gamut because it's fundamentally for us just about infrastructure. And our largest customers are racing to be the first to deploy. And it's multi-site, but 20,000 machines that they're managing under our Digital Rebar management system. CHAD: It's easy, I think, depending on where you sit and your experiences. The cloud providers today can overshadow the idea that there are even people who still have their own data centers or rent a portion of a data center. In today's ecosystem, what are some of the factors that cause someone to do that who isn't an infrastructure provider themselves? ROB: You know the funny thing about these cloud stories (And we're talking just the day after Amazon had a day-long outage.) is that even the cloud providers don't have you give up operation. You're still responsible for the ops. And for our customers, it's not like they can all just use Lambdas and API gateways. At the end of the day, they're actually doing multi-site distributed operations. And they have these estates that are actually it's more about how do I control distributed infrastructure as much as it is about repatriating. Now, we do a lot to help people repatriate. And they do that because they want more control. Cost savings is a significant component with this. You get into the 1000s of machines, and that's a big deal. Even at hundreds of machines, you can save a lot of money compared to what you get in cloud. And I think people get confused with it being an or choice. It really is an and choice. Our best customers are incredibly savvy cloud users. They want that dynamic, resilient very API-driven environment. And they're looking to bring that throughout the organization. And so those are the ones that get excited when they see what we've done because we spend a lot of time doing infrastructure as code and API-driven infrastructure. That's really what they want. CHAD: Cool. So, how long have you been working on RackN? When did you found it? ROB: [laughs] Oh my goodness. So RackN is seven years old. Digital Rebar, we consider it to be at its fourth generation, but those numbers actually count back before that. They go back to 2009. The founding team was actually at Dell together in the OpenStack heyday and even before the OpenStack heyday. And we were trying to ship clouds from the Dell Factory. And what we found was that every customer had this bespoke data center we've already talked about. And we couldn't write automation that would work customer to customer to customer. And it was driving us nuts. We're a software team, not a hardware team inside of Dell. And the idea that if I fixed something in the delivery or in their data center, and couldn't go back to their data center because it was different than what the next customer needed and the next customer needed, we knew that we would never have a community. It's very much about this community and reuse pattern. There's an interesting story that I picked up from SREcon actually where they were talking about the early days of boilers. This is going back a few centuries ago. But when they first started putting boilers into homes and buildings, there was no pattern, there was no standard. And everybody would basically hire a plumber or a heating architect. Heating architect was a thing. But you'd build a boiler and every one was custom, and every one was different. And no surprise, they blew up a lot, and they caused fires. And buildings were incredibly unsafe because they were working on high-pressure systems with no pattern. And it took regulation and laws and standards. And now nobody even thinks about it. You just take standard parts, and you connect them together in standard ways. And that creates actually a much more innovative system. You wouldn't want every house to be wired uniquely either. And so when we look at the state of automation today, we see it as this pre-industrial pre-standardization process and that companies are actually harmed and harming themselves because they don't have standards, and patterns, and practices that they can just roll and know they work. And so that philosophy started way back in 2009 with the first generation which was called Crowbar. Some of your audience might even remember this from the OpenStack days. It was the first OpenStack installer built around Chef. And it had all sorts of challenges in it, but it showed us the way. And then we iterated up to where Digital Rebar is today. Really fully infrastructure as code, building infrastructure pipelines, and a lot of philosophical pieces we've learned along the way. CHAD: So you were at Dell working on this thing. How did you decide to leave Dell and start something new? ROB: Dell helped me with that decision. [laughs] So the challenge of being a software person inside of Dell especially at the time, Crowbar was open-source which did make it easier for us to say, "Hey, we want to part ways but keep the IP." And the funny thing is there's not a scrap of Crowbar in Digital Rebar except one or two naming conventions that we pulled forward and the nod of the name, that Rebar is a nod to Crowbar. But what happened was Dell when it went private, really did actually double down on the hardware and the more enterprise packaged things. They didn't want to invest in DevOps and that conversation that you need to have with your customers on how they operate, the infrastructure you sold them. And that made Dell not a very good place for me and the team. And so we left Dell, looked at the opportunity to take what we'd been building with Crowbar and then make it into a product. That's been a long journey. CHAD: Now, did you bootstrap, or did you take investment? ROB: We took [laughs] a little bit of investment. We raised some seed funding. Certainly not what was in hindsight was going to be sufficient for us to do it. But we thought at the time that we had something that was much more product-ready for customers than it was. CHAD: And what was the challenge that you found? What was the surprise there that it wasn't as ready as you thought? ROB: So what we've learned in our space specifically...and there are some things that I think apply to everybody, and there are some things that you might be able to throw on the floor and ignore. I was a big fan of Minimum Viable Product. And it turned out that the MVP strategy was not at all workable with customers in data centers. Our product is for people running production data centers. And nobody's going to put in software to run a data center that is MVP. It has to be resilient. It has to be robust. It has to be simple enough that they can understand it and solve some core problems, which is still sort of an MVP idea. But it can't be oops. [laughs] You can't have a lot of oops moments when you're dealing with enterprise infrastructure automation software. It has to work. And importantly, and as a design note, this has been a lesson for us. If it does break, it has to break in very transparent, obvious ways. And I can't emphasize that enough. There's so much that when we build it, we come back and like, was it obvious that it broke? Is it obvious that it broke in a way that you can fix? CHAD: And it's part of the culture too to do detailed post mortems with explanations and be as transparent as possible or at least find the root cause so that you can address it. That's part of the culture of the space too, right? ROB: You'd like to hope so. [laughs] CHAD: Okay. [laughs] In my experience, that's the culture of the space. ROB: You're looking more at a developer experience. But even with a developer, you've got to be in a post mortem or something. And it's like everybody's pointing to the person to the left and the right sort of by human nature. You don't walk into that room assuming that it was your fault, and you should, but that's not how it usually is approached. And what we find in the ops space, and I would tell people to work around this pattern if they can, is that if you're the thing doing the automation, you're always the first cause of the problem. So we run into situations where we're doing a configuration, and we find a vendor bug or a glitch or there's something, and we found it. It's our problem whether we were the cause or not. And that's super hard. I think that people on every side of any type of issue need to look through and figure out what the...the blameless post mortem is a really important piece in all this. At the end of the day, it's always a human system that made a mistake or has something like that. But it's not as simple as the thing that told you the bad news that the messenger was at fault. And there's a system design element to that. That's what we're talking about here is that when you're exposing errors or when something's not behaving the way you expect, our philosophy is to stop. And we've had some very contentious arguments with customers who were like, "Just retry until it fixes itself," or vendors who were like, "Yeah, if you retry that thing three times, [laughs] then it'll magically go away." And we're like, that's not good behavior. Fix the problem. It actually took us years to put a retry element into the tasks so that you can say, yeah, this does need three retries. Just go do it. We've resisted for a long time for this reason. CHAD: So you head out into the market. And did you get initial customers, or was there so much resistance to the product that you had that you struggled to get even first customers? ROB: We had first customers. We had a nice body of code. The problem is actually pretty well understood even by our customers. And so it wasn't hard for them to get a trial going. So we actually had a very profitable customer doing...it was in object storage, public object storage space. And they were installing us. They wanted to move us into all their data centers. But for it to work, we ended up having an engineer who basically did consulting and worked with them for the better part of six months and built a whole bunch of stuff, got it working. They could plug in servers, and everything would set itself up. And they could hit a button and reset all the servers, and they would talk to the switches. It was an amazing amount of automation. But, and this happens a lot, the person we'd been working with was an SRE. And when they went to turn it over to the admins in the ops team, they said, [laughs] "We can't operate. There's too much going on, too complex." And we'd actually recognized...and this is a really serious challenge. It's a challenge now that we're almost five years into the generation that came after that experience. And we recognized there was a problem. And that this wasn't going to create that repeatable experience that we were looking for if it took that much. At the same time, we had been building what is now Digital Rebar in this generation that was a single Golang binary. All the services were bundled into the system. So it listened on different ports but provide all the services, very easy to install, really, really simple. We literally stripped everything back to the basics and restarted. And we had this experience where we sat down with a customer who had...I'm going to take a second and tell the story because this is such a compelling story from a product experience. So we took our first product. We were in a bake-off with another bare metal focus provisioning at the time. And they were in a lab, and they set our stuff up. And they turned it on, and they provisioned. And they set up the competitor, and they turned it on and provisioned. And both products worked. Our product took 20 minutes to go through the cycle and the competitor took 3. And the customer came back and said, "I can't use this. I like your product better. It has more controls with all this stuff." But it took 20 minutes instead of 3. We actually logged into the system, looked at it and we were like, "Well, that's because it recognized that your BIOS was out of date, patched your BIOS, updated the system, checked that it was right, and then rebooted the systems and then continued on its way because it recognized your systems were outdated automatically. And he said, "I didn't want it to do that. I needed it to boot as quickly as possible." And literally, [laughs] we were in the middle of a team retreat. So it's like, the CTO is literally excusing himself on the table to talk to the guy to make this stuff, try and make it right. And he's like, "Well, we've got this new thing. Why don't you install this, what's now Digital Rebar, on the system and repeat the experiment?" And he did and Digital Rebar was even faster than the competitor. And it did exactly just install, booted, and was done. And he came back to the table, and it took 15 minutes to have the whole conversation to make everything work. It was that much of a simpler design. And he sat down and told the story. And I was in the middle of it. I'm just like, "We're going to have to pivot and put everything into the new version," which is what we did. And we just ripped out the complexity. And then over the last couple of years now, we've built the complexity back into the system to do all those additional but much more customer-driven from that perspective. CHAD: How did you make sure that as you were changing your focus, putting all of your energy into the new version that you [laughs] didn't introduce too much risk in that process or didn't take too long? ROB: [laughs] We did take too long and introduced too much risk, and we did run out of money. [laughs] All those things happened. This was a very difficult decision. We thought we could get it done much faster. The challenge of the simpler product was that it was too simple to be enough in customers' data centers. And so yeah, we almost went out of business in the middle of all this cycle. We had a time where RackN went back down to just the two founders. And at this point, we'd gotten far enough with the product that we knew it was the right thing. And we'd also embedded a degree...with the way we do the UX, we have this split. The UX runs on a hosted system. It doesn't have to but by default, it does. And then we have the back end. So we were very careful about how we collected metrics because you really need to know who's downloading and using your products. And we had enough data from that to realize that we had some very committed early users and early customers, just huge brand names that were playing around. So we knew that we'd gotten this mix right, that we were solving a problem in a unique way. But it was going to take time because big companies don't make decisions quickly. We have a joke. We call it the reorg half-life problem. So the half-life of a reorg in any of our customers is about nine months. And either you're successful inside of that reorg half-life, or you have to be resilient across this reorg half. And so initially, it was taking more than nine months. We had to be able to get the product in play. And once we did, we had some customers who came in with very big checks and let us come back and basically build back up. And we've been adding some really nice names into our customer roster. Unfortunately, it's all private. I can tell you their industries and their scale, but I can't name them. But that engagement helped drive us towards the feature set and the capabilities and building things up along that process. But it was frustrating. And some of them, especially at the time we were open-source, were very happy to say, "No, we are a super big brand name. We don't pay for software." I'm like, "Most profitable, highest valued companies in the world you don't want to pay for this operational software?" And they're like, "No, we don't have to." And that didn't sit very well with us. Very hard, as a starting startup, it was hard. CHAD: At the time, everything you were doing was open source. ROB: So in the Digital Rebar era, we were trying to do Open Core. Digital Rebar itself was open. And then we were trying to hold back the BIOS patches, integrate enterprise single sign-on. So there was a degree of integration pieces that we held back as RackN and then left the core open. So you could use Digital Rebar and run it, which we had actually had a lot of success with people downloading, installing, and running Digital Rebar, not as much success in getting them to pay us for that privilege. CHAD: So, how did you adjust to that reality? ROB: We inverted the license. After we landed a couple of big banks and we had several others and some hyperscalers too who were like, "This is really good software. We love it. We're embedding it in our service, but we're not going to pay you." And then they would show up with bugs and complaints and issues and all sorts of stuff still. And what happened is we started seeing them replicating the closed pieces. The APIs were open. We actually looked at it and listening to our communities, they wanted to see what was in the closed pieces. That was actually operationally important for them to understand how that stuff worked. They never contributed or asked to see anything in the core. And, there's an important and here, and they needed performance improvements in the core that were radically different. So the original open-source stuff went to maybe 500 machines, and then it started to cap out. And we were like, all right, we're going to have to really rewrite the data store mechanisms that go with this. And the team looked at each other and were like, "We're not going to open source that. That's really complex and challenging IP." And so we said the right model for us is going to be to make the core closed and then allow our community and users to see all the things that they are actually using to interact with their environment. And it ends up being a little bit of a filter. There are people who only use open-source software. But those companies also don't necessarily want to pay. When I was an open-source evangelist, this was always a problem. You're pounding on the table saying, "If you're using open-source software, you need to understand who to pay for that service, that software that you're getting. If you're not paying for it, that software is going to go away." In a lot of cases, we're a walking example of that. And it's funny, more of the codebase is open today than it was then. [chuckles] But the challenge is that it's really an open ecosystem now because none of that software is particularly useful without the core to run it and glue everything together. CHAD: Was that a difficult decision to make? Was it controversial? ROB: Incredibly difficult. It was something I spent a lot of time agonizing about. My CTO is much clear-eyed on this. From his perspective, he and the other engineers are blood, sweat, and tears putting this in. And it was very frustrating for them to see it running people's production data centers who told us, and this is I think the key, who just said to us, "You know, we're not going to pay money for that." And so for them, it was very clear-eyed it's their work, their sweat equity, very gut feeling for that. For me, I watched communities with open-source routes, you know, the Kubernetes community. I was in OpenStack. I was on the board for that. And there is definitely a lift that you get from having free software and not having the strings. And I also like the idea that from a support perspective, if you're using open-source software, you could conceivably not care for the vendor that went away. You could find another life for it. But years have gone by and that's not actually a truism that when you are using open-source software if you're getting it from a vendor, you're not necessarily protected from that vendor making decisions for you. CentOS is a great...the whole we're about to hit the CentOS deadlines, which is the Streams, and you can't get other versions. And we now have three versions of CentOS, at least three versions of CentOS with Rocky, and Alma, and CentOs Streams. Those are very challenging decisions for people running enterprise data centers, not that simple. And nobody in our communities is running charity data centers. There's no goodwill charity. I'm running a data center out of the goodness of my heart. [laughs] They are all production systems, enterprise. They're doing real production work. And that's a commercial engagement. It's not a feel-good thing. CHAD: So what did you do in your decision-making process? What pushed you, or what did you come to terms with in order to make that change? ROB: I had to admit I was wrong. [laughter] I had to think back on statements I'd made and the enthusiasm that I'd had and give up some really hard beliefs. Being a CEO or a founder is the same process. So I wish I could say this was the only time [laughs] I had to question, you know, hard-made assumptions, or some core beliefs in what I thought. I've had to get really good at questioning when am I projecting this is the way I want the world to be therefore it will be? That's a CEO skill set and a founder skill set...and when that projection is having you on thin ice. And so you constantly have to make that balance. And this was one of those ones where I'm like, all right, let's do it. And I still wake up some mornings and look at people who are open source only and see how much press they get or how easy it is for them to get mentions and things like that. And I'm like, ah, God, that'd be great. It feels like it's much harder for us because we're commercial to get the amplification. There are conferences that will amplify open-source TerraForm, great example. It gets tons of amplification for being a single vendor project that's really tightly controlled by HashiCorp. But nobody is afraid to go talk about TerraForm and mention TerraForm and do all this stuff, the amazing use of open source by that company. But they could turn it and twist it, and they could change it. It's not a guarantee by any stretch of the imagination. CHAD: Well, one of the things that I've come to terms with, and maybe this is a very positive way of looking at it, instead of that you were wrong, [laughter] is to realize that well, you weren't necessarily wrong. It got you to where you were at that point. But maybe in order to go to the next level, you need to do something different. And that's how I come to terms with some things where I need to change my thinking. ROB: [laughs] I like that. It's good. Sometimes you can look back and be like, yeah, that wasn't the right thing and just own it. But yeah, it does help you to know the path. Part of the reason why I love talking about it with you like this is it's not just Rob was wrong; we're actually walking the path through that decision. And it's easy to imagine us sitting in...we're in a tiny, little shared office listening to calls where...I'll tell you this as a story to make it incredibly concrete because it's exactly how this happened. We were on a call. Everybody was in the room. And we were talking to a major bank saying, "We love your software." We're like, "Great, we're looking forward to working with you," all this stuff. And they're like, "Yeah, we need you to show us how you built this plugin because we want to write our own version of it." CHAD: [chuckles] ROB: We're like, "If you did that, you wouldn't need to buy our software." And they're like, "That's right. We're not going to buy your software." CHAD: Exactly. [laughs] ROB: And we're like, "Well, we won't show you how to use it. Then we won't show you how to do that." And they're like, "Well, okay. We'll figure it out ourselves." And so I'm the cheerful, sunny, positive, sort of managing the call, and I'm not just yelling at them. My CTO is sitting next to me literally tearing his hair. This was literally a tearing his hair out moment. And we hung up the call, and we went on a walk around the neighborhood. And he was just like, "What more do you need to hear for you to understand?" And so it's moments like that. But instead of being like, no, you're wrong, we got to do it this way, I was ready to say, "Okay, what do you think we can do? How do we think we can do it?" And then he left me with a big pile of PR messaging to explain what we're doing, conversations like this. Two years ago when we made this change, almost three, I felt like I was being handed a really hard challenge. As it turns out, it hasn't been as big a deal. The market has changed about how they perceive open source. And for enterprise customers, they're like, "All right, how do we deal with the licensing for this stuff?" And we're like, "You just buy it from us." And they're like, "That's it?" And I'm like, "Yes." And you guarantee every..." "Yes." They're like, "Oh. Well, that's pretty straightforward. I don't have to worry about..." We could go way down an open-source rabbit hole and the consulting pieces and who owns the IP, and I used to deal with all that stuff. Now it's very straightforward. [laughs] Like, "You want to buy and use the software to run your data center?" "Yes, I do." "Great." CHAD: Well, I think this is generally applicable even beyond your specific product but to products in general. It's like, when you're not talking to people who are good customers or who are even going to be your customers who are going to pay for what you want, you can spend a lot of time and energy trying to please them. But you're not going to be successful because they're not going to be your customers no matter what you do. ROB: And that ends up being a bit of a filter with the open-source pieces is that there are customers who were dyed in the wool open source. And this used to be more true actually as the markets moved a lot. We ended up just not talking to many. But they do, they want a lot. They definitely would ask for features or things and additions and help, things like that. And it's hard to say no. Especially as a startup founder, you want to say yes a lot. We try to not say yes to things that we don't...and this puts us at a disadvantage I feel like from a marketing perspective. If we don't do something, we tend to say we don't do it, or we could do it, but it would take whatever. I wish more people in the tech space were as disciplined about this does work, this doesn't work, this is a feature. This is something we're working on. It's not how tech marketing typically works sadly. That's why we focus on self-trials so people can use the product. Mid-roll Ad I wanted to tell you all about something I've been working on quietly for the past year or so, and that's AgencyU. AgencyU is a membership-based program where I work one-on-one with a small group of agency founders and leaders toward their business goals. We do one-on-one coaching sessions and also monthly group meetings. We start with goal setting, advice, and problem-solving based on my experiences over the last 18 years of running thoughtbot. As we progress as a group, we all get to know each other more. And many of the AgencyU members are now working on client projects together and even referring work to each other. Whether you're struggling to grow an agency, taking it to the next level and having growing pains, or a solo founder who just needs someone to talk to, in my 18 years of leading and growing thoughtbot, I've seen and learned from a lot of different situations, and I'd be happy to work with you. Learn more and sign up today at thoughtbot.com/agencyu. That's A-G-E-N-C-Y, the letter U. CHAD: So you have the core and then you have the ecosystem. And you also mentioned earlier that it is an actual software package that people are buying and installing in their data center. But then you have the UI which is in the cloud and what's in the data center is reporting up to that. ROB: Well, this is where I'm going to get very technical [laughs] so hang on for a second. We actually use a cross-domain approach. So the way this works...and our UX is written in React. And everything's...boy, there's like three or four things I have to say all at once. So forgive me as I circle. Everything we do at Digital Rebar is API-first, really API only, so the Golang service with an API, which is amazing. It's the right way to do software. So for our UX, it is a React application that can talk to that what we call an endpoint, that Digital Rebar endpoint. And so the UX is designed to talk directly to the Digital Rebar endpoint, and all of the information that it gets comes from that Digital Rebar endpoint. We do not have to relay it. Like, you have to be inside that network to get access to that endpoint. And the UX just talks to it. CHAD: Okay. And so the UX is just being served from your centralized servers, but you're just delivering the React for the JavaScript app. And that is talking to the local APIs. ROB: Right. And so we do use that browser as a bridge. And so when you want to download new content packs...so Digital Rebar is a platform. So you have to download content and automation and pieces into it. The browser is actually your bridge to do that. So the browser can connect to our catalog, pull down our catalog, and then send things into that browser. So it's super handy for that. But yeah, it's fundamentally...it's all behind your firewall software except...and this is where people get confused because you're downloading it from rackn.io. That download or the URL on the browser looks like it's a RackN URL even though all the traffic is network local. CHAD: Do your customers tend to stay up to date? Are they updating to the latest version right away all the time? ROB: [laughs] No, of course not. CHAD: I figured that was the answer. ROB: And we maintain patches on old versions and things like that. I wish they were a little faster. I'm not always sad that they're...I'm actually very glad when we do a release like we did yesterday...And in that release, I don't expect any of our production customers to go patch everything. So in a SaaS, you might actually have to deal with the fact that you've got...and we're back to our heterogeneity story. And this is why it's important that we don't do this. If we were to push that, if we didn't handle every situation for every customer exactly right, there would be chaos. And it would all come back to our team. The way we do it means that we don't have to deal with that. Customers are in control of when they upgrade and when they migrate, except in the UX case. CHAD: So how do you manage that if someone goes to the UI and their local thing is an old version? Are you detecting that and doing things differently? ROB: Yes, one of the decisions we made that I'm really happy with is we embedded feature flags into the API. When you log in, it will pull back. We know what the versions are. But versions are really problematic as a way to determine what's in software, not what's not in software. So instead, we get an array back that has feature flags as we add features into the core. And we've been doing this for years. And it's an amazingly productive process. And so what the UX does is as we add new things into the UX, it will look for those feature flags. And if the feature flag isn't there, it will show you a message that says, "This feature is not available for your endpoint," or show you the thing appropriate without that. And so the UX has gone through years of this process. And so there are literally just places where the UX changes behavior based on what you've installed on your system. And remember, our customers it's multi-site. So our customers do have multiple versions of Digital Rebar installed across there. So this behavior is really important also for them to be able to do it. And it goes back to LaunchDarkly. I was talking to Edith back in the early days of LaunchDarkly and feature flags, and I got really excited about that. And that's why we embedded it into the product. Everybody should do it. It's amazing. CHAD: One of the previous episodes a few ago was with actually the thoughtbot CTO, Joe Ferris. And we're on a project together where it's a different way of working but especially when you need it... so much of what I had done previously was versioned APIs. Maybe that works at a certain scale. But you get to a certain scale of software and way of working and wanting to do continuous deployment and continually update features and all that stuff. And it's a really good way of working when instead you are communicating on the level of feature availability. ROB: And from an ops person's perspective, and this was true with OpenStack, they were adding feature flags down at the metadata for the...it was incredible. They went deep into the versioned API hellscape. It's the only way I can describe it [laughs] because we don't do that. But the thing that that does not help you with is a lot of times the changes that you're looking at from an API perspective are behavior changes, not API changes. Our API over years now has been additive. And as long as you're okay with new objects showing up, new fields showing up in an object, you could go back to four-year-old software, talk to our API, and it would still work just fine. So all your integrations are going to be good, but the behavior might change. And that's what people don't...they're like, oh, I can make my API version, and everything's good. But the behavior that you're putting behind the scenes might be different. You need a way to express that even more than the APIs in my opinion. CHAD: I do think you really see that when you...if you're just building a monolithic web app, it's harder to see. But once you separate your UI from your back end...and where I first hit this was with mobile applications. The problem becomes more obvious to you as a developer I think. ROB: Yes. CHAD: Because you have some people out there who are actually running different versions of your UI too. So your back end is the same for everybody but your UI is different. ROB: [laughs] CHAD: And so you need a back end that can respond to different clients. And a better way to do that rather than versioning your API is to have the clients tell you what they're capable of while they're making the requests and to respond differently. It's much more of a flexible way. ROB: We do track what UX. We have customers who don't want to use that. They don't even want us changing the UX...or actually normal enterprise. And so they will run...the nice thing about a React app is you can just run it. The Digital Rebar can host its UX, and that's perfectly reasonable. We have customers who do that. But every core adds more operational complexity. And then if they don't patch the UX, they can fall behind or not get features. So we see that it's...you're describing a real, you know, the more information you're exchanging between the clients and the servers, the better for you to track what's really going on. CHAD: And I think overall once you can get a little...in my experience, especially people who haven't worked that way, joining the team, it can take a little bit for them to get comfortable with that approach and the flexibility you need to be building into your system. But once people are comfortable with it and the team is comfortable, it really starts to hum. In my experience, a lot of what we've advocated for in terms of the way software should be built and deployed and that kind of thing is it actually makes it so that you can leave that even easier. And you can really be agile because you can roll things out in a very agile way. ROB: So are you thinking like an actual rolling deployment where the deployed software has multiple versions coming through? CHAD: Yep. And you can also have different users seeing different things at different times as well. You can say, "We're going to be doing continual deployment and have code continually deployed." But that doesn't mean that it's part of the release yet, that it's available to users to use. ROB: Yeah, that ability to split and feature flag is a huge deal. CHAD: Yeah. What I'm trying to figure out is does this apply to every project even the small like, this just changes the way you should build software? Or is there a time in a product to start introducing that thing? ROB: I am a big fan of doing it first and fast. There are decisions that we made early that have proven out really well. Feature flags is one of them. We started right away knowing that this would be an important thing for us to do. And same thing with tracking dependencies and being able to say, "I need..." actually, it's helpful because you can write automation that says, "I need this feature in the product." This flag and the product it's not just a version thing. That makes the automation a little bit more portable, easier to maintain. The other thing we did that I really like is all of our objects have documentation embedded in them. So as I write a parameter or an ask or really anything in the system, everything has a documentation field. And so I can write the documentation for that component right there. And then we modified our build scripts so that they will pull in all of that documentation and create an aggregated view. And so the ability to do just-in-time documentation is very, very high. And so I'm a huge fan of that. Because then you have the burden of like, oh, I need to go back and write up a whole bunch of documentation really lessened when you can be like, okay, for this parameter, I can explain its behavior, or I can tell you what it does and know that it's going to show up as part of a documentation set that explains it. That's been something I've been a big fan of in what we build. And not everybody [laughs] is as much a fan. And you can see people writing stuff without particularly crisp documentation behind it. But at least we can go back and add that documentation or lessons learned or things like that. And it's been hugely helpful to have a place to do that. From a design perspective, one other thing I would say that we did that...and you can imagine the conversation. I have a UX usability focus. I'm out selling the product. So for me, it's how does it demo? How does it show? What's that first experience like? And so for me having icons and colors in the UX, in the experience is really important. Because there's a lot of semantic meaning that people get just looking down a list of icons and seeing that they are different colors and different shapes. But from the CTO's perspective, that's window dressing. Who cares? It doesn't have functional purpose. And we're both right. There's a lot of times when to me, both people can be right. So we added that as a metafield into all of our objects. And so we have the functional part of the definition of the API. And then we have these metaobjects that you can add in or meta definitions that you can add in behind the scenes to drive icons and colors. But sometimes UX rendering hints and things like that that from an API perspective, you're like, I don't care, not really an API thing. But from a do I show...this is sensitive information. Do I turn it into a password field? Or should this have a clipboard so I can clipboard icon it, or should I render it in this type of viewer or a plain text viewer? And all that stuff we have a place for. CHAD: And so it's actually being delivered by the API that's saying that. ROB: Correct. CHAD: That's cool. ROB: It's been very helpful. You can imagine the type of stuff we have, and it's easy to influence UX behaviors without asking for UI change. CHAD: Now, are these GraphQL APIs? ROB: No. We looked at doing that. That's probably a whole nother...I might get our CTO on the line for that. CHAD: [laughs] It's a whole nother episode for that. ROB: But we could do that. But we made some decisions that it wasn't going to provide a lot of lift for us in navigation at the moment. It's funny, there's stuff that we think is a really cool idea, but we've learned not to jump on them without having really specific customer use cases or validations. CHAD: Well, like you said, you've got to say no. You've got to make decisions about what is important, and what isn't important now, and what you'll get to later, and that requires discipline. ROB: This may be a way to bring it full circle. If you go back to the stories of every customer having a unique data center, there's this heterogeneity and multi-vendor pieces that are really important. The unicycle we have to ride for this is we want our customers to have standard operating processes, standard infrastructure pipelines for this and use those and follow that process. Because we know if they do, then they'll keep improving as we improve the pipelines. And they're all unique. So there has to be a way in those infrastructure pipelines to do extensions that allow somebody to say, "I need to make this call here in the middle of this pipeline." And we have ways to do that address those needs. The challenge becomes providing enough opinionated like, this is how you should do things. And it's okay if you have to extend it or change it a little bit or tweak it without it just becoming an open-ended tool where people show up and they're like, "Oh, yeah, I get how to build something." And we have people do this, but they run out of gas in the long journey. They end up writing bespoke workflows. They write their own pipelines; they do their own integrations. And for them, it's very hard to support them. It's very hard to upgrade them. It's very hard for them to survive the reorg, your nine-month reorg windows. And so yeah, there's a balance between go do whatever you want, which you have to enable and do it our way because these processes are going to let your teams collaborate, let you reuse software. And we've actually over time been erring more and more on the side of you really need to do it the way we want you to do; reinforce the infrastructure as code processes. And this is the key, right? I mean, you're coming from a development mindset. You want your tooling to reinforce good behavior, CICD, infrastructure as code, all these things. You need those to be easier to do [laughs] than writing it yourself. And over time, we've been progressing more and more towards the let's make it easier to do it within the opinionated way that we have and less easy to do it within the Wild West pattern. CHAD: Cool. Well, I think with that, we'll start to wrap up. So if people want to find out more, where are some places that they could do that or get in touch with you? ROB: The simplest thing is of course rackn.com is the website. We encourage people to just, if this is interesting, download and try the software. If they have a cloud account, it's super easy to play with it, all things RackN through that. I am very active on Twitter under the handle @zehicle Z-E-H-I-C-L-E. And I'm happy to have conversations around these topics and data center and operations and even the future of cloud and edge computing. So please look me up. I'm excited to have conversations like that. CHAD: Awesome. And you can subscribe to the show and find notes and transcripts for this episode at giantrobots.fm. If you have questions or comments, email us at hosts@giantrobots.fm. And you can find me on Twitter @cpytel. This podcast is brought to you by thoughtbot and produced and edited by Mandy Moore. Thanks for listening and see you next time. Announcer: This podcast was brought to you by thoughtbot. thoughtbot is your expert design and development partner. Let's make your product and team a success.

How to Program with Java Podcast
EP51 - Let's Talk Lambdas in Java

How to Program with Java Podcast

Play Episode Listen Later Nov 12, 2021 57:32


In this episode we'll talk about a super useful feature that was introduced back in Java version 8, known as Lambdas. The Lambda feature is something you didn't know you desperately wanted or needed until you understood it. The Lambda syntax allows you to write much cleaner and more readable code, while also empowering you to get more done with less code. In this lecture, I'll be referring to some code that you can download via this github repository. Interested in starting your coding career? I'm now accepting students into an immersive programming Bootcamp where I guarantee you a job offer upon graduation. It is a 6 month, part-time, online Bootcamp that teaches you everything you need to know to get a job as a Java developer in the real-world. You can learn more via https://www.coderscampus.com/bootcamp

newline
Serverless on AWS Lambda with Stephanie Prime

newline

Play Episode Listen Later Jun 17, 2020 60:46


newline Podcast Sudo StephNate: [00:00:00] Steph, just tell us a little bit about your work and kind of your background with, like AWS and like what you're doing now.Steph: [00:00:06] Yes, so I work as a engineer for a manage services provider called Second Watch. We basically partner with other big companies that use AWS or some other clouds sometimes Azure for managing their cloud infrastructure, which basically just means that.We help big companies who may not, their focus may not be technology, it may not be cloud stuff in general, and we're able to just basically optimize the cost of everything, make sure that things are running reliably and smoothly, and we're able to work with AWS directly to kind of keep people ahead of the curve when.New stuff is coming out and just it changes so much, you know, it's important to be able to adapt. So like personally, my role is I develop automation for our internal operations teams. So we have a bunch of, you know, just really smart people who are always working on customer specific AWS issues. And we see some of the same issues.Pop up over and over. Of course, you know, security , auditing, cost optimization. And so my team makes optimizations that we can distribute to all of these clients who have to maintain their own. You know, they have their own AWS account. It's theirs. And we make it so that we're actually able to distribute these automations same way in all of our customers' accounts.So the idea is that, and it's really wouldn't be doable without serverless because the idea is that everyone has to own their own infrastructure, right? Your AWS account is yours does or your resources, you don't, for security reasons, want to put all of your stuff on somebody else's account. But actually managing them all the same way can be a really difficult, even with scripts, because permissions different places have to be granted through the AWS permissions up with  access, I identity and access management, right? So serverless gave us the real tool that we needed to be able to at scale, say, Hey, we came up with a little script that will run on an hourly basis to check to see how much usage these servers are getting, and if they're not production servers, you know, spin them down if they're not in use to save money.Little things like that when it comes to operations and AWS Lambda is just so good for it because it's all about, you know, like I said, doing things reliably. Doing things in a ways that can be audited and logged and doing things for like a decent price. So like background wise, I used to work at AWS in AWS support actually, and I kind of supported some of their dev ops products like OpsWorks, which is based on chef for configuration management, elastic Beanstalk and AWS CloudFormation, specifically. After working there for a bit, I really got to see, you know, how it's made and what the underlying system is like. And it was just crazy just to see how much work goes into all this, just so you can have a supposedly, easier to use for an end. But serverless just kinda changed all that for the better.Luckily.Amelia: [00:02:57] So it sounds like AWS has a ton of different services. What are the main ones and how many are there?Steph: [00:03:04] So I don't think I can even count anymore because they just, they do release new ones all the time. So hundreds at this point, but really main ones, and maybe not hundreds, maybe a little over a hundred would be a better estimate.I mean,  EC2 which is elastic compute is. The bread and butter. Historically, AWS is just, they're virtualized servers basically. So EC2, the thing that made AWS really special from the beginning and that made cloud start to take over the world was the concept of auto scaling groups, which are basically definitions you attached to EC2 and it basically allows you to say, Hey, if I start getting a lot of traffic on.This one type of server, right? You know, create a second server that looks exactly the same and load balance the traffic through it. So when they say scaling, that's basically what, how you scale, easy to use auto scaling groups and elastic load balancers and kind of distribute the traffic out. The other big thing besides the scalability of  with auto scaling groups is.Redundancy. So there's this idea of regions within AWS, and within each region there's availability zones. So regions are the general, like you can think of it as the place where data center is kind of like located within like a small degree. So it's usually like. Virginia is one, right? That's us East one.It's the oldest one. Another one is in California, but they're all over the world now. So the idea is you pick a region to minimize latency, so you pick the region that's closest to you. And then within the region, there's the idea of availability zones, which are basically just discreet, like physical locations of the servers that you administer them the same way, but they're protected.So like if a tornado runs through and hits one of your data centers. If you happen to have them distributed between two different availability zones, then you'll still be able to, you know, serve traffic. The other one will go down, but then the elastic load balancer will just notice that it's not responding and send the traffic to the other availability zone.So those are the main concepts that make it like EC2 those are what you need to use it effectively.Nate: [00:05:12] So with an easy to instance, that would be like a virtual server. I mean, it's not quite a Docker container, I guess we're getting to nuance there, but it's basically like a server that you would have like command line access to.You could log in and you can do more or less whenever you want on an EC2 instance.Steph: [00:05:29] Right, exactly. And so it used to be AWS used what was called Zen virtualization to do it. And that's just like you can run Zen on your own computer, you can get a computer and set up a virtual machine, almost just like they used to do it .So they are constantly putting out like new ways of virtualizing more efficiently. So they do have new technology now, but it's not something that was really, I mean, it was well known, but they really took it to a new kind of scale, which made it really impressive.Nate: [00:05:56] Okay, so EC2 lets you have full access to the box that's running and you might like load bounce requests against that.How does that contrast with what you do with AWS Lambda and serverless?Steph: [00:06:09] So with , you still have to, you know, either secure shell or, you know, furious and windows. Use RDP or something to actually get in there. You care about what ports are open. You have security groups for that. You care about all the stuff you would care about normally with a server you care about.Is it patched and up today you care about, you know, what's the current memory and CPU usage? All those things don't go away on EC2 just because it's cloud, right? When we start bringing serverless into the mix, suddenly. They do go and away. I mean, and there's still a few limitations. Like for instance, a Lambda has a limit on how much memory it can process with, just because they have to have a way to kind of keep costs down and define the units of them and define where to put them.Right? But at its core, what a Lambda is, it actually runs on a Docker container. You can think of it like a pre-configured Docker container with some pre-installed dependencies. So for Python, it would have. The latest version of Python that it says it has, it would have boto. It would have the stuff that it needs to execute that, and it would also have some basic, it's structured like it was, you know, basic Linux.So there's like a attempt. So slash temp you can write files there, right. But really it's like a Docker container. That runs underneath it on a fleet of . As far as availability zone distribution goes, that's already built into land, but you don't have to think about it with . You do have to think about it.Because if you only run one easy to server and it's in one availability zone, it's not really different from just having a physical server somewhere with a traditional provider.Nate: [00:07:38] So. There are these two terms, there's like serverless and Lambda. Can you talk a little bit about like the difference between those two terms and when to use each appropriately?Steph: [00:07:48] Yeah, so they are in a way sorta interchangeable, right? Because serverless technology just means the general idea of. I have an application, I have it defined it an artifact of we'll say zip from our get repo, right? So that application is my main artifact, and then I pass it to a service somewhere. I don't know.It could be at work. The Google app engine, that's a type of serverless technology and AWS Lambda is just the specific AWS serverless technology. But the reason AWS Lambda is, in my opinion so powerful, is because it integrates really well with the other features of AWS. So permissions management works with AWS Lambda API gateway.there's a lot of really tight integrations you can make with Lambda so that it doesn't, it's not like you have to keep half of your stuff one place and half of your stuff somewhere else. I remember when like Heroku was really big . A lot of people, you know, maybe they were maintaining an AWS account and they were also maintaining a bunch of  stuff and Heroku, and they're just trying to make it work together.And even though Heroku does use, you know, AWS on the backend, or at least it did back then, it can just make things more complicated. But the whole server, this idea of the artifact is you make your code general, it's like a little microservice in a way. So I can take my serverless application and ideally, you know, it's just Python.I use NF, I write it the right way. Getting it to work on a different server. This back end, like for, exit. I think Azure has one, and Google app engine isn't really too much of a change. There's some changes to permissions and the way that you invoke it, but at the core of it, the real resource is just the application itself.It's not, you know, how many, you know, units of compute. Does it have, how many, you know, how much memory, what are the IP address rules and all that. YouNate: [00:09:35] know. So what are some good apps to build on serverless?Steph: [00:09:39] Yes. So you can build almost anything today on serverless, there's actually so much support, especially with AWS Lambda for integrations with all these other kinds of services that the stuff you can't do is getting more limited.But there is a trade off with cost, right? Because. To me the situation where it shines, where I would for no reason ever choose anything but serverless, is if you have something that's kind of bursty. So let's say you're making like a report generation tool that needs to run, but really you only run it three times a week or something like things that.They need to live somewhere. They need to be consistent. They need to be stable, they need to be available, but you don't know how often they're going to call. And even if they can go from that, there is small numbers of times it's being called, because the cool thing about serverless is , you're charged per every 100 milliseconds of time that it's being processed.When it comes to , you're charged and units that are, it used to be by the hour, I think they finally fixed it, and it's down to smaller increments. . But if you can write it. Efficiently. You can save a ton of money just by doing it this way, depending on what the use cases. So some stuff, like if you're using API gateway with Lambda, that actually can.Be a lot more expensive than Lambda will be. But you don't have to worry about, especially if you need redundancy. Cause otherwise you have to run a minimum of two  two servers just to keep them both up for a AZ kind of outages situation. You don't have to worry about that with Lambda. So anything that with lower usage 100%.If it's bursty 100% use Lambda, if it's one of those things where you just don't have many dependencies on it, then Lambda is a really good choice as well. So there's especially infrastructure management, which is, if you look, I think Warner Vogels, he wrote something recently about how serverless driven infrastructure automation is kind of going to be the really key point to making places that are using cloud use cloud more effectively.And so that's one group of people. That's a big group of people. If you're a big company and you already use the AWS and you're not getting everything out of it that you thought you would get. Sometimes there's serverless use cases that already exist out there and like there's a serverless application repo that AWS provides and AWS config integrations, so that if you can trigger a serverless action based off of some other resource actions. So like, let's say that your auto scaling group  scaled up and you wanted to like notify somebody, there's so many things you could do with it. It's super useful for that. But even if you're just, I'm co you're coming at it from like a blank slate and you want to create something .There are a lot of really good use cases for serverless. If you are, like I said, you're not really sure about how it's going to scale. You don't want to deal with redundancy and it fits into like a fairly well-defined, you know, this is pretty much all Python and it works with minimal dependencies. Then it's a really good starting place for that.Nate: [00:12:29] You know, you mentioned earlier that serverless is very good for when you have bursty services in that if you were to do it based on  and then also get that redundancy one. You're going to have to run while you're saying you'll have to run at least two EC2 instances, just 24 hours a day. I'm paying for those.Plus you're also going to pay for API gateway. Do you pay hourly for API gatewaySteph: [00:12:53] API gateway? It, it would work the same either way, but you would pay for, in that case, like a load balancer.Nate: [00:12:59] What is API gateway? Do you use that for serverless?Steph: [00:13:02] All the time. So API gateway?Nate: [00:13:04] Yeah. Tell us the elements of a typical serverless stack.So I understand there's like Lambda, for example, maybe you say like you use CloudFront. In front of your Lambda functions, which may be store images and S3 that like a typical stack? And then can you explain like what each of those services are,Steph: [00:13:22] how you would do that? Yeah, so you're, you're not, you're on the right track here.So, okay. So a good way to think about it is, if you look at AWS has published something which a lot of documentations on it called the serverless application management standard. So S a N. And so basically if you look at that, it actually defines the core units of serverless applications. So which is the function, an API, if you, if you want one.And basically any other permission type resources. So in your case, let's say it was something where I just wanted like a really. Basic tutorial that AWS provides is someone wants to upload an image for their profile and you want to, you know, scale it down to like a smaller image before you store it on your S3.You know, just so they're all the same size and it saves you a ton, all that. So if you're creating something like that, the AWS resources that you would need are basically an API gateway, which is. Acts as basically the definition of your API schema. So like if you've ever used swagger or like a open API, these standards where you basically just define, and JSON, you know it's a rest API, you do get here, post here, this resource name.That's a standard that you might see outside of AWS a lot. And so API Gateway is just basically a way to implement that same standard. They work with AWS. So that's how you can think of API gateway. It also manages stuff like authentication integration. So if you want to enable OAuth or something on something, you could set that up the API gateway level.SoNate: [00:14:55] if you had API gateway set up. Then is that basically a web server hosted by Amazon?Steph: [00:15:03] Yeah, that's basically it.Nate: [00:15:05] And so then your API gateway is just  assigned essentially randomly a DNS name by Amazon. If you wanted to have a custom DNS name to your API gateway. How do you do that?Steph: [00:15:21] Oh, it's just a setting.It's pretty. so what you could do, yeah, so if you already have a domain name, right? Route 53 which is AWS is domain name management service, you can use that to basically point that domain to the API gateway.Nate: [00:15:35] So you'd use route 53 you configure your DNS to have route 53 point a specific DNS name to your API gateway, and your API gateway would be like a web server that also handles like authentication and AWS integration. Okay,Steph: [00:15:51] got it. Yeah, that's a good breakdown of what that works. So that's your first kind of half of how people commonly trigger Lambdas. And that's not the only way to trigger it, but it's a very common way to do it. So what happens is when the API gateway is configured, part of it is you set what happens when the method is invoked.So there's like a REST API as a type of API gateway that. People use a lot. There's a few others, like a web socket, one which is pretty cool for streaming uses, and then they're always adding new options to it. So it's a really neat service. So you would have that kind of input go into your API gate.We would decide where to route it. Right. So in a lot of cases here, you might say that the Lambda function is where it gets routed to. That's considered the integration to it. And so basically API gateway passes it all of this stuff from the requests that you want to pass it. So, you know, I want to give it the content that was uploaded.I want to give it the IP address. It originally came from whatever you want to give it.Nate: [00:16:47] What backend would people use for API gateway other than Lambda? Like would you use an API gateway in front of an EC2 instance?Steph: [00:16:56] You could, but I would use probably a load balancer or application load balancer and that kind of thing.There's a lot of things you can integrate it for. Another cool one is, AWS API calls. It can proxy, so it can just directly take input from an API and send it to a specific API call if you want to do that. That's kind of some advanced usage, but Lambdas are kind of what I see is the go-to right now.Nate: [00:17:20] So the basic stack that we're looking at is you use API gateway to be in front of your Lambda function, and then your Lambda function just basically does the work, which is either like a writing to some sort of storage or calculating some sort of response. You mentioned earlier, you said, you know the Lambda function it can be fronted by an API if you want one. And then you mentioned, you know, there's other ways that you can trigger these Lambda functions. Can you talk a little bit about like what some of those other ways are?Steph: [00:17:48] Yeah, so actually those are really cool. So the cool thing is that you could trigger it based off of basically any type of CloudWatch event is a big one.And so CloudWatch is basically a monitoring slash auditing kind of service that AWS provides. So you can set triggers that go off when alarms are set. So. It could be usage, it could be, Hey, somebody logged in from an IP address that we don't recognize. You could do some really cool stuff with CloudWatch events specifically. And so those are one that I think for like management purposes are really powerful to leverage. But also you can do it off of S3 events, which are really cool. So like you could just have it, so anytime somebody uploads a. Let's say it was a or CI build, right?  You're doing IA builds and you're putting your artifacts into a S three bucket, so you know this is released version 1.1 or whatever, right?You put it into an S3 bucket, right? You can hook it up so that when ever something gets put into that S3 bucket. That another action is that takes place so you can make it so that, you know, whenever we upload a release, then you know, notify these people. So now an email or you can make it so that it, you know, as complicated as you want, you can make it trigger a different kind of part in your build stage.If you have things that are outside of AWS, you can have it trigger from there. There's a lot of really cool, just direct kind of things that you don't need. An API for. An S3 is a good one. The notification service, SNS it's used within AWS a lot that can be used. The queuing service AWS provides called SQS.It works with, and also just scheduled events, which I really like because it replaces the need for a crown server. So if you have things that run, you know, every Tuesday or whatever, right, you can just trigger your Lambda to do that from just one configuration point, you don't have to deal with anything more complicated than that.Nate: [00:19:38] I feel like that gives me a pretty good grounding in the ecosystem, in the setting. Maybe we could talk a little bit more about tools and tooling. Yeah, so I know that in the JavaScript world, on like the node world, they have the serverless framework, which is basically this abstraction over, I think it's over like Lambda and you know, Azure functions and Google up.Google cloud probably too. Do they have like a serverless framework for Python or is there like a framework that you end up using? Or do you just generally just write straight on top of Lambda?Steph: [00:20:06] So that's a great question and I definitely do recommend that even though there is like a front end you could do to just start, you know, typing code in and making the Lambda work right.It's definitely better to have some sort of framework that. Integrates with your actual, like, you know, wherever you use to store your code and test it and that kind of thing. So serverless is a really big one, and that's, it's kind of annoying because serverless, you know, it also refers to the greater ecosystem of code that runs without managing underlying servers.But in this particular case, Serverless is also like a third party company in tooling, and it does work for Python. It works for, a whole budget. That's kind of like the serverless equivalent in my head of like Terraform, something that is kind of meant to be kind of generic, but it offers a lot of kind of value to people just getting started. If you just want to put something in your, read me that says, here's how to, you know, deploy this from Github. You know, serverless is a cool tool for that. I don't personally like building with it myself just because I find that this SAM, which is Serverless Application Model, I think I said management earlier, but it's actually model.I just looked that up. I feel like that has everything I really want for AWS and I get more fine grain control. I don't like having too much obstruction and I also don't like. When you have to install something and something changes between versions and that changes the way your infrastructure gets deployed.That's a pet peeve of mine, and that's why I don't really use Terraform too much for the same reason. When you're operating really in one world, which in my case is AWS, I just don't get the value out of that. Right. But with the serverless application model, and they have a whole Sam CLI, they have a bunch of tools coming out.So many examples on their Github repos as well. I find that it's got really everything. I want to use plus some CloudFormation plugs right into that. So if I need to do anything outside of the normal serverless kind of world, I can do that. So it's better to use serverless than to not use anything at all.  I think it's a good tool and really good way to kind of get used to it and started, but at least my case where it really matters to have super consistent deployments where I'm sharing between people and accounts and all of that. And I find that SAM really gives me the best kind of best of both worlds.Amelia: [00:22:17] So, as far as I understand it, serverless is a fairly new concept.Steph: [00:22:22] You know, it's one of those things it's catching on. Recently, I felt like Google app engine candidate a long time ago, and it was kind of a niche thing for awhile, but it recently it, we're starting to see. Bigger enterprises, people who might not necessarily want bleeding edge stuff start to accept that serverless is going to be the future.And that's why we're seeing all this stuff come up and it's, it's actually really exciting. But the good thing is it's been around long enough that a lot of the actual tooling and the architecture patterns that you will use are mature. They've been used for years. Their sites you've been using for a long time that.You don't know that it's serverless on the back end, but it is because it's one of those things that doesn't really affect you unless you're kind of working on it. Right. But it's new to a lot of people, but I think it's in a good spot where it's more approachable than it used to be.Nate: [00:23:10] When you say that there's like a lot of standard patterns, maybe we could talk about some of those.So when you write a Lambda function and code, either with like Python or Java script or whatever, there are bloods, they say Python because you use Python primarily right? Well, maybe we could talk a little bit about that. Like why do you prefer Python?Steph: [00:23:26] Yeah, so just coming from my background, which is, like I said, I did some support, did some straight dev ops, kind of a more assisted mini before the world kind of became a more interesting place kind of background.Python is just one of those tools that is installed on like every Linux server in the world and works kind of predictably. Enough people know it that it's, it's not too hard to like. Share between people who may not be, you know, super advanced developers, right? Cause a lot of people I work with, maybe they have varying levels of skills and Python's one of those ones you can start to pick up pretty quickly.And it's not too foreign really to people coming from other languages as well. So it's just a practicality thing for a lot of it. But there's also a lot of the tooling that is around. Dev ops type stuff is in Python, like them, Ansible for configuration management, super useful tool. You know, it's all Python.So really there's, there's a lot of good reasons to use Python from, like in my world it's, it's one of the things where you don't have to use one specific language, but Python is just, it has what I need and it gets, I can work with it pretty quickly. The ecosystems develop. There's still a lot of people who use it and it's a good tool for what I have to do.Nate: [00:24:35] Yeah, there's tons, and I was looking at the metrics. I think Python is like, last year was like one of the fastest growing programming languages too. There's a lot of new people coming into Python,Steph: [00:24:44] and a lot of it is data science people too, right? People who may not necessarily have a strong programming background, but there's the tooling they need in a Python already there.There's community, and it sucks that it's not as scary looking as some other languages, frankly. You know.Nate: [00:24:58] And what are some of the other like cloud libraries that Python has? Like I've seen one that's called like BotoSteph: [00:25:03] Boto is the one that Amazon provides as their SDK, basically. And so every Lambda comes bundled with Boto three you know, by default.So yeah, there was an older version of ODA for Python too. But Boto three is the main one everyone uses now. So yeah, Bodo is great. I use it extensively. It's pretty easy to use, a lot of documentation, a lot of examples, a lot of people answering questions about it on StackOverflow, but I'm really, every language does have an SDK for AWS these days, and they all pretty much work the same way because they're all just based off of.The AWS API APIs and all the API APIs are well-defined and pretty stable, so it's not too much of a stretch to use any other language, but Bono's the big one, the requests library in Python is super useful just because it makes it easier to deal with, you know, interacting with API APIs or interacting with requests to APIs.It's just all about, you know, HTP requests and all that. Some of the new Python three. Libraries are really good as well, just because they kind of improve. It used to be like with Python 2, you know, there's URL lib for processing requests and it was just not as easy to use. So people would always bundle a third party tool, like requests, but it's getting better.Also, you know, Python, there's some. Different options for testing Py unit and unit test, and really there's just a bunch of libraries that are well maintained by the community. There's a kazillion on PyPy, but I try to keep outside dependencies from Python to a total minimum because again, I just don't like when things change from underneath me, how things function.So it's one of the things where I can do a lot without. Installing third party libraries, so wherever I can avoid it, I do.Nate: [00:26:47] So let's talk a little bit about these patterns that you have. So Lambda functions generally have a pretty well defined structure, and it's basically through that convention. It makes it somewhat straightforward to write each function. Can you talk a little bit about like, I don't know, the anatomy of a Lambda function?Steph: [00:27:05] Yeah,  so at its basic core, the most important thing that every Lambda function in the world is going to have is something called a handler. And so the handler is basically a function that is accessed to begin the way that it starts.So, any Lambda function when it's invoked. So anytime you are calling it, it's called invoking a Lambda function. It sends it parameters that are event. An event is basically just the data that defines, Hey, this is stuff you need to act on. And it sends it something called context, which a lot of time you never touched the context object.But it's useful too, because AWS provides it with every Lambda and it's basically like, Hey, this is the ID of the currently running Lambda function. You know, this is where you're running. This is the Lambdas name. So like for logging stuff, context can be really useful. Or for stuff where it's like your function code may need to know something about where it is.You can save yourself time from, you don't have to use like an environment. They're able, sometimes if you can look in the context object. So at the core it's cause you have at least a file, you can name it whatever you want. A lot of people call it index and then within that file you define a function called handler.Again, it doesn't have to be called handler, but. That makes it easy to know which one it is, and it takes that event and context. And so really, if that's all you have, you can literally just have your Lambda file be one Python file that says, you can say def handler takes, you know, object and then return something.And then that can be it. As long as you define that index dot handler as your handler resource, which is, that's a lot of words, but basically we need to find your Lambda within AWS.  The required parameters are basically the handler URI, which is the name of the file, and then a.in the name of the handler function.So that's at its most basic. Every Lambda has that, but then you start, you know, scoping it out so you can actually know, organize your code decently. And then it's just a matter of, is there a read me there. Just like any other Python application really, you know, do you have a read me? Do you want to use like a requirements.txt file to like define, you know, these are the exact third party libraries that I'm going to be using.That's really useful. And if you're defining it with SAM, which I really recommend. Then there's a file called template.yaml And that's just contains the actual, like AWS resource definition, as well as any like CloudFormation defined resources that you're using. So you can make a template.yaml as the infrastructure kind of as code, and then everything else, just the code as code.Nate: [00:29:36] Okay. So unpacking that a little bit, you'll invoke this function and they'll basically be two arguments. One is the event that's coming in the event in particular, and then it'll also have the context, which is sort of metadata about the context in which this request is running. So you mentioned some of the things that come in the context, which is like what region you're in or what the name of the function is that you're on.What are some of the parameters in the event object.Steph: [00:30:02] So the interesting thing about the event object. Is, it can be anything. It just has to be basically a Python dictionary or basically, you know, you could think of it like a JSON, right? So it's not predefined and Lambda itself doesn't care what the event is.That's all up to your code to decide what is it, what is a valid event, and how to act on it. So API gateway if you're using that. There's a lot of example events, API gateway will send and so if you like ever try it, look at like the test events for Lambda, you'll see a lot of like templates, which are just JSON files with like expected outputs.But really it can be anything.Nate: [00:30:41] So the way that Lambda's structured is that API gateway will typically pass in an event that's maybe like the request was a POST request, and then it has these like query parameters or headers attached to it. And all of that would be within like the request object. But the event could also be like you mentioned like CloudWatch, like there's like a CloudWatch event that could come in and say, you basically just have to configure your handler to handle any of the events you expect that handler to receive.Steph: [00:31:07] Yeah, exactly.Nate: [00:31:09] So let's talk a little bit more about the development tooling. How in the world do you test these sorts of things? Like with, do you have to deploy every single time or tell us about like the development tooling that you use to test along the way.Steph: [00:31:22] Yeah. So I'm, one of the great things about SAM and there's some other tools for this as well, is that it lets you test your Lambdas locally before you deploy it, if you want.And the way that it does that is, I mentioned earlier that Lambda is really at its core, a container, like a Docker container running on a server somewhere. Is, it just creates a Docker container that behaves exactly like a Lambda would, and it sends your events. So you would just define basically a JSON with the expected data from either API gateway or whatever, right?You make a test one and then it will send it to that. It'll build it on demand for you and you test it all locally with Docker. When you like it, you use the same tool and it'll package it up and deploy it for you. So yeah, it's actually not too bad to test locally at all.Nate: [00:32:05] So you create JSON files of the events that you want it to handle, and then you just like invoke it with those particular events.Steph: [00:32:12] Yeah, so basically like if I created it like a test event, I would save it to my repo is tests slash API gateway event.json Had put in the data I expect, and then I would do like a SAM. So the command is like SAM, a local invoke, and then I would give it to the file path to the JSON, and it would process it.I'd see the exact same output that I would expect to see from Lambda. So it'll say, Hey, this took this many milliseconds to invoke the response code was this, this is what was printed. So it's really useful just for. It's almost a one to one with what you would get from Amazon Lambda output.Amelia: [00:32:50] And then to update your Lambda functions.Do you have to go inside the AWS GUI or can you do that from the command line.Steph: [00:32:57] yeah, no, you can do that from the command line with Sam as well. So there's a Sam package and Sam deploy command. It's useful if you need to use basically any type of CII testing service to like manage your deployments or anything like that.Then you can get a package it and then send it the package to your, Whatever you're using, like Gitlab or something, right. For further validation and then have Gitlab deploy it. Like if you don't want people to have deployed credentials on their local machine, that's the reason it's kind of broken up into two steps there.But basically you just do a command, Sam deploy, and what it does is it goes out to Amazon. It says, Hey, update the Lambda to point to this as the new resource artifact to be invoked. And if you're using and which I think it's enabled by default, not actually the versioning feature, it actually just adds another version of the Lambda so that if you need to roll back, you can just go to the previous one, which is really useful sometimes.Nate: [00:33:54] So let's talk a little bit about deployment. One of the things that I think is stressing when you are deploying Lambda functions is like, I have no idea how much it's going to cost. How is it going to cost to launch something, and how much am I going to pay? And I guess maybe you can kind of calculate if you estimate the number of requests you think you're going to get, but how do you approach that when you're writing a new function?Steph: [00:34:18] Yeah, so the first thing I look at is what's the minimum, basically timeout, what's the minimum memory usage? So number of invocations is a big factor, right? So like if you have free tier, I think it's like a million invocations you get, but that's like assuming like a hundred under a hundred milliseconds each.So when you just deploy it, there's no cost for just deploying it. You don't get charged until it's invoked. If you're storing like an artifact and as three, there's a little cost for you keeping it in as three. But it's usually really, really minimal. So the big thing is really, how many times are you give it?Is it over a million times and or are you not on free tier? The costs, like I said, it gets batchedtogether and it's actually really pretty cheap just in terms of number of invocations cause at the bigger places where you can normally save costs. Is it over-provisioned for how much memory you give it?Right. I think the smallest unit you can give it as 128 that can go up to like two gigabytes maybe more now. So if you have it set where, Oh, I want it to use this much memory and it really never is going to use that much memory and that's kind of like wasteful or if you know, if it uses that much, that's like something's wrongNate: [00:35:25] cause you pay, you configure beforehand, like we're going to use max 128 megabytes of memory and then it's allocated on invocation or something like that.And then if you set it too high, you were going to pay more than you need to. Is that right?Steph: [00:35:40] Yeah. Well and it's more like, I think I'll have to double check cause it actually just show you how much memory you use each time in Lambda is invoked. So you can sort of measure if it's getting near that or if you think you need more than it might give an error.If it doesn't, it isn't able to complete . But in general, like. I haven't had many cases where the memory has been the limiting factor. I will say that, the timeout can sometimes get you, because if a Lambda's processing forever, like let's say API gateway, a lot of times API gateway has its own sort of timeout, which is, I think it's like 30 seconds to respond.And if your Lambda is set to, you know, you give it five minutes to process  it always five minutes processing. If you, let's say that you program something wrong and there's like a loop somewhere and it's going on forever, it'll waste five minutes. Computing API gateway will give up after 30 seconds, but you'll still be charged for the five minutes that Lambda was kind of doing its thing.SoNate: [00:36:29] it's like, I know that AWS is services and Lambda are created by like world-class engineers. It's the highest performing infrastructure probably in the world, but as a user, sometimes it feels like there's giant Rube Goldberg machine, and I have like no idea. All of the different aspects that are involved in, like how do you manage that complexity?Like when you're trying to learn AWS, like let's say someone who's listening to this, they want to try to understand this. How do you. Go about making sense of all of that. Like difficulty.Steph: [00:37:02] You really have to go through a lot of the docs, like videos, people showing you how they did something isn't always the best just because they kind of skirt around all the things that went wrong in the process, right? So it's really important just to understand, just to look at the documentation for what all these features are before you use them. The marketing people make it sound like it's super easy and go, and to a degree, it really is like, it's easier than the alternative, right?It's where you put your complexities the problem Nate: [00:37:29] yeah, and I think that part of the problem that I have with their docs is like they are trying to give you every possible path because they're an infrastructure provider, and so they support like these very complex use cases. And so it's, it's like the world's most detailed choose your own adventure.It's like, Oh, have you decide that you need to take this path? Go to   or this one path B. Path C there's like so many different like paths you can kind of go down. It's just a lot when you're first learning.Steph: [00:37:58] It is, and sometimes like the blog posts have better kind of actual tutorial kind of things for like real use cases.So if you have a use case that is general enough, a lot of times you can just Google for it and there'll be something that one of their solution architects wrote up about had actually do it from like a, you know, user-friendly perspective that anything with the options is that you need to be aware of them too, just because the way that they interact can be really important.If you do ever do something that's not done before and the reason why it's so powerful and what, you know why it takes all these super smart people to set up and all this stuff is actually because are just so many variables that go into it that you can do so much with that. It's so easy to shoot yourself in the foot.It always has been in a way, right? But it's just learning how to not shoot yourself in the foot and use it like with the right agility. And once you get that down, it's really great.Amelia: [00:38:46] So there's over a hundred AWS services. How do you personally find new services that you want to try out or how does anyone discover any of these different services.Steph: [00:38:57] What I do is, you know, I get the emails from AWS whenever they release new ones, and I try to, you know, keep up to date with that. Sometimes I'll read blog posts that I see people writing about how they're using some of them, but honestly, a lot of it's just based off of when I'm doing something, I just keep an eye out.If there's something like, I wished that it did sometimes, like, I used some AWS systems manager a lot, which is basically. You can think of it. It's sort of like a config management an orchestration tool. It lets you, basically, it's a little agent. You can sell on  servers and you can, you know, just automate patching and all this other like little stuff that you would do with like Chef or Puppet or other config management tools.And. It seems like they keep announcing services. What are really just like tie ins to existing ones, right? Which is like, Oh, this one adds, you know, for instance, like the secret management and the parameter store would secrets. A lot of them are really just integrations to other AWS services, so it's not as much.The really core ones that everyone needs to know is, you know, EC2 of course Lambda, so big API gateway and CloudFormation because it's basically. The infrastructure as code format that is super useful just for structuring everything. And I guess S3 is the other one. Yeah. Let's talk aboutNate: [00:40:15] cloud formation for a second.So earlier you said your Lambda function is typically going to have a template.yaml. Is that template.yaml CloudFormation code.Steph: [00:40:26] So at its core, yes. But the way you write it is different. So how it works is that the Sam templating language is defined to simplify. What you would with CloudFormation.So a CloudFormation you have to put a gazillion variables in.And it's like, there's some ways to like make that easier. Like I really like using a Python library called Tropo sphere, where you can actually use Python to generate your own cloud formation templates for you. And it's really nice just cause, you know, I like to know I'll need a loop for something or I'll need to like fetch a value from somewhere else.And it's great to have that kind of flexibility with it . The, the Sam template is specifically a transform, is what they call it, of cloud formation, which means that it executes against the CloudFormation service. So the CloudFormation service receives that kind of turns it into the core that it understands and executes on it.So at the core of it, it is executing on top of CloudFormation. You could create a mostly equivalent kind of CloudFormation template usually, but there's more to it. But there's a lot of just reasons why you would want to use Sam for serverless specifically, just because they add so many niceties and stuff around, you know, permissions management that you don't have to like think of as much and shortcuts and it's just a lot easier to deal with, which is a nice change.But the power of CloudFormation is that if you wanted to do something. That like maybe SAM didn't support the is outside the normal scope. You could just stick a CloudFormation resource definition in it and it would work the same way cause it's running against it. It's one of those services where people, sometimes it gets a bad rap because it's so complicated, but it's also super stable.It behaves in a super predictable way and it's just, I think learning how to use that when I worked at AWS was really valuable.Nate: [00:42:08] What tools do you use to manage the security when you're configuring these things? So earlier you mentioned IAM, which is a, I don't know what it stands for.Steph: [00:42:19] Identity and access management,Nate: [00:42:20] right?Which is like configuration language or configuration that we can configure, which accounts have access to certain resources. let me give you an example. One question I have is how do you make sure each system has the minimum level of permissions and what tools you use? So for example, I wrote this Lambda function a couple of weeks ago.Yeah. I was just following some tutorial and they said like, yeah, make sure that you create this IAM role as like one of the resources for launching this Lambda function, which I think they're like, that's great. But then like. How do I pin down the permissions when I'm granting that function itself permissions to grant new IAM roles. So it was like I basically just had to give it route permission according to my low, my skill level, because otherwise I wasn't able to. Create, I am roles without the authority to create new roles, which just seems like root permissions.Steph: [00:43:13] Yes. So there are some ways that's super risky, honestly, like super risky.Nate: [00:43:17] Yeah. I'm going to need your help,Steph: [00:43:19] but it is a thing that there are case you can, you can limit it down with the right kind of definition. SoIAM. It's really powerful. Right? So the original case behind a MRI was that, so you're a servers so that if you had a, an application server and a database server separately.You could give them separate IAM roles so that they could have different things they could do. Like you never really want your database server to maybe. Interface directly with, you know, an S three resource, but maybe you want your application server to do that or something. So it was nice because it really let you limit down the scope from a servers and you don't, cause you have to leave keys around if you do it .So you don't have to keep keys anywhere on the server if you're using IAM roles to access that stuff. So anytime you're storing like an AWS secret key on a server, or like in a Lambda, you kinda did something wrong. The thing they are just because that's, AWS doesn't really care about those keys. It just looks, is it a key?Do it here. But when you actually use IAM policies, you could say it has to be from this role. It has to be executed from, you know, this service. So it can make sure that it's Lambda or the one doing this, or is it somebody trying to assume Lambda credentials? Right? There's so much you can do to kind of limit it.With I am. So it was really good to like learn about that. And like all of the AWS certifications do focus on IAM specifically. So if anyone thinking about taking like an AWS certification course, a lot of them will introduce you to that and help a lot with understanding like how to use those correctly.But for what you talked about with you, like how do you deal with a function that passes, that creates a role for another function, right? What you would do in that kind of case is there's an idea of IAM paths. So basically you can give them like as namespacing for IAM permissions, right? So you can make a, I am role that can grant functions that can create roles .Only underneath its own namespace. Within its own path.Nate: [00:45:20] When you say namespaces, I mean did inherit permissions. But the parent permission has?Steph: [00:45:28] Depends. So it doesn't inherit itself. But like, let's say that I was making a build server . And my build server, we had to use a couple of different roles for different pieces of it. For different steps. Cause they used different services or something. So we would give it like the top level one of build. And then in my S3 bucket, I might say aloud upload for anyone whose path had built in it. So that's, that's the idea that you can limit on the other side, what is allowed.And so of course, it's one of the things where you want to by default blacklist as much as possible, and then white list what you can. But in reality it can be very hard to go through some of that stuff. So you just have to try to, wherever you can, just minimize the risk potential and understand what's the worst case that could happen if someone came in and was able to use these credentials for something.Amelia: [00:46:16] What are some of the other common things that people do wrong when they're new to AWS or DevOps?Steph: [00:46:22] One thing I see a lot is people treating the environment variables for Lambdas as if they were. Private, like secrets. So they think that if you put like an API key in through the environment variable that that's kind of like secure, but really like I worked in AWS support, anyone would be able to see that if they were helping you out in your account.So it's not really a secure way to do that. You would need to use a surface like secrets manager, or you'd have some kind of way to, you would encrypt it before you put it in and then the Lambda would decrypt it, right? So there's ways to get around that, but like using environment variables as if there were secure or storing.Secure things within your git repositories that get pushed to AWS is like a really big thing that should be avoided. And we said, what else did you ever own?Nate: [00:47:08] I'm pretty sure that I put an API key in mineSteph: [00:47:11] before. So yeah, no, it's one of the things people do, and it's one of those things that. A lot of people, you know, maybe nothing will go wrong and it's fine, but if you can just reduce the scope, then you don't have to worry about it.And it just makes things easier in the future.Amelia: [00:47:27] What are like the new hot things that are up and coming?Steph: [00:47:30] So I'd say that there's more and more kind of uses for Lambda at edge for like IOT integration, which is pretty cool. So basically Lambda editor. Is basically where you can process your lamb dos computers, basically, like, you know, like, just think of it as like raspberry pi.It's like that kind of type thing, right? So you could take asmall computer and you could put it like, you know, maybe where it doesn't have a completely like, consistent internet connection . So maybe if you're doing like a smart vending machine or something. Think of it like that. Then you could actually execute the Lambda logic directly there and deploy it to there and manage it from AWS whenever it does have like a network connection and then you can basically, it just reduces latency.A lot and let your coat and lets you test your code both like locally and then deploy it out. So it was really cool for like IOT stuff. There's been a lot of like tons of stuff happening in machine learning space on AWS too much for me to even keep on top of. But a lot of the stuff around Alexa voices is super cool, like a poly where you can just, if you play with your Alexa type thing before, it's cool, but you could just write a little Lambda program to actually generate, you know, whatever you want it to say in different accents, different voices on demand, and integrate it with your own thing, which is pretty cool. Like, I mean, I haven't had a super great use case for that yet, but it's fun to play with.Amelia: [00:48:48] I feel like a lot of the internet of things are like that.Steph: [00:48:52] Oh, they totally are. That they really are. But yeah, it's just one of the things you had to keep an eye out for. Sometimes the things that, for me, because I'm dealing so much with like enterprisey kind of stuff that excite me are not really exciting to other people cause it's like, yay, patching has a way to like lock it down to a specific version of this at this time.You know, it's like, it's like, it's not really exciting, but like, yeah.Nate: [00:49:14] And I think that's one of the things that's interesting talking to you is like I write web apps, I think of serverless from like a web app perspective, but it's like, Oh, I'm going to write an API that will let her know, fix my images on the way up or something.But a lot of the uses that you alluded to are like using serverless for managing, other parts of your infrastructure, they're like, you're using, you've got a monitor on some EC2 instance that sends out a cloud watch alert that like then responds in some other way, like within your infrastructure.So that's really interesting.Steph: [00:49:47] Yeah, no, that's, it's just been really valuable for us. And like I said, I mentioned the IAM stuff. That's what makes it all possible really.Amelia: [00:49:52] So this is totally unrelated, but I'm always curious how people got into DevOps, because I do a lot of front end development and I feel like.It's pretty easy to get into front end web development because a lot of people need websites. It's fairly easy to create a small website, so that's a really good gateway, but I've never like on the weekend when it to spin up a server or any of this,Steph: [00:50:19] honestly for me, a lot of it was like my first job in college.Like I was basically part-time tech support / sys admin. And I always loved L nuxi because, and the reason I got into Lennox in the first place is I realized that when I was in high school that I could get around a lot of the schools, like, you know, spy software that won't let you do fun stuff on the internet or with the software if you just use a live boot Linux USB.So part of it was just, I was using it. So, you know. Get around stuff, just curiosity about that kind of stuff . But when I got my first job, that's kind of like assist admin type thing. It kind of became a necessity. Because you know when you have limited resources, it was like me and like another part time person and one full time person and hundreds of people who we had to keep their email and everything.Working for them. It kind of becomes a necessity thing cause you realize that all the stuff that you have to do by hand back then, you can't keep track of it all. You can't keep it all secured for a few people. It's extremely hard. And so one way people dealt with that was, you know, offshoring or hiring people, other people to maintain it.But it was kind of cool at the time to realize that the same stuff I was learning in my CS program about programming. There's no reason I couldn't use that for my job, which was support and admin stuff. So, I think I got introduced to like chef, that was the first tool that I really, I was like, wow, this changes everything.You know, because you would write little Ruby files to do configuration management and then your servers would, you know, you run the chef agent to end, you know. You know, they'd all be configured exactly the same way. And it was testable. And there's all this really cool stuff you could do with chef that I, you know, I had been trying to play to do with like, you know, bash script or just normal Python scripts.But then chef kind of gave me that framework for it. And I got a job at AWS where one of the main components was supporting their AWS ops work stool, which was basically managed chef deployments. And so that was cool because then I learned about how does that work at super high scale. What are other things that people use?And right before I actually, you know, got my first job as a full time dev ops person was when they, they were releasing the beta for Lambda. So I was in the little private beta for AWS employees and we were all kind of just like, wow, this changes a lot. They'll make our jobs a lot easier, you know, in a way it will reduce the need for some of it.But we were so overloaded all the time. And I feel like a lot of people from a  perspective know what it feels like to be like. There's so much going on and you can't keep track of it all and you're overloaded all the time and you just want it to be clean and not have to touch it and to do less work at dev ops was kind of like the way forward.So that's really how I got into it.Amelia: [00:52:54] That's awesome. Another thing I keep hearing is that a lot of dev ops tests are slowly being automated. So how does the future of DevOps look if a lot of the things that we're doing by hand now will be automated in the future?Steph: [00:53:09] Well, see, the thing about dev ops is really, it's more of like a goal.It's an ideal. A lot of people, if they're dev ops purists and they'll tell you that it means it's having a culture where. There are not silos between developers and operations, and everyone knows how to deploy and everyone knows how to do everything. But really in reality, not everyone's a generalist.And being a generalist in some ways is kind of its own specialty, which is kind of how I feel about the DevOps role that you see. So I think we'll see that the dev ops role, people might go by different names for the same idea, which is. Basically reliability engineering, like Google has a whole book about site reliability engineering is the same kind of philosophy, right? It's you want to keep things running. You want to know where things are. You want to make things efficient from an infrastructure level. But the way that you do it is you use a lot of the same tools that developers use. So I think that we'll see tiles shift to like serverless architect is a big one that's coming up because that reliability engineering is big.And we may not see people say dev ops is their role as much, but I don't think the need for people who kind of specialize in like infrastructure and deployment and that kind of thing is going to go away. You might have to do more with less, right? Or there might be certain companies that just hire. A bunch of them, like Google and Amazon, right?They're pro still going to be a lot of people, but maybe they're not going to be working at your local place because if they're going to be working for the big people who actually develop the tools that are used for that resource. So I still think it's a great field and it might be getting a little harder to figure out where to enter in this because there's so much competition and attention around the tools and resources that people use, but it's still a really great field overall. And if you just learn, you know, serverless or Kubernetes or something that's big right now, you can start to branch out and it's still a really good place to kind of make a career.Nate: [00:54:59] Yeah. Kubernetes. Oh man, that's a whole nother podcast. We'll have to come back for that.Steph: [00:55:02] Oh, it is. It is.Nate: [00:55:04] So, Steph, tell us more about where we can learn more about you.Steph: [00:55:07] Yeah. So I have a book coming out.Nate: [00:55:10] Yes. Let's talk about the book.Steph: [00:55:12] Yeah. So I'm releasing a book called, Fullstack Serverless. See, I'm terrible.I should know exactly what the title, I don'tNate: [00:55:18] know exactly the title. . Yeah. Full stack. Python with serverless or full-stack serverless with Python,Steph: [00:55:27] full stack Python on Lambda.Nate: [00:55:29] Oh yeah. Lambda. Not serverless.Steph: [00:55:31] Yeah, that's correct. Python on Lambda. Right. And that book really has, it could take you from start to finish, to really understand.I think if you read this kind of book, if I, if I had read this before, like learning it, it wouldn't feel so maybe. Some people confusing or kind of like it's a black box that you don't know what's happening. Cause really at its core lambda that you can understand exactly everything that happens. It has a reason, you know it's running on infrastructure that's not too different from people who run infrastructure on Docker or something.Right. And the code that you write. Can be the same code that you might run on a server or on some other cloud provider. So the real things that I think that the book has that maybe kind of hard to find elsewhere is there's a lot of information about how do you do proper testing and deployment?How do you. Manage your secrets, so you aren't storing those in them in those environment variables. Correct. It has stuff about logging and monitoring, all the different ways that you can trigger Lambda. So API gateway, you know, that's a big one. But then I mentioned S3 and all those other ones. there's going to be examples of pretty much every way you can do that in that book.Stuff about optimizing cost and performance and stuff about using that. SAM, serverless application, a repository, so you can actually publish Lambdas and share them and even sell them if you want to. So it's really a start to finish everything you need to. If you want to have something that you create from scratch.In production. I don't think there's anything left out that you would need to know. I feel pretty confident about that.Nate: [00:57:04] It's great. I think one of the things I love about it is it's almost like the anti version of the docs, like when we talked about earlier that the docs cover every possible use case.This talks about like very specific, but like production use cases in a very approachable, like linear way. You know, even though you can find some tutorials online, maybe. Like you mentioned, they're not always accurate in terms of how you actually do or should do it, and so, yeah, I think your book so far has been really great in covering these production concerns in a linear way.All right. Well, Steph is great to have you.Steph: [00:57:37] Thank you for having me. It was, it was great talking to you both.

The REPL
14: ClojureScript, Lumo, and Lambdas with Antonio Monteiro

The REPL

Play Episode Listen Later Dec 5, 2018 59:57


Antonio Monteiro talks about building Lumo, improving the ClojureScript beginner experience, typed GraphQL in OCaml, and creating a custom AWS Lambda runtime.. Sponsor: Deps - Private, Hosted, Maven Repositories Lumo CLJS GWT Pilloxa V8 custom startup snapshots Glitch with Lumo clj-commons Om Relay Falcor Ladder The REPL episode with Martin Klepsch OCaml Reason ML Lambda support for Powershell Rust runtime for AWS Lambda and GitHub project Antonio’s OCaml Lambda runtime AWS Lambda Runtime API Howard Lewis Ship on The REPL talking about GraphQL Small FP - Antonio Monteiro Developing ReasonML frontend with GraphQL Zeit

Java Off-Heap
Episode 28. Back from JavaOne! With the dropped bomb on Java EE (or EE4J?), FN tech and more!

Java Off-Heap

Play Episode Listen Later Nov 28, 2017


So it's our anual JavaOne Debrief. After landing at the conference we got to take a look at what's brewing behind Oracle. With our special guest  we dive into the big Red plans for EE, the answer from Oracle on Lambdas (hey everyone is...