POPULARITY
Redis is consistently one of the most beloved pieces of infrastructure for developers. And in the last few years, we've seen a number of new Redis-compatible projects that aim to improve on the core of Redis in some way. One of those projects is DragonflyDB, a multi-threaded version of Redis that allows for significantly higher throughput on a single instance. Roman Gershman is the co-founder and CTO at Dragonfly, and he has a fascinating background. Roman initially worked at Google and then was a frustrated user of Redis while working as an engineer at a fast-growing startup. He did a stint on the ElastiCache team at AWS but struck off on his own to make a new, faster version of Redis. In this episode, we talk through the improvements that Dragonfly makes to Redis and why it matters to high-scale users. We go through the different needs and requirements of high-scale cache applications and what Roman learned at AWS. We also go through the Redis licensing drama and how to attract developer attention in 2025.
This episode discusses solutions for securely accessing private VPC resources for debugging and troubleshooting. We cover traditional approaches like bastion hosts and VPNs and newer solutions using containers and AWS services like Fargate, ECS, and SSM. We explain how to set up a Fargate task with a container image with the necessary tools, enable ECS integration with SSM, and use SSM to start remote shells and port forwarding tunnels into the container. This provides on-demand access without exposing resources on the public internet. We share a Python script to simplify the process. We suggest ideas for improvements like auto-scaling the container down when idle. Overall, this lightweight containerized approach can provide easy access for debugging compared to managing EC2 instances.
This episode features Madelyn Olson, maintainer for the open-source project Valkey, to discuss the growth and impact of open-source projects in the tech industry. Corey and Madelyn explore the transformations within these projects, particularly the challenges and shifts in governance and licensing practices that affect how companies like AWS contribute to and utilize open-source software. Furthermore, Madelyn shares insights into the motivations behind Valkey, its differentiation from Redis, and the broader implications for open-source sustainability and corporate involvement.Show Highlights: (00:00) Introduction and discussion on AWS's approach to open-source(01:41) Recap of the Redis controversy and licensing changes(02:35) Madelyn's role at AWS and her work on ElastiCache and MemoryDB(04:11) The enduring relevance and importance of open source in solving global technology problems(06:15) The freedoms of open source and the broad implications for software development(08:19) The evolution of governance and project management in the Valkey project(09:53) The full transition of Madelyn's efforts from Redis to Valkey(17:27) Why Valkey was created and its future direction(24:57) The separation of duties between Madelyn's roles at AWS and the Valkey project(32:34) Closing thoughts and where to find more information on ValkeyAbout Madelyn:Madelyn Olson is a co-creator and maintainer of Valkey, a high-performance kev-value data store and Principal Engineer at Amazon Web Services (AWS). She focuses on building secure and highly reliable features, with a passion for working with open-source communities.Links Referenced:Website: https://valkey.io/ Linkedin: https://www.linkedin.com/in/madelyn-olson-valkey/GitHub: https://github.com/madolsonTwitter: https://x.com/reconditerose*SponsorPanoptica: https://www.panoptica.app/
Jeff Morris, VP of Product & Solutions Marketing at Couchbase, joins Corey on Screaming in the Cloud to discuss Couchbase's new columnar data store functionality, specific use cases for columnar data stores, and why AI gets better when it communicates with a cleaner pool of data. Jeff shares how more responsive databases could allow businesses like Dominos and United Airlines to create hyper-personalized experiences for their customers by utilizing more responsive databases. Jeff dives into the linked future of AI and data, and Corey learns about Couchbase's plans for the re:Invent conference. If you're attending re:Invent, you can visit Couchbase at booth 1095.About JeffJeff Morris is VP Product & Solutions Marketing at Couchbase (NASDAQ: BASE), a cloud database platform company that 30% of the Fortune 100 depend on.Links Referenced:Couchbase: https://www.couchbase.com/TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode of Screaming in the Cloud is brought to us by our friends at Couchbase. Also brought to us by Couchbase is today's victim, for lack of a better term. Jeff Morris is their VP of Product and Solutions Marketing. Jeff, thank you for joining me.Jeff: Thanks for having me, Corey, even though I guess I paid for it.Corey: Exactly. It's always great to say thank you when people give you things. I learned this from a very early age, and the only people who didn't were rude children and turned into worse adults.Jeff: Exactly.Corey: So, you are effectively announcing something new today, and I always get worried when a database company says that because sometimes it's a license that is going to upset people, sometimes it's dyed so deep in the wool of generative AI that, “Oh, we're now supporting vectors or whatnot.” Well, most of us don't know what that means.Jeff: Right.Corey: Fortunately, I don't believe that's what you're doing today. What have you got for us?Jeff: So, you're right. It's—well, what I'm doing is, we're announcing new stuff inside of Couchbase and helping Couchbase expand its market footprint, but we're not really moving away from our sweet spot, either, right? We like building—or being the database platform underneath applications. So, push us on the operational side of the operational versus analytic, kind of, database divide. But we are announcing a columnar data store inside of the Couchbase platform so that we can build bigger, better, stronger analytic functionality to feed the applications that we're supporting with our customers.Corey: Now, I feel like I should ask a question around what a columnar data store is because my first encounter with the term was when I had a very early client for AWS bill optimization when I was doing this independently, and I was asking them the… polite question of, “Why do you have 283 billion objects in a single S3 bucket? That is atypical and kind of terrifying.” And their answer was, “Oh, we built our own columnar data store on top of S3. This might not have been the best approach.” It's like, “I'm going to stop you there. With no further information, I can almost guarantee you that it was not.” But what is a columnar data store?Jeff: Well, let's start with the, everybody loves more data and everybody loves to count more things, right, but a columnar data store allows you to expedite the kind of question that you ask of the data itself by not having to look at every single row of the data while you go through it. You can say, if you know you're only looking for data that's inside of California, you just look at the column value of find me everything in California and then I'll pick all of those records to analyze. So, it gives you a faster way to go through the data while you're trying to gather it up and perform aggregations against it.Corey: It seems like it's one of those, “Well, that doesn't sound hard,” type of things, when you're thinking about it the way that I do, in terms of a database being more or less a medium to large size Excel spreadsheet. But I have it on good faith from all the customer environments. I've worked with that no, no, there are data stores that span even larger than that, which is, you know, one of those sad realities of the world. And everything at scale begins to be a heck of a lot harder. I've seen some of the value that this stuff offers and I can definitely understand a few different workloads in which case that's going to be super handy. What are you targeting specifically? Or is this one of those areas where you're going to learn from your customers?Jeff: Well, we've had analytic functionality inside the platform. It just, at the size and scale customers actually wanted to roam through the data, we weren't supporting that that much. So, we'll expand that particular footprint, it'll give us better integration capabilities with external systems, or better access to things in your bucket. But the use case problem is, I think, going to be driven by what new modern application requirements are going to be. You're going to need, we call it hyper-personalization because we tend to cater to B2C-style applications, things with a lot of account profiles built into them.So, you look at account profile, and you're like, “Oh, well Jeff likes blue, so sell him blue stuff.” And that's a great current level personalization, but with a new analytic engine against this, you can maybe start aggregating all the inventory information that you might have of all the blue stuff that you want to sell me and do that in real-time, so I'm getting better recommendations, better offers as I'm shopping on your site or looking at my phone and, you know, looking for the next thing I want to buy.Corey: I'm sure there's massive amounts of work that goes into these hyper-personalization stories. The problem is that the only time they really rise to our notice is when they fail hilariously. Like, you just bought a TV, would you like to buy another? Now statistically, you are likelier to buy a second TV right after you buy one, but for someone who just, “Well, I'm replacing my living room TV after ten years,” it feels ridiculous. Or when you buy a whole bunch of nails and they don't suggest, “Would you like to also perhaps buy a hammer?”It's one of those areas where it just seems like a human putting thought into this could make some sense. But I've seen some of the stuff that can come out of systems like this and it can be incredible. I also personally tend to bias towards use cases that are less, here's how to convince you to buy more things and start aiming in a bunch of other different directions where it starts meeting emerging use cases or changing situations rapidly, more rapidly than a human can in some cases. The world has, for better or worse, gotten an awful lot faster over the last few decades.Jeff: Yeah. And think of it in terms of how responsive can I be at any given moment. And so, let's pick on one of the more recent interesting failures that has popped up. I'm a Giants fan, San Francisco Giants fan, so I'll pick on the Dodgers. The Dodgers during the baseball playoffs, Clayton Kershaw—three-time MVP, Cy Young Award winner, great, great pitcher—had a first-inning meltdown of colossal magnitude: gave up 11 runs in the first inning to the Diamondbacks.Well, my customer Domino's Pizza could end up—well, let's shift the focus of our marketing. We—you know, the Dodgers are the best team in baseball this year in the National League—let's focus our attention there, but with that meltdown, let's pivot to Arizona and focus on our market in Phoenix. And they could do that within minutes or seconds, even, with the kinds of capabilities that we're coming up with here so that they can make better offers to that new environment and also do the decision intelligence behind it. Like, do I have enough dough to make a bigger offer in that big market? Do I have enough drivers or do I have to go and spin out and get one of the other food delivery folks—UberEats, or something like that—to jump on board with me and partner up on this kind of system?It's that responsiveness in real, real-time, right, that's always been kind of the conundrum between applications and analytics. You get an analytic insight, but it takes you an hour or a day to incorporate that into what the application is doing. This is intended to make all of that stuff go faster. And of course, when we start to talk about things in AI, right, AI is going to expect real-time responsiveness as best you can make it.Corey: I figure we have to talk about AI. That is a technology that has absolutely sprung to the absolute peak of the hype curve over the past year. OpenAI released Chat-Gippity, either late last year or early this year and suddenly every company seems to be falling all over itself to rebrand itself as an AI company, where, “We've been working on this for decades,” they say, right before they announce something that very clearly was crash-developed in six months. And every company is trying to drape themselves in the mantle of AI. And I don't want to sound like I'm a doubter here. I'm like most fans; I see an awful lot of value here. But I am curious to get your take on what do you think is real and what do you think is not in the current hype environment.Jeff: So yeah, I love that. I think there's a number of things that are, you know, are real is, it's not going away. It is going to continue to evolve and get better and better and better. One of my analyst friends came up with the notion that the exercise of generative AI, it's imprecise, so it gives you similarity things, and that's actually an improvement, in many cases, over the precision of a database. Databases, a transaction either works or it doesn't. It has failover or it doesn't, when—Corey: It's ideally deterministic when you ask it a question—Jeff: Yes.Corey: —the same question a second time, assuming it's not time-bound—Jeff: Gives you the right answer.Corey: Yeah, the sa—or at least the same answer.Jeff: The same answer. And your gen AI may not. So, that's a part of the oddity of the hype. But then it also helps me kind of feed our storyline of if you're going to try and make Gen AI closer and more accurate, you need a clean pool of data that you're dealing with, even though you've got probably—your previous design was such that you would use a relational database for transactions, a document database for your user profiles, you'd probably attach your website to a caching database because you needed speed and a lot of concurrency. Well, now you got three different databases there that you're operating.And if you're feeding data from each of those databases back to AI, one of them might be wrong or one of them might confuse the AI, yet how are you going to know? The complexity level is going to become, like, exponential. So, our premise is, because we're a multi-modal database that incorporates in-memory speed and documents and search and transactions and the like, if you start with a cleaner pool of data, you'll have less complexity that you're offering to your AI system and therefore you can steer it into becoming more accurate in its response. And then, of course, all the data that we're dealing with is on mobile, right? Data is created there for, let's say, your account profile, and then it's also consumed there because that's what people are using as their application interface of choice.So, you also want to have mobile interactivity and synchronization and local storage, kind of, capabilities built in there. So, those are kind of, you know, a couple of the principles that we're looking at of, you know, JSON is going to be a great format for it regardless of what happens; complexity is kind of the enemy of AI, so you don't want to go there; and mobility is going to be an absolute requirement. And then related to this particular announcement, large-scale aggregation is going to be a requirement to help feed the application. There's always going to be some other bigger calculation that you're going to want to do relatively in real time and feed it back to your users or the AI system that's helping them out.Corey: I think that that is a much more nuanced use case than a lot of the stuff that's grabbing customer attentions where you effectively have the Chat-Gippity story of it being an incredible parrot. Where I have run into trouble with the generative story has been people putting the thing that the robot that's magic and from the future has come up with off the cuff and just hurling that out into the universe under their own name without any human review, and that's fine sometimes sure, but it does get it hilariously wrong at some points. And the idea of sending something out under my name that has not been at least reviewed by me if not actually authored by me, is abhorrent. I mean, I review even the transactional, “Yes, you have successfully subscribed,” or, “Sorry to see you go,” email confirmations on stuff because there's an implicit, “Hugs and puppies, love Corey,” at the end of everything that goes out under my name.Jeff: Right.Corey: But I've gotten a barrage of terrible sales emails and companies that are trying to put the cart before the horse where either the, “Support rep,” quote-unquote, that I'm speaking to in the chat is an AI system or else needs immediate medical attention because there's something going on that needs assistance.Jeff: Yeah, they just don't understand.Corey: Right. And most big enterprise stories that I've heard so far that have come to light have been around the form of, “We get to fire most of our customer service staff,” an outcome that basically no one sensible wants. That is less compelling than a lot of the individualized consumer use cases. I love asking it, “Here's a blog post I wrote. Give me ten title options.” And I'll usually take one of them—one of them is usually not half bad and then I can modify it slightly.Jeff: And you'll change four words in it. Yeah.Corey: Yeah, exactly. That's a bit of a different use case.Jeff: It's been an interesting—even as we've all become familiar—or at least junior prompt engineers, right—is, your information is only going to be as good as you feed the AI system—the return is only going to be as good—so you're going to want to refine that kind of conversation. Now, we're not trying to end up replacing the content that gets produced or the writing of all kinds of pros, other than we do have a code generator that works inside of our environment called Capella iQ that talks to ChatGPT, but we try and put guardrails on that too, right, as always make sure that it's talking in terms of the context of Couchbase rather than, “Where's Taylor Swift this week,” which I don't want it to answer because I don't want to spend GPT money to answer that question for you.Corey: And it might not know the right answer, but it might very well spit out something that sounds plausible.Jeff: Exactly. But I think the kinds of applications that we're steering ourselves toward can be helped along by the Gen AI systems, but I don't expect all my customers are going to be writing automatic blog post generation kinds of applications. I think what we're ultimately trying to do is facilitate interactions in a way that we haven't dreamt of yet, right? One of them might be if I've opted into to loyalty programs, like my United account and my American Express account—Corey: That feels very targeted at my lifestyle as well, so please, continue.Jeff: Exactly, right? And so, what I really want the system to do is for Amex to reward me when I hit 1k status on United while I'm on the flight and you know, have the flight attendant come up and be like, “Hey, you did it. Either, here's a free upgrade from American Express”—that would be hyper-personalization because you booked your plane ticket with it, but they also happen to know or they cross-consumed information that I've opted into.Corey: I've seen them congratulate people for hitting a million miles flown mid-flight, but that's clearly something that they've been tracking and happens a heck of a lot less frequently. This is how you start scaling that experience.Jeff: Yes. But that happened because American Airlines was always watching because that was an American Airlines ad ages ago, right, but the same principle holds true. But I think there's going to be a lot more of these: how much information am I actually allowing to be shared amongst the, call it loyalty programs, but the data sources that I've opted into. And my God, there's hundreds of them that I've personally opted into, whether I like it or not because everybody needs my email address, kind of like what you were describing earlier.Corey: A point that I have that I think agrees largely with your point is that few things to me are more frustrating than what I'm signing up, for example, oh, I don't know, an AWS even—gee, I can't imagine there's anything like that going on this week—and I have to fill out an entire form that always asked me the same questions: how big my company is, whether we have multiple workloads on, what industry we're in. And no matter what I put into that, first, it never remembers me for the next time, which is frustrating in its own right, but two, no matter what I put in to fill that thing out, the email I get does not change as a result. At one point, I said, all right—I'm picking randomly—“I am a venture capitalist based in Sweden,” and I got nothing that is differentiated from the other normal stuff I get tied to my account because I use a special email address for those things, sometimes just to see what happens. And no, if you're going to make me jump through the hoops to give you the data, at least use it to make my experience better. It feels like I'm asking for the moon here, but I shouldn't be.Jeff: Yes. [we need 00:16:19] to make your experience better and say, you know, “Here's four companies in Malmo that you ought to be talking to. And they happen to be here at the AWS event and you can go find them because their booth is here, here, and here.” That kind of immediate responsiveness could be facilitated, and to our point, ought to be facilitated. It's exactly like that kind of thing is, use the data in real-time.I was talking to somebody else today that was discussing that most data, right, becomes stale and unvaluable, like, 50% of the data, its value goes to zero after about a day. And some of it is stale after about an hour. So, if you can end up closing that responsiveness gap that we were describing—and this is kind of what this columnar service inside of Capella is going to be like—is react in real-time with real-time calculation and real-time look-up and real-time—find out how you might apply that new piece of information right now and then give it back to the consumer or the user right now.Corey: So, Couchbase takes a few different forms. I should probably, at least for those who are not steeped in the world of exotic forms of database, I always like making these conversations more accessible to folks who are not necessarily up to speed. Personally, I tend to misuse anything as a database, if I can hold it just the wrong way.Jeff: The wrong way. I've caught that about you.Corey: Yeah, it's—everything is a database if you hold it wrong. But you folks have a few different options: you have a self-managed commercial offering; you're an open-source project, so I can go ahead and run it on my own infrastructure however I want; and you have Capella, which is Couchbase as a service. And all of those are useful and have their points, and I'm sure I'm missing at least one or two along the way. But do you find that the columnar use case is going to disproportionately benefit folks using Capella in ways that the self-hosted version would not be as useful for, or is this functionality already available in other expressions of Couchbase?Jeff: It's not already available in other expressions, although there is analytic functionality in the self-managed version of Couchbase. But it's, as I've mentioned I think earlier, it's just not as scalable or as really real-time as far as we're thinking. So, it's going to—yes, it's going to benefit the database as a service deployments of Couchbase available on your favorite three clouds, and still interoperable with environments that you might self-manage and self-host. So, there could be even use cases where our development team or your development team builds in AWS using the cloud-oriented features, but is still ultimately deploying and hosting and managing a self-managed environment. You could still do all of that. So, there's still a great interplay and interoperability amongst our different deployment options.But the fun part, I think, about this is not only is it going to help the Capella user, there's a lot of other things inside Couchbase that help address the developers' penchant for trading zero-cost for degrees of complexity that you're willing to accept because you want everything to be free and open-source. And Couchbase is my fifth open-source company in my background, so I'm well, well versed in the nuances of what open-source developers are seeking. But what makes Couchbase—you know, its origin story really cool too, though, is it's the peanut butter and chocolate marriage of memcached and the people behind that and membase and CouchDB from [Couch One 00:19:54]. So, I can't think of that many—maybe Red Hat—project and companies that formed up by merging two complementary open-source projects. So, we took the scale and—Corey: You have OpenTelemetry, I think, that did that once, but that—you see occasional mergers, but it's very far from common.Jeff: But it's very, very infrequent. But what that made the Couchbase people end up doing is make a platform that will scale, make a data design that you can auto partition anywhere, anytime, and then build independently scalable services on top of that, one for SQL++, the query language. Anyone who knows SQL will be able to write something in Couchbase immediately. And I've got this AI Automator, iQ, that makes it even easier; you just say, “Write me a SQL++ query that does this,” and it'll do that. But then we added full-text search, we added eventing so you can stream data, we added the analytics capability originally and now we're enhancing it, and use JSON as our kind of universal data format so that we can trade data with applications really easily.So, it's a cool design to start with, and then in the cloud, we're steering towards things like making your entry point and using our database as a service—Capella—really, really, really inexpensive so that you get that same robustness of functionality, as well as the easy cost of entry that today's developers want. And it's my analyst friends that keep telling me the cloud is where the markets going to go, so we're steering ourselves towards that hockey puck location.Corey: I frequently remark that the role of the DBA might not be vanishing, but it's definitely changing, especially since the last time I counted, if you hold them and use as directed, AWS has something on the order of 14 distinct managed database offerings. Some are general purpose, some are purpose-built, and if this trend keeps up, in a decade, the DBA role is going to be determining which of its 40 databases is going to be the right fit for a given workload. That seems to be the counter-approach to a general-purpose database that works across the board. Clearly you folks have opinions on this. Where do you land?Jeff: Oh, so absolutely. There's the product that is a suite of capabilities—or that are individual capabilities—and then there's ones that are, in my case, kind of multi-model and do lots of things at once. I think historically, you'll recognize—because this is—let's pick on your phone—the same holds true for, you know, your phone used to be a watch, used to be a Palm Pilot, used to be a StarTAC telephone, and your calendar application, your day planner all at the same time. Well, it's not anymore. Technology converges upon itself; it's kind of a historical truism.And the database technologies are going to end up doing that—or continue to do that, even right now. So, that notion that—it's a ten-year-old notion of use a purpose-built database for that particular workload. Maybe sometimes in extreme cases that is the appropriate thing, but in more cases than not right now, if you need transactions when you need them, that's fine, I can do that. You don't necessarily need Aurora or RDS or Postgres to do that. But when you need search and geolocation, I support that too, so you don't need Elastic. And then when you need caching and everything, you don't need ElastiCache; it's all built-in.So, that multi-model notion of operate on the same pool of data, it's a lot less complex for your developers, they can code faster and better and more cleanly, debugging is significantly easier. As I mentioned, SQL++ is our language. It's basically SQL syntax for JSON. We're a reference implementation of this language, along with—[AsteriskDB 00:23:42] is one of them, and actually, the original author of that language also wrote DynamoDB's PartiQL.So, it's a common language that you wouldn't necessarily imagine, but the ease of entry in all of this, I think, is still going to be a driving goal for people. The old people like me and you are running around worrying about, am I going to get a particular, really specific feature out of the full-text search environment, or the other one that I pick on now is, “Am I going to need a vector database, too?” And the answer to me is no, right? There's going—you know, the database vendors like ourselves—and like Mongo has announced and a whole bunch of other NoSQL vendors—we're going to support that. It's going to be just another mode, and you get better bang for your buck when you've got more modes than a single one at a time.Corey: The consensus opinion that's emerging is very much across the board that vector is a feature, not a database type.Jeff: Not a category, yeah. Me too. And yeah, we're well on board with that notion, as well. And then like I said earlier, the JSON as a vehicle to give you all of that versatility is great, right? You can have vector information inside a JSON document, you can have time series information in the document, you could have graph node locations and ID numbers in a JSON array, so you don't need index-free adjacency or some of the other cleverness that some of my former employers have done. It really is all converging upon itself and hopefully everybody starts to realize that you can clean up and simplify your architectures as you look ahead, so that you do—if you're going to build AI-powered applications—feed it clean data, right? You're going to be better off.Corey: So, this episode is being recorded in advance, thankfully, but it's going to release the first day of re:Invent. What are you folks doing at the show, for those who are either there and for some reason, listening to a podcast rather than going to getting marketed to by a variety of different pitches that all mention AI or might even be watching from home and trying to figure out what to make of it?Jeff: Right. So, of course we have a booth, and my notes don't have in front of me what our booth number is, but you'll see it on the signs in the airport. So, we'll have a presence there, we'll have an executive briefing room available, so we can schedule time with anyone who wants to come talk to us. We'll be showing not only the capabilities that we're offering here, we'll show off Capella iQ, our coding assistant, okay—so yeah, we're on the AI hype band—but we'll also be showing things like our mobile sync capability where my phone and your phone can synchronize data amongst themselves without having to actually have a live connection to the internet. So, long as we're on the same network locally within the Venetian's network, we have an app that we have people download from the Apple Store and then it's a color synchronization app or picture synchronization app.So, you tap it, and it changes on my screen and I tap it and it changes on your screen, and we'll have, I don't know, as many people who are around standing there, synchronizing, what, maybe 50 phones at a time. It's actually a pretty slick demonstration of why you might want a database that's not only in the cloud but operates around the cloud, operates mobile-ly, operates—you know, can connect and disconnect to your networks. It's a pretty neat scenario. So, we'll be showing a bunch of cool technical stuff as well as talking about the things that we're discussing right now.Corey: I will say you're putting an awful lot of faith in conductivity working at re:Invent, be it WiFi or the cellular network. I know that both of those have bitten me in various ways over the years. But I wish you the best on it. I think it's going to be an interesting show based upon everything I've heard in the run-up to it. I'm just glad it's here.Jeff: Now, this is the cool part about what I'm talking about, though. The cool part about what I'm talking about is we can set up our own wireless network in our booth, and we still—you'd have to go to the app store to get this application, but once there, I can have you switch over to my local network and play around on it and I can sync the stuff right there and have confidence that in my local network that's in my booth, the system's working. I think that's going to be ultimately our design there because oh my gosh, yes, I have a hundred stories about connectivity and someone blowing a demo because they're yanking on a cable behind the pulpit, right?Corey: I always build in a—and assuming there's no connectivity, how can I fake my demos, just because it's—I've only had to do it once, but you wind up planning in advance when you start doing a talk to a large enough or influential enough audience where you want things to go right.Jeff: There's a delightful acceptance right now of recorded videos and demonstrations that people sort of accept that way because of exactly all this. And I'm sure we'll be showing that in our booth there too.Corey: Given the non-deterministic nature of generative AI, I'm sort of surprised whenever someone hasn't mocked the demo in advance, just because yeah, gives the right answer in the rehearsal, but every once in a while, it gets completely unglued.Jeff: Yes, and we see it pretty regularly. So, the emergence of clever and good prompt engineering is going to be a big skill for people. And hopefully, you know, everybody's going to figure out how to pass it along to their peers.Corey: Excellent. We'll put links to all this in the show notes, and I look forward to seeing how well this works out for you. Best of luck at the show and thanks for speaking with me. I appreciate it.Jeff: Yeah, Corey. We appreciate the support, and I think the show is going to be very strong for us as well. And thanks for having me here.Corey: Always a pleasure. Jeff Morris, VP of Product and Solutions Marketing at Couchbase. This episode has been brought to us by our friends at Couchbase. And I'm Cloud Economist Corey Quinn. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment, but if you want to remain happy, I wouldn't ask that podcast platform what database they're using. No one likes the answer to those things.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Amir Szekely, Owner at CloudSnorkel, joins Corey on Screaming in the Cloud to discuss how he got his start in the early days of cloud and his solo project, CloudSnorkel. Throughout this conversation, Corey and Amir discuss the importance of being pragmatic when moving to the cloud, and the different approaches they see in developers from the early days of cloud to now. Amir shares what motivates him to develop open-source projects, and why he finds fulfillment in fixing bugs and operating CloudSnorkel as a one-man show. About AmirAmir Szekely is a cloud consultant specializing in deployment automation, AWS CDK, CloudFormation, and CI/CD. His background includes security, virtualization, and Windows development. Amir enjoys creating open-source projects like cdk-github-runners, cdk-turbo-layers, and NSIS.Links Referenced: CloudSnorkel: https://cloudsnorkel.com/ lasttootinaws.com: https://lasttootinaws.com camelcamelcamel.com: https://camelcamelcamel.com github.com/cloudsnorkel: https://github.com/cloudsnorkel Personal website: https://kichik.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, and this is an episode that I have been angling for for longer than you might imagine. My guest today is Amir Szekely, who's the owner at CloudSnorkel. Amir, thank you for joining me.Amir: Thanks for having me, Corey. I love being here.Corey: So, I've been using one of your open-source projects for an embarrassingly long amount of time, and for the longest time, I make the critical mistake of referring to the project itself as CloudSnorkel because that's the word that shows up in the GitHub project that I can actually see that jumps out at me. The actual name of the project within your org is cdk-github-runners if I'm not mistaken.Amir: That's real original, right?Corey: Exactly. It's like, “Oh, good, I'll just mention that, and suddenly everyone will know what I'm talking about.” But ignoring the problems of naming things well, which is a pain that everyone at AWS or who uses it knows far too well, the product is basically magic. Before I wind up basically embarrassing myself by doing a poor job of explaining what it is, how do you think about it?Amir: Well, I mean, it's a pretty simple project, which I think what makes it great as well. It creates GitHub runners with CDK. That's about it. It's in the name, and it just does that. And I really tried to make it as simple as possible and kind of learn from other projects that I've seen that are similar, and basically learn from my pain points in them.I think the reason I started is because I actually deployed CDK runners—sorry, GitHub runners—for one company, and I ended up using the Kubernetes one, right? So, GitHub in themselves, they have two projects they recommend—and not to nudge GitHub, please recommend my project one day as well—they have the Kubernetes controller and they have the Terraform deployer. And the specific client that I worked for, they wanted to use Kubernetes. And I tried to deploy it, and, Corey, I swear, I worked three days; three days to deploy the thing, which was crazy to me. And every single step of the way, I had to go and read some documentation, figure out what I did wrong, and apparently the order the documentation was was incorrect.And I had to—I even opened tickets, and they—you know, they were rightfully like, “It's open-source project. Please contribute and fix the documentation for us.” At that point, I said, “Nah.” [laugh]. Let me create something better with CDK and I decided just to have the simplest setup possible.So usually, right, what you end up doing in these projects, you have to set up either secrets or SSM parameters, and you have to prepare the ground and you have to get your GitHub token and all those things. And that's just annoying. So, I decided to create a—Corey: So much busy work.Amir: Yes, yeah, so much busy work and so much boilerplate and so much figuring out the right way and the right order, and just annoying. So, I decided to create a setup page. I thought, “What if you can actually install it just like you install any app on GitHub,” which is the way it's supposed to be right? So, when you install cdk-github-runners—CloudSnorkel—you get an HTML page and you just click a few buttons and you tell it where to install it and it just installs it for you. And it sets the secrets and everything. And if you want to change the secret, you don't have to redeploy. You can just change the secret, right? You have to roll the token over or whatever. So, it's much, much easier to install.Corey: And I feel like I discovered this project through one of the more surreal approaches—and I had cause to revisit it a few weeks ago when I was redoing my talk for the CDK Community Day, which has since happened and people liked the talk—and I mentioned what CloudSnorkel had been doing and how I was using the runners accordingly. So, that was what I accidentally caused me to pop back up with, “Hey, I've got some issues here.” But we'll get to that. Because once upon a time, I built a Twitter client for creating threads because shitposting is my love language, I would sit and create Twitter threads in the middle of live keynote talks. Threading in the native client was always terrible, and I wanted to build something that would help me do that. So, I did.And it was up for a while. It's not anymore because I'm not paying $42,000 a month in API costs to some jackass, but it still exists in the form of lasttootinaws.com if you want to create threads on Mastodon. But after I put this out, some people complained that it was slow.To which my response was, “What do you mean? It's super fast for me in San Francisco talking to it hosted in Oregon.” But on every round trip from halfway around the world, it became a problem. So, I got it into my head that since this thing was fully stateless, other than a Lambda function being fronted via an API Gateway, that I should deploy it to every region. It didn't quite fit into a Cloudflare Worker or into one of the Edge Lambda functions that AWS has given up on, but okay, how do I deploy something to every region?And the answer is, with great difficulty because it's clear that no one was ever imagining with all those regions that anyone would use all of them. It's imagined that most customers use two or three, but customers are different, so which two or three is going to be widely varied. So, anything halfway sensible about doing deployments like this didn't work out. Again, because this thing was also a Lambda function and an API Gateway, it was dirt cheap, so I didn't really want to start spending stupid amounts of money doing deployment infrastructure and the rest.So okay, how do I do this? Well, GitHub Actions is awesome. It is basically what all of AWS's code offerings wish that they were. CodeBuild is sad and this was kind of great. The problem is, once you're out of the free tier, and if you're a bad developer where you do a deploy on every iteration, suddenly it starts costing for what I was doing in every region, something like a quarter of per deploy, which adds up when you're really, really bad at programming.Amir: [laugh].Corey: So, their matrix jobs are awesome, but I wanted to do some self-hosted runners. How do I do that? And I want to keep it cheap, so how do I do a self-hosted runner inside of a Lambda function? Which led me directly to you. And it was nothing short of astonishing. This was a few years ago. I seem to recall that it used to be a bit less well-architected in terms of its elegance. Did it always use step functions, for example, to wind up orchestrating these things?Amir: Yeah, so I do remember that day. We met pretty much… basically as a joke because the Lambda Runner was a joke that I did, and I posted on Twitter, and I was half-proud of my joke that starts in ten seconds, right? But yeah, no, the—I think it always used functions. I've been kind of in love with the functions for the past two years. They just—they're nice.Corey: Oh, they're magic, and AWS is so bad at telling their story. Both of those things are true.Amir: Yeah. And the API is not amazing. But like, when you get it working—and you know, you have to spend some time to get it working—it's really nice because then you have nothing to manage, ever. And they can call APIs directly now, so you don't have to even create Lambdas. It's pretty cool.Corey: And what I loved is you wind up deploying this thing to whatever account you want it to live within. What is it, the OIDC? I always get those letters in the wrong direction. OIDC, I think, is correct.Amir: I think it's OIDC, yeah.Corey: Yeah, and it winds up doing this through a secure method as opposed to just okay, now anyone with access to the project can deploy into your account, which is not ideal. And it just works. It spins up a whole bunch of these Lambda functions that are using a Docker image as the deployment environment. And yeah, all right, if effectively my CDK deploy—which is what it's doing inside of this thing—doesn't complete within 15 minutes, then it's not going to and the thing is going to break out. We've solved the halting problem. After 15 minutes, the loop will terminate. The end.But that's never been a problem, even with getting ACM certificates spun up. It completes well within that time limit. And its cost to me is effectively nothing. With one key exception: that you made the choice to use Secrets Manager to wind up storing a lot of the things it cares about instead of Parameter Store, so I think you wind up costing me—I think there's two of those different secrets, so that's 80 cents a month. Which I will be demanding in blood one of these days if I ever catch you at re:Invent.Amir: I'll buy you beer [laugh].Corey: There we go. That'll count. That'll buy, like, several months of that. That works—at re:Invent, no. The beers there are, like, $18, so that'll cover me for years. We're set.Amir: We'll split it [laugh].Corey: Exactly. Problem solved. But I like the elegance of it, I like how clever it is, and I want to be very clear, though, it's not just for shitposting. Because it's very configurable where, yes, you can use Lambda functions, you can use Spot Instances, you can use CodeBuild containers, you can use Fargate containers, you can use EC2 instances, and it just automatically orchestrates and adds these self-hosted runners to your account, and every build gets a pristine environment as a result. That is no small thing.Amir: Oh, and I love making things configurable. People really appreciate it I feel, you know, and gives people kind of a sense of power. But as long as you make that configuration simple enough, right, or at least the defaults good defaults, right, then, even with that power, people still don't shoot themselves in the foot and it still works really well. By the way, we just added ECS recently, which people really were asking for because it gives you the, kind of, easy option to have the runner—well, not the runner but at least the runner infrastructure staying up, right? So, you can have auto-scaling group backing ECS and then the runner can start up a lot faster. It was actually very important to other people because Lambda, as fast that it is, it's limited, and Fargate, for whatever reason, still to this day, takes a minute to start up.Corey: Yeah. What's wild to me about this is, start to finish, I hit a deploy to the main branch and it sparks the thing up, runs the deploy. Deploy itself takes a little over two minutes. And every time I do this, within three minutes of me pushing to commit, the deploy is done globally. It is lightning fast.And I know it's easy to lose yourself in the idea of this being a giant shitpost, where, oh, who's going to do deployment jobs in Lambda functions? Well, kind of a lot of us for a variety of reasons, some of which might be better than others. In my case, it was just because I was cheap, but the massive parallelization ability to do 20 simultaneous deploys in a matrix configuration that doesn't wind up smacking into rate limits everywhere, that was kind of great.Amir: Yeah, we have seen people use Lambda a lot. It's mostly for, yeah, like you said, small jobs. And the environment that they give you, it's kind of limited, so you can't actually install packages, right? There is no sudo, and you can't actually install anything unless it's in your temp directory. But still, like, just being able to run a lot of little jobs, it's really great. Yeah.Corey: And you can also make sure that there's a Docker image ready to go with the stuff that you need, just by configuring how the build works in the CDK. I will admit, I did have a couple of bug reports for you. One was kind of useful, where it was not at all clear how to do this on top of a Graviton-based Lambda function—because yeah, that was back when not everything really supported ARM architectures super well—and a couple of other times when the documentation was fairly ambiguous from my perspective, where it wasn't at all clear, what was I doing? I spent four hours trying to beat my way through it, I give up, filed an issue, went to get a cup of coffee, came back, and the answer was sitting there waiting for me because I'm not convinced you sleep.Amir: Well, I am a vampire. My last name is from the Transylvania area [laugh]. So—Corey: Excellent. Excellent.Amir: By the way, not the first time people tell me that. But anyway [laugh].Corey: There's something to be said for getting immediate responsiveness because one of the reasons I'm always so loath to go and do a support ticket anywhere is this is going to take weeks. And then someone's going to come back with a, “I don't get it.” And try and, like, read the support portfolio to you. No, you went right into yeah, it's this. Fix it and your problem goes away. And sure enough, it did.Amir: The escalation process that some companies put you through is very frustrating. I mean, lucky for you, CloudSnorkel is a one-man show and this man loves solving bugs. So [laugh].Corey: Yeah. Do you know of anyone using it for anything that isn't ridiculous and trivial like what I'm using it for?Amir: Yeah, I have to think whether or not I can… I mean, so—okay. We have a bunch of dedicated users, right, the GitHub repo, that keep posting bugs and keep posting even patches, right, so you can tell that they're using it. I even have one sponsor, one recurring sponsor on GitHub that uses it.Corey: It's always nice when people thank you via money.Amir: Yeah. Yeah, it is very validating. I think [BLEEP] is using it, but I also don't think I can actually say it because I got it from the GitHub.Corey: It's always fun. That's the beautiful part about open-source. You don't know who's using this. You see what other things people are working on, and you never know, is one of their—is this someone's side project, is it a skunkworks thing, or God forbid, is this inside of every car going forward and no one bothered to tell me about that. That is the magic and mystery of open-source. And you've been doing open-source for longer than I have and I thought I was old. You were originally named in some of the WinAMP credits, for God's sake, that media player that really whipped the llama's ass.Amir: Oh, yeah, I started real early. I started about when I was 15, I think. I started off with Pascal or something or even Perl, and then I decided I have to learn C and I have to learn Windows API. I don't know what possessed me to do that. Win32 API is… unique [laugh].But once I created those applications for myself, right, I think there was—oh my God, do you know the—what is it called, Sherlock in macOS, right? And these days, for PowerToys, there is the equivalent of it called, I don't know, whatever that—PowerBar? That's exactly—that was that. That's a project I created as a kid. I wanted something where I can go to the Run menu of Windows when you hit Winkey R, and you can just type something and it will start it up, right?I didn't want to go to the Start menu and browse and click things. I wanted to do everything with the keyboard. So, I created something called Blazerun [laugh], which [laugh] helped you really easily create shortcuts that went into your path, right, the Windows path, so you can really easily start them from Winkey R. I don't think that anyone besides me used it, but anyway, that thing needed an installer, right? Because Windows, you got to install things. So, I ended up—Corey: Yeah, these days on Mac OS, I use Alfred for that which is kind of long in the tooth, but there's a launch bar and a bunch of other stuff for it. What I love is that if I—I can double-tap the command key and that just pops up whatever I need it to and tell the computer what to do. It feels like there's an AI play in there somewhere if people can figure out how to spend ten minutes on building AI that does something other than lets them fire their customer service staff.Amir: Oh, my God. Please don't fire customer service staff. AI is so bad.Corey: Yeah, when I reach out to talk to a human, I really needed a human.Amir: Yes. Like, I'm not calling you because I want to talk to a robot. I know there's a website. Leave me alone, just give me a person.Corey: Yeah. Like, you already failed to solve my problem on your website. It's person time.Amir: Exactly. Oh, my God. Anyway [laugh]. So, I had to create an installer, right, and I found it was called NSIS. So, it was a Nullsoft “SuperPiMP” installation system. Or in the future, when Justin, the guy who created Winamp and NSIS, tried to tone down a little bit, Nullsoft Scriptable Installation System. And SuperPiMP is—this is such useless history for you, right, but SuperPiMP is the next generation of PiMP which is Plug-in Mini Packager [laugh].Corey: I remember so many of the—like, these days, no one would ever name any project like that, just because it's so off-putting to people with sensibilities, but back then that was half the stuff that came out. “Oh, you don't like how this thing I built for free in the wee hours when I wasn't working at my fast food job wound up—you know, like, how I chose to name it, well, that's okay. Don't use it. Go build your own. Oh, what you're using it anyway. That's what I thought.”Amir: Yeah. The source code was filled with profanity, too. And like, I didn't care, I really did not care, but some people would complain and open bug reports and patches. And my policy was kind of like, okay if you're complaining, I'm just going to ignore you. If you're opening a patch, fine, I'm going to accept that you're—you guys want to create something that's sensible for everybody, sure.I mean, it's just source code, you know? Whatever. So yeah, I started working on that NSIS. I used it for myself and I joined the forums—and this kind of answers to your question of why I respond to things so fast, just because of the fun—I did the same when I was 15, right? I started going on the forums, you remember forums? You remember that [laugh]?Corey: Oh, yeah, back before they all became terrible and monetized.Amir: Oh, yeah. So, you know, people were using NSIS, too, and they had requests, right? They wanted. Back in the day—what was it—there was only support for 16-bit colors for the icon, so they want 32-bit colors and big colors—32—big icon, sorry, 32 pixels by 32 pixels. Remember, 32 pixels?Corey: Oh, yes. Not well, and not happily, but I remember it.Amir: Yeah. So, I started just, you know, giving people—working on that open-source and creating up a fork. It wasn't even called ‘fork' back then, but yeah, I created, like, a little fork of myself and I started adding all these features. And people were really happy, and kind of created, like, this happy cycle for myself: when people were happy, I was happy coding. And then people were happy by what I was coding. And then they were asking for more and they were getting happier, the more I responded.So, it was kind of like a serotonin cycle that made me happy and made everybody happy. So, it's like a win, win, win, win, win. And that's how I started with open-source. And eventually… NSIS—again, that installation system—got so big, like, my fork got so big, and Justin, the guy who works on WinAMP and NSIS, he had other things to deal with. You know, there's a whole history there with AOL. I'm sure you've heard all the funny stories.Corey: Oh, yes. In fact, one thing that—you want to talk about weird collisions of things crossing, one of the things I picked up from your bio when you finally got tired of telling me no and agreed to be on the show was that you're also one of the team who works on camelcamelcamel.com. And I keep forgetting that's one of those things that most people have no idea exists. But it's very simple: all it does is it tracks Amazon products that you tell it to and alerts you when there's a price drop on the thing that you're looking at.It's something that is useful. I try and use it for things of substance or hobbies because I feel really pathetic when I'm like, get excited emails about a price drop in toilet paper. But you know, it's very handy just to keep an idea for price history, where okay, am I actually being ripped off? Oh, they claim it's their big Amazon Deals day and this is 40% off. Let's see what camelcamelcamel has to say.Oh, surprise. They just jacked the price right beforehand and now knocked 40% off. Genius. I love that. It always felt like something that was going to be blown off the radar by Amazon being displeased, but I discovered you folks in 2010 and here you are now, 13 years later, still here. I will say the website looks a lot better now.Amir: [laugh]. That's a recent change. I actually joined camel, maybe two or three years ago. I wasn't there from the beginning. But I knew the guy who created it—again, as you were saying—from the Winamp days, right? So, we were both working in the free—well, it wasn't freenode. It was not freenode. It was a separate IRC server that, again, Justin created for himself. It was called landoleet.Corey: Mmm. I never encountered that one.Amir: Yeah, no, it was pretty private. The only people that cared about WinAMP and NSIS ended up joining there. But it was a lot of fun. I met a lot of friends there. And yeah, I met Daniel Green there as well, and he's the guy that created, along with some other people in there that I think want to remain anonymous so I'm not going to mention, but they also were on the camel project.And yeah, I was kind of doing my poor version of shitposting on Twitter about AWS, kind of starting to get some traction and maybe some clients and talk about AWS so people can approach me, and Daniel approached me out of the blue and he was like, “Do you just post about AWS on Twitter or do you also do some AWS work?” I was like, “I do some AWS work.”Corey: Yes, as do all of us. It's one of those, well crap, we're getting called out now. “Do you actually know how any of this stuff works?” Like, “Much to my everlasting shame, yes. Why are you asking?”Amir: Oh, my God, no, I cannot fix your printer. Leave me alone.Corey: Mm-hm.Amir: I don't want to fix your Lambdas. No, but I do actually want to fix your Lambdas. And so, [laugh] he approached me and he asked if I can help them move camelcamelcamel from their data center to AWS. So, that was a nice big project. So, we moved, actually, all of camelcamelcamel into AWS. And this is how I found myself not only in the Winamp credits, but also in the camelcamelcamel credits page, which has a great picture of me riding a camel.Corey: Excellent. But one of the things I've always found has been that when you take an application that has been pre-existing for a while in a data center and then move it into the cloud, you suddenly have to care about things that no one sensible pays any attention to in the land of the data center. Because it's like, “What do I care about how much data passes between my application server and the database? Wait, what do you mean that in this configuration, that's a chargeable data transfer? Oh, dear Lord.” And things that you've never had to think about optimizing are suddenly things are very much optimizing.Because let's face it, when it comes to putting things in racks and then running servers, you aren't auto-scaling those things, so everything tends to be running over-provisioned, for very good reasons. It's an interesting education. Anything you picked out from that process that you think it'd be useful for folks to bear in mind if they're staring down the barrel of the same thing?Amir: Yeah, for sure. I think… in general, right, not just here. But in general, you always want to be pragmatic, right? You don't want to take steps are huge, right? So, the thing we did was not necessarily rewrite everything and change everything to AWS and move everything to Lambda and move everything to Docker.Basically, we did a mini lift-and-shift, but not exactly lift-and-shift, right? We didn't take it as is. We moved to RDS, we moved to ElastiCache, right, we obviously made use of security groups and session connect and we dropped SSH Sage and we improved the security a lot and we locked everything down, all the permissions and all that kind of stuff, right? But like you said, there's stuff that you start having to pay attention to. In our case, it was less the data transfer because we have a pretty good CDN. There was more of IOPS. So—and IOPS, specifically for a database.We had a huge database with about one terabyte of data and a lot of it is that price history that you see, right? So, all those nice little graphs that we create in—what do you call them, charts—that we create in camelcamelcamel off the price history. There's a lot of data behind that. And what we always want to do is actually remove that from MySQL, which has been kind of struggling with it even before the move to AWS, but after the move to AWS, where everything was no longer over-provisioned and we couldn't just buy a few more NVMes on Amazon for 100 bucks when they were on sale—back when we had to pay Amazon—Corey: And you know, when they're on sale. That's the best part.Amir: And we know [laugh]. We get good prices on NVMe. But yeah, on Amazon—on AWS, sorry—you have to pay for io1 or something, and that adds up real quick, as you were saying. So, part of that move was also to move to something that was a little better for that data structure. And we actually removed just that data, the price history, the price points from MySQL to DynamoDB, which was a pretty nice little project.Actually, I wrote about it in my blog. There is, kind of, lessons learned from moving one terabyte from MySQL to DynamoDB, and I think the biggest lesson was about hidden price of storage in DynamoDB. But before that, I want to talk about what you asked, which was the way that other people should make that move, right? So again, be pragmatic, right? If you Google, “How do I move stuff from DynamoDB to MySQL,” everybody's always talking about their cool project using Lambda and how you throttle Lambda and how you get throttled from DynamoDB and how you set it up with an SQS, and this and that. You don't need all that.Just fire up an EC2 instance, write some quick code to do it. I used, I think it was Go with some limiter code from Uber, and that was it. And you don't need all those Lambdas and SQS and the complication. That thing was a one-time thing anyway, so it doesn't need to be super… super-duper serverless, you know?Corey: That is almost always the way that it tends to play out. You encounter these weird little things along the way. And you see so many things that are tied to this is how architecture absolutely must be done. And oh you're not a real serverless person if you don't have everything running in Lambda and the rest. There are times where yeah, spin up an EC2 box, write some relatively inefficient code in ten minutes and just do the thing, and then turn it off when you're done. Problem solved. But there's such an aversion to that. It's nice to encounter people who are pragmatists more than they are zealots.Amir: I mostly learned that lesson. And both Daniel Green and me learned that lesson from the Winamp days. Because we both have written plugins for Winamp and we've been around that area and you can… if you took one of those non-pragmatist people, right, and you had them review the Winamp code right now—or even before—they would have a million things to say. That code was—and NSIS, too, by the way—and it was so optimized. It was so not necessarily readable, right? But it worked and it worked amazing. And Justin would—if you think I respond quickly, right, Justin Frankel, the guy who wrote Winamp, he would release versions of NSIS and of Winamp, like, four versions a day, right? That was before [laugh] you had CI/CD systems and GitHub and stuff. That was just CVS. You remember CVS [laugh]?Corey: Oh, I've done multiple CVS migrations. One to Git and a couple to Subversion.Amir: Oh yeah, Subversion. Yep. Done ‘em all. CVS to Subversion to Git. Yep. Yep. That was fun.Corey: And these days, everyone's using Git because it—we're beginning to have a monoculture.Amir: Yeah, yeah. I mean, but Git is nicer than Subversion, for me, at least. I've had more fun with it.Corey: Talk about damning with faint praise.Amir: Faint?Corey: Yeah, anything's better than Subversion, let's be honest here.Amir: Oh [laugh].Corey: I mean, realistically, copying a bunch of files and directories to a.bak folder is better than Subversion.Amir: Well—Corey: At least these days. But back then it was great.Amir: Yeah, I mean, the only thing you had, right [laugh]?Corey: [laugh].Amir: Anyway, achieving great things with not necessarily the right tools, but just sheer power of will, that's what I took from the Winamp days. Just the entire world used Winamp. And by the way, the NSIS project that I was working on, right, I always used to joke that every computer in the world ran my code, every Windows computer in the world when my code, just because—Corey: Yes.Amir: So, many different companies use NSIS. And none of them cared that the code was not very readable, to put it mildly.Corey: So, many companies founder on those shores where they lose sight of the fact that I can point to basically no companies that died because their code was terrible, yeah, had an awful lot that died with great-looking code, but they didn't nail the business problem.Amir: Yeah. I would be lying if I said that I nailed exactly the business problem at NSIS because the most of the time I would spend there and actually shrinking the stub, right, there was appended to your installer data, right? So, there's a little stub that came—the executable, basically, that came before your data that was extracted. I spent, I want to say, years of my life [laugh] just shrinking it down by bytes—by literal bytes—just so it stays under 34, 35 kilobytes. It was kind of a—it was a challenge and something that people appreciated, but not necessarily the thing that people appreciate the most. I think the features—Corey: Well, no I have to do the same thing to make sure something fits into a Lambda deployment package. The scale changes, the problem changes, but somehow everything sort of rhymes with history.Amir: Oh, yeah. I hope you don't have to disassemble code to do that, though because that's uh… I mean, it was fun. It was just a lot.Corey: I have to ask, how much work went into building your cdk-github-runners as far as getting it to a point of just working out the door? Because I look at that and it feels like there's—like, the early versions, yeah, there wasn't a whole bunch of code tied to it, but geez, the iterative, “How exactly does this ridiculous step functions API work or whatnot,” feels like I'm looking at weeks of frustration. At least it would have been for me.Amir: Yeah, yeah. I mean, it wasn't, like, a day or two. It was definitely not—but it was not years, either. I've been working on it I think about a year now. Don't quote me on that. But I've put a lot of time into it. So, you know, like you said, the skeleton code is pretty simple: it's a step function, which as we said, takes a long time to get right. The functions, they are really nice, but their definition language is not very straightforward. But beyond that, right, once that part worked, it worked. Then came all the bug reports and all the little corner cases, right? We—Corey: Hell is other people's use cases. Always is. But that's honestly better than a lot of folks wind up experiencing where they'll put an open-source project up and no one ever knows. So, getting users is often one of the biggest barriers to a lot of this stuff. I've found countless hidden gems lurking around on GitHub with a very particular search for something that no one had ever looked at before, as best I can tell.Amir: Yeah.Corey: Open-source is a tricky thing. There needs to be marketing brought into it, there needs to be storytelling around it, and has to actually—dare I say—solve a problem someone has.Amir: I mean, I have many open-source projects like that, that I find super useful, I created for myself, but no one knows. I think cdk-github-runners, I'm pretty sure people know about it only because you talked about it on Screaming in the Cloud or your newsletter. And by the way, thank you for telling me that you talked about it last week in the conference because now we know why there was a spike [laugh] all of a sudden. People Googled it.Corey: Yeah. I put links to it as well, but it's the, yeah, I use this a lot and it's great. I gave a crappy explanation on how it works, but that's the trick I've found between conference talks and, dare I say, podcast episodes, you gives people a glimpse and a hook and tell them where to go to learn more. Otherwise, you're trying to explain every nuance and every intricacy in 45 minutes. And you can't do that effectively in almost every case. All you're going to do is drive people away. Make it sound exciting, get them to see the value in it, and then let them go.Amir: You have to explain the market for it, right? That's it.Corey: Precisely.Amir: And I got to say, I somewhat disagree with your—or I have a different view when you say that, you know, open-source projects needs marketing and all those things. It depends on what open-source is for you, right? I don't create open-source projects so they are successful, right? It's obviously always nicer when they're successful, but—and I do get that cycle of happiness that, like I was saying, people create bugs and I have to fix them and stuff, right? But not every open-source project needs to be a success. Sometimes it's just fun.Corey: No. When I talk about marketing, I'm talking about exactly what we're doing here. I'm not talking take out an AdWords campaign or something horrifying like that. It's you build something that solved the problem for someone. The big problem that worries me about these things is how do you not lose sleep at night about the fact that solve someone's problem and they don't know that it exists?Because that drives me nuts. I've lost count of the number of times I've been beating my head against a wall and asked someone like, “How would you handle this?” Like, “Oh, well, what's wrong with this project?” “What do you mean?” “Well, this project seems to do exactly what you want it to do.” And no one has it all stuffed in their head. But yeah, then it seems like open-source becomes a little more corporatized and it becomes a lead gen tool for people to wind up selling their SaaS services or managed offerings or the rest.Amir: Yeah.Corey: And that feels like the increasing corporatization of open-source that I'm not a huge fan of.Amir: Yeah. I mean, I'm not going to lie, right? Like, part of why I created this—or I don't know if it was part of it, but like, I had a dream that, you know, I'm going to get, oh, tons of GitHub sponsors, and everybody's going to use it and I can retire on an island and just make money out of this, right? Like, that's always a dream, right? But it's a dream, you know?And I think bottom line open-source is… just a tool, and some people use it for, like you were saying, driving sales into their SaaS, some people, like, may use it just for fun, and some people use it for other things. Or some people use it for politics, even, right? There's a lot of politics around open-source.I got to tell you a story. Back in the NSIS days, right—talking about politics—so this is not even about politics of open-source. People made NSIS a battleground for their politics. We would have translations, right? People could upload their translations. And I, you know, or other people that worked on NSIS, right, we don't speak every language of the world, so there's only so much we can do about figuring out if it's a real translation, if it's good or not.Back in the day, Google Translate didn't exist. Like, these days, we check Google Translate, we kind of ask a few questions to make sure they make sense. But back in the day, we did the best that we could. At some point, we got a patch for Catalan language, I'm probably mispronouncing it—but the separatist people in Spain, I think, and I didn't know anything about that. I was a young kid and… I just didn't know.And I just included it, you know? Someone submitted a patch, they worked hard, they wanted to be part of the open-source project. Why not? Sure I included it. And then a few weeks later, someone from Spain wanted to change Catalan into Spanish to make sure that doesn't exist for whatever reason.And then they just started fighting with each other and started making demands of me. Like, you have to do this, you have to do that, you have to delete that, you have to change the name. And I was just so baffled by why would someone fight so much over a translation of an open-source project. Like, these days, I kind of get what they were getting at, right?Corey: But they were so bad at telling that story that it was just like, so basically, screw, “You for helping,” is how it comes across.Amir: Yeah, screw you for helping. You're a pawn now. Just—you're a pawn unwittingly. Just do what I say and help me in my political cause. I ended up just telling both of them if you guys can agree on anything, I'm just going to remove both translations. And that's what I ended up doing. I just removed both translations. And then a few months later—because we had a release every month basically, I just added both of them back and I've never heard from them again. So sort of problem solved. Peace the Middle East? I don't know.Corey: It's kind of wild just to see how often that sort of thing tends to happen. It's a, I don't necessarily understand why folks are so opposed to other people trying to help. I think they feel like there's this loss of control as things are slipping through their fingers, but it's a really unwelcoming approach. One of the things that got me deep into the open-source ecosystem surprisingly late in my development was when I started pitching in on the SaltStack project right after it was founded, where suddenly everything I threw their way was merged, and then Tom Hatch, the guy who founded the project, would immediately fix all the bugs and stuff I put in and then push something else immediately thereafter. But it was such a welcoming thing.Instead of nitpicking me to death in the pull request, it just got merged in and then silently fixed. And I thought that was a classy way to do it. Of course, it doesn't scale and of course, it causes other problems, but I envy the simplicity of those days and just the ethos behind that.Amir: That's something I've learned the last few years, I would say. Back in the NSIS day, I was not like that. I nitpicked. I nitpicked a lot. And I can guess why, but it just—you create a patch—in my mind, right, like you create a patch, you fix it, right?But these days I get, I've been on the other side as well, right? Like I created patches for open-source projects and I've seen them just wither away and die, and then five years later, someone's like, “Oh, can you fix this line to have one instead of two, and then I'll merge it.” I'm like, “I don't care anymore. It was five years ago. I don't work there anymore. I don't need it. If you want it, do it.”So, I get it these days. And these days, if someone creates a patch—just yesterday, someone created a patch to format cdk-github-runners in VS Code. And they did it just, like, a little bit wrong. So, I just fixed it for them and I approved it and pushed it. You know, it's much better. You don't need to bug people for most of it.Corey: You didn't yell at them for having the temerity to contribute?Amir: My voice is so raw because I've been yelling for five days at them, yeah.Corey: Exactly, exactly. I really want to thank you for taking the time to chat with me about how all this stuff came to be and your own path. If people want to learn more, where's the best place for them to find you?Amir: So, I really appreciate you having me and driving all this traffic to my projects. If people want to learn more, they can always go to cloudsnorkel.com; it has all the projects. github.com/cloudsnorkel has a few more. And then my private blog is kichik.com. So, K-I-C-H-I-K dot com. I don't post there as much as I should, but it has some interesting AWS projects from the past few years that I've done.Corey: And we will, of course, put links to all of that in the show notes. Thank you so much for taking the time. I really appreciate it.Amir: Thank you, Corey. It was really nice meeting you.Corey: Amir Szekely, owner of CloudSnorkel. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment. Heck, put it on all of the podcast platforms with a step function state machine that you somehow can't quite figure out how the API works.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Who is the king of all databases when it comes to performance? Yes, Redis! Of course! In this episode of AWS Bites, we talk about Redis on ElastiCache, one of the most essential instruments in the cloud architect's toolbox. We explore the joys and woes of Redis on AWS and share some exciting alternatives regarding in-memory databases and caching systems. We discuss the use cases of Redis, including session storage, web page caching, database cache, cost optimization, queues and pub/sub messaging, and distributed applications state. We extensively talk about ElastiCache, the managed cache solution on AWS based on either Redis or Memcache, and its features such as replication groups, auto-scaling, and monitoring. Finally, we discuss potential alternatives, such as DynamoDB (with DAX), Upstash, or Momento, a serverless cache built on Pelikan.
Welcome to the newest episode of The Cloud Pod podcast! Justin, Ryan and Matthew are your hosts this week as we discuss all the latest news and announcements in the world of the cloud and AI. Do people really love Matt's Azure know-how? Can Google make Bard fit into literally everything they make? What's the latest with Azure AI and their space collaborations? Let's find out! Titles we almost went with this week: Clouds in Space, Fictional Realms of Oracles, Oh My. The cloudpod streams lambda to the cloud A big thanks to this week's sponsor: Foghorn Consulting, provides top-notch cloud and DevOps engineers to the world's most innovative companies. Initiatives stalled because you have trouble hiring? Foghorn can be burning down your DevOps and Cloud backlogs as soon as next week.
Two brothers discussing all things AWS every week. Hosted by Andreas and Michael Wittig presented by cloudonaut.
About AndyAndy is on a lifelong journey to understand, invent, apply, and leverage technology in our world. Both personally and professionally technology is at the root of his interests and passions.Andy has always had an interest in understanding how things work at their fundamental level. In addition to figuring out how something works, the recursive journey of learning about enabling technologies and underlying principles is a fascinating experience which he greatly enjoys.The early Internet afforded tremendous opportunities for learning and discovery. Andy's early work focused on network engineering and architecture for regional Internet service providers in the late 1990s – a time of fantastic expansion on the Internet.Since joining Akamai in 2000, Akamai has afforded countless opportunities for learning and curiosity through its practically limitless globally distributed compute platform. Throughout his time at Akamai, Andy has held a variety of engineering and product leadership roles, resulting in the creation of many external and internal products, features, and intellectual property.Andy's role today at Akamai – Senior Vice President within the CTO Team - offers broad access and input to the full spectrum of Akamai's applied operations – from detailed patent filings to strategic company direction. Working to grow and scale Akamai's technology and business from a few hundred people to roughly 10,000 with a world-class team is an amazing environment for learning and creating connections.Personally Andy is an avid adventurer, observer, and photographer of nature, marine, and astronomical subjects. Hiking, typically in the varied terrain of New England, with his family is a common endeavor. He enjoys compact/embedded systems development and networking with a view towards their applications in drone technology.Links Referenced: Macrometa: https://www.macrometa.com/ Akamai: https://www.akamai.com/ LinkedIn: https://www.linkedin.com/in/andychampagne/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With Tailscale SSH, you can do exactly that. Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH.Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built-in key rotation, permissions as code, connectivity between any two devices, reduce latency, and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. Tailscale is completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomomento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I like doing promoted guest episodes like this one. Not that I don't enjoy all of my promoted guest episodes. But every once in a while, I generally have the ability to wind up winning an argument with one of my customers. Namely, it's great to talk to you folks, but why don't you send me someone who doesn't work at your company? Maybe a partner, maybe an investor, maybe a customer. At Macrometa who's sponsoring this episode said, okay, my guest today is Andy Champagne, SVP at the CTO office at Akamai. Andy, thanks for joining me.Andy: Thanks, Corey. Appreciate you having me. And appreciate Macrometa letting me come.Corey: Let's start with talking about you, and then we'll get around to the Macrometa discussion in the fullness of time. You've been at an Akamai for 22 years, which in tech company terms, it's like staying at a normal job for 75 years. What's it been like being in the same place for over two decades?Andy: Yeah, I've got several gold watches. I've been retired twice. Nobody—you know, Akamai—so in the late-90s, I was in the ISP universe, right? So, I was in network engineering at regional ISPs, you know, kind of cutting teeth on, you know, trying to scale networks and deal with the flux of user traffic coming in from the growth of the web. And, you know, frankly, it wasn't working, right?Companies were trying to scale up at the time by adding bigger and bigger servers, and buying literally, you know, servers, the size of refrigerators. And all of a sudden, there was this company that was coming together out in Cambridge, I'm from Massachusetts, and Akamai started in Cambridge, Massachusetts, still headquartered there. And Akamai was forming up and they had a totally different solution to how to solve this, which was amazing. And it was compelling and it drew me there, and I am still there, 22-odd years in, trying to solve challenging problems.Corey: Akamai is one of those companies that I often will describe to people who aren't quite as inclined in the network direction as I've been previously, as one of the biggest companies of the internet that you've never heard of. You are—the way that I think of you historically, I know this is not how you folks frame yourself these days, but I always thought of you as the CDN that you use when it really mattered, especially in the earlier days of the internet where there were not a whole lot of good options to choose from, and the failure mode that Akamai had when I was looking at it many years ago, is that, well, it feels enterprise-y. Well, what does that mean exactly because that's usually used as a disparaging term by any developer in San Francisco. What does that actually unpack to? And to my mind, it was, well, it was one of the more expensive options, which yes, that's generally not a terrible thing, and also that it felt relatively stodgy, for lack of a better term, where it felt like updating things through an API was more of a JSON API—namely a guy named Jason—who would take a ticket, possibly from Jira if they were that modern or not, and then implement it by hand. I don't believe that it is quite that bad these days because, again, this was circa 2012 that we're talking here. But how do you view what Akamai is and does in 2022?Andy: Yeah. Awesome question. There's a lot to unpack in there, including a few clever jabs you threw in. But all good.Corey: [laugh].Andy: [laugh]. I think Akamai has been through a tremendous, tremendous series of evolutions on the internet. And really the one that, you know, we're most excited about today is, you know, earlier this year, we kind of concluded our acquisition of Linode. And if we think about Linode, which brings compute into our platform, you know, ultimately Akamai today is a compute company that has a security offering and has a delivery offering as well. We do more security than delivery, so you know, delivery is kind of something that was really important during our first ten or twelve years, and security during the last ten, and we think compute during the next ten.The great news there is that if you look at Linode, you can't really find a more developer-focused company than Linode. You essentially fall into a virtual machine, you may accidentally set up a virtual machine inadvertently it's so easy. And that is how we see the interface evolving. We see a compute-centric interface becoming standard for people as time moves on.Corey: I'm reminded of one of those ancient advertisements, I forget, I think would have been Sun that put it out where the network is the computer or the computer is the network. The idea of that a computer sitting by itself unplugged was basically just this side of useless, whereas a bunch of interconnected computers was incredibly powerful. That today and 2022 sounds like an extraordinarily obvious statement, but it feels like this is sort of a natural outgrowth of that, where, okay, you've wound up solving the CDN piece of it pretty effectively. Now, you're expanding out into, as you say, compute through the Linode acquisition and others, and the question I have is, is that because there's a larger picture that's currently unfolding, or is this a scenario where well, we nailed the CDN side of the world, well, on that side of the universe, there's no new worlds left to conquer. Let's see what else we can do. Next, maybe we'll start making toasters.Andy: Bunch of bored guys in Cambridge, and we're just like, “Hey, let's go after compute. We don't know what we're doing.” No. There's a little bit more—Corey: Exactly. “We have money and time. Let's combine the two and see what we can come up with.”Andy: [laugh]. Hey, folks, compute: it's the new thing. No, it's more than that. And you know, Akamai has a very long history with the edge, right? And Akamai started—and again, arrogantly saying, we invented the concept of the edge, right, out there in '99, 2000, deploying hundreds and then to thousands of different locations, which is what our CDN ran on top of.And that was a really new, novel concept at the time. We extended that. We've always been flirting with what is called edge computing, which is how do we take pieces of application logic and move them from a centralized point and move them out to the edge. And I mean, cripes, if you go back and Google, like, ‘Akamai edge computing,' we were working on that in 2003, which is a bit like ancient history, right? And we are still on a quest.And literally, we think about it in the company this way: we are on a quest to make edge computing a reality, which is how do you take applications that have centralized chokepoints? And how do you move as much of those applications as possible out to the edge of the network to unblock user performance and experience, and then see what folks developers can enable with that kind of platform?Corey: For me, it seems that the rise of AWS—which is, by extension, the rise of cloud—has been, okay, you wind up building whatever you want for the internet and you stuff it into an AWS region, and oh, that's far away from your customers and/or your entire architecture is terrible so it has to make 20 different calls to the data center in series rather than in parallel. Great, how do we reduce the latency as much as possible? And their answer has largely seemed to be, ah, we'll build more regions, ever closer to you. One of these days, I expect to wake up and find that there's an announcement that they're launching a new region in my spare room here. It just seems to get closer and closer and closer. You look around, and there's a cloud construction crew stalking you to the mall and whatnot. I don't believe that is the direction that the future necessarily wants to be going in.Andy: Yeah, I think there's a lot there. And I would say it this way, which is, you know, having two-ish dozen uber-large data centers is probably not the peak technology of the internet, right? There's more we need to do to be able to get applications truly distributed. And, you know, just to be clear, I mean, Amazon AWS's done amazing stuff, they've projected phenomenal scale and they continue to do so. You know, but at Akamai, the problem we're trying to solve is really different than how do we put a bunch of stuff in a small number of data centers?It's, you know, obviously, there's going to be a centralized aspect, but there also needs to be incredibly integrated and seamless, moves through a gradient of compute, where hey, maybe you're in a very large data center for your AI/ML, kind of, you know, offline data lake type stuff. And then maybe you're in hundreds of locations for mid-tier application processing, and, you know, reconciliation of databases, et cetera. And then all the way out at the edge, you know, in thousands of locations, you should be there for user interactivity. And when I say user interactivity, I don't just mean, you know, read-only, but you've got to be able to do a read-write operation in synchronous fashion with the edge. And that's what we're after is building ultimately a platform for that and looking at tools, technology, and people along the way to help us with it.Corey: I've built something out, my lasttweetinaws.com threading Twitter client, and that's… it's fine. It's stateless, but it's a little too intricate to effectively run in the Lambda@Edge approach, so using their CloudFront offering is simply a non-starter. So, in order to get low latency for people using it around the world, I now have to deploy it simultaneously to 20 different AWS regions.And that is, to be direct, a colossal pain in the ass. No one is really doing stuff like that, that I can see. I had to build a whole lot of customs tooling just to get a CI/CD system up and working. Their strong regional isolation is great for containing blast radii, but obnoxious when you're trying to get something deployed globally. It's not the only way.Combine that with the reality that ingress data transfer to any of their regions is free—generally—but sending data to the internet is a jewel beyond price because all my stars, that is egress bandwidth; there is nothing more valuable on this planet or any other. And that doesn't quite seem right. Because if that were actively true, a whole swath of industries and apps would not be able to exist.Andy: Yeah, you know, Akamai, a huge part of our business is effectively distributing egress bandwidth to the world, right? And that is a big focus of ours. So, when we look at customers that are well positioned to do compute with Akamai, candidly, the filtering question that I typically ask with customers is, “Hey, do you have a highly distributed audience that you want to engage with, you know, a lot of interactivity or you're pushing a lot of content, video, updates, whatever it is, to them?” And that notion of highly distributed applications that have high egress requirements is exactly the sweet spot that we think Akamai has, you know, just a great advantage with, between our edge platform that we've been working on for the last 20-odd years and obviously, the platform that Linode brings into the conversation.Corey: Let's talk a little bit about Macrometa.Andy: Sure.Corey: What is the nature of your involvement with those folks? Because it seems like you sort of crossed into a whole bunch of different areas simultaneously, which is fascinating and great to see, but to my understanding, you do not own them.Andy: No, we don't. No, they're an independent company doing their thing. So, one of the fun hats that I get to wear at Akamai is, I'm responsible for our Akamai Ventures Program. So, we do our corporate investing and all this kind of thing. And we work with a wide array of companies that we think are contributing to the progression of the internet.So, there's a bunch of other folks out there that we work with as well. And Macrometa is on that list, which is we've done an investment in Macrometa, we're board observers there, so we get to sit in and give them input on, kind of, how they're doing things, but they don't have to listen to us since we're only observers. And we've also struck a preferred partnership with them. And what that means is that as our customers are building solutions, or as we're building solutions for our customers, utilizing the edge, you know, we're really excited and we've got Macrometa at the table to help with that. And Macrometa is—you know, just kind of as a refresher—is trying to solve the problem of distributed data access at the edge in a high-performance and almost non-blocking, developer-friendly way. And that is very, very exciting to us, so that's the context in which they're interesting to our continuing evolution of how the edge works.Corey: One of the questions I always like to ask, and it's usually not considered a personal attack when I asked the question—Andy: Oh, good.Corey: But it's, “Describe what the company does.” Now, at some places like the latter days of Yahoo, for example, it's very much a personal attack. But what is it that Macrometa does?Andy: So, Macrometa provides a worldwide, high-speed distributed database that is resident on what today, you could call the edge of the network. And the advantage here is, instead of having one SQL server sitting somewhere, or what you would call a distributed SQL Server, which is two SQL Servers sitting next to one another, Macrometa has a high-speed data store that allows you to, instead of having that centralized SQL Server, have it run natively at the edge of the network. And when you're building applications that run on the edge or anywhere, you need to try to think about how do you have the data as close to the user or to the access point as possible. And that's the problem Macrometa is after and that's what their products today solve. It's an incredibly bright team over there, a fantastic founder-CEO team, and we're really excited to be working with him.Corey: It wasn't intentionally designed this way as a setup when I mentioned a few minutes ago, but yeah, my Twitter client works across the 20-some-odd AWS regions, specifically because it's stateless. All of the state, other than a couple of API keys at provision time, wind up living in the user's browser. If this was something that needed to retain state in any way, like, you know, basically every real application under the sun, this strategy would absolutely not work unless I wound up with some heinous form of circular replication, and then you wind up with a single region going down and everything explodes. Having a cohesive, coherent data layer that spans all of that is key.Andy: Yeah, and you're on to the classical, you know, CompSci issue here around edge, which is if you have 100 edge regions, how do you have consistent state storage between applications running on N of those? And that is the problem Macrometa is after, and, you know, Akamai has been working on this and other variants of the edge problem for some time. We're very excited to be working with the folks at Macrometa. It's a cool group of folks. And it's an interesting approach to the technology. And from what we've seen so far, it's been working great.Corey: The idea of how do I wind up having persistent, scalable state across a bunch of different edge locations is not just a hard computer science problem; it's also a hard cloud economics problem, given the cost of data transit in a bunch of different directions between different providers. It turns, “How much does it cost?” In most cases to a question that can only be answered by well let's run it for a few days and find out. Which is not usually the best way to answer some questions. Like, “Is that power socket live?” “Let's touch it and find out.” Yeah, there are ways you learn that are extraordinarily painful.Andy: Yeah no, nobody should be doing that with power sockets. I think this is one of these interesting areas, which is this is really right in Akamai's backyard but it's not realized by a lot of folks. So, you know, Akamai has, for the last 20-odd-years, been all about how do we egress as much as possible to the entire internet. The weird areas, the big areas, the small areas, the up-and-coming areas, we serve them all. And in doing that, we've built a very large global fabric network, which allows us to get between those locations at a very low cost because we have to move our own content around.And hooking those together, having a essentially private network fabric that hooks the vast majority of our big locations together and then having very high-speed egress out of all of the locations to the internet, you know, that's been how we operate our business at scale effectively and economically for years, and utilizing that for compute data replication, data synchronization tasks is what we're doing.Corey: There are a lot of different solutions that could be used to solve a lot of the persistent data layer question. For example, when you had to solve a similar problem with compute, you had a few options in front of you. Well, we could buy a whole bunch of computers and stuff them in a rack somewhere because, eh, cloud; how hard could it be? Saner heads prevailed, and no, no, no, we're going to buy Linode, which was honestly a genius approach on about three different levels, and I'm still unconvinced the industry sees that for the savvy move that it was. I'm confident that'll change in time.Why not build it yourself? Or alternately, acquire another company that was working on something similar? Instead, you're an investor in a company that's doing this effectively, but not buying them outright?Andy: Yeah, you know, and I think that's—Akamai is beyond at this point in thinking that it's just about ownership, right? I think that this—we don't have to own everything in order to have a successful ecosystem. You know, certainly, we're going to want to own key parts of it and that's where you saw the Linode acquisition, where we felt that was kind of core. But ultimately, we believe in promoting customer choice here. And there's a pretty big role that we have that we think we can help with companies, such as folks like Macrometa where they have, you know, really interesting technology, but they can use leverage, they can use some of our go-to-market, they can use, you know, some of our, you know, kind of guidance and expertise on running a startup—which, by the way, it's not an easy job for these folks—and that's what we're there to do.So, with things like Linode, you know, we want to bring it in, and we want to own it because we think it's just so compelling, and it fits so well with where we want to go. With folks like Macrometa, you know, that's still a really young area. I mean, you know, Linode was in business for many, many, many years and was a good-sized business, you know, before we bought them.Corey: Yeah, there's something to be said, for letting the market shake something out rather than having to do it all yourself as trailblazers. I'm a big believer in letting other companies do things. I mean, one of the more annoying things, from my position, is this idea where AWS takes a product strategy of, “Yes.” That becomes a bit of a challenge when they're trying to wind up building compete decks, and how do we defeat the competition? And it's like, “Wh—oh, you're talking about the other hyperscalers?” “No, we're talking with the service team one floor away.”That just seems a little on the strange side to—some companies get too big and too expensive on some level. I think that there's a very real risk of Akamai trying to do everything on the internet if you continue to expand and start listing out things that are not currently in your portfolio. And, oh, we should do that, too, and we should do that, too, and we should do that, too. And suddenly, it feels pretty closely aligned with you're trying to do everything.Andy: Yeah. I think we've been a company who has been really disciplined and not doing everything. You know, we started with CDN. And you know, we're talking '98 to 2010, you know, CDN was really our thing, and we feel we executed really well on that. We probably executed quite quietly and well, but feel we executed pretty well on that.Really from 2010, 2012 to 2020, it was all about security, right? And, you know, we built, you know, pretty amazing security business, hundred percent of SaaS business, on top of our CDN platform with security. And now we're thinking about—we did that route relatively quietly, as well, and now we're thinking about the next ten years and how do we have that same kind of impact on cloud. And that is exciting because it's not just centralized cloud; it's about a distributed cloud vision. And that is really compelling and that's why you know, we've got great folks that are still here and working on it.Corey: I'm a big believer in the idea that you can start getting distilled truth out of folks, particularly companies, the more you compress the space they have to wind up saying. Something that's why Twitter very often lets people tip their hands. But a commonplace that I look for is the title field on a company's website. So, when I go over to akamai.com, you position yourself as something that fits in a small portion of a tweet, which is good. Whenever have a Tolstoy-length paragraph in the tooltip title for the browser tab, that's a problem.But you say simply, “Security, cloud delivery, performance. Akamai.” Which is beautifully well done, but security comes first. I have a mental model of Akamai as being a CDN and some other stuff that I don't fully understand. But again, I first encountered you folks in the early-2000s.It turns out that it's hard to change existing opinions. Are you a CDN Company or are you a security company?Andy: Oh, super—Corey: In other words, if someone wind up mis-alphabetizing that and they're about to get censured after this show because, “No, we're a CDN, first; why did you put security first?”Andy: You know, so all those things feed off each other, right? And this has been a question where it's like, you know, our security layer and our distributed WAF and other security offerings run on top of the CDN layer. So, it's all about building a common compute edge and then leveraging that for new applications. CDN was the first application. The next and second application was security.And we think the third application, but probably not the final one, is compute. So, I think I don't think anyone in marketing will be fired by the ordering that they did on that. I think that ultimately now, you know, for—just if we look at it from a monetary perspective, right, we do more security than we do CDN. So, there's a lot that we have in the security business. And you know, compute's got a long way to go, especially because it's not just one big data center of compute; it is a different flavor than I think folks have seen before.Corey: When I was at RSA, you folks were one of the exhibitors there. And I like to make the common observation that there are basically six companies that exhibit at RSA. Yeah, there are hundreds of booths, but it's the same six products, all marketed are different logos with different words. And they all seem to approach it from a few relatively expectable personas and positions. I've always found myself agreeing with the things that you folks say, and maybe it's because of my own network-centric background, but it doesn't seem like you take the same approach that a number of other companies do or it's, “Oh, it has to start with the way that developers write their first line of code.” Instead, it seems to take a holistic view that comes from the starting position of everything talks to each other on a network basis, and from here, let's move forward. Is that accurate to how you view the security space?Andy: Yeah, you know, our view of the security space is—again, it's a network-centric one, right? And our work in the security space initially came from really big DDoS attacks, right? And how do we stop Distributed Denial of Service attacks from impacting folks? And that was the initial benefit that we brought. And from there, we evolved our story around, you know, how do we have a more sophisticated WAF? How do we have predictive capabilities at the edge?So ultimately, we're not about ingraining into your process of how your thing was written or telling you how to write it. We're about, you know, essentially being that perimeter edge that is watching and monitoring everything that comes into you to make sure that, you know, hey, we're not seeing Log4j-type exploits coming at you, and we'll let you know if we do, or to block malicious activity. So, we fit on anything, which is why our security business has been so successful. If you have an application on the edge, you can put Akamai Security in front of it and it's going to make your application better. That's been super compelling for the last, you know, again, last decade or so that we've really been focused on security.Corey: I think that it is a mistake to take a security model that starts with a view of what people have in front of them day-to-day—like, I look at my laptop and say, “Oh, this is what I spend my time on. This is where all security must start and stop.” Because yeah, okay, great. If you get physical access to my laptop, it's pretty much game over on some level. But yeah, if you're at a point where you're going to bust into my house and threaten me in order to get access to my laptop, here you go.There are no secrets that I am in possession of that are worth dying for. It's just money and that's okay. But looking at it through a lens of the internet has gone from science experiment to thing that the nerds love to use to a cornerstone of the fabric of modern society. And that's not because of the magic supercomputer that we all have in our pockets, but rather because those magic supercomputers can talk to the sum total of human knowledge and any other human anywhere on the planet, basically, ever. And I don't know that that evolution has been really appreciated by society at large as far as just how empowering that can be. But it completely changes the entire security paradigm from back in the '80s when I got started, don't put untrusted floppy disks into your computer or it might literally explode on your desk.Andy: [laugh]. So, we're talking about floppy disks now? Yes. So, first of all, the scope of impact of the internet has increased, meaning what you can do with it has increased. And directly proportional to that increase the threat vectors have increased, right? And the more systems are connected, the more vulnerabilities there are.So listen, it's easy to scare anybody about security on the internet. It is a topic that is an infinite well of scariness. At the same time, you know, and not just Akamai, but there's a lot of companies out there that can, whether it's making your development more secure, making your pipeline, your digital supply chain a more secure, or then you know where Akamai is, we're at the end, which is you know, helping to wrap around your entire web presence to make it more secure, there's a variety of companies that are out there really making the internet work from a security perspective. And honestly, there's also been tremendous progress on the operating system front in the last several years, which previously was not as good—probably is way to characterize it—as it is today. So, and you know, at the end of the day, the nerds are still out there working, right?We are out here still working on making the internet, you know, scale better, making it more secure, making it more robust because we're probably not done, right? You know, phones are awesome, and tablet devices, et cetera, are awesome, but we've probably got more coming. We don't quite know what that is yet, but we want to have the capacity, safety, and compute to power it.Corey: How does Macrometa as a persistent data layer tie into your future vision of security first as what Akamai does? I can see a few directions, but I'm going to go out on a limb and guess that before you folks decided to make an investment in such a thing, you probably gave it more than the 30 seconds or whatnot or so a thought that I've had to wind up putting these pieces together.Andy: So, a few things there. First of all, Macrometa, ultimately, we see them coming in the front door with our compute solution, right? Because as folks are building capabilities on the edge, “Hey, I want to run compute on the edge. How do I interoperate with data?” The worst answer possible is, “Well, call back to the centralized data store.”So, we want to ensure that customers have choice and performance options for distributed data access. Macrometa fits great there. However, now pause that; let's transition back to the security point you raised, which is, you know, coordinating an edge data security platform is a really complicated thing. Because you want to make sure that threats that are coming in on one side of the network, or you know, in one given country, you know, are also understood throughout the network. And there's a definite role for a data platform in doing that.We obviously, you know, for the last ten years have built several that help accomplish that at scale for our network, but we also recognize that, you know, innovation in data platforms is probably not done. And you know, Macrometa's got some pretty interesting approaches. So, we're very interested in working with them and talking jointly with customers, which we've done a bunch of, to see how that progresses. But there's tie-ins, I would say, mostly on compute, but secondarily, there's a lot of interesting areas with real-time security intel, they can be very useful as well.Corey: Since I have you here, I would love to ask you something that's a little orthogonal to the rest of this conversation, but I don't even care about that because that's why it's my show; I can ask what I want.Andy: Oh, no.Corey: Talk to me a little bit about the Linode acquisition. Because when it first came out, I thought, “Oh, Linode must not be doing well, so it's an acqui-hire scenario.” Followed by, “Wait a minute, that doesn't seem quite right.” And I dug deeper, and suddenly, I started to see a bunch of things that made sense. But that's just my outside perspective. I prefer to see you justify what it is that you've done.Andy: Justify what we've done. Well, with that positive framing—Corey: Exactly. “Explain yourself. How dare you, sir?”Andy: [laugh]. “What are you doing?” So, to take that, which is first of all, Linode was doing great when we bought them and they're continuing to do great now. You know, backstory here is actually a fun one. So, I personally have been a customer of Linode for about 13 years, and you know, super familiar with their offerings, as we're a bunch of other folks at Akamai.And what ultimately attracted us to Linode was, first of all, from a strategic perspective, is we talked about how Akamai thinks about Compute being a gradient of compute: you've got the edge, you've got kind of a middle tier, and you've got more centralized locations. Akamai has the edge, we've got the middle, we didn't have the central. Linode has got the central. And obviously, you know, we're going to see some significant expansion of capacity and scale there, but they've got the central location. And, you know, ultimately, we feel that there's a lot of passion in Linode.You know, they're a Linux open-source-centric company, and believe it or not Akamai is, too. I mean, you know, that's kind of how it works. And there was a great connection between the sorts of folks that they had and how they think about customers. Linode was a really customer-driven company. I mean, they were fanatical.I mean, I as a, you know, customer of $30 a month personally, could open a ticket and I'd get an answer in five minutes. And that's very similar to kind of how Akamai is driven, which is we're very customer-centric, and when a customer has a problem or need something different, you know, we're on it. So, there's literally nothing bad there and it's a super exciting beginning of a new chapter for Akamai, which is really how do we tackle compute? We're super excited to have the Linode team. You know, they're still mostly down in Philadelphia doing their thing.And, you know, we've hired substantially and we're continuing to do so, so if you want to work there, drop a note over. And it's been fantastic. And it's one of our, you know, really large acquisitions that we've done, and I think we were really lucky to find a great company in such a good position and be able to make it work.Corey: From my perspective, one of the areas that has me excited about the acquisition stems from what I would consider to be something of a customer-base culture misalignment between the two companies. One of the things that I have always enjoyed about Linode—and in the interest of full transparency, they have been a periodic sponsor over the last five or six years of my ridiculous nonsense. I believe that they are not at the moment which I expect you to immediately rectify after this conversation, of course.Andy: I'll give you my credit card. Yeah.Corey: Excellent. Excellent. We do not get in the way of people trying to give you money. But it was great because that's exactly it. I could take a credit card in the middle of the night and spin up things on Linode.And it was one of those companies that aligned very closely to how I tended to view cloud infrastructure from the perspective of, I need a Linux box, or I need a bunch of Linux boxes right there, right now, and I don't have 12 weeks to go to cloud school to learn the intricacies of a given provider. It more or less just worked in a whole bunch of easy ways. Whereas if I wanted to roll out at Akamai, it was always I would pull up the website, and it's, “Click here to talk to our enterprise sales team.” And that tells me two things. One, it is probably going to be outside of my signing authority because no one trusts me with money for obvious reasons, when I was an employee, and two, you will not be going to space today because those conversations always take time.And it's going to be—if I'm in a hurry and trying to get something out the door, that is going to act as a significant drag on capability. Now, most of your customers do not launch things by the seat of their pants, three hours after the idea first occurs to them, but on Linode, that often seems to be the case. The idea of addressing developers early on in the ‘it's just an idea' phase. I can't shake the feeling that there's a definite future in which Linode winds up being able to speak much more effectively to enterprise, while Akamai also learns to speak to, honestly, half-awake shitposters at 2 a.m. when we're building something heinous.Andy: I feel like you've been sitting in on our strategy presentations. Maybe not the shitposters, but the rest of it. And I think the way that I would couch it, my corporate-speak of that, would be that there's a distinct yin and yang, there a complementary nature between the customer bases of Akamai, which has, you know, an incredible list of enterprise customers—I mean, the who's-who of enterprise customers, Akamai works with them—but then, you know, Linode, who has really tremendous representation of developers—that's what we'll use for the name posts—like, folks like myself included, right, who want to throw something together, want to spin up a VM, and then maybe tear it down and never do it again, or maybe set up 100 of them. And, to your point, the crossover opportunities there, which is, you know, Linode has done a really good job of having small customers that grow over time. And by having Akamai, you know, you can now grow, and never have to leave because we're going to be able to bring enough scale and throughput and, you know, professional help services as you need it to help you stay in the ecosystem.And similarly, Akamai has a tremendous—you know, the benefit of a tremendous set of enterprise customers who are out there, you know, frankly, looking to solve their compute challenges, saying, “Hey, I have a highly distributed application. Akamai, how can you help me with this?” Or, “Hey, I need presence in x or y.” And now we have, you know, with Linode, the right tools to support that. And yes, we can make all kinds of jokes about, you know, Akamai and Linode and different, you know, people and archetypes we appeal to, but ultimately, there's an alignment between Akamai and Linode on how we approach things, which is about Linux, open-source, it's about technical honesty and simplicity. So, great group of folks. And secondly, like, I think the customer crossover, you're right on it. And we're very excited for how that goes.Corey: I also want to call out that Macrometa seems to have split this difference perfectly. One of the first things I visit on any given company's page when I'm trying to understand them is the pricing page. It's one of those areas where people spend the least time, early on, but it's also where they tend to be the most honest. Maybe that's why. And I look for two things, and Macrometa has both of them.The first is a ‘try it for free, right now, get started.' It's a free-tier approach. Because even if you charge $10 or whatnot, there are many developers working on things in odd hours where they don't necessarily either have the ability to make that purchase decision, know that they have the ability to make that purchase decision, or are willing to do that by the seat of their pants. So, ‘get started for free' is important; it means you can develop right now. Conversely, there are a bunch of enterprise procurement departments out there who will want a whole bunch of custom things.Custom SLAs, custom support responses, custom everything, and they also don't know how to sign a check that doesn't have two commas in it. So, you don't probably want to avoid those customers, but what they're looking for is an enterprise offering that is no price. There should not be a price tag on that because you will never get it right for everyone, but what they want to see is ‘click here to contact sales.' That is coded language for, “We are serious professionals and know who you are and how you like to operate.” They've got both and I think that is absolutely the right decision.Andy: It do—Corey: And whatever you have in between those two is almost irrelevant.Andy: No, I think you're on it. And Macrometa, their pricing philosophy allows you to get in and try it with zero friction, which is super important. Like, I don't even have to use a credit card. I can experiment for free, I can try it for free, but then as I grow their pricing tier kind of scales along with that. And it's a—you know, that is the way that folks try applications.I always try to think about, hey, you know, if I'm on a team and we're tasked with putting together a proof of concept for something in two days, and I've got, you know, a couple folks working with me, how do I do that? And you don't have time for procurement, you might need to use the free thing to experiment. So, there is a lot that they can do. And you know, their pricing—this transparency of pricing that they have is fantastic. Now, Linode, also very transparent, we don't have a free tier, but you know, you can get in for very low friction and try that as well.Corey: Yeah, companies tend to go through a maturity curve evolution on these things. I've talked to companies that purely view it is how much money a given customer is spending determines how much attention they get. And it's like, “Yeah, maybe take a look through some of your smaller users or new signups there.” Yeah, they're spending $10 a month or whatnot, but their email address is@cocacola.com. Just spitballing here; maybe you might want a white-glove a few of those folks, just because not everyone comes in the door via an RFP.Andy: Yep. We look at customers for what your potential is, right? Like, you know, how much could you end up spending with us, right? You know, so if you're building your application on Linode, and you're going to spend $20, for the first couple months, that's totally fine. Get in there, experiment, and then you know, in the next several years, let's see where it goes. So, you're exactly right, which is, you know, that username@enterprisedomain.com is often much more indicative than what the actual bill is on a monthly basis.Corey: I always find it a little strange when I have a vendor that I'm doing business with, and then suddenly, an account person reaches out, like, hey, let's just have a call for half an hour to talk about what you're doing and how you're doing it. It's my immediate response to that these days, just of too many years doing that, as, “I really need to look at that bill. How much are we spending, again?” And I honestly, usually not that much because believe it or not, when you focus on cloud economics for a living, you pay attention to your credit card bills, but it is always interesting to see who reaches out and who doesn't. That's been a strange approach, and there is no one right answer for all of this.If every free tier account user of any given cloud provider wound up getting constant emails from their account managers, it's how desperate are you to grow revenue, and what are you about to do to pricing? At some level of becomes… unhelpful.Andy: I can see that. I've had, personally, situations where I'm a trial user of something, and all of a sudden I get emails—you know, using personal email addresses, no Akamai involvement—all of a sudden, I'm getting emails. And I'm like, “Really? Did I make the priority list for you to call me and leave me a voicemail, and then email me?” I don't know how that's possible.So, from a personal perspective, totally see that. You know, from an account development perspective, you know, kind of with the Akamai hat on, it's challenging, right? You know, folks are out there trying to figure out where business is going to come from. And I think if you're able to get an indicator that somebody, you know, maybe you're going to call that person at enterprisedomain.com to try to figure out, you know, hey, is this real and is this you with a side project or is this you with a proof of concept for something that could be more fruitful? And, you know, Corey, they're probably just calling you because you're you.Corey: One of the things that I was surprised by where I saw the exact same thing. I started getting a series of emails from my account manager for Google Workspaces. Okay, and then I really did a spit-take when I realized this was on my personal address. Okay… so I read this carefully because what the hell is happening? Oh, they're raising prices and it's a campaign. Great.Now, my one-user vanity domain is going to go from $6 a month to $8 a month or whatever. Cool, I don't care. This is not someone actively trying to reach out as a human being. It's an outreach campaign. Cool, fair. But that's the problem, on some level, for super-tiny customers. It's a, what is it, is it a shakedown? What are they about to yell at me for?Andy: No, I got the same thing. My Google Workspace personal account, which is, like, two people, right? Like, and I got an email and then I think, like, a voicemail. And I'm like, I read the email and I'm like—you know, it's going—again, it's like, it was like six something and now it's, like, eight something a month. So, it's like, “Okay. You're all right.”Corey: Just go—that's what you have a credit card for. Go ahead and charge it. It's fine. Now, yeah, counterpoint if you're a large company, and yeah, we're just going to be raising prices by 20% across the board for everyone, and you look at this and like, that's a phone number. Yeah, I kind of want some special outreach and conversations there. But it's odd.Andy: It's interesting. Yeah. They're great.Corey: Last question before we call this an episode. In 22 years, how have you seen the market change from your perspective? Most people do not work in the industry from one company's perspective for as long as you have. That gives you a somewhat privileged position to see, from a point of relative stability, what the industry has done.Andy: So—Corey: What have you noticed?Andy: —and I'm going to give you an answer, which is about, like, the sales cycle, which is it used to be about meetings and about everybody coming together and used to have to occasionally wear a suit. And there would be, you know, meetings where you would need to get a CEO or CFO to personally see a presentation and decide something and say, “Okay, we're going with X or Y. We're going to make a decision.” And today, those decisions are, pretty far and wide, made much, much further down in the organization. They're made by developers, team leads, project managers, program managers.So, the way people engage with customers today is so different. First of all, like, most meetings are still virtual. I mean, like, yeah, we have physical meetings and we get together for things, but like, so much more is done virtually, which is cool because we built the internet so we wouldn't have to go anywhere, so it's nice that we got that landed. It's unfortunate that we had to do with Covid to get there, but ultimately, I think that purchasing decisions and technology decisions are distributed so much more deeply into the organization than they were. It used to be a, like, C-level thing. We're now seeing that stuff happened much further down in the organization.We see that inside Akamai and we see it with our customers as well. It's been, honestly, refreshing because you tend to be able to engage with technical folks when you're talking about technical products. And you know, the business folks are still there and they're helping to guide the discussions and all that, but it's a much better time, I think, to be a technical person now than it probably was 20 years ago.Corey: I would say that being a technical person has gotten easier in a bunch of ways; it's gotten harder in a bunch of ways. I would say that it has transformed. I was very opposed to the idea that oh, as a sysadmin, why should I learn to write code? And in retrospect, it was because I wasn't sure I could do it and it felt like the rising tide was going to drown me. And in hindsight, yeah, it was the right direction for the industry to go in.But I'm also sensitive to folks who don't want to, midway through their career, pick up an entirely new skill set in order to remain relevant. I think that it is a lot easier to do some things. Back when Akamai started, it took an intimate knowledge of GCC compiler flags, in most cases, to host a website. Now, it is checking a box on a web page and you're done. Things have gotten easier.The abstractions continue to slip below the waterline, so the things we have to care about getting more and more meaningful to the business. We're nowhere near our final form yet, but I'm very excited about how accessible this industry is to folks that previously would not have been, while also disheartened by just how much there is to know. Otherwise, “Oh yeah, that entire aspect of the way that this core thing that runs my business, yeah, that's basically magic and we just hope the magic doesn't stop working, or we make a sacrifice to the proper God, which is usually a giant trillion-dollar company.” And the sacrifice is, of course, engineering time combined with money.Andy: You know, technology is all about abstraction layers, right? And I think—that's my view, right—and we've been spending the last several decades, not, ‘we' Akamai; ‘we' the technology industry—on, you know, coming up with some pretty solid abstraction layers. And you're right, like, the, you know, GCC j6—you know, -j6—you know, kind of compiler tags not that important anymore, we could go back in time and talk about inetd, the first serverless. But other than that, you know, as we get to the present day, I think what's really interesting is you can contribute technically without being a super coding nerd. There's all kinds of different technical approaches today and technical disciplines that aren't just about development.Development is super important, but you know, frankly, the sysadmin skill set is more valuable today if you look at what SREs have become and how important they are to the industry. I mean, you know, those are some of the most critical folks in the entire piping here. So, don't feel bad for starting out as a sysadmin. I think that's my closing comment back to you.Corey: I think that's probably a good place to leave it. I really want to thank you for being so generous with your time.Andy: Anytime.Corey: If people want to learn more about how you see the world, where can they find you?Andy: Yeah, I mean, I guess you could check me out on LinkedIn. Happy to shoot me something there and happy to catch up. I'm pretty much read-only on social, so I don't pontificate a lot on Twitter, but—Corey: Such a good decision.Andy: Feel free to shoot me something on LinkedIn if you want to get in touch or chat about Akamai.Corey: Excellent. And of course, our thanks goes well, to the fine folks at Macrometa who have promoted this episode. It is always appreciated when people wind up supporting this ridiculous nonsense that I do. My guest has been Andy Champagne SVP at the CTO office over at Akamai. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment that will not post successfully because your podcast provider of choice wound up skimping out on a provider who did not care enough about a persistent global data layer.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
About MikeBesides his duties as The Duckbill Group's CEO, Mike is the author of O'Reilly's Practical Monitoring, and previously wrote the Monitoring Weekly newsletter and hosted the Real World DevOps podcast. He was previously a DevOps Engineer for companies such as Taos Consulting, Peak Hosting, Oak Ridge National Laboratory, and many more. Mike is originally from Knoxville, TN (Go Vols!) and currently resides in Portland, OR.Links Referenced: Twitter: https://twitter.com/Mike_Julian mikejulian.com: https://mikejulian.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at Datadog. Datadog is a SaaS monitoring and security platform that enables full-stack observability for modern infrastructure and applications at every scale. Datadog enables teams to see everything: dashboarding, alerting, application performance monitoring, infrastructure monitoring, UX monitoring, security monitoring, dog logos, and log management, in one tightly integrated platform. With 600-plus out-of-the-box integrations with technologies including all major cloud providers, databases, and web servers, Datadog allows you to aggregate all your data into one platform for seamless correlation, allowing teams to troubleshoot and collaborate together in one place, preventing downtime and enhancing performance and reliability. Get started with a free 14-day trial by visiting datadoghq.com/screaminginthecloud, and get a free t-shirt after installing the agent.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With Tailscale SSH, you can do exactly that. Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH.Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built in key rotation permissions is code connectivity between any two devices, reduce latency and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. tail scales. Completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Welcome to Screaming in the Cloud, I'm Corey Quinn and I'm having something of a crisis of faith based upon a recent conversation I've had with my returning yet again guest, Mike Julian, my business partner and CEO of The Duckbill Group. Welcome back, Mike.Mike: Hi, everyone.Corey: So, the revelation that had surfaced unexpectedly was, based upon a repeated talking point where I am a terrible employee slash expensive to manage, et cetera, et cetera, and you pointed out that you've been managing me for four years or so now, at which point I did a spit take, made all the more impressive by the fact that I wasn't drinking anything at the time, and realized, “Oh, my God, you're right, but I haven't had any of the usual problems slash friction with you that I have with basically every boss I've ever had in my entire career.” So, I'm spiraling. Let's talk about that.Mike: My recollection of that conversation is slightly different than yours. Mine is that you called me and said, “Mike, I just realized that you're my boss.” And I'm like, “How do you feel about that?” He's like, “I'm not really sure.”Corey: And I'm still not entirely sure how I feel if I'm being fully honest with you. Just because it's such a weird thing to have to deal with. Because historically, I always view a managerial relationship as starting from a place of a power imbalance. And that is the one element that is missing from our relationship. We each own half the company, we can fire each other, but it takes the form of tearing the company apart, and that isn't something that we're really set up to entertain.Mike: And you know, I actually think it's deeper than that because you owning the other half of the company is not really… it's not really power in itself. Like, yeah, it is, but you could easily own half the company and have no power. Because, like, really when we talk about power, we're talking about political power, influence, and I think the reason that there is no power imbalance is because each of us does something in the company that is just as important as the other. And they're both equally valuable to the company and we both recognize the other's contributions, as that, as being equally valuable to the company. It's less to do about how much we own and more about the work that we do.Corey: Oh, of course. The ownership starts and stops entirely with the fact that neither one of us can force the other out. So it's, as opposed to well, I own 51% of the company, so when I'm tired of your bullshit, you're leaving. And that is a dynamic that's never entered into it. I'm also going to add one more thing onto what you just said, which is, both of us would sooner tear off our own skin than do the other's job.Mike: Yeah. God, I would hate to do your job, but I know you'd hate to do mine.Corey: You look at my calendar on a busy meeting day and you have a minor panic attack just looking at it where, “Oh, my God, talking to that many people.” And you are going away for a while and you come back with a whole analytical model where your first love language feels like it's spreadsheets on some days, and I look at this and it's like, “Yeah, I know what some of those numbers mean.” And it just drives me up a wall, the idea of building out a plan and an execution thing and then delegating a lot of it to other people, it does not work for my worldview in so many different ways. It's the reason I think that you and I get along. That and our shared values.Mike: I remember the first time that you and I did a consulting engagement together. We went on a multi-day trip. And at the end of, like, three days of nonstop conversations, you made a comment, it was like, “Cool. So, what are we going to do that again?” Like, you were excited by it. I can tell you're energized. And I was just thinking, “Please for love of God, I want to die right now.”Corey: One of the weirdest parts about all of it, though, is neither one of us is in a scenario where what we do for a living and how we go about it works without the other.Mike: Right. Yeah, like, this is one of the interesting things about the company we have built is that it would not work with just you or just me; it's us being co-founders is what makes it successful.Corey: The thing that I do not understand and I don't think I ever will is the idea of co-founder speed dating, where you basically go to some big networking mixer event, pick some rando off the street, and congratulations, that's your business partner. Have fun. It is not that much of an exaggeration to say that co-founding a company with someone else is like a marriage. You are creating a legal entity that without very specific controls and guidelines, you are opening yourself up to massive liability issues if the other person decides to screw you over. That is part of the reason that the values match was so important for us.Mike: Yeah, it is surprising to me how similar being co-founders and business partners is to being married. I did not expect how close those two things were. You and I spend an incredible amount of time just on the relationship for each of us, which I never expected, but makes sense in hindsight.Corey: That's I think part of it makes the whole you managing me type of relationship work is because not only can you not, “Fire me,” quote-unquote, but I can't quit without—Mike: [laugh].Corey: Leaving behind a giant pile of effort with nothing to show for it over the last four years. So, it's one of those conversation styles where we go into the conversation knowing, regardless of how heated it gets or how annoyed we are with each other, that we are not going to blow the company up because one of us is salty that week.Mike: Right. Yeah, I remember from the legal perspective, when we put together a partnership agreement, our attorneys were telling us that we really needed to have someone at the 51% owner, and we were both adamant that no, that doesn't work for us. And finally, the way that we handled it is if you and I could not handle a dispute, then the only remedy left was to shut the entire thing down. And that would be an automatic trigger. We've never ever, ever even got close to that point.But like, I like that's the structure because it really means that if you and I can't agree on something and it's a substantial thing, then there's no business, which really kind of sets the stage for how important the conversations that we have are. And of course, you and I, we're close, we have a great relationship, so that's never come up. But I do like that it's still there.Corey: I like the fact that there's always going to be an option to get out. It's not a suicide pact, for lack of a better term. But it's also something that neither one of us would ever entertain lightly. And credit where due, there have been countless conversations where you and I were diametrically opposed; we each talk through it, and one or the other of us will just do a complete one-eighty our position where, “Okay, you convinced me,” and that's it. What's so odd about that is because we don't have too many examples of that in public society, it just seems like there's now this entire focus on, “Oh, if you make an observation or a point, that's wrong, you've got to double down on it.” Why would you do that? That makes zero sense. When you've considered something of a different angle and change your mind, why waste more time on it?Mike: I think there's other interesting ones, too, where you and I have come at something from a different angle and one of us will realize that we just actually don't care as much as we thought we did. And we'll just back down because it's not the hill we want to die on.Corey: Which brings us to a good point. What hill do we want to die on?Mike: Hmm. I think we've only got a handful. I mean, as it should; like, there should not be there should not be many of them.Corey: No, no because most things can change, in the fullness of time. Just because it's not something we believe is right for the business right now does not mean it never will be.Mike: Yeah. I think all of them really come down to questions of values, which is why you and I worked so well together, in that we don't have a lot of common interests, we're at completely different stages in our lives, but we have very tightly aligned values. Which means that when we go into a discussion about something, we know where the other stands right away, like, we could generally make a pretty good guess about it. And there's often very little question about how some values discussion is going to go. Like, do we take on a certain client that is, I don't know, they build landmines? Is that a thing that we're going to do? Of course not. Like—Corey: I should clarify, we're talking here about physical landmines; not whatever disastrous failure mode your SaaS application has.Mike: [laugh]. Yeah.Corey: We know what those are.Mike: Yeah, and like, that sort of thing, you and I would never even pose the question to each other. We would just make the decision. And maybe we tell each other later because and, like, “Hey, haha, look what happened,” but there will never be a discussion around it because it just—our values are so tightly aligned that it wouldn't be necessary.Corey: Whenever we're talking to someone that's in a new sector or a company that has a different expression, we always like to throw it past each other just to double-check, you don't have a problem with—insert any random thing here; the breadth of our customer base just astounds me—and very rarely as either one of us thrown a flag on something just because we do have this affinity for saying[ yes and making money.Mike: Yeah. But you actually wanted to talk about the terribleness of managing you.Corey: Yeah. I am very curious as to what your experience has been.Mike: [laugh].Corey: And before we dive into it, I want to call out a couple of things that make me a little atypical for your typical problem employee. I am ADHD personified. My particular expression of that means that my energy level is very different at different times of day, there are times where I will get nothing done for a day or two, and then in four hours, get three weeks of work done. It is hard to predict and it's hard to schedule around and it's never clear exactly what that energy level is going to be at any given point in time. That's the starting point of this nonsense. Now, take it away.Mike: Yeah. What most people know about Corey is what everyone sees on Twitter, which is what I would call the high highs. Everyone sees you as your most energetic, or at least perceived as the most energetic. If they see you in person at a conference, it's the same sort of thing. What people don't see are your lows, which are really, really low lows.And it's not a matter of, like, you don't get anything done. Like, you know, we can handle that; it's that you disappear. And it may be for a couple hours, it may be for a couple of days, and we just don't really know what's going on. That's really hard. But then, in your high highs, they're really high, but they're also really unpredictable.So, what that means is that because you have ADHD, like, the way that your brain thinks, the way your brain works, is that you don't control what you're going to focus on, and you never know what you're going to focus on. It may be exactly what you should be focusing on, which is a huge win for everyone involved, but sometimes you focus on stuff that doesn't matter to anyone except you. Sometimes really interesting stuff comes out of that, but oftentimes it doesn't. So, helping build a structure to work around those sorts of things and to also support those sorts of things, has been one of the biggest challenges that I've had. And most of my job is really about building a support structure for you and enabling you to do your best work.So, that's been really interesting and really challenging because I do not think that way. Like, if I need to focus on something, I just say, “Great. I'm just going to focus on this thing,” and I'll focus on it until I'm done. But you don't work that way, and you couldn't conceivably work that way, ever. So, it's always been hard because I say things like, “Hey, Corey, I need you to go write this series of emails.” And you'll write them when your brain decides that wants to write them, which might be never.Corey: That's part of the problem. I've also found that if I have an idea floating around too long, it'll linger for years and I'll never write anything about it, whereas there are times when I have—the inspiration strikes, I write a one- to 2000-word blog post every week that goes out, and there are times it takes me hours and there are times I bust out the entire thing in first draft form in 20 minutes or less. Like, if it's Domino's, like, there's not going to be a refund on it. So, it's kind of wild and I wish I could harness that somehow I don't know how, but… that's one of the biggest challenges.Mike: I wish I could too, but it's one of the things that you learn to get used to. And with that, because we've worked together for so long, I've gotten to be able to tell in what state of mind you are. Like, are you in a state where if I put something in front of you, you're going to go after it hard, and like, great things are going to happen, or are you more likely to ignore that I said anything? And I can generally tell within the first sentence or so of bringing something up. But that also means that I have other—I have to be careful with how I structure requests that I have for you.In some cases, I come with a punch list of, like, here's six things I need to get through and I'm going to sit on this call while we go through them. In other cases, I have to drip them out one at a time over the span of a week just because that's how your mind is those days. That makes it really difficult because that's not how most people are managed and it's not how most people expect to manage. So, coming up with different ways to do that has been one of the trickiest things I've done.Corey: Let's move on a little bit other than managing my energy levels because that does not sound like a particularly difficult employee to manage. “Okay, great. We've got to build some buffer room into the schedule in case he winds up not delivering for a few days. Okay, we can live with that.” But oh, working with me gets so much worse.Mike: [laugh]. It absolutely does.Corey: This is my performance review. Please hit me with it.Mike: Yeah. The other major concern that has been challenging to work through that makes you really frustrating to work with, is you hate conflict. Actually, I don't actually—let me clarify that further. You avoid conflict, except your definition of conflict is more broad than most. Because when most people think of conflicts, like, “Oh, I have to go have this really hard conversation, it's going to be uncomfortable, and, like—”Corey: “Time to go fire Steven.”Mike: Right, or things like, “I have to have our performance conversation with someone.” Like, everyone hates those, but, like, there's good ways and bad ways to them, like, it's uncomfortable even at the best of times. But with you, it's more than that, it's much more broad. You avoid giving direction because you perceive giving direction as potential for conflict, and because you're so conflict-avoidant, you don't give direction to people.Which means that if someone does something you don't like, you don't say anything and then it leaves everyone on the team to say, like, “I really wish Corey would be more explicit about what he wants. I wish he was more vocal about the direction he wanted to go.” Like, “Please tell us something more.” But you're so conflict-avoidant that you don't, and no amount of begging or we're asking for it has really changed that, so we end up with these two things where you're doing most of the work yourself because you don't want to direct other people to do it.Corey: I will push back slightly on one element of that, which is when I have a strong opinion about something, I am not at all hesitant about articulating that. I mean, this is not—like, my Twitter is not performance art; it's very much what I believe. The challenge is that for so much of what we talk about internally on a day-to-day basis, I don't really have a strong opinion. And what I've always shied away from is the idea of telling people how to do their jobs. So, I want to be very clear that I'm not doing that, except when it's important.Because we've all been in environments in the corporate world where the president of the company wanders past or your grand-boss walks into the room and asks an idle question, or, “Maybe we should do this,” and it never feels like it's really just idle pondering. It's, “Welp, new strategic priority just dropped from on high.”Mike: Right.Corey: And every senior manager has a story about screwing that one up. And I have led us down that path once or twice previously. So—Mike: That's true.Corey: When I don't have a strong opinion, I think what I need to get better at is saying, “I don't give a shit,” but when I frame it like that it causes different problems.Mike: Yeah. Yeah, that's very true. I still don't completely agree with your disagreement there, but I understand your perspective. [laugh].Corey: Oh, he's not like you can fire me, so it doesn't really matter. I kid. I kid.Mike: Right. Yeah. So, I think those are the two major areas that make you a real challenge to manage and a challenge to direct. But one of the reasons why I think we've been successful at it, or at least I'll say I've been successful at managing you, is I do so with such a gentle touch that you don't realize that I'm doing anything, and I have all these different—Corey: Well, it did take me four years to realize what was going on.Mike: Yeah, like, I have all these different ways of getting you to do things, and you don't realize I'm doing them. And, like, I've shared many of them here for you for the first time. And that's really is what has worked out well. Like, a lot of the ways that I manage you, you don't realize are management.Corey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomomento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: What advice would you have for someone for whom a lot of these stories are resonating? Because, “Hey, I have a direct report is driving me to distraction and a lot sounds like what you're describing.” What do you wish you'd known sooner about how to coax performance out of me, for lack of a better phrasing?Mike: When we first started really working together, I knew what ADHD was, but I knew it from a high school paper that I did on ADHD, and it's um—oh, what was it—“The Overdiagnosis of ADHD,” which was a thing when you and I were at high school. That's all I knew is just that ADHD was suspected to be grossly overdiagnosed and that most people didn't have it. What I have learned is that yeah, that might have been true—maybe; I don't know—but for people that do have any ADHD, it's a real thing. Like, it does have some pretty substantial impact.And I wish I had known more about how that manifests, and particularly how it manifests in different people. And I wish I'd known more earlier on about the coping mechanisms that different people create for themselves and how they manage and how they—[sigh], I'm struggling to come up with the right word here, but many people who are neurodivergent in some way create coping mechanisms and ways to shift themselves to appear more neurotypical. And I wish I had understood that better. Particularly, I wish I had understood that better for you when we first started because I've kind of learned about it over time. And I spent so much time trying to get you to work the way that I work rather than understand that you work different. Had I spent more time to understand how you work and what your coping mechanisms were, the earlier years of Duckbill would have been so much smoother.Corey: And again, having this conversation has been extraordinarily helpful. On my side of it, one of the things that was absolutely transformative and caused a massive reduction in our interpersonal conflict was the very simple tool of, it's not necessarily a problem when I drop something on the floor and don't get to it, as long as I throw a hand up and say, “I'm dropping this thing,” and so someone else can catch it as we go. I don't know how much of this is ADHD speaking versus how much of it is just my own brokenness in some ways, but I feel like everyone has this neverending list of backlog tasks that they'll get to someday that generally doesn't ever seem to happen. More often than not, I wind up every few months, just looking at my ever-growing list, reset to zero and we'll start over. And every once in a while, I'll be really industrious and knock a thing or two off the list. But so many that don't necessarily matter or need to be me doing them, but it drives people to distraction when something hits my email inbox, it just dies there, for example.Mike: Yeah. One of the systems that we set up here is that if there's something that Corey does not immediately want to do, I have you send it to someone else. And generally it's to me and then I become a router for you. But making that more explicit and making that easier for you—I'm just like, “If this is not something that you're going to immediately take care of yourself, forward it to me.” And that was huge. But then other things, like when you take time off, no one knows you're taking time off. And it's an—the easiest thing is no one cares that you're taking time off; just, you know, tell us you're doing it.Corey: Yeah, there's a difference between, “I'm taking three days off,” and your case, the answer is generally, “Oh, thank God. He's finally using some of that vacation.”Mike: [laugh].Corey: The problem is there's a world of difference between, “Oh, I'm going to take these three days off,” and just not showing up that day. That tends to cause problems for people.Mike: Yeah. They're just waving a hand in the air and saying, “Hey, this is happening,” that's great. But not waving it, not saying anything at all, that's where the pain really comes from.Corey: When you take a look across your experience managing people, which to my understanding your first outing with it was at this company—Mike: Yeah.Corey: What about managing me is the least surprising and the most surprising that you've picked up during that pattern? Because again, the story has always been, “Oh, yeah, you're a terrible manager because you've never done it before,” but I look back and you're clearly the best manager I've ever had, if for no other reason than neither one of us can rage-quit. But there's a lot of artistry to how you've handled a lot of challenges that I present to you.Mike: I'm the best manager you've had because I haven't fired you. [laugh].Corey: And also, some of the best ones I have had fired me. That doesn't necessarily disqualify someone.Mike: Yeah. I want to say, I am by no means experienced as a manager. As you mentioned, this is my first outing into doing management. As my coach tells me, I'm getting better every day. I am not terrible [laugh].The—let's see—most surprising; least surprising. I don't think I have anything for least surprising. I think most surprising is how easy it is for you to accept feedback and how quickly you do something about it, how quickly you take action on that feedback. I did not expect that, given all your other proclivities for not liking managers, not liking to be managed, for someone to give feedback to you and you say, “Yep, that sounds good,” and then do it, like, that was incredibly surprising.Corey: It's one of those areas where if you're not embracing or at least paying significant attention to how you are being perceived, maybe that's a problem, maybe it's not, let's be very clear. However, there's also a lot of propensity there to just assume, “Oh, I'm right and screw everyone else.” You can do an awful lot of harm that way. And that is something I've had to become incredibly aware of, especially during the pandemic, as the size of my audience at this point more than quadrupled from the start of the pandemic. These are a bunch of people now who have never met me in person, they have no context on what I do.And I tend to view the world the way you might expect a dog to behave, who caught a car that he has absolutely no idea how to drive, and he's sort of winging it as he goes. Like, step one, let's not kill people. Step two, eh, we'll figure that out later. Like, step one is the most important.Mike: Mm-hm. Yeah.Corey: And feedback is hard to get, past a certain point. I often lament from time to time that it's become more challenging for me to weed out who the jerks are because when you're perceived to have a large platform and more or less have no problem calling large companies and powerful folk to account, everyone's nice to you. And well, “Really? He's terrible and shitty to women. That's odd. He's always been super nice to me.” Is not the glowing defense that so many people seem to think that it is. It's I have learned to listen a lot more clearly the more I speak.Mike: That's a challenge for me as well because, as we've mentioned, my first foray into management. As we've had more people in the company, that has gotten more of a challenge of I have to watch what I say because my word carries weight on its own, by virtue of my position. And you have the same problem, except yours is much more about your weight in public, rather than your weight internally.Corey: I see it as different sides of the same coin. I take it as a personal bit of a badge of honor that almost every person I meet, including the people who've worked here, have come away, very surprised by just how true to life my personality on Twitter is to how actually am when I interact with humans. You're right, they don't see the low sides, but I also try not to take that out on the staff either.Mike: [laugh]. Right.Corey: We do the best of what we have, I think, and it's gratifying to know that I can still learn new tricks.Mike: Yeah. And I'm not firing anytime soon.Corey: That's right. Thank you again for giving me the shotgun performance review. It's always appreciated. If people want to learn more, where can they find you, to get their own performance preview, perhaps?Mike: Yeah, you can find me on Twitter at @Mike_Julian. Or you can sign up for our newsletter, where I'm talking about my upcoming book on consulting at mikejulian.com.Corey: And we will put links to that into the show notes. Thanks again, sir.Mike: Thank you.Corey: Mike Julian, CEO of The Duckbill Group, my business partner, and apparently my boss. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that demonstrates the absolute worst way to respond to a negative performance evaluation.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
About ChetanChetan Venkatesh is a technology startup veteran focused on distributed data, edge computing, and software products for enterprises and developers. He has 20 years of experience in building primary data storage, databases, and data replication products. Chetan holds a dozen patents in the area of distributed computing and data storage.Chetan is the CEO and Co-Founder of Macrometa – a Global Data Network featuring a Global Data Mesh, Edge Compute, and In-Region Data Protection. Macrometa helps enterprise developers build real-time apps and APIs in minutes – not months.Links Referenced: Macrometa: https://www.macrometa.com Macrometa Developer Week: https://www.macrometa.com/developer-week TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With Tailscale SSH, you can do exactly that. Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH.Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built in key rotation permissions is code connectivity between any two devices, reduce latency and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. tail scales. Completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomomento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today, this promoted guest episode is brought to us basically so I can ask a question that has been eating at me for a little while. That question is, what is the edge? Because I have a lot of cynical sarcastic answers to it, but that doesn't really help understanding. My guest today is Chetan Venkatesh, CEO and co-founder at Macrometa. Chetan, thank you for joining me.Chetan: It's my pleasure, Corey. You're one of my heroes. I think I've told you this before, so I am absolutely delighted to be here.Corey: Well, thank you. We all need people to sit on the curb and clap as we go by and feel like giant frauds in the process. So let's start with the easy question that sets up the rest of it. Namely, what is Macrometa, and what puts you in a position to be able to speak at all, let alone authoritatively, on what the edge might be?Chetan: I'll answer the second part of your question first, which is, you know, what gives me the authority to even talk about this? Well, for one, I've been trying to solve the same problem for 20 years now, which is build distributed systems that work really fast and can answer questions about data in milliseconds. And my journey's sort of been like the spiral staircase journey, you know, I keep going around in circles, but the view just keeps getting better every time I do one of these things. So I'm on my fourth startup doing distributed data infrastructure, and this time really focused on trying to provide a platform that's the antithesis of the cloud. It's kind of like taking the cloud and flipping it on its head because instead of having a single region application where all your stuff runs in one place, on us-west-1 or us-east-1, what if your apps could run everywhere, like, they could run in hundreds and hundreds of cities around the world, much closer to where your users and devices and most importantly, where interesting things in the real world are happening?And so we started Macrometa about five years back to build a new kind of distributed cloud—let's call the edge—that kind of looks like a CDN, a Content Delivery Network, but really brings very sophisticated platform-level primitives for developers to build applications in a distributed way around primitives for compute, primitives for data, but also some very interesting things that you just can't do in the cloud anymore. So that's Macrometa. And we're doing something with edge computing, which is a big buzzword these days, but I'm sure you'll ask me about that.Corey: It seems to be. Generally speaking, when I look around and companies are talking about edge, it feels almost like it is a redefining of what they already do to use a term that is currently trending and deep in the hype world.Chetan: Yeah. You know, I think humans just being biologically social beings just tend to be herd-like, and so when we see a new trend, we like to slap it on everything we have. We did that 15 years back with cloud, if you remember, you know? Everybody was very busy trying to stick the cloud label on everything that was on-prem. Edge is sort of having that edge-washing moment right now.But I define edge very specifically is very different from the cloud. You know, where the cloud is defined by centralization, i.e., you've got a giant hyperscale data center somewhere far, far away, where typically electricity, real estate, and those things are reasonably cheap, i.e., not in urban centers, where those things tend to be expensive.You know, you have platforms where you run things at scale, it's sort of a your mess for less business in the cloud and somebody else manages that for you. The edge is actually defined by location. And there are three types of edges. The first edge is the CDN edge, which is historically where we've been trying to make things faster with the internet and make the internet scale. So Akamai came about, about 20 years back and created this thing called the CDN that allowed the web to scale. And that was the first killer app for edge, actually. So that's the first location that defines the edge where a lot of the peering happens between different network providers and the on-ramp around the cloud happens.The second edge is the telecom edge. That's actually right next to you in terms of, you know, the logical network topology because every time you do something on your computer, it goes through that telecom layer. And now we have the ability to actually run web services, applications, data, directly from that telecom layer.And then the third edge is—sort of, people have been familiar with this for 30 years. The third edge is your device, just your mobile phone. It's your internet gateway and, you know, things that you carry around in your pocket or sit on your desk, where you have some compute power, but it's very restricted and it only deals with things that are interesting or important to you as a person, not in a broad range. So those are sort of the three things. And it's not the cloud. And these three things are now becoming important as a place for you to build and run enterprise apps.Corey: Something that I think is often overlooked here—and this is sort of a natural consequence of the cloud's own success and the joy that we live in a system that we do where companies are required to always grow and expand and find new markets—historically, for example, when I went to AWS re:Invent, which is a cloud service carnival in the desert that no one in the right mind should ever want to attend but somehow we keep doing, it used to be that, oh, these announcements are generally all aligned with people like me, where I have specific problems and they look a lot like what they're talking about on stage. And now they're talking about things that, from that perspective, seem like Looney Tunes. Like, I'm trying to build Twitter for Pets or something close to it, and I don't understand why there's so much talk about things like industrial IoT and, “Machine learning,” quote-unquote, and other things that just do not seem to align with. I'm trying to build a web service, like it says on the name of a company; what gives?And part of that, I think, is that it's difficult to remember, for most of us—especially me—that what they're coming out with is not your shopping list. Every service is for someone, not every service is for everyone, so figuring out what it is that they're talking about and what those workloads look like, is something that I think is getting lost in translation. And in our defense—collective defense—Amazon is not the best at telling stories to realize that, oh, this is not me they're talking to; I'm going to opt out of this particular thing. You figure it out by getting it wrong first. Does that align with how you see the market going?Chetan: I think so. You know, I think of Amazon Web Services, or even Google, or Azure as sort of Costco and, you know, Sam's Wholesale Club or whatever, right? They cater to a very broad audience and they sell a lot of stuff in bulk and cheap. And you know, so it's sort of a lowest common denominator type of a model. And so emerging applications, and especially emerging needs that enterprises have, don't necessarily get solved in the cloud. You've got to go and build up yourself on sort of the crude primitives that they provide.So okay, go use your bare basic EC2, your S3, and build your own edgy, or whatever, you know, cutting edge thing you want to build over there. And if enough people are doing it, I'm sure Amazon and Google start to pay interest and you know, develop something that makes it easier. So you know, I agree with you, they're not the best at this sort of a thing. The edge is phenomenon also that's orthogonally, and diametrically opposite to the architecture of the cloud and the economics of the cloud.And we do centralization in the cloud in a big way. Everything is in one place; we make giant piles of data in one database or data warehouse slice and dice it, and almost all our computer science is great at doing things in a centralized way. But when you take data and chop it into 50 copies and keep it in 50 different places on Earth, and you have this thing called the internet or the wide area network in the middle, trying to keep all those copies in sync is a nightmare. So you start to deal with some very basic computer science problems like distributed state and how do you build applications that have a consistent view of that distributed state? So you know, there have been attempts to solve these problems for 15, 18 years, but none of those attempts have really cracked the intersection of three things: a way for programmers to do this in a way that doesn't blow their heads with complexity, a way to do this cheaply and effectively enough where you can build real-world applications that serve billions of users concurrently at a cost point that actually is economical and make sense, and third, a way to do this with adequate levels of performance where you don't die waiting for the spinning wheel on your screen to go away.So these are the three problems with edge. And as I said, you know, me and my team, we've been focused on this for a very long while. And me and my co-founder have come from this world and we created a platform very uniquely designed to solve these three problems, the problems of complexity for programmers to build in a distributed environment like this where data sits in hundreds of places around the world and you need a consistent view of that data, being able to operate and modify and replicate that data with consistency guarantees, and then a third one, being able to do that, at high levels of performance, which translates to what we call ultra-low latency, which is human perception. The threshold of human perception, visually, is about 70 milliseconds. Our finest athletes, the best Esports players are about 70 to 80 milliseconds in their twitch, in their ability to twitch when something happens on the screen. The average human is about 100 to 110 milliseconds.So in a second, we can maybe do seven things at rapid rates. You know, that's how fast our brain can process it. Anything that falls below 100 milliseconds—especially if it falls into 50 to 70 milliseconds—appears instantaneous to the human mind and we experience it as magic. And so where edge computing and where my platform comes in is that it literally puts data and applications within 50 milliseconds of 90% of humans and devices on Earth and allows now a whole new set of applications where latency and location and the ability to control those things with really fine-grained capability matters. And we can talk a little more about what those apps are in a bit.Corey: And I think that's probably an interesting place to dive into at the moment because whenever we talk about the idea of new ways of building things that are aimed at decentralization, first, people at this point automatically have a bit of an aversion to, “Wait, are you talking about some of the Web3 nonsense?” It's one of those look around the poker table and see if you can spot the sucker, and if you can't, it's you. Because there are interesting aspects to that entire market, let's be clear, but it also seems to be occluded by so much of the grift and nonsense and spam and the rest that, again, sort of characterize the early internet as well. The idea though, of decentralizing out of the cloud is deeply compelling just to anyone who's really ever had to deal with the egress charges, or even the data transfer charges inside of one of the cloud providers. The counterpoint is it feels that historically, you either get to pay the tax and go all-in on a cloud provider and get all the higher-level niceties, or otherwise, you wind up deciding you're going to have to more or less go back to physical data centers, give or take, and other than the very baseline primitives that you get to work with of VMs and block storage and maybe a load balancer, you're building it all yourself from scratch. It seems like you're positioning this as setting up for a third option. I'd be very interested to hear it.Chetan: Yeah. And a quick comment on decentralization: good; not so sure about the Web3 pieces around it. We tend to talk about computer science and not the ideology of distributing data. There are political reasons, there are ideological reasons around data and sovereignty and individual human rights, and things like that. There are people far smarter than me who should explain that.I fall personally into the Nicholas Weaver school of skepticism about Web3 and blockchain and those types of things. And for readers who are not familiar with Nicholas Weaver, please go online. He teaches at UC Berkeley is just one of the finest minds of our time. And I think he's broken down some very good reasons why we should be skeptical about, sort of, Web3 and, you know, things like that. Anyway, that's a digression.Coming back to what we're talking about, yes, it is a new paradigm, but that's the challenge, which is I don't want to introduce a new paradigm. I want to provide a continuum. So what we've built is a platform that looks and feels very much like Lambdas, and a poly-model database. I hate the word multi. It's a pretty dumb word, so I've started to substitute ‘multi' with ‘poly' everywhere, wherever I can find it.So it's not multi-cloud; it's poly-cloud. And it's not multi-model; it's poly-model. Because what we want is a world where developers have the ability to use the best paradigm for solving problems. And it turns out when we build applications that deal with data, data doesn't just come in one form, it comes in many different forms, it's polymorphic, and so you need a data platform, that's also, you know, polyglot and poly-model to be able to handle that. So that's one part of the problem, which is, you know, we're trying to provide a platform that provides continuity by looking like a key-value store like Redis. It looks like a document database—Corey: Or the best database in the world Route 53 TXT records. But please, keep going.Chetan: Well, we've got that too, so [laugh] you know? And then we've got a streaming graph engine built into it that kind of looks and behaves like a graph database, like Neo4j, for example. And, you know, it's got columnar capabilities as well. So it's sort of a really interesting data platform that is not open-source; it's proprietary because it's designed to solve these problems of being able to distribute data, put it in hundreds of locations, keep it all in sync, but it looks like a conventional NoSQL database. And it speaks PostgreSQL, so if you know PostgreSQL, you can program it, you know, pretty easily.What it's also doing is taking away the responsibility for engineers and developers to understand how to deal with very arcane problems like conflict resolution in data. I made a change in Mumbai; you made a change in Tokyo; who wins? Our systems in the cloud—you know, DynamoDB, and things like that—they have very crude answers for this something called last writer wins. We've done a lot of work to build a protocol that brings you ACID-like consistency in these types of problems and makes it easy to reason with state change when you've got an application that's potentially running in 100 locations and each of those places is modifying the same record, for example.And then the second part of it is it's a converged platform. So it doesn't just provide data; it provides a compute layer that's deeply integrated directly with the data layer itself. So think of it as Lambdas running, like, stored procedures inside the database. That's really what it is. We've built a very, very specialized compute engine that exposes containers in functions as stored procedures directly on the database.And so they run inside the context of the database and so you can build apps in Python, Go, your favorite language; it compiles down into a [unintelligible 00:15:02] kernel that actually runs inside the database among all these different polyglot interfaces that we have. And the third thing that we do is we provide an ability for you to have very fine-grained control on your data. Because today, data's become a political tool; it's become something that nation-states care a lot about.Corey: Oh, do they ever.Chetan: Exactly. And [unintelligible 00:15:24] regulated. So here's the problem. You're an enterprise architect and your application is going to be consumed in 15 countries, there are 13 different frameworks to deal with. What do you do? Well, you spin up 13 different versions, one for each country, and you know, build 13 different teams, and have 13 zero-day attacks and all that kind of craziness, right?Well, data protection is actually one of the most important parts of the edge because, with something like Macrometa, you can build an app once, and we'll provide all the necessary localization for any region processing, data protection with things like tokenization of data so you can exfiltrate data securely without violating potentially PII sensitive data exfiltration laws within countries, things like that, i.e. It's solving some really hard problems by providing an opinionated platform that does these three things. And I'll summarize it as thus, Corey, we can kind of dig into each piece. Our platform is called the Global Data Network. It's not a global database; it's a global data network. It looks like a frickin database, but it's actually a global network available in 175 cities around the world.Corey: The challenge, of course, is where does the data actually live at rest, and—this is why people care about—well, they're two reasons people care about that; one is the data residency locality stuff, which has always, honestly for me, felt a little bit like a bit of a cloud provider shakedown. Yeah, build a data center here or you don't get any of the business of anything that falls under our regulation. The other is, what is the egress cost of that look like? Because yeah, I can build a whole multicenter data store on top of AWS, for example, but minimum, we're talking two cents, a gigabyte of transfer, even with inside of a region in some cases, and many times that externally.Chetan: Yeah, that's the real shakedown: the egress costs [laugh] more than the other example that you talked about over there. But it's a reality of how cloud pricing works and things like that. What we have built is a network that is completely independent of the cloud providers. We're built on top of five different service providers. Some of them are cloud providers, some of them are telecom providers, some of them are CDNs.And so we're building our global data network on top of routes and capacity provided by transfer providers who have different economics than the cloud providers do. So our cost for egress falls somewhere between two and five cents, for example, depending on which edge locations, which countries, and things that you're going to use over there. We've got a pretty generous egress fee where, you know, for certain thresholds, there's no egress charge at all, but over certain thresholds, we start to charge between two to five cents. But even if you were to take it at the higher end of that spectrum, five cents per gigabyte for transfer, the amount of value our platform brings in architecture and reduction in complexity and the ability to build apps that are frankly, mind-boggling—one of my customers is a SaaS company in marketing that uses us to inject offers while people are on their website, you know, browsing. Literally, you hit their website, you do a few things, and then boom, there's a customized offer for them.In banking that's used, for example, you know, you're making your minimum payments on your credit card, but you have a good payment history and you've got a decent credit score, well, let's give you an offer to give you a short-term loan, for example. So those types of new applications, you know, are really at this intersection where you need low latency, you need in-region processing, and you also need to comply with data regulation. So when you building a high-value revenue-generating app like that egress cost, even at five cents, right, tends to be very, very cheap, and the smallest part of you know, the complexity of building them.Corey: One of the things that I think we see a lot of is that the tone of this industry is set by the big players, and they have done a reasonable job, by and large, of making anything that isn't running in their blessed environments, let me be direct, sound kind of shitty, where it's like, “Oh, do you want to be smart and run things in AWS?”—or GCP? Or Azure, I guess—“Or do you want to be foolish and try and build it yourself out of popsicle sticks and twine?” And, yeah, on some level, if I'm trying to treat everything like it's AWS and run a crappy analog version of DynamoDB, for example, I'm not going to have a great experience, but if I also start from a perspective of not using things that are higher up the stack offerings, that experience starts to look a lot more reasonable as we start expanding out. But it still does present to a lot of us as well, we're just going to run things in VM somewhere and treat them just like we did back in 2005. What's changed in that perspective?Chetan: Yeah, you know, I can't talk for others but for us, we provide a high-level Platform-as-a-Service, and that platform, the global data network, has three pieces to it. First piece is—and none of this will translate into anything that AWS or GCP has because this is the edge, Corey, is completely different, right? So the global data network that we have is composed of three technology components. The first one is something that we call the global data mesh. And this is Pub/Sub and event processing on steroids. We have the ability to connect data sources across all kinds of boundaries; you've got some data in Germany and you've got some data in New York. How do you put these things together and get them streaming so that you can start to do interesting things with correlating this data, for example?And you might have to get across not just physical boundaries, like, they're sitting in different systems in different data centers; they might be logical boundaries, like, hey, I need to collaborate with data from my supply chain partner and we need to be able to do something that's dynamic in real-time, you know, to solve a business problem. So the global data mesh is a way to very quickly connect data wherever it might be in legacy systems, in flat files, in streaming databases, in data warehouses, what have you—you know, we have 500-plus types of connectors—but most importantly, it's not just getting the data streaming, it's then turning it into an API and making that data fungible. Because the minute you put an API on it and it's become fungible now that data is actually got a lot of value. And so the data mesh is a way to very quickly connect things up and put an API on it. And that API can now be consumed by front-ends, it can be consumed by other microservices, things like that.Which brings me to the second piece, which is edge compute. So we've built a compute runtime that is Docker compatible, so it runs containers, it's also Lambda compatible, so it runs functions. Let me rephrase that; it's not Lambda-compatible, it's Lambda-like. So no, you can't take your Lambda and dump it on us and it won't just work. You have to do some things to make it work on us.Corey: But so many of those things are so deeply integrated to the ecosystem that they're operating within, and—Chetan: Yeah.Corey: That, on the one hand, is presented by cloud providers as, “Oh, yes. This shows how wonderful these things are.” In practice, talk to customers. “Yeah, we're using it as spackle between the different cloud services that don't talk to one another despite being made by the same company.”Chetan: [laugh] right.Corey: It's fun.Chetan: Yeah. So the second edge compute piece, which allows you now to build microservices that are stateful, i.e., they have data that they interact with locally, and schedule them along with the data on our network of 175 regions around the world. So you can build distributed applications now.Now, your microservice back-end for your banking application or for your HR SaaS application or e-commerce application is not running in us-east-1 and Virginia; it's running literally in 15, 18, 25 cities where your end-users are, potentially. And to take an industrial IoT case, for example, you might be ingesting data from the electricity grid in 15, 18 different cities around the world; you can do all of that locally now. So that's what the edge functions does, it flips the cloud model around because instead of sending data to where the compute is in the cloud, you're actually bringing compute to where the data is originating, or the data is being consumed, such as through a mobile app. So that's the second piece.And the third piece is global data protection, which is hey, now I've got a distributed infrastructure; how do I comply with all the different privacy and regulatory frameworks that are out there? How do I keep data secure in each region? How do I potentially share data between regions in such a way that, you know, I don't break the model of compliance globally and create a billion-dollar headache for my CIO and CEO and CFO, you know? So that's the third piece of capabilities that this provides.All of this is presented as a set of serverless APIs. So you simply plug these APIs into your existing applications. Some of your applications work great in the cloud. Maybe there are just parts of that app that should be on our edge. And that's usually where most customers start; they take a single web service or two that's not doing so great in the cloud because it's too far away; it has data sensitivity, location sensitivity, time sensitivity, and so they use us as a way to just deal with that on the edge.And there are other applications where it's completely what I call edge native, i.e., no dependancy on the cloud comes and runs completely distributed across our network and consumes primarily the edges infrastructure, and just maybe send some data back on the cloud for long-term storage or long-term analytics.Corey: And ingest does remain free. The long-term analytics, of course, means that once that data is there, good luck convincing a customer to move it because that gets really expensive.Chetan: Exactly, exactly. It's a speciation—as I like to say—of the cloud, into a fast tier where interactions happen, i.e., the edge. So systems of record are still in the cloud; we still have our transactional systems over there, our databases, data warehouses.And those are great for historical types of data, as you just mentioned, but for things that are operational in nature, that are interactive in nature, where you really need to deal with them because they're time-sensitive, they're depleting value in seconds or milliseconds, they're location sensitive, there's a lot of noise in the data and you need to get to just those bits of data that actually matter, throw the rest away, for example—which is what you do with a lot of telemetry in cybersecurity, for example, right—those are all the things that require a new kind of a platform, not a system of record, a system of interaction, and that's what the global data network is, the GDN. And these three primitives, the data mesh, Edge compute, and data protection, are the way that our APIs are shaped to help our enterprise customers solve these problems. So put it another way, imagine ten years from now what DynamoDB and global tables with a really fast Lambda and Kinesis with actually Event Processing built directly into Kinesis might be like. That's Macrometa today, available in 175 cities.Corey: This episode is brought to us in part by our friends at Datadog. Datadog is a SaaS monitoring and security platform that enables full-stack observability for modern infrastructure and applications at every scale. Datadog enables teams to see everything: dashboarding, alerting, application performance monitoring, infrastructure monitoring, UX monitoring, security monitoring, dog logos, and log management, in one tightly integrated platform. With 600-plus out-of-the-box integrations with technologies including all major cloud providers, databases, and web servers, Datadog allows you to aggregate all your data into one platform for seamless correlation, allowing teams to troubleshoot and collaborate together in one place, preventing downtime and enhancing performance and reliability. Get started with a free 14-day trial by visiting datadoghq.com/screaminginthecloud, and get a free t-shirt after installing the agent.Corey: I think it's also worth pointing out that it's easy for me to fall into a trap that I wonder if some of our listeners do as well, which is, I live in, basically, downtown San Francisco. I have gigabit internet connectivity here, to the point where when it goes out, it is suspicious and more a little bit frightening because my ISP—Sonic.net—is amazing and deserves every bit of praise that you never hear any ISP ever get. But when I travel, it's a very different experience. When I go to oh, I don't know, the conference center at re:Invent last year and find that the internet is patchy at best, or downtown San Francisco on Verizon today, I discover that the internet is almost non-existent, and suddenly applications that I had grown accustomed to just working suddenly didn't.And there's a lot more people who live far away from these data center regions and tier one backbones directly to same than don't. So I think that there's a lot of mistaken ideas around exactly what the lower bandwidth experience of the internet is today. And that is something that feels inadvertently classist if that make sense. Are these geographically bigoted?Chetan: Yeah. No, I think those two points are very well articulated. I wish I could articulate it that well. But yes, if you can afford 5G, some of those things get better. But again, 5G is not everywhere yet. It will be, but 5G can in many ways democratize at least one part of it, which is provide an overlap network at the edge, where if you left home and you switched networks, on to a wireless, you can still get the same quality of service that you used to getting from Sonic, for example. So I think it can solve some of those things in the future. But the second part of it—what did you call it? What bigoted?Corey: Geographically bigoted. And again, that's maybe a bit of a strong term, but it's easy to forget that you can't get around the speed of light. I would say that the most poignant example of that I had was when I was—in the before times—giving a keynote in Australia. So ah, I know what I'll do, I'll spin up an EC2 instance for development purposes—because that's how I do my development—in Australia. And then I would just pay my provider for cellular access for my iPad and that was great.And I found the internet was slow as molasses for everything I did. Like, how do people even live here? Well, turns out that my provider would backhaul traffic to the United States. So to log into my session, I would wind up having to connect with a local provider, backhaul to the US, then connect back out from there to Australia across the entire Pacific Ocean, talk to the server, get the response, would follow that return path. It's yeah, turns out that doing laps around the world is not the most efficient way of transferring any data whatsoever, let alone in sizable amounts.Chetan: And that's why we decided to call our platform the global data network, Corey. In fact, it's really built inside of sort of a very simple reason is that we have our own network underneath all of this and we stop this whole ping-pong effect of data going around and help create deterministic guarantees around latency, around location, around performance. We're trying to democratize latency and these types of problems in a way that programmers shouldn't have to worry about all this stuff. You write your code, you push publish, it runs on a network, and it all gets there with a guarantee that 95% of all your requests will happen within 50 milliseconds round-trip time, from any device, you know, in these population centers around the world.So yeah, it's a big deal. It's sort of one of our je ne sais quoi pieces in our mission and charter, which is to just democratize latency and access, and sort of get away from this geographical nonsense of, you know, how networks work and it will dynamically switch topology and just make everything slow, you know, very non-deterministic way.Corey: One last topic that I want to ask you about—because I near certain given your position, you will have an opinion on this—what's your take on, I guess, the carbon footprint of clouds these days? Because a lot of people been talking about it; there has been a lot of noise made about, justifiably so. I'm curious to get your take.Chetan: Yeah, you know, it feels like we're in the '30s and the '40s of the carbon movement when it comes to clouds today, right? Maybe there's some early awareness of the problem, but you know, frankly, there's very little we can do than just sort of put a wet finger in the air, compute some carbon offset and plant some trees. I think these are good building blocks; they're not necessarily the best ways to solve this problem, ultimately. But one of the things I care deeply about and you know, my company cares a lot about is helping make developers more aware off what kind of carbon footprint their code tangibly has on the environment. And so we've started two things inside the company. We've started a foundation that we call the Carbon Conscious Computing Consortium—the four C's. We're going to announce that publicly next year, we're going to invite folks to come and join us and be a part of it.The second thing that we're doing is we're building a completely open-source, carbon-conscious computing platform that is built on real data that we're collecting about, to start with, how Macrometa's platform emits carbon in response to different types of things you build on it. So for example, you wrote a query that hits our database and queries, you know, I don't know, 20 billion objects inside of our database. It'll tell you exactly how many micrograms or how many milligrams of carbon—it's an estimate; not exactly. I got to learn to throttle myself down. It's an estimate, you know, you can't really measure these things exactly because the cost of carbon is different in different places, you know, there are different technologies, et cetera.Gives you a good decent estimate, something that reliably tells you, “Hey, you know that query that you have over there, that piece of SQL? That's probably going to do this much of micrograms of carbon at this scale.” You know, if this query was called a million times every hour, this is how much it costs. A million times a day, this is how much it costs and things like that. But the most important thing that I feel passionate about is that when we give developers visibility, they do good things.I mean, when we give them good debugging tools, the code gets better, the code gets faster, the code gets more efficient. And Corey, you're in the business of helping people save money, when we give them good visibility into how much their code costs to run, they make the code more efficient. So we're doing the same thing with carbon, we know there's a cost to run your code, whether it's a function, a container, a query, what have you, every operation has a carbon cost. And we're on a mission to measure that and provide accurate tooling directly in our platform so that along with your debug lines, right, where you've got all these print statements that are spitting up stuff about what's happening there, we can also print out, you know, what did it cost in carbon.And you can set budgets. You can basically say, “Hey, I want my application to consume this much of carbon.” And down the road, we'll have AI and ML models that will help us optimize your code to be able to fit within those carbon budgets. For example. I'm not a big fan of planting—you know, I love planting trees, but don't get me wrong, we live in California and those trees get burned down.And I was reading this heartbreaking story about how we returned back into the atmosphere a giant amount of carbon because the forest reserve that had been planted, you know, that was capturing carbon, you know, essentially got burned down in a forest fire. So, you know, we're trying to just basically say, let's try and reduce the amount of carbon, you know, that we can potentially create by having better tooling.Corey: That would be amazing, and I think it also requires something that I guess acts almost as an exchange where there's a centralized voice that can make sure that, well, one, the provider is being honest, and two, being able to ensure you're doing an apples-to-apples comparison and not just discounting a whole lot of negative externalities. Because, yes, we're talking about carbon released into the environment. Okay, great. What about water effects from what's happening with your data centers are located? That can have significant climate impact as well. It's about trying to avoid the picking and choosing. It's hard, hard problem, but I'm unconvinced that there's anything more critical in the entire ecosystem right now to worry about.Chetan: So as a startup, we care very deeply about starting with the carbon part. And I agree, Corey, it's a multi-dimensional problem; there's lots of tentacles. The hydrocarbon industry goes very deeply into all parts of our lives. I'm a startup, what do I know? I can't solve all of those things, but I wanted to start with the philosophy that if we provide developers with the right tooling, they'll have the right incentives then to write better code. And as we open-source more of what we learn and, you know, our tooling, others will do the same. And I think in ten years, we might have better answers. But someone's got to start somewhere, and this is where we'd like to start.Corey: I really want to thank you for taking as much time as you have for going through what you're up to and how you view the world. If people want to learn more, where's the best place to find you?Chetan: Yes, so two things on that front. Go to www.macrometa.com—M-A-C-R-O-M-E-T-A dot com—and that's our website. And you can come and experience the full power of the platform. We've got a playground where you can come, open an account and build anything you want for free, and you can try and learn. You just can't run it in production because we've got a giant network, as I said, of 175 cities around the world. But there are tiers available for you to purchase and build and run apps. Like I think about 80 different customers, some of the biggest ones in the world, some of the biggest telecom customers, retail, E-Tail customers, [unintelligible 00:34:28] tiny startups are building some interesting things on.And the second thing I want to talk about is November 7th through 11th of 2022, just a couple of weeks—or maybe by the time this recording comes out, a week from now—is developer week at Macrometa. And we're going to be announcing some really interesting new capabilities, some new features like real-time complex event processing with low, ultra-low latency, data connectors, a search feature that allows you to build search directly on top of your applications without needing to spin up a giant Elastic Cloud Search cluster, or providing search locally and regionally so that, you know, you can have search running in 25 cities that are instant to search rather than sending all your search requests back in one location. There's all kinds of very cool things happening over there.And we're also announcing a partnership with the original, the OG of the edge, one of the largest, most impressive, interesting CDN players that has become a partner for us as well. And then we're also announcing some very interesting experimental work where you as a developer can build apps directly on the 5G telecom cloud as well. And then you'll hear from some interesting companies that are building apps that are edge-native, that are impossible to build in the cloud because they take advantage of these three things that we talked about: geography, latency, and data protection in some very, very powerful ways. So you'll hear actual customer case studies from real customers in the flesh, not anonymous BS, no marchitecture. It's a week-long of technical talk by developers, for developers. And so, you know, come and join the fun and let's learn all about the edge together, and let's go build something together that's impossible to do today.Corey: And we will, of course, put links to that in the [show notes 00:36:06]. Thank you so much for being so generous with your time. I appreciate it.Chetan: My pleasure, Corey. Like I said, you're one of my heroes. I've always loved your work. The Snark-as-a-Service is a trillion-dollar market cap company. If you're ever interested in taking that public, I know some investors that I'd happily put you in touch with. But—Corey: Sadly, so many of those investors lack senses of humor.Chetan: [laugh]. That is true. That is true [laugh].Corey: [laugh]. [sigh].Chetan: Well, thank you. Thanks again for having me.Corey: Thank you. Chetan Venkatesh, CEO and co-founder at Macrometa. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry and insulting comment about why we should build everything on the cloud provider that you work for and then the attempt to challenge Chetan for the title of Edgelord.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
About KevinKevin Miller is currently the global General Manager for Amazon Simple Storage Service (S3), an object storage service that offers industry-leading scalability, data availability, security, and performance. Prior to this role, Kevin has had multiple leadership roles within AWS, including as the General Manager for Amazon S3 Glacier, Director of Engineering for AWS Virtual Private Cloud, and engineering leader for AWS Virtual Private Network and AWS Direct Connect. Kevin was also Technical Advisor to the Senior Vice President for AWS Utility Computing. Kevin is a graduate of Carnegie Mellon University with a Bachelor of Science in Computer Science.Links Referenced: snark.cloud/shirt: https://snark.cloud/shirt aws.amazon.com/s3: https://aws.amazon.com/s3 TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at Datadog. Datadog is a SaaS monitoring and security platform that enables full-stack observability for modern infrastructure and applications at every scale. Datadog enables teams to see everything: dashboarding, alerting, application performance monitoring, infrastructure monitoring, UX monitoring, security monitoring, dog logos, and log management, in one tightly integrated platform. With 600-plus out-of-the-box integrations with technologies including all major cloud providers, databases, and web servers, Datadog allows you to aggregate all your data into one platform for seamless correlation, allowing teams to troubleshoot and collaborate together in one place, preventing downtime and enhancing performance and reliability. Get started with a free 14-day trial by visiting datadoghq.com/screaminginthecloud, and get a free t-shirt after installing the agent.Corey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomomento.co/screaming. That's GO M-O-M-E-N-T-O dot co slash screaming.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Right now, as I record this, we have just kicked off our annual charity t-shirt fundraiser. This year's shirt showcases S3 as the eighth wonder of the world. And here to either defend or argue the point—we're not quite sure yet—is Kevin Miller, AWS's vice president and general manager for Amazon S3. Kevin, thank you for agreeing to suffer the slings and arrows that are no doubt going to be interpreted, misinterpreted, et cetera, for the next half hour or so.Kevin: Oh, Corey, thanks for having me. And happy to do that, and really flattered for you to be thinking about S3 in this way. So more than happy to chat with you.Corey: It's absolutely one of those services that is foundational to the cloud. It was the first AWS service that was put into general availability, although the beta folks are going to argue back and forth about no, no, that was SQS instead. I feel like now that Mai-Lan handles both SQS and S3 as part of her portfolio, she is now the final arbiter of that. I'm sure that's an argument for a future day. But it's impossible to imagine cloud without S3.Kevin: I definitely think that's true. It's hard to imagine cloud, actually, with many of our foundational services, including SQS, of course, but we are—yes, we were the first generally available service with S3. And pretty happy with our anniversary being Pi Day, 3/14.Corey: I'm also curious, your own personal trajectory has been not necessarily what folks would expect. You were the general manager of Amazon Glacier, and now you're the general manager and vice president of S3. So, I've got to ask, because there are conflicting reports on this depending upon what angle you look at, are Glacier and S3 the same thing?Kevin: Yes, I was the general manager for S3 Glacier prior to coming over to S3 proper, and the answer is no, they are not the same thing. We certainly have a number of technologies where we're able to use those technologies both on S3 and Glacier, but there are certainly a number of things that are very distinct about Glacier and give us that ability to hit the ultra-low price points that we do for Glacier Deep Archive being as low as $1 per terabyte-month. And so, that definitely—there's a lot of actual ingenuity up and down the stack, from hardware to software, everywhere in between, to really achieve that with Glacier. But then there's other spots where S3 and Glacier have very similar needs, and then, of course, today many customers use Glacier through S3 as a storage class in S3, and so that's a great way to do that. So, there's definitely a lot of shared code, but certainly, when you get into it, there's [unintelligible 00:04:59] to both of them.Corey: I ran a number of obnoxiously detailed financial analyses, and they all came away with, unless you have a very specific very nuanced understanding of your data lifecycle and/or it is less than 30 or 60 days depending upon a variety of different things, the default S3 storage class you should be using for virtually anything is Intelligent Tiering. That is my purely economic analysis of it. Do you agree with that? Disagree with that? And again, I understand that all of these storage classes are like your children, and I am inviting you to tell me which one of them is your favorite, but I'm absolutely prepared to do that.Kevin: Well, we love Intelligent Tiering because it is very simple; customers are able to automatically save money using Intelligent Tiering for data that's not being frequently accessed. And actually, since we launched it a few years ago, we've already saved customers more than $250 million using Intelligent Tiering. So, I would say today, it is our default recommendation in almost every case. I think that the cases where we would recommend another storage class as the primary storage class tend to be specific to the use case where—and particularly for use cases where customers really have a good understanding of the access patterns. And we saw some customers do for their certain dataset, they know that it's going to be heavily accessed for a fixed period of time, or this data is actually for archival, it'll never be accessed, or very rarely if ever access, just maybe in an emergency.And those kinds of use cases, I think actually, customers are probably best to choose one of the specific storage classes where they're, sort of, paying that the lower cost from day one. But again, I would say for the vast majority of cases that we see, the data access patterns are unpredictable and customers like the flexibility of being able to very quickly retrieve the data if they decide they need to use it. But in many cases, they'll save a lot of money as the data is not being accessed, and so, Intelligent Tiering is a great choice for those cases.Corey: I would take it a step further and say that even when customers believe that they are going to be doing a deeper analysis and they have a better understanding of their data flow patterns than Intelligent Tiering would, in practice, I see that they rarely do anything about it. It's one of those things where they're like, “Oh, yeah, we're going to set up our own lifecycle policies real soon now,” whereas, just switch it over to Intelligent Tiering and never think about it again. People's time is worth so much more than the infrastructure they're working on in almost every case. It doesn't seem to make a whole lot of sense unless you have a very intentioned, very urgent reason to go and do that stuff by hand in most cases.Kevin: Yeah, that's right. I think I agree with you, Corey. And certainly, that is the recommendation we lead with customers.Corey: In previous years, our charity t-shirt has focused on other areas of AWS, and one of them was based upon a joke that I've been telling for a while now, which is that the best database in the world is Route 53 and storing TXT records inside of it. I don't know if I ever mentioned this to you or not, but the first iteration of that joke was featuring around S3. The challenge that I had with it is that S3 Select is absolutely a thing where you can query S3 with SQL which I don't see people doing anymore because Athena is the easier, more, shall we say, well-articulated version of all of that. And no, no, that joke doesn't work because it's actually true. You can use S3 as a database. Does that statement fill you with dread? Regret? Am I misunderstanding something? Or are you effectively running a giant subversive database?Kevin: Well, I think that certainly when most customers think about a database, they think about a collection of technology that's applied for given problems, and so I wouldn't count S3 as providing the whole range of functionality that would really make up a database. But I think that certainly a lot of the primitives and S3 Select as a great example of a primitive are available in S3. And we're looking at adding, you know, additional primitives going forward to make it possible to, you know, to build a database around S3. And as you see, other AWS services have done that in many ways. For example, obviously with Amazon Redshift having a lot of capability now to just directly access and use data in S3 and make that a super seamless so that you can then run data warehousing type queries on top of S3 and on top of your other datasets.So, I certainly think it's a great building block. And one other thing I would actually just say that you may not know, Corey, is that one of the things over the last couple of years we've been doing a lot more with S3 is actually working to directly contribute improvements to open-source connector software that uses S3, to make available automatically some of the performance improvements that can be achieved either using both the AWS SDK, and also using things like S3 Select. So, we started with a few of those things with Select; you're going to see more of that coming, most likely. And some of that, again, the idea there as you may not even necessarily know you're using Select, but when we can identify that it will improve performance, we're looking to be able to contribute those kinds of improvements directly—or we are contributing those directly to those open-source packages. So, one thing I would definitely recommend customers and developers do is have a capability of sort of keeping that software up-to-date because although it might seem like those are sort of one-and-done kind of software integrations, there's actually almost continuous improvement now going on, and around things like that capability, and then others we come out with.Corey: What surprised me is just how broadly S3 has been adopted by a wide variety of different clients' software packages out there. Back when I was running production environments in anger, I distinctly remember in one Ubuntu environment, we wound up installing a specific package that was designed to teach apt how to retrieve packages and its updates from S3, which was awesome. I don't see that anymore, just because it seems that it is so easy to do it now, just with the native features that S3 offers, as well as an awful lot of software under the hood has learned to directly recognize S3 as its own thing, and can react accordingly.Kevin: And just do the right thing. Exactly. No, we certainly see a lot of that. So that's, you know—I mean, obviously making that simple for end customers to use and achieve what they're trying to do, that's the whole goal.Corey: It's always odd to me when I'm talking to one of my clients who is looking to understand and optimize their AWS bill to see outliers in either direction when it comes to S3 itself. When they're driving large S3 bills as in a majority of their spend, it's, okay, that is very interesting. Let's dive into that. But almost more interesting to me is when it is effectively not being used at all. When, oh, we're doing everything with EBS volumes or EFS.And again, those are fine services. I don't have any particular problem with them anymore, but the problem I have is that the cloud long ago took what amounts to an economic vote. There's a tax savings for storing data in an object store the way that you—and by extension, most of your competitors—wind up pricing this, versus the idea of on a volume basis where you have to pre-provision things, you don't get any form of durability that extends beyond the availability zone boundary. It just becomes an awful lot of, “Well, you could do it this way. But it gets really expensive really quickly.”It just feels wild to me that there is that level of variance between S3 just sort of raw storage basis, economically, as well as then just the, frankly, ridiculous levels of durability and availability that you offer on top of that. How did you get there? Was the service just mispriced at the beginning? Like oh, we dropped to zero and probably should have put that in there somewhere.Kevin: Well, no, I wouldn't call it mispriced. I think that the S3 came about when we took a—we spent a lot of time looking at the architecture for storage systems, and knowing that we wanted a system that would provide the durability that comes with having three completely independent data centers and the elasticity and capability where, you know, customers don't have to provision the amount of storage they want, they can simply put data and the system keeps growing. And they can also delete data and stop paying for that storage when they're not using it. And so, just all of that investment and sort of looking at that architecture holistically led us down the path to where we are with S3.And we've definitely talked about this. In fact, in Peter's keynote at re:Invent last year, we talked a little bit about how the system is designed under the hood, and one of the thing you realize is that S3 gets a lot of the benefits that we do by just the overall scale. The fact that it is—I think the stat is that at this point more than 10,000 customers have data that's stored on more than a million hard drives in S3. And that's how you get the scale and the capability to do is through massive parallelization. Where customers that are, you know, I would say building more traditional architectures, those are inherently typically much more siloed architectures with a relatively small-scale overall, and it ends up with a lot of resource that's provisioned at small-scale in sort of small chunks with each resource, that you never get to that scale where you can start to take advantage of the some is more than the greater of the parts.And so, I think that's what the recognition was when we started out building S3. And then, of course, we offer that as an API on top of that, where customers can consume whatever they want. That is, I think, where S3, at the scale it operates, is able to do certain things, including on the economics, that are very difficult or even impossible to do at a much smaller scale.Corey: One of the more egregious clown-shoe statements that I hear from time to time has been when people will come to me and say, “We've built a competitor to S3.” And my response is always one of those, “Oh, this should be good.” Because when people say that, they generally tend to be focusing on one or maybe two dimensions that doesn't work for a particular use case as well as it could. “Okay, what was your story around why this should be compared to S3?” “Well, it's an object store. It has full S3 API compatibility.” “Does it really because I have to say, there are times where I'm not entirely convinced that S3 itself has full compatibility with the way that its API has been documented.”And there's an awful lot of magic that goes into this too. “Okay, great. You're running an S3 competitor. Great. How many buildings does it live in?” Like, “Well, we have a problem with the s at the end of that word.” It's, “Okay, great. If it fits on my desk, it is not a viable S3 competitor. If it fits in a single zip code, it is probably not a viable S3 competitor.” Now, can it be an object store? Absolutely. Does it provide a new interface to some existing data someone might have? Sure why not. But I think that, oh, it's S3 compatible, is something that gets tossed around far too lightly by folks who don't really understand what it is that drives S3 and makes it special.Kevin: Yeah, I mean, I would say certainly, there's a number of other implementations of the S3 API, and frankly we're flattered that customers recognize and our competitors and others recognize the simplicity of the API and go about implementing it. But to your point, I think that there's a lot more; it's not just about the API, it's really around everything surrounding S3 from, as you mentioned, the fact that the data in S3 is stored in three independent availability zones, all of which that are separated by kilometers from each other, and the resilience, the automatic failover, and the ability to withstand an unlikely impact to one of those facilities, as well as the scalability, and you know, the fact that we put a lot of time and effort into making sure that the service continues scaling with our customers need. And so, I think there's a lot more that goes into what is S3. And oftentimes just in a straight-up comparison, it's sort of purely based on just the APIs and generally a small set of APIs, in addition to those intangibles around—or not intangibles, but all of the ‘-ilities,' right, the elasticity and the durability, and so forth that I just talked about. In addition to all that also, you know, certainly what we're seeing for customers is as they get into the petabyte and tens of petabytes, hundreds of petabytes scale, their need for the services that we provide to manage that storage, whether it's lifecycle and replication, or things like our batch operations to help update and to maintain all the storage, those become really essential to customers wrapping their arms around it, as well as visibility, things like Storage Lens to understand, what storage do I have? Who's using it? How is it being used?And those are all things that we provide to help customers manage at scale. And certainly, you know, oftentimes when I see claims around S3 compatibility, a lot of those advanced features are nowhere to be seen.Corey: I also want to call out that a few years ago, Mai-Lan got on stage and talked about how, to my recollection, you folks have effectively rebuilt S3 under the hood into I think it was 235 distinct microservices at the time. There will not be a quiz on numbers later, I'm assuming. But what was wild to me about that is having done that for services that are orders of magnitude less complex, it absolutely is like changing the engine on a car without ever slowing down on the highway. Customers didn't know that any of this was happening until she got on stage and announced it. That is wild to me. I would have said before this happened that there was no way that would have been possible except it clearly was. I have to ask, how did you do that in the broad sense?Kevin: Well, it's true. A lot of the underlying infrastructure that's been part of S3, both hardware and software is, you know, you wouldn't—if someone from S3 in 2006 came and looked at the system today, they would probably be very disoriented in terms of understanding what was there because so much of it has changed. To answer your question, the long and short of it is a lot of testing. In fact, a lot of novel testing most recently, particularly with the use of formal logic and what we call automated reasoning. It's also something we've talked a fair bit about in re:Invent.And that is essentially where you prove the correctness of certain algorithms. And we've used that to spot some very interesting, the one-in-a-trillion type cases that S3 scale happens regularly, that you have to be ready for and you have to know how the system reacts, even in all those cases. I mean, I think one of our engineers did some calculations that, you know, the number of potential states for S3, sort of, exceeds the number of atoms in the universe or something so crazy. But yet, using methods like automated reasoning, we can test that state space, we can understand what the system will do, and have a lot of confidence as we begin to swap, you know, pieces of the system.And of course, nothing in S3 scale happens instantly. It's all, you know, I would say that for a typical engineering effort within S3, there's a certain amount of effort, obviously, in making the change or in preparing the new software, writing the new software and testing it, but there's almost an equal amount of time that goes into, okay, and what is the process for migrating from System A to System B, and that happens over a timescale of months, if not years, in some cases. And so, there's just a lot of diligence that goes into not just the new systems, but also the process of, you know, literally, how do I swap that engine on the system. So, you know, it's a lot of really hard working engineers that spent a lot of time working through these details every day.Corey: I still view S3 through the lens of it is one of the easiest ways in the world to wind up building a static web server because you basically stuff the website files into a bucket and then you check a box. So, it feels on some level though, that it is about as accurate as saying that S3 is a database. It can be used or misused or pressed into service in a whole bunch of different use cases. What have you seen from customers that has, I guess, taught you something you didn't expect to learn about your own service?Kevin: Oh, I'd say we have those [laugh] meetings pretty regularly when customers build their workloads and have unique patterns to it, whether it's the type of data they're retrieving and the access pattern on the data. You know, for example, some customers will make heavy use of our ability to do [ranged gets 00:22:47] on files and [unintelligible 00:22:48] objects. And that's pretty good capability, but that can be one where that's very much dependent on the type of file, right, certain files have structure, as far as you know, a header or footer, and that data is being accessed in a certain order. Oftentimes, those may also be multi-part objects, and so making use of the multi-part features to upload different chunks of a file in parallel. And you know, also certainly when customers get into things like our batch operations capability where they can literally write a Lambda function and do what they want, you know, we've seen some pretty interesting use cases where customers are running large-scale operations across, you know, billions, sometimes tens of billions of objects, and this can be pretty interesting as far as what they're able to do with them.So, for something is sort of what you might—you know, as simple and basics, in some sense, of GET and PUT API, just all the capability around it ends up being pretty interesting as far as how customers apply it and the different workloads they run on it.Corey: So, if you squint hard enough, what I'm hearing you tell me is that I can view all of this as, “Oh, yeah. S3 is also compute.” And it feels like that as a fast-track to getting a question wrong on one of the certification exams. But I have to ask, from your point of view, is S3 storage? And whether it's yes or no, what gets you excited about the space that it's in?Kevin: Yeah well, I would say S3 is not compute, but we have some great compute services that are very well integrated with S3, which excites me as well as we have things like S3 Object Lambda, where we actually handle that integration with Lambda. So, you're writing Lambda functions, we're executing them on the GET path. And so, that's a pretty exciting feature for me. But you know, to sort of take a step back, what excites me is I think that customers around the world, in every industry, are really starting to recognize the value of data and data at large scale. You know, I think that actually many customers in the world have terabytes or more of data that sort of flows through their fingers every day that they don't even realize.And so, as customers realize what data they have, and they can capture and then start to analyze and make ultimately make better business decisions that really help drive their top line or help them reduce costs, improve costs on whether it's manufacturing or, you know, other things that they're doing. That's what really excites me is seeing those customers take the raw capability and then apply it to really just to transform how they not just how their business works, but even how they think about the business. Because in many cases, transformation is not just a technical transformation, it's people and cultural transformation inside these organizations. And that's pretty cool to see as it unfolds.Corey: One of the more interesting things that I've seen customers misunderstand, on some level, has been a number of S3 releases that focus around, “Oh, this is for your data lake.” And I've asked customers about that. “So, what's your data lake strategy?” “Well, we don't have one of those.” “You have, like, eight petabytes and climbing in S3? What do you call that?” It's like, “Oh, yeah, that's just a bunch of buckets we dump things into. Some are logs of our assets and the rest.” It's—Kevin: Right.Corey: Yeah, it feels like no one thinks of themselves as having anything remotely resembling a structured place for all of the data that accumulates at a company.Kevin: Mm-hm.Corey: There is an evolution of people learning that oh, yeah, this is in fact, what it is that we're doing, and this thing that they're talking about does apply to us. But it almost feels like a customer communication challenge, just because, I don't know about you, but with my legacy AWS account, I have dozens of buckets in there that I don't remember what the heck they're for. Fortunately, you folks don't charge by the bucket, so I can smile, nod, remain blissfully ignorant, but it does make me wonder from time to time.Kevin: Yeah, no, I think that what you hear there is actually pretty consistent with what the reality is for a lot of customers, which is in distributed organizations, I think that's bound to happen, you have different teams that are working to solve problems, and they are collecting data to analyze, they're creating result datasets and they're storing those datasets. And then, of course, priorities can shift, and you know, and there's not necessarily the day-to-day management around data that we might think would be expected. I feel [we 00:26:56] sort of drew an architecture on a whiteboard. And so, I think that's the reality we are in. And we will be in, largely forever.I mean, I think that at a smaller-scale, that's been happening for years. So, I think that, one, I think that there's a lot of capability just being in the cloud. At the very least, you can now start to wrap your arms around it, right, where used to be that it wasn't even possible to understand what all that data was because there's no way to centrally inventory it well. In AWS with S3, with inventory reports, you can get a list of all your storage and we are going to continue to add capability to help customers get their arms around what they have, first off; understand how it's being used—that's where things like Storage Lens really play a big role in understanding exactly what data is being accessed and not. We're definitely listening to customers carefully around this, and I think when you think about broader data management story, I think that's a place that we're spending a lot of time thinking right now about how do we help customers get their arms around it, make sure that they know what's the categorization of certain data, do I have some PII lurking here that I need to be very mindful of?And then how do I get to a world where I'm—you know, I won't say that it's ever going to look like the perfect whiteboard picture you might draw on the wall. I don't think that's really ever achievable, but I think certainly getting to a point where customers have a real solid understanding of what data they have and that the right controls are in place around all that data, yeah, I think that's directionally where I see us heading.Corey: As you look around how far the service has come, it feels like, on some level, that there were some, I guess, I don't want to say missteps, but things that you learned as you went along. Like, back when the service was in beta, for example, there was no per-request charge. To my understanding that was changed, in part because people were trying to use it as a file system, and wow, that suddenly caused a tremendous amount of load on some of the underlying systems. You originally launched with a BitTorrent endpoint as an option so that people could download through peer-to-peer approaches for large datasets and turned out that wasn't really the way the internet evolved, either. And I'm curious, if you were to have to somehow build this off from scratch, are there any other significant changes you would make in how the service was presented to customers in how people talked about it in the early days? Effectively given a mulligan, what would you do differently?Kevin: Well, I don't know, Corey, I mean, just given where it's grown to in macro terms, you know, I definitely would be worried taking a mulligan, you know, that I [laugh] would change the sort of the overarching trajectory. Certainly, I think there's a few features here and there where, for whatever reason, it was exciting at the time and really spoke to what customers at the time were thinking, but over time, you know, sort of quickly those needs move to something a little bit different. And, you know, like you said things like the BitTorrent support is one where, at some level, it seems like a great technical architecture for the internet, but certainly not something that we've seen dominate in the way things are done. Instead, you know, we've largely kind of have a world where there's a lot of caching layers, but it still ends up being largely client-server kind of connections. So, I don't think I would do a—I certainly wouldn't do a mulligan on any of the major functionality, and I think, you know, there's a few things in the details where obviously, we've learned what really works in the end. I think we learned that we wanted bucket names to really strictly conform to rules for DNS encoding. So, that was the change that was made at some point. And we would tweak that, but no major changes, certainly.Corey: One subject of some debate while we were designing this year's charity t-shirt—which, incidentally, if you're listening to this, you can pick up for yourself at snark.cloud/shirt—was the is S3 itself dependent upon S3? Because we know that every other service out there is as well, but it is interesting to come up with an idea of, “Oh, yeah. We're going to launch a whole new isolated region of S3 without S3 to lean on.” That feels like it's an almost impossible bootstrapping problem.Kevin: Well, S3 is not dependent on S3 to come up, and it's certainly a critical dependency tree that we look at and we track and make sure that we'd like to have an acyclic graph as we look at dependencies.Corey: That is such a sophisticated way to say what I learned the hard way when I was significantly younger and working in production environments: don't put the DNS servers needed to boot the hypervisor into VMs that require a working hypervisor. It's one of those oh, yeah, in hindsight, that makes perfect sense, but you learn it right after that knowledge really would have been useful.Kevin: Yeah, absolutely. And one of the terms we use for that, as well as is the idea of static stability, or that's one of the techniques that can really help with isolating a dependency is what we call static stability. We actually have an article about that in the Amazon Builder Library, which there's actually a bunch of really good articles in there from very experienced operations-focused engineers in AWS. So, static stability is one of those key techniques, but other techniques—I mean, just pure minimization of dependencies is one. And so, we were very, very thoughtful about that, particularly for that core layer.I mean, you know, when you talk about S3 with 200-plus microservices, or 235-plus microservices, I would say not all of those services are critical for every single request. Certainly, a small subset of those are required for every request, and then other services actually help manage and scale the kind of that inner core of services. And so, we look at dependencies on a service by service basis to really make sure that inner core is as minimized as possible. And then the outer layers can start to take some dependencies once you have that basic functionality up.Corey: I really want to thank you for being as generous with your time as you have been. If people want to learn more about you and about S3 itself, where should they go—after buying a t-shirt, of course.Kevin: Well, certainly buy the t-shirt. First, I love the t-shirts and the charity that you work with to do that. Obviously, for S3, it's aws.amazon.com/s3. And you can actually learn more about me. I have some YouTube videos, so you can search for me on YouTube and kind of get a sense of myself.Corey: We will put links to that into the show notes, of course. Thank you so much for being so generous with your time. I appreciate it.Kevin: Absolutely. Yeah. Glad to spend some time. Thanks for the questions, Corey.Corey: Kevin Miller, vice president and general manager for Amazon S3. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, ignorant comment talking about how your S3 compatible service is going to blow everyone's socks off when it fails.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
About VictorVictor is an Independent Senior Cloud Infrastructure Architect working mainly on Amazon Web Services (AWS), designing: secure, scalable, reliable, and cost-effective cloud architectures, dealing with large-scale and mission-critical distributed systems. He also has a long experience in Cloud Operations, Security Advisory, Security Hardening (DevSecOps), Modern Applications Design, Micro-services and Serverless, Infrastructure Refactoring, Cost Saving (FinOps).Links Referenced: Zoph: https://zoph.io/ unusd.cloud: https://unusd.cloud Twitter: https://twitter.com/zoph LinkedIn: https://www.linkedin.com/in/grenuv/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at Datadog. Datadog's SaaS monitoring and security platform that enables full stack observability for developers, IT operations, security, and business teams in the cloud age. Datadog's platform, along with 500 plus vendor integrations, allows you to correlate metrics, traces, logs, and security signals across your applications, infrastructure, and third party services in a single pane of glass.Combine these with drag and drop dashboards and machine learning based alerts to help teams troubleshoot and collaborate more effectively, prevent downtime, and enhance performance and reliability. Try Datadog in your environment today with a free 14 day trial and get a complimentary T-shirt when you install the agent.To learn more, visit datadoghq.com/screaminginthecloud to get. That's www.datadoghq.com/screaminginthecloudCorey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomomento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. One of the best parts about running a podcast like this and trolling the internet of AWS things is every once in a while, I get to learn something radically different than what I expected. For a long time, there's been this sort of persona or brand in the AWS space, specifically the security side of it, going by Zoph—that's Z-O-P-H—and I just assumed it was a collective or a whole bunch of people working on things, and it turns out that nope, it is just one person. And that one person is my guest today. Victor Grenu is an independent AWS architect. Victor, thank you for joining me.Victor: Hey, Corey, thank you for having me. It's a pleasure to be here.Corey: So, I want to start by diving into the thing that first really put you on my radar, though I didn't realize it was you at the time. You have what can only be described as an army of Twitter bots around the AWS ecosystem. And I don't even know that I'm necessarily following all of them, but what are these bots and what do they do?Victor: Yeah. I have a few bots on Twitter that I push some notification, some tweets, when things happen on AWS security space, especially when the AWS managed policies are updated from AWS. And it comes from an initial project from Scott Piper. He was running a Git command on his own laptop to push the history of AWS managed policy. And it told me that I can automate this thing using a deployment pipeline and so on, and to tweet every time a new change is detected from AWS. So, the idea is to monitor every change on these policies.Corey: It's kind of wild because I built a number of somewhat similar Twitter bots, only instead of trying to make them into something useful, I'd make them into something more than a little bit horrifying and extraordinarily obnoxious. Like there's a Cloud Boomer Twitter account that winds up tweeting every time Azure tweets something only it quote-tweets them in all caps and says something insulting. I have an AWS releases bot called AWS Cwoud—so that's C-W-O-U-D—and that winds up converting it to OwO speak. It's like, “Yay a new auto-scawowing growp.” That sort of thing is obnoxious and offensive, but it makes me laugh.Yours, on the other hand, are things that I have notifications turned on for just because when they announce something, it's generally fairly important. The first one that I discovered was your IAM changes bot. And I found some terrifying things coming out of that from time to time. What's the data source for that? Because I'm just grabbing other people's Twitter feeds or RSS feeds; you're clearly going deeper than that.Victor: Yeah, the data source is the official AWS managed policy. In fact, I run AWS CLI in the background and I'm doing just a list policy, the list policy command, and with this list I'm doing git of each policy that is returned, so I can enter it in a git repository to get the full history of the time. And I also craft a list of deprecated policy, and I also run, like, a dog-food initiative, the policy analysis, validation analysis from AWS tools to validate the consistency and the accuracy of the own policies. So, there is a policy validation with their own tool. [laugh].Corey: You would think that wouldn't turn up anything because their policy validator effectively acts as a linter, so if it throws an error, of course, you wouldn't wind up pushing that. And yet, somehow the fact that you have bothered to hook that up and have findings from it indicates that that's not how the real world works.Victor: Yeah, there is some, let's say, some false positive because we are running the policy validation with their own linter then own policies, but this is something that is documented from AWS. So, there is an official page where you can find why the linter is not working on each policy and why. There is a an explanation for each findings. I thinking of [unintelligible 00:05:05] managed policy, which is too long, and policy analyzer is crashing because the policy is too long.Corey: Excellent. It's odd to me that you have gone down this path because it's easy enough to look at this and assume that, oh, this must just be something you do for fun or as an aspect of your day job. So, I did a little digging into what your day job is, and this rings very familiar to me: you are an independent AWS consultant, only you're based out of Paris, whereas I was doing this from San Francisco, due to an escalatingly poor series of life choices on my part. What do you focus on in the AWS consulting world?Victor: Yeah. I'm running an AWS consulting boutique in Paris and I'm working for a large customer in France. And I'm doing mostly infrastructure stuff, infrastructure design for cloud-native application, and I'm also doing some security audits and [unintelligible 00:06:07] mediation for my customer.Corey: It seems to me that there's a definite divide as far as how people find the AWS consulting experience to be. And I'm not trying to cast judgment here, but the stories that I hear tend to fall into one of two categories. One of them is the story that you have, where you're doing this independently, you've been on your own for a while working specifically on this, and then there's the stories of, “Oh, yeah, I work for a 500 person consultancy and we do everything as long as they'll pay us money. If they've got money, we'll do it. Why not?”And it always seems to me—not to be overly judgy—but the independent consultants just seem happier about it because for better or worse, we get to choose what we focus on in a way that I don't think you do at a larger company.Victor: Yeah. It's the same in France or in Europe; there is a lot of consulting firms. But with the pandemic and with the market where we are working, in the cloud, in the cloud-native solution and so on, that there is a lot of demands. And the natural path is to start by working for a consulting firm and then when you are ready, when you have many AWS certification, when you have the experience of the customer, when you have a network of well-known customer, and you gain trust from your customer, I think it's natural to go by yourself, to be independent and to choose your own project and your own customer.Corey: I'm curious to get your take on what your perception of being an AWS consultant is when you're based in Paris versus, in my case, being based in the West Coast of the United States. And I know that's a bit of a strange question, but even when I travel, for example, over to the East Coast, suddenly, my own newsletter sends out three hours later in the day than I expect it to and that throws me for a loop. The AWS announcements don't come out at two or three in the afternoon; they come out at dinnertime. And for you, it must be in the middle of the night when a lot of those things wind up dropping. The AWS stuff, not my newsletter. I imagine you're not excitedly waiting on tenterhooks to see what this week's issue of Last Week in AWS talks about like I am.But I'm curious is that even beyond that, how do you experience the market? From what you're perceiving people in the United States talking about as AWS consultants versus what you see in Paris?Victor: It's difficult, but in fact, I don't have so much information about the independent in the US. I know that there is a lot, but I think it's more common in Europe. And yeah, it's an advantage to whoever ten-hour time [unintelligible 00:08:56] from the US because a lot of stuff happen on the Pacific time, on the Seattle timezone, on San Francisco timezone. So, for example, for this podcast, my Monday is over right now, so, so yeah, I have some advantage in time, but yeah.Corey: This is potentially an odd question for you. But I find an awful lot of the AWS documentation to be challenging, we'll call it. I don't always understand exactly what it's trying to tell me, and it's not at all clear that the person writing the documentation about a service in some cases has ever used the service. And in everything I just said, there is no language barrier. This documentation was written—theoretically—in English and I, most days, can stumble through a sentence in English and almost no other language. You obviously speak French as a first language. Given that you live in Paris, it seems to be a relatively common affliction. How do you find interacting with AWS in French goes? Or is it just a complete nonstarter, and it all has to happen in English for you?Victor: No, in fact, the consultants in Europe, I think—in fact, in my part, I'm using my laptop in English, I'm using my phone in English, I'm using the AWS console in English, and so on. So, the documentation for me is a switch on English first because for the other language, there is sometimes some automated translation that is very dangerous sometimes, so we all keep the documentation and the materials in English.Corey: It's wild to me just looking at how challenging so much of the stuff is. Having to then work in a second language on top of that, it just seems almost insurmountable to me. It's good they have automated translation for a lot of this stuff, but that falls down in often hilariously disastrous ways, sometimes. It's wild to me that even taking most programming languages that folks have ever heard of, even if you program and speak no English, which happens in a large part of the world, you're still using if statements even if the term ‘if' doesn't mean anything to you localized in your language. It really is, in many respects, an English-centric industry.Victor: Yeah. Completely. Even in French for our large French customer, I'm writing the PowerPoint presentation in English, some emails are in English, even if all the folks in the thread are French. So yeah.Corey: One other area that I wanted to explore with you a bit is that you are very clearly focused on security as a primary area of interest. Does that manifest in the work that you do as well? Do you find that your consulting engagements tend to have a high degree of focus on security?Victor: Yeah. In my design, when I'm doing some AWS architecture, my main objective is to design some security architecture and security patterns that apply best practices and least privilege. But often, I'm working for engagement on security audits, for startups, for internal customer, for diverse company, and then doing some accommodation after all. And to run my audit, I'm using some open-source tooling, some custom scripts, and so on. I have a methodology that I'm running for each customer. And the goal is to sometime to prepare some certification, PCI DSS or so on, or maybe to ensure that the best practice are correctly applied on a workload or before go-live or, yeah.Corey: One of the weird things about this to me is that I've said for a long time that cost and security tend to be inextricably linked, as far as being a sort of trailing reactive afterthought for an awful lot of companies. They care about both of those things right after they failed to adequately care about those things. At least in the cloud economic space, it's only money as opposed to, “Oops, we accidentally lost our customers' data.” So, I always found that I find myself drifting in a security direction if I don't stop myself, just based upon a lot of the cost work I do. Conversely, it seems that you have come from the security side and you find yourself drifting in a costing direction.Your side project is a SaaS offering called unusd.cloud, that's U-N-U-S-D dot cloud. And when you first mentioned this to me, my immediate reaction was, “Oh, great. Another SaaS platform for costing. Let's tear this one apart, too.” Except I actually like what you're building. Tell me about it.Victor: Yeah, and unusd.cloud is a side project for me and I was working since, let's say one year. It was a project that I've deployed for some of my customer on their local account, and it was very useful. And so, I was thinking that it could be a SaaS project. So, I've worked at [unintelligible 00:14:21] so yeah, a few months on shifting the product to assess [unintelligible 00:14:27].The product aim to detect the worst on AWS account on all AWS region, and it scan all your AWS accounts and all your region, and you try to detect and use the EC2, LDS, Glue [unintelligible 00:14:45], SageMaker, and so on, and attach a EBS and so on. I don't craft a new dashboard, a new Cost Explorer, and so on. It's it just cost awareness, it's just a notification on email or Slack or Microsoft Teams. And you just add your AWS account on the project and you schedule, let's say, once a day, and it scan, and it send you a cost of wellness, a [unintelligible 00:15:17] detection, and you can act by turning off what is not used.Corey: What I like about this is it cuts at the number one rule of cloud economics, which is turn that shit off if you're not using it. You wouldn't think that I would need to say that except that everyone seems to be missing that, on some level. And it's easy to do. When you need to spin something up and it's not there, you're very highly incentivized to spin that thing up. When you're not using it, you have to remember that thing exists, otherwise it just sort of sits there forever and doesn't do anything.It just costs money and doesn't generate any value in return for that. What you got right is you've also eviscerated my most common complaint about tools that claim to do this, which is you build in either a explicit rule of ignore this resource or ignore resources with the following tags. The benefit there is that you're not constantly giving me useless advice, like, “Oh, yeah, turn off this idle thing.” It's, yeah, that's there for a reason, maybe it's my dev box, maybe it's my backup site, maybe it's the entire DR environment that I'm going to need at little notice. It solves for that problem beautifully. And though a lot of tools out there claim to do stuff like this, most of them really failed to deliver on that promise.Victor: Yeah, I just want to keep it simple. I don't want to add an additional console and so on. And you are correct. You can apply a simple tag on your asset, let's say an EC2 instances, you apply the tag in use and the value of, and then the alerting is disabled for this asset. And the detection is based on the CPU [unintelligible 00:17:01] and the network health metrics, so when the instances is not used in the last seven days, with a low CPU every [unintelligible 00:17:10] and low network out, it comes as a suspect. [laugh].[midroll 00:17:17]Corey: One thing that I like about what you've done, but also have some reservations about it is that you have not done with so many of these tools do which is, “Oh, just give us all the access in your account. It'll be fine. You can trust us. Don't you want to save money?” And yeah, but I also still want to have a company left when all sudden done.You are very specific on what it is that you're allowed to access, and it's great. I would argue, on some level, it's almost too restrictive. For example, you have the ability to look at EC2, Glue, IAM—just to look at account aliases, great—RDS, Redshift, and SageMaker. And all of these are simply list and describe. There's no gets in there other than in Cost Explorer, which makes sense. You're not able to go rummaging through my data and see what's there. But that also bounds you, on some level, to being able to look only at particular types of resources. Is that accurate or are you using a lot of the CloudWatch stuff and Cost Explorer stuff to see other areas?Victor: In fact, it's the least privilege and read-only permission because I don't want too much question for the security team. So, it's full read-only permission. And I've only added the detection that I'm currently supports. Then if in some weeks, in some months, I'm adding a new detection, let's say for Snapshot, for example, I will need to update, so I will ask my customer to update their template. There is a mechanisms inside the project to tell them that the template is obsolete, but it's not a breaking change.So, the detection will continue, but without the new detection, the new snapshot detection, let's say. So yeah, it's least privilege, and all I need is the get-metric-statistics from CloudWatch to detect unused assets. And also checking [unintelligible 00:19:16] Elastic IP or [unintelligible 00:19:19] EBS volume. So, there is no CloudWatching in this detection.Corey: Also, to be clear, I am not suggesting that what you have done is at all a mistake, even if you bound it to those resources right now. But just because everyone loves to talk about these exciting, amazing, high-level services that AWS has put up there, for example, oh, what about DocumentDB or all these other—you know, Amazon Basics MongoDB; same thing—or all of these other things that they wind up offering, but you take a look at where customers are spending money and where they're surprised to be spending money, it's EC2, it's a bit of RDS, occasionally it's S3, but that's a lot harder to detect automatically whether that data is unused. It's, “You haven't been using this data very much.” It's, “Well, you see how the bucket is labeled ‘Archive Backups' or ‘Regulatory Logs?'” imagine that. What a ridiculous concept.Yeah. Whereas an idle EC2 instance sort of can wind up being useful on this. I am curious whether you encounter in the wild in your customer base, folks who are having idle-looking EC2 instances, but are in fact, for example, using a whole bunch of RAM, which you can't tell from the outside without custom CloudWatch agents.Victor: Yeah, I'm not detecting this behavior for larger usage of RAM, for example, or for maybe there is some custom application that is low in CPU and don't talk to any other services using the network, but with this detection, with the current state of the detection, I'm covering large majority of waste because what I see from my customer is that there is some teams, some data scientists or data teams who are experimenting a lot with SageMaker with Glue, with Endpoint and so on. And this is very expensive at the end of the day because they don't turn off the light at the end of the day, on Friday evening. So, what I'm trying to solve here is to notify the team—so on Slack—when they forgot to turn off the most common waste on AWS, so EC2, LTS, Redshift.Corey: I just now wound up installing it while we've been talking on my dedicated shitposting account, and sure enough, it already spat out a single instance it found, which yeah was running an EC2 instance on the East Coast when I was just there, so that I had a DNS server that was a little bit more local. Okay, great. And it's a T4g.micro, so it's not exactly a whole lot of money, but it does exactly what it says on the tin. It didn't wind up nailing the other instances I have in that account that I'm using for a variety of different things, which is good.And it further didn't wind up falling into the trap that so many things do, which is the, “Oh, it's costing you zero and your spend this month is zero because this account is where I dump all of my AWS credit codes.” So, many things say, “Oh, well, it's not costing you anything, so what's the problem?” And then that's how you accidentally lose $100,000 in activate credits because someone left something running way too long. It does a lot of the right things that I would hope and expect it to do, and the fact that you don't do that is kind of amazing.Victor: Yeah. It was a need from my customer and an opportunity. It's a small bet for me because I'm trying to do some small bets, you know, the small bets approach, so the idea is to try a new thing. It's also an excuse for me to learn something new because building a SaaS is a challenging.Corey: One thing that I am curious about, in this account, I'm also running the controller for my home WiFi environment. And that's not huge. It's T3.small, but it is still something out there that it sits there because I need it to exist. But it's relatively bored.If I go back and look over the last week of CloudWatch metrics, for example, it doesn't look like it's usually busy. I'm sure there's some network traffic in and out as it updates itself and whatnot, but the CPU peeks out at a little under 2% used. It didn't warn on this and it got it right. I'm just curious as to how you did that. What is it looking for to determine whether this instance is unused or not?Victor: It's the magic [laugh]. There is some intelligence artif—no, I'm just kidding. It just statistics. And I'm getting two metrics, the superior average from the last seven days and the network out. And I'm getting the average on those metrics and I'm doing some assumption that this EC2, this specific EC2 is not used because of these metrics, this server average.Corey: Yeah, it is wild to me just that this is working as well as it is. It's just… like, it does exactly what I would expect it to do. It's clear that—and this is going to sound weird, but I'm going to say it anyway—that this was built from someone who was looking to answer the question themselves and not from the perspective of, “Well, we need to build a product and we have access to all of this data from the API. How can we slice and dice it and add some value as we go?” I really liked the approach that you've taken on this. I don't say that often or lightly, particularly when it comes to cloud costing stuff, but this is something I'll be using in some of my own nonsense.Victor: Thanks. I appreciate it.Corey: So, I really want to thank you for taking as much time as you have to talk about who you are and what you're up to. If people want to learn more, where can they find you?Victor: Mainly on Twitter, my handle is @zoph [laugh]. And, you know, on LinkedIn or on my company website, as zoph.io.Corey: And we will, of course, put links to that in the [show notes 00:25:23]. Thank you so much for your time today. I really appreciate it.Victor: Thank you, Corey, for having me. It was a pleasure to chat with you.Corey: Victor Grenu, independent AWS architect. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that is going to cost you an absolute arm and a leg because invariably, you're going to forget to turn it off when you're done.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
About MikeBeside his duties as The Duckbill Group's CEO, Mike is the author of O'Reilly's Practical Monitoring, and previously wrote the Monitoring Weekly newsletter and hosted the Real World DevOps podcast. He was previously a DevOps Engineer for companies such as Taos Consulting, Peak Hosting, Oak Ridge National Laboratory, and many more. Mike is originally from Knoxville, TN (Go Vols!) and currently resides in Portland, OR.Links Referenced: @Mike_Julian: https://twitter.com/Mike_Julian mikejulian.com: https://mikejulian.com duckbillgroup.com: https://duckbillgroup.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With Tailscale SSH, you can do exactly that. Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH.Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built in key rotation, permissions is code, connectivity between any two devices, reduce latency and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. Tailscale is completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Welcome to Screaming in the Cloud. I'm Cloud Economist Corey Quinn, and my guest is a returning guest on this show, my business partner and CEO of The Duckbill Group, Mike Julian. Mike, thanks for making the time.Mike: Lucky number three, I believe?Corey: Something like that, but numbers are hard. I have databases for that of varying quality and appropriateness for the task, but it works out. Anything's a database. If you're brave enough.Mike: With you inviting me this many times, I'm starting to think you'd like me or something.Corey: I know, I know. So, let's talk about something that is going to put that rumor to rest.Mike: [laugh].Corey: Clearly, you have made some poor choices in the course of your career, like being my business partner being the obvious one. But what's really in a dead heat for which is the worst decision is you've written a book previously. And now you are starting the process of writing another book because, I don't know, we don't keep you busy enough or something. What are you doing?Mike: Making very bad decisions. When I finished writing Practical Monitoring—O'Reilly, and by the way, you should go buy a copy if interested in monitoring—I finished the book and said, “Wow, that was awful. I'm never doing it again.” And about a month later, I started thinking of new books to write. So, that was 2017, and Corey and I started Duckbill and kind of stopped thinking about writing books because small companies are basically small children. But now I'm going to write a book about consulting.Corey: Oh, thank God. I thought you're going to go down the observability path a second time.Mike: You know, I'm actually dreading the day that O'Reilly asks me to do a second edition because I don't really want to.Corey: Yeah. Effectively turn it into an entire story where the only monitoring tool you really need is the AWS bill. That'll go well.Mike: [laugh]. Yeah. So yeah, like, basically, I've been doing consulting for such a long time, and most of my career is consulting in some form or fashion, and I head up all the consulting at Duckbill. I've learned a lot about consulting. And I've found that people have a lot of questions about consulting, particularly at the higher-end levels. Once you start getting into advisory sort of stuff, there's not a lot of great information out there aimed at engineering.Corey: There's a bunch of different views on what consulting is. You have independent contractors billing by the hour as staff replacement who call what they do consulting; you have the big consultancies, like Bain or BCG; you've got what we do in an advisory sense, and of course, you have a bunch of MBA new grads going to a lot of the big consultancies who are going to see a book on consulting and think that it's potentially for them. I don't know that you necessarily have a lot of advice for the new grad type, so who is this for? What is your target customer for this book?Mike: If you're interested in joining McKinsey out of college, I don't have a lot to add; I don't have a lot to tell you. The reason for that is kind of twofold. One is that shops like McKinsey and Deloitte and Accenture and BCG and Bain, all those, are playing very different games than what most of us think about when we think consulting. Their entire model revolves around running a process. And it's the same process for every client they work with. But, like, you're buying them because of their process.And that process is nothing new or novel. You don't go to those firms because you want the best advice possible. You go to those firms because it's the most defensible advice. It's sort of those things like, “No one gets fired for buying Cisco,” no one got fired for buying IBM, like, that sort of thing, it's a very defensible choice. But you're not going to get great results from it.But because of that, their entire model revolves around throwing dozens, in some cases, hundreds of new grads at a problem and saying, “Run this process. Have fun. Let us know if you need help.” That's not consulting I have any experience with. It's honestly not consulting that most of us want to do.Most of that is staffed by MBAs and accountants. When I think consulting, I think about specialized advice and providing that specialized advice to people. And I wager that most of us think about that in the same way, too. In some cases, it might just be, “I'm going to write code for you as a freelancer,” or I'm just going to tell you like, “Hey, put the nail in here instead of over here because it's going to be better for you.” Like, paying for advice is good.But with that, I also have a… one of the first things I say in the beginning of the book, which [laugh] I've already started writing because I'm a glutton for punishment, is I don't think junior people should be consultants. I actually think it's really bad idea because to be a consultant, you have to have expertise in some area, and junior staff don't. They haven't been in their careers long enough to develop that yet. So, they're just going to flounder. So, my advice is generally aimed at people that have been in their careers for quite some time, generally, people that are 10, 15, 20 years into their career, looking to do something.Corey: One of the problems that we see when whenever we talk about these things on Twitter is that we get an awful lot of people telling us that we're wrong, that it can't be made to work, et cetera, et cetera. But following this model, I've been independent for—well, I was independent and then we became The Duckbill Group; add them together because figuring out exactly where that divide happened is always a mental leap for me, but it's been six years at this point. We've definitely proven our ability to not go out of business every month. It's kind of amazing. Without even an exception case of, “That one time.”Mike: [laugh]. Yeah, we are living proof that it does work, but you don't really have to take just our word for it because there are a lot of other firms that exist entirely on an advisory-only, high-expertise model. And it works out really well. We've worked with several of them, so it does work; it just isn't very common inside of tech and particularly inside of engineering.Corey: So, one of the things that I find is what differentiates an expert from an enthusiastic amateur is, among other things, the number of mistakes that they've made. So, I guess a different way of asking this is what qualifies you to write this book, but instead, I'm going to frame it in a very negative way. What have you screwed up on that puts you in a position of, “Ah, I'm going to write a book so that someone else can make better choices.”Mike: One of my favorite stories to tell—and Corey, I actually think you might not have heard this story before—Corey: That seems unlikely, but give it a shot.Mike: Yeah. So, early in my career, I was working for a consulting firm that did ERP implementations. We worked with mainly large, old-school manufacturing firms. So, my job there was to do the engineering side of the implementation. So, a lot of rack-and-stack, a lot of Windows Server configuration, a lot of pulling cables, that sort of thing. So, I thought I was pretty good at this. I quickly learned that I was actually not nearly as good as I thought I was.Corey: A common affliction among many different people.Mike: A common affliction. But I did not realize that until this one particular incident. So, me and my boss are both on site at this large manufacturing facility, and the CFO pulls my boss aside and I can hear them talking and, like, she's pretty upset. She points at me and says, “I never want this asshole in my office ever again.” So, he and I have a long drive back to our office, like an hour and a half.And we had a long chat about what that meant for me. I was not there for very long after that, as you might imagine, but the thing is, I still have no idea to this day what I did to upset her. I know that she was pissed and he knows that she was pissed. And he never told me exactly what it was, only that's you take care of your client. And the client believes that I screwed up so massively that she wanted me fired.Him not wanting to argue—he didn't; he just kind of went with it—and put me on other clients. But as a result of that, it really got me thinking that I screwed something up so badly to make this person hate me so much and I still have no idea what it was that I did. Which tells me that even at the time, I did not understand what was going on around me. I did not understand how to manage clients well, and to really take care of them. That was probably the first really massive mistake that I've made my career—or, like, the first time I came to the realization that there's a whole lot I don't know and it's really costing me.Corey: From where I sit, there have been a number of things that we have done as we've built our consultancy, and I'm curious—you know, let's get this even more personal—in the past, well, we'll call it four years that we have been The Duckbill Group—which I think is right—what have we gotten right and what have we gotten wrong? You are the expert; you're writing a book on this for God's sake.Mike: So, what I think we've gotten right is one of my core beliefs is never bill hourly. Shout out to Jonathan Stark. He wrote I really good book that is a much better explanation of that than I've ever been able to come up with. But I've always had the belief that billing hourly is just a bad idea, so we've never done that and that's worked out really well for us. We've turned down work because that's the model they wanted and it's like, “Sorry, that's not what we do. You're going to have to go work for someone else—or hire someone else.”Other things that I think we've gotten right is a focus on staying on the advisory side and not doing any implementation. That's allowed us to get really good at what we do very quickly because we don't get mired in long-term implementation detail-level projects. So, that's been great. Where we went a little wrong, I think—or what we have gotten wrong, lessons that we've learned. I had this idea that we could build out a junior and mid-level staff and have them overseen by very senior people.And, as it turns out, that didn't work for us, entirely because it didn't work for me. That was really my failure. I went from being an IC to being the leader of a company in one single step. I've never been a manager before Duckbill. So, that particular mistake was really about my lack of abilities in being a good manager and being a good leader.So, building that out, that did not work for us because it didn't work for me and I didn't know how to do it. So, I made way too many mistakes that were kind of amateur-level stuff in terms of management. So, that didn't work. And the other major mistake that I think we've made is not putting enough effort into marketing. So, we get most of our leads by inbound or referral, as is common with boutique consulting firms, but a lot of the income that we get comes through Last Week in AWS, which is really awesome.But we don't put a whole lot of effort into content or any marketing stuff related to the thing that we do, like cost management. I think a lot of that is just that we don't really know how, aside from just creating content and publishing it. We don't really understand how to market ourselves very well on that side of things. I think that's a mistake we've made.Corey: It's an effective strategy against what's a very complicated problem because unlike most things, if—let's go back to your old life—if we have an observability problem, we will talk about that very publicly on Twitter and people will come over and get—“Hey, hey, have you tried to buy my company's product?” Or they'll offer consulting services, or they'll point us in the right direction, all of which is sometimes appreciated. Whereas when you have a big AWS bill, you generally don't talk about it in public, especially if you're a serious company because that's going to, uh, I think the phrase is, “Shake investor confidence,” when you're actually live tweeting slash shitposting about your own AWS bill. And our initial thesis was therefore, since we can't wind up reaching out to these people when they're having the pain because there's no external indication of it, instead what we have to do is be loud enough and notable in this space, where they find us where it shouldn't take more than them asking one or two of their friends before they get pointed to us. What's always fun as the stories we hear is, “Okay, so I asked some other people because I wanted a second opinion, and they told us to go to you, too.” Word of mouth is where our customers come from. But how do you bootstrap that? I don't know. I'm lucky that I got it right the first time.Mike: Yeah, and as I mentioned a minute ago, that a lot of that really comes through your content, which is not really cost management-related. It's much more AWS broad. We don't put out a lot of cost management specific content. And honestly, I think that's to our detriment. We should and we absolutely can. We just haven't. I think that's one of the really big things that we've missed on doing.Corey: There's an argument that the people who come to us do not spend their entire day thinking about AWS bills. I mean, I can't imagine what that would be like, but they don't for whatever reason; they're trying to do something ridiculous, like you know, run a profitable company. So, getting in front of them when they're not thinking about the bills means, on some level, that they're going to reach out to us when the bill strikes. At least that's been my operating theory.Mike: Yeah, I mean, this really just comes down to content strategy and broader marketing strategy. Because one of the things you have to think about with marketing is how do you meet a customer at the time that they have the problem that you solve? And what most marketing people talk about here is what's called the triggering event. Something causes someone to take an action. What is that something? Who is that someone, and what is that action?And for us, one of the things that we thought early on is that well, the bill comes out the first week of the month, every month, so people are going to opened the bill freak out, and a big influx of leads are going to come our way and that's going to happen every single month. The reality is that never happened. That turns out was not a triggering event for anyone.Corey: And early on, when we didn't have that many leads coming in, it was a statistical aberration that I thought I saw, like, “Oh, out of the three leads this month, two of them showed up in the same day. Clearly, it's an AWS billing day thing.” No. It turns out that every company's internal cadence is radically different.Mike: Right. And I wish I could say that we have found what our triggering events are, but I actually don't think we have. We know who the people are and we know what they reach out for, but we haven't really uncovered that triggering event. And it could also be there, there isn't a one. Or at least, if there is one, it's not one that we could see externally, which is kind of fine.Corey: Well, for the half of our consulting that does contract negotiation for large-scale commitments with AWS, it comes up for renewal or the initial discount contract gets offered, those are very clear triggering events but the challenge is that we don't—Mike: You can't see them externally.Corey: —really see that from the outside. Yeah.Mike: Right. And this is one of those things where there are triggering events for basically everything and it's probably going to be pretty consistent once you get down to specific services. Like we provide cost optimization services and contract negotiation services. I'm willing to bet that I can predict exactly what the trigger events for both of those will be pretty well. The problem is, you can never see those externally, which is kind of fine.Ideally, you would be able to see it externally, but you can't, so we roll with it, which means our entire strategy has revolved around always being top-of-mind because at the time where it happens, we're already there. And that's a much more difficult strategy to employ, but it does work.Corey: All it takes is time and being really lucky and being really prolific, and, and, and. It's one of those things where if I were to set out to replicate it, I don't even know how I'd go about doing it.Mike: People have been asking me. They say, “I want to create The Duckbill Group for X. What do I do?” And I say, “First step, get yourself a Corey Quinn.” And they're like, “Well, I can't do that. There's only one.” I'm like, “Yep. Sucks to be you.” [laugh].Corey: Yeah, we called the Jerk Store. They're running out of him. Yeah, it's a problem. And I don't think the world needs a whole lot more of my type of humor, to be honest, because the failure mode that I have experienced brutally and firsthand is not that people don't find me funny; it's that it really hurts people's feelings. I have put significant effort into correcting those mistakes and not repeating them, but it sucks every time I get it wrong.Mike: Yeah.Corey: Another question I have for you around the book targeting, are you aiming this at individual independent consultants or are you looking to advise people who are building agencies?Mike: Explicitly not the latter. My framing around this is that there are a number of people who are doing consulting right now and they've kind of fell into it. Often, they'll leave one job and do a little consulting while they're waiting on their next thing. And in some cases, that might be a month or two. In some cases, it might go on years, but that whole time, they're just like, “Oh, yeah, I'm doing consulting in between things.”But at some point, some of those think, “You know what? I want this to be my thing. I don't want there to be a next thing. This is my thing. So therefore, how do I get serious about doing consulting? How do I get serious about being a consultant?”And that's where I think I can add a lot of value because casually consulting of, like, taking whatever work just kind of falls your way is interesting for a while, but once you get serious about it, and you have to start thinking, well, how do I actually deliver engagements? How do I do that consistently? How do I do it repeatedly? How to do it profitably? How do I price my stuff? How do I package it? How do I attract the leads that I want? How do I work with the customers I want?And turning that whole thing from a casual, “Yeah, whatever,” into, “This is my business,” is a very different way of thinking. And most people don't think that way because they didn't really set out to build a business. They set out to just pass time and earn a little bit of money before they went off to the next job. So, the framing that I have here is that I'm aiming to help people that are wanting to get serious about doing consulting. But they generally have experience doing it already.Corey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomemento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: We went from effectively being the two of us on the consulting delivery side, two scaling up to, I believe, at one point we were six of us, and now we have scaled back down to largely the two of us, aided by very specific external folk, when it makes sense.Mike: And don't forget April.Corey: And of course. I'm talking delivery.Mike: [laugh].Corey: There's a reason I—Mike: Delivery. Yes.Corey: —prefaced it that way. There's a lot of support structure here, let's not get ourselves, and they make this entire place work. But why did we scale up? And then why did we scale down? Because I don't believe we've ever really talked about that publicly.Mike: No, not publicly. In fact, most people probably don't even notice that it happened. We got pretty big for—I mean, not big. So, we hit, I think, six full-time people at one point. And that was quite a bit.Corey: On the delivery side. Let's be clear.Mike: Yeah. No, I think actually with support structure, too. Like, if you add in everyone that we had with the sales and marketing as well, we were like 11 people. And that was a pretty sizable company. But then in July this year, it kind of hit a point where I found that I just wasn't enjoying my job anymore.And I looked around and noticed that a lot of other people was kind of feeling the same way, is just things had gotten harder. And the business wasn't suffering at all, it was just everything felt more difficult. And I finally realized that, for me personally at least, I started Duckbill because I love working with clients, I love doing consulting. And what I have found is that as the company grew larger and larger, I spent most of my time keeping the trains running and taking care of the staff. Which is exactly what I should be doing when we're that size, like, that is my job at that size, but I didn't actually enjoy it.I went into management as, like, this job going from having never done it before. So, I didn't have anything to compare it to. I didn't know if I would like it or not. And once I got here, I realized I actually don't. And I spent a lot of efforts to get better at it and I think I did. I've been working with a leadership coach for years now.But it finally came to a point where I just realized that I wasn't actually enjoying it anymore. I wasn't enjoying the job that I had created. And I think that really panned out to you as well. So, we decided, we had kind of an opportune time where one of our team decided that they were also wanting to go back to do independent consulting. I'm like, “Well, this is actually pretty good time. Why don't we just start scaling things back?” And like, maybe we'll scale it up again in the future; maybe we won't. But like, let's just buy ourselves some breathing room.Corey: One of the things that I think we didn't spend quite enough time really asking ourselves was what kind of place do we want to work at. Because we've explicitly stated that you and I both view this as the last job either of us is ever going to have, which means that we're not trying to do the get big quickly to get acquired, or we want to raise a whole bunch of other people's money to scale massively. Those aren't things either of us enjoy. And it turns out that handling the challenges of a business with as many people working here as we had wasn't what either one of us really wanted to do.Mike: Yeah. You know what—[laugh] it's funny because a lot of our advisors kept asking the same thing. Like, “So, what kind of company do you want?” And like, we had some pretty good answers for that, in that we didn't want to build a VC-backed company, we didn't ever want to be hyperscale. But there's a wide gulf of things between two-person company and hyperscale and we didn't really think too much about that.In fact, being a ten-person company is very different than being a three-person company, and we didn't really think about that either. We should have really put a lot more thought into that of what does it mean to be a ten-person company, and is that what we want? Or is three, four, or five-person more our style? But then again, I don't know that we could have predicted that as a concern had we not tried it first.Corey: Yeah, that was very much something that, for better or worse, we pay advisors for their advice—that's kind of definitionally how it works—and then we ignored it, on some level, though we thought we were doing something different at the time because there's some lessons you've just got to learn by making the mistake yourself.Mike: Yeah, we definitely made a few of those. [laugh].Corey: And it's been an interesting ride and I've got zero problem with how things have shaken out. I like what we do quite a bit. And honestly, the biggest fear I've got going forward is that my jackass business partner is about to distract the hell out of himself by writing a book, which is never as easy as even the most pessimistic estimates would be. So, that's going to be awesome and fun.Mike: Yeah, just wait until you see the dedication page.Corey: Yeah, I wasn't mentioned at all in the last book that you wrote, which I found personally offensive. So, if I'm not mentioned this time, you're fired.Mike: Oh, no, you are. It's just I'm also adding an anti-dedication page, which just has a photo of you.Corey: Oh, wonderful, wonderful. This is going to be one of those stories of the good consultant and the bad consultant, and I'm going to be the Goofus to your Gallant, aren't I?Mike: [laugh]. Yes, yes. You are.Corey: “Goofus wants to bill by the hour.”Mike: It's going to have a page of, like, “Here's this [unintelligible 00:25:05] book is dedicated to. Here's my acknowledgments. And [BLEEP] this guy.”Corey: I love it. I absolutely love it. I think that there is definitely a bright future for telling other people how to consult properly. May just suggest as a subtitle for the book is Consulting—subtitle—You Have Problems and Money. We'll Take Both.Mike: [laugh]. Yeah. My working title for this is Practical Consulting, but only because my previous book was Practical Monitoring. Pretty sure O'Reilly would have a fit if I did that. I actually have no idea what I'm going to call the book, still.Corey: Naming things is super hard. I would suggest asking people at AWS who name services and then doing the exact opposite of whatever they suggest. Like, take their list of recommendations and sort by reverse order and that'll get you started.Mike: Yeah. [laugh].Corey: I want to thank you for giving us an update on what you're working on and why you have less hair every time I see you because you're mostly ripping it out due to self-inflicted pain. If people want to follow your adventures, where's the best place to keep updated on this ridiculous, ridiculous nonsense that I cannot talk you out of?Mike: Two places. You can follow me on Twitter, @Mike_Julian, or you can sign up for the newsletter on my site at mikejulian.com where I'll be posting all the updates.Corey: Excellent. And I look forward to skewering the living hell out of them.Mike: I look forward to ignoring them.Corey: Thank you, Mike. It is always a pleasure.Mike: Thank you, Corey.Corey: Mike Julian, CEO at The Duckbill Group, and my unwilling best friend. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, annoying comment in which you tell us exactly what our problem is, and then charge us a fixed fee to fix that problem.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
About RichardRichard "RichiH" Hartmann is the Director of Community at Grafana Labs, Prometheus team member, OpenMetrics founder, OpenTelemetry member, CNCF Technical Advisory Group Observability chair, CNCF Technical Oversight Committee member, CNCF Governing Board member, and more. He also leads, organizes, or helps run various conferences from hundreds to 18,000 attendess, including KubeCon, PromCon, FOSDEM, DENOG, DebConf, and Chaos Communication Congress. In the past, he made mainframe databases work, ISP backbones run, kept the largest IRC network on Earth running, and designed and built a datacenter from scratch. Go through his talks, podcasts, interviews, and articles at https://github.com/RichiH/talks or follow him on Twitter at https://twitter.com/TwitchiH for musings on the intersection of technology and society.Links Referenced: Grafana Labs: https://grafana.com/ Twitter: https://twitter.com/TwitchiH Richard Hartmann list of talks: https://github.com/richih/talks TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: This episode is brought to us in part by our friends at Datadog. Datadog's SaaS monitoring and security platform that enables full stack observability for developers, IT operations, security, and business teams in the cloud age. Datadog's platform, along with 500 plus vendor integrations, allows you to correlate metrics, traces, logs, and security signals across your applications, infrastructure, and third party services in a single pane of glass.Combine these with drag and drop dashboards and machine learning based alerts to help teams troubleshoot and collaborate more effectively, prevent downtime, and enhance performance and reliability. Try Datadog in your environment today with a free 14 day trial and get a complimentary T-shirt when you install the agent.To learn more, visit datadoghq/screaminginthecloud to get. That's www.datadoghq/screaminginthecloudCorey: Welcome to Screaming in the Cloud, I'm Corey Quinn. There are an awful lot of people who are incredibly good at understanding the ins and outs and the intricacies of the observability world. But they didn't have time to come on the show today. Instead, I am talking to my dear friend of two decades now, Richard Hartmann, better known on the internet as RichiH, who is the Director of Community at Grafana Labs, here to suffer—in a somewhat atypical departure for the theme of this show—personal attacks for once. Richie, thank you for joining me.Richard: And thank you for agreeing on personal attacks.Corey: Exactly. It was one of your riders. Like, there have to be the personal attacks back and forth or you refuse to appear on the show. You've been on before. In fact, the last time we did a recording, I believe you were here in person, which was a long time ago. What have you been up to?You're still at Grafana Labs. And in many cases, I would point out that, wow, you've been there for many years; that seems to be an atypical thing, which is an American tech industry perspective because every time you and I talk about this, you look at folks who—wow, you were only at that company for five years. What's wrong with you—you tend to take the longer view and I tend to have the fast twitch, time to go ahead and leave jobs because it's been more than 20 minutes approach. I see that you're continuing to live what you preach, though. How's it been?Richard: Yeah, so there's a little bit of Covid brains, I think. When we talked in 2018, I was still working at SpaceNet, building a data center. But the last two-and-a-half years didn't really happen for many people, myself included. So, I guess [laugh] that includes you.Corey: No, no you're right. You've only been at Grafana Labs a couple of years. One would think I would check the notes for shooting my mouth off. But then, one wouldn't know me.Richard: What notes? Anyway, I've been around Prometheus and Grafana Since 2015. But it's like, real, full-time everything is 2020. There was something in between. Since 2018, I contracted to do vulnerability handling and everything for Grafana Labs because they had something and they didn't know how to deal with it.But no, full time is 2020. But as to the space in the [unintelligible 00:02:45] of itself, it's maybe a little bit German of me, but trying to understand the real world and trying to get an overview of systems and how they actually work, and if they are working correctly and as intended, and if not, how they're not working as intended, and how to fix this is something which has always been super important to me, in part because I just want to understand the world. And this is a really, really good way to automate understanding of the world. So, it's basically a work-saving mechanism. And that's why I've been sticking to it for so long, I guess.Corey: Back in the early days of monitoring systems—so we called it monitoring back then because, you know, are using simple words that lack nuance was sort of de rigueur back then—we wound up effectively having tools. Nagios is the one that springs to mind, and it was terrible in all the ways you would expect a tool written in janky Perl in the early-2000s to be. But it told you what was going on. It tried to do a thing, generally reach a server or query it about things, and when things fell out of certain specs, it screamed its head off, which meant that when you had things like the core switch melting down—thinking of one very particular incident—you didn't get a Nagios alert; you got 4000 Nagios alerts. But start to finish, you could wrap your head rather fully around what Nagios did and why it did the sometimes strange things that it did.These days, when you take a look at Prometheus, which we hear a lot about, particularly in the Kubernetes space and Grafana, which is often mentioned in the same breath, it's never been quite clear to me exactly where those start and stop. It always feels like it's a component in a larger system to tell you what's going on rather than a one-stop shop that's going to, you know, shriek its head off when something breaks in the middle of the night. Is that the right way to think about it? The wrong way to think about it?Richard: It's a way to think about it. So personally, I use the terms monitoring and observability pretty much interchangeably. Observability is a relatively well-defined term, even though most people won't agree. But if you look back into the '70s into control theory where the term is coming from, it is the measure of how much you're able to determine the internal state of a system by looking at its inputs and its outputs. Depending on the definition, some people don't include the inputs, but that is the OG definition as far as I'm aware.And from this, there flow a lot of things. This question of—or this interpretation of the difference between telling that, yes, something's broken versus why something's broken. Or if you can't ask new questions on the fly, it's not observability. Like all of those things are fundamentally mapped to this definition of, I need enough data to determine the internal state of whatever system I have just by looking at what is coming in, what is going out. And that is at the core the thing. Now, obviously, it's become a buzzword, which is oftentimes the fate of successful things. So, it's become a buzzword, and you end up with cargo culting.Corey: I would argue periodically, that observability is hipster monitoring. If you call it monitoring, you get yelled at by Charity Majors. Which is tongue and cheek, but she has opinions, made, nonetheless shall I say, frustrating by the fact that she is invariably correct in those opinions, which just somehow makes it so much worse. It would be easy to dismiss things she says if she weren't always right. And the world is changing, especially as we get into the world of distributed systems.Is the server that runs the app working or not working loses meaning when we're talking about distributed systems, when we're talking about containers running on top of Kubernetes, which turns every outage into a murder mystery. We start having distributed applications composed of microservices, so you have no idea necessarily where an issue is. Okay, is this one microservice having an issue related to the request coming into a completely separate microservice? And it seems that for those types of applications, the answer has been tracing for a long time now, where originally that was something that felt like it was sprung, fully-formed from the forehead of some God known as one of the hyperscalers, but now is available to basically everyone, in theory.In practice, it seems that instrumenting applications still one of the hardest parts of all of this. I tried hooking up one of my own applications to be observed via OTEL, the open telemetry project, and it turns out that right now, OTEL and AWS Lambda have an intersection point that makes everything extremely difficult to work with. It's not there yet; it's not baked yet. And someday, I hope that changes because I would love to interchangeably just throw metrics and traces and logs to all the different observability tools and see which ones work, which ones don't, but that still feels very far away from current state of the art.Richard: Before we go there, maybe one thing which I don't fully agree with. You said that previously, you were told if a service up or down, that's the thing which you cared about, and I don't think that's what people actually cared about. At that time, also, what they fundamentally cared about: is the user-facing service up, or down, or impacted? Is it slow? Does it return errors every X percent for requests, something like this?Corey: Is the site up? And—you're right, I was hand-waving over a whole bunch of things. It was, “Okay. First, the web server is returning a page, yes or no? Great. Can I ping the server?” Okay, well, there are ways of server can crash and still leave enough of the TCP/IP stack up or it can respond to pings and do little else.And then you start adding things to it. But the Nagios thing that I always wanted to add—and had to—was, is the disk full? And that was annoying. And, on some level, like, why should I care in the modern era how much stuff is on the disk because storage is cheap and free and plentiful? The problem is, after the third outage in a month because the disk filled up, you start to not have a good answer for well, why aren't you monitoring whether the disk is full?And that was the contributors to taking down the server. When the website broke, there were what felt like a relatively small number of reasonably well-understood contributors to that at small to midsize applications, which is what I'm talking about, the only things that people would let me touch. I wasn't running hyperscale stuff where you have a fleet of 10,000 web servers and, “Is the server up?” Yeah, in that scenario, no one cares. But when we're talking about the database server and the two application servers and the four web servers talking to them, you think about it more in terms of pets than you do cattle.Richard: Yes, absolutely. Yet, I think that was a mistake back then, and I tried to do it differently, as a specific example with the disk. And I'm absolutely agreeing that previous generation tools limit you in how you can actually work with your data. In particular, once you're with metrics where you can do actual math on the data, it doesn't matter if the disk is almost full. It matters if that disk is going to be full within X amount of time.If that disk is 98% full and it sits there at 98% for ten years and provides the service, no one cares. The thing is, will it actually run out in the next two hours, in the next five hours, what have you. Depending on this, is this currently or imminently a customer-impacting or user-impacting then yes, alert on it, raise hell, wake people, make them fix it, as opposed to this thing can be dealt with during business hours on the next workday. And you don't have to wake anyone up.Corey: Yeah. The big filer with massive amounts of storage has crossed the 70% line. Okay, now it's time to start thinking about that, what do you want to do? Maybe it's time to order another shelf of discs for it, which is going to take some time. That's a radically different scenario than the 20 gigabyte root volume on your server just started filling up dramatically; the rate of change is such that'll be full in 20 minutes.Yeah, one of those is something you want to wake people up for. Generally speaking, you don't want to wake people up for what is fundamentally a longer-term strategic business problem. That can be sorted out in the light of day versus, “[laugh] we're not going to be making money in two hours, so if I don't wake up and fix this now.” That's the kind of thing you generally want to be woken up for. Well, let's be honest, you don't want that to happen at all, but if it does happen, you kind of want to know in advance rather than after the fact.Richard: You're literally describing linear predict from Prometheus, which is precisely for this, where I can look back over X amount of time and make a linear prediction because everything else breaks down at scale, blah, blah, blah, to detail. But the thing is, I can draw a line with my pencil by hand on my data and I can predict when is this thing going to it. Which is obviously precisely correct if I have a TLS certificate. It's a little bit more hand-wavy when it's a disk. But still, you can look into the future and you say, “What will be happening if current trends for the last X amount of time continue in Y amount of time.” And that's precisely a thing where you get this more powerful ability of doing math with your data.Corey: See, when you say it like that, it sounds like it actually is a whole term of art, where you're focusing on an in-depth field, where salaries are astronomical. Whereas the tools that I had to talk about this stuff back in the day made me sound like, effectively, the sysadmin that I was grunting and pointing: “This is gonna fill up.” And that is how I thought about it. And this is the challenge where it's easy to think about these things in narrow, defined contexts like that, but at scale, things break.Like the idea of anomaly detection. Well, okay, great if normally, the CPU and these things are super bored and suddenly it gets really busy, that's atypical. Maybe we should look into it, assuming that it has a challenge. The problem is, that is a lot harder than it sounds because there are so many factors that factor into it. And as soon as you have something, quote-unquote, “Intelligent,” making decisions on this, it doesn't take too many false positives before you start ignoring everything it has to say, and missing legitimate things. It's this weird and obnoxious conflation of both hard technical problems and human psychology.Richard: And the breaking up of old service boundaries. Of course, when you say microservices, and such, fundamentally, functionally a microservice or nanoservice, picoservice—but the pendulum is already swinging back to larger units of complexity—but it fundamentally does not make any difference if I have a monolith on some mainframe or if I have a bunch of microservices. Yes, I can scale differently, I can scale horizontally a lot more easily, vertically, it's a little bit harder, blah, blah, blah, but fundamentally, the logic and the complexity, which is being packaged is fundamentally the same. More users, everything, but it is fundamentally the same. What's happening again, and again, is I'm breaking up those old boundaries, which means the old tools which have assumptions built in about certain aspects of how I can actually get an overview of a system just start breaking down, when my complexity unit or my service or what have I, is usually congruent with a physical piece, of hardware or several services are congruent with that piece of hardware, it absolutely makes sense to think about things in terms of this one physical server. The fact that you have different considerations in cloud, and microservices, and blah, blah, blah, is not inherently that it is more complex.On the contrary, it is fundamentally the same thing. It scales with users' everything, but it is fundamentally the same thing, but I have different boundaries of where I put interfaces onto my complexity, which basically allow me to hide all of this complexity from the downstream users.Corey: That's part of the challenge that I think we're grappling with across this entire industry from start to finish. Where we originally looked at these things and could reason about it because it's the computer and I know how those things work. Well, kind of, but okay, sure. But then we start layering levels of complexity on top of layers of complexity on top of layers of complexity, and suddenly, when things stop working the way that we expect, it can be very challenging to unpack and understand why. One of the ways I got into this whole space was understanding, to some degree, of how system calls work, of how the kernel wound up interacting with userspace, about how Linux systems worked from start to finish. And these days, that isn't particularly necessary most of the time for the care and feeding of applications.The challenge is when things start breaking, suddenly having that in my back pocket to pull out could be extremely handy. But I don't think it's nearly as central as it once was and I don't know that I would necessarily advise someone new to this space to spend a few years as a systems person, digging into a lot of those aspects. And this is why you need to know what inodes are and how they work. Not really, not anymore. It's not front and center the way that it once was, in most environments, at least in the world that I live in. Agree? Disagree?Richard: Agreed. But it's very much unsurprising. You probably can't tell me how to precisely grow sugar cane or corn, you can't tell me how to refine the sugar out of it, but you can absolutely bake a cake. But you will not be able to tell me even a third of—and I'm—for the record, I'm also not able to tell you even a third about the supply chain which just goes from I have a field and some seeds and I need to have a package of refined sugar—you're absolutely enabled to do any of this. The thing is, you've been part of the previous generation of infrastructure where you know how this underlying infrastructure works, so you have more ability to reason about this, but it's not needed for cloud services nearly as much.You need different types of skill sets, but that doesn't mean the old skill set is completely useless, at least not as of right now. It's much more a case of you need fewer of those people and you need them in different places because those things have become infrastructure. Which is basically the cloud play, where a lot of this is just becoming infrastructure more and more.Corey: Oh, yeah. Back then I distinctly remember my elders looking down their noses at me because I didn't know assembly, and how could I possibly consider myself a competent systems admin if I didn't at least have a working knowledge of assembly? Or at least C, which I, over time, learned enough about to know that I didn't want to be a C programmer. And you're right, this is the value of cloud and going back to those days getting a web server up and running just to compile Apache's httpd took a week and an in-depth knowledge of GCC flags.And then in time, oh, great. We're going to have rpm or debs. Great, okay, then in time, you have apt, if you're in the dev land because I know you are a Debian developer, but over in Red Hat land, we had yum and other tools. And then in time, it became oh, we can just use something like Puppet or Chef to wind up ensuring that thing is installed. And then oh, just docker run. And now it's a checkbox in a web console for S3.These things get easier with time and step by step by step we're standing on the shoulders of giants. Even in the last ten years of my career, I used to have a great challenge question that I would interview people with of, “Do you know what TinyURL is? It takes a short URL and then expands it to a longer one. Great, on the whiteboard, tell me how you would implement that.” And you could go up one side and down the other, and then you could add constraints, multiple data centers, now one goes offline, how do you not lose data? Et cetera, et cetera.But these days, there are so many ways to do that using cloud services that it almost becomes trivial. It's okay, multiple data centers, API Gateway, a Lambda, and a global DynamoDB table. Now, what? “Well, now it gets slow. Why is it getting slow?”“Well, in that scenario, probably because of something underlying the cloud provider.” “And so now, you lose an entire AWS region. How do you handle that?” “Seems to me when that happens, the entire internet's kind of broken. Do people really need longer URLs?”And that is a valid answer, in many cases. The question doesn't really work without a whole bunch of additional constraints that make it sound fake. And that's not a weakness. That is the fact that computers and cloud services have never been as accessible as they are now. And that's a win for everyone.Richard: There's one aspect of accessibility which is actually decreasing—or two. A, you need to pay for them on an ongoing basis. And B, you need an internet connection which is suitably fast, low latency, what have you. And those are things which actually do make things harder for a variety of reasons. If I look at our back-end systems—as in Grafana—all of them have single binary modes where you literally compile everything into a single binary and you can run it on your laptop because if you're stuck on a plane, you can't do any work on it. That kind of is not the best of situations.And if you have a huge CI/CD pipeline, everything in this cloud and fine and dandy, but your internet breaks. Yeah, so I do agree that it is becoming generally more accessible. I disagree that it is becoming more accessible along all possible axes.Corey: I would agree. There is a silver lining to that as well, where yes, they are fraught and dangerous and I would preface this with a whole bunch of warnings, but from a cost perspective, all of the cloud providers do have a free tier offering where you can kick the tires on a lot of these things in return for no money. Surprisingly, the best one of those is Oracle Cloud where they have an unlimited free tier, use whatever you want in this subset of services, and you will never be charged a dime. As opposed to the AWS model of free tier where well, okay, it suddenly got very popular or you misconfigured something, and surprise, you now owe us enough money to buy Belize. That doesn't usually lead to a great customer experience.But you're right, you can't get away from needing an internet connection of at least some level of stability and throughput in order for a lot of these things to work. The stuff you would do locally on a Raspberry Pi, for example, if your budget constrained and want to get something out here, or your laptop. Great, that's not going to work in the same way as a full-on cloud service will.Richard: It's not free unless you have hard guarantees that you're not going to ever pay anything. It's fine to send warning, it's fine to switch the thing off, it's fine to have you hit random hard and soft quotas. It is not a free service if you can't guarantee that it is free.Corey: I agree with you. I think that there needs to be a free offering where, “Well, okay, you want us to suddenly stop serving traffic to the world?” “Yes. When the alternative is you have to start charging me through the nose, yes I want you to stop serving traffic.” That is definitionally what it says on the tin.And as an independent learner, that is what I want. Conversely, if I'm an enterprise, yeah, I don't care about money; we're running our Superbowl ad right now, so whatever you do, don't stop serving traffic. Charge us all the money. And there's been a lot of hand wringing about, well, how do we figure out which direction to go in? And it's, have you considered asking the customer?So, on a scale of one to bank, how serious is this account going to be [laugh]? Like, what are your big concerns: never charge me or never go down? Because we can build for either of those. Just let's make sure that all of those expectations are aligned. Because if you guess you're going to get it wrong and then no one's going to like you.Richard: I would argue this. All those services from all cloud providers actually build to address both of those. It's a deliberate choice not to offer certain aspects.Corey: Absolutely. When I talk to AWS, like, “Yeah, but there is an eventual consistency challenge in the billing system where it takes”—as anyone who's looked at the billing system can see—“Multiple days, sometimes for usage data to show up. So, how would we be able to stop things if the usage starts climbing?” To which my relatively direct responses, that sounds like a huge problem. I don't know how you'd fix that, but I do know that if suddenly you decide, as a matter of policy, to okay, if you're in the free tier, we will not charge you, or even we will not charge you more than $20 a month.So, you build yourself some headroom, great. And anything that people are able to spin up, well, you're just going to have to eat the cost as a provider. I somehow suspect that would get fixed super quickly if that were the constraint. The fact that it isn't is a conscious choice.Richard: Absolutely.Corey: And the reason I'm so passionate about this, about the free space, is not because I want to get a bunch of things for free. I assure you I do not. I mean, I spend my life fixing AWS bills and looking at AWS pricing, and my argument is very rarely, “It's too expensive.” It's that the billing dimension is hard to predict or doesn't align with a customer's experience or prices a service out of a bunch of use cases where it'll be great. But very rarely do I just sit here shaking my fist and saying, “It costs too much.”The problem is when you scare the living crap out of a student with a surprise bill that's more than their entire college tuition, even if you waive it a week or so later, do you think they're ever going to be as excited as they once were to go and use cloud services and build things for themselves and see what's possible? I mean, you and I met on IRC 20 years ago because back in those days, the failure mode and the risk financially was extremely low. It's yeah, the biggest concern that I had back then when I was doing some of my Linux experimentation is if I typed the wrong thing, I'm going to break my laptop. And yeah, that happened once or twice, and I've learned not to make those same kinds of mistakes, or put guardrails in so the blast radius was smaller, or use a remote system instead. Yeah, someone else's computer that I can destroy. Wonderful. But that was on we live and we learn as we were coming up. There was never an opportunity for us, to my understanding, to wind up accidentally running up an $8 million charge.Richard: Absolutely. And psychological safety is one of the most important things in what most people do. We are social animals. Without this psychological safety, you're not going to have long-term, self-sustaining groups. You will not make someone really excited about it. There's two basic ways to sell: trust or force. Those are the two ones. There's none else.Corey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomemento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: Yeah. And it also looks ridiculous. I was talking to someone somewhat recently who's used to spending four bucks a month on their AWS bill for some S3 stuff. Great. Good for them. That's awesome. Their credentials got compromised. Yes, that is on them to some extent. Okay, great.But now after six days, they were told that they owed $360,000 to AWS. And I don't know how, as a cloud company, you can sit there and ask a student to do that. That is not a realistic thing. They are what is known, in the United States at least, in the world of civil litigation as quote-unquote, “Judgment proof,” which means, great, you could wind up finding that someone owes you $20 billion. Most of the time, they don't have that, so you're not able to recoup it. Yeah, the judgment feels good, but you're never going to see it.That's the problem with something like that. It's yeah, I would declare bankruptcy long before, as a student, I wound up paying that kind of money. And I don't hear any stories about them releasing the collection agency hounds against people in that scenario. But I couldn't guarantee that. I would never urge someone to ignore that bill and see what happens.And it's such an off-putting thing that, from my perspective, is beneath of the company. And let's be clear, I see this behavior at times on Google Cloud, and I see it on Azure as well. This is not something that is unique to AWS, but they are the 800-pound gorilla in the space, and that's important. Or as I just to mention right now, like, as I—because I was about to give you crap for this, too, but if I go to grafana.com, it says, and I quote, “Play around with the Grafana Stack. Experience Grafana for yourself, no registration or installation needed.”Good. I was about to yell at you if it's, “Oh, just give us your credit card and go ahead and start spinning things up and we won't charge you. Honest.” Even your free account does not require a credit card; you're doing it right. That tells me that I'm not going to get a giant surprise bill.Richard: You have no idea how much thought and work went into our free offering. There was a lot of math involved.Corey: None of this is easy, I want to be very clear on that. Pricing is one of the hardest things to get right, especially in cloud. And it also, when you get it right, it doesn't look like it was that hard for you to do. But I fix [sigh] I people's AWS bills for a living and still, five or six years in, one of the hardest things I still wrestle with is pricing engagements. It's incredibly nuanced, incredibly challenging, and at least for services in the cloud space where you're doing usage-based billing, that becomes a problem.But glancing at your pricing page, you do hit the two things that are incredibly important to me. The first one is use something for free. As an added bonus, you can use it forever. And I can get started with it right now. Great, when I go and look at your pricing page or I want to use your product and it tells me to ‘click here to contact us.' That tells me it's an enterprise sales cycle, it's got to be really expensive, and I'm not solving my problem tonight.Whereas the other side of it, the enterprise offering needs to be ‘contact us' and you do that, that speaks to the enterprise procurement people who don't know how to sign a check that doesn't have to commas in it, and they want to have custom terms and all the rest, and they're prepared to pay for that. If you don't have that, you look to small-time. When it doesn't matter what price you put on it, you wind up offering your enterprise tier at some large number, it's yeah, for some companies, that's a small number. You don't necessarily want to back yourself in, depending upon what the specific needs are. You've gotten that right.Every common criticism that I have about pricing, you folks have gotten right. And I definitely can pick up on your fingerprints on a lot of this. Because it sounds like a weird thing to say of, “Well, he's the Director of Community, why would he weigh in on pricing?” It's, “I don't think you understand what community is when you ask that question.”Richard: Yes, I fully agree. It's super important to get pricing right, or to get many things right. And usually the things which just feel naturally correct are the ones which took the most effort and the most time and everything. And yes, at least from the—like, I was in those conversations or part of them, and the one thing which was always clear is when we say it's free, it must be free. When we say it is forever free, it must be forever free. No games, no lies, do what you say and say what you do. Basically.We have things where initially you get certain pro features and you can keep paying and you can keep using them, or after X amount of time they go away. Things like these are built in because that's what people want. They want to play around with the whole thing and see, hey, is this actually providing me value? Do I want to pay for this feature which is nice or this and that plugin or what have you? And yeah, you're also absolutely right that once you leave these constraints of basically self-serve cloud, you are talking about bespoke deals, but you're also talking about okay, let's sit down, let's actually understand what your business is: what are your business problems? What are you going to solve today? What are you trying to solve tomorrow?Let us find a way of actually supporting you and invest into a mutual partnership and not just grab the money and run. We have extremely low churn for, I would say, pretty good reasons. Because this thing about our users, our customers being successful, we do take it extremely seriously.Corey: It's one of those areas that I just can't shake the feeling is underappreciated industry-wide. And the reason I say that this is your fingerprints on it is because if this had been wrong, you have a lot of… we'll call them idiosyncrasies, where there are certain things you absolutely will not stand for, and misleading people and tricking them into paying money is high on that list. One of the reasons we're friends. So yeah, but I say I see your fingerprints on this, it's yeah, if this hadn't been worked out the way that it is, you would not still be there. One other thing that I wanted to call out about, well, I guess it's a confluence of pricing and logging in the rest, I look at your free tier, and it offers up to 50 gigabytes of ingest a month.And it's easy for me to sit here and compare that to other services, other tools, and other logging stories, and then I have to stop and think for a minute that yeah, discs have gotten way bigger, and internet connections have gotten way faster, and even the logs have gotten way wordier. I still am not sure that most people can really contextualize just how much logging fits into 50 gigs of data. Do you have any, I guess, ballpark examples of what that looks like? Because it's been long enough since I've been playing in these waters that I can't really contextualize it anymore.Richard: Lord of the Rings is roughly five megabytes. It's actually less. So, we're talking literally 10,000 Lord of the Rings, which you can just shove in us and we're just storing this for you. Which also tells you that you're not going to be reading any of this. Or some of it, yes, but not all of it. You need better tooling and you need proper tooling.And some of this is more modern. Some of this is where we actually pushed the state of the art. But I'm also biased. But I, for myself, do claim that we did push the state of the art here. But at the same time you come back to those absolute fundamentals of how humans deal with data.If you look back basically as far as we have writing—literally 6000 years ago, is the oldest writing—humans have always dealt with information with the state of the world in very specific ways. A, is it important enough to even write it down, to even persist it in whatever persistence mechanisms I have at my disposal? If yes, write a detailed account or record a detailed account of whatever the thing is. But it turns out, this is expensive and it's not what you need. So, over time, you optimize towards only taking down key events and only noting key events. Maybe with their interconnections, but fundamentally, the key events.As your data grows, as you have more stuff, as this still is important to your business and keeps being more important to—or doesn't even need to be a business; can be social, can be whatever—whatever thing it is, it becomes expensive, again, to retain all of those key events. So, you turn them into numbers and you can do actual math on them. And that's this path which you've seen again, and again, and again, and again, throughout humanity's history. Literally, as long as we have written records, this has played out again, and again, and again, and again, for every single field which humans actually cared about. At different times, like, power networks are way ahead of this, but fundamentally power networks work on metrics, but for transient load spike, and everything, they have logs built into their power measurement devices, but those are only far in between. Of course, the main thing is just metrics, time-series. And you see this again, and again.You also were sysadmin in internet-related all switches have been metrics-based or metrics-first for basically forever, for 20, 30 years. But that stands to reason. Of course the internet is running at by roughly 20 years scale-wise in front of the cloud because obviously you need the internet because as you wouldn't be having a cloud. So, all of those growing pains why metrics are all of a sudden the thing, “Or have been for a few years now,” is basically, of course, people who were writing software, providing their own software services, hit the scaling limitations which you hit for Internet service providers two decades, three decades ago. But fundamentally, you have this complete system. Basically profiles or distributed tracing depending on how you view distributed tracing.You can also argue that distributed tracing is key events which are linked to each other. Logs sit firmly in the key event thing and then you turn this into numbers and that is metrics. And that's basically it. You have extremes at the and where you can have valid, depending on your circumstances, engineering trade-offs of where you invest the most, but fundamentally, that is why those always appear again in humanity's dealing with data, and observability is no different.Corey: I take a look at last month's AWS bill. Mine is pretty well optimized. It's a bit over 500 bucks. And right around 150 of that is various forms of logging and detecting change in the environment. And on the one hand, I sit here, and I think, “Oh, I should optimize that,” because the value of those logs to me is zero.Except that whenever I have to go in and diagnose something or respond to an incident or have some forensic exploration, they then are worth an awful lot. And I am prepared to pay 150 bucks a month for that because the potential value of having that when the time comes is going to be extraordinarily useful. And it basically just feels like a tax on top of what it is that I'm doing. The same thing happens with application observability where, yeah, when you just want the big substantial stuff, yeah, until you're trying to diagnose something. But in some cases, yeah, okay, then crank up the verbosity and then look for it.But if you're trying to figure it out after an event that isn't likely or hopefully won't recur, you're going to wish that you spent a little bit more on collecting data out of it. You're always going to be wrong, you're always going to be unhappy, on some level.Richard: Ish. You could absolutely be optimizing this. I mean, for $500, it's probably not worth your time unless you take it as an exercise, but outside of due diligence where you need specific logs tied to—or specific events tied to specific times, I would argue that a lot of the problems with logs is just dealing with it wrong. You have this one extreme of full-text indexing everything, and you have this other extreme of a data lake—which is just a euphemism of never looking at the data again—to keep storage vendors happy. There is an in between.Again, I'm biased, but like for example, with Loki, you have those same label sets as you have on your metrics with Prometheus, and you have literally the same, which means you only index that part and you only extract on ingestion time. If you don't have structured logs yet, only put the metadata about whatever you care about extracted and put it into your label set and store this, and that's the only thing you index. But it goes further than just this. You can also turn those logs into metrics.And to me this is a path of optimization. Where previously I logged this and that error. Okay, fine, but it's just a log line telling me it's HTTP 500. No one cares that this is at this precise time. Log levels are also basically an anti-pattern because they're just trying to deal with the amount of data which I have, and try and get a handle on this on that level whereas it would be much easier if I just counted every time I have an HTTP 500, I just up my counter by one. And again, and again, and again.And all of a sudden, I have literally—and I did the math on this—over 99.8% of the data which I have to store just goes away. It's just magic the way—and we're only talking about the first time I'm hitting this logline. The second time I'm hitting this logline is functionally free if I turn this into metrics. It becomes cheap enough that one of the mantras which I have, if you need to onboard your developers on modern observability, blah, blah, blah, blah, blah, the whole bells and whistles, usually people have logs, like that's what they have, unless they were from ISPs or power companies, or so; there they usually start with metrics.But most users, which I see both with my Grafana and with my Prometheus [unintelligible 00:38:46] tend to start with logs. They have issues with those logs because they're basically unstructured and useless and you need to first make them useful to some extent. But then you can leverage on this and instead of having a debug statement, just put a counter. Every single time you think, “Hey, maybe I should put a debug statement,” just put a counter instead. In two months time, see if it was worth it or if you delete that line and just remove that counter.It's so much cheaper, you can just throw this on and just have it run for a week or a month or whatever timeframe and done. But it goes beyond this because all of a sudden, if I can turn my logs into metrics properly, I can start rewriting my alerts on those metrics. I can actually persist those metrics and can more aggressively throw my logs away. But also, I have this transition made a lot easier where I don't have this huge lift, where this day in three months is to be cut over and we're going to release the new version of this and that software and it's not going to have that, it's going to have 80% less logs and everything will be great and then you missed the first maintenance window or someone is ill or what have you, and then the next Big Friday is coming so you can't actually deploy there. I mean Black Friday. But we can also talk about deploying on Fridays.But the thing is, you have this huge thing, whereas if you have this as a continuous improvement process, I can just look at, this is the log which is coming out. I turn this into a number, I start emitting metrics directly, and I see that those numbers match. And so, I can just start—I build new stuff, I put it into a new data format, I actually emit the new data format directly from my code instrumentation, and only then do I start removing the instrumentation for the logs. And that allows me to, with full confidence, with psychological safety, just move a lot more quickly, deliver much more quickly, and also cut down on my costs more quickly because I'm just using more efficient data types.Corey: I really want to thank you for spending as much time as you have. If people want to learn more about how you view the world and figure out what other personal attacks they can throw your way, where's the best place for them to find you?Richard: Personal attacks, probably Twitter. It's, like, the go-to place for this kind of thing. For actually tracking, I stopped maintaining my own website. Maybe I'll do again, but if you go on github.com/ritchieh/talks, you'll find a reasonably up-to-date list of all the talks, interviews, presentations, panels, what have you, which I did over the last whatever amount of time. [laugh].Corey: And we will, of course, put links to that in the [show notes 00:41:23]. Thanks again for your time. It's always appreciated.Richard: And thank you.Corey: Richard Hartmann, Director of Community at Grafana Labs. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment. And then when someone else comes along with an insulting comment they want to add, we'll just increment the counter by one.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
En este episodio hablamos con Giselle Goicochea y Roberto Luna sobre las bases de datos en memoria. Qué son? Porqué usarlas? Y luego entramos en los detalles de dos servicios gestionados de AWS que nos ayudaran con eso - MemoryDB y ElasticCache.Este es el episodio 17 de la tercera temporada del podcast de Charlas Técnicas de AWS.
In this episode we will look at some of the fundamental architecture and patterns that can be leveraged with AWS Services such as N-Tiered Architecture, MultiTentant and Service Architectures. Concepts are explanied based on AWS Services but can apply for general cloud as well. We briefly touch on few services like AWS Storage, AWS CloudFront, EC2, ElastiCache, Elastic Load Balancer, Pricing Calculator and so on. --- Send in a voice message: https://anchor.fm/vishnu-vg/message
About PeterPeter's spent more than a decade building scalable and robust systems at startups across adtech and edtech. At Remind, where he's VP of Technology, Peter pushes for building a sustainable tech company with mature software engineering. He lives in Southern California and enjoys spending time at the beach with his family.Links: Redis: https://redis.com/ Remind: https://www.remind.com/ Remind Engineering Blog: https://engineering.remind.com LinkedIn: https://www.linkedin.com/in/hamiltop Email: peterh@remind101.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Today's episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that's built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you're defining those as, which depends probably on where you work. It's getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that's exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn't eat all the data you've gotten on the system, it's exactly what you've been looking for. Check it out today at min.io/download, and see for yourself. That's min.io/download, and be sure to tell them that I sent you.Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats V-U-L-T-R.com slash screaming.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn and this is a fun episode. It is a promoted episode, which means that our friends at Redis have gone ahead and sponsored this entire episode. I asked them, “Great, who are you going to send me from, generally, your executive suite?” And they said, “Nah. You already know what we're going to say. We want you to talk to one of our customers.” And so here we are. My guest today is Peter Hamilton, VP of Technology at Remind. Peter, thank you for joining me.Peter: Thanks, Corey. Excited to be here.Corey: It's always interesting when I get to talk to people on promoted guest episodes when they're a customer of the sponsor because to be clear, you do not work for Redis. This is one of those stories you enjoy telling, but you don't personally have a stake in whether people love Redis, hate Redis, adopt that or not, which is exactly what I try and do on these shows. There's an authenticity to people who have in-the-trenches experience who aren't themselves trying to sell the thing because that is their entire job in this world.Peter: Yeah. You just presented three or four different opinions and I guarantee we felt all at the different times.Corey: [laugh]. So, let's start at the very beginning. What does Remind do?Peter: So, Remind is a messaging tool for education, largely K through 12. We support about 30 million active users across the country, over 2 million teachers, making sure that every student has, you know, equal opportunities to succeed and that we can facilitate as much learning as possible.Corey: When you say messaging that could mean a bunch of different things to a bunch of different people. Once on a lark, I wound up sitting down—this was years ago, so I'm sure the number is a woeful underestimate now—of how many AWS services I could use to send a message from me to you. And this is without going into the lunacy territory of, “Well, I can tag a thing and then mail it to you like a Snowball Edge or something.” No, this is using them as intended, I think I got 15 or 16 of them. When you say messaging, what does that mean to you?Peter: So, for us, it's about communication to the end-user. We will do everything we can to deliver whatever message a teacher or district administrator has to the user. We go through SMS, text messaging, we go through Apple and Google's push services, we go through email, we go through voice call, really pulling out all the stops we can to make sure that these important messages get out.Corey: And I can only imagine some of the regulatory pressure you almost certainly experience. It feels like it's not quite to HIPAA levels, where ohh, there's a private cause of action if any of this stuff gets out, but people are inherently sensitive about communications involving their children. I always sort of knew this in a general sense, and then I had kids myself, and oh, yeah, suddenly I really care about those sorts of things.Peter: Yeah. One of the big challenges, you can build great systems that do the correct thing, but at the end of the day, we're relying on a teacher choosing the right recipient when they send a message. And so we've had to build a lot of processes and controls in place, so that we can, kind of, satisfy two conflicting needs: One is to provide a clear audit log because that's an important thing for districts to know if something does happen, that we have clear communication; and the other is to also be able to jump in and intervene when something inappropriate or mistaken is sent out to the wrong people.Corey: Remind has always been one of those companies that has a somewhat exalted reputation in the AWS space. You folks have been early adopters of a bunch of different services—which let's be clear, in the responsible way, not the, “Well, they said it on stage; time to go ahead and put everything they just listed into production because we for some Godforsaken reason, view it as a todo list.”—but you've been thoughtful about how you approach things, and you have been around as a company for a while. But you've also been making a significant push toward being cloud-native by certain definitions of that term. So, I know this sounds like a college entrance essay, but what does cloud-native mean to you?Peter: So, one of the big gaps—if you take an application that was written to be deployed in a traditional data center environment and just drop it in the cloud, what you're going to get is a flaky data center.Corey: Well, that's unfair. It's also going to be extremely expensive.Peter: [laugh]. Sorry, an expensive, flaky data set.Corey: There we go. There we go.Peter: What we've really looked at–and a lot of this goes back to our history in the earlier days; we ran a top of Heroku and it was kind of the early days what they call the Twelve-Factor Application—but making aggressive decisions about how you structure your architecture and application so that you fit in with some of the cloud tools that are available and that you fit in, you know, with the operating models that are out there.Corey: When you say an aggressive decision, what sort of thing are you talking about? Because when I think of being aggressive with an approach to things like AWS, it usually involves Twitter, and I'm guessing that is not the direction you intend that to go.Peter: No, I think if you look at Twitter or Netflix or some of these players that, quite frankly, have defined what AWS is to us today through their usage patterns, not quite that.Corey: Oh, I mean using Twitter to yell at them explicitly about things—Peter: Oh.Corey: —because I don't do passive-aggressive; I just do aggressive.Peter: Got it. No, I think in our case, it's been plotting a very narrow path that allows us to avoid some of the bigger pitfalls. We have our sponsor here, Redis. Talk a little bit about our usage of Redis and how that's helped us in some of these cases. One of the pitfalls you'll find with pulling a non-cloud-native application and put it in the cloud is state is hard to manage.If you put state on all your machines and machines go down, networks fail, all those things, you now no longer have access to that state and we start to see a lot of problems. One of the decisions we've made is try to put as much data as we can into data stores like Redis or Postgres or something, in order to decouple our hardware from the state we're trying to manage and provide for users so that we're more resilient to those sorts of failures.Corey: I get the sense from the way that we're having this conversation, when you talk about Redis, you mean actual Redis itself, not ElastiCache for Redis, or as to I'm tending to increasingly think about AWS's services, Amazon Basics for Redis.Peter: Yeah. I mean, Amazon has launched a number of products. They have their ElastiCache, they have their new MemoryDB, there's a lot different ways to use this. We've relied pretty heavily on Redis, previously known as Redis Labs, and their enterprise product in their cloud, in order to take care of our most important data—which we just don't want to manage ourselves—trying to manage that on our own using something like ElastiCache, there's so many pitfalls, so many ways that we can lose that data. This data is important to us. By having it in a trusted place and managed by a great ops team, like they have at Redis, we're able to then lean in on the other aspects of cloud data to really get as much value as we can out of AWS.Corey: I am curious. As I said you've had a reputation as a company for a while in the AWS space of doing an awful lot of really interesting things. I mean, you have a robust GitHub presence, you have a whole bunch of tools that have come out Remind that are great, I've linked to a number of them over the years in the newsletter. You are clearly not afraid, culturally, to get your hands dirty and build things yourself, but you are using Redis Enterprise as opposed to open-source Redis. What drove that decision? I have to assume it's not, “Wait. You mean, I can get it for free as an open-source project? Why didn't someone tell me?” What brought you to that decision?Peter: Yeah, a big part of this is what we could call operating leverage. Building a great set of tools that allow you to get more value out of AWS is a little different story than babysitting servers all day and making sure they stay up. So, if you look through, most of our contributions in open-source space have really been around here's how to expand upon these foundational pieces from AWS; here's how to more efficiently launch a suite of servers into an auto-scaling group; here's, you know, our troposphere and other pieces there. This was all before Amazon CDK product, but really, it was, here's how we can more effectively use CloudFormation to capture our Infrastructure as Code. And so we are not afraid in any way to invest in our tooling and invest in some of those things, but when we look at the trade-off of directly managing stateful services and dealing with all the uncertainty that comes, we feel our time is better spent working on our product and delivering value to our users and relying on partners like Redis in order to provide that stability we need.Corey: You raise a good point. An awful lot of the tools that you've put out there are the best, from my perspective, approach to working with AWS services. And that is a relatively thin layer built on top of them with an eye toward making the user experience more polished, but not being so heavily opinionated that as soon as the service goes in a different direction, the tool becomes completely useless. You just decide to make it a bit easier to wind up working with specific environment variables or profiles, rather than what appears to be the AWS UX approach of, “Oh, now type in your access key, your secret key and your session token, and we've disabled copy and paste. Go, have fun.” You've really done a lot of quality of life improvements, more so than you have this is the entire system of how we do deploys, start to finish. It's opinionated and sort of a, like, a take on what Netflix, did once upon a time, with Asgard. It really feels like it's just the right level of abstraction.Peter: We did a pretty good job. I will say, you know, years later, we felt that we got it wrong a couple times. It's been really interesting to see that, that there are times when we say, “Oh, we could take these three or four services and wrap it up into this new concept of an application.” And over time, we just have to start poking holes in that new layer and we start to see we would have been better served by sticking with as thin a layer as possible that enables us, rather than trying to get these higher-level pieces.Corey: It's remarkably refreshing to hear you say that just because so many people love to tell the story on podcasts, or on conference stages, or whatever format they have of, “This is what we built.” And it is an aspirationally superficial story about this. They don't talk about that, “Well, firstly, without these three wrong paths first.” It's always a, “Oh, yes, obviously, we are smart people and we only make the correct decision.”And I remember in the before times sitting in conference talks, watching people talk about great things they'd done, and I'll turn next to the person next to me and say, “Wow, I wish I could be involved in a project like that.” And they'll say, “Yeah, so do I.” And it turns out they work at the company the speaker is from. Because all of these things tend to be the most positive story. Do you have an example of something that you have done in your production environment that going back, “Yeah, in hindsight, I would have done that completely differently.”Peter: Yeah. So, coming from Heroku moving into AWS, we had a great open-source project called Empire, which kind of bridge that gap between them, but used Amazon's ECS in order to launch applications. It was actually command-line compatible with the Heroku command when it first launched. So, a very big commitment there. And at the time—I mean, this comes back to the point I think you and I were talking about earlier, where architecture, costs, infrastructure, they're all interlinked.And I'm a big fan of Conway's Law, which says that an organization's structure needs to match its architecture. And so six, seven years ago, we're heavy growth-based company and we are interns running around, doing all the things, and we wanted to have really strict guardrails and a narrow set of things that our development team could do. And so we built a pretty constrained: You will launch, you will have one Docker image per ECS service, it can only do these specific things. And this allowed our development team to focus on pretty buttons on the screen and user engagement and experiments and whatnot, but as we've evolved as a company, as we built out a more robust business, we've started to track revenue and costs of goods sold more aggressively, we've seen, there's a lot of inefficient things that come out of it.One particular example was we used PgBouncer for our connection pooling to our Postgres application. In the traditional model, we had an auto-scaling group for a PgBouncer, and then our auto-scaling groups for the other applications would connect to it. And we saw additional latency, we saw additional cost, and we eventually kind of twirl that down and packaged that PgBouncer alongside the applications that needed it. And this was a configuration that wasn't available on our first pass; it was something we intentionally did not provide to our development team, and we had to unwind that. And when we did, we saw better performance, we saw better cost efficiency, all sorts of benefits that we care a lot about now that we didn't care about as much, many years ago.Corey: It sounds like you're describing some semblance of an internal platform, where instead of letting all your engineers effectively, “Well, here's the console. Ideally, you use some form of Infrastructure as Code. Good luck. Have fun.” You effectively gate access to that. Is that something that you're still doing or have you taken a different approach?Peter: So, our primary gate is our Infrastructure as Code repository. If you want to make a meaningful change, you open up a PR, got to go through code review, you need people to sign off on it. Anything that's not there may not exist tomorrow. There's no guarantees. And we've gone around, occasionally just shut random servers down that people spun up in our account.And sometimes people will be grumpy about it, but you really need to enforce that culture that we have to go through the correct channels and we have to have this cohesive platform, as you said, to support our development efforts.Corey: So, you're a messaging service in education. So, whenever I do a little bit of digging into backstories of companies and what has made, I guess, an impression, you look for certain things and explicit dates are one of them, where on March 13th of 2020, your business changed just a smidgen. What happened other than the obvious, we never went outside for two years?Peter: [laugh]. So, if we roll back a week—you know, that's March 13th, so if we roll back a week, we're looking at March 6th. On that day, we sent out about 60 million messages over all of our different mediums: Text, email, push notifications. On March 13th that was 100 million, and then, a few weeks later on March 30th, that was 177 million. And so our traffic effectively tripled over the course of those three weeks. And yeah, that's quite a ride, let me tell you.Corey: The opinion that a lot of folks have who have not gotten to play in sophisticated distributed systems is, “Well, what's the hard part there you have an auto-scaling group. Just spin up three times the number of servers in that fleet and problem solved. What's challenging?” A lot, but what did you find that the pressure points were?Peter: So, I love that example, that your auto-scaling group will just work. By default, Amazon's auto-scaling groups only support 1000 backends. So, when your auto-scaling group goes from 400 backends to 1200, things break, [laugh] and not in ways that you would have expected. You start to learn things about how database systems provided by Amazon have limits other than CPU and memory. And they're clearly laid out that there's network bandwidth limits and things you have to worry about.We had a pretty small team at that time and we'd gotten this cadence where every Monday morning, we would wake up at 4 a.m. Pacific because as part of the pandemic, our traffic shifted, so our East Coast users would be most active in the morning rather than the afternoon. And so at about 7 a.m. on the east coast is when everyone came online. And we had our Monday morning crew there and just looking to see where the next pain point was going to be.And we'd have Monday, walk through it all, Monday afternoon, we'd meet together, we come up with our three or four hypotheses on what will break, if our traffic doubles again, and we'd spend the rest of that next week addressing those the best we could and repeat for the next Monday. And we did this for three, four or five weeks in a row, and finally, it stabilized. But yeah, it's all the small little things, the things you don't know about, the limits in places you don't recognize that just catch up to you. And you need to have a team that can move fast and adapt quickly.Corey: You've been using Redis for six, seven years, something along those lines, as an enterprise offering. You've been working with the same vendor who provides this managed service for a while now. What are the fruits of that relationship? What is the value that you see by continuing to have a long-term relationship with vendors? Because let's be serious, most of us don't stay in jobs that long, let alone work with the same vendor.Peter: Yeah. So, coming back to the March 2020 story, many of our vendors started to see some issues here that various services weren't scaled properly. We made a lot of phone calls to a lot of vendors in working with them, and I… very impressed with how Redis Labs at the time was able to respond. We hopped on a call, they said, “Here's what we think we need to do, we'll go ahead and do this. We'll sort this out in a few weeks and figure out what this means for your contract. We're here to help and support in this pandemic because we recognize how this is affecting everyone around the world.”And so I think when you get in those deeper relationships, those long-term relationships, it is so helpful to have that trust, to have a little bit of that give when you need it in times of crisis, and that they're there and willing to jump in right away.Corey: There's a lot to be said for having those working relationships before you need them. So often, I think that a lot of engineering teams just don't talk to their vendors to a point where they may as well be strangers. But you'll see this most notably because—at least I feel it most acutely—with AWS service teams. They'll do a whole kickoff when the enterprise support deal is signed, three years go passed, and both the AWS team and the customer's team have completely rotated since then, and they may as well be strangers. Being able to have that relationship to fall back on in those really weird really, honestly, high-stress moments has been one of those things where I didn't see the value myself until the first time I went through a hairy situation where I found that that was useful.And now it's oh, I—I now bias instead for, “Oh, I can fit to the free tier of this service. No, no, I'm going to pay and become a paying customer.” I'd rather be a customer that can have that relationship and pick up the phone than someone whining at people in a forum somewhere of, “Hey, I'm a free user, and I'm having some problems with production.” Just never felt right to me.Peter: Yeah, there's nothing worse than calling your account rep and being told, “Oh, I'm not your account rep anymore.” Somehow you missed the email, you missed who it was. Prior to Covid, you know—and we saw this many, many years ago—one of the things about Remind is every back-to-school season, our traffic 10Xes in about three weeks. And so we're used to emergencies happening and unforeseen things happening. And we plan through our year and try to do capacity planning and everything, but we been around the block a couple of times.And so we have a pretty strong culture now leaning in hard with our support reps. We have them in our Slack channels. Our AWS team, we meet with often. Redis Labs, we have them on Slack as well. We're constantly talking about databases that may or may not be performing as we expect them, too. They're an extension of our team, we have an incident; we get paged. If it's related to one of the services, we hit them in Slack immediately and have them start checking on the back end while we're checking on our side. So.Corey: One of the biggest takeaways I wish more companies would have is that when you are dependent upon another company to effectively run your production infrastructure, they are no longer your vendor, they're your partner, whether you want them to be or not. And approaching it with that perspective really pays dividends down the road.Peter: Yeah. One of the cases you get when you've been at a company for a long time and been in relationship for a long time is growing together is always an interesting approach. And seeing, sometimes there's some painful points; sometimes you're on an old legacy version of their product that you were literally the last customer on, and you got to work with them to move off of. But you were there six years ago when they're just starting out, and they've seen how you grow, and you've seen how they've grown, and you've kind of been able to marry that experience together in a meaningful way.Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of “Hello, World” demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself, all while gaining the networking, load balancing, and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small-scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free? This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.Corey: Redis is, these days, of data platform back once upon a time, I viewed it as more of a caching layer. And I admit that the capabilities of the platform has significantly advanced since those days when I viewed it purely through lens of cache. But one of the interesting parts is that neither one of those use cases, in my mind, blends particularly well with heavy use of Spot Fleets, but you're doing exactly that. What are your folks doing over there?Peter: [laugh]. Yeah, so as I mentioned earlier, coming back to some of the Twelve-Factor App design, we heavily rely on Redis as sort of a distributed heap. One of our challenges of delivering all these messages is every single message has its in-flight state: Here's the content, here's who we sent it to, we wait for them to respond. On a traditional application, you might have one big server that stores it all in-memory, and you get the incoming requests, and you match things up. By moving all that state to Redis, all of our workers, all of our application servers, we know they can disappear at any point in time.We use Amazon's Spot Instances and their Spot Fleet for all of our production traffic. Every single web service, every single worker that we have runs on this infrastructure, and we would not be able to do that if we didn't have a reliable and robust place to store this data that is in-flight and currently being accessed. So, we'll have a couple hundred gigs of data at any point in time in a Redis Database, just representing in-flight work that's happening on various machines.Corey: It's really neat seeing Spot Fleets being used as something more than a theoretical possibility. It's something I've always been very interested in, obviously, given the potential cost savings; they approach cheap is free in some cases. But it turns out—we talked earlier about the idea of being cloud-native versus the rickety, expensive data center in the cloud, and an awful lot of applications are simply not built in a way that yeah, we're just going to randomly turn off a subset of your systems, ideally, with two minutes of notice, but all right, have fun with that. And a lot of times, it just becomes a complete non-starter, even for stateless workloads, just based upon how all of these things are configured. It is really interesting to watch a company that has an awful lot of responsibility that you've been entrusted with who embraces that mindset. It's a lot more rare than you'd think.Peter: Yeah. And again, you know, sometimes, we overbuild things, and sometimes we go down paths that may have been a little excessive, but it really comes down to your architecture. You know, it's not just having everything running on Spot. It's making effective use of SQS and other queueing products at Amazon to provide checkpointing abilities, and so you know that should you lose an instance, you're only going to lose a few seconds of productive work on that particular workload and be able to kick off where you left off.It's properly using auto-scaling groups. From the financial side, there's all sorts of weird quirks you'll see. You know, the Spot market has a wonderful set of dynamics where the big instances are much, much cheaper per CPU than the small ones are on the Spot market. And so structuring things in a way that you can colocate different workloads onto the same hosts and hedge against the host going down by spreading across multiple availability zones. I think there's definitely a point where having enough workload, having enough scale allows you to take advantage of these things, but it all comes down to the architecture and design that really enables it.Corey: So, you've been using Redis for longer than I think many of our listeners have been in tech.Peter: [laugh].Corey: And the key distinguishing points for me between someone who is an advocate for a technology and someone who's a zealot—or a pure critic—is they can identify use cases for which is great and use cases for which it is not likely to be a great experience. In your time with Redis, what have you found that it's been great at and what are some areas that you would encourage people to consider more carefully before diving into it?Peter: So, we like to joke that five, six years ago, most of our development process was, “I've hit a problem. Can I use Redis to solve that problem?” And so we've tried every solution possible with Redis. We've done all the things. We have number of very complicated Lua scripts that are managing different keys in an atomic way.Some of these have been more successful than others, for sure. Right now, our biggest philosophy is, if it is data we need quickly, and it is data that is important to us, we put it in Enterprise Redis, the cloud product from Redis. Other use cases, there's a dozen things that you can use for a cache, Redis is great for cache, memcache does a decent job as well; you're not going to see a meaningful difference between those sorts of products. Where we've struggled a little bit has been when we have essentially relational data that we need fast access to. And we're still trying to find a clear path forward here because you can do it and you can have atomic updates and you can kind of simulate some of the ACID characteristics you would have in a relational database, but it adds a lot of complexity.And that's a lot of overhead to our team as we're continuing to develop these products, to extend them, to fix any bugs you might have in there. And so we're kind of recalibrating a bit, and some of those workloads are moving to other data stores where they're more appropriate. But at the end of the day, it's data that we need fast, and it's data that's important, we're sticking with what we got here because it's been working pretty well.Corey: It sounds almost like you started off with the mindset of one database for a bunch of different use cases and you're starting to differentiate into purpose-built databases for certain things. Or is that not entirely accurate?Peter: There's a little bit of that. And I think coming back to some of our tooling, as we kind of jumped on a bit of the microservice bandwagon, we would see, here's a small service that only has a small amount of data that needs to be stored. It wouldn't make sense to bring up a RDS instance, or an Aurora instance, for that, you know, in Postgres. Let's just store it in an easystore like Redis. And some of those cases have been great, some of them have been a little problematic.And so as we've invested in our tooling to make all our databases accessible and make it less of a weird trade-off between what the product needs, what we can do right now, and what we want to do long-term, and reduce that friction, we've been able to be much more deliberate about the data source that we choose in each case.Corey: It's very clear that you're speaking with a voice of experience on this where this is not something that you just woke up and figured out. One last area I want to go into with you is when I asked you what is you care about primarily as an engineering leader and as you look at serving your customers well, you effectively had a dual answer, almost off the cuff, of stability and security. I find the two of those things are deeply intertwined in most of the conversations I have, but they're rarely called out explicitly in quite the way that you do. Talk to me about that.Peter: Yeah, so in our wild journey, stability has always been a challenge. And we've alway—you know, been an early startup mode, where you're constantly pushing what can we ship? How quickly can we ship it? And in our particular space, we feel that this communication that we foster between teachers and students and their parents is incredibly important, and is a thing that we take very, very seriously. And so, a couple years ago, we were trying to create this balance and create not just a language that we could talk about on a podcasts like this, but really recognizing that framing these concepts to our company internally: To our engineers to help them to think as they're building a feature, what are the things they should think about, what are the concerns beyond the product spec; to work with our marketing and sales team to help them to understand why we're making these investments that may not get particular feature out by X date but it's still a worthwhile investment.So, from the security side, we've really focused on building out robust practices and robust controls that don't necessarily lock us into a particular standard, like PCI compliance or things like that, but really focusing on the maturity of our company and, you know, our culture as we go forward. And so we're in a place now we are ISO 27001; we're heading into our third year. We leaned in hard our disaster recovery processes, we've leaned in hard on our bug bounties, pen tests, kind of, found this incremental approach that, you know, day one, I remember we turned on our bug bounty and it was a scary day as the reports kept coming in. But we take on one thing at a time and continue to build on it and make it an essential part of how we build systems.Corey: It really has to be built in. It feels like security is not something could be slapped on as an afterthought, however much companies try to do that. Especially, again, as we started this episode with, you're dealing with communication with people's kids. That is something that people have remarkably little sense of humor around. And rightfully so.Seeing that there is as much if not more care taken around security than there is stability is generally the sign of a well-run organization. If there's a security lapse, I expect certain vendors to rip the power out of their data centers rather than run in an insecure fashion. And your job done correctly—which clearly you have gotten to—means that you never have to make that decision because you've approached this the right way from the beginning. Nothing's perfect, but there's always the idea of actually caring about it being the first step.Peter: Yeah. And the other side of that was talking about stability, and again, it's avoiding the either/or situation. We can work in as well along those two—stability and security—we work in our cost of goods sold and our operating leverage in other aspects of our business. And every single one of them, it's our co-number one priorities are stability and security. And if it costs us a bit more money, if it takes our dev team a little longer, there's not a choice at that point. We're doing the correct thing.Corey: Saving money is almost never the primary objective of any company that you really want to be dealing with unless something bizarre is going on.Peter: Yeah. Our philosophy on, you know, any cost reduction has been this should have zero negative impact on our stability. If we do not feel we can safely do this, we won't. And coming back to the Spot Instance piece, that was a journey for us. And you know, we tested the waters a bit and we got to a point, we worked very closely with Amazon's team, and we came to that conclusion that we can safely do this. And we've been doing it for over a year and seen no adverse effects.Corey: Yeah. And a lot of shops I've talked to folks about well, when we go and do a consulting project, it's, “Okay. There's a lot of things that could have been done before we got here. Why hasn't any of that been addressed?” And the answer is, “Well. We tried to save money once and it caused an outage and then we weren't allowed to save money anymore. And here we are.” And I absolutely get that perspective. It's a hard balance to strike. It always is.Peter: Yeah. The other aspect where stability and security kind of intertwine is you can think about security as InfoSec in our systems and locking things down, but at the end of the day, why are we doing all that? It's for the benefit of our users. And Remind, as a communication platform, and safety and security of our users is as dependent on us being up and available so that teachers can reach out to parents with important communication. And things like attendance, things like natural disasters, or lockdowns, or any of the number of difficult situations schools find themselves in. This is part of why we take that stewardship that we have so seriously is that being up and protecting a user's data just has such a huge impact on education in this country.Corey: It's always interesting to talk to folks who insists they're making the world a better place. And it's, “What do you do?” “We're improving ad relevance.” I mean, “Okay, great, good for you.” You're serving a need that I would I would not shy away from classifying what you do, fundamentally, as critical infrastructure, and that is always a good conversation to have. It's nice being able to talk to folks who are doing things that you can unequivocally look at and say, “This is a good thing.”Peter: Yeah. And around 80% of public schools in the US are using Remind in some capacity. And so we're not a product that's used in a few civic regions. All across the board. One of my favorite things about working in Remind is meeting people and telling them where I work, and they recognize it.They say, “Oh, I have that app, I use that app. I love it.” And I spent years and ads before this, and you know, I've been there and no one ever told me they were glad to see an ad. That's never the case. And it's been quite a rewarding experience coming in every day, and as you said, being part of this critical infrastructure. That's a special thing.Corey: I look forward to installing the app myself as my eldest prepares to enter public school in the fall. So, now at least I'll have a hotline of exactly where to complain when I didn't get the attendance message because, you know, there's no customer quite like a whiny customer.Peter: They're still customers. [laugh]. Happy to have them.Corey: True. We tend to be. I want to thank you for taking so much time out of your day to speak with me. If people want to learn more about what you're up to, where's the best place to find you?Peter: So, from an engineering perspective at Remind, we have our blog, engineering.remind.com. If you want to reach out to me directly. I'm on LinkedIn; good place to find me or you can just reach out over email directly, peterh@remind101.com.Corey: And we will put all of that into the show notes. Thank you so much for your time. I appreciate it.Peter: Thanks, Corey.Corey: Peter Hamilton, VP of Technology at Remind. This has been a promoted episode brought to us by our friends at Redis, and I'm Cloud Economist Corey Quinn. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry and insulting comment that you will then hope that Remind sends out to 20 million students all at once.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
About JasonJason Frazier is a Software Engineering Manager at Ekata, a Mastercard Company. Jason's team is responsible for developing and maintaining Ekata's product APIs. Previously, as a developer, Jason led the investigation and migration of Ekata's Identity Graph from AWS Elasticache to Redis Enterprise Redis on Flash, which brought an average savings of $300,000/yr.Links: Ekata: https://ekata.com/ Email: jason.frazier@ekata.com LinkedIn: https://www.linkedin.com/in/jasonfrazier56 TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig is the solution for securing DevOps. They have a blog post that went up recently about how an insecure AWS Lambda function could be used as a pivot point to get access into your environment. They've also gone deep in-depth with a bunch of other approaches to how DevOps and security are inextricably linked. To learn more, visit sysdig.com and tell them I sent you. That's S-Y-S-D-I-G dot com. My thanks to them for their continued support of this ridiculous nonsense.Corey: Today's episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that's built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you're defining those as, which depends probably on where you work. It's getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that's exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn't eat all the data you've gotten on the system, it's exactly what you've been looking for. Check it out today at min.io/download, and see for yourself. That's min.io/download, and be sure to tell them that I sent you.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This one is a bit fun because it's a promoted episode sponsored by our friends at Redis, but my guest does not work at Redis, nor has he ever. Jason Frazier is a Software Engineering Manager at Ekata, a Mastercard company, which I feel, like, that should have some sort of, like, music backstopping into it just because, you know, large companies always have that magic sheen on it. Jason, thank you for taking the time to speak with me today.Jason: Yeah. Thanks for inviting me. Happy to be here.Corey: So, other than the obvious assumption, based upon the fact that Redis is kind enough to be sponsoring this episode, I'm going to assume that you're a Redis customer at this point. But I'm sure we'll get there. Before we do, what is Ekata? What do you folks do?Jason: So, the whole idea behind Ekata is—I mean, if you go to our website, our mission statement is, “We want to be the global leader in online identity verification.” What that really means is, in more increasingly digital world, when anyone can put anything they want into any text field they want, especially when purchasing anything online—Corey: You really think people do that? Just go on the internet and tell lies?Jason: I know. It's shocking to think that someone could lie about who they are online. But that's sort of what we're trying to solve specifically in the payment space. Like, I want to buy a new pair of shoes online, and I enter in some information. Am I really the person that I say I am when I'm trying to buy those shoes? To prevent fraudulent transactions. That's really one of the basis that our company goes on is trying to reduce fraud globally.Corey: That's fascinating just from the perspective of you take a look at cloud vendors at the space that I tend to hang out with, and a lot of their identity verification of, is this person who they claim to be, in fact, is put back onto the payment providers. Take Oracle Cloud, which I periodically beat up but also really enjoy aspects of their platform on, where you get to their always free tier, you have to provide a credit card. Now, they'll never charge you anything until you affirmatively upgrade the account, but—“So, what do you do need my card for?” “Ah, identity and fraud verification.” So, it feels like the way that everyone else handles this is, “Ah, we'll make it the payment networks' problem.” Well, you're now owned by Mastercard, so I sort of assume you are what the payment networks, in turn, use to solve that problem.Jason: Yeah, so basically, one of our flagship products and things that we return is sort of like a score, from 0 to 400, on how confident we are that this person is who they are. And it's really about helping merchants help determine whether they should either approve, or deny, or forward on a transaction to, like, a manual review agent. As well as there's also another use case that's even more popular, which is just, like, account creation. As you can imagine, there's lots of bots on everyone's [laugh] favorite app or website and things like that, or customers offer a promotion, like, “Sign up and get $10.”Well, I could probably get $10,000 if I make a thousand random accounts, and then I'll sign up with them. But, like, make sure that those accounts are legitimate accounts, that'll prevent, like, that sort of promo abuse and things like that. So, it's also not just transactions. It's also, like, account openings and stuff, make sure that you actually have real people on your platform.Corey: The thing that always annoyed me was the way that companies decide, oh, we're going to go ahead and solve that problem with a CAPTCHA on it. It's, “No, no, I don't want to solve machine learning puzzles for Google for free in order to sign up for something. I am the customer here; you're getting it wrong somewhere.” So, I assume, given the fact that I buy an awful lot of stuff online, but I don't recall ever seeing anything branded with Ekata that you do this behind the scenes; it is not something that requires human interaction, by which I mean, friction.Jason: Yeah, for sure. Yeah, yeah. It's behind the scenes. That's exactly what I was about to segue to is friction, is trying to provide a frictionless experience for users. In the US, it's not as common, but when you go into Europe or anything like that, it's fairly common to get confirmations on transactions and things like that.You may have to, I don't know text—or get a code text or enter that online to basically say, like, “Yes, I actually received this.” But, like, helping—and the reason companies do that is for that, like, extra bit of security and assurance that that's actually legitimate. And obviously, companies would like to prefer not to have to do that because, I don't know, if I'm trying to buy something, this website makes me do something extra, the site doesn't make me do anything extra, I'm probably going to go with that one because it's just more convenient for me because there's less friction there.Corey: You're obviously limited in how much you can say about this, just because it's here's a list of all the things we care about means that great, you've given me a roadmap, too, of things to wind up looking at. But you have an example or two of the sort of the data that you wind up analyzing to figure out the likelihood that I'm a human versus a robot.Jason: Yeah, for sure. I mean, it's fairly common across most payment forms. So, things like you enter in your first name, your last name, your address, your phone number, your email address. Those are all identity elements that we look at. We have two data stores: We have our Identity Graph and our Identity Network.The Identity Graph is what you would probably think of it, if you think of a web of a person and their identity, like, you have a name that's linked to a telephone, and that name is also linked to an address. But that address used to have previous people living there, so on and so forth. So, the various what we just call identity elements are the various things we look at. It's fairly common on any payment form, I'm sure, like, if you buy something on Amazon versus eBay or whatever, you're probably going to be asked, what's your name? What's your address? What's your email address? What's your telephone?Corey: It's one of the most obnoxious parts of buying things online from websites I haven't been to before. It's one of the genius ideas behind Apple Pay and the other centralized payment systems. Oh, yeah. They already know who you are. Just click the button, it's done.Jason: Yeah, even something as small as that. I mean, it gets a little bit easier with, like, form autocompletes and stuff like, oh, just type J and it'll just autocomplete everything for me. That's not the worst of the world, but it is still some amount of annoyance and friction. [laugh].Corey: So, as I look through all this, it seems like one of the key things you're trying to do since it's in line with someone waiting while something is spinning in their browser, that this needs to be quick. It also strikes me that this is likely not something that you're going to hit the same people trying to identify all the time—if so, that is its own sign of fraud—so it doesn't really seem like something can be heavily cached. Yet you're using Redis, which tells me that your conception of how you're using it might be different than the mental space that I put Redis into what I'm thinking about where this ridiculous architecture diagram is the Redis part going to go?Jason: Yeah, I mean, like, whenever anyone says Redis, thinks of Redis, I mean, even before we went down this path, you always think of, oh, I need a cache, I'll just stuff in Redis. Just use Redis as a cache here and there. I don't know, some small—I don't know, a few tens, hundreds gigabytes, maybe—cache, spin that up, and you're good. But we actually use Redis as our primary data store for our Identity Graph, specifically for the speed that we can get. Because if you're trying to look for a person, like, let's say you're buying something for your brother, how do we know if that's true or not? Because you have this name, you're trying to send it to a different address, like, how does that make sense? But how do we get from Corey to an address? Like, oh, maybe used to live with your brother?Corey: It's funny, you pick that as your example; my brother just moved to Dublin, so it's the whole problem of how do I get this from me to someone, different country, different names, et cetera? And yeah, how do you wind up mapping that to figure out the likelihood that it is either credit card fraud, or somebody actually trying to be, you know, a decent brother for once in my life?Jason: [laugh]. So, I mean, how it works is how you imagine you start at some entry point, which would probably be your name, start there and say, “Can we match this to this person's address that you believe you're sending to?” And we can say, “Oh, you have a person-person relationship, like he's your brother.” So, it maps to him, which we can then get his address and say, “Oh, here's that address. That matches what you're trying to send it to. Hey, this makes sense because you have a legitimate reason to be sending something there. You're not just sending it to some random address out in the middle of nowhere, for no reason.”Corey: Or the drop-shipping scams, or brushing scams, or one of—that's the thing is every time you think you've seen it all, all you have to do is look at fraud. That's where the real innovation seems to be happening, [laugh] no matter how you slice it.Jason: Yeah, it's quite an interesting space. I always like to say it's one of those things where if you had the human element in it, it's not super easy, but it's like, generally easy to tell, like, okay, that makes sense, or, oh, no, that's just complete garbage. But trying to do it at scale very fast in, like, a general case becomes an actual substantially harder problem. [laugh]. It's one of those things that people can probably do fairly well—I mean, that's why we still have manual reviews and things like that—but trying to do it automatically or just with computers is much more difficult. [laugh].Corey: Yeah, “Hee hee, I scammed a company out of 20 bucks is not the problem you're trying to avoid for.” It's the, “Okay, I just did that ten million times and now we have a different problem.”Jason: Yeah, exactly. I mean, one of the biggest losses for a lot of companies is, like, fraudulent transactions and chargebacks. Usually, in the case on, like, e-commerce companies—or even especially like nowadays where, as you can imagine, more people are moving to a more online world and doing shopping online and things like that, so as more people move to online shopping, some companies are always going to get some amount of chargebacks on fraudulent transactions. But when it happens at scale, that's when you start seeing many losses because not only are you issuing a chargeback, you probably sent out some products, that you're now out some physical product as well. So, it's almost kind of like a double-whammy. [laugh].Corey: So, as I look through all this, I tended to always view Redis in terms of, more or less, a key-value store. Is that still accurate? Is that how you wind up working with it? Or has it evolved significantly past them to the point where you can now do relational queries against it?Jason: Yeah, so we do use Redis as a key-value store because, like, Redis is just a traditional key-value store, very fast lookups. When we first started building out Identity Graph, as you can imagine, you're trying to model people to telephones to addresses; your first thought is, “Hey, this sounds a whole lot like a graph.” That's sort of what we did quite a few years ago is, let's just put it in some graph database. But as time went on and as it became much more important to have lower and lower latency, we really started thinking about, like, we don't really need all the nice and shiny things that, like, a graph database or some sort of graph technology really offers you. All we really need to do is I need to get from point A to point B, and that's it.Corey: Yeah, [unintelligible 00:10:35] graph database, what's the first thing I need to do? Well, spend six weeks in school trying to figure out exactly what the hell of graph database is because they're challenging to wrap your head around at the best of times. Then it just always seemed overpowered for a lot of—I don't want to say simple use cases; what you're doing is not simple, but it doesn't seem to be leveraging the higher-order advantages that graph database tends to offer.Jason: Yeah, it added a lot of complexity in the system, and [laugh] me and one of our senior principal engineers who's been here for a long time, we always have a joke: If you search our GitHub repository for… we'll say kindly-worded commit messages, you can see a very large correlation of those types of commit messages to all the commits to try and use a graph database from multiple years ago. It was not fun to work with, just added too much complexity, and we just didn't need all that shiny stuff. So, that's how we really just took a step back. Like, we really need to do it this way. We ended up effectively flattening the entire graph into an adjacency list.So, a key is basically some UUID to an entity. So, Corey, you'd have some UUID associated with you and the value would be whatever your information would be, as well as other UUIDs to links to the other entities. So, from that first retrieval, I can now unpack it, and, “Oh, now I have a whole bunch of other UUIDs I can then query on to get that information, which will then have more IDs associated with it,” is more or less sort of how we do our graph traversal and query this in our graph queries.Corey: One of the fun things about doing this sort of interview dance on the podcast as long as I have is you start to pick up what people are saying by virtue of what they don't say. Earlier, you wound up mentioning that we often use Redis for things like tens, or hundreds of gigabytes, which sort of leaves in my mind the strong implication that you're talking about something significantly larger than that. Can you disclose the scale of data we're talking about her?Jason: Yeah. So, we use Redis as our primary data store for our Identity Graph, and also for—soon to be for our Identity Network, which is our other database. But specifically for our Identity Graph, scale we're talking about, we do have some compression added on there, but if you say uncompressed, it's about 12 terabytes of data that's compressed, with replication into about four.Corey: That's a relatively decent compression factor, given that I imagine we're not talking about huge datasets.Jason: Yeah, so this is actually basically driven directly by cost: If you need to store less data, then you need less memory, therefore, you need to pay for less.Corey: So, our users once again have shored up my longtime argument that when it comes to cloud, cost and architecture are in fact the same thing. Please, continue by all means.Jason: I would be lying if I said that we didn't do weekly slash monthly reviews of costs. Where are we spending costs in AWS? How can we improve costs? How can we cut down on costs? How can you store less—Corey: You are singing my song.Jason: It is a [laugh] it is a constant discussion. But yeah, so we use Zstandard compression, which was developed at Facebook, and it's a dictionary-based compression. And the reason we went for this is—I mean like if I say I want to compress, like, a Word document down, like, you can get very, very, very high level of compression. It exists. It's not that interesting, everyone does it all the time.But with this we're talking about—so in that, basically, four or so terabytes of compressed data that we have, it's something around four to four-and-a-half billion keys and values, and so in that we're talking about each key-value only really having anywhere between 50 and 100 bytes. So, we're not compressing very large pieces of information. We're compressing very small 50 to 100 byte JSON values that we have give UUID keys and JSON strings stored as values. So, we're compressing these 50 to 100 byte JSON strings with around 70, 80% compression. I mean, that's using Zstandard with a custom dictionary, which probably gave us the biggest cost savings of all, if you can [unintelligible 00:14:32] your dataset size by 60, 70%, that's huge. [laugh].Corey: Did you start off doing this on top of Redis, or was this an evolution that eventually got you there?Jason: It was an evolution over time. We were formally Whitepages. I mean, Whitepages started back in the late-90s. It really just started off as a—we just—Corey: You were a very early adopter of Redis [laugh]. Yeah, at that point, like, “We got a time machine and started using it before it existed.” Always a fun story. Recruiters seem to want that all the time.Jason: Yeah. So, when we first started, I mean, we didn't have that much data. It was basically just one provider that gave us some amount of data, so it was kind of just a—we just need to start something quick, get something going. And so, I mean, we just did what most people do just do the simplest thing: Just stuff it all in a Postgres database and call it good. Yeah, it was slow, but hey, it was back a long time ago, people were kind of okay with a little bit—Corey: The world moved a bit slower back then.Jason: Everything was a bit slower, no one really minded too much, the scale wasn't that large. But business requirements always change over time and they evolve, and so to meet those ever-evolving business requirements, we move from Postgres, and where a lot of the fun commit messages that I mentioned earlier can be found is when we started working with Cassandra and Titan. That was before my time before I had started, but from what I understand, that was a very fun time. But then from there, that's when we really kind of just took a step back and just said, like, “There's so much stuff that we just don't need here. Let's really think about this, and let's try to optimize a bit more.”Like, we know our use case, why not optimize for our use case? And that's how we ended up with the flattened graph storage stuffing into Redis. Because everyone thought of Redis as a cache, but everyone also knows that—why is it a cache? Because it's fast. [laugh]. We need something that's very fast.Corey: I still conceptualize it as an in-memory data store, just because when I turned on disk persistence model back in 2011, give or take, it suddenly started slamming the entire data store to a halt for about three seconds every time it did it. It was, “What's this piece of crap here?” And it was, “Oh, yeah. Turns out there was a regression on Zen, which is what AWS is used as a hypervisor back then.” And, “Oh, yeah.”So, fork became an expensive call, it took forever to wind up running. So oh, the obvious lesson we take from this is, oh, yeah, Redis is not designed to be used with disk persistence. Wrong lesson to take from the behavior, but did cement, in my mind at least, the idea that this is something that we tend to use only as an in-memory store. It's clear that the technology has evolved, and in fact, I'm super glad that Redis threw you my direction to talk to you about this stuff because until talking to you, I was still—I got to admit—sort of in the position of thinking of it still as an in-memory data store because the fact that Redis says otherwise because they're envisioning it being something else, well okay, marketers going to market. You're a customer; it's a lot harder for me to talk smack about your approach to this thing, when I see you doing it for, let's be serious here, what is a very important use case. If identity verification starts failing open and everyone claims to be who they say they are, that's something is visible from orbit when it comes to the macroeconomic effect.Jason: Yeah, exactly. It's actually funny because before we move to primarily just using Redis, before going to fully Redis, we did still use Redis. But we used ElastiCache, we had it loaded into ElastiCache, but we also had it loaded into DynamoDB as sort of a, I don't want this to fail because we weren't comfortable with actually using Redis as a primary database. So, we used to use ElastiCache with a fallback to DynamoDB, just in that off chance, which, you know, sometimes it happens, sometimes it didn't. But that's when we basically just went searching for new technologies, and that's actually how we landed on Redis on Flash, which is a kind of breaks the whole idea of Redis as an in-memory database to where it's Redis, but it's not just an in-memory database, you also have flashback storage.Corey: So, you'll forgive me if I combine my day job with this side project of mine, where I fixed the horrifying AWS bills for large companies. My bias, as a result, is to look at infrastructure environments primarily through the lens of AWS bill. And oh, great, go ahead and use an enterprise offering that someone else runs because, sure, it might cost more money, but it's not showing up on the AWS bill, therefore, my job is done. Yeah, it turns out that doesn't actually work or the answer to every AWS billing problem is to migrate to Azure to GCP. Turns out that doesn't actually solve the problem that you would expect.But you're obviously an enterprise customer of Redis. Does that data live in your AWS account? Is it something using as their managed service and throwing over the wall so it shows up as data transfer on your side? How is that implemented? I know they've got a few different models.Jason: There's a couple of aspects onto how we're actually bill. I mean, so like, when you have ElastiCache, you're just billed for your, I don't know, whatever nodes using, cache dot, like, r5 or whatever they are… [unintelligible 00:19:12]Corey: I wish most people were using things that modern. But please, continue.Jason: But yeah, so you basically just build for whatever last cache nodes you have, you have your hourly rate, I don't know, maybe you might reserve them. But with Redis Enterprise, the way that we're billed is there's two aspects. One is, well, the contract that we signed that basically allows us to use their technology [unintelligible 00:19:31] with a managed service, a managed solution. So, there's some amount that we pay them directly within some contract, as well as the actual nodes themselves that exist in the cluster. And so basically the way that this is set up, is we effectively have a sub-account within our AWS account that Redis Labs has—or not Redis Labs; Redis Enterprise—has access to, which they deploy directly into, and effectively using VPC peering; that's how we allow our applications to talk directly to it.So, we're built directly—or so the actual nodes of the cluster, which are i3.8x, I believe, on they basically just run EC2 instances. All of those instances, those exist on our bill. Like, we get billed for them; we pay for them. It's just basically some sub-account that they have access to that they can deploy into. So, we get billed for the instances of the cluster as well as whatever we pay for our enterprise contract. So, there's sort of two aspects to the actual billing of it.Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats V-U-L-T-R.com slash screaming.Corey: So, it's easy to sit here as an engineer—and believe me, having been one for most of my career, I fall subject to this bias all the time—where it's, “Oh, you're going to charge me a management fee to run this thing? Oh, that's ridiculous. I can do it myself instead,” because, at least when I was learning in my dorm room, it was always a “Well, my time is free, but money is hard to come by.” And shaking off that perspective as my career continued to evolve was always a bit of a challenge for me. Do you ever find yourself or your team drifting toward the direction of, “Well, what we're paying for Redis Enterprise for? We could just run it ourselves with the open-source version and save whatever it is that they're charging on top of that?”Jason: Before we landed on Redis on Flash, we had that same thought, like, “Why don't we just run our own Redis?” And the decision to that is, well, managing such a large cluster that's so important to the function of our business, like, you effectively would have needed to hire someone full time to just sit there and stare at the cluster the whole time just to operate it, maintain it, make sure things are running smoothly. And it's something that we made a decision that, no, we're going to go with a managed solution. It's not easy to manage and maintain clusters of that size, especially when they're so important to business continuity. [laugh]. From our eyes, it was just not worth the investment for us to try and manage it ourselves and go with the fully managed solution.Corey: But even when we talk about it, it's one of those well—it's—everyone talks about, like, the wrong side of it first, the oh, it's easier if things are down if we wind up being able to say, “Oh, we have a ticket open,” rather than, “I'm on the support forum and waiting for people to get back to me.” Like, there's a defensibility perspective. We all just sort of, like sidestep past the real truth of it of, yeah, the people who are best in the world running and building these things are right now working on the problem when there is one.Jason: Yeah, they're the best in the world at trying to solve what's going on. [laugh].Corey: Yeah, because that is what we're paying them to do. Oh, right. People don't always volunteer for for-profit entities. I keep forgetting that part of it.Jason: Yeah, I mean, we've had some very, very fun production outages that just randomly happened because to our knowledge, we would just like—I would, like… “I have no idea what's going on.” And, you know, working with their support team, their DevOps team, honestly, it was a good, like, one-week troubleshooting. When we were validating the technology, we accidentally halted the database for seemingly no reason, and we couldn't possibly figure out what's going on. We kept talking to—we were talking to their DevOps team. They're saying, “Oh, we see all these writes going on for some reason.” We're like, “We're not sending any writes. Why is there writes?”And that was the whole back and forth for almost a week, trying to figure out what the heck was going on, and it happened to be, like, a very subtle case, in terms of, like, the how the keys and values are actually stored between RAM and flash and how it might swap in and out of flash. And like, all the way down to that level where I want to say we probably talked to their DevOps team at least two to three times, like, “Could you just explain this to me?” Like, “Sure,” like, “Why does this happen? I didn't know this was a thing.” So, on and so forth. Like, there's definitely some things that are fairly difficult to try and debug, which definitely helps having that enterprise-level solution.Corey: Well, that's the most valuable thing in any sort of operational experience where, okay, I can read the documentation and all the other things, and it tells me how it works. Great. The real value of whether I trust something in production is whether or not I know how it breaks where it's—Jason: Yeah.Corey: —okay—because the one thing you want to hear when you're calling someone up is, “Oh, yeah. We've seen this before. This is what you do to fix it.” The worst thing in the world is, “Oh, that's interesting. We've never seen that before.” Because then oh, dear Lord, we're off in the mists of trying to figure out what's going on here, while production is down.Jason: Yeah kind of like, “What is this database do, like, in terms of what do we do?” Like, I mean, this is what we store our Identity Graph in. This has the graph of people's information. If we're trying to do identity verification for transactions or anything, for any of our products, I mean, we need to be able to query this database. It needs to be up.We have a certain requirement in terms of uptime, where we want it at least, like, four nines of uptime. So, we also want a solution that, hey, even if it wants to break, don't break that bad. [laugh]. There's a difference between, “Oh, a node failed and okay, like, we're good in 10, 20 seconds,” versus, “Oh, node failed. You lost data. You need to start reloading your dataset, or you can't query this anymore.” [laugh]. There's a very large difference between those two.Corey: A little bit, yeah. That's also a great story to drive things across. Like, “Really? What is this going to cost us if we pay for the enterprise version? Great. Is it going to be more than some extortionately large number because if we're down for three hours in the course of a year, that's we owe our customers back for not being able to deliver, so it seems to me this is kind of a no-brainer for things like that.”Jason: Yeah, exactly. And, like, that's part of the reason—I mean, a lot of the things we do at Ekata, we usually go with enterprise-level for a lot of things we do. And it's really for that support factor in helping reduce any potential downtime for what we have because, well, if we don't consider ourselves comfortable or expert-level in that subject, I mean, then yeah, if it goes down, that's terrible for our customers. I mean, it's needed for literally every single query that comes through us.Corey: I did want to ask you, but you keep talking about, “The database” and, “The cluster.” That seems like you have a single database or a single cluster that winds up being responsible for all of this. That feels like the blast radius of that thing going down must be enormous. Have you done any research into breaking that out into smaller databases? What is it that's driven you toward this architectural pattern?Jason: Yeah, so for right now, so we have actually three regions were deployed into. We have a copy of it in us-west in AWS, we have one an eu-central-1, and we also have one, an ap-southeast-1. So, we have a complete copy of this database in three separate regions, as well as we're spread across all the available availability zones for that region. So, we try and be as multi-AZ as we can within a specific region. So, we have thought about breaking it down, but having high availability, having multiple replication factors, having also, you know, it stored in multiple data centers, provides us at least a good level of comfortability.Specifically, in our US cluster, we actually have two. We literally also—with a lot of the cost savings that we got, we actually have two. We have one that literally sits idle 24/7 that we just call our backup and our standby where it's ready to go at a moment's notice. Thankfully, we haven't had to use it since I want to say its creation about a year-and-a-half ago, but it sits there in that doomsday scenario: “Oh, my gosh, this cluster literally cannot function anymore. Something crazy catastrophic happened,” and we can basically hot swap back into another production-ready cluster as needed, if needed.Because the really important thing is that if we broke it up into two separate databases if one of them goes down, that could still fail your entire query. Because what if that's the database that held your address? We can still query you, but we're going to try and get your address and well, there, your traversal just died because you can no longer get that. So, even trying to break it up doesn't really help us too much. We can still fail the entire traversal query.Corey: Yeah, which makes an awful lot of sense. Again, to be clear, you've obviously put thought into this goes way beyond the me hearing something in passing and saying, “Hey, you considered this thing?” Let's be very clear here. That is the sign of a terrible junior consultant. “Well, it sounds like what you built sucked. Did you consider building something that didn't suck?” “Oh, thanks, Professor. Really appreciate your pointing that out.” It's one of those useful things.Jason: It's like, “Oh, wow, we've been doing this for, I don't know, many, many years.” It's like, “Oh, wow, yeah. I haven't thought about that one yet.” [laugh].Corey: So, it sounds like you're relatively happy with how Redis has worked out for you as the primary data store. If you were doing it all again from scratch, would you make the same technology selection there or would you go in a different direction?Jason: Yeah, I think I'd make the same decision. I mean, we've been using Redis on Flash for at this point three, maybe coming up to four years at this point. There's a reason we keep renewing our contract and just keep continuing with them is because, to us, it just fits our use case so well, and we very much choose to continue going with this direction in this technology.Corey: What would you have them change as far as feature enhancements and new options being enabled there? Because remember, asking them right now in front of an audience like this puts them in a situation where they cannot possibly refuse. Please, how would you improve Redis from where it is now?Jason: I like how you think. That's [laugh] a [fair way to 00:28:42] to describe it. There's a couple of things for optimizations that can always be done. And, like, specifically with, like, Redis on Flash, there's some issue we had with storing as binary keys that to my knowledge hasn't necessarily been completed yet that basically prevents us from storing as binary, which has some amount of benefit because well, binary keys require less memory to store. When you're talking about 4 billion keys, even if you're just saving 20 bytes of key, like you're talking about potentially hundreds of gigabytes of savings once you—Corey: It adds up with the [crosstalk 00:29:13].Jason: Yeah, it adds up pretty quick. [laugh]. So, that's probably one of the big things that we've been in contact with them about fixing that hasn't gotten there yet. The other thing is, like, there's a couple of, like, random… gotchas that we had to learn along the way. It does add a little bit of complexity in our loading process.Effectively, when you first write a value into the database it'll write to RAM, but then once it gets flushed to flash, the database effectively asks itself, “Does this value already exist in flash?” Because once it's first written, it's just written to RAM, it isn't written to backing flash. And if it says, “No it's not,” the database then does a write to write it into Flash and then evict it out of RAM. That sounds pretty innocent, but if it already exists in flash when you read it, it says, “Hey, I need to evict this does it already exist in Flash?” “Yep.” “Okay, just chuck it away. It already exists, we're good.”It sounds pretty nice, but this is where we accidentally halted our database is once we started putting a huge amount of load on the cluster, our general throughput on peak day is somewhere in the order of 160 to 200,000 Redis operations per second. So, you're starting to think of, hey, you might be evicting 100,000 values per second into Flash, you're talking about added 100,000 operate or write operations per second into your cluster, and that accidentally halted our database. So, the way we actually go around this is once we write our data store, we actually basically read the whole thing once because if you read every single key, you pretty much guarantee to cycle everything into Flash, so it doesn't have to do any of those writes. For right now, there is no option to basically say that, if I write—for our use case, we do very little writes except for upfront, so it'd be super nice for our use case, if we can say, “Hey, our write operations, no, I want you to actually do a full write-through to flash.” Because, you know, that would effectively cut our entire database prep in half. We no longer had to do that read to cycle everything through. Those are probably the two big things, and one of the biggest gotchas that we ran into [laugh] that maybe it isn't, so known.Corey: I really want to thank you for taking the time to speak with me today. If people want to learn more, where can they find you? And I will also theorize wildly, that if you're like basically every other company out there right now, you're probably hiring on your team, too.Jason: Yeah, I very much am hiring; I'm actually hiring quite a lot right now. [laugh]. So, they can reach me, my email is simply jason.frazier@ekata.com. I unfortunately, don't have a Twitter handle. Or you can find me on LinkedIn. I'm pretty sure most people have LinkedIn nowadays.But yeah, and also feel free to reach out if you're also interested in learning more or opportunities, like I said, I'm hiring quite extensively. I'm specifically the team that builds our actual product APIs that we offer to customers, so a lot of the sort of latency optimizations that we do usually are kind of through my team, in coordination with all the other teams, since we need to build a new API with this requirement. How do we get that requirement? [laugh]. Like, let's go start exploring.Corey: Excellent. I will, of course, throw a link to that in the [show notes 00:32:10] as well. I want to thank you for spending the time to speak with me today. I really do appreciate it.Jason: Yeah. I appreciate you having me on. It's been a good chat.Corey: Likewise. I'm sure we will cross paths in the future, especially as we stumble through the wide world of, you know, data stores in AWS, and this ecosystem keeps getting bigger, but somehow feels smaller all the time.Jason: Yeah, exactly. You know, we'll still be where we are hopefully, approving all of your transactions as they go through, make sure that you don't run into any friction.Corey: Thank you once again, for speaking to me, I really appreciate it.Jason: No problem. Thanks again for having me.Corey: Jason Frazier, Software Engineering Manager at Ekata. This has been a promoted episode brought to us by our friends at Redis. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment telling me that Enterprise Redis is ridiculous because you could build it yourself on a Raspberry Pi in only eight short months.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
Cloud Posse holds public "Office Hours" every Wednesday at 11:30am PST to answer questions on all things related to DevOps, Terraform, Kubernetes, CICD. Basically, it's like an interactive "Lunch & Learn" session where we get together for about an hour and talk shop. These are totally free and just an opportunity to ask us (or our community of experts) any questions you may have. You can register here: https://cloudposse.com/office-hoursJoin the conversation: https://slack.cloudposse.com/Find out how we can help your company:https://cloudposse.com/quizhttps://cloudposse.com/accelerate/Learn more about Cloud Posse:https://cloudposse.comhttps://github.com/cloudpossehttps://sweetops.com/https://newsletter.cloudposse.comhttps://podcast.cloudposse.com/[00:00:00] Intro[00:01:17] Terraform AWS EC2 Client VPN Module releasedhttps://github.com/cloudposse/terraform-aws-ec2-client-vpn[00:01:54] OMIGOD! Azure RCE: “Secret” Agent Exposes To Unauthorized Code Executionhttps://www.wiz.io/blog/secret-agent-exposes-azure-customers-to-unauthorized-code-execution[00:04:04] New OWASP Top 10 for 2021 (Open Web Application Security Project)https://owasp.org/Top10/[00:04:50] GitHub CLI now supports extensions!https://github.blog/2021-08-24-github-cli-2-0-includes-extensions/[00:07:20] Custom widgets for CloudWatch dashboardshttps://aws.amazon.com/about-aws/whats-new/2021/08/custom-widgets-amazon-cloudwatch-dashboards/[00:07:46] ElastiCache for Redis now supports auto scalinghttps://aws.amazon.com/about-aws/whats-new/2021/08/amazon-elasticache-redis/[00:08:09] AWS CloudFormation Can Retry Stack Operations from the Point of Failurehttps://aws.amazon.com/blogs/aws/new-for-aws-cloudformation-quickly-retry-stack-operations-from-the-point-of-failure/[00:08:51] Amazon Elasticsearch Service Is Now Amazon OpenSearch Servicehttps://aws.amazon.com/blogs/aws/amazon-elasticsearch-service-is-now-amazon-opensearch-service-and-supports-opensearch-10/[00:24:55] Anyone using Stack Exchange for teams?[00:28:35] Terraform Cloud Alternatives?[00:36:15] How to implement maintenance pages and activate them?[00:43:10] Does anyone use a span trace viewer as a primary view into a local development environment? (e.g. honeycomb UI, Perfetto)[00:49:15] Any best practices for organizing your TF configs for different environments, but keeping common variable settings in just one place?[00:52:55] Nomad for application CD [00:55:27] Outro#officehours,#cloudposse,#sweetops,#devops,#sre,#terraform,#kubernetes,#awsSupport the show (https://cloudposse.com/office-hours/)
Show Notes:Links:MicromortNoblesse obligeJosh's dotfilesGitHub Code SpacesFull Transcript:Ben:Yeah. I've been holding out for the new MacBook Pros. The M1 is pretty tempting, but I want whatever comes next. I want the 16-inch new hotness that's apparently supposed to be launching in November, but I've been waiting for it so patiently for so long now.Josh:Will they have the M2?Ben:Yeah, either or that or M1X. People are kind of unsure what the odds are.Starr:Why do they do that? Why did they make an M1 if they can't make an M2? Why do they have to keep... You just started, people. You can just have a normal naming scheme that just increments. Why not?Josh:M1.1?Ben:That would be awesome.Starr:Oh, Lord.Josh:Yeah, it would.Ben:M1A, Beachfront Avenue.Starr:So last week we did an Ask Me Anything on Indie Hackers, and that was a lot of fun.Josh:It was a lot of fun.Starr:I don't know. One of the most interesting questions on there was some guy was just like, "Are you rich?" I started thinking about it. I was like, "I literally have no idea." It reminded me of when I used to live in New York briefly in the '90s or, no, the early '00s. There was a Village Voice article in which they found... They started out with somebody not making very much money, and they're like, "Hey, what is rich to you?" Then that person described that. Then they went and found a person who had that level of income and stuff and they asked them, and it just kept going up long past the point where... Basically, nobody ever was like, "Yeah, I'm rich."Josh:Yeah. At the end, they're like, "Jeff Bezos, what is rich? What is rich to you?"Starr:Yeah.Josh:He's like, "Own your own star system."Starr:So, yeah, I don't know. I feel like I'm doing pretty good for myself because I went to fill up my car with gas the other day and I just didn't even look at the price. The other day, I wanted to snack, so I just got a whole bag of cashews, and I was just chowing down on those. I didn't need to save that. I could always get another bag of cashews.Ben:Cashews are my arch nemesis, man. I can't pass up the cashews. As far as the nut kingdom, man, they are my weakness.Starr:I know. It's the subtle sweetness.Ben:It's so good. The buttery goodness.Starr:Yeah, the smoothness of the texture, the subtle sweetness, it's all there.Ben:That and pistachios. I could die eating cashews and pistachios.Josh:There you go. I like pistachios.Ben:Speaking of being rich, did you see Patrick McKenzie's tweet about noblesse oblige?Josh:No. Tell me.Ben:Yeah, we'll have to link it up in the show notes. But, basically, the idea is when you reach a certain level of richness, I guess, when you feel kind of rich, you should be super generous, right? So noblesse oblige is the notion that nobility should act nobly. If you have been entrusted with this respect of the community and you're a noble, then you ought to act a certain way. You got to act like a noble, right? You should be respectful and et cetera. So Patio was applying this to modern day, and he's like, "Well, we should bring this back," like if you're a well-paid software developer living in the United States of America, you go and you purchase something, let's say a coffee, that has basically zero impact on your budget, right? You don't notice that $10 or whatever that you're spending. Then just normalize giving a 100% tip because you will hardly feel it, but the person you're giving it to, that'll just make their day, right? So doing things like that. I was like, "Oh, that's"-Josh:Being generous.Ben:Yeah, it's being generous. Yeah. So I like that idea.Josh:That's cool.Ben:So-Starr:So it's okay to be rich as long as you're not a rich asshole.Ben:Exactly. Exactly. That's a good way to bring it forward there, Starr.Starr:There you go. I don't know. Yeah. I think there's some historical... I don't know. The phrase noblesse oblige kind of grates at me a little bit in a way that I can't quite articulate in this moment, but I'll think about that, and I will get back with you.Josh:Wait. Are you saying you don't identify as part of the nobility?Starr:No.Ben:I mean, I think there's a lot of things from the regency period that we should bring back, like governesses, because who wants to send your child to school in the middle of a COVID pandemic? So just bring the teacher home, right?Starr:Yeah. That's pretty sexist. Why does it have to be gendered? Anyway.Ben:Okay, it could be a governor, but you might get a little misunderstanding. All of a sudden, you've got Jay Inslee showing up on your doorstep, "I heard you wanted me to come teach your kids."Josh:I don't know. I'll just take an algorithm in the home to teach my kids, just entrust them to it.Starr:Yeah. Oh, speaking of bringing things back, I told y'all, but I'll tell our podcast listeners. On Sunday, I'm driving to Tacoma to go to somebody's basement and look at a 100-year old printing press to possibly transport to Seattle and put in my office for no good reason that I can think of. It just seems to be something that I'm doing.Josh:Do you like that none of us actually asked you what you were intending to do with it? I was like, "Yeah, just let me know when you need to move it. I'm there." I just assumed you were going to do something cool with it, but ... Yeah.Starr:I appreciate that. I appreciate the support. I'm going to make little zines or something. I don't know.Josh:Yeah. If I get a lifetime subscription to your zine-Starr:Okay, awesome.Josh:... that would be payment.Starr:Done. Done.Josh:Cool.Ben:Yeah, sign me up, too. I'll be there.Starr:Well, I appreciate that.Ben:I mean, who could resist that invitation, right, because you get to... If you get to help with moving that thing, you get to see it, you get to touch it and play with it, but you don't have to keep it. It's somebody else's problem when you're done with the day, so sounds great to me.Starr:There you go. Well, I mean, if you read the forums about these things, this is one of the smaller ones, so people are just like, "Ah, no big deal. No big deal. It's okay." But I was happy to hear that there's no stairs involved.Ben:That is the deal-breaker. Yeah.Josh:Yeah. But it-Ben:If you ever get the friend helping you to move their piano, you always ask, "Okay, how many flights of steps," right?Starr:Yeah. Oh, I just thought of something I could do with it. I could make us all nice business card to hand out to nobody.Ben:Because we're not going anywhere.Josh:I just think of my last six attempts at having business cards. They're all still sitting in my closet, all six boxes of-Starr:I know. People look at you like, "What, really, a business card? What?"Josh:Yeah, like all six generations.Starr:Yeah.Ben:I hand out one or two per year. Yeah, just random people and like, "Hey, here's my phone number." It's an easy way to give it to somebody.Josh:Just people on the street?Ben:Exactly. Like a decent fellow, "Here you go." Thank you.Josh:Yeah.Starr:It's like, "I've got 1000 of these. I got to justify the cost somehow."Josh:We got to move these.Starr:We could start invoicing our customers by snail mail. I could print a really nice letterhead.Ben:I think we have a few customers who would be delighted to receive a paper invoice from us because then they would have an excuse to not pay us for 90 days.Starr:Yeah.Josh:Isn't owning a printing press like owning a truck, though? Once people know you have it, everyone wants to borrow it.Starr:It's going to be pretty hard to borrow for a 1000-pound piece of iron.Josh:Well, they're going to want to come over and hang out in your basement and do their printing. This is the Pacific Northwest, like-Starr:It's their manifestos.Josh:Yeah. They got to print their manifestos, lists of demands.Starr:They don't want the establishment at Kinko's to be able to see.Josh:Right.Ben:I don't know. It's got to put you on a special kind of watch list, though, if you have a printing press in your home, right? All of a sudden, some people are really interested in what you're up to.Josh:It's like a legacy watch list.Ben:I'm just flashing back to, yeah, in the 1800s when cities, towns would get all-Starr:There you go.Josh:Well, yeah, because they're like-Ben:The mob would come out and burn down the printing press building and stuff.Josh:If you wanted to be a propagandist back then, you had to buy a printing press and then you get put on a watch list. That just never went away. They're still looking for those people. They just don't find as many of them these days.Starr:Yeah. It's so inefficient. It's not the super efficient way of getting the word out, though, I hear, unless you want to be one of those people handing out leaflets on the side of the road.Josh:Well, you could paper windshields in parking lots.Starr:Oh, there you go. Yeah.Josh:Yeah, that's how they used to do it.Starr:No, look at my beautifully hand-crafted leaflet that you're going to throw in the gutter.Josh:Mm-hmm (affirmative).Ben:I think you just settled on what your next adventure's going to be after Honeybadger. You're ready to put this business aside and focus on printing up flyers for your local missing cat.Starr:There you go. There you go. Band flyers, that's big business.Josh:But you could get into fancy paper. That's a whole thing up here. It's pretty cool, actually.Starr:Yeah. I don't know. Really, I was like, "Oh, it'd be cool to have a big thing to tinker with." I'm learning about myself that I like having just a big physical project going on, and I'm pretty... Like, I built this backyard office, and that took up two years of my time. Ever since then, I don't have a big physical thing to work on, so I'm thinking this might fill that niche, that niche, sorry. I read a thing that's like don't say niche, Americans. Niche.Ben:I don't know, Starr. Maybe you should think of the children and then think about 50 years from now when you're dead and Ida's cleaning out the house and she's all like, "Why is there this printing press?"Starr:Oh, there you go.Josh:Have to move it.Starr:They'll just sell it with the house.Ben:There you go.Starr:Yeah. I mean, the funny thing is, is that it is wider than the doorway, so I would either have to dissemble it partially or take out the door. I put the door in, so I know how to take it out, so there is a good chance the door's coming out because I have less chance of messing something up if I do that one. But we'll see.Ben:Echo that.Starr:Well, thank you.Josh:You should've put one of those roll-up doors in there.Starr:I should've, yeah.Josh:Those are cool.Starr:What was I thinking?Josh:You really did not plan ahead for this.Starr:Yeah. I mean, walls are really only a couple of thin pieces of plywood, and you can just saw through it.Josh:Just a small refactor.Starr:Yeah.Josh:Yeah.Starr:And that would-Josh:Did y'all see that someone listened to every episode of this podcast in a row?Starr:I know. I feel so bad. I feel so bad for them.Josh:Speaking of-Starr:We're sorry. We're so sorry.Ben:I was feeling admiration. I'm like, "Wow, that's impressive," like the endurance of it.Starr:I just think we would've made different decisions.Ben:I don't know. But not-Josh:Maybe it's pretty good. I haven't gone back and gone through it all and never will, but-Ben:Well, I mean, not only did they say they listened to every episode, but then they were eager for more. They were like, "When are you getting done with your break?" So I guess-Starr:There you go.Ben:... that net it was positive, but-Josh:We must not be too repetitive.Ben:Must not.Starr:Stockholm syndrome.Josh:We're sorry.Ben:Well...Starr:I'm sorry. I don't have anything informative to add, so I'm just going to be shit-posting this whole episode.Ben:Well, I've had an amazing week since we last chatted. I kept reflecting on how I couldn't remember anything that I did over the past whatever months. Well, this past week, I can remember a whole bunch of things that I did. I've been crazy busy and getting a bunch of little things knocked out. But today, today was the capstone of the week because I rolled over our main Redis cluster that we use for all of our jobs, all of the incoming notices and whatevers. Yeah, rolled over to a new Redis cluster with zero downtime, no dropped data, nobody even noticed. It was just smooth as-Starr:Oh my God.Josh:I saw that.Starr:Awesome.Ben:It's going pretty good.Starr:Just like butter?Ben:Just like butter.Starr:They slid right out of that old Redis instance and just into this new... Is it an AWS-managed type thing?Ben:Yeah, both of them were. They all went on the new one, but... Yeah.Josh:It's, what, ElastiCache?Ben:Yep. Smooth like a new jar of Skippy.Josh:I saw that you put that in our ops channel or something.Ben:Yeah. Yeah, that's the topic in our ops channel.Josh:So it's the subject or the topic, yeah. We're making ops run, yeah, like a jar of Skippy.Starr:Why isn't that our tagline for our whole business?Ben:I mean, we can change it.Starr:I don't know why that's making me crack up so much, but it is.Josh:Skippy's good stuff.Starr:Oh my gosh.Josh:Although we-Ben:Actually-Josh:... usually go for the Costco natural brand these days.Ben:Well, we go for the Trader Joe's all-natural brand that you have to actually mix every time you use it. I prefer crunchy over creamy, so, actually, my peanut butter's not that smooth, but... You know.Josh:Yeah.Ben:It's okay. But, yeah, I love our natural peanut butter, except for the whole churning thing, but you can live with that.Starr:We're more of a Nutella family.Ben:Ooh, I do love a Nutella.Josh:Ooh, Nutella.Ben:Mm-hmm (affirmative), that's good stuff.Josh:We made pancakes the other day, and I was putting Nutella on pancakes. I did this thing, like I made this... We have one of those griddles, like an electric griddle, and so I made this super long rectangular pancake, and then I spread Nutella on the entire thing, and then I rolled it so that you have this-Starr:You know what it's called, Josh.Josh:What is it called?Starr:That's called a crepe.Josh:So it's a crepe, but it's made out of a pancake.Starr:It's a Texas crepe.Ben:Texas crepe.Josh:Yeah, a Texas-Starr:A Texas crepe.Ben:Yes.Josh:Is it really a Texas crepe because that's... Yeah, so, I-Starr:Oh, no, I just made that up.Ben:That sounds perfect, yeah.Josh:Well, it is now.Ben:Yeah, it is now.Josh:It is now, and I highly recommend it. It's pretty amazing.Ben:Throw some Skippy on there and, man, now it's a... That's awesome.Josh:Peanut butter's also good on pancakes.Starr:That's why people listen to us, for our insights about business.Ben:Yeah, there was this one time, speaking of pancakes and peanut butter...Josh:How did we get on pancakes? Like, oh, yeah, ops.Ben:This one time, I went over to dinner at some person's house, and I didn't know what dinner was going to be, but we got there and it was breakfast for dinner, which I personally love. That's one of my favorites.Starr:I knew that about you.Ben:So they're like, "Oh, I'm sorry. Hope you don't think it's weird, but we're having breakfast for dinner." I'm like, "No, no, I love it." So eggs and bacon and waffles, and so I'm getting my waffle and I'm like, "Do you have some peanut butter," and they're like, "Oh my goodness, we thought you would think that was way too weird, and so we didn't have the peanut butter." They whipped it out from in the counter. It's like, "Oh, shew, now we can have our peanut butter, too." I'm like, "Oh, yeah, peanut butter on waffles, yeah."Josh:Everyone had their hidden peanut butter.Ben:Mm-hmm (affirmative).Josh:Yeah.Starr:And that's how you level up a friendship.Ben:There you go. So, yeah, the week was good. The week was good. Bugs got fixed, things got deployed, and, yeah, just a whole-Josh:Yeah, you had a bunch of PRs and stuff for little things, too, which-Ben:Yeah. And got some practice with the whole delegating thing, got Shava doing some stuff, too. So, yeah, just all-around super productive week.Josh:Nice. I got Java to run in a Docker container, so my week's going pretty good.Ben:And that took you all week?Starr:What do those words mean? I don't...Josh:Yeah.Starr:Was your audio cutting out? I don't know. I just heard a bunch of things I don't understand.Josh:Well, for your own sake, don't ask me to explain it.Starr:Yeah, it's like better not looked at.Josh:Yeah.Ben:Why would you subject yourself to that sort of torture, Josh?Josh:Oh, well, because running Java on an M1 Mac is even worse.Starr:Oh my Lord.Josh:Well, actually, running it, period. But, yeah, like just our Java package. I mean, I've spent half this podcast ranting about our packaging, so I don't need to get too deep into it. But every time I release this thing, it's like it just doesn't work because I've forgotten my... I've changed my system, and Java and Maven package repository are just like that. So I figure if I can make some sort of reproduceable development environment using Docker, then in two years everything will just be smooth as a jar of Skippy.Ben:Skippy. Yeah, yeah.Starr:Well, I had a chance to-Josh:I reckon.Starr:I had a chance to dig into some numbers, which is one of my favorite things to do, and so... I don't know. There was this question that was just bothering me, which was... Well, let me just back up. So we've had some success, as you guys know, in the past year. We've almost doubled our rate of new user sign-ups, not new user sign-ups, like conversion to paid users. We've doubled our paid user conversion numbers, rate, whatever you call it. And so, obviously, revenue from users has gone up as well, but since we are a... Our plans are basically broken down by error rates, right? So what happens when people upgrade is they get too many errors for their plan. It says, "Hey, you should upgrade if you want to keep sending us errors," and they do.Starr:I had this weird situation where it's like I wasn't sure... In our system, revenue from users was coming just from whatever plan they picked when they signed up, and so I was wondering, "Well, what if they sign up, and then a week later they upgrade? That's going to be counted under upgrade revenue instead of new user revenue," which, really, it really kind of should be. So I got to digging, and I found that it doesn't really make that big of a difference. Some people do upgrade pretty quickly after converting, but they don't... It's not really enough to really change things.Josh:Yeah.Starr:Then, also, just sort of offhand, I took a little sneak peek. I've been running this experiment to see if lowering our error quota for our basic, our free plan, it would increase conversions. So I took a little sneak peek at the data. It's too soon to know for sure, but so far the conversion rate, I think, is going to end up being higher, which is what I would expect, so that's good, and-Josh:Nice.Starr:Yeah. And when we're done, I'm going to look at sign-ups just to make sure that they are still in line.Ben:Yeah. Anecdotally, I've seen a smaller window from trial to paid conversion. Well, not trial, but freemium to paid conversion. I've seen people who are signing up, getting on the basic plan, and then within some short time period they're actually going to a team plan.Josh:Oh, that's good to know.Ben:That's happening more often than it was, so... Yeah. So that's-Josh:Cool.Ben:I'm just saying the same thing Starr said but without real data.Josh:Yeah.Starr:Yeah, it's awesome. Yeah, we need a little bit more time to see how things pan out, too, because it's... One thing I figured out that I will share with our readers, our readers, I'm used to doing the blog posts, I'll share with our listeners that I figured out that you really have to pay attention to, on free plans especially, is comparing conversion rates between time periods. So if you make a change and then you wait for a month of data to come in and you're like, "Okay, let's look at the conversion rate for the past month after the change with the conversion rate for the time period before the change," that is really an apples to oranges comparison because on the one hand you've had people who have maybe had a year to upgrade versus people who've had a month to upgrade. So you have to be really careful to make it apples to apples, right, where you only compare... If you have a month worth of users on one side, you compare it to a month worth of users on the other side, and you only count the conversions that happened in that time period.Josh:Makes sense.Starr:Yeah. So, anyway, that's just my little freebie data analysis thing for our listeners.Josh:We should have Starr's weekly data science tip.Starr:Starr's data corner.Josh:Yeah.Ben:Love it.Josh:Yeah. We could move the podcast to segments. We've never done segments. We could introduce segments if we need to spice things up on FounderQuest.Ben:Yeah. Totally. Well, speaking of spicing things up, I had a brilliant idea this morning.Starr:Oh, I want to hear it.Ben:Yeah. So one of the things that I keep an eye on is how much we spend on hosting because that's a good chunk of our expenses. We always want to make more money, and one way to make more money is to have fewer expenses. So I had this brilliant idea on how to cut expenses. We can chop our AWS bill in half by just not running everything redundant.Starr:There you go.Josh:Brilliant.Starr:Would you say the AWS is the sixth Honeybadger employee?Ben:Yeah, pretty much.Josh:Yes. That's a good way to put it, actually.Starr:Yeah.Josh:Yeah.Ben:Well, in the early days, before we were paying ourselves a full salary, I remember we budgeted 25% for Starr, 25% for Ben, 25% for Josh, and 25% for hosting.Josh:Yeah.Ben:Yeah, I don't think we ever exceeded the 25%, which is good. That would be a bit high. So, yeah, AWS is like our sixth employee.Starr:Yeah, it's funny because do we even have other expenses?Josh:No.Ben:I mean, salaries is definitely the biggest one, and our health insurance is not cheap either.Starr:Yeah.Ben:Advertising.Starr:I was thinking like marketing, advertising. Yeah.Ben:Yeah. Advertising and marketing, that's the next one.Starr:That's the next 25%.Josh:Can we make AWS our seventh and eighth employee, too?Ben:Eventually may. Yeah, I did some... Oh. Oh. So I told you my great success that I had this morning. Well, your comment just now about AWS made me think about the one failure, just amazingly huge failure that I had also this week, migrating a bunch of data from Redis to DynomoDB. So we have this situation where it's one of those seemed like a good idea at the time kind of thing where we're doing a bunch of counting of people and individuals that hit errors, and we're counting that in Redis. I'm like, "Okay, great," because Redis has this INCRBY and it's easy and it's atomic and, boom, you're done, and I just never paid much attention to it until a few weeks ago, and I was like, "Yo, you know what? That's actually a lot of data in there, and we're keeping that forever, and so it's probably better to put it someplace that's not Redis." So I'm like, "Ah, I know. I'll do DynamoDB because it has an increment thing and...Josh:Yeah.Ben:So I put a table together, and I wrote a migration script, and I migrated a bunch of data. It took two days. It's great. Everything is beautiful. Had buckets of data inside DynomiteDB, and then I went to go query it, and I'm like, "Oh, I can't query it that way because I don't have the right index." Well, that sucks. All right. So you can't create a local index on DynomiteDB without recreating the table. I'm like, "Okay, well, that sucks. I just lost two days worth of data migration but oh well." So dump the table, recreated it with the index, and started redoing the data migration, and I'm like, "Yeah, it might take two days, no problem." So I check on it every half-day or so, and it's not going to be getting done after two days. Three days go by, and I'm checking the work backlog, and I was like, "It's just flat."Ben:Turns out because of that local index, now Dynamo can't really write fast enough because the way they do the partition throttling and stuff because we have some customers who have huge chunks of data. So their partitions are too big for Dynamo to write very quickly. Hot partition keys is the problem. So I just gave up. I'm like, "All right, fine." Drop the table again, recreated it, and now we're just double writing so that, eventually, given six months from now or so, it'll be there and I can replace that thing in Redis.Josh:Nice.Ben:So this is my life, the ups and the downs. So, yeah.Josh:And just waiting six months.Ben:And just waiting six months.Josh:Yeah. That's funny, but that is kind of a pattern in the business. In some cases, we need to just wait for the data to populate itself, and we just have to basically wait our retention period because data tends to turnover and then we can drop the old database or whatever.Ben:Yeah. Yep. But, luckily, nobody noticed my big fail, so it's all good. It didn't impact the customers.Josh:I didn't notice.Ben:So, yeah, busy weekend.Starr:I noticed, but I didn't say anything because I wanted to be nice.Ben:Thank you, Starr. Appreciate that.Starr:Yeah, I [inaudible].Josh:Starr was over there just quietly shaking her head.Ben:Just judging. Just judge-Starr:No, sorry.Ben:So, Josh, I'm going to get back to this Java thing because I'm curious. I remember, I don't know, a year ago or something, we're kind of like, "Maybe we should just not when it comes to Java anymore." So I'm curious what prompted this renewed activity to do a new release.Josh:Well, I don't know. I figured... I don't know. Didn't we say we were just not going to do any releases?Ben:Yeah, it just-Josh:It's not high on my list of development. We're not spending a bunch adding stuff to it, but there are dependency updates that have been getting merged in. I merged the Dependabot PRs and stuff. There's something else. There might be some small PR or something that someone submitted that was sitting there on release, and I just can't handle just unreleased code sitting on the pane. So it's just one of those things that's been sitting on my backlog halfway down the list just gnawing at me every week, so I figured I'd dive in and at least get some sort of quick release, relatively quick release process down so we can just continue to release dependency updates and stuff, like if there's a security update or something, so...Ben:Yeah.Josh:Some people still do use it, so I want to make sure they're secure.Ben:Make sure they're happy. Yeah.Josh:Yeah. But, yeah, that's a good point. We are not treating all platforms as equal because we just don't have the resource, so we need to focus on the stuff that actually is making us money.Ben:Yeah. Yeah, it's tough when very few of our customers are actually using that for it to get a whole lot of priority.Josh:That said, we have already put a lot into it, so as far as I know, it works well for the people that have used it.Starr:So are y'all encouraging our customers to do more Java?Josh:Yes, switch to Java. Then switch to SentryBen:Ride a wave.Josh:... or something.Ben:So I've been contemplating this new laptop showing up, right, whenever Apple finally releases it and I get to get my hot little hands on it. I've been thinking, well, the one big downside to getting a new laptop is getting back to a place where you can actually work again, right, getting all your things set up. Some people are smart, like Josh, that have this DOT file, this repo, on GitHub, and they can just clone that, and they're off to the races. I'm not that smart. I always have to hand-craft my config every time I get a new machine. But I'm thinking-Josh:Oh. Take the time.Ben:So, yeah, I'm not looking forward to that part, but GitHub has released Codespaces, and so now I'm thinking, "Ooh, I wonder if I could get all our repos updated so that I could just work totally in the cloud and just not even have a development set-up on my machine." Probably not, but it's a fun little fantasy.Josh:Well, then you could have any little... You could work on your iPad.Ben:Yeah.Josh:Yeah.Ben:Yeah, I don't even need a laptop. Then I could save the company money. That's brilliant, Josh.Josh:Yeah. You could work at the library.Starr:Yeah. It's like, "So your main ops guy, I see he's primarily working from a five-year-old iPad."Ben:At a library.Starr:In a library.Josh:An iMac.Starr:When he gets paged, he has to run to the nearest Starbucks and get that wifi.Josh:Yeah. I got to say, having your DOT files all ready to go and all that is pretty good. Also, I've got my Brewfile, too, so all of my Homebrew stuff is automated in that.Ben:Well, that's clever. I never even thought of that.Josh:It does make it very quick to bootstrap a new machine.Ben:Yeah. Maybe I should take this as initiative to actually put my stuff into DOT files repo and get to that point.Josh:Careful, though, because you might... I've had four computers between your current one and now, so you might end up switching more often because it's easier to do it.Ben:Appreciate that warning. That's good.Josh:Yeah. Speaking of the M1s, I love the M1 MacBook Air that I have. But the battery has been... I don't know what happened, but the battery was fantastic, I don't know, first few months. Ever since then, it's been kind of like it hasn't been lasting. I've been surprised at how fast it's draining, and I go and look at, whatever, the battery health stuff, and it says that health is down to 86% and the condition says it's fair, which does not make me feel warm and fuzzy.Josh:It has 50 cycles, so I think it might be defective, and that sucks because otherwise this machine is maybe one of the best Macs I've had. I guess... Yeah. I've had a few compatibility issues with the architecture, but it's not too bad. I mean, I'm not a Java developer at least, so...Ben:Yeah, I think you need to take that in for a service because that is way soon for that kind of degradation.Josh:Yeah. I might need to do something.Ben:That's a bummer.Josh:Yeah. I don't know. I might have to ship it in because I think our local Portland Apple Store is shuttered currently.Ben:All those protests?Josh:Yeah. It's got eight fences around it and stuff. Downtown Portland's a little rough these days.Starr:Yeah.Ben:Well, I mean, you can always take the trip out to Seattle.Josh:Yeah. Oh, yeah. Or there's other... I forget. There's an Apple Store that's not too far outside of Portland. It's where I bought this, so I could take it down there.Starr:Yeah. I'm sad now because I bought my second MacBook from that store in Portland.Josh:Yeah? It's a good store.Ben:Speaking of you coming out to Seattle, I was thinking the other day that maybe we should do a company-wide get-together sometime soon. Be fun to see everybody again in-person.Josh:It would be. Now that we're all vaxxed, we're all super vaxxed. I don't know that Starr is even down for that, though. I'm just looking at Starr.Starr:I don't know. Like, I-Josh:You don't look like you're too stoked on that idea.Starr:I don't know. I'm just-Josh:What with Delta lurking.Starr:The problem is, Josh, is that you have not been reading nursing Twitter.Josh:Uh-huh (affirmative).Starr:So I don't know. Yeah, it's doable. Currently, I think the CDC just released a thing that said vaccine efficiency of preventing COVID infections... It's very good still at preventing bad, I don't know, disease, health problems, whatever, keeping people out of the hospital. It's very good at that. With Delta, it's about 65% effective at preventing infections, and so if you get infected, you can transmit it to other people.Josh:Right.Starr:Yeah. So it's not impossible. It's just like we're just back to this fricking calculus where every possible social interaction you just have to run it through your spreadsheet and your risk analysis and... Ugh.Josh:Yeah.Ben:It's like, "Are you worthy of the hassle? No. Sorry, can't make it."Starr:Yeah. Yeah. It's like, "Okay, so what's the probability that meeting with you is going to send my child to the hospital? Okay, that's low enough. Sure."Josh:Yeah.Starr:It's just such a weird world.Josh:Wouldn't it be funny if when you get into your car in the morning, it reads out the probability of you dying in a car accident?Starr:Oh, yeah. Do you know about millimorts?Josh:No.Starr:Oh, you should go Google millimorts. A millimort is a one in a million chance that you will die, and so there's tables and stuff that you can find online that have different activities and what the number of millimorts is about them. So you can compare, and you can be like, "Okay, so going skydiving has this many millimorts as driving so many miles in a motorcycle."Josh:That's awesome. Okay, we have to link this in the show notes because I want to remember to look this up-Starr:Okay. I'll go find it.Josh:... so that I can depress people.Starr:I think there was a New York Times article, too.Ben:Yeah, I totally have to see this because I just signed up for a motorcycle training course and I'm going to get my endorsements so that I know exactly what kind of risks... Though that's probably part of the course, where they try to scare you out of actually getting your endorsement. They probably...Josh:By the way, I'm really glad my morbid humor or my morbid joke landed because for a minute there-Starr:Oh, I'm sorry, it's a micromort.Josh:Oh, a micromort. Okay.Starr:I was like, "Isn't milli 1000?"Josh:Minimort, like-Starr:Milli is 1000.Josh:Yeah.Starr:Yeah, that grated at me. I know. My old chemistry teachers are just giving me an F right now.Ben:Yeah, I got to see that.Josh:Well, I'm sure you'll be all right, Ben. I mean, the risk of a motorcycle is much higher than a car, but you just can't think about that all the time because the fun... I'm sure the fun is much...Ben:[inaudible].Josh:It's worth it.Ben:It's worth every hazard. Yeah.Josh:Yeah. The risk is worth the reward.Ben:Yesterday, I just hit 250 miles on the odometer on my scooter, so loving that. It's a lot of fun.Josh:That's cool.Starr:That's a lot of miles for a scooter.Josh:Mm-hmm (affirmative).Starr:I guess you just love to scoot.Ben:I love to scoot. Well, there you go, Starr. There's our happy ending after that slight dip there.Starr:That slight delay into reality.Josh:I like the dark humor. I don't know. It's always a gamble, though, with depending on... Yeah. But I think, Starr, you're always down to get dark.Starr:Oh, yeah. I'm down with the darkness. All right. Well, should we wrap it up?Ben:Let's wrap it.Starr:Okay. This has been a very witchy episode of FounderQuest, so if you liked it, go give us a review and... Yeah, if not, just keep listening to us. Make it a hate listen. You got to have a couple of those in your line-up.
Links: AWS's Egregious Egress: https://blog.cloudflare.com/aws-egregious-egress/ TranscriptCorey: This episode is sponsored in part by our friends at ChaosSearch. You could run Elasticsearch or Elastic Cloud—or OpenSearch as they're calling it now—or a self-hosted ELK stack. But why? ChaosSearch gives you the same API you've come to know and tolerate, along with unlimited data retention and no data movement. Just throw your data into S3 and proceed from there as you would expect. This is great for IT operations folks, for app performance monitoring, cybersecurity. If you're using Elasticsearch, consider not running Elasticsearch. They're also available now in the AWS marketplace if you'd prefer not to go direct and have half of whatever you pay them count towards your EDB commitment. Discover what companies like Klarna, Equifax, Armor Security, and Blackboard already have. To learn more, visit chaossearch.io and tell them I sent you just so you can see them facepalm, yet again.Corey: Hi there. Chief Cloud Economist Corey Quinn from the Duckbill Group here to more or less rant for a minute about something it's been annoying the heck out of me for a while, as anyone who follows me on Twitter or subscribes to the lastweekinaws.com newsletter, or passes me in a crowded elevator will attest to, and that is AWS's data transfer story.Back on July 23rd—of 2021, for those listening to this in future years—CloudFlare did a blog post titled AWS's Egregious Egress, and that was co-authored by Matthew Prince—CloudFlare's CEO—and Nitin Rao—who is one of their employees. Presumably. That was somewhat unclear—and it effectively tears down the obnoxious—and I mean deeply obnoxious—level of AWS data transfer pricing for egress to the outside world.And there's a bunch of things to unpack in this blog post, where they wind up comparing AWS pricing to the wholesale bandwidth market. And they go into a whole depth for those who aren't aware of how bandwidth is generally charged for. And the markups that they come up with for AWS are, in many cases, almost 8,000%, which is just ludicrous, in some respects, because—spoiler—every year, give or take, the wholesale cost of network bandwidth winds up dropping by about 10%, give or take. And the math that they've done that I'm too lazy to check, says that in effect, given that they don't tend to reduce egress bandwidth pricing, basically ever, while the wholesale market has dropped 93%, what we pay AWS hasn't. And that's obnoxious.They also talk—rather extensively—about how ingress is generally free. Now, there's a whole list of reasons that this could be true, but let's face it, when you're viewing bandwidth into AWS as being free, you start to think of it that way of, “Oh, it's bandwidth, how expensive could it possibly be?” But when you see data coming out and it charges you through the nose, you start to think that it's purely predatory. So, it already starts off with customers not feeling super great about this. Then diving into it, of course; they're pushing for the whole bandwidth alliance that CloudFlare spun up, and good for them; that's great.They have a bunch of other providers willing to play games with them and partner. Cool, I get it. It's a sales pitch. They're trying to more or less bully Amazon into doing the right thing here, in some ways. Great, not my actual point.My problem is that it's not just that data transfer is expensive in AWS land, but it's also inscrutable because, ignoring for a second what it costs to send things to the outside world, it's more obnoxious trying to figure out what it costs to send things inside of AWS. It ranges anywhere from free to very much not free. If you have a private subnet that's talking to something in the public subnet that needs to go through a managed NAT gateway, whatever your transfer price is going to be has four and a half cents per gigabyte added on to it with no price breaks for volume. So, it's very easy to wind up accidentally having some horrifyingly expensive bills for these things and not being super clear as to why. It's very challenging to look at this and not come away with the conclusion that someone at the table is the sucker.And, as anyone who plays poker is able to tell you, if you can't spot the sucker, it's you. Further—and this is the part that I wish more people paid attention to—if I'm running an AWS managed service—maybe RDS, maybe DynamoDB, maybe ElastiCache, maybe Elasticsearch—none of these things are necessarily going to be best-to-breed for the solution I'm looking at, but their replication traffic between AZs in the same region is baked into the price and you don't pay a per-gigabyte fee for this. If you want to run something else, either run it yourself on top of EC2 instances or grab something from the AWS marketplace that a partner has provided to you. There is no pattern in which that cross-AZ replication traffic is free; you pay for every gigabyte, generally two cents a gigabyte, but that can increase significantly in some places.Corey: I really love installing, upgrading, and fixing security agents in my cloud estate. Why do I say that? Because I sell things for a company that deploys an agent. There's no other reason. Because let's face it; agents can be a real headache. Well, Orca Security now gives you a single tool to detect basically every risk in your cloud environment that's as easy to install and maintain as a smartphone app. It is agentless—or my intro would have gotten me in trouble here—but it can still see deep into your AWS workloads while guaranteeing 100% coverage. With Orca Security there are no overlooked assets, no DevOps headaches—and believe me, you will hear from those people if you cause them headaches—and no performance hits on live environment. Connect your first cloud account in minutes and see for yourself at orca dot security. That's orca—as in whale—dot security as in that thing your company claims to care about but doesn't until right after it really should have.Corey: It feels predatory, it feels anti-competitive, and you look at this and you can't shake the feeling that somehow their network group is being evaluated on how much profit it can turn, as opposed to being the connective tissue that makes all the rest of their services work. Whenever I wind up finding someone who has an outsized data transfer bill when I'm doing the deep-dive analysis on what they have in their accounts, and I talk to them about this, they come away feeling, on some level, ripped off, and they're not wrong. Now, if you take a look at other providers—like Oracle Cloud is a great example of this—their retail rate is about 10% of what AWS's for the same level of traffic. In other words, get a 90% discount without signing any contract and just sign the dotted line and go with Oracle Cloud. Look, if what you're doing is bandwidth-centric, it's hard to turn your nose up at that, especially if you start kicking the tires and like what you see over there.This is the Achilles heel of what happens in the world of AWS. Now, I know I'm going to wind up getting letters about this because I always tend to whenever I rant about this that no one at any significant scale is paying retail rate for AWS bandwidth. Right, but that's sort of the point because when I'm sitting here doing back-of-the-envelope calculations on starting something new and that thing tends to be fairly heavy on data transfer—like video streaming—and I look at the retail published rates, it doesn't matter what the discount is going to be because I'm still trying to figure out if this thing has any baseline level of viability, and I run the numbers and realize, wow, 95% of my AWS bill is going to be data transfer. Well, I guess my answer is not AWS. That's not a pure hypothetical.I was speaking to someone years ago, and they have raised many tens of millions of dollars for their company since, and it's not on AWS because it can't be given their public pricing. Look, this is not me trying to beat up unnecessarily on AWS. I'm beating them up on something that frankly, has been where it is for far too long and needs to be addressed. This is not customer obsession; this is not earning trust; this is not in any meaningful way aligned with where customers are and the problems customers are trying to solve. In many cases, customers are going to be better served by keeping two copies of the data, one in each availability zone rather than trying to replicate back and forth between them because that's what the economics dictate.That's ludicrous. It should never be that way. But here we are. And here I am. I'm Chief Cloud Economist Corey Quinn here at the Duckbill Group. Thank you for listening to my rant about AWS data transfer pricing.Announcer: This has been a HumblePod production. Stay humble.
שלום וברוכים הבאים לפודקאסט מספר 412 של רברס עם פלטפורמה. התאריך היום הוא ה-13 ביוני 2021, - הייתי אומר שזה תאריך קצת היסטורי: ככל הנראה היום הוקמה איזושהי ממשלה, אנחנו לא יודעים האם היא תחזיק מעמד, אבל ההצבעה הייתה ממש היום ואנחנו במתח לקראת מה שהולך להיות [מחשש שמספר הממשלות יעקוף את מספר הפרקים של רברסים?].היום אנחנו מקליטים פודקאסט עם ינון מחברת Via - תיכף תציג את עצמך - ויש לנו גם אורח מיוחד היום: כמחליף לאורי יש לנו היום את יונתן מחברת Outbrain - היי יונתן!שלום וברוך הבא - יונתן עובד ב-Outbrain כבר כך-וכך שנים וגם התארח בעבר בפרקים שלנו, אבל זה כבר ממש ממש היסטוריה עתיקה [נגיד 328 The tension between Agility and Ownership או Final Class 23: IDEs או 131 uijet או 088 Final Class 2 ... יש עוד].אז קודם כל - כיוון ש-412 זה Precondition Failed - כולם יודעים, נכון? לא הייתי צריך לבדוק את זה לפני השידור, ממש לא, זכרתי בע”פ . . . - אז יונתן, בוא וספר לנו על ה-Precondition שלך. או מי אתה, במילים אחרות . . . (יונתן) אז קודם כל - אני מאזין ותיק של רברסים, אני חושב שכשהגעתי ל-Outbrain לפני 10 שנים, אחת הסיבות הייתה הפודקאסט.הגעתי כמהנדס Backend, ובחמש השנים האחרונות אני מוביל ב-Outbrain את הפיתוח.(רן) “מוביל ב-Outbrain את הפיתוח” זו הדרך שלך להצטנע ולהגיד שאתה מנהל הפיתוח?(יונתן) מנהל הפיתוח . . .(רן) יפה, טוב - פיתוח ב-Outbrain זו קבוצה גדולה, יש לך הרבה עבודה, ותודה שבאת(יונתן) בשמחה.(רן) אז ינון מחברת Via, והיום אנחנו הולכים לדבר על הנושא של Serverless - אבל מזויות אחרות, זויות שעדיין לא כיסינו.לפני שנכנס ככה לנושא - ספר לנו קצת על עצמך.(ינון) אז אהלן, אני ינון, נעים מאוד.אני נמצא ב-Via משהו כמו שלוש שנים וקצת.סתם כרקע - דיברנו על Precondition - הגעתי ל-Via מחברת Ravello לפני כן - למי שלא מכיר, Ravello תומכת די גדולה ברברסים וככה גם התוודעתי גם לפודקאסט, וככה הגעתי אליך, רן . . .אספר קצת על Via, אני חושב שהם התארחו כבר בפודקסט [אכן - 360 Via] אבל עוד פעם, אולי מזוית אישית שלי, אני תמיד אוהב לספר לכולם צ'יזבט על Via . . .(רן) אז, דרך אגב, אירחנו איש מוצר, אז זה היה פחות טכנולוגי - והיום אנחנו הולכים להיות הרבה יותר טכנולוגיים.(ינון) מעולה - זו ההקדמה שאני עושה.למעשה, Via ככה התחילה . . . אני מספר תמיד צ'יזבט כזה, שאף אחד לא אישר או הכחיש במסדרונות של Via.לפי הסיפור, ה-CTO וה-Founder שלנו, אורן, הסתובב יום אחד ברחוב אלנבי וניסה לתפוס מונית שירות לכיוון תל אביב, לכיוון האוניברסיטה - ולמעשה גילה שזה די מסובך, בתקופה ההיא.לפני 7-8 שנים, משהו כזה, לתפוס מונית שירות זה לא כזה קל - צריך למצוא, ולאן להגיע, ואיך לעלות עליה ומה עושים איתה, ואיך שעולים צריך לשלם את הכסף הזה וזה נורא מסובך . . .ומה שעלה לו בראש זה ש”וואלה, זה רעיון מגניב הדבר הזה, זה סוג של בלגן מזרח-תיכוני כזה מגניב”, של מי שהיה כבר בתוך התחבורה הציבורית - ומצד שני, איזה כיף היה אם הייתה איזו אפליקציה או משהו שהיה מאפשר לפחות להזמין מונית, להגיע איתה מאיפה שאתה רוצה לאן שאתה רוצה, שאומרת לך איך להגיע, מה לעשות, זה היה משלם בשבילך . . . מרגיש כמו חלום.אז הוא הלך והגשים אותו, וככה Via התחילה את חייה - בניו-יורק . . .משם התגלגלנו להרבה מאוד מקומות אחרים - התחלנו כשירות סוג-של-Consumer ומשם הלאה זה התגלגל לשירותים שונים של Pre-booking ו-Paratransit.היום אנחנו מתעסקים גם ברכבים כמו School Bus, כלומר - כל מערך שירות האוטובוסים של עיריית ניו-יורק, כך שבעצם אנחנו יכולים לעשות המון המון דברים.ואת כל זה אנחנו בונים בעצם מעל Stack טכנולוגי אחד ויחיד, שכמו שרן קודם רמז - רובו ככולו נעשה מעל Serverless.(רן) כן, אז מיד ניכנס לסיפור הזה . . . דרך אגב, אני מניח שהרבה מהמאזינים מכירים את Via, וגם היו הרצאות של עובדים שלכם בעבר בכנסים - יש שם ערבוב מעניין של טכנולוגיה, אלגוריתמיקה, Data Science ודברים אחרים - וכמובן גם סיפור מוצרי.יכול להיות שמי ששומע את ה-Pitch שלך אומר “אה! זה Uber!” או אחרים - אבל זה לא . . . הפעם לא ניכנס לזה, כי אנחנו לא עושים פרק על מוצר. יש הבדלים, אבל לא ניכנס אליהם [וכמו כן - 360 Via].ועכשיו - בוא ניכנס לטכנולוגיה: אז אחד הדברים המעניינים ב-Via זה שה-Stack הטכנולוגי כולו, או רובו, רץ מעל פלטפורמת Serverless.אז בוא ספר לנו - איך בנוי ה-Stack שלכם?(ינון) אז בוא נתחיל אולי גם כן היסטורית -אז היסטורית, Via, כמובן, כמו כל חברת היי-טק ישראלית טובה, סטארטאפ ישראלי, התחילה עם Monolith, שלשמחתינו או לצערינו קיים עד היום, אותו Monolith מפורסם שנקרא “Via Server”, שם מאוד מקורי . . . ואותו Via Server רץ במקור על הרבה מכונות EC2 באמאזון - די סטנדרטי, כמו שאתה מצפה, הרבה לפני Kubernetes .מה שגילינו זה שככל שמתפתחים, והזכרתי קודם מקומות ש-Via התפתחה - בעצם ה-Stack עצמו התחיל להיות נורא-נורא יקר . . .מצד אחד, היינו נתקלים בהרבה מאוד בעיות של Scale-up - זאת אומרת שהיה צריך לעשות הרבה Scale-up ולהגדיל את ה-Monolith הזה כדי לתמוך ב-Traffic שכל הזמן גדל, ומצד שני, אם היינו משאירים אותו ב-Scale מאוד גבוה, אז AWS היו מאוד נהנים ו-Via פחות . . . מה גם שב-Via מסתכלים גם על צורת השימוש, בעיקר במקור אבל גם היום - יש תקופות שיש בהן Peak, אפשר לחשוב על זה גם בתל אביב, אנחנו מפעילים גם את שירות Bubble, אז אפשר לחשוב שבשעות הבוקר, בערך מ-07:00 עד 09:30 - מי שמנסה לתפוס Bubble יודע שזה לא פשוט, הרכבים מאוד מאוד עמוסים, ואתם בעצם מפציצים את השרתים שלנו . . . ואותו הדבר קורה בשעות הערב.מצד שני - מי שמנסה לתפוס Bubble בסביבות השעה 12:00, אז החיים שלו מאוד קלים, הוא מוצא אותו מאוד מאוד מהר - וזה פשוט כי יש פחות ביקוש, יש פחות אנשים שרוצים להגיד אל ומ-העבודה.(רן) אבל נשמע שאתם יודעים מה הולכות להיות שעות העומס . . . למה לא פשוט לעשות Scale-up ו-Scale-Down, כשאפילו יש לך חמש דקות לעשות Warm-up לשרתים, אם אתם יודעים מראש . . . ?(ינון) מעולה - אז ככה אמרנו גם אנחנו . . . “מעולה! החיים קלים - אנחנו יודעים לעשות warm-up ו-Scale-Down”הבעיה היא שבשביל זה צריך קצת לחזות ולהבין מה באמת יהיה ה-Traffic - ובכל פעם צריך כמו בפיתוח - לשים באפרים (Buffers) . . .“כמה יהיה מחר בדיוק? אז יהיו מחר 1,000 נוסעים? אז בוא נשים 10 שרתים, או 15 שרתים . . . “ואז למחרת קמים בבוקר - ויש גשם, וגשם זה מכה . . . אז פתאום ה-1,000 נוסעים הופכים ל-2,000 . . . ואם מסתכלים על עיר ב-Scale של ניו-יורק, שהיא עיר מן הסתם הרבה יותר גדולה, אז שם זה הופך מ-10,000 נוסעים ל-100,000 פתאום . . . במכה אחת פתאום כולם מפציצים.אז אם אין גשם - נהדר, זה אומר שמישהו צריך ב-06:00 בבוקר לשים לב ולהגיד “וואו, כדאי שנגדיל עוד יותר את השרתים” - ואתם יכולים לדמיין כמה פעמים מישהו פספס, או פספס לכמה שעות, פספס את ההערכה שלו, ובמקום 100 שרתים הוסיף רק 50 - והפסדנו.(רן) אני מבין את הנקודה הכואבת - זה לקום ב-06:00 בבוקר . . . זאת אומרת, אם זו הייתה חברה שפעילה בצהריים, אז לאף אחד אולי לא היה אכפת, ולא הייתם מגיעים ל-Serverless, אבל לקום ב-06:00 בבוקר זה כבר סיפור . . . (ינון) זו סיבה ממש מעולה לעבור למשהו אחר . . . הבעיה השנייה היא שגם אחרי כשהיינו מעלים את ה-50 שרתים - מישהו גם היה צריך לזכור להוריד את זה אחר כך . . . זה לא תמיד כזה קליל של “נרים עכשיו ואחר כך נזכור”, כי שוכחים, ולמחרת לא . . . והחשבון AWS נראה פתאום לא להיט.אז חפשנו פתרון שיאפשר לנו לעשות גם את ה-Scaling האוטומטי.מצד אחד אתה אומר “אחלה, אז יש פתרונות יותר מודרניים” . . . Kubernetes הגיע בשלב יותר מאוחר, אבל גם לפני כן היו פתרונות של Auto-Scaling Groups ב-AWS, אפשר היה להרים גם איתם.הבעיה היא שכשמסתכלים על זה רגע - אז Monolith שכזה, אמנם כתוב ב-Python, שזה עולה יחסית מהר - ועדיין עד שהוא עולה ועושה את ה-Init שלו, ומוריד . . . אפשר לחשוב קצת על מה ש-Via עושה, אז צריך להוריד את את המפות, צריך להוריד קונפיגורציות, צריך להכין כל מיני דברים . . . וזמן ה-Warm-up והבנייה של ה-Container הוא לא קצר - זה יכול לקחת גם דקות, תלוי כמובן בגודל ה-Traffic ובגודל הדברים שצריך להעלות - וזה כמובן די כואב.לעשות Scale-Up שמסתמך רק על ה-Auto-Scaling הזה מראש זה לא מספיק מהר, ויש תקופה לא מספיק קצרה שמפסידים.מפסידים כסף, מפסידים תנועה - וגם יש שירות ממש לא מוצלח למשתמש שמנסה לנסוע.וזו הסיבה שהתחלנו לחפש דברים אחרים.(רן) אבל זמן טעינה כזה - בטח יבוא יונתן תיכף ויטען - “רגע! אבל אתם לא Monolith! יש Microservices!” . . . אז למה ה-Monolith צרך לטעון את כל המפות של כל העולם ואת כל שאר הדברים? נכון, זה כבד - אבל יש לזה גם פתרונות אחרים, לא רק Serverless . . .(ינון) מעולה - זה השלב שבאמת הסתכלנו - ותודה יונתן על השאלה . . . - הסתכלנו ואמרנו “אוקיי, פתרון אחד זה באמת להגיד יופי, בוא נבנה את זה עם כל מיני Microservices”למעשה, אפשר להסתכל על זה ולהגיד שזה הפתרון שבחרנו - השאלה רק עכשיו היא רק מה ה-Transport שלו, מה ה-Pipeline שבאמצעותו אנחנו בעצם מרימים את אותו הדבר.אופציה אחת הייתה להגיד “אוקיי, נכתוב את הקוד באוסף של שרתים קטנים, ב-Python . . .”, ואגב - בחלק מהדברים זה מה שעשינויש מקומות שבהם . . . Via לא דוגמאטית ואומרת “Serverless is the only way”, זה לא הדרך הנכונה שלנו להסתכל על זה.אנחנו אומרים שבמקומות שבהם אפשר לעשות את זה בצורה קלה דווקא שלא על ידי להרים Service שלם וכבד מעל Kubernetes אלא להסתכל על יתרונות של דברים אחרים, זה היה המקום שבו הסתכלנו על Serverless.אם מסתכלים רק על למה בחרנו ללכת עם Serverless בחלקים ספציפיים, אז בעצם שמנו לעצמנו כמה נקודות מעניינות - אמרנו שאנחנו רוצים כמה שפחות התערבות של DevOps, כי DevOps זה דבר יקר וזה דבר מסובך - לא רק מצד האנשים אלא גם עצם הזמן שמושקע ב-DevOps, בלהרים סביבות ולסדר אותן - מאוד יקר.גם בסביבה מאוד מוצלחת כמו Kubernetes, שבאמת יש לה הרבה יתרונות - עדיין יש הרבה מאוד קונפיגורציה שצריך לעשותצריך להבין מהם הפרמטרים שבעזרתם אנחנו קובעים Scale-up ו-Scale-Down ו-Scale-In - ובעצם לקנפג (Configure) את השרת, לעשות Fine-tuning כל הזמן, כדי להגיע בעצם לתוצאות שהיינו רוצים להגיע אליהן.אז גם בסביבה של Microservices קלאסית כזאת, שבה יש Containers ו-Pods, רוב עבודת הקונפיגורציה הזאת היא עלינו, אחריות שלנו . . . (רן) אני חושב שיש חוק, כלל שימור האנרגיה בטכנולוגיה: עבודה לא נעלמת - היא משנה צורה . . . אם לפני כן היית צריך לחווט כבלים, אז היום אתה צריך לקנפג VPCs, ואם לפני כן היית צריך לקנפג איזושהי מכונה, אז היום אתה צריך לייצר איזשהו Script או לעשות איזושהי קונפיגורציה ב-Terraform או כל כלי אחר . . .ההתמחות משתנה, אבל העבודה לא נעלמת(יונתן) אמרתם שאתם לא דוגמאטיים, זאת אומרת - אתם לא אומרים שזו הדרך היחידה. יש דברים שבהם אתם כן משתמשים עדיין ב . . . ה-Monolith עדיין משחק תפקיד? ה-Microservices עדיין באיזור? או שזה . . .(ינון) קודם כל, ה-Monolith עדיין קיים - לא בכל Deployment ולא בכל מקום, אבל עדיין קיים בלב של חלק מה-Deployment שלנו.ויש לנו עדיין כמה מה-Services האחרים, שהם Microservices סטנדרטיים, עם Containers, חלק כתובים ב-Java, חלקם ב-Python, ועדיין קיימים כ-microServices קלאסיים, Docker Containers בתוך Kubernetes.בעיקר במקומות שבהם יש צריכת זכרון מאוד גבוה - אנחנו צריכים להוריד . . . לצורך העניין להחזיק את המפה - מפה, מן הסתם, זה אובייקט שלוקח הרבה מאוד זכרון, ושם דווקא יוצא לנו יותר נוח להחזיק אותה למשל בתוך Container.כך שיש לנו גם Kubernetes stacks שלמים.עם זאת, במקומות שבהם זו לוגיקה או שהוא “Container Glue” - וזה, אגב, הרבה ממה ש-Via עושה . . .תחשבו, לצורך העניין, על נהגים שמסתובבים בעיר ומדווחים לנו מיקום - הם צריכים כל הזמן לדווח איפה הם נמצאים ולקבל הוראות - זה משהו שלא מצריך הרבה זכרוןמה שהוא באמת מצריך זה את היכולת לעשות Scale-up ו-Scale-In, לפי כמות הנהגים שמסתובבים כרגע בכביש.אז במקום, בעצם, להרים Containers שיודעים לטפל בדבר הזה, גילינו שהרבה יותר קל לנו להרים “micro-micro-micro-Containers”, או “Nano-Containers” כאלה - שזה, בתכל'ס, Lambda . . . אז זה בדיוק מה שזה עושה.(יונתן) אז זה Use-case של, נגיד, לקבל את המיקומים של הנהגים ולכתוב אותם איפשהו, אני מניח? יש Use-cases אחרים, נניח אם אני רוצה להזמין מונית, זה גם . . .(ינון) גם זה על Serverless, לגמרי. גם זה ירוץ Serverless.בעצם תגיע בקשה, לאיזושהי Lambda, שיושבת, לצורך העניין, מאחורי או ALB או איזשהו API Gateway.היא מחוברת ישירות לתוך ה-Lambda - ומשם, למעשה, יכולה לרוץ “שרשרת של Lambda-ות” . . .הגיעה בקשה - מזהה מי הנוסע - משם זה רץ למוקד ההזמנות שלנו, שזה בעצם המוקד “שמדבר” מול הרכבים - תבוצע הזמנה - זה יעבור לאיזושהי Lambda שיודעת לנהל תשלומים, מן הסתם צריך לבדוק שאתה גם רשאי לעלות על הנסיעה . . . זה גם יעבור משם לאיזשהו מקום שהנהג מקבל בו את ההוראותמשם זה יפנה לשירות המיפוי - שזה, להזכיר לכם, Container - נגיד לו “נא לייצר לנהג מסלול חדש”, שיוביל אתכם לאיסוף של אותו נוסע.ובסופו של דבר זה יתורגם כהוראות בחזרה לנהג - והנהג מקבל הוראה וימשיך לשדר לנו את אותם דיווחים, שאנחנו קוראים להם Heartbeat, בשם המקורי . . .וזה יגיע חזרה, בעצם, למערכות שלנו, וימשיך את אותו Flow שהזכרתי קודם.(רן) בוא נדבר רגע על כסף, כלכלה . . . קודם הזכרת שהיו Instances של EC2, והייתם צריך לעשות Scale-up ואז אולי שכחתם לעשות להם Scale-Down וזה עולה כסף וכו' . . . לי יצא לעבוד Serverless, בסטארטאפ הרבה יותר קטן מ-Via, וזה גם היה לפני כמה שנים, לפני חמש שנים או משהו כזה, ואז היה ברור ש-Serverless זה יקר . . . זאת אומרת - יש יתרונות בצד של האופרציה, יש מודל תפעולי, יש מודל תכנותי שהוא בריא, את כל הדברים האלה מאוד מאוד אהבתי - אבל דבר אחד היה ברור: שזה הולך להיות מאוד יקר כשנעשה Scale-up.אם אין כלום באוויר, אז נכון, זה לא עולה - אם לא קוראים לפונקציה שלך, אז עץ שנופל ביער ואף לא שומע אותו אז הוא לא באמת נופל . . . אז זה ברור שיותר קל מאשר לתחזק Container של EC2.אבל ברגע שיש Traffic משמעותי, והפונקציה כל הזמן נקראית, אז היה ברור, לפחות אז, שזה גם הולך להיות הרבה-הרבה-יותר יקר מאשר לתחזק Microservice משלך.איך נראית הכלכלה של זה היום?(ינון) אז בוא נספר לכם את זה ככה . . . נתחיל דווקא מסיפור ואז ניכנס לאט לאט לכיוון הזה.בעצם, בתחילת משבר הקורונה [הסיבוב הראשון . . .], אתם יכולים לדמיין מה הייתה ההשפעה של זה על שירותי ההסעה . . . במכה אחת, בוקר אחד, בתוך שבוע פחות או יותר, עברנו ממאות אלפי נוסעים בניו-יורק לבערך עשרה.אף אחד לא נסע, אף אחד לא זז - וזה היה בכל העולם, לא רק בניו-יורק.מצד שני, אם מסתכלים רגע עכשיו על הכלכלה, או על מה שעלה כסף - במכה אחת כל ה-Lambda-ות שלנו ירדו לאפס, הפסקנו לשלם עליהן לחלוטין,ולעומת זאת כל אותם Containers - שנשארו ב-Monolith וכל מיני כאלה - השאירו שם לא מעט דולרים, שהמשיכו לזרום ישירות לכיסים של מיסטר בזוס . . . [קצת אמפטיה, לאיש יש חללית לבנות].כך שלפחות ברמת ה-Scale-Up / Scale-In, יש לזה כלכלה שהיא סופר-מוצלחת - אנחנו לא צריכים להשאיר בשום שלב “ספיירים” כדי להתמודד עם עומס “למקרה ש…"אפשר לדבר על Warn-ups, אבל לא משאיריםומצד שני, גם בזמן שהיא באוויר והיא כן עושה את הפעילות, אנחנו רואים שזמן העיבוד בפועל, שבו ה-Lambda בעצם עובדת, הוא מאוד נמוך - בעיקר כי כי משתמשים בקריאות א-סינכרוניות.אם עובדים בצורה שהיא יותר א-סינכרונית - כלומר, קריאות שמגיעות עוברות . . . משקיעות את רוב הזמן שלהן במעבר בין Lambda-ות בתוך תורים, ואין בו קריאות סינכרוניות החוצה - פתאום הזמן שבעצם ה-Lambda רצה הוא מאוד מאוד קטן [קצר].ספציפית, אגב - לפני כמה חודשים AWS החליפו את צורת ה-Billing שלהם ממינימום של 100 מילי-שניות למינימום של 1 מילי-שנייה ל-Lambda, וזה שיפר משמעותית את העלות שלהן, בעיקר של Lambda-ות קצרות, שזה הרבה ממה שאנחנו עושים.(רן) הבנתי - זאת אומרת שאם, לדוגמא, ה-Lambda-ות . . . נגיד, אני אתאר איזשהו Flow של -Lambda שקוראת ל--Lambda וכו' - אם כל אחת מהן מחכה לשניה בצורה סינכרונית, אז אתה משלם את החשבון של כולן, אם היא בדרך; אבל אם זה קורה בצורה א-סינכרונית, במעבר דרך SQS או כל מכניזם אחר - אז אתה משלם רק על זמן העיבוד המינימלי. בסדר, אני מבין . . .(יונתן) גם מה שמעניין פה, רן, זה שיש קשר ישיר בין ה-Business - שזה ה-Traffic שאתם מקבלים - לבין העלות, מה שעם Services יותר קשה לעשות את הקשר הזה.הוא חי כל הזמן, גם אם הוא לא יקבל Traffic, גם אם אתה לא “מקבל כסף”.(ינון) בדיוק - בערים של Via, יש ערים שלמות שבהן אין לך שירות בשעות מסויימות של היוםגם השירות ב-Bubble, לצורך העניין, הוא רק בשעות היום, הוא נגמר סביב 22:00-23:00 בלילה [אפילו קצת יותר].תחשוב שכל הלילה יש איזשהו שרת פעיל, וכן צריך לענות תשובות לשאלות: אם איזשהו נוסע פותח אפליקציה של Bubble ב-02:00 לפנות בוקר, אנחנו ניתן לו תשובה שאין כרגע שירות - אבל בשביל זה צריך שיהיה איזשהו שרת באוויר . . .אז ככל שנעביר יותר מהדברים האלה ל-Serverless, אם מישהו יבקש בקשה אז הוא יקבל, אבל אם לא - אז אין צורך בכלל להרים את ה-Service.(רן) אז אתה אומר שבגלל שאתם מאופיינים באלסטיות מאוד גדולה - אולי קורונה זו דוגמא קיצונית, אבל עדיין ביום-יום יש אלסטיות - יש סופי-שבוע, יש שעות שונות במהלך היום, יש חגים . . . בכל אופן, יש אלסטיות בצורה יחסית משמעותית - זה עושה את המודל של Serverless ליותר משתלם אצלכם.אני תוהה - אני לא יודע אם יש בכלל את התשובה, אבל אני תוהה - האם למישהו עם Workload יחסית מאוזן לאורך היממה, האם גם לו זה הולך להשתלם?(ינון) זו תמיד שאלה של מה באמת ה-Workload שלך - וכמה באמת ממה שאתה עושה הוא באמת Broken-down ל-Microservices עד הסוף.למה אני מתכוון? לצורך העניין, אם מסתכלים לרגע על אותו שירות של Via, אז מצד אחד מה שבאמת לוקח המון מהקריאות ומה-Traffic אלו אותם Hearbeat-ים של הנהגים - זה משהו שאנחנו יודעים עליו שהוא מאוד מאוד כבד מבחינת כמות הקריאות שנעשות ומבחינת זמן העיבוד שרץ שם בפנים - גם אם העיבוד עצמו הוא מאוד קצר.אז אם יש לך איזשהו שירות שבו את מחזיק את ה-Heartbeats האלה יחד עם עוד שירות ביחד, בעצם אתה עושה פה Coupling מאוד חזק של של שרת אחד יחד לשתי שכבות יחד.בעולם של Serverless, נורא קל לעשות את ה-Breakdown הזה ממש ל-”Nano-Services” - זה Service שאולי אין לו בכלל זכות חיים משל עצמו, אבל לעשות Scale-up של חתיכה קטנה מתוך ה-Service זה נורא קל.(רן) אוקיי, כן, בסדר - זה היה השיקול הכלכלי. עכשיו, בוא נסתכל על השיקול המתודולוגי.אני אשאל את זה ככה - האם המפתחים שלכם ניהיו יותר טובים, כי הם עובדים Stateless? הם נהיו יותר טובים כי הם נאלצים לרוץ תחת Constraints כאלה של פונקציות Lambda? או במילים אחרות - איך אתה רואה שמתודולוגיה כזאת משפיעה על צורת הפיתוח, איכות הפיתוח, איכות הקוד וכו'?(ינון) אז יש לזה כמה תשובות . . .מצד אחד, כן - המפתחים, בלית ברירה, צריכים לחשוב על עולם שבוא אין זכרון מרכזי, אין שיתוף בין . . . השיתוף היחידי בין Containers הוא בעצם משהו חיצוני, כך שזה גורם לאנשים לחשוב כמה שיותר על איך מחזיקים State ומה עושים איתו.באמת עלינו עם הרבה כיוונים ופתרונות לזה, שגם חלקם, אגב, זה שיקולים כלכליים גם כן . . . לצורך העניין, גילינו שעבודה עם Databases שהם יותר Serverless באופי שלהם, כמו DynamoDB, יוצא לנו הרבה יותר זול - וגם נוח מבחינת Burst-ים של Traffic - מאשר להשתמש ב-Database שהוא “MySQL-כזה”, ושיותר קשה לו לעשות Scale-up.ו-Dynamo, לצורך העניין, הוא גם “אם לא השתמשת - לא שילמת”, אז אם לא קראת אז לא קרה כלום - ולעומת זאת ב-MySQL, גם בגרסאות מוצלחות כמו Aurora, אתה משלם כל עוד ה-Instance למעלה, לא יעזור כלום.בנוסף לכך, גילינו שיש הרבה דרכים גם לשפר את הקריאה מה-Database - אנחנו עובדים כמובן גם עם איזשהו ElastiCache או עם איזשהו S3 כ-Cache מקומי, שעוזר לנו להתמודד בעצם עם Burst-ים של “פתאום אלפי Lambda-ות מנסים לתקוף את ה-Database” [הסרט הבא של Netflix?].ו-Dynamo, לצורך העניין, מתמודד עם זה די יפה, בשביל זה הוא בנוי.כש-Aurora, שהוא Database די מוצלח, קצת פחות נהנה מכזה Burst של Traffic.יש כמה פתרונות ל-AWS, שעובדים ויודעים לפתור את הבעיה הזו - חלק מהם זה אנחנו בנינו בעצמנו לצורך העניין - הרמנו Cache ב-Redis מעל הדבר הזה, בעזרת ElastiCache.אופציה אחרת זה שיש לשים Proxy לפני ה-Database - ובעצם לעשות Connection Pooling לפני ה-Database עצמו.זה באמת מאפשר לנו להריץ, שוב, הרבה Load עם הרבה מאוד Spikiness . . .(רן) בוא רגע נתעכב על הסיטואציה הזאת של Connection Pooling - אני חייב להגיב שגם אני נכוותי מזה . . . ממש אותה סיטואציה שאתה מתאר: פונקציות Lambda, עם Aurora ו-MySQL מאחור - ואלפי פונקציות Lambda שמנסות להתחבר אל ה-Database . . . עכשיו - אם כל האלפים הללו היו בסך הכל Thread-ים בתוך אותו Process, אז יש Connection Pooling ולפי . . . נניח שאתה מחליט שה-Database מרשה שיהיו 300 Connections, אז 300 Thread-ים ידברו עם ה-Database, והאחרים יחכו בתור.אבל פה - Lambda לא יודעת “לחכות בתור” . . . . אז הן מתחילות להיכשל . . .(ינון) ה-Lambda-ות הן באמת יצור קצת אנוכי בקטע הזה - הן לא כל כך “מסתכלות מסביב”ובאמת יש שני פתרונות שאנחנו גילינו והשתמשנו בהם - אחד מהם זה פתרון Built-in של AWS, יש להם Proxy שנועד לפתור בדיוק את הבעיות האלה.בעצם שמים . . . תחשוב על זה כעל סוג של מכונה ששמים לפני ה-EC2, בעצם EC2 לפני ה-Database.ה-Lambda מתחברת למכונה הזאת - והמכונה עצמה מחזיקה Connection Pool - והיא אומרת ל-Lambda “אוקיי, חכי שנייה, אני אתפוס אותך על ה-Connection Pool הבא”.וזה מתנהג, בעצם, מבחינת התפיסה, מאוד דומה למה שהזכרת קודם - בתוך Monolith שכזה . . . זה מאפשר להשתמש בעצם ב-Aurora [מחייב רפרנס ל-The Robots of Dawn . . . .](יונתן) דרך אגב, אתה יודע, רק כדי להשלים את התמונה ואת המוטיבציה - זה לא רק שהן מפגיזות את ה-Database וחלקן נכשלות, אלה למה בכלל מייצרים Connection Pool? כדי לחסוך את זמן יצירת ה-Connection, שב-DCP זה זמן יקר - אבל Lambda לא יכולה לעשות את זה . . . Lambda חייבת בכל פעם לייצר את ה-Connection מחדשואז אתה משלם שוב על Latency - וגם דולרים בסופו של דבר . . .וזו רק דוגמא אחת של Connection Pool - אני חושב שכל Local Cache . . . .כל מה שב-Microservices אתה יכול להשתמש ב-Local Cache, פה אתה בבעיה, אתה צריך פתרונות אחרים.(ינון) נכון . . . אז יש כמה דרכים . . . שוב, כשנתקלנו בבעיה דומה, אגב במקום שבו ה-Database היה Read-Mostly, הפתרון היה בעצם להשתמש ב-ElastiCache כסוג-של-Cache מעל ה-Database.אפשרנו להכריז את ה-Database עצמו כהרבה יותר קטן - ומה שצריך זה Lambda ש”פעם ב” . . . פעם בזמן ה-Refresh-הרלוונטי הייתה פשוט מרפרשת (Refresh) את ה-Cache.די פשוט - לקרוא מה-Database, לדחוף ל-Cache . . . בעצם לעבוד ישירות מול ה-ElastiCache.עלו על כמה פתרונות בדרך, אגב - יש פתרון שנקרא EFS, שבעצם מאפשר להחזיק File System, כשה-Lambda-ות בעצם חולקות, ויש גישה שהיא הרבה יותר קלה, לא צריך להחזיק Connection אלה פשוט זה ניגש ישירות ל-Data.וגם כשהוא באותו Proxy, באמת זה שימושי כדי להחזיק DCP Connection פתוח מול ה-Database ואז רק צריך ליצור Connection קטן מול “הדברצ'יק” הזה.(רן) לפעמים קורה שאתה כן רוצה לשלוט על Server . . . זאת אומרת: אנחנו מדברים על Serverless, ואתה רץ בתוך איזשהו Container. אבל וואלה - לפעמים אתה רוצה לקבוע את כמות הזכרון, לשחק ב-TCP Stack, לעשות כל מיני אופטימיזציות על File Descriptors וכו' . . . מה אתה עושה כשאתה מגיע למצב כזה? מה אתה עושה כשאתה מרגיש שאתה כבר “מגרד את תקרת הזכוכית” בתוך ה-Lambda שלכם?(ינון) קודם כל, אני אשאל אותך - למה? מה המניע?כי בדרך אצלנו, At the end of the day, the business is not that . . . אנחנו לא מתעסקים בלהתעסק עם הקרביים של איזשהו קובץ . . . אם אין ברירה אז אין ברירה, אבל לא מצאנו, עד עכשיו, שום מקום שבו היה צורך בזה.כלומר, העדפנו את הקלות של ה-No-Ops, כשכל מה שצריך לקבוע ב-Lambda זה את כמות הזיכרון שלה - והיא רצה.אני כמובן מגזים, ואפשר לקבוע עוד כמה דברים - האם היא רצה בתוך vPC או מחוץ ל-vPC, יש Security Groups וכו' - אבל בגדול, ברגע שקבעת אותם Once אז גמרנו, ואין מה להתעסק עם זה כמעט.לפי כמות הזכרון בעצם אתה קובע את ה . . . לא את הזכרון אלא את ה-Performance הכללי של ה-Lambda.בגדול, כשאני חושב על זה - כשאתה קובע את הזכרון אתה קובע כמה Lambda-ות רצות על Container אחד של AWSוככל שרצות פחות Lambda-ות, כלומר תופסות יותר זיכרון ורצות פחות Lambda-ות על ה-Container - אז יש לך יותר משאבים בתוך ה-Container: גם CPU, גם Network card - וזו בעצם השליטה שיש לך.בסך הכל - גילינו שכשקצת משחקים עם הזכרון אז זה ממש מעל ומעבר למה שאנחנו צריכים מבחינת השליטה שיש שלנו בתוך ה-Server.לדברים שהם ממש Fine grained - אני מסכים, Lambda לא מתאימה.בשביל זה בדיוק אנחנו הולכים למקומות אחרים כמו Containers ו-Pods.(יונתן) אם מסתכלים קצת אחורה, אז פעם היו “מפלצות כאלה” - היה WebSphere ו-JBOSS ואפילו Tomcat . . . אתה היית כותב את הקוד שלך, עושה לו איזשהו . . . היו קוראים לזה WAR ו-EAR וכל מיני קללות . . . ועושה לזה Deploy בתוך איזשהו Container.וכשהגיעו ה-Microservices זה די הלך לכיוון אחר . . . במקום להיות “אורח” בתוך איזשהו Run-time שמישהו אחר מתחזק, והוא מאוד גדול ומורכב ומוטת השליטה שלך היא קטנה, אתה ניהיה בעל בית של ה . . . אם אתה רץ ב-Python אז אתה ניהיה הבעל-בית של ה-Process של ה-Python של ה-JVM - ובאיזשהו אופן ה-Serverless קצת מחזיר אותך אחורה, לפחות “אחורה” מבחינת האופנה . . . אתה עדיין ניהיה אורח בתוך איזשהו Run-time שמישהו אחר מחזיק ומקנפג (Configure) - איך אתה פה עם “הרטרו” הזה? . . .(ינון) אני מת על הרטרו הזה, כי מי שמחזיק ומקפנג את ה-Server הענק הזה זה לא אני . . . זה התותחים ב-AWS, שיודעים בדיוק מה רוצים - והם די טובים בעולם הזה . . . בגלל זה אני חי עם זה בשלום.אם אני הייתי צריך לתחזק את ה-JBOSS הזה או את ה-WebSphere הזה, אז כנראה שלא היינו מדברים היום . . .מכיוון ש-AWS עושים את זה ואנחנו . . . בסופו של דבר הם באמת יודעים בדיוק מה הם עושים והם טובים בזה, אז אני חי עם זה די בשלום..אני אמנם נתון לחסדיהם, וזה נשמע קצת פטאליסטי, אבל at the end of the day, אם יש מישהו שטוב להיות בידיים שלו זה כנראה החבר'ה ב-AWS שעושים עבודה די טובה.ומה שאני מרוויח מזה באמת זה שאני לא צריך להתעסק יותר עם קונפיגורציות מסובכות, אני בסך הכל I Deploy my code, it works - וזה די הסיפור.בטח כאשר אנחנו מרגישים . . . זאת אומרת, AWS הם מאוד פתוחים מבחינת האימפלמנטציה (Implementation) שלהם ומה שהם מוכנים לספר, ברמה כזאת שהם מאפשרים לך להריץ Any Run-time you want, bring your own Run-time . . . אם אתה רוצה ממש להריץ קוד Fortran מעל Lambda אז No problem, you can do it [ברצינות…] כך שזה אמנם סגור מצד אחד - אבל יש לזה הרבה פתיחות מהצד השני.(יונתן) ואני מניח שאם אתה באמת רוצה להיות בעל הבית של ה-Process, אתה תעשה Microservice שיפתור את הבעיה, אם אתה צריך לקנפג (Configure) את הלא-יודע-כמה Descriptors שאתה צריך . . .(ינון) בדיוק, ואגב - גם שם, זה קצת שונה, אבל במובן מסויים אתה Hosted בתוך Kubernetes . . כלומר - יש לך יותר שליטה על ה-Process, אתה שולט באמת על ה-Run-time, יש לך את ה-Pod . . .מצד שני, יש לך עדיין איזשהו “בעל-בית” שאומר לך “שמע, אתה לא בדיוק עושה את מה שאתה . . . אני עדיין בעל הבית פה”.(יונתן) נכון.(רן) אתם עדיין “Python-Shop”? או ש . . .(ינון) עדיין Python-Shop . . .(רן) זאת אומרת - ברמת העיקרון, Lambda מאפשר לך, אפילו יותר בקלות, לגוון בשפות היעד - אבל זו הזדמנות שעוד לא לקחתם.(ינון) נכון - חוץ מהעובדה שבעצם מה שבאמת משפיע, ואפשר לדבר גם על זה קצת, זה Cold-Start . . . מסתכלים רגע על מתי Lambda עולה, אז כש-Lambda מרימה את עצמה, היא צריכה לעשות הרבה קונפיגורציות ו-Setup.ובעצם העלאת Run-time של Python זה ה-Run-time, אולי חוץ מ-Node, הכי מהיר שיש.משמעותית, לצורך העניין, יותר מהיר מאשר לעלות Lambda של Javaוגם כאלה, אגב, יש לנו כמה, מסיבות הסטוריות - ובאמת רואים שה-Run-time של Java, עד שהוא עולה . . . הוא כבד.מצד שני, אם משתמשים ב-Lambda, אחד מהחסרונות - שהוא גם יתרון, במובן מסויים - הוא שה-Lambda Run-time הוא Single-threaded, או לפחות Single-Core - אין שם באמת תמיכה מלאה ב-Multi-threading, שרצים במקביל.אפשר להסתכל על כמה Processes בתוך Lambda, אבל לא Multi-thread - מה שמייתר, לצורך העניין, את הצורך להשתמש ב-asyncio או בכל מני Thread-ים מסובכים ב-Javaוגם הסתכלנו קצת בעבר על Go-lang - שפה כזו מודרנית ומגניבה [חכה לבאמפרס הבא . . . ]אז לכתוב בה Containers זה די מגניב, אבל לכתוב אותה בתוך Lambda זה די מיותר . . . זאת אומרת - אי אפשר להרוויח שם בכלל מכל ה-asyncio שיושב בתוך Go, כל ה-Async Functions(רן) כן . . . טכנית זה אפשרי, רק שאתה לא מרוויח(ינון) בדיוק - זה עדיין Single-threaded אז זה סינכרוני לחלוטין.(רן) איך נראית חווית המפתח? זאת אומרת - מה קורה אם פתאום ה-Production איטי, או פתאום דברים אובדים, או פתאום . . . לא יודע, בקשה מקבלת Time-out או דברים כאלה? איך מדבגים (Debug) תהליך? איך עושים Tracing? . . איך מדבגים פונקציות Lambda שמפוזרות, אני לא יודע כמה . . . כמה יש לכם?(ינון) יש לנו, בפעם האחרונה שספרתי - כמה אלפים טובים של פונקציות.(יונתן) . . . בטח אתה מתחיל להתגעגע ל-Monolith, שיכולת לשים Break-point ולראות בדיוק מה קורה . . .(ינון) . . . בדיוק, זו אכן שחוויה שהיא . . . At first daunting . . . כשמסתכלים על זה בפעם הראשונה, אני זוכר שאני הסתכלתי על זה ואמרתי “אוקיי, מה אני עושה?” . . . אני פותח את CloudWatch ומנסה לחפור בלוגים . . .לא חווייה מאוד מעניינת, לא כיפית כל כך.ובאמת, אחד הדברים ש-Lambda מחייב זה Clear observability - אז יש כלים פנימיים של AWS - לצורך העניין X-Ray, ש-AWS מאוד דוחפיםכלי חביב כזה, שעוזר בעצם לעשות Distributed Tracing.הבעיה העיקרית עם X-Ray זה שצריך לעבוד בשביל לגרום לזה לעבוד . . . כלומר, חלק מהעבודה היא גם להכניס בעצם יכולת של Tracing בפנים . . .(רן) . . . אינסטרומנטציה (Instrumentation)(ינון) . . אינסטרומנטציה שכזאת, בדיוק . . .ואנחנו העדפנו כלי שעושה את אינסטרומנטציה בשבילנוחפרנו קצת מסביב והתלבשנו בסוף על Epsagonלמי שלא מכיר את Epsagon - כלי מעניין מאוד, שבעצם, עם מעט מאוד עבודה, מאפשר להיכנס ולעשות Tracing של כל ה-Lambda-ות שלנו יחד ולחבר אותן ביחד.הוא משתמש בספרייה שנקראת Jaeger כדי לעשות בעצם Distributed Tracing, זו ספריית open-source די מוכרת, פשוט המימוש שלהם די מוצלח.בעצם, זה מאפשר לנו לראות קריאות שמתחילות ב-Lambda אחת ונגמרות ב-Lambda אחרת, בקצה ה-stack, דרך כל ה-Lambda-ות האחרות.גם מבחינת Tracing שלהן, גם מבחינת ה-Payloads שעברו בתוך ה-Lambda-ות - מבחוץ פנימה, דרך ה-SQS-ים השונים, קריאות ל-Database וכן הלאה.בעצם, זה מאפשר גם לראות את הלוגים - וגם לראות Performance: כמה כל קריאה לקחה, בפנים.(רן) אבל מההיכרות שלי עם Jaeger - הוא מצויין כשמדובר על gRPC או HTTP - כל הדברים הסינכרוניים, אבל דברים א-סינכרוניים, למשל המעבר ב-SQS או מעבר ב-Kafka - שם אתה צריך כבר להמציא בעצמך פתרון . . . אז הם עטפו לכם את זה?(ינון) הם עטפו את כל העסק, הם טיפלו בזה מאוד יפה - אפשר לראות ממש את הקריאות ל-SQS ואת המעבר החוצה, את הקריאה החוצה מתוכו.בעצם הכניסו לא מעט מה-Tracing . . . הרחיבו Jaeger לתוך ה-Tracing שלהם - על זה אולי יהיה מעניין לעשות פרק אחר . . .אבל אנחנו כן משתמשים ב-Epsagon ורואים Observability מלא, End-to-End.המקומות היחידים שבהם זה נשבר הם מקומות שבהם לא הוספנו איזה ארבע שורות לתוך ה-Serverless Framework, שבאמצעותו אנחנו עושים Deploy ל-Lambda-ות, ששם בעצם אנחנו לא עושים את ה-Automatic wrapping שלהם - ושם באמת רואים מתי זה נשבר וכמה זה קשה.ובמקומות כאלה שאנחנו מזהים, זה מאוד פשוט להוסיף Tracing אוטומטי שכזה - זה ממש כמה שורות, להוסיף מודול קטן ב-Node וזה הופ! עושה Tracing אוטומטי ובעצם מאפשר לנו Observability מלא ממש של הכל.(רן) אוקיי, אז זה Tracing ב-Production - אבל איך נראית חוויית הפיתוח? אני עכשיו צריך לכתוב איזשהו Service חדש, או פונקציה חדשה - מה, אני פשוט פורש את זה לענן ורואה מה קורה, או שיש איזשהו משהו מקומי?(ינון) . . . That's pretty much itיש כמובן דברים בסיסיים - אם כותבים ב-Python אז Unit Testing זה דבר די סטנדרטי, שאנחנו מן הסתם חייבים לכתוב, זה אפילו סוג-של-תחליף-Complier, בלאית ברירה.יש קצת Linting וכאלה - אבל למי שלא מכיר את Python, ואני מניח שיש מעט מאוד כאלה, יודע שבלי איזה Unit Test אחד או שניים כדי לראות שהקוד באוויר אתה מאוד בקלות פורש איזו שטות . . . אפשר להריץ Unit Testing בשביל לעשות Local Debugging פשוטבדרך כלל, מה שאנחנו עושים זה פשוט פורשים את זה ישירות לענן מהסביבת Dev, מריצים אוסף של קריאות HTTP ל-Lambda-ות כדי לראות שזה עובד, PostPlan עובד שעות נוספות . . .כן יש פה כמה אופציות להריץ לוקאלית, זאת אומרת - גם ל-AWS יש אופציה להריץ סוג-של Local Lambda Server, “להרים את Lambda מקומית”להגיד שזה מאוד נוח? זה לא . . . זה לא להיט, וגילינו שהרבה יותר קל ונוח לנו לפרוש ישירות ל-Dev Environment ולהריץ הכל משם.(רן) זאת אומרת שיש לכם איזשהו עותק של סביבת ה-Production . . . זאת אומרת, להריץ את הפונקציה שלך זה . . השאלה היא האם היא תלויה בפונקציות אחרות? בתורים אחרים? ב-Databases אחרים? שם הדברים יותר מתחילים להסתבך.אז בעצם, את כל זה אתם עושים ישר בענן? לא על תחנה מקומית?(ינון) נכון.אפשר להסתכל על זה בעצם כעל סוג של Sandbox, שמכיל את ה-Lambda-ות שיש לנו בעולם.בעצם, אנחנו פורשים את הקוד ישירות לשם ובודקים אחד מול השני.מן הסתם, כל Lambda היא באחריות של איזשהו צוות, כל אוסף Lambda-ות או כל Service, בעצם - זה לא רק Lambda, אנחנו מסתכלים על אוסף של X [כמה] Lambda-ות כעל Service מסויים, שיש לו איזושהי מטרה.אנחנו בעצם פורשים את השירותים השונים אל תוך הענן ועושים . . . משתמשים ב-Convention כדי להגיע משירות לשירות ולעשות את כל ה-Wiring בין ה-Lambda-ות השונות.(יונתן) במובן מסוים, גם ב-microServices, החל מ-Scale מסויים, אתה בבעיה די דומה . . .זאת אומרת - כשיש לך כבר כמה מאות אתה כבר לא מרים את כולם על ה-Laptop, וגם פה תלוי ב-Cloud שלך, בעצם.(רן) מסכים ב-100% . . .אני חושב שהבעיה, או האתגר, של שירותים מבוססי-דאטה זה לשחזר איזושהי סביבת Production, כלומר - אם אתה רוצה איזשהו Copy של סביבת ה-Production, עם הדאטה של Production, אבל בלי להזיק ל-Production, וגם לא לשלם את העלות של Production.לפעמים, ה-Databases הם ענקיים, ואתה לא באמת רוצה עותק מלא - אז תיקח את ה-Sub-set של הדאטה, שהוא בדיוק מה שאתה צריך אבל לא יותר מזה - וגם לא תזיק ל-Production - זה אתגר לא פשוט לכל מי שמתעסק עם כמויות גדולות של דאטה, בלי קשר ל-Lambda או לא Lambda.כמה זמן אתה ב-Via?(ינון) שלוש שנים . . .(רן) אוקיי . . . כשהגעת, כבר הייתם בעולם ה-Serverless?(ינון) זו בדיוק הייתה ההתחלה, בשלב שבו הסתכלנו על זה בפעם הראשונה.(רן) אני אגיד לך למה אני שואל - אני מנסה לדמיין מפתח ותיק, מפתח מנוסה אחר, שעכשיו נכנס ל-Via. האם אתה מוצא, נגיד כשאתה מסתכל על מפתחים שגוייסו בזמן האחרון, ואני לא מדבר על צעירים שבחיים לא כתבו קוד אלא על כאלה שכבר . . . אתה יודע, “שועלי קרבות” . . .(יונתן) הפילו את ה-Production כבר כמה פעמים . . .(רן) כן . . . האם אתה רואה שהם, אתה יודע - הם מסתכלים על כל עולם ה-Serverless הזה, ועכשיו צריכים לכתוב איזושהי פונקציה חדשה - האם אתה רואה שהם נלחמים ביצר הטבעי שלהם, או שזה פשוט בא להם בטבעיות, והם “משילים מעליהם” איזשהו משקל כבד שהם נשאו עד עכשיו על הכתפיים ופורחים סביבת ה-Serverless?(ינון) אז הייתי אומר שהם די פורחים . . . זאת אומרת, יש תמיד את המעבר המסוחרר הראשון הזה שאומר, כמו שהזכרת קודם: “רגע, אין לי Connection Pool”, “אין לי פה . . . אני צריך להבין רגע איך זה מגיע, אני פורש את זה כבר לענן? מה קרה לי? זה קצת מוקדם?”.אז באמת יש את הכמה ימים האלה של “רגע-רגע, איך אני עושה פה דברים?”אבל באמת זה לוקח ממש מעט זמן.ב-Via, אתה בדרך כלל מתחיל לכתוב קוד תוך פחות משבועייםכלומר - צריך להרים איזושהי סביבה מקומית, צריך לראות שהכל עובד, שהכל מותקן והכל בסדר - ואז טיפה ללמוד את העולם, גם של Via וגם את עולם של Serverless.אבל תוך באמת פחות משבועיים הוא מקבל משימה ואוקיי - פורש בפעם הראשונה ובפעם השנייה ומשם בעצם זה מגיע ל-Production די מהר.(יונתן) יש גם, אני מניח, יתרון שאולי ה-Scope של הקוד שאתה צריך להכיר כדי לעשות שינוי הוא, כנראה, יותר קטן - זאת אומר, הוא כנראה תלוי בעוד הרבה דברים אחרים, אבל כבר מראש צמצמו לך אותו לסט מסויים של פונקציות או של Services . . . (ינון) כן, אז יותר קל, כנראה, לפרק ל-microServices קטנים, כי העלות של להרים microService היא כמעט כלום.לא צריך פה לפרוש איזה Pod חדש או לייצר משהו חדש - זה “אוקיי, מעתיקים את ה-Serverless.yml”, שזה yml פשוט שרק מגדיר את ה-Service עצמו, יש בו ממש-ממש כלום הגדרות.ומשם פורשים Service חדש מאוד-מאוד בקלות, מה שמאפשר לנו בעצם להריץ הרבה מאוד microServicesרק אצלי בקבוצה יש בין 80 ל-100 microServices ו… And Growing . . . (יונתן) יש איזו אופטימיזציה ש-AWS נותנים, נניח שהם מזהים Lambda-ות שקוראות אחת לשנייה בצורה . . . באופן תדיר, ובעצם להוריד את ה-Network ביניהן ושהן תרוצנה In-process?(ינון) רעיון מדליק . . . אבל לא שאני מכיר . . . מה שאנחנו עושים הרבה באמת זה שאם יש לנו הרבה מקומות שבהם אנחנו קוראים לאותו קוד שוב ושוב ושוב, אנחנו פשוט אורזים אותו כ-Packages.כלומר, במקום לארוז אותו כ-Lambda נפרדת, אנחנו אורזים את זה ב-Package כזה, ואז משתמשים בו, ב-Re-use, במקומות שונים.(רן) שזה, ”בשפת Lambda”, זה ספרייה, נכון? זאת אומרת, יש מגבלה על גודל הפונקציה, אז בשביל זה AWS מציעים Packages, שזה כמו Library . . . (ינון) לא . . . ל-Lambda עצמו, Built-in, יש את מה שנקרא Layers . . .עם Layers, ב-AWS בעצם מאפשרים לך להרים, בהגדרות של AWS, ממש “שכבות” כאלה של Lambda, שמאפשרות לפרוש כחלק מה-Lambda, כאשר ה-Lambda עצמה נפרשת לתוך ה-Container.רעיון די מדליק - אנחנו לא משתמשים בו הרבה . . . אנחנו משתמשים ממש ב-Node Packages בשביל לארוז מחדש את ה-Packages אצלנו, מכמה סיבות.ל-Layers היו כל מיני מגבלות טכניות - היה אפשר רק 5 Layers, ואם אתה צריך את השישית אז אתה כבר נתקע.יש עניין ש-Cold start לא מתחיל מחדש את ה-Layer תמיד . . . כך שהשליטה שלך היא לא מספיק חזקה שם.הרגשנו יותר בנוח לעבוד עם Node Packages, עם גרסאות מסודרות, כשכל Lambda תדע מתי היא מתקדמת לגירסא הבאה . . .(רן) רגע, אמרת Node Packages? אנחנו לא ב-Python? . . . (ינון) סליחה . . . Python . . . אתה צודק, 100% . . .(רן) כמעט תפסנו אותך . . . (ינון) כמעט תפסתם אותי . . . אגב - יש לנו Lambda-ות גם ב-Node, כתבנו כמה Lambda-ות ב-JavaScriptיש צוותים שהעדיפו לעבוד ב-JavaScriptלא הרבה . . . זה עובד, אגב, טוב ממש כמו Python, אם כי אני חובב Python יותר מאשר Node, ולכן אצלי בצוות עובדים בעיקר ב-Python.(רן) כן . . . דרך אגב, יונתן אולי נתייחס לשאלה שלך - שאלת האם כשיש פונקציות שקוראות אחת לשנייה באופן תכוף, האם אפשר לעשות כזאת אופטימיזציה, אבל אז, זאת אומרת . . (א) זה רעיון טובאבל כנראה שבמקרה של ינון זה לא יעזור, כי הם עושים את הכל א-סינכרוני ושמים את הכל ב-Queue, אז בכל מקרה צריך לשלוח ל-Queue . . . (יונתן) Wix בדיוק נתנו הרצאה לא מזמן, על ה-Vision שלהם בעולם ה-Serverless, והם נתנו את הדוגמא הזאת.[לפני חודש - Beyond Serverless and DevOps, Aviran Mordo]זאת אומרת - את ההזדמנות לאופטימיזציה הזאתאז שווה ל . . .(רן) אז מה - גם ה-Queue נמצא בתוך ה-Host?(יונתן) אני לא יודע, אני חושב שזה היה . . .צריך לשאול את אבירן, זה היה יותר “תכנונים עתידיים”, כמו שאני הבנתי . . .(רן) הבנתיטוב, ינון, שמע - מרתק . . . אז אנחנו ממש, ככה, לקראת סיום - תן כמה “מילים סוגרות”, אני בטוח שאתם מגייסים . . .(ינון) כמובן . . . אנחנו בהחלט - כמו, כנראה, כל חברה אחרת בארץ - אבל בטח, אצלנו מגייסים.אנחנו מגייסים, אגב, בכל מיני מקומות בארץ - גם בתל אביב, גם בירושליםואנחנו גם די עובדים, כזה, From anywhere - אז אנחנו מאוד נשמח, אם מעניין אתכם.וכן - העולם של Serverless הוא מרתק בעיני, הוא מאוד שינה לי את החשיבה, מרגע שהגעתי ל-Via, ומאפשר לי באמת לעשות כמה דברים מאוד מאוד מהר ובקלות.ושוב - אנחנו באמת, אם נסכם את זה - אנחנו לא דוגמאטיים בעניין.אנחנו מאוד מאוד מאמינים ב-Serverless כאחת מהטכנולוגיות שעוזרות לנו לקדם את המוצראבל, אתה יודע: מה שעובד - עובד . . . If it works, don't break it . . . (רן) טוב, אחלה - אז תודה רבה, היה מעניין, ונתראה.תודה רן, תודה יונתן.כנס רברסים 2021: נפתחה הקריאה להגשות!הקובץ נמצא כאן, האזנה נעימה ותודה רבה לעופר פורר על התמלול
Tune in to learn: How Vue.js was chosen for Vue Storefront 1 and Nuxt.js for Vue Storefront 2 Vue Storefront's journey with the ElastiCache layer and why it was transitioned out in VSF2 Why Rakowski sees Magento as the most vibrant headless ecosystem What hosting solutions are most popular for Vue Storefront What VSF provides its customers to help optimize for Core Web Vitals
In this episode Nick and Rob share the top launches from AWS including interviews with AWS DeepComposer and Amazon AppFlow.
In this episode Nick and Rob share the top launches from AWS including interviews with ElastiCache for Redis Global Datastore, AWS Amplify and Bottlerocket.
Since re:Invent 2018, the Amazon ElastiCache team has been hard at work innovating on behalf of customers. In this session, we review the work that has been done in 2019 to make sure that ElastiCache is the most cost-effective and best performing Redis- and Memcached-compatible cloud service.
Managing Redis clusters on your own can be hard. You have to provision hardware, patch software, back up data, and monitor workloads constantly. With the newly released Online Migration feature for Amazon ElastiCache, you can now easily move your data from self-hosted Redis on Amazon EC2 to fully managed Amazon ElastiCache, with cluster mode disabled. In this session, you learn about the new Online Migration tool, see a demo, and, more importantly, learn hands-on best practices for a smooth migration to Amazon ElastiCache.
Le notizie che abbiamo analizzato questa settimana sono: 1:22 Supporto TTFB nei fields del logging di CloudFront - https://docs.aws.amazon.com/en_us/AmazonCloudFront/latest/DeveloperGuide/AccessLogs.html#LogFileFormat (https://docs.aws.amazon.com/en_us/AmazonCloudFront/latest/DeveloperGuide/AccessLogs.html#LogFileFormat) 3:34 Saving Plans - https://aws.amazon.com/blogs/aws/new-savings-plans-for-aws-compute-services/ (https://aws.amazon.com/blogs/aws/new-savings-plans-for-aws-compute-services/) 6:10 AWS supports Automated Draining for spot instante nodes on Kubernetes - https://aws.amazon.com/it/about-aws/whats-new/2019/11/aws-supports-automated-draining-for-spot-instance-nodes-on-kubernetes/ (https://aws.amazon.com/it/about-aws/whats-new/2019/11/aws-supports-automated-draining-for-spot-instance-nodes-on-kubernetes/) 9:50 CloudFormation Import - https://aws.amazon.com/blogs/aws/new-import-existing-resources-into-a-cloudformation-stack (https://aws.amazon.com/blogs/aws/new-import-existing-resources-into-a-cloudformation-stack) 13:37 CodePipeline supports variables passing between actions at execution time - https://aws.amazon.com/about-aws/whats-new/2019/11/aws-codepipeline-enables-passing-variables-between-actions-at-execution-time (https://aws.amazon.com/about-aws/whats-new/2019/11/aws-codepipeline-enables-passing-variables-between-actions-at-execution-time) 15:05 ElastiCache supports t3 instances - https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-elasticache-now-supports-t3-standard-cache-nodes/ (https://aws.amazon.com/about-aws/whats-new/2019/11/amazon-elasticache-now-supports-t3-standard-cache-nodes/) 15:45 FireLens - https://aws.amazon.com/blogs/aws/announcing-firelens-a-new-way-to-manage-container-logs/ (https://aws.amazon.com/blogs/aws/announcing-firelens-a-new-way-to-manage-container-logs/) 19:30 Mirantis acquires Docker Enterprise - https://thenewstack.io/mirantis-acquires-docker-enterprise (https://thenewstack.io/mirantis-acquires-docker-enterprise) 21:23 Kubernetes Cluster API - https://cluster-api.sigs.k8s.io (https://cluster-api.sigs.k8s.io) Iscriviti alla newsletter per ricevere ogni settimana via mail gli approfondimenti delle AWS & Cloud News: https://www.kloudops.io/aws-news/ (https://www.kloudops.io/aws-news/)
Sponsor Circle CI Episode on CI/CD with Circle CI Show DetailsIn this episode, we cover the following topics: Pillars in depth Performance Efficiency "Ability to use resources efficiently to meet system requirements and maintain that efficiency as demand changes and technology evolves" Design principles Easy to try new advanced technologies (by letting AWS manage them, instead of standing them up yourself) Go global in minutes Use serverless architectures Experiment more often Mechanical sympathy (use the technology approach that aligns best to what you are trying to achieve) Key service: CloudWatch Focus areas SelectionServices: EC2, EBS, RDS, DynamoDB, Auto Scaling, S3, VPC, Route53, DirectConnect ReviewServices: AWS Blog, AWS What's New MonitoringServices: CloudWatch, Lambda, Kinesis, SQS TradeoffsServices: CloudFront, ElastiCache, Snowball, RDS (read replicas) Best practices SelectionChoose appropriate resource typesCompute, storage, database, networking Trade OffsProximity and caching Cost Optimization "Ability to run systems to deliver business value at the lowest price point" Design principles Adopt a consumption model (only pay for what you use) Measure overall efficiency Stop spending money on data center operations Analyze and attribute expenditures Use managed services to reduce TCO Key service: AWS Cost Explorer (with cost allocation tags) Focus areas Expenditure awarenessServices: Cost Explorer, AWS Budgets, CloudWatch, SNS Cost-effective resourcesServices: Reserved Instances, Spot Instances, Cost Explorer Matching supply and demandServices: Auto Scaling Optimizing over timeServices: AWS Blog, AWS What's New, Trusted Advisor Key pointsUse Trusted Advisor to find ways to save $$$ The Well-Architected Review Centered around the question "Are you well architected?" The Well-Architected review provides a consistent approach to review workload against current AWS best practices and gives advice on how to architect for the cloud Benefits of the review Build and deploy faster Lower or mitigate risks Make informed decisions Learn AWS best practices The AWS Well-Architected Tool Cloud-based service available from the AWS console Provides consistent process for you to review and measure your architecture using the AWS Well-Architected Framework Helps you: Learn Measure Improve Improvement plan Based on identified high and medium risk topics Canned list of suggested action items to address each risk topic MilestonesMakes a read-only snapshot of completed questions and answers Best practices Save milestone after initially completing workload review Then, whenever you make large changes to your workload architecture, perform a subsequent review and save as a new milestone Links AWS Well-Architected AWS Well-Architected Framework - Online/HTML versionincludes drill down pages for each review question, with recommended action items to address that issue AWS Well-Architected Tool Enhanced Networking Amazon EBS-optimized instance VPC Endpoint Amazon S3 Transfer Acceleration AWS Billing and Cost Management Whitepapers AWS Well-Architected Framework Operational Excellence Pillar Security Pillar Reliability Pillar Performance-Efficiency Pillar Cost Optimization Pillar End Song:The Shadow Gallery by Roy EnglandFor a full transcription of this episode, please visit the episode webpage.We'd love to hear from you! You can reach us at: Web: https://mobycast.fm Voicemail: 844-818-0993 Email: ask@mobycast.fm Twitter: https://twitter.com/hashtag/mobycast Reddit: https://reddit.com/r/mobycast
At Airbnb, we use Redis extensively as an in-memory data store to reduce latency and provide sub-millisecond response for our website, search, images, payments, and more. We migrated our self-managed Redis environment from EC2 classic to fully-managed Amazon ElastiCache for Redis to reduce operational overhead and improve availability. Now, all our Redis is in an AWS managed service that provides multi-Availability Zone support, automatic failover, and maintenance. Attend this session to learn how we migrated our Redis environment while ensuring data integrity and zero downtime.
Lifion is ADP's next generation platform, born in the cloud and built on an ecosystem of containerized microservices. Initially developed entirely on EC2, Lifion is undergoing a cloud native transformation and embracing AWS managed services. We will discuss our strategic architectural objectives, transformational journey as well as our learnings in adopting Kinesis, Aurora, Dynamo DB, and ElastiCache at scale.
In this session, we provide a behind the scenes peek to learn about the design and architecture of Amazon ElastiCache. See common design patterns with our Redis and Memcached offerings and how customers use them for in-memory data processing to reduce latency and improve application throughput. We review ElastiCache best practices, design patterns, and anti-patterns. Complete Title: AWS re:Invent 2018: [REPEAT 1] ElastiCache Deep Dive: Design Patterns for In-Memory Data Stores (DAT302-R1)
The AWS database group of services is positioned to cover any need you have for storing and accessing data. We have looked at files and general storage. These services are more specific and are true database solutions. Aurora The popularity of MySQL and PostgreSQL was recognized by Amazon. Thus, they saw that a high-speed solution that was compatible with those SQL platforms was going to be worthwhile. They have made it easy and cost-effective to spin up an Aurora instance. I highly recommend you take a look at it for your needs. The setup alone will reduce your headaches even while providing enterprise-class availability and backups. The Classics You may want to stick to a database engine you know. That is ok with Amazon. They offer solutions (with or without a provided license) for every well-known vendor. This includes Oracle, SQL Server, and more. You can take advantage of a cloud solution without re-engineering your solutions. It is as easy as pointing to a new server. Superfast Access The AWS database group includes Elasticache for the equivalent of pinned or in-memory data. This will give you the fastest access to your data although at a higher cost. Of course, this is a solution that you know when you need that sort of speed. You can also go the other direction and look at their offerings for big data or NoSQL. The maintenance headaches around administering a database are taken on by Amazon. I recommend you check these out so you can get your solutions out faster.
Simon takes you through another BIG set of updates - what will catch your imagination? Shownotes: AWS re:Invent 2018: https://reinvent.awsevents.com/ AWS Public Sector Summit Canberra: https://aws.amazon.com/summits/canberra-public-sector/ Amazon QuickSight announces Pay-per-Session pricing, Private VPC Connectivity and more! | https://aws.amazon.com/about-aws/whats-new/2018/05/Amazon-QuickSight-announces-Pay-per-Session-pricing-Private-VPC-Connectivity-and-more/ Introducing Amazon EC2 M5d Instances | https://aws.amazon.com/about-aws/whats-new/2018/06/introducing-amazon-ec2-m5d-instances/ Amazon Polly Introduces a New French Female Voice, Léa | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-polly-introduces-a-new-french-female-voice-lea/ Amazon Neptune is now generally available to build fast, reliable graph applications | https://aws.amazon.com/about-aws/whats-new/2018/05/amazon-neptune-is-now-generally-available/ Amazon Athena releases support for Views | https://aws.amazon.com/about-aws/whats-new/2018/06/athena-support-for-views/ Amazon Redshift Can Now COPY from Parquet and ORC File Formats | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-redshift-can-now-copy-from-parquet-and-orc-file-formats/ Amazon DynamoDB Announces 99.999% Service Level Agreement for Global Tables | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-dynamodb-announces-a-monthly-service-level-agreement/ Amazon DynamoDB Backup and Restore Regional Expansion | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-dynamodb-backup-and-restore-regional-expansion/ Amazon DynamoDB Accelerator (DAX) SDK for Go Now Available | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-dynamodb-accelerator--dax--sdk-for-go-now-available/ Introducing Optimize CPUs for Amazon RDS for Oracle | https://aws.amazon.com/about-aws/whats-new/2018/06/introducing-optimize-cpus-for-amazon-rds-for-oracle/ Announcing General Availability of Performance Insights | https://aws.amazon.com/about-aws/whats-new/2018/06/announcing-general-availability-of-performance-insights/ Amazon RDS for PostgreSQL Read Replicas now support Multi-AZ Deployments | https://aws.amazon.com/about-aws/whats-new/2018/06/rds-postgres-supports-readreplicas-multiaz/ AWS Database Migration Service Can Start Replication Anywhere in a Transaction Log | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-dms-can-start-replication-anywhere-in-a-transaction-log/ AWS Storage Gateway Adds SMB Support to Store and Access Objects in Amazon S3 Buckets | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-storage-gateway-adds-smb-support-to-store-objects-in-amazon-s3/ Amazon EBS Extends Elastic Volumes to Support EBS Magnetic (Standard) Volume Type | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-ebs-extends-elastic-volumes-to-support-ebs-magnetic--standard--volume-type/ AWS CloudFormation StackSets Supports Multiple Execution Roles and Selective Update Operation on Stack Instances | https://aws.amazon.com/about-aws/whats-new/2018/05/aws-cloudformation-stacksets-supports-multiple-execution-roles-a/ Introducing CloudFormation Support for AWS PrivateLink Resources | https://aws.amazon.com/about-aws/whats-new/2018/06/cloudformation-support-for-aws-privatelink-resources/ Application Load Balancer Simplifies User Authentication for Your Applications | https://aws.amazon.com/about-aws/whats-new/2018/05/application-load-balancer-simplifies-user-authentication-for-your-applications/ Amazon MQ Now Supports AWS CloudFormation | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-mq-now-supports-aws-cloudformation/ Application Load Balancer Adds New Security Policies Including Policy for Forward Secrecy | https://aws.amazon.com/about-aws/whats-new/2018/06/application-load-balancer-adds-new-security-policies-including-policy-for-forward-secrecy/ Amazon Cognito Now Supports Custom Domains for a Unified Login Experience | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-cognito-now-supports-custom-domains-for-a-unified-login-experience/ Amazon Cognito Protection for Unusual Sign-in Activity and Compromised Credentials Is Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-cognito-advanced-security-features/ AWS Shield Advanced Announces New Onboarding Wizard | https://aws.amazon.com/about-aws/whats-new/2018/06/shield-advanced-onboarding-sign-up-wizard-drt-permission/ AWS WAF Announces Two New Features | https://aws.amazon.com/about-aws/whats-new/2018/06/waf-new-features-queryargs-cidr/ Amazon EC2 Auto Recovery is now available for Dedicated Instances | https://aws.amazon.com/about-aws/whats-new/2018/05/amazon-ec2-auto-recovery-is-now-available-for-dedicated-instances/ Amazon SageMaker Now Supports PyTorch and TensorFlow 1.8 | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-sagemaker-now-supports-pytorch-and-tensorflow-1-8/ Amazon SageMaker now Provides Chainer integration, Support for AWS CloudFormation, and Availability in the Asia Pacific (Tokyo) AWS Region | https://aws.amazon.com/about-aws/whats-new/2018/05/amazon-sagemaker-chainer-nrt-cloud-formation-support/ Amazon SageMaker Inference Calls are now supported on AWS PrivateLink | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-sagemaker-inference-calls-are-supported-on-aws-privatelink/ Now Clone a Model Training Job on the Amazon SageMaker Console | https://aws.amazon.com/about-aws/whats-new/2018/05/now-clone-a-model-training-job-on-the-amazon-sagemaker-console/ Automatic Model Tuning is now Generally Available | https://aws.amazon.com/about-aws/whats-new/2018/05/automatic-model-tuning-is-now-generally-available/ Announcing AWS DeepLens support for TensorFlow and Caffe, expanded MXNet layer support, integration with Kinesis Video Streams, new sample project, and availability to buy on Amazon.com | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-deeplens-tensorflow-caffe-mxnet-kinesis-video-streams-buy-now/ Amazon Elastic Container Service for Kubernetes Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-elastic-container-service-for-kubernetes-eks-now-ga/ Amazon Sumerian Regional Expansions | https://aws.amazon.com/about-aws/whats-new/2018/06/Amazon-Sumerian-Regional-Expansions/ Amazon Sumerian Regional and Feature Expansion | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-sumerian-regional-and-feature-expansion/ Amazon API Gateway Supports Private APIs | https://aws.amazon.com/about-aws/whats-new/2018/06/api-gateway-supports-private-apis/ Amazon CloudWatch Adds VPC Endpoint Support to AWS PrivateLink | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-cloudwatch-adds-vpc-endpoint-support-to-aws-privatelink/ Announcing Amazon Linux 2 with Long Term Support (LTS) | https://aws.amazon.com/about-aws/whats-new/2018/06/announcing-amazon-linux-2-with-long-term-support/ AWS Introduces Amazon Linux WorkSpaces | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-introduces-amazon-linux-workspaces/ Amazon CloudWatch Metric Math Supports Bulk Transformations | https://aws.amazon.com/about-aws/whats-new/2018/06/Amazon-CloudWatch-Metric-Math-Supports-Bulk-Transformations/ AWS CloudTrail Event History Now Includes All Management Events | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-cloud-trail-event-history-now-includes-all-management-events/ Amazon ElastiCache for Redis announces support for Redis 4.0 with caching improvements and better memory management for high-performance in-memory data processing | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-elastiCache-for-redis-announces-support-for-redis-40/ AWS Config Introduces New Lower Pricing for AWS Config Rules | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-config-introduces-new-lower-pricing-for-aws-config-rules/ AWS Marketplace Launches New Website Workflow | https://aws.amazon.com/about-aws/whats-new/2018/06/aws_marketplace_launches_new_website_workflow/
Coffee Meets Bagel is a top-tier dating app that focuses on delivering high-quality matches via our recommendation systems. We use Amazon ElastiCache as part of our recommendation pipeline to identify nearby users with geohashing, store feature vectors for on-demand user similarity calculations, and perform set intersections to find mutual friends between candidate matches. Coffee Meets Bagel also employs Redis for other novel use cases, such as a fault-tolerant priority queue mechanism for its asynchronous worker processes, and storing per-user recommendations in sorted sets. Join our top data scientist and CTO as we walk you through our use cases and architecture and highlight ways to take advantage of ElastiCache and Redis.
AWS architecture for Careem, a fast-growing car-booking service in the broader Middle East, has quickly evolved to support over six million users in eleven countries. Careem also operates in areas with weak GPS signals and unique traffic patterns, resulting in the poor user experience of long driver match times and rider wait times. Careem was storing driver location data in MySQL, but their high volume of concurrent calls and lack of geospatial support in MySQL 5.6 resulted in continuous deadlocks and performance issues. Amazon ElastiCache for Redis helped meet their need for in-memory storage service with advanced data structures. ElastiCache for Redis accelerated their car booking application and reduced ride matching times from several minutes to milliseconds. Learn how their big bottleneck of insert and update operations in MySQL became a quick lookup in ElastiCache for Redis by using Redis Sorted Sets, geohashes, and timestamps.
In this session, we provide a peek behind the scenes to learn about Amazon ElastiCache's design and architecture. See common design patterns with our Redis and Memcached offerings and how customers have used them for in-memory operations to reduce latency and improve application throughput. During this session, we review ElastiCache best practices, design patterns, and anti-patterns.
From CloudFront to ElastiCache to DynamoDB Accelerator (DAX), this is your one-stop shop for learning how to apply caching methods to your AdTech workload: What data to cache and why? What are common side effects and pitfalls when caching? What is negative caching and how can it help you maximize your cache hit rate? How to use DynamoDB Accelerator in practice? How can you ensure that data always stays current in your cache? These and many more topics will be discussed in depth during this talk and we'll share lessons learned from Team Internet, the leading provider in domain monetization.
The BBC's website and apps are used around the world by an audience of millions who read, watch, and interact with a range of content. The BBC handles this scale with an innovative website platform, built on Amazon ElastiCache and Amazon EC2 and based on nanoservices. The BBC has over a thousand nanoservices, powering many of its biggest webpages. Explore its nanoservices platform and use of ElastiCache. Learn how Redis's ultra-fast queues and pub/sub allow thousands of nanoservices to interact efficiently with low latency. Discover intelligent caching strategies to optimize rendering costs and ensure lightning fast performance. Together, ElastiCache and nanoservices can make real-time systems that can handle thousands of requests per second.
In this episode, the crew talks about enterprise applications, scalability, and productivity. Transcription provided by https://twitter.com/wtoalabi Episode 53: Bigger & Better Music.... Intro: Alright welcome back to another episode of the Laravel Podcast, I am one of your hosts, Matt Stauffer, I have got two guys joining me...Can you introduce yourselves? JEFFREY WAY: I am Jeffery Way! TAYLOR OTWELL: And I am Taylor Otwell. MATT STAUFFER: It's been a little while but we are back with a little bit more to share and if you haven't gotten a chance to check out the Laravel New...News Podcast...all *laugh...*Check out the Laravel new Podcast where... Interjections MATT STAUFFER: Checkout the Laravel New..News Podcast...oh my gosh! Everytime now! News Podcast, where Jacob Bennett and Michael Dyrynda, basically being Australian and ' Illinoisian' tell you all the greatest and latest news that is going on with Laravel, so, because they are covering that so well, we are going off the beaten track a little bit talking about a few kinda broader topics, so, what we did was, we put out some requests on the Twitter account and said "Hey folks, what do you want us to talk about?" And we picked a couple interesting ones and we just want to...just like the reader grab bag or... whatever you call it on your podcast Jeffery, so, the first one at the top of the queue is...something we hear about all the time, not just in this particular request, which is "Can Laravel be used for big apps?" And sometimes this comes in the same conversation of well you know if you want to do enterprise you should use this framework or if you just want to do a cute little thing, then use Laravel. You know, there are all this like statements and perceptions that people have and make about this, so before we go anywhere else, I would ask like, what is and do we know, what is the definition of an enterprise app, like if someone, and then again we are trying to give as much grace as possible to the person who actually thinks there is a distinction...what makes an enterprise app? Is it about lines of code? Is it about patents? Is it about security? Is it about traffic? Like what makes something a big app? And or an enterprise app? Do you guys have a sense for that? JEFFREY WAY: I really don't. So I basically have the same question. From afar, I will just say an enterprise app is something I imagine that is really really big...I don't know, it is an interesting distinction that people always make. I mean for as long as I can remember, even back in the Codeigniter days, you had this idea that Codeigniter is for these sorts of hobby projects but then if you are on the enterprise level, you are gonna reach for Zend or you are gonna reach for Symphony. And I feel like even after all those years, I can't quite figure out, what specific features or functionality do they have, that make them suitable for enterprise or what would Codeigniter not have or what does Laravel not have...hmm... is it related to the fact that Zend has a big company behind it? And whereas with Laravel, you know, like everyone is just gonna keep creating threads about ...what happens when Taylor dies? Is that the kind of idea? Like this is open source...it's kind of rickety...you are not sure what the state of it is, you are not sure if it's going to be abandoned? And with Zend, maybe if you have a big company behind it..maybe you can depend upon it more? Maybe? I don't know, I have the same question as everyone else. TAYLOR OTWELL Yeah, I think most people mean lots of classes I guess. You know, lots of code, lots of lines of code and I think the answers is, you know, obviously I am going to say yes it can be used for big apps, one because it has been used for big apps in the past, so we already know it's true basically. But then also, I think that, you know, Laravel is good for any app that PHP is good for, so, Laravel gives you a good routing system and a way to route request as classes and sort of beyond that is really up to you, you know, once you are past the controller, you basically have total freedom to do whatever you want to do, so, it's up to you in terms of if your app is going to be scalable in terms of complexity. And also I think Laravel is kind of uniquely qualified and better at making big apps than other PHP offerings right now, for a few reasons. One because when people start talking about big apps, a lot of time there is dependency complexity and Laravel Dependency Injection Container is really good and it's really thoroughly baked in throughout the entire framework. When you talk about complicated apps, a lot of time you are also talking about needs like background job processing and Laravel has basically the only baked in queue system out of any major framework in PHP...hmmm...and then of course there is event broadcasting and other features that I would say are more kind of on the big app side of things, so, not only is it...can it be used for big apps, I think it's uniquely better for big apps than other alternatives out there in PHP right now for those reasons. And I think it's just a little misleading because it is easy to get started with, and has a very simple starting point. And since that has a single route file you can kind of jump into it and start hacking around on, but it also scales up, you know with your needs and with your team's needs in terms of complexity...so yeah, that's kind of my take on it. Everyone kind of thinks that their app is a special snowflake you know, that has this, very unique requirements that have never been required in the history of web apps, but, the vast majority of applications don't have unique requirements and they don't really have unique needs and you know Laravel and many other frameworks really are going to be a good fit for them but I think Laravel is the best option in PHP right now for a big sophisticated application. JEFFREY WAY: And it is funny because, for whatever reason everyone thinks their project is going to be the one that really put Laravel to the test in terms of how many page views it can render in a single second...all that stuff like...if you need to worry about that, you are at such high level and you will know if you need to worry about that or not, but 90, I would say 99% of projects will never even get close to that point. So, it's almost like, to be frank, it's almost like a sense of vanity that you think the project you are working on right now is something that really needs to worry about that, because you probably are not even close. TAYLOR OTWELL Yeah, and we are assuming, developers don't approach projects in a rational way, even though we think they might. Like people don't choose frameworks in a rational way, they don't choose anything really laughs related to tech in rational way, a lot of time, as surprising as that is. There's a lot of things that go into it and some of it are sort of personality things, maybe they don't like a way that a certain framework is marketed or not marketed. You know some people are very turned off by active marketing around open source, so, maybe they don't like the style of Laravel sort of friendly, hey look at easy this is easy kind of marketing and they are turned off by that, and so they choose something that is more toned down, more sort of suite and tie like Zend because that fit's their personality better. It's not really a technical decision, it's more of just personality or subjective decision. And that happens a lot with tech in general, you know, some people don't use anything that is popular in general, just the kind of classic hipster type thing. I think a lot goes into it and rarely is it purely technical. Sometimes it is... they don't like me! You know, they don't like me personally. And so they don't like Laravel or use Laravel. JEFFREY WAY: I like you Taylor. Everyone laughs JEFFREY WAY: Right before we started recording, I guess RailsConf is going on and I was watching DHH give his presentation live...and he was kind of talking about this to some extent...the idea that it is important even for a tool like Rails or Laravel to have like their own culture and their own sense of values. And he was talking about how like a lot of people take this idea that you just learn all the different languages and then... you do...you are a programmer. So, if you need to work in this local language, you do it and you just apply everything over. And he was talking about how like while that is true, what is wrong with being part of the community that has a very specific culture, very specific views...he talked about like the people that are still using Rails are doing it, maybe not just because it's better, but because they agree with the values that Rails represents. That is like the huge reason why people still use it to this day. And I think, that is very much true for Laravel as well. It is kind of interesting way to think about things. It's all personality, it's about what your values are. What you connect with and what you don't connect with. TAYLOR OTWELL Yeah...when I first started Laravel, that was a big part of how I wanted, how I thought Laravel could be successful, because I knew that in my own life, like there is sort of this ongoing desire to sort of connect with a group of people. Some sort of community or whatever around shared values. And you know that can be found like around many different things like music, or sport or religion, or whatever. And I knew with programming like I wanted to connect with this group of people that has similar values about writing really clean code and having a good time doing it and making it enjoyable and sort of interesting, new and fresh. And that's kind of how I presented Laravel and I think it resonated with some people that were also looking for a group with those kinds of values. And that is still the kind of the values that we obviously try to share today, but yeah, it wasn't necessarily a purely technical thing, it was building this group of people that sort of resonates around similar ideas and working on it together. MATT STAUFFER: It's interesting 'cos I think that even in my question, I conflicted big and enterprise and I think that you guys kind of really drew out the difference between the two in some of your answers, I mean if we think about it, like Jeffery's first answer was, while enterprise might be really interested in having a company back it versus a person..like Taylor said, we get the question of what if Taylor gets hit by a bus all the time. And it makes sense right, like we have clients all the time coming to us like, say, you know well, you know the CEO or the board or the CFO of our multi-million dollar or multi-billion company are very worried that we are gonna invest a whole bunch of money and time in something and X ..and it's not always...and that Taylor might get run over by a bus, but a lot of developers are getting non-developer input on decisions they make here and there are certain times where some IT persons have set up some rules that says like "You can only use projects like this and not [projects like that and I do wonder whether there are some constraints there like one of them being, that it must of be owned by a company, I know that when we worked with CraftCMS a lot of people said well, why would you, there's actually a business value of using CraftCMS over something Wordpress because Craft is making money and therefore it's a sustainable business model and therefore the business people are actually less worried about this thing disappearing. Right? So like maybe a more direct chain of profit to the people who are running the thing might actually make it clearer. I don't know if that exists maybe ZendCon would be something like that but I know it's Laracon too...I don't really know! But it's interesting that the requirements of ...like the true enterprise requirements...like because I work for a company, my company has these requirements...but I think people, including me when I ask these questions conflict that with big. And so I think there is a good place to take this next is, lets step away from enterprise a little bit...enterprise culture is a thing...you know whatever...let's talk about big, so the thing mentioned Taylor, and Jeffery both of you said a lot of people come along and say oh well mine is going to be the one that finally pushes those bounds right, I am gonna run into traffic issues and stuff like that, so, first of all, like I know that we can't say a lot of the names of big sites that are running on it but I feel like is there anything we can do to kind of like just ... I mean, I know several of them 'cos I am under NDA with several of them, you know, who have talked to us about doing some work with us but there's like multi- I mean milions of millions of hundreds of millions of page views sites running on Laravel...there is like Alexa top 500 sites running on Laravel, there's ...hummmm...what's the big group of all the businesses in the US? I can't remember the name of it...Fortune 500 companies running on Laravel...like multiple Fortune 500 companies whose websites are running on Laravel. Are there anything that you guys can share, like to say, hey look, this is the proof, like we've got big stuff running through here. TAYLOR OTWELL Trying to think some of them..I mean like the Vice Video, Log Swan, you know, various video games sites like FallOut 4 had their landing page on Laravel...other stuff like that, but you know, it's sort of never seems to be enough and it sorts of becomes this treadmill of, you know, I have to give one more proof that it sort of can work...and I just wonder like what's really underlining the question like, do they want to know that if I build my big app on Laravel will it be infinitely maintainable and clean...and no, Laravel won't automatically make your app amazing to maintain for 10 years, you know, I don't know if it's like trying to sort of scale responsibility for you also having to do a lot effort to like make your app enjoyable to maintain or what...but... MATT STAUFFER: Bad programmer, can write a bad app with any framework right? Like, nothing is going to rescue you from that..not saying that the person asking is necessarily bad..but I think that's a great point you made earlier Taylor, I wish we can further into it, is that with Laravel like yea ok, Laravel has it's own conveniences but at some point every single app is basically just you writing PHP... TAYLOR OTWELL: Yeah MATT STAUFFER: And especially at this level when you are talking about hundreds of thousands of lines of code, like the vast majority of the dependencies there is going to be just PHP code right? TAYLOR OTWELL: Yeah. Once you get...let's just take like a Laravel app...'laravel new'...whatever...once you are at the controller, method, in your controller class, everything else is up to you, so whether you use the validator or whether you even use Eloquent at all, or whether you use anything in Laravel, is entirely up you, so it was your choice to do whatever you did past that point. So, it's not Laravel making you do any one particular thing. So, that's sort of the point where you are gonna have to, you know turn your thinking cap on and really plan on how to do a big project, because as far as the framework is concerned, the framework is gonna be a much smaller concern than your actual code, you know the framework is gonna be routing session, some caching, some database calls, but you are the one that is gonna have to like, figure out the domain problems of your app, which is gonna be way more complicated I think, than any framework problems you are gonna have. Like, how is this app gonna work? How is it gonna provide value for our customers, or whatever, those are all like much bigger questions I think...than worrying about can Laravel be used for "Big" apps. MATT STAUFFER: One of the questions we got on Twitter was, how to build big sites with Laravel, scaling, deployment, database structure, load balancing, so, lets say someone is on board right...yes, Laravel can be used for big apps period..it's good..so, what are some considerations that you would have, so if you were taking, you know, a default app out of the box and you "laravel new" it and you build some basic stuff and someone says alright, this app that you just built needs to be able to handle you know, a million hits a week next month..what are the first things you would look to, to start, kind of hardening it against that kind of traffic? TAYLOR OTWELL: Hmm, really simple things you could do is to make sure you are using a good cache or session driver, so probably you wanna use something like Memcache or Redis or something that you can centralize on one server or Elasticache if you are on AWS whatever, you know, you are also probably gonna use a load balancer...PHP is really easy to deploy this way you know, to put a LoadBalancer up and to make a few PHP servers and to alternate traffic between them. PHP makes it really simple to do that kind of scaling and then with Laravel, make sure you use config cache, make sure you are using the route cache, make sure you are doing composer dump autoload optimized, you know, really simple things you can do to sort of boost your application a little bit. MATT STAUFFER: Jeffery, I know Laracast is pretty huge, you kinda in there day in, day out, so I know you are super focused on making sure that it's performing, especially related to maybe, let's say, databases and deployment, can you give me any kind of tips that you have there for people who are building new kind of high traffic apps that you have learned from developing Laracast? JEFFREY WAY: Yeah, Laracast is surprisingly high traffic, if you look at the numbers. And I can tell you, not doing that much...just to be perfectly frank, beyond what Taylor said, a lot of that stuff is kind of the fundamentals...of using config cache...a lot of people will just deploy and stick with the file based cache driver...laughs..you will obviously have some issues with that...but, I am not doing anything that fancy. A lot of it becomes basic stuffs like, people completely ignoring the size of their images...like that is always the very first one I bring up and it's such a 101 tip, but if you go from site to site, you can see it being abused immensely. There is so many ways to work it into your build process...or if not, just dragging a bunch of images into..like a Mac app...I am trying to think of the one I use... TAYLOR OTWELL: Is it ImageOptim? JEFFREY WAY: ImageOptim, yeah just, like when you deploy you can drag a bunch of images up there and it will automatically optimize them as best as it can. And you would be shocked how much benefit you can get from that...versus people who just take a 100kb image and they throw it into their project...you know it's funny that people will debate single quotes versus double quotes all day and then throw a 200kb image into their banner, you know, it makes no sense, people, are silly that way. TAYLOR OTWELL: I think another great thing to do is separate out your database from your web server. If you are building anything, you know, that you care about...like in a real way, it can be good to do that..and sort of, if you don't do it from the start, it can be kind of, you know, scary to make the transition, because now you've got to move your live database to another server...but, there are tools out there to make it pretty easy, there are even free packages out there to make it pretty easy to back up your database, so, that has always been really nice for me to have that on a separate server. So definitely if you are gonna have to start do that because it just makes it easier to do that scaling where if you wanna add a second server, you don't have this sort of funky situation where you have one webserver talking to another webserver because it has your database and all that other stuff where now if you want upgrade PHP you've got to upgrade PHP on the same server that your live database is running on...just scary situations like that...that, that would help you avoid. MATT STAUFFER: Are you guys using a lot of caching on your common Eloquent Queries? JEFFREY WAY: Yeah, I do quite a bit. TAYLOR OTWELL: I really don't on Forge. MATT STAUFFER: I wondered about Forge, because with Forge, each query is gonna be unique per user right? Versus with Jeffery where there might be like a page that lists out all of the episode and you might have 10, 000 people hit that same page. With Forge, it's more 10,000 people each seeing a totally different list right? TAYLOR OTWELL Yeah, it is very dynamic. The one thing I do cache is the list of invoices from stripe because there is a stripe API call we have to make, so we do cache that. JEFFREY WAY: Yeah me too. TAYLOR OTWELL: But other than that I don't think I really do any caching. So, Jeffrey probably has more insight on that...? JEFFREY WAY: Well I have a lot of the stuff on the Forum, because the forum just gets hammered...you will be surprised about how popular that forum is... MATT STAUFFER: I won't be surprised because it shows up on the top results of everything. JEFFREY WAY: I know and I do love finding my forum when am googling for my own ignorance. And I go to my own website to figure out how to do something which is a great feeling! But I do have some queries related to the forum that are pretty intense, a lot of like multiple joins, pulling in stuff, so I do cache that..even summary, I cache that every 10 minutes at a time. Just to reduce the weight a little bit. I get a lot of use out of that stuff and then, yeah, of course, the type of stuff that doesn't just change like Categories or Channels or like Taylor was bringing up, there is no reason not to cache those things. And yes especially the invoices it's a great example, if you are making a network query every single time a page is hit, there is really no need to do that if it's going to be the exact same results...every single time give or take a change or two...so those are obvious cases where you want to cache it as long as you can. TAYLOR OTWELL: How do you burst your cache on Laracast? JEFFREY WAY: Whenever something cache bustable takes place...I guess... TAYLOR OTWELL: Ok so I guess like whenever a new category is out and stuff, you just ... JEFFREY WAY: When a new category is out yeah, as part of that I will just manually bust the cache...or no, I will automatically bust the cache...in other areas, it happens so rarely that I just boot up 'php artisan tinker' and do it myself....*laughs...*which is crappy, but no, anything more common like that, I will just automate it as part of the...whenever I update the database. MATT STAUFFER: We are working on an app right now that has Varnish sitting in front of it. And so literately the code that is behind our Skype window right now is me writing a job that just wipes the Varnish cache either for the whole thing or for specific routes in response to us notifying that the change happens and that's an interesting thing because the cache is outside of Laravel app, but it's cached based on its routes...and so I have the ability to say...well, these particular changes are gonna modify these routes and I built an intelligent Job that kinda get sent out anytime we need those things. So even when it is not within the app, even when it is not your Laravel cache, there is still a lot of ability to kind of put some heavy caches on. And in speaking of that kind of stuff cache busting, use the Version in mix all the time. I mean that is just, because then you can throw Varnish or whatever else and just do infinite cache on your assets. And if you all don't know what that is, it's essentially every single asset that gets built by Mix now has like a random string appended at the end of the file name. And every time it's changed, it gets a new random string on it. And so you can set a forever expires on your Javascript files, your CSS files or whatever else, because anytime it needs to change, it would actually be a different file name as your browser will get to request it and then Varnish will get re-request it or whatever is your cache is. JEFFREY WAY: But on that note, actually, I have been thinking about that, is there...can you guys tell me any real reason why when we are using Versioning, the file name itself needs to change? Because you are using that Mix helper function already to dynamically figure out what the version file is, so is there any reason why we can't just use a unique query string there, or not a unique query string but taking where we would change the file name to include the version, we just include it as part of the query string and then the file name always stays the same? MATT STAUFFER: I know that HTML5 boiler plate used to do just query strings and I hadn't even thought about that, but that might be possible, where the files always stay the same but your...what's that JSON file that has the .... JEFFREY WAY: JSON manifest... MATT STAUFFER: Maybe that just adds the id into the new id to pass? And it's just like authoring comment or something like that? JEFFREY WAY: Yeah, when you version the file, it creates, basically it gets like a Hash of the file that you just bundled up and then that gets included in the new file name...but every time you bundle if that changes, you will never know what that file name is called in your HTML so basically you can use this Mix helper function that Laravel provides that will dynamically read that JSON file and it will figure out, oh you want this file, well, here is the current hashed version and we return that...but yeah I have just been thinking lately like, is it kind of dumb that we keep creating a new file, when instead, the Mix manifest file can just have the relevant query string updated. MATT STAUFFER: So, I googled really quick and there is a thing from Steve Souders....who is the guy who originated the 13 rules of make your website faster or whatever they were...the whole like, you know less HTTP requests, and it's called in your files names don't use query strings...I havent read it yet...oh High Performant websites...I havent read it yet and it is 9 years old. My God! Now that I am seeing seen him talking about Squid, I have worked with Squid before which is like a pretty old cache, but a lot of stuff that works for Squid also works for Cloudflare so I am guessing Cloudflare is either using Squid or adopted Squids terminologies and I do think...and I also did a whole bunch of work with one of our clients who is writing custom Varnish rules right now. And I do remember that stripping query strings is a thing that happens sometimes especially when it doesn't matter, for example in the case of asset, I think it maybe a thing that he do by default, so he is digging through here and Squid and proxies and stuff like that, I think basically what he is saying is your proxy administrator could go and teach the proxy to care about query strings but all then ignore them by default... JEFFREY WAY: Ok MATT STAUFFER: So by choosing to use it with query strings you are opening up a lot of job opportunities where it doesn't work the way you are expecting. TAYLOR OTWELL: I have been using Cloudflare quite a bit recently. The whole Laravel website is behind Cloudflare, heavy Cloudflare caching, very few requests actually hit the real server. Mainly because it's all static, you know documentation but am a big fan of that, especially when you are scaling out webservers, if you are using, you know, some kind of Cloudflare SSL. I think Amazon has a similar SSL service now, it makes so much easier to add a web server because you don't really have to think about your certificates as much, you know, putting your certificate on every server, especially because since you can just use like a self-signed certificate if you are using the Cloudflare edge certificate...so that's something to look into and it's free to get started with and it has some nice feature for scaling. MATT STAUFFER: I helped some folks at this thing called the Resistance Manual, which is a Wiki about basically...huh......sorry to be mildly political for a second...all the negative impact of the Trump presidency and how to kind of resist against those things. And so they wanted me to help them gather their information together and I said well I can help out, I am a tech guy and they were like, do you know, you know, media wiki, which is the open source platform behind things like Wikipedia, and I said no, but you know, I can learn it. Turns out that it's like really old school janky procedure PHP and so I said yeah I can handle this but it is also just extremely dumb in terms of how it interacts with the database and so when you are getting you know millions of hits like they were on day one, we had a like a 8 core, you know, hundreds of hundreds of dollars a month Digital Ocean box and it was still just tanking. Like couple of times a day that the caches were getting overflowed and all that kind of stuff, so, I threw clouldflare on it, hoping it would be magical and the problem with that is it's not Cloudflare's fault it's because Cloudflare or Squid or Varnish needs to have some kind of reasonable rules knowing when things have changed and for anyone who has never dealt with them before there is a sort of complicated but hopefully not too complicated dance between your proxy and reading things like expires headers and E-Tags and all that kind of stuff from your website. And so if you throw something like Cloudflare or something like that on it and it is not working the way you expect, the first thing to look at is both the expires headers and the cache link headers that are coming off of your server pre-cloudflare and also what that same response looks like when it's coming back after going through cloudflare, and cloudflare or whoever else will add a couple of other ones like did it hit the cache or miss the cache and what's the expires headers and what's the Squids expires headers, so there are lots of headers that give you the ability when it just seems like it is not just working the way you want and there is only like 3 configuration options in cloudflare, then what do you do? Go look at your headers and I bet that you know, 15 minutes of googling about how cloudflare headers work and Squid headers work and then inspecting all your headers before and after they hit cloudflare and you will be able to source out the problem. Alright so, we talked databases, we talked loadbalancing a little bit, deployment, so, if anybody is not familiar with zero downtime deployment, just a quick introduction for how it works...if you use deployments on something like Forge the default response when you push something new to your github branch is that it hits 'git pull', 'composer install' 'php artisan migrate' or whatever, so your site could erratically be down for seconds while the whole process runs and so, if you are worried about that you can run, 'php artisan down' beforehand and 'php artisan up' afterwards, so when it's down, instead of throwing an error, you just see like hey this site is temporary down kind of thing. But if you are in a circumstances where that is a problem, you might want to consider something like Capistrano style or Envoyer style zero downtime deploy, look somewhere else for a much longer explanation but essentially, every time a new release comes out, it's cloned into a new directory, the whole installation is processed and run there and only once that is done, then the public directory that is getting served is symlinked into that new directory instead of the old one. So you end up with you know with the last 10 releases each in its own directory and you can go back and roll back into a previous directory and Taylor's service Envoyer is basically a really nice User Interface in front of that... For me that has always been the easiest way to handle deploys in a high kind of pressure high traffic high loads situations is just to use Envoyer or Capistrano. Are there any other experiences you've all had or tips or anything about how to handle deploys in high traffic settings when you are really worried about you know those 15 seconds or whatever...are there any other considerations we should be thinking about? or anything? TAYLOR OTWELL: That's the extent of my experience..I haven't had anything that is more demanding than using Envoyer. Am sure there is you know...if I were deploying to thousands of servers, but for me when I am just deploying to 4 or 5 servers Envoyer has been huh...pretty good bet. MATT STAUFFER: And hopefully if you are deploying to a thousand sites, then you've got a server person who is doing that. You know like we are talking dev'ops for developers here right, like when you are running a minor server not when you are running a multi-billion dollar product and the clients I have been talking to were working with all these kind of Varnish stuff..I didnt setup Varnish you know, my client setup Varnish and took care of all these stuff and he just kinda asked me for an input in these kind of stuff and so I definitely would say like there is definitely a limit at which...you know...people often lament like how many responsibilities they are putting on developers these days. I don't think we all have to be IT people capable of running servers for you know, a one thousand server setup for some massive startup or something like that. But I think like this whole, you know, how do I handle a thing big enough that 15 seconds of down time where a migration and composer run...I think that is often within our purview and I think something like Capistrano or Envoyer is for me at least it's being a good fix...the only situation I have not had to run into which I have heard people asked about online and I wanna see if you all have any experience there is, what if you do a roll out and it has a migration in it and then you need to roll back? Is there an easy way to do the 'migrate:rollback' in an Envoyer rollback command or should you just go Envoyer rollback as you SSH in and then do 'php artisan migrate:rollback' TAYLOR OTWELL: Sort of my view on that recently like over the past year has been that you will just never roll back, ever. And you will always go forward. So, because I don't know how you rollback without losing customer data. So, it's, a lot of time not really visible to rollback. Lets just pretend you could, then yes, there is no real easy way to do it on Envoyer, you will just kind have to SSH in and do php artisan rollback like you said. But I think a lot of times, at least for like my own project like Forge and Envoyer, I can never really guarantee that I wasn't loosing data so I think if at all possible, what I would try to do is to write an entirely new migration that fixes whatever problem there is. And deploy that and it will just migrate forward, you know. And I will never really try to go backwards. MATT STAUFFER: You find yourself in that accidental situation where you deploy something that should never have been, then you then go 'php artisan down' real quick, run the fix, push it up and let it go through the deploy process and then 'php artisan up' after that one deployed. TAYLOR OTWELL: Yeah. That's what I would do. If it's, I mean, sometimes if it's low traffic and you feel pretty certain no one's messed with the new database schema, then probably you can just roll back, but, I was just worried in Forge's case that people are in there all day, I would lose data. So that's why I would every time possible to try and go forward. MATT STAUFFER: Yeah, that makes sense. TAYLOR OTWELL: I have actually stopped writing down methods in my migrations entirely recently...not that it's optional. JEFFREY WAY: I feel evil doing that! Like I very much get the argument...but, when I create a migration and I just ignore the down method, I feel like, I am just doing something wrong. I am still doing it right now. TAYLOR OTWELL: It's really mainly visible in Laravel 5.5 'cos you've got the new db:fresh method or db:fresh command, which just totally drops all the tables without running any down methods. MATT STAUFFER: I end up doing that manually all the time anyway because at least in development, the most often when I want to do refresh, it's often in context where I still feel comfortable modifying old migrations..like basically, the moment I have run a migration in prod, I would never modify an old migration. The moment there is somebody else working on the project with me, I will never modify an old one unless I have to and it's just so important that I have to say hey, you know, lets go refresh. But often when I am just starting something out and I have got my first 6 migrations out, I will go back and hack those things over and over again...I don't need to add a migration that has a single alter in it, when I can just go back and edit the thing. And in that context often, I change the migration and then I try to roll back, and sometimes I have changed it in such a way where the rollback doesn't work anymore. I rename the table or something like that... JEFFREY WAY: Right.... MATT STAUFFER: So fresh is definitely going to be a breath of fresh air. JEFFREY WAY: I do wish there was maybe a way to consolidate things, like when you have a project that has been going on for a few years, you can end up in a situation where your migrations folder is huge...you just have so many. And it's like every time you need to boot it up, you are running through all of those and like you said sometimes, just the things you've done doesn't just quite work anymore and you can't rollback. It would be nice sometimes if you could just have like...like a reboot, like just consolidate all of these down to something very very simple. MATT STAUFFER: We did that with Karani I don't know if there is...there is a tool that we used that helps you generate Laravel migrations from Schema and we did it soon after we had migrated from Codeigniter to Laravel for our database access layer. Karani is a Codeigniter app where I eventually started bringing in Laravel components and then now, the actual core of the app is in Laravel and there is just like a third of the route that are still on Codeigniter that havent been moved over and once we got to the point where half of our migrations were Codeigniter and half of them were in Laravel it's just such a mess so we found this tool...whatever it was. We exported the whole thing down to a single migration, archived all the old ones, I mean, we have them on git if we ever need them and now, there is just one..you know, one date from where you just get this massive thing, and then all of our migrations happen kind of, from that date. And for me, I actually feel more free to do that when it's in production because the moment it's in production, I have less concern about being able to speed it up through this specific process because like if something is from two months ago, I am sure it had already has been run in production and so I feel less worry about making sure the history of it still sticks around... JEFFREY WAY: Alright...right... MATT STAUFFER: Alright...so the next question we have coming up is, "I will like to hear about how you all stay productive." And we've talked on and off at various times about what we use..I know we've got us some Todoist love and I know we've got some WunderList love...hummm... I've have some thoughts about Calendar versus Todo lists and I also saw something about Microsoft buying and potentially ruining Wunderlist..so what do y'all use and what happened with Wunderlist. TAYLOR OTWELL: Well, Todo lists are dead now that Wunderlist is dead. MATT STAUFFER: Yeah...So what happened? TAYLOR OTWELL: Wunderlist was my preferred todo list, I just thought it looked pretty good...and Microsoft bought them I think, that was actually little while back that they bought them but now they have finally announced what they are actually doing with it...they are basically shutting down Wunderlist and turning it into Microsoft Todo...which doesn't look a lot like the old Wunderlist and doesn't have some of the features of the old Wunderlist...but it looked ok..you know, it seems fun, so what I have done is migrated to Todoist rather reluctantly but it's working out ok. JEFFREY WAY: Please correct me...is this funny like, Wunderlist is gonna be around for a very long time but just the idea that they are shutting down it's almost like you feel compelled...we've talked about this with other things too...where it's like you suddenly feel like oh I need to migrate...we talked about it with Sublime, like if we find out tomorrow, Sublime is dead in the water. But you can still use it as long as you want. Even though, it would still work great, you would have this feeling like well, I gotta get over to Atom or I gotta start moving on...'cos this place is dead, even though Wunderlist is gonna work for a long time. TAYLOR OTWELL: Yes...laughs...as soon as it was announced, I basically deleted Wunderlist off my computer... All laugh.... TAYLOR OTWELL: Which makes no sense, but it's so true... MATT STAUFFER: I needed a new router and everyone told me, you use the Apple Routers 'cos they are the best...but I have heard they are 'end-of-life'd'....and I was like no way...no way I am gonna throw all my money there and someone say well, why does it matter...you know...you are gonna buy a router and you are gonna use it till it dies? And I said I don't care...I am gonna buy something else 'cos it just...I don't know...it's just like you are putting your energy and your effort after something that can't...you know can only be around for so long and you just want..you want be working with something that's gonna last I guess... JEFFREY WAY: Yeah...I am still on Wunderlist right now. I am hearing...humm..if you guys are familiar with "Things" that was like the big Todo app years ago...and then they have been working on Things 3 or third version for a year...it's been so long, that people joke about it..you know, it's almost like that...new version of..humm..what was it...there was hummm...some Duke Nukem game that.... TAYLOR OTWELL: Is it Duke Nukem Forever..? JEFFREY WAY: Yes! For like 10 or fifteen years and it finally came out! It's looking like next month, "Things 3" will be out and I am hearing it..like the prettiest ToDo app ever made I am hearing really good things. So, I was hoping to get in on the beta but, they skipped over me. So, I will experience it in May but I am excited about it. So, that's the next one..but you know what, I am never happy with Todo apps..I don't know why. It's kinda of weird addiction...if you say an item address basic need...even like a Microsoft Todo. Ok, your most basic need would be to like say...Go to the market on Thursday. You can't do that in Microsoft Todo. You have to manually like set the due date to Thursday. Rather than just using human speech. TAYLOR OTWELL: Have you tried Todoist? JEFFREY WAY: Todoist works that way. Huh I think Wunderlist works that way but now, Microsoft Todo doesn't. MATT STAUFFER: Oh ok..got it. You lost that ability right? JEFFREY WAY: Yeah, it's so weird like every task app would have something that's really great and then other basic things that are completely missing...and it's been that way for years. MATT STAUFFER: I always feel bad, I mean I bought things...thankfully I managed to skip...what was that thing...OminiFocus, I skipped OminiFocus which is good 'cos that is hundreds of dollars saved for me. And I tried...I tried all these different things and I finally figured out that there is a reason why I keep jumping from one to the other, is because..for me...this is not true for everybody...and I think it might have to do with personality a little bit...and the industry a little bit, and what your roles are whatever, Todo lists are fundamentally flawed because they are not the way I approach the day...and they are not the place my brain is...so, I can force my brain into a new paradigm for even a week at a time but I have never been able to stick with it and it's not the app, I thought it was the app, I thought just once I get the right app, I will become a todolist person and I realized, I am not a Todolist person so I can try every app and it can be perfect and I will still just stop using it 'cos it's not what I think about. And when I discovered that I can't use and then later found some articles talking about how I am not the only person who come up with this...that validated me...'cos I put it on my calendar and so, if I need to do something, I put it on my calendar and then it gets done. And if I don't put it on my calendar, it doesn't get done. End of story. It's so effective for me that my wife knows at this point that if she asks me to do something and I don't immediately pick up my phone and put it on my calendar, she knows it's not gonna get done. Because that's..that's how things happen and so, it's amazing to me, that..laughs...she literarily, when she first started discovering this, she sent..and she's not not super technical..like she's smart, she just doesn't like computers all that much...but she knows how to use google..and so, she, when she first discovered this, she sent me a calendar invite that is "Matt Clean Toilet"...and it's for 8 hours every Sunday and so, I will be on a screenshare..'cos she's like that's how I am ever going to clean the toillet right?...so I will be on this screenshare with a client and I will pull up my calendar and to say hey when is it a good time for us to have this meeting? And I will be like..oh "Matt Clean Toilet" takes the 8 hours....laughs... But for me, my todo list is my calendar. And everyone kinda in the company knows what my calendar is completely for and Dan actually has asked me to start marking those things as not busy, so, Calendly, our appointment app will still allow people to book...like clients to book times with me during that time..but like essentially, if I need to get something done, like, I..I need to review a whole bunch of pull requests, like Daniel who works with me literarily just put meeting invite on my calendar for tomorrow 10:30 and it says "Code Review @ Daniel". And literarily after this podcast, there is an hour that says "Code Review with James" because they know that that's how they get it....and there is...500 hundred emails in my inbox and all these other things I have to do, but if it goes to the calendar, it gets done. So, have you guys ever tried that? Does it sound like something that will click with you or no? JEFFREY WAY: I think it makes good sense for you because it sounds like your days are scheduled like your day is full..humm...my day isn't quite as much like do this with so and so, I don't have as many meetings. So, most of my day is like: these are the things I wanna get done. And it doesn't matter whether I do it at 9:AM or 9:PM, so, Todo list works good for me but yeah..I can see how like if my day was very segmented and scheduled that would make far more sense than reaching for some todo app. TAYLOR OTWELL: Yeah..my days are usually pretty free-form outside of the kinda standard schedule where I always do emails and pull requests first thing in the morning but then after that lately it's been...you know..was work on Horizon..now it's work on the thing that comes after Horizon, and that's pretty much the rest of the day, you know, besides whatever Laracon stuff that I have to do recently, which is more of a seasonal thing you know. But I got lunch, all booked, that's done...but whatever we need, you know, furniture, catering or whatever. But yeah, then I pretty much just work on one thing throughout the day. So, I don't really switch context like that a lot. But I was so despondent at the Wunderlist announcement that all Friday afternoon I wrote a chrome extension that when you open new tab, it opens "Discussing Todo List" that I wrote in VueJs and you know HTML and it uses the chrome sync to sync it across my chrome account to all my laptops whatever...so... every new tab has a todo list, but even that, I was still not happy with it and deleted it and the whole afternoon went with the todo list. Anyway, but I have forgotten about the Chrome extension thing. I need to open source it. MATT STAUFFER: Every developer has to make their own Todo list at some point in their lives. TAYLOR OTWELL: Yeah. That's interesting about the calendar though...I want to get Calendly because it looks like a really cool app and try some more calendar stuff 'cos I haven't really dug into that as much as I could. MATT STAUFFER: Yeah...I use basic Calfor my desktop app, I know that, I think I use Fantastic Cal on the phone or something..a lot of people love that...the thing that we like about Calendly is that it gives me a public link that syncs up to my Google calendar and so when we need to schedule things like we are in the middle of hiring right now or client meetings, I just send them to my calendly link and I just say, go here and schedule time with me and it syncs up with my Google calendar and it shows them all the times and I can say..go schedule a 60 minutes meeting and I give them the 60 minutes link or 5 minutes or whatever and you can put different rules around each. So I teach calendly when do I drop my son off at school and when do I ...you know drive from my home office to my work office all that kind of stuff...so that it knows when I am available and then..because we just wasted so much time between Dan and me trying to get our calendars in sync. So, that's what I love about Calendly. TAYLOR OTWELL: What really sold you on basic Cal over like you know just Apple Calendar or whatever? MATT STAUFFER: I wish I can tell you...I know that it handles multiple calendars better...but it's been so long since I made that choice that I couldn't even tell you. I know that Dan, my business partner hates calendars more than any person I have ever met and almost every time he complains about something, I am like oh yeah, you can do that with Basic Cal and he is like "I still use Apple Calendars" I know those things but I can't tell you what they are..so. Alright...so one last question before we go for the day. Saeb asked "It would be nice to hear why are guys are programmers. Is it just something you love and enjoy or is it just a way to put bread on the table? Is it passion what is it that makes you wanna be a programmers?" JEFFREY WAY: I will go first. I fell into it. I think we are being disingenuous if we don't say that to some extent...but I know even from when I was a kid, I love the act of solving puzzles. I remember I had this Sherlock Holmes book and it's one of those things where every single page is some little such and such happens...somebody was murdered and then Sherlock comes, points to so and so and says you are the person who did it. And the last sentence is always..."How did he know?!" And that was like my favorite book. I would go through it every day and try to figure out how how did he figure out that this was the guy who...you know...robbed the bank or whatever it happened it be. So, between that or I play guitar for over a decade and I went to school for that. It's all still the same thing of like trying to solve puzzles trying to solve riddles trying to figure out how to connect these things. You may not know it with guitar but the same thing is true, like puzzles and you start learning about shapes on the guitar and how to transpose this to this. And how to play this scale in eight different ways...it's still like the same thing to me it's figuring out how to solve these little puzzles. And so for programming, I feel like it's the perfect mix of all of that. There needs to ne some level of creativity involved for me to be interested in it....I always worried I would end up in a job where I just did the same and only this thing every single day. And I would finish the day and come back tomorrow and I am gonna do the exact same thing all over again. So there needs to be some level of creativity there which programming does amazingly well or offers amazingly well. Although my mum would never know. I think she thinks I gave up on music and went to this like boring computer job...and even though when I explain to her like no there is huge amounts of creativity in this I don't think she quite makes the connection of how that is. So, yeah, between the creativity and solving puzzles, and making things, it's a perfect mix for my personality. TAYLOR OTWELL: I was always really into computers and games and stuffs growing up, so it was pretty natural for me to major you know in IT in college but I didn't really get exposed to the sort of the front side of programming and open source stuff until after college when I started poking around on side projects and stuff like that. So, I did kind of fall into this side of programming you know, where, you are programming for fun as a hobby and working on open source after I graduated but I was always kind of interested in looking back sort of things that are similar to programming so like into games like SimCity and stuff like that where you are planning out you know, your city and sort of...one of the similar things you do when you are building up big projects or whatever a big enterprise project you know was sort of planning and trying to get... just the right structure whatever, so I was kinda always into that thing. And just sort of naturally fall into that path later in life. MATT STAUFFER: I...my brother and I started a bulletin boards service...out of our spare bedroom, I mean we were in Elementary or middle school or something like that..and he is 3 and half years older than me and he is a little bit more kind of like intellectual than I am, so, he learned how to code the things and he said why don't you be the designer. And that kind of trend just kept up. When he learned how to make websites, he be like well, I am gonna make websites and you be the designer. And so I kinda had this internalized idea that A...I was interested in tech..but B, I was the design mind. And the thing is, I am not a very good designer...like the only reason I kept getting into design is because I had like... I was creative, I was a musician and stuff but also because my brother already had the programming skills down and so he needs a designer right? And so, I think that I went off to college, by that point I already had a job as a programmer, I already had my own clients, doing you know frontend web developments and basic PHP, Wordpress that kind of stuff but I was like well I need to become a better designer so I went off to college for design and I just realized I am not a designer, so I left. And I went off and I did English and I worked with people and I worked for a non-profit having thought you know like oh that is not my thing and then I kinda did a turn round when I left the non-profit, my wife went back to school and I needed to pay the bills, so it was..there is an element of paying the bills..I say like well I know that web development pays well, so I will go back to that. And just discovered that I love web development...it is fulfilling and it is satisfying...it is creative...it's using your brain in all this really interesting ways...each one it's a little bit the same, a little bit unique, there is always really great things about it...I mean I remember one of the things that drove me nuts about my previous work..both in design and in working in the non-profit is that there is no sense like whether you did a good job or not. There is no sense of when something is done. You are just very kinda of vague and vacuous and with this, it's like there is a defined challenge...and you know when it's done. And you know whether you did a good job or not. And I was just like that was huge...that was so foundationally helpful for me. And so I think just kind of being able to approach it and realize that it's creative..like, it's creative and it is well defined..it's a little concrete..it's a challenge all those things together I think for me..and it turns out that it wasn't just a way to make money and I have also since discovered now that I run a company that I also have all the people aspects here..it's about relationship, it's about communities...I mean we have talked about that a lot in this episode and running a company is about hiring and company culture and all those kind of stuff... So I get to comment especially at the level of tech that I get to do day to day whether it's open source or running company I feel like it's all of the best together in one word. JEFFREY WAY: So Matt, how did you go from taking on smaller projects when you went back to web developments to suddenly running Tighten? Like how did you get there? What happened? Were you getting more projects than you can handle? MATT STAUFFER: The opposite. I...I had no work. I worked out of a co-working space in Chicago and I only had about 10 hours a day, fifteen hours day filled because I didnt know anybody. And I had not been doing anything in the industry for 6 years. So, I said, you know what? When I worked for non-profit there was this need I had and I still worked for those non-profit's per time at that point, so I just started building an app...I built an app by hand while I worked for the non-profit in PHP and it was terrible. And I was like oh, I have heard about this framework thing, and so I tried building it in CakePHP and it was terrible, and so, those experiences matured me a little bit...and so by the time I was now kinda going solo as a developer, every free moment I would have, outside of the you know, the contract work I had, I would go learn Codeigniter. You know my buddy Matt had learned ExpressionEngine and said hey, checkout Codeigniter I think you might like it. So I learned Codeigniter and I did all these work in Codeigniter and I built this whole app which is Karani, the thing we are talking about today and I built Karani and I made it for myself and then my friends wanted it and so then I made it for my friends and then it was costing me money to upkeep, so I learned how to charge them money..and Stripe was brand new at that point, so I almost went with Stripe but I ended up going with BrainTree...I got into like big and software as a service app development through there...and right at that same time... I was teaching my buddy all about modern web development HTML5 boiler plate all that kind of stuff after work one day and this guy walked over...the one guy in my co-working space that I had never met, who was always in his closed office and he was like, are you a developer? Are you looking for work? I was like yeah..and he was like..I need you...would you consider working for me? I played it all cool but I was like YES..PLEASE I NEED WORK!!! I only have 10 hours of work a week right now. And it was Dan... And so, Dan and I worked together on this massive project for a year and the client took 6 months to actually get the work ready for us. And he already had me booked and he already had me billed and he was why don't you just go learn become the best possible developer you can..I will throw you know, 30 hours a week jobs just off my various you know various projects...but in all your free time and even in those projects, just learn to become the absolute best, because we were working for, you know, this massive billion dollar international company at that point. And responsive was like just a thought in people's minds. So, I wrote you know, articles and I created responsive libraries back in the early days of responsive and all those kind of stuff and I was like really up in the middle of it. And then we built this app. So, I had like a lot of kind of things that took me very quickly from like hey I haven't written any code or any professional code in 6 years to like to the point where I was ready to build an app for this billion dollar company. JEFFREY WAY: That was amazing. That is how learn best too. MATT STAUFFER: It really is..and Dan and I loved working together so well that within 6 months we decided to go into business together and 6 months or a year later, we named it Tighten and the rest is history. MATT STAUFFER: And so, we are super late and Jeffery, you are the one who has to edit this all later, so I apologize for that..so Ok. Future Jeffery, editing this, I am going to do you a favor, call it a day for now so..guys...it's been a ton of fun..everyone who submitted questions to us on Twitter, the ones we didn't get to today, they are still on our trailer board, we will get to some of them next time... But keep sending us stuff for us to talk about and like I said, the Laravel news podcast is doing a fantastic job of keeping you up to date on regular basis with news so definitely tune in there for that...but we are gonna be talking about more long form stuffs when you got questions for us, send them to us either to our personal accounts or twitter account..for the podcast and we will try to get to them whenever we can..so, until next time..it's Laravel Podcast thanks for listening. MUSIC fades out...
IFTTT is a free service that empowers people to do more with the services they love, from automating simple tasks to transforming how someone interacts with and controls their home. IFTTT uses ElastiCache for Redis to store transaction run history and schedule predictions as well as indexes for log documents on S3. Join this session to learn how the scripting power of Lua and the data types of Redis allowed them to accomplish something they would not have been able to elsewhere.
In this session, we provide a peek behind the scenes to learn about Amazon ElastiCache's design and architecture. See common design patterns with our Redis and Memcached offerings and how customers have used them for in-memory operations to reduce latency and improve application throughput. During this session, we review ElastiCache best practices, design patterns, and anti-patterns.
Nike+ is at the core of the Nike digital product ecosystem, providing services to enhance your athletic experience through quantified activity tracking and gamification. As one of the first movers at Nike to migrate out of the datacenter to AWS, they share the evolution in building a reactive platform on AWS to handle large, complex data sets. They provide a deep technical view of how they process billions of metrics a day in their quantified-self platform, supporting millions of customers worldwide. You’ll leave with ideas and tools to help your organization scale in the cloud. Come learn from experts who have built an elastic platform using Java, Scala, and Akka, leveraging the power of many AWS technologies like Amazon EC2, ElastiCache, Amazon SQS, Amazon SNS, DynamoDB, Amazon ES, Lambda, Amazon S3, and a few others that helped them (and can help you) get there quickly.