Podcasts about eventbridge

Coding talks with Vishnu VG

Play Episode Listen Later Nov 29, 2024 17:39

In this pre-re:Invent 2024 episode, Luciano and Eoin discuss some of their favorite recent AWS announcements, including improvements to AWS Step Functions, Lambda runtime updates, DynamoDB price reductions, ALB header injection, Cognito enhancements, VPC public access blocking, and more. They share their thoughts on the implications of these new capabilities and look forward to seeing what else is announced at the conference. Overall, it's an exciting time for AWS developers with many new features to explore. Very important: no focus on GenAI in this episode :) AWS Bites is brought to you, as always, by fourTheorem! Sometimes, AWS is overwhelming and you might need someone to provide clear guidance in the fog of cloud offerings. That someone is fourTheorem. Check them out at ⁠fourtheorem.com⁠ In this episode, we mentioned the following resources: The repo containing the code of the AWS Bites website: https://github.com/awsbites/aws-bites-site Orama Search: https://orama.com/ JSONata in AWS Step Functions: https://aws.amazon.com/blogs/compute/simplifying-developer-experience-with-variables-and-jsonata-in-aws-step-functions/ EC2 Auto Scaling improvements: https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-ec2-auto-scaling-highly-responsive-scaling-policies/ Node.js 22 is available for Lambda: https://aws.amazon.com/blogs/compute/node-js-22-runtime-now-available-in-aws-lambda/ Python 3.13 runtime: https://aws.amazon.com/blogs/compute/python-3-13-runtime-now-available-in-aws-lambda/ Aurora Serverless V2 now scales to 0: https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-aurora-serverless-v2-scaling-zero-capacity/ Episode 95 covering Mountpoint for S3: https://awsbites.com/95-mounting-s3-as-a-filesystem/ One Zone caching for Mountpoint for S3: https://aws.amazon.com/about-aws/whats-new/2024/11/mountpoint-amazon-s3-high-performance-shared-cache/ Appending to S3 objects: https://docs.aws.amazon.com/AmazonS3/latest/userguide/directory-buckets-objects-append.html 1 million S3 Buckets per account: https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-s3-up-1-million-buckets-per-aws-account/ DynamoDB cost reduction: https://aws.amazon.com/blogs/database/new-amazon-dynamodb-lowers-pricing-for-on-demand-throughput-and-global-tables/ ALB Headers: https://aws.amazon.com/about-aws/whats-new/2024/11/aws-application-load-balancer-header-modification-enhanced-traffic-control-security/ Cognito Managed Login: https://aws.amazon.com/blogs/aws/improve-your-app-authentication-workflow-with-new-amazon-cognito-features/ Cognito Passwordless Authentication: https://aws.amazon.com/blogs/aws/improve-your-app-authentication-workflow-with-new-amazon-cognito-features/ VPC Block Public Access: https://aws.amazon.com/blogs/networking-and-content-delivery/vpc-block-public-access/ Episode 88 where we talk about VPC Lattice: https://awsbites.com/88-what-is-vpc-lattice/ Direct integration between Lattice and ECS: https://aws.amazon.com/blogs/aws/streamline-container-application-networking-with-native-amazon-ecs-support-in-amazon-vpc-lattice/ Resource Control Policies: https://aws.amazon.com/blogs/aws/introducing-resource-control-policies-rcps-a-new-authorization-policy/ Episode 23 about EventBridge: https://awsbites.com/23-what-s-the-big-deal-with-eventbridge/ EventBridge latency improvements: https://aws.amazon.com/about-aws/whats-new/2024/11/amazon-eventbridge-improvement-latency-event-buses/ AppSync web sockets: https://aws.amazon.com/blogs/mobile/announcing-aws-appsync-events-serverless-websocket-apis/ Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on X/Twitter: - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://twitter.com/eoins⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ - ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://twitter.com/loige⁠⁠⁠⁠

python aws invent genai s3 node alb lambda eoin lattice ecs amazon s3 vpc dynamodb cognito aws step functions appsync eventbridge

Connecting Events with AWS EventBridge

Play Episode Listen Later Nov 18, 2024 30:53

Join us as we explore the power of Amazon EventBridge, a versatile tool that simplifies building scalable and responsive event driven applications.

events connecting eventbridge

CoinAct

dans napoleon aws aws lambda eventbridge

Play Episode Listen Later Aug 30, 2024 44:38

Le trading de cryptomonnaies, c'est un univers complexe et volatile où chaque seconde compte. Pour réussir, les traders ont besoin d'outils performants et réactifs. C'est là qu'intervient CoinAct, une startup française qui a développé une plateforme d'alertes en temps réel capable de traiter 5000 événements par minute ! ⚡ Dans cet épisode, nous recevons Vincent Bertin et Sébastien Napoleon, les co-fondateurs de CoinAct, qui vont nous dévoiler les coulisses de leur projet. De l'architecture serverless basée sur AWS Lambda et EventBridge à la gestion de millions d'alertes par jour, ils nous expliqueront tout ! Vous découvrirez également les perspectives d'avenir de la plateforme. Si vous êtes passionné(e) par les cryptomonnaies, l'AWS, et l'entrepreneuriat, cet épisode est fait pour vous ! Il y a eu du changement depuis l'enregistrement au mois d'Arvil, CoinAct fait maintenant partie de Laevitas.

120. Lambda Best Practices

Real World Serverless with theburningmonk

Play Episode Listen Later Apr 4, 2024 26:22

In this episode, we discuss best practices for working with AWS Lambda. We cover how Lambda functions work under the hood, including cold starts and warm starts. We then explore different invocation types - synchronous, asynchronous, and event-based. For each, we share tips on performance, cost optimization, and monitoring. Other topics include function structure, logging, instrumentation, and security. Throughout the episode, we aim to provide a solid mental model for serverless development and share our experiences to help you build efficient and robust Lambda applications.

blog best practices machine learning rust aws sns lambda aws lambda decomposing fargate sqs cloudwatch middy yan cui eventbridge aws partner

#100: LocalStack v3 is here and it's kinda amazing!

Play Episode Listen Later Apr 2, 2024 70:00

In this episode, I spoke with Waldemar Hummer, founder and CTO of LocalStack. We discussed what's new in the latest version of LocalStack and highlighted some of the most interesting additions.One particular highlight for me is the ability to identify IAM permission errors between direct service integrations. For example, when an EventBridge pipe cannot deliver a message to a SQS target. And the ability to use test runs to generate the necessary IAM permissions so they can be added to your Lambda functions.LocalStack v3 also allows running chaos experiments locally by adding random latency spikes, making an entire AWS region unavailable, or simulating DynamoDB throughput-exceeded errors.Lots of exciting new features in LocalStack v3! Waldemar gave us a live demo of some of these features. You can watch the episode on YouTube and watch the demos here.Links from the episode:My webinar with Waldemar after LocalStack v2LocalStack v3 announcementMy blog post on when to use Step Functions vs running everything in LambdaLocalStack is hiring!Opening theme song:Cheery Monday by Kevin MacLeodLink: https://incompetech.filmmusic.io/song/3495-cheery-mondayLicense: http://creativecommons.org/licenses/by/4.0

i am cto kevin macleod aws lambda waldemar dynamodb cheery monday sqs eventbridge

Revolutionizing Healthcare: AWS Serverless for Easier EHR Processes | Danielle Heberling Explains

Datacenter Technical Deep Dives

Play Episode Listen Later Mar 29, 2024 58:09

Dive into the future of healthcare with AWS Hero Danielle Heberling as she unveils the transformative power of AWS Serverless services in streamlining electronic health records. In this insightful presentation, Danielle sheds light on the previously cumbersome medical record updating process and demonstrates how AWS Serverless tools like Lambda and EventBridge are revolutionizing the way medical assistants manage records. Get ready for a deep dive into the intersection of technology and healthcare efficiency.

Quoi de neuf ?

Play Episode Listen Later Jan 26, 2024 15:10

J'ai compté 54 nouveautés depuis le 12 janvier. Cette semaine, nous parlons de ECS et EBS, de DNS avec Route53, de notification mobiles avec Firebase, de EventBridge et GraphQL, d'un service qui est maintenant disponible sur la région de Paris. On terminera avec un tour des activités de la communauté des clients et partenaires AWS, avec deux projets personnels en open-source, un marathon de workshop organisés par le groupe d'utilisateurs AWS de Yaounde au Cameroun et d'un nouveau livre sur les architecture serverless.

aws dns neuf cameroun graphql ecs firebase ebs yaounde eventbridge

Quoi de neuf ?

The Six Five with Patrick Moorhead and Daniel Newman

Play Episode Listen Later Jan 26, 2024 15:10

aws dns neuf cameroun graphql ecs firebase ebs yaounde eventbridge

The Six Five On the Road at AWS re:Invent 2023: Event-Driven App Architectures with Emily Shea and Nick Smit

Play Episode Listen Later Dec 1, 2023 23:04

On this episode of The Six Five – On The Road, hosts Daniel Newman and Patrick Moorhead welcome Emily Shea, Worldwide Lead, Integration Services and Nick Smit, Principal Product Manager, EventBridge at AWS for a conversation on Event-Driven App Architecture and the role of application integration in modernization. Their discussion covers: A look at Event-Driven App Architecture and why is it important How application architects manage messaging and APIs in the Cloud to create event-driven communications across applications Why customers need to first think about integration when they want to migrate and modernize applications New launches AWS announced at re:Invent, including Generative AI workflows, HTTPS Endpoints, and Adobe/Stripe Integrations with EventBridge

ai cloud integration architecture aws reinvent apis invent modernization smit principal product manager event driven daniel newman aws reinvent integration services eventbridge patrick moorhead

105. Integration Testing on AWS

Play Episode Listen Later Nov 24, 2023 28:28

In this episode, we discuss integration testing event-driven systems and explore AWS's new Integration Application Test Kit (IATK). We cover the challenges of testing events and common approaches like logging, end-to-end testing, and using temporary queues. We then introduce IATK, walk through how to use it for EventBridge testing, and share our experience trying out the X-Ray trace validation. We found IATK promising but still rough around the edges, though overall a useful addition to help test complex event flows.

spotify integration aws github x ray sarah hamilton integration testing eventbridge

103. Building GetAI Features with Bedrock

Real World Serverless with theburningmonk

Play Episode Listen Later Nov 10, 2023 20:54

In this episode, we discuss how we automated generating YouTube descriptions, chapters and tags for our podcast using Amazon's new GenAI tool: Bedrock. We provide an overview of Bedrock's features and how we built an integration to summarize podcast transcripts and extract relevant metadata using the Anthropic Claude model. We share the prompt engineering required to instruct the AI, and details on our serverless architecture using Step Functions, Lambda, and EventBridge. We also discussed Bedrock pricing models and how we built a real-time cost-monitoring dashboard. Overall, this automation saves us substantial manual effort while keeping costs low. We hope this episode inspires others to explore building their AI workflows with Bedrock.

amazon spotify ai aws github genai bedrock deploying lambda eventbridge

#78: Out with legacy lock-in with Eduard Bargues

Play Episode Listen Later May 16, 2023 57:38

In this episode, I spoke with Eduard Bargues, who is a Principal Engineer at Ohpen, a cloud-native open banking platform.We talked about Ohpen's migration from EC2 to a mix of Fargate and Lambda, and along the way, we touched on many topics:migration patternsthe "serverless-first" mindsethow to choose when to use Lambda vs Fargatewhy monolith functions are painful and should be avoidedrecommendations and pitfallswhy you should build a DX teamWe also talked about how being cloud-agnostic makes you use the cloud in a way that is inefficient and creates many layers of unnecessary abstraction layers in your architecture, which Eduard calls "legacy lock-in"!We talked about their event-driven architecture and how they are using an in-house Event Broker to add FIFO support (which EventBridge doesn't support) for some subscribers. It's similar to what Luc van Donkersgoed talked about in episode 68. We talked about their EventBridge topology and how they manage over 200 AWS accounts, and finally, what are the biggest shortcomings with EventBridge right now.Links from the episode:Job openings at OhpenEpisode 68 with PostNLEpisode 73 with NNLuc van Donkersgoed's Twitter and LinkedInThe decoupled invocation patternFor more stories about real-world use of serverless technologies, please follow me on Twitter as @theburningmonk and subscribe to this podcast.Want to step up your AWS game and learn how to build production-ready serverless applications? Check out my upcoming workshops and I will teach you everything I know.Opening theme song:Cheery Monday by Kevin MacLeodLink: https://incompetech.filmmusic.io/song/3495-cheery-mondayLicense: http://creativecommons.org/licenses/by/4.0

lock kevin macleod luc aws lambda eduard fifo principal engineer ec2 cheery monday fargate eventbridge

Amazon EventBridge Pipes: A New Approach to Event-Driven Architecture

DevOps With Zack

Play Episode Listen Later May 14, 2023 21:56

In this podcast, Pubudu will take a deep dive into the benefits of using Amazon EventBridge Pipes, how it works, and why it is a game changer for developers. About Pubudu: Pubudu is a senior backend developer at Starred (starred.com) and an AWS Community Builder in Serverless category. In free time, he likes to play around with Serverless services and share his experience about them in social media, tech talks and in blogs.LinkedIn: https://www.linkedin.com/in/pubudusj/Twitter: https://twitter.com/pubudusjMedium: https://medium.com/@pubudusjDev.to: https://dev.to/pubudusjEventBridge Pipes Articles:https://medium.com/@pubudusj/split-messages-from-single-sqs-queue-into-multiple-sqs-queues-using-eventbridge-728809342352https://medium.com/@pubudusj/self-healing-serverless-app-with-lambda-destinations-and-eventbridge-b548d4554df0

amazon architecture new approach pipes serverless starred event driven aws community builder eventbridge

Amazon EventBridge Pipes: A New Approach to Event-Driven Architecture

DevOps With Zack

Play Episode Listen Later May 14, 2023 21:56

amazon architecture new approach pipes serverless starred event driven aws community builder eventbridge

Advanced Monitoring with EventBridge + Amazon Linux 2 Container

cloudonaut

Play Episode Listen Later Mar 9, 2023 26:28

Two brothers discussing all things AWS every week. Hosted by Andreas and Michael Wittig presented by cloudonaut.

andreas monitoring aws container amazon linux eventbridge

Quoi de neuf ?

Play Episode Listen Later Feb 24, 2023 17:11

AWS a annoncé 83 nouveautés ces derniers 15 jours. J'en ai retenu 7 pour partager avec vous. Dans ma revue biaisée de ces news, j'ai retenu un nouveau service pour les telco, des nouvelles instances EC2 basées sur Graviton3 — le silicon AWS Arm —, une nouvelle option pour gérer vos instances avec Systems Manager. Il y a aussi une nouvelle possibilité pour déployer vos pod kubernetes. Et puis nous parlerons d'améliorations apportées aux APIs de DynamoDB et de Eventbridge qui vous donnent un bénéfice immédiat, sans que vous n'ayez rien à changer de votre côté.

dans aws apis neuf ec2 dynamodb systems manager eventbridge

Quoi de neuf ?

amazon tiktok cloud investigation consumers aws nik pipes principal engineer eventbridge

Play Episode Listen Later Feb 24, 2023 17:11

dans aws apis neuf ec2 dynamodb systems manager eventbridge

Episode 071 - Amazon EventBridge Pipes with Nik Pinski

AWS Developers Podcast

Play Episode Listen Later Feb 10, 2023 26:18

In this episode, Dave is joined by co-host Brooke Jaimeson, and Nik Pinski, Principal Engineer, AWS on the Amazon EventBridge team. Amazon EventBridge Pipes helps you create point-to-point integrations between event producers and consumers with optional transform, filter and enrich steps, while helping to reduces the amount of integration code you need to write and maintain. Nik walks us through this new Pipes feature for Amazon EventBridge, what it means for developers, and how you can get started building with it today. Brooke's Twitter: https://twitter.com/brooke_jamieson Brooke on LinkedIn: https://www.linkedin.com/in/brookejamieson/ Brooke's TikTok: https://www.tiktok.com/@brookebytes Brooke's Instagram: https://www.instagram.com/brooke.bytes/ [BLOG] Amazon EventBridge Pipes is now generally available - https://aws.amazon.com/about-aws/whats-new/2022/12/amazon-eventbridge-pipes-generally-available/ [BLOG] New — Create Point-to-Point Integrations Between Event Producers and Consumers with Amazon EventBridge Pipes - https://aws.amazon.com/blogs/aws/new-create-point-to-point-integrations-between-event-producers-and-consumers-with-amazon-eventbridge-pipes [CLI] Amazon EventBridge CLI - https://www.npmjs.com/package/@mhlabs/evb-cli [PATTERNS] ServerlessLand EventBridge Pipes Patterns Collection - https://serverlessland.com/patterns?services=eventbridge-pipes [PORTAL] Amazon EventBridge - https://aws.amazon.com/eventbridge [PORTAL] Amazon EventBridge Pipes - https://aws.amazon.com/eventbridge/pipes Good Tech Things - Crimes against the Cloud: an Investigation - https://newsletter.goodtechthings.com/p/crimes-against-cloud-an-investigation Tungsten Metal - https://en.wikipedia.org/wiki/Tungsten Subscribe: Amazon Music: https://music.amazon.com/podcasts/f8bf7630-2521-4b40-be90-c46a9222c159/aws-developers-podcast Apple Podcasts: https://podcasts.apple.com/us/podcast/aws-developers-podcast/id1574162669 Google Podcasts: https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5zb3VuZGNsb3VkLmNvbS91c2Vycy9zb3VuZGNsb3VkOnVzZXJzOjk5NDM2MzU0OS9zb3VuZHMucnNz Spotify: https://open.spotify.com/show/7rQjgnBvuyr18K03tnEHBI TuneIn: https://tunein.com/podcasts/Technology-Podcasts/AWS-Developers-Podcast-p1461814/ RSS Feed: https://feeds.soundcloud

Building a Better Developer Experience with Lars Jacobsson

Ready, Set, Cloud Podcast!

Play Episode Listen Later Feb 10, 2023 27:05

Episode Summary Developer experience is often an afterthought when building APIs. But when you build for developers, DevEx is critical to your success. How fast can you make that time to first call? If you have a high barrier of entry, nobody is going to use it. The easier it is to move quickly, the easier it is to spread the word and drive usage. Lars and Allen talk about what makes for good and bad developer experiences. They cover real-world examples and what makes them particularly easy (or bad!) to use. Lars dives into the AWS Step Functions DevEx and what he did to finally get rid of that "swivel chair experience." About Lars Lars is the Head of Infrastructure at Mathem.se and AWS Community Builder. He regularly takes it upon himself to make developer experience better for everyone with his open source tools like the Step Functions SDK Autocomplete VSCode extension and EventBridge pattern generator CLI. He's also the creator of the Step Functions Workflow Studio sync, which aims to address the big gap in usability with the popular AWS service. Links Twitter - https://twitter.com/lajacobsson LinkedIn - https://www.linkedin.com/in/lars-jacobsson-a9518b23/ GitHub - https://github.com/ljacobsson Step Functions Workflow Studio Sync -https://github.com/ljacobsson/sfn-workflow-studio-sync Mathem's GitHub - https://github.com/mhlabs --- Send in a voice message: https://podcasters.spotify.com/pod/show/readysetcloud/message Support this podcast: https://podcasters.spotify.com/pod/show/readysetcloud/support

head infrastructure developers aws github apis cli developer experience devex jacobsson mathem aws community builder eventbridge

#67 EventBridge Scheduler + Packer AMI + AWS Debug Games

cloudonaut

Play Episode Listen Later Jan 18, 2023 19:28

Two brothers discussing all things AWS every week. Hosted by Andreas and Michael Wittig presented by cloudonaut.

games andreas aws packer debug scheduler eventbridge

Serverless Craic Ep41 Serverless Espresso

event qr aws espresso invent baristas s3 lambda serverless craic dynamodb cognito api gateway expo hall aws summit aws step functions eventbridge

Play Episode Listen Later Jan 6, 2023 15:38 Transcription Available

We've been talking about AWS re:Invent over the last few episodes. But one thing that we haven't talked about is Serverless Espresso. Serverless Espresso is a pop up coffee shop that allows you to order coffee from your phone. In the Expo Hall at the AWS Summit, they have a Barista setup. And you walk up to a QR code with a screen in the background. So you scan the QR code and enter in your email address. And up pops a menu. If you select an americano, espresso or other type of coffee, it kicks off an event driven flow. It uses an event driven service under the hood and pops up in the screen as a number. And then the Barista takes the number makes your coffee and gives it to you. But what's happening in the background is a whole load of orchestration and choreography run events. And as they've been using it as a way to explain serverless event driven architecture. Event driven architecture can be hard for people to conceptually wrap their heads around. So making it real through ordering coffee. And showing how to tie a coffee process and an event driven coffee ordering system makes it real. It demystifies it a little bit and removes some of the thinking that event driven architecture sounds really hard. This makes it more approachable. It stitches together a lot of stuff that we've been advocating for, in a way that makes sense. By using AWS Step Functions, EventBridge, Lambda, API Gateway, S3, DynamoDB and Cognito. It brings to bare a lot of great technology that we are advocating for in a way that's practical and easily consumable. And the Serverless Espresso workshop is very good for walking you through the steps about why you're doing what you're doing. And how do you do that, set up rules and set up everything. It's a great way to get hands on with event driven architectures and serverless in a practical way. There are two myths in this space that AWS are trying to dispel. We first started talking about event driven architecture 13 years ago. We had the idea of doing something but back then it was really difficult, because we didn't have a lot of support. So we had hard problems to solve technically, because the foundations weren't there. That is the first myth of being a hard thing to do. The second myth is that people think of serverless as just writing code and functions. It's actually more like an event driven architecture. It's events firing off activity and not a call stack. So it's a lot easier to build full event of architecture than it would have been years ago. The technical challenge is not as bad as you think it is. What I like about Serverless Espresso is the simple interactions. You order, it goes to the barista who makes the coffee, and he gives it back to you. There's one interaction. Normally when ordering a coffee, you talk to a waiter, the waiter talks to the barista, and the barista talks to someone about milk etc. There can be six or seven people in that flow. It causes confusion. In a company, if a business process is owned by six or seven teams, even across two or three departments, it gets messy. If a single team builds for the customer directly and there's no one else, it's usually pretty clean. Because you can see everything needs to happen. It gets complicated if the business processes is split over several teams and departments. Serverless Espresso Lab is good, because the opinions are out there and you add on your bit, which is your business process flow. It goes back to our book The Value Flywheel Effect. And the first phase of the value flywheel which is clarity of purpose. Who is the customer and what is the business flow that we're trying to build? And let's have a good debate on how we are going to do that. When you get to the technical side, that opinion is already there. And you can focus on getting the orchestration correct. Because you know that a lot of that underlying stuff is pretty much solved apart from making it behave. Look at the Serverless Espresso Lab on workshop.serverlesscoffee.com. Or search for Serverless Espresso. And big kudos to the AWS Serverless Developer Advocate team. They're mostly on serverlessland.com. Thanks for listening. Serverless Craic from The Serverless Edge Check out our book The Value Flywheel Effect Follow us on Twitter @ServerlessEdge

#59 [Hot off the Cloud] EventBridge Scheduler + Resource Explorer + ECS scale-in protection

cloudonaut

Play Episode Listen Later Nov 15, 2022 30:02

Two brothers discussing all things AWS every week. Hosted by Andreas and Michael Wittig presented by cloudonaut.

scale cloud andreas resource explorers aws scheduler hot off eventbridge

Serverless Craic Ep33 How to apply the well architected tool

Play Episode Listen Later Sep 28, 2022 14:31 Transcription Available

How to apply the well architected tool Grady Booch is a fantastic software architect who has done a lot of pioneering software engineering. One of his latest messages says that it's a good thing that AWS, Azure and Google are offering guidance on building with well architected. But he also warns that when cloud providers talk to you about well architected, it comes with a bias towards their products. There is an inherent bias among providers who are investing in these frameworks. They lean towards their own ecosystem and services. But the interesting thing about the AWS framework is if you squint at it sideways, you can extract the implementation details. And focus on the principles and values behind delivering a well architected solution. You can abstract a huge amount of value from them, even if you don't embrace the services aligned to that cloud provider. It's the collaboration, questions, mindset and thinking that are good. And they are fairly consistent across whatever tech stack or platform you're leveraging. The questions are very similar. And the answers may be a little bit different. But even just asking those questions is hugely valuable. And as long as you apply it to your context, you can get a lot of value. I remember having discussions with people who were asking what architecture is? They weren't quite sure what architecture was and were trying to redefine architecture. With less technical colleagues, you need to give them a definition of what architecture is. When you compare all three Cloud Providers, they roughly have the same stuff. But AWS is driven by principles or tenets. A lot of the principles in the framework are applicable elsewhere. Azure is almost like a wizard as it takes you on a path, which is good 90% of the time. But I worry about what happens if that's not right for you. And Google sets the bar was quite high with hardcore SRE. The culture of each of the providers comes through. But they are all really good frameworks. With AWS, if you leverage a lot of AWS services, you're going to be EDA. So that is going to bias you towards operating in certain ways and leveraging certain ways of communication. Whether it's Kinesis, EventBridge or SQS SNS to connect things together. With Google, I am interested to see why they're focusing so much on SRE. Maybe there's a requirement on squads to look after resources when they are up, because it's more container oriented. The folks at Google perhaps believe that customer success is driven through proper SRE and operational excellence. A lot of the SRE principles came from Google. A cloud provider writes a framework in their own language. And it will have a bias towards their own products. So you need to see that and see the difference. And if you ask a cloud provider to do a well architected review on your product, a solution architect from that cloud provider to will come in and do it. And all they know is their products. So if you get an AWS solution architect to review your product, they'll start recommending AWS products. The lesson to learn is that it's not the framework that's important. It's the process that you put around it. And the process does not need to involve the cloud provider. We can run that ourselves. And we can apply the framework ourselves. It is a big shift, to use well architected as a benchmark to self assess to learn, measure and improve. The great advantage of doing this yourself is that your feedback loop is established. So you're actually seeing what works, what doesn't work and where your gaps are across all the different pillars. And where you should be investing your time and your team's time across the org. If you bring in somebody external, they won't have the context. And they're not going to be part of that feedback loop. When you use Clarity of Purpose, which is Phase 1 of the Value Flywheel, with the well architected framework, you can combat that bias. Your focus will be on solving the problem with the tools you've got at your disposal. And you're able to leverage tools faster and more efficiently. It should not be something that your architect does to you. This is something that the team should be embracing themselves through an iterative and incremental process. We've written about this a lot in our SCORPS process, which is in the book as well. It doesn't need to be a big, scary thing that an architect from on high comes down and imposes on you. It can be a very useful learning and enablement tool. Grady Booch Search for 'Grady' 'Well' 'Architected' to find that tweet and discussion SCORPS process on TheServerlessEdge.com. The Value Flywheel Effect book Twitter @ServerlessEdge Transcribed by https://otter.ai

google clarity tool phase aws azure sre serverless craic kinesis architected eventbridge

Serverless Craic Ep31 Event Driven Architecture Examples at EDA Day

airhacks.fm podcast with adam bien

Play Episode Listen Later Sep 16, 2022 18:27 Transcription Available

We are talking about Event Driven Architecture examples today. There was an event in London a few weeks ago, called EDA Day. It was organised by GOTO with a lot of AWS contributors. It was neat because it was one day focused on event driven architectures. It showed the coming together of a 15 to 20 year old pattern of EDA, plus serverless. And all the bigger services on top of that, like Eventbridge and Step Functions. Gregor Hohpe's Keynote Gregor Hohpe did the keynote talk: 'I made everything loosely coupled. Does my app fall apart?' Gregor is an AWS enterprise strategist. And he talked about the event landscape and the complexities behind event driven and architecture. He had a diagram called: 'A calls B'. It looked pretty simple until you get to the million things you need to think about when A calls B! He said that there were three languages in a cloud native serverless domain. The business domain and how you talk about the business domain as a business person. The eventing architecture and how you talk about it as an architect. And the cloud native area, and how you talk about it as a cloud engineer. So DDD, event framework and CDK for automation. It's about having those three separate languages and how you talk. And bringing them together at the end. Serverless Espresso And one neat thing to mention is a developer advocate called Julian Wood. He's worth looking up on Twitter. He, Ben Smith, and a few others from Serverless Land, put together a demo called Serverless Espresso. You scan a barcode and through an event driven step function, event bridge sequence, you can order a coffee from your phone. It looks and sounds really simple. But you watch the whole thing happen. That's a great lab. So look up AWS labs, to see Serverless Espresso. It's well put together to show how you build an event driven architecture from the ground up. Ben Ellerby - Minimal Viable Migrations Another good speaker was Ben Ellerby. He worked in Theodo and is an AWS Serverless Hero. He has a thing about Minimal Viable Migrations. A lot of people think event driven is a greenfield or brand new thing. But he had a great talk about existing architecture and going event driven. He talked about doing a small part of your architecture and going bit by bit. By using an incremental model. David Boyne - Awesome EventBridge David Boyne joined AWS and does ‘Awesome EventBridge'. He has open source projects. And he does a great talk on 'Thinking Event First'. How to approach events and get your schemes right. And really think about your domain model and lock it in from day one. So he's got a bunch of tools as well. So it's worth looking up his resources on ‘awesome event bridge'. Marcia Villalba - FooBar Serverless Another great speaker was Marcia Villalba. She's one of the developer advocates at AWS. She's got great content on good practices and getting started. She has a really nice way of explaining these concepts. There is one thing I get nervous about around event driven and domain driven. People who are good at it tend to get very complicated very quickly and lose everyone. But Marcia's super at bringing these concepts across and helping normal teams, which is every team. Check out her FooBar Serverless YouTube channel. There is tons of developer friendly content from beginner to more advanced. It's one of my YouTube subscriptions that I watch quite regularly. Lego Talk - Sarah Hamilton and Sheen Brisals The last one to talk about is Lego. They sponsored the event. And they had two talks. Sarah Hamilton is one of the software engineers and she gave a really good talk about the advanced techniques they're using in their event driven architecture. My friend, Sheen Brisals was speaking as well. They have a fantastic story, which is well worth listening to. It's about how they moved to an event driven serverless architecture. There's a socio-technical element to this. How you organise your teams and the attitude is what I would call a core engineering competency and mindset. As opposed to an architectural pattern. Lego tells their story brilliantly. Product Leader panel The event ended with a panel of product leaders from Eventbridge, Step Functions and MongoDB. It was a really relaxed panel. Emily Shea, who we know well, was there. She works in go to market for serverless. It was a relaxed chat. No one was pushing any tools. They were shooting the breeze on good practice and what's coming down the track. The evolution of event driven architecture and the tie in with serverless. There's something in it! I don't want to say Serverless is becoming EDA or EDA is becoming serverless. But serverless enables EDA for sure. https://gotoldn.com/ https://theserverlessedge.com/ https://twitter.com/ServerlessEdge

lego architecture aws gregor ben smith mongodb serverless craic cdk event driven sarah hamilton julian wood eventbridge aws serverless hero gregor hohpe

#0020 - Anuncios para Developers

Innovando con AWS

Play Episode Listen Later Sep 6, 2022 61:16

Andres Lindo, Solutions Architect en AWS nos entretiene durante 1 hora contándonos y refrescándonos las últimas novedades en el ámbito de herramientas para desarrolladores. Nos cuenta sobre tecnologías serverless, containers, herramientas de monitoreo y service bus. Su pasión es contagiosa y si trabajás en desarrollo de software no podés perderte este episodio. 00:00 Intro01:13 El background de Andres03:51 Serverless13:15 SQS16:35 EventBridge22:30 Herramientas de Desarrollo42:12 Containers50:40 CDK Infraestructura como código53:02 Chatops56:12 Cierre Andres Lindo - https://www.linkedin.com/in/andreslindo/ Andrés Lindo es Entreprise Solutions Architect basado en Madrid (España), cuenta con 18 años de experiencia en IT. Inició su carrera en AWS en LATAM, en 2019. Andrés ayuda a clientes de múltiples industrias en la construcción de experiencias digitales con AWS. Apasionado por los viajes, la música, la radio y DJing.Rodrigo Asensio - https://twitter.com/rasensio Basado en North Carolina, USA, Rodrigo es responsable de un equipo de cuentas estratégicas para el segmento de ISV de Educación. Rodrigo busca poder descomplejizar y desmitificar conceptos, herramientas y procesos relacionados al cloud para poder hacer que esta tecnología esté al alcance a más gente. LinksAWS Lambda Event Filtering: https://aws.amazon.com/blogs/compute/filtering-event-sources-for-aws-lambda-functions/ AWS Lambda Partial Batch Response: https://aws.amazon.com/es/about-aws/whats-new/2021/11/aws-lambda-partial-batch-response-sqs-event-source/ SQL DLQ-Redrive: https://aws.amazon.com/blogs/compute/introducing-amazon-simple-queue-service-dead-letter-queue-redrive-to-source-queues/ SQS Managed SSE: https://aws.amazon.com/es/about-aws/whats-new/2021/11/amazon-sqs-server-side-encryption-keys-sse/ EventBridge S3 Data Plane Notifications: https://aws.amazon.com/blogs/aws/new-use-amazon-s3-event-notifications-with-amazon-eventbridge/ EventBridge cross-region support: https://aws.amazon.com/es/about-aws/whats-new/2021/11/amazon-eventbridge-cross-region-expands/ CodeGuru Secrets Detector: https://aws.amazon.com/blogs/aws/codeguru-reviewer-secrets-detector-identify-hardcoded-secrets/ Amplify Studio: https://aws.amazon.com/blogs/mobile/aws-amplify-studio-figma-to-fullstack-react-app-with-minimal-programming/ CloudWatch Real User Monitoring (RUM): https://aws.amazon.com/blogs/aws/cloudwatch-rum/ CloudWatch Evidently: https://aws.amazon.com/blogs/aws/cloudwatch-evidently/ New AWS SDKs (Swift, Kotlin, Rust): https://aws.amazon.com/es/about-aws/whats-new/2021/12/aws-sdk-swift-developer-preview/ https://aws.amazon.com/about-aws/whats-new/2021/12/aws-sdk-kotlin-developer-preview/ https://aws.amazon.com/about-aws/whats-new/2021/12/aws-sdk-rust-developer-preview/ Karpenter: https://aws.amazon.com/blogs/aws/introducing-karpenter-an-open-source-high-performance-kubernetes-cluster-autoscaler/ Elastic Container Registry (ECR) - Pull through cache repositories: https://aws.amazon.com/blogs/aws/announcing-pull-through-cache-repositories-for-amazon-elastic-container-registry/ AWS Marketplace for Containers Anywhere: https://aws.amazon.com/blogs/aws/new-aws-marketplace-for-containers-anywhere-to-deploy-your-kubernetes-cluster-in-any-environment/ AWS CDK v2: https://aws.amazon.com/es/about-aws/whats-new/2021/12/aws-cloud-development-kit-cdk-generally-available/ AWS CDK Constructs Library: https://constructs.dev AWS Chatbot - Manage resources from Slack: https://aws.amazon.com/es/about-aws/whats-new/2021/11/aws-chatbot-management-resources-slack/ Conectate con Rodrigo Asensio en Twitter https://twitter.com/rasensio y Linkedin en https://www.linkedin.com/in/rasensio/ Conectate con Andres Lindo en Linkedin https://www.linkedin.com/in/andreslindo/

AWS Lambda, Events, Quarkus and Java

Play Episode Listen Later Aug 6, 2022 61:10

An airhacks.fm conversation with Goran Opacic (@goranopacic) about: transactions and clouds, checkout last episode with Goran: "#190 Real World Enterprise Serverless Java on AWS Cloud", transition from Java EE to the cloud, Long Running Actions in MicroProfile and the saga pattern, the problem of transaction coordination, in the clouds there should be no coordinating servers, DynamoDB is transactional and supports conditional writes, AWS Lambda Powertools for Java, event driven thinking on AWS, Java idioms and conventions on AWS, Amazon DynamoDB JPA-like persistence - DynamoDBMapper, dependency injection in AWS Lambdas, AWS Lambda PowerTools features should become a part of Lambda, the Z Garbage Collector, a missile with memory leaks, running BIRT reports in a AWS Lambda, synchronous Step Functions, EventBridge is the service connectors, AWS AppSync can push events to the client, Goran Opacic on twitter: @goranopacic, Goran's blog: madabout.cloud

events aws java lambda goran aws lambda birt dynamodb aws cloud java ee aws appsync eventbridge

Idempotency, Secrets, Dependency Injection and AWS Lambda

airhacks.fm podcast with adam bien

Play Episode Listen Later Jul 8, 2022 54:27

An airhacks.fm conversation with Mark Sailes (@MarkSailes3) about: AWS Lambda Powertools for Java,, fetching and caching secrets, default caching retention period for short lived secrets, the limits of the clouds, the consistency of DynamoDB, DynamoDB connectivity with AWS Lambda, RDS Proxy, "Proprietary Cloud Native Managed Service", SQS, SNS, Kinesis, the fan-out pattern is implemented with SQS and SNS; EventBridge could replace SQS and SNS for the implementation of fan out patterns, AWS Lambda Powertools for Java Idempontency, using request as idempotency key, transactions, network errors and idempotency, idempotency per design, partial batch failures handling, bisect in SQS, SQS Large Message Handling, message offloading to S3, S3 transactional writes, dependency injection of Lambda resources, secret injection with AWS Lambda, JSON-logging and AWS Lambda Powertools for Java Logging, converting json to metrics in CloudWatch, Mark Sailes on twitter: @MarkSailes3, Mark's blog: mark-sailes.medium.com

secrets java sns s3 lambda json aws lambda kinesis dynamodb dependency injection sqs cloudwatch eventbridge

39. How do you build a cross-account event backbone with EventBridge?

Play Episode Listen Later Jun 2, 2022 22:38

When it comes to building and deploying microservice applications on AWS, there are 2 emerging best practices: use a separate AWS account per application (and environment) and decouple communication between separate systems using events (instead of point-to-point communication). Can we use these two best practices together? Yes, but we will need to find a way to pass messages between AWS accounts! In this episode we discuss how to do that using EventBridge as a cross-account event backbone! We discuss why these 2 suggestions are well established best practices, what are the pros and cons that they bring to the table, what an event backbone is and why EventBridge is a great service to implement one. Finally, we will discuss a case study and an example implementation of this pattern in the context of an e-commerce application built with a microservices architecture. In this episode, we mentioned the following resources: - Article “How to use EventBridge as a Cross-Account Event Backbone” https://dev.to/eoinsha/how-to-use-eventbridge-as-a-cross-account-event-backbone-5fik - Repository with example code: https://github.com/fourTheorem/cross-account-eventbridge/ - Article “What can you do with EventBridge?” (fourTheorem blog): https://www.fourtheorem.com/blog/what-can-you-do-with-eventbridge - For great ideas on structuring event payloads, take a read of Sheen Brisals' post on the Lego Engineering blog: https://medium.com/lego-engineering/the-power-of-amazon-eventbridge-is-in-its-detail-92c07ddcaa40 - Article “What do you need to know about SNS?” (fourTheorem blog) which includes a comparison of SNS and EventBridge: https://www.fourtheorem.com/blog/what-do-you-need-to-know-about-sns - AWS Bites Episode 23: “What's the big deal with EventBridge?” : https://youtu.be/UjIE5qp-v8w - AWS Community Day talk by Luc van Donkersgoed “Event-Driven Architecture at PostNL Scale” https://www.youtube.com/watch?v=nyoMF1AEI7g This episode is also available on YouTube: https://www.youtube.com/AWSBites You can listen to AWS Bites wherever you get your podcasts: - Apple Podcasts: https://podcasts.apple.com/us/podcast/aws-bites/id1585489017 - Spotify: https://open.spotify.com/show/3Lh7PzqBFV6yt5WsTAmO5q - Google: https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy82YTMzMTJhMC9wb2RjYXN0L3Jzcw== - Breaker: https://www.breaker.audio/aws-bites - RSS: https://anchor.fm/s/6a3312a0/podcast/rss Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on Twitter: - https://twitter.com/eoins - https://twitter.com/loige #aws #microservice #eventbridge

spotify cross event account luc aws backbone sns repository article how eventbridge

Episode 86 - Amazon EventBridge

AWS TechChat

Play Episode Listen Later May 10, 2022 55:44

In this episode of AWS Techchat, we started by talking about foundations, where we spoke about an overview of EventBridge and how it is different from CloudWatch Events. Then we talked about some of the features such as Archive and replay, Schema Registry, Global Endpoints and API Destinations And finally we dove into architecture patterns. Here we touched on the need to spend time modeling your logical architecture to get a good foundation for your event-driven architecture and explored event bus topologies and best practices. Speakers Shai Perednik - Global Tech Lead - Blockchain Cheryl Joseph - Solutions Architect, AWS Stephen Liedig - Principal SA - Serverless, AWS Resources *Amazon EventBridge resource policy samples* https://github.com/aws-samples/amazon-eventbridge-resource-policy-samples *AWS re:Invent 2020 session* Building event-driven applications with Amazon EventBridge (https://youtu.be/Wk0FoXTUEjo) *Introducing global endpoints for Amazon EventBridge* https://aws.amazon.com/blogs/compute/introducing-global-endpoints-for-amazon-eventbridge/ *ANZ Summit: Design event-driven integrations using Amazon EventBridge (Day 2)* * AWS Summit regisration (https://aws.amazon.com/events/summits/anz/) * Agenda at a glance (https://pages.awscloud.com/rs/112-TZM-766/images/AWS-Summit-ANZ-2022-Agenda.pdf) Blog Post * Building an event-driven application with Amazon EventBridge (https://aws.amazon.com/blogs/compute/building-an-event-driven-application-with-amazon-eventbridge/)

amazon building archive aws invent aws summit eventbridge tzm

33. What can you do with CloudWatch metrics?

spotify metrics aws honeycomb observability datadog new relic cloudwatch eventbridge

Play Episode Listen Later Apr 21, 2022 33:01

CloudWatch is the main Observability tool in AWS and it offers a wide range of features: logs, metrics, dashboards, alarms and even events (recently moved into EventBridge). In this episode we are going to focus on CloudWatch metrics. We are going to discuss the characteristics of metrics in CloudWatch: namespaces, dimensions, units and more. What metrics you get out of the box and how to create your own. How to access and explore metrics. Finally we will compare CloudWatch to other providers like DataDog, New Relic, Honeycomb and Grafana + Prometheus and try to assess whether CloudWatch is enough or if you need to use other third-party services. In this episode, we mentioned the following resources: - How to send Gzipped requests with boto3 (which uses the PutMetricsData API as an example): https://loige.co/how-to-send-gzipped-requests-with-boto3 - CloudWatch service quota: https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch_limits.html - CloudWatch metrics stream for DataDog: https://www.datadoghq.com/blog/amazon-cloudwatch-metric-streams-datadog/ This episode is also available on YouTube: https://www.youtube.com/AWSBites You can listen to AWS Bites wherever you get your podcasts: - Apple Podcasts: https://podcasts.apple.com/us/podcast/aws-bites/id1585489017 - Spotify: https://open.spotify.com/show/3Lh7PzqBFV6yt5WsTAmO5q - Google: https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy82YTMzMTJhMC9wb2RjYXN0L3Jzcw== - Breaker: https://www.breaker.audio/aws-bites - RSS: https://anchor.fm/s/6a3312a0/podcast/rss Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on Twitter: - https://twitter.com/eoins - https://twitter.com/loige

The Multi-Cloud Counterculture with Tim Bray

Screaming in the Cloud

Play Episode Listen Later Apr 5, 2022 41:50

About TimTimothy William Bray is a Canadian software developer, environmentalist, political activist and one of the co-authors of the original XML specification. He worked for Amazon Web Services from December 2014 until May 2020 when he quit due to concerns over the terminating of whistleblowers. Previously he has been employed by Google, Sun Microsystemsand Digital Equipment Corporation (DEC). Bray has also founded or co-founded several start-ups such as Antarctica Systems.Links Referenced: Textuality Services: https://www.textuality.com/ laugh]. So, the impetus for having this conversation is, you had a [blog post: https://www.tbray.org/ongoing/When/202x/2022/01/30/Cloud-Lock-In @timbray: https://twitter.com/timbray tbray.org: https://tbray.org duckbillgroup.com: https://duckbillgroup.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats V-U-L-T-R.com slash screaming.Corey: Couchbase Capella Database-as-a-Service is flexible, full-featured and fully managed with built in access via key-value, SQL, and full-text search. Flexible JSON documents aligned to your applications and workloads. Build faster with blazing fast in-memory performance and automated replication and scaling while reducing cost. Capella has the best price performance of any fully managed document database. Visit couchbase.com/screaminginthecloud to try Capella today for free and be up and running in three minutes with no credit card required. Couchbase Capella: make your data sing.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest today has been on a year or two ago, but today, we're going in a bit of a different direction. Tim Bray is a principal at Textuality Services.Once upon a time, he was a Distinguished Engineer slash VP at AWS, but let's be clear, he isn't solely focused on one company; he also used to work at Google. Also, there is scuttlebutt that he might have had something to do, at one point, with the creation of God's true language, XML. Tim, thank you for coming back on the show and suffering my slings and arrows.Tim: Oh, you're just fine. Glad to be here.Corey: [laugh]. So, the impetus for having this conversation is, you had a blog post somewhat recently—by which I mean, January of 2022—where you talked about lock-in and multi-cloud, two subjects near and dear to my heart, mostly because I have what I thought was a fairly countercultural opinion. You seem to have a very closely aligned perspective on this. But let's not get too far ahead of ourselves. Where did this blog posts come from?Tim: Well, I advised a couple of companies and one of them happens to be using GCP and the other happens to be using AWS and I get involved in a lot of industry conversations, and I noticed that multi-cloud is a buzzword. If you go and type multi-cloud into Google, you get, like, a page of people saying, “We will solve your multi-cloud problems. Come to us and you will be multi-cloud.” And I was not sure what to think, so I started writing to find out what I would think. And I think it's not complicated anymore. I think the multi-cloud is a reality in most companies. I think that many mainstream, non-startup companies are really worried about cloud lock-in, and that's not entirely unreasonable. So, it's a reasonable thing to think about and it's a reasonable thing to try and find the right balance between avoiding lock-in and not slowing yourself down. And the issues were interesting. What was surprising is that I published that blog piece saying what I thought were some kind of controversial things, and I got no pushback. Which was, you know, why I started talking to you and saying, “Corey, you know, does nobody disagree with this? Do you disagree with this? Maybe we should have a talk and see if this is just the new conventional wisdom.”Corey: There's nothing worse than almost trying to pick a fight, but no one actually winds up taking you up on the opportunity. That always feels a little off. Let's break it down into two issues because I would argue that they are intertwined, but not necessarily the same thing. Let's start with multi-cloud because it turns out that there's just enough nuance to—at least where I sit on this position—that whenever I tweet about it, I wind up getting wildly misinterpreted. Do you find that as well?Tim: Not so much. It's not a subject I have really had too much to say about, but it does mean lots of different things. And so it's not totally surprising that that happens. I mean, some people think when you say multi-cloud, you mean, “Well, I'm going to take my strategic application, and I'm going to run it in parallel on AWS and GCP because that way, I'll be more resilient and other good things will happen.” And then there's another thing, which is that, “Well, you know, as my company grows, I'm naturally going to be using lots of different technologies and that might include more than one cloud.” So, there's a whole spectrum of things that multi-cloud could mean. So, I guess when we talk about it, we probably owe it to our audiences to be clear what we're talking about.Corey: Let's be clear, from my perspective, the common definition of multi-cloud is whatever the person talking is trying to sell you at that point in time is, of course, what multi-cloud is. If it's a third-party dashboard, for example, “Oh, yeah, you want to be able to look at all of your cloud usage on a single pane of glass.” If it's a certain—well, I guess, certain not a given cloud provider, well, they understand if you go all-in on a cloud provider, it's probably not going to be them so they're, of course, going to talk about multi-cloud. And if it's AWS, where they are the 8000-pound gorilla in the space, “Oh, yeah, multi-clouds, terrible. Put everything on AWS. The end.” It seems that most people who talk about this have a very self-serving motivation that they can't entirely escape. That bias does reflect itself.Tim: That's true. When I joined AWS, which was around 2014, the PR line was a very hard line. “Well, multi-cloud that's not something you should invest in.” And I've noticed that the conversation online has become much softer. And I think one reason for that is that going all-in on a single cloud is at least possible when you're a startup, but if you're a big company, you know, a insurance company, a tire manufacturer, that kind of thing, you're going to be multi-cloud, for the same reason that they already have COBOL on the mainframe and Java on the old Sun boxes, and Mongo running somewhere else, and five different programming languages.And that's just the way big companies are, it's a consequence of M&A, it's a consequence of research projects that succeeded, one kind or another. I mean, lots of big companies have been trying to get rid of COBOL for decades, literally, [laugh] and not succeeding and doing that. So—Corey: It's ‘legacy' which is, of course, the condescending engineering term for, “It makes money.”Tim: And works. And so I don't think it's realistic to, as a matter of principle, not be multi-cloud.Corey: Let's define our terms a little more closely because very often, people like to pull strange gotchas out of the air. Because when I talk about this, I'm talking about—like, when I speak about it off the cuff, I'm thinking in terms of where do I run my containers? Where do I run my virtual machines? Where does my database live? But you can also move in a bunch of different directions. Where do my Git repositories live? What Office suite am I using? What am I using for my CRM? Et cetera, et cetera? Where do you draw the boundary lines because it's very easy to talk past each other if we're not careful here?Tim: Right. And, you know, let's grant that if you're a mainstream enterprise, you're running your Office automation on Microsoft, and they're twisting your arm to use the cloud version, so you probably are. And if you have any sense at all, you're not running your own Exchange Server, so let's assume that you're using Microsoft Azure for that. And you're running Salesforce, and that means you're on Salesforce's cloud. And a lot of other Software-as-a-Service offerings might be on AWS or Azure or GCP; they don't even tell you.So, I think probably the crucial issue that we should focus our conversation on is my own apps, my own software that is my core competence that I actually use to run the core of my business. And typically, that's the only place where a company would and should invest serious engineering resources to build software. And that's where the question comes, where should that software that I'm going to build run? And should it run on just one cloud, or—Corey: I found that when I gave a conference talk on this, in the before times, I had to have a ever lengthier section about, “I'm speaking in the general sense; there are specific cases where it does make sense for you to go in a multi-cloud direction.” And when I'm talking about multi-cloud, I'm not necessarily talking about Workload A lives on Azure and Workload B lives on AWS, through mergers, or weird corporate approaches, or shadow IT that—surprise—that's not revenue-bearing. Well, I guess we have to live with it. There are a lot of different divisions doing different things and you're going to see that a fair bit. And I'm not convinced that's a terrible idea as such. I'm talking about the single workload that we're going to spread across two or more clouds, intentionally.Tim: That's probably not a good idea. I just can't see that being a good idea, simply because you get into a problem of just terminology and semantics. You know, the different providers mean different things by the word ‘region' and the word ‘instance,' and things like that. And then there's the people problem. I mean, I don't think I personally know anybody who would claim to be able to build and deploy an application on AWS and also on GCP. I'm sure some people exist, but I don't know any of them.Corey: Well, Forrest Brazeal was deep in the AWS weeds and now he's the head of content at Google Cloud. I will credit him that he probably has learned to smack an API around over there.Tim: But you know, you're going to have a hard time hiring a person like that.Corey: Yeah. You can count these people almost as individuals.Tim: And that's a big problem. And you know, in a lot of cases, it's clearly the case that our profession is talent-starved—I mean, the whole world is talent-starved at the moment, but our profession in particular—and a lot of the decisions about what you can build and what you can do are highly contingent on who you can hire. And you can't hire a multi-cloud expert, well, you should not deploy, [laugh] you know, a multi-cloud application.Now, having said that, I just want to dot this i here and say that it can be made to kind of work. I've got this one company I advise—I wrote about it in the blog piece—that used to be on AWS and switched over to GCP. I don't even know why; this happened before I joined them. And they have a lot of applications and then they have some integrations with third-party partners which they implemented with AWS Lambda functions. So, when they moved over to GCP, they didn't stop doing that.So, this mission-critical latency-sensitive application of theirs runs on GCP that calls out to AWS to make calls into their partners' APIs and so on. And works fine. Solid as a rock, reliable, low latency. And so I talked to a person I know who knows over on the AWS side, and they said, “Oh, yeah sure, you know, we talked to those guys. Lots of people do that. We make sure, you know, the connections are low latency and solid.” So, technically speaking, it can be done. But for a variety of business reasons—maybe the most important one being expertise and who you can hire—it's probably just not a good idea.Corey: One of the areas where I think is an exception case is if you are a SaaS provider. Let's pick a big easy example: Snowflake, where they are a data warehouse. They've got to run their data warehousing application in all of the major clouds because that is where their customers are. And it turns out that if you're going to send a few petabytes into a data warehouse, you really don't want to be paying cloud egress rates to do it because it turns out, you can just bootstrap a second company for that much money.Tim: Well, Zoom would be another example, obviously.Corey: Oh, yeah. Anything that's heavy on data transfer is going to be a strange one. And there's being close to customers; gaming companies are another good example on this where a lot of the game servers themselves will be spread across a bunch of different providers, just purely based on latency metrics around what is close to certain customer clusters.Tim: I can't disagree with that. You know, I wonder how large a segment that is, of people who are, I think you're talking about core technology companies. Now, of the potential customers of the cloud providers, how many of them are core technology companies, like the kind we're talking about, who have such a need, and how many people who just are people who just want to run their manufacturing and product design and stuff. And for those, buying into a particular cloud is probably a perfectly sensible choice.Corey: I've also seen regulatory stories about this. I haven't been able to track them down specifically, but there is a pervasive belief that one interpretation of UK banking regulations stipulates that you have to be able to get back up and running within 30 days on a different cloud provider entirely. And also, they have the regulatory requirement that I believe the data remain in-country. So, that's a little odd. And honestly, when it comes to best practices and how you should architect things, I'm going to take a distinct backseat to legal requirements imposed upon you by your regulator. But let's be clear here, I'm not advising people to go and tell their auditors that they're wrong on these things.Tim: I had not heard that story, but you know, it sounds plausible. So, I wonder if that is actually in effect, which is to say, could a huge British banking company, in fact do that? Could they in fact, decamp from Azure and move over to GCP or AWS in 30 days? Boy.Corey: That is what one bank I spoke to over there was insistent on. A second bank I spoke to in that same jurisdiction had never heard of such a thing, so I feel like a lot of this is subject to auditor interpretation. Again, I am not an expert in this space. I do not pretend to be—I know I'm that rarest of all breeds: A white guy with a microphone in tech who admits he doesn't know something. But here we are.Tim: Yeah, I mean, I imagine it could be plausible if you didn't use any higher-level services, and you just, you know, rented instances and were careful about which version of Linux you ran and we're just running a bunch of Java code, which actually, you know, describes the workload of a lot of financial institutions. So, it should be a matter of getting… all the right instances configured and the JVM configured and launched. I mean, there are no… architecturally terrifying barriers to doing that. Of course, to do that, it would mean you would have to avoid using any of the higher-level services that are particular to any cloud provider and basically just treat them as people you rent boxes from, which is probably not a good choice for other business reasons.Corey: Which can also include things as seemingly low-level is load balancers, just based upon different provisioning modes, failure modes, and the rest. You're probably going to have a more consistent experience running HAProxy or nginx yourself to do it. But Tim, I have it on good authority that this is the old way of thinking, and that Kubernetes solves all of it. And through the power of containers and powers combining and whatnot, that frees us from being beholden to any given provider and our workloads are now all free as birds.Tim: Well, I will go as far as saying that if you are in the position of trying to be portable, probably using containers is a smart thing to do because that's a more tractable level of abstraction that does give you some insulation from, you know, which version of Linux you're running and things like that. The proposition that configuring and running Kubernetes is easier than configuring and running [laugh] JVM on Linux [laugh] is unsupported by any evidence I've seen. So, I'm dubious of the proposition that operating at the Kubernetes-level at the [unintelligible 00:14:42] level, you know, there's good reasons why some people want to do that, but I'm dubious of the proposition that really makes you more portable in an essential way.Corey: Well, you're also not the target market for Kubernetes. You have worked at multiple cloud providers and I feel like the real advantage of Kubernetes is people who happen to want to protect that they do so they can act as a sort of a cosplay of being their own cloud provider by running all the intricacies of Kubernetes. I'm halfway kidding, but there is an uncomfortable element of truth to that to some of the conversations I've had with some of its more, shall we say, fanatical adherents.Tim: Well, I think you and I are neither of us huge fans of Kubernetes, but my reasons are maybe a little different. Kubernetes does some really useful things. It really, really does. It allows you to take n VMs, and pack m different applications onto them in a way that takes reasonably good advantage of the processing power they have. And it allows you to have different things running in one place with different IP addresses.It sounds straightforward, but that turns out to be really helpful in a lot of ways. So, I'm actually kind of sympathetic with what Kubernetes is trying to be. My big gripe with it is that I think that good technology should make easy things easy and difficult things possible, and I think Kubernetes fails the first test there. I think the complexity that it involves is out of balance with the benefits you get. There's a lot of really, really smart people who disagree with me, so this is not a hill I'm going to die on.Corey: This is very much one of those areas where reasonable people can disagree. I find the complexity to be overwhelming; it has to collapse. At this point, it's finding someone who can competently run Kubernetes in production is a bit hard to do and they tend to be extremely expensive. You aren't going to find a team of those people at every company that wants to do things like this, and they're certainly not going to be able to find it in their budget in many cases. So, it's a challenging thing to do.Tim: Well, that's true. And another thing is that once you step onto the Kubernetes slope, you start looking about Istio and Envoy and [fabric 00:16:48] technology. And we're talking about extreme complexity squared at that point. But you know, here's the thing is, back in 2018 I think it was, in his keynote, Werner said that the big goal is that all the code you ever write should be application logic that delivers business value, which you know rep—Corey: Didn't CGI say the same thing? Didn't—like, isn't there, like, a long history dating back longer than I believe either of us have been alive have, “With this, all you're going to write is business logic.” That was the Java promise. That was the Google App Engine promise. Again, and again, we've had that carrot dangled in front of us, and it feels like the reality with Lambda is, the only code you will write is not necessarily business logic, it's getting the thing to speak to the other service you're trying to get it to talk to because a lot of these integrations are super finicky. At least back when I started learning how this stuff worked, they were.Tim: People understand where the pain points are and are indeed working on them. But I think we can agree that if you believe in that as a goal—which I still do; I mean, we may not have got there, but it's still a worthwhile goal to work on. We can agree that wrangling Istio configurations is not such a thing; it's not [laugh] directly value-adding business logic. To the extent that you can do that, I think serverless provides a plausible way forward. Now, you can be all cynical about, “Well, I still have trouble making my Lambda to talk to my other thing.” But you know, I've done that, and I've also deployed JVM on bare metal kind of thing.You know what? I'd rather do things at the Lambda level. I really rather would. Because capacity forecasting is a horribly difficult thing, we're all terrible at it, and the penalties for being wrong are really bad. If you under-specify your capacity, your customers have a lousy experience, and if you over-specify it, and you have an architecture that makes you configure for peak load, you're going to spend bucket-loads of money that you don't need to.Corey: “But you're then putting your availability in the cloud providers' hands.” “Yeah, you already were. Now, we're just being explicit about acknowledging that.”Tim: Yeah. Yeah, absolutely. And that's highly relevant to the current discussion because if you use the higher-level serverless function if you decide, okay, I'm going to go with Lambda and Dynamo and EventBridge and that kind of thing, well, that's not portable at all. I mean, APIs are totally idiosyncratic for AWS and GCP's equivalent, and Azure's—what do they call it? Permanent functions or something-a-rather functions. So yeah, that's part of the trade-off you have to think about. If you're going to do that, you're definitely not going to be multi-cloud in that application.Corey: And in many cases, one of the stated goals for going multi-cloud is that you can avoid the downtime of a single provider. People love to point at the big AWS outages or, “See? They were down for half a day.” And there is a societal question of what happens when everyone is down for half a day at the same time, but in most cases, what I'm seeing, your instead of getting rid of a single point of failure, introducing a second one. If either one of them is down your applications down, so you've doubled your outage surface area.On the rare occasions where you're able to map your dependencies appropriately, great. Are your third-party critical providers all doing the same? If you're an e-commerce site and Stripe processes your payments, well, they're public about being all-in on AWS. So, if you can't process payments, does it really matter that your website stays up? It becomes an interesting question. And those are the ones that you know about, let alone the third, fourth-order dependencies that are almost impossible to map unless everyone is as diligent as you are. It's a heavy, heavy lift.Tim: I'm going to push back a little bit. Now, for example, this company I'm advising that running GCP and calling out to Lambda is in that position; either GCP or Lambda goes off the air. On the other hand, if you've got somebody like Zoom, they're probably running parallel full stacks on the different cloud providers. And if you're doing that, then you can at least plausibly claim that you're in a good place because if Dynamo has an outage—and everything relies on Dynamo—then you shift your load over to GCP or Oracle [laugh] and you're still on the air.Corey: Yeah, but what is up as well because Zoom loves to sign me out on my desktop whenever I log into it on my laptop, and vice versa, and I wonder if that authentication and login system is also replicated full-stack to everywhere it goes, and what the fencing on that looks like, and how the communication between all those things works? I wouldn't doubt that it's possible that they've solved for this, but I also wonder how thoroughly they've really tested all of the, too. Not because I question them any; just because this stuff is super intricate as you start tracing it down into the nitty-gritty levels of the madness that consumes all these abstractions.Tim: Well, right, that's a conventional wisdom that is really wise and true, which is that if you have software that is alleged to do something like allow you to get going on another cloud, unless you've tested it within the last three weeks, it's not going to work when you need it.Corey: Oh, it's like a DR exercise: The next commit you make breaks it. Once you have the thing working again, it sits around as a binder, and it's a best guess. And let's be serious, a lot of these DR exercises presume that you're able to, for example, change DNS records on the fly, or be able to get a virtual machine provisioned in less than 45 minutes—because when there's an actual outage, surprise, everyone's trying to do the same things—there's a lot of stuff in there that gets really wonky at weird levels.Tim: A related similar exercise, which is people who want to be on AWS but want to be multi-region. It's actually, you know, a fairly similar kind of problem. If I need to be able to fail out of us-east-1—well, God help you, because if you need to everybody else needs to as well—but you know, would that work?Corey: Before you go multi-cloud go multi-region first. Tell me how easy it is because then you have full-feature parity—presumably—between everything; it should just be a walk in the park. Send me a postcard once you get that set up and I'll eat a bunch of words. And it turns out, basically, no one does.Tim: Mm-hm.Corey: Another area of lock-in around a lot of this stuff, and I think that makes it very hard to go multi-cloud is the security model of how does that interface with various aspects. In many cases, I'm seeing people doing full-on network overlays. They don't have to worry about the different security group models and VPCs and all the rest. They can just treat everything as a node sitting on the internet, and the only thing it talks to is an overlay network. Which is terrible, but that seems to be one of the only ways people are able to build things that span multiple providers with any degree of success.Tim: Well, that is painful because, much as we all like to scoff and so on, in the degree of complexity you get into there, it is the case that your typical public cloud provider can do security better than you can. They just can. It's a fact of life. And if you're using a public cloud provider and not taking advantage of their security offerings, infrastructure, that's probably dumb. But if you really want to be multi-cloud, you kind of have to, as you said.In particular, this gets back to the problem of expertise because it's hard enough to hire somebody who really understands IAM deeply and how to get that working properly, try and find somebody who can understand that level of thing on two different cloud providers at once. Oh, gosh.Corey: This episode is sponsored in part by LaunchDarkly. Take a look at what it takes to get your code into production. I'm going to just guess that it's awful because it's always awful. No one loves their deployment process. What if launching new features didn't require you to do a full-on code and possibly infrastructure deploy? What if you could test on a small subset of users and then roll it back immediately if results aren't what you expect? LaunchDarkly does exactly this. To learn more, visit launchdarkly.com and tell them Corey sent you, and watch for the wince.Corey: Another point you made in your blog post was the idea of lock-in, of people being worried that going all-in on a provider was setting them up to be, I think Oracle is the term that was tossed around where once you're dependent on a provider, what's to stop them from cranking the pricing knobs until you squeal?Tim: Nothing. And I think that is a perfectly sane thing to worry about. Now, in the short term, based on my personal experience working with, you know, AWS leadership, I think that it's probably not a big short-term risk. AWS is clearly aware that most of the growth is still in front of them. You know, the amount of all of it that's on the cloud is still pretty small and so the thing to worry about right now is growth.And they are really, really genuinely, sincerely focused on customer success and will bend over backwards to deal with the customers problems as they are. And I've seen places where people have negotiated a huge multi-year enterprise agreement based on Reserved Instances or something like that, and then realize, oh, wait, we need to switch our whole technology stack, but you've got us by the RIs and AWS will say, “No, no, it's okay. We'll tear that up and rewrite it and get you where you need to go.” So, in the short term, between now and 2025, would I worry about my cloud provider doing that? Probably not so much.But let's go a little further out. Let's say it's, you know, 2030 or something like that, and at that point, you know, Andy Jassy decided to be a full-time sports mogul, and Satya Narayana has gone off to be a recreational sailboat owner or something like that, and private equity operators come in and take very significant stakes in the public cloud providers, and get a lot of their guys on the board, and you have a very different dynamic. And you have something that starts to feel like Oracle where their priority isn't, you know, optimizing for growth and customer success; their priority is optimizing for a quarterly bottom line, and—Corey: Revenue extraction becomes the goal.Tim: That's absolutely right. And this is not a hypothetical scenario; it's happened. Most large companies do not control the amount of money they spend per year to have desktop software that works. They pay whatever Microsoft's going to say they pay because they don't have a choice. And a lot of companies are in the same situation with their database.They don't get to budget, their database budget. Oracle comes in and says, “Here's what you're going to pay,” and that's what you pay. You really don't want to be in a situation with your cloud, and that's why I think it's perfectly reasonable for somebody who is doing cloud transition at a major financial or manufacturing or service provider company to have an eye to this. You know, let's not completely ignore the lock-in issue.Corey: There is a significant scale with enterprise deals and contracts. There is almost always a contractual provision that says if you're going to raise a price with any cloud provider, there's a fixed period of time of notice you must give before it happens. I feel like the first mover there winds up getting soaked because everyone is going to panic and migrate in other directions. I mean, Google tried it with Google Maps for their API, and not quite Google Cloud, but also scared the bejesus out of a whole bunch of people who were, “Wait. Is this a harbinger of things to come?”Tim: Well, not in the short term, I don't think. And I think you know, Google Maps [is absurdly 00:26:36] underpriced. That's hellishly expensive service. And it's supposed to pay for itself by, you know, advertising on maps. I don't know about that.I would see that as the exception rather than the rule. I think that it's reasonable to expect cloud prices, nominally at least, to go on decreasing for at least the short term, maybe even the medium term. But that's—can't go on forever.Corey: It also feels to me, like having looked at an awful lot of AWS environments that if there were to be some sort of regulatory action or some really weird outage for a year that meant that AWS could not onboard a single new customer, their revenue year-over-year would continue to increase purely by organic growth because there is no forcing function that turns the thing off when you're done using it. In fact, they can migrate things around to hardware that works, they can continue building you for the things sitting there idle. And there is no governance path on that. So, on some level, winding up doing a price increase is going to cause a massive company focus on fixing a lot of that. It feels on some level like it is drawing attention to a thing that they don't really want to draw attention to from a purely revenue extraction story.When CentOS back-walked their ten-year support line two years, suddenly—and with an idea that it would drive [unintelligible 00:27:56] adoption. Well, suddenly, a lot of people looked at their environment, saw they had old [unintelligible 00:28:00] they weren't using. And massively short-sighted, massively irritated a whole bunch of people who needed that in the short term, but by the renewal, we're going to be on to Ubuntu or something else. It feels like it's going to backfire massively, and I'd like to imagine the strategist of whoever takes the reins of these companies is going to be smarter than that. But here we are.Tim: Here we are. And you know it's interesting you should mention regulatory action. At the moment, there are only three credible public cloud providers. It's not obvious the Google's really in it for the long haul, as last time I checked, they were claiming to maybe be breaking even on it. That's not a good number, you know? You'd like there to be more than that.And if it goes on like that, eventually, some politician is going to say, “Oh, maybe they should be regulated like public utilities,” because they kind of are right? And I would think that anybody who did get into Oracle-izing would be—you know, accelerate that happening. Having said that, we do live in the atmosphere of 21st-century capitalism, and growth is the God that must be worshiped at all costs. Who knows. It's a cloudy future. Hard to see.Corey: It really is. I also want to be clear, on some level, that with Google's current position, if they weren't taking a small loss at least, on these things, I would worry. Like, wait, you're trying to catch AWS and you don't have anything better to invest that money into than just well time to start taking profits from it. So, I can see both sides of that one.Tim: Right. And as I keep saying, I've already said once during this slot, you know, the total cloud spend in the world is probably on the order of one or two-hundred billion per annum, and global IT is in multiple trillions. So, [laugh] there's a lot more space for growth. Years and years worth of it.Corey: Yeah. The challenge, too, is that people are worried about this long-term strategic point of view. So, one thing you talked about in your blog post is the idea of using hosted open-source solutions. Like, instead of using Kinesis, you'd wind up using Kafka or instead of using DynamoDB you use their managed Cassandra service—or as I think of it Amazon Basics Cassandra—and effectively going down the path of letting them manage this thing, but you then have a theoretical Exodus path. Where do you land on that?Tim: I think that speaks to a lot of people's concerns, and I've had conversations with really smart people about that who like that idea. Now, to be realistic, it doesn't make migration easy because you've still got all the CI and CD and monitoring and management and scaling and alarms and alerts and paging and et cetera, et cetera, et cetera, wrapped around it. So, it's not as though you could just pick up your managed Kafka off AWS and drop a huge installation onto GCP easily. But at least, you know, your data plan APIs are the same, so a lot of your code would probably still run okay. So, it's a plausible path forward. And when people say, “I want to do that,” well, it does mean that you can't go all serverless. But it's not a totally insane path forward.Corey: So, one last point in your blog post that I think a lot of people think about only after they get bitten by it is the idea of data gravity. I alluded earlier in our conversation to data egress charges, but my experience has been that where your data lives is effectively where the rest of your cloud usage tends to aggregate. How do you see it?Tim: Well, it's a real issue, but I think it might perhaps be a little overblown. People throw the term petabytes around, and people don't realize how big a petabyte is. A petabyte is just an insanely huge amount of data, and the notion of transmitting one over the internet is terrifying. And there are lots of enterprises that have multiple petabytes around, and so they think, “Well, you know, it would take me 26 years to transmit that, so I can't.”And they might be wrong. The internet's getting faster all time. Did you notice? I've been able to move some—for purely personal projects—insane amounts of data, and it gets there a lot faster than you did. Secondly, in the case of AWS Snowmobile, we have an existence proof that you can do exabyte-ish scale data transfers in the time it takes to drive a truck across the country.Corey: Inbound only. Snowmobiles are not—at least according to public examples—are valid for Exodus.Tim: But you know, this is kind of place where regulatory action might come into play if what the people were doing was seen to be abusive. I mean, there's an existence proof you can do this thing. But here's another point. So, I suppose you have, like, 15 petabytes—that's an insane amount of data—displayed in your corporate application. So, are you actually using that to run the application, or is a huge proportion of that stuff just logs and data gathered of various kinds that's being used in analytics applications and AI models and so on?Do you actually need all that data to actually run your app? And could you in fact, just pick up the stuff you need for your app, move it to a different cloud provider from there and leave your analytics on the first one? Not a totally insane idea.Corey: It's not a terrible idea at all. It comes down to the idea as well of when you're trying to run a query against a bunch of that data, do you need all the data to transit or just the results of that query, as well? It's a question of, can you move the compute closer to the data as opposed to the data to where the compute lives?Tim: Well, you know and a lot of those people who have those huge data pools have it sitting on S3, and a lot of it migrated off into Glacier, so it's not as if you could get at it in milliseconds anyhow. I just ask myself, “How much data can anybody actually use in a day? In the course of satisfying some transaction requests from a customer?” And I think it's not petabyte. It just isn't.Now, there are—okay, there are exceptions. There's the intelligence community, there's the oil drilling community, there are some communities who genuinely will use insanely huge seas of data on a routine basis, but you know, I think that's kind of a corner case, so before you shake your head and say, “Ah, they'll never move because the data gravity,” you know… you need to prove that to me and I might be a little bit skeptical.Corey: And I think that is probably a very fair request. Just tell me what it is you're going to be doing here to validate the idea that is in your head because the most interesting lies I've found customers tell isn't intentionally to me or anyone else; it's to themselves. The narrative of what they think they're doing from the early days takes root, and never mind the fact that, yeah, it turns out that now that you've scaled out, maybe development isn't 80% of your cloud bill anymore. You learn things and your understanding of what you're doing has to evolve with the evolution of the applications.Tim: Yep. It's a fun time to be around. I mean, it's so great; right at the moment lock-in just isn't that big an issue. And let's be clear—I'm sure you'll agree with me on this, Corey—is if you're a startup and you're trying to grow and scale and prove you've got a viable business, and show that you have exponential growth and so on, don't think about lock-in; just don't go near it. Pick a cloud provider, pick whichever cloud provider your CTO already knows how to use, and just go all-in on them, and use all their most advanced features and be serverless if you can. It's the only sane way forward. You're short of time, you're short of money, you need growth.Corey: “Well, what if you need to move strategically in five years?” You should be so lucky. Great. Deal with it then. Or, “Well, what if we want to sell to retail as our primary market and they hate AWS?”Well, go all-in on a provider; probably not that one. Pick a different provider and go all in. I do not care which cloud any given company picks. Go with what's right for you, but then go all in because until you have a compelling reason to do otherwise, you're going to spend more time solving global problems locally.Tim: That's right. And we've never actually said this probably because it's something that both you and I know at the core of our being, but it probably needs to be said that being multi-cloud is expensive, right? Because the nouns and verbs that describe what clouds do are different in Google-land and AWS-land; they're just different. And it's hard to think about those things. And you lose the capability of using the advanced serverless stuff. There are a whole bunch of costs to being multi-cloud.Now, maybe if you're existentially afraid of lock-in, you don't care. But for I think most normal people, ugh, it's expensive.Corey: Pay now or pay later, you will pay. Wouldn't you ideally like to see that dollar go as far as possible? I'm right there with you because it's not just the actual infrastructure costs that's expensive, it costs something far more dear and expensive, and that is the cognitive expense of having to think about both of these things, not just how each cloud provider works, but how each one breaks. You've done this stuff longer than I have; I don't think that either of us trust a system that we don't understand the failure cases for and how it's going to degrade. It's, “Oh, right. You built something new and awesome. Awesome. How does it fall over? What direction is it going to hit, so what side should I not stand on?” It's based on an understanding of what you're about to blow holes in.Tim: That's right. And you know, I think particularly if you're using AWS heavily, you know that there are some things that you might as well bet your business on because, you know, if they're down, so is the rest of the world, and who cares? And, other things, eh, maybe a little chance here. So, understanding failure modes, understanding your stuff, you know, the cost of sharp edges, understanding manageability issues. It's not obvious.Corey: It's really not. Tim, I want to thank you for taking the time to go through this, frankly, excellent post with me. If people want to learn more about how you see things, and I guess how you view the world, where's the best place to find you?Tim: I'm on Twitter, just @timbray T-I-M-B-R-A-Y. And my blog is at tbray.org, and that's where that piece you were just talking about is, and that's kind of my online presence.Corey: And we will, of course, put links to it in the [show notes 00:37:42]. Thanks so much for being so generous with your time. It's always a pleasure to talk to you.Tim: Well, it's always fun to talk to somebody who has shared passions, and we clearly do.Corey: Indeed. Tim Bray principal at Textuality Services. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that you then need to take to all of the other podcast platforms out there purely for redundancy, so you don't get locked into one of them.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

god amazon ai google starting uk pr service british canadian zoom office microsoft exodus sun software cloud cd boy ip oracle i am saas solid cto crm salesforce cgi permanent api google maps screaming aws werner linux java snowflakes iso apis stripe devops azure kafka bray amazon web services ubuntu counterculture google cloud dns s3 glacier sql dynamo git kubernetes envoy lambda microsoft azure capella xml mongo deployments gcp vms cobol multicloud ris snowmobiles andy jassy jvm aws lambda distinguished engineer kinesis istio dynamodb launchdarkly corey quinn exchange server tim oh tim yeah haproxy vpcs google app engine duckbill group eventbridge tim well forrest brazeal reserved instances tim here chief cloud economist tim right last week in aws tim people humblepod

009. Все о Serverless в AWS

AWS на русском

Play Episode Listen Later Mar 29, 2022 34:59

В этом эпизоде мы поговорили про Serverless сервисы в AWS. Мы рассказали об основных сервисах таких как AWS Lambda, API Gateway, EventBridge, SNS, SQS, StepFunctions и как они помогут вам при создании новых микросервисных, event-driven приложений. Рассмотрели типичные сценарии использования этих сервисов и рассказали о лучших практиках их использования, как построить CI/CD для ваших лямбд. Также упомянули некоторые случаи, когда Serverless будет не лучшим вариантом. Таймкоды: 00:00:24 - Рома Бойко в гостях 00:02:28 - Почему serverless 00:07:20 - 4 характеристики serverless 00:09:05 - Cписок serverless сервисов в AWS 00:12:00 - Основные use-cases 00:16:23 - Как построить CI/CD для serverless 00:22:20 - Ограничения Lambda 00:24:30 - AWS Lambda power tuning 00:28:43 - Разогрев Lambda функций Ссылки Best practices of advanced serverless developers AWS Lambda power tuning Если у вас есть вопросы, предложения темы, пишите мне в Linkedin - https://www.linkedin.com/in/vedmich/

sns ci cd serverless aws lambda api gateway sqs eventbridge

20. Luca Mezzalira (Principal Solutions Architect di Amazon Web Services)

Cloud Champions

Play Episode Listen Later Mar 29, 2022 52:19

Dal gaming di classe AAA allo streaming per le masse in DAZN, fino ad arrivare al "lato Vendor della nuvola": parliamo di cloud, solution architecture ed altro ancora con Luca Mezzalira, Principal Solutions Architect in Amazon Web Services.Libri citati nell'episodio:* "Domain Driven Design", Eric Evans* "Fundamentals of Software Architecture: An Engineering Approach", Neal Ford & Mark Richards* "Software Architecture: The Hard Parts", Neal Ford* Gregor Hophe su leanpub: https://leanpub.com/bookstore?search=Gregor%20hopher&type=allTecnologie raccomandate da Luca:* Amazon EventBridge* AWS Serverless Application Model* AWS Step FunctionsI nostri contatti:* Canale Telegram: https://t.me/CloudChampions* Facebook: https://www.facebook.com/TheCloudChamps* LinkedIn: https://www.linkedin.com/showcase/cloudchampions * Sito web: https://www.cloudchampions.tech/* Twitter: https://twitter.com/TheCloudChamps

#31 Cliché Quips and Useful Advice - nib Group's Data Mesh Journey So Far - Interview w/ Kurt Gardiner

Data Mesh Radio

Play Episode Listen Later Feb 22, 2022 66:45

Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) In this episode, Scott interviews Kurt Gardiner, Engineering Manager of Data Engineering at Australian Insurance company nib Group. Kurt shared some insights into nib's journey so far, including the search for something like data mesh before Zhamak published, tool choices (Snowflake, dbt, Fivetran, EventBridge, Kinesis), the slow-role approach to replacing legacy implementation (the "application strangler" pattern mentioned), how they got started, and much more. Much of nib's approach is the small-scale tactical while building incrementally for the bigger strategic focus. E.g. helping teams to design their data products somewhat manually while building the reusable tooling to be far less manual going forward. Along their journey, there was some internal pushback from data consumers, especially those used to consuming from the data warehouse. To do data mesh right, Kurt and Scott both emphasized the need to set things up so they can evolve. That will frustrate or scare some people and it's important to work with them to see why that matters. There also needs to be a high tolerance for failure - you will NOT get everything right on your first go. Kurt also waxed poetic (said nice things) about event streaming patterns, especially CQRS - see link below for more info -, for a useful and scalable pattern that is good for both application development and creating a scalable and useful domain data model. But it requires a complete redesign so it is probably something to slowly introduce where it makes sense, if at all. Some pithy nuggets of wisdom from Kurt that are highly applicable to data mesh: "The single biggest problem in communication is the illusion that it has taken place" "Nobody cares what you know until they know that you care" Application Strangler pattern (recently renamed Strangler Fig Application pattern): https://martinfowler.com/bliki/StranglerFigApplication.html (https://martinfowler.com/bliki/StranglerFigApplication.html) CQRS: https://www.martinfowler.com/bliki/CQRS.html (https://www.martinfowler.com/bliki/CQRS.html) Kurt's LinkedIn: https://www.linkedin.com/in/kugardiner/ (https://www.linkedin.com/in/kugardiner/) nib Group careers page: https://nib.wd3.myworkdayjobs.com/careers (https://nib.wd3.myworkdayjobs.com/careers) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (https://pixabay.com/users/lesfm-22579021/) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)

advice data snowflakes apis clich mesh gardiner engineering manager journey so far data engineering lesfm quips datastax kinesis data mesh cqrs eventbridge zhamak

24. What's SNS and how do you use it?

Play Episode Listen Later Feb 18, 2022 15:34

Luciano and Eoin deep dive into SNS discussing what it does, how it differs from EventBridge and SQS and how you can use it to send messages to customers but also for microservices communication. In this new episode dedicated to AWS events and messaging services, we learn everything there is to know about SNS including advantages, limitations and cost. This episode complements the episode about EventBridge, giving another perspective on when to use SNS and when to pick EventBridge instead. In this episode, we mentioned the following resources: - Our previous episode about EventBridge: https://www.youtube.com/watch?v=UjIE5qp-v8w - Our previous episode about all things SQS: https://www.youtube.com/watch?v=svoA-ds8-8c - Our introductory episode about what services you should use for events: https://www.youtube.com/watch?v=CG7uhkKftoY - A comparison between EventBridge and SNS by Cloudonaut: https://cloudonaut.io/eventbridge-vs-sns/ This episode is also available on YouTube: https://www.youtube.com/AWSBites You can listen to AWS Bites wherever you get your podcasts: - Apple Podcasts: https://podcasts.apple.com/us/podcast/aws-bites/id1585489017 - Spotify: https://open.spotify.com/show/3Lh7PzqBFV6yt5WsTAmO5q - Google: https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy82YTMzMTJhMC9wb2RjYXN0L3Jzcw== - Breaker: https://www.breaker.audio/aws-bites - RSS: https://anchor.fm/s/6a3312a0/podcast/rss Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on Twitter: - https://twitter.com/eoins - https://twitter.com/loige

spotify aws sns eoin sqs eventbridge

23. What's the big deal with EventBridge

Play Episode Listen Later Feb 11, 2022 33:59

Eoin and Luciano continue their series about event services. In this episode, they chat about EventBridge and explore why this AWS service has such a great potential for event-based serverless applications. This episode presents some interesting examples of when and how to use EventBridge. It also covers all the different classes of events that you can manage with EventBridge: AWS events, third-party events and custom events. We discuss limits and pricing and, finally, we show how things can go wrong and how much you can end up paying for it. We conclude the episode with some tips and resources to avoid shooting yourself in the foot and get good observability when using EventBridge.In this episode, we mentioned the following resources: - Our previous episode about all things SQS: https://www.youtube.com/watch?v=svoA-ds8-8c - Our introductory episode about what services you should use for events: https://www.youtube.com/watch?v=CG7uhkKftoY - List of AWS services that can trigger EventBridge events: https://docs.aws.amazon.com/eventbridge/latest/userguide/eb-service-event.html - An example of how to make HTTP calls directly from EventBridge (Sheen Brisals): https://medium.com/lego-engineering/amazon-eventbridge-api-destinations-demystified-part-i-23fa70d9a04d - How to test when using EventBridge (by Paul Swail): https://serverlessfirst.com/eventbridge-testing-guide/ - Eventbridge CLI tool: https://github.com/spezam/eventbridge-cli - Lumigo CLI: https://github.com/lumigo-io/lumigo-CLI#lumigo-cli-tail-eventbridge-bus - EventBridge Atlas: https://eventbridge-atlas.netlify.app/ - EventBridge Canon: https://eventbridge-canon.netlify.app/ - Accelerate Serverless Adoption with EventBridge (talk by Sheen Brisals): https://www.youtube.com/watch?v=sTZpoSGOSOI - Series of Articles by Sheen Brisals on EventBridge: https://sbrisals.medium.com/table-of-contents-set-pieces-16c1ca1ecb33 This episode is also available on YouTube: https://www.youtube.com/AWSBites You can listen to AWS Bites wherever you get your podcasts: - Apple Podcasts: https://podcasts.apple.com/us/podcast/aws-bites/id1585489017 - Google: https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy82YTMzMTJhMC9wb2RjYXN0L3Jzcw== - Breaker: https://www.breaker.audio/aws-bites - RSS: https://anchor.fm/s/6a3312a0/podcast/rss Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on Twitter: - https://twitter.com/eoins - https://twitter.com/loige

google big deals aws eoin cli sqs eventbridge

Serverless Craic Ep10 Well Architected Reliability Pillar

Play Episode Listen Later Jan 26, 2022 10:35

In this episode the team continue their conversation on the well architected framework with the Reliability Pillar. It has 4 sections: foundations, workload architecture, change management, and failure management. If you're building in a traditional way then reliability can be a huge amount of work. But we're serverless heads. AWS make it a lot easier to do some of these things. A lot of service quotas and constraints are baked into the foundations. From a change management perspective, you want to get into the continuous delivery kind of mindset, so there's a lot of monitoring if you use the modern tools. From a failure management perspective, serverless is ephemeral, so it's built for retries. You can design so that those areas are slightly easier to work with. When you look at the foundation section of the reliability pillar, you're probably looking at how to plan and not over provision or overspend and how to scale up effectively. You put a lot of time into the foundation section, where as if you are in Serverless you still need to look at foundations, but I think it's less intensive. So you're looking at things like DNS, or how you to write effectively, and things like that. You still need to worry about quotas and some of the constraints and resource constraints of your account. But you're starting from a more mature reliability standpoint, when you adopt the serverless and a serviceful approach. So from a change management point of view, adapting to change is built into serverless capabilities and how managed services operate. From a failure management point of view, a lot of that is baked in especially if you've built an event driven asynchronous workload, using SNS, SQS or Eventbridge. A lot of circuit breaker type mentality, retry and dead letter queues are coming out of the box now. Increasingly they are maturing those capabilities to make it easier for teams to have a default, resilient driver and reliable capability. With serverless architecture we tend to take a micro path: micro service or micro front end. It's opinionated so there's only so many ways to connect these things. It's aimed at speed, cost effectiveness and reliability. If you are using lambda, it's 6 AZs wide, in terms of the HA side of things. It's the same with DynamoDB or Aurora. There's lot of stuff that AWS has thought about for you in relation to the foundational side and you can benefit from that. And that influences how you actually assemble your workload in terms of the workload architecture as you've got to work within those constraints. What's interesting about the reliability pillar and well architected is that they have been there for eight years, but they've just recently added this new section, which is workload architecture. And one of the questions is how do you design interactions in a distributed system to prevent failures? So there's a specific section on how to design distributed workloads. That's more than a nod. It's proof that a lot of customers are moving towards distributed micro service and modern application stacks. There's a lot of depth in that. To do that well, you need like upfront design. You need to think about your system before you design it. You can't code you way out of it. Serverless makes some of these things easier, but the time that you don't spend on retries is put into domain design. It's an equal amount of effort and it's still challenging but you're putting precious time into system design, as opposed to tuning a non functional thing. It's still hard to build the systems, but I firmly believe you get a better system at the end of the day. Some of the questions look at: if we auto scale this serverless component, what sort of load or pressures are put on something that doesn't auto scale? It forces you to think about protection and where are the choke points? Should you be throttling your workload? Should you be setting constraints on the scalability of your serverless workload? Those type of questions get to something that we're very passionate about, which is testing for resilience and continuous resiliency, and having test days or game days that tease out where the choke points are. Where are your failure cases? Where are the downstream systems that can't respond or can't take the load as you pass to it? All of us agree that it is easier to do it with serverless and it's important that the setup is designed properly. But you can get good feedback by testing for this stuff. And you've seen the maturity of the fault injection service that has come out. It's easy to use and I'm hoping to see that evolve and mature to be much more serverless focused as well. But it's a lot easier to test for resiliency. So you're not guessing anymore. You have real automation and you've baked that into your CI CD pipeline as well. Serverless Craic from The Serverless Edge theserverlessedge.com @ServerlessEdge

aws pillar reliability dns sns ci cd serverless craic dynamodb architected sqs azs eventbridge

AWS re:Invent 2021 - All the Cloud Security Updates so far

Cloud Security News

Play Episode Listen Later Dec 2, 2021 7:16

Cloud Security News this week 2 December 2021 AWS has launched some improvements to a few of their existing services and no new Security service has been announced yet. With Google Cloud announcing their CyberSecurity Action team earlier this year, we were hoping for a similar response or better from AWS but nothing so far. Updates to AWS Shield, Amazon Cloud Guru and Amazon Inspector. For those storing CloudTrail logs or other important logs to help with incident response in S3 buckets, you can now use EventBridge to build applications that react quickly and efficiently to changes in your S3 objects. This will deliver responses to potential Events/incidents of interest in a faster, more reliable, and in a more developer-friendly way than ever. More on this here If you use AWS Control Tower and care about Data Residency, now you will be able to apply Preventive and detective controls that prevent provisioning resources in unwanted AWS Regions by restricting access to AWS APIs through service control policies (SCPs) built and managed by AWS Control Tower. This means that content cannot be created or transferred outside of your selected Regions at the infrastructure level. More on this here They have announced Amazon VPC IP Address Manager (IPAM), a new feature that provides network administrators with an automated IP management workflow.making it easier to organize, assign, monitor, and audit IP addresses in at-scale networks. More on this here new feature.” Amazon VPC Network Access Analyzer. In contrast to manual checking of network configurations, which is error-prone and hard to scale, this tool lets you analyze your AWS networks of any size and complexity. You can get started with a set of Amazon-created scopes, and then either copy & customize them, or create your own from scratch. More on this here A new Amazon S3 Object Ownership setting and the Amazon S3 console policy editor. More on the Episode Show Notes on Cloud Security Podcast Website. Podcast Twitter - Cloud Security Podcast (@CloudSecPod) Instagram - Cloud Security News If you want to watch videos of this LIVE STREAMED episode and past episodes, check out: - Cloud Security Podcast: - Cloud Security Academy:

amazon events security ip aws reinvent regions s3 preventive cloud security live streamed amazon s3 scps eventbridge aws control tower amazon inspector aws regions

AWS re:Invent 2021 - All the Cloud Security Updates so far

Cloud Security Podcast

Play Episode Listen Later Dec 2, 2021 7:16

amazon events security ip aws reinvent regions s3 preventive cloud security live streamed amazon s3 scps eventbridge aws control tower amazon inspector aws regions

AWS re:Invent Day One Special

Play Episode Listen Later Nov 30, 2021 30:16

In this special episode, Eoin and Luciano talk about their impression on the announcements from the first day of AWS re:invent 2021. AWS Lambda now supports event filtering for Amazon SQS, Amazon DynamoDB, and Amazon Kinesis as event sources: https://aws.amazon.com/about-aws/whats-new/2021/11/aws-lambda-event-filtering-amazon-sqs-dynamodb-kinesis-sources/ Amazon CodeGuru Reviewer now detects hardcoded secrets in Java and Python repositories: https://aws.amazon.com/about-aws/whats-new/2021/11/amazon-codeguru-reviewer-hardcoded-secrets-java-python/ Amazon ECR announces pull through cache repositories: https://aws.amazon.com/about-aws/whats-new/2021/11/amazon-ecr-cache-repositories/ Introducing recommenders optimized to deliver personalized experiences for Media & Entertainment and Retail with Amazon Personalize: https://aws.amazon.com/about-aws/whats-new/2021/11/recommenders-optimized-personalized-media-entertainment-retail-amazon-personalize/AWS Chatbot now supports management of AWS resources in Slack (Preview): https://aws.amazon.com/about-aws/whats-new/2021/11/aws-chatbot-management-resources-slack/ Amazon CloudWatch Evidently: https://aws.amazon.com/about-aws/whats-new/2021/11/amazon-cloudwatch-evidently-feature-experimentation-safer-launches/ AWS Migration Hub Refactor Spaces - Preview: https://aws.amazon.com/about-aws/whats-new/2021/11/aws-migration-hub-refactor-spaces/ CloudWatch Real User Monitoring: https://aws.amazon.com/about-aws/whats-new/2021/11/amazon-cloudwatch-rum-applications-client-side-performance/ CloudWatch Metrics Insights: https://aws.amazon.com/about-aws/whats-new/2021/11/amazon-cloudwatch-metrics-insights-preview/ AWS Karpenter: https://github.com/aws/karpenter S3 Event Notifications with EventBridge: https://aws.amazon.com/blogs/aws/new-use-amazon-s3-event-notifications-with-amazon-eventbridge/ S3 Event Notifications for S3 Lifecycle, S3 Intelligent-Tiering, object tags, and object access control lists: https://aws.amazon.com/about-aws/whats-new/2021/11/amazon-s3-event-notifications-s3-lifecycle-intelligent-tiering-object-tags-object-access-control-lists/ Amazon Athena ACID Transactions (Preview): https://aws.amazon.com/about-aws/whats-new/2021/11/amazon-athena-acid-apache-iceberg/ AWS Control Tower introduces Terraform account provisioning and customization: https://aws.amazon.com/about-aws/whats-new/2021/11/aws-control-tower-terraform/ Leave a comment here or connect with us on Twitter: - https://twitter.com/eoins - https://twitter.com/loige

media retail day one python aws java reinvent eoin terraform aws lambda amazon dynamodb eventbridge amazon sqs aws control tower amazon kinesis amazon ecr

05. What are our favourite AWS services and why?

Play Episode Listen Later Oct 8, 2021 9:47

In this episode Eoin and Luciano talk about their favourite AWS services and why they like them. Spoiler: we talk about EventBridge, Lambda, CloudFormation, CDK, S3, ECS/Fargate plus some honorable mentions like CloudWatch and IAM. In this episode we mentioned the following resources: SLIC Watch: https://github.com/fourTheorem/slic-watch/ This episode is also available on YouTube: https://www.youtube.com/channel/UC3WPpke5mzsbiij1zGFbeEA You can listen to AWS Bites wherever you get your podcasts: - Apple Podcasts: https://podcasts.apple.com/us/podcast/aws-bites/id1585489017 - Spotify: https://open.spotify.com/show/3Lh7PzqBFV6yt5WsTAmO5q - Google: https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy82YTMzMTJhMC9wb2RjYXN0L3Jzcw== - Breaker: https://www.breaker.audio/aws-bites - RSS: https://anchor.fm/s/6a3312a0/podcast/rss Do you have any AWS questions you would like us to address? Leave a comment here or connect with us on Twitter: - https://twitter.com/eoins - https://twitter.com/loige

spotify spoilers services i am favourite aws s3 lambda eoin cdk cloudformation cloudwatch eventbridge

Episode #107: Serverless Infrastructure as Code with Ben Kehoe

Play Episode Listen Later Jun 28, 2021 79:04

About Ben KehoeBen Kehoe is a Cloud Robotics Research Scientist at iRobot and an AWS Serverless Hero. As a serverless practitioner, Ben focuses on enabling rapid, secure-by-design development of business value by using managed services and ephemeral compute (like FaaS). Ben also seeks to amplify voices from dev, ops, and security to help the community shape the evolution of serverless and event-driven designs.Twitter: @ben11kehoeMedium: ben11kehoeGitHub: benkehoeLinkedIn: ben11kehoeiRobot: www.irobot.comWatch this episode on YouTube: https://youtu.be/B0QChfAGvB0 This episode is sponsored by CBT Nuggets and Lumigo.TranscriptJeremy: Hi, everyone. I'm Jeremy Daly.Rebecca: And I'm Rebecca Marshburn.Jeremy: And this is Serverless Chats. And this is a momentous occasion on Serverless Chats because we are welcoming in Rebecca Marshburn as an official co-host of Serverless Chats.Rebecca: I'm pretty excited to be here. Thanks so much, Jeremy.Jeremy: So for those of you that have been listening for hopefully a long time, and we've done over 100 episodes. And I don't know, Rebecca, do I look tired? I feel tired.Rebecca: I've never seen you look tired.Jeremy: Okay. Well, I feel tired because we've done a lot of these episodes and we've published a new episode every single week for the last 107 weeks, I think at this point. And so what we're going to do is with you coming on as a new co-host, we're going to take a break over the summer. We're going to revamp. We're going to do some work. We're going to put together some great content. And then we're going to come back on, I think it's August 30th with a new episode and a whole new show. Again, it's going to be about serverless, but what we're thinking is ... And, Rebecca, I would love to hear your thoughts on this as I come at things from a very technical angle, because I'm an overly technical person, but there's so much more to serverless. There's so many other sides to it that I think that bringing in more perspectives and really being able to interview these guests and have a different perspective I think is going to be really helpful. I don't know what your thoughts are on that.Rebecca: Yeah. I love the tech side of things. I am not as deep in the technicalities of tech and I come at it I think from a way of loving the stories behind how people got there and perhaps who they worked with to get there, the ideas of collaboration and community because nothing happens in a vacuum and there's so much stuff happening and sharing knowledge and education and uplifting each other. And so I'm super excited to be here and super excited that one of the first episodes I get to work on with you is with Ben Kehoe because he's all about both the technicalities of tech, and also it's actually on his Twitter, a new compassionate tech values around humility, and inclusion, and cooperation, and learning, and being a mentor. So couldn't have a better guest to join you in the Serverless Chats community and being here for this.Jeremy: I totally agree. And I am looking forward to this. I'm excited. I do want the listeners to know we are testing in production, right? So we haven't run any unit tests, no integration tests. I mean, this is straight test in production.Rebecca: That's the best practice, right? Total best practice to test in production.Jeremy: Best practice. Right. Exactly.Rebecca: Straight to production, always test in production.Jeremy: Push code to the cloud. Here we go.Rebecca: Right away.Jeremy: Right. So if it's a little bit choppy, we'd love your feedback though. The listeners can be our observability tool and give us some feedback and we can ... And hopefully continue to make the show better. So speaking of Ben Kehoe, for those of you who don't know Ben Kehoe, I'm going to let him introduce himself, but I have always been a big fan of his. He was very, very early in the serverless space. I read all his blogs very early on. He was an early AWS Serverless Hero. So joining us today is Ben Kehoe. He is a cloud robotics research scientist at iRobot, as I said, an AWS Serverless Hero. Ben, welcome to the show.Ben: Thanks for having me. And I'm excited to be a guinea pig for this new exciting format.Rebecca: So many observability tools watching you be a guinea pig too. There's lots of layers to this.Jeremy: Amazing. All right. So Ben, why don't you tell the listeners for those that don't know you a little bit about yourself and what you do with serverless?Ben: Yeah. So I mean, as with all software, software is people, right? It's like Soylent Green. And so I'm really excited for this format being about the greater things that technology really involves in how we create it and set it up. And serverless is about removing the things that don't matter so that you can focus on the things that do matter.Jeremy: Right.Ben: So I've been interested in that since I learned about it. And at the time saw that I could build things without running servers, without needing to deal with the scaling of stuff. I've been working on that at iRobot for over five years now. As you said early on in serverless at the first serverless con organized by A Cloud Guru, now plural sites.Jeremy: Right.Ben: And yeah. And it's been really exciting to see it grow into the large-scale community that it is today and all of the ways in which community are built like this podcast.Jeremy: Right. Yeah. I love everything that you've done. I love the analogies you've used. I mean, you've always gone down this road of how do you explain serverless in a way to show really the adoption of it and how people can take that on. Serverless is a ladder. Some of these other things that you would ... I guess the analogies you use were always great and always helped me. And of course, I don't think we've ever really come to a good definition of serverless, but we're not talking about that today. But ...Ben: There isn't one.Jeremy: There isn't one, which is also a really good point. So yeah. So welcome to the show. And again, like I said, testing in production here. So, Rebecca, jump in when you have questions and we'll beat up Ben from both sides on this, but, really ...Rebecca: We're going to have Ben from both sides.Jeremy: There you go. We'll embrace him from both sides. There you go.Rebecca: Yeah. Yeah.Jeremy: So one of the things though that, Ben, you have also been very outspoken on which I absolutely love, because I'm in very much closely aligned on this topic here. But is about infrastructure as code. And so let's start just quickly. I mean, I think a lot of people know or I think people working in the cloud know what infrastructure as code is, but I also think there's a lot of people who don't. So let's just take a quick second, explain what infrastructure as code is and what we mean by that.Ben: Sure. To my mind, infrastructure as code is about having a definition of the state of your infrastructure that you want to see in the cloud. So rather than using operations directly to modify that state, you have a unified definition of some kind. I actually think infrastructure is now the wrong word with serverless. It used to be with servers, you could manage your fleet of servers separate from the software that you were deploying onto the servers. And so infrastructure being the structure below made sense. But now as your code is intimately entwined in the rest of your resources, I tend to think of resource graph definitions rather than infrastructure as code. It's a less convenient term, but I think it's worth understanding the distinction or the difference in perspective.Jeremy: Yeah. No, and I totally get that. I mean, I remember even early days of cloud when we were using the Chefs and the Puppets and things like that, that we were just deploying the actual infrastructure itself. And sometimes you deploy software as part of that, but it was supporting software. It was the stuff that ran in the runtime and some of those and some configurations, but yeah, but the application code that was a whole separate process, and now with serverless, it seems like you're deploying all those things at the same time.Ben: Yeah. There's no way to pick it apart.Jeremy: Right. Right.Rebecca: Ben, there's something that I've always really admired about you and that is how strongly you hold your opinions. You're fervent about them, but it's also because they're based on this thorough nature of investigation and debate and challenging different people and yourself to think about things in different ways. And I know that the rest of this episode is going to be full with a lot of opinions. And so before we even get there, I'm curious if you can share a little bit about how you end up arriving at these, right? And holding them so steady.Ben: It's a good question. Well, I hope that I'm not inflexible in these strong opinions that I hold. I mean, it's one of those strong opinions loosely held kind of things that new information can change how you think about things. But I do try and do as much thinking as possible so that there's less new information that I have to encounter to change an opinion.Rebecca: Yeah. Yeah.Ben: Yeah. I think I tend to try and think about how people ... But again, because it's always people. How people interact with the technology, how people behave, how organizations behave, and then how technology fits into that. Because sometimes we talk about technology in a vacuum and it's really not. Technology that works for one context doesn't work for another. I mean, a lot of my strong opinions are that there is no one right answer kind of a thing, or here's a framework for understanding how to think about this stuff. And then how that fits into a given person is just finding where they are in that more general space. Does that make sense? So it's less about finding out here's the one way to do things and more about finding what are the different options, how do you think about the different options that are out there.Rebecca: Yeah, totally makes sense. And I do want to compliment you. I do feel like you are very good at inviting new information in if people have it and then you're like, "Aha, I've already thought of that."Ben: I hope so. Yeah. I was going to say, there's always a balance between trying to think ahead so that when you discover something you're like, "Oh, that fits into what I thought." And the danger of that being that you're twisting the information to fit into your preexisting structures. I hope that I find a good balance there, but I don't have a principle way of determining that balance or knowing where you are in that it's good versus it's dangerous kind of spectrum.Jeremy: Right. So one of the opinions that you hold that I tend to agree with, I have some thoughts about some of the benefits, but I also really agree with the other piece of it. And this really has to do with the CDK and this idea of using CloudFormation or any sort of DSL, maybe Terraform, things like that, something that is more domain-specific, right? Or I guess declarative, right? As opposed to something that is imperative like the CDK. So just to get everybody on the same page here, what is the top reasons why you believe, or you think that DSL approach is better than that iterative approach or interpretive approach, I guess?Ben: Yeah. So I think we get caught up in the imperative versus declarative part of it. I do think that declarative has benefits that can be there, but the way that I think about it is with the CDK and infrastructure as code in general, I'm like mildly against imperative definitions of resources. And we can get into that part, but that's not my smallest objection to the CDK. I'm moderately against not being able to enforce deterministic builds. And the CDK program can do anything. Can use a random number generator and go out to the internet to go ask a question, right? It can do anything in that program and that means that you have no guarantees that what's coming out of it you're going to be able to repeat.So even if you check the source code in, you may not be able to go back to the same infrastructure that you had before. And you can if you're disciplined about it, but I like tools that help give you guardrails so that you don't have to be as disciplined. So that's my moderately against. My strongly against piece is I'm strongly against developer intent remaining client side. And this is not an inherent flaw in the CDK, is a choice that the CDK team has made to turn organizational dysfunction in AWS into ownership for their customers. And I don't think that's a good approach to take, but that's also fixable.So I think if we want to start with the imperative versus declarative thing, right? When I think about the developers expressing an intent, and I want that intent to flow entirely into the cloud so that developers can understand what's deployed in the cloud in terms of the things that they've written. The CDK takes this approach of flattening it down, flattening the richness of the program the developer has written into ... They think of it as assembly language. I think that is a misinterpretation of what's happening. The assembly language in the process is the imperative plan generated inside the CloudFormation engine that says, "Here's how I'm going to take this definition and turn it into an actual change in the cloud.Jeremy: Right.Ben: They're just translating between two definition formats in CDK scene. But it's a flattening process, it's a lossy process. So then when the developer goes to the Console or the API has to go say, "What's deployed here? What's going wrong? What do I need to fix?" None of it is framed in terms of the things that they wrote in their original language.Jeremy: Right.Ben: And I think that's the biggest problem, right? So drift detection is an important thing, right? What happened when someone went in through the Console? Went and tweaked some stuff to fix something, and now it's different from the definition that's in your source repository. And in CloudFormation, it can tell you that. But what I would want if I was running CDK is that it should produce another CDK program that represents the current state of the cloud with a meaningful file-level diff with my original program.Jeremy: Right. I'm just thinking this through, if I deploy something to CDK and I've got all these loops and they're generating functions and they're using some naming and all this kind of stuff, whatever, now it produces this output. And again, my naming of my functions might be some function that gets called to generate the names of the function. And so now I've got all of these functions named and I have to go in. There's no one-to-one map like you said, and I can imagine somebody who's not familiar with CloudFormation which is ultimately what CDK synthesizes and produces, if you're not familiar with what that output is and how that maps back to the constructs that you created, I can see that as being really difficult, especially for younger developers or developers who are just getting started in that.Ben: And the CDK really takes the attitude that it's going to hide those things from those developers rather than help them learn it. And so when they do have to dive into that, the CDK refers to it as an escape hatch.Jeremy: Yeah.Ben: And I think of escape hatches on submarines, where you go from being warm and dry and having air to breathe to being hundreds of feet below the sea, right? It's not the sort of thing you want to go through. Whereas some tools like Amplify talk about graduation. In Amplify they aim to help you understand the things that Amplify is doing for you, such that when you grow beyond what Amplify can provide you, you have the tools to do that, to take the thing that you built and then say, "Okay, I know enough now that I understand this and can add onto it in ways that Amplify can't help with."Jeremy: Right.Ben: Now, how successful they are in doing that is a separate question I think, but the attitude is there to say, "We're looking to help developers understand these things." Now the CDK could also if the CDK was a managed service, right? Would not need developers to understand those things. If you could take your program directly to the cloud and say, "Here's my program, go make this real." And when it made it real, you could interact with the cloud in an understanding where you could list your deployed constructs, right? That you can understand the program that you wrote when you're looking at the resources that are deployed all together in the cloud everywhere. That would be a thing where you don't need to learn CloudFormation.Jeremy: Right.Ben: Right? That's where you then end up in the imperative versus declarative part where, okay, there's some reasons that I think declarative is better. But the major thing is that disconnect that's currently built into the way that CDK works. And the reason that they're doing that is because CloudFormation is not moving fast enough, which is not always on the CloudFormation team. It's often on the service teams that aren't building the resources fast enough. And that's AWS's problem, AWS as an entire company, as an organization. And this one team is saying, "Well, we can fix that by doing all this client side."What that means is that the customers are then responsible for all the things that are happening on the client side. The reason that they can go fast is because the CDK team doesn't have ownership of it, which just means the ownership is being pushed on customers, right? The CDK deploys Lambda functions into your account that they don't tell you about that you're now responsible for. Right? Both the security and operations of. If there are security updates that the CDK team has to push out, you have to take action to update those things, right? That's ownership that's being pushed onto the customer to fix a lack of ACM certificate management, right?Jeremy: Right. Right.Ben: That is ACM not building the thing that's needed. And so AWS says, "Okay, great. We'll just make that the customer's problem."Jeremy: Right.Ben: And I don't agree with that approach.Rebecca: So I'm sure as an AWS Hero you certainly have pretty good, strong, open communication channels with a lot of different team members across teams. And I certainly know that they're listening to you and are at least hearing you, I should say, and watching you and they know how you feel about this. And so I'm curious how some of those conversations have gone. And some teams as compared to others at AWS are really, really good about opening their roadmap or at least saying, "Hey, we hear this, and here's our path to a solution or a success." And I'm curious if there's any light you can shed on whether or not those conversations have been fruitful in terms of actually being able to get somewhere in terms of customer and AWS terms, right? Customer obsession first.Ben: Yeah. Well, customer obsession can mean two things, right? Customer obsession can mean giving the customer what they want or it can mean giving the customer what they need and different AWS teams' approach fall differently on that scale. The reason that many of those things are not available in CloudFormation is that those teams are ... It could be under-resourced. They could have a larger majority of customer that want new features rather than infrastructure as code support. Because as much as we all like infrastructure as code, there are many, many organizations out there that are not there yet. And with the CDK in particular, I'm a relatively lone voice out there saying, "I don't think this ownership that's being pushed onto the customer is a good thing." And there are lots of developers who are eating up CDK saying, "I don't care."That's not something that's in their worry. And because the CDK has been enormously successful, right? It's fixing these problems that exists. And I don't begrudge them trying to fix those problems. I think it's a question of do those developers who are grabbing onto those things and taking them understand the full total cost of ownership that the CDK is bringing with it. And if they don't understand it, I think AWS has a responsibility to understand it and work with it to help those customers either understand it and deal with it, right? Which is where the CDK takes this approach, "Well, if you do get Ops, it's all fine." And that's somewhat true, but also many developers who can use the CDK do not control their CI/CD process. So there's all sorts of ways in which ... Yeah, so I think every team is trying to do the best that they can, right?They're all working hard and they all have ... Are pulled in many different directions by customers. And most of them are making, I think, the right choices given their incentives, right? Given what their customers are asking for. I think not all of them balance where customers ... meeting customers where they are versus leading them where they should, like where they need to go as well as I would like. But I think ... I had a conclusion to that. Oh, but I think that's always a debate as to where that balance is. And then the other thing when I talk about the CDK, that my ideal audience there is less AWS itself and more AWS customers ...Rebecca: Sure.Ben: ... to understand what they're getting into and therefore to demand better of AWS. Which is in general, I think, the approach that I take with AWS, is complaining about AWS in public, because I do have the ability to go to teams and say, "Hey, I want this thing," right? There are plenty of teams where I could just email them and say, "Hey, this feature could be nice", but I put it on Twitter because other people can see that and say, "Oh, that's something that I want or I don't think that's helpful," right? "I don't care about that," or, "I think it's the wrong thing to ask for," right? All of those things are better when it's not just me saying I think this is a good thing for AWS, but it being a conversation among the community differently.Rebecca: Yeah. I think in the spirit too of trying to publicize types of what might be best next for customers, you said total cost of ownership. Even though it might seem silly to ask this, I think oftentimes we say the words total cost of ownership, but there's actually many dimensions to total cost of ownership or TCO, right? And so I think it would be great if you could enumerate what you think of as total cost of ownership, because there might be dimensions along that matrices, matrix, that people haven't considered when they're actually thinking about total cost of ownership. They're like, "Yeah, yeah, I got it. Some Ops and some security stuff I have to do and some patches," but they might only be thinking of five dimensions when you're like, "Actually the framework is probably 10 to 12 to 14." And so if you could outline that a bit, what you mean when you think of a holistic total cost of ownership, I think that could be super helpful.Ben: I'm bad at enumeration. So I would miss out on dimensions that are obvious if I was attempting to do that. But I think a way that I can, I think effectively answer that question is to talk about some of the ways in which we misunderstand TCO. So I think it's important when working in an organization to think about the organization as a whole, not just your perspective and that your team's perspective in it. And so when you're working for the lowest TCO it's not what's the lowest cost of ownership for my team if that's pushing a larger burden onto another team. Now if it's reducing the burden on your team and only increasing the burden on another team a little bit, that can be a lower total cost of ownership overall. But it's also something that then feeds into things like political capital, right?Is that increased ownership that you're handing to that team something that they're going to be happy with, something that's not going to cause other problems down the line, right? Those are the sorts of things that fit into that calculus because it's not just about what ... Moving away from that topic for a second. I think about when we talk about how does this increase our velocity, right? There's the piece of, "Okay, well, if I can deploy to production faster, right? My feedback loop is faster and I can move faster." Right? But the other part of that equation is how many different threads can you be operating on and how long are those threads in time? So when you're trying to ship a feature, if you can ship it and then never look at it again, that means you have increased bandwidth in the future to take on other features to develop other new features.And so even if you think about, "It's going to take me longer to finish this particular feature," but then there's no maintenance for that feature, that can be a lower cost of ownership in time than, "I can ship it 50% faster, but then I'm going to periodically have to revisit it and that's going to disrupt my ability to ship other things," right? So this is where I had conversations recently about increasing use of Step Functions, right? And being able to replace Lambda functions with Step Functions express workflows because you never have to go back to those Lambdas and update dependencies in them because dependent bot has told you that you need to or a version of Python is getting deprecated, right? All of those things, just if you have your Amazon States Language however it's been defined, right?Once it's in there, you never have to touch it again if nothing else changes and that means, okay, great, that piece is now out of your work stream forever unless it needs to change. And that means that you have more bandwidth for future things, which serverless is about in general, right? Of say, "Okay, I don't have to deal with this scaling problems here. So those scaling things. Once I have an auto-scaling group, I don't have to go back and tweak it later." And so the same thing happens at the feature level if you build it in ways that allow you to do that. And so I think that's one of the places where when we focus on, okay, how fast is this getting me into production, it's okay, but how often do you have to revisit it ...Jeremy: Right. And so ... So you mentioned a couple of things in there, and not only in that question, but in the previous questions as you were talking about the CDK in general, and I am 100% behind you on this idea of deterministic builds because I want to know exactly what's being deployed. I want to be able to audit that and map that back. And you can audit, I mean, you could run CDK synth and then audit the CloudFormation and test against certain things. But if you are changing stuff, right? Then you have to understand not only the CDK but also the CloudFormation that it actually generates. But in terms of solving problems, some of the things that the CDK does really, really well, and this is something where I've always had this issue with just trying to use raw CloudFormation or Serverless Framework or SAM or any of these things is the fact that there's a lot of boilerplate that you often have to do.There's ways that companies want to do something specifically. I basically probably always need 1,400 lines of CloudFormation. And for every project I do, it's probably close to the same, and then add a little bit more to actually make it adaptive for my product. And so one thing that I love about the CDK is constructs. And I love this idea of being able to package these best practices for your company or these compliance requirements, excuse me, compliance requirements for your company, whatever it is, be able to package these and just hand them to developers. And so I'm just curious on your thoughts on that because that seems like a really good move in the right direction, but without the deterministic builds, without some of these other problems that you talked about, is there another solution to that that would be more declarative?Ben: Yeah. In theory, if the CDK was able to produce an artifact that represented all of the non-deterministic dependencies that it had, right? That allowed you to then store that artifacts as you'd come back and put that into the program and say, "I'm going to get out the same thing," but because the CDK doesn't control upstream of it, the code that the developers are writing, there isn't a way to do that. Right? So on the abstraction front, the constructs are super useful, right? CloudFormation now has modules which allow you to say, "Here's a template and I'm going to represent this as a CloudFormation type itself," right? So instead of saying that I need X different things, I'm going to say, "I packaged that all up here. It is as a type."Now, currently, modules can only be playing CloudFormation templates and there's a lot of constraints in what you can express inside a CloudFormation template. And I think the answer for me is ... What I want to see is more richness in the CloudFormation language, right? One of the things that people do in the CDK that's really helpful is say, "I need a copy of this in every AZ."Jeremy: Right.Ben: Right? There's so much boilerplate in server-based things. And CloudFormation can't do that, right? But if you imagine that it had a map function that allowed you to say, "For every AZ, stamp me out a copy of this little bit." And then that the CDK constructs allowed to translate. Instead of it doing all this generation only down to the L one piece, instead being able to say, "I'm going to translate this into more rich CloudFormation templates so that the CloudFormation template was as advanced as possible."Right? Then it could do things like say, "Oh, I know we need to do this in every AZ, I'm going to use this map function in the CloudFormation template rather than just stamping it out." Right? And so I think that's possible. Now, modules should also be able to be defined as CDK programs. Right? You should be able to register a construct as a CloudFormation tag.Jeremy: It would be pretty cool.Ben: There's no reason you shouldn't be able to. Yeah. Because I think the declarative versus imperative thing is, again, not the most important piece, it's how do we move ... It's shifting right in this case, right? That how do you shift what's happening with the developer further into the process of deployment so that more of their context is present? And so one of the things that the CDK does that's hard to replicate is have non-local effects. And this is both convenient and I think of code smell often.So you can pass a bucket resource from another stack into a piece of code in your CDK program that's creating a different stack and you say, "Oh great, I've got this Lambda function, it needs permissions to that bucket. So add permissions." And it's possible for the CDK programs to either be adding the permissions onto the IAM role of that function, or non-locally adding to that bucket's resource policy, which is weird, right? That you can be creating a stack and the thing that you do to that stack or resource or whatever is not happening there, it's happening elsewhere. I don't think that's a great approach, but it's certainly convenient to be able to do it in a lot of situations.Now, that's not representable within a module. A module is a contained piece of functionality that can't touch anything else. So things like SAM where you can add events onto a function that can go and create ... You create the API events on different functions and then SAM aggregates them and creates an API gateway for you. Right? If AWS serverless function was a module, it couldn't do that because you'd have these in different places and you couldn't aggregate something between all of them and put them in the top-level thing, right?This is what CloudFormation macros enable, but they don't have a... There's no proper interface to them, right? They don't define, "This is what I'm doing. This is the kind of resources I can create." There's none of that that would help you understand them. So they're infinitely flexible, but then also maybe less principled for that reason. So I think there are ways to evolve, but it's investment in the CloudFormation language that allows us to shift that burden from being a flattening inside client-side code from the developer and shifting it to be able to be represented in the cloud.Jeremy: Right. Yeah. And I think from that standpoint too if we go back to the solving people's problems standpoint, that everything you explained there, they're loaded with nuances, it's loaded with gotchas, right? Like, "Oh, you can't do this, you can't do that." So that's just why I think the CDK is so popular because it's like you can do so much with it so quickly and it's very, very fast. And I think that trade-off, people are just willing to make it.Ben: Yes. And that's where they're willing to make it, do they fully understand the consequences of it? Then does AWS communicate those consequences well? Before I get into that question of, okay, you're a developer that's brand new to AWS and you've been tasked with standing up some Kubernetes cluster and you're like, "Great. I can use a CDK to do this." Something is malfunctioning. You're also tasked with the operations and something is malfunctioning. You go in through the Console and maybe figure out all the things that are out there are new to you because they're hidden inside L3 constructs, right?You're two levels down from where you were defining what you want, and then you find out what's wrong and you have no idea how to turn that into a change in your CDK program. So instead of going back and doing the thing that infrastructure as code is for, which is tweaking your program to go fix the problem, you go and you tweak it in the Console ...Jeremy: Right. Which you should never do.Ben: ... and you fix it that way. Right. Well, and that's the thing that I struggle with, with the CDK is how does the CDK help the developer who's in that situation? And I don't think they have a good story around that. Now, I don't know. I haven't talked with enough junior developers who are using the CDK about how often they get into that situation. Right? But I always say client-side code is not a replacement for a managed service because when it's client-side code, you still own the result.Jeremy: Right.Ben: If a particular CDK construct was a managed service in AWS, then all of the resources that would be created underneath AWS's problem to make work. And the interface that the developer has is the only level of ownership that they have. Fargate is this. Because you could do all the things that Fargate does with a CDK construct, right? Set up EC2, do all the things, and represent it as something that looks like Fargate in your CDK program. But every time your EC2 fleet is unhealthy that's your problem. With Fargate, that's AWS's problem. If we didn't have Fargate, that's essentially what CDK would be trying to do for ECS.And I think we all recognize that Fargate is very necessary and helpful in that case, right? And I just want that for all the things, right? Whenever I have an abstraction, if it's an abstraction that I understand, then I should have a way of zooming into it while not having to switch languages, right? So that's where you shouldn't dump me out the CloudFormation to understand what you're doing. You should help me understand the low-level things in the same language. And if it's not something that I need to understand, it should be a managed service. It shouldn't be a bunch of stuff that I still own that I haven't looked at.Jeremy: Makes sense. Got a question, Rebecca? Because I was waiting for you to jump in.Rebecca: No, but I was going to make a joke, but then the joke passed, and then I was like, "But should I still make it?" I was going to be like, "Yeah, but does the CDK let you test in production?" But that was a 32nd ago joke and then I was really wrestling with whether or not I should tell it, but I told it anyway, hopefully, someone gets a laugh.Ben: Yeah. I mean, there's the thing that Charity Majors says, right? Which is that everybody tests in production. Some people are lucky enough to have a development environment in production. No, sorry. I said that the wrong way. It's everybody has a test environment. Some people are lucky enough that it's not in production.Rebecca: Yeah. Swap that. Reverse it. Yeah.Ben: Yeah.Jeremy: All right. So speaking of talking to developers and getting feedback from them, so I actually put a question out on Twitter a couple of weeks ago and got a lot of really interesting reactions. And essentially I asked, "What do you love or hate about infrastructure as code?" And there were a lot of really interesting things here. I don't know, maybe it might be fun to go through a couple of these and get your thoughts on them. So this is probably not a great one to start with, but I thought it was interesting because this I think represents the frustration that a lot of us feel. And it was basically that they love that automation minimizes future work, right? But they hate that it makes life harder over time. And that pretty much every approach to infrastructure in, sorry, yeah, infrastructure in code at the present is flawed, right? So really there are no good solutions right now.Ben: Yeah. CloudFormation is still a pain to learn and deal with. If you're operating in certain IDEs, you can get tab completion.Jeremy: Right.Ben: If you go to CDK you get tab completion, which is, I think probably most of the value that developers want out of it and then the abstraction, and then all the other fancy things it does like pipelines, which again, should be a managed service. I do think that person is absolutely right to complain about how difficult it is. That there are many ways that it could be better. One of the things that I think about when I'm using tools is it's not inherently bad for a tool to have some friction to use it.Jeremy: Right.Ben: And this goes to another infrastructure as code tool that goes even further than the CDK and says, "You can define your Lambda code in line with your infrastructure definition." So this is fine with me. And there's some other ... I think Punchcard also lets you do some of this. Basically extracts out the bits of your code that you say, "This is a custom thing that glues together two things I'm defining in here and I'll make that a Lambda function for you." And for me, that is too little friction to defining a Lambda function.Because when I define a Lambda function, just going back to that bringing in ownership, every time I add a Lambda function, that's something that I own, that's something that I have to maintain, that I'm responsible for, that can go wrong. So if I'm thinking about, "Well, I could have API Gateway direct into DynamoDB, but it'd be nice if I could change some of these fields. And so I'm just going to drop in a little sprinkle of code, three lines of code in between here to do some transformation that I want." That is all of sudden an entire Lambda function you've brought into your infrastructure.Jeremy: Right. That's a good point.Ben: And so I want a little bit of friction to do that, to make me think about it, to make me say, "Oh, yeah, downstream of this decision that I am making, there are consequences that I would not otherwise think about if I'm just trying to accomplish the problem," right? Because I think developers, humans, in general, tend to be a bit shortsighted when you have a goal especially, and you're being pressured to complete that goal and you're like, "Okay, well I can complete it." The consequences for later are always a secondary concern.And so you can change your incentives in that moment to say, "Okay, well, this is going to guide me to say, "Ah, I don't really need this Lambda function in here. Then I'm better off in the long term while accomplishing that goal in the short term." So I do think that there is a place for tools making things difficult. That's not to say that the amount of difficult that infrastructure as code is today is at all reasonable, but I do think it's worth thinking about, right?I'd rather take on the pain of creating an ASL definition by hand for express workflow than the easier thing of writing Lambda code. Because I know the long-term consequences of that. Now, if that could be flipped where it was harder to write something that took more ownership, it'd be just easy to do, right? You'd always do the right thing. But I think it's always worth saying, "Can I do the harder thing now to pay off to pay off later?"Jeremy: And I always call those shortcuts "tomorrow-Jeremy's" problem. That's how I like to look at those.Ben: Yeah. Yes.Jeremy: And the funny thing about that too is I remember right when EventBridge came out and there was no CloudFormation support for a long time, which was super frustrating. But Serverless Framework, for example, implemented a custom resource in order to do that. And I remember looking at a clean stack and being like, "Why are there two Lambda functions there that I have no idea?" I'm like, "I didn't publish ..." I honestly thought my account was compromised that somebody had published a Lambda function in there because I'm like, "I didn't do that." And then it took me a while to realize, I'm like, "Oh, this is what this is." But if it is that easy to just create little transform functions here and there, I can imagine there being thousands of those in your account without anybody knowing that they even exist.Ben: Now, don't get me wrong. I would love to have the ability to drop in little transforms that did not involve Lambda functions. So in other words, I mean, the thing that VTL does for API Gateway, REST APIs but without it being VTL and being ... Because that's hard and then also restricted in what you can do, right? It's not, "Oh, I can drop in arbitrary code in here." But enough to say, "Oh, I want to flip ... These fields should go from a key-value mapping to a list of key-value, right? In the way that it addresses inconsistent with how tags are defined across services, those kinds of things. Right? And you could drop that in any service, but once you've defined it, there's no maintenance for you, right?You're writing JavaScript. It's not actually a JavaScript engine underneath or something. It's just getting translated into some big multi-tenant fancy thing. And I have a hypothesis that that should be possible. You should be able to do it where you could even do it in the parsing of JSON, being able to do transforms without ever having to have the whole object in memory. And if we could get that then, "Oh, sure. Now I have sprinkled all over the place all of these little transforms." Now there's a little bit of overhead if the transform is defined correctly or not, right? But once it is, then it just works. And having all those little transforms everywhere is then fine, right? And that incentive to make it harder it doesn't need to be there because it's not bringing ownership with it.Rebecca: Yeah. It's almost like taking the idea of tomorrow-Jeremy's problem and actually switching it to say tomorrow-Jeremy's celebration where tomorrow-Jeremy gets to look back at past-Jeremy and be like, "Nice. Thank you for making that decision past-Jeremy." Because I think we often do look at it in terms of tomorrow-Jeremy will think of this, we'll solve this problem rather than how do we approach it by saying, how do I make tomorrow-Jeremy thankful for it today-Jeremy? And that's a simple language, linguistic switch, but a hard switch to actually make decisions based on.Ben: Yeah. I don't think tomorrow-Ben is ever thankful for today-Ben. I think it's tomorrow-Ben is thankful for yesterday-Ben setting up the incentives correctly so that today-Ben will do the right thing for tomorrow-Ben. Right? When I think about people, I think it's easier to convince people to accept a change in their incentives than to convince them to fight against their incentives sustainably.Jeremy: Right. And I think developers and I'm guilty of this too, I mean, we make decisions based off of expediency. We want to get things done fast. And when you get stuck on that problem you're like, "You know what? I'm not going to figure it out. I'm just going to write a loop or I'm going to do whatever I can do just to make it work." Another if statement here, "Isn't going to hurt anybody." All right. So let's move to ... Sorry, go ahead.Ben: We shouldn't feel bad about that.Jeremy: You're right.Ben: I was going to say, we shouldn't feel bad about that. That's where I don't want tomorrow-Ben to have to be thankful for today-Ben, because that's the implication there is that today-Ben is fighting against his incentives to do good things for tomorrow-Ben. And if I don't need to have to get to that point where just the right path is the easiest path, right? Which means putting friction in the right places than today-Ben ... It's never a question of whether today-Ben is doing something that's worth being thankful for. It's just doing the job, right?Jeremy: Right. No, that makes sense. All right. I got another question here, I think falls under the category of service discovery, which I know is another topic that you love. So this person said, "I love IaC, but hate the fuzzy boundaries where certain software awkwardly fall. So like Istio and Prometheus and cert-manager. That they can be considered part of the infrastructure, but then it's awkward to deploy them when something like Terraform due to circular dependencies relating to K8s and things like that."So, I mean, I know that we don't have to get into the actual details of that, but I think that is an important aspect of infrastructure as code where best practices sometimes are deploy a stack that has your permanent resources and then deploy a stack that maybe has your more femoral or the ones that are going to be changing, the more mutable ones, maybe your Lambda functions and some of those sort of things. If you're using Terraform or you're using some of these other services as well, you do have that really awkward mix where you're trying to use outputs from one stack into another stack and trying to do all that. And really, I mean, there are some good tools that help with it, but I mean just overall thoughts on that.Ben: Well, we certainly need to demand better of AWS services when they design new things that they need to be designed so that infrastructure as code will work. So this is the S3 bucket notification problem. A very long time ago, S3 decided that they were going to put bucket notifications as part of the S3 bucket. Well, CloudFormation at that point decided that they were going to put bucket notifications as part of the bucket resource. And S3 decided that they were going to check permissions when the notification configuration is defined so that you have to have the permissions before you create the configuration.This creates a circular dependency when you're hooking it up to anything in CloudFormation because the dependency depends on the resource policy on an SNS topic, and SQS queue or a Lambda function depends on the bucket name if you're letting CloudFormation name the bucket, which is the best practice. Then bucket name has to exist, which means the resource has to have been created. But the notification depends on the thing that's notifying, which doesn't have the names and the resource policy doesn't exist so it all fails. And this is solved in a couple of different ways. One of which is name your bucket explicitly, again, not a good practice. Another is what SAM does, which says, "The Lambda function will say I will allow all S3 buckets to invoke me."So it has a star permission in it's resource policy. So then the notification will work. None of which is good or there's custom resources that get created, right? Now, if those resources have been designed with infrastructure as code as part of the process, then it would have been obvious, "Oh, you end up with a circular pendency. We need to split out bucket notifications as a separate resource." And not enough teams are doing this. Often they're constrained by the API that they develop first ...Jeremy: That's a good point.Ben: ... they come up with the API, which often makes sense for a Console experience that they desire. So this is where API Gateway has this whole thing where you create all the routes and the resources and the methods and everything, right? And then you say, "Great, deploy." And in the Console you only need one mutable working copy of that at a time, but it means that you can't create two deployments or update two stages in parallel through infrastructure as code and API Gateway because they both talk to this mutable working copy state and would overwrite each other.And if infrastructure as code had been on their list would have been, "Oh, if you have a definition of your API, you should be able to go straight to the deployment," right? And so trying to push that upstream, which to me is more important than infrastructure as code support at launch, but people are often like, "Oh, I want CloudFormation support at launch." But that often means that they get no feedback from customers on the design and therefore make it bad. KMS asymmetric keys should have been a different resource type so that you can easily tell which key types are in your template.Jeremy: Good point. Yeah.Ben: Right? So that you can use things like CloudFormation Guard more easily on those. Sure, you can control the properties or whatever, but you should be able to think in terms of, "I have a symmetric key or an asymmetric key in here." And they're treated completely separately because you use them completely differently, right? They don't get used to the same place.Jeremy: Yeah. And it's funny that you mentioned the lacking support at launch because that was another complaint. That was quite prevalent in this thread here, was people complaining that they don't get that CloudFormation support right away. But I think you made a very good point where they do build the APIs first. And that's another thing. I don't know which question asked me or which one of these mentioned it, but there was a lot of anger over the fact that you go to the API docs or you go to the docs for AWS and it focuses on the Console and it focuses on the CLI and then it gives you the API stuff and very little mention of CloudFormation at all. And usually, you have to go to a whole separate set of docs to find the CloudFormation. And it really doesn't tie all the concepts together, right? So you get just a block of JSON or of YAML and you're like, "Am I supposed to know what everything does here?"Ben: Yeah. I assume that's data-driven. Right? And we exist in this bubble where everybody loves infrastructure as code.Jeremy: True.Ben: And that AWS has many more customers who set things up using Console, people who learn by doing it first through the Console. I assume that's true, if it's not, then the AWS has somehow gotten on the extremely wrong track. But I imagine that's how they find that they get the right engagement. Now maybe the CDK will change some of this, right? Maybe the amount of interest that is generating, we'll get it to the point where blogs get written with CDK programs being written there. I think that presents different problems about what that CDK program might hide from when you're learning about a service. But yeah, it's definitely not ... I wrote a blog for AWS and my first draft had it as CloudFormation and then we changed it to the Console. Right? And ...Jeremy: That must have hurt. Did you die a little inside when that happened?Ben: I mean, no, because they're definitely our users, right? That's the way in which they interact with data, with us and they should be able to learn from that, their company, right? Because again, developers are often not fully in control of this process.Jeremy: Right. That's a good point.Ben: And so they may not be able to say, "I want to update this through CloudFormation," right? Either because their organization says it or just because their team doesn't work that way. And I think AWS gets requests to prevent people from using the Console, but also to force people to use the Console. I know that at least one of them is possible in IAM. I don't remember which, because I've never encountered it, but I think it's possible to make people use the Console. I'm not sure, but I know that there are companies who want both, right? There are companies who say, "We don't want to let people use the API. We want to force them to use the Console." There are companies who say, "We don't want people using the Console at all. We want to force them to use the APIs."Jeremy: Interesting.Ben: Yeah. There's a lot of AWS customers, right? And there's every possible variety of organization and AWS should be serving all of them, right? They're all customers. And certainly, I want AWS to be leading the ones that are earlier in their cloud journey and on the serverless ladder to getting further but you can't leave them behind, I think it's important.Jeremy: So that people argument and those different levels and coming in at a different, I guess, level or comfortability with APIs versus infrastructure as code and so forth. There was another question or another comment on this that said, "I love the idea of committing everything that makes my solution to text and resurrect an entire solution out of nothing other than an account key. Loved the ability to compare versions and unit tests, every bit of my solution, and not having to remember that one weird setting if you're using the Console. But hate that it makes some people believe that any coder is now an infrastructure wizard."And I think this is a good point, right? And I don't 100% agree with it, but I think it's a good point that it basically ... Back to your point about creating these little transformations in Pulumi, you could do a lot of damage, I mean, good or bad, right? When you are using these tools. What are your thoughts on that? I mean, is this something where ... And again, the CDK makes it so easy for people to write these constructs pretty quickly and spin up tons of infrastructure without a lot of guard rails to protect them.Ben: So I think if we tweak the statement slightly, I think there's truth there, which isn't about the self-perception but about what they need to be. Right? That I think this is more about serverless than about infrastructure as code. Infrastructure as code is just saying that you can define it. Right? I think it's more about the resources that are in a particular definition that require that. My former colleague, Aaron Camera says, "Serverless means every developer is an architect" because you're not in that situation where the code you write goes onto something, you write the whole thing. Right?And so you do need to have those ... You do need to be an infrastructure wizard whether you're given the tools to do that and the education to do that, right? Not always, like if you're lucky. And the self-perception is again an even different thing, right? Especially if coders think that there's nothing to be learned ... If programmers, software developers, think that there's nothing to be learned from the folks who traditionally define the infrastructure, which is Ops, right? They think, "Those people have nothing to teach me because now I can do all the things that they did." Well, you can create the things that they created and it does not mean that you're as good at it ...Jeremy: Or responsible for monitoring it too. Right.Ben: ... and have the ... Right. The monitoring, the experience of saying these are the things that will come back to bite you that are obvious, right? This is how much ownership you're getting into. There's very much a long-standing problem there of devaluing Ops as a function and as a career. And for my money when I look at serverless, I think serverless is also making the software development easier because there's so much less software you need to write. You need to write less software that deals with the hard parts of these architectures, the scaling, the distributed computing problems.You still have this, your big computing problems, but you're considering them functionally rather than coding things that address them, right? And so I see a lot of operations folks who come into serverless learn or learn a new programming language or just upscale, right? They're writing Python scripts to control stuff and then they learn more about Python to be able to do software development in it. And then they bring all of that Ops experience and expertise into it and look at something and say, "Oh, I'd much rather have step functions here than something where I'm running code for it because I know how much my script break and those kinds of things when an API changes or ... I have to update it or whatever it is."And I think that's something that Tom McLaughlin talks about having come from an outside ground into serverless. And so I think there's definitely a challenge there in both directions, right? That Ops needs to learn more about software development to be more engaged in that process. Software development does need to learn much more about infrastructure and is also at this risk of approaching it from, "I know the syntax, but not the semantics, sort of thing." Right? We can create ...Jeremy: Just because I can doesn't mean I should.Ben: ... an infrastructure. Yeah.Rebecca: So Ben, as we're looping around this conversation and coming back to this idea that software is people and that really software should enable you to focus on the things that do matter. I'm wondering if you can perhaps think of, as pristine as possible, an example of when you saw this working, maybe it was while you've been at iRobot or a project that you worked on your own outside of that, but this moment where you saw software really working as it should, and that how it enabled you or your team to focus on the things that matter. If there's a concrete example that you can give when you see it working really well and what that looks like.Ben: Yeah. I mean, iRobot is a great example of this having been the company without need for software that scaled to consumer electronics volumes, right? Roomba volumes. And needing to build a IOT cloud application to run connected Roombas and being able to do that without having to gain that expertise. So without having to build a team that could deal with auto-scaling fleets of servers, all of those things was able to build up completely serverlessly. And so skip an entire level of organizational expertise, because that's just not necessary to accomplish those tasks anymore.Rebecca: It sounds quite nice.Ben: It's really great.Jeremy: Well, I have one more question here that I think could probably end up ... We could talk about for another hour. So I will only throw it out there and maybe you can give me a quick answer on this, but I actually had another Twitter thread on this not too long ago that addressed this very, very problem. And this is the idea of the feedback cycle on these infrastructure as code tools where oftentimes to deploy infrastructure changes, I mean, it just takes time. In many cases things can run in parallel, but as you said, there's race conditions and things like that, that sometimes things have to be ... They just have to be synchronous. So is this something where there are ways where you see in the future these mutations to your infrastructure or things like that potentially happening faster to get a better feedback cycle, or do you think that's just something that we're going to have to deal with for a while?Ben: Yeah, I think it's definitely a very extensive topic. I think there's a few things. One is that the deployment cycle needs to get shortened. And part of that I think is splitting dev deployments from prod deployments. In prod it's okay for it to take 30 seconds, right? Or a minute or however long because that's at the end of a CI/CD pipeline, right? There's other things that are happening as part of that. Now, you don't want that to be hours or whatever it is. Right? But it's okay for that to be proper and to fully manage exactly what's going on in a principled manner.When you're doing for development, it would be okay to, for example, change the Lambda code without going through CloudFormation to change the Lambda code, right? And this is what an architect does, is there's a notion of a dirty deploy which just packages up. Now, if your resource graph has changed, you do need to deploy again. Right? But if the only thing that's changing is your code, sure, you can go and say, "Update function code," on that Lambda directly and that's faster.But calling it a dirty deploy is I think important because that is not something that you want to do in prod, right? You don't want there to be drift between what the infrastructure as code service understands, but then you go further than that and imagine there's no reason that you actually have to do this whole zip file process. You could be R sinking the code directly, or you could be operating over SSH on the code remotely, right? There's many different ways in which the loop from I have a change in my Lambda code to that Lambda having that change could be even shorter than that, right?And for me, that's what it's really about. I don't think that local mocking is the answer. You and Brian Rue were talking about this recently. I mean, I agree with both of you. So I think about it as I want unit tests of my business logic, but my business logic doesn't deal with AWS services. So I want to unit test something that says, "Okay, I'm performing this change in something and that's entirely within my custom code." Right? It's not touching other services. It doesn't mean that I actually need adapters, right? I could be dealing with the native formats that I'm getting back from a given service, but I'm not actually making calls out of the code. I'm mocking out, "Well, here's what the response would look like."And so I think that's definitely necessary in the unit testing sense of saying, "Is my business logic correct? I can do that locally. But then is the wiring all correct?" Is something that should only happen in the cloud. There's no reason to mock API gateway into Lambda locally in my mind. You should just be dealing with the Lambda side of it in your local unit tests rather than trying to set up this multiple thing. Another part of the story is, okay, so these deploys have to happen faster, right? And then how do we help set up those end-to-end test and give you observability into it? Right? X-Ray helps, but until X-Ray can sort through all the services that you might use in the serverless architecture, can deal with how does it work in my Lambda function when it's batching from Kinesis or SQS into my function?So multiple traces are now being handled by one invocation, right? These are problems that aren't solved yet. Until we get that kind of inspection, it's going to be hard for us to feel as good about cloud development. And again, this is where I feel sometimes there's more friction there, but there's bigger payoff. Is one of those things where again, fighting against your incentives which is not the place that you want to be.Jeremy: I'm going to stop you before you disagree with me anymore. No, just kidding! So, Rebecca, you have any final thoughts or questions for Ben?Rebecca: No. I just want to say to both of you and to everyone listening that I hope your today self is celebrating your yesterday-self right now.Jeremy: Perfect. Well, Ben, thank you so much for joining us and being a guinea pig as we said on this new format that we are trying. Excellent guinea pig. Excellent.Rebecca: An excellent human too but also great guinea pig.Jeremy: Right. Right. Pretty much so. So if people want to find out more about you, read some of the stuff you're doing and working on, how do they do that?Ben: I'm on Twitter. That's the primary place. I'm on LinkedIn, I don't post much there. And then I write articles that show up on Medium.Rebecca: And just so everyone knows your Twitter handle I'll say it out loud too. It's @ben11kehoe, K-E-H-O-E, ben11kehoe.Jeremy: Right. Perfect. All right. Well, we will put all that in the show notes and hopefully people will like this new format. And again, we'd love your feedback on this, things that you'd like us to do in the future, any ideas you have. And of course, make sure you reach out to Ben. He's an amazing resource for serverless. So again, thank you for everything you do, and thank you for being on the show.Ben: Yeah. Thanks so much for having me. This was great.Rebecca: Good to see you. Thank you.

technology moving chefs code loved software medium cloud reverse infrastructure i am iot api puppets swap aha python aws amplify console faa prometheus apis javascript ops x ray roomba sns asl s3 ides kubernetes irobot lambda acm baas ci cd json serverless iac soylent green terraform cli dsl ssh ecs l3 tco rest apis kms cdk ec2 yaml k8s kinesis istio dynamodb charity majors api gateway lambdas fargate cloud guru cloudformation sqs ben it vtl tom mclaughlin serverless framework aws hero eventbridge ben there jeremy you jeremy it aws serverless hero ben right ben well cbt nuggets ben thanks ben kehoe rebecca so jeremy so

Episode #105: Building a Serverless Banking Platform with Patrick Strzelec

Play Episode Listen Later Jun 14, 2021 66:01

About Patrick StrzelecPatrick Strzelec is a fullstack developer with a focus on building GraphQL gateways and serverless microservices. He is currently working as a technical lead at NorthOne making banking effortless for small businesses.LinkedIn: Patrick StrzelecNorthOne Careers: www.northone.com/about/careersWatch this episode on YouTube: https://youtu.be/8W6lRc03QNU This episode sponsored by CBT Nuggets and Lumigo. TranscriptJeremy: Hi everyone. I'm Jeremy Daly, and this is Serverless Chats. Today, I'm joined by Patrick Strzelec. Hey, Patrick, thanks for joining me.Patrick: Hey, thanks for having me.Jeremy: You are a lead developer at NorthOne. I'd love it if you could tell the listeners a little bit about yourself, your background, and what NorthOne does.Patrick: Yeah, totally. I'm a lead developer here at NorthOne, I've been focusing on building out our GraphQL gateway here, as well as some of our serverless microservices. What NorthOne does, we are a banking experience for small businesses. Effectively, we are a deposit account, with many integrations that act almost like an operating system for small businesses. Basically, we choose the best partners we can to do things like check deposits, just your regular transactions you would do, as well as any insights, and the use cases will grow. I'd like to call us a very tailored banking experience for small businesses.Jeremy: Very nice. The thing that is fascinating, I think about this, is that you have just completely embraced serverless, right?Patrick: Yeah, totally. We started off early on with this vision of being fully event driven, and we started off with a monolith, like a Python Django big monolith, and we've been experimenting with serverless all the way through, and somewhere along the journey, we decided this is the tool for us, and it just totally made sense on the business side, on the tech side. It's been absolutely great.Jeremy: Let's talk about that because this is one of those things where I think you get a business and a business that's a banking platform. You're handling some serious transactions here. You've got a lot of transactions that are going through, and you've totally embraced this. I'd love to have you take the listeners through why you thought it was a good idea, what were the business cases for it? Then we can talk a little bit about the adoption process, and then I know there's a whole bunch of stuff that you did with event driven stuff, which is absolutely fascinating.Then we could probably follow up with maybe a couple of challenges, and some of the issues you face. Why don't we start there. Let's start, like who in your organization, because I am always fascinated to know if somebody in your organization says, “Hey we absolutely need to do serverless," and just starts beating that drum. What was that business and technical case that made your organization swallow that pill?Patrick: Yeah, totally. I think just at a high level we're a user experience company, we want to make sure we offer small businesses the best banking experience possible. We don't want to spend a lot of time on operations, and trying to, and also reliability is incredibly important. If we can offload that burden and move faster, that's what we need to do. When we're talking about who's beating that drum, I would say our VP, Blake, really early on, seemed to see serverless as this amazing fit. I joined about three years ago today, so I guess this is my anniversary at the company. We were just deciding what to build. At the time there was a lot of architecture diagrams, and Blake hypothesized that serverless was a great fit.We had a lot of versions of the world, some with Apache Kafka, and a bunch of microservices going through there. There's other versions with serverless in the mix, and some of the tooling around that, and this other hypothesis that maybe we want GraphQL gateway in the middle of there. It was one of those things that we wanted to test our hypothesis as we go. That ties into this innovation velocity that serverless allows for. It's very cheap to put a new piece of infrastructure up in serverless. Just the other day we wanted to test Kinesis for an event streaming use case, and that was just a half an hour to set up that config, and you could put it live in production and test it out, which is completely awesome.I think that innovation velocity was the hypothesis. We could just try things out really quickly. They don't cost much at all. You only pay for what you use for the most part. We were able to try that out, and as well as reliability. AWS really does a good job of making sure everything's available all the time. Something that maybe a young startup isn't ready to take on. When I joined the company, Blake proposed, “Okay, let's try out GraphQL as a gateway, as a concept. Build me a prototype." In that prototype, there was a really good opportunity to try serverless. They just ... Apollo server launched the serverless package, that was just super easy to deploy.It was a complete no-brainer. We tried it out, we built the case. We just started with this GraphQL gateway running on serverless. AWS Lambda. It's funny because at first, it's like, we're just trying to sell them development. Nobody's going to be hitting our services. It was still a year out from when we were going into production. Once we went into prod, this Lambda's hot all the time, which is interesting. I think the cost case breaks down there because if you're running this thing, think forever, but it was this GraphQL server in front of our Python Django monolift, with this vision of event driven microservices, which has fit well for banking. If you just think about the banking world, everything is pretty much eventually consistent.Just, that's the way the systems are designed. You send out a transaction, it doesn't settle for a while. We were always going to do event driven, but when you're starting out with a team of three developers, you're not going to build this whole microservices environment and everything. We started with that monolith with the GraphQL gateway in front, which scaled pretty nicely, because we were able to sort of, even today we have the same GraphQL gateway. We just changed the services backing it, which was really sweet. The adoption process was like, let's try it out. We tried it out with GraphQL first, and then as we were heading into launch, we had this monolith that we needed to manage. I mean, manually managing AWS resources, it's easier than back in the day when you're managing your own virtual machines and stuff, but it's still not great.We didn't have a lot of time, and there was a lot of last-minute changes we needed to make. A big refactor to our scheduling transactions functions happened right before launch. That was an amazing serverless use case. And there's our second one, where we're like, “Okay, we need to get this live really quickly." We created this work performance pattern really quickly as a test with serverless, and it worked beautifully. We also had another use case come up, which was just a simple phone scheduling service. We just wrapped an API, and just exposed some endpoints, but it was just a lot easier to do with serverless. Just threw it off to two developers, figure out how you do it, and it was ready to be live. And then ...Jeremy: I'm sorry to interrupt you, but I want to get to this point, because you're talking about standing up infrastructure, using infrastructure as code, or the tools you're using. How many developers were working on this thing?Patrick: How many, I think at the time, maybe four developers on backend functionality before launch, when we were just starting out.Jeremy: But you're building a banking platform here, so this is pretty sophisticated. I can imagine another business case for serverless is just the sense that we don't have to hire an operations team.Patrick: Yeah, exactly. We were well through launching it. I think it would have been a couple of months where we were live, or where we hired our first dev ops engineer. Which is incredible. Our VP took a lot of that too, I'm sure he had his hands a little more dirty than he did like early on. But it was just amazing. We were able to manage all that infrastructure, and scale was never a concern. In the early stages, maybe it shouldn't be just yet, but it was just really, really easy.Jeremy: Now you started with four, and I think, what are you now? Somewhere around 25 developers? Somewhere in that space now?Patrick: About 25 developers now, we're growing really fast. We doubled this year during COVID, which is just crazy to think about, and somehow have been scaling somewhat smoothly at least, in terms of just being able to output as a dev team promote. We'll probably double again this year. This is maybe where I shamelessly plug that we're hiring, and we always are, and you could visit northone.com and just check out the careers page, or just hit me up for a warm intro. It's been crazy, and that's one of the things that serverless has helped with us too. We haven't had this scaling bottleneck, which is an operations team. We don't need to hire X operations people for a certain number of developers.Onboarding has been easier. There was one example of during a major project, we hired a developer. He was new to serverless, but just very experienced developer, and he had a production-ready serverless service ready in a month, which was just an insane ramp-up time. I haven't seen that very often. He didn't have to talk to any of our operation staff, and we'd already used serverless long enough that we had all of our presets and boilerplates ready, and permissions locked down, so it was just super easy. It's super empowering just for him to be able to just play around with the different services. Because we hit that point where we've invested enough that every developer when they opened a branch, that branch deploys its own stage, which has all of the services, AWS infrastructure deployed.You might have a PR open that launches an instance of Kinesis, and five SQS queues, and 10 Lambdas, and a bunch of other things, and then tear down almost immediately, and the cost isn't something we really worry about. The innovation velocity there has been really, really good. Just being able to try things out. If you're thinking about something like Kinesis, where it's like a Kafka, that's my understanding, and if you think about the organizational buy-in you need for something like Kafka, because you need to support it, come up with opinions, and all this other stuff, you'll spend weeks trying it out, but for one of our developers, it's like this seems great.We're streaming events, we want this to be real-time. Let's just try it out. This was for our analytics use case, and it's live in production now. It seems to be doing the thing, and we're testing out that use case, and there isn't that roadblock. We could always switch off to a different design if you want. The experimentation piece there has been awesome. We've changed, during major projects we've changed the way we've thought about our resources a few times, and in the end it works out, and often it is about resiliency. It's just jamming queues into places we didn't think about in the first place, but that's been awesome.Jeremy: I'm curious with that, though, with 25 developers ... Kinesis for the most part works pretty well, but you do have to watch those iterator ages, and make sure that they're not backing up, or that you're losing events. If they get flooded or whatever, and also sticking queues everywhere, sounds like a really good idea, and I'm a big fan of that, but it also, that means there's a lot of queues you have to manage, and watch, and set alarms and all that kind of stuff. Then you also talked about a pretty, what sounds like a pretty great CI/CD process to spin up new branches and things like that. There's a lot of dev ops-y ops work that is still there. How are you handling that now? Do you have dedicated ops people, or do you just have your developers looking after that piece of it?Patrick: I would say we have a very spirited group of developers who are inspired. We do a lot of our code-sharing via internal packages. A few of our developers just figured out some of our patterns that we need, whether it's like CI, or how we structure our events stores, or how we do our Q subscriptions. We manage these internal packages. This won't scale well, by the way. This is just us being inspired and trying to reduce some of this burden. It is interesting, I've listened to this podcast and a few others, and this idea of infrastructure as code being part of every developer's toolbox, it's starting to really resonate with our team.In our migration, or our swift shift to full, I'd say doing serverless properly, we've learned to really think in it. Think in terms of infrastructure in our creating solutions. Not saying we're doing serverless the right way now, but we certainly did it the wrong way in the past, where we would spin up a bunch of API gateways that would talk to each other. A lot of REST calls going around the spider web of communication. Also, I'll call these monster Lambdas, that have a whole procedure list that they need to get through, and a lot of points of failure. When we were thinking about the way we're going to do Lambda now, we try to keep one Lambda doing one thing, and then there's pieces of infrastructure stitching that together. EventBridge between domain boundaries, SQS for commands where we can, instead of using API gateway. I think that transitions pretty well into our big break. I'm talking about this as our migration to serverless. I want to talk more about that.Jeremy: Before we jump into that, I just want to ask this question about, because again, I call those fat, some people call them fat Lambdas, I call them Lambda lifts. I think there's Lambda lifts, then fat Lambdas, then your single-purpose functions. It's interesting, again, moving towards that direction, and I think it's super important that just admitting that you're like, we were definitely doing this wrong. Because I think so many companies find that adopting serverless is very much so an evolution, and it's a learning thing where the teams have to figure out what works for them, and in some cases discovering best practices on your own. I think that you've gone through that process, I think is great, so definitely kudos to you for that.Before we get into that adoption and the migration or the evolution process that you went through to get to where you are now, one other business or technical case for serverless, especially with something as complex as banking, I think I still don't understand why I can't transfer personal money or money from my personal TD Bank account to my wife's local checking account, why that's so hard to do. But, it seems like there's a lot of steps. Steps that have to work. You can't get halfway through five steps in some transaction, and then be like, oops we can't go any further. You get to roll that back and things like that. I would imagine orchestration is a huge piece of this as well.Patrick: Yeah, 100%. The banking lends itself really well to these workflows, I'll call them. If you're thinking about even just the start of any banking process, there's this whole application process where you put in all your personal information, you send off a request to your bank, and then now there's this whole waterfall of things that needs to happen. All kinds of checks and making sure people aren't on any fraud lists, or money laundering lists, or even just getting a second dive from our compliance department. There's a lot of steps there, and even just keeping our own systems in sync, with our off-provider and other places. We definitely lean on using step functions a lot. I think they work really, really well for our use case. Just the visual, being able to see this is where a customer is in their onboarding journey, is very, very powerful.Being able to restart at any point of their, or even just giving our compliance team a view into that process, or even adding a pause portion. I think that's one of the biggest wins there, is that we could process somebody through any one of our pipelines, and we may need a human eye there at least for this point in time. That's one of the interesting things about the banking industry is. There are still manual processes behind the scenes, and there are, I find this term funny, but there are wire rooms in banks where there are people reviewing things and all that. There are a lot of workflows that just lend themselves well to step functions. That pausing capability and being able to return later with a response, so that allows you to build other internal applications for your compliance teams and other teams, or just behind the scenes calls back, and says, "Okay, resume this waterfall."I think that was the visualization, especially in an events world when you're talking about like sagas, I guess, we're talking about distributed transactions here in a way, where there's a lot of things happening, and a common pattern now is the saga pattern. You probably don't want to be doing two-phase commits and all this other stuff, but when we're looking at sagas, it's the orchestration you could do or the choreography. Choreography gets very messy because there's a lot of simplistic behavior. I'm a service and I know what I need to do when these events come through, and I know which compensating events I need to dump, and all this other stuff. But now there's a very limited view.If a developer is trying to gain context in a certain domain, and understand the chain of events, although you are decoupled, there's still this extra coupling now, having to understand what's going on in your system, and being able to share it with external stakeholders. Using step functions, that's the I guess the serverless way of doing orchestration. Just being able to share that view. We had this process where we needed to move a lot of accounts to, or a lot of user data to a different system. We were able to just use an orchestrator there as well, just to keep an eye on everything that's going on.We might be paused in migrating, but let's say we're moving over contacts, a transaction list, and one other thing, you could visualize which one of those are in the red, and which one we need to come in and fix, and also share that progress with external stakeholders. Also, it makes for fun launch parties I'd say. It's kind of funny because when developers do their job, you press a button, and everything launches, and there's not really anything to share or show.Jeremy: There's no balloons or anything like that.Patrick: Yeah. But it was kind of cool to look at these like, the customer is going through this branch of the logic. I know it's all green. Then I think one of the coolest things was just the retry ability as well. When somebody does fail, or when one of these workflows fails, you could see exactly which step, you can see the logs, and all that. I think one of the challenges we ran into there though, was because we are working in the banking space, we're dealing with sensitive data. Something I almost wish AWS solved out of the box, would be being able to obfuscate some of that data. Maybe you can't, I'm not sure, but we had to think of patterns for tokenization for instance.Stripe does this a lot where certain parts of their platform, you just get it, you put in personal information, you get back a token, and you use that reference everywhere. We do tokenization, as well as we limit the amount of details flowing through steps in our orchestrators. We'll use an event store with identifiers flowing through, and we'll be doing reads back to that event store in between steps, to do what we need to do. You lose some of that debug-ability, you can't see exactly what information is flowing through, but we need to keep user data safe.Jeremy: Because it's the use case for it. I think that you mentioned a good point about orchestration versus choreography, and I'm a big fan of choreography when it makes sense. But I think one of the hardest lessons you learn when you start building distributed systems is knowing when to use choreography, and knowing when to use orchestration. Certainly in banking, orchestration is super important. Again, with those saga patterns built-in, that's the kind of thing where you can get to a point in the process and you don't even need to do automated rollbacks. You can get to a failure state, and then from there, that can be a pause, and then you can essentially kick off the unwinding of those things and do some of that.I love that idea that the token pattern and using just rehydrating certain steps where you need to. I think that makes a ton of sense. All right. Let's move on to the adoption and the migration process, because I know this is something that really excites you and it should because it is cool. I always know, as you're building out applications and you start to add more capabilities and more functionality and start really embracing serverless as a methodology, then it can get really exciting. Let's take a step back. You had a champion in your organization that was beating the drum like, "Let's try this. This is going to make a lot of sense." You build an Apollo Lambda or a Lambda running Apollo server on it, and you are using that as a strangler pattern, routing all your stuff through now to your backend. What happens next?Patrick: I would say when we needed to build new features, developers just gravitated towards using serverless, it was just easier. We were using TypeScript instead of Python, which we just tend to like as an organization, so it's just easier to hop into TypeScript land, but I think it was just easier to get something live. Now we had all these Lambdas popping up, and doing their job, but I think the problem that happened was we weren't using them properly. Also, there was a lot of difference between each of our serverless setups. We would learn each time and we'd be like, okay, we'll use this parser function here to simplify some of it, because it is very bare-bones if you're just pulling the Serverless Framework, and it took a little ...Every service looked very different, I would say. Also, we never really took the time to sit back and say, “Okay, how do we think about this? How do we use what serverless gives us to enable us, instead of it just being an easy thing to spin up?" I think that's where it started. It was just easy to start. But we didn't embrace it fully. I remember having a conversation at some point with our VP being like, “Hey, how about we just put Express into one of our Lambdas, and we create this," now I know it's a Lambda lift. I was like, it was just easier. Everybody knows how to use Express, why don't we just do this? Why are we writing our own parsers for all these things? We have 10 versions of a make response helper function that was copy-pasted between repos, and we didn't really have a good pattern for sharing that code yet in private packages.We realized that we liked serverless, but we realized we needed to do it better. We started with having a serverless chapter reading between some of our team members, and we made some moves there. We created a shared boilerplate at some point, so it reduced some of the differences you'd see between some of the repositories, but we needed a step-change difference in our thinking, when I look back, and we got lucky that opportunity came up. At this point, we probably had another six Lambda services, maybe more actually. I want to say around, we'd probably have around 15 services at this point, without a governing body around patterns.At this time, we had this interesting opportunity where we found out we're going to be re-platforming. A big announcement we just made last month was that we moved on to a new bank partner called Bancorp. The bank partner that supports Chime, and they're like, I'll call them an engine boost. We put in a much larger, more efficient engine for our small businesses. If you just look at the capabilities they provide, they're just absolutely amazing. It's what we need to build forward. Their events API is amazing as well as just their base banking capabilities, the unit economics they can offer, the times on there, things were just better. We found out we're doing an engine swap. The people on the business side on our company trusted our technical team to do what we needed to do.Obviously, we need to put together a case, but they trusted us to choose our technology, which was awesome. I think we just had a really good track record of delivering, so we had free reign to decide what do we do. But the timeline was tight, so what we decided to do, and this was COVID times too, was a few of our developers got COVID tested, and we rented a house and we did a bubble situation. How in the NHL or MBA you have a bubble. We had a dev bubble.Jeremy: The all-star team.Patrick: The all-star team, yeah. We decided let's sit down, let's figure out what patterns are going to take us forward. How do we make the step-change at the same time as step-change in our technology stack, at the same time as we're swapping out this bank, this engine essentially for the business. In this house, we watched almost every YouTube video you can imagine on event driven and serverless, and I think leading up. I think just knowing that we were going to be doing this, I think all of us independently started prototyping, and watching videos, and reading a lot of your content, and Alex DeBrie and Yan Cui. We all had a lot of ideas already going in.When we all got to this house, we started off with this exercise, an event storming exercise, just popular in the domain-driven design community, where we just threw down our entire business on a wall with sticky notes, and it would have been better to have every business stakeholder there, but luckily we had two people from our product team there as representatives. That's how invested we were in building this outright, that we have products sitting in the room with us to figure it out.We slapped down our entire business on a wall, this took days, and then drew circles around it and iterated on that for a while. Then started looking at what the technology looks like. What are our domain boundaries, and what prototypes do we need to make? For a few weeks there, we were just prototyping. We built out what I'd called baby's first balance. That was the running joke where, how do we get an account opened with a balance, with the transactions minimally, with some new patterns. We really embraced some of this domain-driven-design thinking, as well as just event driven thinking. When we were rethinking architecture, three concepts became very important for us, not entirely new, but important. Item potency was a big one, dealing with distributed transactions was another one of those, as well as the eventual consistency. The eventual consistency portion is kind of funny because we were already doing it a lot.Our transactions wouldn't always settle very quickly. We didn't know about it, but now our whole system becomes eventually consistent typically if you now divide all of your architecture across domains, and decouple everything. We created some early prototypes, we created our own version of an event store, which is, I would just say an opinionated scheme around DynamoDB, where we keep track of revisions, payload, timestamp, all the things you'd want to be able to do event sourcing. That's another thing we decided on. Event sourcing seemed like the right approach for state, for a lot of our use cases. Banking, if you just think about a banking ledger, it is events or an accounting ledger. You're just adding up rows, add, subtract, add, subtract.We created a lot of prototypes for these things. Our events store pattern became basically just a DynamoDB with opinions around the schema, as well as a package of a shared code package with a simple dispatch function. One dispatch function that really looks at enforcing optimistic concurrency, and one that's a little bit more relaxed. Then we also had some reducer functions built into there. That was one of the packages that we created, as well as another prototype around that was how do we create the actual subscriptions to this event store? We landed on SNS to SQS fan-out, and it seems like fan-out first is the serverless way of doing a lot of things. We learned that along the way, and it makes sense. It was one of those things we read from a lot of these blogs and YouTube videos, and it really made sense in production, when all the data is streaming from one place, and then now you just add subscribers all over the place. Just new queues. Fan-out first, highly recommend. We just landed on there by following best practices.Jeremy: Great. You mentioned a bunch of different things in there, which is awesome, but so you get together in this house, you come up with all the events, you do this event storming session, which is always a great exercise. You get a pretty good visualization of how the business is going to run from an event standpoint. Then you start building out this event driven architecture, and you mentioned some packages that you built, we talked about step functions and the orchestration piece of this. Just give me a quick overview of the actual system itself. You said it's backed by DynamoDB, but then you have a bunch of packages that run in between there, and then there's a whole bunch of queues, and then you're using some custom packages. I think I already said that but you're using ... are you using EventBridge in there? What's some of the architecture behind all that?Patrick: Really, really good question. Once we created these domain boundaries, we needed to figure out how do we communicate between domains and within domains. We landed on really differentiating milestone events and domain events. I guess milestone events in other terms might be called integration events, but this idea that these are key business milestones. An account was open, an application was approved or rejected, things that every domain may need to know about. Then within our domains, or domain boundaries, we had these domain events, which might reduce to a milestone event, and we can maintain those contracts in the future and change those up. We needed to think about how do we message all these things across? How do we communicate? We landed on EventBridge for our milestone events. We have one event bus that we talked to all of our, between domain boundaries basically.EventBridge there, and then each of our services now subscribed to that EventBridge, and maintain their own events store. That's backed by DynamoDB. Each of our services have their own data store. It's usually an event stream or a projection database, but it's almost all Dynamo, which is interesting because our old platform used Postgres, and we did have relational data. It was interesting. I was really scared at first, how are we going to maintain relations and things? It became a non-issue. I don't even know why now that I think about it. Just like every service maintains its nice projection through events, and builds its own view of the world, which brings its own problems. We have DynamoDB in there, and then SNS to SQS fan-out. Then when we're talking about packages ...Jeremy: That's Office Streams?Patrick: Exactly, yeah. We're Dynamo streams to SNS, to SQS. Then we use shared code packages to make those subscriptions very easy. If you're looking at doing that SNS to SQS fan-out, or just creating SQS queues, there is a lot of cloud formation boilerplate that we were creating, and we needed to move really quick on this project. We got pretty opinionated quick, and we created our own subscription function that just generates all this cloud formation with naming conventions, which was nice. I think the opinions were good because early on we weren't opinionated enough, I would say. When you look in your AWS dashboard, the read for these aren't prefixed correctly, and there's all this garbage. You're able to have consistent naming throughout, make it really easy to subscribe to an event.We would publish packages to help with certain things. Our events store package was one of those. We also created a Lambda handlers package, which leverages, there's like a Lambda middlewares compose package out there, which is quite nice, and we basically, all the common functionality we're doing a lot of, like parsing a body from S3, or SQS or API gateway. That's just the middleware that we now publish. Validation in and out. We highly recommend the library Zod, we really embrace the TypeScript first object validation. Really, really cool package. We created all these middlewares now. Then subscription packages. We have a lot of shared code in this internal NPM repository that we install across.I think one challenge we had there was, eventually you extracted away too much from the cloud formation, and it's hard for new developers to ... It's easy for them to create events subscriptions, it's hard for them to evolve our serverless thinking because they're so far removed from it. I still think it was the right call in the end. I think this is the next step of the journey, is figuring out how do we share code effectively while not hiding away too much of serverless, especially because it's changing so fast.Jeremy: It's also interesting though that you take that approach to hide some of that complexity, and bake in some of that boilerplate that, someone's mostly didn't have to write themselves anyways. Like you said, they're copying and pasting between services, is not the best way to do it. I tried the whole shared packages thing one time, and it kind of worked. It's just like when you make a small change to that package and you have 14 services, that then you have to update to get the newest version. Sometimes that's a little frustrating. Lambda layers haven't been a huge help with some of that stuff either. But anyways, it's interesting, because again you've mentioned this a number of times about using queues.You did mention resiliency in there, but I want to touch on that point a little bit because that's one of those things too, where I would assume in a banking platform, you do not want to lose events. You don't want to lose things. and so if something breaks, or something gets throttled or whatever, having to go and retry those events, having the alerts in place to know that a queue is backed up or whatever. Then just, I'm thinking ordering issues and things like that. What kinds of issues did you face, and tell me a little bit more about what you've done for reliability?Patrick: Totally. Queues are definitely ... like SQS is a workhorse for our company right now. We use a lot of it. Dropping messages is one of the scariest things, so you're dead-on there. When we were moving to event driven, that was what scared me the most. What if we drop an event? A good example of that is if you're using EventBridge and you're subscribing Lambdas to it, I was under the impression early on that EventBridge retries forever. But I'm pretty sure it'll retry until it invokes twice. I think that's what we landed on.Jeremy: Interesting.Patrick: I think so, and don't quote me on this. That was an example of where drop message could be a problem. We put a queue in front of there, an SQS queue as the subscription there. That way, if there's any failure to deliver there, it's just going to retry all the time for a number of days. At that point we got to think about DLQs, and that's something we're still thinking about. But yeah, I think the reason we've been using queues everywhere is that now queues are in charge of all your retry abilities. Now that we've decomposed these Lambdas into one Lambda lift, into five Lambdas with queues in between, if anything fails in there, it just pops back into the queue, and it'll retry indefinitely. You can drop messages after a few days, and that's something we learned luckily in the prototyping stage, where there are a few places where we use dead letter queues. But one of the issues there as well was ordering. Ordering didn't play too well with ...Jeremy: Not with DLQs. No, it does not, no.Patrick: I think that's one lesson I'd want to share, is that only use ordering when you absolutely need it. We found ways to design some of our architecture where we didn't need ordering. There's places we were using FIFO SQS, which was something that just launched when we were building this thing. When we were thinking about messaging, we're like, "Oh, well we can't use SQS because they don't respect ordering, or it doesn't respect ordering." Then bam, the next day we see this blog article. We got really hyped on that and used FIFO everywhere, and then realized it's unnecessary in most use cases. So when we were going live, we actually changed those FIFO queues into just regular SQS queues in as many places as we can. Then so, in that use case, you could really easily attach a dead letter queue and you don't have to worry about anything, but with FIFO things get really, really gnarly.Ordering is an interesting one. Another place we got burned I think on dead-letter queues, or a tough thing to do with dead letter queues is when you're using our state machines, we needed to limit the concurrency of our state machines is another wishlist item in AWS. I wish there was just at the top of the file, a limit concurrent executions of your state machine. Maybe it exists. Maybe we just didn't learn to use it properly, but we needed to. There's a few patterns out there. I've seen the [INAUDIBLE] pattern where you can use the actual state machine flow to look back at how many concurrent executions you have, and pause. We landed on setting reserved concurrency in a number of Lambdas, and throwing errors. If we've hit the max concurrency and it'll pause that Lambda, but the problem with DLQs there was, these are all errors. They're coming back as errors.We're like, we're fine with them. This is a throttle error. That's fine. But it's hard to distinguish that from a poison message in your queue, so when do you dump those into DLQ? If it's just a throttling thing, I don't think it matters to us. That was another challenge we had. We're still figuring out dead letter queues and alerting. I think for now we just relied on CloudWatch alarms a lot for our alerting, and there's a lot you could do. Even just in the state machines, you can get pretty granular there. I know once certain things fail, and announced to your Slack channel. We use that Slack integration, it's pretty easy. You just go on a Slack channel, there's an email in there, you plop it into the console in AWS, and you have your very early alerting mechanism there.Jeremy: The thing with Elasticsearch ... not Elasticsearch, I'm sorry. I'm totally off-topic here. The thing with EventBridge and Lambda, these are one of those things that, again, they're nuances, but event bridge, as long as it can deliver to the Lambda service, then the Lambda service kicks off and queues it automatically. Then that will retry at a certain number of times. I think you can control that now. But then eventually if that retries multiple times and eventually fails, then that kicks it over to the DLQ or whatever. There's all different ways that it works like that, but that's why I always liked the idea of putting a queue in between there as well, because I felt you just had a little bit more control over exactly what happens.As long as it gets to the queue, then you know you haven't lost the message, or you hope you haven't lost a message. That's super interesting. Let's move on a little bit about the adoption issues. You mentioned a few of these things, obviously issues with concurrency and ordering, and some of that other stuff. What about some of the other challenges you had? You mentioned this idea of writing all these packages, and it pulls devs away from the CloudFormation a little bit. I do like that in that it, I think, accelerates a lot of things, but what are some of the other maybe challenges that you've been having just getting this thing up and running?Patrick: I would say IAM is an interesting one. Because we are in the banking space, we want to be very careful about what access do you give to what machines or developers, I think machines are important too. There've been cases where ... so we do have a separate developer set up with their own permissions, in development's really easy to spin up all your services within reason. But now when we're going into production, there's times where our CI doesn't have the permissions to delete a queue or create a queue, or certain things, and there's a lot of tweaking you have to do there, and you got to do a lot of thinking about your IAM policies as an organization, especially because now every developer's touching infrastructure.That becomes this shared operational overhead that serverless did introduce. We're still figuring that out. Right now we're functioning on least privilege, so it's better to just not be able to deploy than deploy something you shouldn't or read the logs that you shouldn't, and that's where we're starting. But that's something that, it will be a challenge for a little while I think. There's all kinds of interesting things out there. I think temporary IAM permissions is a really cool one. There are times we're in production and we need to view certain logs, or be able to access a certain queue, and there's tooling out there where you can, or at least so I've heard, you can give temporary permissions. You have this queue permission for 30 minutes, and it expires and it's audited, and I think there's some CloudTrail tie-in you could do there. I'm speaking about my wishlist for our next evolution here. I hope my team is listening ...Jeremy: Your team's listening to you.Patrick: ... will be inspired as well.Jeremy: What about ... because this is something too that I always found to be a challenge, especially when you start having multiple services, and you've talked about these domain events, but then milestone events. You've got different services that need to communicate across services, or across domains, and realize certain things like that. Service discovery in and of itself, and which queue are we mapping to, or which service am I talking to, and which version of the service am I talking to? Things like that. How have you been dealing with that stuff?Patrick: Not well, I would say. Very, very ad hoc. I think like right now, at least we have tight communication between the teams, so we roughly know which service we need to talk to, and we output our URLs in the cloud formation output, so at least you could reference the URLs across services, a little easier. Really, a GraphQL is one of the only service that really talks to a lot of our API gateways. At least there's less of that, knowing which endpoint to hit. Most of our services will read into EventBridge, and then within services, a lot of that's abstracted away, like the queue subscription's a little easier. Service discovery is a bit of a nightmare.Once our services grow, it'll be, I don't know. It'll be a huge challenge to understand. Even which services are using older versions of Node, for instance. I saw that AWS is now deprecating version 10 and we'll have to take a look internally, are we using version 10 anywhere, and how do we make sure that's fine, or even things like just knowing which services now have vulnerabilities in their NPM packages because we're using Node. That's another thing. I don't even know if that falls in service discovery, but it's an overhead of ...Jeremy: It's a service management too. It's a lot there. That actually made me, it brings me to this idea of observability too. You mentioned doing some CloudWatch alerts and some of that stuff, but what about using some observability tool or tracing like x-ray, and things like that? Have you been implementing any of that, and if you have, have you had any success and or problems with it?Patrick: I wish we had a better view of some of the observability tools. I think we were just building so quickly that we never really invested the time into trying them out. We did use X-Ray, so we rolled our own tooling internally to at least do what we know. X-Ray was one of those, but the problem with X-Ray is, we do subscribe all of our services, but X-Ray isn't implemented everywhere internally in AWS, so we lose our trail somewhere in that Dynamo stream to SNS, or SQS. It's not a full trace. Also, just digesting that huge graph of information is just very difficult. I don't use it often, I think it's a really cool graphic to show, “Hey, look, how many services are running, and it's going so fast."It's a really cool thing to look at, but it hasn't been very useful. I think our most useful tool for debugging and observability has been just our logging. We created a JSON logger package, so we get up JSON logs and we can actually filter off of different properties, and we ship those to Elasticsearch. Now you can have a view of all of the functions within a given domain at any point in time. You could really see the story. Because I think early on when we were opening up CloudWatch and you'd have like 10 tabs, and you're trying to understand this flow of information, it was very difficult.We also implemented our own trace ID pattern, and I think we just followed a Lumigo article where we introduced some properties, and in each of our Lambdas at a higher level, and one of our middlewares, and we were able to trace through. It's not ideal. Observability is something that we'll probably have to work on next. It's been tolerable for now, but I can't see the scaling that long.Jeremy: That's the other thing too, is even the shared package issue. It's like when you have an observability tool, they'll just install a layer or something, where you don't necessarily have to worry about updating your own tool. I always find if you are embracing serverless and you want to get rid of all that undifferentiated heavy lifting, observability tools, there's a lot of really good ones out there that are doing some great stuff, and they're specializing in it. It might be worth letting someone else handle that for you than trying to do it yourself internally.Patrick: Yeah, 100%. Do you have any that you've used that are particularly good? I know you work with serverless so-Jeremy: I played around with all of them, because I love this stuff, so it's always fun, but I mean, obviously Lumigo and Epsagon, and Thundra, and New Relic. They're all great. They all do things slightly differently, but they all follow a similar implementation pattern so that it's very easy to install them. We can talk more about some recommendations. I think it's just one of those things where in a modern application not having that insight is really hard. It can be really hard to debug stuff. If you look at some of the tools that AWS offers, I think they're there, it's just, they are maybe a little harder to implement, and not quite as refined and targeted as some of the observability tools. But still, you got to get there. Again, that's why I keep saying it's an evolution, it's a process. Maybe one time you get burned, and you're like, we really needed to have observability, then that's when it becomes more of a priority when you're moving fast like you are.Patrick: Yeah, 100%. I think there's got to be a priority earlier than later. I think I'll do some reading now that you've dropped some of these options. I have seen them floating around, but it's one of those things that when it's too late, it's too late.Jeremy: It's never too late to add observability though, so it should. Actually, a lot of them now, again, it makes it really, really easy. So I'm not trying to pitch any particular company, but take a look at some of them, because they are really great. Just one other challenge that I also find a lot of people run into, especially with serverless because there's all these artificial account limits in place. Even the number of queues you can create, and the number of concurrent Lambda functions in a particular region, and stuff like that. Have you run into any of those account limit issues?Patrick: Yeah. I could give you the easiest way to run into an account on that issue, and that is replay your entire EventBridge archive to every subscriber, and you will find a bottleneck somewhere. That's something ...Jeremy: Somewhere it'll fall over? Nice.Patrick: 100%. It's a good way to do some quick check and development to see where you might need to buffer something, but we have run into that. I think the solution there, and a lot of places was just really playing with concurrency where we needed to, and being thoughtful about where is their main concurrency in places that we absolutely needed to stay functioning. I think the challenge there is that eats into your total account concurrency, which was an interesting learning there. Definitely playing around there, and just being thoughtful about where you are replaying. A couple of things. We use replays a lot. Because we are using these milestone events between service boundaries, now when you launch a new service, you want to replay that whole history all the way through.We've done a lot of replaying, and that was one of the really cool things about EventBridge. It just was so easy. You just set up an archive, and it'll record everything coming through, and then you just press a button in the console, and it'll replay all of them. That was really awesome. But just being very mindful of where you're replaying to. If you replay to all of your subscriptions, you'll hit Lambda concurrency limits real quick. Even just like another case, early on we needed to replace ... we have our own domain events store. We want to replace some of those events, and those are coming off the Dynamo stream, so we were using dynamo to kick those to a stream, to SNS, and fan-out to all of our SQS queues. But there would only be one or two queues you actually needed to subtract to those events, so we created an internal utility just to dump those events directly into the SQS queue we needed. I think it's just about not being wasteful with your resources, because they are cheap. Sure.Jeremy: But if you use them, they start to cost money.Patrick: Yeah. They start to cost some money as well as they could lock down, they can lock you out of other functionality. If you hit your Lambda limits, now our API gateway is tapped.Jeremy: That's a good point.Patrick: You could take down your whole system if you just aren't mindful about those limits, and now you could call up AWS in a panic and be like, “Hey, can you update our limits?" Luckily we haven't had to do that yet, but it's definitely something in your back pocket if you need it, if you can make the case to AWS, that maybe you do need bigger limits than the default. I think just not being wasteful, being mindful of where you're replaying. I think another interesting thing there is dealing with partners too. It's really easy to scale in the Lambda world, but not every partner could handle that volume really quickly. If you're not buffering any event coming through EventBridge to your new service that hits a partner every time, you're going to hit their API rate limit really quickly, because they're just going to just go right through it.You might be doing thousands of API calls when you're instantiating a new service. That's one of those interesting things that we have to deal with, and particularly in our orchestrators, because they are talking to different partners, that's why we need to really make sure we could limit the concurrent executions of the state machines themselves. In a way, some of our architecture is too fast to scale.Jeremy: It's too good.Patrick: You still have to consider downstream. That, and even just, if you are using relational databases or anything else in your system, now you have to worry about connection limits and ...Jeremy: I have a whole talk I gave on that.Patrick: ... spikes in traffic.Jeremy: Yes, absolutely.Patrick: Really cool.Jeremy: I know all about it. Any final advice for companies like you that are trying to bite off a piece of the serverless apple, I guess, That's really bad. Anyways, any advice for people looking to get into this?Patrick: Yeah, totally. I would say start small. I think we were wise to just try it out. It might not land with your development team. If you don't really buy in, it's one of those things that could just end up unnecessarily messy, so start small, see if you like it in-shop, and then reevaluate, once you hit a certain point. That, and I would say shared boilerplate packages sooner than later. I know shared code is a problem, but it is nice to have an un-opinionated starter pack, that you're at least not doing anything really crazy. Even just things like having opinions around logging. In our industry, it's really important that you're not logging sensitive details.For us doing things like wrapping our HTTP clients to make sure we're not logging sensitive details, or having short Lambda packages that make sure out-of-the-box you're opinionated about not doing something terribly awful. I would say those two things. Start small and a boiler package, and maybe the third thing is just pay attention to the code smell of a growing Lambda. If you are doing three API calls in one Lambda, chances are you could probably break that up, and think about it in a more resilient way. If any one of those pieces fail, now you could have retry ability in each one of those. Those are the three things I would say. I could probably talk forever about the rest of our journey.Jeremy: I think that was great advice, and I love hearing about how companies are going through this process, what that process looks like, and I hope, I hope, I hope that companies listen to this and can skip a lot of these mistakes. I don't want to call them all mistakes, and I think it's just evolution. The stuff that you've done, we've all made them, we've all gone through that process, and the more we can solidify these practices and stuff like that, I think that more companies will benefit from hearing stories like these. Thank you very much for sharing that. Again, thank you so much for spending the time to do this and sharing all of this knowledge, and this journey that you've been on, and are continuing to be on. It would great to continue to get updates from you. If people want to contact you, I know you're not on Twitter, but what's the best way to reach out to you?Patrick: I almost wish I had a Twitter. It's the developer thing to have, so maybe in the future. Just on LinkedIn would be great. LinkedIn would be great, as well as if anybody's interested in working with our team, and just figuring out how to take serverless to the next level, just hit me up on LinkedIn or look at our careers page at northone.com, and I could give you a warm intro.Jeremy: That's great. Just your last name is spelled S-T-R-Z-E-L-E-C. How do you say that again? Say it in Polish, because I know I said it wrong in the beginning.Patrick: I guess for most people it would just be Strzelec, but if there are any Slavs in the audience, it's "Strzelec." Very intense four consonants last name.Jeremy: That is a lot of consonants. Anyways again, Patrick, thanks again. This was great.Patrick: Yeah, thank you so much, Jeremy. This has been awesome.

covid-19 pr service event mba cloud id nhl platform express apollo i am banking dropping polish slack validation api python aws onboarding faa stripe item kafka ordering chime x ray sns s3 dynamo node urls choreography lambda baas zod ci cd json serverless td bank typescript queues graphql fifo observability npm elasticsearch new relic postgres aws lambda inaudible slavs apache kafka kinesis dynamodb lambdas cloudformation bancorp our vp northone sqs cloudwatch serverless framework yan cui strzelec eventbridge thundra jeremy you jeremy it patrick you cbt nuggets patrick yeah jeremy great

Episode #101: How Serverless is Becoming More Extensible with Julian Wood

Play Episode Listen Later May 17, 2021 64:40

About Julian WoodJulian Wood is a Senior Developer Advocate for the AWS Serverless Team. He loves helping developers and builders learn about, and love, how serverless technologies can transform the way they build and run applications at any scale. Julian was an infrastructure architect and manager in global enterprises and start-ups for more than 25 years before going all-in on serverless at AWS.Twitter: @julian_woodAll things Serverless @ AWS: ServerlessLandServerless Patterns CollectionServerless Office Hours – every Tuesday 10am PTLambda ExtensionsLambda Container ImagesWatch this episode on YouTube: https://youtu.be/jtNLt3Y51-gThis episode sponsored by CBT Nuggets and Lumigo.TranscriptJeremy: Hi everyone, I'm Jeremy Daly and this is Serverless Chats. Today I'm joined by Julian Wood. Hey Julian, thanks for joining me.Julian: Hey Jeremy, thank you so much for inviting me.Jeremy: Well, I am super excited to have you here. I have been following your work for a very long time and of course, big fan of AWS. So you are a Serverless Developer Advocate at AWS, and I'd love it if you could just tell the listeners a little bit about your background, so they get to know you a bit. And then also, sort of what your role is at AWS.Julian: Yeah, certainly. Well, I'm Julian Wood. I am based in London, but yeah, please don't let my accent fool you. I'm actually originally from South Africa, so the language purists aren't scratching their heads anymore. But yeah, I work within the Serverless Team at AWS, and hopefully do a number of things. First of all, explain what we're up to and how our sort of serverless things work and sort of, I like to sometimes say a bit cheekily, basically help the world fall in love with serverless as I have. And then also from the other side is to be a proxy and sort of be the voice of builders, and developers and whoever's building service applications, and be their voices internally. So you can also keep us on our toes to help build the things that will brighten your days.And just before, I've worked for too many years probably, as an infrastructure racker, stacker, architect, and manager. I've worked in global enterprises babysitting their Windows and Linux servers, and running virtualization, and doing all the operations kind of stuff to support that. But, I was always thinking there's a better way to do this and we weren't doing the best for the developers and internal customers. And so when this, you know in inverted commas, "serverless way" of things started to appear, I just knew that this was going to be the future. And I could happily leave the server side to much better and cleverer people than me. So by some weird, auspicious alignment of the stars, a while later, I managed to get my current dream job talking about serverless and talking to you.Jeremy: Yeah. Well, I tell you, I think a lot of serverless people or people who love serverless are recovering ops and infrastructure people that were doing racking and stacking. Because I too am also recovering from that and I still have nightmares.I thought that it was interesting too, how you mentioned though, developer advocacy. It's funny, you work for a specific company, AWS obviously, but even developer advocacy in general, who is that for? Who are you advocating for? Are you advocating for the developers to use the service from the company? Are you advocating for the developers so that the company can provide the services that they actually need? Interesting balance there.Julian: Yeah, it's true. I mean, the honest answer is we don't have great terms for this kind of role, but yeah, I think primarily we are advocating for the people who are developing the applications and on the outside. And to advocate for them means we've got to build the right stuff for them and get their voices internally. And there are many ways of doing that. Some people raise support requests and other kind of things, but I mean, sometimes some of our great ideas come from trolling Twitter, or yes, I know even Hacker News or that kind of thing. But also, we may get responses from 10 different people about something and that will formulate something in our brain and we'll chat with other kind of people. And that sort of starts a thing. It's not just necessarily each time, some good idea in Twitter comes in, it gets mashed into some big surface database that we all pick off.But part of our job is to be out there and try and think and be developers in whatever backgrounds we come from. And I mean, I'm not a pure software developer where I've come from, and I come, I suppose, from infrastructure, but maybe you'd call that a bit of systems engineering. So yeah, I try and bring that background to try and give input on whatever we do, hopefully, the right stuff.Jeremy: Right. Yeah. And then I think part of the job too, is just getting the information out there and getting the examples out there. And trying to create those best practices or at least surface those best practices, and encourage the community to do a lot of that work and to follow that. And you've done a lot of work with that, obviously, writing for the AWS blog. I know you have a series on the Serverless Lens and the Well-Architected Framework, and we can talk about that in a little while. But I really want to talk to you about, I guess, just the expansion of serverless over the last couple of years.I mean, it was very narrowly focused, probably, when it first came out. Lambda was ... FaaS as a whole new concept for a lot of people. And then as this progressed and we've gotten more APIs, and more services and things that it can integrate with, it just becomes complex and complicated. And that's a good thing, but also maybe a bad thing. But one of the things that AWS has done, and I think this is clearly in reaction to the developers needing it, is the ability to extend what you can do with a Lambda function, right? I mean, the idea of just putting your code in there and then, boom, that's it, that's all you have to do. That's great. But what if you do need access to lifecycle hooks? Or what if you do want to manipulate the underlying runtime or something like that? And AWS, I think has done a great job with that.So maybe we can start there. So just about the extensibility of Lambda in general. And one of the new things that was launched recently was, and recently, I don't know what was it? Seven months ago at this point? I'm not even sure. But was launched fairly recently, let's say that, is Lambda Extensions, and a couple of different flavors of that as well. Could you kind of just give the users an over, the users, wow, the listeners an overview of what Lambda Extensions are?Julian: I could hear the ops background coming in, talking about our users. Yeah. But I mean, from the get-go, serverless was always a terrible term because, why on earth would you name something for what it isn't? I mean, you know? I remember talking to DBAs, talking about noSQL, and they go, "Well, if it's not SQL, then what is it?" So we're terrible at that, serverless as well. And yeah, Lambda was very constrained when it came out. Lambda was never built being a serverless thing, that's what was the outcome. Sometimes we focus too much on the tools rather than the outcome. And the story is S3, just turning 15. And the genesis of Lambda was being an event trigger for S3, and people thought you'd upload something to S3, fire off a Lambda function, how cool is that? And then obviously the clever clubs at the time were like, "Well, hang on, let's not just do this for S3, let's do this for a whole bunch of kind of things."So Lambda was born out of that, as that got that great history, which is created an arc sort of into the present and into the future, which I know we're also going to get on about, the power of event driven applications. But the power of Lambda has always been its simplicity, and removing that operational burden, and that heavy lifting. But, sometimes that line is a bit of a gray area and there're people who can be purists about serverless and can be purists about FaaS and say, "Everything needs to be ephemeral. Lambda functions can't extend to anything else. There shouldn't be any state, shouldn't be any storage, shouldn't be any ..." All this kind of thing.And I think both of us can agree, but I don't want to speak for you, but I think both of us would agree that in some sense, yeah, that's fine. But we live in the real world and there's other stuff that needs to connect to and we're not here about building purist kind of stuff. So Lambda Extensions is a new way basically to integrate Lambda with your favorite tools. And that's the sort of headline thing we like to talk about. And the big idea is to open up Lambda to more effectively work mainly with partners, but also your own tools if you want to write them. And to sort of have deeper hooks into the Lambda lifecycle.And other partners are awesome and they do a whole bunch of stuff for serverless, plus customers also have connections to on-prem staff, or EC2 staff, or containers, or all kind of things. How can we make the tools more seamless in a way? How can we have a common set of tools maybe that you even use on-prem or in the cloud or containers or whatever? Why does Lambda have to be unique or different or that kind of thing? And Extensions is sort of one of the starts of that, is to be able to use these kind of tools and get more out of Lambda. So I mean, just the kind of tools that we've already got on board, there's things like Splunk and AppDynamics. And Lumigo, Epsagon, HashiCorp, Honeycomb, CoreLogic, Dynatrace, I can't think. Thundra and Sumo Logic, Check Point. Yeah, I'm sorry. Sorry for any partners who I've forgotten a few.Jeremy: No, right, no. That's very good. Shout them out, shout them out. No, I mean just, and not to interrupt you here, but ...Julian: No, please.Jeremy: ... I think that's great. I mean, I think that's one of the things that I like about the way that AWS deals with partners, is that ... I mean, I think AWS knows they can't solve all these problems on their own. I mean, maybe they could, right? But they would be their own way of solving the problems and there's other people who are solving these problems differently and giving you the ability to extend your Lambda functions into those partners is, there's a huge win for not only the partners because it creates that ecosystem for them, but also for AWS because it makes the product itself more valuable.Julian: Well, never mind the big win for customers because ultimately they're the one who then gets a common deployment tool, or a common observability tool, or a HashiCorp Vault that you can manage secrets and a Lambda function from HashiCorp Vault. I mean, that's super cool. I mean, also AWS services are picking this up because that's easy for them to do stuff. So if anybody's used Lambda Insights or even seen Lambda Insights in the console, it's somewhere in the monitoring thing, and you just click something over and you get this tool which can pull stuff that you can't normally get from a Lambda function. So things like CPU time and network throughput, which you couldn't normally get. But actually, under the hoods, Lambda Insights is using Lambda extensions. And you can see that if you look. It automatically adds the Lambda layer and job done.So anyway, this is how a lot of the tools work, that a layer is just added to a Lambda function and off you go, the tool can do its work. So also there's a very much a simplicity angle on this, that in a lot of cases you don't have to do anything. You configure some of the extensions via environment variables, if that's cooled you may just have an API key or a log retention value or something like that, I don't know, any kind of example of that. But you just configure that as a normal Lambda environment variable at this partner extension, which is just a Lambda layer, and off you go. Super simple.Jeremy: Right. So explain Extensions exactly, because I think that's one of those things because now we have Lambda layers and we have Lambda Extensions. And there's also like the runtime API and then something else. I mean, even I'm not 100% sure what all of the naming conventions. I'm pretty sure I know what they do ...Julian: Yeah, fair enough.Jeremy: ... but maybe we could say the names and say exactly what they do as well.Julian: Yeah, cool. You get an API, I get an API, everybody gets an API. So Lambda layers, let's just start, because that's, although it's not related to Extensions, it's how Extensions are delivered to the power core functions. And Lambda layers is just another way to add code to a Lambda function or not even code, it can be a dependency. It's just a way that you could, and it's cool because they are shareable. So you have some dependencies, or you have a library, or an SDK, or some training data for something, a Lambda layer just allows you to add some bits and bobs to your Lambda function. That's a horrible explanation. There's another word I was thinking of, because I don't want to use the word code, because it's not necessarily code, but it's dependency, whatever. It's just another way of adding something. I'll wake up in a cold sweat tonight thinking of the word I was thinking of, but anyway.But Lambda Extensions introduces a whole new companion API. So the runtime API is the little bit of code that allows your function to talk to the Lambda service. So when an event comes in, this is from the outside. This could be via API gateway or via the Lambda API, or where else, EventBridge or Step Functions or wherever. When you then transports that data rise in the Lambda services and HTTP call, and Lambda transposes that into an event and sends that onto the Lambda function. And it's that API that manages that. And just as a sidebar, what I find it cool on a sort of geeky, technical thing is, that actually API sits within the execution environment. People are like, "Oh, that's weird. Why would your Lambda API sit within the execution environment basically within the bubble that contains your function rather than it on the Lambda service?"And the cool answer for that is it's actually for a security mechanism. Like your function can then only ever talk to the Lambda runtime API, which is in that secure execution environment. And so our security can be a lot stronger because we know that no function code can ever talk directly out of your function into the Lambda service, it's all got to talk locally. And then the Lambda service gets that response from the runtime API and sends it back to the caller or whatever. Anyway, sidebar, thought that was nerdy and interesting. So what we've now done is we've released a new Extensions API. So the Extensions API is another API that an extension can use to get information from Lambda. And they're two different types of extensions, just briefly, internal and external extensions.Now, internal extensions run within the runtime process so that it's just basically another thread. So you can use this for Python or Java or something and say, when the Python runtime starts, let's start it with another parameter and also run this Java file that may do some observability, or logging, or tracing, or finding out how long the modules take to launch, for example. I know there's an example for Python. So that's one way of doing extensions. So it's internal extensions, they're two different flavors, but I'll send you a link. I'll provide a link to the blog posts before we go too far down the rabbit hole on that.And then the other part of extensions are external extensions. And this is a cool part because they actually run as completely separate processes, but still within that secure bubble, that secure execution environment that Lambda runs it. And this gives you some superpowers if you want. Because first of all, an extension can run in any language because it's a separate process. So if you've got a Node function, you could run an extension in other kind of languages. Now, what do we do recommend is you do run your extension in a compiled binary, just because you've got to provide the runtime that the extensions got to run in any way, so as a compiled binary, it's super easy and super useful. So is something like Go, a lot of people are doing because you write a single extension and Go, and then you can use it on your Node functions, your Java functions, your PowerShell functions, whatever. So that's a really good, simple way that you can have the portability.But now, what can these extensions do? Well, the extensions basically register with extensions API, and then they say to Lambda, "Lambda, I want to know about what happens when my functions invoke?" So the extension can start up, maybe it's got some initialization code, maybe it needs to connect to a database, or log into an observability platform, or pull down a secret order. That it can do, it's got its own init that can happen. And then it's basically ready to go before the function invokes. And then when the extension then registers and says, "I want to know when the function invokes and when it shuts down. Cool." And that's just something that registers with the API. Then what happens is, when a functioning invoke comes in, it tells the runtime API, "Hello, you now have an event," sends it off to the Lambda function, which the runtime manages, but also extension or extensions, multiple ones, hears information about that event. And so it can tell you the time it's going to run and has some metadata about that event. So it doesn't have the actual event data itself, but it's like the sort of Lambda context, a version of that that it's going to send to the extension.So the extension can use that to do various things. It can start collecting telemetry data. It can alter instrument some of your code. It could be managing a secret as a separate process that it is going to cache in the background. For example, we've got one with AppConfig, which is really cool. AppConfig is a service where you manage parameters external to your Lambda function. Well, each time your Lambda function warm invokes if you've got to do an external API call to retrieve that, well, it's going to be a little bit efficient. First of all, you're going to pay for it and it's going to take some time.So how about when the Lambda function runs and the extension could run before the Lambda function, why don't we just cache that locally? And then when your Lambda function runs, it just makes a local HTTP call to the extension to retrieve that value, which is going to be super quick. And some extensions are super clever because they're their own process. They will go, "Well, my value is for 30 minutes and every 30 minutes if I haven't been run, I will then update the value from that." So that's useful. Extensions can then also, when the runtime ... Sorry, let me back up.When the runtime is finished, it sends its response back to the runtime API, and extensions when they're done doing, so the runtime can send it back and the extension can carry on processing saying, "Oh, I've got the information about this. I know that this Lambda function has done X, Y, Z, so let me do, do some telemetry. Let me maybe, if I'm writing logs, I could write a log to S3 or to Kinesis or whatever. Do some kind of thing after the actual function invocation has happened." And then when it says it's ready, it says, "Hello, extensions API, I'm telling you I'm done." And then it's gone. And then Lambda freezes the execution environment, including the runtime and the extensions until another invocation happens. And the cycle then will happen.And then the last little bit that happens is, instead of an invoke coming in, we've extended the Lambda life cycles, so when the environment is going to be shut down, the extension can receive the shutdown and actually do some stuff and say, "Okay, well, I was connected to my observer HTTP platform, so let me close that connection. I've got some extra logs to flush out. I've got whatever else I need to do," and just be able to cleanly shut down that extra process that is running in parallel to the Lambda function.Jeremy: All right.Julian: So that was a lot of words.Jeremy: That was a lot and I bet you that would be great conversation for a dinner party. Really kicks things up. Now, the good news is that, first of all, thank you for that though. I mean, that's super technical and super in-depth. And for anyone listening who ...Julian: You did ask, I did warn you.Jeremy ... kind of lost their way ... Yes, but something that is really important to remember is that you likely don't have to write these yourself, right? There is all those companies you mentioned earlier, all those partners, they've already done this work. They've already figured this out and they're providing you access to their tools via this, that allows you to build things.Julian: Exactly.Jeremy: So if you want to build an extension and you want to integrate your product with Lambda or so forth, then maybe go back and listen to this at half speed. But for those of you who just want to take advantage of it because of the great functionality, a lot of these companies have already done that for you.Julian: Correct. And that's the sort of easiness thing, of just adding the Lambda layer or including in a container image. And yeah, you don't have to worry any about that, but behind the scenes, there's some really cool functionality that we're literally opening up our Lambda operates and allowing you to impact when a function responds.Jeremy: All right. All right. So let me ask another, maybe an overly technical question. I have heard, and I haven't experienced this, but that when it runs the life cycle that ends the Lambda function, I've heard something like it doesn't send the information right away, right? You have to wait for that Lambda to expire or something like that?Julian: Well, yes, for now, about to change. So currently Extensions is actually in preview. And that's not because it's in Beta or anything like that, but it's because we spoke to the partners and we didn't want to dump Extensions on the world. And all the partners had to come out with their extensions on day one and then try and figure out how customers are going to use them and everything. So what we really did, which I think in this case works out really well, is we worked with the partners and said, "Well, let's release this in preview mode and then give everybody a whole bunch of months to work out what's the best use cases, how can we best use this?" And some partners have said, "Oh, amazing. We're ready to go." And some partners have said, "Ah, it wasn't quite what we thought. Maybe we're going to wait a bit, or we're going to do something differently, or we've got some cool ideas, just give us time." And so that's what this time has been.The one other thing that has happened is we've actually added some performance enhancements during it. So yes, currently during the preview, the runtime and all extensions need to finish before we give you your response back to your Lambda function. So if you're in an asynchronous mode, you don't really care, but obviously if you're in a synchronous mode behind an API, yeah, you don't really want that. But when Extensions goes GA, which isn't going to be long, then that is no longer the case. So basically what'll happen is the runtime will respond and the result goes directly back to whoever's calling that, maybe API gateway, and the extensions can carry on, partly asynchronously in the background.Jeremy: Yep. Awesome. All right. And I know that the plan is to go GA soon. I'm not sure when around when this episode comes out, that that will be, but soon, so that's good to know that that is ...Julian: And in fact, when we go GA that performance enhancement is part of the GA. So when it goes GA, then you know, it's not something else you need to wait for.Jeremy: Perfect. Okay. All right. So let's move on to another bit of, I don't know if this is extensibility of the actual product itself or more so I think extensibility of maybe the workflow that you use to deploy to Lambda and deploy your serverless applications, and that's container image support. I mean, we've discussed it a lot. I think people kind of have an idea, but just give me your quick overview of what that is to set some context here.Julian: Yeah, sure. Well, container image support in a simple sort of headline thing is to be able to build and package your functions as a container image. So you basically build a function using a Docker file. So before if you use a zip function, but a lot of people use Serverless Framework or SAM, or whatever, that's all abstracted away from you, but it's actually creating a zip file and uploading it to Lambda or S3. So with container image support, you use a Docker file to build your Lambda function. That's the headline of what's happening.Jeremy: Right. And so the idea of creating, and this is also, and again, you mentioned packaging, right? I mean, that is the big thing here. This is a packaging format. You're not actually running the container in a Lambda function.Julian: Correct. Yeah, let's maybe think, because I mean, "containers," in inverted commas again for people who are on the audio, is ...Jeremy: What does it even mean?Julian: Yeah, exactly. And can be quite an overload of terms and definitely causes some confusion. And I sort of think maybe there's sort of four things that are in the container world. One, containers is an isolation mechanism. So on Linux, this is UNC Group, seccomp, other bits and pieces that can be used to isolate processes or maybe groups of processes. And then a second one, containers as the packaging mechanism. This is what Docker really popularized and this is about taking some code and the dependencies needed to run the code, and then packaging them all out together, maybe with some metadata to describe it.And then, three is containers as also a design philosophy. This is the idea, if we can package and isolate software, it's easier to run. Maybe smaller pieces of software is easy to reason about and manage independently. So I don't want to necessarily use microservices, but there's some component of that with it. And the emphasis here is on software rather than services, and standardized tooling to simplify your ops. And then the fourth thing is containers as an ecosystem. This is where all the products, tools, know how, all the actual things to how to do containers. And I mean, these are certain useful, but I wouldn't say there're anything about the other kind of things.What is cool and worth appreciating is how maybe independent these things are. So when I spoke about containers as isolation, well, we could actually replace containers as isolation with micro VMs such as we do with Firecracker, and there's no real change in the operational properties. So one, if we think, what are we doing with containers and why? One of those is in a way ticked off with Lambda. Lambda does have secure isolation. And containers as a packaging format. I mean, you could replace it with static linking, then maybe won't really be a change, but there's less convenience. And the design philosophy, that could really be applicable if we're talking microservices, you can have instances and certainly functions, but containers are all the same kind of thing.So if we talk about the packaging of Lambda functions, it's really for people who are more familiar with containers, why does Lambda have to be different? You've got, why does Lambda to have to be a snowflake in a way that you have to manage differently? And if you are packaging dependencies, and you're doing npm or pip install, and you're used to building Docker files, well, why can't we just do that for Lambda at the same things? And we've got some other things that come with that, larger function sizes, up to 10 gig, which is enabled with some of this technology. So it's a packaging format, but on the backend, there's a whole bunch of different stuff, which has to be done to to allow this. Benefits are, use your tooling. You've got your CI/CD pipelines already for containers, well, you can use that.Jeremy: Yeah, yeah. And I actually like that idea too. And when I first heard of it, I was like, I have nothing against containers, the containers are great. But when I was thinking about it, I'm like, "Wait container? No, what's happening here? We're losing something." But I will say, like when Lambda layers came out, which was I think maybe 2019 or something like that, maybe 2018, the idea of it made a lot of sense, being able to kind of supplement, add additional dependencies or code or whatever. But it always just seemed awkward. And some of the publishing for it was a little bit awkward. The versioning used like a numbered versioning instead of like semantic versioning and things like that. And then you had to share it to multiple places and if you published it as a SAR app, then you got global distri ... Anyways, it was a little bit hard to use.And so when you're trying to package large dependencies and put those in a layer and then combine them with a Lambda function, the other problem you had was you still had a maximum size that you could use for those, when those were combined. So I like this idea of saying like, "Look, I'd like to just kind of create this little isolate," like you said, "put my dependencies in there." Whether that's PyCharm or some other thing that is a big dependency that maybe I don't want to install, directly in a Lambda layer, or I don't want to do directly in my Lambda function. But you do that together and then that whole process just is a lot easier. And then you can actually run those containers, you could run those locally and test those if you wanted to.Julian: Correct. So that's also one of the sort of superpowers of this. And that's when I was talking about, just being able to package them up. Well, that now enables a whole bunch of extra kind of stuff. So yes, first of all is you can then use those container images that you've created as your local testing. And I know, it's silly for anyone to poo poo local testing. And we do like to say, "Well, bring your testing to the cloud rather than bringing the cloud to your testing." But testing locally for unit tests is super great. It's going to be super fast. You can iterate, have your Lambda functions, but we don't want to be mocking all of DynamoDB, all of building harebrained S3 options locally.But the cool thing is you've got the same Docker file that you're going to run in Lambda can be the same Docker file to build your function that you run locally. And it is literally exactly the same Lambda function that's going to run. And yes, that may be locally, but, with a bit of a stretch of kind of stuff, you could also run those Lambda functions elsewhere. So even if you need to run it on EC2 instances or ECS or Fargate or some kind of thing, this gives you a lot more opportunities to be able to use the same Lambda function, maybe in different way, shapes or forms, even if is on-prem. Now, obviously you can't recreate all of Lambda because that's connected to IM and it's got huge availability, and scalability, and latency and all that kind of things, but you can actually run a Lambda function in a lot more places.Jeremy: Yeah. Which is interesting. And then the other thing I had mentioned earlier was the size. So now the size of these container or these packages can be much, much bigger.Julian: Yeah, up to 10 gig. So the serverless purists in the back are shouting, "What about cold starts? What about cold starts?"Jeremy: That was my next question, yes.Julian: Yeah. I mean, back on zip functional archives are also all available, nothing changes with that Lambda layers, many people use and love, that's all available. This isn't a replacement it's just a new way of doing it. So now we've got Lambda functions that can be up to 10 gig in size and surely, surely that's got to be insane for cold starts. But actually, part of what I was talking about earlier of some of the work we've done on the backend to support this is to be able to support these super large package sizes. And the high level thing is that we actually cache those things really close to where the Lambda layer is going to be run.Now, if you run the Docker ecosystem, you build your Docker files based on base images, and so this needs to be Linux. One of the super things with the container image support is you don't have to use Amazon Linux or Amazon Linux 2 for Lambda functions, you can actually now build your Lambda functions also on Ubuntu, DBN or Alpine or whatever else. And so that also gives you a lot more functionality and flexibility. You can use the same Linux distribution, maybe across your entire suite, be it on-prem or anywhere else.Jeremy: Right. Right.Julian: And the two little components, there's an interface client, what you install, it's just another Docker layer. And that's that runtime API shim that talks to the runtime API. And then there's a runtime interface emulator and that's the thing that pretends to be Lambda, so you can shunt those events between HTTP and JSON. And that's the thing you would use to run locally. So runtime interface client means you can use any Linux distribution at the runtime interface client and you're compatible with Lambda, and then the interface emulators, what you would use for local testing, or if you want to spread your wings and run your Lambda functions elsewhere.Jeremy: Right. Awesome. Okay. So the other thing I think that container support does, I think it opens up a broader set of, or I guess a larger audience of people who are familiar with containerization and how that works, bringing those two Lambda functions. And one of the things that you really don't get when you run a container, I guess, on EC2, or, not EC2, I'm sorry, ECS, or Fargate or something like that, without kind of adding another layer on top of it, is the eventing aspect of it. I mean, Lambda just is naturally an event driven, a compute layer, right? And so, eventing and this idea of event driven applications and so forth has just become much more popular and I think much more mainstream. So what are your thoughts? What are you seeing in terms of, especially working with so many customers and businesses that are using this now, how are you seeing this sort of evolution or adoption of event driven applications?Julian: Yeah. I mean, it's quite funny to think that actually the event of an application was the genesis of Lambda rather than it being Serverless. I mentioned earlier about starting with S3. Yeah, the whole crux of Lambda has been, I respond to an event of an API gateway, or something on SQS, or via the API or anything. And so the whole point in a way of Lambda has been this event driven computing, which I think people are starting to sort of understand in a bigger thing than, "Oh, this is just the way you have to do Lambda." Because, I do think that serverless has a unique challenge where there is a new conceptual learning maybe that you have to go through. And one other thing that holds back service development is, people are used to a client's server and maybe ports and sockets. And even if you're doing containers or on-prem, or EC2, you're talking IP addresses and load balances, and sockets and firewalls, and all this kind of thing.But ultimately, when we're building these applications that are going to be composed of multiple services talking together through using APIs and events, the events is actually going to be a super part of it. And I know he is, not for so much longer, but my ultimate boss, but I can blame Jeff Bezos just a little bit, because he did say that, "If you want to talk via anything, talk via an API." And he was 100% right and that was great. But now we're sort of evolving that it doesn't just have to be an API and it doesn't have to be something behind API gateway or some API that you can run. And you can use the sort of power of events, particularly in an asynchronous model to not just be "forced" again in inverted commas to use APIs, but have far more flexibility of how data and information is going to flow through, maybe not just your application, but your suite of applications, or to and from your partners, or where that is.And ultimately authentications are going to be distributed, and maybe that is connecting to partners, that could be SaaS partners, or it's going to be an on-prem component, or maybe things in other kind of places. And those things need to communicate. And so the way of thinking about events is a super powerful way of thinking about that.Jeremy: Right. And it's not necessarily new. I mean, we've been doing web hooks for quite some time. And that idea of, something is going to happen somewhere and I want to be notified of it, is again, not a new concept. But I think certainly the way that it's evolved with Lambda and the way that other FaaS products had done eventing and things like that, is just those tight integrations and just all of the, I guess, the connective tissue that runs between those things to make sure that the events get delivered, and that you can DLQ them, and you can do all these other things with retries and stuff like that, is pretty powerful.I know you have, I actually just mentioned this on the last episode, about one of my favorite books, I think that changed my thinking and really got me thinking about how microservices communicate with one another. And that was Building Microservices by Sam Newman, which I actually said was sort of like my Bible for a couple of years, yes, I use that. So what are some of the other, like I know you have a favorite book on this.Julian: Well, that Building Microservices, Sam Newman, and I think there's a part two. I think it's part two, or there's another one ...Jeremy: Hopefully.Julian: ... in the works. I think even on O'Riley's website, you can go and see some preview copies of it. I actually haven't seen that. But yeah, I mean that is a great kind of Bible talking. And sometimes we do conflate this microservices things with a whole bunch of stuff, but if you are talking events, you're talking about separating things. But yeah, the book recommendation I have is one called Flow Architectures by James Urquhart. And James Urquhart actually works with VMware, but he's written this book which is looking sort of at the current state and also looking into the future about how does information flow through our applications and between companies and all this kind of thing.And he goes into some of the technology. When we talk about flow, we are talking about streams and we're talking about events. So streams would be, let's maybe put some AWS words around it, so streams would be something like Kinesis and events would be something like EventBridge, and topics would be SNS, and SQS would be queues. And I know we've got all these things and I wish some clever person would create the one flow service to rule them all, but we're not there. And they've got also different properties, which are helpful for different things and I know confusingly some of them merge. But James' sort of big idea is, in the future we are going to be able to moving data around between businesses, between applications. So how can we think of that as a flow? And what does that mean for designing applications and how we handle that?And Lambda is part of it, but even more nicely, I think is even some of the native integrations where you don't have to have a Lambda function. So if you've got API gateway talking to Step Functions directly, for example, well, that's even better. I mean, you don't have any code to manage and if it's certainly any code that I've written, you probably don't want to manage it. So yeah. I mean this idea of flow, Lambda's great for doing some of this moving around. But we are even evolving to be able to flow data around our applications without having to do anything and just wire up some things in a console or in a terminal.Jeremy: Right. Well, so you mentioned, someone could build the ultimate sort of flow control system or whatever. I mean, I honestly think EventBridge is very close to that. And I actually had Mike Deck on the show. I think it was like episode five. So two years ago, whenever it was when the show came out. I mean, when EventBridge came out. And we were talking and I sort of made the joke, I'm like, so this is like serverless web hooks, essentially being able, because there was the partner integrations where partners could push events onto an event bus, which they still can do. But this has evolved, right? Because the issue was always sort of like, I would have to subscribe to web books, I'd have to build a web hook to get events from a particular company. Which was great, always worked fine, but you're still maintaining that infrastructure.So EventBridge comes along, it creates these partner integrations and now you can just push an event on that now your applications, whether it's a Lambda function or other services, you can push them to an SQS queue, you can push them into a Kinesis stream, all these different destinations. You can go ahead and pull that data in and that's just there. So you don't have to worry about maintaining that infrastructure. And then, the EventBridge team went ahead and released the destination API, I think it's called.Julian: Yeah, API destinations.Jeremy: Event API destinations, right, where now you can set up these integrations with other companies, so you don't even have to make the API call yourself anymore, but instead you get all of the retries, you get the throttling, you get all that stuff kind of built in. So I mean, it's just really, really interesting where this is going. And actually, I mean, if you want to take a second to tell people about EventBridge API destinations, what that can do, because I think that now sort of creates both sides of that equation for you.Julian: It does. And I was just thinking over there, you've done a 10 times better job at explaining API destinations than I have, so you've nailed it on the head. And packet is that kind of simple. And it is just, events land up in your EventBridge and you can just pump events to any arbitrary endpoint. So it doesn't have to be in AWS, it can be on-prem. It can be to your Raspberry PI, it can literally be anywhere. But it's not just about pumping the events over there because, okay, how do we handle failover? And how do we handle over throttling? And so this is part of the extra cool goodies that came with API destinations, is that you can, for instance, if you are sending events to some external API and you only licensed for 1,000 invocations, not invocations, that could be too Lambda-ish, but 1,000 hits on the API every minute.Jeremy: Quotas. I think we call them quotas.Julian: Quotas, something like that. That's a much better term. Thank you, Jeremy. And some sort of quota, well, you can just apply that in API destinations and it'll basically store the data in the meantime in EventBridge and fire that off to the API destination. If the API destination is in that sort of throttle and if the API destination is down, well, it's going to be able to do some exponential back-off or calm down a little bit, don't over-flood this external API. And then eventually when the API does come back, it will be able to send those events. So that does just really give you excellent power rather than maintaining all these individual API endpoints yourself, and you're not handling the availability of the endpoint API, but of whatever your code is that needs to talk to that destination.Jeremy: Right. And I don't want to oversell this to anybody, but that also ...Julian: No, keep going. Keep going.Jeremy: ... adds the capability of enhanced security, because you're not exposing those API keys to your developers or anybody else, they're all baked in and stored within, the API destinations or within an EventBridge. You have the ability, you mentioned this idea of not needing Lambda to maybe talk directly, API gateway to DynamoDB or to step function or something like that. I mean, the cool thing about this is you do have translation capabilities, or transformation capabilities in EventBridge where you can transform the event. I haven't tried this, but I'm assuming it's possible to say, get an event from Salesforce and then pipe it into Stripe or some other API that you might want to pipe it into.So I mean, just that idea of having that centralized bus that can communicate with all these different things. I mean, we're talking about distributed systems here, right? So why is it different sending an event from my microservice A to my microservice B? Why can't I send it from my microservice A to company-wise, microservice B or whatever? And being able to do that in a secure, reliable, just with all of that stuff kind of built in for you, I think it's amazing. So I love EventBridge. To me EventBridge is one of those services that rivals Lambda. It's as, I guess as important as Lambda is, in this whole serverless equation.Julian: Absolutely, Jeremy. I mean, I'm just sitting here. I don't actually have to say anything. This is a brilliant interview and Jeremy, you're the expert. And you're just like laying down all of the excellent use cases. And exactly it. I mean, I like to think we've got sort of three interlinked services which do three different things, but are awesome. Lambda, we love if you need to do some processing or you need to do something that's literally your business logic. You've got EventBridge that can route data from in and out of SaaS partners to any other kind of API. And then you've got Step Functions that can do some coordination. And they all work together, but you've got three different things that really have sort of superpowers in terms of the amount of stuff you can do with it. And yes, start with them. If you land up bumping up against any kind of things that it doesn't work, well, first of all, get in touch with me, I'll work on that.But then you can maybe start thinking about, is it containers or EC2, or that kind of thing? But using literally just Lambda, Step Functions and EventBridge, okay. Yes, maybe you're going to need some queues, topics and APIs, and that kind of thing. But ...Jeremy: I was just going to say, add DynamoDB in there for some permanent state or for some data persistence. Right? Yeah. But other than that, no, I think you nailed it. Honestly, sometimes you're starting to build applications and yeah, you're right. You maybe need a queue here and there and things like that. But for the most part, no, I mean, you could build a lot with those three or four services.Julian: Yeah. Well, I mean, even think of it what you used to do before with API destinations. Maybe you drop something on a queue, you'd have Lambda pull that from a queue. You have Lambda concurrency, which would be set to five per second to then send that to an external API. If it failed going to that API, well, you've got to then dump it to Lambda destinations or to another SQS queue. You then got something ... You know, I'm going down the rabbit hole, or just put it on EventBridge ...Jeremy: You just have it magically happen.Julian: ... or we talk about removing serverless infrastructure, not normal infrastructure, and just removing even the serverless bits, which is great.Jeremy: Yeah, no. I think that's amazing. So we talked about a couple of these different services, and we talked about packaging formats and we talked about event driven applications, and all these other things. And a lot of this stuff, even though some of it may be familiar and you could probably equate it or relate it to things that developers might already know, there is still a lot of new stuff here. And I think, my biggest complaint about serverless was not about the capabilities of it, it was basically the education and the ability to get people to adopt it and understand the power behind it. So let's talk about that a little bit because ... What's that?Julian: It sounds like my job description, perfectly.Jeremy: Right. So there we go. Right, that's what you're supposed to be doing, Julian. Why aren't you doing it? No, but you are doing it. You are doing it. No, and that's why I want to talk to you about it. So you have that series on the Well-Architected Framework and we can talk about that. There's a whole bunch of really good resources on this. Obviously, you're doing videos and conferences, well, you used to be doing conferences. I think you probably still do some of those virtual ones, right? Which are not the same thing.Julian: Not quite, no.Jeremy: I mean, it was fun seeing you in Cardiff and where else were you?Julian: Yeah, Belfast.Jeremy: Cardiff and Northern Ireland.Julian: Yeah, exactly.Jeremy: Yeah, we were all over the place together.Julian: With the Guinness and all of us. It was brilliant.Jeremy: Right. So tell me a little bit about, sort of, the education process that you're trying to do. Or maybe even where you sort of see the state of Serverless education now, and just sort of where it's evolved, where we're getting best practices from, what's out there for people. And that's a really long question, but I don't know, maybe you can distill that down to something usable.Julian: No, that's quite right. I'm thinking back to my extensions explanation, which is a really long answer. So we're doing really long stuff, but that's fine. But I like to also bring this back to also thinking about the people aspect of IT. And we talk a lot about the technology and Lambda is amazing and S3 is amazing and all those kinds of things. But ultimately it is still sort of people lashing together these services and building the serverless applications, and deciding what you even need to do. And so the education is very much tied with, of course, having the products and features that do lots of kinds of things. And Serverless, there's always this lever, I suppose, between simplicity and functionality. And we are adding lots of knobs and levers and everything to Lambda to make it more feature-rich, but we've got to try and keep it simple at the same time.So there is sort of that trade-off, and of course with that, that obviously means not just the education side, but education about Lambda and serverless, but generally, how do I build applications? What do I do? And so you did mention the Well-Architected Framework. And so for people who don't know, this came out in 2015, and in 2017, there was a Serverless Lens which was added to it; what is basically serverless specific information for Well-Architected. And Well-Architected means bringing best practices to serverless applications. If you're building prod applications in the cloud, you're normally looking to build and operate them following best practices. And this is useful stuff throughout the software life cycle, it's not just at the end to tick a few boxes and go, "Yes, we've done that." So start early with the well-architected journey, it'll help you.And just sort of answer the question, am I well architected? And I mean, that is a bit of a fuzzy, what is that question? But the idea is to give you more confidence in the architecture and operations of your workloads, and that's not a goal it's in, but it's to reduce and minimize the impact of any issues that can happen. So what we do is we try and distill some of our questions and thoughts on how you could do things, and we built that into the Well-Architected Framework. And so the ServiceLens has a few questions on its operational excellence, security, reliability, performance, efficiency, and cost optimization. Excellent. I knew I was going to forget one of them and I didn't. So yeah, these are things like, how do you control access to an API? How do you do lifecycle management? How do you build resiliency into your application? All these kinds of things.And so the Well-Architected Framework with Serverless Lens there's a whole bunch of guidance to help you do that. And I have been slowly writing a blog series to literally cover all of the questions, they're nine questions in the Well-Architected Serverless Lens. And I'm about halfway through, and I had to pause because we have this little conference called re:Invent, which requires one or two slides to be created. But yeah, I'm desperately keen to pick that up again. And yeah, that's just providing some really and sort of more opinionated stuff, because the documentation is awesome and it's very in-depth and it's great when you need all that kind of stuff. But sometimes you want to know, well, okay, just tell me what to do or what do you think is best rather than these are the seven different options.Jeremy: Just tell me what to do.Julian: Yeah.Jeremy: I think that's a common question.Julian: Exactly. And I'll launch off from that to mention my colleague, James Beswick, he writes one or two things on serverless ...Jeremy: Yeah, I mean, every once in a while you see something from it. Yeah.Julian: ... every day. The Besbot machine of serverless. He's amazing. James, he's so knowledgeable and writes like a machine. He's brilliant. Yeah, I'm lucky to be on his team. So when you talk about education, I learn from him. But anyway, in a roundabout way, he's created this blog series and other series called the Lambda Operations Guide. And this is literally a whole in-depth study on how to operate Lambda. And it goes into a whole bunch of things, it's sort of linked to the Serverless Lens because there are a lot of common kind of stuff, but it's also a great read if you are more nerdily interested in Lambda than just firing off a function, just to read through it. It's written in an accessible way. And it has got a whole bunch of information on how to operate Lambda and some of the stuff under the scenes, how to work, just so you can understand it better.Jeremy: Right. Right. Yeah. And I think you mentioned this idea of confidence too. And I can tell you right now I've been writing serverless applications, well, let's see, what year is it? 2021. So I started in 2015, writing or building applications with Lambda. So I've been doing this for a while and I still get to a point every once in a while, where I'm trying to put something in cloud formation or I'm using the Serverless Framework or whatever, and you're trying to configure something and you think about, well, wait, how do I want to do this? Or is this the right way to do it? And you just have that moment where you're like, well, let me just search and see what other people are doing. And there are a lot of myths about serverless.There's as much good information is out there, there's a lot of bad information out there too. And that's something that is kind of hard to combat, but I think that maybe we could end it there. What are some of the things, the questions people are having, maybe some of the myths, maybe some of the concerns, what are those top ones that you think you could sort of ...Julian: Dispel.Jeremy: ... to tell people, dispel, yeah. That you could say, "Look, these are these aren't things to worry about. And again, go and read your blog post series, go and read James' blog post series, and you're going to get the right answers to these things."Julian: Yeah. I mean, there are misconceptions and some of them are just historical where people think the Lambda functions can only run for five minutes, they can run for 15 minutes. Lambda functions can also now run up to 10 gig of RAM. At re:Invent it was only 3 gig of RAM. That's a three times increase in Lambda functions within a three times proportional increase in CPU. So I like to say, if you had a CPU-intensive job that took 40 minutes and you couldn't run it on Lambda, you've now got three times the CPU. Maybe you can run it on Lambda and now because that would work. So yeah, some of those historical things that have just changed. We've got EFS for Lambda, that's some kind of thing you can't do state with Lambda. EFS and NFS isn't everybody's cup of tea, but that's certainly going to help some people out.And then the other big one is also cold starts. And this is an interesting one because, obviously we've sort of solved the cold start issue with connecting Lambda functions to VPC, so that's no longer an issue. And that's been a barrier for lots of people, for good reason, and that's now no longer the case. But the other thing for cold starts is interesting because, people do still get caught up at cold starts, but particularly for development because they create a Lambda function, they run it, that's a cold start and then update it and they run it and then go, oh, that's a cold start. And they don't sort of grok that the more you run your Lambda function the less cold starts you have, just because they're warm starts. And it's literally the number of Lambda functions that are running at exactly the same time will have a cold start, but then every subsequent Lambda function invocation for quite a while will be using a warm function.And so as it ramps up, we see, in the small percentages of cold starts that are actually going to happen. And when we're talking again about the container image support, that's got a whole bunch of complexity, which people are trying to understand. Hopefully, people are learning from this podcast about that as well. But also with the cold starts with that, those are huge and they're particular ways that you can construct your Lambda functions to really reduce those cold starts, and it's best practices anyway. But yeah, cold starts is also definitely one of those myths. And the other one ...Jeremy: Well, one note on cold starts too, just as something that I find to be interesting. I know that we, I even had to spend time battling with that earlier on, especially with VPC cold starts, that's all sort of gone away now, so much more efficient. The other thing is like provision concurrency. If you're using provision concurrency to get your cold starts down, I'm not even sure that's the right use for provision concurrency. I think provision concurrency is more just to make sure you have enough capacity because of the ramp-up time for Lambda. You certainly can use it for cold starts, but I don't think you need to, that's just my two cents on that.Julian: Yeah. No, that is true. And they're two different use cases for the same kind of thing. Yeah. As you say, Lambda is pretty scalable, but there is a bit of a ramp-up to get up to many, many, many, many thousands or tens of thousands of concurrent executions. And so yeah, using provision currency, you can get that up in advance. And yeah, some people do also use it for provision concurrency for getting those cold starts done. And yet that is another very valid use case, but it's only an issue for synchronous workloads as well. Anything that is synchronous you really shouldn't be carrying too much. Other than for cost perspective because it's going to take longer to run.Jeremy: Sure. Sure. I have a feeling that the last one you were going to mention, because this one bugs me quite a bit, is this idea of no ops or some people call it ops-less, which I think is kind of funny. But that's one of those things where, oh, it drives me nuts when I hear this.Julian: Yeah, exactly. And it's a frustrating thing. And I think often, sometimes when people are talking about no ops, they either have something to sell you. And sometimes what they're selling you is getting rid of something, which never is the case. It's not as though we develop serverless applications and we can then get rid of half of our development team, it just doesn't work like that. And it's crazy, in fact. And when I was talking about the people aspect of IT, this is a super important thing. And me coming from an infrastructure background, everybody is dying in their jobs to do more meaningful work and to do more interesting things and have the agility to try those experiments or try something else. Or do something that's better or even improve the way your build or improve the way your CI/CD pipeline runs or anything, rather than just having to do a lot of work in the lower levels.And this is what serverless really helps you do, is to be able to, we'll take over a whole lot of the ops for you, but it's not all of the ops, because in a way there's never an end to ops. Because you can always do stuff better. And it's not just the operations of deploying Lambda functions and limits and all that kind of thing. But I mean, think of observability and not knowing just about your application, but knowing about your business. Think of if you had the time that you weren't just monitoring function invocations and monitoring how long things were happening, but imagine if you were able to pull together dashboards of exactly what each transaction costs as it flows through your whole entire application. Think of the benefit of that to your business, or think of the benefit that in real-time, even if it's on Lambda function usage or something, you can say, "Well, oh, there's an immediate drop-off or pick-up in one region in the world or one particular application." You can spot that immediately. That kind of stuff, you just haven't had time to play with to actually build.But if we can take over some of the operational stuff with you and run one or two or trillions of Lambda functions in the background, just to keep this all ticking along nicely, you're always going to have an opportunity to do more ops. But I think the exciting bit is that ops is not just IT infrastructure, plumbing ops, but you can start even doing even better business ops where you can have more business visibility and more cool stuff for your business because we're not writing apps just for funsies.Jeremy: Right. Right. And I think that's probably maybe a good way to describe serverless, is it allows you to focus on more meaningful work and more meaningful tasks maybe. Or maybe not more meaningful, but more impactful on the business. Anyways, Julian, listen, this was a great conversation. I appreciate it. I appreciate the work that you're doing over at AWS ...Julian: Thank you.Jeremy: ... and the stuff that you're doing. And I hope that there will be a conference soon that we will be able to attend together ...Julian: I hope so too.Jeremy: ... maybe grab a drink. So if people want to get a hold of you or find out more about serverless and what AWS is doing with that, how do they do that?Julian: Yeah, absolutely. Well, please get hold of me anytime on Twitter, is the easiest way probably, julian_wood. Happy to answer your question about anything Serverless or Lambda. And if I don't know the answer, I'll always ask Jeremy, so you're covered twice over there. And then, three different things. James is, if you're talking specifically Lambda, James Beswick's operations guide, have a look at that. Just so much nuggets of super information. We've got another thing we did just sort of jump around, you were talking about cloud formation and the spark was going off in my head. We have something which we're calling the Serverless Patterns Collection, and this is really super cool. We didn't quite get to talk about it, but if you're building applications using SAM or serverless application model, or using the CDK, so either way, we've got a whole bunch of patterns which you can grab.So if you're pulling something from S3 to Lambda, or from Lambda to EventBridge, or SNS to SQS with a filter, all these kind of things, they're literally copy and paste patterns that you can put immediately into your cloud formation or your CDK templates. So when you are down the rabbit hole of Hacker News or Reddit or Stack Overflow, this is another resource that you can use to copy and paste. So go for that. And that's all hosted on our cool site called serverlessland.com. So that's serverlessland.com and that's an aggregation site that we run because we've got video talks, and we've got blog posts, and we've got learning path series, and we've got a whole bunch of stuff. Personally, I've got a learning path series coming out shortly on Lambda extensions and also one on Lambda observability. There's one coming out shortly on container image supports. And our team is talking all over as many things as we can virtually. I'm actually speaking about container images of DockerCon, which is coming up, which is exciting.And yeah, so serverlessland.com, that's got a whole bunch of information. That's just an easy one-stop-shop where you can get as much information about AWS services as you can. And if not yet, get in touch, I'm happy to help. I'm happy to also carry your feedback. And yeah, at the moment, just inside, we're sort of doing our planning for the next cycle of what Lambda and what all the service stuff we're going to do. So if you've got an awesome idea, please send it on. And I'm sure you'll be super excited when something pops out in the near issue, maybe just in future for a cool new functionality you could have been involved in.Jeremy: Well, I know that serverlessland.com is an excellent resource, and it's not that the AWS Compute blog is hard to parse through or anything, but serverlessland.com is certainly a much easier resource to get there. S

bible benefits south africa fall in love ga reddit cloud honestly jeff bezos windows ip saas beta ram personally salesforce northern ireland api belfast guinness python aws linux java cardiff faa apis stripe alpine invent sar ubuntu cpu extensions checkpoint vmware sns raspberry pi s3 sql docker node sdks lambda baas stack overflow splunk ci cd firecrackers json serverless honeycomb vms ecs nosql powershell woodall corelogic hashicorp hacker news nfs cdk ec2 efs dynatrace dbn appdynamics dbas kinesis sam newman vpc dynamodb sumo logic senior developer advocate extensible fargate pycharm sqs dockercon building microservices serverless framework julian wood amazon linux james urquhart eventbridge jeremy you thundra cbt nuggets jeremy so julian it

Episode #97: How Serverless Fits in to the Cyclical Nature of the Industry with Gojko Adzic