Podcasts about feature flags

  • 87PODCASTS
  • 129EPISODES
  • 39mAVG DURATION
  • 1EPISODE EVERY OTHER WEEK
  • May 13, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about feature flags

Latest podcast episodes about feature flags

COMPRESSEDfm
203 | Feature Flags, Framework Wars, and Landing Your Next Dev Job

COMPRESSEDfm

Play Episode Listen Later May 13, 2025 46:34


In this hosts-only episode, Amy and Brad get real about the developer experience - from the stress of job interviews to the complexities of choosing the right framework. They discuss why companies are comparing candidates more than ever, share strategies for answering behavioral interview questions, and debate the merits of Remix versus Next.js (spoiler: Brad's all-in on Remix). The conversation shifts to feature flags and progressive rollouts, with insights from Brad's work at Stripe. SponsorWorkOS helps you launch enterprise features like SSO and user management with ease. Thanks to the AuthKit SDK for JavaScript, your team can integrate in minutes and focus on what truly matters—building your app. Chapter Marks00:00 - Intro00:41 - Sponsor: WorkOS01:47 - Brad's Keyboard and Mouse Shopping Spree04:30 - Keyboard Layout Discussion07:23 - Apple Ecosystem: Reminders and Notes09:23 - Family Sharing and Raycast Integration09:43 - Notion vs Apple Notes for Project Management11:31 - File Storage and Backup Strategies14:00 - Machine Backup Philosophy16:46 - Job Interview Preparation Tips19:40 - Answering the "Weakness" Question21:53 - Addressing Weaknesses: Delegation Examples24:29 - Conflict Resolution Interview Questions25:46 - Company Research Before Interviews27:00 - Tech Stack Considerations: Remix vs Next.js28:30 - Framework Migration Decisions29:30 - Astro for Content Sites31:02 - Backend Languages: Go vs TypeScript32:30 - React Server Components Future34:23 - Feature Flags and Boolean as a Service35:30 - Feature Flag Segmentation and A/B Testing36:54 - PostHog and Analytics Tools38:30 - Progressive Rollouts and Error Monitoring40:20 - Amy's Picks and Plugs43:35 - Brad's Picks and Plugs  

The Mob Mentality Show
No Branches?! Ron Cohen Breaks Down Trunk Based Development and Feature Flags (For Real)

The Mob Mentality Show

Play Episode Listen Later Apr 14, 2025 43:48


Voice of the DBA
Using Feature Flags

Voice of the DBA

Play Episode Listen Later Apr 6, 2025 3:03


The use of feature flags in software development has become more and more prevalent over time, especially as teams move to DevOps-style development with frequent releases. I've often thought that using feature flags allows technical people to separate out the deployment of some feature or change from the release of that to users. There are a number of articles on this style of work (feature flag driven development, Why Use Feature Flags?) as well as a discussion at Reddit. I am a big believer in feature flags helping with improving your software in many ways. These articles (and others) highlight the advantages that a software organization gains by using feature flags. Failed releases become less of an issue, as the specific change that doesn't work can be turned off. This can even work with databases. I can deploy a database change and at a later time have the code (or new table/column) start being used when a feature flag is set. If there is an issue, I can turn off the feature flag and stop using the code (or populating the schema). I can then clean things up, even saving data before I make a change. Read the rest of Using Feature Flags

php[podcast] episodes from php[architect]
The PHP Podcast: 2025.03.27

php[podcast] episodes from php[architect]

Play Episode Listen Later Mar 28, 2025 63:01


This week on the PHP Podcast, Eric and John talk about Laravel Cloud, Feature Flags, PHP Tek 2025, PHPxSan, and more… Links from the show: Laravel Cloud achieves SOC 2 Type 1 Compliance – The Laravel Blog Flipt Laravel Pennant – Laravel 12.x – The PHP Framework For Web Artisans PHP[TEK] 2025 – May 20th […] The post The PHP Podcast: 2025.03.27 appeared first on PHP Architect.

The Mob Mentality Show
When TDD Meets R&D: How to Keep Small Steps & Fast Feedback Loops in High Uncertainty

The Mob Mentality Show

Play Episode Listen Later Feb 17, 2025 26:10


How do you balance small, iterative progress with the vast unknowns of research and development (R&D)? Can test-driven development (TDD) literally or "in spirit" still provide value when you're navigating uncharted territory? In this episode of the Mob Mentality Show, we dive deep into the intersection of R&D Mobbing and software development, exploring real-world scenarios, strategies, and challenges teams face when innovating under uncertainty. What You'll Learn in This Episode:

No Compromises
Feature flags: Temporary tool or permanent solution?

No Compromises

Play Episode Listen Later Dec 21, 2024 10:13 Transcription Available


Joel and Aaron dive into a friendly debate about the true nature of feature flags in software development. Drawing from their varied experiences across different programming languages and environments, they explore whether feature flags should always be temporary or can serve permanent purposes. The discussion evolves from a simple disagreement into deeper insights about different architectural approaches.(00:00) - Newsletter tips undergo careful peer review process (02:15) - Debating if feature flags should be temporary (05:25) - Different uses of feature flags across languages (07:20) - Feature flags in modern Laravel applications (08:35) - Silly Bit Sign up for free to get those amazing Laravel tips delivered each day 

The Bootstrapped Founder
363: Ben Rometsch — From Side Projects to Industry Giants

The Bootstrapped Founder

Play Episode Listen Later Dec 18, 2024 53:28 Transcription Available


Ben Rometsch (@dabeeeenster.bsky.social), the founder of Flagsmith, created a bootstrapped SaaS success story. Feature flags are transforming software deployment by decoupling releases and enhancing control. And Ben bootstrapped this deceptively simple-looking part of engineering into a significant software business. And then there's the open-source part of all that. The Open Feature project is setting new standards in software development, akin to OpenTelemetry. Ben shares insights into this collaborative open-source initiative and takes you on a decade-long journey running a software agency in London, where creativity thrived, leading to the creation of a cost-effective, open-source feature flag tool now used by major companies. We even get to the parallels between Brexit and business growth as Ben discusses breaking growth ceilings and the challenges of venture capital. You'll hear about a pivotal deal during the pandemic and how it set off a massive growth spurt that was previously impossible.Ben and I both value slow, sustainable growth without VC pressures. But it comes with its own challenges, like balancing monetization strategies while maintaining a sustainable open-source project. Join us for a conversation about building a business with purpose.This episode is sponsored by Paddle.com — if you're looking for a payment platform that works for you so you can focus on what matters, check them out.The blog post: https://thebootstrappedfounder.com/ben-rometsch-from-side-projects-to-industry-giantsThe podcast episode: https://tbf.fm/episodes/363-ben-rometsch-from-side-projects-to-industry-giantsCheck out Podscan to get alerts when you're mentioned on podcasts: https://podscan.fmSend me a voicemail on Podline: https://podline.fm/arvidYou'll find my weekly article on my blog: https://thebootstrappedfounder.comPodcast: https://thebootstrappedfounder.com/podcastNewsletter: https://thebootstrappedfounder.com/newsletterMy book Zero to Sold: https://zerotosold.com/My book The Embedded Entrepreneur: https://embeddedentrepreneur.com/My course Find Your Following: https://findyourfollowing.comHere are a few tools I use. Using my affiliate links will support my work at no additional cost to you.- Notion (which I use to organize, write, coordinate, and archive my podcast + newsletter): https://affiliate.notion.so/465mv1536drx- Riverside.fm (that's what I recorded this episode with): https://riverside.fm/?via=arvid- TweetHunter (for speedy scheduling and writing Tweets): http://tweethunter.io/?via=arvid- HypeFury (for massive Twitter analytics and scheduling): https://hypefury.com/?via=arvid60- AudioPen (for taking voice notes and getting amazing summaries): https://audiopen.ai/?aff=PXErZ- Descript (for word-based video editing, subtitles, and clips): https://www.descript.com/?lmref=3cf39Q- ConvertKit (for email lists, newsletters, even finding sponsors): https://convertkit.com?lmref=bN9CZw

Arguing Agile Podcast
AA190 - Navigating Product-Engineering Conflicts: A Coaching Session

Arguing Agile Podcast

Play Episode Listen Later Nov 13, 2024 42:18 Transcription Available


Have you been in a situation where engineering leadership and product management do not see eye-to-eye?In this episode, Enterprise Business Agility Coach Om Patel interviews and coaches Product Manager Brian Orlando on challenges product managers face when working with engineering teams, leads, and managers. Listen/watch to learn tactics for diffusing a potentially difficult situation, including:Strategies for effective spike work and time-boxingThe importance of frequent check-ins and demosSpotting when tech leads aren't aligned with modern dev practicesKey takeaways from "Accelerate" and it's relevanceThe value of being willing to abandon unsuccessful featuresWhether you're a product manager struggling with team dynamics or an engineering leader looking to improve collaboration, this episode is packed to the tippy-top with valuable and practical advice you can start using - right meow!#ProductManagement #Agile #EngineeringLeadership #ContinuousImprovement #DevOps= = = = = = = = = = = =Watch it on YouTube= = = = = = = = = = = =Subscribe to our YouTube Channel:https://www.youtube.com/channel/UC8XUSoJPxGPI8EtuUAHOb6g?sub_confirmation=1Apple Podcasts:https://podcasts.apple.com/us/podcast/agile-podcast/id1568557596Spotify:https://open.spotify.com/show/362QvYORmtZRKAeTAE57v3Amazon Music:https://music.amazon.com/podcasts/ee3506fc-38f2-46d1-a301-79681c55ed82/Agile-Podcast= = = = = = = = = = = =Toronto Is My Beat (Music Sample)By Whitewolf (Source: https://ccmixter.org/files/whitewolf225/60181)CC BY 4.0 DEED (https://creativecommons.org/licenses/by/4.0/deed.en)

COMPRESSEDfm
186 | Breaking into Tech through Open Source

COMPRESSEDfm

Play Episode Listen Later Nov 8, 2024 52:39


In this episode, Chris Nowicki shares his path from aerospace to web development and the unique challenges of transitioning into tech. Chris and James discuss how Chris got involved in the open-source project "Deals for Devs," including the tech stack, managing contributions, and handling obstacles. This episode offers a first-hand look at the value of community in development and tips for new devs on getting started in open source.SponsorPostman is an API platform for building and using APIs. Postman simplifies each step of the API lifecycle and streamlines collaboration so you can create better APIs—faster.Show Notes00:00 - Intro01:08 - Chris Nowicki's Journey into Tech02:12 - Bootcamp Experience and Structure05:07 - Breaking into Tech Through Community Involvement08:38 - Deals for Devs: The Project Origin11:10 - Sponsor Message: Postman12:06 - Tech Stack Overview for Deals for Devs13:22 - Tech Stack: Resend, React Email, Tailwind, and Xata17:00 - Prisma Integration with Xata20:00 - Challenges in Managing Community Projects23:54 - Planning and Issue Management for Deals for Devs28:00 - Feature Flags and Release Management37:15 - Subscription System Workflow45:45 - Creating a Dynamic Email Subscription System51:58 - Managing Admin and Approval for Deals52:26 - ClosingLinksOpenSaucedRedwoodJSDeals for Devs ProjectPostmanReact EmailVercelXataResendFrontend MentorLaunchDarklyGrid Iron SurvivorDev.to article on CRON jobs

Engineering Kiosk
#143 Ship It! Deployment-Strategien und Anti-Patterns auf der letzten Meile

Engineering Kiosk

Play Episode Listen Later Oct 1, 2024 76:45


Dein Code ist nichts wert, bevor er nicht in Produktion ist!Viele Software-Entwickler*innen haben sich bereits in der Situation gefunden, wo wir immer und immer wieder über den eigenen Source Code iterieren, um diesen noch schöner zu machen. Soviel Spaß dies auch macht … ist das schönste Gefühl jedoch, wenn jemand meinen Source Code wirklich nutzt. Und das geht nur, wenn wir diesen auch deployen.Oder etwas direkter gesagt: Dein Source Code ist solange nichts wert, bis dieser nicht in Produktion ist und vom Kunden genutzt werden kann. Klingt hart, ist aber Fakt. Deswegen geht's in dieser Podcast Episode um das Thema Deployment.Wir sprechen über Anti-Patterns wie manuelle Deployments, Big-Bang Deployments und Deployment Monolithen. Wir schauen uns an welche Herausforderungen wir bereits in unserer beruflichen Laufbahn bei Deployments gesehen haben, wie zB Caching, CDNs, Deployment unter Hochlast oder das Einspielen von Datenbankänderungen und geben mal eine Tour durch verschiedene Deployment-Arten, mit u.a. Canary Deployments, der Blue-Green-Stratgie, Feature Flags oder Shadow Deployments bzw. Dark Launches.Final bringen wir die Frage auf den Tisch, wann du das letzte mal deinen Rollback getestet hast.Bonus: Wie macht man eine Podcast-Episode über Deployment ohne Continuous Delivery und Continuous Deployment (CD) zu erwähnen?Das schnelle Feedback zur Episode:

PodRocket - A web development podcast from LogRocket
Custom DevTools for your React App with Cory House

PodRocket - A web development podcast from LogRocket

Play Episode Listen Later Sep 18, 2024 32:32


React and JavaScript expert Cory House discusses the creation of custom development tools for React applications, sharing insights from his recent talk at React Rally and exploring how the right tools can shape development workflows and enhance automated testing strategies. Links https://www.bitnative.com https://github.com/coryhouse/ama https://x.com/housecor https://github.com/coryhouse https://stackoverflow.com/users/26180/cory-house https://www.linkedin.com/in/coryhouse https://www.pluralsight.com/authors/cory-house https://www.reactjsconsulting.com We want to hear from you! How did you find us? Did you see us on Twitter? In a newsletter? Or maybe we were recommended by a friend? Let us know by sending an email to our producer, Emily, at emily.kochanekketner@logrocket.com (mailto:emily.kochanekketner@logrocket.com), or tweet at us at PodRocketPod (https://twitter.com/PodRocketpod). Follow us. Get free stickers. Follow us on Apple Podcasts, fill out this form (https://podrocket.logrocket.com/get-podrocket-stickers), and we'll send you free PodRocket stickers! What does LogRocket do? LogRocket provides AI-first session replay and analytics that surfaces the UX and technical issues impacting user experiences. Start understand where your users are struggling by trying it for free at [LogRocket.com]. Try LogRocket for free today.(https://logrocket.com/signup/?pdr) Special Guest: Cory House.

Maintainable
Ryosuke Iwanaga: The Benefits of Cell-Based Architecture

Maintainable

Play Episode Listen Later Aug 8, 2024 42:26


Ryosuke shares his insights on:Ownership in Software Maintenance: The role of single-threaded ownership and dedicated teams in maintaining software and shared libraries.Technical Debt: How his definition of technical debt has evolved over the years and strategies to manage it effectively.Monitoring and Alarming: The importance of comprehensive monitoring and alarming systems in handling legacy software and ensuring reliability.Change Management: Best practices for change management, including preparing for worst-case scenarios and automating processes to reduce risks.Phased Rollouts and Feature Flags: Implementing phased rollouts and using feature flags to manage changes safely and gradually.Cell-Based Architecture: How cell-based architecture enhances scalability and reliability, and the challenges of maintaining multi-cell systems.Operational Excellence: Continuous deployment, regular dashboard reviews, and technologies used in orchestration to achieve operational excellence.Ryosuke also discusses his current role and responsibilities as a software engineer and his consulting work with OpsVL, where he helps organizations raise their operational standards.Resources MentionedRyosuke Iwanaga on LinkedInOpsBR Software Technology Inc.Cell-Based ArchitectureTune in to this insightful episode to learn more about maintaining healthy and scalable software systems.About the Guest:Ryosuke Iwanaga is the President of OpsBR Software Technology Inc. He has extensive experience in software engineering, including roles in sales engineering, support engineering, and data center operations. Ryosuke is passionate about operational excellence and helping organizations improve their software systems.Follow Ryosuke on Social Media:LinkedIn Subscribe to Maintainable on:Apple PodcastsSpotifyOr search "Maintainable" wherever you stream your podcasts.Keep up to date with the Maintainable Podcast by joining the newsletter.

Front-End Fire
News: Google Backs Off Blocking Cookies, New CSS Features, and Vercel's Feature Flags SDK

Front-End Fire

Play Episode Listen Later Aug 5, 2024 42:15


Google is making headline news once again as it reverses course on a decision to block third-party cookies in its Chrome browser. After years of testing, planning, and delays, Google scrapped a plan to turn off third-party cookie tracking by default like Safari and Firefox already do.In other news, the annual CSS Working Group meeting wrapped up recently, and some of the exciting features the group will be focusing on this year include: the if() statement for conditional styling, cross document view transitions without the need for a JavaScript library, and (perhaps the most anticipated feature) cleaner, easier CSS anchor positioning. Vercel introduces feature flags in Next.js and SvelteKit with Vercel's Flags SDK. The Flags SDK works with any feature flag provider, and sits between the application and the source of the flags to help devs follow best practices for using feature flags, while keeping websites fast.And finally, Reddit has doubled down on blocking search engine crawlers from surfacing new posts and comments in recent weeks, and as of now, Google is the only mainstream search engine that's made a deal that will allow it to index new search results when users search for posts on Reddit.News:Paige - Exciting new CSS features coming out of this year's CSSWG meetingJack - Feature Flag Support from VercelTJ - Chrome's is no longer removing third-party cookiesBonus News:Reddit is now blocking all non-Google search engines and AI botsAll the video talks from React Conf 2024 are availableWhat Makes Us Happy this Week:Paige - Apple Watch SEJack - 3D printing (Autodesk Fusion 360 program)TJ - 2024 Paris OlympicsThanks as always to our sponsor, the Blue Collar Coder channel on YouTube. You can join us in our Discord channel, explore our website and reach us via email, or Tweet us on X @front_end_fire.Front-end Fire websiteBlue Collar Coder on YouTubeBlue Collar Coder on DiscordReach out via emailTweet at us on X @front_end_fire

My life as a programmer
What about feature flags?

My life as a programmer

Play Episode Listen Later Aug 4, 2024 11:22


What about feature flags?

The Mob Mentality Show
Crafting Lean Software: Dave Adsit on Small Batches and Short Lead Times

The Mob Mentality Show

Play Episode Listen Later Jul 30, 2024 45:23


Join us in this thoughtful episode of the Mob Mentality Show as we explore the world of Lean Software Development with Dave Adsit. Titled "Crafting Lean Software: Dave Adsit on Small Batches and Short Lead Times," this episode provides valuable insights for those looking to enhance their software development values and practices. Dave Adsit shares his experiences on how to effectively implement lean principles to achieve small batches, short lead times, and frequent releases. ### Key Discussion Points: #### **Lean Software Development** - **Craft vs. Engineering** - **Principles of Flow** - **Waterfall vs. "Agile" vs. Lean** - **Timeboxes vs. Scope-Boxes** - **Resource vs. Flow Efficiency** - **Prioritization, Prototyping, and Lean Investment Bets** - **Single Piece Flow, Feature Flags, Continuous Delivery** - **Maximal Learning through Experimentation and a 50% Product Bet Success Rate** #### **Collaboration** - **Integration with Lean** - **"All Hands on Deck" Mindset** - **Relation to WIP Limits** - **Pair and Mob Programming** - **Failures and Lessons** - **Rules, Why, and Learning Paths** - **Utilization and Person vs. Team vs. System Value** #### **Continuous Improvement** - **Core Value** - **Innovative vs. Inert Practices** - **Deep vs. Shallow Learning** - **Leading Learning Opportunities** - **Knowing Enough to Make Informed Decisions** - **What If Some Do Not Want to Learn?** - **Rock Star vs. Super-Star** Video and Show Notes: https://youtu.be/LgAMUGtdXGA  

My life as a programmer
Feature flags vs feature branches?

My life as a programmer

Play Episode Listen Later Jul 27, 2024 10:45


Feature flags vs feature branches?

What the Dev?
270: Solving the issue of stale feature flags (with Lekko's Konrad Niemic)

What the Dev?

Play Episode Listen Later Jul 23, 2024 11:59


In this episode, SD Times editor-in-chief David Rubinstein speaks to Konrad Niemic, founder and CEO of Lekko about: What feature flags are"Stale flags" and why they're an issueWhy dynamic feature flags helps cut down on stale flags

North Meets South Web Podcast
The one with feature flags

North Meets South Web Podcast

Play Episode Listen Later Jul 11, 2024 36:44 Transcription Available


In this episode, Jake and Michael discuss feature flags, particularly the freshly-released before hook, and the perils of incorrect eager loading as your application scales.Show linksFool's mateTim MacDonaldIntroduce 'before' hook

Laravel News Podcast
Third-party relations, managing features, and asserting JSON

Laravel News Podcast

Play Episode Listen Later Jun 20, 2024 34:16


Jake and Michael discuss all the latest Laravel releases, tutorials, and happenings in the community.This episode is sponsored by Mailtrap, an Email Delivery Platform that developers love. An email-sending solution with industry-best analytics, SMTP, and email API, SDKs for major programming languages, and 24/7 human support. Try for Free at MAILTRAP.IOShow linksView Third-party Relations in model:show - Now Available in Laravel 11.11 Sentry and Laravel announce a new partnership Laravel Herd v1.7 is out with updates to the dump UI Create a DateTime from a Timestamp With this New Method Coming to PHP 8.4 Manage Events, Feature Flags, and More with PostHog for Laravel Randomize Command Execution Time with the Chaotic Schedule Package for Laravel Share Error Package for Laravel's New Exception Page Neovim Plugin to for Navigating Laravel and Livewire Components Asserting a JSON Response Structure in Laravel 

North Meets South Web Podcast
Music, feature flags, and making the new one do what the old one did

North Meets South Web Podcast

Play Episode Listen Later May 29, 2024 43:43 Transcription Available


In this episode, Jake and Michael discuss music we're into at the moment, using Pennant for feature flags in Laravel, and the age old set of requirements: "it needs to do everything the old one did"Show linksAudio ReignLouis ColeVulfpeckBurn the JukeboxLaracon AU

Troubleshooting Agile
One- and Two-Way Doors

Troubleshooting Agile

Play Episode Listen Later Jan 24, 2024 14:20


How can you do risky experiments even in the most risk-averse organisations? Find the answers on this week's episode as Squirrel and Jeffrey discuss the value of two-way doors and reversible decisions for your tech team and your product. Links: - Jeff Bezos on Doors: https://www.inc.com/jeff-haden/amazon-founder-jeff-bezos-this-is-how-successful-people-make-such-smart-decisions.html - Feature Flags: https://martinfowler.com/articles/feature-toggles.html -------------------------------------------------- Order your copy of our book, Agile Conversations at agileconversations.com Plus, get access to a free mini training video about the technique of Coherence Building when you join our mailing list. We'd love to hear any thoughts, ideas, or feedback you have about the show. Email us at info@agileconversations.com -------------------------------------------------- About Your Hosts Douglas Squirrel and Jeffrey Fredrick first met while working together at TIM group in 2013. A decade later, they remain united in their passion for growing organisations through better conversations. Squirrel is an advisor, author, keynote speaker, coach, and consultant, helping companies of all sizes make huge, profitable improvements in their culture, skills, and processes. You can find out more about his work here: https://douglassquirrel.com/index.html Jeffrey is Vice President of Engineering at ION Analytics, Organiser at CITCON, the Continuous Integration and Testing Conference, author and speaker. You can connect with him here: https://www.linkedin.com/in/jfredrick/

Modernize or Die ® Podcast - CFML News Edition
Modernize or Die® - CFML News Podcast for January 23rd, 2024 - Episode 210

Modernize or Die ® Podcast - CFML News Edition

Play Episode Listen Later Jan 23, 2024 65:04


2024-01-23 Weekly News — Episode 210Watch the video version on YouTube at https://www.youtube.com/watch?v=K2-hjkIsSvg Hosts: Gavin Pickin - Senior Developer at Ortus SolutionsEric Peterson - Senior Developer at Ortus SolutionsThanks to our Sponsor - Ortus SolutionsThe makers of ColdBox, CommandBox, ForgeBox, TestBox and all your favorite box-es out there. A few ways to say thanks back to Ortus Solutions:Buy workshop tickets to CF Summit EastBuy Tickets to Into the Box 2024 in Washington DC https://www.intothebox.org/Like and subscribe to our videos on YouTube. Help ORTUS reach for the Stars - Star and Fork our ReposStar all of your Github Box Dependencies from CommandBox with https://www.forgebox.io/view/commandbox-github Subscribe to our Podcast on your Podcast Apps and leave us a review AND WE WILL READ IT ON THE SHOWSign up for a free or paid account on CFCasts, which is releasing new content regularlyBOXLife store: https://www.ortussolutions.com/about-us/shopBuy Ortus's Books102 ColdBox HMVC Quick Tips and Tricks on GumRoad (http://gum.co/coldbox-tips)Now on Amazon!https://www.amazon.com/dp/B0CJHB712MLearn Modern ColdFusion (CFML) in 100+ Minutes - Free online https://modern-cfml.ortusbooks.com/ or buy an EBook or Paper copy https://www.ortussolutions.com/learn/books/coldfusion-in-100-minutes Patreon Support (staunch)We have 38 patreons: https://www.patreon.com/ortussolutions. News and AnnouncementsColdBox 7 Workshop at Adobe CF Summit East 2024A Deep Dive into ColdBox 7.2 - Date: April 25th - 26th, 2024 | After Adobe CFSummit EastSpeakers: Luis Majano, creator of ColdBoxElevate Your CFML Development Skills!Master ColdBox 7.2 from the Ground Up in Our Workshop Following CFSummit East 2024Calling all CFML developers and enthusiasts! We are thrilled to announce an upcoming event that promises to elevate your skills and empower you with ColdBox's latest updates and features. This two-day workshop is led by the creator of ColdBox, Luis Majano. You'll dive into ColdBox 7.2, exploring new features, updates, and fixes to build modern, high-quality projects.Whether you're a beginner looking to jumpstart your journey into the MVC ecosystem or an experienced developer seeking to refine your ColdBox skills, this workshop is designed to meet your needs. Get ready for an immersive experience that keeps you at the forefront of ColdBox development!Tickets are limited, get yours now and save with early bird pricinghttps://www.ortussolutions.com/blog/a-deep-dive-into-coldbox-72 ITB Workshops and Speakers announced - more to come!!!https://www.intothebox.org/CFCamp Call for Speakers is Open - CFP closes at March 17, 2024 23:30 UTCLast year's CFCamp 2023 was our first event after a forced-upon-us pandemic break and we were really happy how the conference was re-adopted by the community and that we were able to run in a reasonable and yet safe environment. So….CFCamp is back for a 2024 edition.Would you like to meet the German and European CFML web developer communities, listen to expert speakers and find out all about the latest trends around CFML and associated technologies? Then join us at CFCamp 2024, Europe's largest conference on CFML, Lucee, Adobe ColdFusion and associated technologies.Look at recommended topics - big variety https://www.papercall.io/cfcamp2024 Ben Nadel Released his Book - Feature Flags Book - Transforming Your Product Development WorkflowIn my tenure as co-founder and principal engineer at InVision, I went from never having heard of "Feature Flags" (aka "feature toggles" aka "feature switches"); to seeing them become widely adopted by our engineering team; to witnessing a complete transformation with regard to how our company approached product development. For me, feature flags are as transformational as databases—they are as important as both logs and metrics. I cannot imagine creating another product without them.I believe that I have a perspective worth sharing. I want to help people see the magic that I see. I want to help teams deliver value to their customers with love and empathy and without fear.https://featureflagsbook.com/ New Releases and UpdatesColdBox Debugger v4.2 - Unleashing a Wave of Debugging Power!In the ever-evolving landscape of web development, staying ahead requires cutting-edge tools. Enter ColdBox Debugger v4.2.0, the latest release that promises an action-packed experience with a plethora of features, improvements, and bug fixes. This update introduces the Hyper Collector, allowing you to track Hyper HTTP/S requests effortlessly with aggregated data on total time, slowest requests, grouping, and timelines. Lucee SQL Collector now enables profiling of SQL queries, providing valuable insights into your Lucee-powered applications. The addition of Heap Dump Support empowers users to generate Java heap dumps for offline analysis, ideal for debugging memory leaks and ensuring system stability. A revamped Request Dock and enhanced SQL/JSON formatting contribute to an improved user interface. Moreover, the ability to add timers manually and download heap dump snapshots adds versatility to your debugging toolkit.ColdBox Debugger v4.2.0 is not just an upgrade; it's a leap forward in simplifying the debugging process and enhancing overall development efficiency. Explore the new features and take your debugging game to new heights!https://www.ortussolutions.com/blog/coldbox-debugger-v42-unleashing-a-wave-of-debugging-powerCBWIRE 3.2 ReleasedHey there CBWIRE enthusiasts!

Modernize or Die ® Podcast - CFML News Edition
Modernize or Die® - CFML News Podcast for December 19th, 2023 - Episode 209

Modernize or Die ® Podcast - CFML News Edition

Play Episode Listen Later Dec 19, 2023 28:54


2023-12-19 Weekly News — Episode 209Watch the video version on YouTube at https://youtube.com/live/BbBInJ9LgDo?feature=shareHosts:  Eric Peterson - Senior Developer at Ortus Solutions Daniel Garcia - Senior Developer at Ortus Solutions Thanks to our Sponsor - Ortus SolutionsThe makers of ColdBox, CommandBox, ForgeBox, TestBox and all your favorite box-es out there. A few ways to say thanks back to Ortus Solutions: Buy Tickets to Into the Box 2024 in Washington DC https://www.intothebox.org/ Like and subscribe to our videos on YouTube.  Help ORTUS reach for the Stars - Star and Fork our ReposStar all of your Github Box Dependencies from CommandBox with https://www.forgebox.io/view/commandbox-github  Subscribe to our Podcast on your Podcast Apps and leave us a review AND WE WILL READ IT ON THE SHOW Sign up for a free or paid account on CFCasts, which is releasing new content regularly BOXLife store: https://www.ortussolutions.com/about-us/shop Buy Ortus's Books 102 ColdBox HMVC Quick Tips and Tricks on GumRoad (http://gum.co/coldbox-tips) Now on Amazon! https://www.amazon.com/dp/B0CJHB712M Learn Modern ColdFusion (CFML) in 100+ Minutes - Free online https://modern-cfml.ortusbooks.com/ or buy an EBook or Paper copy https://www.ortussolutions.com/learn/books/coldfusion-in-100-minutes  Patreon Support (Festive)We have 42 patreons: https://www.patreon.com/ortussolutions. News and AnnouncementsNo new newsNew Releases and UpdatesContentBox 6 ReleasedLots of great updates including improvements to the ContentBox CLI, upgraded to use ColdBox 7, now using cbSecurity 3 with more security features, content templates, domain aliases, migrations, and more!https://www.ortussolutions.com/blog/contentbox-v60-releasedWebinar / Meetups and WorkshopsICYMI - Hawaii ColdFusion Meetup Group - InertiaJS and ColdFusion with Eric PetersonInertiaJS is a new JavaScript framework made for people who don't really need an API but want to use a modern JavaScript framework like React or Vue as their view layer. Inspired by libraries like Turbolinks, InteriaJS makes your app behave like a SPA while still being a fully server-rendered app.https://www.meetup.com/hawaii-coldfusion-meetup-group/events/297584413/ Recording: https://hawaiicoldfusionusergroup.adobeconnect.com/pkc1egu6z131/Online CFMeetup - Installing CF2023: choices, challenges, and solutions with Charlie ArehartDecember 21st, 2023 at 12pm US Eastern TimeIf you'll be installing CF2023, there are some things to consider before or as you do. First, be aware that besides the traditional full installer there's the new "zip" install option (added in CF2021). What's that about, why might you want to use it--or not?Then there are some options and choices during installation--some new also with CF2021. Perhaps it's been a while since you've installed even previous CF versions. We'll cover some of the key options to consider (including license activation, package/module management, and more) as well as post-install steps including updating CF and the JVM, and migrating in CF Admin settings (including using the new CLI/json admin config tool, cfsetup).https://www.meetup.com/coldfusionmeetup/events/298025246/CFCasts Content Updateshttps://www.cfcasts.comRecent ReleasesInto the Box 2023 Videos are now available for all Paid Subscriptions https://cfcasts.com/series/itb-2023 Coming SoonMastering CBWIRE v3 from GrantConferences and TrainingITB 2024 Location: Optica in Washington, DC Announcement Blog Post: https://www.ortussolutions.com/blog/our-into-the-box-2024-venue-and-dates-are-set Dates: May 15-17, 2024 Get Blind Tickets Now (through the end of the year): https://www.eventbrite.com/e/into-the-box-2024-the-new-era-of-modernization-tickets-663126347757 Call for Speakers: CLOSED  First batch of sessions and workshops being announced this week. Save the Date: CFCamp 2024 Location: Munich, Freising, Germany Dates: June 13-14, 2024 Call for Speakers: around mid-January (https://twitter.com/cf_camp/status/1736851753260498946) Twitter Link: https://twitter.com/cf_camp/status/1736705195927646236 Facebook Link: https://t.co/YKU4dhuHEO More conferencesNeed more conferences, this site has a huge list of conferences for almost any language/community.https://confs.tech/Blogs, Tweets, and Videos of the Week12/06/23 - Blog - Ben Nadel - Generating Pandoc Heading Identifiers In ColdFusionOver on my Feature Flags book website, I'm using my book's Markdown content to generate the HTML for the page. I then use jSoup to inject a table of contents (TOC); which requires that I insert an identifier into each header element. And, now that I'm trying to use Pandoc to generate an EPUB (digital book) version, I need to make sure that my ColdFusion-based header identifiers match the ones that Pandoc will generate in the final EPUB.https://www.bennadel.com/blog/4537-generating-pandoc-heading-identifiers-in-coldfusion.htm 12/11/23 - Blog - Robert Zehnder - Bringing back commandbox-ssgOver the past few years, my focus has been largely on blog-related projects. My initial foray into the world of static site generators began with commandbox-jasper. This project laid the foundation for my current static site generator, aptly named commandbox-ssg. commandbox-ssg not only inherits a substantial portion of its codebase from Jasper, but it also boasts several refinements and a more descriptive name that better captures its functionality. The name Jasper, while a sentimental nod to my dog, didn't quite convey the tool's purpose.The transition of my development environment from MacOS to Windows, however, presented some unexpected challenges. It became apparent that my assumptions regarding file paths, which worked seamlessly on MacOS, were not compatible with Windows. This realization led to a few hiccups, but I've been making steady progress in addressing these issues.I'm enthusiastic about resolving any lingering issues and diving into further development of the tool.https://kisdigital.com/posts/2023/bringing-back-commandbox-ssg12/14/23 - Blog - Robert Zehnder - An introduction to commandbox-ssgThis module, a static site generator for CommandBox, is a personal favorite among the modules I've had the pleasure of working on. This guide aims to provide an overview of installing, using, and configuring CommandBox-SSG for your web projects.https://kisdigital.com/posts/2023/an-introduction-to-commandbox-ssg12/19/23 - Blog - Ben Nadel - Using Google reCAPTCHA v3 In ColdFusionOver on my Dig Deep Fitness weight lifting application, I use magic links for passwordless logins. This type of authentication workflow takes an email address and sends a one-time-use link that will automatically log the given user into my ColdFusion application, no password required. A few weeks ago, I started seeing SPAM bots submit this form (for reasons that I can't understand). To combat this malicious attack, I added Google's reCAPTCHA v3 to my login form. This was the first time that I've used reCAPTCHA in a ColdFusion application; so, I thought it might be worth a closer look.https://www.bennadel.com/blog/4538-using-google-recaptcha-v3-in-coldfusion.htmCFML JobsSeveral positions available on https://www.getcfmljobs.com/Listing over 113 ColdFusion positions from 68 companies across 48 locations in 5 Countries.2 new jobs listed in the last few weeksFull-Time - ColdFusion 2016 & 2023 Expert at HotelPlanner - United States Posted Dec 12, 2023https://twitter.com/hotelplanner/status/1734614012845871359Full-Time - ColdFusion Developer at Washington, DCPosted Dec 13, 2023https://www.getcfmljobs.com/jobs/index.cfm/united-states/CFDeveloper-at-Washington-DC/11625Other Job LinksThere is a jobs channel in the CFML slack team, and in the Box team slack now tooForgeBox Module of the WeekRoute Auditor by Dan CardThis module is a simple interceptor which captures the event being run based on the route that was hit in your API and persists it to a database with the date, time and endpoint hit.https://forgebox.io/view/route_auditorVS Code Hint Tips and Tricks of the WeekNovember 2023 Visual Studio Code Release Tidbits Floating Editor Windows Terminal Sticky Scroll GitHub Copilot Potential vulnerability detection in code blocks https://code.visualstudio.com/updates/v1_85#_sticky-scrollThank you to all of our Patreon SupportersThese individuals are personally supporting our open source initiatives to ensure the great toolings like CommandBox, ForgeBox, ColdBox, ContentBox, TestBox and all the other boxes keep getting the continuous development they need, and funds the cloud infrastructure at our community relies on like ForgeBox for our Package Management with CommandBox. You can support us on Patreon here https://www.patreon.com/ortussolutionsDon't forget, we have Annual Memberships, pay for the year and save 10% - great for businesses everyone. Bronze Packages and up, now get a ForgeBox Pro and CFCasts subscriptions as a perk for their Patreon Subscription. All Patreon supporters have a Profile badge on the Community Website All Patreon supporters have their own Private Forum access on the Community Website All Patreon supporters have their own Private Channel access BoxTeam Slack https://community.ortussolutions.com/Top Patreons (Festive) John Wilson - Synaptrix Tomorrows Guides Jordan Clark Gary Knight Giancarlo Gomez  David Belanger Dan Card James Moberg & Jeffry McGee - Sunstar Media  Dean Maunder Kevin Wright Doug Cain  Nolan Erck  Abdul Raheen And many more PatreonsYou can see an up to date list of all sponsors on Ortus Solutions' Websitehttps://ortussolutions.com/about-us/sponsors Thanks and Happy Holidays everyone!!! ★ Support this podcast on Patreon ★

Modernize or Die ® Podcast - CFML News Edition
Modernize or Die® - CFML News Podcast for December 5th, 2023 - Episode 208

Modernize or Die ® Podcast - CFML News Edition

Play Episode Listen Later Dec 5, 2023 50:34


2023-12-05 Weekly News — Episode 208Watch the video version on YouTube at https://youtube.com/live/WHVwcHtf_gA?feature=share Hosts:  Gavin Pickin - Senior Developer at Ortus Solutions Grant Copley - Senior Developer at Ortus Solutions Thanks to our Sponsor - Ortus SolutionsThe makers of ColdBox, CommandBox, ForgeBox, TestBox and all your favorite box-es out there. A few ways  to say thanks back to Ortus Solutions: Buy Tickets to Into the Box 2024 in Washington DC https://www.intothebox.org/ Like and subscribe to our videos on YouTube.  Help ORTUS reach for the Stars - Star and Fork our Repos Star all of your Github Box Dependencies from CommandBox with https://www.forgebox.io/view/commandbox-github  Subscribe to our Podcast on your Podcast Apps and leave us a review AND WE WILL READ IT ON THE SHOW Sign up for a free or paid account on CFCasts, which is releasing new content regularly BOXLife store: https://www.ortussolutions.com/about-us/shop Buy Ortus's Books 102 ColdBox HMVC Quick Tips and Tricks on GumRoad (http://gum.co/coldbox-tips) Now on Amazon! https://www.amazon.com/dp/B0CJHB712M Learn Modern ColdFusion (CFML) in 100+ Minutes - Free online https://modern-cfml.ortusbooks.com/ or buy an EBook or Paper copy https://www.ortussolutions.com/learn/books/coldfusion-in-100-minutes  Patreon Support ()We have 42 patreons: https://www.patreon.com/ortussolutions. News and AnnouncementsAdobe ColdFusion flaw exploited in US government agency attacksAdobe released a security update for the vulnerability (CVE-2023-26360) that the attackers exploited in March this year. At that time, the vulnerability was already used in zero-day attacks.Following the FCEB agency's investigation, analysis of network logs confirmed the compromise of at least two public-facing servers within the environment between June and July 2023.https://stackdiary.com/adobe-coldfusion-flaw-exploited-in-us-government-agency-attacks/ https://www.cisa.gov/news-events/alerts/2023/12/05/cisa-releases-advisory-threat-actors-exploiting-cve-2023-26360-vulnerability-adobe-coldfusion CISA has issued an alert regarding multiple vulnerabilities impacting Adobe ColdFusion.CISA has issued an alert regarding multiple vulnerabilities impacting Adobe ColdFusion. The alert underscores that the exploitation of the vulnerabilities could grant threat actors control over affected systems, prompting organizations to take measures to protect their systems.Adobe ColdFusion serves as a rapid scripting environment for developing dynamic internet applications on both web and mobile platforms, utilizing ColdFusion Markup Language (CFML).The security update addresses a range of vulnerabilities, including critical, high, and medium severity issues. These vulnerabilities have the potential to enable threat actors to access specific endpoints or execute arbitrary code, without requiring user interaction.https://socradar.io/cisa-alert-serious-vulnerabilities-in-adobe-coldfusion-cve-2023-44350-cve-2023-44351-cve-2023-44353-and-more/ Ben Nadel wrote a Book - Early Access: Feature Flags - From Concept To Cultural RevolutionAlmost 3-months ago, I announced that I was writing a book on Feature Flags. This morning, I'm thrilled to announce that I have an early access version available for purchase. This is a PDF version; and, the formatting is a bit rough around the edges. But, the content is all there. And, if you pick-up the book now (at a deep discount), you'll automatically get access to future versions.https://www.bennadel.com/blog/4531-early-access-feature-flags-from-concept-to-cultural-revolution.htm New Releases and UpdatesUpdate your servers with the below updatesICYMI - Adobe November Updates - Security FixesAdobe for ColdFusion 2023 (update 6) and 2021 (update 12)Previous versions no longer receive security updates!!!CommandBox has already been updatedSecurity updates available for Adobe ColdFusion | APSB23-52 - https://helpx.adobe.com/security/products/coldfusion/apsb23-52.html https://community.adobe.com/t5/coldfusion-discussions/now-live-adobe-coldfusion-2023-and-2021-november-security-updates/m-p/14233917#M196421 Note: Reported WDDX related issues by some customersMore details from Charlie Arehart: https://www.carehart.org/blog/2023/11/14/cf_security_updates_nov_2023#more ICYMI - ColdBox 7.2.0 ReleasedWelcome to ColdBox 7.2.0, which packs a big punch on stability and tons of new features.Includes lots of updates for all the core products: ColdBox, WireBox, CacheBox, and LogBox.ColdBox, 10 new features, 6 improvements and 4 bug fixesLogBox has 3 new features, 4 improvements, 2 bug fixes and a taskWith WireBox including a new feature and CacheBox has an Improvement.https://coldbox.ortusbooks.com/readme/release-history/whats-new-with-7.2.0 Webinar / Meetups and WorkshopsColdFusion Security TrainingWriting Secure CFML with Pete FreitagA hands-on CFML / ColdFusion Security Training class for developers. Learn how to identify and fix security vulnerabilities in your ColdFusion / CFML applications.Where: OnlineWhen: Tuesday December 12, 2023 @ 11am-2pm EST & Wednesday December 13 @ 11am-2pmPrice: $899 per studenthttps://foundeo.com/consulting/coldfusion/security-training/ The class will be recorded, so if you cannot attend it fully online you will have access to a recording.Hawaii ColdFusion Meetup Group - InertiaJS and ColdFusion with Eric PetersonDecember 15thInertiaJS is a new JavaScript framework made for people who don't really need an API but want to use a modern JavaScript framework like React or Vue as their view layer. Inspired by libraries like Turbolinks, InteriaJS makes your app behave like a SPA while still being a fully sever-rendered app.https://www.meetup.com/hawaii-coldfusion-meetup-group/events/297584413/ CFCasts Content Updateshttps://www.cfcasts.comRecent ReleasesInto the Box 2023 Videos are now available for all Paid Subscriptions https://cfcasts.com/series/itb-2023  Coming SoonMastering CBWIRE v3 from GrantConferences and TrainingICYMI - Into the Box LATAM - Recap from GrantNovember 30thUniversity of Business in El Salvador.https://latam.intothebox.org/ICYMI - Adobe ColdFusion India Summit 2023December 2nd, 2023Register for FreeLocation: Bengaluru, Indiahttps://cf-indiasummit-2023.attendease.com/ https://twitter.com/mishrabagish/status/1730801813547339927/photo/1 ITB 2024 Location: Optica in Washington, DC Announcement Blog Post: https://www.ortussolutions.com/blog/our-into-the-box-2024-venue-and-dates-are-set Dates: May 15-17, 2024 Get Blind Tickets Now: https://www.eventbrite.com/e/into-the-box-2024-the-new-era-of-modernization-tickets-663126347757 Call for Speakers: CLOSED  More conferencesNeed more conferences, this site has a huge list of conferences for almost any language/community.https://confs.tech/Blogs, Tweets, and Videos of the Week12/05/23 - Blog - Stackdiary - Adobe ColdFusion flaw exploited in US government agency attacksAdobe released a security update for the vulnerability (CVE-2023-26360) that the attackers exploited in March this year. At that time, the vulnerability was already used in zero-day attacks.Following the FCEB agency's investigation, analysis of network logs confirmed the compromise of at least two public-facing servers within the environment between June and July 2023.https://stackdiary.com/adobe-coldfusion-flaw-exploited-in-us-government-agency-attacks/ 11/30/23 - Blog - Ben Nadel - Multi-Var Assignments In A Single Line In ColdFusionThe other day, when I was looking up some operators for my post on natural language operators in ColdFusion, I saw something in the documentation that surprised me: ColdFusion has the ability to assign multiple Function-local variables in a single line. It's a very strange notation, so I'll probably never use it. But, since it surprised me, I figured there's other people out there who have never seen it.https://www.bennadel.com/blog/4535-multi-var-assignments-in-a-single-line-in-coldfusion.htm 11/29/23 - Blog - Ben Nadel - Reflecting On Natural Language Operators In ColdFusionThe other day, on the Lucee Dev Forum, I suggested that ColdFusion might benefit from having starts with and ends with operators. These would fall under the "natural language" operators, in that they read like normal human language, not computer jargon. But, my suggestion is somewhat fraudulent considering the fact that I never use the natural language operators in ColdFusion. This conversation, however, gave me pause to reflect on this choice more deeply.https://www.bennadel.com/blog/4534-reflecting-on-natural-language-operators-in-coldfusion.htm 11/28/23 - Tweet - Cameron Childress - This is a pretty solid writeup about refactoring a legacy stateful app into a stateless one. I'm looking at you #coldfusion developers!https://aws.amazon.com/blogs/architecture/converting-stateful-application-to-stateless-using-aws-services/ https://x.com/cameronc/status/1729577651772289395?s=20 11/28/23 - Blog - Ben Nadel - The RegEx Of Everyday Things - Great cheat sheetI'm a massive fan of Regular Expressions. I started learning about them 20-years ago for the purposes of data cleaning at Nylon Technology; and, since then, not a day goes by where I don't use them in some form. A lot of engineers view pattern matching as a dark art; and, there's no question that RegEx patterns can be very complicated. But, they don't have to be. Simple patterns can still add a lot value in your every day engineering life. And, there's no place where this rings more true than in your "Code Search".https://www.bennadel.com/blog/4532-the-regex-of-everyday-things.htm 11/27/23 - Blog - Ben Nadel - Early Access: Feature Flags - From Concept To Cultural RevolutionAlmost 3-months ago, I announced that I was writing a book on Feature Flags. This morning, I'm thrilled to announce that I have an early access version available for purchase. This is a PDF version; and, the formatting is a bit rough around the edges. But, the content is all there. And, if you pick-up the book now (at a deep discount), you'll automatically get access to future versions.https://www.bennadel.com/blog/4531-early-access-feature-flags-from-concept-to-cultural-revolution.htm 11/23/23 - Blog - SOCRadar - CISA Alert: Serious Vulnerabilities in Adobe ColdFusion (CVE-2023-44350, CVE-2023-44351, CVE-2023-44353 and More)CISA has issued an alert regarding multiple vulnerabilities impacting Adobe ColdFusion. The alert underscores that the exploitation of the vulnerabilities could grant threat actors control over affected systems, prompting organizations to take measures to protect their systems.Adobe ColdFusion serves as a rapid scripting environment for developing dynamic internet applications on both web and mobile platforms, utilizing ColdFusion Markup Language (CFML).The security update addresses a range of vulnerabilities, including critical, high, and medium severity issues. These vulnerabilities have the potential to enable threat actors to access specific endpoints or execute arbitrary code, without requiring user interaction.https://socradar.io/cisa-alert-serious-vulnerabilities-in-adobe-coldfusion-cve-2023-44350-cve-2023-44351-cve-2023-44353-and-more/ 11/23/23 - Tweet - Ortus Solutions - Unleash the power of a Headless CMS with Luis Majano at #WeyWeyWeb23!

Working Code
152: Cron Heatmaps, Harvard AI, and Ben's Book - What's On Your Workbench

Working Code

Play Episode Listen Later Nov 8, 2023 60:16 Transcription Available


This week on the show, the hosts talk about what they have going on. Adam is trying to better understand the cadence with which his scheduled tasks are executing; and, has built a visualization tool using Svelte and D3. Tim has signed up for CS50 at Harvard - an online course introducing Artificial Intelligence (AI) with Python. And, Ben has a working draft for the first half of his Feature Flags book; and, is now considering some sort of pre-sale (if he can figure out how to turn his Markdown files into something consumable).Follow the show and be sure to join the discussion on Discord! Our website is workingcode.dev and we're @WorkingCodePod on Twitter and Instagram. New episodes drop weekly on Wednesday.And, if you're feeling the love, support us on Patreon.With audio editing and engineering by ZCross Media.Full show notes and transcript here.

Working Code
150: What's on Your Workbench #3

Working Code

Play Episode Listen Later Oct 25, 2023 56:00 Transcription Available


This week we go around the table and see what the hosts have going on. Carol got a promotion in her first week back at work, despite the fact that she's had to emotionally suppress everything she once knew about dotnet. Adam is now - finally - at 100% SOC compliance (and is awaiting a 3-month review period). Tim has been wrestling with APIs and bending them to his will (to receive JSON payloads). And, Ben is considering different ways in which to package his Feature Flags book.Follow the show and be sure to join the discussion on Discord! Our website is workingcode.dev and we're @WorkingCodePod on Twitter and Instagram. New episodes drop weekly on Wednesday.And, if you're feeling the love, support us on Patreon.With audio editing and engineering by ZCross Media.Full show notes and transcript here.

Ready, Set, Cloud Podcast!
The Secret Power of Feature Flags With Steve Rice

Ready, Set, Cloud Podcast!

Play Episode Listen Later Oct 6, 2023 26:54


Feature flags are much more than on/off switches to hide in-progress features. They separate releases from your deployments. They allow you to slowly roll out features to your user base. They give you access to easy A/B testing. Join Steve and Allen as they talk about the impressive capabilities of AWS AppConfig, a managed service that controls your feature flags and powers many of the AWS services. The two go over the types of feature flags, commonly seen anti-patterns, and how to implement them in your code. About Steve Steve is Principal Product Manager for AWS AppConfig, which is a feature flagging service that helps engineers move faster and more safely. His career has covered engineering and product management leadership roles at AWS, Coca-Cola, LivingSocial, and AOL, and has been using dynamic configuration to make things move faster for over a decade. He lives in the Northern Virginia area with his wife, kids, and two dogs. Links LinkedIn - https://www.linkedin.com/in/stevejrice AWS AppConfig - https://go.aws/awsappconfig --- Send in a voice message: https://podcasters.spotify.com/pod/show/readysetcloud/message Support this podcast: https://podcasters.spotify.com/pod/show/readysetcloud/support

More Than Just Code podcast - iOS and Swift development, news and advice

This week we review the Apple Event, Wonderlust, from Sept 12, 2023. We discuss the Watch Series 9, Watch Ultra 2, iPhone 15, iPhone 15 Pro, and the addition of USB-C. Picks: Vision Pro Hand's On, Advanced macOS Commands, Swift 5.8 Feature Flags, and GitHistoryApple's Unusual Headset Design Has Led to Unprecedented Production Challenges - MacRumorsVision Pro leak reveals how Apple plans to launch its futuristic headset | iMoreSystem in a package - WikipediaiPhone 15 lineup gets a price hike in CanadaApple releases detailed PDFs of iOS 17 and macOS Sonoma featuresiPhone 15 Pro fixes the worst thing about Apple's Vision ProContrary to rumors, the iPhone 15 has a standard, by-the-book USB-C port | Ars TechnicaRumors of Lightning's death are just slightly exaggerated - The VergeiPhone 15 fulfills a vision for photography shared with Steve Jobs over a decade ago - 9to5MacVision Pro Developer Hands-onAdvanced macOS Commands - saurabhs.orgUsing Upcoming Feature FlagsGit History Become a member at https://plus.acast.com/s/mtjc. Hosted on Acast. See acast.com/privacy for more information.

Dev Interrupted
Reimagining DORA Metrics & Leveraging Feature Flags w/ Split's VP of Engineering, Ariel Perez

Dev Interrupted

Play Episode Listen Later Jul 11, 2023 46:31


Does the emergence of feature flags affect the interpretation and utility of DORA metrics?On this week's episode of Dev Interrupted, host Dan Lines and Ariel Perez, VP of Engineering at Split.io, discuss the state of DORA metrics and whether they need reimaging in a world of feature flags. Listen as Ariel explains why he believes feature flags are more than a tool, and have begun to reshape our understanding of software development and the metrics we use to measure it.Dan and Ariel also touch on how feature flags can drastically reduce lead time and mean time to recover, and conclude their chat with an intriguing look at the granular nature of control in the modern software engineering landscape, where the unit of control has shifted from the application as a whole to individual features. Show Notes:The Split BlogJoin the Split community on SlackRegister for our summer series!  Accelerate State of DevOps SurveySupport the show: Subscribe to our Substack Follow us on YouTube Leave us a review Follow us on Twitter or LinkedIn Offers: Learn about Continuous Merge with gitStream Want to try LinearB? Book a Demo & use discount code "Dev Interrupted Podcast"

Fragmented - Android Developer Podcast
248 - Feature Flags & A/B Testing: A Deep Dive with Ishan Khanna

Fragmented - Android Developer Podcast

Play Episode Listen Later Jun 26, 2023 65:44


In this edition of Fragmented, we're thrilled to host Ishan Khanna, a software engineer at Tinder who possesses great enthusiasm for feature flags and A/B testing. Donn discusses why he invited Ishan on the show, highlighting Ishan's passion for feature flagging and A/B testing. The conversation kicks off with an insightful story from Ishan about feature flagging at Booking.com, leading to a discussion on the difference between A/B Testing and Feature Flags, when and why to introduce feature flagging, and how to measure its effectiveness. The show also focuses on the benefits and risks of feature flagging, along with ways to manage potential complexities in the codebase.We then delve deeper into the topic of feature flagging, covering how to get started, what to look for in a tool, and the role of testing. Discussion points include the best practices for rollout percentages, considerations for multi-platform implementation, and the specifics of targeting in feature flagging. The conversation wraps up with an exploration of available tools for those looking to introduce feature flagging or A/B testing frameworks into their operations, examining when it might be necessary to build a bespoke solution.The episode offers a wealth of resources for listeners, including links to an array of feature flagging and A/B testing tools, such as Firebase Remote Config, Optimizely, and LaunchDarkly. For more insight into the topics discussed, Ishan recommends his Droidcon Berlin talk on 'Customer Driven Development' and Stuart Frisby's talk on A/B Testing. To reach out to Ishan, listeners can contact him via Twitter, LinkedIn, or his website.LinksHere are the links mentioned in the document, in markdown format:Firebase Remote ConfigOptimizelyLaunchDarklyAWS AppConfig for Feature FlagsVWOUnleash - Open Source Feature FlagsPosthog Feature Flags and A/B TestingIshan's Droidcon Berlin TalkStuart Frisby's Talk on A/B TestingErindoesthingsContact IshanIshan on Twitter - @droidchefIshan on LinkedInIshan's WebsiteDonn's Git CourseNeed to learn Git? Donn has the course for you. In this FREE course you'll learn everything you need to know in order to start working with Git everyday. Watch it here.AndroidJobs.IOJob postings are FREE on AndroidJobs.IO

Code Time
A/B testing y feature flags | Code Time (233) -Versión Compacta

Code Time

Play Episode Listen Later Jun 17, 2023 80:30


- Disclaimers: 01:08 - ¿Cómo solemos desarrollar software?: 04:37 - Algunos problemas que se producen en el desarrollo: 23:00 Feature Flags - ¿Qué son las feature flags?: 30:27 - ¿Cómo funcionan las feature flags?: 36:02 Preguntas - Beneficios: 50:29 - ¿Qué se puede controlar?: 1:03:13 - ¿Cuánto debería vivir una flag?: 1:06:26 - ¿Siempre debemos usarlas?: 1:08:48 - Concluyendo: 1:15:30 - Cierre: 1:17:50 –––––––––––––––––––––––––––––– Para Contribuir PAYPAL : https://www.paypal.me/codetime Mercado Pago $100: https://mpago.la/1Zqo3G9 Mercado Pago $500: https://mpago.la/2MZ3oz3 Mercado Pago $1000: https://mpago.la/333qhPp –––––––––––––––––––––––––––––– Curso completo de desarrollo en Swift 4 desde cero https://www.udemy.com/curso-completo-de-swift-4-desde-cero/?couponCode=YOUTUBE_1 Curso de desarrollo de aplicaciones para iOS 11 desde cero https://www.udemy.com/desarrollo-de-aplicaciones-para-ios-11-desde-cero/?couponCode=YOUTUBE_1 –––––––––––––––––––––––––––––– Medios de contacto: Twitter / Telegram: @DavidGiordana Correo Electrónico: davidgiordana0@gmail.com Grupo en Telegram: https://t.me/joinchat/C-YEzBGu5Jh-mu8ejM2toA –––––––––––––––––––––––––––––– Canciones Utilizadas OP: Adventures by A Himitsu https://soundcloud.com/a-himitsu Creative Commons — Attribution 3.0 Unported— CC BY 3.0 Free Download / Stream: http://bit.ly/2Pj0MtT Music released by Argofox https://youtu.be/8BXNwnxaVQE Music promoted by Audio Library https://youtu.be/MkNeIUgNPQ8 ED: See You Tomorrow by GoSoundtrack http://www.gosoundtrack.com Creative Commons — Attribution 4.0 International — CC BY 4.0 Free Download / Stream: http://bit.ly/see-you-tomorrow Music promoted by Audio Library https://youtu.be/idlqqMHd0W4

Code Time
A/B testing y feature flags | Code Time (233) - Versión Completa

Code Time

Play Episode Listen Later Jun 17, 2023 107:25


- Comienzo del episodio: 27:08 - Disclaimers: 28:02 - ¿Cómo solemos desarrollar software?: 31:31 - Algunos problemas que se producen en el desarrollo: 49:54 Feature Flags - ¿Qué son las feature flags?: 57:21 - ¿Cómo funcionan las feature flags?: 1:02:56 Preguntas - Beneficios: 1:17:23 - ¿Qué se puede controlar?: 1:30:07 - ¿Cuánto debería vivir una flag?: 1:33:20 - ¿Siempre debemos usarlas?: 1:35:42 - Concluyendo: 1:42:24 - Cierre: 1:44:44 –––––––––––––––––––––––––––––– Para Contribuir PAYPAL : https://www.paypal.me/codetime Mercado Pago $100: https://mpago.la/1Zqo3G9 Mercado Pago $500: https://mpago.la/2MZ3oz3 Mercado Pago $1000: https://mpago.la/333qhPp –––––––––––––––––––––––––––––– Curso completo de desarrollo en Swift 4 desde cero https://www.udemy.com/curso-completo-de-swift-4-desde-cero/?couponCode=YOUTUBE_1 Curso de desarrollo de aplicaciones para iOS 11 desde cero https://www.udemy.com/desarrollo-de-aplicaciones-para-ios-11-desde-cero/?couponCode=YOUTUBE_1 –––––––––––––––––––––––––––––– Medios de contacto: Twitter / Telegram: @DavidGiordana Correo Electrónico: davidgiordana0@gmail.com Grupo en Telegram: https://t.me/joinchat/C-YEzBGu5Jh-mu8ejM2toA –––––––––––––––––––––––––––––– Canciones Utilizadas OP: Adventures by A Himitsu https://soundcloud.com/a-himitsu Creative Commons — Attribution 3.0 Unported— CC BY 3.0 Free Download / Stream: http://bit.ly/2Pj0MtT Music released by Argofox https://youtu.be/8BXNwnxaVQE Music promoted by Audio Library https://youtu.be/MkNeIUgNPQ8 ED: See You Tomorrow by GoSoundtrack http://www.gosoundtrack.com Creative Commons — Attribution 4.0 International — CC BY 4.0 Free Download / Stream: http://bit.ly/see-you-tomorrow Music promoted by Audio Library https://youtu.be/idlqqMHd0W4

North Meets South Web Podcast
World champions, deploying from GitHub Actions, and feature flags

North Meets South Web Podcast

Play Episode Listen Later Jun 13, 2023 39:47


Jake and Michael discuss the world champion Denver Nuggets, building assets and deploying apps in GitHub Actions, and feature flags with Laravel Pennant.This episode is brought to you by our friends at Workvivo - The leading employee communication app.Show links Cache dependencies in GitHub Actions Laravel Pennant

More Than Just Code podcast - iOS and Swift development, news and advice

This week Mark, Jaime and Tim review the WWDC 2023 highlights. Mark describes attending WWDC in person at Apple Park. We cover all the new MacBook Air 15-inch, updates to the Mac Studio and the introduction of the Mac Pro with M2 Apple Silicon. We review the new feature of iOS 17, iPadOS 17, tvOS and watchOS. Next up we talk about the Vision Pro and visionOS. Picks: 1Password's Passkeys, Tim's short story iViz 1.0, upcoming Feature Flags and applying to work with Vision Pro and visionOS.SwiftUI updates | Apple Developer DocumentationSee Apple Park's massive lunchroom doors open in epic fashion - CNETNow in beta: Save and sign in with passkeys using 1Password in the browseriViz 1.0Using Upcoming Feature FlagsWork with Apple - visionOSApple Vision Pro Impressions! Become a member at https://plus.acast.com/s/mtjc. Hosted on Acast. See acast.com/privacy for more information.

Software Engineering Radio - The Podcast for Professional Software Developers
SE Radio 564: Paul Hammant on Trunk-Based Development

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later May 17, 2023 60:23


Paul Hammant, independent consultant, joins host Giovanni Asproni to speak about trunk-based development—a version control management practice in which developers merge small, frequent updates to a core “trunk” or main branch. The episode explores the technique in some detail, including its pros and cons and some examples from real projects, and offers suggestions on how to get started. The conversation touches on a set of related topics, including code reviews, feature flags, continuous integration, and testing.

The Bike Shed
379: Feature Flags

The Bike Shed

Play Episode Listen Later Apr 11, 2023 41:56


Joël submitted a last-minute submission to RailsConf discreet math, which got picked up!

Syntax - Tasty Web Development Treats
Potluck × Testing Animations × Tools for Learning × Coding Related Injuries

Syntax - Tasty Web Development Treats

Play Episode Listen Later Mar 29, 2023 57:52


In this potluck episode of Syntax, Wes and Scott answer your questions about what to do with client projects, testing animations, evaluating front-end frameworks, tools to use when learning, and coding related injuries. Sentry - Sponsor If you want to know what's happening with your code, track errors and monitor performance with Sentry. Sentry's Application Monitoring platform helps developers see performance issues, fix errors faster, and optimize their code health. Cut your time on error resolution from hours to minutes. It works with any language and integrates with dozens of other services. Syntax listeners new to Sentry can get two months for free by visiting Sentry.io and using the coupon code TASTYTREAT during sign up. Show Notes 00:10 Welcome 00:25 Sponsor: Sentry 01:22 Landscaping update 02:27 What do you do when you are done a client project? 10:09 Should I keep animations in our tests so our tests match prod behavior? 14:05 How does ChatGPT fill the responses to the prompt? 17:14 What is the best way to evaluate and choose a front-end framework for a project? 21:10 Should functions only be used strictly for code that is going to be re-used? 26:03 What kind of tools and processes do you use when learning? Obsidian Roam Research – A note taking tool for networked thought. 30:19 What are your opinions on using “display: grid” simply to be able use the gap property on the elements inside? 33:57 What do you guys think of being a 1-language dev? 36:38 What are some tips you have to push back on requirements from clients? 41:11 Have you guys ever had any coding related stress injuries, like back issues or carpal tunnel? MoErgo Glove80 Wireless Split Ergonomic Keyboard GitHub Next | Hey, GitHub! 48:41 What do you think of using “Feature Flags” in the codebase to enable / disable features at runtime? 51:19 SIIIIICK ××× PIIIICKS ××× ××× SIIIIICK ××× PIIIICKS ××× Scott: History for Granite Wes: GreatScott!, bigclivedotcom Shameless Plugs Scott: LevelUp Discord Wes: Wes Bos Tutorials Tweet us your tasty treats Scott's Instagram LevelUpTutorials Instagram Wes' Instagram Wes' Twitter Wes' Facebook Scott's Twitter Make sure to include @SyntaxFM in your tweets

Startups For the Rest of Us
Episode 643 | Feature Flags, Impostor Syndrome, and More Listener Questions with Derrick Reimer

Startups For the Rest of Us

Play Episode Listen Later Jan 10, 2023 41:25


In episode 643, Rob Walling chats with fan favorite Derrick Reimer, the founder of SavvyCal, as they answer listener questions. They cover topics ranging from SaaS feature flags to communicating product needs to a technical founder and combating imposter syndrome. Episode Sponsor: Find your perfect developer or a team at Lemon.io/startupsThe competition for incredible engineers and developers has never been more fierce. Lemon.io helps you cut through the noise and find great talent through its network of engineers in Europe and Latin America.They take care of the vetting, interviewing, and testing of candidates to make sure that you are working with someone who can hit the ground running.When it comes to hiring, the time it takes to write your job description, list the position, review resumes, schedule interviews, and make an offer can take weeks, if not months. With Lemon.io, you can cut down on a lot of that time by tapping into their wide network of developers...Read more... »Click the icon below to listen.  

Python Bytes
#315 Some Stickers!

Python Bytes

Play Episode Listen Later Dec 20, 2022 29:56


Watch on YouTube About the show Sponsored by Microsoft for Startups Founders Hub. Connect with the hosts Michael: @mkennedy@fosstodon.org Brian: @brianokken@fosstodon.org Michael #1: Jupyter Server 2.0 is released! Jupyter Server provides the core web server that powers JupyterLab and Jupyter Notebook. New Identity API: As Jupyter continues to innovate its real-time collaboration experience, identity is an important component. New Authorization API: Enabling collaboration on a notebook shouldn't mean “allow everyone with access to my Jupyter Server to edit my notebooks”. What if I want to share my notebook with e.g. a subset of my teammates? New Event System API: jupyter_events—a package that provides a JSON-schema-based event-driven system to Jupyter Server and server extensions. Terminals Service is now a Server Extension: Jupyter Server now ships the “Terminals Service” as an extension (installed and enabled by default) rather than a core Jupyter Service. pytest-jupyter: A pytest plugin for Jupyter Brian #2: Converting to pyproject.toml Last week, episode 314, we talked about “Tools for rewriting Python code” and I mentioned a desire to convert setup.py/setup.cfg to pyproject.toml Several of you, including Christian Clauss and Brian Skinn, let me know about a few tools to help in that area. Thank you. ini2toml - Automatically translates .ini/.cfg files into TOML “… can also be used to convert any compatible .ini/.cfg file to TOML.” “ini2toml comes in two flavours: “lite” and “full”. The “lite” flavour will create a TOML document that does not contain any of the comments from the original .ini/.cfg file. On the other hand, the “full” flavour will make an extra effort to translate these comments into a TOML-equivalent (please notice sometimes this translation is not perfect, so it is always good to check the TOML document afterwards).” pyproject-fmt - Apply a consistent format to pyproject.toml files Having a consistent ordering and such is actually quite nice. I agreed with most changes when I tried it, except one change. The faulty change: it modified the name of my project. Not cool. pytest plugins are traditionally named pytest-something. the tool replaced the - with _. Wrong. So, be careful with adding this to your tool chain if you have intentional dashes in the name. Otherwise, and still, cool project. validate-pyproject - Automated checks on pyproject.toml powered by JSON Schema definitions It's a bit terse with errors, but still useful. $ validate-pyproject pyproject.toml Invalid file: pyproject.toml [ERROR] `project.authors[{data__authors_x}]` must be object $ validate-pyproject pyproject.toml Invalid file: pyproject.toml [ERROR] Invalid value (at line 3, column 12) I'd probably add tox Don't forget to build and test your project after making changes to pyproject.toml You'll catch things like missing dependencies that the other tools will miss. Michael #3: aws-lambda-powertools-python Via Mark Pender A suite of utilities for AWS Lambda Functions that makes distributed tracing, structured logging, custom metrics, idempotency, and many leading practices easier Looks kinda cool if you prefer to work almost entirely in python and avoid using any 3rd party tools like Terraform etc to manage the support functions of deploying, monitoring, debugging lambda functions - Tracing: Decorators and utilities to trace Lambda function handlers, and both synchronous and asynchronous functions Logging - Structured logging made easier, and decorator to enrich structured logging with key Lambda context details Metrics - Custom Metrics created asynchronously via CloudWatch Embedded Metric Format (EMF) Event handler: AppSync - AWS AppSync event handler for Lambda Direct Resolver and Amplify GraphQL Transformer function Event handler: API Gateway and ALB - Amazon API Gateway REST/HTTP API and ALB event handler for Lambda functions invoked using Proxy integration Bring your own middleware - Decorator factory to create your own middleware to run logic before, and after each Lambda invocation Parameters utility - Retrieve and cache parameter values from Parameter Store, Secrets Manager, or DynamoDB Batch processing - Handle partial failures for AWS SQS batch processing Typing - Static typing classes to speedup development in your IDE Validation - JSON Schema validator for inbound events and responses Event source data classes - Data classes describing the schema of common Lambda event triggers Parser - Data parsing and deep validation using Pydantic Idempotency - Convert your Lambda functions into idempotent operations which are safe to retry Feature Flags - A simple rule engine to evaluate when one or multiple features should be enabled depending on the input Streaming - Streams datasets larger than the available memory as streaming data. Brian #4: How to create a self updating GitHub Readme Bob Belderbos Bob's GitHub profile is nice Includes latest Pybites articles, latest Python tips, and even latest Fosstodon toots And he includes a link to an article on how he did this. A Python script that pulls together all of the content, build-readme.py and fills in a TEMPLATE.md markdown file. It gets called through a GitHub action workflow, update.yml and automatically commits changes, currently daily at 8:45 This happens every day, and it looks like there are only commits if Note: We covered Simon Willison's notes on self updating README on episode 192 in 2020 Extras Brian: GitHub can check your repos for leaked secrets. Julia Evans has released a new zine, The Pocket Guide to Debugging Python Easter Eggs Includes this fun one from 2009 from Barry Warsaw and Brett Cannon >>> from __future__ import barry_as_FLUFL >>> 1 2 True >>> 1 != 2 File "[HTML_REMOVED]", line 1 1 != 2 ^ SyntaxError: invalid syntax Crontab Guru Michael: Canary Email AI 3.11 delivers First chance to try “iPad as the sole travel device.” Here's a report. Follow up from 306 and 309 discussions. Maps be free New laptop design Joke: What are clouds made of?

Screaming in the Cloud
Consulting the Aspiring Consultant with Mike Julian

Screaming in the Cloud

Play Episode Listen Later Oct 20, 2022 30:33


About MikeBeside his duties as The Duckbill Group's CEO, Mike is the author of O'Reilly's Practical Monitoring, and previously wrote the Monitoring Weekly newsletter and hosted the Real World DevOps podcast. He was previously a DevOps Engineer for companies such as Taos Consulting, Peak Hosting, Oak Ridge National Laboratory, and many more. Mike is originally from Knoxville, TN (Go Vols!) and currently resides in Portland, OR.Links Referenced: @Mike_Julian: https://twitter.com/Mike_Julian mikejulian.com: https://mikejulian.com duckbillgroup.com: https://duckbillgroup.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With Tailscale SSH, you can do exactly that. Tailscale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate SSH.Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built in key rotation, permissions is code, connectivity between any two devices, reduce latency and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. Tailscale is completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Welcome to Screaming in the Cloud. I'm Cloud Economist Corey Quinn, and my guest is a returning guest on this show, my business partner and CEO of The Duckbill Group, Mike Julian. Mike, thanks for making the time.Mike: Lucky number three, I believe?Corey: Something like that, but numbers are hard. I have databases for that of varying quality and appropriateness for the task, but it works out. Anything's a database. If you're brave enough.Mike: With you inviting me this many times, I'm starting to think you'd like me or something.Corey: I know, I know. So, let's talk about something that is going to put that rumor to rest.Mike: [laugh].Corey: Clearly, you have made some poor choices in the course of your career, like being my business partner being the obvious one. But what's really in a dead heat for which is the worst decision is you've written a book previously. And now you are starting the process of writing another book because, I don't know, we don't keep you busy enough or something. What are you doing?Mike: Making very bad decisions. When I finished writing Practical Monitoring—O'Reilly, and by the way, you should go buy a copy if interested in monitoring—I finished the book and said, “Wow, that was awful. I'm never doing it again.” And about a month later, I started thinking of new books to write. So, that was 2017, and Corey and I started Duckbill and kind of stopped thinking about writing books because small companies are basically small children. But now I'm going to write a book about consulting.Corey: Oh, thank God. I thought you're going to go down the observability path a second time.Mike: You know, I'm actually dreading the day that O'Reilly asks me to do a second edition because I don't really want to.Corey: Yeah. Effectively turn it into an entire story where the only monitoring tool you really need is the AWS bill. That'll go well.Mike: [laugh]. Yeah. So yeah, like, basically, I've been doing consulting for such a long time, and most of my career is consulting in some form or fashion, and I head up all the consulting at Duckbill. I've learned a lot about consulting. And I've found that people have a lot of questions about consulting, particularly at the higher-end levels. Once you start getting into advisory sort of stuff, there's not a lot of great information out there aimed at engineering.Corey: There's a bunch of different views on what consulting is. You have independent contractors billing by the hour as staff replacement who call what they do consulting; you have the big consultancies, like Bain or BCG; you've got what we do in an advisory sense, and of course, you have a bunch of MBA new grads going to a lot of the big consultancies who are going to see a book on consulting and think that it's potentially for them. I don't know that you necessarily have a lot of advice for the new grad type, so who is this for? What is your target customer for this book?Mike: If you're interested in joining McKinsey out of college, I don't have a lot to add; I don't have a lot to tell you. The reason for that is kind of twofold. One is that shops like McKinsey and Deloitte and Accenture and BCG and Bain, all those, are playing very different games than what most of us think about when we think consulting. Their entire model revolves around running a process. And it's the same process for every client they work with. But, like, you're buying them because of their process.And that process is nothing new or novel. You don't go to those firms because you want the best advice possible. You go to those firms because it's the most defensible advice. It's sort of those things like, “No one gets fired for buying Cisco,” no one got fired for buying IBM, like, that sort of thing, it's a very defensible choice. But you're not going to get great results from it.But because of that, their entire model revolves around throwing dozens, in some cases, hundreds of new grads at a problem and saying, “Run this process. Have fun. Let us know if you need help.” That's not consulting I have any experience with. It's honestly not consulting that most of us want to do.Most of that is staffed by MBAs and accountants. When I think consulting, I think about specialized advice and providing that specialized advice to people. And I wager that most of us think about that in the same way, too. In some cases, it might just be, “I'm going to write code for you as a freelancer,” or I'm just going to tell you like, “Hey, put the nail in here instead of over here because it's going to be better for you.” Like, paying for advice is good.But with that, I also have a… one of the first things I say in the beginning of the book, which [laugh] I've already started writing because I'm a glutton for punishment, is I don't think junior people should be consultants. I actually think it's really bad idea because to be a consultant, you have to have expertise in some area, and junior staff don't. They haven't been in their careers long enough to develop that yet. So, they're just going to flounder. So, my advice is generally aimed at people that have been in their careers for quite some time, generally, people that are 10, 15, 20 years into their career, looking to do something.Corey: One of the problems that we see when whenever we talk about these things on Twitter is that we get an awful lot of people telling us that we're wrong, that it can't be made to work, et cetera, et cetera. But following this model, I've been independent for—well, I was independent and then we became The Duckbill Group; add them together because figuring out exactly where that divide happened is always a mental leap for me, but it's been six years at this point. We've definitely proven our ability to not go out of business every month. It's kind of amazing. Without even an exception case of, “That one time.”Mike: [laugh]. Yeah, we are living proof that it does work, but you don't really have to take just our word for it because there are a lot of other firms that exist entirely on an advisory-only, high-expertise model. And it works out really well. We've worked with several of them, so it does work; it just isn't very common inside of tech and particularly inside of engineering.Corey: So, one of the things that I find is what differentiates an expert from an enthusiastic amateur is, among other things, the number of mistakes that they've made. So, I guess a different way of asking this is what qualifies you to write this book, but instead, I'm going to frame it in a very negative way. What have you screwed up on that puts you in a position of, “Ah, I'm going to write a book so that someone else can make better choices.”Mike: One of my favorite stories to tell—and Corey, I actually think you might not have heard this story before—Corey: That seems unlikely, but give it a shot.Mike: Yeah. So, early in my career, I was working for a consulting firm that did ERP implementations. We worked with mainly large, old-school manufacturing firms. So, my job there was to do the engineering side of the implementation. So, a lot of rack-and-stack, a lot of Windows Server configuration, a lot of pulling cables, that sort of thing. So, I thought I was pretty good at this. I quickly learned that I was actually not nearly as good as I thought I was.Corey: A common affliction among many different people.Mike: A common affliction. But I did not realize that until this one particular incident. So, me and my boss are both on site at this large manufacturing facility, and the CFO pulls my boss aside and I can hear them talking and, like, she's pretty upset. She points at me and says, “I never want this asshole in my office ever again.” So, he and I have a long drive back to our office, like an hour and a half.And we had a long chat about what that meant for me. I was not there for very long after that, as you might imagine, but the thing is, I still have no idea to this day what I did to upset her. I know that she was pissed and he knows that she was pissed. And he never told me exactly what it was, only that's you take care of your client. And the client believes that I screwed up so massively that she wanted me fired.Him not wanting to argue—he didn't; he just kind of went with it—and put me on other clients. But as a result of that, it really got me thinking that I screwed something up so badly to make this person hate me so much and I still have no idea what it was that I did. Which tells me that even at the time, I did not understand what was going on around me. I did not understand how to manage clients well, and to really take care of them. That was probably the first really massive mistake that I've made my career—or, like, the first time I came to the realization that there's a whole lot I don't know and it's really costing me.Corey: From where I sit, there have been a number of things that we have done as we've built our consultancy, and I'm curious—you know, let's get this even more personal—in the past, well, we'll call it four years that we have been The Duckbill Group—which I think is right—what have we gotten right and what have we gotten wrong? You are the expert; you're writing a book on this for God's sake.Mike: So, what I think we've gotten right is one of my core beliefs is never bill hourly. Shout out to Jonathan Stark. He wrote I really good book that is a much better explanation of that than I've ever been able to come up with. But I've always had the belief that billing hourly is just a bad idea, so we've never done that and that's worked out really well for us. We've turned down work because that's the model they wanted and it's like, “Sorry, that's not what we do. You're going to have to go work for someone else—or hire someone else.”Other things that I think we've gotten right is a focus on staying on the advisory side and not doing any implementation. That's allowed us to get really good at what we do very quickly because we don't get mired in long-term implementation detail-level projects. So, that's been great. Where we went a little wrong, I think—or what we have gotten wrong, lessons that we've learned. I had this idea that we could build out a junior and mid-level staff and have them overseen by very senior people.And, as it turns out, that didn't work for us, entirely because it didn't work for me. That was really my failure. I went from being an IC to being the leader of a company in one single step. I've never been a manager before Duckbill. So, that particular mistake was really about my lack of abilities in being a good manager and being a good leader.So, building that out, that did not work for us because it didn't work for me and I didn't know how to do it. So, I made way too many mistakes that were kind of amateur-level stuff in terms of management. So, that didn't work. And the other major mistake that I think we've made is not putting enough effort into marketing. So, we get most of our leads by inbound or referral, as is common with boutique consulting firms, but a lot of the income that we get comes through Last Week in AWS, which is really awesome.But we don't put a whole lot of effort into content or any marketing stuff related to the thing that we do, like cost management. I think a lot of that is just that we don't really know how, aside from just creating content and publishing it. We don't really understand how to market ourselves very well on that side of things. I think that's a mistake we've made.Corey: It's an effective strategy against what's a very complicated problem because unlike most things, if—let's go back to your old life—if we have an observability problem, we will talk about that very publicly on Twitter and people will come over and get—“Hey, hey, have you tried to buy my company's product?” Or they'll offer consulting services, or they'll point us in the right direction, all of which is sometimes appreciated. Whereas when you have a big AWS bill, you generally don't talk about it in public, especially if you're a serious company because that's going to, uh, I think the phrase is, “Shake investor confidence,” when you're actually live tweeting slash shitposting about your own AWS bill. And our initial thesis was therefore, since we can't wind up reaching out to these people when they're having the pain because there's no external indication of it, instead what we have to do is be loud enough and notable in this space, where they find us where it shouldn't take more than them asking one or two of their friends before they get pointed to us. What's always fun as the stories we hear is, “Okay, so I asked some other people because I wanted a second opinion, and they told us to go to you, too.” Word of mouth is where our customers come from. But how do you bootstrap that? I don't know. I'm lucky that I got it right the first time.Mike: Yeah, and as I mentioned a minute ago, that a lot of that really comes through your content, which is not really cost management-related. It's much more AWS broad. We don't put out a lot of cost management specific content. And honestly, I think that's to our detriment. We should and we absolutely can. We just haven't. I think that's one of the really big things that we've missed on doing.Corey: There's an argument that the people who come to us do not spend their entire day thinking about AWS bills. I mean, I can't imagine what that would be like, but they don't for whatever reason; they're trying to do something ridiculous, like you know, run a profitable company. So, getting in front of them when they're not thinking about the bills means, on some level, that they're going to reach out to us when the bill strikes. At least that's been my operating theory.Mike: Yeah, I mean, this really just comes down to content strategy and broader marketing strategy. Because one of the things you have to think about with marketing is how do you meet a customer at the time that they have the problem that you solve? And what most marketing people talk about here is what's called the triggering event. Something causes someone to take an action. What is that something? Who is that someone, and what is that action?And for us, one of the things that we thought early on is that well, the bill comes out the first week of the month, every month, so people are going to opened the bill freak out, and a big influx of leads are going to come our way and that's going to happen every single month. The reality is that never happened. That turns out was not a triggering event for anyone.Corey: And early on, when we didn't have that many leads coming in, it was a statistical aberration that I thought I saw, like, “Oh, out of the three leads this month, two of them showed up in the same day. Clearly, it's an AWS billing day thing.” No. It turns out that every company's internal cadence is radically different.Mike: Right. And I wish I could say that we have found what our triggering events are, but I actually don't think we have. We know who the people are and we know what they reach out for, but we haven't really uncovered that triggering event. And it could also be there, there isn't a one. Or at least, if there is one, it's not one that we could see externally, which is kind of fine.Corey: Well, for the half of our consulting that does contract negotiation for large-scale commitments with AWS, it comes up for renewal or the initial discount contract gets offered, those are very clear triggering events but the challenge is that we don't—Mike: You can't see them externally.Corey: —really see that from the outside. Yeah.Mike: Right. And this is one of those things where there are triggering events for basically everything and it's probably going to be pretty consistent once you get down to specific services. Like we provide cost optimization services and contract negotiation services. I'm willing to bet that I can predict exactly what the trigger events for both of those will be pretty well. The problem is, you can never see those externally, which is kind of fine.Ideally, you would be able to see it externally, but you can't, so we roll with it, which means our entire strategy has revolved around always being top-of-mind because at the time where it happens, we're already there. And that's a much more difficult strategy to employ, but it does work.Corey: All it takes is time and being really lucky and being really prolific, and, and, and. It's one of those things where if I were to set out to replicate it, I don't even know how I'd go about doing it.Mike: People have been asking me. They say, “I want to create The Duckbill Group for X. What do I do?” And I say, “First step, get yourself a Corey Quinn.” And they're like, “Well, I can't do that. There's only one.” I'm like, “Yep. Sucks to be you.” [laugh].Corey: Yeah, we called the Jerk Store. They're running out of him. Yeah, it's a problem. And I don't think the world needs a whole lot more of my type of humor, to be honest, because the failure mode that I have experienced brutally and firsthand is not that people don't find me funny; it's that it really hurts people's feelings. I have put significant effort into correcting those mistakes and not repeating them, but it sucks every time I get it wrong.Mike: Yeah.Corey: Another question I have for you around the book targeting, are you aiming this at individual independent consultants or are you looking to advise people who are building agencies?Mike: Explicitly not the latter. My framing around this is that there are a number of people who are doing consulting right now and they've kind of fell into it. Often, they'll leave one job and do a little consulting while they're waiting on their next thing. And in some cases, that might be a month or two. In some cases, it might go on years, but that whole time, they're just like, “Oh, yeah, I'm doing consulting in between things.”But at some point, some of those think, “You know what? I want this to be my thing. I don't want there to be a next thing. This is my thing. So therefore, how do I get serious about doing consulting? How do I get serious about being a consultant?”And that's where I think I can add a lot of value because casually consulting of, like, taking whatever work just kind of falls your way is interesting for a while, but once you get serious about it, and you have to start thinking, well, how do I actually deliver engagements? How do I do that consistently? How do I do it repeatedly? How to do it profitably? How do I price my stuff? How do I package it? How do I attract the leads that I want? How do I work with the customers I want?And turning that whole thing from a casual, “Yeah, whatever,” into, “This is my business,” is a very different way of thinking. And most people don't think that way because they didn't really set out to build a business. They set out to just pass time and earn a little bit of money before they went off to the next job. So, the framing that I have here is that I'm aiming to help people that are wanting to get serious about doing consulting. But they generally have experience doing it already.Corey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomemento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: We went from effectively being the two of us on the consulting delivery side, two scaling up to, I believe, at one point we were six of us, and now we have scaled back down to largely the two of us, aided by very specific external folk, when it makes sense.Mike: And don't forget April.Corey: And of course. I'm talking delivery.Mike: [laugh].Corey: There's a reason I—Mike: Delivery. Yes.Corey: —prefaced it that way. There's a lot of support structure here, let's not get ourselves, and they make this entire place work. But why did we scale up? And then why did we scale down? Because I don't believe we've ever really talked about that publicly.Mike: No, not publicly. In fact, most people probably don't even notice that it happened. We got pretty big for—I mean, not big. So, we hit, I think, six full-time people at one point. And that was quite a bit.Corey: On the delivery side. Let's be clear.Mike: Yeah. No, I think actually with support structure, too. Like, if you add in everyone that we had with the sales and marketing as well, we were like 11 people. And that was a pretty sizable company. But then in July this year, it kind of hit a point where I found that I just wasn't enjoying my job anymore.And I looked around and noticed that a lot of other people was kind of feeling the same way, is just things had gotten harder. And the business wasn't suffering at all, it was just everything felt more difficult. And I finally realized that, for me personally at least, I started Duckbill because I love working with clients, I love doing consulting. And what I have found is that as the company grew larger and larger, I spent most of my time keeping the trains running and taking care of the staff. Which is exactly what I should be doing when we're that size, like, that is my job at that size, but I didn't actually enjoy it.I went into management as, like, this job going from having never done it before. So, I didn't have anything to compare it to. I didn't know if I would like it or not. And once I got here, I realized I actually don't. And I spent a lot of efforts to get better at it and I think I did. I've been working with a leadership coach for years now.But it finally came to a point where I just realized that I wasn't actually enjoying it anymore. I wasn't enjoying the job that I had created. And I think that really panned out to you as well. So, we decided, we had kind of an opportune time where one of our team decided that they were also wanting to go back to do independent consulting. I'm like, “Well, this is actually pretty good time. Why don't we just start scaling things back?” And like, maybe we'll scale it up again in the future; maybe we won't. But like, let's just buy ourselves some breathing room.Corey: One of the things that I think we didn't spend quite enough time really asking ourselves was what kind of place do we want to work at. Because we've explicitly stated that you and I both view this as the last job either of us is ever going to have, which means that we're not trying to do the get big quickly to get acquired, or we want to raise a whole bunch of other people's money to scale massively. Those aren't things either of us enjoy. And it turns out that handling the challenges of a business with as many people working here as we had wasn't what either one of us really wanted to do.Mike: Yeah. You know what—[laugh] it's funny because a lot of our advisors kept asking the same thing. Like, “So, what kind of company do you want?” And like, we had some pretty good answers for that, in that we didn't want to build a VC-backed company, we didn't ever want to be hyperscale. But there's a wide gulf of things between two-person company and hyperscale and we didn't really think too much about that.In fact, being a ten-person company is very different than being a three-person company, and we didn't really think about that either. We should have really put a lot more thought into that of what does it mean to be a ten-person company, and is that what we want? Or is three, four, or five-person more our style? But then again, I don't know that we could have predicted that as a concern had we not tried it first.Corey: Yeah, that was very much something that, for better or worse, we pay advisors for their advice—that's kind of definitionally how it works—and then we ignored it, on some level, though we thought we were doing something different at the time because there's some lessons you've just got to learn by making the mistake yourself.Mike: Yeah, we definitely made a few of those. [laugh].Corey: And it's been an interesting ride and I've got zero problem with how things have shaken out. I like what we do quite a bit. And honestly, the biggest fear I've got going forward is that my jackass business partner is about to distract the hell out of himself by writing a book, which is never as easy as even the most pessimistic estimates would be. So, that's going to be awesome and fun.Mike: Yeah, just wait until you see the dedication page.Corey: Yeah, I wasn't mentioned at all in the last book that you wrote, which I found personally offensive. So, if I'm not mentioned this time, you're fired.Mike: Oh, no, you are. It's just I'm also adding an anti-dedication page, which just has a photo of you.Corey: Oh, wonderful, wonderful. This is going to be one of those stories of the good consultant and the bad consultant, and I'm going to be the Goofus to your Gallant, aren't I?Mike: [laugh]. Yes, yes. You are.Corey: “Goofus wants to bill by the hour.”Mike: It's going to have a page of, like, “Here's this [unintelligible 00:25:05] book is dedicated to. Here's my acknowledgments. And [BLEEP] this guy.”Corey: I love it. I absolutely love it. I think that there is definitely a bright future for telling other people how to consult properly. May just suggest as a subtitle for the book is Consulting—subtitle—You Have Problems and Money. We'll Take Both.Mike: [laugh]. Yeah. My working title for this is Practical Consulting, but only because my previous book was Practical Monitoring. Pretty sure O'Reilly would have a fit if I did that. I actually have no idea what I'm going to call the book, still.Corey: Naming things is super hard. I would suggest asking people at AWS who name services and then doing the exact opposite of whatever they suggest. Like, take their list of recommendations and sort by reverse order and that'll get you started.Mike: Yeah. [laugh].Corey: I want to thank you for giving us an update on what you're working on and why you have less hair every time I see you because you're mostly ripping it out due to self-inflicted pain. If people want to follow your adventures, where's the best place to keep updated on this ridiculous, ridiculous nonsense that I cannot talk you out of?Mike: Two places. You can follow me on Twitter, @Mike_Julian, or you can sign up for the newsletter on my site at mikejulian.com where I'll be posting all the updates.Corey: Excellent. And I look forward to skewering the living hell out of them.Mike: I look forward to ignoring them.Corey: Thank you, Mike. It is always a pleasure.Mike: Thank you, Corey.Corey: Mike Julian, CEO at The Duckbill Group, and my unwilling best friend. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, annoying comment in which you tell us exactly what our problem is, and then charge us a fixed fee to fix that problem.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
The Evolution of Cloud Services with Richard Hartmann

Screaming in the Cloud

Play Episode Listen Later Oct 18, 2022 45:26


About RichardRichard "RichiH" Hartmann is the Director of Community at Grafana Labs, Prometheus team member, OpenMetrics founder, OpenTelemetry member, CNCF Technical Advisory Group Observability chair, CNCF Technical Oversight Committee member, CNCF Governing Board member, and more. He also leads, organizes, or helps run various conferences from hundreds to 18,000 attendess, including KubeCon, PromCon, FOSDEM, DENOG, DebConf, and Chaos Communication Congress. In the past, he made mainframe databases work, ISP backbones run, kept the largest IRC network on Earth running, and designed and built a datacenter from scratch. Go through his talks, podcasts, interviews, and articles at https://github.com/RichiH/talks or follow him on Twitter at https://twitter.com/TwitchiH for musings on the intersection of technology and society.Links Referenced: Grafana Labs: https://grafana.com/ Twitter: https://twitter.com/TwitchiH Richard Hartmann list of talks: https://github.com/richih/talks TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: This episode is brought to us in part by our friends at Datadog. Datadog's SaaS monitoring and security platform that enables full stack observability for developers, IT operations, security, and business teams in the cloud age. Datadog's platform, along with 500 plus vendor integrations, allows you to correlate metrics, traces, logs, and security signals across your applications, infrastructure, and third party services in a single pane of glass.Combine these with drag and drop dashboards and machine learning based alerts to help teams troubleshoot and collaborate more effectively, prevent downtime, and enhance performance and reliability. Try Datadog in your environment today with a free 14 day trial and get a complimentary T-shirt when you install the agent.To learn more, visit datadoghq/screaminginthecloud to get. That's www.datadoghq/screaminginthecloudCorey: Welcome to Screaming in the Cloud, I'm Corey Quinn. There are an awful lot of people who are incredibly good at understanding the ins and outs and the intricacies of the observability world. But they didn't have time to come on the show today. Instead, I am talking to my dear friend of two decades now, Richard Hartmann, better known on the internet as RichiH, who is the Director of Community at Grafana Labs, here to suffer—in a somewhat atypical departure for the theme of this show—personal attacks for once. Richie, thank you for joining me.Richard: And thank you for agreeing on personal attacks.Corey: Exactly. It was one of your riders. Like, there have to be the personal attacks back and forth or you refuse to appear on the show. You've been on before. In fact, the last time we did a recording, I believe you were here in person, which was a long time ago. What have you been up to?You're still at Grafana Labs. And in many cases, I would point out that, wow, you've been there for many years; that seems to be an atypical thing, which is an American tech industry perspective because every time you and I talk about this, you look at folks who—wow, you were only at that company for five years. What's wrong with you—you tend to take the longer view and I tend to have the fast twitch, time to go ahead and leave jobs because it's been more than 20 minutes approach. I see that you're continuing to live what you preach, though. How's it been?Richard: Yeah, so there's a little bit of Covid brains, I think. When we talked in 2018, I was still working at SpaceNet, building a data center. But the last two-and-a-half years didn't really happen for many people, myself included. So, I guess [laugh] that includes you.Corey: No, no you're right. You've only been at Grafana Labs a couple of years. One would think I would check the notes for shooting my mouth off. But then, one wouldn't know me.Richard: What notes? Anyway, I've been around Prometheus and Grafana Since 2015. But it's like, real, full-time everything is 2020. There was something in between. Since 2018, I contracted to do vulnerability handling and everything for Grafana Labs because they had something and they didn't know how to deal with it.But no, full time is 2020. But as to the space in the [unintelligible 00:02:45] of itself, it's maybe a little bit German of me, but trying to understand the real world and trying to get an overview of systems and how they actually work, and if they are working correctly and as intended, and if not, how they're not working as intended, and how to fix this is something which has always been super important to me, in part because I just want to understand the world. And this is a really, really good way to automate understanding of the world. So, it's basically a work-saving mechanism. And that's why I've been sticking to it for so long, I guess.Corey: Back in the early days of monitoring systems—so we called it monitoring back then because, you know, are using simple words that lack nuance was sort of de rigueur back then—we wound up effectively having tools. Nagios is the one that springs to mind, and it was terrible in all the ways you would expect a tool written in janky Perl in the early-2000s to be. But it told you what was going on. It tried to do a thing, generally reach a server or query it about things, and when things fell out of certain specs, it screamed its head off, which meant that when you had things like the core switch melting down—thinking of one very particular incident—you didn't get a Nagios alert; you got 4000 Nagios alerts. But start to finish, you could wrap your head rather fully around what Nagios did and why it did the sometimes strange things that it did.These days, when you take a look at Prometheus, which we hear a lot about, particularly in the Kubernetes space and Grafana, which is often mentioned in the same breath, it's never been quite clear to me exactly where those start and stop. It always feels like it's a component in a larger system to tell you what's going on rather than a one-stop shop that's going to, you know, shriek its head off when something breaks in the middle of the night. Is that the right way to think about it? The wrong way to think about it?Richard: It's a way to think about it. So personally, I use the terms monitoring and observability pretty much interchangeably. Observability is a relatively well-defined term, even though most people won't agree. But if you look back into the '70s into control theory where the term is coming from, it is the measure of how much you're able to determine the internal state of a system by looking at its inputs and its outputs. Depending on the definition, some people don't include the inputs, but that is the OG definition as far as I'm aware.And from this, there flow a lot of things. This question of—or this interpretation of the difference between telling that, yes, something's broken versus why something's broken. Or if you can't ask new questions on the fly, it's not observability. Like all of those things are fundamentally mapped to this definition of, I need enough data to determine the internal state of whatever system I have just by looking at what is coming in, what is going out. And that is at the core the thing. Now, obviously, it's become a buzzword, which is oftentimes the fate of successful things. So, it's become a buzzword, and you end up with cargo culting.Corey: I would argue periodically, that observability is hipster monitoring. If you call it monitoring, you get yelled at by Charity Majors. Which is tongue and cheek, but she has opinions, made, nonetheless shall I say, frustrating by the fact that she is invariably correct in those opinions, which just somehow makes it so much worse. It would be easy to dismiss things she says if she weren't always right. And the world is changing, especially as we get into the world of distributed systems.Is the server that runs the app working or not working loses meaning when we're talking about distributed systems, when we're talking about containers running on top of Kubernetes, which turns every outage into a murder mystery. We start having distributed applications composed of microservices, so you have no idea necessarily where an issue is. Okay, is this one microservice having an issue related to the request coming into a completely separate microservice? And it seems that for those types of applications, the answer has been tracing for a long time now, where originally that was something that felt like it was sprung, fully-formed from the forehead of some God known as one of the hyperscalers, but now is available to basically everyone, in theory.In practice, it seems that instrumenting applications still one of the hardest parts of all of this. I tried hooking up one of my own applications to be observed via OTEL, the open telemetry project, and it turns out that right now, OTEL and AWS Lambda have an intersection point that makes everything extremely difficult to work with. It's not there yet; it's not baked yet. And someday, I hope that changes because I would love to interchangeably just throw metrics and traces and logs to all the different observability tools and see which ones work, which ones don't, but that still feels very far away from current state of the art.Richard: Before we go there, maybe one thing which I don't fully agree with. You said that previously, you were told if a service up or down, that's the thing which you cared about, and I don't think that's what people actually cared about. At that time, also, what they fundamentally cared about: is the user-facing service up, or down, or impacted? Is it slow? Does it return errors every X percent for requests, something like this?Corey: Is the site up? And—you're right, I was hand-waving over a whole bunch of things. It was, “Okay. First, the web server is returning a page, yes or no? Great. Can I ping the server?” Okay, well, there are ways of server can crash and still leave enough of the TCP/IP stack up or it can respond to pings and do little else.And then you start adding things to it. But the Nagios thing that I always wanted to add—and had to—was, is the disk full? And that was annoying. And, on some level, like, why should I care in the modern era how much stuff is on the disk because storage is cheap and free and plentiful? The problem is, after the third outage in a month because the disk filled up, you start to not have a good answer for well, why aren't you monitoring whether the disk is full?And that was the contributors to taking down the server. When the website broke, there were what felt like a relatively small number of reasonably well-understood contributors to that at small to midsize applications, which is what I'm talking about, the only things that people would let me touch. I wasn't running hyperscale stuff where you have a fleet of 10,000 web servers and, “Is the server up?” Yeah, in that scenario, no one cares. But when we're talking about the database server and the two application servers and the four web servers talking to them, you think about it more in terms of pets than you do cattle.Richard: Yes, absolutely. Yet, I think that was a mistake back then, and I tried to do it differently, as a specific example with the disk. And I'm absolutely agreeing that previous generation tools limit you in how you can actually work with your data. In particular, once you're with metrics where you can do actual math on the data, it doesn't matter if the disk is almost full. It matters if that disk is going to be full within X amount of time.If that disk is 98% full and it sits there at 98% for ten years and provides the service, no one cares. The thing is, will it actually run out in the next two hours, in the next five hours, what have you. Depending on this, is this currently or imminently a customer-impacting or user-impacting then yes, alert on it, raise hell, wake people, make them fix it, as opposed to this thing can be dealt with during business hours on the next workday. And you don't have to wake anyone up.Corey: Yeah. The big filer with massive amounts of storage has crossed the 70% line. Okay, now it's time to start thinking about that, what do you want to do? Maybe it's time to order another shelf of discs for it, which is going to take some time. That's a radically different scenario than the 20 gigabyte root volume on your server just started filling up dramatically; the rate of change is such that'll be full in 20 minutes.Yeah, one of those is something you want to wake people up for. Generally speaking, you don't want to wake people up for what is fundamentally a longer-term strategic business problem. That can be sorted out in the light of day versus, “[laugh] we're not going to be making money in two hours, so if I don't wake up and fix this now.” That's the kind of thing you generally want to be woken up for. Well, let's be honest, you don't want that to happen at all, but if it does happen, you kind of want to know in advance rather than after the fact.Richard: You're literally describing linear predict from Prometheus, which is precisely for this, where I can look back over X amount of time and make a linear prediction because everything else breaks down at scale, blah, blah, blah, to detail. But the thing is, I can draw a line with my pencil by hand on my data and I can predict when is this thing going to it. Which is obviously precisely correct if I have a TLS certificate. It's a little bit more hand-wavy when it's a disk. But still, you can look into the future and you say, “What will be happening if current trends for the last X amount of time continue in Y amount of time.” And that's precisely a thing where you get this more powerful ability of doing math with your data.Corey: See, when you say it like that, it sounds like it actually is a whole term of art, where you're focusing on an in-depth field, where salaries are astronomical. Whereas the tools that I had to talk about this stuff back in the day made me sound like, effectively, the sysadmin that I was grunting and pointing: “This is gonna fill up.” And that is how I thought about it. And this is the challenge where it's easy to think about these things in narrow, defined contexts like that, but at scale, things break.Like the idea of anomaly detection. Well, okay, great if normally, the CPU and these things are super bored and suddenly it gets really busy, that's atypical. Maybe we should look into it, assuming that it has a challenge. The problem is, that is a lot harder than it sounds because there are so many factors that factor into it. And as soon as you have something, quote-unquote, “Intelligent,” making decisions on this, it doesn't take too many false positives before you start ignoring everything it has to say, and missing legitimate things. It's this weird and obnoxious conflation of both hard technical problems and human psychology.Richard: And the breaking up of old service boundaries. Of course, when you say microservices, and such, fundamentally, functionally a microservice or nanoservice, picoservice—but the pendulum is already swinging back to larger units of complexity—but it fundamentally does not make any difference if I have a monolith on some mainframe or if I have a bunch of microservices. Yes, I can scale differently, I can scale horizontally a lot more easily, vertically, it's a little bit harder, blah, blah, blah, but fundamentally, the logic and the complexity, which is being packaged is fundamentally the same. More users, everything, but it is fundamentally the same. What's happening again, and again, is I'm breaking up those old boundaries, which means the old tools which have assumptions built in about certain aspects of how I can actually get an overview of a system just start breaking down, when my complexity unit or my service or what have I, is usually congruent with a physical piece, of hardware or several services are congruent with that piece of hardware, it absolutely makes sense to think about things in terms of this one physical server. The fact that you have different considerations in cloud, and microservices, and blah, blah, blah, is not inherently that it is more complex.On the contrary, it is fundamentally the same thing. It scales with users' everything, but it is fundamentally the same thing, but I have different boundaries of where I put interfaces onto my complexity, which basically allow me to hide all of this complexity from the downstream users.Corey: That's part of the challenge that I think we're grappling with across this entire industry from start to finish. Where we originally looked at these things and could reason about it because it's the computer and I know how those things work. Well, kind of, but okay, sure. But then we start layering levels of complexity on top of layers of complexity on top of layers of complexity, and suddenly, when things stop working the way that we expect, it can be very challenging to unpack and understand why. One of the ways I got into this whole space was understanding, to some degree, of how system calls work, of how the kernel wound up interacting with userspace, about how Linux systems worked from start to finish. And these days, that isn't particularly necessary most of the time for the care and feeding of applications.The challenge is when things start breaking, suddenly having that in my back pocket to pull out could be extremely handy. But I don't think it's nearly as central as it once was and I don't know that I would necessarily advise someone new to this space to spend a few years as a systems person, digging into a lot of those aspects. And this is why you need to know what inodes are and how they work. Not really, not anymore. It's not front and center the way that it once was, in most environments, at least in the world that I live in. Agree? Disagree?Richard: Agreed. But it's very much unsurprising. You probably can't tell me how to precisely grow sugar cane or corn, you can't tell me how to refine the sugar out of it, but you can absolutely bake a cake. But you will not be able to tell me even a third of—and I'm—for the record, I'm also not able to tell you even a third about the supply chain which just goes from I have a field and some seeds and I need to have a package of refined sugar—you're absolutely enabled to do any of this. The thing is, you've been part of the previous generation of infrastructure where you know how this underlying infrastructure works, so you have more ability to reason about this, but it's not needed for cloud services nearly as much.You need different types of skill sets, but that doesn't mean the old skill set is completely useless, at least not as of right now. It's much more a case of you need fewer of those people and you need them in different places because those things have become infrastructure. Which is basically the cloud play, where a lot of this is just becoming infrastructure more and more.Corey: Oh, yeah. Back then I distinctly remember my elders looking down their noses at me because I didn't know assembly, and how could I possibly consider myself a competent systems admin if I didn't at least have a working knowledge of assembly? Or at least C, which I, over time, learned enough about to know that I didn't want to be a C programmer. And you're right, this is the value of cloud and going back to those days getting a web server up and running just to compile Apache's httpd took a week and an in-depth knowledge of GCC flags.And then in time, oh, great. We're going to have rpm or debs. Great, okay, then in time, you have apt, if you're in the dev land because I know you are a Debian developer, but over in Red Hat land, we had yum and other tools. And then in time, it became oh, we can just use something like Puppet or Chef to wind up ensuring that thing is installed. And then oh, just docker run. And now it's a checkbox in a web console for S3.These things get easier with time and step by step by step we're standing on the shoulders of giants. Even in the last ten years of my career, I used to have a great challenge question that I would interview people with of, “Do you know what TinyURL is? It takes a short URL and then expands it to a longer one. Great, on the whiteboard, tell me how you would implement that.” And you could go up one side and down the other, and then you could add constraints, multiple data centers, now one goes offline, how do you not lose data? Et cetera, et cetera.But these days, there are so many ways to do that using cloud services that it almost becomes trivial. It's okay, multiple data centers, API Gateway, a Lambda, and a global DynamoDB table. Now, what? “Well, now it gets slow. Why is it getting slow?”“Well, in that scenario, probably because of something underlying the cloud provider.” “And so now, you lose an entire AWS region. How do you handle that?” “Seems to me when that happens, the entire internet's kind of broken. Do people really need longer URLs?”And that is a valid answer, in many cases. The question doesn't really work without a whole bunch of additional constraints that make it sound fake. And that's not a weakness. That is the fact that computers and cloud services have never been as accessible as they are now. And that's a win for everyone.Richard: There's one aspect of accessibility which is actually decreasing—or two. A, you need to pay for them on an ongoing basis. And B, you need an internet connection which is suitably fast, low latency, what have you. And those are things which actually do make things harder for a variety of reasons. If I look at our back-end systems—as in Grafana—all of them have single binary modes where you literally compile everything into a single binary and you can run it on your laptop because if you're stuck on a plane, you can't do any work on it. That kind of is not the best of situations.And if you have a huge CI/CD pipeline, everything in this cloud and fine and dandy, but your internet breaks. Yeah, so I do agree that it is becoming generally more accessible. I disagree that it is becoming more accessible along all possible axes.Corey: I would agree. There is a silver lining to that as well, where yes, they are fraught and dangerous and I would preface this with a whole bunch of warnings, but from a cost perspective, all of the cloud providers do have a free tier offering where you can kick the tires on a lot of these things in return for no money. Surprisingly, the best one of those is Oracle Cloud where they have an unlimited free tier, use whatever you want in this subset of services, and you will never be charged a dime. As opposed to the AWS model of free tier where well, okay, it suddenly got very popular or you misconfigured something, and surprise, you now owe us enough money to buy Belize. That doesn't usually lead to a great customer experience.But you're right, you can't get away from needing an internet connection of at least some level of stability and throughput in order for a lot of these things to work. The stuff you would do locally on a Raspberry Pi, for example, if your budget constrained and want to get something out here, or your laptop. Great, that's not going to work in the same way as a full-on cloud service will.Richard: It's not free unless you have hard guarantees that you're not going to ever pay anything. It's fine to send warning, it's fine to switch the thing off, it's fine to have you hit random hard and soft quotas. It is not a free service if you can't guarantee that it is free.Corey: I agree with you. I think that there needs to be a free offering where, “Well, okay, you want us to suddenly stop serving traffic to the world?” “Yes. When the alternative is you have to start charging me through the nose, yes I want you to stop serving traffic.” That is definitionally what it says on the tin.And as an independent learner, that is what I want. Conversely, if I'm an enterprise, yeah, I don't care about money; we're running our Superbowl ad right now, so whatever you do, don't stop serving traffic. Charge us all the money. And there's been a lot of hand wringing about, well, how do we figure out which direction to go in? And it's, have you considered asking the customer?So, on a scale of one to bank, how serious is this account going to be [laugh]? Like, what are your big concerns: never charge me or never go down? Because we can build for either of those. Just let's make sure that all of those expectations are aligned. Because if you guess you're going to get it wrong and then no one's going to like you.Richard: I would argue this. All those services from all cloud providers actually build to address both of those. It's a deliberate choice not to offer certain aspects.Corey: Absolutely. When I talk to AWS, like, “Yeah, but there is an eventual consistency challenge in the billing system where it takes”—as anyone who's looked at the billing system can see—“Multiple days, sometimes for usage data to show up. So, how would we be able to stop things if the usage starts climbing?” To which my relatively direct responses, that sounds like a huge problem. I don't know how you'd fix that, but I do know that if suddenly you decide, as a matter of policy, to okay, if you're in the free tier, we will not charge you, or even we will not charge you more than $20 a month.So, you build yourself some headroom, great. And anything that people are able to spin up, well, you're just going to have to eat the cost as a provider. I somehow suspect that would get fixed super quickly if that were the constraint. The fact that it isn't is a conscious choice.Richard: Absolutely.Corey: And the reason I'm so passionate about this, about the free space, is not because I want to get a bunch of things for free. I assure you I do not. I mean, I spend my life fixing AWS bills and looking at AWS pricing, and my argument is very rarely, “It's too expensive.” It's that the billing dimension is hard to predict or doesn't align with a customer's experience or prices a service out of a bunch of use cases where it'll be great. But very rarely do I just sit here shaking my fist and saying, “It costs too much.”The problem is when you scare the living crap out of a student with a surprise bill that's more than their entire college tuition, even if you waive it a week or so later, do you think they're ever going to be as excited as they once were to go and use cloud services and build things for themselves and see what's possible? I mean, you and I met on IRC 20 years ago because back in those days, the failure mode and the risk financially was extremely low. It's yeah, the biggest concern that I had back then when I was doing some of my Linux experimentation is if I typed the wrong thing, I'm going to break my laptop. And yeah, that happened once or twice, and I've learned not to make those same kinds of mistakes, or put guardrails in so the blast radius was smaller, or use a remote system instead. Yeah, someone else's computer that I can destroy. Wonderful. But that was on we live and we learn as we were coming up. There was never an opportunity for us, to my understanding, to wind up accidentally running up an $8 million charge.Richard: Absolutely. And psychological safety is one of the most important things in what most people do. We are social animals. Without this psychological safety, you're not going to have long-term, self-sustaining groups. You will not make someone really excited about it. There's two basic ways to sell: trust or force. Those are the two ones. There's none else.Corey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomemento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: Yeah. And it also looks ridiculous. I was talking to someone somewhat recently who's used to spending four bucks a month on their AWS bill for some S3 stuff. Great. Good for them. That's awesome. Their credentials got compromised. Yes, that is on them to some extent. Okay, great.But now after six days, they were told that they owed $360,000 to AWS. And I don't know how, as a cloud company, you can sit there and ask a student to do that. That is not a realistic thing. They are what is known, in the United States at least, in the world of civil litigation as quote-unquote, “Judgment proof,” which means, great, you could wind up finding that someone owes you $20 billion. Most of the time, they don't have that, so you're not able to recoup it. Yeah, the judgment feels good, but you're never going to see it.That's the problem with something like that. It's yeah, I would declare bankruptcy long before, as a student, I wound up paying that kind of money. And I don't hear any stories about them releasing the collection agency hounds against people in that scenario. But I couldn't guarantee that. I would never urge someone to ignore that bill and see what happens.And it's such an off-putting thing that, from my perspective, is beneath of the company. And let's be clear, I see this behavior at times on Google Cloud, and I see it on Azure as well. This is not something that is unique to AWS, but they are the 800-pound gorilla in the space, and that's important. Or as I just to mention right now, like, as I—because I was about to give you crap for this, too, but if I go to grafana.com, it says, and I quote, “Play around with the Grafana Stack. Experience Grafana for yourself, no registration or installation needed.”Good. I was about to yell at you if it's, “Oh, just give us your credit card and go ahead and start spinning things up and we won't charge you. Honest.” Even your free account does not require a credit card; you're doing it right. That tells me that I'm not going to get a giant surprise bill.Richard: You have no idea how much thought and work went into our free offering. There was a lot of math involved.Corey: None of this is easy, I want to be very clear on that. Pricing is one of the hardest things to get right, especially in cloud. And it also, when you get it right, it doesn't look like it was that hard for you to do. But I fix [sigh] I people's AWS bills for a living and still, five or six years in, one of the hardest things I still wrestle with is pricing engagements. It's incredibly nuanced, incredibly challenging, and at least for services in the cloud space where you're doing usage-based billing, that becomes a problem.But glancing at your pricing page, you do hit the two things that are incredibly important to me. The first one is use something for free. As an added bonus, you can use it forever. And I can get started with it right now. Great, when I go and look at your pricing page or I want to use your product and it tells me to ‘click here to contact us.' That tells me it's an enterprise sales cycle, it's got to be really expensive, and I'm not solving my problem tonight.Whereas the other side of it, the enterprise offering needs to be ‘contact us' and you do that, that speaks to the enterprise procurement people who don't know how to sign a check that doesn't have to commas in it, and they want to have custom terms and all the rest, and they're prepared to pay for that. If you don't have that, you look to small-time. When it doesn't matter what price you put on it, you wind up offering your enterprise tier at some large number, it's yeah, for some companies, that's a small number. You don't necessarily want to back yourself in, depending upon what the specific needs are. You've gotten that right.Every common criticism that I have about pricing, you folks have gotten right. And I definitely can pick up on your fingerprints on a lot of this. Because it sounds like a weird thing to say of, “Well, he's the Director of Community, why would he weigh in on pricing?” It's, “I don't think you understand what community is when you ask that question.”Richard: Yes, I fully agree. It's super important to get pricing right, or to get many things right. And usually the things which just feel naturally correct are the ones which took the most effort and the most time and everything. And yes, at least from the—like, I was in those conversations or part of them, and the one thing which was always clear is when we say it's free, it must be free. When we say it is forever free, it must be forever free. No games, no lies, do what you say and say what you do. Basically.We have things where initially you get certain pro features and you can keep paying and you can keep using them, or after X amount of time they go away. Things like these are built in because that's what people want. They want to play around with the whole thing and see, hey, is this actually providing me value? Do I want to pay for this feature which is nice or this and that plugin or what have you? And yeah, you're also absolutely right that once you leave these constraints of basically self-serve cloud, you are talking about bespoke deals, but you're also talking about okay, let's sit down, let's actually understand what your business is: what are your business problems? What are you going to solve today? What are you trying to solve tomorrow?Let us find a way of actually supporting you and invest into a mutual partnership and not just grab the money and run. We have extremely low churn for, I would say, pretty good reasons. Because this thing about our users, our customers being successful, we do take it extremely seriously.Corey: It's one of those areas that I just can't shake the feeling is underappreciated industry-wide. And the reason I say that this is your fingerprints on it is because if this had been wrong, you have a lot of… we'll call them idiosyncrasies, where there are certain things you absolutely will not stand for, and misleading people and tricking them into paying money is high on that list. One of the reasons we're friends. So yeah, but I say I see your fingerprints on this, it's yeah, if this hadn't been worked out the way that it is, you would not still be there. One other thing that I wanted to call out about, well, I guess it's a confluence of pricing and logging in the rest, I look at your free tier, and it offers up to 50 gigabytes of ingest a month.And it's easy for me to sit here and compare that to other services, other tools, and other logging stories, and then I have to stop and think for a minute that yeah, discs have gotten way bigger, and internet connections have gotten way faster, and even the logs have gotten way wordier. I still am not sure that most people can really contextualize just how much logging fits into 50 gigs of data. Do you have any, I guess, ballpark examples of what that looks like? Because it's been long enough since I've been playing in these waters that I can't really contextualize it anymore.Richard: Lord of the Rings is roughly five megabytes. It's actually less. So, we're talking literally 10,000 Lord of the Rings, which you can just shove in us and we're just storing this for you. Which also tells you that you're not going to be reading any of this. Or some of it, yes, but not all of it. You need better tooling and you need proper tooling.And some of this is more modern. Some of this is where we actually pushed the state of the art. But I'm also biased. But I, for myself, do claim that we did push the state of the art here. But at the same time you come back to those absolute fundamentals of how humans deal with data.If you look back basically as far as we have writing—literally 6000 years ago, is the oldest writing—humans have always dealt with information with the state of the world in very specific ways. A, is it important enough to even write it down, to even persist it in whatever persistence mechanisms I have at my disposal? If yes, write a detailed account or record a detailed account of whatever the thing is. But it turns out, this is expensive and it's not what you need. So, over time, you optimize towards only taking down key events and only noting key events. Maybe with their interconnections, but fundamentally, the key events.As your data grows, as you have more stuff, as this still is important to your business and keeps being more important to—or doesn't even need to be a business; can be social, can be whatever—whatever thing it is, it becomes expensive, again, to retain all of those key events. So, you turn them into numbers and you can do actual math on them. And that's this path which you've seen again, and again, and again, and again, throughout humanity's history. Literally, as long as we have written records, this has played out again, and again, and again, and again, for every single field which humans actually cared about. At different times, like, power networks are way ahead of this, but fundamentally power networks work on metrics, but for transient load spike, and everything, they have logs built into their power measurement devices, but those are only far in between. Of course, the main thing is just metrics, time-series. And you see this again, and again.You also were sysadmin in internet-related all switches have been metrics-based or metrics-first for basically forever, for 20, 30 years. But that stands to reason. Of course the internet is running at by roughly 20 years scale-wise in front of the cloud because obviously you need the internet because as you wouldn't be having a cloud. So, all of those growing pains why metrics are all of a sudden the thing, “Or have been for a few years now,” is basically, of course, people who were writing software, providing their own software services, hit the scaling limitations which you hit for Internet service providers two decades, three decades ago. But fundamentally, you have this complete system. Basically profiles or distributed tracing depending on how you view distributed tracing.You can also argue that distributed tracing is key events which are linked to each other. Logs sit firmly in the key event thing and then you turn this into numbers and that is metrics. And that's basically it. You have extremes at the and where you can have valid, depending on your circumstances, engineering trade-offs of where you invest the most, but fundamentally, that is why those always appear again in humanity's dealing with data, and observability is no different.Corey: I take a look at last month's AWS bill. Mine is pretty well optimized. It's a bit over 500 bucks. And right around 150 of that is various forms of logging and detecting change in the environment. And on the one hand, I sit here, and I think, “Oh, I should optimize that,” because the value of those logs to me is zero.Except that whenever I have to go in and diagnose something or respond to an incident or have some forensic exploration, they then are worth an awful lot. And I am prepared to pay 150 bucks a month for that because the potential value of having that when the time comes is going to be extraordinarily useful. And it basically just feels like a tax on top of what it is that I'm doing. The same thing happens with application observability where, yeah, when you just want the big substantial stuff, yeah, until you're trying to diagnose something. But in some cases, yeah, okay, then crank up the verbosity and then look for it.But if you're trying to figure it out after an event that isn't likely or hopefully won't recur, you're going to wish that you spent a little bit more on collecting data out of it. You're always going to be wrong, you're always going to be unhappy, on some level.Richard: Ish. You could absolutely be optimizing this. I mean, for $500, it's probably not worth your time unless you take it as an exercise, but outside of due diligence where you need specific logs tied to—or specific events tied to specific times, I would argue that a lot of the problems with logs is just dealing with it wrong. You have this one extreme of full-text indexing everything, and you have this other extreme of a data lake—which is just a euphemism of never looking at the data again—to keep storage vendors happy. There is an in between.Again, I'm biased, but like for example, with Loki, you have those same label sets as you have on your metrics with Prometheus, and you have literally the same, which means you only index that part and you only extract on ingestion time. If you don't have structured logs yet, only put the metadata about whatever you care about extracted and put it into your label set and store this, and that's the only thing you index. But it goes further than just this. You can also turn those logs into metrics.And to me this is a path of optimization. Where previously I logged this and that error. Okay, fine, but it's just a log line telling me it's HTTP 500. No one cares that this is at this precise time. Log levels are also basically an anti-pattern because they're just trying to deal with the amount of data which I have, and try and get a handle on this on that level whereas it would be much easier if I just counted every time I have an HTTP 500, I just up my counter by one. And again, and again, and again.And all of a sudden, I have literally—and I did the math on this—over 99.8% of the data which I have to store just goes away. It's just magic the way—and we're only talking about the first time I'm hitting this logline. The second time I'm hitting this logline is functionally free if I turn this into metrics. It becomes cheap enough that one of the mantras which I have, if you need to onboard your developers on modern observability, blah, blah, blah, blah, blah, the whole bells and whistles, usually people have logs, like that's what they have, unless they were from ISPs or power companies, or so; there they usually start with metrics.But most users, which I see both with my Grafana and with my Prometheus [unintelligible 00:38:46] tend to start with logs. They have issues with those logs because they're basically unstructured and useless and you need to first make them useful to some extent. But then you can leverage on this and instead of having a debug statement, just put a counter. Every single time you think, “Hey, maybe I should put a debug statement,” just put a counter instead. In two months time, see if it was worth it or if you delete that line and just remove that counter.It's so much cheaper, you can just throw this on and just have it run for a week or a month or whatever timeframe and done. But it goes beyond this because all of a sudden, if I can turn my logs into metrics properly, I can start rewriting my alerts on those metrics. I can actually persist those metrics and can more aggressively throw my logs away. But also, I have this transition made a lot easier where I don't have this huge lift, where this day in three months is to be cut over and we're going to release the new version of this and that software and it's not going to have that, it's going to have 80% less logs and everything will be great and then you missed the first maintenance window or someone is ill or what have you, and then the next Big Friday is coming so you can't actually deploy there. I mean Black Friday. But we can also talk about deploying on Fridays.But the thing is, you have this huge thing, whereas if you have this as a continuous improvement process, I can just look at, this is the log which is coming out. I turn this into a number, I start emitting metrics directly, and I see that those numbers match. And so, I can just start—I build new stuff, I put it into a new data format, I actually emit the new data format directly from my code instrumentation, and only then do I start removing the instrumentation for the logs. And that allows me to, with full confidence, with psychological safety, just move a lot more quickly, deliver much more quickly, and also cut down on my costs more quickly because I'm just using more efficient data types.Corey: I really want to thank you for spending as much time as you have. If people want to learn more about how you view the world and figure out what other personal attacks they can throw your way, where's the best place for them to find you?Richard: Personal attacks, probably Twitter. It's, like, the go-to place for this kind of thing. For actually tracking, I stopped maintaining my own website. Maybe I'll do again, but if you go on github.com/ritchieh/talks, you'll find a reasonably up-to-date list of all the talks, interviews, presentations, panels, what have you, which I did over the last whatever amount of time. [laugh].Corey: And we will, of course, put links to that in the [show notes 00:41:23]. Thanks again for your time. It's always appreciated.Richard: And thank you.Corey: Richard Hartmann, Director of Community at Grafana Labs. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment. And then when someone else comes along with an insulting comment they want to add, we'll just increment the counter by one.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Dynamic Configuration Through AWS AppConfig with Steve Rice

Screaming in the Cloud

Play Episode Listen Later Oct 11, 2022 35:54


About Steve:Steve Rice is Principal Product Manager for AWS AppConfig. He is surprisingly passionate about feature flags and continuous configuration. He lives in the Washington DC area with his wife, 3 kids, and 2 incontinent dogs.Links Referenced:AWS AppConfig: https://go.aws/awsappconfig TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: Forget everything you know about SSH and try Tailscale. Imagine if you didn't need to manage PKI or rotate SSH keys every time someone leaves. That'd be pretty sweet, wouldn't it? With tail scale, ssh, you can do exactly that. Tail scale gives each server and user device a node key to connect to its VPN, and it uses the same node key to authorize and authenticate.S. Basically you're SSHing the same way you manage access to your app. What's the benefit here? Built in key rotation permissions is code connectivity between any two devices, reduce latency and there's a lot more, but there's a time limit here. You can also ask users to reauthenticate for that extra bit of security. Sounds expensive?Nope, I wish it were. tail scales. Completely free for personal use on up to 20 devices. To learn more, visit snark.cloud/tailscale. Again, that's snark.cloud/tailscaleCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This is a promoted guest episode. What does that mean? Well, it means that some people don't just want me to sit here and throw slings and arrows their way, they would prefer to send me a guest specifically, and they do pay for that privilege, which I appreciate. Paying me is absolutely a behavior I wish to endorse.Today's victim who has decided to contribute to slash sponsor my ongoing ridiculous nonsense is, of all companies, AWS. And today I'm talking to Steve Rice, who's the principal product manager on AWS AppConfig. Steve, thank you for joining me.Steve: Hey, Corey, great to see you. Thanks for having me. Looking forward to a conversation.Corey: As am I. Now, AppConfig does something super interesting, which I'm not aware of any other service or sub-service doing. You are under the umbrella of AWS Systems Manager, but you're not going to market with Systems Manager AppConfig. You're just AWS AppConfig. Why?Steve: So, AppConfig is part of AWS Systems Manager. Systems Manager has, I think, 17 different features associated with it. Some of them have an individual name that is associated with Systems Manager, some of them don't. We just happen to be one that doesn't. AppConfig is a service that's been around for a while internally before it was launched externally a couple years ago, so I'd say that's probably the origin of the name and the service. I can tell you more about the origin of the service if you're curious.Corey: Oh, I absolutely am. But I just want to take a bit of a detour here and point out that I make fun of the sub-service names in Systems Manager an awful lot, like Systems Manager Session Manager and Systems Manager Change Manager. And part of the reason I do that is not just because it's funny, but because almost everything I found so far within the Systems Manager umbrella is pretty awesome. It aligns with how I tend to think about the world in a bunch of different ways. I have yet to see anything lurking within the Systems Manager umbrella that has led to a tee-hee-hee bill surprise level that rivals, you know, the GDP of Guam. So, I'm a big fan of the entire suite of services. But yes, how did AppConfig get its name?Steve: [laugh]. So, AppConfig started about six years ago, now, internally. So, we actually were part of the region services department inside of Amazon, which is in charge of launching new services around the world. We found that a centralized tool for configuration associated with each service launching was really helpful. So, a service might be launching in a new region and have to enable and disable things as it moved along.And so, the tool was sort of built for that, turning on and off things as the region developed and was ready to launch publicly; then the regions launch publicly. It turned out that our internal customers, which are a lot of AWS services and then some Amazon services as well, started to use us beyond launching new regions, and started to use us for feature flagging. Again, turning on and off capabilities, launching things safely. And so, it became massively popular; we were actually a top 30 service internally in terms of usage. And two years ago, we thought we really should launch this externally and let our customers benefit from some of the goodness that we put in there, and some of—those all come from the mistakes we've made internally. And so, it became AppConfig. In terms of the name itself, we specialize in application configuration, so that's kind of a mouthful, so we just changed it to AppConfig.Corey: Earlier this year, there was a vulnerability reported around I believe it was AWS Glue, but please don't quote me on that. And as part of its excellent response that AWS put out, they said that from the time that it was disclosed to them, they had patched the service and rolled it out to every AWS region in which Glue existed in a little under 29 hours, which at scale is absolutely magic fast. That is superhero speed and then some because you generally don't just throw something over the wall, regardless of how small it is when we're talking about something at the scale of AWS. I mean, look at who your customers are; mistakes will show. This also got me thinking that when you have Adam, or previously Andy, on stage giving a keynote announcement and then they mention something on stage, like, “Congratulations. It's now a very complicated service with 14 adjectives in his name because someone's paid by the syllable. Great.”Suddenly, the marketing pages are up, the APIs are working, it's showing up in the console, and it occurs to me only somewhat recently to think about all of the moving parts that go on behind this. That is far faster than even the improved speed of CloudFront distribution updates. There's very clearly something going on there. So, I've got to ask, is that you?Steve: Yes, a lot of that is us. I can't take credit for a hundred percent of what you're talking about, but that's how we are used. We're essentially used as a feature-flagging service. And I can talk generically about feature flagging. Feature flagging allows you to push code out to production, but it's hidden behind a configuration switch: a feature toggle or a feature flag. And that code can be sitting out there, nobody can access it until somebody flips that toggle. Now, the smart way to do it is to flip that toggle on for a small set of users. Maybe it's just internal users, maybe it's 1% of your users. And so, the features available, you can—Corey: It's your best slash worst customers [laugh] in that 1%, in some cases.Steve: Yeah, you want to stress test the system with them and you want to be able to look and see what's going to break before it breaks for everybody. So, you release us to a small cohort, you measure your operations, you measure your application health, you measure your reputational concerns, and then if everything goes well, then you maybe bump it up to 2%, and then 10%, and then 20%. So, feature flags allow you to slowly release features, and you know what you're releasing by the time it's at a hundred percent. It's tempting for teams to want to, like, have everybody access it at the same time; you've been working hard on this feature for a long time. But again, that's kind of an anti-pattern. You want to make sure that on production, it behaves the way you expect it to behave.Corey: I have to ask what is the fundamental difference between feature flags and/or dynamic configuration. Because to my mind, one of them is a means of achieving the other, but I could also see very easily using the terms interchangeably. Given that in some of our conversations, you have corrected me which, first, how dare you? Secondly, okay, there's probably a reason here. What is that point of distinction?Steve: Yeah. Typically for those that are not eat, sleep, and breathing dynamic configuration—which I do—and most people are not obsessed with this kind of thing, feature flags is kind of a shorthand for dynamic configuration. It allows you to turn on and off things without pushing out any new code. So, your application code's running, it's pulling its configuration data, say every five seconds, every ten seconds, something like that, and when that configuration data changes, then that app changes its behavior, again, without a code push or without restarting the app.So, dynamic configuration is maybe a superset of feature flags. Typically, when people think feature flags, they're thinking of, “Oh, I'm going to release a new feature, so it's almost like an on-off switch.” But we see customers using feature flags—and we use this internally—for things like throttling limits. Let's say you want to be able to throttle TPS transactions per second. Or let's say you want to throttle the number of simultaneous background tasks, and say, you know, I just really don't want this creeping above 50; bad things can start to happen.But in a period of stress, you might want to actually bring that number down. Well, you can push out these changes with dynamic configuration—which is, again, any type of configuration, not just an on-off switch—you can push this out and adjust the behavior and see what happens. Again, I'd recommend pushing it out to 1% of your users, and then 10%. But it allows you to have these dials and switches to do that. And, again, generically, that's dynamic configuration. It's not as fun to term as feature flags; feature flags is sort of a good mental picture, so I do use them interchangeably, but if you're really into the whole world of this dynamic configuration, then you probably will care about the difference.Corey: Which makes a fair bit of sense. It's the question of what are you talking about high level versus what are you talking about implementation detail-wise.Steve: Yep. Yep.Corey: And on some level, I used to get… well, we'll call it angsty—because I can't think of a better adjective right now—about how AWS was reluctant to disclose implementation details behind what it did. And in the fullness of time, it's made a lot more sense to me, specifically through a lens of, you want to be able to have the freedom to change how something works under the hood. And if you've made no particular guarantee about the implementation detail, you can do that without potentially worrying about breaking a whole bunch of customer expectations that you've inadvertently set. And that makes an awful lot of sense.The idea of rolling out changes to your infrastructure has evolved over the last decade. Once upon a time you'd have EC2 instances, and great, you want to go ahead and make a change there—or this actually predates EC2 instances. Virtual machines in a data center or heaven forbid, bare metal servers, you're not going to deploy a whole new server because there's a new version of the code out, so you separate out your infrastructure from the code that it runs. And that worked out well. And increasingly, we started to see ways of okay, if we want to change the behavior of the application, we'll just push out new environment variables to that thing and restart the service so it winds up consuming those.And that's great. You've rolled it out throughout your fleet. With containers, which is sort of the next logical step, well, okay, this stuff gets baked in, we'll just restart containers with a new version of code because that takes less than a second each and you're fine. And then Lambda functions, it's okay, we'll just change the deployment option and the next invocation will wind up taking the brand new environment variables passed out to it. How do feature flags feature into those, I guess, three evolving methods of running applications in anger, by which I mean, of course, production?Steve: [laugh]. Good question. And I think you really articulated that well.Corey: Well, thank you. I should hope so. I'm a storyteller. At least I fancy myself one.Steve: [laugh]. Yes, you are. Really what you talked about is the evolution of you know, at the beginning, people were—well, first of all, people probably were embedding their variables deep in their code and then they realized, “Oh, I want to change this,” and now you have to find where in my code that is. And so, it became a pattern. Why don't we separate everything that's a configuration data into its own file? But it'll get compiled at build time and sent out all at once.There was kind of this breakthrough that was, why don't we actually separate out the deployment of this? We can separate the deployment from code from the deployment of configuration data, and have the code be reading that configuration data on a regular interval, as I already said. So now, as the environments have changed—like you said, containers and Lambda—that ability to make tweaks at microsecond intervals is more important and more powerful. So, there certainly is still value in having things like environment variables that get read at startup. We call that static configuration as opposed to dynamic configuration.And that's a very important element in the world of containers that you talked about. Containers are a bit ephemeral, and so they kind of come and go, and you can restart things, or you might spin up new containers that are slightly different config and have them operate in a certain way. And again, Lambda takes that to the next level. I'm really excited where people are going to take feature flags to the next level because already today we have people just fine-tuning to very targeted small subsets, different configuration data, different feature flag data, and allows them to do this like at we've never seen before scale of turning this on, seeing how it reacts, seeing how the application behaves, and then being able to roll that out to all of your audience.Now, you got to be careful, you really don't want to have completely different configurations out there and have 10 different, or you know, 100 different configurations out there. That makes it really tough to debug. So, you want to think of this as I want to roll this out gradually over time, but eventually, you want to have this sort of state where everything is somewhat consistent.Corey: That, on some level, speaks to a level of operational maturity that my current deployment adventures generally don't have. A common reference I make is to my lasttweetinaws.com Twitter threading app. And anyone can visit it, use it however they want.And it uses a Route 53 latency record to figure out, ah, which is the closest region to you because I've deployed it to 20 different regions. Now, if this were a paid service, or I had people using this in large volume and I had to worry about that sort of thing, I would probably approach something that is very close to what you describe. In practice, I pick a devoted region that I deploy something to, and cool, that's sort of my canary where I get things working the way I would expect. And when that works the way I want it to I then just push it to everything else automatically. Given that I've put significant effort into getting deployments down to approximately two minutes to deploy to everything, it feels like that's a reasonable amount of time to push something out.Whereas if I were, I don't know, running a bank, for example, I would probably have an incredibly heavy process around things that make changes to things like payment or whatnot. Because despite the lies, we all like to tell both to ourselves and in public, anything that touches payments does go through waterfall, not agile iterative development because that mistake tends to show up on your customer's credit card bills, and then they're also angry. I think that there's a certain point of maturity you need to be at as either an organization or possibly as a software technology stack before something like feature flags even becomes available to you. Would you agree with that, or is this something everyone should use?Steve: I would agree with that. Definitely, a small team that has communication flowing between the two probably won't get as much value out of a gradual release process because everybody kind of knows what's going on inside of the team. Once your team scales, or maybe your audience scales, that's when it matters more. You really don't want to have something blow up with your users. You really don't want to have people getting paged in the middle of the night because of a change that was made. And so, feature flags do help with that.So typically, the journey we see is people start off in a maybe very small startup. They're releasing features at a very fast pace. They grow and they start to build their own feature flagging solution—again, at companies I've been at previously have done that—and you start using feature flags and you see the power of it. Oh, my gosh, this is great. I can release something when I want without doing a big code push. I can just do a small little change, and if something goes wrong, I can roll it back instantly. That's really handy.And so, the basics of feature flagging might be a homegrown solution that you all have built. If you really lean into that and start to use it more, then you probably want to look at a third-party solution because there's so many features out there that you might want. A lot of them are around safeguards that makes sure that releasing a new feature is safe. You know, again, pushing out a new feature to everybody could be similar to pushing out untested code to production. You don't want to do that, so you need to have, you know, some checks and balances in your release process of your feature flags, and that's what a lot of third parties do.It really depends—to get back to your question about who needs feature flags—it depends on your audience size. You know, if you have enough audience out there to want to do a small rollout to a small set first and then have everybody hit it, that's great. Also, if you just have, you know, one or two developers, then feature flags are probably something that you're just kind of, you're doing yourself, you're pushing out this thing anyway on your own, but you don't need it coordinated across your team.Corey: I think that there's also a bit of—how to frame this—misunderstanding on someone's part about where AppConfig starts and where it stops. When it was first announced, feature flags were one of the things that it did. And that was talked about on stage, I believe in re:Invent, but please don't quote me on that, when it wound up getting announced. And then in the fullness of time, there was another announcement of AppConfig now supports feature flags, which I'm sitting there and I had to go back to my old notes. Like, did I hallucinate this? Which again, would not be the first time I'd imagine such a thing. But no, it was originally how the service was described, but now it's extra feature flags, almost like someone would, I don't know, flip on a feature-flag toggle for the service and now it does a different thing. What changed? What was it that was misunderstood about the service initially versus what it became?Steve: Yeah, I wouldn't say it was a misunderstanding. I think what happened was we launched it, guessing what our customers were going to use it as. We had done plenty of research on that, and as I mentioned before we had—Corey: Please tell me someone used it as a database. Or am I the only nutter that does stuff like that?Steve: We have seen that before. We have seen something like that before.Corey: Excellent. Excellent, excellent. I approve.Steve: And so, we had done our due diligence ahead of time about how we thought people were going to use it. We were right about a lot of it. I mentioned before that we have a lot of usage internally, so you know, that was kind of maybe cheating even for us to be able to sort of see how this is going to evolve. What we did announce, I guess it was last November, was an opinionated version of feature flags. So, we had people using us for feature flags, but they were building their own structure, their own JSON, and there was not a dedicated console experience for feature flags.What we announced last November was an opinionated version that structured the JSON in a way that we think is the right way, and that afforded us the ability to have a smooth console experience. So, if we know what the structure of the JSON is, we can have things like toggles and validations in there that really specifically look at some of the data points. So, that's really what happened. We're just making it easier for our customers to use us for feature flags. We still have some customers that are kind of building their own solution, but we're seeing a lot of them move over to our opinionated version.Corey: This episode is brought to us in part by our friends at Datadog. Datadog's SaaS monitoring and security platform that enables full stack observability for developers, IT operations, security, and business teams in the cloud age. Datadog's platform, along with 500 plus vendor integrations, allows you to correlate metrics, traces, logs, and security signals across your applications, infrastructure, and third party services in a single pane of glass.Combine these with drag and drop dashboards and machine learning based alerts to help teams troubleshoot and collaborate more effectively, prevent downtime, and enhance performance and reliability. Try Datadog in your environment today with a free 14 day trial and get a complimentary T-shirt when you install the agent.To learn more, visit datadoghq/screaminginthecloud to get. That's www.datadoghq/screaminginthecloudCorey: Part of the problem I have when I look at what it is you folks do, and your use cases, and how you structure it is, it's similar in some respects to how folks perceive things like FIS, the fault injection service, or chaos engineering, as is commonly known, which is, “We can't even get the service to stay up on its own for any [unintelligible 00:18:35] period of time. What do you mean, now let's intentionally degrade it and make it work?” There needs to be a certain level of operational stability or operational maturity. When you're still building a service before it's up and running, feature flags seem awfully premature because there's no one depending on it. You can change configuration however your little heart desires. In most cases. I'm sure at certain points of scale of development teams, you have a communications problem internally, but it's not aimed at me trying to get something working at 2 a.m. in the middle of the night.Whereas by the time folks are ready for what you're doing, they clearly have that level of operational maturity established. So, I have to guess on some level, that your typical adopter of AppConfig feature flags isn't in fact, someone who is, “Well, we're ready for feature flags; let's go,” but rather someone who's come up with something else as a stopgap as they've been iterating forward. Usually something homebuilt. And it might very well be you have the exact same biggest competitor that I do in my consulting work, which is of course, Microsoft Excel as people try to build their own thing that works in their own way.Steve: Yeah, so definitely a very common customer of ours is somebody that is using a homegrown solution for turning on and off things. And they really feel like I'm using the heck out of these feature flags. I'm using them on a daily or weekly basis. I would like to have some enhancements to how my feature flags work, but I have limited resources and I'm not sure that my resources should be building enhancements to a feature-flagging service, but instead, I'd rather have them focusing on something, you know, directly for our customers, some of the core features of whatever your company does. And so, that's when people sort of look around externally and say, “Oh, let me see if there's some other third-party service or something built into AWS like AWS AppConfig that can meet those needs.”And so absolutely, the workflows get more sophisticated, the ability to move forward faster becomes more important, and do so in a safe way. I used to work at a cybersecurity company and we would kind of joke that the security budget of the company is relatively low until something bad happens, and then it's, you know, whatever you need to spend on it. It's not quite the same with feature flags, but you do see when somebody has a problem on production, and they want to be able to turn something off right away or make an adjustment right away, then the ability to do that in a measured way becomes incredibly important. And so, that's when, again, you'll see customers starting to feel like they're outgrowing their homegrown solution and moving to something that's a third-party solution.Corey: Honestly, I feel like so many tools exist in this space, where, “Oh, yeah, you should definitely use this tool.” And most people will use that tool. The second time. Because the first time, it's one of those, “How hard could that be out? I can build something like that in a weekend.” Which is sort of the rallying cry of doomed engineers who are bad at scoping.And by the time that they figure out why, they have to backtrack significantly. There's a whole bunch of stuff that I have built that people look at and say, “Wow, that's a really great design. What inspired you to do that?” And the absolute honest answer to all of it is simply, “Yeah, I worked in roles for the first time I did it the way you would think I would do it and it didn't go well.” Experience is what you get when you didn't get what you wanted, and this is one of those areas where it tends to manifest in reasonable ways.Steve: Absolutely, absolutely.Corey: So, give me an example here, if you don't mind, about how feature flags can improve the day-to-day experience of an engineering team or an engineer themselves. Because we've been down this path enough, in some cases, to know the failure modes, but for folks who haven't been there that's trying to shave a little bit off of their journey of, “I'm going to learn from my own mistakes.” Eh, learn from someone else's. What are the benefits that accrue and are felt immediately?Steve: Yeah. So, we kind of have a policy that the very first commit of any new feature ought to be the feature flag. That's that sort of on-off switch that you want to put there so that you can start to deploy your code and not have a long-lived branch in your source code. But you can have your code there, it reads whether that configuration is on or off. You start with it off.And so, it really helps just while developing these things about keeping your branches short. And you can push the mainline, as long as the feature flag is off and the feature is hidden to production, which is great. So, that helps with the mess of doing big code merges. The other part is around the launch of a feature.So, you talked about Andy Jassy being on stage to launch a new feature. Sort of the old way of doing this, Corey, was that you would need to look at your pipelines and see how long it might take for you to push out your code with any sort of code change in it. And let's say that was an hour-and-a-half process and let's say your CEO is on stage at eight o'clock on a Friday. And as much as you like to say it, “Oh, I'm never pushing out code on a Friday,” sometimes you have to. The old way—Corey: Yeah, that week, yes you are, whether you want to or not.Steve: [laugh]. Exactly, exactly. The old way was this idea that I'm going to time my release, and it takes an hour-and-a-half; I'm going to push it out, and I'll do my best, but hopefully, when the CEO raises her arm or his arm up and points to a screen that everything's lit up. Well, let's say you're doing that and something goes wrong and you have to start over again. Well, oh, my goodness, we're 15 minutes behind, can you accelerate things? And then you start to pull away some of these blockers to accelerate your pipeline or you start editing it right in the console of your application, which is generally not a good idea right before a really big launch.So, the new way is, I'm going to have that code already out there on a Wednesday [laugh] before this big thing on a Friday, but it's hidden behind this feature flag, I've already turned it on and off for internals, and it's just waiting there. And so, then when the CEO points to the big screen, you can just flip that one small little configuration change—and that can be almost instantaneous—and people can access it. So, that just reduces the amount of stress, reduces the amount of risk in pushing out your code.Another thing is—we've heard this from customers—customers are increasing the number of deploys that they can do per week by a very large percentage because they're deploying with confidence. They know that I can push out this code and it's off by default, then I can turn it on whenever I feel like it, and then I can turn it off if something goes wrong. So, if you're into CI/CD, you can actually just move a lot faster with a number of pushes to production each week, which again, I think really helps engineers on their day-to-day lives. The final thing I'm going to talk about is that let's say you did push out something, and for whatever reason, that following weekend, something's going wrong. The old way was oop, you're going to get a page, I'm going to have to get on my computer and go and debug things and fix things, and then push out a new code change.And this could be late on a Saturday evening when you're out with friends. If there's a feature flag there that can turn it off and if this feature is not critical to the operation of your product, you can actually just go in and flip that feature flag off until the next morning or maybe even Monday morning. So, in theory, you kind of get your free time back when you are implementing feature flags. So, I think those are the big benefits for engineers in using feature flags.Corey: And the best way to figure out whether someone is speaking from a position of experience or is simply a raving zealot when they're in a position where they are incentivized to advocate for a particular way of doing things or a particular product, as—let's be clear—you are in that position, is to ask a form of the following question. Let's turn it around for a second. In what scenarios would you absolutely not want to use feature flags? What problems arise? When do you take a look at a situation and say, “Oh, yeah, feature flags will make things worse, instead of better. Don't do it.”Steve: I'm not sure I wouldn't necessarily don't do it—maybe I am that zealot—but you got to do it carefully.Corey: [laugh].Steve: You really got to do things carefully because as I said before, flipping on a feature flag for everybody is similar to pushing out untested code to production. So, you want to do that in a measured way. So, you need to make sure that you do a couple of things. One, there should be some way to measure what the system behavior is for a small set of users with that feature flag flipped to on first. And it could be some canaries that you're using for that.You can also—there's other mechanisms you can do that to: set up cohorts and beta testers and those kinds of things. But I would say the gradual rollout and the targeted rollout of a feature flag is critical. You know, again, it sounds easy, “I'll just turn it on later,” but you ideally don't want to do that. The second thing you want to do is, if you can, is there some sort of validation that the feature flag is what you expect? So, I was talking about on-off feature flags; there are things, as when I was talking about dynamic configuration, that are things like throttling limits, that you actually want to make sure that you put in some other safeguards that say, “I never want my TPS to go above 1200 and never want to set it below 800,” for whatever reason, for example. Well, you want to have some sort of validation of that data before the feature flag gets pushed out. Inside Amazon, we actually have the policy that every single flag needs to have some sort of validation around it so that we don't accidentally fat-finger something out before it goes out there. And we have fat-fingered things.Corey: Typing the wrong thing into a command structure into a tool? “Who would ever do something like that?” He says, remembering times he's taken production down himself, exactly that way.Steve: Exactly, exactly, yeah. And we've done it at Amazon and AWS, for sure. And so yeah, if you have some sort of structure or process to validate that—because oftentimes, what you're doing is you're trying to remediate something in production. Stress levels are high, it is especially easy to fat-finger there. So, that check-and-balance of a validation is important.And then ideally, you have something to automatically roll back whatever change that you made, very quickly. So AppConfig, for example, hooks up to CloudWatch alarms. If an alarm goes off, we're actually going to roll back instantly whatever that feature flag was to its previous state so that you don't even need to really worry about validating against your CloudWatch. It'll just automatically do that against whatever alarms you have.Corey: One of the interesting parts about working at Amazon and seeing things in Amazonian scale is that one in a million events happen thousands of times every second for you folks. What lessons have you learned by deploying feature flags at that kind of scale? Because one of my problems and challenges with deploying feature flags myself is that in some cases, we're talking about three to five users a day for some of these things. That's not really enough usage to get insights into various cohort analyses or A/B tests.Steve: Yeah. As I mentioned before, we build these things as features into our product. So, I just talked about the CloudWatch alarms. That wasn't there originally. Originally, you know, if something went wrong, you would observe a CloudWatch alarm and then you decide what to do, and one of those things might be that I'm going to roll back my configuration.So, a lot of the mistakes that we made that caused alarms to go off necessitated us building some automatic mechanisms. And you know, a human being can only react so fast, but an automated system there is going to be able to roll things back very, very quickly. So, that came from some specific mistakes that we had made inside of AWS. The validation that I was talking about as well. We have a couple of ways of validating things.You might want to do a syntactic validation, which really you're validating—as I was saying—the range between 100 and 1000, but you also might want to have sort of a functional validation, or we call it a semantic validation so that you can make sure that, for example, if you're switching to a new database, that you're going to flip over to your new database, you can have a validation there that says, “This database is ready, I can write to this table, it's truly ready for me to switch.” Instead of just updating some config data, you're actually going to be validating that the new target is ready for you. So, those are a couple of things that we've learned from some of the mistakes we made. And again, not saying we aren't making mistakes still, but we always look at these things inside of AWS and figure out how we can benefit from them and how our customers, more importantly, can benefit from these mistakes.Corey: I would say that I agree. I think that you have threaded the needle of not talking smack about your own product, while also presenting it as not the global panacea that everyone should roll out, willy-nilly. That's a good balance to strike. And frankly, I'd also say it's probably a good point to park the episode. If people want to learn more about AppConfig, how you view these challenges, or even potentially want to get started using it themselves, what should they do?Steve: We have an informational page at go.aws/awsappconfig. That will tell you the high-level overview. You can search for our documentation and we have a lot of blog posts to help you get started there.Corey: And links to that will, of course, go into the [show notes 00:31:21]. Thank you so much for suffering my slings, arrows, and other assorted nonsense on this. I really appreciate your taking the time.Steve: Corey thank you for the time. It's always a pleasure to talk to you. Really appreciate your insights.Corey: You're too kind. Steve Rice, principal product manager for AWS AppConfig. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment. But before you do, just try clearing your cookies and downloading the episode again. You might be in the 3% cohort for an A/B test, and you [want to 00:32:01] listen to the good one instead.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
HeatWave and the Latest Evolution of MySQL with Nipun Agarwal

Screaming in the Cloud

Play Episode Listen Later Oct 6, 2022 38:43


About NipunNipun Agarwal is a Senior Vice President, MySQL HeatWave Development, Oracle. His interests include distributed data processing, machine learning, cloud technologies and security. Nipun was part of the Oracle Database team where he introduced a number of new features. He has been awarded over 170 patents.Links Referenced: Oracle: https://oracle.com MySQL HeatWave info: https://www.oracle.com/mysql/ MySQL Service on AWS and OCI login (Oracle account required): https://cloud.mysql.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at Datadog. Datadog's SaaS monitoring and security platform that enables full stack observability for developers, IT operations, security, and business teams in the cloud age. Datadog's platform, along with 500 plus vendor integrations, allows you to correlate metrics, traces, logs, and security signals across your applications, infrastructure, and third party services in a single pane of glass.Combine these with drag and drop dashboards and machine learning based alerts to help teams troubleshoot and collaborate more effectively, prevent downtime, and enhance performance and reliability. Try Datadog in your environment today with a free 14 day trial and get a complimentary T-shirt when you install the agent.To learn more, visit datadoghq.com/screaminginthecloud to get. That's www.datadoghq.com/screaminginthecloudCorey: This episode is sponsored in part by our friends at Sysdig. Sysdig secures your cloud from source to run. They believe, as do I, that DevOps and security are inextricably linked. If you wanna learn more about how they view this, check out their blog, it's definitely worth the read. To learn more about how they are absolutely getting it right from where I sit, visit Sysdig.com and tell them that I sent you. That's S Y S D I G.com. And my thanks to them for their continued support of this ridiculous nonsense.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted episode is sponsored by our friends at Oracle, and back for a borderline historic third round going out and telling stories about these things, we have Nipun Agarwal, who is, as opposed to his first appearance on the show, has been promoted to senior vice president of MySQL HeatWave. Nipun, thank you for coming back. Most people are not enamored enough with me to subject themselves to my slings and arrows a second time, let alone a third. So first, thanks. And are you okay, over there?Nipun: Thank you, Corey. Yeah, very happy to be back.Corey: [laugh]. So, since the last time we've spoken, there have been some interesting developments that have happened. It was pre-announced by Larry Ellison on a keynote stage or an earnings call, I don't recall the exact format, that HeatWave was going to be coming to AWS. Now, you've conducted a formal announcement, this usual media press blitz, et cetera, talking about it with an eye toward general availability later this year, if I'm not mistaken, and things seem to be—if you'll forgive the term—heating up a bit.Nipun: That is correct. So, as you know, we have had MySQL HeatWave on OCI for just about two years now. Very good reception, a lot of people who are using MySQL HeatWave, are migrating from other clouds, specifically from AWS, and now we have announced availability of MySQL HeatWave on AWS.Corey: So, for those who have not done the requisite homework of listening to the entire back catalog of nearly 400 episodes of this show, what exactly is MySQL HeatWave, just so we make sure that we set the stage for what we're going to be talking about? Because I sort of get the sense that without a baseline working knowledge of what that is, none of the rest of this is going to make a whole lot of sense.Nipun: MySQL HeatWave is a managed MySQL service provided by Oracle. But it is different from other MySQL-based services in the sense that we have significantly enhanced the service such that it can very efficiently process transactions, analytics, and in-database machine learning. So, what customers get with the service, with MySQL HeatWave, is a single MySQL database which can process OLTP, transaction processing, real-time analytics, and machine learning. And they can do this without having to move the data out of MySQL into some other specialized database services who are running analytics or machine learning. And all existing tools and applications which work with MySQL work as is because this is something that enhances the server. In addition to that, it provides very good performance and very good price performance compared to other similar services out there.Corey: The idea historically that some folks were pushing around the idea of multi-cloud was that you would have workloads that—oh, they live in one cloud, but the database was going to be all the way across the other side of the internet, living in a different provider. And in practice, what we generally tend to see is that where the data lives is where the compute winds up living. By and large, it's easier to bring the compute resources to the data than it is to move the data to the compute, just because data egress in most of the cloud providers—notably exempting yours—is astronomically expensive. You are, if I recall correctly, less than 10% of AWS's data egress charge on just retail pricing alone, which is wild to me. So first, thank you for keeping that up and not raising prices because I would have felt rather annoyed if I'd been saying such good things. And it was, haha, it was a bait and switch. It was not. I'm still a big fan. So, thank you for that, first and foremost.Nipun: Certainly. And what you described is absolutely correct that while we have a lot of customers migrating from AWS to use MySQL HeatWave and OCI, a class of customers are unable to, and the number one reason they're unable to is that AWS charges these customers all very high egress fees to move the data out of AWS into OCI for them to benefit from MySQL HeatWave. And this has definitely been one of the key incentives for us, the key motivation for us, to offer MySQL HeatWave on AWS so that customers don't need to pay this exorbitant data egress fees.Corey: I think it's fair to disclose that I periodically advise a variety of different cloud companies from a perspective of voice-of-the-customer feedback, which essentially distills down to me asking really annoying slash obnoxious questions that I, as a customer, legitimately want to know, but people always frown at me when I asked that in vendor pitches. For some reason, when I'm doing this on an advisory basis, people instead nod thoughtfully and take notes, so that at least feels better from my perspective. Oracle Cloud has been one of those, and I've been kicking the tires on the AWS offering that you folks have built out for a bit of time now. I have to say, it is legitimate. I was able to run a significant series of tests on this, and what I found going through that process was interesting on a bunch of different levels.I'm wondering if it's okay with you, if we go through a few of them, just things that jumped out to me as we went through a series of conversations around, “So, we're going to run a service on AWS.” And my initial answer was, “Is this Oracle? Are you sure?” And here we are today; we are talking about it and press releases.Nipun: Yes, certainly fine with me. Please go ahead.Corey: So, I think one of the first questions I had when you said, “We're going to run a database service on AWS itself,” was, if I'm true to type, is going to be fairly sarcastic, which is, “Oh, thank God. Finally, a way to run a MySQL database on AWS. There's never been one of those before.” Unless you count EC2 or Aurora or Redshift depending upon how you squint at it, or a variety of other increasingly strange things. It feels like that has been a largely saturated market in many respects.I generally don't tend to advise on things that I find patently ridiculous, and your answer was great, but I don't want to put words in your mouth. What was it that you saw that made you say, “Ah, we're going to put a different database offering on AWS, and no, it's not a terrible decision.”Nipun: Got it. Okay, so if you look at it, the value proposition which MySQL HeatWave offers is that customers of MySQL or customers have MySQL compatible databases, whether Aurora, or whether it's RDS MySQL, right, or even, like, you know, customers of Redshift, they have been migrating to MySQL HeatWave on OCI. Like, for the reasons I said: it's a single database, customers don't need to have multiple databases for managing different kinds of workloads, it's much faster, it's a lot less expensive, right? So, there are a lot of value propositions. So, what we found is that if you were to offer MySQL HeatWave on AWS, it will significantly ease the migration of other customers who might be otherwise thinking that it will be difficult for them to migrate, perhaps because of the high egress cost of AWS, or because of the high latency some of the applications in the AWS incur when the database is running somewhere else.Or, if they really have an ecosystem of applications already running on AWS and they just want to replace the database, it'll be much easier for them if MySQL HeatWave was offered on AWS. Those are the reasons why we feel it's a compelling proposition, that if existing customers of AWS are willing to migrate the cloud from AWS to OCI and use MySQL HeatWave, there is clearly a value proposition we are offering. And if we can now offer the same service in AWS, it will hopefully increase the number of customers who can benefit from MySQL HeatWave.Corey: One of the next questions I had going in was, “Okay, so what actually is this under the hood?” Is this you effectively just stuffing some software into a machine image or an AMI—or however they want to mispronounce that word over an AWS-land—and then just making it available to your account and running that, where's the magic or mystery behind this? Like, it feels like the next more modern cloud approach is to stuff the whole thing into a Docker container. But that's not what you wound up doing.Nipun: Correct. So, HeatWave has been designed and architected for scale-out processing, and it's been optimized for the cloud. So, when we decided to offer MySQL HeatWave on AWS, we have actually gone ahead and optimize our server for the AWS architecture. So, the processor we are running on, right, we have optimized our software for that instance types in AWS, right? So, the data plane has been optimized for AWS architecture.The second thing is we have a brand new control plane layer, right? So, it's not the case that we're just taking what we had in OCI and running it on AWS. We have optimized the data plane for AWS, we have a native control plane, which is running on AWS, which is using the respective services on AWS. And third, we have a brand new console which we are offering, which is a very interactive console where customers can run queries from the console. They can do data management from the console, they're able to use Autopilot from the console, and we have performance monitoring from the console, right? So, data plane, control plane, console. They're all running natively in AWS. And this provides for a very seamless integration or seamless experience for the AWS customers.Corey: I think it's also a reality, however much we may want to pretend otherwise, that if there is an opportunity to run something in a different cloud provider that is better than where you're currently running it now, by and large, customers aren't going to do it because it needs to not just be better, but so astronomically better in ways that are foundational to a company's business model in order to justify the tremendous expense of a cloud migration, not just in real, out of pocket, cost in dollars and cents that are easy to measure, but also in terms of engineering effort, in terms of opportunity cost—because while you're doing that you're not doing other things instead—and, on some level, people tend to only do that when there's an overwhelming strategic reason to do it. When folks already have existing workloads on AWS, as many of them do, it stands to reason that they are not going to want to completely deviate from that strategy just because something else offers a better database experience any number of axes. So, meeting customers where they are is one of the, I guess, foundational shifts that we've really seen from the entire IT industry over the last 40 years, rather than you will buy it from us and you will tolerate it. It's, now customers have choice, and meeting them where they are and being much more, I guess, able to impedance-match with them has been critical. And I'm really optimistic about what the launch of this service portends for Oracle.Nipun: Indeed, but let me give you another data point. We find a very large number of Aurora customers migrating to MySQL HeatWave on OCI, right? And this is the same workload they were running on Aurora, but now they want to run the same workload on MySQL HeatWave on OCI. They are willing to undertake this journey of migration because their applications, they get much faster, and for a lot less price, but they get much faster. Then the second aspect is, there's another class of customers who are for instance running, on Aurora or other transactions or workloads, but then they have to keep moving the data, they'll keep performing the ETL process into some other service, whether it's Snowflake, or whether it's Redshift for analytics.Now, with this migration, when they move to MySQL HeatWave, customers don't need to, like, have multiple databases, and they get real-time analytics, meaning that if any data changes inside the server inside the OLTP as a database service, right? If they were to run a query, that query is giving them the latest results, right? It's not stale. Whereas with an ETL process, it gets to be stale. So, given that we already found that there were so many customers migrating to OCI to use MySQL HeatWave, I think there's a clear value proposition of MySQL HeatWave, and there's a lot of demand.But like, as I was mentioning earlier, by having MySQL HeatWave be offered on AWS, it makes the proposition even more compelling because, as you said, yes, there is some engineering work that customers will need to do to migrate between clouds, and if they don't want to, then absolutely now they have MySQL HeatWave which they can now use in AWS itself.Corey: I think that one of the things I continually find myself careening into, perhaps unexpectedly, is a failure to really internalize just how vast this entire industry really is. Every time I think I've seen it all, all I have to do is talk to one more cloud customer and I learn something completely new and different. Sometimes it's an innovative, exciting use of a thing. Other times, it's people holding something fearfully wrong and trying to use it as a hammer instead. And you know, if it's dumb and it works, is it really dumb? There are questions around that.And this in turn gave rise to one of my next obnoxious questions as I was looking at what you were building at the time because a lot of your pricing and discussions and framing of this was targeting very large enterprise-style customers, and the price points reflected that. And then I asked the question that Big E enterprise never quite expects, for whatever reason, it's like, “That looks awesome if I have a budget with many commas in it. What can I get for $4?” And as of this recording, pricing has not been finalized slash published for the service, but everything that you have shown me so far absolutely makes developing on this for a proof of concept or an evening puttering around, completely tenable: it is not bound to a fixed period of licensing; it's, use it when you want to use it, turn it off when you're done; and the hourly pricing is not egregious. I think that is something that historically, Oracle Database offerings have not really aligned with.OCI very much has, particularly with an eye toward its extraordinarily awesome free tier that's always free. But this feels like it's a weird blending of the OCI model versus historical Oracle Database pricing models in a way that, honestly I'm pretty excited about.Nipun: So, we react to what the customer requirements and needs are. So, for this class of customers who are using, say, RDS, MySQL, Aurora, we understand that they are very cost sensitive, right? So, one of the things which we have done in addition to offering MySQL HeatWave on AWS is based on the customer feedback and such. We are now offering a small shape of HeatWave instance in addition to the regular large shape. So, if customers want to just, you know, kick the tires, if developers just want to get started, they can get a MySQL node with HeatWave for less than ten cents an hour. So, for less than ten cents an hour, they get the ability to run transaction processing, analytics, and machine learning.And if you were to compare the corresponding cost of Aurora for the same, like, you know, core count, it's, like, you know, 12-and-a-half cents. And that's just Aurora, without Redshift or without SageMaker. So yes, you're right that based on the feedback and we have found that it would be much more attractive to have this low-end shape for the AWS developers. We are offering this smaller shape. And yeah, it's very, very affordable. It's about just shy of ten cents an hour.Corey: This brings up another question that I raised pretty early on in the process because you folks kept talking about shapes, and it turns out that is the Oracle Cloud term that applies to instance size over an AWS-land. And as we dug into this a bit further, it does make sense for how you think about these things and how you build them to customers. Specifically, if I want to run this, I log into cloud.oracle.com and sign up for it there, and pay you over on that side of the world, this does not show up on my AWS bill. What drove that decision?Nipun: Okay, so a couple of things. One clarification is that the site people log in to is cloud.mysql.com. So, that's where they come to: cloud.mysql.com.Corey: Oh, my apologies. I keep forgetting that you folks have multiple cloud offerings and domains. They're kind of a thing. How do they work? Given I have a bad domain by habit myself, I have no room to judge.Nipun: So, they come to cloud.mysql.com. From there, they can provision an instance. And we, as, like, you know, Oracle or MySQL, go ahead and create an instance in AWS, in the Oracle tenancy. From there, customers can then, you know, access their data on AWS and such. Now, what we want to provide the customers is a very seamless experience, that they just come to cloud.mysql.com, and from there, they can do everything: provisioning an instance, running the queries, payment and such. So, this is one of the reasons that we want customers just to be able to come to the site, cloud.mysql.com, and take care of the billing and such.Now, the other thing is that, okay, why not allow customers to pay from AWS, right? Now, one of the things over there is that if you were to do that and there's a customer, they'll be like, “Hey, I got to pay something to AWS, something to Oracle, so we'd prefer, it'd be better to have a one-stop shop.” And since many of these are already Oracle customers, it's helpful to do it this way.Corey: Another approach you could have taken—and I want to be very clear here that I am not suggesting that this would have been a good idea—but an approach that you could have taken would have been to go down the weird AWS partner rabbit hole, and we're going to provide this to customers on the AWS Marketplace. Because according to AWS, that's where all of their customers go to discover new softwares. Yeah, first, that's a lie. They do not. But aside from that, what was it about that Marketplace model that drove you to a decision point where okay, at launch, we are not going to be offering this on the AWS Marketplace? And to be clear, I'm not suggesting that was the wrong decision.Nipun: Right. The main reason is we want to offer the MySQL HeatWave service at the least expensive cost to the user, right, or like, the least cost. If you were to, like, have MySQL HeatWave in the Marketplace, AWS charges a premium. This the customers would need to pay. So, we just didn't want the customers to have to pay this additional premium just because they can now source this thing from the Marketplace. So, it's really to, like, save costs for the customer.Corey: The value of the Marketplace, from my perspective, has been effectively not having to deal as much with customer procurement departments because well, AWS is already on the procurement approved list, so we're just going to go ahead and take the hit to wind up making it accessible from that perspective and calling it good. The downside to this is that increasingly, as customers are making larger and longer-term commitments that are tied to certain levels of spend on AWS, they're increasingly trying to drag every vendor with whom they do business into the your AWS bill so they can check those boxes off. And the problem that I keep seeing with that is vendors who historically have been doing just fine, have great working relationships with a customer are reporting that suddenly customers are coming back with, “Yeah, so for our contract renewal, we want to go through the AWS Marketplace.” In return, effectively, these companies are then just getting a haircut off whatever it is they're able to charge their customers but receiving no actual value for any of this. It attenuates the relationship by introducing a third party into the process, and it doesn't make anything better from the vendor's point of view because they already had something functional and working; now they just have to pay a commission on it to AWS, who, it seems, is pathologically averse to any transaction happening where they don't get a cut, on some level. But I digress. I just don't like that model very much at all. It feels coercive.Nipun: That's absolutely right. That's absolutely right. And we thought that, yes, there is some value to be going to Marketplace, but it's not worth the additional premium customers would need to pay. Totally agree.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: It's also worth pointing out that in Oracle's historical customer base, by which I mean the last 40 years that you folks have been in business, you do have significant customers with very sizable estates. A lot of your cloud efforts have focused around, I guess, we'll call it an Oracle-specific currency: Oracle Credits. Which is similar to the AWS style of currency just for a different company in different ways. One of the benefits that you articulated to me relatively early on was that by going through cloud.mysql.com, customers with those credits—which can be in sizable amounts based upon various differentiating variables that change from case to case—and apply that to their use of MySQL HeatWave on AWS.Nipun: Right. So, in fact, just for starters, right, what we give to customers is we offer some free credits for customers to try a service on OCI of, you know, $300. And that's the same thing, the same experience you would like customers who are trying HeatWave on AWS to get. Yes, so you're right, this is the kind of consistency we want to have, and yet another reason why cloud.mysql.com makes sense is the entry point for customers to try the service.Corey: There was a time where I would have struggled not to laugh in your face at the idea that we're talking about something in the context of an Oracle database, and well, there's $300 in credit. That's, “What can I get for that? Hung up on?” No. A surprising amount, when it comes to these things.I feel like that opens up an entirely new universe of experimentation. And, “Let's see how this thing actually works with his workload,” and lets people kick the tires on it for themselves in a way that, “Oh, we have this great database. Well, can I try it? Sure, for $8 million, you absolutely can.” “Well, it can stay great and awesome over there because who wants to take that kind of a bet?” It feels like it's a new world and in a bunch of different respects, and I just can't make enough noise about how glad I am to see this transformation happening.Nipun: Yeah. Absolutely, right? So, just think about it. So, you're getting MySQL and HeatWave together for just shy of ten cents an hour, right? So, what you could get for $300 is 3000 hours for MySQL HeatWave instance, which is very good for people to try for free. And then, you know, decide if they want to go ahead with it.Corey: One other, I guess, obnoxious question that I love to ask—it's not really a question so much as a statement; that that's part of the first thing that makes it really obnoxious—but it always distills down to the following that upsets product people left and right, which is, “I don't get it.” And one of the things that I didn't fully understand at the outset of how you were structuring things was the idea of separating out HeatWave from its constituent components. I believe it was Autopilot if I'm not mistaken, and it was effectively different SKUs that you could wind up opting to go for. And okay, if I'm trying to kick the tires on this and contextualize it as someone for whom the world's best database is Route 53, then it really felt like an additional decision point that I wasn't clear on the value of. And I'm still not entirely sure on the differentiation point and the value there, but now you offer it bundled as a default, which I think is so much better, from the user experience perspective.Nipun: Okay, so let me clarify a couple of things.Corey: Please. Databases are not my forte, so expect me to wind up getting most of the details hilariously wrong.Nipun: Sure. So, MySQL Autopilot provides machine-learning-based automation for various aspects of the MySQL service; very popular. There is no charge for it. It is built into MySQL HeatWave; there is no additional charge for it, right, so there never was any SKU for it. What you're referring to is, we have had a SKU for the MySQL node or the MySQL instance, and there's a separate SKU for HeatWave.The reason there is a need to have a different SKU for these two is because you always only have one node of MySQL. It could be, like, you know, running on one core, or like, you know, multiple cores, but it's always, like, you know, one node. But with HeatWave, it's a scale-out architecture, so you can have multiple nodes. So, the users need to be able to express how many nodes of HeatWave are they provisioning, right? So, that's why there is a need to have two SKUs, and we continue to have those two SKUs.What we are doing now differently is that when users instantiate a MySQL instance, by default, they always get the HeatWave node associated with it, right? So, they don't need to, like, you know, make the decision to—okay when to add HeatWave; they always get HeatWave along with the MySQL instance, and that's what I was saying a combination of both of these is, you know, like, just about ten cents an hour. If for whatever reason, they decide that they do not want HeatWave, they can turn it off, and then the price drops to half. But what we're providing is the AWS service that HeatWave is turned on by default.Corey: Which makes an awful lot of sense. It's something that lets people opt out if they decide they don't need this as they continue to scale out, but for the newcomer who does not, in many cases—in my particular case—have a nuanced understanding of where this offering starts and stops, it's clearly the right decision of—rather than, “Oh, yeah. The thing you were trying and it didn't work super well? Well, yeah. If you enable this other thing, it would have been awesome.” “Well, great. Please enable it for me by default and let me opt out later in time as my level of understanding deepens.”Nipun: That's right. And that's exactly what we are doing. Now, this was a feedback we got because many, if not most, of our customers would want to have HeatWave, and we just kind of, you know, mitigating them from going through one more step, it's always enabled by default.Corey: As far as I'm aware, you folks are running this effectively as any other AWS customer might, where you establish a private link connection to your customers, in some cases, or give them a public or private endpoint where they can wind up communicating with this service. It doesn't require any favoritism or special permissions from AWS themselves that they wouldn't give to any other random customer out there, correct?Nipun: Yes, that is correct. So, for now, we are exposing this thing as a public endpoint. In the future, we have plans to support the private endpoint as well, but for now, it's public.Corey: Which means that foundationally what you're building out is something that fits into a model that could work extraordinarily well across a variety of different environments. How purpose-tuned is the HeatWave installation you have running on AWS for the AWS environment, versus something that is relatively agnostic, could be dropped into any random cloud provider, up to and including the terrifyingly obsolete rack I have in the spare room?Nipun: So, as I mentioned, when we decided to offer MySQL HeatWave on AWS, the idea was that okay, for the AWS customers, we now want to have an offering which is completely optimized for AWS, provides the best price-performance on AWS. So, we have determined which instance types underneath will provide the best price performance, and that's what we have optimized for, right? So, I can tell you, like, in terms of many of—for instance, take the case of the cache size of the underlying processor that we're using on AWS is different than what we're using for OCI. So, we have gone ahead, made these optimizations in our code, and we believe that our code is really optimized now for the AWS infrastructure.Corey: I think that makes a fair deal of sense because, again, one of the big problems AWS has had is the proliferation of EC2 instance types to the point now where the answer is super easy, too, “Are you using the correct instance type for your workload?” Because that answer now is, “Of course not. Who could possibly say that they were with any degree of confidence?” But when you take the time to look at a very specific workload that's going to be scaled out, it's worth the time investment to figure out exactly how to optimize things for price and performance, given the constraints. Let's be very clear here, I would argue that the better price performance for HeatWave is almost certainly not going to be on AWS themselves, if for no other reason than the joy that is their data transfer pricing, even for internal things moving around from time to time.Personally, I love getting charged data transfer for taking data from S3, running it through AWS Glue, putting it into a different S3 bucket, accessing it with Athena, then hooking that up to Tableau as we go down and down and down the spiraling rabbit hole that never ends. It's not exactly what I would call well-optimized economically. Their entire system feels almost like it's a rigged game, on some level. But given those constraints, yeah, dialing in it and making it cost-effective is absolutely something that I've watched you folks put significant time and effort into.Nipun: So, I'll make two points, right, to the questions. First is yes, I just want to, like, be clear about it, that when a user provisions MySQL HeatWave via cloud.mysql.com and we create an instance in AWS, we don't give customers a multitude of things to, like, you know, choose from.We have determined which instance type is going to provide the customer the best price performance, and that's what we provision. So, the customer doesn't even need to know or care, is it going to be, like, you know, AMD? Is it going to be Intel? Is it going to be, like, you know, ARM, right? So, it's something which we have predetermined and we have optimized for it. That's first.The second point is in terms of the price performance. So, you're absolutely right, that for the class of customers who cannot migrate away from AWS because of the egress costs or because of the high latency because of AWS, right, sure, MySQL HeatWave on AWS will provide the best price-performance compared to other services out in AWS like Redshift, or Aurora, or Snowflake. But if customers have the flexibility to choose a cloud of their choice, it is indeed the case that customers are going to find that running MySQL HeatWave on OCI is going to provide them, by far, the best price performance, right? So, the price performance of running MySQL HeatWave on OCI is indeed better than MySQL HeatWave on AWS. And just because of the fact that when we are running the service in AWS, we are paying the list price, right, on AWS; that's how we get the gear. Whereas with OCI, like, you know, things are a lot less expensive for us.But even when you're running on AWS, we are very, very price competitive with other services. And you know, as you've probably seen from the performance benchmarks and such, what I'm very intrigued about is that we're able to run a standard workload, like some, like, you know, TPC-H and offer seven times better price-performance while running in AWS compared to Redshift. So, what this goes to show is that we are really passing on the savings to the customers. And clearly, Redshift is not doing a good job of performance or, like, you know, they're charging too much. But the fact that we can offer seven times better price performance than Redshift in AWS speaks volumes, both about architecture and how much of savings we are passing to our customers.Corey: What I love about this story is that it makes testing the waters of what it's like to run MySQL HeatWave a lot easier for customers because the barrier to entry is so much lower. Where everything you just said I agree with it is more cost-effective to run on Oracle Cloud. I think there are a number of workloads that are best placed on Oracle Cloud. But unless you let people kick the tires on those things, where they happen to be already, it's difficult to get them to a point where they're going to be able to experience that themselves. This is a massive step on that path.Nipun: Yep. Right.Corey: I really want to thank you for taking time out of your day to walk us through exactly how this came to be and what the future is going to look like around this. If people want to learn more, where should they go?Nipun: Oh, they can go to oracle.com/mysql, and there they can get a lot more information about the capabilities of MySQL HeatWave, what we are offering in AWS, price-performance. By the way, all the price performance numbers I was talking about, all the scripts are available publicly on GitHub. So, we welcome, we encourage customers to download the scripts from GitHub, try for themselves, and all of this information is available from oracle.com/mysql where they can get this detailed information.Corey: And we will, of course, put links to that in the show notes. Thank you so much for your time. I appreciate it.Nipun: Sure thing, Corey. Thank you for the opportunity.Corey: Nipun Agarwal, Senior Vice President of MySQL HeatWave. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry insulting comment. You will then be overcharged for the data transfer to submit that insulting comment, and then AWS will take a percentage of that just because they're obnoxious and can.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
ChaosSearch and the Evolving World of Data Analytics with Thomas Hazel

Screaming in the Cloud

Play Episode Listen Later Oct 4, 2022 35:21


About ThomasThomas Hazel is Founder, CTO, and Chief Scientist of ChaosSearch. He is a serial entrepreneur at the forefront of communication, virtualization, and database technology and the inventor of ChaosSearch's patented IP. Thomas has also patented several other technologies in the areas of distributed algorithms, virtualization and database science. He holds a Bachelor of Science in Computer Science from University of New Hampshire, Hall of Fame Alumni Inductee, and founded both student & professional chapters of the Association for Computing Machinery (ACM).Links Referenced: ChaosSearch: https://www.chaossearch.io/ Twitter: https://twitter.com/ChaosSearch Facebook: https://www.facebook.com/CHAOSSEARCH/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted episode is brought to us by our returning sponsor and friend, ChaosSearch. And once again, the fine folks at ChaosSearch has seen fit to basically subject their CTO and Founder, Thomas Hazel, to my slings and arrows. Thomas, thank you for joining me. It feels like it's been a hot minute since we last caught up.Thomas: Yeah, Corey. Great to be on the program again, then. I think it's been almost a year. So, I look forward to these. They're fun, they're interesting, and you know, always a good time.Corey: It's always fun to just take a look at companies' web pages in the Wayback Machine, archive.org, where you can see snapshots of them at various points in time. Usually, it feels like this is either used for long-gone things and people want to remember the internet of yesteryear, or alternately to deliver sick burns with retorting a “This you,” when someone winds up making an unpopular statement. One of the approaches I like to use it for, which is significantly less nefarious—usually—is looking back in time at companies' websites, just to see how the positioning of the product evolves over time.And ChaosSearch has had an interesting evolution in that direction. But before we get into that, assuming that there might actually be people listening who do not know the intimate details of exactly what it is you folks do, what is ChaosSearch, and what might you folks do?Thomas: Yeah, well said, and I look forward to [laugh] doing the Wayback Time because some of our ideas, way back when, seemed crazy, but now they make a lot of sense. So, what ChaosSearch is all about is transforming customers' cloud object stores like Amazon S3 into an analytical database that supports search and SQL-type use cases. Now, where's that apply? In log analytics, observability, security, security data lakes, operational data, particularly at scale, where you just stream your data into your data lake, connect our service, our SaaS service, to that lake and automagically we index it and provide well-known APIs like Elasticsearch and integrate with Kibana or Grafana, and SQL APIs, something like, say, a Superset or Tableau or Looker into your data. So, you stream it in and you get analytics out. And the key thing is the time-cost complexity that we all know that operational data, particularly at scale, like terabytes and a day and up causes challenges, and we all know how much it costs.Corey: They certainly do. One of the things that I found interesting is that, as I've mentioned before, when I do consulting work at The Duckbill Group, we have absolutely no partners in the entire space. That includes AWS, incidentally. But it was easy in the beginning because I was well aware of what you folks were up to, and it was great when there was a use case that matched of you're spending an awful lot of money on Elasticsearch; consider perhaps migrating some of that—if it makes sense—to ChaosSearch. Ironically, when you started sponsoring some of my nonsense, that conversation got slightly trickier where I had to disclose, yeah our media arm is does have sponsorships going on with them, but that has no bearing on what I'm saying.And if they take their sponsorships away—please don't—then we would still be recommending them because it's the right answer, and it's what we would use if we were in your position. We receive no kickbacks or partner deal or any sort of reseller arrangement because it just clouds the whole conflict of interest perception. But you folks have been fantastic for a long time in a bunch of different ways.Thomas: Well, you know, I would say that what you thought made a lot of sense made a lot of sense to us as well. So, the ChaosSearch idea just makes sense. Now, you had to crack some code, solve some problems, invent some technology, and create some new architecture, but the idea that Elasticsearch is a useful solution with all the tooling, the visualization, the wonderful community around that, was a good place to start, but here's the problem: setting it up, scaling it out, keep it up, when things are happening, things go bump in the night. All those are real challenges, and one of them was just the storaging of the data. Well, what if you could make S3 the back-end store? One hundred percent; no SSDs or HDDs. Makes a lot of sense.And then support the APIs that your tooling uses. So, it just made a lot of sense on what we were trying to do, just no one thought of it. Now, if you think about the Northstar you were talking about, you know, five, six years ago, when I said, transforming cloud storage into an analytical database for search and SQL, people thought that was crazy and mad. Well, now everyone's using Cloud Storage, everyone's using S3 as a data lake. That's not in question anymore.But it was a question five, six, you know, years ago. So, when we met up, you're like, “Well, that makes sense.” It always made sense, but people either didn't think was possible, or were worried, you know, I'll just try to set up an Elastic cluster and deal with it. Because that's what happens when you particularly deal with large-scale implementations. So, you know, to us, we would love the Elastic API, the tooling around it, but what we all know is the cost, the time the complexity, to manage it, to scale it out, just almost want to pull your hair out. And so, that's where we come in is, don't change what you do, just change how you do it.Corey: Every once in a while, I'll talk to a client who's running an Amazon Elasticsearch cluster, and they have nothing but good things to say about it. Which, awesome. On the one hand, part of me wishes that I had some of their secrets, but often what's happened is that they have this down to a science, they have a data lifecycle that's clearly defined and implemented, the cluster is relatively static, so resizes aren't really a thing, and it just works for their use cases. And in those scenarios, like, “Do you care about the bill?” “Not overly. We don't have to think about it.”Great. Then why change? If there's no pain, you're not going to sell someone something, especially when we're talking, this tends to be relatively smaller-scale as well. It's okay, great, they're spending $5,000 a month on it. It doesn't necessarily justify the engineering effort to move off.Now, when you start looking at this, and, “Huh, that's a quarter million bucks a month we're spending on this nonsense, and it goes down all the time,” yeah, that's when it starts to be one of those logical areas to start picking apart and diving into. What's also muddied the waters since the last time we really went in-depth on any of this was it used to be we would be talking about it exactly like we are right now, about how it's Elasticsearch-compatible. Technically, these days, we probably shouldn't be saying it is OpenSearch compatible because of the trademark issues between Elastic and AWS and the Schism of the OpenSearch fork of the Elasticsearch project. And now it feels like when you start putting random words in front of the word search, ChaosSearch fits right in. It feels like your star is rising.Thomas: Yeah, no, well said. I appreciate that. You know, it's funny when Elastic changed our license, we all didn't know what was going to happen. We knew something was going to happen, but we didn't know what was going to happen. And Amazon, I say ironically, or, more importantly, decided they'll take up the open mantle of keeping an open, free solution.Now, obviously, they recommend running that in their cloud. Fair enough. But I would say we don't hear as much Elastic replacement, as much as OpenSearch replacement with our solution because of all the benefits that we talked about. Because the trigger points for when folks have an issue with the OpenSearch or Elastic stack is got too expensive, or it was changing so much and it was falling over, or the complexity of the schema changing, or all the above. The pipelines were complex, particularly at scale.That's both for Elasticsearch, as well as OpenSearch. And so, to us, we want either to win, but we want to be the replacement because, you know, at scale is where we shine. But we have seen a real trend where we see less Elasticsearch and more OpenSearch because the community is worried about the rules that were changed, right? You see it day in, day out, where you have a community that was built around open and fair and free, and because of business models not working or the big bad so-and-so is taking advantage of it better, there's a license change. And that's a trust change.And to us, we're following the OpenSearch path because it's still open. The 600-pound gorilla or 900-pound gorilla of Amazon. But they really held the mantle, saying, “We're going to stay open, we assume for as long as we know, and we'll follow that path. But again, at that scale, the time, the costs, we're here to help solve those problems.” Again, whether it's on Amazon or, you know, Google et cetera.Corey: I want to go back to what I mentioned at the start of this with the Wayback Machine and looking at how things wound up unfolding in the fullness of time. The first time that it snapshotted your site was way back in the year 2018, which—Thomas: Nice. [laugh].Corey: Some of us may remember, and at that point, like, I wasn't doing any work with you, and later in time I would make fun of you folks for this, but back then your brand name was in all caps, so I would periodically say things like this episode is sponsored by our friends at [loudly] CHAOSSEARCH.Thomas: [laugh].Corey: And once you stopped capitalizing it and that had faded from the common awareness, it just started to look like I had the inability to control the volume of my own voice. Which, fair, but generally not mid-sentence. So, I remember those early days, but the positioning of it was, “The future of log management and analytics,” back in 2018. Skipping forward a year later, you changed this because apparently in 2019, the future was already here. And you were talking about, “Log search analytics, purpose-built for Amazon S3. Store everything, ask anything all on your Amazon S3.”Which is awesome. You were still—unfortunately—going by the all caps thing, but by 2020, that wound up changing somewhat significantly. You were at that point, talking for it as, “The data platform for scalable log analytics.” Okay, it's clearly heading in a log direction, and that made a whole bunch of sense. And now today, you are, “The data lake platform for analytics at scale.” So, good for you, first off. You found a voice?Thomas: [laugh]. Well, you know, it's funny, as a product mining person—I'll take my marketing hat off—we've been building the same solution with the same value points and benefits as we mentioned earlier, but the market resonates with different terminology. When we said something like, “Transforming your Cloud Object Storage like S3 into an analytical database,” people were just were like, blown away. Is that even possible? Right? And so, that got some eyes.Corey: Oh, anything is a database if you hold that wrong. Absolutely.Thomas: [laugh]. Yeah, yeah. And then you're saying log analytics really resonated for a few years. Data platform, you know, is more broader because we do more broader things. And now we see over the last few years, observability, right? How do you fit in the observability viewpoint, the stack where log analytics is one aspect to it?Some of our customers use Grafana on us for that lens, and then for the analysis, alerting, dashboarding. You can say that Kibana in the hunting aspect, the log aspects. So, you know, to us, we're going to put a message out there that resonates with what we're hearing from our customers. For instance, we hear things like, “I need a security data lake. I need that. I need to stream all my data. I need to have all the data because what happens today that now, I need to know a week, two weeks, 90 days.”We constantly hear, “I need at least 90 days forensics on that data.” And it happens time and time again. We hear in the observability stack where, “Hey, I love Datadog, but I can't afford it more than a week or two.” Well, that's where we come in. And we either replace Datadog for the use cases that we support, or we're auxiliary to it.Sometimes we have an existing Grafana implementation, and then they store data in us for the long tail. That could be the scenario. So, to us, the message is around what resonates with our customers, but in the end, it's operational data, whether you want to call it observability, log analytics, security analytics, like the data lake, to us, it's just access to your data, all your data, all the time, and supporting the APIs and the tooling that you're using. And so, to me, it's the same product, but the market changes with messaging and requirements. And this is why we always felt that having a search and SQL platform is so key because what you'll see in Elastic or OpenSearch is, “Well, I only support the Elastic API. I can't do correlations. I can't do this. I can't do that. I'm going to move it over to say, maybe Athena but not so much. Maybe a Snowflake or something else.”Corey: “Well, Thomas, it's very simple. Once you learn our own purpose-built, domain-specific language, specifically for our product, well, why are you still sitting here, go learn that thing.” People aren't going to do that.Thomas: And that's what we hear. It was funny, I won't say what the company was, a big banking company that we're talking to, and we hear time and time again, “I only want to do it via the Elastic tooling,” or, “I only want to do it via the BI tooling.” I hear it time and time again. Both of these people are in the same company.Corey: And that's legitimate as well because there's a bunch of pre-existing processes pointing at things and we're not going to change 200 different applications in their data model just because you want to replace a back-end system. I also want to correct myself. I was one tab behind. This year's branding is slightly different: “Search and analyze unlimited log data in your cloud object storage.” Which is, I really like the evolution on this.Thomas: Yeah, yeah. And I love it. And what was interesting is the moving, the setting up, the doubling of your costs, let's say you have—I mean, we deal with some big customers that have petabytes of data; doubling your petabytes, that means, if your Elastic environment is costing you tens of millions and then you put into Snowflake, that's also going to be tens of millions. And with a solution like ours, you have really cost-effective storage, right? Your cloud storage, it's secure, it's reliable, it's Elastic, and you attach Chaos to get the well-known APIs that your well-known tooling can analyze.So, to us, our evolution has been really being the end viewpoint where we started early, where the search and SQL isn't here today—and you know, in the future, we'll be coming out with more ML type tooling—but we have two sides: we have the operational, security, observability. And a lot of the business side wants access to that data as well. Maybe it's app data that they need to do analysis on their shopping cart website, for instance.Corey: The thing that I find curious is, the entire space has been iterating forward on trying to define observability, generally, as whatever people are already trying to sell in many cases. And that has seemed to be a bit of a stumbling block for a lot of folks. I figured this out somewhat recently because I've built the—free for everyone to use—the lasttweetinaws.com, Twitter threading client.That's deployed to 20 different AWS regions because it's go—the idea is that should be snappy for people, no matter where they happen to be on the planet, and I use it for conferences when I travel, so great, let's get ahead of it. But that also means I've got 20 different sources of logs. And given that it's an omnibus Lambda function, it's very hard to correlate that to users, or user sessions, or even figure out where it's going. The problem I've had is, “Oh, well, this seems like something I could instrument to spray logs somewhere pretty easily, but I don't want to instrument it for 15 different observability vendors. Why don't I just use otel—or Open Telemetry—and then tell that to throw whatever I care about to various vendors and do a bit of a bake-off?” The problem, of course, is that open telemetry and Lambda seem to be in just the absolute wrong directions. A lot.Thomas: So, we see the same trend of otel coming out, and you know, this is another API that I'm sure we're going to go all-in on because it's getting more and more talked about. I won't say it's the standard that I think is trending to all your points about I need to normalize a process. But as you mentioned, we also need to correlate across the data. And this is where, you know, there are times where search and hunting and alerting is awesome and wonderful and solves all your needs, and sometimes correlation. Imagine trying to denormalize all those logs, set up a pipeline, put it into some database, or just do a SELECT *, you know, join this to that to that, and get your answers.And so, I think both OpenTelemetry and SQL and search all need to be played into one solution, or at least one capability because if you're not doing that, you're creating some hodgepodge pipeline to move it around and ultimately get your questions answered. And if it takes weeks—maybe even months, depending on the scale—you may sometimes not choose to do it.Corey: One other aspect that has always annoyed me about more or less every analytics company out there—and you folks are no exception to this—is the idea of charging per gigabyte ingested because that inherently sets up a weird dichotomy of, well, this is costing a lot, so I should strive to log less. And that is sort of the exact opposite, not just of the direction you folks want customers to go in, but also where customers themselves should be going in. Where you diverge from an awful lot of those other companies because of the nature of how you work, is that you don't charge them again for retention. And the idea that, yeah, the fact that anything stored in ChaosSearch lives in your own S3 buckets, you can set your own lifecycle policies and do whatever you want to do with that is a phenomenal benefit, just because I've always had a dim view of short-lived retention periods around logs, especially around things like audit logs. And these days, I would consider getting rid of audit logging data and application logging data—especially if there's a correlation story—any sooner than three years feels like borderline malpractice.Thomas: [laugh]. We—how many times—I mean, we've heard it time and time again is, “I don't have access to that data because it was too costly.” No one says they don't want the data. They just can't afford the data. And one of the key premises that if you don't have all the data, you're at risk, particularly in security—I mean, even audits. I mean, so many times our customers ask us, you know, “Hey, what was this going on? What was that go on?” And because we can so cost-effectively monitor our own service, we can provide that information for them. And we hear this time and time again.And retention is not a very sexy aspect, but it's so crucial. Anytime you look in problems with X solution or Y solution, it's the cost of the data. And this is something that we wanted to address, officially. And why do we make it so cost-effective and free after you ingest it was because we were using cloud storage. And it was just a great place to land the data cost-effective, securely.Now, with that said, there are two types of companies I've seen. Everybody needs at least 90 days. I see time and time again. Sure, maybe daily, in a weeks, they do a lot of their operation, but 90 days is where it lands. But there's also a bunch of companies that need it for years, for compliance, for audit reasons.And imagine trying to rehydrate, trying to rebuild—we have one customer—again I won't say who—has two petabytes of data that they rehydrate when they need it. And they say it's a nightmare. And it's growing. What if you just had it always alive, always accessible? Now, as we move from search to SQL, there are use cases where in the log world, they just want to pay upfront, fixed fee, this many dollars per terabyte, but as we get into the more ad hoc side of it, more and more folks are asking for, “Can I pay per query?”And so, you'll see coming out soon, about scenarios where we have a different pricing model. For logs, typically, you want to pay very consistent, you know, predetermined cost structure, but in the case of more security data lakes, where you want to go in the past and not really pay for something until you use it, that's going to be an option as well coming out soon. So, I would say you need both in the pricing models, but you need the data to have either side, right?Corey: This episode is sponsored in part by our friends at ChaosSearch. You could run Elasticsearch or Elastic Cloud—or OpenSearch as they're calling it now—or a self-hosted ELK stack. But why? ChaosSearch gives you the same API you've come to know and tolerate, along with unlimited data retention and no data movement. Just throw your data into S3 and proceed from there as you would expect. This is great for IT operations folks, for app performance monitoring, cybersecurity. If you're using Elasticsearch, consider not running Elasticsearch. They're also available now in the AWS marketplace if you'd prefer not to go direct and have half of whatever you pay them count towards your EDB commitment. Discover what companies like Equifax, Armor Security, and Blackboard already have. To learn more, visit chaossearch.io and tell them I sent you just so you can see them facepalm, yet again.Corey: You'd like to hope. I mean, you could always theoretically wind up just pulling what Ubiquiti apparently did—where this came out in an indictment that was unsealed against an insider—but apparently one of their employees wound up attempting to extort them—which again, that's not their fault, to be clear—but what came out was that this person then wound up setting the CloudTrail audit log retention to one day, so there were no logs available. And then as a customer, I got an email from them saying there was no evidence that any customer data had been accessed. I mean, yeah, if you want, like, the world's most horrifyingly devilish best practice, go ahead and set your log retention to nothing, and then you too can confidently state that you have no evidence of anything untoward happening.Contrast this with what AWS did when there was a vulnerability reported in AWS Glue. Their analysis of it stated explicitly, “We have looked at our audit logs going back to the launch of the service and have conclusively proven that the only time this has ever happened was in the security researcher who reported the vulnerability to us, in their own account.” Yeah, one of those statements breeds an awful lot of confidence. The other one makes me think that you're basically being run by clowns.Thomas: You know what? CloudTrail is such a crucial—particularly Amazon, right—crucial service because of that, we see time and time again. And the challenge of CloudTrail is that storing a long period of time is costly and the messiness the JSON complexity, every company struggles with it. And this is how uniquely—how we represent information, we can model it in all its permutations—but the key thing is we can store it forever, or you can store forever. And time and time again, CloudTrail is a key aspect to correlate—to your question—correlate this happened to that. Or do an audit on two years ago, this happened.And I got to tell you, to all our listeners out there, please store your CloudTrail data—ideally in ChaosSearch—because you're going to need it. Everyone always needs that. And I know it's hard. CloudTrail data is messy, nested JSON data that can explode; I get it. You know, there's tricks to do it manually, although quite painful. But CloudTrail, every one of our customers is indexing with us in CloudTrail because of stories like that, as well as the correlation across what maybe their application log data is saying.Corey: I really have never regretted having extra logs lying around, especially with, to be very direct, the almost ridiculously inexpensive storage classes that S3 offers, especially since you can wind up having some of the offline retrieval stuff as part of a lifecycle policy now with intelligent tiering. I'm a big believer in just—again—the Glacier Deep Archive I've at the cost of $1,000 a month per petabyte, with admittedly up to 12 hours of calling that as a latency. But that's still, for audit logs and stuff like that, why would I ever want to delete things ever again?Thomas: You're exactly right. And we have a bunch of customers that do exactly that. And we automate the entire process with you. Obviously, it's your S3 account, but we can manage across those tiers. And it's just to a point where, why wouldn't you? It's so cost-effective.And the moments where you don't have that information, you're at risk, whether it's internal audits, or you're providing a service for somebody, it's critical data. With CloudTrail, it's critical data. And if you're not storing it and if you're not making it accessible through some tool like an Elastic API or Chaos, it's not worth it. I think, to your point about your story, it's epically not worth it.Corey: It's really not. It's one of those areas where that is not a place to overly cost optimize. This is—I mean we talked earlier about my business and perceptions of conflict of interest. There's a reason that I only ever charge fixed-fee and not percentage of savings or whatnot because, at some point, I'll be placed in a position of having to say nonsense, like, “Do you really need all of these backups?” That doesn't make sense at that point.I do point out things like you have hourly disk snapshots of your entire web fleet, which has no irreplaceable data on them dating back five years. Maybe cleaning some of that up might be the right answer. The happy answer is somewhere in between those two, and it's a business decision around exactly where that line lies. But I'm a believer in never regretting having kept logs almost into perpetuity. Until and unless I start getting more or less pillaged by some particularly rapacious vendor that's oh, yeah, we're going to charge you not just for ingest, but also for retention. And for how long you want to keep it, we're going to treat it like we're carving it into platinum tablets. No. Stop that.Thomas: [laugh]. Well, you know, it's funny, when we first came out, we were hearing stories that vendors were telling customers why they didn't need their data, to your point, like, “Oh, you don't need that,” or, “Don't worry about that.” And time and time again, they said, “Well, turns out we didn't need that.” You know, “Oh, don't index all your data because you just know what you know.” And the problem is that life doesn't work out that way business doesn't work out that way.And now what I see in the market is everyone's got tiering scenarios, but the accessibility of that data takes some time to get access to. And these are all workarounds and bandaids to what fundamentally is if you design an architecture and a solution is such a way, maybe it's just always hot; maybe it's just always available. Now, we talked about tiering off to something very, very cheap, then it's like virtually free. But you know, our solution was, whether it's ultra warm, or this tiering that takes hours to rehydrate—hours—no one wants to live in that world, right? They just want to say, “Hey, on this date on this year, what was happening? And let me go look, and I want to do it now.”And it has to be part of the exact same system that I was using already. I didn't have to call up IT to say, “Hey, can you rehydrate this?” Or, “Can I go back to the archive and look at it?” Although I guess we're talking about archiving with your website, viewing from days of old, I think that's kind of funny. I should do that more often myself.Corey: I really wish that more companies would put themselves in the customers' shoes. And for what it's worth, periodically, I've spoken to a number of very happy ChaosSearch customers. I haven't spoken to any angry ones yet, which tells me you're either terrific at crisis comms, or the product itself functions as intended. So, either way, excellent job. Now, which team of yours is doing that excellent job, of course, is going to depend on which one of those outcomes it is. But I'm pretty good at ferreting out stories on those things.Thomas: Well, you know, it's funny, being a company that's driven by customer ask, it's so easy build what the customer wants. And so, we really take every input of what the customer needs and wants—now, there are cases where we relace Splunk. They're the Cadillac, they have all the bells and whistles, and there's times where we'll say, “Listen, that's not what we're going to do. We're going to solve these problems in this vector.” But they always keep on asking, right? You know, “I want this, I want that.”But most of the feedback we get is exactly what we should be building. People need their answers and how they get it. It's really helped us grow as a company, grow as a product. And I will say ever since we went live now many, many years ago, all our roadmap—other than our Northstar of transforming cloud storage into a search SQL big data analytics database has been customer-driven, market customer-driven, like what our customer is asking for, whether it's observability and integrating with Grafana and Kibana or, you know, security data lakes. It's just a huge theme that we're going to make sure that we provide a solution that meets those needs.So, I love when customers ask for stuff because the product just gets better. I mean, yeah, sometimes you have to have a thick skin, like, “Why don't you have this?” Or, “Why don't you have that?” Or we have customers—and not to complain about customers; I love our customers—but they sometimes do crazy things that we have to help them on crazy-ify. [laugh]. I'll leave it at that. But customers do silly things and you have to help them out. I hope they remember that, so when they ask for a feature that maybe takes a month to make available, they're patient with us.Corey: We sure can hope. I really want to thank you for taking so much time to once again suffer all of my criticisms, slings and arrows, blithe market observations, et cetera, et cetera. If people want to learn more, where's the best place to find you?Thomas: Well, of course, chaossearch.io. There's tons of material about what we do, use cases, case studies; we just published a big case study with Equifax recently. We're in Gartner and a whole bunch of Hype Cycles that you can pull down to see how we fit in the market.Reach out to us. You can set up a trial, kick the tires, again, on your cloud storage like S3. And ChaosSearch on Twitter, we have a Facebook, we have all this classic social medias. But our website is really where all the good content and whether you want to learn about the architecture and how we've done it, and use cases; people who want to say, “Hey, I have a problem. How do you solve it? How do I learn more?”Corey: And we will, of course, put links to that in the show notes. For my own purposes, you could also just search for the term ChaosSearch in your email inbox and find one of their sponsored ads in my newsletter and click that link, but that's a little self-serving as we do it. I'm kidding. I'm kidding. There's no need to do that. That is not how we ever evaluate these things. But it is funny to tell that story. Thomas, thank you so much for your time. As always, it's appreciated.Thomas: Corey Quinn, I truly enjoyed this time. And I look forward to upcoming re:Invent. I'm assuming it's going to be live like last year, and this is where we have a lot of fun with the community.Corey: Oh, I have no doubt that we're about to go through that particular path very soon. Thank you. It's been an absolute pleasure.Thomas: Thank you.Corey: Thomas Hazel, CTO and Founder of ChaosSearch. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that I will then set to have a retention period of one day, and then go on to claim that I have received no negative feedback.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
How Data Discovery is Changing the Game with Shinji Kim

Screaming in the Cloud

Play Episode Listen Later Sep 22, 2022 32:58


About ShinjiShinji Kim is the Founder & CEO of Select Star, an automated data discovery platform that helps you to understand & manage your data. Previously, she was the Founder & CEO of Concord Systems, a NYC-based data infrastructure startup acquired by Akamai Technologies in 2016. She led the strategy and execution of Akamai IoT Edge Connect, an IoT data platform for real-time communication and data processing of connected devices. Shinji studied Software Engineering at University of Waterloo and General Management at Stanford GSB.Links Referenced: Select Star: https://www.selectstar.com/ LinkedIn: https://www.linkedin.com/company/selectstarhq/ Twitter: https://twitter.com/selectstarhq TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: I come bearing ill tidings. Developers are responsible for more than ever these days. Not just the code that they write, but also the containers and the cloud infrastructure that their apps run on. Because serverless means it's still somebody's problem. And a big part of that responsibility is app security from code to cloud. And that's where our friend Snyk comes in. Snyk is a frictionless security platform that meets developers where they are - Finding and fixing vulnerabilities right from the CLI, IDEs, Repos, and Pipelines. Snyk integrates seamlessly with AWS offerings like code pipeline, EKS, ECR, and more! As well as things you're actually likely to be using. Deploy on AWS, secure with Snyk. Learn more at Snyk.co/scream That's S-N-Y-K.co/screamCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while, I encounter a company that resonates with something that I've been doing on some level. In this particular case, that is what's happened here, but the story is slightly different. My guest today is Shinji Kim, who's the CEO and founder at Select Star.And the joke that I was making a few months ago was that Select Stars should have been the name of the Oracle ACE program instead. Shinji, thank you for joining me and suffering my ridiculous, basically amateurish and sophomore database-level jokes because I am bad at databases. Thanks for taking the time to chat with me.Shinji: Thanks for having me here, Corey. Good to meet you.Corey: So, Select Star despite being the only query pattern that I've ever effectively been able to execute from memory, what you do as a company is described as an automated data discovery platform. So, I'm going to start at the beginning with that baseline definition. I think most folks can wrap their heads around what the idea of automated means, but the rest of the words feel like it might mean different things to different people. What is data discovery from your point of view?Shinji: Sure. The way that we define data discovery is finding and understanding data. In other words, think about how discoverable your data is in your company today. How easy is it for you to find datasets, fields, KPIs of your organization data? And when you are looking at a table, column, dashboard, report, how easy is it for you to understand that data underneath? Encompassing on that is how we define data discovery.Corey: When you talk about data lurking around the company in various places, that can mean a lot of different things to different folks. For the more structured data folks—which I tend to think of as the organized folks who are nothing like me—that tends to mean things that live inside of, for example, traditional relational databases or things that closely resemble that. I come from a grumpy old sysadmin perspective, so I'm thinking, oh, yeah, we have a Jira server in the closet and that thing's logging to its own disk, so that's going to be some information somewhere. Confluence is another source of data in an organization; it's usually where insight and a knowledge of what's going on goes to die. It's one of those write once, read never type of things.And when I start thinking about what data means, it feels like even that is something of a squishy term. From the perspective of where Select Start starts and stops, is it bounded to data that lives within relational databases? Does it go beyond that? Where does it start? Where does it stop?Shinji: So, we started the company with an intention of increasing the discoverability of data and hence providing automated data discovery capability to organizations. And the part where we see this as the most effective is where the data is currently being consumed today. So, this is, like, where the data consumption happens. So, this can be a data warehouse or data lake, but this is where your data analysts, data scientists are querying data, they are building dashboards, reports on top of, and this is where your main data mart lives.So, for us, that is primarily a cloud data warehouse today, usually has a relational data structure. On top of that, we also do a lot of deep integrations with BI tools. So, that includes tools like Tableau, Power BI, Looker, Mode. Wherever these queries from the business stakeholders, BI engineers, data analysts, data scientists run, this is a point of reference where we use to auto-generate documentation, data models, lineage, and usage information, to give it back to the data team and everyone else so that they can learn more about the dataset they're about to use.Corey: So, given that I am seeing an increased number of companies out there talking about data discovery, what is it the Select Star does that differentiates you folks from other folks using similar verbiage in how they describe what they do?Shinji: Yeah, great question. There are many players that popping up, and also, traditional data catalog's definitely starting to offer more features in this area. The main differentiator that we have in the market today, we call it fast time-to-value. Any customer that is starting with Select Star, they get to set up their instance within 24 hours, and they'll be able to get all the analytics and data models, including column-level lineage, popularity, ER diagrams, and how other people are—top users and how other people are utilizing that data, like, literally in few hours, max to, like, 24 hours. And I would say that is the main differentiator.And most of the customers I have pointed out that setup and getting started has been super easy, which is primarily backed by a lot of automation that we've created underneath the platform. On top of that, just making it super easy and simple to use. It becomes very clear to the users that it's not just for the technical data engineers and DBAs to use; this is also designed for business stakeholders, product managers, and ops folks to start using as they are learning more about how to use data.Corey: Mapping this a little bit toward the use cases that I'm the most familiar with, this big source of data that I tend to stumble over is customer AWS bills. And that's not exactly a big data problem, given that it can fit in memory if you have a sufficiently exciting computer, but using Tableau don't wind up slicing and dicing that because at some point, Excel falls down. From my perspective, problem with Excel is that it doesn't tend to work on huge datasets very well, and from the position of Salesforce, the problem with Excel is that it doesn't cost a giant pile of money every month. So, those two things combined, Tableau is the answer for what we do. But that's sort of the end-all for us of, that's where it stops.At that point, we have dashboards that we build and queries that we run that spit out the thing we're looking at, and then that goes back to inform our analysis. We don't inherently feed that back into anything else that would then inform the rest of what we do. Now, for our use case, that probably makes an awful lot of sense because we're here to help our customers with their billing challenges, not take advantage of their data to wind up informing some giant model and mispurposing that data for other things. But if we were generating that data ourselves as a part of our operation, I can absolutely see the value of tying that back into something else. You wind up almost forming a reinforcing cycle that improves the quality of data over time and lets you understand what's going on there. What are some of the outcomes that you find that customers get to by going down this particular path?Shinji: Yeah, so just to double-click on what you just talked about, the way that we see this is how we analyze the metadata and the activity logs—system logs, user logs—of how that data has been used. So, part of our auto-generated documentation for each table, each column, each dashboard, you're going to be able to see the full data lineage: where it came from, how it was transformed in the past, and where it's going to. You will also see what we call popularity score: how many unique users are utilizing this data inside the organization today, how often. And utilizing these two core models and analysis that we create, you can start looking at first mapping out the data flow, and then determining whether or not this dataset is something that you would want to continue keeping or running the data pipelines for. Because once you start mapping these usage models of tables versus dashboards, you may find that there are recurring jobs that creates all these materialized views and tables that are feeding dashboards that are not being looked at anymore.So, with this mechanism by looking initially data lineage as a concept, a lot of companies use data lineage in order to find dependencies: what is going to break if I make this change in the column or table, as well as just debugging any of issues that is currently happening in their pipeline. So, especially when you will have to debug a SQL query or pipeline that you didn't build yourself but you need to find out how to fix it, this is a really easy way to instantly find out, like, where the data is coming from. But on top of that, if you start adding this usage information, you can trace through where the main compute is happening, which largest route table is still being queried, instead of the more summarized tables that should be used, versus which are the tables and datasets that is continuing to get created, feeding the dashboards and is those dashboards actually being used on the business side. So, with that, we have customers that have saved thousands of dollars every month just by being able to deprecate dashboards and pipelines that they were afraid of deprecating in the past because they weren't sure if anyone's actually using this or not. But adopting Select Star was a great way to kind of do a full spring clean of their data warehouse as well as their BI tool. And this is an additional benefit to just having to declutter so many old, duplicated, and outdated dashboards and datasets in their data warehouse.Corey: That is, I guess, a recurring problem that I see in many different pockets of the industry as a whole. You see it in the user visibility space, you see it in the cost control space—I even made a joke about Confluence that alludes to it—this idea that you build a whole bunch of dashboards and use it to inform all kinds of charts and other systems, but then people are busy. It feels like there's no ‘and then.' Like, one of the most depressing things in the universe that you can see after having spent a fair bit of effort to build up those dashboards is the analytics for who internally has looked at any of those dashboards since the demo you gave showing it off to everyone else. It feels like in many cases, we put all these projects and amount of effort into building these things out that then don't get used.People don't want to be informed by data they want to shoot from their gut. Now, sometimes that's helpful when we're talking about observability tools that you use to trace down outages, and, “Well, our site's really stable. We don't have to look at that.” Very awesome, great, awesome use case. The business insight level of dashboard just feels like that's something you should really be checking a lot more than you are. How do you see that?Shinji: Yeah, for sure. I mean, this is why we also update these usage metrics and lineage every 24 hours for all of our customers automatically, so it's just up-to-date. And the part that more customers are asking for where we are heading to—earlier, I mentioned that our main focus has been on analyzing data consumption and understanding the consumption behavior to drive better usage of your data, or making data usage much easier. The part that we are starting to now see is more customers wanting to extend those feature capabilities to their staff of where the data is being generated. So, connecting the similar amount of analysis and metadata collection for production databases, Kafka Queues, and where the data is first being generated is one of our longer-term goals. And then, then you'll really have more of that, up to the source level, of whether the data should be even collected or whether it should even enter the data warehouse phase or not.Corey: One of the challenges I see across the board in the data space is that so many products tend to have a very specific point of the customer lifecycle, where bringing them in makes sense. Too early and it's, “Data? What do you mean data? All I have are these logs, and their purpose is basically to inflate my AWS bill because I'm bad at removing them.” And on the other side, it's, “Great. We pioneered some of these things and have built our own internal enormous system that does exactly what we need to do.” It's like, “Yes, Google, you're very smart. Good job.” And most people are somewhere between those two extremes. Where are customers on that lifecycle or timeline when using Select Star makes sense for them?Shinji: Yeah, I think that's a great question. Also the time, the best place where customers would use Select Star for is that after they have their cloud data warehouse set up. Either they have finished their migration, they're starting to utilize it with their BI tools, and they're starting to notice that it's not just, like, you know, ten to fifty tables that they're starting with; most of them have more than hundreds of tables. And they're feeling that this is starting to go out of control because we have all these data, but we are not a hundred percent sure what exactly is in our database. And this usually just happens more in larger companies, companies at thousand-plus employees, and they usually find a lot of value out of Select Star right away because, like, we will start pointing out many different things.But we also see a lot of, like, forward-thinking, fast-growing startups that are at the size of a few hundred employees, you know, they now have between five to ten-person data team, and they are really creating the right single source of truth of their data knowledge through a Select Star. So, I think you can start anywhere from when your data team size is, like, beyond five and you're continuing to grow because every time you're trying to onboard a data analyst, data scientist, you will have to go through, like, basically the same type of training of your data model, and it might actually look different because the data models and the new features, new apps that you're integrating this changes so quickly. So, I would say it's important to have that base early on and then continue to grow. But we do also see a lot of companies coming to us after having thousands of datasets or tens of thousands of datasets that it's really, like, very hard to operate and onboard anyone. And this is a place where we really shine to help their needs, as well.Corey: Sort of the, “I need a database,” to the, “Help, I have too many databases,” pipeline, where [laugh] at some point people start to—wanting to bring organization to the chaos. One thing I like about your model is that you don't seem to be making the play that every other vendor in the data space tends to, which is, “Oh, we want you to move your data onto our systems. The end.” You operate on data that is in place, which makes an awful lot of sense for the kinds of things that we're talking about. Customers are flat out not going to move their data warehouse over to your environment, just because the data gravity is ludicrous. Just the sheer amount of money it would take to egress that data from a cloud provider, for example, is monstrous.Shinji: Exactly. [laugh]. And security concerns. We don't want to be liable for any of the data—and this is, like, a very specific decision we've made very early on the company—to not access data, to not egress any of the real data, and to provide as much value as possible just utilizing the metadata and logs. And depending on the types of data warehouses, it also can be really efficient because the query history or the metadata systems tables are indexed separately. Usually, it's much lighter load on the compute side. And that definitely has, like, worked well for our advantage, especially being a SaaS tool.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig secures your cloud from source to run. They believe, as do I, that DevOps and security are inextricably linked. If you wanna learn more about how they view this, check out their blog, it's definitely worth the read. To learn more about how they are absolutely getting it right from where I sit, visit Sysdig.com and tell them that I sent you. That's S Y S D I G.com. And my thanks to them for their continued support of this ridiculous nonsense.Corey: What I like is just how straightforward the integrations are. It's clear you're extraordinarily agnostic as far as where the data itself lives. You integrate with Google's BigQuery, with Amazon Redshift, with Snowflake, and then on the other side of the world with Looker, and Tableau, and other things as well. And one of the example use cases you give is find the upstream table in BigQuery that a Looker dashboard depends on. That's one of those areas where I see something like that, and, oh, I can absolutely see the value of that.I have two or three DynamoDB tables that drive my newsletter publication system that I built—because I have deep-seated emotional problems and I take it out and everyone else via code—but as a small, contained system that I can still fit in my head. Mostly. And I still forget which table is which in some cases. Down the road, especially at scale, “Okay, where is the actual data source that's informing this because it doesn't necessarily match what I'm expecting,” is one of those incredibly valuable bits of insight. It seems like that is something that often gets lost; the provenance of data doesn't seem to work.And ideally, you know, you're staffing a company with reasonably intelligent people who are going to look at the results of something and say, “That does not align with my expectations. I'm going to dig.” As opposed to the, “Oh, yeah, that seems plausible. I'll just go with whatever the computer says.” There's an ocean of nuance between those two, but it's nice to be able to establish the validity of the path that you've gone down in order to set some of these things up.Shinji: Yeah, and this is also super helpful if you're tasked to debug a dashboard or pipeline that you did not build yourself. Maybe the person has left the company, or maybe they're out-of-office, but this dashboard has been broken and you're quote-unquote, “On call,” for data. What are you going to do? You're going to—without a tool that can show you a full lineage, you will have to start digging through somebody else's SQL code and try to map out, like, where the data is coming from, if this is calculating correctly. Usually takes, you know, few hours to just get to the bottom of the issue. And this is one of the main use cases that our customers bring up every single time, as more of, like, this is now the go-to place every time there is any data questions or data issues.Corey: The first and golden rule of cloud economics is step one, turn that shit off.Shinji: [laugh].Corey: When people are using something, you can optimize the hell out of it however you want, but nothing's going to beat turning it off. One challenge is when we're looking at various accounts and we see a Redshift cluster, and it's, “Okay. That thing's costing a few million bucks a year and no one seems to know anything about it.” They keep pointing to other teams, and it turns into this giant, like, finger-pointing exercise where no one seems to have responsibility for it. And very often, our clients will choose not to turn that thing off because on the one hand, if you don't turn it off, you're going to spend a few million bucks a year that you otherwise would not have had to.On the other, if you delete the data warehouse, and it turns out, oh, yeah, that was actually kind of important, now we don't have a company anymore. It's a question of which is the side you want to be wrong on. And in some levels, leaving something as it is and doing something else is always a more defensible answer, just because the first time your cost-saving exercises take out production, you're generally not allowed to save money anymore. This feels like it helps get to that source of truth a heck of a lot more effectively than tracing individual calls and turning into basically data center archaeologists.Shinji: [laugh]. Yeah, for sure. I mean, this is why from the get go, we try to give you all your tables, all of your database, just ordered by popularity. So, you can also see overall, like, from all the tables, whether that's thousands or tens of thousands, you're seeing the most used, has the most number of dependencies on the top, and you can also filter it by all the database tables that hasn't been touched in the last 90 days. And just having this, like, high-level view gives a lot of ideas to the data platform team about how they can optimize usage of their data warehouse.Corey: From where I tend to sit, an awful lot of customers are still relatively early in their data journey. An awful lot of the marketing that I receive from various AWS mailing lists that I found myself on because I've had the temerity to open accounts has been along the lines of oh, data discovery is super important, but first, they presuppose that I've already bought into this idea that oh, every company must be a completely data-driven company. The end. Full stop.And yeah, we're a small bespoke services consultancy. I don't necessarily know that that's the right answer here. But then it takes it one step further and starts to define the idea of data discovery as, ah, you will use it to find a PII or otherwise sensitive or restricted data inside of your datasets so you know exactly where it lives. And sure, okay, that's valuable, but it also feels like a very narrow definition compared to how you view these things.Shinji: Yeah. Basically, the way that we see data discovery is it's starting to become more of an essential capability in order for you to monitor and understand how your data is actually being used internally. It basically gives you the insights around sure, like, what are the duplicated datasets, what are the datasets that have that descriptions or not, what are something that may contain sensitive data, so on and so forth, but that's still around the characteristics of the physical datasets. Whereas I think the part that's really important around data discovery that is not being talked about as much is how the data can actually be used better. So, have it as more of a forward-thinking mechanism and in order for you to actually encourage more people to utilize data or use the data correctly, instead of trying to contain this within just one team is really where I feel like data discovery can help.And in regards to this, the other big part around data discovery is really opening up and having that transparency just within the data team. So, just within the data team, they always feel like they do have that access to the SQL queries and you can just go to GitHub and just look at the database itself, but it's so easy to get lost in the sea of metadata that is just laid out as just the list; there isn't much context around the data itself. And that context and with along with the analytics of the metadata is what we're really trying to provide automatically. So eventually, like, this can be also seen as almost like a way to, like, monitor the datasets, like, how you're currently monitoring your applications through Datadog or your website with your Google Analytics, this is something that can be also used as more of a go-to source of truth around what your state of the data is, how that's defined, and how that's being mapped to different business processes, so that there isn't much confusion around data. Everything can be called the same, but underneath it actually can mean very different things. Does that make sense?Corey: No, it absolutely does. I think that this is part of the challenge in trying to articulate value that is, I guess, specific to this niche across an entire industry. The context that drives data is going to be incredibly important, and it feels like so much of the marketing in the space is aimed at one or two pre-imagined customer profiles. And that has the side effect of making customers for whom that model doesn't align, look and feel like either doing something wrong, or makes it look like the vendor who's pitching this is somewhat out of touch. I know that I work in a relatively bounded problem space, but I still learn new things about AWS billing on virtually every engagement that I go on, just because you always get to learn more about how customers view things and how they view not just their industry, but also the specificities of their own business and their own niche.I think that is one of the challenges historically, with the idea of letting software do everything. Do you find the problems that you're solving tend to be global in nature or are you discovering strange depths of nuance on a customer-by-customer basis at this point?Shinji: Overall, a lot of the problems that we solve and the customers that we work with is very industry agnostic. As long as you are having many different datasets that you need to manage, there are common problems that arises, regardless of the industry that you're in. We do observe some industry-specific issues because your data is either, it's an unstructured data, or your data is primarily events, or you know, depending on how the data looks like, but primarily because of most of the BI solutions and data warehouses are operating as a relational databases, this is a part where we really try to build a lot of best practices, and the common analytics that we can apply to every customer that's using Select Star.Corey: I really want to thank you for taking so much time to go through the ins and outs of what it is you're doing these days. If people want to learn more, where's the best place to find you?Shinji: Yeah, I mean, it's been fun [laugh] talking here. So, we are at selectstar.com. That's our website. You can sign up for a free trial. It's completely self-service, so you don't need to get on a demo but, like, we'll also help you onboard and happy to give a free demo to whoever that is interested.We are also on LinkedIn and Twitter under selectstarhq. Yeah, I mean, we're happy to help for any companies that have these issues around wanting to increase their discoverability of data, and want to help their data team and the rest of the company to be able to utilize data better.Corey: And we will, of course, put links to all of that in the [show notes 00:28:58]. Thank you so much for your time today. I really appreciate it.Shinji: Great. Thanks for having me, Corey.Corey: Shinji Kim, CEO and founder at Select Star. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that I won't be able to discover because there are far too many podcast platforms out there, and I have no means of discovering where you've said that thing unless you send it to me.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Azul and the Current State of the Java Ecosystem with Scott Sellers

Screaming in the Cloud

Play Episode Listen Later Sep 20, 2022 36:35


About ScottWith more than 28 years of successful leadership in building high technology companies and delivering advanced products to market, Scott provides the overall strategic leadership and visionary direction for Azul Systems.Scott has a consistent proven track record of vision, leadership, and success in enterprise, consumer and scientific markets. Prior to co-founding Azul Systems, Scott founded 3dfx Interactive, a graphics processor company that pioneered the 3D graphics market for personal computers and game consoles. Scott served at 3dfx as Vice President of Engineering, CTO and as a member of the board of directors and delivered 7 award-winning products and developed 14 different graphics processors. After a successful initial public offering, 3dfx was later acquired by NVIDIA Corporation.Prior to 3dfx, Scott was a CPU systems architect at Pellucid, later acquired by MediaVision. Before Pellucid, Scott was a member of the technical staff at Silicon Graphics where he designed high-performance workstations.Scott graduated from Princeton University with a bachelor of science, earning magna cum laude and Phi Beta Kappa honors. Scott has been granted 8 patents in high performance graphics and computing and is a regularly invited keynote speaker at industry conferences.Links Referenced:Azul: https://www.azul.com/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: I come bearing ill tidings. Developers are responsible for more than ever these days. Not just the code that they write, but also the containers and the cloud infrastructure that their apps run on. Because serverless means it's still somebody's problem. And a big part of that responsibility is app security from code to cloud. And that's where our friend Snyk comes in. Snyk is a frictionless security platform that meets developers where they are - Finding and fixing vulnerabilities right from the CLI, IDEs, Repos, and Pipelines. Snyk integrates seamlessly with AWS offerings like code pipeline, EKS, ECR, and more! As well as things you're actually likely to be using. Deploy on AWS, secure with Snyk. Learn more at Snyk.co/scream That's S-N-Y-K.co/screamCorey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. My guest on this promoted episode today is Scott Sellers, CEO and co-founder of Azul. Scott, thank you for joining me.Scott: Thank you, Corey. I appreciate the opportunity in talking to you today.Corey: So, let's start with what you're doing these days. What is Azul? What do you folks do over there?Scott: Azul is an enterprise software and SaaS company that is focused on delivering more efficient Java solutions for our customers around the globe. We've been around for 20-plus years, and as an entrepreneur, we've really gone through various stages of different growth and different dynamics in the market. But at the end of the day, Azul is all about adding value for Java-based enterprises, Java-based applications, and really endearing ourselves to the Java community.Corey: This feels like the sort of space where there are an awful lot of great business cases to explore. When you look at what's needed in that market, there are a lot of things that pop up. The surprising part to me is that this is the direction that you personally went in. You started your career as a CPU architect, to my understanding. You were then one of the co-founders of 3dfx before it got acquired by Nvidia.You feel like you've spent your career more as a hardware guy than working on the SaaS side of the world. Is that a misunderstanding of your path, or have things changed, or is this just a new direction? Help me understand how you got here from where you were.Scott: I'm not exactly sure what the math would say because I continue to—can't figure out a way to stop time. But you're correct that my academic background, I was an electrical engineer at Princeton and started my career at Silicon Graphics. And that was when I did a lot of fantastic and fascinating work building workstations and high-end graphics systems, you know, back in the day when Silicon Graphics really was the who's who here in Silicon Valley. And so, a lot of my career began in the context of hardware. As you mentioned, I was one of the founders of graphics company called 3dfx that was one of, I think, arguably the pioneer in terms of bringing 3d graphics to the masses, if you will.And we had a great run of that. That was a really fun business to be a part of just because of what was going on in the 3d world. And we took that public and eventually sold that to Nvidia. And at that point, my itch, if you will, was really learning more about the enterprise segment. I'd been involved with professional graphics with SGI, I had been involved with consumer graphics with 3dfx.And I was fascinated just to learn about the enterprise segment. And met a couple people through a mutual friend around the 2001 timeframe, and they started talking about this thing called Java. And you know, I had of course heard about Java, but as a consumer graphics guy, didn't have a lot of knowledge about it or experience with it. And the more I learned about it, recognized that what was going on in the Java world—and credit to Sun for really creating, obviously, not only language, but building a community around Java—and recognized that new evolutions of developer paradigms really only come around once a decade if then, and was convinced and really got excited about the opportunity to ride the wave of Java and build a company around that.Corey: One of the blind spots that I have throughout the entire world of technology—and to be fair, I have many of them, but the one most relevant to this conversation, I suppose, is the Java ecosystem as a whole. I come from a background of being a grumpy Unix sysadmin—because I've never met a happy one of those in my entire career—and as a result, scripting languages is where everything that I worked with started off. And on the rare occasions, I worked in Java shops, it was, “Great. We're going to go—here's a WAR file. Go ahead and deploy this with Tomcat,” or whatever else people are going to use. But basically, “Don't worry your pretty little head about that.”At most, I have to worry about how to configure a heap or whatnot. But it's from the outside looking in, not having to deal with that entire ecosystem as a whole. And what I've seen from that particular perspective is that every time I start as a technologist, or even as a consumer trying to install some random software package in the depths of the internet, and I have to start thinking about Java, it always feels like I'm about to wind up in a confusing world. There are a number of software packages that I installed back in, I want to say the early-2010s or whatnot. “Oh, you need to have a Java runtime installed on your Mac,” for example.And okay, going through Oracle site, do I need the JRE? Do I need the JDK? Oh, there's OpenJDK, which kind of works, kind of doesn't. Amazon got into the space with Corretto, which because that sounds nothing whatsoever, like Java, but strange names coming from Amazon is basically par for the course for those folks. What is the current state of the Java ecosystem, for those of us who have—basically the closest we've ever gotten is JavaScript, which is nothing alike except for the name.Scott: And you know, frankly, given the protection around the name Java—and you know, that is a trademark that's owned by Oracle—it's amazing to me that JavaScript has been allowed to continue to be called JavaScript because as you point out, JavaScript has nothing to do with Java per se.Corey: Well, one thing they do have in common I found out somewhat recently is that Oracle also owns the trademark for JavaScript.Scott: Ah, there you go. Maybe that's why it continues.Corey: They're basically a law firm—three law firms in a trench coat, masquerading as a tech company some days.Scott: Right. But anyway, it is a confusing thing because you know, I think, arguably, JavaScript, by the numbers, probably has more programmers than any other language in the world, just given its popularity as a web language. But to your question about Java specifically, it's had an evolving life, and I think the state where it is today, I think it's in the most exciting place it's ever been. And I'll walk you through kind of why I believe that to be the case.But Java has evolved over time from its inception back in the days when it was called, I think it was Oak when it was originally conceived, and Sun had eventually branded it as Java. And at the time, it truly was owned by Sun, meaning it was proprietary code; it had to be licensed. And even though Sun gave it away, in most cases, it still at the end of the day, it was a commercially licensed product, if you will, and platform. And if you think about today's world, it would not be conceivable to create something that became so popular with programmers that was a commercially licensed product today. It almost would be mandated that it would be open-source to be able to really gain the type of traction that Java has gained.And so, even though Java was really garnering interest, you know, not only within the developer community, but also amongst commercial entities, right, everyone—and the era now I'm talking about is around the 2000 era—all of the major software vendors, whether it was obviously Sun, but then you had Oracle, you had IBM, companies like BEA, were really starting to blossom at that point. It was a—you know, you could almost not find a commercial software entity that was not backing Java. But it was still all controlled by Sun. And all that success ultimately led to a strong outcry from the community saying this has to be open-source; this is too important to be beholden to a single vendor. And that decision was made by Sun prior to the Oracle acquisition, they actually open-sourced the Java runtime code and they created an open-source project called OpenJDK.And to Oracle's credit, when they bought Sun—which I think at the time when you really look back, Oracle really did not have a lot of track record, if you will, of being involved with an open-source community—and I think when Oracle acquired Sun, there was a lot of skepticism as to what's going to happen to Java. Is Oracle going to make this thing, you know, back to the old days, proprietary Oracle, et cetera? And really—Corey: I was too busy being heartbroken over Solaris at that point to pay much attention to the Java stuff, but it felt like it was this—sort of the same pattern, repeated across multiple ecosystems.Scott: Absolutely. And even though Sun had also open-sourced Solaris, with the OpenSolaris project, that was one of the kinds of things that it was still developed very much in a closed environment, and then they would kind of throw some code out into the open world. And no one really ran OpenSolaris because it wasn't fully compatible with Solaris. And so, that was a faint attempt, if you will.But Java was quite different. It was truly all open-sourced, and the big difference that—and again, I give Oracle a lot of credit for this because this was a very important time in the evolution of Java—that Oracle, maintained Sun's commitment to not only continue to open-source Java but most importantly, develop it in the open community. And so, you know, again, back and this is the 2008, ‘09, ‘10 timeframe, the evolution of Java, the decisions, the standards, you know, what goes in the platform, what doesn't, decisions about updates and those types of things, that truly became a community-led world and all done in the open-source. And credit to Oracle for continuing to do that. And that really began the transition away from proprietary implementations of Java to one that, very similar to Linux, has really thrived because of the true open-source nature of what Java is today.And that's enabled more and more companies to get involved with the evolution of Java. If you go to the OpenJDK page, you'll see all of the not only, you know, incredibly talented individuals that are involved with the evolution of Java, but again, a who's who in pretty much every major commercial entities in the enterprise software world is also somehow involved in the OpenJDK community. And so, it really is a very vibrant, evolving standard. And some of the tactical things that have happened along the way in terms of changing how versions of Java are released still also very much in the context of maintaining compatibility and finding that careful balance of evolving the platform, but at the same time, recognizing that there is a lot of Java applications out there, so you can't just take a right-hand turn and forget about the compatibility side of things. But we as a community overall, I think, have addressed that very effectively, and the result has been now I think Java is more popular than ever and continues to—we liken it kind of to the mortar and the brick walls of the enterprise. It's a given that it's going to be used, certainly by most of the enterprises worldwide today.Corey: There's a certain subset of folk who are convinced the Java, “Oh, it's this a legacy programming language, and nothing modern or forward-looking is going to be built in it.” Yeah, those people generally don't know what the internal language stack looks like at places like oh, I don't know, AWS, Google, and a few others, it is very much everywhere. But it also feels, on some level, like, it's a bit below the surface-level of awareness for the modern full-stack developer in some respects, right up until suddenly it's very much not. How is Java evolving in a cloud these days?Scott: Well, what we see happening—you know, this is true for—you know, I'm a techie, so I can talk about other techies. I mean as techies, we all like the new thing, right? I mean, it's not that exciting to talk about a language that's been around for 20-plus years. But that doesn't take away from the fact that we still all use keyboards. I mean, no one really talks about what keyboard they use anymore—unless you're really into keyboards—but at the end of the day, it's still a fundamental tool that you use every single day.And Java is kind of in the same situation. The reason that Java continues to be so fundamental is that it really comes back to kind of reinventing the wheel problem. Are there are other languages that are more efficient to code in? Absolutely. Are there other languages that, you know, have some capabilities that the Java doesn't have? Absolutely.But if you have the ability to reinvent everything from scratch, sure, go for it. And you also don't have to worry about well, can I find enough programmers in this, you know, new hot language, okay, good luck with that. You might be able to find dozens, but when you need to really scale a company into thousands or tens of thousands of developers, good luck finding, you know, everyone that knows, whatever your favorite hot language of the day is.Corey: It requires six years experience in a four-year-old language. Yeah, it's hard to find that, sometimes.Scott: Right. And you know, the reality is, is that really no application ever is developed from scratch, right? Even when an application is, quote, new, immediately, what you're using is frameworks and other things that have written long ago and proven to be very successful.Corey: And disturbing amounts of code copied and pasted from Stack Overflow.Scott: Absolutely.Corey: But that's one of those impolite things we don't say out loud very often.Scott: That's exactly right. So, nothing really is created from scratch anymore. And so, it's all about building blocks. And this is really where this snowball of Java is difficult to stop because there is so much third-party code out there—and by that, I mean, you know, open-source, commercial code, et cetera—that is just so leveraged and so useful to very quickly be able to take advantage of and, you know, allow developers to focus on truly new things, not reinventing the wheel for the hundredth time. And that's what's kind of hard about all these other languages is catching up to Java with all of the things that are immediately available for developers to use freely, right, because most of its open-source. That's a pretty fundamental Catch-22 about when you start talking about the evolution of new languages.Corey: I'm with you so far. The counterpoint though is that so much of what we're talking about in the world of Java is open-source; it is freely available. The OpenJDK, for example, says that right on the tin. You have built a company and you've been in business for 20 years. I have to imagine that this is not one of those stories where, “Oh, all the things we do, we give away for free. But that's okay. We make it up in volume.” Even the venture capitalist mindset tends to run out of patience on those kinds of timescales. What is it you actually do as a business that clearly, obviously delivers value for customers but also results in, you know, being able to meet payroll every week?Scott: Right? Absolutely. And I think what time has shown is that, with one very notable exception and very successful example being Red Hat, there are very, very few pure open-source companies whose business is only selling support services for free software. Most successful businesses that are based on open-source are in one-way shape or form adding value-added elements. And that's our strategy as well.The heart of everything we do is based on free code from OpenJDK, and we have a tremendous amount of business that we are following the Red Hat business model where we are selling support and long-term access and a huge variety of different operating system configurations, older Java versions. Still all free software, though, right, but we're selling support services for that. And that is, in essence, the classic Red Hat business model. And that business for us is incredibly high growth, very fast-moving, a lot of that business is because enterprises are tired of paying the very high price to Oracle for Java support and they're looking for an open-source alternative that is exactly the same thing, but comes in pure open-source form and with a vendor that is as reputable as Oracle. So, a lot of our businesses based on that.However, on top of that, we also have value-added elements. And so, our product that is called Azul Platform Prime is rooted in OpenJDK—it is OpenJDK—but then we've added value-added elements to that. And what those value-added elements create is, in essence, a better Java platform. And better in this context means faster, quicker to warm up, elimination of some of the inconsistencies of the Java runtime in terms of this nasty problem called garbage collection which causes applications to kind of bounce around in terms of performance limitations. And so, creating a better Java is another way that we have monetized our company is value-added elements that are built on top of OpenJDK. And I'd say that part of the business is very typical for the majority of enterprise software companies that are rooted in open-source. They're typically adding value-added components on top of the open-source technology, and that's our similar strategy as well.And then the third evolution for us, which again is very tried-and-true, is evolving the business also to add SaaS offerings. So today, the majority of our customers, even though they deploy in the cloud, they're stuck customer-managed and so they're responsible for where do I want to put my Java runtime on building out my stack and cetera, et cetera. And of course, that could be on-prem, but like I mentioned, the majority are in the cloud. We're evolving our product offerings also to have truly SaaS-based solutions so that customers don't even need to manage those types of stacks on their own anymore.Corey: On some level, it feels like we're talking about two different things when we talk about cloud and when we talk about programming languages, but increasingly, I'm starting to see across almost the entire ecosystem that different languages and different cloud providers are in many ways converging. How do you see Java changing as cloud-native becomes the default rather than the new thing?Scott: Great question. And I think the thing to recognize about, really, most popular programming languages today—I can think of very few exceptions—these languages were created, envisioned, implemented if you will, in a day when cloud was not top-of-mind, and in many cases, certainly in the case of Java, cloud didn't even exist when Java was originally conceived, nor was that the case when you know, other languages, such as Python, or JavaScript, or on and on. So, rethinking how these languages should evolve in very much the context of a cloud-native mentality is a really important initiative that we certainly are doing and I think the Java community is doing overall. And how you architect not only the application, but even the Java runtime itself can be fundamentally different if you know that the application is going to be deployed in the cloud.And I'll give you an example. Specifically, in the world of any type of runtime-based language—and JavaScript is an example of that; Python is an example of that; Java is an example of that—in all of those runtime-based environments, what that basically means is that when the application is run, there's a piece of software that's called the runtime that actually is running that application code. And so, you can think about it as a middleware piece of software that sits between the operating system and the application itself. And so, that runtime layer is common across those languages and those platforms that I mentioned. That runtime layer is evolving, and it's evolving in a way that is becoming more and more cloud-native in it's thinking.The process itself of actually taking the application, compiling it into whatever underlying architecture it may be running on—it could be an x86 instance running on Amazon; it could be, you know, for example, an ARM64, which Amazon has compute instances now that are based on an ARM64 processor that they call Graviton, which is really also kind of altering the price-performance of the compute instances on the AWS platform—that runtime layer magically takes an application that doesn't have to be aware of the underlying hardware and transforms that into a way that can be run. And that's a very expensive process; it's called just-in-time compiling, and that just-in-time compilation, in today's world—which wasn't really based on cloud thinking—every instance, every compute instance that you deploy, that same JIT compilation process is happening over and over again. And even if you deploy 100 instances for scalability, every one of those 100 instances is doing that same work. And so, it's very inefficient and very redundant. Contrast that to a cloud-native thinking: that compilation process should be a service; that service should be done once.The application—you know, one instance of the application is actually run and there are the other ninety-nine should just reuse that compilation process. And that shared compiler service should be scalable and should be able to scale up when applications are launched and you need more compilation resources, and then scaled right back down when you're through the compilation process and the application is more moving into the—you know, to the runtime phase of the application lifecycle. And so, these types of things are areas that we and others are working on in terms of evolving the Java runtime specifically to be more cloud-native.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig secures your cloud from source to run. They believe, as do I, that DevOps and security are inextricably linked. If you wanna learn more about how they view this, check out their blog, it's definitely worth the read. To learn more about how they are absolutely getting it right from where I sit, visit Sysdig.com and tell them that I sent you. That's S Y S D I G.com. And my thanks to them for their continued support of this ridiculous nonsense.Corey: This feels like it gets even more critical when we're talking about things like serverless functions across basically all the cloud providers these days, where there's the whole setup, everything in the stack, get it running, get it listening, ready to go, to receive a single request and then shut itself down. It feels like there are a lot of operational efficiencies possible once you start optimizing from a starting point of yeah, this is what that environment looks like, rather than us big metal servers sitting in a rack 15 years ago.Scott: Yeah. I think the evolution of serverless appears to be headed more towards serverless containers as opposed to serverless functions. Serverless functions have a bunch of limitations in terms of when you think about it in the context of a complex, you know, microservices-based deployment framework. It's just not very efficient, to spin up and spin down instances of a function if that actually is being—it is any sort of performance or latency-sensitive type of applications. If you're doing something very rarely, sure, it's fine; it's efficient, it's elegant, et cetera.But any sort of thing that has real girth to it—and girth probably means that's what's driving your application infrastructure costs, that's what's driving your Amazon bill every month—those types of things typically are not going to be great for starting and stopping functional instances. And so, serverless is evolving more towards thinking about the container itself not having to worry about the underlying operating system or the instance on Amazon that it's running on. And that's where, you know, we see more and more of the evolution of serverless is thinking about it at a container-level as opposed to a functional level. And that appears to be a really healthy steady state, so it gets the benefits of not having to worry about all the underlying stuff, but at the same time, doesn't have the downside of trying to start and stop functional influences at a given point in time.Corey: It seems to me that there are really two ways of thinking about cloud. The first is what I think a lot of companies do their first outing when they're going into something like AWS. “Okay, we're going to get a bunch of virtual machines that they call instances in AWS, we're going to run things just like it's our data center except now data transfer to the internet is terrifyingly expensive.” The more quote-unquote, “Cloud-native” way of thinking about this is what you're alluding to where there's, “Here's some code that I wrote. I want to throw it to my cloud provider and just don't tell me about any of the infrastructure parts. Execute this code when these conditions are met and leave me alone.”Containers these days seem to be one of our best ways of getting there with a minimum of fuss and friction. What are you seeing in the enterprise space as far as adoption of those patterns go? Or are we seeing cloud repatriation showing up as a real thing and I'm just not in the right place to see it?Scott: Well, I think as a cloud journey evolves, there's no question that—and in fact it's even silly to say that cloud is here to stay because I think that became a reality many, many years ago. So really, the question is, what are the challenges now with cloud deployments? Cloud is absolutely a given. And I think you stated earlier, it's rare that, whether it's a new company or a new application, at least in most businesses that don't have specific regulatory requirements, that application is highly, highly likely to be envisioned to be initially and only deployed in the cloud. That's a great thing because you have so many advantages of not having to purchase infrastructure in advance, being able to tap into all of the various services that are available through the cloud providers. No one builds databases anymore; you're just tapping into the service that's provided by Azure or AWS, or what have you.And, you know, just that specific example is a huge amount of savings in terms of just overhead, and license costs, and those types of stuff, and there's countless examples of that. And so, the services that are available in the cloud are unquestioned. So, there's countless advantages of why you want to be in the cloud. The downside, however, the cloud that is, if at the end of the day, AWS, Microsoft with Azure, Google with GCP, they are making 30% margin on that cloud infrastructure. And in the days of hardware, when companies would actually buy their servers from Dell, or HP, et cetera, those businesses are 5% margin.And so, where's that 25% going? Well, the 25% is being paid for by the users of cloud, and as a result of that, when you look at it purely from an operational cost perspective, it is more expensive to run in the cloud than it is back in the legacy days, right? And that's not to say that the industry has made the wrong choice because there's so many advantages of being in cloud, there's no doubt about it. And there should be—you know, and the cloud providers deserve to take some amount of margin to provide the services that they provide; there's no doubt about that. The question is, how do you do the best of all worlds?And you know, there is a great blog by a couple of the partners in Andreessen Horowitz, they called this the Cloud Paradox. And the Cloud Paradox really talks about the challenges. It's really a Catch-22; how do you get all the benefits of cloud but do that in a way that is not overly taxing from a cost perspective? And a lot of it comes down to good practices and making sure that you have the right monitoring and culture within an enterprise to make sure that cloud cost is a primary thing that is discussed and metric, but then there's also technologies that can help so that you don't have to even think about what you really don't ever want to do: repatriating, which is about the concept of actually moving off the cloud back to the old way of doing things. So certainly, I don't believe repatriation is a practical solution for ongoing and increasing cloud costs. I believe technology is a solution to that.And there are technologies such as our product, Azul Platform Prime, that in essence, allows you to do more with less, right, get all the benefits of cloud, deploy in your Amazon environment, deploy in your Azure environment, et cetera, but imagine if instead of needing a hundred instances to handle your given workload, you could do that with 50 or 60. Tomorrow, that means that you can start savings and being able to do that simply by changing your JVM from a standard OpenJDK or Oracle JVM to something like Platform Prime, you can immediately start to start seeing the benefits from that. And so, a lot of our business now and our growth is coming from companies that are screaming under the ongoing cloud costs and trying to keep them in line, and using technology like Azul Platform Prime to help mitigate those costs.Corey: I think that there is a somewhat foolish approach that I'm seeing taken by a lot of folks where there are some companies that are existentially anti-cloud, if for no other reason than because if the cloud wins, then they don't really have a business anymore. The problem I see with that is that it seems that their solution across the board is to turn back the clock where if I'm going to build a startup, it's time for me to go buy some servers and a rack somewhere and start negotiating with bandwidth providers. I don't see that that is necessarily viable for almost anyone. We aren't living in 1995 anymore, despite how much some people like to pretend we are. It seems like if there are workloads—for which I agree, cloud is not necessarily an economic fit, first, I feel like the market will fix that in the fullness of time, but secondly, on an individual workload belonging in a certain place is radically different than, “Oh, none of our stuff should live on cloud. Everything belongs in a data center.” And I just think that companies lose all credibility when they start pretending that it's any other way.Scott: Right. I'd love to see the reaction of the venture capitalists' face when an entrepreneur walks in and talks about how their strategy for deploying their SaaS service is going to be buying hardware and renting some space in the local data center.Corey: Well, there is a good cost control method, if you think about it. I mean very few engineers are going to accidentally spin up an $8 million cluster in a data center a second time, just because there's no space left for it.Scott: And you're right; it does happen in the cloud as well. It's just, I agree with you completely that as part of the evolution of cloud, in general, is an ever-improving aspect of cost and awareness of cost and building in technologies that help mitigate that cost. So, I think that will continue to evolve. I think, you know, if you really think about the cloud journey, cost, I would say, is still in early phases of really technologies and practices and processes of allowing enterprises to really get their head around cost. I'd still say it's a fairly immature industry that is evolving quickly, just given the importance of it.And so, I think in the coming years, you're going to see a radical improvement in terms of cost awareness and technologies to help with costs, that again allows you to the best of all worlds. Because, you know, if you go back to the Dark Ages and you start thinking about buying servers and infrastructure, then you are really getting back to a mentality of, “I've got to deploy everything. I've got to buy software for my database. I've got to deploy it. What am I going to do about my authentication service? So, I got to buy this vendor's, you know, solution, et cetera.” And so, all that stuff just goes away in the world of cloud, so it's just not practical, in this day and age I think, to think about really building a business that's not cloud-native from the beginning.Corey: I really want to thank you for spending so much time talking to me about how you view the industry, the evolution we've seen in the Java ecosystem, and what you've been up to. If people want to learn more, where's the best place for them to find you?Scott: Well, there's a thing called a website that you may not have heard of, it's really cool.Corey: Can I build it in Java?Scott: W-W-dot—[laugh]. Yeah. Azul website obviously has an awful lot of information about that, Azul is spelled A-Z-U-L, and we sometimes get the question, “How in the world did you name a company—why did you name it Azul?”And it's kind of a funny story because back in the days of Azul when we thought about, hey, we want to be big and successful, and at the time, IBM was the gold standard in terms of success in the enterprise world. And you know, they were Big Blue, so we said, “Hey, we're going to be a little blue. Let's be Azul.” So, that's where we began. So obviously, go check out our site.We're very present, also, in the Java community. We're, you know, many developer conferences and talks. We sponsor and run many of what's called the Java User Groups, which are very popular 10-, 20-person meetups that happen around the globe on a regular basis. And so, you know, come check us out. And I appreciate everyone's time in listening to the podcast today.Corey: No, thank you very much for spending as much time with me as you have. It's appreciated.Scott: Thanks, Corey.Corey: Scott Sellers, CEO and co-founder of Azul. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an entire copy of the terms and conditions from Oracle's version of the JDK.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
The Future of Serverless with Allen Helton

Screaming in the Cloud

Play Episode Listen Later Sep 15, 2022 39:06


About AllenAllen is a cloud architect at Tyler Technologies. He helps modernize government software by creating secure, highly scalable, and fault-tolerant serverless applications.Allen publishes content regularly about serverless concepts and design on his blog - Ready, Set Cloud!Links Referenced: Ready, Set, Cloud blog: https://readysetcloud.io Tyler Technologies: https://www.tylertech.com/ Twitter: https://twitter.com/allenheltondev Linked: https://www.linkedin.com/in/allenheltondev/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: I come bearing ill tidings. Developers are responsible for more than ever these days. Not just the code that they write, but also the containers and the cloud infrastructure that their apps run on. Because serverless means it's still somebody's problem. And a big part of that responsibility is app security from code to cloud. And that's where our friend Snyk comes in. Snyk is a frictionless security platform that meets developers where they are - Finding and fixing vulnerabilities right from the CLI, IDEs, Repos, and Pipelines. Snyk integrates seamlessly with AWS offerings like code pipeline, EKS, ECR, and more! As well as things you're actually likely to be using. Deploy on AWS, secure with Snyk. Learn more at Snyk.co/scream That's S-N-Y-K.co/screamCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while I wind up stumbling into corners of the internet that I previously had not traveled. Somewhat recently, I wound up having that delightful experience again by discovering readysetcloud.io, which has a whole series of, I guess some people might call it thought leadership, I'm going to call it instead how I view it, which is just amazing opinion pieces on the context of serverless, mixed with APIs, mixed with some prognostications about the future.Allen Helton by day is a cloud architect at Tyler Technologies, but that's not how I encountered you. First off, Allen, thank you for joining me.Allen: Thank you, Corey. Happy to be here.Corey: I was originally pointed towards your work by folks in the AWS Community Builder program, of which we both participate from time to time, and it's one of those, “Oh, wow, this is amazing. I really wish I'd discovered some of this sooner.” And every time I look through your back catalog, and I click on a new post, I see things that are either I've really agree with this or I can't stand this opinion, I want to fight about it, but more often than not, it's one of those recurring moments that I love: “Damn, I wish I had written something like this.” So first, you're absolutely killing it on the content front.Allen: Thank you, Corey, I appreciate that. The content that I make is really about the stuff that I'm doing at work. It's stuff that I'm passionate about, stuff that I'd spend a decent amount of time on, and really the most important thing about it for me, is it's stuff that I'm learning and forming opinions on and wants to share with others.Corey: I have to say, when I saw that you were—oh, your Tyler Technologies, which sounds for all the world like, oh, it's a relatively small consultancy run by some guy presumably named Tyler, and you know, it's a petite team of maybe 20, 30 people on the outside. Yeah, then I realized, wait a minute, that's not entirely true. For example, for starters, you're publicly traded. And okay, that does change things a little bit. First off, who are you people? Secondly, what do you do? And third, why have I never heard of you folks, until now?Allen: Tyler is the largest company that focuses completely on the public sector. We have divisions and products for pretty much everything that you can imagine that's in the public sector. We have software for schools, software for tax and appraisal, we have software for police officers, for courts, everything you can think of that runs the government can and a lot of times is run on Tyler software. We've been around for decades building our expertise in the domain, and the reason you probably haven't heard about us is because you might not have ever been in trouble with the law before. If you [laugh] if you have been—Corey: No, no, I learned very early on in the course of my life—which will come as a surprise to absolutely no one who spent more than 30 seconds with me—that I have remarkably little filter and if ten kids were the ones doing something wrong, I'm the one that gets caught. So, I spent a lot of time in the principal's office, so this taught me to keep my nose clean. I'm one of those squeaky-clean types, just because I was always terrified of getting punished because I knew I would get caught. I'm not saying this is the right way to go through life necessarily, but it did have the side benefit of, no, I don't really engage with law enforcement going throughout the course of my life.Allen: That's good. That's good. But one exposure that a lot of people get to Tyler is if you look at the bottom of your next traffic ticket, it'll probably say Tyler Technologies on the bottom there.Corey: Oh, so you're really popular in certain circles, I'd imagine?Allen: Super popular. Yes, yes. And of course, you get all the benefits of writing that code that says ‘if defendant equals Allen Helton then return.'Corey: I like that. You get to have the exception cases built in that no one's ever going to wind up looking into.Allen: That's right. Yes.Corey: The idea of what you're doing makes an awful lot of sense. There's a tremendous need for a wide variety of technical assistance in the public sector. What surprises me, although I guess it probably shouldn't, is how much of your content is aimed at serverless technologies and API design, which to my way of thinking, isn't really something that public sector has done a lot with. Clearly I'm wrong.Allen: Historically, you're not wrong. There's an old saying that government tends to run about ten years behind on technology. Not just technology, but all over the board and runs about ten years behind. And until recently, that's really been true. There was a case last year, a situation last year where one of the state governments—I don't remember which one it was—but they were having a crisis because they couldn't find any COBOL developers to come in and maintain their software that runs the state.And it's COBOL; you're not going to find a whole lot of people that have that skill. A lot of those people are retiring out. And what's happening is that we're getting new people sitting in positions of power and government that want innovation. They know about the cloud and they want to be able to integrate with systems quickly and easily, have little to no onboarding time. You know, there are people in power that have grown up with technology and understand that, well, with everything else, I can be up and running in five or ten minutes. I cannot do this with the software I'm consuming now.Corey: My opinion on it is admittedly conflicted because on the one hand, yeah, I don't think that governments should be running on COBOL software that runs on mainframes that haven't been supported in 25 years. Conversely, I also don't necessarily want them being run like a seed series startup, where, “Well, I wrote this code last night, and it's awesome, so off I go to production with it.” Because I can decide not to do business anymore with Twitter for Pets, and I could go on to something else, like PetFlicks, or whatever it is I choose to use. I can't easily opt out of my government. The decisions that they make stick and that is going to have a meaningful impact on my life and everyone else's life who is subject to their jurisdiction. So, I guess I don't really know where I believe the proper, I guess, pace of technological adoption should be for governments. Curious to get your thoughts on this.Allen: Well, you certainly don't want anything that's bleeding edge. That's one of the things that we kind of draw fine lines around. Because when we're dealing with government software, we're dealing with, usually, critically sensitive information. It's not medical records, but it's your criminal record, and it's things like your social security number, it's things that you can't have leaking out under any circumstances. So, the things that we're building on are things that have proven out to be secure and have best practices around security, uptime, reliability, and in a lot of cases as well, and maintainability. You know, if there are issues, then let's try to get those turned around as quickly as we can because we don't want to have any sort of downtime from the software side versus the software vendor side.Corey: I want to pivot a little bit to some of the content you've put out because an awful lot of it seems to be, I think I'll call it variations on a theme. For example, I just read some recent titles, and to illustrate my point, “Going API First: Your First 30 Days,” “Solutions Architect Tips how to Design Applications for Growth,” “3 Things to Know Before Building A Multi-Tenant Serverless App.” And the common thread that I see running through all of these things are these are things that you tend to have extraordinarily strong and vocal opinions about only after dismissing all of them the first time and slapping something together, and then sort of being forced to live with the consequences of the choices that you've made, in some cases you didn't realize you were making at the time. Are you one of those folks that has the wisdom to see what's coming down the road, or did you do what the rest of us do and basically learn all this stuff by getting it hilariously wrong and having to careen into rebound situations as a result?Allen: [laugh]. I love that question. I would like to say now, I feel like I have the vision to see something like that coming. Historically, no, not at all. Let me talk a little bit about how I got to where I am because that will shed a lot of context on that question.A few years ago, I was put into a position at Tyler that said, “Hey, go figure out this cloud thing.” Let's figure out what we need to do to move into the cloud safely, securely, quickly, all that rigmarole. And so, I did. I got to hand-select team of engineers from people that I worked with at Tyler over the past few years, and we were basically given free rein to learn. We were an R&D team, a hundred percent R&D, for about a year's worth of time, where we were learning about cloud concepts and theory and building little proof of concepts.CI/CD, serverless, APIs, multi-tenancy, a whole bunch of different stuff. NoSQL was another one of the things that we had to learn. And after that year of R&D, we were told, “Okay, now go do something with that. Go build this application.” And we did, building on our theory our cursory theory knowledge. And we get pretty close to go live, and then the business says, “What do you do in this scenario? What do you do in that scenario? What do you do here?”Corey: “I update my resume and go work somewhere else. Where's the hard part here?”Allen: [laugh].Corey: Turns out, that's not a convincing answer.Allen: Right. So, we moved quickly. And then I wouldn't say we backpedaled, but we hardened for a long time before the—prior to the go-live, with the lessons that we've learned with the eyes of Tyler, the mature enterprise company, saying, “These are the things that you have to make sure that you take into consideration in an actual production application.” One of the things that I always pushed—I was a manager for a few years of all these cloud teams—I always push do it; do it right; do it better. Right?It's kind of like crawl, walk, run. And if you follow my writing from the beginning, just looking at the titles and reading them, kind of like what you were doing, Corey, you'll see that very much. You'll see how I talk about CI/CD, you'll see me how I talk about authorization, you'll see me how I talk about multi-tenancy. And I kind of go in waves where maybe a year passes and you see my content revisit some of the topics that I've done in the past. And they're like, “No, no, no, don't do what I said before. It's not right.”Corey: The problem when I'm writing all of these things that I do, for example, my entire newsletter publication pipeline is built on a giant morass of Lambda functions and API Gateways. It's microservices-driven—kind of—and each microservice is built, almost always, with a different framework. Lately, all the new stuff is CDK. I started off with the serverless framework. There are a few other things here and there.And it's like going architecting, back in time as I have to make updates to these things from time to time. And it's the problem with having done all that myself is that I already know the answer to, “What fool designed this?” It's, well, you're basically watching me learn what I was, doing bit by bit. I'm starting to believe that the right answer on some level, is to build an inherent shelf-life into some of these things. Great, in five years, you're going to come back and re-architect it now that you know how this stuff actually works rather than patching together 15 blog posts by different authors, not all of whom are talking about the same thing and hoping for the best.Allen: Yep. That's one of the things that I really like about serverless, I view that as a giant pro of doing Serverless is that when we revisit with the lessons learned, we don't have to refactor everything at once like if it was just a big, you know, MVC controller out there in the sky. We can refactor one Lambda function at a time if now we're using a new version of the AWS SDK, or we've learned about a new best practice that needs to go in place. It's a, “While you're in there, tidy up, please,” kind of deal.Corey: I know that the DynamoDB fanatics will absolutely murder me over this one, but one of the reasons that I have multiple Dynamo tables that contain, effectively, variations on the exact same data, is because I want to have the dependency between the two different microservices be the API, not, “Oh, and under the hood, it's expecting this exact same data structure all the time.” But it just felt like that was the wrong direction to go in. That is the justification I use for myself why I run multiple DynamoDB tables that [laugh] have the same content. Where do you fall on the idea of data store separation?Allen: I'm a big single table design person myself, I really like the idea of being able to store everything in the same table and being able to create queries that can return me multiple different types of entity with one lookup. Now, that being said, one of the issues that we ran into, or one of the ambiguous areas when we were getting started with serverless was, what does single table design mean when you're talking about microservices? We were wondering does single table mean one DynamoDB table for an entire application that's composed of 15 microservices? Or is it one table per microservice? And that was ultimately what we ended up going with is a table per microservice. Even if multiple microservices are pushed into the same AWS account, we're still building that logical construct of a microservice and one table that houses similar entities in the same domain.Corey: So, something I wish that every service team at AWS would do as a part of their design is draw the architecture of an application that you're planning to build. Great, now assume that every single resource on that architecture diagram lives in its own distinct AWS account because somewhere in some customer, there's going to be an account boundary at every interconnection point along the way. And so, many services don't do that where it's, “Oh, that thing and the other thing has to be in the same account.” So, people have to write their own integration shims, and it makes doing the right thing of putting different services into distinct bounded AWS accounts for security or compliance reasons way harder than I feel like it needs to be.Allen: [laugh]. Totally agree with you on that one. That's one of the things that I feel like I'm still learning about is the account-level isolation. I'm still kind of early on, personally, with my opinions in how we're structuring things right now, but I'm very much of a like opinion that deploying multiple things into the same account is going to make it too easy to do something that you shouldn't. And I just try not to inherently trust people, in the sense that, “Oh, this is easy. I'm just going to cross that boundary real quick.”Corey: For me, it's also come down to security risk exposure. Like my lasttweetinaws.com Twitter shitposting thread client lives in a distinct AWS account that is separate from the AWS account that has all of our client billing data that lives within it. The idea being that if you find a way to compromise my public-facing Twitter client, great, the blast radius should be constrained to, “Yay, now you can, I don't know, spin up some cryptocurrency mining in my AWS account and I get to look like a fool when I beg AWS for forgiveness.”But that should be the end of it. It shouldn't be a security incident because I should not have the credit card numbers living right next to the funny internet web thing. That sort of flies in the face of the original guidance that AWS gave at launch. And right around 2008-era, best practices were one customer, one AWS account. And then by 2012, they had changed their perspective, but once you've made a decision to build multiple services in a single account, unwinding and unpacking that becomes an incredibly burdensome thing. It's about the equivalent of doing a cloud migration, in some ways.Allen: We went through that. We started off building one application with the intent that it was going to be a siloed application, a one-off, essentially. And about a year into it, it's one of those moments of, “Oh, no. What we're building is not actually a one-off. It's a piece to a much larger puzzle.”And we had a whole bunch of—unfortunately—tightly coupled things that were in there that we're assuming that resources were going to be in the same AWS account. So, we ended up—how long—I think we took probably two months, which in the grand scheme of things isn't that long, but two months, kind of unwinding the pieces and decoupling what was possible at the time into multiple AWS accounts, kind of, segmented by domain, essentially. But that's hard. AWS puts it, you know, it's those one-way door decisions. I think this one was a two-way door, but it locked and you could kind of jimmy the lock on the way back out.Corey: And you could buzz someone from the lobby to let you back in. Yeah, the biggest problem is not necessarily the one-way door decisions. It's the one-way door decisions that you don't realize you're passing through at the time that you do them. Which, of course, brings us to a topic near and dear to your heart—and I only recently started have opinions on this myself—and that is the proper design of APIs, which I'm sure will incense absolutely no one who's listening to this. Like, my opinions on APIs start with well, probably REST is the right answer in this day and age. I had people, like, “Well, I don't know, GraphQL is pretty awesome.” Like, “Oh, I'm thinking SOAP,” and people look at me like I'm a monster from the Black Lagoon of centuries past in XML-land. So, my particular brand of strangeness side, what do you see that people are doing in the world of API design that is the, I guess, most common or easy to make mistakes that you really wish they would stop doing?Allen: If I could boil it down to one word, fundamentalism. Let me unpack that for you.Corey: Oh, please, absolutely want to get a definition on that one.Allen: [laugh]. I approach API design from a developer experience point of view: how easy is it for both internal and external integrators to consume and satisfy the business processes that they want to accomplish? And a lot of times, REST guidelines, you know, it's all about entity basis, you know, drill into the appropriate entities and name your endpoints with nouns, not verbs. I'm actually very much onto that one.But something that you could easily do, let's say you have a business process that given a fundamentally correct RESTful API design takes ten API calls to satisfy. You could, in theory, boil that down to maybe three well-designed endpoints that aren't, quote-unquote, “RESTful,” that make that developer experience significantly easier. And if you were a fundamentalist, that option is not even on the table, but thinking about it pragmatically from a developer experience point of view, that might be the better call. So, that's one of the things that, I know feels like a hot take. Every time I say it, I get a little bit of flack for it, but don't be a fundamentalist when it comes to your API designs. Do something that makes it easier while staying in the guidelines to do what you want.Corey: For me the problem that I've kept smacking into with API design, and it honestly—let me be very clear on this—my first real exposure to API design rather than API consumer—which of course, I complain about constantly, especially in the context of the AWS inconsistent APIs between services—was when I'm building something out, and I'm reading the documentation for API Gateway, and oh, this is how you wind up having this stage linked to this thing, and here's the endpoint. And okay, great, so I would just populate—build out a structure or a schema that has the positional parameters I want to use as variables in my function. And that's awesome. And then I realized, “Oh, I might want to call this a different way. Aw, crap.” And sometimes it's easy; you just add a different endpoint. Other times, I have to significantly rethink things. And I can't shake the feeling that this is an entire discipline that exists that I just haven't had a whole lot of exposure to previously.Allen: Yeah, I believe that. One of the things that you could tie a metaphor to for what I'm saying and kind of what you're saying, is AWS SAM, the Serverless Application Model, all it does is basically macros CloudFormation resources. It's just a transform from a template into CloudFormation. CDK does same thing. But what the developers of SAM have done is they've recognized these business processes that people do regularly, and they've made these incredibly easy ways to satisfy those business processes and tie them all together, right?If I want to have a Lambda function that is backed behind a endpoint, an API endpoint, I just have to add four or five lines of YAML or JSON that says, “This is the event trigger, here's the route, here's the API.” And then it goes and does four, five, six different things. Now, there's some engineers that don't like that because sometimes that feels like magic. Sometimes a little bit magic is okay.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig secures your cloud from source to run. They believe, as do I, that DevOps and security are inextricably linked. If you wanna learn more about how they view this, check out their blog, it's definitely worth the read. To learn more about how they are absolutely getting it right from where I sit, visit Sysdig.com and tell them that I sent you. That's S Y S D I G.com. And my thanks to them for their continued support of this ridiculous nonsense.Corey: I feel like one of the benefits I've had with the vast majority of APIs that I've built is that because this is all relatively small-scale stuff for what amounts to basically shitposting for the sake of entertainment, I'm really the only consumer of an awful lot of these things. So, I get frustrated when I have to backtrack and make changes and teach other microservices to talk to this thing that has now changed. And it's frustrating, but I have the capacity to do that. It's just work for a period of time. I feel like that equation completely shifts when you have published this and it is now out in the world, and it's not just users, but in many cases paying customers where you can't really make those changes without significant notice, and every time you do you're creating work for those customers, so you have to be a lot more judicious about it.Allen: Oh, yeah. There is a whole lot of governance and practice that goes into production-level APIs that people integrate with. You know, they say once you push something out the door into production that you're going to support it forever. I don't disagree with that. That seems like something that a lot of people don't understand.And that's one of the reasons why I push API-first development so hard in all the content that I write is because you need to be intentional about what you're letting out the door. You need to go in and work, not just with the developers, but your product people and your analysts to say, what does this absolutely need to do, and what does it need to do in the future? And you take those things, and you work with analysts who want specifics, you work with the engineers to actually build it out. And you're very intentional about what goes out the door that first time because once it goes out with a mistake, you're either going to version it immediately or you're going to make some people very unhappy when you make a breaking change to something that they immediately started consuming.Corey: It absolutely feels like that's one of those things that AWS gets astonishingly right. I mean, I had the privilege of interviewing, at the time, Jeff Barr and then Ariel Kelman, who was their head of marketing, to basically debunk a bunch of old myths. And one thing that they started talking about extensively was the idea that an API is fundamentally a promise to your customers. And when you make a promise, you'd better damn well intend on keeping it. It's why API deprecations from AWS are effectively unique whenever something happens.It's the, this is a singular moment in time when they turn off a service or degrade old functionality in favor of new. They can add to it, they can launch a V2 of something and then start to wean people off by calling the old one classic or whatnot, but if I built something on AWS in 2008 and I wound up sleeping until today, and go and try and do the exact same thing and deploy it now, it will almost certainly work exactly as it did back then. Sure, reliability is going to be a lot better and there's a crap ton of features and whatnot that I'm not taking advantage of, but that fundamental ability to do that is awesome. Conversely, it feels like Google Cloud likes to change around a lot of their API stories almost constantly. And it's unplanned work that frustrates the heck out of me when I'm trying to build something stable and lasting on top of it.Allen: I think it goes to show the maturity of these companies as API companies versus just vendors. It's one of the things that I think AWS does [laugh]—Corey: You see the similar dichotomy with Microsoft and Apple. Microsoft's new versions of Windows generally still have functionalities in them to support stuff that was written in the '90s for a few use cases, whereas Apple's like, “Oh, your computer's more than 18-months old? Have you tried throwing it away and buying a new one? And oh, it's a new version of Mac OS, so yeah, maybe the last one would get security updates for a year and then get with the times.” And I can't shake the feeling that the correct answer is in some way, both of those, depending upon who your customer is and what it is you're trying to achieve.If Microsoft adopted the Apple approach, their customers would mutiny, and rightfully so; the expectation has been set for decades that isn't what happens. Conversely, if Apple decided now we're going to support this version of Mac OS in perpetuity, I don't think a lot of their application developers wouldn't quite know what to make of that.Allen: Yeah. I think it also comes from a standpoint of you better make it worth their while if you're going to move their cheese. I'm not a Mac user myself, but from what I hear for Mac users—and this could be rose-colored glasses—but is that their stuff works phenomenally well. You know, when a new thing comes out—Corey: Until it doesn't, absolutely. It's—whenever I say things like that on this show, I get letters. And it's, “Oh, yeah, really? They'll come up with something that is a colossal pain in the ass on Mac.” Like, yeah, “Try building a system-wide mute key.”It's yeah, that's just a hotkey away on windows and here in Mac land. It's, “But it makes such beautiful sounds. Why would you want them to be quiet?” And it's, yeah, it becomes this back-and-forth dichotomy there. And you can even explain it to iPhones as well and the Android ecosystem where it's, oh, you're going to support the last couple of versions of iOS.Well, as a developer, I don't want to do that. And Apple's position is, “Okay, great.” Almost half of the mobile users on the planet will be upgrading because they're in the ecosystem. Do you want us to be able to sell things those people are not? And they're at a point of scale where they get to dictate those terms.On some level, there are benefits to it and others, it is intensely frustrating. I don't know what the right answer is on the level of permanence on that level of platform. I only have slightly better ideas around the position of APIs. I will say that when AWS deprecates something, they reach out individually to affected customers, on some level, and invariably, when they say, “This is going to be deprecated as of August 31,” or whenever it is, yeah, it is going to slip at least twice in almost every case, just because they're not going to turn off a service that is revenue-bearing or critical-load-bearing for customers without massive amounts of notice and outreach, and in some cases according to rumor, having engineers reach out to help restructure things so it's not as big of a burden on customers. That's a level of customer focus that I don't think most other companies are capable of matching.Allen: I think that comes with the size and the history of Amazon. And one of the things that they're doing right now, we've used Amazon Cloud Cams for years, in my house. We use them as baby monitors. And they—Corey: Yea, I saw this I did something very similar with Nest. They didn't have the Cloud Cam at the right time that I was looking at it. And they just announced that they're going to be deprecating. They're withdrawing them for sale. They're not going to support them anymore. Which, oh at Amazon—we're not offering this anymore. But you tell the story; what are they offering existing customers?Allen: Yeah, so slightly upset about it because I like my Cloud Cams and I don't want to have to take them off the wall or wherever they are to replace them with something else. But what they're doing is, you know, they gave me—or they gave all the customers about eight months head start. I think they're going to be taking them offline around Thanksgiving this year, just mid-November. And what they said is as compensation for you, we're going to send you a Blink Cam—a Blink Mini—for every Cloud Cam that you have in use, and then we are going to gift you a year subscription to the Pro for Blink.Corey: That's very reasonable for things that were bought years ago. Meanwhile, I feel like not to be unkind or uncharitable here, but I use Nest Cams. And that's a Google product. I half expected if they ever get deprecated, I'll find out because Google just turns it off in the middle of the night—Allen: [laugh].Corey: —and I wake up and have to read a blog post somewhere that they put an update on Nest Cams, the same way they killed Google Reader once upon a time. That's slightly unfair, but the fact that joke even lands does say a lot about Google's reputation in this space.Allen: For sure.Corey: One last topic I want to talk with you about before we call it a show is that at the time of this recording, you recently had a blog post titled, “What does the Future Hold for Serverless?” Summarize that for me. Where do you see this serverless movement—if you'll forgive the term—going?Allen: So, I'm going to start at the end. I'm going to work back a little bit on what needs to happen for us to get there. I have a feeling that in the future—I'm going to be vague about how far in the future this is—that we'll finally have a satisfied promise of all you're going to write in the future is business logic. And what does that mean? I think what can end up happening, given the right focus, the right companies, the right feedback, at the right time, is we can write code as developers and have that get pushed up into the cloud.And a phrase that I know Jeremy Daly likes to say ‘infrastructure from code,' where it provisions resources in the cloud for you based on your use case. I've developed an application and it gets pushed up in the cloud at the time of deploying it, optimized resource allocation. Over time, what will happen—with my future vision—is when you get production traffic going through, maybe it's spiky, maybe it's consistently at a scale that outperforms the resources that it originally provisioned. We can have monitoring tools that analyze that and pick that out, find the anomalies, find the standard patterns, and adjust that infrastructure that it deployed for you automatically, where it's based on your production traffic for what it created, optimizes it for you. Which is something that you can't do on an initial deployment right now. You can put what looks best on paper, but once you actually get traffic through your application, you realize that, you know, what was on paper might not be correct.Corey: You ever noticed that whiteboard diagrams never show the reality, and they're always aspirational, and they miss certain parts? And I used to think that this was the symptom I had from working at small, scrappy companies because you know what, those big tech companies, everything they build is amazing and awesome. I know it because I've seen their conference talks. But I've been a consultant long enough now, and for a number of those companies, to realize that nope, everyone's infrastructure is basically a trash fire at any given point in time. And it works almost in spite of itself, rather than because of it.There is no golden path where everything is shiny, new and beautiful. And that, honestly, I got to say, it was really [laugh] depressing when I first discovered it. Like, oh, God, even these really smart people who are so intelligent they have to have extra brain packs bolted to their chests don't have the magic answer to all of this. The rest of us are just screwed, then. But we find ways to make it work.Allen: Yep. There's a quote, I wish I remembered who said it, but it was a military quote where, “No battle plan survives impact with the enemy—first contact with the enemy.” It's kind of that way with infrastructure diagrams. We can draw it out however we want and then you turn it on in production. It's like, “Oh, no. That's not right.”Corey: I want to mix the metaphors there and say, yeah, no architecture survives your first fight with a customer. Like, “Great, I don't think that's quite what they're trying to say.” It's like, “What, you don't attack your customers? Pfft, what's your customer service line look like?” Yeah, it's… I think you're onto something.I think that inherently everything beyond the V1 design of almost anything is an emergent property where this is what we learned about it by running it and putting traffic through it and finding these problems, and here's how it wound up evolving to account for that.Allen: I agree. I don't have anything to add on that.Corey: [laugh]. Fair enough. I really want to thank you for taking so much time out of your day to talk about how you view these things. If people want to learn more, where is the best place to find you?Allen: Twitter is probably the best place to find me: @AllenHeltonDev. I have that username on all the major social platforms, so if you want to find me on LinkedIn, same thing: AllenHeltonDev. My blog is always open as well, if you have any feedback you'd like to give there: readysetcloud.io.Corey: And we will, of course, put links to that in the show notes. Thanks again for spending so much time talking to me. I really appreciate it.Allen: Yeah, this was fun. This was a lot of fun. I love talking shop.Corey: It shows. And it's nice to talk about things I don't spend enough time thinking about. Allen Helton, cloud architect at Tyler Technologies. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that I will reject because it was not written in valid XML.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Authentication Matters with Dan Moore of FusionAuth

Screaming in the Cloud

Play Episode Listen Later Sep 8, 2022 37:19


About DanDan Moore is head of developer relations for FusionAuth, where he helps share information about authentication, authorization and security with developers building all kinds of applications.A former CTO, AWS certification instructor, engineering manager and a longtime developer, he's been writing software for (checks watch) over 20 years.Links Referenced: FusionAuth: https://fusionauth.io Twitter: https://twitter.com/mooreds TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig. That's snark.cloud/appconfig.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig secures your cloud from source to run. They believe, as do I, that DevOps and security are inextricably linked. If you wanna learn more about how they view this, check out their blog, it's definitely worth the read. To learn more about how they are absolutely getting it right from where I sit, visit Sysdig.com and tell them that I sent you. That's S Y S D I G.com. And my thanks to them for their continued support of this ridiculous nonsense.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined today on this promoted episode, which is brought to us by our friends at FusionAuth by Dan Moore, who is their head of DevRel at same. Dan, thank you for joining me.Dan: Corey, thank you so much for having me.Corey: So, you and I have been talking for a while. I believe it predates not just you working over at FusionAuth but me even writing the newsletter and the rest. We met on a leadership Slack many years ago. We've kept in touch ever since, and I think, I haven't run the actual numbers on this, but I believe that you are at the top of the leaderboard right now for the number of responses I have gotten to various newsletter issues that I've sent out over the years.And it's always something great. It's “Here's a link I found that I thought that you might appreciate.” And we finally sat down and met each other in person, had a cup of coffee somewhat recently, and the first thing you asked was, “Is it okay that I keep doing this?” And at the bottom of the newsletter is “Hey, if you've seen something interesting, hit reply and let me know.” And you'd be surprised how few people actually take me up on it. So, let me start by thanking you for being as enthusiastic a contributor of the content as you have been.Dan: Well, I appreciate that. And I remember the first time I ran across your newsletter and was super impressed by kind of the breadth of it. And I guess my way of thanking you is to just send you interesting tidbits that I run across. And it's always fun when I see one of the links that I sent go into the newsletter because what you provide is just such a service to the community. So, thank you.Corey: The fun part, too, is that about half the time that you send a link in, I already have it in my queue, or I've seen it before, but not always. I talked to Jeff Barr about this a while back, and apparently, a big Amazonian theme that he lives by is two is better than zero. He'd rather two people tell him about a thing than no one tells him about the thing. And I've tried to embody that. It's the right answer, but it's also super tricky to figure out what people have heard or haven't heard. It leads to interesting places. But enough about my nonsense. Let's talk about your nonsense instead. So, FusionAuth; what do you folks do over there?Dan: So, FusionAuth is an auth provider, and we offer a Community Edition, which is downloadable for free; we also offer premium editions, but the space we play in is really CIAM, which is Customer Identity Access Management. Very similar to Auth0 or Cognito that some of your listeners might have heard of.Corey: If people have heard about Cognito, it's usually bracketed by profanity, in one direction or another, but I'm sure we'll get there in a minute. I will say that I never considered authentication to be a differentiator between services that I use. And then one day I was looking for a tool—I'm not going to name what it was just because I don't really want to deal with the angry letters and whatnot—but I signed up for this thing to test it out, and “Oh, great. So, what's my password?” “Oh, we don't use passwords. We just every time you want to log in, we're going to email you a link and then you go ahead and click the link.”And I hadn't seen something like that before. And my immediate response to that was, “Okay, this feels like an area they've decided to innovate in.” Their core business is basically information retention and returning it to you—basically any CRUD app. Yay. I don't think this is where I want them to be innovating.I want them to use the tried and true solutions, not build their own or be creative on this stuff, so it was a contributor to me wanting to go in a different direction. When you start doing things like that, there's no multi-factor authentication available and you start to wonder, how have they implemented this? What corners have they cut? Who's reviewed this? It just gave me a weird feeling.And that was sort of the day I realized that authentication for me is kind of like crypto, by which I mean cryptography, not cryptocurrency, I want to be very clear on, here. You should not roll your own cryptography, you should not roll your own encryption, you should buy off-the-shelf unless you're one of maybe five companies on the planet. Spoiler, if you're listening to this, you are almost certainly not one of them.Dan: [laugh]. Yeah. So, first of all, I've been at FusionAuth for a couple of years. Before I came to FusionAuth, I had rolled my own authentication a couple of times. And what I've realized working there is that it really is—there a couple of things worth unpacking here.One is you can now buy or leverage open-source libraries or other providers a lot more than you could 15 or 20 years ago. So, it's become this thing that can be snapped into your architecture. The second is, auth is the front door to application. And while it isn't really that differentiated—I don't think most applications, as you kind of alluded to, should innovate there—it is kind of critical that it runs all the time that it's safe and secure, that it's accessible, that it looks like your application.So, at the same time, it's undifferentiated, right? Like, at the end of the day, people just want to get through authentication and authorization schemes into your application. That is really the critical thing. So, it's undifferentiated, it's critical, it needs to be highly available. Those are all things that make it a good candidate for outsourcing.Corey: There are a few things to unpack there. First is that everything becomes commoditized in the fullness of time. And this is a good thing. Back in the original dotcom bubble, there were entire teams of engineers at all kinds of different e-commerce companies that were basically destroying themselves trying to build an online shopping cart. And today you wind up implementing Shopify or something like it—which is usually Shopify—and that solves the problem for you. This is no longer a point of differentiation.If I want to start selling physical goods on the internet, it feels like it'll take me half an hour or so to wind up with a bare-bones shopping cart thing ready to go, and then I just have to add inventory. Authentication feels like it was kind of the same thing. I mean, back in that song from early on in internet history “Code Monkey” talks about building a login page as part of it, and yeah, that was a colossal pain. These days, there are a bunch of different ways to do that with folks who spend their entire careers working on this exact problem so you can go and work on something that is a lot more core and central to the value that your business ostensibly provides. And that seems like the right path to go down.But this does lead to the obvious counter-question of how is it that you differentiate other than, you know, via marketing, which again, not the worst answer in the world, but it also turns into skeezy marketing. “Yes, you should use this other company's option, or you could use ours and we don't have any intentional backdoors in our version.” “Hmm. That sounds more suspicious and more than a little bit frightening. Tell me more.” “No, legal won't let me.” And it's “Okay.” Aside from the terrible things, how do you differentiate?Dan: I liked that. That was an oddly specific disclaimer, right? Like, whenever a company says, “Oh, yeah, no.” [laugh].Corey: “My breakfast cereal has less arsenic than leading brands.”Dan: Perfect. So yeah, so FusionAuth realizes that, kind of, there are a lot of options out there, and so we've chosen to niche down. And one of the things that we really focus on is the CIAM market. And that stands for Customer Identity Access Management. And we can dive into that a little bit later if you want to know more about that.We have a variety of deployment options, which I think differentiates us from a lot of the SaaS providers out there. You can run us as a self-hosted option with, by the way, professional-grade support, you can use us as a SaaS provider if you don't want to run it yourself. We are experts in operating this piece of software. And then thirdly, you can move between them, right? It's your data, so if you start out and you're bare bones and you want to save money, you can start with self-hosted, when you grow, move to the SaaS version.Or we actually have some bigger companies that kickstart on the SaaS version because they want to get going with this integration problem and then later, as they build out their capabilities, they want the option to move it in-house. So, that is a really key differentiator for us. The last one I'd say is we're really dev-focused. Who isn't, right? Everyone says they're dev-focused, but we live that in terms of our APIs, in terms of our documentation, in terms of our open development process. Like, there's actually a GitHub issues list you can go look on the FusionAuth GitHub profile and it shows exactly what we have planned for the next couple of releases.Corey: If you go to one of my test reference applications, lasttweetinaws.com, as of the time of this recording at least, it asks you to authenticate with your Twitter account. And you can do that, and it's free; I don't charge for any of these things. And once you're authenticated, you can use it to author Twitter threads because I needed it to exist, first off, and secondly, it makes a super handy test app to try out a whole bunch of different things.And one of the reasons you can just go and use it without registering an account for this thing or anything else was because I tried to set that up in an early version with Cognito and immediately gave the hell up and figured, all right, if you can find the URL, you can use this thing because the experience was that terrible. If instead, I had gone down the path of using FusionAuth, what would have made that experience different, other than the fact that Cognito was pretty clearly a tech demo at best rather than something that had any care, finish, spit and polish went into it.Dan: So, I've used Cognito. I'm not going to bag on Cognito, I'm going to leave that to—[laugh].Corey: Oh, I will, don't worry. I'll do all the bagging on Cognito you'd like because the problem is, and I want to be clear on this point, is that I didn't understand what it was doing because the interface was arcane, and the failure mode of everything in this entire sector, when the interface is bad, the immediate takeaway is not “This thing's a piece of crap.” It's, “Oh, I'm bad at this. I'm just not smart enough.” And it's insulting, and it sets me off every time I see it. So, if I feel like I'm coming across as relatively annoyed by the product, it's because it made me feel dumb. That is one of those cardinal sins, from my perspective. So, if you work on that team, please reach out. I would love to give you a laundry list of feedback. I'm not here to make you feel bad about your product; I'm here to make you feel bad about making your customers feel bad. Now please, Dan, continue.Dan: Sure. So, I would just say that one of the things that we've strived to do for years and years is translate some of the arcane IAM Identity Access Management jargon into what normal developers expect. And so, we don't have clients in our OAuth implementation—although they really are clients if you're an RFC junkie—we have applications, right? We have users, we have groups, we have all these things that are what users would expect, even though underlying them they're based on the same standards that, frankly, Cognito and Auth0 and a lot of other people use as well.But to get back to your question, I would say that, if you had chosen to use FusionAuth, you would have had a couple of advantages. The first is, as I mentioned, kind of the developer friendliness and the extensive documentation, example applications. The second would be a themeability. And this is something that we hear from our clients over and over again, is Cognito is okay if you stay within the lines in terms of your user interface, right? If you just want to login form, if you want to stay between lines and you don't want to customize your application's login page at all.We actually provide you with HTML templates. It's actually using a language called FreeMarker, but they let you do whatever the heck you want. Now, of course, with great power comes great responsibility. Now, you own that piece, right, and we do have some more simple customization you can do if all you want to do is change the color. But most of our clients are the kind of folks who really want their application login screen to look exactly like their application, and so they're willing to take on that slightly heavier burden. Unfortunately, Cognito doesn't give you that option at all, as far as I can tell when I've kicked the tires on it. The theming is—how I put this politely—some of our clients have found the theming to be lacking.Corey: That's part of the issue where when I was looking at all the reference implementations, I could find for Cognito, it went from “Oh, you have your own app, and its branding, and the rest,” and bam, suddenly, you're looking right, like, you're logging into an AWS console sub-console property because of course they have those. And it felt like “Oh, great. If I'm going to rip off some company's design aesthetic wholesale, I'm sorry, Amazon is nowhere near anywhere except the bottom 10% of that list, I've got to say. I'm sorry, but it is not an aesthetically pleasing site, full stop. So, why impose that on customers?”It feels like it's one of those things where—like, so many Amazon service teams say, “We're going to start by building a minimum lovable product.” And it's yeah, it's a product that only a parent could love. And the problem is, so many of them don't seem to iterate beyond that do a full-featured story. And this is again, this is not every AWS service. A lot of them are phenomenal and grow into themselves over time.One of the best rags-to-riches stories that I can recall is EFS, their Elastic File System, for an example. But others, like Cognito just sort of seem to sit and languish for so long that I've basically given up hope. Even if they wind up eventually fixing all of these problems, the reputation has been cemented at this point. They've got to give it a different terrible name.Dan: I mean, here's the thing. Like, EFS, if it looks horrible, right, or if it has, like, a toughest user experience, guess what? Your users are devs. And if they're forced to use it, they will. They can sometimes see the glimmers of the beauty that is kind of embedded, right, the diamond in the rough. If your users come to a login page and see something ugly, you immediately have this really negative association. And so again, the login and authentication process is really the front door of your application, and you just need to make sure that it shines.Corey: For me at least, so much of what's what a user experience or user takeaway is going to be about a company's product starts with their process of logging into it, which is one of the reasons that I have challenges with the way that multi-factor auth can be presented, like, “Step one, login to the thing.” Oh, great. Now, you have to fish out your YubiKey, or you have to go check your email for a link or find a code somewhere and punch it in. It adds friction to a process. So, when you have these services or tools that oh, your session will expire every 15 minutes and you have to do that whole thing again to log back in, it's ugh, I'm already annoyed by the time I even look at anything beyond just the login stuff.And heaven forbid, like, there are worse things, let's be very clear here. For example, if I log in to a site, and I'm suddenly looking at someone else's account, yeah, that's known as a disaster and I don't care how beautiful the design aesthetic is or how easy to use it is, we're done here. But that is job zero: the security aspect of these things. Then there's all the polish that makes it go from something that people tolerate because they have to into something that, in the context of a login page I guess, just sort of fades into the background.Dan: That's exactly what you want, right? It's just like the old story about the sysadmin. People only notice when things are going wrong. People only care about authentication when it stops them from getting into what they actually want to do, right? No one ever says, “Oh, my gosh, that login experience was so amazing for that application. I'm going to come back to that application,” right? They notice when it's friction, they noticed when it's sand in the gears.And our goal at FusionAuth, obviously, security is job zero because as you said, last thing you want is for a user to have access to some other user's data or to be able to escalate their privileges, but after that, you want to fade in the background, right? No one comes to FusionAuth and builds a whole application on top of it, right? We are one component that plugs into your application and lets you get on to the fundamentals of building the features that your users really care about, and then wraps your whole application in a blanket of security, essentially.Corey: I'll take even one more example before we just drive this point home in a way that I hope resonates with folks. Everyone has an opinion on logging into AWS properties because “Oh, what about your Amazon account?” At which point it's “Oh, sit down. We're going for a ride here. Are you talking about amazon.com account? Are you talking about the root account for my AWS account? Are you talking about an IAM user? Are you talking about the service formerly known as AWS SSO that's now IAM Identity Center users? Are you talking about their Chime user account? Are you talking about your repost forum account?” And so, on and so on and so on. I'm sure I'm missing half a dozen right now off the top of my head.Yeah, that's awful. I've been also developing lately on top of Google Cloud, and it is so far to the opposite end of that spectrum that it's suspicious and more than a little bit frightening. When I go to console.cloud.google.com, I am boom, there. There is no login approach, which on the one hand, I definitely appreciate, just from a pure perspective of you're Google, you track everything I do on the internet. Thank you for not insulting my intelligence by pretending you don't know who I am when I log into your Cloud Console.Counterpoint, when I log into the admin portal for my Google Workspaces account, admin.google.com, it always re-prompts for a password, which is reasonable. You'd think that stuff running production might want to do something like that, in some cases. I would not be annoyed if it asked me to just type in a password again when I get to the expensive things that have lasting repercussions.Although, given my personality, logging into Gmail can have massive career repercussions as soon as I hit send on anything. I digress. It is such a difference from user experience and ease-of-use that it's one of those areas where I feel like you're fighting something of a losing battle, just because when it works well, it's glorious to the point where you don't notice it. When authentication doesn't work well, it's annoying. And there's really no in between.Dan: I don't have anything to say to that. I mean, I a hundred percent agree that it's something that you could have to get right and no one cares, except for when you get it wrong. And if your listeners can take one thing away from this call, right, I know it's we're sponsored by FusionAuth, I want to rep Fusion, I want people to be aware of FusionAuth, but don't roll your own, right? There are a lot of solutions out there. I hope you evaluate FusionAuth, I hope you evaluate some other solutions, but this is such a critical thing and Corey has laid out [laugh] in multiple different ways, the ways it can ruin your user experience and your reputation. So, look at something that you can build or a library that you can build on top of. Don't roll your own. Please, please don't.Corey: This episode is sponsored in part by Honeycomb. When production is running slow, it's hard to know where problems originate. Is it your application code, users, or the underlying systems? I've got five bucks on DNS, personally. Why scroll through endless dashboards while dealing with alert floods, going from tool to tool to tool that you employ, guessing at which puzzle pieces matter? Context switching and tool sprawl are slowly killing both your team and your business. You should care more about one of those than the other; which one is up to you. Drop the separate pillars and enter a world of getting one unified understanding of the one thing driving your business: production. With Honeycomb, you guess less and know more. Try it for free at honeycomb.io/screaminginthecloud. Observability: it's more than just hipster monitoring.Corey: So, tell me a little bit more about how it is that you folks think about yourselves in just in terms of the market space, for example. The idea of CIAM, customer IAM, it does feel viscerally different than traditional IAM in the context of, you know, AWS, which I use all the time, but I don't think I have the vocabulary to describe it without sounding like a buffoon. What is the definition between the two, please? Or the divergence, at least?Dan: Yeah, so I mean, not to go back to AWS services, but I'm sure a lot of your listeners are familiar with them. AWS SSO or the artist formerly known as AWS SSO is IAM, right? So, it's Workforce, right, and Workforce—Corey: And it was glorious, to the point where I felt like it was basically NDA'ed from other service teams because they couldn't talk about it. But this was so much nicer than having to juggle IAM keys and sessions that timeout after an hour in the console. “What do you doing in the console?” “I'm doing ClickOps, Jeremy. Leave me alone.”It's just I want to make sure that I'm talking about this the right way. It feels like AWS SSO—creature formerly known as—and traditional IAM feels like they're directionally the same thing as far as what they target, as far as customer bases, and what they empower you to do.Dan: Absolutely, absolutely. There are other players in that same market, right? And that's the market that grew up originally: it's for employees. So, employees have this very fixed lifecycle. They have complicated relationships with other employees and departments in organizations, you can tell them what to do, right, you can say you have to enroll your MFA key or you are no longer employed with us.Customers have a different set of requirements, and yet they're crucial to businesses because customers are, [laugh] who pay you money, right? And so, things that customers do that employees don't: they choose to register; they pick you, you don't pick them; they have a wide variety of devices and expectations; they also have a higher expectation of UX polish. Again, with an IAM solution, you can kind of dictate to your employees because you're paying them money. With a customer identity access management solution, it is part of your product, in the same way, you can't really dictate features unless you have something that the customer absolutely has to have and there are no substitutes for it, you have to adjust to the customer demands. CIAM is more responsive to those demands and is a smoother experience.The other thing I would say is CIAM, also, frankly, has a simpler model. Most customers have access to applications, maybe they have a couple of roles that you know, an admin role, an editor role, a viewer role if you're kind of a media conglomerate, for an example, but they don't have necessarily the thicket of complexity that you might have to have an eye on, so it's just simpler to model.Corey: Here's an area that feels like it's on the boundary between them. I distinctly remember being actively annoyed a while back that I had to roll my marketing person her own entire AWS IAM account solely so that she could upload assets into an S3 bucket that was driving some other stuff. It feels very much like that is a better use case for something that is a customer IAM solution. Because if I screw up those permissions even slightly, well, congratulations, now I've inadvertently given someone access to wind up, you know, taking production down. It feels like it is way too close to things that are going to leave a mark, whereas the idea of a customer authentication story for something like that is awesome.And no please if you're listening to this, don't email me with this thing you built and put on the Marketplace that “Oh, it uses signed URLs and whatnot to wind up automatically federating an identity just for this one per—” Yes. I don't want to build something ridiculous and overwrought so a single person can update assets within S3. I promise I don't want to do that. It just ends badly.Dan: Well, that was the promise of Cognito, right? And that is actually one of the reasons you should stick with Cognito if you have super-detailed requirements that are all about AWS and permissions to things inside AWS. Cognito has that tight integration. And I assume—I haven't looked at some of the other big cloud providers, but I assume that some of the other ones have that similar level of integration. So yeah, so that my answer there would be Cognito is the CIAM solution that AWS has, so that is what I would expect it to be able to handle, relatively smoothly.Corey: A question I have for you about the product itself is based on a frustration I originally had with Cognito, which is that once you're in there and you are using that for authentication and you have users, there's no way for me to get access to the credentials of my users. I can't really do an export in any traditional sense. Is that possible with FusionAuth?Dan: Absolutely. So, your data is your data. And because we're a self-hosted or SaaS solution, if you're running it self-hosted, obviously you have access to the password hashes in your database. If you are—Corey: The hashes, not the plaintext passwords to be explicitly clear on this. [laugh].Dan: Absolutely the hashes. And we have a number of guides that help you get hashes from other providers into ours. We have a written export guide ourselves, but it's in the database and the schema is public. You can go download our schema right now. And if—Corey: And I assume you've used an industry standard hashing algorithm for this?Dan: Yeah, we have a number of different options. You can bring your own actually, if you want, and we've had people bring their own options because they have either special needs or they have an older thing that's not as secure. And so, they still want their users to be able to log in, so they write a plugin and then they import the users' hashes, and then we transparently re-encrypt with a more modern one. The default for us is PDK.Corey: I assume you do the re-encryption at login time because there's no other way for you to get that.Dan: Exactly. Yeah yeah yeah—Corey: Yeah.Dan: —because that's the only time we see the password, right? Like we don't see it any other time. But we support Bcrypt and other modern algorithms. And it's entirely configurable; if you want to set a factor, which basically is how—Corey: I want to use MD5 because I'm still living in 2003.Dan: [laugh]. Please don't use MD5. Second takeaway: don't roll your own and don't use MD5. Yeah, so it's very tweakable, but we shipped with a secured default, basically.Corey: I just want to clarify as well why this is actively important. I don't think people quite understand that in many cases, picking an authentication provider is one of those lasting decisions where migrations take an awful lot of work. And they probably should. There should be no mechanism by which I can export the clear text passwords. If any authentication provider advertises or offers such a thing, don't use that one. I'm going to be very direct on that point.The downside to this is that if you are going to migrate from any other provider to any other provider, it has to happen either slowly as in, every time people log in, it'll check with the old system and then migrate that user to the new one, or you have to force password resets for your entire customer base. And the problem with that is I don't care what story you tell me. If I get an email from one of my vendors saying “You now have to reset your password because we're migrating to their auth thing,” or whatnot, there's no way around it, there's no messaging that solves this, people will think that you suffered a data breach that you are not disclosing. And that is a heavy, heavy lift. Another pattern I've seen is it for a period of three months or whatnot, depending on user base, you will wind up having the plug in there, and anyone who logs in after that point will, “Ohh you need to reset your password. And your password is expired. Click here to reset.” That tends to be a little bit better when it's not the proactive outreach announcement, but it's still a difficult lift and it adds—again—friction to the customer experience.Dan: Yep. And the third one—which you imply it—is you have access to your password hashes. They're hashed in a secure manner. And trust me, even though they're hashed securely, like, if you contact FusionAuth and say, “Hey, I want to move off FusionAuth,” we will arrange a way to get you your database in a secure manner, right? It's going to be encrypted, we're going to have a separate password that we communicate with you out-of-band because this is—even if it is hashed and salted and handled correctly, it's still very, very sensitive data because credentials are the keys to the kingdom.So, but those are the three options, right? The slow migration, which is operationally expensive, the requiring the user to reset their password, which is horribly expensive from a user interface perspective, right, and the customer service perspective, or export your password hashes. And we think that the third option is the least of the evils because guess what? It's your data, right? It's your user data. We will help you be careful with it, but you own it.Corey: I think that there's a lot of seriously important nuance to the whole world of authentication. And the fact that this is such a difficult area to even talk about with folks who are not deeply steeped in that ecosystem should be an indication alone that this is the sort of thing that you definitely want to outsource to a company that knows what the hell they're doing. And it's not like other areas of tech where you can basically stumble your way through something. It's like “Well, I'm going to write a Lambda to go ahead and post some nonsense on Twitter.” “Okay, are you good at programming?” “Not even slightly, but I am persistent and brute force is a viable strategy, so we're going to go with that one.” “Great. Okay, that's awesome.”But authentication is one of those areas where mistakes will show. The reputational impact of losing data goes from merely embarrassing to potentially life-ruining for folks. The most stressful job I've ever had from a data security position wasn't when I was dealing with money—because that's only money, which sounds like a weird thing to say—it was when I did a brief stint at Grindr where people weren't out. In some countries, users could have wound up in jail or have been killed if their sexuality became known. And that was the stuff that kept me up at night.Compared to that, “Okay, you got some credit card numbers with that. What the hell do I care about that, relatively speaking?” It's like, “Yeah, it's well, my credit card number was stolen.” “Yeah, but did you die, though?” “Oh, you had to make a phone call and reset some stuff.” And I'm not trivializing the importance of data security. Especially, like, if you're a bank, and you're listening to this, and you're terrified, yeah, that's not what I'm saying at all. I'm just saying there are worse things.Dan: Sure. Yeah. I mean, I think that, unfortunately, the pandemic showed us that we're living more and more of our lives online. And the identity online and making sure that safe and secure is just critical. And again, not just for your employees, although that's really important, too, but more of your customer interactions are going to be taking place online because it's scalable, because it makes people money, because it allows for capabilities that weren't previously there, and you have to take that seriously. So, take care of your users' data. Please, please do that.Corey: And one of the best ways you can do that is by not touching the things that are commoditized in your effort to apply differentiation. That's why I will never again write my own auth system, with a couple of asterisks next to it because some of what I do is objectively horrifying, intentionally so. But if I care about the authentication piece, I have the good sense to pay someone else to do it for me.Dan: From personal experience, you mentioned at the beginning that we go back aways. I remember when I first discovered RDS, and I thought, “Oh, my God. I can outsource all this scut work, all of the database backups, all of the upgrades, all of the availability checking, right? Like, I can outsource this to somebody else who will take this off my plate.” And I was so thankful.And I don't—outside of, again, with some asterisks, right, there are places where I could consider running a database, but they're very few and far between—I feel like auth has entered that category. There are great providers like FusionAuth out there that are happy to take this off your plate and let you move forward. And in some ways, I'm not really sure which is more dangerous; like, not running a database properly or not running an auth system properly. They both give me shivers and I would hate to [laugh] hate to be forced to choose. But they're comparable levels of risk, so I a hundred percent agree, Corey.Corey: Dan, I really want to thank you for taking so much time to talk to me about your view of the world. If people want to learn more because you're not in their inboxes responding to newsletters every week, where's the best place to find you?Dan: Sure, you can find more about me at Twitter. I'm @mooreds, M-O-O-R-E-D-S. And you can learn more about FusionAuth and download it for free at fusionauth.io.Corey: And we will put links to all of that in the show notes. I really want to thank you again for just being so generous with your time. It's deeply appreciated.Dan: Corey, thank you so much for having me.Corey: Dan Moore, Head of DevRel at FusionAuth. I'm Cloud Economist Corey Quinn. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that will be attributed to someone else because they screwed up by rolling their own authentication.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Trivy and Open Source Communities with Anaïs Urlichs

Screaming in the Cloud

Play Episode Listen Later Sep 6, 2022 36:15


About AnaïsAnaïs is a Developer Advocate at Aqua Security, where she contributes to Aqua's cloud native open source projects. When she is not advocating DevOps best practices, she runs her own YouTube Channel centered around cloud native technologies. Before joining Aqua, Anais worked as SRE at Civo, a cloud native service provider, where she helped enhance the infrastructure for hundreds of tenant clusters. As CNCF ambassador of the year 2021, her passion lies in making tools and platforms more accessible to developers and community members.Links Referenced: Aqua Security: https://www.aquasec.com/ Aqua Open Source YouTube channel: https://www.youtube.com/c/AquaSecurityOpenSource Personal blog: https://anaisurl.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at AWS AppConfig. Engineers love to solve, and occasionally create, problems. But not when it's an on-call fire-drill at 4 in the morning. Software problems should drive innovation and collaboration, NOT stress, and sleeplessness, and threats of violence. That's why so many developers are realizing the value of AWS AppConfig Feature Flags. Feature Flags let developers push code to production, but hide that that feature from customers so that the developers can release their feature when it's ready. This practice allows for safe, fast, and convenient software development. You can seamlessly incorporate AppConfig Feature Flags into your AWS or cloud environment and ship your Features with excitement, not trepidation and fear. To get started, go to snark.cloud/appconfig That's snark.cloud/appconfig.Corey: This episode is sponsored in part by Honeycomb. When production is running slow, it's hard to know where problems originate. Is it your application code, users, or the underlying systems? I've got five bucks on DNS, personally. Why scroll through endless dashboards while dealing with alert floods, going from tool to tool to tool that you employ, guessing at which puzzle pieces matter? Context switching and tool sprawl are slowly killing both your team and your business. You should care more about one of those than the other; which one is up to you. Drop the separate pillars and enter a world of getting one unified understanding of the one thing driving your business: production. With Honeycomb, you guess less and know more. Try it for free at honeycomb.io/screaminginthecloud. Observability: it's more than just hipster monitoring.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while, when I start trying to find guests to chat with me and basically suffer my various slings and arrows on this show, I encounter something that I've never really had the opportunity to explore further. And today's guest leads me in just such a direction. Anaïs is an open-source developer advocate at Aqua Security, and when I was asking her whether or not she wanted to talk about various topics, one of the first thing she said was, “Don't ask me much about AWS because I've never used it,” which, oh my God. Anaïs, thank you for joining me. You must be so very happy never to have dealt with the morass of AWS.Anaïs: [laugh]. Yes, I'm trying my best to stay away from it. [laugh].Corey: Back when I got into the cloud space, for lack of a better term, AWS was sort of really the only game in town unless you wanted to start really squinting hard at what you define cloud as. I mean yes, I could have gone into Salesforce or something, but I was already sad and angry all the time. These days, you can very much go all in-on cloud. In fact, you were a CNCF ambassador, if I'm not mistaken. So, you absolutely are in the infrastructure cloud space, but you haven't dealt with AWS. That is just an interesting path. Have you found others who have gone down that same road, or are you sort of the first of a new breed?Anaïs: I think to find others who are in a similar position or have a similar experience, as you do, you first have to talk about your experience, and this is the first time, or maybe the second, that I'm openly [laugh] saying it on something that will be posted live, like, to the internet. Before I, like, I tried to stay away from mentioning it at all, do the best that I can because I'm at this point where I'm so far into my cloud-native Kubernetes journey that I feel like I should have had to deal with AWS by now, and I just didn't. And I'm doing my best and I'm very successful in avoiding it. [laugh]. So, that's where I am. Yeah.Corey: We're sort of on opposite sides of a particular fence because I spend entirely too much time being angry at AWS, but I've never really touched Kubernetes and anger. I mean, I see it in a lot of my customer accounts and I get annoyed at its data transfer bills and other things that it causes in an economic sense, but as far as the care and feeding of a production cluster, back in my SRE days, I had very old-school architectures. It's, “Oh, this is an ancient system, just like grandma used to make,” where we had the entire web tier, then a job applic—or application server tier, and then a database at the end, and everyone knew where everything was. And then containers came out of nowhere, and it seemed like okay, this solves a bunch of problems and introduces a whole bunch more. How do I orchestrate them? How do I ensure that they're healthy?And then ah, Kubernetes was the answer. And for a while, it seemed like no matter what the problem was, Kubernetes was going to be the answer because people were evangelizing it pretty hard. And now I see it almost everywhere that I turn. What's your journey been like? How did you get into the weeds of, “You know what I want to do when I grow up? That's right. I want to work on container orchestration systems.” I have a five-year-old. She has never once said that because I don't abuse my children by making them learn how clouds work. How did you wind up doing what you do?Anaïs: It's funny that you mention that. So, I'm actually of the generation of engineers who doesn't know anything else but Kubernetes. So, when you mentioned that you used to use something before, I don't really know what that looks like. I know that you can still deploy systems without Kubernetes, but I have no idea how. My journey into the cloud-native space started out of frustration from the previous industry that I was working at.So, I was working for several years as developer advocate in the open-source blockchain cryptocurrency space and it's highly similar to all of the cliches that you hear online and across the news. And out of this frustration, [laugh] I was looking at alternatives. One of them was either going into game development, into the gaming industry, or the cloud-native space and infrastructure development and deployment. And yeah, that's where I ended up. So, at the end of 2020, I joined a startup in the cloud-native space and started my social media journey.Corey: One of the things that I found that Kubernetes solved for—and to be clear, Kubernetes really came into its own after I was doing a lot more advisory work and a lot more consulting style activity rather than running my own environments, but there's an entire universe of problems that the modern day engineer never has to think about due to, partially cloud and also Kubernetes as well, which is the idea of hardware or node failure. I've had middle of the night driving across Los Angeles in a panic getting to the data center because the disk array on the primary database had degraded because the drive failed. That doesn't happen anymore. And clouds have mostly solved that. It's okay, drives fail, but yeah, that's the problem for some people who live in Virginia or Oregon. I don't have to think about it myself.But you do have to worry about instances failing; what if the primary database instance dies? Well, when everything lives in a container then that container gets moved around in the stateless way between things, well great, you really only have to care instead about okay, what if all of my instances die? Or, what if my code is really crappy? To which my question is generally, what do you mean, ‘if?' All of us write crappy code.That's the nature of the universe. We open-source only the small subset that we are not actively humiliated by, which is, in a lot of ways, what you're focusing on now, over at Aqua Sec, you are an advocate for open-source. One of the most notable projects that come out of that is Trivy, if I'm pronouncing that correctly.Anaïs: Yeah, that's correct. Yeah. So, Trivy is our main open-source project. It's an all-in-one cloud-native security scanner. And it's actually—it's focused on misconfiguration issues, so it can help you to build more robust infrastructure definitions and configurations.So ideally, a lot of the things that you just mentioned won't happen, but it obviously, highly depends on so many different factors in the cloud-native space. But definitely misconfigurations of one of those areas that can easily go wrong. And also, not just that you have data might cease to exist, but the worst thing or, like, as bad might be that it's completely exposed online. And they are databases of different exposures where you can see all the kinds of data of information from just health data to dating apps, just being online available because the IP address is not protected, right? Things like that. [laugh].Corey: We all get those emails that start with, “Your security is very important to us,” and I know just based on that opening to an email, that the rest of that email is going to explain how security was not very important to you folks. And it's the apology, “Oops, we have messed up,” email. Now, the whole world of automated security scanners is… well, it's crowded. There are a number of different services out there that the cloud providers themselves offer a bunch of these, a whole bunch of scareware vendors at the security conferences do as well. Taking a quick glance at Trivy, one of the problems I see with it, from a cloud provider perspective, is that I see nothing that it does that winds up costing extra money on your cloud bill that you then have to pay to the cloud provider, so maybe they'll put a pull request in for that one of these days. But my sarcasm aside, what is it that differentiates Trivy from a bunch of other offerings in various spaces?Anaïs: So, there are multiple factors. If we're looking from an enterprise perspective, you could be using one of the in-house scanners from any of the cloud providers available, depending which you're using. The thing is, they are not generally going to be the ones who have a dedicated research team that provides the updates based on the vulnerabilities they find across the space. So, with an open-source security scanner or from a dedicated company, you will likely have more up-to-date information in your scans. Also, lots of different companies, they're using Trivy under the hood ultimately, or for their own scans.I can link a few where you can also find them in a Trivy repository. But ultimately, a lot of companies rely on Trivy and other open-source security scanners under the hood because they are from dedicated companies. Now, the other part to Trivy and why you might want to consider using Trivy is that in larger teams, you will have different people dealing with different components of your infrastructure, of your deployments, and you could end up having to use multiple different security scanners for all your different components from your container images that you're using, whether or not they are secure, whether or not they're following best practices that you defined to your infrastructure-as-code configurations, to you're running deployments inside of your cluster, for instance. So, each of those different stages across your lifecycle, from development to runtime, will maybe either need different security scanners, or you could use one security scanner that does it all. So, you could have in a team more knowledge sharing, you could have dedicated people who know how to use the tool and who can help out across a team across the lifecycle, and similar. So, that's one of the components that you might want to consider.Another thing is how mature is a tool, right? A lot of cloud providers, what they end up doing is they provide you with a solution, but it's nice to decoupled from anything else that you're using. And especially in the cloud-native space, you're heavily reliant on open-source tools, such as for your observability stack, right? Coming from Site Reliability Engineering also myself, I love using metrics and Grafana. And for me, if anything open-source from Loki to accessing my logs, to Grafana to dashboards, and all their integrations.I love that and I want to use the same tools that I'm using for everything else, also for my security tools. I don't want to have the metrics for my security tools visualized in a different solution to my reliability metrics for my application, right? Because that ultimately makes it more difficult to correlate metrics. So, those are, like, some of the factors that you might want to consider when you're choosing a security scanner.Corey: When you talk about thinking about this, from the perspective of an SRE is—I mean, this is definitely an artifact of where you come from and how you approach this space. Because in my world, when you have ten web servers, five application servers, and two database servers and you wind up with a problem in production, how do you fix this? Oh, it's easy. You log into one of those nodes and poke around and start doing diagnostics in production. In a containerized world, you generally can't do that, or there's a problem on a container, and by the time you're aware of that, that container hasn't existed for 20 minutes.So, how do you wind up figuring out what happens? And instrumenting for telemetry and metrics and observability, particularly at scale becomes way more important than it ever was, for me. I mean, my version of monitoring was always Nagios, which was the original Call of Duty that wakes you up at two in the morning when the hard drive fails. The world has thankfully moved beyond that and a bunch of ways. But it's not first nature for me. It's always, “Oh, yeah, that's right. We have a whole telemetry solution where I can go digging into.” My first attempt is always, oh, how do I get into this thing and poke it with a stick? Sometimes that's helpful, but for modern applications, it really feels like it's not.Anaïs: Totally. When we're moving to an infrastructure to an environment where we can deploy multiple times a day, right, and update our application multiple times a day, multiple times a day, we can introduce new security issues or other things can go wrong, right? So, I want to see—as much as I want to see all of the other failures, I want to see any security-related issues that might be deployed alongside those updates at the same frequency, right?Corey: The problem that I see across all this stuff, though, is there are a bunch of tools out there that people install, but then don't configure because, “Oh, well, I bought the tool. The end.” I mean, I think it was reported almost ten years ago or so on the big Target breach that they wound up installing some tool—I want to say FireEye, but please don't quote me on that—and it wound up firing off a whole bunch of alerts, and they figured was just noise, so they turned it all off. And it turned out no, no, this was an actual breach in progress. But people are so used to all the alarms screaming at them, that they don't dig into this.I mean, one of the original security scanners was Nessus. And I seen a lot of Nessus reports because for a long time, what a lot of crappy consultancies would do is they would white-label the output of whatever it was that Nessus said and deliver that in as the report. So, you'd wind up with 700 pages of quote-unquote, “Security issues.” And you'd have to flip through to figure out that, ah, this supports a somewhat old SSL negotiation protocol, and you're focusing on that instead of the oh, and by the way, the primary database doesn't have a password set. Like, it winds up just obscuring it because there is so much. How does Trivy approach avoiding the information overload problem?Anaïs: That's a great question because everybody's complaining about vulnerability fatigue, of them, for the first time, scanning their container images and workloads and seeing maybe even hundreds of vulnerabilities. And one of the things that can be done to counteract that right from the beginning is investing your time into looking at the different flags and configurations that you can do before actually deploying Trivy to, for example, your cluster. That's one part of it. The other part is I mentioned earlier, you would use a security scan at different parts of your deployment. So, it's really about integrating scanning not just once you—like, in your production environment, once you've deployed everything, but using it already before and empowering engineers to actually use it on their machines.Now, they can either decide to do it or not; it's not part of most people's job to do security scanning, but as you move along, the more you do, the more you can reduce the noise and then ultimately, when you deploy Trivy, for example, inside of your cluster, you can do a lot of configuration such as scanning just for critical vulnerabilities, only scanning for vulnerabilities that already have a fix available, and everything else should be ignored. Those are all factors and flags that you can place into Trivy, for instance, and make it easier. Now, with Trivy, you won't have automated PRs and everything out of the box; you would have to set up the actions or, like, the ways to mitigate those vulnerabilities manually by yourself with tools, as well as integrating Trivy with your existing stack, and similar. But then obviously, if you want to have something more automated, if you want to have something that does more for you in the background, that's when you want to use to an enterprise solution and shift to something like Aqua Security Enterprise Platform that actually provides you with the automated way of mitigating vulnerabilities where you don't have to know much about it and it just gives you the solution and provides you with a PR with the updates that you need in your infrastructure-as-code configurations to mitigate the vulnerability [unintelligible 00:15:52]?Corey: I think that's probably a very fair answer because let's be serious when you're running a bank or someone for whom security matters—and yes, yes, I know, security should matter for everyone, but let's be serious, I care a little bit less about the security impact of, for example, I don't know, my Twitter for Pets nonsense, than I do a dating site where people are not out about their orientation or whatnot. Like, there is a world of difference between the security concerns there. “Oh, no, you might be able to shitpost as me if you compromise my lasttweetinaws.com Twitter client that I put out there for folks to use.” Okay, great. That is not the end of the world compared to other stuff.By the time you're talking about things that are critically important, yeah, you want to spend money on this, and you want to have an actual full-on security team. But open-source tools like this are terrific for folks who are just getting started or they're building something for fun themselves and as it turns out, don't have a full security budget for their weird late-night project. I think that there's a beautiful, I guess, spectrum, as far as what level of investment you can make into security. And it's nice to see the innovation continued happening in the space.Anaïs: And you just mentioned that dedicated security companies, they likely have a research team that's deploying honeypots and seeing what happens to them, right? Like, how are attackers using different vulnerabilities and misconfigurations and what can be done to mitigate them. And that ultimately translates into the configurations of the open-source tool as well. So, if you're using, for instance, a security scanner that doesn't have an enterprise company with a research team behind it, then you might have different input into the data of that security scanner than if you do, right? So, these are, like, additional considerations that you might want to take when choosing a scanner. And also that obviously depends on what scanning you want to do, on the size of your company, and similar, right?Corey: This episode is sponsored in part by our friend EnterpriseDB. EnterpriseDB has been powering enterprise applications with PostgreSQL for 15 years. And now EnterpriseDB has you covered wherever you deploy PostgreSQL on-premises, private cloud, and they just announced a fully-managed service on AWS and Azure called BigAnimal, all one word. Don't leave managing your database to your cloud vendor because they're too busy launching another half-dozen managed databases to focus on any one of them that they didn't build themselves. Instead, work with the experts over at EnterpriseDB. They can save you time and money, they can even help you migrate legacy applications—including Oracle—to the cloud. To learn more, try BigAnimal for free. Go to biganimal.com/snark, and tell them Corey sent you.Corey: Something that I do find fairly interesting is that you started off, as you say, doing DevRel in the open-source blockchain world, then you went to work as an SRE, and then went back to doing DevRel-style work. What got you into SRE and what got you out of SRE, other than the obvious having worked in SRE myself and being unhappy all the time? I kid, but what was it that got you into that space and then out of it?Anaïs: Yeah. Yeah, but no, it's a great question. And it's, I guess, also was shaped my perspective on different tools and, like, the user experience of different tools. But ultimately, I first worked in the cloud-native space for an enterprise tool as developer advocate. And I did not like the experience of working for a paid solution. Doing developer advocacy for it, it felt wrong in a lot of ways. A lot of times you were required to do marketing work in those situations.And that kind of got me out of developer advocacy into SRE work. And now I was working partially or mainly as SRE, and then on the side, I was doing some presentations in developer advocacy. However, that split didn't quite work, either. And I realized that the value that I add to a project is really the way I convey information, which I can't do if I'm busy fixing the infrastructure, right? I can't convey the information of as much of how the infrastructure has been fixed as I can if I'm working with an engineering team and then doing developer advocacy, solely developer advocacy within the engineering team.So, how I ultimately got back into developer advocacy was just simply by being reached out to by my manager at Aqua Security, and Itay telling me, him telling me that he has a role available and if I want to join his team. And it was open-source-focused. Given that I started my career for several years working in the open-source space and working with engineers, contributing to open-source tools, it was kind of what I wanted to go back to, what I really enjoy doing. And yeah, that's how that came about [laugh].Corey: For me, I found that I enjoy aspects of the technology part, but I find I enjoy talking to people way more. And for me, the gratifying moment that keeps me going, believe it or not, is not necessarily helping giant companies spend slightly less money on another giant company. It's watching people suddenly understand something they didn't before, it's watching the light go on in their eyes. And that's been addictive to me for a long time. I've also found that the best way for me to learn something is to teach someone else.I mean, the way I learned Git was that I foolishly wound up proposing a talk, “Terrible Ideas in Git”—we'll teach it by counterexample—four months before the talk. And they accepted it, and crap, I'd better learn enough get to give this talk effectively. I don't recommend this because if you miss the deadline, I checked, they will not move the conference for you. But there really is something to be said for watching someone learn something by way of teaching it to them.Anaïs: It's actually a common strategy for a lot of developer advocates of making up a talk and then waiting whether or not it will get accepted. [laugh] and once it gets accepted, that's when you start learning the tool and trying to figure it out. Now, it's not a good strategy, obviously, to do that because people can easily tell that you just did that for a conference. And—Corey: Sounds to me, like, you need to get better at bluffing. I kid.Anaïs: [laugh].Corey: I kid. Don't bluff your way through conference talks as a general rule. It tends not to go well. [laugh].Anaïs: No. It's a bad idea. It's a really bad idea. And so, I ultimately started learning the technologies or, like, the different tools and projects in the cloud-native space. And there are lots, if you look at the CNCF landscape, right? But just trying to talk myself through them on my YouTube channel. So, my early videos on my channel, it's just very much on the go of me looking for the first time at somebody's documentation and not making any sense out of them.Corey: It's surprising to me how far that gets you. I mean, I guess I'm always reminded of that Tom Hanks movie from my childhood Big where he wakes up—the kid wakes up as an adult one day, goes to work, and bluffs his way into working at a toy company. He's in a management meeting and just they're showing their new toy they're going to put out there and he's, “I don't get it.” Everyone looks at him like how dare you say it? And, “I don't get it. What's fun about this?” Because he's a kid.And he wants to getting promoted to vice president because wow, someone pointed out the obvious thing. And so often, it feels like using a tool or a product, be it open-source or enterprise, it is clearly something different in my experience of it when I try to use this thing than the person who developed it. And very often it's that I don't see the same things or think of the problem space the same way that the developers did, but also very often—and I don't mean to call anyone in particular out here—it's a symptom of a terrible user interface or user experience.Anaïs: What you've just said, a lot of times, it's just about saying the thing that nobody that dares to say or nobody has thought of before, and that gets you obviously, easier, further [laugh] then repeating what other people have already mentioned, right? And a lot of what you see a lot of times in these—also an open-source projects, but I think more even in closed-source enterprise organizations is that people just repeat whatever everybody else is saying in the room, right? You don't have that as much in the open-source world because you have more input or easier input in public than you do otherwise, but it still happens that I mean, people are highly similar to each other. If you're contributing to the same project, you probably have a similar background, similar expertise, similar interests, and that will get you to think in a similar way. So, if there's somebody like, like a high school student maybe, somebody just graduated, somebody from a completely different industry who's looking at those tools for the first time, it's like, “Okay, I know what I'm supposed to do, but I don't understand why I should use this tool for that.” And just pointing that out, gets you a response, most of the time. [laugh].Corey: I use Twitter and use YouTube. And obviously, I bias more for short, pithy comments that are dripping in sarcasm, whereas in a long-form video, you can talk a lot more about what you're seeing. But the problem I have with bad user experience, particularly bad developer experience, is that when it happens to me—and I know at a baseline level, that I am reasonably competent in technical spaces, but when I encounter a bad interface, my immediate instinctive reaction is, “Oh, I'm dumb. And this thing is for smart people.” And that is never, ever true, except maybe with quantum computing. Great, awesome. The Hello World tutorial for that stuff is a PhD from Berkeley. Good luck if you can get into that. But here in the real world where the rest of us play, it's just a bad developer experience, but my instinctive reaction is that there's stuff I don't know, and I'm not good enough to use this thing. And I get very upset about that.Anaïs: That's one of the things that you want to do with any technical documentation is that the first experience that anybody has, no matter the background, with your tool should be a success experience, right? Like people should look at it, use maybe one command, do one thing, one simple thing, and be like, “Yeah, this makes sense,” or, like, this was fun to do, right? Like, this first positive interaction. And it doesn't have to be complex. And that's what many people I think get wrong, that they try to show off how powerful a tool is, of like, oh, “My God, you can do all those things. It's so exciting, right?” But [laugh] ultimately, if nobody can use it or if most of the people, 99% of the people who try it for the first time have a bad experience, it makes them feel uncomfortable or any negative emotion, then it's really you're approaching it from the wrong perspective, right?Corey: That's very apt. I think it's so much of whether people stick with something long enough to learn it and find the sharp edges has to do with what their experience looks like. I mean, back when I was more or less useless when it comes to anything that looked like programming—because I was a sysadmin type—I started contributing to SaltStack. And what was amazing about that was Tom Hatch, the creator of the project had this pattern that he kept up for way too long, where whenever anyone submitted an issue, he said, “Great, well, how about you fix it?” And because we had a patch, like, “Well, I'm not good at programming.” He's like, “That's okay. No one is. Try it and we'll see.”And he accepted every patch and then immediately, you'd see another patch come in ten minutes later that fixed the problems in your patch. But it was the most welcoming and encouraging experience, and I'm not saying that's a good workflow for an open-source maintainer, but he still remains one of the best humans I know, just from that perspective alone.Anaïs: That's amazing. I think it's really about pointing out that there are different ways of doing open-source [laugh] and there is no one way to go about it. So, it's really about—I mean, it's about the community, ultimately. That's what it boils down to, of you are dependent, as an open-source project, on the community, so what is the best experience that you can give them? If that's something that you want to and can invest in, then yeah [laugh] that's probably the best outcome for everybody.Corey: I do have one more question, specifically around things that are more timely. Now, taking a quick look at Trivy and recent features, it seems like you've just now—now-ish—started supporting cloud scanning as well. Previously, it was effectively, “Oh, this scans configuration and containers. Okay, great.” Now, you're targeting actually scanning cloud providers themselves. What does this change and what brought you to this place, as someone who very happily does not deal with AWS?Anaïs: Yeah, totally. So, I just started using AWS, specifically to showcase this feature. So, if you look at the Aqua Open Source YouTube channel, you will find several tutorials that show you how to use that feature, among others.Now, what I mentioned earlier in the podcast already is that Trivy is really versatile, it allows you to scan different aspects of your stack at different stages of your development lifecycle. And that's made possible because Trivy is ultimately using different open-source projects under the hood. For example, if you want to scan your infrastructure-as-code misconfigurations, it's using a tool called tfsec, specifically for Terraform. And then other tools for other scanning, for other security scanning. Now, we have—or had; it's going to be probably deprecated—a tool called CloudSploit in the Aqua open-source project suite.Now, that's going to, kind of like, the functionality that CloudSploit was providing is going to get converted to become part of Trivy, so everything scanning-related is going to become part of Trivy that really, like, once you understand how Trivy works and all of the CLI commands in Trivy have exactly the same structure, it's really easy to scan from container images to infrastructure-as-code, to generating s-bombs to scanning also now, your cloud infrastructure and Trivy can scan any of your AWS services for misconfigurations, and it's using basically the AWS client under the hood to connect with the services of everything you have set up there, and then give you the list of misconfigurations. And once it has done the scan, you can then drill down further into the different aspects of your misconfigurations without performing the entire scan again, since you likely have lots and lots of resources, so you wouldn't want to scan them every time again, right, when you perform the scan. So, once something has been scanned, Trivy will know whether the resource changed or not, it won't scan it again. That's the same way that in-classes scanning works right now. Once a container image has been scanned for vulnerabilities, it won't scan the same container image again because that would just waste time. [laugh]. So yeah, do check it out. It's our most recent feature, and it's going to come out also to the other cloud providers out there. But we're starting with AWS and this kind of forced me to finally [laugh] look at it for the sake of it. But I'm not going to be happy. [laugh].Corey: No, I don't think anyone is. It's every time I see on a resume that someone says, “Oh, I'm an expert in AWS,” it's, “No you're not.” They have 400-some-odd services now. We have crossed the point long ago, where I can very convincingly talk about AWS services that do not exist to Amazonians and not get called out for it because who in the world knows what they run? And half of their services sound like something I made up to be funny, but they're very real. It's wild to me that it is a sprawling as it is and apparently continues to work as a viable business.But no one knows all of it and everyone feels confused, lost, and overwhelmed every time they look at the AWS console. This has been my entire career in life for the last six years, and I still feel that way. So, I'm sure everyone else does, too.Anaïs: And this is how misconfigurations happen, right? You're confused about what you're actually supposed to do and how you're supposed to do it. And that's, for example, with all the access rights in Google Cloud, something that I'm very familiar with, that completely overwhelms you and you get super frustrated by, and you don't even know what you give access to. It's like, if you've ever had to configure Discord user roles, it's a similar disaster. You will not know which user has access to which. They kind of changed it and try to improve it over the past year, but it's a similar issue that you face in cloud providers, just on a much larger-scale, not just on one chat channel. [laugh]. So.Corey: I think that is probably a fair place to leave it. I really want to thank you for spending as much time with me as you have talking about the trials and travails of, well, this industry, for lack of a better term. If people want to learn more, where's the best place to find you?Anaïs: So, I have a weekly DevOps newsletter on my blog, which is anaisurl—like, how you spell U-R-L—and then dot com. anaisurl.com. That's where I have all the links to my different channels, to all of the resources that are published where you can find out more as well. So, that's probably the best place. Yeah.Corey: And we will, of course, put a link to that in the show notes. I really want to thank you for being as generous with your time as you have been. Thank you.Anaïs: Thank you for having me. It was great.Corey: Anaïs, open-source developer advocate at Aqua Security. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment that I will never see because it's buried under a whole bunch of minor or false-positive vulnerability reports.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.