POPULARITY
Join Jobi George and Tony Le from Weaviate as we discuss how their AI Native Vector Database is powering modern AI applications. We begin with an intro to Vector Databases followed by common use cases, and how they fit into Agentic Workflows. We also talk about the advantages of using an AI Native Database from the Open Source community. To learn more:Website - www.weaviate.ioLinkedin - https://www.linkedin.com/company/weaviate-io/AWS Marketplace - https://aws.amazon.com/marketplace/seller-profile?id=seller-jxgfug62rvpxsDeveloper Community - https://weaviate.io/developers/weaviate/model-providers/awsInstagram - https://www.instagram.com/weaviate.io/Youtube - https://www.youtube.com/@WeaviateAWS Hosts: Nolan Chen & Malini ChatterjeeEmail Your Feedback: rethinkpodcast@amazon.com
Jon Yoo, CEO of Suger, shares how his company automates the complex & challenging workflows of selling software through cloud marketplaces like AWS.Topics Include:Jon Yoo is co-founder/CEO of Suger.Suger automates B2B marketplace workflows.Handles listing, contracts, offers, billing for marketplaces like AWS.Co-founder previously led Confluent's marketplace enablement product.Confluent had 40-50% revenue through cloud marketplaces.Required 10-20 engineers working solely on marketplace integration.Engineers prefer core product work over marketplace integration.Product/engineering leaders struggle with marketplace deployment requirements.Marketplace customers adopt without marketing, creating unexpected management needs.Version control is challenging for marketplace-deployed products.License management through marketplace creates engineering challenges.Suger helps sell, resell, co-sell through AWS Marketplace.Marketplace integration isn't one-time; requires ongoing maintenance.Business users constantly request marketplace automation features.Suger works with Snowflake, Intel, and AI startups.Data security concerns drive self-hosted AI deployments.AI products increasingly deploy via AMI/container solutions.AI products use usage-based pricing, not seat-based.Usage-based pricing creates complex billing challenges.AI products are tested at unprecedented rates.Two deployment options: vendor cloud or customer cloud.SaaS requires reporting usage to marketplace APIs.Customer-hosted deployment simplifies some billing aspects.Marketplaces need integration with ERP systems.Version control particularly challenging for AI products.Companies need automated updates for marketplace-deployed products.License management includes scaling up/down and expiration handling.Suger aims to integrate with GitHub for automatic updates.Participants:· Jon Yoo – CEO and Co-founder, SugerSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon/isv/
AWS partners Braze, Qualtrics, and Tealium share strategies for marketplace success, vertical industry expansion, and generative AI integration that have driven significant business growth. Topics Include:Jason Warren introduces AWS Business Application Partnerships panel.Three key topics: Marketplace Strategy, Vertical Expansion, Gen-AI Integration.Alex Rees of Braze, Matthew Gray of Tealium, and Jason Mann of Qualtrics join discussion.Braze experienced triple-digit percentage growth through AWS Marketplace.Braze dedicating resources specifically to Marketplace procurement.Tealium accelerated deal velocity by listing on Marketplace.Tealium saw broader use case expansion with AWS co-selling.Qualtrics views Marketplace listing as earning a "diploma."Understanding AWS incentives and metrics is crucial.Knowing AWS "love language" helps partnership success.Braze saw transaction volume increase between Q1 and Q4.Aligning with industry verticals unlocked faster growth.Tealium sees bigger deals and faster close times.Tealium moved from transactional to strategic marketplace approach.Private offers work well for complex enterprise agreements.Qualtrics measures AWS partnership through "influence, intel, introductions."AWS relationships help navigate IT and procurement challenges.Propensity-to-buy data guides AWS engagement strategy.Marketplace strategy evolving with new capabilities and international expansion.Brazilian marketplace distribution reduces currency and tax challenges.Partnership evolution: sell first, then market, then co-innovate.Braze penetrated airline market through AWS Travel & Hospitality.RFP introductions show tangible partnership benefits.Tealium partnering with Virgin Australia and United Airlines.MUFG bank case study shows joint AWS-Tealium success.Qualtrics won awards despite not completing formal competencies.Focus on fewer verticals yields better results.Gen AI brings both opportunities and regulatory concerns.First-party data rights critical for AI implementation.AWS Bedrock integration provides security and prescriptive solutions.Participants:Alex Rees – Director Tech Partnerships, BrazeJason Mann – Global AWS Alliance Lead, QualtricsMatthew Gray - SVP, Partnerships & Alliances, TealiumJason Warren - Head of Business Applications ISV Partnerships (Americas), AWSSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon/isv/
Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
In this conversation, Krish Palaniappan discusses a critical issue encountered with an API on AWS Marketplace, where they were notified of a lack of usage requests due to an expired SSL certificate. He walks through the steps taken to identify the problem, including checking logs and understanding SSL certificate management. The conversation highlights the importance of proper certificate handling, the challenges faced during the resolution process, and the lessons learned from the experience. Snowpal Products Backends as Services on AWS Marketplace Mobile Apps on App Store and Play Store Web App Education Platform for Learners and Course Creators
Executives from DataRobot, LaunchDarkly and ServiceNow share strategies, actions and recommendations to achieve profitable growth in today's competitive SaaS landscape.Topics Include:Introduction of panelists from DataRobot, LaunchDarkly & ServiceNowServiceNow's journey from service management to workflow orchestration platform.DataRobot's evolution as comprehensive AI platform before AI boom.LaunchDarkly's focus on helping teams decouple release from deploy.Rule of 40: balancing revenue growth and profit margin.ServiceNow exceeding standards with Rule of 50-60 approach.Vertical markets expansion as key strategy for sustainable growth.AWS Marketplace enabling largest-ever deal for ServiceNow.R&D investment effectiveness through experimentation and feature management.Developer efficiency as driver of profitable SaaS growth.Competition through data-driven decisions rather than guesswork.Speed and iteration frequency determining competitive advantage in SaaS.Balancing innovation with early customer adoption for AI products.Product managers should adopt revenue goals and variable compensation.Product-led growth versus sales-led motion: strategies and frictions.Sales-led growth optimized for enterprise; PLG for practitioners.Marketplace-led growth as complementary go-to-market strategy.Customer acquisition cost (CAC) as primary driver of margin erosion.Pricing and packaging philosophy: platform versus consumption models.Value realization must precede pricing and packaging discussions.Good-better-best pricing model used by LaunchDarkly.Security as foundation of trust in software delivery.LaunchDarkly's Guardian Edition for high-risk software release scenarios.Security for regulated industries through public cloud partnerships.GenAI security: benchmarks, tests, and governance to prevent issues.M&A strategy: ServiceNow's 33 acquisitions for features, not revenue.Replatforming acquisitions into core architecture for consistent experience.Balancing technology integration with people aspects during acquisitions.Trends in buying groups: AI budgets and tool consolidation.Implementing revenue goals in product teams for new initiatives.Participants:Prajakta Damle – Head of Product / SVP of Product, DataRobotClaire Vo – Chief Product & Technology Officer, LaunchDarklyAnshuman Didwania – VP/GM, Hyperscalers Business Group, ServiceNowAkshay Patel – Global SaaS Strategist, Amazon Web ServicesSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon/isv/
In this episode of Partnerships Unraveled, we sit down with Timm Hoyt, SVP of Global Partner Sales and Alliances at Sumo Logic, to break down why companies that prioritize a partner-first strategy see massive long-term success. - How to shift from a partner-convenient to a partner-centric model - Why a clear vision is critical for scaling global partnerships - The hard financial metrics that prove partner-led growth is the future - The role of AWS Marketplace in simplifying enterprise deals - How AI and solution providers are reshaping the channel Timm shares insights from his career-leading channel transformations at Sumo Logic, Druva, and PagerDuty. If you're a channel leader looking to scale faster and drive revenue through partnerships, this episode is packed with insights. Connect with Timm: https://www.linkedin.com/in/timm-hoyt-43a853a/ _________________________Learn more about Channext
How is IBM transforming Data and AI in partnership with AWS? I hosted Marcela Vairo, VP of Data & AI, Americas, IBM, on The Ravit Show at AWS re: Invent!We covered some exciting topics around Data Quality, Data Observability, and IBM's latest innovations in Data & AI. Marcela shared insights from her recent roundtable discussion at the event and highlighted how IBM's partnership with AWS is creating new opportunities for businesses leveraging cloud-based data solutions.Key highlights from our conversation:-- Takeaways from Marcela's roundtable discussion on the state of Data and AI-- Updates on the IBM + AWS partnership, including RDS for Db2 and expanded collaboration-- New capabilities in the AWS Marketplace, such as Databand for data observability and watsonx.data to enhance data management and AI readiness-- Marcela's vision for what's next in Data and AI for 2025This conversation puts light on how IBM is driving innovation and empowering organizations with cutting-edge Data and AI capabilities.#data #ai #awsreinvent #awsreinvent2024 #reinvent2024 #IBMPartner #ibm #theravitshow
Our 3 hosts take you through all the latest news PLUS a new "Top Stories" discussion! Chapters: 00:08 Top Stories 09:52 AWS Marketplace 10:17 Analytics 11:10 Application Integration 12:41 Artificial Intelligence 14:24 Compute 15:16 Customer Engagement 17:46 Databases 19:25 Developer Tools 20:38 End User Computing 21:29 Management and Governance 25:22 Media Services 25:42 Security, Identity and Compliance Shownotes: https://d29iemol7wxagg.cloudfront.net/705ExtendedShownotes.html
Join Jon Myer and CloudSmart's GM Tre Vance for an in-depth discussion about marketplace-led growth strategies and maximizing AWS Marketplace success. Learn how CloudSmart Insights is helping partners optimize their marketplace presence, drive customer acquisition, and leverage data-driven insights for better business outcomes. Plus, discover the power of AWS Marketplace integration and how it's revolutionizing software procurement. Key Takeaways:
Join Jon Myer and CloudSmart's GM Tre Vance for an in-depth discussion about marketplace-led growth strategies and maximizing AWS Marketplace success. Learn how CloudSmart Insights is helping partners optimize their marketplace presence, drive customer acquisition, and leverage data-driven insights for better business outcomes. Plus, discover the power of AWS Marketplace integration and how it's revolutionizing software procurement. Key Takeaways:
What does the future have in store for cloud marketplaces and their essential role for hyperscalers? Matt Yanchyshyn, VP of Marketplace & Partner Services at AWS, joined host Dion Hinchcliffe on this episode of Six Five On The Road at AWS re:Invent to discuss the rapidly evolving world of cloud marketplaces. Key topics covered: The unprecedented growth of cloud marketplaces and the forces shaping their trajectory AWS Marketplace's strategic vision and its roadmap for the future The revolutionary "Buy with AWS" experience and its implications for commerce How GenAI is reshaping operations and creating new opportunities
On this week's episode of The Retail Tea Break podcast I'm joined by a product experience specialist. Andy Tyra is the Chief Product Officer at Akeneo. He strives to deliver product information management solutions for brands and retailers contributing to a seamless and positive customer journey.Andy was previously a founding team member on AmazonFresh and AWS Marketplace, building these businesses to materiality from the very beginning. He has a phenomenal amount of experience- so this is an episode not to miss… Grab your cup of tea, sit back and listen in!Topics: The product experience company, what does that mean?What comes first; the data or the customer? Barriers retailers are facing, but they don't have to do it aloneReducing returns and support the customer journey Regulations: preparing for the digital product passportFantastic case study from Swedish clothing brand AsketWhat's coming up next for the businessFor more information visit: https://www.akeneo.com/Transcription and show notes available at: https://theretailadvisor.ie/
ServiceNow (NYSE: NOW), the AI platform for business transformation, and Amazon Web Services (AWS) has announced an expanded strategic collaboration with new capabilities to accelerate AI-driven business transformation across every corner of the enterprise. A new connector enables the seamless use of multimodal models developed and trained on Amazon Bedrock for GenAI-powered workflows in the Now Platform. Additional automation solutions and integrations to seamlessly manage security incidents and procurement are now available on the AWS Marketplace. By deepening its collaboration with AWS and expanding geographically to Canada and Europe expected in 2025, the companies are supercharging value to customers across key industries, including telco, technology, financial services, education, and retail. Connecting Amazon Bedrock models to ServiceNow helps enterprises boost the development and deployment of GenAI solutions. The new connector allows customers to connect seamlessly to their choice of third-party models, based on their specific workflow needs, such as summarisation, advanced analytics, or code generation. Data remains private and secure through ServiceNow and AWS, and customers can set up the integration quickly and easily. "Our partnership with AWS is accelerating business transformation for our joint customers," said Paul Fipps, president of Strategic Accounts at ServiceNow. "More than ever before, organisations demand integrated, end-to-end solutions that enhance user experiences and optimise technology investments. Together, ServiceNow's GenAI workflows and AWS's next-gen cloud capabilities deliver on that promise." "We are committed to empowering our customers with the industry's best tools and resources by leveraging AWS Marketplace to build, deploy and scale GenAI," said Chris Grusz, Managing Director, Technology Partnerships, AWS. "Working with ServiceNow, we're helping our enterprise customers accelerate GenAI deployments and get the most value out of their cloud investments." Expanding integrated solutions, now available in AWS Marketplace ServiceNow is also announcing the availability of new solutions in AWS Marketplace, a digital catalog with thousands of software listings from independent software vendors that make it easy to find, test, buy, and deploy software that runs on AWS. ServiceNow Security Incident Response integration with AWS Security Hub: AWS Marketplace - Uses security findings from AWS Security Hub to automate the creation of security incidents in SecOps on the Now Platform, often resulting in faster, more efficient incident response and remediation. Resolved incidents and findings will then automatically be updated in AWS Security Hub. Integration with Amazon Business Procurement - Integrates Amazon Business procurement with ServiceNow Procurement Operations to enable greater visibility into approved suppliers, purchase requests, changes to prices, order confirmation, and shipping notifications. This streamlines the approval and onboarding of Amazon Business as a supplier for the enterprise and provides built-in governance and procurement policies for Now Platform users. Accelerating business outcomes, maximising cloud investment This announcement builds on ServiceNow and AWS's continued collaboration, bringing the advanced cloud capabilities of AWS to the innovative solutions on the Now Platform, helping customers accelerate business outcomes, realise cloud value, enhance digital experiences, and reimagine GenAI-powered workflows. A range of global enterprise customers, including Bell Canada, Boomi and Pearson are already seeing remarkable value and significant cost savings. Bell Canada "ServiceNow has become a cornerstone of Bell Canada's enterprise services strategy to streamline and enhance end-to-end processes," said John Watson, President, Bell Business Markets, AI and FX Innovation. "By harnessing the Now Platform's advanced automation and AI capabilities powered by AWS, we are dr...
AWS Morning Brief for the week of November 25, with Corey Quinn. Links:Enhanced account linking experience across AWS Marketplace and AWS Partner CentralAmazon API Gateway now supports Custom Domain Name for private REST APIsAmazon Aurora Serverless v2 supports scaling to zero capacityAmazon CloudFront now supports Anycast Static IPsAmazon CloudFront now supports additional log formats and destinations for access logsAmazon CloudFront announces VPC originsAmazon CloudWatch launches full visibility into application transactionsAmazon EC2 now provides lineage information for your AMIsAmazon Q Developer in the AWS Management Console now uses the service you're viewing as context for your chatAmazon WorkSpaces introduces support for Rocky LinuxAWS App Studio is now generally availableAWS CloudTrail Lake launches enhanced analytics and cross-account data accessAWS Compute Optimizer now supports rightsizing recommendations for Amazon AuroraAWS Elastic Beanstalk adds support for Node.js 22AWS Lambda supports Amazon S3 as a failed-event destination for asynchronous and stream event sourcesIntroducing an AWS Management Console Visual Update (Preview)The new AWS Systems Manager experience: Simplifying node managementAWS announces Block Public Access for Amazon Virtual Private CloudLoad Balancer Capacity Unit Reservation for Application and Network Load BalancersAnnouncing Idle Recommendations in AWS Compute OptimizerAnnouncing Savings Plans Purchase AnalyzerAWS Lambda turns ten – looking back and looking aheadBoost Engagement with AWS and Amazon AdsBuild fullstack AI apps in minutes with the new Amplify AI KitImportant changes to CloudTrail events for AWS IAM Identity CenterFollow Corey on BlueSky!Follow Last Week In AWS on BlueSky!
This week on DisrupTV, we interviewed Matt Yanchyshyn, VP, Amazon Web Services (AWS) Marketplace and Partner Services and John Nosta, President at NostaLab. Matt highlighted the growth of AWS Marketplace, which now offers over 42,000 products with ROIs over 200-300%. He emphasized the importance of speed and innovation, noting that AI and large language models (LLMs) are transforming procurement by offering sophisticated recommendations and automated validation. John discussed the cognitive age, where LLMs unlock thoughts, enhancing human cognition and creativity. He stressed the need for businesses to adapt quickly to exponential change, emphasizing continuous learning and ethical stewardship of knowledge. DisrupTV is a weekly podcast with hosts R "Ray" Wang and Vala Afshar. The show airs live at 11:00 a.m. PT/ 2:00 p.m. ET every Friday. Brought to you by Constellation Executive Network: constellationr.com/CEN.
Oracle and Amazon Web Services, Inc. (AWS) have announced the launch of Oracle Database@AWS, a new offering that allows customers to access Oracle Autonomous Database on dedicated infrastructure and Oracle Exadata Database Service within AWS. Oracle Database@AWS will provide customers with a unified experience between Oracle Cloud Infrastructure (OCI) and AWS, offering simplified database administration, billing, and unified customer support. In addition, customers will have the ability to seamlessly connect enterprise data in their Oracle Database to applications running on Amazon Elastic Compute Cloud (Amazon EC2), AWS Analytics services, or AWS's advanced artificial intelligence (AI) and machine learning (ML) services, including Amazon Bedrock. With direct access to Oracle Exadata Database Service on AWS, including Oracle Autonomous Database on dedicated infrastructure and workloads running on Oracle Real Application Clusters (RAC), Oracle Database@AWS allows customers to bring together all of their enterprise data to drive breakthrough innovation. The new offering provides a low latency network connection between Oracle databases and applications on AWS. This allows customers to benefit from Oracle Autonomous Database, a fully automated and managed Oracle Database service, and the performance, availability, security, and cost-effectiveness of Oracle Exadata Database Service, while enjoying the security, agility, flexibility, and sustainability benefits provided by AWS. "We are seeing huge demand from customers that want to use multiple clouds," said Larry Ellison, Oracle Chairman and CTO. "To meet this demand and give customers the choice and flexibility they want, Amazon and Oracle are seamlessly connecting AWS services with the very latest Oracle Database technology, including the Oracle Autonomous Database. With Oracle Cloud Infrastructure deployed inside of AWS datacenters, we can provide customers with the best possible database and network performance." "As far back as 2008, customers could run their Oracle workloads in the cloud, and since then, many of the world's largest and most security sensitive organisations have chosen to deploy their Oracle software on AWS," said Matt Garman, CEO at AWS. "This new, deeper partnership will provide Oracle Database services within AWS to allow customers to take advantage of the flexibility, reliability, and scalability of the world's most widely adopted cloud alongside enterprise software they rely on." Oracle Database@AWS will help customers continue to accelerate their migration to the cloud and modernise their IT environments. Customers can also benefit from: Zero-ETL integration between Oracle Database services and AWS Analytics services. Customers will be able to seamlessly and securely connect and analyse data across Oracle Database services and applications they already have running on AWS to get faster, deeper insights without having to build pipelines. Flexible options to simplify and accelerate migrating their Oracle databases to the cloud, including compatibility with proven migration tools such as Oracle Zero Downtime Migration. A simplified procurement experience via AWS Marketplace that enables customers to purchase Oracle Database services using their existing AWS commitments and use their existing Oracle license benefits, including Bring Your Own License (BYOL) and discount programs such as Oracle Support Rewards (OSR). A fully unified support experience from both AWS and Oracle as well as guidance through reference architectures, landing zones, and other collateral for customers to successfully build and run their most trusted enterprise applications in the cloud. Seamless integration with Amazon Simple Storage Service (Amazon S3) for an easy and secure way to perform database backups and restoration, and to aid with disaster recovery. Customers are able to easily launch their Oracle Database@AWS experience using familiar tools such as the AWS Management Con...
Jillian speaks with Danielle Rios, Acting CEO of Totogi . Hear how Totogi, a vertical SaaS company in the telecom industry, onboarded 23 million new customers in 18 days using AWS. Totogi https://totogi.com/charging-as-a-service/charging-as-a-service-on-aws/ Totogi on the AWS Marketplace: https://aws.amazon.com/marketplace/pp/prodview-hnaq62hzpxane?sr=0-2&ref_=beagle&applicationId=AWSMPContessa AWS for Telco: https://aws.amazon.com/telecom/
Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
In this podcast episode, Krish discusses recent changes made to their mobile app built on Flutter. They highlight the challenges faced when upgrading dependencies and dealing with breaking changes. He shares their experience with dependency conflicts and the need to update dependencies incrementally. He also discuss the changes in Facebook login and the introduction of limited login. Krish provides insights into debugging and finding solutions to these challenges. He concludes by mentioning their exploration of AI technologies and the availability of their APIs on AWS Marketplace. Takeaways Regularly upgrading dependencies in a mobile app is important to avoid dependency issues and breaking changes. Handling dependency conflicts can be challenging, especially when using third-party plugins and libraries. Changes in third-party plugins, like Facebook login, can introduce new features and limitations that need to be accounted for. Debugging and finding solutions to issues with upgrades and changes require thorough investigation and sometimes trial and error. Exploring AI technologies and leveraging existing APIs can save time and effort in software development. Chapters 00:00 Introduction and Apologies for the Delay 04:12 Handling Dependency Upgrades and Breaking Changes 08:38 Challenges with Facebook Login and Limited Login 13:04 Debugging and Finding Solutions to Issues 15:28 Importance of Keeping Up to Date with Software Changes 18:16 Exploring AI Technologies and APIs 29:20 Future Topics and Conclusion
SummaryIn this episode of What's New in Cloud FinOps, Frank and Stephen discuss a wide range of cloud-related news and updates. They cover topics such as Azure VM hibernation, Azure Compute Fleet, Google Cloud TPU, Amazon EC2 C7i Flex, DynamoDB, AWS Marketplace, Cloud Run, and more. The conversation also delves into the complexities of cloud pricing, energy progress, and the impact of cloud technology on businesses.
Jillian and Shruti take you though all the latest and greatest AWS news. Chapters: 00:39 AWS Marketplace; 00:58 Analytics; 02:23 Application Integration; 03:07 Compute; 05:43 Databases; 06:35 Developer Tools; 07:06 End User Computing; 07:21 Front End Web & Mobile; 07:39 Internet of Things; 07:54 Machine Learning; 13:00 Management & Governance; 16:56 Media Services; 18:10 Migration and Transfer; 20:00 Networking & Content Delivery; 23:29 Quantum Technologies; 23:44 Security, Identity & Compliance; 26:25 Solutions; 26:39 Storage.
We are excited to launch our first AWS episode in years. Join us as we delve into marketplace development and partnership success at AWS with our special guest, Josh Greene. Josh is responsible for Marketplace Success at AWS and brings over 20 years of experience in startups, channel partners, and ISVs to this role. In this episode, Josh brings valuable insights into driving revenue generation, strategic partnerships, and global expansion, decoding AWS Marketplace Success, and how partners achieved billions last year. Josh shares his expertise as the Sr. Manager of Global Seller Innovation/MP Development at AWS. Discover how AWS and partners leverage cloud commitments for partner success, the strategies behind AWS Marketplace growth, and the pivotal role of partner engineering in streamlining the sales process. Tune in to learn the importance of collaboration, innovation, and customer obsession in achieving success with AWS Marketplace. Don't miss out on actionable insights to scale your business and drive innovation with AWS.
Today we welcome Ruba Borno VP, Worldwide Channels & Alliances for Amazon Web Services (AWS). In this role, Ruba leads the AWS Partner Organization (APO), a global team responsible for recruiting, enabling, buying and selling with over 100,000 companies in more than 150 countries. Ruba also leads AWS Marketplace where AWS partners can sell their services and solutions to customers. Prior to joining AWS, Ruba held several leadership positions over the course of her six-year tenure with Cisco, a leader in IP-based networking technologies: Vice President, Growth Initiatives and Chief of Staff to the CEO; Vice President and General Manager, Cisco Managed Services (2018-2020); and SVP and General Manager of Cisco's Global Customer Experience (CX) Centers (2020-2021) Before Cisco, Ruba was a principal with The Boston Consulting Group (BCG) and a leader in BCG's Technology, Media, & Telecommunications, and People & Organization practices. She is a member of The Forum of Young Global Leaders, an organization that seeks to drive public-private cooperation in the global public interest in alignment with the World Economic Forum. She also previously served on the Board of Directors at Experian. Ruba holds a Ph.D. and a Master of Science in Electrical Engineering from the University of Michigan, and has a Bachelor of Science in Computer Engineering from the University of North Carolina at Charlotte.
Andreas and Michael are sharing their learning while building on AWS. This episode is about automating the release process of bucketAV, a software product sold on the AWS Marketplace. Besides that, Andreas and Michael discuss how to reduce costs for GitHub Actions. Last but not least, Andreas asks Michael about his thoughts on the latest AWS announcements.
An Assessment of Key 5G-IoT Alliance Developments Including Dell Nokia P5G Use Case Focus, Intel Enlisting P5G Partners, and Vonage AWS Network API Push In this episode of The 5G Factor, our series that focuses on all things 5G, the IoT, and the ecosystem as a whole, we look at the major 5G ecosystem moves and what's going on that caught my eye in the lead up to Mobile World Congress 2024. Key alliance developments include Dell and Nokia teaming up to advance private 5G network use cases and network cloud transformation, Intel enlisting an array of major partners, such as Cisco, NTT Data, AWS, Ericsson, and , to accelerated private 5G network adoption by businesses, and Ericsson's Vonage and AWS bringing together Vonage's platform - based on communications Application Programming Interfaces (APIs) and network APIs - Ericsson's 5G network capabilities and AWS services to catalyze the availability of new solutions to millions of AWS developers through AWS Marketplace. My analytical review highlighted: Dell and Nokia Commit to Advancing Network Cloud Transformation and P5G Use Cases. Dell Technologies and Nokia announced the extension of a strategic partnership to use each company's expertise and solutions, including infrastructure solutions from Dell and private wireless connectivity from Nokia, to advance open network architectures in the telecom ecosystem and private 5G use cases among businesses. As part of the agreement, Nokia will adopt Dell as its preferred infrastructure partner for existing Nokia AirFrame customers, offering Dell's technology as the infrastructure of choice for telecom cloud deployments. I delve into why the extend alliance is a sales and marketing boost for both companies as Dell gains valuable mind share and presence during the early stages of private 5G adoption and Nokia taps into Dell's extensive hybrid cloud channels as Nokia Digital Automation Cloud private wireless solution becomes Dell's preferred private wireless platform for enterprise customers' edge use cases. Intel Ready to Power Private 5G with Partners. Intel is teaming with partners to further enable and grow the private network market. Intel-powered private 5G solutions are deployed globally with major partners such as Cisco, NTT Data, AWS, Ericsson, and Nokia. Key partnerships include Cisco and NTT Data collaborating to transform RAI Amsterdam into the first smart venue in Europe, Aramco Digital, part of the world's largest energy company Aramco, in collaboration with Intel, is developing private 5G for the industrial sector, powered by Intel-based open RAN technology, and the addition of Amdocs as a system integrator for the Integrated Private Wireless (IPW) on AWS program. Through the program, AWS customers can access Amdocs Mobile Private Network (MPN) services, and the infrastructure, built on AWS Outposts servers, which is powered by Intel Xeon processors. I examine why Private 5G networks are in high demand in 2024 and how Intel and partners are well-positioned as enterprises look for scalable compute solutions to power the next wave of AI applications running at the edge that drive their digital transformation missions and improvement of business outcomes. Ericsson and AWS Ready to Unleash Communications and Network APIs. The collaboration between Ericsson-owned Vonage and Amazon Web Services (AWS) will bring together Vonage's platform - based on communications APIs and network APIs - Ericsson's 5G network capabilities and AWS services. The collaboration aims to accelerate the availability of new solutions to millions of AWS developers through AWS Marketplace. I assess why network APIs are essential for exposing new capabilities from within the 5G network that have never been available before, allowing existing applications to be enhanced with network information and enabling the development of a new class of applications. The alliance also aligns with the recently released Ericsson Mobility Report Business Review 2024 that cites programmable networks (network APIs) as integral to driving innovation and ecosystem growth by offering access to new value opportunities, allowing application developers to innovate on a large scale.
Jillian takes you through all the latest AWS updates! Chapters: 00:48 AWS Marketplace; 01:25 Analytics; 03:04 Application Integration; 03:41 Compute; 08:19 Customer Engagement; 09:13 Databases; 11:39 Developer Tools; 12:41 Front End Web and Mobile; 13:28 Machine Learning; 13:55 Management & Governance; 15:44 Media Services; 15:52 Migration & Transfer; 17:07 Networking & Content Delivery; 18:48 Security, Identity, & Compliance; 19:34 Storage; 20:46 Supply Chain
Jillian and Shruti take you through all the updates since re:Invent! AWS Marketplace (01:01); Analytics (01:32); Application Integration (08:04); Compute (09:33); Customer Engagement (22:45); Databases (25:37); Developer Tools (31:39); End User Computing (34:28); Front End Web and Mobile (34:51); Internet of Things (35:13); Machine Learning (36:07); Management & Governance (39:35); Media Services (48:06); Migration & Transfer (48:42); Networking & Content Delivery (49:52); Satellite (51:36); Security, Identity, & Compliance (51:52); Storage (54:22)
Maya Kaczorowski, Chief Product Officer at Tailscale, joins Corey on Screaming in the Cloud to discuss what sets the Tailscale product approach apart, for users of their free tier all the way to enterprise. Maya shares insight on how she evaluates feature requests, and how Tailscale's unique architecture sets them apart from competitors. Maya and Corey discuss the importance of transparency when building trust in security, as well as Tailscale's approach to new feature roll-outs and change management.About MayaMaya is the Chief Product Officer at Tailscale, providing secure networking for the long tail. She was mostly recently at GitHub in software supply chain security, and previously at Google working on container security, encryption at rest and encryption key management. Prior to Google, she was an Engagement Manager at McKinsey & Company, working in IT security for large enterprises.Maya completed her Master's in mathematics focusing on cryptography and game theory. She is bilingual in English and French.Outside of work, Maya is passionate about ice cream, puzzling, running, and reading nonfiction.Links Referenced: Tailscale: https://tailscale.com/ Tailscale features: VS Code extension: https://marketplace.visualstudio.com/items?itemName=tailscale.vscode-tailscale Tailscale SSH: https://tailscale.com/kb/1193/tailscale-ssh Tailnet lock: https://tailscale.com/kb/1226/tailnet-lock Auto updates: https://tailscale.com/kb/1067/update#auto-updates ACL tests: https://tailscale.com/kb/1018/acls#tests Kubernetes operator: https://tailscale.com/kb/1236/kubernetes-operator Log streaming: https://tailscale.com/kb/1255/log-streaming Tailscale Security Bulletins: https://tailscale.com/security-bulletins Blog post “How Our Free Plan Stays Free:” https://tailscale.com/blog/free-plan Tailscale on AWS Marketplace: https://aws.amazon.com/marketplace/pp/prodview-nd5zazsgvu6e6 TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, and I am joined today on this promoted guest episode by my friends over at Tailscale. They have long been one of my favorite products just because it has dramatically changed the way that I interact with computers, which really should be enough to terrify anyone. My guest today is Maya Kaczorowski, Chief Product Officer at Tailscale. Maya, thanks for joining me.Maya: Thank you so much for having me.Corey: I have to say originally, I was a little surprised to—“Really? You're the CPO? I really thought I would have remembered that from the last time we hung out in person.” So, congratulations on the promotion.Maya: Thank you so much. Yeah, it's exciting.Corey: Being a product person is probably a great place to start with this because we've had a number of conversations, here and otherwise, around what Tailscale is and why it's awesome. I don't necessarily know that beating the drum of why it's so awesome is going to be covering new ground, but I'm sure we're going to come up for that during the conversation. Instead, I'd like to start by talking to you about just what a product person does in the context of building something that is incredibly central not just to critical path, but also has massive security ramifications as well, when positioning something that you're building for the enterprise. It's a very hard confluence of problems, and there are days I am astonished that enterprises can get things done based purely upon so much of the mitigation of what has to happen. Tell me about that. How do you even function given the tremendous vulnerability of the attack surface you're protecting?Maya: Yeah, I don't know if you—I feel like you're talking about the product, but also the sales cycle of talking [laugh] and working with enterprise customers.Corey: The product, the sales cycle, the marketing aspects of it, and—Maya: All of it.Corey: —it all ties together. It's different facets of frankly, the same problem.Maya: Yeah. I think that ultimately, this is about really understanding who the customer that is buying the product is. And I really mean that, like, buying the product, right? Because, like, look at something like Tailscale. We're typically used by engineers, or infrastructure teams in an organization, but the buyer might be the VP of Engineering, but it might be the CISO, or the CTO, or whatever, and they're going to have a set of requirements that's going to be very different from what the end-user has as a set of requirements, so even if you have something like bottom-up adoption, in our case, like, understanding and making sure we're checking all the boxes that somebody needs to actually bring us to work.Enterprises are incredibly demanding, and to your point, have long checklists of what they need as part of an RFP or that kind of thing. I find that some of the strictest requirements tend to be in security. So like, how—to your point—if we're such a critical part of your network, how are you sure that we're always available, or how are you sure that if we're compromised, you're not compromised, and providing a lot of, like, assurances and controls around making sure that that's not the case.Corey: I think that there's a challenge in that what enterprise means to different people can be wildly divergent. I originally came from the school of obnoxious engineering where oh, as an engineer, whenever I say something is enterprise grade, that's not a compliment. That means it's going to be slow and moribund. But that is a natural consequence of a company's growth after achieving success, where okay, now we have actual obligations to customers and risk mitigation that needs to be addressed. And how do you wind up doing that without completely hobbling yourself when it comes to accelerating feature velocity? It's a very delicate balancing act.Maya: Yeah, for sure. And I think you need to balance, to your point, kind of creating demand for the product—like, it's actually solving the problem that the customer has—versus checking boxes. Like, I think about them as features, or you know, feature requests versus feature blockers or deal blockers or adoption blockers. So, somebody wants to, say, connect to an AWS VPC, but then the person who has to make sure that that's actually rolled out properly also wants audit logs and SSH session recording and RBAC-based controls and lots of other things before they're comfortable deploying that in their environment. And I'm not even talking about the list of, you know, legal, kind of, TOS requirements that they would have for that kind of situation.I think there's a couple of things that you need to do to even signal that you're in that space. One of the things that I was—I was talking to a friend of mine the other day how it feels like five years ago, like, nobody had SOC 2 reports, or very few startups had SOC 2 reports. And it's probably because of the advent of some of these other companies in this space, but like, now you can kind of throw a dart, and you'll hit five startups that have SOC 2 reports, and the amount that you need to show that you're ready to sell to these companies has changed.Corey: I think that there's a definite broadening of the use case. And I've been trying to avoid it, but let's go diving right into it. I used to view Tailscale as, oh it's a VPN. The end. Then it became something more where it effectively became the mesh overlay where all of the various things that I have that speak Tailscale—which is frankly, a disturbing number of things that I'd previously considered to be appliances—all talk to one another over a dedicated network, and as a result, can do really neat things where I don't have to spend hours on end configuring weird firewall rules.It's more secure, it's a lot simpler, and it seems like every time I get that understanding down, you folks do something that causes me to yet again reevaluate where you stand. Most recently, I was doing something horrifying in front-end work, and in VS Code the Tailscale extension popped up. “Oh, it looks like you're running a local development server. Would you like to use Tailscale Funnel to make it available to the internet?” And my response to that is, “Good lord, no, I'm ashamed of it, but thanks for asking.” Every time I think I get it, I have to reevaluate where it stands in the ecosystem. What is Tailscale now? I feel like I should get the official description of what you are.Maya: Well, I sure hope I'm not the official description. I think the closest is a little bit of what you're saying: a mesh overlay network for your infrastructure, or a programmable network that lets you mesh together your users and services and services and services, no matter where they are, including across different infrastructure providers and, to your point, on a long list of devices you might have running. People are running Tailscale on self-driving cars, on robots, on satellites, on elevators, but they're also running Tailscale on Linux running in AWS or a MacBook they have sitting under their desk or whatever it happens to be. The phrase that I like to use for that is, like, infrastructure agnostic. We're just a building block.Your infrastructure can be whatever infrastructure you want. You can have the cheapest GPUs from this cloud, or you can use the Android phone to train the model that you have sitting on your desk. We just help you connect all that stuff together so you can build your own cloud whatever way you want. To your point, that's not really a VPN [laugh]. The word VPN doesn't quite do it justice. For the remote access to prod use case, so like a user, specifically, like, a developer infra team to a production network, that probably looks the most like a zero-trust solution, but we kind of blur a lot of the lines there for what we can do.Corey: Yeah, just looking at it, at the moment, I have a bunch of Raspberries Pi, perhaps, hanging out on my tailnet. I have currently 14 machines on there, I have my NAS downstairs, I have a couple of EC2 instances, a Google Cloud instance, somewhere, I finally shut down my old Oracle Cloud instance, my pfSense box speaks it natively. I have a Thinkst Canary hanging out on there to detect if anything starts going ridiculously weird, my phone, my iPad, and a few other things here and there. And they all just talk seamlessly over the same network. I can identify them via either IP address, if I'm old, or via DNS if I want to introduce problems that will surprise me at one point or another down the road.I mean, I even have an exit node I share with my brother's Tailscale account for reasons that most people would not expect, namely that he is an American who lives abroad. So, many weird services like banks or whatnot, “Oh, you can't log in to check your bank unless you're coming from US IP space.” He clicks a button, boom, now he doesn't get yelled at to check his own accounts. Which is probably not the primary use case you'd slap on your website, but it's one of those solving everyday things in somewhat weird ways.Maya: Oh, yeah. I worked at a bank maybe ten years ago, and they would block—this little bank on the east coast of the US—they would block connections from Hawaii because why would any of your customers ever be in Hawaii? And it was like, people travel and maybe you're—Corey: How can you be in Hawaii? You don't have a passport.Maya: [laugh]. People travel. They still need to do banking. Like, it doesn't change, yeah. The internet, we've built a lot of weird controls that are IP-based, that don't really make any sense, that aren't reflective. And like, that's true for individuals—like you're describing, people who travel and need to bank or whatever they need to do when they travel—and for corporations, right? Like the old concept—this is all back to the zero trust stuff—but like, the old concept that you were trusted just because you had an IP address that was in the corp IP range is just not true anymore, right? Somebody can walk into your office and connect to the Wi-Fi and a legitimate employee can be doing their job from home or from Starbucks, right? Those are acceptable ways to work nowadays.Corey: One other thing that I wanted to talk about is, I know that in previous discussions with you folks—sometimes on the podcast sometimes when I more or less corner someone a Tailscale at your developer conference—one of the things that you folks talk about is Tailscale SSH, which is effectively a drop-in replacement for the SSH binary on systems. Full disclosure, I don't use it, mostly because I'm grumpy and I'm old. I also like having some form of separation of duties where you're the network that ties it all together, but something else winds up acting as that authentication step. That said, if I were that interesting that someone wanted to come after me, there are easier ways to get in, so I'm mostly just doing this because I'm persnickety. Are you seeing significant adoption of Tailscale SSH?Maya: I think there's a couple of features that are missing in Tailscale SSH for it to be as adopted by people like you. The main one that I would say is—so right now if you use Tailscale SSH, it runs a binary on the host, you can use your Tailscale credentials, and your Tailscale private key, effectively, to SSH something else. So, you don't have to manage a separate set of SSH keys or certs or whatever it is you want to do to manage that in your network. Your identity provider identity is tied to Tailscale, and then when you connect to that device, we still need to have an identity on the host itself, like in Unix. Right now, that's not tied to Tailscale. You can adopt an identity of something else that's already on the host, but it's not, like, corey@machine.And I think that's the number one request that we're getting for Tailscale SSH, to be able to actually generate or tie to the individual users on the host for an identity that comes from, like, Google, or GitHub, or Okta, or something like that. I'm not hearing a lot of feedback on the security concerns that you're expressing. I think part of that is that we've done a lot of work around security in general so that you feel like if Tailscale were to be compromised, your network wouldn't need to be compromised. So, Tailscale itself is end-to-end encrypted using WireGuard. We only see your public keys; the private keys remain on the device.So, in some sense the, like, quote-unquote, “Worst” that we could do would be to add a node to your network and then start to generate traffic from that or, like, mess with the configuration of your network. These are questions that have come up. In terms of adding nodes to your network, we have a feature called tailnet lock that effectively lets you sign and verify that all the nodes on your network are supposed to be there. One of the other concerns that I've heard come up is, like, what if the binary was compromised. We develop in open-source so you can see that that's the case, but like, you know, there's certainly more stuff we could be doing there to prevent, for example, like a software supply chain security attack. Yeah.Corey: Yeah, but you also have taken significant architectural steps to ensure that you are not placed in a position of undue trust around a lot of these things. Most recently, you raised a Series B, that was $100 million, and the fact that you have not gone bankrupt in the year since that happened tells me that you are very clearly not routing all customer traffic through you folks, at least on one of the major cloud providers. And in fact, a little bit of playing a-slap-and-tickle with Wireshark affirm this, that the nodes talk to each other; they do not route their traffic through you folks, by design. So one, great for the budget, I have respect for that data transfer pattern, but also it means that you are in the position of being a global observer in a way that can be, in many cases, exploited.Maya: I think that's absolutely correct. So, it was 18 months ago or so that we raised our Series B. When you use Tailscale, your traffic connects peer-to-peer directly between nodes on your network. And that has a couple of nice properties, some of what you just described, which is that we don't see your traffic. I mean, one, because it's end-to-end encrypted, but even if we could capture it, and then—we're not in the way of capturing it, let alone decrypting it.Another nice property it has is just, like, latency, right? If your user is in the UK, and they're trying to access something in Scotland, it's not, you know, hair-pinning, bouncing all the way to the West Coast or something like that. It doesn't have to go through one of our servers to get there. Another nice property that comes with that is availability. So, if our network goes down, if our control plane goes down, you're temporarily not able to add nodes or change your configuration, but everything in your network can still connect to each other, so you're not dependent on us being online in order for your network to work.And this is actually coming up more and more in customer conversations where that's a differentiator for us versus a competitor. Different competitors, also. There's a customer case study on our website about somebody who was POC'ing us with a different option, and literally during the POC, the competitor had an outage, unfortunately for them, and we didn't, and they sort of looked at our model, our deployment model and went, “Huh, this really matters to us.” And not having an outage on our network with this solution seems like a better option.Corey: Yeah, when the network is down, the computers all turn into basically space heaters.Maya: [laugh]. Yeah, as long as they're not down because, I guess, unplugged or something. But yeah, [laugh] I completely agree. Yeah. But I think there's a couple of these kinds of, like, enterprise things that people are—we're starting to do a better job of explaining and meeting customers where they are, but it's also people are realizing actually does matter when you're deploying something at this scale that's such a key part of your network.So, we talked a bit about availability, we talked a bit about things like latency. On the security side, there's a lot that we've done around, like I said, tailnet lock or that type of thing, but it's like some of the basic security features. Like, when I joined Tailscale, probably the first thing I shipped in some sense as a PM was a change log. Here's the change log of everything that we're shipping as part of these releases so that you can have confidence that we're telling you what's going on in your network, when new features are coming out, and you can trust us to be part of your network, to be part of your infrastructure.Corey: I do want to further call out that you have a—how should I frame this—a typically active security notification page.Maya: [laugh].Corey: And I think it is easy to misconstrue that as look at how terrifyingly insecure this is? Having read through it, I would argue that it is not that you are surprisingly insecure, but rather that you are extraordinarily transparent about things that are relatively minor issues. And yes, they should get fixed, but, “Oh, that could be a problem if six other things happen to fall into place just the right way.” These are not security issues of the type, “Yeah, so it turns out that what we thought was encrypting actually wasn't and we're just expensive telnet.” No, there's none of that going on.It's all been relatively esoteric stuff, but you also address it very quickly. And that is odd, as someone who has watched too many enterprise-facing companies respond to third-party vulnerability reports with rather than fixing the problem, more or less trying to get them not to talk about it, or if they do, to talk about it only using approved language. I don't see any signs of that with what you've done there. Was that a challenging internal struggle for you to pull off?Maya: I think internally, it was recognizing that security was such an important part of our value proposition that we had to be transparent. But once we kind of got past that initial hump, we've been extremely transparent, as you say. We think we can build trust through transparency, and that's the most important thing in how we respond to security incidents. But code is going to have bugs. It's going to have security bugs. There's nothing you can do to prevent that from happening.What matters is how you—and like, you should. Like, you should try to catch them early in the development process and, you know, shift left and all that kind of stuff, but some things are always going to happen [laugh] and what matters in that case is how you respond to them. And having another, you know, an app update that just says “Bug fixes” doesn't help you figure out whether or not you should actually update, it doesn't actually help you trust us. And so, being as public and as transparent as possible about what's actually happening, and when we respond to security issues and how we respond to security issues is really, really important to us. We have a policy that talks about when we will publish a bulletin.You can subscribe to our bulletins. We'll proactively email anyone who has a security contact on file, or alternatively, another contact that we have if you haven't provided us a security contact when you're subject to an issue. I think by far and large, like, Tailscale has more security bulletins just because we're transparent about them. It's like, we probably have as many bugs as anybody else does. We're just lucky that people report them to us because they see us react to them so quickly, and then we're able to fix them, right? It's a net positive for everyone involved.Corey: It's one of those hard problems to solve for across the board, just because I've seen companies in the past get more or less brutalized by the tech press when they have been overly transparent. I remember that there was a Reuters article years ago about Slack, for example, because they would pull up their status history and say, “Oh, look at all of these issues here. You folks can't keep your website up.” But no, a lot of it was like, “Oh, file uploads for a small subset of our users is causing a problem,” and so on and so forth. These relatively minor issues that, in aggregate, are very hard to represent when you're using traffic light signaling.So, then you see people effectively going full-on AWS status page where there's a significant outage lasting over a day, last month, and what you see on this is if you go really looking for it is this yellow thing buried in his absolute sea of green lights, even though that was one of the more disruptive things to have happened this year. So, it's a consistent and constant balance, and I really have a lot of empathy no matter where you wind up landing on that?Maya: Yeah, I think that's—you're saying it's sort of about transparency or being able to find the right information. I completely agree. And it's also about building trust, right? If we set expectations as to how we will respond to these things then we consistently respond to them, people believe that we're going to keep doing that. And that is almost more important than, like, committing to doing that, if that makes any sense.I remember having a conversation many years ago with an eng manager I worked with, and we were debating what the SLO for a particular service should be. And he sort of made an interesting point. He's like, “It doesn't really matter what the SLO is. It matters what you actually do because then people are going to start expecting [laugh] what you actually do.” So, being able to point at this and say, “Yes, here's what we say and here's what we actually do in practice,” I think builds so much more trust in how we respond to these kinds of things and how seriously we take security.I think one of the other things that came out of the security work is we realized—and I think you talked to Avery, the CEO of Tailscale on a prior podcast about some of this stuff—but we realized that platforms are broken, and we don't have a great way of pushing automatic updates on a lot of platforms, right? You know, if you're using the macOS store, or the Android Play Store, or iOS or whatever, you can automatically update your client when there is a security issue. On other platforms, you're kind of stuck. And so, as a result of us wanting to make sure that the fleet is as updated as possible, we've actually built an auto-update feature that's available on all of our major clients now, so people can opt in to getting those updates as quickly as needed when there is a security issue. We want to expose people to as little risk as possible.Corey: I am not a Tailscale customer. And that bugs me because until I cross that chasm into transferring $1 every month from my bank account to yours, I'm just a whiny freeloader in many respects, which is not at all how you folks who never made me feel I want to be very clear on that. But I believe in paying for the services that empower me to do my job more effectively, and Tailscale absolutely qualifies.Maya: Yeah, understood, I think that you still provide value to us in ways that aren't your data, but then in ways that help our business. One of them is that people like you tend to bring Tailscale to work. They tend to have a good experience at home connecting to their Synology, helping their brother connect to his bank account, whatever it happens to be, and they go, “Oh.” Something kind of clicks, and then they see a problem at work that looks very similar, and then they bring it to work. That is our primary path of adoption.We are a bottom-up adoption, you know, product-led growth product [laugh]. So, we have a blog post called “How Our Free Plan Stays Free” that covers some of that. I think the second thing that I don't want to undersell that a user like you also does is, you have a problem, you hit an issue, and you write into support, and you find something that nobody else has found yet [laugh].Corey: I am very good at doing that entirely by accident.Maya: [laugh]. But that helps us because that means that we see a problem that needs to get fixed, and we can catch it way sooner than before it's deployed, you know, at scale, at a large bank, and you know, it's a critical, kind of, somebody's getting paged kind of issue, right? We have a couple of bugs like that where we need, you know, we need a couple of repros from a couple different people in a couple different situations before we can really figure out what's going on. And having a wide user base who is happy to talk to us really helps us.Corey: I would say it goes beyond that, too. I have—I see things in the world of Tailscale that started off as features that I requested. One of the more recent ones is, it is annoying to me to see on the Tailscale machines list everything I have joined to the tailnet with that silly little up arrow next to it of, “Oh, time to go back and update Tailscale to the latest,” because that usually comes with decent benefits. Great, I have to go through iteratively, or use Ansible, or something like that. Well, now there's a Tailscale update option where it will keep itself current on supported operating systems.For some unknown reason, you apparently can't self-update the application on iOS or macOS. Can't imagine why. But those things tend to self-update based upon how the OS works due to all the sandboxing challenges. The only challenge I've got now is a few things that are, more or less, embedded devices that are packaged by the maintainer of that embedded system, where I'm beholden to them. Only until I get annoyed enough to start building a CI/CD system to replace their package.Maya: I can't wait till you build that CI/CD system. That'll be fun.Corey: “We wrote this code last night. Straight to the bank with it.” Yeah, that sounds awesome.Maya: [laugh] You'd get a couple of term sheets for that, I'm sure.Corey: There are. I am curious, looping back to the start of our conversation, we talked about enterprise security requirements, but how do you address enterprise change management? I find that that's something an awful lot of companies get dreadfully wrong. Most recently and most noisily on my part is Slack, a service for which I paid thousands of dollars a year, decided to roll out a UI redesign that, more or less, got in the way of a tremendous number of customers and there was no way to stop it or revert it. And that made me a lot less likely to build critical-flow business processes that depended upon Slack behaving a certain way.Just, “Oh, we decided to change everything in the user interface today just for funsies.” If Microsoft pulled that with Excel, by lunchtime they'd have reverted it because an entire universe of business users would have marched on Redmond to burn them out otherwise. That carries significant cost for businesses. Yet I still see Tailscale shipping features just as fast as you ever have. How do you square that circle?Maya: Yeah. I think there's two different kinds of change management really, which is, like—because if you think about it, it's like, an enterprise needs a way to roll out a product or a feature internally and then separately, we need a way to roll out new things to customers, right? And so, I think on the Tailscale side, we have a change log that tells you about everything that's changing, including new features, and including changes to the client. We update that religiously. Like, it's a big deal, if something doesn't make it the day that it's supposed to make it. We get very kind of concerned internally about that.A couple of things that were—that are in that space, right, we just talked about auto-updates to make it really easy for you to maintain what's actually rolled out in your infrastructure, but more importantly, for us to push changes with a new client release. Like, for example, in the case of a security incident, we want to be able to publish a version and get it rolled out to the fleet as quickly as possible. Some of the things that we don't have here, but although I hear requests for is the ability to, like, gradually roll out features to a customer. So like, “Can we change the configuration for 10% of our network and see if anything breaks before rolling back, right before rolling forward.” That's a very traditional kind of infra change management thing, but not something I've ever seen in, sort of, the networking security space to this degree, and something that I'm hearing a lot of customers ask for.In terms of other, like, internal controls that a customer might have, we have a feature called ACL Tests. So, if you're going to change the configuration of who can access what in your network, you can actually write tests. Like, your permission file is written in HuJSON and you can write a set of things like, Corey should be able to access prod. Corey should not be able to access test, or whatever it happens to be—actually, let's flip those around—and when you have a policy change that doesn't pass those tests, you actually get told right away so you're not rolling that out and accidentally breaking a large part of your network. So, we built several things into the product to do it. In terms of how we notify customers, like I said, that the primary method that we have right now is something like a change log, as well as, like, security bulletins for security updates.Corey: Yeah, it's one of the challenges, on some level, of the problem of oh, I'm going to set up a service, and then I'm going to go sail around the world, and when I come back in a year or two—depending on how long I spent stranded on an island somewhere—now I get to figure out what has changed. And to your credit, you have to affirmatively enable all of the features that you have shipped, but you've gone from, “Oh, it's a mesh network where everything can talk to each other,” to, “I can use an exit node from that thing. Oh, now I can seamlessly transfer files from one node to another with tail drop,” to, “Oh, Tailscale Funnel. Now, I can expose my horrifying developer environment to the internet.” I used that one year to give a talk at a conference, just because why not?Maya: [crosstalk 00:27:35].Corey: Everything evolves to become [unintelligible 00:27:37] email on Microsoft Outlook, or tries to be Microsoft Excel? Oh, no, no. I want you to be building Microsoft PowerPoint for me. And we eventually get there, but that is incredibly powerful functionality, but also terrifying when you think you have a handle on what's going on in a large-scale environment, and suddenly, oh, there's a whole new vector we need to think about. Which is why your—the thought and consideration you put into that is so apparent and so, frankly, welcome.Maya: Yeah, you actually kind of made a statement there that I completely missed, which is correct, which is, we don't turn features on by default. They are opt-in features. We will roll out features by default after they've kind of baked for an incredibly long period of time and with, like, a lot of fanfare and warning. So, the example that I'll give is, we have a DNS feature that was probably available for maybe 18 months before we turned it on by default for new tailnets. So didn't even turn it on for existing folks. It's called Magic DNS.We don't want to touch your configuration or your network. We know people will freak out when that happens. Knowing, to your point, that you can leave something for a year and come back, and it's going to be the same is really important. For everyone, but for an enterprise customer as well. Actually, one other thing to mention there. We have a bunch of really old versions of clients that are running in production, and we want them to keep working, so we try to be as backward compatible as possible.I think the… I think we still have clients from 2019 that are running and connecting to corp that nobody's updated. And like, it'd be great if they would update them, but like, who knows what situation they're in and if they can connect to them, and all that kind of stuff, but they still work. And the point is that you can have set it up four years ago, and it should still work, and you should still be able to connect to it, and leave it alone and come back to it in a year from now, and it should still work and [laugh] still connect without anything changing. That's a very hard guarantee to be able to make.Corey: And yet, somehow you've been able to do that, just from the perspective of not—I've never yet seen you folks make a security-oriented decision that I'm looking at and rolling my eyes and amazed that you didn't make the decision the other way. There are a lot of companies that while intending very well have done, frankly, very dumb things. I've been keeping an eye on you folks for a long time, and I would have caught that in public. I just haven't seen anything like that. It's kind of amazing.Last year, I finally took the extraordinary step of disabling SSH access anywhere except the tailnet to a number of my things. It lets my logs fill up a lot less, and you've built to that level of utility-like reliability over the series of longtime experimentation. I have yet to regret having Tailscale in the mix, which is, frankly, not something I can say about almost any product.Maya: Yeah. I'm very proud to hear that. And like, maintaining that trust—back to a lot of the conversation about security and reliability and stuff—is incredibly important to us, and we put a lot of effort into it.Corey: I really appreciate your taking the time to talk to me about how things continue to evolve over there. Anything that's new and exciting that might have gotten missed? Like, what has come out in, I guess, the last six months or so that are relevant to the business and might be useful for people looking to use it themselves?Maya: I was hoping you're going to ask me what came out in the last, you know, 20 minutes while we were talking, and the answer is probably nothing, but you never know. But [laugh]—Corey: With you folks, I wouldn't doubt it. Like, “Oh, yeah, by the way, we had to do a brand treatment redo refresh,” or something on the website? Why not? It now uses telepathy just because.Maya: It could, that'd be pretty cool. No, I mean, lots has gone on in the last six months. I think some of the things that might be more interesting to your listeners, we're now in the AWS Marketplace, so if you want to purchase Tailscale through AWS Marketplace, you can. We have a Kubernetes operator that we've released, which lets you both ingress and egress from a Kubernetes cluster to things that are elsewhere in the world on other infrastructure, and also access the Kubernetes control plane and the API server via Tailscale. I mentioned auto-updates. You mentioned the VS Code extension. That's amazing, the fact that you can kind of connect directly from within VS Code to things on your tailnet. That's a lot of the exciting stuff that we've been doing. And there's boring stuff, you know, like audit log streaming, and that kind of stuff. But it's good.Corey: Yeah, that stuff is super boring until suddenly, it's very, very exciting. And those are not generally good days.Maya: [laugh]. Yeah, agreed. It's important, but boring. But important.Corey: [laugh]. Well, thank you so much for taking the time to talk through all the stuff that you folks are up to. If people want to learn more, where's the best place for them to go to get started?Maya: tailscale.com is the best place to go. You can download Tailscale from there, get access to our documentation, all that kind of stuff.Corey: Yeah, I also just want to highlight that you can buy my attention but never my opinion on things and my opinion on Tailscale remains stratospherically high, so thank you for not making me look like a fool, by like, “Yes. And now we're pivoting to something horrifying is a business model and your data.” Thank you for not doing exactly that.Maya: Yeah, we'll keep doing that. No, no, blockchains in our future.Corey: [laugh]. Maya Kaczorowski, Chief Product Officer at Tailscale. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. This episode has been brought to us by our friends at Tailscale. If you enjoyed this episode, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment that will never actually make it back to us because someone screwed up a firewall rule somewhere on their legacy connection.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
On this episode of The Six Five – On The Road, hosts Daniel Newman and Patrick Moorhead welcome Matt Yanchyshyn, GM at AWS Marketplace and Partner Engineering for a conversation on what is driving the growth and adoption of AWS Marketplace and what buyers and sellers can expect in the future. Their discussion covers: An introduction from Matt Yanchyshyn as GM at AWS Marketplace What factors drive the growth and adoption of cloud marketplaces How important is AWS Marketplace to the AWS business and their partners What buyers and sellers can look forward to from AWS Marketplace in the near future Learn more about AWS Marketplace on the company's website.
Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
AWS Marketplace offers multiple categories for you to list your products under, one of which is SaaS. If you are looking to sell SaaS products, and are an AWS customer, I highly recommend that you list your products in the marketplace as it will provide plenty of visibility for you. In this course, I'll walk you through the process of listing a SaaS product on AWS Marketplace as it can be a tad overwhelming at times. We launched a number of API products recently, and I will be sharing that experience with you in this course. The aim is that by the end of this course, you'll have a good sense of what it takes to list your products on the marketplace. Purchase course in one of 2 ways: 1. Go to https://getsnowpal.com, and purchase it on the Web 2. On your phone: (i) If you are an iPhone user, go to http://ios.snowpal.com, and watch the course on the go. (ii). If you are an Android user, go to http://android.snowpal.com.
Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
You will need to provide an API Key for you to authenticate your API Requests. There's many ways to generate this key and share it with your subscribers, and in this course, we'll look at one of those ways. In addition, you will likely also need to publish an API code to your subscribers if you have listed more than 1 SaaS Product (say, on the AWS Marketplace, or one of the many API Hubs). I'll walk you through our high level architecture so you can leave this course with a fair understanding of what it took us to support these 2 attributes (API Key & Product Code) in our implementation. Purchase course in one of 2 ways: 1. Go to https://getsnowpal.com, and purchase it on the Web 2. On your phone: (i) If you are an iPhone user, go to http://ios.snowpal.com, and watch the course on the go. (ii). If you are an Android user, go to http://android.snowpal.com.
Adnan Khan, Lead Security Engineer at Praetorian, joins Corey on Screaming in the Cloud to discuss software bill of materials and supply chain attacks. Adnan describes how simple pull requests can lead to major security breaches, and how to best avoid those vulnerabilities. Adnan and Corey also discuss the rapid innovation at Github Actions, and the pros and cons of having new features added so quickly when it comes to security. Adnan also discusses his view on the state of AI and its impact on cloud security. About AdnanAdnan is a Lead Security Engineer at Praetorian. He is responsible for executing on Red-Team Engagements as well as developing novel attack tooling in order to meet and exceed engagement objectives and provide maximum value for clients.His past experience as a software engineer gives him a deep understanding of where developers are likely to make mistakes, and has applied this knowledge to become an expert in attacks on organization's CI/CD systems.Links Referenced: Praetorian: https://www.praetorian.com/ Twitter: https://twitter.com/adnanthekhan Praetorian blog posts: https://www.praetorian.com/author/adnan-khan/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Are you navigating the complex web of API management, microservices, and Kubernetes in your organization? Solo.io is here to be your guide to connectivity in the cloud-native universe!Solo.io, the powerhouse behind Istio, is revolutionizing cloud-native application networking. They brought you Gloo Gateway, the lightweight and ultra-fast gateway built for modern API management, and Gloo Mesh Core, a necessary step to secure, support, and operate your Istio environment.Why struggle with the nuts and bolts of infrastructure when you can focus on what truly matters - your application. Solo.io's got your back with networking for applications, not infrastructure. Embrace zero trust security, GitOps automation, and seamless multi-cloud networking, all with Solo.io.And here's the real game-changer: a common interface for every connection, in every direction, all with one API. It's the future of connectivity, and it's called Gloo by Solo.io.DevOps and Platform Engineers, your journey to a seamless cloud-native experience starts here. Visit solo.io/screaminginthecloud today and level up your networking game.Corey: As hybrid cloud computing becomes more pervasive, IT organizations need an automation platform that spans networks, clouds, and services—while helping deliver on key business objectives. Red Hat Ansible Automation Platform provides smart, scalable, sharable automation that can take you from zero to automation in minutes. Find it in the AWS Marketplace.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. I've been studiously ignoring a number of buzzword, hype-y topics, and it's probably time that I addressed some of them. One that I've been largely ignoring, mostly because of its prevalence at Expo Hall booths at RSA and other places, has been software bill of materials and supply chain attacks. Finally, I figured I would indulge the topic. Today I'm speaking with Adnan Khan, lead security engineer at Praetorian. Adnan, thank you for joining me.Adnan: Thank you so much for having me.Corey: So, I'm trying to understand, on some level, where the idea of these SBOM or bill-of-material attacks have—where they start and where they stop. I've seen it as far as upstream dependencies have a vulnerability. Great. I've seen misconfigurations in how companies wind up configuring their open-source presences. There have been a bunch of different, it feels almost like orthogonal concepts to my mind, lumped together as this is a big scary thing because if we have a big single scary thing we can point at, that unlocks budget. Am I being overly cynical on this or is there more to it?Adnan: I'd say there's a lot more to it. And there's a couple of components here. So first, you have the SBOM-type approach to security where organizations are looking at which packages are incorporated into their builds. And vulnerabilities can come out in a number of ways. So, you could have software actually have bugs or you could have malicious actors actually insert backdoors into software.I want to talk more about that second point. How do malicious actors actually insert backdoors? Sometimes it's compromising a developer. Sometimes it's compromising credentials to push packages to a repository, but other times, it could be as simple as just making a pull request on GitHub. And that's somewhere where I've spent a bit of time doing research, building off of techniques that other people have documented, and also trying out some attacks for myself against two Microsoft repositories and several others that have reported over the last few months that would have been able to allow an attacker to slip a backdoor into code and expand the number of projects that they are able to attack beyond that.Corey: I think one of the areas that we've seen a lot of this coming from has been the GitHub Action space. And I'll confess that I wasn't aware of a few edge-case behaviors around this. Most of my experience with client-side Git configuration in the .git repository—pre-commit hooks being a great example—intentionally and by design from a security perspective, do not convey when you check that code in and push it somewhere, or grab someone else's, which is probably for the best because otherwise, it's, “Oh yeah, just go ahead and copy your password hash file and email that to something else via a series of arcane shell script stuff.” The vector is there. I was unpleasantly surprised somewhat recently to discover that when I cloned a public project and started running it locally and then adding it to my own fork, that it would attempt to invoke a whole bunch of GitHub Actions flows that I'd never, you know, allowed it to do. That was… let's say, eye-opening.Adnan: [laugh]. Yeah. So, on the particular topic of GitHub Actions, the pull request as an attack vector, like, there's a lot of different forms that an attack can take. So, one of the more common ones—and this is something that's been around for just about as long as GitHub Actions has been around—and this is a certain trigger called ‘pull request target.' What this means is that when someone makes a pull request against the base repository, maybe a branch within the base repository such as main, that will be the workflow trigger.And from a security's perspective, when it runs on that trigger, it does not require approval at all. And that's something that a lot of people don't really realize when they're configuring their workflows. Because normally, when you have a pull request trigger, the maintainer can check a box that says, “Oh, require approval for all external pull requests.” And they think, “Great, everything needs to be approved.” If someone tries to add malicious code to run that's on the pull request target trigger, then they can look at the code before it runs and they're fine.But in a pull request target trigger, there is no approval and there's no way to require an approval, except for configuring the workflow securely. So, in this case, what happens is, and in one particular case against the Microsoft repository, this was a Microsoft reusable GitHub Action called GPT Review. It was vulnerable because it checked out code from my branch—so if I made a pull request, it checked out code from my branch, and you could find this by looking at the workflow—and then it ran tests on my branch, so it's running my code. So, by modifying the entry points, I could run code that runs in the context of that base branch and steal secrets from it, and use those to perform malicious Actions.Corey: Got you. It feels like historically, one of the big threat models around things like this is al—[and when 00:06:02] you have any sort of CI/CD exploit—is either falls down one of two branches: it's either the getting secret access so you can leverage those credentials to pivot into other things—I've seen a lot of that in the AWS space—or more boringly, and more commonly in many cases, it seems to be oh, how do I get it to run this crypto miner nonsense thing, with the somewhat large-scale collapse of crypto across the board, it's been convenient to see that be less prevalent, but still there. Just because you're not making as much money means that you'll still just have to do more of it when it's all in someone else's account. So, I guess it's easier to see and detect a lot of the exploits that require a whole bunch of compute power. The, oh by the way, we stole your secrets and now we're going to use that to lateral into an organization seem like it's something far more… I guess, dangerous and also sneaky.Adnan: Yeah, absolutely. And you hit the nail on the head there with sneaky because when I first demonstrated this, I made a test account, I created a PR, I made a couple of Actions such as I modified the name of the release for the repository, I just put a little tag on it, and didn't do any other changes. And then I also created a feature branch in one of Microsoft's repositories. I don't have permission to do that. That just sat there for about almost two weeks and then someone else exploited it and then they responded to it.So, sneaky is exactly the word you could describe something like this. And another reason why it's concerning is, beyond the secret disclosure for—and in this case, the repository only had an OpenAI API key, so… okay, you can talk to ChatGPT for free. But this was itself a Github Action and it was used by another Microsoft machine-learning project that had a lot more users, called SynapseML, I believe was the name of the other project. So, what someone could do is backdoor this Action by creating a commit in a feature branch, which they can do by stealing the built-in GitHub token—and this is something that all Github Action runs have; the permissions for it vary, but in this case, it had the right permissions—attacker could create a new branch, modify code in that branch, and then modify the tag, which in Git, tags are mutable, so you can just change the commit the tag points to, and now, every time that other Microsoft repository runs GPT Review to review a pull request, it's running attacker-controlled code, and then that could potentially backdoor that other repository, steal secrets from that repository.So that's, you know, one of the scary parts of, in particular backdooring a Github Action. And I believe there was a very informative Blackhat talk this year, that someone from—I'm forgetting the name of the author, but it was a very good watch about how Actions vulnerabilities can be vulnerable, and this is kind of an example of—it just happened to be that this was an Action as well.Corey: That feels like this is an area of exploit that is becoming increasingly common. I tie it almost directly to the rise of GitHub Actions as the default CI/CD system that a lot of folks have been using. For the longest time, it seemed like a poorly configured Jenkins box hanging out somewhere in your environment that was the exception to the Infrastructure as Code rule because everyone has access to it, configures it by hand, and invariably it has access to production was the way that people would exploit things. For a while, you had CircleCI and Travis-CI, before Travis imploded and Circle did a bunch of layoffs. Who knows where they're at these days?But it does seem that the common point now has been GitHub Actions, and a .github folder within that Git repo with a workflows YAML file effectively means that a whole bunch of stuff can happen that you might not be fully aware of when you're cloning or following along with someone's tutorial somewhere. That has caught me out in a couple of strange ways, but nothing disastrous because I do believe in realistic security boundaries. I just worry how much of this is the emerging factor of having a de facto standard around this versus something that Microsoft has actively gotten wrong. What's your take on it?Adnan: Yeah. So, my take here is that Github could absolutely be doing a lot more to help prevent users from shooting themselves in the foot. Because their documentation is very clear and quite frankly, very good, but people aren't warned when they make certain configuration settings in their workflows. I mean, GitHub will happily take the settings and, you know, they hit commit, and now the workflow could be vulnerable. There's no automatic linting of workflows, or a little suggestion box popping up like, “Hey, are you sure you want to configure it this way?”The technology to detect that is there. There's a lot of third-party utilities that will lint Actions workflows. Heck, for looking for a lot of these pull request target-type vulnerabilities, I use a Github code search query. It's just a regular expression. So, having something that at least nudges users to not make that mistake would go really far in helping people not make these mista—you know, adding vulnerabilities to their projects.Corey: It seems like there's also been issues around the GitHub Actions integration approach where OICD has not been scoped correctly a bunch of times. I've seen a number of articles come across my desk in that context and fortunately, when I wound up passing out the ability for one of my workflows to deploy to my AWS account, I got it right because I had no idea what I was doing and carefully followed the instructions. But I can totally see overlooking that one additional parameter that leaves things just wide open for disaster.Adnan: Yeah, absolutely. That's one where I haven't spent too much time actually looking for that myself, but I've definitely read those articles that you mentioned, and yeah, it's very easy for someone to make that mistake, just like, it's easy for someone to just misconfigure their Action in general. Because in some of the cases where I found vulnerabilities, there would actually be a commit saying, “Hey, I'm making this change because the Action needs access to these certain secrets. And oh, by the way, I need to update the checkout steps so it actually checks out the PR head so that it's [testing 00:12:14] that PR code.” Like, people are actively making a decision to make it vulnerable because they don't realize the implication of what they've just done.And in the second Microsoft repository that I found the bug in, was called Microsoft Confidential Sidecar Containers. That repository, the developer a week prior to me identifying the bug made a commit saying that we're making a change and it's okay because it requires approval. Well, it doesn't because it's a pull request target.Corey: Part of me wonders how much of this is endemic to open-source as envisioned through enterprises versus my world of open-source, which is just eh, I've got this weird side project in my spare time, and it seemed like it might be useful to someone else, so I'll go ahead and throw it up there. I understand that there's been an awful lot of commercialization of open-source in recent years; I'm not blind to that fact, but it also seems like there's a lot of companies playing very fast and loose with things that they probably shouldn't be since they, you know, have more of a security apparatus than any random contributors standing up a clone of something somewhere will.Adnan: Yeah, we're definitely seeing this a lot in the machine-learning space because of companies that are trying to move so quickly with trying to build things because OpenAI AI has blown up quite a bit recently, everyone's trying to get a piece of that machine learning pie, so to speak. And another thing of what you're seeing is, people are deploying self-hosted runners with Nvidia, what is it, the A100, or—it's some graphics card that's, like, $40,000 apiece attached to runners for running integration tests on machine-learning workflows. And someone could, via a pull request, also just run code on those and mine crypto.Corey: I kind of miss the days when exploiting computers is basically just a way for people to prove how clever they were or once in a blue moon come up with something innovative. Now, it's like, well, we've gone all around the mulberry bush just so we can basically make computers solve a sudoku form, and in return, turn that into money down the road. It's frustrating, to put it gently.Adnan: [laugh].Corey: When you take a look across the board at what companies are doing and how they're embracing the emerging capabilities inherent to these technologies, how do you avoid becoming a cautionary tale in the space?Adnan: So, on the flip side of companies having vulnerable workflows, I've also seen a lot of very elegant ways of writing secure workflows. And some of the repositories are using deployment environments—which is the GitHub Actions feature—to enforce approval checks. So, workflows that do need to run on pull request target because of the need to access secrets for pull requests will have a step that requires a deployment environment to complete, and that deployment environment is just an approval and it doesn't do anything. So essentially, someone who has permissions to the repository will go in, approve that environment check, and only then will the workflow continue. So, that adds mandatory approvals to pull requests where otherwise they would just run without approval.And this is on, particularly, the pull request target trigger. Another approach is making it so the trigger is only running on the label event and then having a maintainer add a label so the tests can run and then remove the label. So, that's another approach where companies are figuring out ways to write secure workflows and not leave their repositories vulnerable.Corey: It feels like every time I turn around, Github Actions has gotten more capable. And I'm not trying to disparage the product; it's kind of the idea of what we want. But it also means that there's certainly not an awareness in the larger community of how these things can go awry that has kept up with the pace of feature innovation. How do you balance this without becoming the Department of No?Adnan: [laugh]. Yeah, so it's a complex issue. I think GitHub has evolved a lot over the years. Actions, it's—despite some of the security issues that happen because people don't configure them properly—is a very powerful product. For a CI/CD system to work at the scale it does and allow so many repositories to work and integrate with everything else, it's really easy to use. So, it's definitely something you don't want to take away or have an organization move away from something like that because they are worried about the security risks.When you have features coming in so quickly, I think it's important to have a base, kind of like, a mandatory reading. Like, if you're a developer that writes and maintains an open-source software, go read through this document so you can understand the do's and don'ts instead of it being a patchwork where some people, they take a good security approach and write secure workflows and some people just kind of stumble through Stack Overflow, find what works, messes around with it until their deployment is working and their CI/CD is working and they get the green checkmark, and then they move on to their never-ending list of tasks that—because they're always working on a deadline.Corey: Reminds me of a project I saw a few years ago when it came out that Volkswagen had been lying to regulators. It was a framework someone built called ‘Volkswagen' that would detect if it was running inside of a CI/CD environment, and if so, it would automatically make all the tests pass. I have a certain affinity for projects like that. Another one was a tool that would intentionally degrade the performance of a network connection so you could simulate having a latent or stuttering connection with packet loss, and they call that ‘Comcast.' Same story. I just thought that it's fun seeing people get clever on things like that.Adnan: Yeah, absolutely.Corey: When you take a look now at the larger stories that are emerging in the space right now, I see an awful lot of discussion coming up that ties to SBOMs and understanding where all of the components of your software come from. But I chased some stuff down for fun once, and I gave up after 12 dependency leaps from just random open-source frameworks. I mean, I see the Dependabot problem that this causes as well, where whenever I put something on GitHub and then don't touch it for a couple of months—because that's how I roll—I come back and there's a whole bunch of terrifyingly critical updates that it's warning me about, but given the nature of how these things get used, it's never going to impact anything that I'm currently running. So, I've learned to tune it out and just ignore it when it comes in, which is probably the worst of all possible approaches. Now, if I worked at a bank, I should probably take a different perspective on this, but I don't.Adnan: Mm-hm. Yeah. And that's kind of a problem you see, not just with SBOMs. It's just security alerting in general, where anytime you have some sort of signal and people who are supposed to respond to it are getting too much of it, you just start to tune all of it out. It's like that human element that applies to so much in cybersecurity.And I think for the particular SBOM problem, where, yeah, you're correct, like, a lot of it… you don't have reachability because you're using a library for one particular function and that's it. And this is somewhere where I'm not that much of an expert in where doing more static source analysis and reachability testing, but I'm certain there are products and tools that offer that feature to actually prioritize SBOM-based alerts based on actual reachability versus just having an as a dependency or not.[midroll 00:20:00]Corey: I feel like, on some level, wanting people to be more cautious about what they're doing is almost shouting into the void because I'm one of the only folks I found that has made the assertion that oh yeah, companies don't actually care about security. Yes, they email you all the time after they failed to protect your security, telling you how much they care about security, but when you look at where they invest, feature velocity always seems to outpace investment in security approaches. And take a look right now at the hype we're seeing across the board when it comes to generative AI. People are excited about the capabilities and security is a distant afterthought around an awful lot of these things. I don't know how you drive a broader awareness of this in a way that sticks, but clearly, we haven't collectively found it yet.Adnan: Yeah, it's definitely a concern. When you see things on—like for example, you can look at Github's roadmap, and there's, like, a feature there that's, oh, automatic AI-based pull request handling. Okay, so does that mean one day, you'll have a GitHub-powered LLM just approve PRs based on whether it determines that it's a good improvement or not? Like, obviously, that's not something that's the case now, but looking forward to maybe five, six years in the future, in the pursuit of that ever-increasing velocity, could you ever have a situation where actual code contributions are reviewed fully by AI and then approved and merged? Like yeah, that's scary because now you have a threat actor that could potentially specifically tailor contributions to trick the AI into thinking they're great, but then it could turn around and be a backdoor that's being added to the code.Obviously, that's very far in the future and I'm sure a lot of things will happen before that, but it starts to make you wonder, like, if things are heading that way. Or will people realize that you need to look at security at every step of the way instead of just thinking that these newer AI systems can just handle everything?Corey: Let's pivot a little bit and talk about your day job. You're a lead security engineer at what I believe to be a security-focused consultancy. Or—Adnan: Yeah.Corey: If you're not a SaaS product. Everything seems to become a SaaS product in the fullness of time. What's your day job look like?Adnan: Yeah, so I'm a security engineer on Praetorian's red team. And my day-to-day, I'll kind of switch between application security and red-teaming. And that kind of gives me the opportunity to, kind of, test out newer things out in the field, but then also go and do more traditional application security assessments and code reviews, and reverse engineering to kind of break up the pace of work. Because red-teaming can be very fast and fast-paced and exciting, but sometimes, you know, that can lead to some pretty late nights. But that's just the nature of being on a red team [laugh].Corey: It feels like as soon as I get into the security space and start talking to cloud companies, they get a lot more defensive than when I'm making fun of, you know, bad service naming or APIs that don't make a whole lot of sense. It feels like companies have a certain sensitivity around the security space that applies to almost nothing else. Do you find, as a result, that a lot of the times when you're having conversations with companies and they figure out that, oh, you're a red team for a security researcher, oh, suddenly, we're not going to talk to you the way we otherwise might. We thought you were a customer, but nope, you can just go away now.Adnan: [laugh]. I personally haven't had that experience with cloud companies. I don't know if I've really tried to buy a lot. You know, I'm… if I ever buy some infrastructure from cloud companies as an individual, I just kind of sign up and put in my credit card. And, you know, they just, like, oh—you know, they just take my money. So, I don't really think I haven't really, personally run into anything like that yet [laugh].Corey: Yeah, I'm curious to know how that winds up playing out in some of these, I guess, more strategic, larger company environments. I don't get to see that because I'm basically a tiny company that dabbles in security whenever I stumble across something, but it's not my primary function. I just worry on some level one of these days, I'm going to wind up accidentally dropping a zero-day on Twitter or something like that, and suddenly, everyone's going to come after me with the knives. I feel like [laugh] at some point, it's just going to be a matter of time.Adnan: Yeah. I think when it comes to disclosing things and talking about techniques, the key thing here is that a lot of the things that I'm talking about, a lot of the things that I'll be talking about in some blog posts that have coming out, this is stuff that these companies are seeing themselves. Like, they recognize that these are security issues that people are introducing into code. They encourage people to not make these mistakes, but when it's buried in four links deep of documentation and developers are tight on time and aren't digging through their security documentation, they're just looking at what works, getting it to work and moving on, that's where the issue is. So, you know, from a perspective of raising awareness, I don't feel bad if I'm talking about something that the company itself agrees is a problem. It's just a lot of the times, their own engineers don't follow their own recommendations.Corey: Yeah, I have opinions on these things and unfortunately, it feels like I tend to learn them in some of the more unfortunate ways of, oh, yeah, I really shouldn't care about this thing, but I only learned what the norm is after I've already done something. This is, I think, the problem inherent to being small and independent the way that I tend to be. We don't have enough people here for there to be a dedicated red team and research environment, for example. Like, I tend to bleed over a little bit into a whole bunch of different things. We'll find out. So far, I've managed to avoid getting it too terribly wrong, but I'm sure it's just a matter of time.So, one area that I think seems to be a way that people try to avoid cloud issues is oh, I read about that in the last in-flight magazine that I had in front of me, and the cloud is super insecure, so we're going to get around all that by running our own infrastructure ourselves, from either a CI/CD perspective or something else. Does that work when it comes to this sort of problem?Adnan: Yeah, glad you asked about that. So, we've also seen open-s—companies that have large open-source presence on GitHub just opt to have self-hosted Github Actions runners, and that opens up a whole different Pandora's box of attacks that an attacker could take advantage of, and it's only there because they're using that kind of runner. So, the default GitHub Actions runner, it's just an agent that runs on a machine, it checks in with GitHub Actions, it pulls down builds, runs them, and then it waits for another build. So, these are—the default state is a non-ephemeral runner with the ability to fork off tasks that can run in the background. So, when you have a public repository that has a self-hosted runner attached to it, it could be at the organization level or it could be at the repository level.What an attacker can just do is create a pull request, modify the pull request to run on a self-hosted runner, write whatever they want in the pull request workflow, create a pull request, and now as long as they were a previous contributor, meaning you fixed a typo, you… that could be a such a, you know, a single character typo change could even cause that, or made a small contribution, now they create the pull request. The arbitrary job that they wrote is now picked up by that self-hosted runner. They can fork off it, process it to run in the background, and then that just continues to run, the job finishes, their pull request, they'll just—they close it. Business as usual, but now they've got an implant on the self-hosted runner. And if the runners are non-ephemeral, it's very hard to completely lock that down.And that's something that I've seen, there's quite a bit of that on GitHub where—and you can identify it just by looking at the run logs. And that's kind of comes from people saying, “Oh, let's just self-host our runners,” but they also don't configure that properly. And that opens them up to not only tampering with their repositories, stealing secrets, but now depending on where your runner is, now you potentially could be giving an attacker a foothold in your cloud environment.Corey: Yeah, that seems like it's generally a bad thing. I found that cloud tends to be more secure than running it yourself in almost every case, with the exception that once someone finds a way to break into it, there's suddenly a lot more eggs in a very large, albeit more secure, basket. So, it feels like it's a consistent trade-off. But as time goes on, it feels like it is less and less defensible, I think, to wind up picking out an on-prem strategy from a pure security point of view. I mean, there are reasons to do it. I'm just not sure.Adnan: Yeah. And I think that distinction to be made there, in particular with CI/CD runners is there's cloud, meaning you let your—there's, like, full cloud meaning you let your CI/CD provider host your infrastructure as well; there's kind of that hybrid approach you mentioned, where you're using a CI/CD provider, but then you're bringing your own cloud infrastructure that you think you could secure better; or you have your runners sitting in vCenter in your own data center. And all of those could end up being—both having a runner in your cloud and in your data center could be equally vulnerable if you're not segmenting builds properly. And that's the core issue that happens when you have a self-hosted runner is if they're not ephemeral, it's very hard to cut off all attack paths. There's always something an attacker can do to tamper with another build that'll have some kind of security impact. You need to just completely isolate your builds and that's essentially what you see in a lot of these newer guidances like the [unintelligible 00:30:04] framework, that's kind of the core recommendation of it is, like, one build, one clean runner.Corey: Yeah, that seems to be the common wisdom. I've been doing a lot of work with my own self-hosted runners that run inside of Lambda. Definitionally those are, of course, ephemeral. And there's a state machine that winds up handling that and screams bloody murder if there's a problem with it. So far, crossing fingers hoping it works out well.And I have a bounded to a very limited series of role permissions, and of course, its own account of constraint blast radius. But there's still—there are no guarantees in this. The reason I build it the way I do is that, all right, worst case someone can get access to this. The only thing they're going to have the ability to do is, frankly, run up my AWS bill, which is an area I have some small amount of experience with.Adnan: [laugh]. Yeah, yeah, that's always kind of the core thing where if you get into someone's cloud, like, well, just sit there and use their compute resources [laugh].Corey: Exactly. I kind of miss when that was the worst failure mode you had for these things.Adnan: [laugh].Corey: I really want to thank you for taking the time to speak with me today. If people want to learn more, where's the best place for them to find you?Adnan: I do have a Twitter account. Well, I guess you can call it Twitter anymore, but, uh—Corey: Watch me. Sure I can.Adnan: [laugh]. Yeah, so I'm on Twitter, and it's @adnanthekhan. So, it's like my first name with ‘the' and then K-H-A-N because, you know, my full name probably got taken up, like, years before I ever made a Twitter account. So, occasionally I tweet about GitHub Actions there.And on Praetorian's website, I've got a couple of blog posts. I have one—the one that really goes in-depth talking about the two Microsoft repository pull request attacks, and a couple other ones that are disclosed, will hopefully drop on the twenty—what is that, Tuesday? That's going to be the… that's the 26th. So, it should be airing on the Praetorian blog then. So, if you—Corey: Excellent. It should be out by the time this is published, so we will, of course, put a link to that in the [show notes 00:32:01]. Thank you so much for taking the time to speak with me today. I appreciate it.Adnan: Likewise. Thank you so much, Corey.Corey: Adnan Khan, lead security engineer at Praetorian. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that's probably going to be because your podcast platform of choice is somehow GitHub Actions.Adnan: [laugh].Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Dmitry Kagansky, State CTO and Deputy Executive Director for the Georgia Technology Authority, joins Corey on Screaming in the Cloud to discuss how he became the CTO for his home state and the nuances of working in the public sector. Dmitry describes his focus on security and reliability, and why they are both equally important when working with state government agencies. Corey and Dmitry describe AWS's infamous GovCloud, and Dmitry explains why he's employing a multi-cloud strategy but that it doesn't work for all government agencies. Dmitry also talks about how he's focusing on hiring and training for skills, and the collaborative approach he's taking to working with various state agencies.About DmitryMr. Kagansky joined GTA in 2021 from Amazon Web Services where he worked for over four years helping state agencies across the country in their cloud implementations and migrations.Prior to his time with AWS, he served as Executive Vice President of Development for Star2Star Communications, a cloud-based unified communications company. Previously, Mr. Kagansky was in many technical and leadership roles for different software vending companies. Most notably, he was Federal Chief Technology Officer for Quest Software, spending several years in Europe working with commercial and government customers.Mr. Kagansky holds a BBA in finance from Hofstra University and an MBA in management of information systems and operations management from the University of Georgia.Links Referenced: Twitter: https://twitter.com/dimikagi LinkedIn: https://www.linkedin.com/in/dimikagi/ GTA Website: https://gta.ga.gov TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: In the cloud, ideas turn into innovation at virtually limitless speed and scale. To secure innovation in the cloud, you need Runtime Insights to prioritize critical risks and stay ahead of unknown threats. What's Runtime Insights, you ask? Visit sysdig.com/screaming to learn more. That's S-Y-S-D-I-G.com/screaming.My thanks as well to Sysdig for sponsoring this ridiculous podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Technical debt is one of those fun things that everyone gets to deal with, on some level. Today's guest apparently gets to deal with 235 years of technical debt. Dmitry Kagansky is the CTO of the state of Georgia. Dmitry, thank you for joining me.Dmitry: Corey, thank you very much for having me.Corey: So, I want to just begin here because this has caused confusion in my life; I can only imagine how much it's caused for you folks. We're talking Georgia the US state, not Georgia, the sovereign country?Dmitry: Yep. Exactly.Corey: Excellent. It's always good to triple-check those things because otherwise, I feel like the shipping costs are going to skyrocket in one way or the other. So, you have been doing a lot of very interesting things in the course of your career. You're former AWS, for example, you come from commercial life working in industry, and now it's yeah, I'm going to go work in state government. How did this happen?Dmitry: Yeah, I've actually been working with governments for quite a long time, both here and abroad. So, way back when, I've been federal CTO for software companies, I've done other work. And then even with AWS, I was working with state and local governments for about four, four-and-a-half years. But came to Georgia when the opportunity presented itself, really to try and make a difference in my own home state. You mentioned technical debt at the beginning and it's one of the things I'm hoping that helped the state pay down and get rid of some of it.Corey: It's fun because governments obviously are not thought of historically as being the early adopters, bleeding edge when it comes to technical innovation. And from where I sit, for good reason. You don't want code that got written late last night and shoved into production to control things like municipal infrastructure, for example. That stuff matters. Unlike a lot of other walks of life, you don't usually get to choose your government, and, “Oh, I don't like this one so I'm going to go for option B.”I mean you get to do at the ballot box, but that takes significant amounts of time. So, people want above all else—I suspect—their state services from an IT perspective to be stable, first and foremost. Does that align with how you think about these things? I mean, security, obviously, is a factor in that as well, but how do you see, I guess, the primary mandate of what you do?Dmitry: Yeah. I mean, security is obviously up there, but just as important is that reliance on reliability, right? People take time off of work to get driver's licenses, right, they go to different government agencies to get work done in the middle of their workday, and we've got to have systems available to them. We can't have them show up and say, “Yeah, come back in an hour because some system is rebooting.” And that's one of the things that we're trying to fix and trying to have fewer of, right?There's always going to be things that happen, but we're trying to really cut down the impact. One of the biggest things that we're doing is obviously a move to the cloud, but also segmenting out all of our agency applications so that agencies manage them separately. Today, my organization, Georgia Technology Authority—you'll hear me say GTA—we run what we call NADC, the North Atlanta Data Center, a pretty large-scale data center, lots of different agencies, app servers all sitting there running. And then a lot of times, you know, an impact to one could have an impact to many. And so, with the cloud, we get some partitioning and some segmentation where even if there is an outage—a term you'll often hear used that we can cut down on the blast radius, right, that we can limit the impact so that we affect the fewest number of constituents.Corey: So, I have to ask this question, and I understand it's loaded and people are going to have opinions with a capital O on it, but since you work for the state of Georgia, are you using GovCloud over in AWS-land?Dmitry: So… [sigh] we do have some footprint in GovCloud, but I actually spent time, even before coming to GTA, trying to talk agencies out of using it. I think there's a big misconception, right? People say, “I'm government. They called it GovCloud. Surely I need to be there.”But back when I was with AWS, you know, I would point-blank tell people that really I know it's called GovCloud, but it's just a poorly named region. There are some federal requirements that it meets; it was built around the ITAR, which is International Traffic of Arms Regulations, but states aren't in that business, right? They are dealing with HIPAA data, with various criminal justice data, and other things, but all of those things can run just fine on the commercial side. And truthfully, it's cheaper and easier to run on the commercial side. And that's one of the concerns I have is that if the commercial regions meet those requirements, is there a reason to go into GovCloud, just because you get some extra certifications? So, I still spend time trying to talk agencies out of going to GovCloud. Ultimately, the agencies with their apps make the choice of where they go, but we have been pretty good about reducing the footprint in GovCloud unless it's absolutely necessary.Corey: Has this always been the case? Because my distant recollection around all of this has been that originally when GovCloud first came out, it was a lot harder to run a whole bunch of workloads in commercial regions. And it feels like the commercial regions have really stepped up as far as what compliance boxes they check. So, is one of those stories where five or ten years ago, whenever it GovCloud first came out, there were a bunch of reasons to use it that no longer apply?Dmitry: I actually can't go past I'll say, seven or eight years, but certainly within the last eight years, there's not been a reason for state and local governments to use it. At the federal level, that's a different discussion, but for most governments that I worked with and work with now, the commercial regions have been just fine. They've met the compliance requirements, controls, and everything that's in place without having to go to the GovCloud region.Corey: Something I noticed that was strange to me about the whole GovCloud approach when I was at the most recent public sector summit that AWS threw is whenever I was talking to folks from AWS about GovCloud and adopting it and launching new workloads and the rest, unlike in almost any other scenario, they seemed that their first response—almost a knee jerk reflex—was to pass that work off to one of their partners. Now, on the commercial side, AWS will do that when it makes sense, and each one becomes a bit of a judgment call, but it just seemed like every time someone's doing something with GovCloud, “Oh, talk to Company X or Company Y.” And it wasn't just one or two companies; there were a bunch of them. Why is that?Dmitry: I think a lot of that is because of the limitations within GovCloud, right? So, when you look at anything that AWS rolls out, it almost always rolls out into either us-east-1 or us-west-2, right, one of those two regions, and it goes out worldwide. And then it comes out in GovCloud months, sometimes even years later. And in fact, sometimes there are features that never show up in GovCloud. So, there's not parity there, and I think what happens is, it's these partners that know what limitations GovCloud has and what things are missing and GovCloud they still have to work around.Like, I remember when I started with AWS back in 2016, right, there had been a new console, you know, the new skin that everyone's now familiar with. But that old console, if you remember that, that was in GovCloud for years afterwards. I mean, it took them at least two more years to get GovCloud to even look like the current commercial console that you see. So, it's things like that where I think AWS themselves want to keep moving forward and having to do anything with kind of that legacy platform that doesn't have all the bells and whistles is why they say, “Go get a partner [unintelligible 00:08:06] those things that aren't there yet.”Corey: That's it makes a fair bit of sense. What I was always wondering how much of this was tied to technical challenges working within those, and building solutions that don't depend upon things. “Oh, wait, that one's not available in GovCloud,” versus a lack of ability to navigate the acquisition process for a lot of governments natively in the same way that a lot of their customers can.Dmitry: Yeah, I don't think that's the case because even to get a GovCloud account, you have to start off with a commercial account, right? So, you actually have to go through the same purchasing steps and then essentially, click an extra button or two.Corey: Oh, I've done that myself already. I have a shitposting account and a—not kidding—Ministry of Shitposting GovCloud account. But that's also me just kicking the tires on it. As I went through the process, it really felt like everything was built around a bunch of unstated assumption—because of course you've worked within GovCloud before and you know where these things are. And I kept tripping into a variety of different aspects of that. I'm wondering how much of that is just due to the fact that partners are almost always the ones guiding customers through that.Dmitry: Yeah. It is almost always that. There's very few people, even in the AWS world, right, if you look at all the employees they have there, it's small subset that work with that environment, and probably an even smaller subset of those that understand what it's really needed for. So, this is where if there's not good understanding, you're better off handing it off to a partner. But I don't think it is the purchasing side of things. It really is the regulatory things and just having someone else sign off on a piece of paper, above and beyond just AWS themselves.Corey: I am curious, since it seems that people love to talk about multi-cloud in a variety of different ways, but I find there's a reality that, ehh, basically, on a long enough timeline, everyone uses everything, versus the idea of, “Oh, we're going to build everything so we can seamlessly flow from one provider to another.” Are you folks all in on AWS? Are you using a bunch of different cloud providers for different workloads? How are you approaching a cloud strategy?Dmitry: So, when you say ‘you guys,' I'll say—as AWS will always say—“It depends.” So, GTA is multi-cloud. We support AWS, we support OCI, we support Azure, and we are working towards getting Google in as well, GCP. However, on the agency side, I am encouraging agencies to pick a cloud. And part of that is because you do have limited staff, they are all different, right?They'll do similar things, but if it's done in a different way and you don't have people that know those little tips and tricks, kind of how to navigate certain cloud vendors, it just makes things more difficult. So, I always look at it as kind of the car analogy, right? Most people are not multi-car, right? You go you buy a car—Toyota, Ford, whatever it is—and you're committed to that thing for the next 4 or 5, 10 years, however long you own it, right? You may not like where the cupholder is or you need to get used to something, you know, being somewhere else, but you do commit to it.And I think it's the same thing with cloud that, you know, do you have to be in one cloud for the rest of your life? No, but know that you're not going to hop from cloud to cloud. No one really does. No one says, “Every six months, I'm going to go move my application from one cloud to another.” It's a pretty big lift and no one really needs to do that. Just find the one that's most comfortable for you.Corey: I assume that you have certain preferences as far as different cloud providers go. But I've found even in corporate life that, “Well, I like this company better than the other,” is generally not the best basis for making sweeping decisions around this. What frameworks do you give various departments to consider where a given workload should live? Like, how do you advise them to think about this?Dmitry: You know, it's funny, we actually had a call with an agency recently that said, “You know, we don't know cloud. What do you guys think we should do?” And it was for a very small, I don't want to call it workload; it was really for some DNS work that they wanted to do. And really came down to, for that size and scale, right, we're looking at a few dollars, maybe a month, they picked it based on the console, right? They liked one console over another.Not going to get into which cloud they picked, but we wound up them giving them a demo of here's what this looks like in these various cloud providers. And they picked that just because they liked the buttons and the layout of one console over another. Now, having said that, for obviously larger workloads, things that are more important, there is criteria. And in many cases, it's also the vendors. Probably about 60 to 70% of the applications we run are all vendor-provided in some way, and the vendors will often dictate platforms that they'll support over others, right?So, that supportability is important to us. Just like you were saying, no one wants code rolled out overnight and surprise all the constituents one day. We take our vendor relations pretty seriously and we take our cue from them. If we're buying software from someone and they say, “Look, this is better in AWS,” or, “This is better in OCI,” for whatever reasons they have, will go in that direction more often than not.Corey: I made a crack at the beginning of the episode where the state was founded 235 years ago, as of this recording. So, how accurate is that? I have to imagine that back in those days, they didn't really have a whole lot of computers, except probably something from IBM. How much technical debt are you folks actually wrestling with?Dmitry: It's pretty heavy. One of the biggest things we have is, we ourselves, in our data center, still have a mainframe. That mainframe is used for a lot of important work. Most notably, a lot of healthcare benefits are really distributed through that system. So, you're talking about federal partnerships, you're talking about, you know, insurance companies, health care providers, all somehow having—Corey: You're talking about things that absolutely, positively cannot break.Dmitry: Yep, exactly. We can't have outages, we can't have blips, and they've got to be accurate. So, even that sort of migration, right, that's not something that we can do overnight. It's something we've been working on for well over a year, and right now we're targeting probably roughly another year or so to get that fully migrated out. And even there, we're doing what would be considered a traditional lift-and-shift. We're going to mainframe emulation, we're not going cloud-native, we're not going to do a whole bunch of refactoring out of the gate. It's just picking up what's working and running and just moving it to a new venue.Corey: Did they finally build an AWS/400 that you can run that out? I didn't realize they had a mainframe emulation offering these days.Dmitry: They do. There's actually several providers that do it. And there's other agencies in the state that have made this sort of move as well, so we're also not even looking to be innovators in that respect, right? We're not going to be first movers to try that out. We'll have another agency make that move first and now we're doing this with our Department of Human Services.But yeah, a lot of technical debt around that platform. When you look at just the cost of operating these platforms, that mainframe costs the state roughly $15 million a year. We think in the cloud, it's going to wind up costing us somewhere between 3 to 4 million. Even if it's 5 million, that's still considerable savings over what we're paying for today. So, it's worth making that move, but it's still very deliberate, very slow, with a lot of testing along the way. But yeah, you're talking about that workload has been in the state, I want to say, for over 20, 25 years.Corey: So, what's the reason to move it? Because not for nothing, but there's an old—the old saw, “Well, don't fix it if it ain't broke.” Well, what's broke about it?Dmitry: Well, there's a couple of things. First off, the real estate that it takes up as an issue. It is a large machine sitting on a floor of a data center that we've got to consolidate to. We actually have some real estate constraints and we've got to cut down our footprint by next year, contractually, right? We've agreed, we're going to move into a smaller space.The other part is the technical talent. While yes, it's not broke, things are working on it, there are fewer and fewer people that can manage it. What we've found was doing a complete refactor while doing a move anywhere, is really too risky, right? Rewriting everything with a bunch of Lambdas is kind of scary, as well as moving it into another venue. So, there are mainframe emulators out there that will run in the cloud. We've gotten one and we're making this move now. So, we're going to do that lift-and-shift in and then look to refactor it piecemeal.Corey: Specifics are always going to determine, but as a general point, I felt like I am the only voice in the room sometimes advocating in favor of lift-and-shift. Because people say, “Oh, it's terrible for reasons X, Y, and Z.” It's, “Yes, all of your options are terrible and for the common case, this is the one that I have the sneaking suspicion, based upon my lived experience, is going to be the least bad have all of those various options.” Was there a thought given to doing a refactor in flight?Dmitry: So… from the time I got here, no. But I could tell you just having worked with the state even before coming in as CTO, there were constant conversations about a refactor. And the problem is, no one actually has an appetite for it. Everyone talks about it, but then when you say, “Look, there's a risk to doing this,”—right, governments are about minimizing risk—when you say, “Look, there's a risk to rewriting and moving code at the same time and it's going to take years longer,” right, that refactoring every time, I've seen an estimate, it would be as small as three years, as large as seven or eight years, depending on who was doing the estimate. Whereas the lift-and-shift, we're hoping we can get it done in two years, but even if it's two-and-a-half, it's still less than any of the estimates we've seen for a refactor and less risky. So, we're going with that model and we'll tinker and optimize later. But we just need to get out of that mainframe so that we can have more modern technology and more modern support.Corey: It seems like the right approach. I'm sorry, I didn't mean to frame that is quite as insulting as it might have come across. Like, “Did anyone consider other options just out of curi—” of course. Whenever you're making big changes, we're going to throw a dart at a whiteboard. It's not what appears to be Twitter's current product strategy we're talking about here. This is stuff that's very much measure twice, cut once.Dmitry: Yeah. Very much so. And you see that with just about everything we do here. I know, when the state, what now, three years ago, moved their tax system over to AWS, not only did they do two or three trial runs of just the data migration, we actually wound up doing six, right? You're talking about adding two months of testing just to make sure every time we did the data move, it was done correctly and all the data got moved over. I mean, government is very, very much about measure three, four times, cut once.Corey: Which is kind of the way you'd want it. One thing that I found curious whenever I've been talking to folks in the public sector space around things that they care about—and in years past, I periodically tried to, “Oh, should we look at doing some cost consulting for folks in this market?” And by and large, there have been a couple of exceptions, but—generally, in our experience with sovereign governments, more so than municipal or state ones—but saving money is not usually one of the top three things that governments care about when it comes to their AWS's state. Is cost something that's on your radar? And how do you conceptualize around this? And I should also disclose, this is not in any way, shape, or form intended to be a sales pitch.Dmitry: Yeah, no, cost actually, for GTA. Is a concern. But I think it's more around the way we're structured. I have worked with other governments where they say, “Look, we've already gotten an allotment of money. It costs whatever it costs and we're good with it.”With the way my organization is set up, though, we're not appropriated funds, meaning we're not given any tax dollars. We actually have to provide services to the agencies and they pay us for it. And so, my salary and everyone else's here, all the work that we do, is basically paid for by agencies and they do have a choice to leave. They could go find other providers. It doesn't have to be GTA always.So, cost is a consideration. But we're also finding that we can get those cost savings pretty easily with this move to the cloud because of the number of available tools that we now have available. We have—that data center I talked about, right? That data center is obviously locked down, secured, very limited access, you can't walk in, but that also prevents agencies from doing a lot of day-to-day work that now in the cloud, they can do on their own. And so, the savings are coming just from this move of not having to have as much locks away from the agency, but having more locks from the outside world as well, right? There's definitely scaling up in the number of tools that they have available to them to work around their applications that they didn't have before.Corey: It's, on some level, a capability story, I think, when it comes to cloud. But something I have heard from a number of folks is that even more so than in enterprises, budgets tend to be much more fixed things in the context of cloud in government. Often in enterprises, what you'll see is sprawl: someone leaves something running and oops, the bill wound up going up higher than we projected for this given period of time. When we start getting into the realm of government, that stops being a you broke budgeting policy and starts to resemble things that are called crimes. How do you wind up providing governance as a government around cloud usage to avoid, you know, someone going to prison over a Managed NAT Gateway?Dmitry: Yeah. So, we do have some pretty stringent monitoring. I know, even before the show, we talked about fact that we do have a separate security group. So, on that side of it, they are keeping an eye on what are people doing in the cloud. So, even though agencies now have more access to more tooling, they can do more, right, GTA hasn't stepped back from it and so, we're able to centrally manage things.We've put in a lot of controls. In fact, we're using Control Tower. We've got a lot of guardrails put in, even basic things like you can't run things outside of the US, right? We don't want you running things in the India region or anywhere in South America. Like, that's not even allowed, so we're able to block that off.And then we've got some pretty tight financial controls where we're watching the spend on a regular basis, agency by agency. Not enforcing any of it, obviously, agencies know what they're doing and it's their apps, but we do warn them of, “Hey, we're seeing this trend or that trend.” We've been at this now for about a year-and-a-half, and so agencies are starting to see that we provide more oversight and a lot less pressure, but at the same time, there's definitely a lot more collaboration assistance with one another.Corey: It really feels like the entire procurement model is shifted massively. As opposed to going out for a bunch of bids and doing all these other things, it's consumption-based. And that has been—I know for enterprises—a difficult pill for a lot of their procurement teams to wind up wrapping their heads around. I can only imagine what that must be like for things that are enshrined in law.Dmitry: Yeah, there's definitely been a shift, although it's not as big as you would think on that side because you do have cloud but then you also have managed services around cloud, right? So, you look at AWS, OCI, Azure, no one's out there putting a credit card down to open an environment anymore, you know, a tenant or an account. It is done through procurement rules. Like, we don't actually buy AWS directly from AWS; we go through a reseller, right, so there's some controls there as well from the procurement side. So, there's still a lot of oversight.But it is scary to some of our procurement people. Like, AWS Marketplace is a very, very scary place for them, right? The fact that you can go and—you can hire people at Marketplace, you could buy things with a single button-click. So, we've gone out of our way, in my agency, to go through and lock that down to make sure that before anyone clicks one of those purchase buttons, that we at least know about it, they've made the request, and we have to go in and unlock that button for that purchase. So, we've got to put in more controls in some cases. But in other cases, it has made things easier.Corey: As you look across the landscape of effectively, what you're doing is uprooting an awful lot of technical systems that have been in place for decades at this point. And we look at cloud and I'm not saying it's not stable—far from it—but it also feels a little strange to be, effectively, making a similar timespan of commitment—because functionally a lot of us are—when we look at these platforms. Was that something that had already been a pre-existing appetite for when you started the role or is that something that you've found that you've had to socialize in the last couple years?Dmitry: It's a little bit of both. It's been lumpy, agency by agency, I'll say. There are some agencies that are raring to go, they want to make some changes, do a lot of good, so to speak, by upgrading their infrastructure. There are others that will sit and say, “Hey, I've been doing this for 20, 30 years. It's been fine.” That whole, “If it ain't broke, don't fix it,” mindset.So, for them, there's definitely been, you know, a lot more friction to get them going in that direction. But what I'm also finding is the people with their hands on the keyboards, right, the ones that are doing the work, are excited by this. This is something new for them. In addition to actually going to cloud, the other thing we've been doing is providing a lot of different training options. And so, that's something that's perked people up and definitely made them much more excited to come into work.I know, down at the, you know, the operator level, the administrators, the managers, all of those folks, are pretty pleased with the moves we're making. You do get some of the folks in upper management in the agencies that do say, “Look, this is a risk.” We're saying, “Look, it's a risk not to do this.” Right? You've also got to think about staffing and what people are willing to work on. Things like the mainframe, you know, you're not going to be able to hire those people much longer. They're going to be fewer and far between. So, you have to retool. I do tell people that, you know, if you don't like change, IT is probably not the industry to be in, even in government. You probably want to go somewhere else, then.Corey: That is sort of the next topic I want to get into, where companies across the board are finding it challenging to locate and source talent to work in their environments. How has the process of recruiting cloud talent gone for you?Dmitry: It's difficult. Not going to sugarcoat that. It's, it's—Corey: [laugh]. I'm not sure anyone would say otherwise, no matter where you are. You can pay absolutely insane, top-of-market money and still have that exact same response. No one says, “Oh, it's super easy.” Everyone finds it hard. But please continue [laugh].Dmitry: Yeah, but it's also not a problem that we can even afford to throw money at, right? So, that's not something that we'd ever do. But what I have found is that there's actually a lot of people, really, that I'll say are tech adjacent, that are interested in making that move. And so, for us, having a mentoring and training program that bring people in and get them comfortable with it is probably more important than finding the talent exactly as it is, right? If you look at our job descriptions that we put out there, we do want things like cloud certs and certain experience, but we'll drop off things like certain college requirements. Say, “Look, do you really need a college degree if you know what you're doing in the cloud or if you know what you're doing with a database and you can prove that?”So, it's re-evaluating who we're bringing in. And in some cases, can we also train someone, right, bring someone in for a lower rate, but willing to learn and then give them the experience, knowing that they may not be here for 15, 20 years and that's okay. But we've got to retool that model to say, we expect some attrition, but they walk away with some valuable skills and while they're here, they learn those skills, right? So, that's the payoff for them.Corey: I think that there's a lot of folks exploring that where there are people who have the interest and the aptitude that are looking to transition in. So, much of the discussion points around filling the talent pipeline have come from a place of, oh, we're just going to talk to all the schools and make sure that they're teaching people the right way. And well, colleges aren't really aimed at being vocational institutions most of the time. And maybe you want people who can bring an understanding of various aspects of business, of workplace dynamics, et cetera, and even the organization themselves, you can transition them in. I've always been a big fan of helping people lateral from one part of an organization to another. It's nice to see that there's actual formal processes around that for you, folks.Dmitry: Yeah, we're trying to do that and we're also working across agencies, right, where we might pull someone in from another agency that's got that aptitude and willingness, especially if it's someone that already has government experience, right, they know how to work within the system that we have here, it certainly makes things easier. It's less of a learning curve for them on that side. We think, you know, in some cases, the technical skills, we can teach you those, but just operating in this environment is just as important to understand the soft side of it.Corey: No, I hear you. One thing that I've picked up from doing this show and talking to people in the different places that you all tend to come from, has been that everyone's working with really hard problems and there's a whole universe of various constraints that everyone's wrestling with. The biggest lie in our industry across the board that I'm coming to realize is any whiteboard architecture diagram. Full stop. The real world is messy.Nothing is ever quite like it looks like in that sterile environment where you're just designing and throwing things up there. The world is built on constraints and trade-offs. I'm glad to see that you're able to bring people into your organization. I think it gives an awful lot of folks hope when they despair about seeing what some of the job prospects are for folks in the tech industry, depending on what direction they want to go in.Dmitry: Yeah. I mean, I think we've got the same challenge as everyone else does, right? It is messy. The one thing that I think is also interesting is that we also have to have transparency but to some degree—and I'll shift; I know this wasn't meant to kind of go off into the security side of things, but I think one of the things that's most interesting is trying to balance a security mindset with that transparency, right?You have private corporations, other organizations that they do whatever they do, they're not going to talk about it, you don't need to know about it. In our case, I think we've got even more of a challenge because on the one hand, we do want to lock things down, make sure they're secure and we protect not just the data, but how we do things, right, some are mechanisms and methods. But same time, we've got a responsibility to be transparent to our constituents. They've got to be able to see what we're doing, what are we spending money on? And so, to me, that's also one of the biggest challenges we have is how do we make sure we balance that out, that we can provide people and even our vendors, right, a lot of times our vendors [will 00:30:40] say, “How are you doing something? We want to know so that we can help you better in some areas.” And it's really become a real challenge for us.Corey: I really want to thank you for taking the time to speak with me about what you're doing. If people want to learn more, where's the best place for them to find you?Dmitry: I guess now it's no longer called Twitter, but really just about anywhere. Twitter, Instagram—I'm not a big Instagram user—LinkedIn, Dmitry Kagansky, there's not a whole lot of us out there; pretty easy to do a search. But also you'll see there's my contact info, I believe, on the GTA website, just gta.ga.gov.Corey: Excellent. We will, of course, put links to that in the [show notes 00:31:20]. Thank you so much for being so generous with your time. I really appreciate it.Dmitry: Thank you, Corey. I really appreciate it as well.Corey: Dmitry Kagansky, CTO for the state of Georgia. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment telling me that I've got it all wrong and mainframes will in fact rise again.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
In this special live-recorded episode of Screaming in the Cloud, Corey interviews himself— well, kind of. Corey hosts an AMA session, answering both live and previously submitted questions from his listeners. Throughout this episode, Corey discusses misconceptions about his public persona, the nature of consulting on AWS bills, why he focuses so heavily on AWS offerings, his favorite breakfast foods, and much, much more. Corey shares insights into how he monetizes his public persona without selling out his genuine opinions on the products he advertises, his favorite and least favorite AWS services, and some tips and tricks to get the most out of re:Invent.About CoreyCorey is the Chief Cloud Economist at The Duckbill Group. Corey's unique brand of snark combines with a deep understanding of AWS's offerings, unlocking a level of insight that's both penetrating and hilarious. He lives in San Francisco with his spouse and daughters.Links Referenced: lastweekinaws.com/disclosures: https://lastweekinaws.com/disclosures duckbillgroup.com: https://duckbillgroup.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: As businesses consider automation to help build and manage their hybrid cloud infrastructures, deployment speed is important, but so is cost. Red Hat Ansible Automation Platform is available in the AWS Marketplace to help you meet your cloud spend commitments while delivering best-of-both-worlds support.Corey: Well, all right. Thank you all for coming. Let's begin and see how this whole thing shakes out, which is fun and exciting, and for some godforsaken reason the lights like to turn off, so we're going to see if that continues. I've been doing Screaming in the Cloud for about, give or take, 500 episodes now, which is more than a little bit ridiculous. And I figured it would be a nice change of pace if I could, instead of reaching out and talking to folks who are innovative leaders in the space and whatnot, if I could instead interview my own favorite guest: myself.Because the entire point is, I'm usually the one sitting here asking questions, so I'm instead going to now gather questions from you folks—and feel free to drop some of them into the comments—but I've solicited a bunch of them, I'm going to work through them and see what you folks want to know about me. I generally try to be fairly transparent, but let's have fun with it. To be clear, if this is your first exposure to my Screaming in the Cloud podcast show, it's generally an interview show talking with people involved with the business of cloud. It's not intended to be snarky because not everyone enjoys thinking on their feet quite like that, but rather a conversation of people about what they're passionate about. I'm passionate about the sound of my own voice. That's the theme of this entire episode.So, there are a few that have come through that are in no particular order. I'm going to wind up powering through them, and again, throw some into the comments if you want to have other ones added. If you're listening to this in the usual Screaming in the Cloud place, well, send me questions and I am thrilled to wind up passing out more of them. The first one—a great one to start—comes with someone asked me a question about the video feed. “What's with the Minecraft pickaxe on the wall?” It's made out of foam.One of my favorite stories, and despite having a bunch of stuff on my wall that is interesting and is stuff that I've created, years ago, I wrote a blog post talking about how machine learning is effectively selling digital pickaxes into a gold rush. Because the cloud companies pushing it are all selling things such as, you know, they're taking expensive compute, large amounts of storage, and charging by the hour for it. And in response, Amanda, who runs machine learning analyst relations at AWS, sent me that by way of retaliation. And it remains one of my absolute favorite gifts. It's, where's all this creativity in the machine-learning marketing? No, instead it's, “We built a robot that can think. But what are we going to do with it now? Microsoft Excel.” Come up with some of that creativity, that energy, and put it into the marketing side of the world.Okay, someone else asks—Brooke asks, “What do I think is people's biggest misconception about me?” That's a good one. I think part of it has been my misconception for a long time about what the audience is. When I started doing this, the only people who ever wound up asking me anything or talking to me about anything on social media already knew who I was, so I didn't feel the need to explain who I am and what I do. So, people sometimes only see the witty banter on Twitter and whatnot and think that I'm just here to make fun of things.They don't notice, for example, that my jokes are never calling out individual people, unless they're basically a US senator, and they're not there to make individual humans feel bad about collectively poor corporate decision-making. I would say across the board, people think that I'm trying to be meaner than I am. I'm going to be honest and say it's a little bit insulting, just from the perspective of, if I really had an axe to grind against people who work at Amazon, for example, is this the best I'd be able to do? I'd like to think that I could at least smack a little bit harder. Speaking of, we do have a question that people sent in in advance.“When was the last time that Mike Julian gave me that look?” Easy. It would have been two days ago because we were both in the same room up in Seattle. I made a ridiculous pun, and he just stared at me. I don't remember what the pun is, but I am an incorrigible punster and as a result, Mike has learned that whatever he does when I make a pun, he cannot incorrige me. Buh-dum-tss. That's right. They're no longer puns, they're dad jokes. A pun becomes a dad joke once the punch line becomes a parent. Yes.Okay, the next one is what is my favorite AWS joke? The easy answer is something cynical and ridiculous, but that's just punching down at various service teams; it's not my goal. My personal favorite is the genie joke where a guy rubs a lamp, Genie comes out and says, “You can have a billion dollars if you can spend $100 million in a month, and you're not allowed to waste it or give it away.” And the person says, “Okay”—like, “Those are the rules.” Like, “Okay. Can I use AWS?” And the genie says, “Well, okay, there's one more rule.” I think that's kind of fun.Let's see, another one. A hardball question: given the emphasis on right-sizing for meager cost savings and the amount of engineering work required to make real architectural changes to get costs down, how do you approach cost controls in companies largely running other people's software? There are not as many companies as you might think where dialing in the specifics of a given application across the board is going to result in meaningful savings. Yes, yes, you're running something in hyperscale, it makes an awful lot of sense, but most workloads don't do that. The mistakes you most often see are misconfigurations for not knowing this arcane bit of AWS trivia, as a good example. There are often things you can do with relatively small amounts of effort. Beyond a certain point, things are going to cost what they're going to cost without a massive rearchitecture and I don't advise people do that because no one is going to be happy rearchitecting just for cost reasons. Doesn't go well.Someone asks, “I'm quite critical of AWS, which does build trust with the audience. Has AWS tried to get you to market some of their services, and would I be open to do that?” That's a great question. Yes, sometimes they do. You can tell this because they wind up buying ads in the newsletter or the podcast and they're all disclaimed as a sponsored piece of content.I do have an analyst arrangement with a couple of different cloud companies, as mentioned lastweekinaws.com/disclosures, and the reason behind that is because you can buy my attention to look at your product and talk to you in-depth about it, but you cannot buy my opinion on it. And those engagements are always tied to, let's talk about what the public is seeing about this. Now, sometimes I write about the things that I'm talking about because that's where my mind goes, but it's not about okay, now go and talk about this because we're paying you to, and don't disclose that you have a financial relationship.No, that is called fraud. I figure I can sell you as an audience out exactly once, so I better be able to charge enough money to never have to work again. Like, when you see me suddenly talk about multi-cloud being great and I became a VP at IBM, about three to six months after that, no one will ever hear from me again because I love nesting doll yacht money. It'll be great.Let's see. The next one I have on my prepared list here is, “Tell me about a time I got AWS to create a pie chart.” I wish I'd see less of it. Every once in a while I'll talk to a team and they're like, “Well, we've prepared a PowerPoint deck to show you what we're talking about.” No, Amazon is famously not a PowerPoint company and I don't know why people feel the need to repeatedly prove that point to me because slides are not always the best way to convey complex information.I prefer to read documents and then have a conversation about them as Amazon tends to do. The visual approach and the bullet lists and all the rest are just frustrating. If I'm going to do a pie chart, it's going to be in service of a joke. It's not going to be anything that is the best way to convey information in almost any sense.“How many internal documents do I think reference me by name at AWS,” is another one. And I don't know the answer to documents, but someone sent me a screenshot once of searching for my name in their Slack internal nonsense thing, and it was about 10,000 messages referenced me that it found. I don't know what they were saying. I have to assume, on some level, just something that does a belt feed from my Twitter account where it lists my name or something. But I choose to believe that no, they actually are talking about me to that level of… of extreme.Let's see, let's turn back to the chat for a sec because otherwise it just sounds like I'm doing all prepared stuff. And I'm thrilled to do that, but I'm also thrilled to wind up fielding questions from folks who are playing along on these things. “I love your talk, ‘Heresy in the Church of Docker.' Do I have any more speaking gigs planned?” Well, today's Wednesday, and this Friday, I have a talk that's going out at the CDK Community Day.I also have a couple of things coming up that are internal corporate presentations at various places. But at the moment, no. I suspect I'll be giving a talk if they accept it at SCALE in Pasadena in March of next year, but at the moment, I'm mostly focused on re:Invent, just because that is eight short weeks away and I more or less destroy the second half of my year because… well, holidays are for other people. We're going to talk about clouds, as Amazon and the rest of us dance to the tune that they play.“Look in my crystal ball; what will the industry look like in 5, 10, or 20 years?” Which is a fun one. You shouldn't listen to me on this. At all. I was the person telling you that virtualization was a flash in the pan, that cloud was never going to catch on, that Kubernetes and containers had a bunch of problems that were unlikely to be solved, and I'm actually kind of enthused about serverless which probably means it's going to flop.I am bad at predicting overall trends, but I have no problem admitting that wow, I was completely wrong on that point, which apparently is a rarer skill than it should be. I don't know what the future the industry holds. I know that we're seeing some AI value shaping up. I think that there's going to be a bit of a downturn in that sector once people realize that just calling something AI doesn't mean you make wild VC piles of money anymore. But there will be use cases that filter out of it. I don't know what they're going to look like yet, but I'm excited to see it.Okay, “Have any of the AWS services increased costs in the last year? I was having a hard time finding historical pricing charts for services.” There have been repricing stories. There have been SMS charges in India that have—and pinpointed a few other things—that wound up increasing because of a government tariff on them and that cost was passed on. Next February, they're going to be charging for public IPV4 addresses.But those tend to be the exceptions. The way that most costs tend increase have been either, it becomes far cheaper for AWS to provide a service and they don't cut the cost—data transfer being a good example—they'll also often have stories in that they're going to start launching a bunch of new things, and you'll notice that AWS bills tend to grow in time. Part of that growth, part of that is just cruft because people don't go back and clean things up. But by and large, I have not seen, “This thing that used to cost you $1 is now going to cost you $2.” That's not how AWS does pricing. Thankfully. Everyone's always been scared of something like that happening. I think that when we start seeing actual increases like that, that's when it's time to start taking a long, hard look at the way that the industry is shaping up. I don't think we're there yet.Okay. “Any plans for a Last Week in Azure or a Last Week in GCP?” Good question. If so, I won't be the person writing it. I don't think that it's reasonable to expect someone to keep up with multiple large companies and their releases. I'd also say that Azure and GCP don't release updates to services with the relentless cadence that AWS does.The reason I built the thing to start with is simply because it was difficult to gather all the information in one place, at least the stuff that I cared about with an economic impact, and by the time I'd done that, it was, well, this is 80% of the way toward republishing it for other people. I expected someone was going to point me at a thing so I didn't have to do it, and instead, everyone signed up. I don't see the need for it. I hope that in those spaces, they're better at telling their own story to the point where the only reason someone would care about a newsletter would be just my sarcasm tied into whatever was released. But that's not something that I'm paying as much attention to, just because my customers are on AWS, my stuff is largely built on AWS, it's what I have to care about.Let's see here. “What do I look forward to at re:Invent?” Not being at re:Invent anymore. I'm there for eight nights a year. That is shitty cloud Chanukah come to life for me. I'm there to set things up in advance, I'm there to tear things down at the end, and I'm trying to have way too many meetings in the middle of all of that. I am useless for the rest of the year after re:Invent, so I just basically go home and breathe into a bag forever.I had a revelation last year about re:Play, which is that I don't have to go to it if I don't want to go. And I don't like the cold, the repetitive music, the giant crowds. I want to go read a book in a bathtub and call it a night, and that's what I hope to do. In practice, I'll probably go grab dinner with other people who feel the same way. I also love the Drink Up I do there every year over at Atomic Liquors. I believe this year, we're partnering with the folks over at RedMonk because a lot of the people we want to talk to are in the same groups.It's just a fun event: show up, let us buy you drinks. There's no badge scan or any nonsense like that. We just want to talk to people who care to come out and visit. I love doing that. It's probably my favorite part of re:Invent other than not being at re:Invent. It's going to be on November 29th this year. If you're listening to this, please come on by if you're unfortunate enough to be in Las Vegas.Someone else had a good question I want to talk about here. “I'm a TAM for AWS. Cost optimization is one of our functions. What do you wish we would do better after all the easy button things such as picking the right instance and family, savings plans RIs, turning off or delete orphan resources, watching out for inefficient data transfer patterns, et cetera?” I'm going to back up and say that you're begging the question here, in that you aren't doing the easy things, at least not at scale, not globally.I used to think that all of my customer engagements would be, okay after the easy stuff, what's next? I love those projects, but in so many cases, I show up and those easy things have not been done. “Well, that just means that your customers haven't been asking their TAM.” Every customer I've had has asked their TAM first. “Should we ask the free expert or the one that charges us a large but reasonable fixed fee? Let's try the free thing first.”The quality of that advice is uneven. I wish that there were at least a solid baseline. I would love to get to a point where I can assume that I can go ahead and be able to just say, “Okay, you've clearly got your RI stuff, you're right-sizing, you're deleting stuff you're not using, taken care of. Now, let's look at the serious architecture stuff.” It's just rare that I get to see it.“What tool, feature, or widget do I wish AWS would build into the budget console?” I want to be able to set a dollar figure, maybe it's zero, maybe it's $20, maybe it is irrelevant, but above whatever I set, the account will not charge me above that figure, period. If that means they have to turn things off if that means they had to delete portions of data, great. But I want that assurance because even now when I kick the tires in a new service, I get worried that I'm going to wind up with a surprise bill because I didn't understand some very subtle interplay of the dynamics. And if I'm worried about that, everyone else is going to wind up getting caught by that stuff, too.I want the freedom to experiment and if it smacks into a wall, okay, cool. That's $20. That was worth learning that. Whatever. I want the ability to not be charged unreasonable overages. And I'm not worried about it turning from 20 into 40. I'm worried about it turning from 20 into 300,000. Like, there's the, “Oh, that's going to have a dent on the quarterlies,” style of [numb 00:16:01]—All right. Someone also asked, “What is the one thing that AWS could do that I believe would reduce costs for both AWS and their customers. And no, canceling re:Invent doesn't count.” I don't think about it in that way because believe it or not, most of my customers don't come to me asking to reduce their bill. They think they do at the start, but what they're trying to do is understand it. They're trying to predict it.Yes, they want to turn off the waste in the rest, but by and large, there are very few AWS offerings that you take a look at and realize what you're getting for it and say, “Nah, that's too expensive.” It can be expensive for certain use cases, but the dangerous part is when the costs are unpredictable. Like, “What's it going to cost me to run this big application in my data center?” The answer is usually, “Well, run it for a month, and then we'll know.” But that's an expensive and dangerous way to go about finding things out.I think that customers don't care about reducing costs as much as they think; they care about controlling them, predicting them, and understanding them. So, how would they make things less expensive? I don't know. I suspect that data transfer if they were to reduce that at least cross-AZ or eliminate it ideally, you'd start seeing a lot more compute usage in multiple AZs. I've had multiple clients who are not spinning things up in multi-AZ, specifically because they'll take the reliability trade-off over the extreme cost of all the replication flowing back and forth. Aside from that, they mostly get a lot of the value right in how they price things, which I don't think people have heard me say before, but it is true.Someone asked a question here of, “Any major trends that I'm seeing in EDP/PPA negotiations?” Yeah, lately, in particular. Used to be that you would have a Marketplace as the fallback, where it used to be that 50 cents of every dollar you spent on Marketplace would count. Now, it's a hundred percent up to a quarter of your commit. Great.But when you have a long-term commitment deal with Amazon, now they're starting to push for all—put all your other vendors onto the AWS Marketplace so you can have a bigger commit and thus a bigger discount, which incidentally, the discount does not apply to Marketplace spend. A lot of folks are uncomfortable with having Amazon as the middleman between all of their vendor relationships. And a lot of the vendors aren't super thrilled with having to pay percentages of existing customer relationships to Amazon for what they perceive to be remarkably little value. That's the current one.I'm not seeing generative AI play a significant stake in this yet. People are still experimenting with it. I'm not seeing, “Well, we're spending $100 million a year, but make that 150 because of generative AI.” It's expensive to play with gen-AI stuff, but it's not driving the business spend yet. But that's the big trend that I'm seeing over the past, eh, I would say, few months.“Do I use AWS for personal projects?” The first problem there is, well, what's a personal project versus a work thing? My life is starting to flow in a bunch of weird different ways. The answer is yes. Most of the stuff that I build for funsies is on top of AWS, though there are exceptions. “Should I?” Is the follow-up question and the answer to that is, “It depends.”The person is worrying about cost overruns. So, am I. I tend to not be a big fan of uncontrolled downside risk when something winds up getting exposed. I think that there are going to be a lot of caveats there. I know what I'm doing and I also have the backstop, in my case, of, I figure I can have a big billing screw-up or I have to bend the knee and apologize and beg for a concession from AWS, once.It'll probably be on a billboard or something one of these days. Lord knows I have it coming to me. That's something I can use as a get-out-of-jail-free card. Most people can't make that guarantee, and so I would take—if—depending on the environment that you know and what you want to build, there are a lot of other options: buying a fixed-fee VPS somewhere if that's how you tend to think about things might very well be a cost-effective for you, depending on what you're building. There's no straight answer to this.“Do I think Azure will lose any market share with recent cybersecurity kerfuffles specific to Office 365 and nation-state actors?” No, I don't. And the reason behind that is that a lot of Azure spend is not necessarily Azure usage; it's being rolled into enterprise agreements customers negotiate as part of their on-premises stuff, their operating system licenses, their Office licensing, and the rest. The business world is not going to stop using Excel and Word and PowerPoint and Outlook. They're not going to stop putting Windows on desktop stuff. And largely, customers don't care about security.They say they do, they often believe that they do, but I see where the bills are. I see what people spend on feature development, I see what they spend on core infrastructure, and I see what they spend on security services. And I have conversations about budgeting with what are you doing with a lot of these things? The companies generally don't care about this until right after they really should have cared. And maybe that's a rational effect.I mean, take a look at most breaches. And a year later, their stock price is larger than it was when they dispose the breach. Sure, maybe they're burning through their ablated CISO, but the business itself tends to succeed. I wish that there were bigger consequences for this. I have talked to folks who will not put specific workloads on Azure as a result of this. “Will you talk about that publicly?” “No, because who can afford to upset Microsoft?”I used to have guests from Microsoft on my show regularly. They don't talk to me and haven't for a couple of years. Scott Guthrie, the head of Azure, has been on this show. The problem I have is that once you start criticizing their security posture, they go quiet. They clearly don't like me.But their options are basically to either ice me out or play around with my seven seats for Office licensing, which, okay, whatever. They don't have a stick to hit me with, in the way that they do most companies. And whether that's true or not that they're going to lash out like that, companies don't want to take the risk of calling Microsoft out in public. Too big to be criticized as sort of how that works.Let's see, someone else asks, “How can a startup get the most out of its startup status with AWS?” You're not going to get what you think you want from AWS in this context. “Oh, we're going to be a featured partner so they market us.” I've yet to hear a story about how being featured by AWS for something has dramatically changed the fortunes of a startup. Usually, they'll do that when there's either a big social mission and you never hear about the company again, or they're a darling of the industry that's taking the world by fire and they're already [at 00:22:24] upward swing and AWS wants to hang out with those successful people in public and be seen to do so.The actual way that startup stuff is going to manifest itself well for you from AWS is largely in the form of credits as you go through Activate or one of their other programs. But be careful. Treat them like actual money, not this free thing you don't have to worry about. One day they expire or run out and suddenly you're going from having no dollars going to AWS to ten grand a month and people aren't prepared for that. It's, “Wait. So you mean this costs money? Oh, my God.”You have to approach it with a sense of discipline. But yeah, once you—if you can do that, yeah, free money and a free cloud bill for a few years? That's not nothing. I also would question the idea of being able to ask a giant company that's worth a trillion-and-a-half dollars and advice for how to be a startup. I find that one's always a little on the humorous side myself.“What do I think is the most underrated service or feature release from 2023? Full disclosures, this means I'll make some content about it,” says Brooke over at AWS. Oh, that's a good question. I'm trying to remember when various things have come out and it all tends to run together. I think that people are criticizing AWS for charging for IPV4 an awful lot, and I think that that is a terrific change, just because I've seen how wasteful companies are with public IP addresses, which are basically an exhausted or rapidly exhausting resource.And they just—you spend tens or hundreds of thousands of these things and don't use reason to think about that. It'll be one of the best things that we've seen for IPV6 adoption once AWS figures out how to make that work. And I would say that there's a lot to be said for since, you know, IPV4 is exhausted already, now we're talking about can we get them on the secondary markets, you need a reasonable IP plan to get some of those. And… “Well, we just give them the customers and they throw them away.” I want AWS to continue to be able to get those for the stuff that the rest of us are working on, not because one big company uses a million of them, just because, “Oh, what do you mean private IP addresses? What might those be?” That's part of it.I would say that there's also been… thinking back on this, it's unsung, the compute optimizer is doing a lot better at recommending things than it used to be. It was originally just giving crap advice, and over time, it started giving advice that's actually solid and backs up what I've seen. It's not perfect, and I keep forgetting it's there because, for some godforsaken reason, it's its own standalone service, rather than living in the billing console where it belongs. But no one's excited about a service like that to the point where they talk about or create content about it, but it's good, and it's getting better all the time. That's probably a good one. They recently announced the ability for it to do GPU instances which, okay great, for people who care about that, awesome, but it's not exciting. Even I don't think I paid much attention to it in the newsletter.Okay, “Does it make economic sense to bring your own IP addresses to AWS instead of paying their fees?” Bring your own IP, if you bring your own allocation to AWS, costs you nothing in terms of AWS costs. You take a look at the market rate per IP address versus what AWS costs, you'll hit break even within your first year if you do it. So yeah, it makes perfect economic sense to do it if you have the allocation and if you have the resourcing, as well as the ability to throw people at the problem to do the migration. It can be a little hairy if you're not careful. But the economics, the benefit is clear on that once you account for those variables.Let's see here. We've also got tagging. “Everyone nods their heads that they know it's the key to controlling things, but how effective are people at actually tagging, especially when new to cloud?” They're terrible at it. They're never going to tag things appropriately. Automation is the way to do it because otherwise, you're going to spend the rest of your life chasing developers and asking them to tag things appropriately, and then they won't, and then they'll feel bad about it. No one enjoys that conversation.So, having derived tags and the rest, or failing that, having some deployment gate as early in the process as possible of, “Oh, what's the tag for this?” Is the only way you're going to start to see coverage on this. And ideally, someday you'll go back and tag a bunch of pre-existing stuff. But it's honestly the thing that everyone hates the most on this. I have never seen a company that says, “We are thrilled with our with our tag coverage. We're nailing it.” The only time you see that is pure greenfield, everything done without ClickOps, and those environments are vanishingly rare.“Outside a telecom are customers using local zones more, or at all?” Very, very limited as far as what their usage looks like on that. Because that's… it doesn't buy you as much as you'd think for most workloads. The real benefit is a little more expensive, but it's also in specific cities where there are not AWS regions, and at least in the United States where the majority of my clients are, there is not meaningful latency differences, for example, from in Los Angeles versus up to Oregon, since no one should be using the Northern California region because it's really expensive. It's a 20-millisecond round trip, which in most cases, for most workloads, is fine.Gaming companies are big exception to this. Getting anything they can as close to the customer as possible is their entire goal, which very often means they don't even go with some of the cloud providers in some places. That's one of those actual multi-cloud workloads that you want to be able to run anywhere that you can get a baseline computer up to run a container or a golden image or something. That is the usual case. The rest are, for local zones, is largely going to be driven by specific one-off weird things. Good question.Let's see, “Is S3 intelligent tiering good enough or is it worth trying to do it yourself?” Your default choice for almost everything should be intelligent tiering in 2023. It winds up costing you more only in very specific circumstances that are unlikely to be anything other than a corner case for what you're doing. And the exceptions to this are, large workloads that are running a lot of S3 stuff where the lifecycle is very well understood, environments where you're not going to be storing your data for more than 30 days in any case and you can do a lifecycle policy around it. Other than those use cases, yeah, the monitoring fee is not significant in any environment I've ever seen.And people view—touch their data a lot less than they believe. So okay, there's a monitoring fee for object, yes, but it also cuts your raw storage cost in half for things that aren't frequently touched. So, you know, think about it. Run your own numbers and also be aware that first month as it transitions in, you're going to see massive transition charges per object, but wants it's an intelligent tiering, there's no further transition charges, which is nice.Let's see here. “We're all-in on serverless”—oh good, someone drank the Kool-Aid, too—“And for our use cases, it works great. Do I find other customers moving to it and succeeding?” Yeah, I do when they're moving to it because for certain workloads, it makes an awful lot of sense. For others, it requires a complete reimagining of whatever it is that you're doing.The early successes were just doing these periodic jobs. Now, we're seeing full applications built on top of event-driven architectures, which is really neat to see. But trying to retrofit something that was never built with that in mind can be more trouble than it's worth. And there are corner cases where building something on serverless would cost significantly more than building it in a server-ful way. But its time has come for an awful lot of stuff. Now, what I don't subscribe to is this belief that oh, if you're not building something serverless you're doing it totally wrong. No, that is not true. That has never been true.Let's see what else have we got here? Oh, “Following up on local zones, how about Outposts? Do I see much adoption? What's the primary use case or cases?” My customers inherently are coming to me because of a large AWS bill. If they're running Outposts, it is extremely unlikely that they are putting significant portions of their spend through the Outpost. It tends to be something of a rounding error, which means I don't spend a lot of time focusing on it.They obviously have some existing data center workloads and data center facilities where they're going to take an AWS-provided rack and slap it in there, but it's not going to be in the top 10 or even top 20 list of service spend in almost every case as a result, so it doesn't come up. One of the big secrets of how we approach things is we start with a big number first and then work our way down instead of going alphabetically. So yes, I've seen customers using them and the customers I've talked to at re:Invent who are using them are very happy with them for the use cases, but it's not a common approach. I'm not a huge fan of the rest.“Someone said the Basecamp saved a million-and-a-half a year by leaving AWS. I know you say repatriation isn't a thing people are doing, but has my view changed at all since you've published that blog post?” No, because everyone's asking me about Basecamp and it's repatriation, and that's the only use case that they've got for this. Let's further point out that a million-and-a-half a year is not as many engineers as you might think it is when you wind up tying that all together. And now those engineers are spending time running that environment.Does it make sense for them? Probably. I don't know their specific context. I know that a million-and-a-half dollars a year to—even if they had to spend that for the marketing coverage that they're getting as a result of this, makes perfect sense. But cloud has never been about raw cost savings. It's about feature velocity.If you have a data center and you move it to the cloud, you're not going to recoup that investment for at least five years. Migrations are inherently expensive. It does not create the benefits that people often believe that they do. That becomes a painful problem for folks. I would say that there's a lot more noise than there are real-world stories [hanging 00:31:57] out about these things.Now, I do occasionally see a specific workload that is moved back to a data center for a variety of reasons—occasionally cost but not always—and I see proof-of-concept projects that they don't pursue and then turn off. Some people like to call that a repatriation. No, I call it as, “We tried and it didn't do what we wanted it to do so we didn't proceed.” Like, if you try that with any other project, no one says, “Oh, you're migrating off of it.” No, you're not. You tested it, it didn't do what it needed to do. I do see net-new workloads going into data centers, but that's not the same thing.Let's see. “Are the talks at re:Invent worth it anymore? I went to a lot of the early re:Invents and haven't and about five years. I found back then that even the level 400 talks left a lot to be desired.” Okay. I'm not a fan of attending conference talks most of the time, just because there's so many things I need to do at all of these events that I would rather spend the time building relationships and having conversations.The talks are going to be on YouTube a week later, so I would rather get to know the people building the service so I can ask them how to inappropriately use it as a database six months later than asking questions about the talk. Conference-ware is often the thing. Re:Invent always tends to have an AWS employee on stage as well. And I'm not saying that makes these talks less authentic, but they're also not going to get through slide review of, “Well, we tried to build this onto this AWS service and it was a terrible experience. Let's tell you about that as a war story.” Yeah, they're going to shoot that down instantly even though failure stories are so compelling, about here's what didn't work for us and how we got there. It's the lessons learned type of thing.Whenever you have as much control as re:Invent exhibits over its speakers, you know that a lot of those anecdotes are going to be significantly watered down. This is not to impugn any of the speakers themselves; this is the corporate mind continuing to grow to a point where risk mitigation and downside protection becomes the primary driving goal.Let's pull up another one from the prepared list here. “My most annoying, overpriced, or unnecessary charge service in AWS.” AWS Config. It's a tax on using the cloud as the cloud. When you have a high config bill, it's because it charges you every time you change the configuration of something you have out there. It means you're spinning up and spinning down EC2 instances, whereas you're going to have a super low config bill if you, you know, treat it like a big dumb data center.It's a tax on accepting the promises under which cloud has been sold. And it's necessary for a number of other things like Security Hub. Control Towers magic-deploys it everywhere and makes it annoying to turn off. And I think that that is a pure rent-seeking charge because people aren't incurring config charges if they're not already using a lot of AWS things. Not every service needs to make money in a vacuum. It's, “Well, we don't charge anything for this because our users are going to spend an awful lot of money on storing things in S3 to use our service.” Great. That's a good thing. You don't have to pile charge upon charge upon charge upon charge. It drives me a little bit nuts.Let's see what else we have here as far as questions go. “Which AWS service delights me the most?” Eesh, depends on the week. S3 has always been a great service just because it winds up turning big storage that usually—used to require a lot of maintenance and care into something I don't think about very much. It's getting smarter and smarter all the time. The biggest lie is the ‘Simple' in its name: ‘Simple Storage Service.' At this point, if that's simple, I really don't want to know what you think complex would look like.“By following me on Twitter, someone gets a lot of value from things I mention offhandedly as things everybody just knows. For example, which services are quasi-deprecated or outdated, or what common practices are anti-patterns? Is there a way to learn this kind of thing all in one go, as in a website or a book that reduces AWS to these are the handful of services everybody actually uses, and these are the most commonly sensible ways to do it?” I wish. The problem is that a lot of the stuff that everyone knows, no, it's stuff that at most, maybe half of the people who are engaging with it knew.They find out by hearing from other people the way that you do or by trying something and failing and realizing, ohh, this doesn't work the way that I want it to. It's one of the more insidious forms of cloud lock-in. You know how a service works, how a service breaks, what the constraints are around when it starts and it stops. And that becomes something that's a hell of a lot scarier when you have to realize, I'm going to pick a new provider instead and relearn all of those things. The reason I build things on AWS these days is honestly because I know the ways it sucks. I know the painful sharp edges. I don't have to guess where they might be hiding. I'm not saying that these sharp edges aren't painful, but when you know they're there in advance, you can do an awful lot to guard against that.“Do I believe the big two—AWS and Azure—cloud providers have agreed between themselves not to launch any price wars as they already have an effective monopoly between them and [no one 00:36:46] win in a price war?” I don't know if there's ever necessarily an explicit agreement on that, but business people aren't foolish. Okay, if we're going to cut our cost of service, instantly, to undercut a competitor, every serious competitor is going to do the same thing. The only reason to do that is if you believe your margins are so wildly superior to your competitors that you can drive them under by doing that or if you have the ability to subsidize your losses longer than they can remain a going concern. Microsoft and Amazon are—and Google—are not in a position where, all right, we're going to drive them under.They can both subsidize losses basically forever on a lot of these things and they realize it's a game you don't win in, I suspect. The real pricing pressure on that stuff seems to come from customers, when all right, I know it's big and expensive upfront to buy a SAN, but when that starts costing me less than S3 on a per-petabyte basis, that's when you start to see a lot of pricing changing in the market. The one thing I haven't seen that take effect on is data transfer. You could be forgiven for believing that data transfer still cost as much as it did in the 1990s. It does not.“Is AWS as far behind in AI as they appear?” I think a lot of folks are in the big company space. And they're all stammering going, “We've been doing this for 20 years.” Great, then why are all of your generative AI services, A, bad? B, why is Alexa so terrible? C, why is it so clear that everything you have pre-announced and not brought to market was very clearly not envisioned as a product to be going to market this year until 300 days ago, when Chat-Gippity burst onto the scene and OpenAI [stole a march 00:38:25] on everyone?Companies are sprinting to position themselves as leaders in the AI space, despite the fact that they've gotten lapped by basically a small startup that's seven years old. Everyone is trying to work the word AI into things, but it always feels contrived to me. Frankly, it tells me that I need to just start tuning the space out for a year until things settle down and people stop describing metric math or anomaly detection is AI. Stop it. So yeah, I'd say if anything, they're worse than they appear as far as from behind goes.“I mostly focus on AWS. Will I ever cover Azure?” There are certain things that would cause me to do that, but that's because I don't want to be the last Perl consultancy is the entire world has moved off to Python. And effectively, my focus on AWS is because that's where the painful problems I know how to fix live. But that's not a suicide pact. I'm not going to ride that down in flames.But I can retool for a different cloud provider—if that's what the industry starts doing—far faster than AWS can go from its current market-leading status to irrelevance. There are certain triggers that would cause me to do that, but at the time, I don't see them in the near term and I don't have any plans to begin covering other things. As mentioned, people want me to talk about the things I'm good at not the thing that makes me completely nonsensical.“Which AWS services look like a good idea, but pricing-wise, they're going to kill you once you have any scale, especially the ones that look okay pricing-wise but aren't really and it's hard to know going in?” CloudTrail data events, S3 Bucket Access logging any of the logging services really, Managed NAT Gateways in a bunch of cases. There's a lot that starts to get really expensive once you hit certain points of scale with a corollary that everyone thinks that everything they're building is going to scale globally and that's not true. I don't build things as a general rule with the idea that I'm going to get ten million users on it tomorrow because by the time I get from nothing to substantial workloads, I'm going to have multiple refactors of what I've done. I want to get things out the door as fast as possible and if that means that later in time, oh, I accidentally built Pinterest. What am I going to do? Well, okay, yeah, I'm going to need to rebuild a whole bunch of stuff, but I'll have the user traffic and mindshare and market share to finance that growth.Early optimization on stuff like this causes a lot more problems than it solves. “Best practices and anti-patterns in managing AWS costs. For context, you once told me about a role that I had taken that you'd seen lots of companies tried to create that role and then said that the person rarely lasts more than a few months because it just isn't effective. You were right, by the way.” Imagine that I sometimes know what I'm talking about.When it comes to managing costs, understand what your goal is here, what you're actually trying to achieve. Understand it's going to be a cross-functional work between people in finance and people that engineering. It is first and foremost, an engineering problem—you learn that at your peril—and making someone be the human gateway to spin things up means that they're going to quit, basically, instantly. Stop trying to shame different teams without understanding their constraints.Savings Plans are a great example. They apply biggest discount first, which is what you want. Less money going out the door to Amazon, but that makes it look like anything with a low discount percentage, like any workload running on top of Microsoft Windows, is not being responsible because they're always on demand. And you're inappropriately shaming a team for something completely out of their control. There's a point where optimization no longer makes sense. Don't apply it to greenfield projects or skunkworks. Things you want to see if the thing is going to work first. You can optimize it later. Starting out with a, ‘step one: spend as little as possible' is generally not a recipe for success.What else have we got here? I've seen some things fly by in the chat that are probably worth mentioning here. Some of it is just random nonsense, but other things are, I'm sure, tied to various questions here. “With geopolitics shaping up to govern tech data differently in each country, does it make sense to even build a globally distributed B2B SaaS?” Okay, I'm going to tackle this one in a way that people will probably view as a bit of an attack, but it's something I see asked a lot by folks trying to come up with business ideas.At the outset, I'm a big believer in, if you're building something, solve it for a problem and a use case that you intrinsically understand. That is going to mean the customers with whom you speak. Very often, the way business is done in different countries and different cultures means that in some cases, this thing that's a terrific idea in one country is not going to see market adoption somewhere else. There's a better approach to build for the market you have and the one you're addressing rather than aspirational builds. I would also say that it potentially makes sense if there are certain things you know are going to happen, like okay, we validated our marketing and yeah, it turns out that we're building an image resizing site. Great. People in Germany and in the US all both need to resize images.But you know, going in that there's going to be a data residency requirement, so architecting, from day one with an idea that you can have a partition that winds up storing its data separately is always going to be to your benefit. I find aligning whatever you're building with the idea of not being creepy is often a great plan. And there's always the bring your own storage approach to, great, as a customer, you can decide where your data gets stored in your account—charge more for that, sure—but then that na—it becomes their problem. Anything that gets you out of the regulatory critical path is usually a good idea. But with all the problems I would have building a business, that is so far down the list for almost any use case I could ever see pursuing that it's just one of those, you have a half-hour conversation with someone who's been down the path before if you think it might apply to what you're doing, but then get back to the hard stuff. Like, worry on the first two or three steps rather than step 90 just because you'll get there eventually. You don't want to make your future life harder, but you also don't want to spend all your time optimizing early, before you've validated you're actually building something useful.“What unique feature of AWS do I most want to see on other cloud providers and vice versa?” The vice versa is easy. I love that Google Cloud by default has the everything in this project—which is their account equivalent—can talk to everything else, which means that humans aren't just allowing permissions to the universe because it's hard. And I also like that billing is tied to an individual project. ‘Terminate all billable resources in this project' is a button-click away and that's great.Now, what do I wish other cloud providers would take from AWS? Quite honestly, the customer obsession. It's still real. I know it sounds like it's a funny talking point or the people who talk about this the most under the cultists, but they care about customer problems. Back when no one had ever heard of me before and my AWS Bill was seven bucks, whenever I had a problem with a service and I talked about this in passing to folks, Amazonians showed up out of nowhere to help make sure that my problem got answered, that I was taken care of, that I understood what I was misunderstanding, or in some cases, the feedback went to the product team.I see too many companies across the board convinced that they themselves know best about what customers need. That occasionally can be true, but not consistently. When customers are screaming for something, give them what they need, or frankly, get out of the way so someone else can. I mean, I know someone's expecting me to name a service or something, but we've gotten past the point, to my mind, of trying to do an apples-to-oranges comparison in terms of different service offerings. If you want to build a website using any reasonable technology, there's a whole bunch of companies now that have the entire stack for you. Pick one. Have fun.We've got time for a few more here. Also, feel free to drop more questions in. I'm thrilled to wind up answering any of these things. Have I seen any—here's one that about Babelfish, for example, from Justin [Broadly 00:46:07]. “Have I seen anyone using Babelfish in the wild? It seems like it was a great idea that didn't really work or had major trade-offs.”It's a free open-source project that translates from one kind of database SQL to a different kind of database SQL. There have been a whole bunch of attempts at this over the years, and in practice, none of them have really panned out. I have seen no indications that Babelfish is different. If someone at AWS works on this or is a customer using Babelfish and say, “Wait, that's not true,” please tell me because all I'm saying is I have not seen it and I don't expect that I will. But I'm always willing to be wrong. Please, if I say something at some point that someone disagrees with, please reach out to me. I don't intend to perpetuate misinformation.“Purely hypothetically”—yeah, it's always great to ask things hypothetically—“In the companies I work with, which group typically manages purchasing savings plans, the ops team, finance, some mix of both?” It depends. The sad answer is, “What's a savings plan,” asks the company, and then we have an educational path to go down. Often it is individual teams buying them ad hoc, which can work, cannot as long as everyone's on the same page. Central planning, in a bunch of—a company that's past a certain point in sophistication is where everything winds up leading to.And that is usually going to be a series of discussions, ideally run by that group in a cross-functional way. They can be cost engineering, they can be optimization engineering, I've heard it described in a bunch of different ways. But that is—increasingly as the sophistication of your business and the magnitude of your spend increases, the sophistication of how you approach this should change as well. Early on, it's the offense of some VP of engineering at a startup. Like, “Oh, that's a lot of money,” running the analyzer and clicking the button to buy what it says. That's not a bad first-pass attempt. And then I think getting smaller and smaller buys as you continue to proceed means you can start to—it no longer becomes the big giant annual decision and instead becomes part of a frequently used process. That works pretty well, too.Is there anything else that I want to make sure I get to before we wind up running this down? To the folks in the comments, this is your last chance to throw random, awkward questions my way. I'm thrilled to wind up taking any slings, arrows, et cetera, that you care to throw my way a going once, going twice style. Okay, “What is the most esoteric or shocking item on the AWS bill that you ever found with one of your customers?” All right, it's been long enough, and I can say it without naming the customer, so that'll be fun.My personal favorite was a high five-figure bill for Route 53. I joke about using Route 53 as a database. It can be, but there are better options. I would say that there are a whole bunch of use cases for Route 53 and it's a great service, but when it's that much money, it occasions comment. It turned out that—we discovered, in fact, a data exfiltration in progress which made it now a rather clever security incident.And, “This call will now be ending for the day and we're going to go fix that. Thanks.” It's like I want a customer testimonial on that one, but for obvious reasons, we didn't get one. But that was probably the most shocking thing. The depressing thing that I see the most—and this is the core of the cost problem—is not when the numbers are high. It's when I ask about a line item that drives significant spend, and the customer is surprised.I don't like it when customers don't know what they're spending money on. If your service surprises customers when they realize what it costs, you have failed. Because a lot of things are expensive and customers know that and they're willing to take the value in return for the cost. That's fine. But tricking customers does not serve anyone well, even your own long-term interests. I promise.“Have I ever had to reject a potential client because they had a tangled mess that was impossible to tackle, or is there always a way?” It's never the technology that will cause us not to pursue working with a given company. What will is, like, if you go to our website at duckbillgroup.com, you're not going to see a ‘Buy Here' button where you ‘add one consulting, please' to your shopping cart and call it a day.It's a series of conversations. And what we will try to make sure is, what is your goal? Who's aligned with it? What are the problems you're having in getting there? And what does success look like? Who else is involved in this? And it often becomes clear that people don't like the current situation, but there's no outcome with which they would be satisfied.Or they want something that we do not do. For example, “We want you to come in and implement all of your findings.” We are advisory. We do not know the specifics of your environment and—or your deployment processes or the rest. We're not an engineering shop. We charge a fixed fee and part of the way we can do that is by controlling the scope of what we do. “Well, you know, we have some AWS bills, but we really want to—we really care about is our GCP bill or our Datadog bill.” Great. We don't focus on either of those things. I mean, I can just come in and sound competent, but that's not what adding value as a consultant is about. It's about being authoritatively correct. Great question, though.“How often do I receive GovCloud cost optimization requests? Does the compliance and regulation that these customers typically have keep them from making the needed changes?” It doesn't happen often and part of the big reason behind that is that when we're—and if you're in GovCloud, it's probably because you are a significant governmental entity. There's not a lot of private sector in GovCloud for almost every workload there. Yes, there are exceptions; we don't tend to do a whole lot with them.And the government procurement process is a beast. We can sell and service three to five commercial engagements in the time it takes to negotiate a single GovCloud agreement with a customer, so it just isn't something that we focused. We don't have the scale to wind up tackling that down. Let's also be clear that, in many cases, governments don't view money the same way as enterprise, which in part is a good thing, but it also means that, “This cloud thing is too expensive,” is never the stated problem. Good question.“Waffles or pancakes?” Is another one. I… tend to go with eggs, personally. It just feels like empty filler in the morning. I mean, you could put syrup on anything if you're bold enough, so if it's just a syrup delivery vehicle, there are other paths to go.And I believe we might have exhausted the question pool. So, I want to thank you all for taking the time to talk with me. Once again, I am Cloud Economist Corey Quinn. And this is a very special live episode of Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review wherever you can—or a thumbs up, or whatever it is, like and subscribe obviously—whereas if you've hated this podcast, same thing: five-star review, but also go ahead and leave an insulting comment, usually around something I've said about a service that you deeply care about because it's tied to your paycheck.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Steve Tuck, Co-Founder & CEO of Oxide Computer Company, joins Corey on Screaming in the Cloud to discuss his work to make modern computers cloud-friendly. Steve describes what it was like going through early investment rounds, and the difficult but important decision he and his co-founder made to build their own switch. Corey and Steve discuss the demand for on-prem computers that are built for cloud capability, and Steve reveals how Oxide approaches their product builds to ensure the masses can adopt their technology wherever they are. About SteveSteve is the Co-founder & CEO of Oxide Computer Company. He previously was President & COO of Joyent, a cloud computing company acquired by Samsung. Before that, he spent 10 years at Dell in a number of different roles. Links Referenced: Oxide Computer Company: https://oxide.computer/ On The Metal Podcast: https://oxide.computer/podcasts/on-the-metal TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at RedHat. As your organization grows, so does the complexity of your IT resources. You need a flexible solution that lets you deploy, manage, and scale workloads throughout your entire ecosystem. The Red Hat Ansible Automation Platform simplifies the management of applications and services across your hybrid infrastructure with one platform. Look for it on the AWS Marketplace.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. You know, I often say it—but not usually on the show—that Screaming in the Cloud is a podcast about the business of cloud, which is intentionally overbroad so that I can talk about basically whatever the hell I want to with whoever the hell I'd like. Today's guest is, in some ways of thinking, about as far in the opposite direction from Cloud as it's possible to go and still be involved in the digital world. Steve Tuck is the CEO at Oxide Computer Company. You know, computers, the things we all pretend aren't underpinning those clouds out there that we all use and pay by the hour, gigabyte, second-month-pound or whatever it works out to. Steve, thank you for agreeing to come back on the show after a couple years, and once again suffer my slings and arrows.Steve: Much appreciated. Great to be here. It has been a while. I was looking back, I think three years. This was like, pre-pandemic, pre-interest rates, pre… Twitter going totally sideways.Corey: And I have to ask to start with that, it feels, on some level, like toward the start of the pandemic, when everything was flying high and we'd had low interest rates for a decade, that there was a lot of… well, lunacy lurking around in the industry, my own business saw it, too. It turns out that not giving a shit about the AWS bill is in fact a zero interest rate phenomenon. And with all that money or concentrated capital sloshing around, people decided to do ridiculous things with it. I would have thought, on some level, that, “We're going to start a computer company in the Bay Area making computers,” would have been one of those, but given that we are a year into the correction, and things seem to be heading up into the right for you folks, that take was wrong. How'd I get it wrong?Steve: Well, I mean, first of all, you got part of it right, which is there were just a litany of ridiculous companies and projects and money being thrown in all directions at that time.Corey: An NFT of a computer. We're going to have one of those. That's what you're selling, right? Then you had to actually hard pivot to making the real thing.Steve: That's it. So, we might as well cut right to it, you know. This is—we went through the crypto phase. But you know, our—when we started the company, it was yes, a computer company. It's on the tin. It's definitely kind of the foundation of what we're building. But you know, we think about what a modern computer looks like through the lens of cloud.I was at a cloud computing company for ten years prior to us founding Oxide, so was Bryan Cantrill, CTO, co-founder. And, you know, we are huge, huge fans of cloud computing, which was an interesting kind of dichotomy. Instead of conversations when we were raising for Oxide—because of course, Sand Hill is terrified of hardware. And when we think about what modern computers need to look like, they need to be in support of the characteristics of cloud, and cloud computing being not that you're renting someone else's computers, but that you have fully programmable infrastructure that allows you to slice and dice, you know, compute and storage and networking however software needs. And so, what we set out to go build was a way for the companies that are running on-premises infrastructure—which, by the way, is almost everyone and will continue to be so for a very long time—access to the benefits of cloud computing. And to do that, you need to build a different kind of computing infrastructure and architecture, and you need to plumb the whole thing with software.Corey: There are a number of different ways to view cloud computing. And I think that a lot of the, shall we say, incumbent vendors over in the computer manufacturing world tend to sound kind of like dinosaurs, on some level, where they're always talking in terms of, you're a giant company and you already have a whole bunch of data centers out there. But one of the magical pieces of cloud is you can have a ridiculous idea at nine o'clock tonight and by morning, you'll have a prototype, if you're of that bent. And if it turns out it doesn't work, you're out, you know, 27 cents. And if it does work, you can keep going and not have to stop and rebuild on something enterprise-grade.So, for the small-scale stuff and rapid iteration, cloud providers are terrific. Conversely, when you wind up in the giant fleets of millions of computers, in some cases, there begin to be economic factors that weigh in, and for some on workloads—yes, I know it's true—going to a data center is the economical choice. But my question is, is starting a new company in the direction of building these things, is it purely about economics or is there a capability story tied in there somewhere, too?Steve: Yeah, it's actually economics ends up being a distant third, fourth, in the list of needs and priorities from the companies that we're working with. When we talk about—and just to be clear we're—our demographic, that kind of the part of the market that we are focused on are large enterprises, like, folks that are spending, you know, half a billion, billion dollars a year in IT infrastructure, they, over the last five years, have moved a lot of the use cases that are great for public cloud out to the public cloud, and who still have this very, very large need, be it for latency reasons or cost reasons, security reasons, regulatory reasons, where they need on-premises infrastructure in their own data centers and colo facilities, et cetera. And it is for those workloads in that part of their infrastructure that they are forced to live with enterprise technologies that are 10, 20, 30 years old, you know, that haven't evolved much since I left Dell in 2009. And, you know, when you think about, like, what are the capabilities that are so compelling about cloud computing, one of them is yes, what you mentioned, which is you have an idea at nine o'clock at night and swipe a credit card, and you're off and running. And that is not the case for an idea that someone has who is going to use the on-premises infrastructure of their company. And this is where you get shadow IT and 16 digits to freedom and all the like.Corey: Yeah, everyone with a corporate credit card winds up being a shadow IT source in many cases. If your processes as a company don't make it easier to proceed rather than doing it the wrong way, people are going to be fighting against you every step of the way. Sometimes the only stick you've got is that of regulation, which in some industries, great, but in other cases, no, you get to play Whack-a-Mole. I've talked to too many companies that have specific scanners built into their mail system every month looking for things that look like AWS invoices.Steve: [laugh]. Right, exactly. And so, you know, but if you flip it around, and you say, well, what if the experience for all of my infrastructure that I am running, or that I want to provide to my software development teams, be it rented through AWS, GCP, Azure, or owned for economic reasons or latency reasons, I had a similar set of characteristics where my development team could hit an API endpoint and provision instances in a matter of seconds when they had an idea and only pay for what they use, back to kind of corporate IT. And what if they were able to use the same kind of developer tools they've become accustomed to using, be it Terraform scripts and the kinds of access that they are accustomed to using? How do you make those developers just as productive across the business, instead of just through public cloud infrastructure?At that point, then you are in a much stronger position where you can say, you know, for a portion of things that are, as you pointed out, you know, more unpredictable, and where I want to leverage a bunch of additional services that a particular cloud provider has, I can rent that. And where I've got more persistent workloads or where I want a different economic profile or I need to have something in a very low latency manner to another set of services, I can own it. And that's where I think the real chasm is because today, you just don't—we take for granted the basic plumbing of cloud computing, you know? Elastic Compute, Elastic Storage, you know, networking and security services. And us in the cloud industry end up wanting to talk a lot more about exotic services and, sort of, higher-up stack capabilities. None of that basic plumbing is accessible on-prem.Corey: I also am curious as to where exactly Oxide lives in the stack because I used to build computers for myself in 2000, and it seems like having gone down that path a bit recently, yeah, that process hasn't really improved all that much. The same off-the-shelf components still exist and that's great. We always used to disparagingly call spinning hard drives as spinning rust in racks. You named the company Oxide; you're talking an awful lot about the Rust programming language in public a fair bit of the time, and I'm starting to wonder if maybe words don't mean what I thought they meant anymore. Where do you folks start and stop, exactly?Steve: Yeah, that's a good question. And when we started, we sort of thought the scope of what we were going to do and then what we were going to leverage was smaller than it has turned out to be. And by that I mean, man, over the last three years, we have hit a bunch of forks in the road where we had questions about do we take something off the shelf or do we build it ourselves. And we did not try to build everything ourselves. So, to give you a sense of kind of where the dotted line is, around the Oxide product, what we're delivering to customers is a rack-level computer. So, the minimum size comes in rack form. And I think your listeners are probably pretty familiar with this. But, you know, a rack is—Corey: You would be surprised. It's basically, what are they about seven feet tall?Steve: Yeah, about eight feet tall.Corey: Yeah, yeah. Seven, eight feet, weighs a couple 1000 pounds, you know, make an insulting joke about—Steve: Two feet wide.Corey: —NBA players here. Yeah, all kinds of these things.Steve: Yeah. And big hunk of metal. And in the cases of on-premises infrastructure, it's kind of a big hunk of metal hole, and then a bunch of 1U and 2U boxes crammed into it. What the hyperscalers have done is something very different. They started looking at, you know, at the rack level, how can you get much more dense, power-efficient designs, doing things like using a DC bus bar down the back, instead of having 64 power supplies with cables hanging all over the place in a rack, which I'm sure is what you're more familiar with.Corey: Tremendous amount of weight as well because you have the metal chassis for all of those 1U things, which in some cases, you wind up with, what, 46U in a rack, assuming you can even handle the cooling needs of all that.Steve: That's right.Corey: You have so much duplication, and so much of the weight is just metal separating one thing from the next thing down below it. And there are opportunities for massive improvement, but you need to be at a certain point of scale to get there.Steve: You do. You do. And you also have to be taking on the entire problem. You can't pick at parts of these things. And that's really what we found. So, we started at this sort of—the rack level as sort of the design principle for the product itself and found that that gave us the ability to get to the right geometry, to get as much CPU horsepower and storage and throughput and networking into that kind of chassis for the least amount of wattage required, kind of the most power-efficient design possible.So, it ships at the rack level and it ships complete with both our server sled systems in Oxide, a pair of Oxide switches. This is—when I talk about, like, design decisions, you know, do we build our own switch, it was a big, big, big question early on. We were fortunate even though we were leaning towards thinking we needed to go do that, we had this prospective early investor who was early at AWS and he had asked a very tough question that none of our other investors had asked to this point, which is, “What are you going to do about the switch?”And we knew that the right answer to an investor is like, “No. We're already taking on too much.” We're redesigning a server from scratch in, kind of, the mold of what some of the hyperscalers have learned, doing our own Root of Trust, we're doing our own operating system, hypervisor control plane, et cetera. Taking on the switch could be seen as too much, but we told them, you know, we think that to be able to pull through all of the value of the security benefits and the performance and observability benefits, we can't have then this [laugh], like, obscure third-party switch rammed into this rack.Corey: It's one of those things that people don't think about, but it's the magic of cloud with AWS's network, for example, it's magic. You can get line rate—or damn near it—between any two points, sustained.Steve: That's right.Corey: Try that in the data center, you wind into massive congestion with top-of-rack switches, where, okay, we're going to parallelize this stuff out over, you know, two dozen racks and we're all going to have them seamlessly transfer information between each other at line rate. It's like, “[laugh] no, you're not because those top-of-rack switches will melt and become side-of-rack switches, and then bottom-puddle-of-rack switches. It doesn't work that way.”Steve: That's right.Corey: And you have to put a lot of thought and planning into it. That is something that I've not heard a traditional networking vendor addressing because everyone loves to hand-wave over it.Steve: Well so, and this particular prospective investor, we told him, “We think we have to go build our own switch.” And he said, “Great.” And we said, “You know, we think we're going to lose you as an investor as a result, but this is what we're doing.” And he said, “If you're building your own switch, I want to invest.” And his comment really stuck with us, which is AWS did not stand on their own two feet until they threw out their proprietary switch vendor and built their own.And that really unlocked, like you've just mentioned, like, their ability, both in hardware and software to tune and optimize to deliver that kind of line rate capability. And that is one of the big findings for us as we got into it. Yes, it was really, really hard, but based on a couple of design decisions, P4 being the programming language that we are using as the surround for our silicon, tons of opportunities opened up for us to be able to do similar kinds of optimization and observability. And that has been a big, big win.But to your question of, like, where does it stop? So, we are delivering this complete with a baked-in operating system, hypervisor, control plane. And so, the endpoint of the system, where the customer meets is either hitting an API or a CLI or a console that delivers and kind of gives you the ability to spin up projects. And, you know, if one is familiar with EC2 and EBS and VPC, that VM level of abstraction is where we stop.Corey: That, I think, is a fair way of thinking about it. And a lot of cloud folks are going to pooh-pooh it as far as saying, “Oh well, just virtual machines. That's old cloud. That just treats the cloud like a data center.” And in many cases, yes, it does because there are ways to build modern architectures that are event-driven on top of things like Lambda, and API Gateway, and the rest, but you take a look at what my customers are doing and what drives the spend, it is invariably virtual machines that are largely persistent.Sometimes they scale up, sometimes they scale down, but there's always a baseline level of load that people like to hand-wave away the fact that what they're fundamentally doing in a lot of these cases, is paying the cloud provider to handle the care and feeding of those systems, which can be expensive, yes, but also delivers significant innovation beyond what almost any company is going to be able to deliver in-house. There is no way around it. AWS is better than you are—whoever you happen to—be at replacing failed hard drives. That is a simple fact. They have teams of people who are the best in the world of replacing failed hard drives. You generally do not. They are going to be better at that than you. But that's not the only axis. There's not one calculus that leads to, is cloud a scam or is cloud a great value proposition for us? The answer is always a deeply nuanced, “It depends.”Steve: Yeah, I mean, I think cloud is a great value proposition for most and a growing amount of software that's being developed and deployed and operated. And I think, you know, one of the myths that is out there is, hey, turn over your IT to AWS because we have or you know, a cloud provider—because we have such higher caliber personnel that are really good at swapping hard drives and dealing with networks and operationally keeping this thing running in a highly available manner that delivers good performance. That is certainly true, but a lot of the operational value in an AWS is been delivered via software, the automation, the observability, and not actual people putting hands on things. And it's an important point because that's been a big part of what we're building into the product. You know, just because you're running infrastructure in your own data center, it does not mean that you should have to spend, you know, 1000 hours a month across a big team to maintain and operate it. And so, part of that, kind of, cloud, hyperscaler innovation that we're baking into this product is so that it is easier to operate with much, much, much lower overhead in a highly available, resilient manner.Corey: So, I've worked in a number of data center facilities, but the companies I was working with, were always at a scale where these were co-locations, where they would, in some cases, rent out a rack or two, in other cases, they'd rent out a cage and fill it with their own racks. They didn't own the facilities themselves. Those were always handled by other companies. So, my question for you is, if I want to get a pile of Oxide racks into my environment in a data center, what has to change? What are the expectations?I mean, yes, there's obviously going to be power and requirements at the data center colocation is very conversant with, but Open Compute, for example, had very specific requirements—to my understanding—around things like the airflow construction of the environment that they're placed within. How prescriptive is what you've built, in terms of doing a building retrofit to start using you folks?Steve: Yeah, definitely not. And this was one of the tensions that we had to balance as we were designing the product. For all of the benefits of hyperscaler computing, some of the design center for you know, the kinds of racks that run in Google and Amazon and elsewhere are hyperscaler-focused, which is unlimited power, in some cases, data centers designed around the equipment itself. And where we were headed, which was basically making hyperscaler infrastructure available to, kind of, the masses, the rest of the market, these folks don't have unlimited power and they aren't going to go be able to go redesign data centers. And so no, the experience should be—with exceptions for folks maybe that have very, very limited access to power—that you roll this rack into your existing data center. It's on standard floor tile, that you give it power, and give it networking and go.And we've spent a lot of time thinking about how we can operate in the wide-ranging environmental characteristics that are commonplace in data centers that focus on themselves, colo facilities, and the like. So, that's really on us so that the customer is not having to go to much work at all to kind of prepare and be ready for it.Corey: One of the challenges I have is how to think about what you've done because you are rack-sized. But what that means is that my own experimentation at home recently with on-prem stuff for smart home stuff involves a bunch of Raspberries Pi and a [unintelligible 00:19:42], but I tend to more or less categorize you the same way that I do AWS Outposts, as well as mythical creatures, like unicorns or giraffes, where I don't believe that all these things actually exist because I haven't seen them. And in fact, to get them in my house, all four of those things would theoretically require a loading dock if they existed, and that's a hard thing to fake on a demo signup form, as it turns out. How vaporware is what you've built? Is this all on paper and you're telling amazing stories or do they exist in the wild?Steve: So, last time we were on, it was all vaporware. It was a couple of napkin drawings and a seed round of funding.Corey: I do recall you not using that description at the time, for what it's worth. Good job.Steve: [laugh]. Yeah, well, at least we were transparent where we were going through the race. We had some napkin drawings and we had some good ideas—we thought—and—Corey: You formalize those and that's called Microsoft PowerPoint.Steve: That's it. A hundred percent.Corey: The next generative AI play is take the scrunched-up, stained napkin drawing, take a picture of it, and convert it to a slide.Steve: Google Docs, you know, one of those. But no, it's got a lot of scars from the build and it is real. In fact, next week, we are going to be shipping our first commercial systems. So, we have got a line of racks out in our manufacturing facility in lovely Rochester, Minnesota. Fun fact: Rochester, Minnesota, is where the IBM AS/400s were built.Corey: I used to work in that market, of all things.Steve: Really?Corey: Selling tape drives in the AS/400. I mean, I still maintain there's no real mainframe migration to the cloud play because there's no AWS/400. A joke that tends to sail over an awful lot of people's heads because, you know, most people aren't as miserable in their career choices as I am.Steve: Okay, that reminds me. So, when we were originally pitching Oxide and we were fundraising, we [laugh]—in a particular investor meeting, they asked, you know, “What would be a good comp? Like how should we think about what you are doing?” And fortunately, we had about 20 investor meetings to go through, so burning one on this was probably okay, but we may have used the AS/400 as a comp, talking about how [laugh] mainframe systems did such a good job of building hardware and software together. And as you can imagine, there were some blank stares in that room.But you know, there are some good analogs to historically in the computing industry, when you know, the industry, the major players in the industry, were thinking about how to deliver holistic systems to support end customers. And, you know, we see this in the what Apple has done with the iPhone, and you're seeing this as a lot of stuff in the automotive industry is being pulled in-house. I was listening to a good podcast. Jim Farley from Ford was talking about how the automotive industry historically outsourced all of the software that controls cars, right? So, like, Bosch would write the software for the controls for your seats.And they had all these suppliers that were writing the software, and what it meant was that innovation was not possible because you'd have to go out to suppliers to get software changes for any little change you wanted to make. And in the computing industry, in the 80s, you saw this blow apart where, like, firmware got outsourced. In the IBM and the clones, kind of, race, everyone started outsourcing firmware and outsourcing software. Microsoft started taking over operating systems. And then VMware emerged and was doing a virtualization layer.And this, kind of, fragmented ecosystem is the landscape today that every single on-premises infrastructure operator has to struggle with. It's a kit car. And so, pulling it back together, designing things in a vertically integrated manner is what the hyperscalers have done. And so, you mentioned Outposts. And, like, it's a good example of—I mean, the most public cloud of public cloud companies created a way for folks to get their system on-prem.I mean, if you need anything to underscore the draw and the demand for cloud computing-like, infrastructure on-prem, just the fact that that emerged at all tells you that there is this big need. Because you've got, you know, I don't know, a trillion dollars worth of IT infrastructure out there and you have maybe 10% of it in the public cloud. And that's up from 5% when Jassy was on stage in '21, talking about 95% of stuff living outside of AWS, but there's going to be a giant market of customers that need to own and operate infrastructure. And again, things have not improved much in the last 10 or 20 years for them.Corey: They have taken a tone onstage about how, “Oh, those workloads that aren't in the cloud, yet, yeah, those people are legacy idiots.” And I don't buy that for a second because believe it or not—I know that this cuts against what people commonly believe in public—but company execs are generally not morons, and they make decisions with context and constraints that we don't see. Things are the way they are for a reason. And I promise that 90% of corporate IT workloads that still live on-prem are not being managed or run by people who've never heard of the cloud. There was a decision made when some other things were migrating of, do we move this thing to the cloud or don't we? And the answer at the time was no, we're going to keep this thing on-prem where it is now for a variety of reasons of varying validity. But I don't view that as a bug. I also, frankly, don't want to live in a world where all the computers are basically run by three different companies.Steve: You're spot on, which is, like, it does a total disservice to these smart and forward-thinking teams in every one of the Fortune 1000-plus companies who are taking the constraints that they have—and some of those constraints are not monetary or entirely workload-based. If you want to flip it around, we were talking to a large cloud SaaS company and their reason for wanting to extend it beyond the public cloud is because they want to improve latency for their e-commerce platform. And navigating their way through the complex layers of the networking stack at GCP to get to where the customer assets are that are in colo facilities, adds lag time on the platform that can cost them hundreds of millions of dollars. And so, we need to think behind this notion of, like, “Oh, well, the dark ages are for software that can't run in the cloud, and that's on-prem. And it's just a matter of time until everything moves to the cloud.”In the forward-thinking models of public cloud, it should be both. I mean, you should have a consistent experience, from a certain level of the stack down, everywhere. And then it's like, do I want to rent or do I want to own for this particular use case? In my vast set of infrastructure needs, do I want this to run in a data center that Amazon runs or do I want this to run in a facility that is close to this other provider of mine? And I think that's best for all. And then it's not this kind of false dichotomy of quality infrastructure or ownership.Corey: I find that there are also workloads where people will come to me and say, “Well, we don't think this is going to be economical in the cloud”—because again, I focus on AWS bills. That is the lens I view things through, and—“The AWS sales rep says it will be. What do you think?” And I look at what they're doing and especially if involves high volumes of data transfer, I laugh a good hearty laugh and say, “Yeah, keep that thing in the data center where it is right now. You will thank me for it later.”It's, “Well, can we run this in an economical way in AWS?” As long as you're okay with economical meaning six times what you're paying a year right now for the same thing, yeah, you can. I wouldn't recommend it. And the numbers sort of speak for themselves. But it's not just an economic play.There's also the story of, does this increase their capability? Does it let them move faster toward their business goals? And in a lot of cases, the answer is no, it doesn't. It's one of those business process things that has to exist for a variety of reasons. You don't get to reimagine it for funsies and even if you did, it doesn't advance the company in what they're trying to do any, so focus on something that differentiates as opposed to this thing that you're stuck on.Steve: That's right. And what we see today is, it is easy to be in that mindset of running things on-premises is kind of backwards-facing because the experience of it is today still very, very difficult. I mean, talking to folks and they're sharing with us that it takes a hundred days from the time all the different boxes land in their warehouse to actually having usable infrastructure that developers can use. And our goal and what we intend to go hit with Oxide as you can roll in this complete rack-level system, plug it in, within an hour, you have developers that are accessing cloud-like services out of the infrastructure. And that—God, countless stories of firmware bugs that would send all the fans in the data center nonlinear and soak up 100 kW of power.Corey: Oh, God. And the problems that you had with the out-of-band management systems. For a long time, I thought Drax stood for, “Dell, RMA Another Computer.” It was awful having to deal with those things. There was so much room for innovation in that space, which no one really grabbed onto.Steve: There was a really, really interesting talk at DEFCON that we just stumbled upon yesterday. The NVIDIA folks are giving a talk on BMC exploits… and like, a very, very serious BMC exploit. And again, it's what most people don't know is, like, first of all, the BMC, the Baseboard Management Controller, is like the brainstem of the computer. It has access to—it's a backdoor into all of your infrastructure. It's a computer inside a computer and it's got software and hardware that your server OEM didn't build and doesn't understand very well.And firmware is even worse because you know, firmware written by you know, an American Megatrends or other is a big blob of software that gets loaded into these systems that is very hard to audit and very hard to ascertain what's happening. And it's no surprise when, you know, back when we were running all the data centers at a cloud computing company, that you'd run into these issues, and you'd go to the server OEM and they'd kind of throw their hands up. Well, first they'd gaslight you and say, “We've never seen this problem before,” but when you thought you've root-caused something down to firmware, it was anyone's guess. And this is kind of the current condition today. And back to, like, the journey to get here, we kind of realized that you had to blow away that old extant firmware layer, and we rewrote our own firmware in Rust. Yes [laugh], I've done a lot in Rust.Corey: No, it was in Rust, but, on some level, that's what Nitro is, as best I can tell, on the AWS side. But it turns out that you don't tend to have the same resources as a one-and-a-quarter—at the moment—trillion-dollar company. That keeps [valuing 00:30:53]. At one point, they lost a comma and that was sad and broke all my logic for that and I haven't fixed it since. Unfortunate stuff.Steve: Totally. I think that was another, kind of, question early on from certainly a lot of investors was like, “Hey, how are you going to pull this off with a smaller team and there's a lot of surface area here?” Certainly a reasonable question. Definitely was hard. The one advantage—among others—is, when you are designing something kind of in a vertical holistic manner, those design integration points are narrowed down to just your equipment.And when someone's writing firmware, when AMI is writing firmware, they're trying to do it to cover hundreds and hundreds of components across dozens and dozens of vendors. And we have the advantage of having this, like, purpose-built system, kind of, end-to-end from the lowest level from first boot instruction, all the way up through the control plane and from rack to switch to server. That definitely helped narrow the scope.Corey: This episode has been fake sponsored by our friends at AWS with the following message: Graviton Graviton, Graviton, Graviton, Graviton, Graviton, Graviton, Graviton, Graviton. Thank you for your l-, lack of support for this show. Now, AWS has been talking about Graviton an awful lot, which is their custom in-house ARM processor. Apple moved over to ARM and instead of talking about benchmarks they won't publish and marketing campaigns with words that don't mean anything, they've let the results speak for themselves. In time, I found that almost all of my workloads have moved over to ARM architecture for a variety of reason, and my laptop now gets 15 hours of battery life when all is said and done. You're building these things on top of x86. What is the deal there? I do not accept that if that you hadn't heard of ARM until just now because, as mentioned, Graviton, Graviton, Graviton.Steve: That's right. Well, so why x86, to start? And I say to start because we have just launched our first generation products. And our first-generation or second-generation products that we are now underway working on are going to be x86 as well. We've built this system on AMD Milan silicon; we are going to be launching a Genoa sled.But when you're thinking about what silicon to use, obviously, there's a bunch of parts that go into the decision. You're looking at the kind of applicability to workload, performance, power management, for sure, and if you carve up what you are trying to achieve, x86 is still a terrific fit for the broadest set of workloads that our customers are trying to solve for. And choosing which x86 architecture was certainly an easier choice, come 2019. At this point, AMD had made a bunch of improvements in performance and energy efficiency in the chip itself. We've looked at other architectures and I think as we are incorporating those in the future roadmap, it's just going to be a question of what are you trying to solve for.You mentioned power management, and that is kind of commonly been a, you know, low power systems is where folks have gone beyond x86. Is we're looking forward to hardware acceleration products and future products, we'll certainly look beyond x86, but x86 has a long, long road to go. It still is kind of the foundation for what, again, is a general-purpose cloud infrastructure for being able to slice and dice for a variety of workloads.Corey: True. I have to look around my environment and realize that Intel is not going anywhere. And that's not just an insult to their lack of progress on committed roadmaps that they consistently miss. But—Steve: [sigh].Corey: Enough on that particular topic because we want to keep this, you know, polite.Steve: Intel has definitely had some struggles for sure. They're very public ones, I think. We were really excited and continue to be very excited about their Tofino silicon line. And this came by way of the Barefoot networks acquisition. I don't know how much you had paid attention to Tofino, but what was really, really compelling about Tofino is the focus on both hardware and software and programmability.So, great chip. And P4 is the programming language that surrounds that. And we have gotten very, very deep on P4, and that is some of the best tech to come out of Intel lately. But from a core silicon perspective for the rack, we went with AMD. And again, that was a pretty straightforward decision at the time. And we're planning on having this anchored around AMD silicon for a while now.Corey: One last question I have before we wind up calling it an episode, it seems—at least as of this recording, it's still embargoed, but we're not releasing this until that winds up changing—you folks have just raised another round, which means that your napkin doodles have apparently drawn more folks in, and now that you're shipping, you're also not just bringing in customers, but also additional investor money. Tell me about that.Steve: Yes, we just completed our Series A. So, when we last spoke three years ago, we had just raised our seed and had raised $20 million at the time, and we had expected that it was going to take about that to be able to build the team and build the product and be able to get to market, and [unintelligible 00:36:14] tons of technical risk along the way. I mean, there was technical risk up and down the stack around this [De Novo 00:36:21] server design, this the switch design. And software is still the kind of disproportionate majority of what this product is, from hypervisor up through kind of control plane, the cloud services, et cetera. So—Corey: We just view it as software with a really, really confusing hardware dongle.Steve: [laugh]. Yeah. Yes.Corey: Super heavy. We're talking enterprise and government-grade here.Steve: That's right. There's a lot of software to write. And so, we had a bunch of milestones that as we got through them, one of the big ones was getting Milan silicon booting on our firmware. It was funny it was—this was the thing that clearly, like, the industry was most suspicious of, us doing our own firmware, and you could see it when we demonstrated booting this, like, a year-and-a-half ago, and AMD all of a sudden just lit up, from kind of arm's length to, like, “How can we help? This is amazing.” You know? And they could start to see the benefits of when you can tie low-level silicon intelligence up through a hypervisor there's just—Corey: No I love the existing firmware I have. Looks like it was written in 1984 and winds up having terrible user ergonomics that hasn't been updated at all, and every time something comes through, it's a 50/50 shot as whether it fries the box or not. Yeah. No, I want that.Steve: That's right. And you look at these hyperscale data centers, and it's like, no. I mean, you've got intelligence from that first boot instruction through a Root of Trust, up through the software of the hyperscaler, and up to the user level. And so, as we were going through and kind of knocking down each one of these layers of the stack, doing our own firmware, doing our own hardware Root of Trust, getting that all the way plumbed up into the hypervisor and the control plane, number one on the customer side, folks moved from, “This is really interesting. We need to figure out how we can bring cloud capabilities to our data centers. Talk to us when you have something,” to, “Okay. We actually”—back to the earlier question on vaporware, you know, it was great having customers out here to Emeryville where they can put their hands on the rack and they can, you know, put your hands on software, but being able to, like, look at real running software and that end cloud experience.And that led to getting our first couple of commercial contracts. So, we've got some great first customers, including a large department of the government, of the federal government, and a leading firm on Wall Street that we're going to be shipping systems to in a matter of weeks. And as you can imagine, along with that, that drew a bunch of renewed interest from the investor community. Certainly, a different climate today than it was back in 2019, but what was great to see is, you still have great investors that understand the importance of making bets in the hard tech space and in companies that are looking to reinvent certain industries. And so, we added—our existing investors all participated. We added a bunch of terrific new investors, both strategic and institutional.And you know, this capital is going to be super important now that we are headed into market and we are beginning to scale up the business and make sure that we have a long road to go. And of course, maybe as importantly, this was a real confidence boost for our customers. They're excited to see that Oxide is going to be around for a long time and that they can invest in this technology as an important part of their infrastructure strategy.Corey: I really want to thank you for taking the time to speak with me about, well, how far you've come in a few years. If people want to learn more and have the requisite loading dock, where should they go to find you?Steve: So, we try to put everything up on the site. So, oxidecomputer.com or oxide.computer. We also, if you remember, we did [On the Metal 00:40:07]. So, we had a Tales from the Hardware-Software Interface podcast that we did when we started. We have shifted that to Oxide and Friends, which the shift there is we're spending a little bit more time talking about the guts of what we built and why. So, if folks are interested in, like, why the heck did you build a switch and what does it look like to build a switch, we actually go to depth on that. And you know, what does bring-up on a new server motherboard look like? And it's got some episodes out there that might be worth checking out.Corey: We will definitely include a link to that in the [show notes 00:40:36]. Thank you so much for your time. I really appreciate it.Steve: Yeah, Corey. Thanks for having me on.Corey: Steve Tuck, CEO at Oxide Computer Company. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this episode, please leave a five-star review on your podcast platform of choice, along with an angry ranting comment because you are in fact a zoology major, and you're telling me that some animals do in fact exist. But I'm pretty sure of the two of them, it's the unicorn.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Tony Baer, Principal at dbInsight, joins Corey on Screaming in the Cloud to discuss his definition of what is and isn't a database, and the trends he's seeing in the industry. Tony explains why it's important to try and have an outsider's perspective when evaluating new ideas, and the growing awareness of the impact data has on our daily lives. Corey and Tony discuss the importance of working towards true operational simplicity in the cloud, and Tony also shares why explainability in generative AI is so crucial as the technology advances. About TonyTony Baer, the founder and CEO of dbInsight, is a recognized industry expert in extending data management practices, governance, and advanced analytics to address the desire of enterprises to generate meaningful value from data-driven transformation. His combined expertise in both legacy database technologies and emerging cloud and analytics technologies shapes how clients go to market in an industry undergoing significant transformation. During his 10 years as a principal analyst at Ovum, he established successful research practices in the firm's fastest growing categories, including big data, cloud data management, and product lifecycle management. He advised Ovum clients regarding product roadmap, positioning, and messaging and helped them understand how to evolve data management and analytic strategies as the cloud, big data, and AI moved the goal posts. Baer was one of Ovum's most heavily-billed analysts and provided strategic counsel to enterprises spanning the Fortune 100 to fast-growing privately held companies.With the cloud transforming the competitive landscape for database and analytics providers, Baer led deep dive research on the data platform portfolios of AWS, Microsoft Azure, and Google Cloud, and on how cloud transformation changed the roadmaps for incumbents such as Oracle, IBM, SAP, and Teradata. While at Ovum, he originated the term “Fast Data” which has since become synonymous with real-time streaming analytics.Baer's thought leadership and broad market influence in big data and analytics has been formally recognized on numerous occasions. Analytics Insight named him one of the 2019 Top 100 Artificial Intelligence and Big Data Influencers. Previous citations include Onalytica, which named Baer as one of the world's Top 20 thought leaders and influencers on Data Science; Analytics Week, which named him as one of 200 top thought leaders in Big Data and Analytics; and by KDnuggets, which listed Baer as one of the Top 12 top data analytics thought leaders on Twitter. While at Ovum, Baer was Ovum's IT's most visible and publicly quoted analyst, and was cited by Ovum's parent company Informa as Brand Ambassador in 2017. In raw numbers, Baer has 14,000 followers on Twitter, and his ZDnet “Big on Data” posts are read 20,000 – 30,000 times monthly. He is also a frequent speaker at industry conferences such as Strata Data and Spark Summit.Links Referenced:dbInsight: https://dbinsight.io/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at RedHat.As your organization grows, so does the complexity of your IT resources. You need a flexible solution that lets you deploy, manage, and scale workloads throughout your entire ecosystem. The Red Hat Ansible Automation Platform simplifies the management of applications and services across your hybrid infrastructure with one platform. Look for it on the AWS Marketplace.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Back in my early formative years, I was an SRE sysadmin type, and one of the areas I always avoided was databases, or frankly, anything stateful because I am clumsy and unlucky and that's a bad combination to bring within spitting distance of anything that, you know, can't be spun back up intact, like databases. So, as a result, I tend not to spend a lot of time historically living in that world. It's time to expand horizons and think about this a little bit differently. My guest today is Tony Baer, principal at dbInsight. Tony, thank you for joining me.Tony: Oh, Corey, thanks for having me. And by the way, we'll try and basically knock down your primal fear of databases today. That's my mission.Corey: We're going to instill new fears in you. Because I was looking through a lot of your work over the years, and the criticism I have—and always the best place to deliver criticism is massively in public—is that you take a very conservative, stodgy approach to defining a database, whereas I'm on the opposite side of the world. I contain information. You can ask me about it, which we'll call querying. That's right. I'm a database.But I've never yet found myself listed in any of your analyses around various database options. So, what is your definition of databases these days? Where do they start and stop? Tony: Oh, gosh.Corey: Because anything can be a database if you hold it wrong.Tony: [laugh]. I think one of the last things I've ever been called as conservative and stodgy, so this is certainly a way to basically put the thumbtack on my share.Corey: Exactly. I'm trying to normalize my own brand of lunacy, so we'll see how it goes.Tony: Exactly because that's the role I normally play with my clients. So, now the shoe is on the other foot. What I view a database is, is basically a managed collection of data, and it's managed to the point where essentially, a database should be transactional—in other words, when I basically put some data in, I should have some positive information, I should hopefully, depending on the type of database, have some sort of guidelines or schema or model for how I structure the data. So, I mean, database, you know, even though you keep hearing about unstructured data, the fact is—Corey: Schemaless databases and data stores. Yeah, it was all the rage for a few years.Tony: Yeah, except that they all have schemas, just that those schemaless databases just have very variable schema. They're still schema.Corey: A question that I have is you obviously think deeply about these things, which should not come as a surprise to anyone. It's like, “Well, this is where I spend my entire career. Imagine that. I might think about the problem space a little bit.” But you have, to my understanding, never worked with databases in anger yourself. You don't have a history as a DBA or as an engineer—Tony: No.Corey: —but what I find very odd is that unlike a whole bunch of other analysts that I'm not going to name, but people know who I'm talking about regardless, you bring actual insights into this that I find useful and compelling, instead of reverting to the mean of well, I don't actually understand how any of these things work in reality, so I'm just going to believe whoever sounds the most confident when I ask a bunch of people about these things. Are you just asking the right people who also happen to sound confident? But how do you get away from that very common analyst trap?Tony: Well, a couple of things. One is I purposely play the role of outside observer. In other words, like, the idea is that if basically an idea is supposed to stand on its own legs, it has to make sense. If I've been working inside the industry, I might take too many things for granted. And a good example of this goes back, actually, to my early days—actually this goes back to my freshman year in college where I was taking an organic chem course for non-majors, and it was taught as a logic course not as a memorization course.And we were given the option at the end of the term to either, basically, take a final or do a paper. So, of course, me being a writer I thought, I can BS my way through this. But what I found—and this is what fascinated me—is that as long as certain technical terms were defined for me, I found a logic to the way things work. And so, that really informs how I approach databases, how I approach technology today is I look at the logic on how things work. That being said, in order for me to understand that, I need to know twice as much as the next guy in order to be able to speak that because I just don't do this in my sleep.Corey: That goes a big step toward, I guess, addressing a lot of these things, but it also feels like—and maybe this is just me paying closer attention—that the world of databases and data and analytics have really coalesced or emerged in a very different way over the past decade-ish. It used to be, at least from my perspective, that oh, that the actual, all the data we store, that's a storage admin problem. And that was about managing NetApps and SANs and the rest. And then you had the database side of it, which functionally from the storage side of the world was just a big file or series of files that are the backing store for the database. And okay, there's not a lot of cross-communication going on there.Then with the rise of object store, it started being a little bit different. And even the way that everyone is talking about getting meaning from data has really seem to be evolving at an incredibly intense clip lately. Is that an accurate perception, or have I just been asleep at the wheel for a while and finally woke up?Tony: No, I think you're onto something there. And the reason is that, one, data is touching us all around ourselves, and the fact is, I mean, I'm you can see it in the same way that all of a sudden that people know how to spell AI. They may not know what it means, but the thing is, there is an awareness the data that we work with, the data that is about us, it follows us, and with the cloud, this data has—well, I should say not just with the cloud but with smart mobile devices—we'll blame that—we are all each founts of data, and rich founts of data. And people in all walks of life, not just in the industry, are now becoming aware of it and there's a lot of concern about can we have any control, any ownership over the data that should be ours? So, I think that phenomenon has also happened in the enterprise, where essentially where we used to think that the data was the DBAs' issue, it's become the app developers' issue, it's become the business analysts' issue. Because the answers that we get, we're ultimately accountable for. It all comes from the data.Corey: It also feels like there's this idea of databases themselves becoming more contextually aware of the data contained within them. Originally, this used to be in the realm of, “Oh, we know what's been accessed recently and we can tier out where it lives for storage optimization purposes.” Okay, great, but what I'm seeing now almost seems to be a sense of, people like to talk about pouring ML into their database offerings. And I'm not able to tell whether that is something that adds actual value, or if it's marketing-ware.Tony: Okay. First off, let me kind of spill a couple of things. First of all, it's not a question of the database becoming aware. A database is not sentient.Corey: Niether are some engineers, but that's neither here nor there.Tony: That would be true, but then again, I don't want anyone with shotguns lining up at my door after this—Corey: [laugh].Tony: —after this interview is published. But [laugh] more of the point, though, is that I can see a couple roles for machine learning in databases. One is a database itself, the logs, are an incredible font of data, of operational data. And you can look at trends in terms of when this—when the pattern of these logs goes this way, that is likely to happen. So, the thing is that I could very easily say we're already seeing it: machine learning being used to help optimize the operation of databases, if you're Oracle, and say, “Hey, we can have a database that runs itself.”The other side of the coin is being able to run your own machine-learning models in database as opposed to having to go out into a separate cluster and move the data, and that's becoming more and more of a checkbox feature. However, that's going to be for essentially, probably, like, the low-hanging fruit, like the 80/20 rule. It'll be like the 20% of an ana—of relatively rudimentary, you know, let's say, predictive analyses that we can do inside the database. If you're going to be doing something more ambitious, such as a, you know, a large language model, you probably do not want to run that in database itself. So, there's a difference there.Corey: One would hope. I mean, one of the inappropriate uses of technology that I go for all the time is finding ways to—as directed or otherwise—in off-label uses find ways of tricking different services into running containers for me. It's kind of a problem; this is probably why everyone is very grateful I no longer write production code for anyone.But it does seem that there's been an awful lot of noise lately. I'm lazy. I take shortcuts very often, and one of those is that whenever AWS talks about something extensively through multiple marketing cycles, it becomes usually a pretty good indicator that they're on their back foot on that area. And for a long time, they were doing that about data and how it's very important to gather data, it unlocks the key to your business, but it always felt a little hollow-slash-hypocritical to me because you're going to some of the same events that I have that AWS throws on. You notice how you have to fill out the exact same form with a whole bunch of mandatory fields every single time, but there never seems to be anything that gets spat back out to you that demonstrates that any human or system has ever read—Tony: Right.Corey: Any of that? It's basically a, “Do what we say, not what we do,” style of story. And I always found that to be a little bit disingenuous.Tony: I don't want to just harp on AWS here. Of course, we can always talk about the two-pizza box rule and the fact that you have lots of small teams there, but I'd rather generalize this. And I think you really—what you're just describing is been my trip through the healthcare system. I had some sports-related injuries this summer, so I've been through a couple of surgeries to repair sports injuries. And it's amazing that every time you go to the doctor's office, you're filling the same HIPAA information over and over again, even with healthcare systems that use the same electronic health records software. So, it's more a function of that it's not just that the technologies are siloed, it's that the organizations are siloed. That's what you're saying.Corey: That is fair. And I think at some level—I don't know if this is a weird extension of Conway's Law or whatnot—but these things all have different backing stores as far as data goes. And there's a—the hard part, it seems, in a lot of companies once they hit a certain point of maturity is not just getting the data in—because they've already done that to some extent—but it's also then making it actionable and helping various data stores internal to the company reconcile with one another and start surfacing things that are useful. It increasingly feels like it's less of a technology problem and more of a people problem.Tony: It is. I mean, put it this way, I spent a lot of time last year, I burned a lot of brain cells working on data fabrics, which is an idea that's in the idea of the beholder. But the ideal of a data fabric is that it's not the tool that necessarily governs your data or secures your data or moves your data or transforms your data, but it's supposed to be the master orchestrator that brings all that stuff together. And maybe sometime 50 years in the future, we might see that.I think the problem here is both technical and organizational. [unintelligible 00:11:58] a promise, you have all these what we used call island silos. We still call them silos or islands of information. And actually, ironically, even though in the cloud we have technologies where we can integrate this, the cloud has actually exacerbated this issue because there's so many islands of information, you know, coming up, and there's so many different little parts of the organization that have their hands on that. That's also a large part of why there's such a big discussion about, for instance, data mesh last year: everybody is concerned about owning their own little piece of the pie, and there's a lot of question in terms of how do we get some consistency there? How do we all read from the same sheet of music? That's going to be an ongoing problem. You and I are going to get very old before that ever gets solved.Corey: Yeah, there are certain things that I am content to die knowing that they will not get solved. If they ever get solved, I will not live to see it, and there's a certain comfort in that, on some level.Tony: Yeah.Corey: But it feels like this stuff is also getting more and more complicated than it used to be, and terms aren't being used in quite the same way as they once were. Something that a number of companies have been saying for a while now has been that customers overwhelmingly are preferring open-source. Open source is important to them when it comes to their database selection. And I feel like that's a conflation of a couple of things. I've never yet found an ideological, purity-driven customer decision around that sort of thing.What they care about is, are there multiple vendors who can provide this thing so I'm not going to be using a commercially licensed database that can arbitrarily start playing games with seat licenses and wind up distorting my cost structure massively with very little notice. Does that align with your—Tony: Yeah.Corey: Understanding of what people are talking about when they say that, or am I missing something fundamental? Which is again, always possible?Tony: No, I think you're onto something there. Open-source is a whole other can of worms, and I've burned many, many brain cells over this one as well. And today, you're seeing a lot of pieces about the, you know, the—that are basically giving eulogies for open-source. It's—you know, like HashiCorp just finally changed its license and a bunch of others have in the database world. What open-source has meant is been—and I think for practitioners, for DBAs and developers—here's a platform that's been implemented by many different vendors, which means my skills are portable.And so, I think that's really been the key to why, for instance, like, you know, MySQL and especially PostgreSQL have really exploded, you know, in popularity. Especially Postgres, you know, of late. And it's like, you look at Postgres, it's a very unglamorous database. If you're talking about stodgy, it was born to be stodgy because they wanted to be an adult database from the start. They weren't the LAMP stack like MySQL.And the secret of success with Postgres was that it had a very permissive open-source license, which meant that as long as you don't hold University of California at Berkeley, liable, have at it, kids. And so, you see, like, a lot of different flavors of Postgres out there, which means that a lot of customers are attracted to that because if I get up to speed on this Postgres—on one Postgres database, my skills should be transferable, should be portable to another. So, I think that's a lot of what's happening there.Corey: Well, I do want to call that out in particular because when I was coming up in the naughts, the mid-2000s decade, the lingua franca on everything I used was MySQL, or as I insist on mispronouncing it, my-squeal. And lately, on same vein, Postgres-squeal seems to have taken over the entire universe, when it comes to the de facto database of choice. And I'm old and grumpy and learning new things as always challenging, so I don't understand a lot of the ways that thing gets managed from the context coming from where I did before, but what has driven the massive growth of mindshare among the Postgres-squeal set?Tony: Well, I think it's a matter of it's 30 years old and it's—number one, Postgres always positioned itself as an Oracle alternative. And the early years, you know, this is a new database, how are you going to be able to match, at that point, Oracle had about a 15-year headstart on it. And so, it was a gradual climb to respectability. And I have huge respect for Oracle, don't get me wrong on that, but you take a look at Postgres today and they have basically filled in a lot of the blanks.And so, it now is a very cre—in many cases, it's a credible alternative to Oracle. Can it do all the things Oracle can do? No. But for a lot of organizations, it's the 80/20 rule. And so, I think it's more just a matter of, like, Postgres coming of age. And the fact is, as a result of it coming of age, there's a huge marketplace out there and so much choice, and so much opportunity for skills portability. So, it's really one of those things where its time has come.Corey: I think that a lot of my own biases are simply a product of the era in which I learned how a lot of these things work on. I am terrible at Node, for example, but I would be hard-pressed not to suggest JavaScript as the default language that people should pick up if they're just entering tech today. It does front-end, it does back-end—Tony: Sure.Corey: —it even makes fries, apparently. There's a—that is the lingua franca of the modern internet in a bunch of different ways. That doesn't mean I'm any good at it, and it doesn't mean at this stage, I'm likely to improve massively at it, but it is the right move, even if it is inconvenient for me personally.Tony: Right. Right. Put it this way, we've seen—and as I said, I'm not an expert in programming languages, but we've seen a huge profusion of programming languages and frameworks. But the fact is that there's always been a draw towards critical mass. At the turn of the millennium, we thought is between Java and .NET. Little did we know that basically JavaScript—which at that point was just a web scripting language—[laugh] we didn't know that it could work on the server; we thought it was just a client. Who knew?Corey: That's like using something inappropriately as a database. I mean, good heavens.Tony: [laugh]. That would be true. I mean, when I could have, you know, easily just use a spreadsheet or something like that. But so, I mean, who knew? I mean, just like for instance, Java itself was originally conceived for a set-top box. You never know how this stuff is going to turn out. It's the same thing happen with Python. Python was also a web scripting language. Oh, by the way, it happens to be really powerful and flexible for data science. And whoa, you know, now Python is—in terms of data science languages—has become the new SaaS.Corey: It really took over in a bunch of different ways. Before that, Perl was great, and I go, “Why would I use—why write in Python when Perl is available?” It's like, “Okay, you know, how to write Perl, right?” “Yeah.” “Have you ever read anything a month later?” “Oh…” it's very much a write-only language. It is inscrutable after the fact. And Python at least makes that a lot more approachable, which is never a bad thing.Tony: Yeah.Corey: Speaking of what you touched on toward the beginning of this episode, the idea of databases not being sentient, which I equate to being self-aware, you just came out very recently with a report on generative AI and a trip that you wound up taking on this. Which I've read; I love it. In fact, we've both been independently using the phrase [unintelligible 00:19:09] to, “English is the new most common programming language once a lot of this stuff takes off.” But what have you seen? What have you witnessed as far as both the ground truth reality as well as the grandiose statements that companies are making as they trip over themselves trying to position as the forefront leader and all of this thing that didn't really exist five months ago?Tony: Well, what's funny is—and that's a perfect question because if on January 1st you asked “what's going to happen this year?” I don't think any of us would have thought about generative AI or large language models. And I will not identify the vendors, but I did some that had— was on some advanced briefing calls back around the January, February timeframe. They were talking about things like server lists, they were talking about in database machine learning and so on and so forth. They weren't saying anything about generative.And all of a sudden, April, it changed. And it's essentially just another case of the tail wagging the dog. Consumers were flocking to ChatGPT and enterprises had to take notice. And so, what I saw, in the spring was—and I was at a conference from SaaS, I'm [unintelligible 00:20:21] SAP, Oracle, IBM, Mongo, Snowflake, Databricks and others—that they all very quickly changed their tune to talk about generative AI. What we were seeing was for the most part, position statements, but we also saw, I think, the early emphasis was, as you say, it's basically English as the new default programming language or API, so basically, coding assistance, what I'll call conversational query.I don't want to call it natural language query because we had stuff like Tableau Ask Data, which was very robotic. So, we're seeing a lot of that. And we're also seeing a lot of attention towards foundation models because I mean, what organization is going to have the resources of a Google or an open AI to develop their own foundation model? Yes, some of the Wall Street houses might, but I think most of them are just going to say, “Look, let's just use this as a starting point.”I also saw a very big theme for your models with your data. And where I got a hint of that—it was a throwaway LinkedIn post. It was back in, I think like, February, Databricks had announced Dolly, which was kind of an experimental foundation model, just to use with your own data. And I just wrote three lines in a LinkedIn post, it was on Friday afternoon. By Monday, it had 65,000 hits.I've never seen anything—I mean, yes, I had a lot—I used to say ‘data mesh' last year, and it would—but didn't get anywhere near that. So, I mean, that really hit a nerve. And other things that I saw, was the, you know, the starting to look with vector storage and how that was going to be supported was it was going be a new type of database, and hey, let's have AWS come up with, like, an, you know, an [ADF 00:21:41] database here or is this going to be a feature? I think for the most part, it's going to be a feature. And of course, under all this, everybody's just falling in love, falling all over themselves to get in the good graces of Nvidia. In capsule, that's kind of like what I saw.Corey: That feels directionally accurate. And I think databases are a great area to point out one thing that's always been more a little disconcerting for me. The way that I've always viewed databases has been, unless I'm calling a RAND function or something like it and I don't change the underlying data structure, I should be able to run a query twice in a row and receive the same result deterministically both times.Tony: Mm-hm.Corey: Generative AI is effectively non-deterministic for all realistic measures of that term. Yes, I'm sure there's a deterministic reason things are under the hood. I am not smart enough or learned enough to get there. But it just feels like sometimes we're going to give you the answer you think you're going to get, sometimes we're going to give you a different answer. And sometimes, in generative AI space, we're going to be supremely confident and also completely wrong. That feels dangerous to me.Tony: [laugh]. Oh gosh, yes. I mean, I take a look at ChatGPT and to me, the responses are essentially, it's a high school senior coming out with an essay response without any footnotes. It's the exact opposite of an ACID database. The reason why we're very—in the database world, we're very strongly drawn towards ACID is because we want our data to be consistent and to get—if we ask the same query, we're going to get the same answer.And the problem is, is that with generative, you know, based on large language models, computers sounds sentient, but they're not. Large language models are basically just a series of probabilities, and so hopefully those probabilities will line up and you'll get something similar. That to me, kind of scares me quite a bit. And I think as we start to look at implementing this in an enterprise setting, we need to take a look at what kind of guardrails can we put on there. And the thing is, that what this led me to was that missing piece that I saw this spring with generative AI, at least in the data and analytics world, is nobody had a clue in terms of how to extend AI governance to this, how to make these models explainable. And I think that's still—that's a large problem. That's a huge nut that it's going to take the industry a while to crack.Corey: Yeah, but it's incredibly important that it does get cracked.Tony: Oh, gosh, yes.Corey: One last topic that I want to get into. I know you said you don't want to over-index on AWS, which, fair enough. It is where I spend the bulk of my professional time and energy—Tony: [laugh].Corey: Focusing on, but I think this one's fair because it is a microcosm of a broader industry question. And that is, I don't know what the DBA job of the future is going to look like, but increasingly, it feels like it's going to primarily be picking which purpose-built AWS database—or larger [story 00:24:56] purpose database is appropriate for a given workload. Even without my inappropriate misuse of things that are not databases as databases, they are legitimately 15 or 16 different AWS services that they position as database offerings. And it really feels like you're spiraling down a well of analysis paralysis, trying to pick between all these things. Do you think the future looks more like general-purpose databases, or very purpose-built and each one is this beautiful, bespoke unicorn?Tony: [laugh]. Well, this is basically a hit on a theme that I've been—you know, we've been all been thinking about for years. And the thing is, there are arguments to be made for multi-model databases, you know, versus a for-purpose database. That being said, okay, two things. One is that what I've been saying, in general, is that—and I wrote about this way, way back; I actually did a talk at the [unintelligible 00:25:50]; it was a throwaway talk, or [unintelligible 00:25:52] one of those conferences—I threw it together and it's basically looking at the emergence of all these specialized databases.But how I saw, also, there's going to be kind of an overlapping. Not that we're going to come back to Pangea per se, but that, for instance, like, a relational database will be able to support JSON. And Oracle, for instance, does has some fairly brilliant ideas up the sleeve, what they call a JSON duality, which sounds kind of scary, which basically says, “We can store data relationally, but superimpose GraphQL on top of all of this and this is going to look really JSON-y.” So, I think on one hand, you are going to be seeing databases that do overlap. Would I use Oracle for a MongoDB use case? No, but would I use Oracle for a case where I might have some document data? I could certainly see that.The other point, though, and this is really one I want to hammer on here—it's kind of a major concern I've had—is I think the cloud vendors, for all their talk that we give you operational simplicity and agility are making things very complex with its expanding cornucopia of services. And what they need to do—I'm not saying, you know, let's close down the patent office—what I think we do is we need to provide some guided experiences that says, “Tell us the use case. We will now blend these particular services together and this is the package that we would suggest.” I think cloud vendors really need to go back to the drawing board from that standpoint and look at, how do we bring this all together? How would he really simplify the life of the customer?Corey: That is, honestly, I think the biggest challenge that the cloud providers have across the board. There are hundreds of services available at this point from every hyperscaler out there. And some of them are brand new and effectively feel like they're there for three or four different customers and that's about it and others are universal services that most people are probably going to use. And most things fall in between those two extremes, but it becomes such an analysis paralysis moment of trying to figure out what do I do here? What is the golden path?And what that means is that when you start talking to other people and asking their opinion and getting their guidance on how to do something when you get stuck, it's, “Oh, you're using that service? Don't do it. Use this other thing instead.” And if you listen to that, you get midway through every problem for them to start over again because, “Oh, I'm going to pick a different selection of underlying components.” It becomes confusing and complicated, and I think it does customers largely a disservice. What I think we really need, on some level, is a simplified golden path with easy on-ramps and easy off-ramps where, in the absence of a compelling reason, this is what you should be using.Tony: Believe it or not, I think this would be a golden case for machine learning.Corey: [laugh].Tony: No, but submit to us the characteristics of your workload, and here's a recipe that we would propose. Obviously, we can't trust AI to make our decisions for us, but it can provide some guardrails.Corey: “Yeah. Use a graph database. Trust me, it'll be fine.” That's your general purpose—Tony: [laugh].Corey: —approach. Yeah, that'll end well.Tony: [laugh]. I would hope that the AI would basically be trained on a better set of training data to not come out with that conclusion.Corey: One could sure hope.Tony: Yeah, exactly.Corey: I really want to thank you for taking the time to catch up with me around what you're doing. If people want to learn more, where's the best place for them to find you?Tony: My website is dbinsight.io. And on my homepage, I list my latest research. So, you just have to go to the homepage where you can basically click on the links to the latest and greatest. And I will, as I said, after Labor Day, I'll be publishing my take on my generative AI journey from the spring.Corey: And we will, of course, put links to this in the [show notes 00:29:39]. Thank you so much for your time. I appreciate it.Tony: Hey, it's been a pleasure, Corey. Good seeing you again.Corey: Tony Baer, principal at dbInsight. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that we will eventually stitch together with all those different platforms to create—that's right—a large-scale distributed database.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
AWS Morning Brief for the week of September 11, 2023, with Corey Quinn. Links: Amazon Aurora and Amazon RDS announces Extended Support for MySQL and PostgreSQL databases Amazon CloudWatch adds Amazon EKS control plane logs as Vended Logs Amazon CloudWatch Logs announces regular expression filter pattern syntax support As SwiftOnSecurity pointed out a week or two ago, a lot of folks can now discover firsthand just how many of their rules allow all 10* traffic Introducing Amazon EC2 R7iz instances AWS Marketplace now supports AWS CloudTrail to improve procurement activity monitoring AWS Step Functions launches enhanced error handling AWS Trusted Advisor adds 1 new fault tolerance check Announcing daily disbursements for AWS Marketplace sellers Embracing FinOps to Maximize Cloud Value and Control Costs with the Deloitte FinOps Framework Transforming Aviation Maintenance with the Infosys Generative AI Solution Built on Amazon Bedrock How Vercel Shipped Cron Jobs in 2 Months Using Amazon EventBridge Scheduler How contact center leaders can prepare for generative AI A Culture of Resilience How generative AI is energizing the beauty industry Migrating AWS Direct Connect to a new location Reduce the security and compliance risks of messaging apps with AWS Wickr AWS Guild Tournament builds cloud skills and innovative customer solutions From chocolate sales to a career in cloud with training from AWS re/Start Amazon to Discontinue Honeycode App-Building Service
On this episode, Christine Li, VP of Partnerships and Enablement at G2, discusses the importance of reviews and social proof in the software industry. G2 is a platform that gathers social proof through customer reviews to help buyers make informed decisions. G2 has a unique role in the marketplace ecosystem. They function as a marketplace themselves, as a vendor in other marketplaces (such as Salesforce AppExchange and AWS Marketplace), and as a syndication partner for marketplaces. Its syndication program helps amplify reviews and social proof for vendors across multiple marketplaces, making it easier for buyers to find valuable insights. Christine shares three key takeaways from their syndication program: Firstly, B2B review collection is challenging, but G2 has invested in AI and human resources to ensure the authenticity and reliability of reviews. Second, negative reviews are essential as they provide valuable insights and build trust with buyers. Responding to negative reviews professionally is crucial for addressing concerns. Finally, offering syndicated reviews for free didn't yield desired results; monetizing the service helped drive partner prioritization. The discussion highlights the significance of customer reviews and social proof in the B2B software industry, and how G2's approach to review collection and syndication benefits vendors and buyers alike. Resources mentioned: Christine Li - https://www.linkedin.com/in/christinerli/ G2 | LinkedIn - https://www.linkedin.com/company/g2dotcom/ G2 | Website - https://www.g2.com/ Thank you to our amazing podcast team at Content Allies. Want to launch your own B2B revenue-generating podcasts? Contact them at https://ContentAllies.com. #saas #software #cloud
Andreas Wittig, Co-Author of Amazon Web Services in Action and Co-Founder of marbot, joins Corey on Screaming in the Cloud to discuss ways to keep a book up to date in an ever-changing world, the advantages of working with a publisher, and how he began the journey of writing a book in the first place. Andreas also recalls how much he learned working on the third edition of Amazon Web Services in Action and how teaching can be an excellent tool for learning. Since writing the first edition, Adreas's business has shifted from a consulting business to a B2B product business, so he and Corey also discuss how that change came about and the pros and cons of each business model. About AndreasAndreas is the Co-Author of Amazon Web Services in Action and Co-Founder of marbot - AWS Monitoring made simple! He is also known on the internet as cloudonaut through the popular blog, podcast, and youtube channel he created with his brother Michael. Links Referenced: Amazon Web Services in Action: https://www.amazon.com/Amazon-Services-Action-Andreas-Wittig/dp/1617295116 Rapid Docker on AWS: https://cloudonaut.io/rapid-docker-on-aws/ bucket/av: https://bucketav.com/ marbot: https://marbot.io/ cloudonaut.io: https://cloudonaut.io TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. It's been a few years since I caught up with Andreas Wittig, who is also known in the internet as cloudonaut, and much has happened since then. Andreas, how are you?Andreas: Hey, absolutely. Thank you very much. I'm happy to be here in the show. I'm doing fine.Corey: So, one thing that I have always held you in some high regard for is that you have done what I have never had the attention span to do: you wrote a book. And you published it a while back through Manning, it was called Amazon Web Services in Action. That is ‘in action' two words, not Amazon Web Services Inaction of doing absolutely nothing about it, which is what a lot of companies in the space seem to do instead.Andreas: [laugh]. Yeah, absolutely. So. And it was not only me. I've written the book together with my brother because back in 2015, Manning, for some reason, wrote in and asked us if we would be interested in writing the book.And we had just founded our own consulting company back then and we had—we didn't have too many clients at the very beginning, so we had a little extra time free. And then we decided, okay, let's do the book. And let's write a book about Amazon Web Services, basically, a deep introduction into all things AWS. So, this was 2015, and it was indeed a lot of work, much more [laugh] than we expected. So, first of all, the hard part is, what do you want to have in the book? So, what's the TOC? What is important and must be in?And then you start writing and have examples and everything. So, it was really an interesting journey. And doing it together with a publisher like Manning was also really interesting because we learned a lot about writing. You have kind of a coach, an editor that helps you through that process. So, this was really a hard and fun experience.Corey: There's a lot of people that have said very good things about writing the book through a traditional publisher. And they also say that one of the challenges is it's a blessing and a curse, where you basically have someone standing over your shoulder saying, “Is it done yet? Is it done yet? Is it done yet?” The consensus that seems to have emerged from people who have written books is, “That was great, please don't ever ask me to do it again.”And my operating theory is that no one wants to write a book. They want to have written a book. Which feels like two very different things most of the time. But the reason you're back on now is that you have gone the way of the terrible college professor, where you're going to update the book, and therefore you get to do a whole new run of textbooks and make everyone buy it and kill the used market, et cetera. And you've done that twice now because you have just recently released the third edition. So, I have to ask, how different is version one from version two and from version three? Although my apologies; we call them ‘editions' in the publishing world.Andreas: [laugh]. Yeah, yeah. So, of course, as you can imagine, things change a lot in AWS world. So, of course, you have to constantly update things. So, I remember from first to second edition, we switched from CloudFormation in JSON to YAML. And now to the third edition, we added two new chapters. This was also important to us, so to keep also the scope of the book in shape.So, we have in the third edition, two new chapters. One is about automating deployments, recovering code deploy, [unintelligible 00:03:59], CloudFormation rolling updates in there. And then there was one important topic missing at all in the book, which was containers. And we finally decided to add that in, and we have now container chapter, starting with App Runner, which I find quite an interesting service to observe right now, and then our bread and butter service: ECS and Fargate. So, that's basically the two new chapters. And of course, then reworking all the other chapters is also a lot of work. And so, many things change over time. Cannot imagine [laugh].Corey: When was the first edition released? Because I believe the second one was released in 2018, which means you've been at this for a while.Andreas: Yeah. So, the first was 2015, the second 2018, three years later, and then we had five years, so now this third edition was released at the beginning of this year, 2023.Corey: Eh, I think you're right on schedule. Just March of 2020 lasted three years. That's fine.Andreas: Yeah [laugh].Corey: So, I have to ask, one thing that I've always appreciated about AWS is, it feels like with remarkably few exceptions, I can take a blog post written on how to do something with AWS from 2008 and now in 2023, I can go through every step along with that blog post. And yeah, I might have trouble getting some of the versions and services and APIs up and running, but the same steps will absolutely work. There are very few times where a previously working API gets deprecated and stops working. Is this the best way to proceed? Absolutely not.But you can still spin up the m1.medium instance sizes, or whatever it was, or [unintelligible 00:05:39] on small or whatever the original only size that you could get was. It's just there are orders of magnitude and efficiency gains you can do by—you can go through by using more modern approaches. So, I have to ask, was there anything in the book as you revised it—two times now—that needed to come out because it was now no longer working?Andreas: So, related to the APIs that's—they are really very stable, you're right about that. So, the problem is, our first few chapters where we have screenshots of how you go through the management—Corey: Oh no.Andreas: —console [laugh]. And you can probably, you can redo them every three months, probably, because the button moves or a step is included or something like that. So, the later chapters in the book, where we focus a lot on the CLI or CloudFormation and stuff like—or SDKs, they are pretty stable. But the first few [ones 00:06:29] are a nightmare to update all those screenshots. And then sometimes, so I was going through the book, and then I noticed, oh, there's a part of this chapter that I can completely remove nowadays.So, I can give you an example. So, I was going through the chapter about Simple Storage Service S3, and I—there was a whole section in the chapter about read-after-write consistency. Because back then, it was important that you knew that after updating an object or reading an object before it was created the first time, you could get outdated versions for a little while, so this was eventually consistent. But nowadays, AWS has changed that and basically now, S3 has this strong read-after-write consistency. So, I basically could remove that whole part in the chapter which was quite complicated to explain to the reader, right, so I [laugh] put a lot of effort into that.Corey: You think that was confusing? I look at the sea of systems I had to oversee at one company, specifically to get around that problem. It's like, well, we can now take this entire application and yeet it into the ocean because it was effectively a borderline service to that just want to ens—making consistency guarantees. It's not a common use case, but it is one that occurs often enough to be a problem. And of course, when you need it, you really need it. That was a nice under-the-hood change that was just one day, surprise, it works that way. But I'm sure it was years of people are working behind the scenes, solving for impossible problems to get there, and cetera, et cetera.Andreas: Yeah, yeah. But that's really cool is to remove parts of the book that are now less complicated. This is really cool. So, a few other examples. So, things change a lot. So, for example, EFS, so we have EFS, Elastic File System, in the book as well. So, now we have new throughput modes, different limits. So, there's really a lot going on and you have to carefully go through all the—Corey: Oh, when EFS launched, it was terrible. Now, it's great just because it's gotten so much more effective and efficient as a service. It's… AWS releases things before they're kind of ready, it feels like sometimes, and then they improve with time. I know there have been feature deprecations. For example, for some reason, they are no longer allowing us to share out a bucket via BitTorrent, which, you know, in 2006 when it came out, seemed like a decent idea to save on bandwidth. But here in 2023, no one cares about it.But I'm also keeping a running list of full-on AWS services that have been deprecated or have the deprecations announced. Are any of those in the book in any of its editions? And if and when there's a fourth edition, will some of those services have to come out?Andreas: [laugh]. Let's see. So, right after the book was published—because the problem with books is they get printed, right; that's the issue—but the target of the book, AWS, changes. So, a few weeks after the printed book was out, we found out that we have an issue in our one of our examples because now S3 buckets, when you create them, they have locked public access enabled by default. And this was not the case before. And one of our example relies on that it can create object access control lists, and this is not working now anymore. [laugh].So yeah, there are things changing. And we have, the cool thing about Manning is they have that what they call a live book, so you can read it online and you can have notes from other readers and us as the authors along the text, and there we can basically point you in the right direction and explain what happened here. So, this is how we try to keep the book updated. Of course, the printed one stays the same, but the ebook can change over time a little bit.Corey: Yes, ebooks are… at least keeping them updated is a lot easier, I would imagine. It feels like that—speaking of continuous builds and automatic CI/CD approaches—yeah, well, we could build a book just by updating some text in a Git repo or its equivalent, and pressing go, but it turns out that doing a whole new print run takes a little bit more work.Andreas: Yeah. Because you mentioned the experience of writing a book with a publisher and doing it on your own with self-publishing, so we did both in the past. We have Amazon Web Services in Action with Manning and we did another book, Rapid Docker on AWS in self-publishing. And what we found out is, there's really a lot of effort that goes into typesetting and layouting a book, making sure it looks consistent.And of course, you can just transform some markdown into a epub and PDF versions, but if a publisher is doing that, the results are definitely different. So, that was, besides the other help that we got from the publisher, very helpful. So, we enjoyed that as well.Corey: What is the current state of the art—since I don't know the answer to this one—around updating ebook versions? If I wind up buying an ebook on Kindle, for example, will they automatically push errata down automatically through their system, or do they reserve that for just, you know, unpublishing books that they realized shouldn't be on the Marketplace after people have purchased them?Andreas: [laugh]. So—Corey: To be fair, that only happened once, but I'm still giving them grief for it a decade and change later. But it was 1984. Of all the books to do that, too. I digress.Andreas: So, I'm not a hundred percent sure how it works with the Kindle. I know that Manning pushes out new versions by just emailing all the customers who bought the book and sending them a new version. Yeah.Corey: Yeah. It does feel, on some level, like there needs to be at least a certain degree of substantive change before they're going to start doing that. It's like well, good news. There was a typo on page 47 that we're going to go ahead and fix now. Two letters were transposed in a word. Now, that might theoretically be incredibly important if it's part of a code example, which yes, send that out, but generally, A, their editing is on point, so I didn't imagine that would sneak through, and 2, no one cares about a typo release and wants to get spammed over it?Andreas: Definitely, yeah. Every time there's a reprint of the book, you have the chance to make small modifications, to add something or remove something. That's also a way to keep it in shape a little bit.Corey: I have to ask, since most people talk about AWS services to a certain point of view, what is your take on databases? Are you sticking to the actual database services or are you engaged in my personal hobby of misusing everything as a database by holding it wrong?Andreas: [laugh]. So, my favorite database for starting out is DynamoDB. So, I really like working with DynamoDB and I like the limitations and the thing that you have to put some thoughts into how to structure your data set in before. But we also use a lot of Aurora, which really find an interesting technology. Unfortunately, Aurora Serverless, it's not becoming a product that I want to use. So, version one is now outdated, version two is much too expensive and restricted. So—Corey: I don't even know that it's outdated because I'm seeing version one still get feature updates to it. It feels like a divergent service. That is not what I would expect a version one versus version two to be. I'm with you on Dynamo, by the way. I started off using that and it is cheap is free for most workloads I throw at it. It's just a great service start to finish. The only downside is that if I need to move it somewhere else, then I have a problem.Andreas: That's true. Yeah, absolutely.Corey: I am curious, as far as you look across the sea of change—because you've been doing this for a while and when you write a book, there's nothing that I can imagine that would be better at teaching you the intricacies of something like AWS than writing a book on it. I got a small taste of this years ago when I shot my mouth off and committed to give a talk about Git. Well, time to learn Git. And teaching it to other people really solidifies a lot of the concepts yourself. Do you think that going through the process of writing this book has shaped how you perform as an engineer?Andreas: Absolutely. So, it's really interesting. So,I added the third edition and I worked on it mostly last year. And I didn't expect to learn a lot during that process actually, because I just—okay, I have to update all the examples, make sure everything work, go through the text, make sure everything is up to date. But I learned things, not only new things, but I relearned a lot of things that I wasn't aware of anymore. Or maybe I've never been; I don't know exactly [laugh].But it's always, if you go into the details and try to explain something to others, you learn a lot about that. So, teaching is a very good way to, first of all gather structure and a deep understanding of a topic and also dive into the details. Because when you write a book, every time you write a sentence, ask the question, is that really correct? Do I really know that or do I just assume that? So, I check the documentation, try to find out, is that really the case or is that something that came up myself?So, you'll learn a lot by doing that. And always come to the limits of the AWS documentation because sometimes stuff is just not documented and you need to figure out, what is really happening here? What's the real deal? And then this is basically the research part. So, I always find that interesting. And I learned a lot in during the third edition, while was only adding two new chapters and rewriting a lot of them. So, I didn't expect that.Corey: Do you find that there has been an interesting downstream effect from having written the book, that for better or worse, I've always no—I always notice myself responding to people who have written a book with more deference, more acknowledgment for the time and effort that it takes. And some books, let's be clear, are terrible, but I still find myself having that instinctive reaction because they saw something through to be published. Have you noticed it changing other aspects of your career over the past, oh, dear Lord, it would have been almost ten years now.Andreas: So, I think it helped us a lot with our consulting business, definitely. Because at the very beginning, so back in 2015, at least here in Europe and Germany, AWS was really new in the game. And being the one that has written a book about AWS was really helping… stuff. So, it really helped us a lot for our consulting work. I think now we are into that game of having to update the book [laugh] every few years, to make sure it stays up to date, but I think it really helped us for starting our consulting business.Corey: And you've had a consulting business for a while. And now you have effectively progressed to the next stage of consulting business lifecycle development, which is, it feels like you're becoming much more of a product company than you were in years past. Is that an accurate perception from the outside or am I misunderstanding something fundamental?Andreas: You know, absolutely, that's the case. So, from the very beginning, basically, when we founded our company, so eight years ago now, so we always had to go to do consulting work, but also do product work. And we had a rule of thumb that 20% of our time goes into product development. And we tried a lot of different things. So, we had just a few examples that failed completely.So, we had a Time [Series 00:17:49] as a Service offering at the very beginning of our journey, which failed completely. And now we have Amazon Timestream, which makes that totally—so now the market is maybe there for that. We tried a lot of things, tried content products, but also as we are coming from the software development world, we always try to build products. And over the years, we took what we learned from consulting, so we learned a lot about, of course, AWS, but also about the market, about the ecosystem. And we always try to bring that into the market and build products out of that.So nowadays, we really transitioned completely from consulting to a product company, as you said. So, we do not do any consulting anymore with one few exception with one of our [laugh] best or most important clients. But we are now a product company. And we only a two-person company. So, the idea was always how to scale a company without growing the team or hiring a lot of people, and a consulting business is definitely not a good way to do that, so yeah, this was why always invested into products.And now we have two products in the AWS Marketplace which works very well for us because it allows us to sell worldwide and really easily get a relationship up and running with our customers, and that pay through their AWS bill. So, that's really helping us a lot. Yeah.Corey: A few questions on that. At first it always seems to me that writing software or building a product is a lot like real estate in that you're doing a real estate development—to my understanding since I live in San Francisco and this is a [two exit 00:19:28] town; I still rent here—I found though, that you have to spend a lot of money and effort upfront and you don't get to start seeing revenue on that for years, which is why the VC model is so popular where you'll take $20 million, but then in return they want to see massive, outsized returns on that, which—it feels—push an awful lot of perfectly sustainable products into things that are just monstrous.Andreas: Hmm, yeah. Definitely.Corey: And to my understanding, you bootstrapped. You didn't take a bunch of outside money to do this, right?Andreas: No, no, we have completely bootstrapping and basically paying the bills with our consulting work. So yeah, I can give you one example. So, bucketAV is our solution to scan S3 buckets for malware, and basically, this started as an open-source project. So, this was just a side project we are working on. And we saw that there is some demand for that.So, people need ways to scan their objects—for example, user uploads—for malware, and we just tried to publish that in the AWS Marketplace to sell it through the Marketplace. And we don't really expect that this is a huge deal, and so we just did, I don't know, Michael spent a few days to make sure it's possible to publish that and get in shape. And over time, this really grew into an important, really substantial part of our business. And this doesn't happen overnight. So, this adds up, month by month. And you get feedback from customers, you improve the product based on that. And now this is one of the two main products that we sell in the Marketplace.Corey: I wanted to ask you about the Marketplace as well. Are you finding that that has been useful for you—obviously, as a procurement vehicle, it means no matter what country a customer is in, they can purchase it, it shows up on the AWS bill, and life goes on—but are you finding that it has been an effective way to find new customers?Andreas: Yes. So, I definitely would think so. It's always funny. So, we have completely inbound sales funnel. So, all customers find us through was searching the Marketplace or Google, probably. And so, what I didn't expect that it's possible to sell a B2B product that way. So, we don't know most of our customers. So, we know their name, we know the company name, but we don't know anyone there. We don't know the person who buys the product.This is, on the one side, a very interesting thing as a two-person company. You cannot build a huge sales process and I cannot invest too much time into the sales process or procurement process, so this really helps us a lot. The downside of it is a little bit that we don't have a close relationship with our customers and sometimes it's a little tricky for us to find important person to talk to, to get feedback and stuff. But on the other hand, yeah, it really helps us to sell to businesses all over the world. And we sell to very small business of course, but also to large enterprise customers. And they are fine with that process as well. And I think, even the large enterprises, they enjoy that it's so easy [laugh] to get a solution up and running and don't have to talk to any salespersons. So, enjoy it and I think our customers do as well.Corey: This is honestly the first time I've ever heard a verifiable account a vendor saying, “Yeah, we put this thing on the Marketplace, and people we've never talked to find us on the Marketplace and go ahead and buy.” That is not the common experience, let's put it that way. Now true, an awful lot of folks are selling enterprise software on this and someone—I forget who—many years ago had a great blog post on why no enterprise software costs $5,000. It either is going to cost $500 or it's going to cost 100 grand and up because the difference is, is at some point, you'd have a full-court press enterprise sales motion to go and sell the thing. And below a certain point, great, people are just going to be able to put it on their credit card and that's fine. But that's why you have this giant valley of there is very little stuff priced in that sweet spot.Andreas: Yeah. So, I think maybe it's important to mention that our products are relatively simple. So, they are just for a very small niche, a solution for a small problem. So, I think that helps a lot. So, we're not selling a full-blown cloud security solution; we only focus on that very small part: scanning S3 objects for malware.For example, on marbot,f the other product that we sell, which is monitoring of AWS accounts. Again, we focus on a very simple way to monitor AWS workloads. And so, I think that is probably why this is a successful way for us to find new customers because it's not a very complicated product where you have to explain a lot. So, that's probably the differentiator here.Corey: Having spent a fair bit of time doing battle with compliance goblins—which is, to be clear, I'm not describing people; I'm describing processes—in many cases, we had to do bucket scanning for antivirus, just to check a compliance box. From our position, there was remarkably little risk of a user-generated picture of a receipt that is input sanitized to make sure it is in fact a picture, landing in an S3 bucket and then somehow infecting one of the Linux servers through which it passed. So, we needed something that just checked the compliance box or we would not be getting the gold seal on our website today. And it was, more or less, a box-check as opposed to something that solved a legitimate problem. This was also a decade and change ago. Has that changed to a point now where there are legitimate threats and concerns around this, or is it still primarily just around make the auditor stop yelling at me, please?Andreas: Mmm. I think it's definitely to tick the checkbox, to be compliant with this, some regulation. On the other side, I think there are definitely use cases where it makes a lot of sense, especially when it comes to user-generated content of all kinds, especially if you're not only consuming it internally, but maybe also others can immediately start downloading that. So, that is where we see many of our customers are coming with that scenario that they want to make sure that the files that people upload and others can download are not infected. So, that is probably the most important use case.Corey: There's also, on some level, an increasing threat of ransomware. And for a long time, I was very down on the ideas of all these products that hit the market to defend S3 buckets against ransomware. Until one day, there was an AWS security blog post talking about how they found it. And yeah, we've we have seen this in the wild; it is causing problems for companies; here's what to do about it. Because it's one of those areas where I can't trust a vendor who's trying to sell me something to tell me that this problem exists.I mean, not to cast aspersions, but they're very interested, they're very incentivized to tell that story, whereas AWS is not necessarily incentivized to tell a story like that. So, that really brought it home for me that no, this is a real thing. So, I just want to be clear that my opinion on these things does in fact, evolve. It's not, “Well, I thought it was dumb back in 2012, so clearly it's still dumb now.” That is not my position, I want to be very clear on that.I do want to revisit for a moment, the idea of going from a consultancy that is a services business over to a product business because we've toyed with aspects of that here at The Duckbill Group a fair bit. We've not really found long-term retainer services engagements that add value that we are comfortable selling. And that means as a result that when you sell fixed duration engagements, it's always a sell, sell, sell, where's the next project coming from? Whereas with product businesses, it's oh, the grass is always greener on the other side. It's recurring revenue. Someone clicks, the revenue sticks around and never really goes away. That's the dream from where I sit on the services side of the fence, wistfully looking across and wondering what if. Now that you've made that transition, what sucks about product businesses that you might not have seen going into it?Andreas: [laugh]. Yeah, that a good question. So, on the one side, it was really also our dream to have a product business because it really changes the way we work. We can block large parts of our calendar to do deep-focus work, focus on things, find new solutions, and really try to make a solution that really fits to problem and uses all the AWS capabilities to do so. And on the other side, a product business involves, of course, selling the product, which is hard.And we are two software engineers, [laugh] and really making sure that we optimize our sales and there's search engine optimization, all that stuff, this is really hard for us because we don't know anything about that and we always have to find an expert, or we need to build a knowledge ourself, try things out, and so on. So, that whole part of selling the product, this is really a challenge for us. And then of course, product business evolves a lot of support work. So, we get support emails multiple times per hour, and we have to answer them and be as fast as possible with that. So, that is, of course, something that you do not have to do with consulting work.And not always that, the questions are many times really simple questions that pointed people in the right direction, find part of the documentation that answers the question. So, that is a constant stream of questions coming in that you have to answer. So, the inbox is always full [laugh]. So, that is maybe a small downside of a product business. But other than that, yeah, compared to a consulting business, it really gives us many flexibilities with planning our work day around the rest of our lives. That's really what we enjoy about a product company.Corey: I was very careful to pick an expensive problem that was only a business-hours problem. So, I don't wind up with a surprise, middle-of-the-night panic phone call. It's yeah, it turns out that AWS billing operate during business hours in the US Pacific Time. The end. And there are no emergencies here; there are simply curiosities that will, in the fullness of time take weeks to get resolved.Andreas: Mmm. Yeah.Corey: I spent too many years on call, in that sense. Everyone who's built a product company the first time always says the second time, the engineering? Meh, there are ways to solve that. Solving the distribution problem. That's the thing I want to focus on next.And I feel like I sort of went into this backwards in that I don't really have a product to sell people but I somehow built an audience. And to be honest, it's partly why. It's because I didn't know what I was going to be doing after 18 months and I knew that whatever it was going to be, I needed an audience to tell about it, so may as well start the work of building the audience now. So, I have to imagine if nothing else, your book has been a tremendous source of building a community. When I mentioned the word cloudonaut to people who have been learning AWS, more often than not, they know who you are.Andreas: Yeah.Corey: Although I admit they sometimes get you confused with your brother.Andreas: [laugh]. Yes, that's not too hard. Yeah, yeah, cloudonaut is definitely—this was always our, also a side project of we was just writing about things that we learned about AWS. Whenever we, I don't know, for example, looked into a new series, we wrote a blog post about that. Later, we did start a podcast and YouTube videos during the pandemic, of course, as everyone did. And so, I think this was always fun stuff to do. And we like sharing what we learn and getting into discussion with the community, so this is what we still do and enjoy as well, of course. Yeah.Corey: I really want to thank you for taking the time to catch up and see what you've been up to these last few years with a labor of love and the pivot to a product company. If people want to learn more, where's the best place for them to find you?Andreas: So definitely, the best place to find me is cloudonaut.io. So, this basically points you to all [laugh] what I do. Yeah, that's basically the one domain and URL that you need to know.Corey: Excellent. And we will put that in the show notes, of course. Thank you so much for taking the time to speak with me today. I really appreciate it.Andreas: Yeah, it was a pleasure to be back here. I'm big fan of podcasts and also of Screaming in the Cloud, of course, so it was a pleasure to be here again.Corey: [laugh]. You are always welcome. Andreas Wittig, co-author of Amazon Web Services in Action, now up to its third edition. And of course, the voice behind cloudonaut. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry, insulting comment that I will at one point be able to get to just as soon as I find something to deal with your sarcasm on the AWS Marketplace.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
In this episode we are joined by Tim Brown, Global Healthcare Customer Lead for AWS Marketplace and Ken Harris, Vertical Head, Academic Medicine for AWS to discuss the role of AMCs and their role in driving technology, innovation, and supporting the next generation of healthcare professionals. Tune in to hear how AWS's partnerships benefit AMCs and help launch them into the future.This episode is sponsored by AWS.
Welcome to Episode 127 of the Eye on AI podcast with host Craig Smith and guest Clemens Mewald. In this episode, we dive into the world of AI and its transformative impact on industries. Join us as we explore Instabase, the cutting-edge company led by our guest, a seasoned engineer with an impressive background at Google Brain, Databricks, and now Instabase. Discover how Instabase is revolutionizing automation and content capture across various organizations using AI-driven methods. Uncover the mission behind Instabase and delve into the intricate details of AI Hub, a groundbreaking marketplace for AI models and products. We explore the limitations of model repositories and marketplaces, particularly in large-scale applications. As we compare AI Hub with the AWS Marketplace, we touch upon the abundance of low-code app development solutions in the market, highlighting Accio's rich SaaS offerings and generative AI apps as an industry benchmark. No discussion about AI would be complete without delving into the potential of GPT-4, a powerful language model capable of accurately predicting task outcomes. Join us on this ride as we uncover the heart of AI, its revolutionary applications, and its transformative power across industries. (00:00) Preview (00:38) Introduction (01:22) Clemens background and Google Brain (02:44) Instabase and solving unstructured data problems (07:40) How Instabase works and different use cases (13:20) The long term vision of the AI Hub (17:12) Blockchain based marketplace for AI models (21:50) AWS Marketplace compared to Instabase (24:05) Generative AI and no code web apps (31:05) Biggest concerns of using Open AI for security (35:40) Considerations of use cases of GPT4 (40:00) LLMs acting as knowledge and reasoning engines (46:40) Using different AI models based on different tasks (51:00) Leveraging other AI models for compatibility (54:14) How to get people to start using Instabase Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI
Join hosts Emily and Dave as they sit down with Lindbergh Matillano, Director of Cloud Cost Optimization at Avalara. In this engaging episode, Lindbergh uncovers the untold secrets behind Avalara's remarkable success in achieving consistent year-over-year cost reductions. Tune in to discover how Avalara strategically embraced AWS Marketplace, resulting in significant time and monetary savings. Lindbergh also shares the inspiring journey of fostering a cost-conscious culture within the organization and how they harnessed the power of AWS services to drive innovation. Don't miss out on this eye-opening conversation that explores the intersection of efficiency, financial prudence, and groundbreaking technology. Avalara on Twitter: https://twitter.com/avalara Avalara on LinkedIn: https://www.linkedin.com/company/avalara Emily on Twitter: https://twitter.com/editingemily [PORTAL] Avalara Developer Portal: https://developer.avalara.com Subscribe: Spotify: https://open.spotify.com/show/7rQjgnBvuyr18K03tnEHBI Apple Podcasts: https://podcasts.apple.com/us/podcast/aws-developers-podcast/id1574162669 Stitcher: https://www.stitcher.com/show/1065378 Pandora: https://www.pandora.com/podcast/aws-developers-podcast/PC:1001065378 TuneIn: https://tunein.com/podcasts/Technology-Podcasts/AWS-Developers-Podcast-p1461814/ Amazon Music: https://music.amazon.com/podcasts/f8bf7630-2521-4b40-be90-c46a9222c159/aws-developers-podcast Google Podcasts: https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5zb3VuZGNsb3VkLmNvbS91c2Vycy9zb3VuZGNsb3VkOnVzZXJzOjk5NDM2MzU0OS9zb3VuZHMucnNz RSS Feed: https://feeds.soundcloud.com/users/soundcloud:users:994363549/sounds.rss
Ava Labs, the company behind layer 1 blockchain Avalanche, is launching AvaCloud, a Web3 launchpad that helps businesses build no-code, fully managed blockchain ecosystems. They also partnered with Amazon Web Services (AWS) to implement new features intended to make running a node easier. The new features include one-click node deployment through the AWS Marketplace. Could we see Avalanche as the mechanism behind Amazon Prime Gaming for web3? Guest: John Nahas, VP of Business Development, Ava LabsAva Cloud ➜ https://bit.ly/AvaxCloud
In this episode we discuss innovation and digital transformation. Tim Brown, Global Healthcare Customer Lead for AWS Marketplace, shares his experiences on challenges and barriers in technology/service partnerships, tips on streamlining the selection process and much more. This episode is sponsored by AWS Marketplace
In this episode, we are joined by Tim Brown, Global Healthcare Customer Lead for AWS Marketplace and Joerg Schwarz, Senior Director Healthcare Interoperability Solutions and Strategy at Infor, to discuss why achieving interoperability in healthcare is important, its effect on innovation, and how technology can be of assistance.This episode is sponsored by AWS Marketplace.
The history of computing has been a story of moving up levels of abstraction: from hard-coding algorithms and directly manipulating memory addresses with assembly languages to using more natural language constructs in high-level general purpose languages to abstracting the hardware of the computer in cloud compute. Now serverless functions take that abstraction even further. We've made the algorithms that process data simple and natural; MongoDB wants to do the same for how we persist data. On this sponsored episode of the podcast, we chat with Andrew Davidson, SVP Products at MongoDB, about how they're turning a database into a fully-managed service that developers can use in a more natural way. Along the way, we discuss how the cost bottleneck has moved from the storage media to developers' minds, how greater abstractions can enable developers, and how to get insights from production data faster. Episode notesTry MongoDB Atlas on AWS for free.You can get started with MongoDB Atlas directly from the AWS Marketplace. If you're at a startup, you can take advantage of their special offer for startups. The community edition of their classic database is available to download as well. If you're looking to learn a thing or two before diving in, check out MongoDB University. Our thanks to Great Question badge winner Derek 朕會功夫 for asking How can I reverse an array in JavaScript without using libraries? You know the rarest kung fu of all: asking great questions.
On this episode of The Cloud Pod, the team talks about the possible replacement of CEO Sundar Pichai after Alphabet stock went up by just 1.9%, the new support feature of Amazon EKS for Kubernetes, three partner specializations just released by Google, and how clients have responded to the AI Powered Bing and Microsoft Edge. A big thanks to this week's sponsor, Foghorn Consulting, which provides full-stack cloud solutions with a focus on strategy, planning and execution for enterprises seeking to take advantage of the transformative capabilities of AWS, Google Cloud and Azure. This week's highlights
In this episode we are joined by Tim Brown, Global Healthcare Customer Lead for AWS Marketplace, and Matt Beyer, Strategic Alliance Manager at Hyland, to discuss what technologies are best suited for improving healthcare business operations, how to determine which solutions are right for your organization and more.This episode is sponsored by AWS Marketplace.
Tune in to improve your understanding of digital transformation and how it can be leveraged to best support patients and healthcare providers. We talk with Ryan Blackwell, the Leader of Cloud Operations at Tufts Medicine & Tim Brown, the Global Healthcare Customer Lead for AWS Marketplace, to find out the types of technology integrations today's CIOs are prioritizing, to better define "digital transformation" and much more.This episode is sponsored by AWS Marketplace.
About SamSam Nicholls: Veeam's Director of Public Cloud Product Marketing, with 10+ years of sales, alliance management and product marketing experience in IT. Sam has evolved from his on-premises storage days and is now laser-focused on spreading the word about cloud-native backup and recovery, packing in thousands of viewers on his webinars, blogs and webpages.Links Referenced: Veeam AWS Backup: https://www.veeam.com/aws-backup.html Veeam: https://veeam.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Chronosphere. Tired of observability costs going up every year without getting additional value? Or being locked in to a vendor due to proprietary data collection, querying and visualization? Modern day, containerized environments require a new kind of observability technology that accounts for the massive increase in scale and attendant cost of data. With Chronosphere, choose where and how your data is routed and stored, query it easily, and get better context and control. 100% open source compatibility means that no matter what your setup is, they can help. Learn how Chronosphere provides complete and real-time insight into ECS, EKS, and your microservices, whereever they may be at snark.cloud/chronosphere That's snark.cloud/chronosphere Corey: This episode is brought to us by our friends at Pinecone. They believe that all anyone really wants is to be understood, and that includes your users. AI models combined with the Pinecone vector database let your applications understand and act on what your users want… without making them spell it out. Make your search application find results by meaning instead of just keywords, your personalization system make picks based on relevance instead of just tags, and your security applications match threats by resemblance instead of just regular expressions. Pinecone provides the cloud infrastructure that makes this easy, fast, and scalable. Thanks to my friends at Pinecone for sponsoring this episode. Visit Pinecone.io to understand more.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by and sponsored by our friends over at Veeam. And as a part of that, they have thrown one of their own to the proverbial lion. My guest today is Sam Nicholls, Director of Public Cloud over at Veeam. Sam, thank you for joining me.Sam: Hey. Thanks for having me, Corey, and thanks for everyone joining and listening in. I do know that I've been thrown into the lion's den, and I am [laugh] hopefully well-prepared to answer anything and everything that Corey throws my way. Fingers crossed. [laugh].Corey: I don't think there's too much room for criticizing here, to be direct. I mean, Veeam is a company that is solidly and thoroughly built around a problem that absolutely no one cares about. I mean, what could possibly be wrong with that? You do backups; which no one ever cares about. Restores, on the other hand, people care very much about restores. And that's when they learn, “Oh, I really should have cared about backups at any point prior to 20 minutes ago.”Sam: Yeah, it's a great point. It's kind of like taxes and insurance. It's almost like, you know, something that you have to do that you don't necessarily want to do, but when push comes to shove, and something's burning down, a file has been deleted, someone's made their way into your account and, you know, running a right mess within there, that's when you really, kind of, care about what you mentioned, which is the recovery piece, the speed of recovery, the reliability of recovery.Corey: It's been over a decade, and I'm still sore about losing my email archives from 2006 to 2009. There's no way to get it back. I ran my own mail server; it was an iPhone setting that said, “Oh, yeah, automatically delete everything in your trash folder—or archive folder—after 30 days.” It was just a weird default setting back in that era. I didn't realize it was doing that. Yeah, painful stuff.And we learned the hard way in some of these cases. Not that I really have much need for email from that era of my life, but every once in a while it still bugs me. Which gets speaks to the point that the people who are the most fanatical about backing things up are the people who have been burned by not having a backup. And I'm fortunate in that it wasn't someone else's data with which I had been entrusted that really cemented that lesson for me.Sam: Yeah, yeah. It's a good point. I could remember a few years ago, my wife migrated a very aging, polycarbonate white Mac to one of the shiny new aluminum ones and thought everything was good—Corey: As the white polycarbonate Mac becomes yellow, then yeah, all right, you know, it's time to replace it. Yeah. So yeah, so she wiped the drive, and what happened?Sam: That was her moment where she learned the value and importance of backup unless she backs everything up now. I fortunately have never gone through it. But I'm employed by a backup vendor and that's why I care about it. But it's incredibly important to have, of course.Corey: Oh, yes. My spouse has many wonderful qualities, but one that drives me slightly nuts is she's something of a digital packrat where her hard drives on her laptop will periodically fill up. And I used to take the approach of oh, you can be more efficient and do the rest. And I realized no, telling other people they're doing it wrong is generally poor practice, whereas just buying bigger drives is way easier. Let's go ahead and do that. It's small price to pay for domestic tranquility.And there's a lesson in that. We can map that almost perfectly to the corporate world where you folks tend to operate in. You're not doing home backup, last time I checked; you are doing public cloud backup. Actually, I should ask that. Where do you folks start and where do you stop?Sam: Yeah, no, it's a great question. You know, we started over 15 years ago when virtualization, specifically VMware vSphere, was really the up-and-coming thing, and, you know, a lot of folks were there trying to utilize agents to protect their vSphere instances, just like they were doing with physical Windows and Linux boxes. And, you know, it kind of got the job done, but was it the best way of doing it? No. And that's kind of why Veeam was pioneered; it was this agentless backup, image-based backup for vSphere.And, of course, you know, in the last 15 years, we've seen lots of transitions, of course, we're here at Screaming in the Cloud, with you, Corey, so AWS, as well as a number of other public cloud vendors we can help protect as well, as a number of SaaS applications like Microsoft 365, metadata and data within Salesforce. So, Veeam's really kind of come a long way from just virtual machines to really taking a global look at the entirety of modern environments, and how can we best protect each and every single one of those without trying to take a square peg and fit it in a round hole?Corey: It's a good question and a common one. We wind up with an awful lot of folks who are confused by the proliferation of data. And I'm one of them, let's be very clear here. It comes down to a problem where backups are a multifaceted, deep problem, and I don't think that people necessarily think of it that way. But I take a look at all of the different, even AWS services that I use for my various nonsense, and which ones can be used to store data?Well, all of them. Some of them, you have to hold it in a particularly wrong sort of way, but they all store data. And in various contexts, a lot of that data becomes very important. So, what service am I using, in which account am I using, and in what region am I using it, and you wind up with data sprawl, where it's a tremendous amount of data that you can generally only track down by looking at your bills at the end of the month. Okay, so what am I being charged, and for what service?That seems like a good place to start, but where is it getting backed up? How do you think about that? So, some people, I think, tend to ignore the problem, which we're seeing less and less, but other folks tend to go to the opposite extreme and we're just going to backup absolutely everything, and we're going to keep that data for the rest of our natural lives. It feels to me that there's probably an answer that is more appropriate somewhere nestled between those two extremes.Sam: Yeah, snapshot sprawl is a real thing, and it gets very, very expensive very, very quickly. You know, your snapshots of EC2 instances are stored on those attached EBS volumes. Five cents per gig per month doesn't sound like a lot, but when you're dealing with thousands of snapshots for thousands machines, it gets out of hand very, very quickly. And you don't know when to delete them. Like you say, folks are just retaining them forever and dealing with this unfortunate bill shock.So, you know, where to start is automating the lifecycle of a snapshot, right, from its creation—how often do we want to be creating them—from the retention—how long do we want to keep these for—and where do we want to keep them because there are other storage services outside of just EBS volumes. And then, of course, the ultimate: deletion. And that's important even from a compliance perspective as well, right? You've got to retain data for a specific number of years, I think healthcare is like seven years, but then you've—Corey: And then not a day more.Sam: Yeah, and then not a day more because that puts you out of compliance, too. So, policy-based automation is your friend and we see a number of folks building these policies out: gold, silver, bronze tiers based on criticality of data compliance and really just kind of letting the machine do the rest. And you can focus on not babysitting backup.Corey: What was it that led to the rise of snapshots? Because back in my very early days, there was no such thing. We wound up using a bunch of servers stuffed in a rack somewhere and virtualization was not really in play, so we had file systems on physical disks. And how do you back that up? Well, you have an agent of some sort that basically looks at all the files and according to some ruleset that it has, it copies them off somewhere else.It was slow, it was fraught, it had a whole bunch of logic that was pushed out to the very edge, and forget about restoring that data in a timely fashion or even validating a lot of those backups worked other than via checksum. And God help you if you had data that was constantly in the state of flux, where anything changing during the backup run would leave your backups in an inconsistent state. That on some level seems to have largely been solved by snapshots. But what's your take on it? You're a lot closer to this part of the world than I am.Sam: Yeah, snapshots, I think folks have turned to snapshots for the speed, the lack of impact that they have on production performance, and again, just the ease of accessibility. We have access to all different kinds of snapshots for EC2, RDS, EFS throughout the entirety of our AWS environment. So, I think the snapshots are kind of like the default go-to for folks. They can help deliver those very, very quick RPOs, especially in, for example, databases, like you were saying, that change very, very quickly and we all of a sudden are stranded with a crash-consistent backup or snapshot versus an application-consistent snapshot. And then they're also very, very quick to recover from.So, snapshots are very, very appealing, but they absolutely do have their limitations. And I think, you know, it's not a one or the other; it's that they've got to go hand-in-hand with something else. And typically, that is an image-based backup that is stored in a separate location to the snapshot because that snapshot is not independent of the disk that it is protecting.Corey: One of the challenges with snapshots is most of them are created in a copy-on-write sense. It takes basically an instant frozen point in time back—once upon a time when we ran MySQL databases on top of the NetApp Filer—which works surprisingly well—we would have a script that would automatically quiesce the database so that it would be in a consistent state, snapshot the file and then un-quiesce it, which took less than a second, start to finish. And that was awesome, but then you had this snapshot type of thing. It wasn't super portable, it needed to reference a previous snapshot in some cases, and AWS takes the same approach where the first snapshot it captures every block, then subsequent snapshots wind up only taking up as much size as there have been changes since the first snapshots. So, large quantities of data that generally don't get access to a whole lot have remarkably small, subsequent snapshot sizes.But that's not at all obvious from the outside, and looking at these things. They're not the most portable thing in the world. But it's definitely the direction that the industry has trended in. So, rather than having a cron job fire off an AWS API call to take snapshots of my volumes as a sort of the baseline approach that we all started with, what is the value proposition that you folks bring? And please don't say it's, “Well, cron jobs are hard and we have a friendlier interface for that.”Sam: [laugh]. I think it's really starting to look at the proliferation of those snapshots, understanding what they're good at, and what they are good for within your environment—as previously mentioned, low RPOs, low RTOs, how quickly can I take a backup, how frequently can I take a backup, and more importantly, how quickly can I restore—but then looking at their limitations. So, I mentioned that they were not independent of that disk, so that certainly does introduce a single point of failure as well as being not so secure. We've kind of touched on the cost component of that as well. So, what Veeam can come in and do is then take an image-based backup of those snapshots, right—so you've got your initial snapshot and then your incremental ones—we'll take the backup from that snapshot, and then we'll start to store that elsewhere.And that is likely going to be in a different account. We can look at the Well-Architected Framework, AWS deeming accounts as a security boundary, so having that cross-account function is critically important so you don't have that single point of failure. Locking down with IAM roles is also incredibly important so we haven't just got a big wide open door between the two. But that data is then stored in a separate account—potentially in a separate region, maybe in the same region—Amazon S3 storage. And S3 has the wonderful benefit of being still relatively performant, so we can have quick recoveries, but it is much, much cheaper. You're dealing with 2.3 cents per gig per month, instead of—Corey: To start, and it goes down from there with sizeable volumes.Sam: Absolutely, yeah. You can go down to S3 Glacier, where you're looking at, I forget how many points and zeros and nines it is, but it's fractions of a cent per gig per month, but it's going to take you a couple of days to recover that da—Corey: Even infrequent access cuts that in half.Sam: Oh yeah.Corey: And let's be clear, these are snapshot backups; you probably should not be accessing them on a consistent, sustained basis.Sam: Well, exactly. And this is where it's kind of almost like having your cake and eating it as well. Compliance or regulatory mandates or corporate mandates are saying you must keep this data for this length of time. Keeping that—you know, let's just say it's three years' worth of snapshots in an EBS volume is going to be incredibly expensive. What's the likelihood of you needing to recover something from two years—actually, even two months ago? It's very, very small.So, the performance part of S3 is, you don't need to take it as much into consideration. Can you recover? Yes. Is it going to take a little bit longer? Absolutely. But it's going to help you meet those retention requirements while keeping your backup bill low, avoiding that bill shock, right, spending tens and tens of thousands every single month on snapshots. This is what I mean by kind of having your cake and eating it.Corey: I somewhat recently have had a client where EBS snapshots are one of the driving costs behind their bill. It is one of their largest single line items. And I want to be very clear here because if one of those people who listen to this and thinking, “Well, hang on. Wait, they're telling stories about us, even though they're not naming us by name?” Yeah, there were three of you in the last quarter.So, at that point, it becomes clear it is not about something that one individual company has done and more about an overall driving trend. I am personalizing it a little bit by referring to as one company when there were three of you. This is a narrative device, not me breaking confidentiality. Disclaimer over. Now, when you talk to people about, “So, tell me why you've got 80 times more snapshots than you do EBS volumes?” The answer is as, “Well, we wanted to back things up and we needed to get hourly backups to a point, then daily backups, then monthly, and so on and so forth. And when this was set up, there wasn't a great way to do this natively and we don't always necessarily know what we need versus what we don't. And the cost of us backing this up, well, you can see it on the bill. The cost of us deleting too much and needing it as soon as we do? Well, that cost is almost incalculable. So, this is the safe way to go.” And they're not wrong in anything that they're saying. But the world has definitely evolved since then.Sam: Yeah, yeah. It's a really great point. Again, it just folds back into my whole having your cake and eating it conversation. Yes, you need to retain data; it gives you that kind of nice, warm, cozy feeling, it's a nice blanket on a winter's day that that data, irrespective of what happens, you're going to have something to recover from. But the question is does that need to be living on an EBS volume as a snapshot? Why can't it be living on much, much more cost-effective storage that's going to give you the warm and fuzzies, but is going to make your finance team much, much happier [laugh].Corey: One of the inherent challenges I think people have is that snapshots by themselves are almost worthless, in that I have an EBS snapshot, it is sitting there now, it's costing me an undetermined amount of money because it's not exactly clear on a per snapshot basis exactly how large it is, and okay, great. Well, I'm looking for a file that was not modified since X date, as it was on this time. Well, great, you're going to have to take that snapshot, restore it to a volume and then go exploring by hand. Oh, it was the wrong one. Great. Try it again, with a different one.And after, like, the fifth or six in a row, you start doing a binary search approach on this thing. But it's expensive, it's time-consuming, it takes forever, and it's not a fun user experience at all. Part of the problem is it seems that historically, backup systems have no context or no contextual awareness whatsoever around what is actually contained within that backup.Sam: Yeah, yeah. I mean, you kind of highlighted two of the steps. It's more like a ten-step process to do, you know, granular file or folder-level recovery from a snapshot, right? You've got to, like you say, you've got to determine the point in time when that, you know, you knew the last time that it was around, then you're going to have to determine the volume size, the region, the OS, you're going to have to create an EBS volume of the same size, region, from that snapshot, create the EC2 instance with the same OS, connect the two together, boot the EC2 instance, mount the volume search for the files to restore, download them manually, at which point you have your file back. It's not back in the machine where it was, it's now been downloaded locally to whatever machine you're accessing that from. And then you got to tear it all down.And that is again, like you say, predicated on the fact that you knew exactly that that was the right time. It might not be and then you have to start from scratch from a different point in time. So, backup tooling from backup vendors that have been doing this for many, many years, knew about this problem long, long ago, and really seek to not only automate the entirety of that process but make the whole e-discovery, the search, the location of those files, much, much easier. I don't necessarily want to do a vendor pitch, but I will say with Veeam, we have explorer-like functionality, whereby it's just a simple web browser. Once that machine is all spun up again, automatic process, you can just search for your individual file, folder, locate it, you can download it locally, you can inject it back into the instance where it was through Amazon Kinesis or AWS Kinesis—I forget the right terminology for it; some of its AWS, some of its Amazon.But by-the-by, the whole recovery process, especially from a file or folder level, is much more pain-free, but also much faster. And that's ultimately what people care about how reliable is my backup? How quickly can I get stuff online? Because the time that I'm down is costing me an indescribable amount of time or money.Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open source database. If you're tired of managing open source Redis on your own, or if you are looking to go beyond just caching and unlocking your data's full potential, these folks have you covered. Redis Enterprise is the go-to managed Redis service that allows you to reimagine how your geo-distributed applications process, deliver, and store data. To learn more from the experts in Redis how to be real-time, right now, from anywhere, visit redis.com/duckbill. That's R - E - D - I - S dot com slash duckbill.Corey: Right, the idea of RPO versus RTO: recovery point objective and recovery time objective. With an RPO, it's great, disaster strikes right now, how long is acceptable to it have been since the last time we backed up data to a restorable point? Sometimes it's measured in minutes, sometimes it's measured in fractions of a second. It really depends on what we're talking about. Payments databases, that needs to be—the RPO is basically an asymptotically approaches zero.The RTO is okay, how long is acceptable before we have that data restored and are back up and running? And that is almost always a longer time, but not always. And there's a different series of trade-offs that go into that. But both of those also presuppose that you've already dealt with the existential question of is it possible for us to recover this data. And that's where I know that you are obviously—you have a position on this that is informed by where you work, but I don't, and I will call this out as what I see in the industry: AWS backup is compelling to me except for one fatal flaw that it has, and that is it starts and stops with AWS.I am not a proponent of multi-cloud. Lord knows I've gotten flack for that position a bunch of times, but the one area where it makes absolute sense to me is backups. Have your data in a rehydrate-the-business level state backed up somewhere that is not your primary cloud provider because you're otherwise single point of failure-ing through a company, through the payment instrument you have on file with that company, in the blast radius of someone who can successfully impersonate you to that vendor. There has to be a gap of some sort for the truly business-critical data. Yes, egress to other providers is expensive, but you know what also is expensive? Irrevocably losing the data that powers your business. Is it likely? No, but I would much rather do it than have to justify why I'm not doing it.Sam: Yeah. Wasn't likely that I was going to win that 2 billion or 2.1 billion on the Powerball, but [laugh] I still play [laugh]. But I understand your standpoint on multi-cloud and I read your newsletters and understand where you're coming from, but I think the reality is that we do live in at least a hybrid cloud world, if not multi-cloud. The number of organizations that are sole-sourced on a single cloud and nothing else is relatively small, single-digit percentage. It's around 80-some percent that are hybrid, and the remainder of them are your favorite: multi-cloud.But again, having something that is one hundred percent sole-source on a single platform or a single vendor does expose you to a certain degree of risk. So, having the ability to do cross-platform backups, recoveries, migrations, for whatever reason, right, because it might not just be a disaster like you'd mentioned, it might also just be… I don't know, the company has been taken over and all of a sudden, the preference is now towards another cloud provider and I want you to refactor and re-architect everything for this other cloud provider. If all that data is locked into one platform, that's going to make your job very, very difficult. So, we mentioned at the beginning of the call, Veeam is capable of protecting a vast number of heterogeneous workloads on different platforms, in different environments, on-premises, in multiple different clouds, but the other key piece is that we always use the same backup file format. And why that's key is because it enables portability.If I have backups of EC2 instances that are stored in S3, I could copy those onto on-premises disk, I could copy those into Azure, I could do the same with my Azure VMs and store those on S3, or again, on-premises disk, and any other endless combination that goes with that. And it's really kind of centered around, like control and ownership of your data. We are not prescriptive by any means. Like, you do what is best for your organization. We just want to provide you with the toolset that enables you to do that without steering you one direction or the other with fee structures, disparate feature sets, whatever it might be.Corey: One of the big challenges that I keep seeing across the board is just a lack of awareness of what the data that matters is, where you see people backing up endless fleets of web server instances that are auto-scaled into existence and then removed, but you can create those things at will; why do you care about the actual data that's on these things? It winds up almost at the library management problem, on some level. And in that scenario, snapshots are almost certainly the wrong answer. One thing that I saw previously that really changed my way of thinking about this was back many years ago when I was working at a startup that had just started using GitHub and they were paying for a third-party service that wound up backing up Git repos. Today, that makes a lot more sense because you have a bunch of other stuff on GitHub that goes well beyond the stuff contained within Git, but at the time, it was silly. It was, why do that? Every Git clone is a full copy of the entire repository history. Just grab it off some developer's laptop somewhere.It's like, “Really? You want to bet the company, slash your job, slash everyone else's job on that being feasible and doable or do you want to spend the 39 bucks a month or whatever it was to wind up getting that out the door now so we don't have to think about it, and they validate that it works?” And that was really a shift in my way of thinking because, yeah, backing up things can get expensive when you have multiple copies of the data living in different places, but what's really expensive is not having a company anymore.Sam: Yeah, yeah, absolutely. We can tie it back to my insurance dynamic earlier where, you know, it's something that you know that you have to have, but you don't necessarily want to pay for it. Well, you know, just like with insurances, there's multiple different ways to go about recovering your data and it's only in crunch time, do you really care about what it is that you've been paying for, right, when it comes to backup?Could you get your backup through a git clone? Absolutely. Could you get your data back—how long is that going to take you? How painful is that going to be? What's going to be the impact to the business where you're trying to figure that out versus, like you say, the 39 bucks a month, a year, or whatever it might be to have something purpose-built for that, that is going to make the recovery process as quick and painless as possible and just get things back up online.Corey: I am not a big fan of the fear, uncertainty, and doubt approach, but I do practice what I preach here in that yeah, there is a real fear against data loss. It's not, “People are coming to get you, so you absolutely have to buy whatever it is I'm selling,” but it is something you absolutely have to think about. My core consulting proposition is that I optimize the AWS bill. And sometimes that means spending more. Okay, that one S3 bucket is extremely important to you and you say you can't sustain the loss of it ever so one zone is not an option. Where is it being backed up? Oh, it's not? Yeah, I suggest you spend more money and back that thing up if it's as irreplaceable as you say. It's about doing the right thing.Sam: Yeah, yeah, it's interesting, and it's going to be hard for you to prove the value of doing that when you are driving their bill up when you're trying to bring it down. But again, you have to look at something that's not itemized on that bill, which is going to be the impact of downtime. I'm not going to pretend to try and recall the exact figures because it also varies depending on your business, your industry, the size, but the impact of downtime is massive financially. Tens of thousands of dollars for small organizations per hour, millions and millions of dollars per hour for much larger organizations. The backup component of that is relatively small in comparison, so having something that is purpose-built, and is going to protect your data and help mitigate that impact of downtime.Because that's ultimately what you're trying to protect against. It is the recovery piece that you're buying is the most important piece. And like you, I would say, at least be cognizant of it and evaluate your options and what can you live with and what can you live without.Corey: That's the big burning question that I think a lot of people do not have a good answer to. And when you don't have an answer, you either backup everything or nothing. And I'm not a big fan of doing either of those things blindly.Sam: Yeah, absolutely. And I think this is why we see varying different backup options as well, you know? You're not going to try and apply the same data protection policies each and every single workload within your environment because they've all got different types of workload criticality. And like you say, some of them might not even need to be backed up at all, just because they don't have data that needs to be protected. So, you need something that is going to be able to be flexible enough to apply across the entirety of your environment, protect it with the right policy, in terms of how frequently do you protect it, where do you store it, how often, or when are you eventually going to delete that and apply that on a workload by workload basis. And this is where the joy of things like tags come into play as well.Corey: One last thing I want to bring up is that I'm a big fan of watching for companies saying the quiet part out loud. And one area in which they do this—because they're forced to by brevity—is in the title tag of their website. I pull up veeam.com and I hover over the tab in my browser, and it says, “Veeam Software: Modern Data Protection.”And I want to call that out because you're not framing it as explicitly backup. So, the last topic I want to get into is the idea of security. Because I think it is not fully appreciated on a lived-experience basis—although people will of course agree to this when they're having ivory tower whiteboard discussions—that every place your data lives is a potential for a security breach to happen. So, you want to have your data living in a bunch of places ideally, for backup and resiliency purposes. But you also want it to be completely unworkable or illegible to anyone who is not authorized to have access to it.How do you balance those trade-offs yourself given that what you're fundamentally saying is, “Trust us with your Holy of Holies when it comes to things that power your entire business?” I mean, I can barely get some companies to agree to show me their AWS bill, let alone this is the data that contains all of this stuff to destroy our company.Sam: Yeah. Yeah, it's a great question. Before I explicitly answer that piece, I will just go to say that modern data protection does absolutely have a security component to it, and I think that backup absolutely needs to be a—I'm going to say this an air quotes—a “first class citizen” of any security strategy. I think when people think about security, their mind goes to the preventative, like how do we keep these bad people out?This is going to be a bit of the FUD that you love, but ultimately, the bad guys on the outside have an infinite number of attempts to get into your environment and only have to be right once to get in and start wreaking havoc. You on the other hand, as the good guy with your cape and whatnot, you have got to be right each and every single one of those times. And we as humans are fallible, right? None of us are perfect, and it's incredibly difficult to defend against these ever-evolving, more complex attacks. So backup, if someone does get in, having a clean, verifiable, recoverable backup, is really going to be the only thing that is going to save your organization, should that actually happen.And what's key to a secure backup? I would say separation, isolation of backup data from the production data, I would say utilizing things like immutability, so in AWS, we've got Amazon S3 object lock, so it's that write once, read many state for whatever retention period that you put on it. So, the data that they're seeking to encrypt, whether it's in production or in their backup, they cannot encrypt it. And then the other piece that I think is becoming more and more into play, and it's almost table stakes is encryption, right? And we can utilize things like AWS KMS for that encryption.But that's there to help defend against the exfiltration attempts. Because these bad guys are realizing, “Hey, people aren't paying me my ransom because they're just recovering from a clean backup, so now I'm going to take that backup data, I'm going to leak the personally identifiable information, trade secrets, or whatever on the internet, and that's going to put them in breach compliance and give them a hefty fine that way unless they pay me my ransom.” So encryption, so they can't read that data. So, not only can they not change it, but they can't read it is equally important. So, I would say those are the three big things for me on what's needed for backup to make sure it is clean and recoverable.Corey: I think that is one of those areas where people need to put additional levels of thought in. I think that if you have access to the production environment and have full administrative rights throughout it, you should definitionally not—at least with that account and ideally not you at all personally—have access to alter the backups. Full stop. I would say, on some level, there should not be the ability to alter backups for some particular workloads, the idea being that if you get hit with a ransomware infection, it's pretty bad, let's be clear, but if you can get all of your data back, it's more of an annoyance than it is, again, the existential business crisis that becomes something that redefines you as a company if you still are a company.Sam: Yeah. Yeah, I mean, we can turn to a number of organizations. Code Spaces always springs to mind for me, I love Code Spaces. It was kind of one of those precursors to—Corey: It's amazing.Sam: Yeah, but they were running on AWS and they had everything, production and backups, all stored in one account. Got into the account. “We're going to delete your data if you don't pay us this ransom.” They were like, “Well, we're not paying you the ransoms. We got backups.” Well, they deleted those, too. And, you know, unfortunately, Code Spaces isn't around anymore. But it really kind of goes to show just the importance of at least logically separating your data across different accounts and not having that god-like access to absolutely everything.Corey: Yeah, when you talked about Code Spaces, I was in [unintelligible 00:32:29] talking about GitHub Codespaces specifically, where they have their developer workstations in the cloud. They're still very much around, at least last time I saw unless you know something I don't.Sam: Precursor to that. I can send you the link—Corey: Oh oh—Sam: You can share it with the listeners.Corey: Oh, yes, please do. I'd love to see that.Sam: Yeah. Yeah, absolutely.Corey: And it's been a long and strange time in this industry. Speaking of links for the show notes, I appreciate you're spending so much time with me. Where can people go to learn more?Sam: Yeah, absolutely. I think veeam.com is kind of the first place that people gravitate towards. Me personally, I'm kind of like a hands-on learning kind of guy, so we always make free product available.And then you can find that on the AWS Marketplace. Simply search ‘Veeam' through there. A number of free products; we don't put time limits on it, we don't put feature limitations. You can backup ten instances, including your VPCs, which we actually didn't talk about today, but I do think is important. But I won't waste any more time on that.Corey: Oh, configuration of these things is critically important. If you don't know how everything was structured and built out, you're basically trying to re-architect from first principles based upon archaeology.Sam: Yeah [laugh], that's a real pain. So, we can help protect those VPCs and we actually don't put any limitations on the number of VPCs that you can protect; it's always free. So, if you're going to use it for anything, use it for that. But hands-on, marketplace, if you want more documentation, want to learn more, want to speak to someone veeam.com is the place to go.Corey: And we will, of course, include that in the show notes. Thank you so much for taking so much time to speak with me today. It's appreciated.Sam: Thank you, Corey, and thanks for all the listeners tuning in today.Corey: Sam Nicholls, Director of Public Cloud at Veeam. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry insulting comment that takes you two hours to type out but then you lose it because you forgot to back it up.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
About BrianBrian is an accomplished dealmaker with experience ranging from developer platforms to mobile services. Before InfluxData, Brian led business development at Twilio. Joining at just thirty-five employees, he built over 150 partnerships globally from the company's infancy through its IPO in 2016. He led the company's international expansion, hiring its first teams in Europe, Asia, and Latin America. Prior to Twilio Brian was VP of Business Development at Clearwire and held management roles at Amp'd Mobile, Kivera, and PlaceWare.Links Referenced:InfluxData: https://www.influxdata.com/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is bought to you in part by our friends at Veeam. Do you care about backups? Of course you don't. Nobody cares about backups. Stop lying to yourselves! You care about restores, usually right after you didn't care enough about backups. If you're tired of the vulnerabilities, costs and slow recoveries when using snapshots to restore your data, assuming you even have them at all living in AWS-land, there is an alternative for you. Check out Veeam, thats V-E-E-A-M for secure, zero-fuss AWS backup that won't leave you high and dry when it's time to restore. Stop taking chances with your data. Talk to Veeam. My thanks to them for sponsoring this ridiculous podcast.Corey: This episode is brought to us by our friends at Pinecone. They believe that all anyone really wants is to be understood, and that includes your users. AI models combined with the Pinecone vector database let your applications understand and act on what your users want… without making them spell it out.Make your search application find results by meaning instead of just keywords, your personalization system make picks based on relevance instead of just tags, and your security applications match threats by resemblance instead of just regular expressions. Pinecone provides the cloud infrastructure that makes this easy, fast, and scalable. Thanks to my friends at Pinecone for sponsoring this episode. Visit Pinecone.io to understand more.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. It's been a year, which means it's once again time to have a promoted guest episode brought to us by our friends at InfluxData. Joining me for a second time is Brian Mullen, CMO over at InfluxData. Brian, thank you for agreeing to do this a second time. You're braver than most.Brian: Thanks, Corey. I'm happy to be here. Second time is the charm.Corey: So, it's been an interesting year to put it mildly and I tend to have the attention span of a goldfish of most days, so for those who are similarly flighty, let's start at the very top. What is an InfluxDB slash InfluxData slash Influx—when you're not sure which one to use, just shorten it and call it good—and why might someone need it?Brian: Sure. So, InfluxDB is what most people understand our product as, a pretty popular open-source product, been out for quite a while. And then our company, InfluxData is the company behind InfluxDB. And InfluxDB is where developers build IoT real-time analytics and cloud applications, typically all based on time series. It's a time-series data platform specifically built to handle time-series data, which we think about is any type of data that is stamped in time in some way.It could be metrics, like, taken every one second, every two seconds, every three seconds, or some kind of event that occurs and is stamped in time in some way. So, our product and platform is really specialized to handle that technical problem.Corey: When last we spoke, I contextualized that in the realm of an IoT sensor that winds up reporting its device ID and its temperature at a given timestamp. That is sort of baseline stuff that I think aligns with what we're talking about. But over the past year, I started to see it in a bit of a different light, specifically viewing logs as time-series data, which hadn't occurred to me until relatively recently. And it makes perfect sense, on some level. It's weird to contextualize what Influx does as being a logging database, but there's absolutely no reason it couldn't be.Brian: Yeah, it certainly could. So typically, we see the world of time-series data in kind of two big realms. One is, as you mentioned the, you know, think of it as the hardware or, you know, physical realm: devices and sensors, these are things that are going to show up in a connected car, in a factory deployment, in renewable energy, you know, wind farm. And those are real devices and pieces of hardware that are out in the physical world, collecting data and emitting, you know, time-series every one second, or five seconds, or ten minutes, or whatever it might be.But it also, as you mentioned, applies to, call it the virtual world, which is really all of the software and infrastructure that is being stood up to run applications and services. And so, in that world, it could be the same—it's just a different type of source, but is really kind of the same technical problem. It's still time-series data being stamped, you know, data being stamped every, you know, one second, every five seconds, in some cases, every millisecond, but it is coming from a source that is actually in the infrastructure. Could be, you know, virtual machines, it could be containers, it could be microservices running within those containers. And so, all of those things together, both in the physical world and this infrastructure world are all emitting time-series data.Corey: When you take a look at the broader ecosystem, what is it that you see that has been the most misunderstood about time-series data as a whole? For example, when I saw AWS talking about a lot of things that they did in the realm of for your data lake, I talked to clients of mine about this and their response is, “Well, that'd be great genius, if we had a data lake.” It's, “What do you think those petabytes of nonsense in S3 are?” “Oh, those are the logs and the assets and a bunch of other nonsense.” “Yeah, that's what other people are calling a data lake.” “Oh.” Do you see similar lights-go-on moment when you talk to clients and prospective clients about what it is that they're doing that they just hadn't considered to be time-series data previously?Brian: Yeah. In fact, that's exactly what we see with many of our customers is they didn't realize that all of a sudden, they are now handling a pretty sizable time-series workload. And if you kind of take a step back and look at a couple of pretty obvious but sometimes unrecognized trends in technology, the first is cloud applications in general are expanding, they're both—horizontally and vertically. So, that means, like, the workloads that are being run in the Netflix's of the world, or all the different infrastructure that's being spun up in the cloud to run these various, you know, applications and services, those workloads are getting bigger and bigger, those companies and their subscriber bases, and the amount of data they're generating is getting bigger and bigger. They're also expanding horizontally by region and geography.So Netflix, for example, running not just in the US, but in every continent and probably every cloud region around the world. So, that's happening in the cloud world, and then also, in the IoT world, there's this massive growth of connected devices, both net-new devices that are being developed kind of, you know, the next Peloton or the next climate control unit that goes in an apartment or house, and also these longtime legacy devices that are been on the factory floor for a couple of decades, but now are being kind of modernized and coming online. So, if you look at all of that growth of the data sources now being built up in the cloud and you look at all that growth of these connected devices, both new and existing, that are kind of coming online, there's a huge now exponential growth in the sources of data. And all of these sources are emitting time-series data. You can just think about a connected car—not even a self-driving car, just a connected car, your everyday, kind of, 2022 model, and nearly every element of the car is emitting time-series data: its engine components, you know, your tires, like, what the climate inside of the car is, statuses of the engine itself, and it's all doing that in real-time, so every one second, every five seconds, whatever.So, I think in general, people just don't realize they're already dealing with a substantial workload of time series. And in most cases, unless they're using something like Influx, they're probably not, you know, especially tuned to handle it from a technology perspective.Corey: So, it's been a year. What has changed over on your side of the world since the last time we spoke? It seems that well, things continue and they're up and to the right. Well, sure, generally speaking, you're clearly still in business. Good job, always appreciative of your custom, as well as the fact that oh, good, even in a world where it seems like there's a macro recession in progress, that there are still companies out there that continue to persist and in some cases, dare I say, even thrive? What have you folks been up to?Brian: Yeah, it's been a big year. So first, we've seen quite a bit of expansion across the use cases. So, we've seen even further expansion in IoT, kind of expanding into consumer, industrial, and now sustainability and clean energy, and that pairs with what we've seen on FinTech and cryptocurrency, gaming and entertainment applications, network telemetry, including some of the biggest names in telecom, and then a little bit more on the cloud side with cloud services, infrastructure, and dev tools and APIs. So, quite a bit more broad set of use cases we're now seeing across the platform. And the second thing is—you might have seen it in the last month or so—is a pretty big announcement we had of our new storage engine.So, this was just announced earlier this month in November and was previously introduced to our community as what we call an IOx, which is how it was known in the open-source. And think of this really as a rebuilt and reimagined storage engine which is built on that open-source project, InfluxDB IOx that allows us to deliver faster queries, and now—pretty exciting for the first time—unlimited time-series, or cardinality as it's known in the space. And then also we introduced SQL for writing queries and BI tool support. And this is, for the first time we're introducing SQL, which is world's most popular data programming language to our platform, enabling developers to query via the API our language Flux, and InfluxQL in addition.Corey: A long time ago, it really seems that the cloud took a vote, for lack of a better term, and decided that when it comes to storage, object store is the way forward. It was a bit of a reimagining from how we all considered using storage previously, but the economics are at minimum of ten to one in favor of objects store, the latency is far better, the durability is off the charts better, you don't have to deal—at least in AWS-land—with the concept of availability zones and the rest, just from an economic and performance perspective, provided the use case embraces it, there's really no substitute.Brian: Yeah, I mean, the way we think about storage is, you know, obviously, it varies quite a bit from customer to customer with our use cases. Especially in IoT, we see some use cases where customers want to have data around for months and in some cases, years. So, it's a pretty substantial data set you're often looking at. And sometimes those customers want to downsample those, they don't necessarily need every single piece of minutia that they may need in real-time, but not in summary, looking backward. So, you really—we're in this kind of world where we're dealing with both hive fidelity—usually in the moment—data and lower fidelity, when people can downsample and have a little bit more of a summarized view of what happened.So, pretty unique for us and we have to kind of design the product in a way that is able to balance both of those because that's what, you know, the customer use cases demand. It's a super hard problem to solve. One of the reasons that you have a product like InfluxDB, which is specialized to handle this kind of thing, is so that you can actually manage that balance in your application service and setting your retention policy, et cetera.Corey: That's always been something that seemed a little on the odd side to me when I'm looking at a variety of different observability tools, where it seems that one of the key dimensions that they all tend to, I guess, operate on and price on is retention period. And I get it; you might not necessarily want to have your load balancer logs from 2012 readily available and paying for the privilege, but it does seem that given the dramatic fall of archival storage pricing, on some level, people do want to be able to retain that data just on the off chance that will be useful. Maybe that's my internal digital packrat chiming in at this point, but I do believe strongly that there is a correlation between how recent the data is and how useful it is, for a variety of different use cases. But that's also not a global truth. How do you view the divide? And what do you actually see people saying they want versus what they're actually using?Brian: It's a really good question and not a simple problem to solve. So, first of all, I would say it probably really depends on the use case and the extent to which that use case is touching real world applications and services. So, in a pure observability setting where you're looking at, perhaps more of a, kind of, operational view of infrastructure monitoring, you want to understand kind of what happened and when those tend to be a little bit more focused on real-time and recent. So, for example, you of course, want to know exactly what's happening in the moment, zero in on whatever anomaly and kind of surrounding data there is, perhaps that means you're digging into something that happened in you know, fairly recent time. So, those do tend to be, not all of them, but they do tend to be a little bit more real-time and recent-oriented.I think it's a little bit different when we look at IoT. Those generally tend to be longer timeframes that people are dealing with. Their physical out-in-the-field devices, you know, many times those devices are kind of coming online and offline, depending on the connectivity, depending on the environment, you can imagine a connected smart agriculture setup, I mean, those are a pretty wide array of devices out and in, you know, who knows what kind of climate and environment, so they tend to be a little bit longer in retention policy, kind of, being able to dig into the data, what's happening. The time frame that people are dealing with is just, in general, much longer in some of those situations.Corey: One story that I've heard a fair bit about observability data and event data is that they inevitably compose down into metrics rather than events or traces or logs, and I have a hard time getting there because I can definitely see a bunch of log entries showing the web servers return codes, okay, here's the number of 500 errors and number of different types of successes that we wind up seeing in the app. Yeah, all right, how many per minute, per second, per hour, whatever it is that makes sense that you can look at aberrations there. But in the development process at least, I find that having detailed log messages tell me about things I didn't see and need to understand or to continue building the dumb thing that I'm in the process of putting out. It feels like once something is productionalized and running, that its behavior is a lot more well understood, and at that point, metrics really seem to take over. How do you see it, given that you fundamentally live at that intersection where one can become the other?Brian: Yeah, we are right at that intersection and our answer probably would be both. Metrics are super important to understand and have that regular cadence and be kind of measuring that state over time, but you can miss things depending on how frequent those metrics are coming in. And increasingly, when you have the amount of data that you're dealing with coming from these various sources, the measurement is getting smaller and smaller. So, unless you have, you know, perfect metrics coming in every half-second, or you know, in some sub-partition of that, in milliseconds, you're likely to miss something. And so, events are really key to understand those things that pop up and then maybe come back down and in a pure metric setting, in your regular interval, you would have just completely missed. So, we see most of our use cases that are showing a balance of the two is kind of the most effective. And from a product perspective, that's how we think about solving the problem, addressing both.Corey: One of the things that I struggled with is it seems that—again, my approach to this is relatively outmoded. I was a systems administrator back when that title was not considered disparaging by a good portion of the technical community the way that it is today. Even though the job is the same, we call them something different now. Great. Okay, whatever smile, nod, and accept the larger paycheck.But my way of thinking about things are okay, you have the logs, they live on the server itself. And maybe if you want to be fancy, you wind up putting them to a centralized rsyslog cluster or whatnot. Yes, you might send them as well to some other processing system for visibility or a third-party monitoring system, but the canonical truth slash source of logs tends to live locally. That said, I got out of running production infrastructure before this idea of ephemeral containers or serverless functions really became a thing. Do you find that these days you are the source of truth slash custodian of record for these log entries, or do you find that you are more of a secondary source for better visibility and analysis, but not what they're going to bust out when the auditor comes calling in three years?Brian: I think, again, it—[laugh] I feel like I'm answering the same way [crosstalk 00:15:53]Corey: Yeah, oh, and of course, let's be clear, use cases are going to vary wildly. This is not advice on anyone's approach to compliance and the rest [laugh]. I don't want to get myself in trouble here.Brian: Exactly. Well, you know, we kind of think about it in terms of profiles. And we see a couple of different profiles of customers using InfluxDB. So, the first is, and this was kind of what we saw most often early on, still see quite a bit of them is kind of more of that operator profile. And these are folks who are going to—they're building some sort of monitor, kind of, source of truth for—that's internally facing to monitor applications or services, perhaps that other teams within their company built.And so that's, kind of like, a little bit more of your kind of pure operator. Yes, they're building up in the stack themselves, but it's to pay attention to essentially something that another team built. And then what we've seen more recently, especially as we've moved more prominently into the cloud and offered a usage-based service with a, you know, APIs and endpoint people can hit, as we see more people come into it from a builder's perspective. And similar in some ways, except that they're still building kind of a, you know, a source of truth for handling this kind of data. But they're also building the applications and services themselves are taken out to market that are in the hands of customers.And so, it's a little bit different mindset. Typically, there's, you know, a little bit more comfort with using one of many services to kind of, you know, be part of the thing that they're building. And so, we've seen a little bit more comfort from that type of profile, using our service running in the cloud, using the API, and not worrying too much about the kind of, you know, underlying setup of the implementation.Corey: Love how serverless helps you scale big and ship fast, but hate debugging your serverless apps? With Lumigo's serverless observability, it's fast and easy (and maybe a little fun, too). End-to-end distributed tracing gives developers full clarity into their most complex serverless and containerized applications, connecting every service from AWS Lambda and Amazon ECS to DynamoDB, API Gateways, Step Functions and more. Try Lumigo free and debug 3x faster, reduce error rate and speed up development. Visit snark.cloud/lumigo That's snark.cloud/L-U-M-I-G-OCorey: So, I've been on record a lot saying that the best database is TXT records stuffed into Route 53, which works super well as a gag, let's be clear, don't actually build something on top of this, that's a disaster waiting to happen. I don't want to destroy anyone's career as I do this. But you do have a much more viable competitive threat on the landscape. And that is quite simply using the open-source version of InfluxDB. What is the tipping point where, “Huh, I can run this myself,” turns into, “But I shouldn't. I should instead give money to other people to run it for me.”Because having been an engineer, where I believe I'm the world's greatest everything, when it comes to my environment—a fact provably untrue, but that hubris never quite goes away entirely—at what point am I basically being negligent not to start dealing with you in a more formalized business context?Brian: First of all, let me say that we have many customers, many developers out there who are running open-source and it works perfectly for them. The workload is just right, the deployment makes sense. And so, there are many production workloads we're using open-source. But typically, the kind of big turning point for people is on scale, scale, and overall performance related to that. And so, that's typically when they come and look at one of the two commercial offers.So, to start, open-source is a great place to, you know, kind of begin the journey, check it out, do that level of experimentation and kind of proof of concept. We also have 60,000-plus developers using our introductory cloud service, which is a free service. You simply sign up and can begin immediately putting data into the platform and building queries, and you don't have to worry about any of the setup and running servers to deploy software. So, both of those, the open-source and our cloud product are excellent ways to get started. And then when it comes time to really think about building in production and moving up in scale, we have our two commercial offers.And the first of those is InfluxDB Cloud, which is our cloud-native fully managed by InfluxData offering. We run this not only in AWS but also in Google Cloud and Microsoft Azure. It's a usage-based service, which means you pay exactly for what you use, and the three components that people pay for our data in, number of queries, and the amount of data you store in storage. We also for those who are interested in actually managing it themselves, we have InfluxDB Enterprise, which is a software subscription-base model, and it is self-managed by the customer in their environment. Now, that environment could be their own private cloud, it also could be on-premises in their own data center.And so, lots of fun people who are a little bit more oriented to kind of manage software themselves rather than using a service gear toward that. But both those commercial offers InfluxDB Cloud and InfluxDB Enterprise are really designed for, you know, massive scale. In the case of Cloud, I mentioned earlier with the new storage engine, you can hit unlimited cardinality, which means you have no limit on the number of time series you can put into the platform, which is a pretty big game-changing concept. And so, that means however many time-series sources you have and however many series they're emitting, you can run that without a problem without any sort of upper limit in our cloud product. Over on the enterprise side with our self-managed product, that means you can deploy a cluster of whatever size you want. It could be a two-by-four, it could be a four-by-eight, or something even larger. And so, it gives people that are managing in their own private cloud or in a data center environment, really their own options to kind of construct exactly what they need for their particular use case.Corey: Does your object storage layer make it easier to dynamically change clusters on the fly? I mean, historically, running things in a pre-provisioned cluster with EBS volumes or local disk was, “Oh, great. You want to resize something? Well, we're going to be either taking an outage or we're going to be building up something, migrating data live, and there's going to be a knife-switch cutover at some point that makes things relatively unfortunate.” It seems that once you abstract the storage layer away from anything resembling an instance that you would be able to get away from some of those architectural constraints.Brian: Yeah, that's really the promise, and what is delivered in our cloud product is that you no longer, as a developer, have to think about that if you're using that product. You don't have to think about how big the cluster is going to be, you don't have to think about these kind of disaster scenarios. It is all kind of pre-architected in the service. And so, the things that we really want to deliver to people, in addition to the elimination of that concern for what the underlying infrastructure looks like and how its operating. And so, with infrastructure concerns kind of out of the way, what we want to deliver on are kind of the things that people care most about: real-time query speed.So, now with this new storage engine, you can query data across any time series within milliseconds, 100 times faster queries against high cardinality data that was previously impossible. And we also have unlimited time-series volume. Again, any total number of time series you have, which is known as cardinality, is now able to run without a problem in the platform. And then we also have kind of opening up, we're opening up the aperture a bit for developers with SQL language support. And so, this is just a whole new world of flexibility for developers to begin building on the platform. And again, this is all in the way that people are using the product without having to worry about the underlying infrastructure.Corey: For most companies—and this does not apply to you—their core competency is not running time-series databases and the infrastructure attendant thereof, so it seems like it is absolutely a great candidate for, “You know, we really could make this someone else's problem and let us instead focus on the differentiated thing that we are doing or building or complaining about.”Brian: Yeah, that's a true statement. Typically what happens with time-series data is that people first of all, don't realize they have it, and then when they realize they have time-series data, you know, the first thing they'll do is look around and say, “Well, what do I have here?” You know, I have this relational database over here or this document database over here, maybe even this, kind of, search database over here, maybe that thing can handle time series. And in a light manner, it probably does the job. But like I said, the sources of data and just the volume of time series is expanding, really across all these different use cases, exponentially.And so, pretty quickly, people realize that thing that may be able to handle time series in some minor manner, is quickly no longer able to do it. They're just not purpose-built for it. And so, that's where really they come to a product like Influx to really handle this specific problem. We're built specifically for this purpose and so as the time-series workload expands when it kind of hits that tipping point, you really need a specialized tool.Corey: Last question, before I turn you loose to prepare for re:Invent, of course—well, I guess we'll ask you a little bit about that afterwards, but first, we can talk a lot theoretically about what your product could or might theoretically do. What are you actually seeing? What are the use cases that other than the stereotypical ones we've talked about, what have you seen people using it for that surprised you?Brian: Yeah, some of it is—it's just really interesting how it connects to, you know, things you see every day and/or use every day. I mean, chances are, many people listening have probably use InfluxDB and, you know, perhaps didn't know it. You know, if anyone has been to a home that has Tesla Powerwalls—Tesla is a customer of ours—then they've seen InfluxDB in action. Tesla's pulling time-series data from these connected Powerwalls that are in solar-powered homes, and they monitor things like health and availability and performance of those solar panels and the battery setup, et cetera. And they're collecting this at the edge and then sending that back into the hub where InfluxDB is running on their back end.So, if you've ever seen this deployed like that's InfluxDB running behind the scenes. Same goes, I'm sure many people have a Nest thermostat in their house. Nest monitors the infrastructure, actually the powers that collection of IoT data collection. So, you think of this as InfluxDB running behind the scenes to monitor what infrastructure is standing up that back-end Nest service. And this includes their use of Kubernetes and other software infrastructure that's run in their platform for collection, managing, transforming, and analyzing all of this aggregate device data that's out there.Another one, especially for those of us that streamed our minds out during the pandemic, Disney+ entertainment, streaming, and delivery of that to applications and to devices in the home. And so, you know, this hugely popular Disney+ streaming service is essentially a global content delivery network for distributing all these, you know, movies and video series to all the users worldwide. And they monitor the movement and performance of that video content through this global CDN using InfluxDB. So, those are a few where you probably walk by something like this multiple times a week, or in our case of Disney+ probably watching it once a day. And it's great to see InfluxDB kind of working behind the scenes there.Corey: It's one of those things where it's, I guess we'll call it plumbing, for lack of a better term. It's not the sort of thing that people are going to put front-and-center into any product or service that they wind up providing, you know, except for you folks. Instead, it's the thing that empowers a capability behind that product or service that is often taken for granted, just because until you understand the dizzying complexity, particularly at scale, of what these things have to do under the hood, it just—well yeah, of course, it works that way. Why shouldn't it? That's an expectation I have of the product because it's always had that. Yeah, but this is how it gets there.Brian: Our thesis really is that data is best understood through the lens of time. And as this data is expanding exponentially, time becomes increasingly the, kind of, common element, the common component that you're using to kind of view what happened. That could be what's running through a telecom network, what's happening with the devices that are connected that network, the movement of data through that network, and when, what's happening with subscribers and content pushing through a CDN on a streaming service, what's happening with climate and home data in hundreds of thousands, if not millions of homes through common device like a Nest thermostat. All of these things they attach to some real-world collection of data, and as long as that's happening, there's going to be a place for time-series data and tools that are optimized to handle it.Corey: So, my last question—for real this time—we are recording this the week before re:Invent 2022. What do you hope to see, what do you expect to see, what do you fear to see?Brian: No fears. Even though it's Vegas, no fears.Corey: I do have the super-spreader event fear, but that's a separate—Brian: [laugh].Corey: That's a separate issue. Neither one of us are deep into the epidemiology weeds, to my understanding. But yeah, let's just bound this to tech, let's be clear.Brian: Yeah, so first of all, we're really excited to go there. We'll have a pretty big presence. We have a few different locations where you can meet us. We'll have a booth on the main show floor, we'll be in the marketplace pavilion, as I mentioned, InfluxDB Cloud is offered across the marketplaces of each of the clouds, AWS, obviously in this case, but also in Azure and Google. But we'll be there in the AWS Marketplace pavilion, showcasing the new engine and a lot of the pretty exciting new use cases that we've been seeing.And we'll have our full team there, so if you're looking to kind of learn more about InfluxDB, or you've checked it out recently and want to understand kind of what the new capability is, we'll have many folks from our technical teams there, from our development team, some our field folks like the SEs and some of the product managers will be there as well. So, we'll have a pretty great collection of experts on InfluxDB to answer any questions and walk people through, you know, demonstrations and use cases.Corey: I look forward to it. I will be doing my traditional Wednesday afternoon tour through the expo halls and nature walk, so if you're listening to this and it's before Wednesday afternoon, come and find me. I am kicking off and ending at the [unintelligible 00:29:15] booth, but I will make it a point to come by the Influx booth and give you folks a hard time because that's what I do.Brian: We love it. Please. You know, being on the tour is—on the walking tour is excellent. We'll be mentally prepared. We'll have some comebacks ready for you.Corey: Therapists are standing by on both sides.Brian: Yes, exactly. Anyway, we're really looking forward to it. This will be my third year on your walking tour. So, the nature walk is one of my favorite parts of AWS re:Invent.Corey: Well, I appreciate that. Thank you. And thank you for your time today. I will let you get back to your no doubt frenzied preparations. At least they are on my side.Brian: We will. Thanks so much for having me and really excited to do it.Corey: Brian Mullen, CMO at InfluxData, I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that you naively believe will be stored as a TXT record in a DNS server somewhere rather than what is almost certainly a time-series database.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.