Podcasts about yugabyte

  • 32PODCASTS
  • 47EPISODES
  • 42mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Feb 6, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about yugabyte

Latest podcast episodes about yugabyte

The Capitalism and Freedom in the Twenty-First Century Podcast
Productivity, Innovation, and the New American Golden Age with Joe Lonsdale

The Capitalism and Freedom in the Twenty-First Century Podcast

Play Episode Listen Later Feb 6, 2025 64:53 Transcription Available


Jon Hartley and Joe Lonsdale discuss Joe's career, co-founding Palantir, Addepar, and OpenGov, venture capital investing, defense tech, DOGE, Elon Musk, regulation, and the prospects for generative artificial intelligence. Recorded on December 12, 2024. ABOUT THE SPEAKERS: Joe Lonsdale is the founder and managing Partner at 8VC, an early-stage venture capital firm managing over $6 billion in capital. In 2003, he founded Palantir Technologies (NYSE:PLTR), a global software company known for its work supporting US and its allies' defense and intelligence. Since then, he has founded more than a dozen prominent companies, including Addepar, a wealth management platform with about $5 trillion, and OpenGov, the leading cloud software provider for local governments. He continues to create and scale companies through the 8VC Build program.  As an investor, Joe was an early backer of companies like Anduril Industries, Oculus (acq.FB), Guardant Health (NASDAQ:GH), Oscar (NYSE:OSCR), Illumio, Wish (NASDAQ:WISH), JoyTunes, Blend (NYSE:BLND), Flexport, Joby Aviation (NYSE:JOBY), Orca Bio, Qualia, Synthego, RelateIQ (acq. CRM), Yugabyte, and others.  Joe and his wife Tayler are active in a variety of philanthropic and institutional pursuits. In 2018, they founded the non-partisan Cicero Institute, which crafts and advances policies to promote effective and accountable governance, and is now successfully battling special interests with teams in over a dozen states. In 2021, Joe became the founding chairman of the board of the University of Austin(UATX), a new university dedicated to restoring the pursuit of truth in higher education. He also sits on the board of the Ronald Reagan Presidential Foundation & Institute. ​  Joe, Tayler, and their four daughters live in Austin, TX. Jon Hartley is the host of the Capitalism and Freedom in the 21st Century Podcast at the Hoover Institution and an economics PhD Candidate at Stanford University, where he specializes in finance, labor economics, and macroeconomics. He is also currently an Affiliated Scholar at the Mercatus Center, a Senior Fellow at the Foundation for Research on Equal Opportunity (FREOPP), and a Senior Fellow at the Macdonald-Laurier Institute. Jon is also a member of the Canadian Group of Economists, and serves as chair of the Economic Club of Miami. Jon has previously worked at Goldman Sachs Asset Management as well as in various policy roles at the World Bank, IMF, Committee on Capital Markets Regulation, US Congress Joint Economic Committee, the Federal Reserve Bank of New York, the Federal Reserve Bank of Chicago, and the Bank of Canada.  Jon has also been a regular economics contributor for National Review Online, Forbes, and The Huffington Post and has contributed to The Wall Street Journal, The New York Times, USA Today, Globe and Mail, National Post, and Toronto Star among other outlets. Jon has also appeared on CNBC, Fox Business, Fox News, Bloomberg, and NBC, and was named to the 2017 Forbes 30 Under 30 Law & Policy list, the 2017 Wharton 40 Under 40 list, and was previously a World Economic Forum Global Shaper. ABOUT THE SERIES: Each episode of Capitalism and Freedom in the 21st Century, a video podcast series and the official podcast of the Hoover Economic Policy Working Group, focuses on getting into the weeds of economics, finance, and public policy on important current topics through one-on-one interviews. Host Jon Hartley asks guests about their main ideas and contributions to academic research and policy. The podcast is titled after Milton Friedman‘s famous 1962 bestselling book Capitalism and Freedom, which after 60 years, remains prescient from its focus on various topics which are now at the forefront of economic debates, such as monetary policy and inflation, fiscal policy, occupational licensing, education vouchers, income share agreements, the distribution of income, and negative income taxes, among many other topics. For more information, visit: capitalismandfreedom.substack.com/

Open Source Startup Podcast
E159: Innovating on Distributed SQL Databases with Yugabyte

Open Source Startup Podcast

Play Episode Listen Later Dec 9, 2024 41:41


Karthik Ranganathan is Founder & CTO of Yugabyte, the PostgreSQL-compatible distributed database for cloud native applications. Their open source database, also called yugabyte, has almost 10K stars on GitHub. Yugabyte has raised almost $300M and sits at a $1.3B valuation. They've raised from investors including Sapphire Ventures, Lightspeed, and 8VC. In this episode, we dig into the enormous interest Yugabyte had at the onset as transactional databases were due for innovation, the key architecture choices they made, the initial launch, key early customer wins, the importance of positioning as a distributed SQL company, their evolving open source strategy, building alongside the Postgres community, the decision to bring on an outside CEO & much more!

The Business of Open Source
Excellent Open Source User Experiences with Karthik Ranganathan

The Business of Open Source

Play Episode Listen Later Jun 19, 2024 47:33


This week on The Business of Open Source, I spoke with Karthik Ranganathan, founder and co-CEO of Yugabyte. This is the second time Karthik has been on the podcast, but since three years had passed I thought it'd be a good idea to catch up and see what's changed at Yugabyte and how his perspective on the open source commercial ecosystem has changed. Some really cool topics came up in this conversation. For example: Why engineers don't choose databases based on features (and how this is related to why so many databases are open source This was super interesting, because I've seen a lot of conversations in the developer tools space about how developers choose their tools based on the features the tool has, and you should therefore market/sell based on features (unlike marketing/selling to any other market). I think this is bullshit and based on a misunderstanding about the difference between a feature and a benefit. Going back to the database market, we talked about how ultimately database users need to develop an intuition around when a particular database is the best choice, and that it takes time to do so. Choosing a database is about choosing what to prioritize for a particular application, and in a way Yugabyte presents its users/customers with a way to prioritize what's important, simplicity or flexibility. Companies that want more simplicity get something that's fully managed (and pay for it) companies that prioritize flexibility above all else are a better fit for the open source. The database is the same, regardless of whether someone is using the pure open source version or the fully managed service — and it's important to Yugabyte that everyone gets the same core functionality. How the role of open source and it's value for Yugabyte as a company has changed as the company has matured, and in particular how it's a way for people to try out Yugabyte first, and then reach out. Why Yugabyte has invested in making sure the open source user experience is excellent — because they want users to get value out of the project immediately; no one has time to spend four days figuring out how a new database works. This is part of why they think the open source project has become a lead engine. The importance of messaging in helping people understand quickly what to expect from the project and minimizing the amount of time it takes for them to get value out of it. Whether or not Yugabyte was a bit early to the cloud native party, and the pros and cons of being early. And much more! 

Postgres FM
Custom vs generic plan

Postgres FM

Play Episode Listen Later May 10, 2024 29:00


Nikolay and Michael discuss custom and generic planning in prepared statements — how it works, how issues can present themselves, some ways to view the generic plan, and some benefits of avoiding planning (not just time).  Here are some links to things they mentioned:PREPARE https://www.postgresql.org/docs/current/sql-prepare.html track_activity_query_size https://www.postgresql.org/docs/current/runtime-config-statistics.html#GUC-TRACK-ACTIVITY-QUERY-SIZE plan_cache_mode https://www.postgresql.org/docs/current/runtime-config-query.html#GUC-PLAN-CACHE-MODE EXPLAIN (GENERIC_PLAN) https://www.postgresql.org/docs/current/sql-explain.html#id-1.9.3.148.8 EXPLAIN (GENERIC_PLAN) in PostgreSQL 16 (blog post by Laurenz from Cybertec)  https://www.cybertec-postgresql.com/en/explain-generic-plan-postgresql-16/ Running EXPLAIN on any query, even with $1 parameters (blog post and video by Lukas Fittl of pganalyze) https://pganalyze.com/blog/5mins-postgres-explain-generic-plan EXPLAIN from pg_stat_statements, how to get the generic plan (blog post by Franck Pashto of Yugabyte) https://dev.to/yugabyte/explain-from-pgstatstatements-normalized-queries-how-to-always-get-the-generic-plan-in--5cfi Rework query relation permission checking (commit by Amit Langote) https://git.postgresql.org/gitweb/postgres.git?p=postgresql.git;a=commit;h=a61b1f74823c9c4f79c95226a461f1e7a367764b Partition pruning, prepared statements and generic vs custom query plans (a follow up blog post and video by Lukas) https://pganalyze.com/blog/5mins-postgres-partition-pruning-prepared-statements-generic-vs-custom-query-plans Our episode on over-indexing (inc LWLock discussion) https://postgres.fm/episodes/over-indexing “The year of the lock manager's revenge” (from blog post by Jeremy Schneider) https://ardentperf.com/2024/03/03/postgres-indexes-partitioning-and-lwlocklockmanager-scalability/  ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is brought to you by:Nikolay Samokhvalov, founder of Postgres.aiMichael Christofides, founder of pgMustardWith special thanks to:Jessie Draws for the amazing artwork 

Hacking Postgres
S2E2: Denis Magda, Director of Developer Relations, Yugabyte

Hacking Postgres

Play Episode Listen Later Apr 11, 2024 48:23


Denis Magda is the mastermind behind the innovative extension PgCompute and a pivotal figure in the world of Postgres development. With over a decade and a half of experience, Denis cut his teeth on Postgres during its use in high-traffic social networking projects in Eastern Europe and has continued to push the envelope at Yugabyte as the head of Developer Relations.In this episode we explore:The role of Yugabyte in the Postgres ecosystemThe need for local development experiencesA more 'magical' and testing-friendly approach to handling stored proceduresThe intersection of DevRel with sales engineering and marketingThe future of Postgres in educational institutionsLinks mentioned:Business Wars podcastPostgres FMFounders podcastDenis on XDenis on LinkedInYugabyte

Trino Community Broadcast
52: Commander Bun Bun takes a bite out of Yugabyte

Trino Community Broadcast

Play Episode Listen Later Oct 27, 2023 61:43


Timestamps:- 0:00 Intro- 1:48 Releases 428-430- 6:30 Introducing Denis Magda from @YugabyteDB - 7:56 JDBC, Trino's JDBC driver, and the Postgres connector- 14:08 Introducing YugabyteDB- 21:33 Demo time! Trino with PostgreSQL- 29:56 Demoing Trino with YugabyteDB- 44:57 Failover and resiliency- 56:05 Upcoming events and Trino Summit soon!

Data Engineering Podcast
Building An Internal Database As A Service Platform At Cloudflare

Data Engineering Podcast

Play Episode Listen Later Aug 28, 2023 61:09


Summary Data persistence is one of the most challenging aspects of computer systems. In the era of the cloud most developers rely on hosted services to manage their databases, but what if you are a cloud service? In this episode Vignesh Ravichandran explains how his team at Cloudflare provides PostgreSQL as a service to their developers for low latency and high uptime services at global scale. This is an interesting and insightful look at pragmatic engineering for reliability and scale. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) This episode is brought to you by Datafold – a testing automation platform for data engineers that finds data quality issues before the code and data are deployed to production. Datafold leverages data-diffing to compare production and development environments and column-level lineage to show you the exact impact of every code change on data, metrics, and BI tools, keeping your team productive and stakeholders happy. Datafold integrates with dbt, the modern data stack, and seamlessly plugs in your data CI for team-wide and automated testing. If you are migrating to a modern data stack, Datafold can also help you automate data and code validation to speed up the migration. Learn more about Datafold by visiting dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! Your host is Tobias Macey and today I'm interviewing Vignesh Ravichandran about building an internal database as a service platform at Cloudflare Interview Introduction How did you get involved in the area of data management? Can you start by describing the different database workloads that you have at Cloudflare? What are the different methods that you have used for managing database instances? What are the requirements and constraints that you had to account for in designing your current system? Why Postgres? optimizations for Postgres simplification from not supporting multiple engines limitations in postgres that make multi-tenancy challenging scale of operation (data volume, request rate What are the most interesting, innovative, or unexpected ways that you have seen your DBaaS used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on your internal database platform? When is an internal database as a service the wrong choice? What do you have planned for the future of Postgres hosting at Cloudflare? Contact Info LinkedIn (https://www.linkedin.com/in/vigneshravichandran28/) Website (https://viggy28.dev/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links Cloudflare (https://www.cloudflare.com/) PostgreSQL (https://www.postgresql.org/) Podcast Episode (https://www.dataengineeringpodcast.com/postgresql-with-jonathan-katz-episode-42/) IP Address Data Type in Postgres (https://www.postgresql.org/docs/current/datatype-net-types.html) CockroachDB (https://www.cockroachlabs.com/) Podcast Episode (https://www.dataengineeringpodcast.com/cockroachdb-with-peter-mattis-episode-35/) Citus (https://www.citusdata.com/) Podcast Episode (https://www.dataengineeringpodcast.com/citus-data-with-ozgun-erdogan-and-craig-kerstiens-episode-13/) Yugabyte (https://www.yugabyte.com/) Podcast Episode (https://www.dataengineeringpodcast.com/yugabytedb-planet-scale-sql-episode-115/) Stolon (https://github.com/sorintlab/stolon) pg_rewind (https://www.postgresql.org/docs/current/app-pgrewind.html) PGBouncer (https://www.pgbouncer.org/) HAProxy Presentation (https://www.youtube.com/watch?v=HIOo4j-Tiq4) Etcd (https://etcd.io/) Patroni (https://patroni.readthedocs.io/en/latest/) pg_upgrade (https://www.postgresql.org/docs/current/pgupgrade.html) Edge Computing (https://en.wikipedia.org/wiki/Edge_computing) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

The State of Developer Education
The Importance of Understanding Databases in Developer Education with Denis Magda of Yugabyte

The State of Developer Education

Play Episode Listen Later Aug 24, 2023 43:53


In this episode of The State of Developer Education, Jon is joined by Denis Magda, the Director of Developer Relations at Yugabyte. They delve into the importance of understanding databases in application development, the value of comprehending the internals of programming languages, and the benefits of being involved in open source communities.

Behind Company Lines
Mastering The Art Of Scaling - Karthik Ranganathan | BCL #307

Behind Company Lines

Play Episode Listen Later Aug 10, 2023 50:04


Episode 307: Karthik Ranganathan, CTO & Founder of YugabyteJoin us on this exciting episode of Behind Company Lines as Julian Torres sits down with Karthik Ranganathan, the visionary CTO and Co-Founder of Yugabyte. This remarkable conversation takes you on a journey from Karthik's foundational role in building distributed databases at Facebook to the impressive rise of Yugabyte, now a unicorn company with over $1 billion in valuation. Yugabyte is on its series C and to date has raised $291M in funding. Its last round was a big one -- $188M. Discover the secrets of startup value discovery, hear about bold moves by tech titans, and gain insights into Facebook's innovative culture. Karthik shares invaluable lessons on scaling success and navigating challenges, all while revealing the dynamic world of data scaling and showcasing innovative use cases for Yugabyte. Don't miss this episode filled with candid discussions about the founder's day-to-day challenges, the power of open-source, and the influential books and people that shaped Karthik's incredible journey. Tune in to uncover the hidden gems behind the tech industry's curtain, and get inspired by the stories and strategies that drive success in today's dynamic landscape.Connect with Behind Company Lines and HireOtter Website Facebook Twitter LinkedIn:Behind Company LinesHireOtter Instagram Buzzsprout

The Security Podcast of Silicon Valley
Sergey Stelmakh, Head of Security Engineering at Yugabyte

The Security Podcast of Silicon Valley

Play Episode Listen Later Aug 3, 2023 44:46


Sergey Stelmakh engages the intriguing question of how to marry innovation (risk taking) with security (risk mitigation), how to build effective teams, and how his life led him down the path into security in engineering-driven companies, such as Head of Security Engineering at Yugabyte, Platform Security Architect at MuleSoft (acquired by Salesforce), Lead Security Architect at Symphony Communications, all from his roots as in mathematics as Assistant Professor at Belarusian State University.

EM360 Podcast
The Importance of Data Modernisation

EM360 Podcast

Play Episode Listen Later Jun 28, 2023 12:58


Data modernisation is becoming increasingly important to CIOs and CTOs in today's world. From leveraging modern technologies to improving performance, scalability and functionality, database modernisation can be a key component to meeting the evolving needs of an organisation. In this episode of the EM360 Podcast, Head of Content Matt Harris speaks to Suda Srinivasan, VP Strategy and Solutions at Yugabyte, to discuss:Database modernisation and why it's so important to CIOs/CTOsDangers of ties to legacy database technologySupporting businesses in a multi-cloud world

Voice of the DBA
Concurrency Challenges Around Schema Changes

Voice of the DBA

Play Episode Listen Later Jun 25, 2023 2:40


I saw a great question on Twitter from Frank Pachot, a developer advocate of Yugabyte. He wrote: Without thinking how your preferred database deals with it, what do you expect if: session 1 starts to reads table T session 2 drops table T session 1 continues to read The choices in his poll were: session 2 waits, session 2 fails, session 1 fails, both fail. My first thought was SQL Server and the default need for session 2 to get an exclusive lock. In that case, session 2 would wait. Most people answered that same way, but then Frank posted a follow-up with a link to his blog. The answer for Yugabyte is that session 1 fails as it gets the message that the table was deleted. Read the rest of Concurrency Challenges Around Schema Changes

Screaming in the Cloud
Making Open-Source Multi-Cloud Truly Free with AB Periasamy

Screaming in the Cloud

Play Episode Listen Later Mar 28, 2023 40:04


AB Periasamy, Co-Founder and CEO of MinIO, joins Corey on Screaming in the Cloud to discuss what it means to be truly open source and the current and future state of multi-cloud. AB explains how MinIO was born from the idea that the world was going to produce a massive amount of data, and what it's been like to see that come true and continue to be the future outlook. AB and Corey explore why some companies are hesitant to move to cloud, and AB describes why he feels the move is inevitable regardless of cost. AB also reveals how he has helped create a truly free open-source software, and how his partnership with Amazon has been beneficial. About ABAB Periasamy is the co-founder and CEO of MinIO, an open source provider of high performance, object storage software. In addition to this role, AB is an active investor and advisor to a wide range of technology companies, from H2O.ai and Manetu where he serves on the board to advisor or investor roles with Humio, Isovalent, Starburst, Yugabyte, Tetrate, Postman, Storj, Procurify, and Helpshift. Successful exits include Gitter.im (Gitlab), Treasure Data (ARM) and Fastor (SMART).AB co-founded Gluster in 2005 to commoditize scalable storage systems. As CTO, he was the primary architect and strategist for the development of the Gluster file system, a pioneer in software defined storage. After the company was acquired by Red Hat in 2011, AB joined Red Hat's Office of the CTO. Prior to Gluster, AB was CTO of California Digital Corporation, where his work led to scaling of the commodity cluster computing to supercomputing class performance. His work there resulted in the development of Lawrence Livermore Laboratory's “Thunder” code, which, at the time was the second fastest in the world.  AB holds a Computer Science Engineering degree from Annamalai University, Tamil Nadu, India.AB is one of the leading proponents and thinkers on the subject of open source software - articulating the difference between the philosophy and business model. An active contributor to a number of open source projects, he is a board member of India's Free Software Foundation.Links Referenced: MinIO: https://min.io/ Twitter: https://twitter.com/abperiasamy LinkedIn: https://www.linkedin.com/in/abperiasamy/ Email: mailto:ab@min.io TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Chronosphere. When it costs more money and time to observe your environment than it does to build it, there's a problem. With Chronosphere, you can shape and transform observability data based on need, context and utility. Learn how to only store the useful data you need to see in order to reduce costs and improve performance at chronosphere.io/corey-quinn. That's chronosphere.io/corey-quinn. And my thanks to them for sponsor ing my ridiculous nonsense. Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn, and I have taken a somewhat strong stance over the years on the relative merits of multi-cloud, and when it makes sense and when it doesn't. And it's time for me to start modifying some of those. To have that conversation and several others as well, with me today on this promoted guest episode is AB Periasamy, CEO and co-founder of MinIO. AB, it's great to have you back.AB: Yes, it's wonderful to be here again, Corey.Corey: So, one thing that I want to start with is defining terms. Because when we talk about multi-cloud, there are—to my mind at least—smart ways to do it and ways that are frankly ignorant. The thing that I've never quite seen is, it's greenfield, day one. Time to build something. Let's make sure we can build and deploy it to every cloud provider we might ever want to use.And that is usually not the right path. Whereas different workloads in different providers, that starts to make a lot more sense. When you do mergers and acquisitions, as big companies tend to do in lieu of doing anything interesting, it seems like they find it oh, we're suddenly in multiple cloud providers, should we move this acquisition to a new cloud? No. No, you should not.One of the challenges, of course, is that there's a lot of differentiation between the baseline offerings that cloud providers have. MinIO is interesting in that it starts and stops with an object store that is mostly S3 API compatible. Have I nailed the basic premise of what it is you folks do?AB: Yeah, it's basically an object store. Amazon S3 versus us, it's actually—that's the comparable, right? Amazon S3 is a hosted cloud storage as a service, but underneath the underlying technology is called object-store. MinIO is a software and it's also open-source and it's the software that you can deploy on the cloud, deploy on the edge, deploy anywhere, and both Amazon S3 and MinIO are exactly S3 API compatible. It's a drop-in replacement. You can write applications on MinIO and take it to AWS S3, and do the reverse. Amazon made S3 API a standard inside AWS, we made S3 API standard across the whole cloud, all the cloud edge, everywhere, rest of the world.Corey: I want to clarify two points because otherwise I know I'm going to get nibbled to death by ducks on the internet. When you say open-source, it is actually open-source; you're AGPL, not source available, or, “We've decided now we're going to change our model for licensing because oh, some people are using this without paying us money,” as so many companies seem to fall into that trap. You are actually open-source and no one reasonable is going to be able to disagree with that definition.The other pedantic part of it is when something says that it's S3 compatible on an API basis, like, the question is always does that include the weird bugs that we wish it wouldn't have, or some of the more esoteric stuff that seems to be a constant source of innovation? To be clear, I don't think that you need to be particularly compatible with those very corner and vertex cases. For me, it's always been the basic CRUD operations: can you store an object? Can you give it back to me? Can you delete the thing? And maybe an update, although generally object stores tend to be atomic. How far do you go down that path of being, I guess, a faithful implementation of what the S3 API does, and at which point you decide that something is just, honestly, lunacy and you feel no need to wind up supporting that?AB: Yeah, the unfortunate part of it is we have to be very, very deep. It only takes one API to break. And it's not even, like, one API we did not implement; one API under a particular circumstance, right? Like even if you see, like, AWS SDK is, right, Java SDK, different versions of Java SDK will interpret the same API differently. And AWS S3 is an API, it's not a standard.And Amazon has published the REST specifications, API specs, but they are more like religious text. You can interpret it in many ways. Amazon's own SDK has interpreted, like, this in several ways, right? The only way to get it right is, like, you have to have a massive ecosystem around your application. And if one thing breaks—today, if I commit a code and it introduced a regression, I will immediately hear from a whole bunch of community what I broke.There's no certification process here. There is no industry consortium to control the standard, but then there is an accepted standard. Like, if the application works, they need works. And one way to get it right is, like, Amazon SDKs, all of those language SDKs, to be cleaner, simpler, but applications can even use MinIO SDK to talk to Amazon and Amazon SDK to talk to MinIO. Now, there is a clear, cooperative model.And I actually have tremendous respect for Amazon engineers. They have only been kind and meaningful, like, reasonable partnership. Like, if our community reports a bug that Amazon rolled out a new update in one of the region and the S3 API broke, they will actually go fix it. They will never argue, “Why are you using MinIO SDK?” Their engineers, they do everything by reason. That's the reason why they gained credibility.Corey: I think, on some level, that we can trust that the API is not going to meaningfully shift, just because so much has been built on top of it over the last 15, almost 16 years now that even slight changes require massive coordination. I remember there was a little bit of a kerfuffle when they announced that they were going to be disabling the BitTorrent endpoint in S3 and it was no longer going to be supported in new regions, and eventually they were turning it off. There were still people pushing back on that. I'm still annoyed by some of the documentation around the API that says that it may not return a legitimate error code when it errors with certain XML interpretations. It's… it's kind of become very much its own thing.AB: [unintelligible 00:06:22] a problem, like, we have seen, like, even stupid errors similar to that, right? Like, HTTP headers are supposed to be case insensitive, but then there are some language SDKs will send us in certain type of casing and they expect the case to be—the response to be same way. And that's not HTTP standard. If we have to accept that bug and respond in the same way, then we are asking a whole bunch of community to go fix that application. And Amazon's problem are our problems too. We have to carry that baggage.But some places where we actually take a hard stance is, like, Amazon introduced that initially, the bucket policies, like access control list, then finally came IAM, then we actually, for us, like, the best way to teach the community is make best practices the standard. The only way to do it. We have been, like, educating them that we actually implemented ACLs, but we removed it. So, the customers will no longer use it. The scale at which we are growing, if I keep it, then I can never force them to remove.So, we have been pedantic about, like, how, like, certain things that if it's a good advice, force them to do it. That approach has paid off, but the problem is still quite real. Amazon also admits that S3 API is no longer simple, but at least it's not like POSIX, right? POSIX is a rich set of API, but doesn't do useful things that we need to do. So, Amazon's APIs are built on top of simple primitive foundations that got the storage architecture correct, and then doing sophisticated functionalities on top of the simple primitives, these atomic RESTful APIs, you can finally do it right and you can take it to great lengths and still not break the storage system.So, I'm not so concerned. I think it's time for both of us to slow down and then make sure that the ease of operation and adoption is the goal, then trying to create an API Bible.Corey: Well, one differentiation that you have that frankly I wish S3 would wind up implementing is this idea of bucket quotas. I would give a lot in certain circumstances to be able to say that this S3 bucket should be able to hold five gigabytes of storage and no more. Like, you could fix a lot of free tier problems, for example, by doing something like that. But there's also the problem that you'll see in data centers where, okay, we've now filled up whatever storage system we're using. We need to either expand it at significant cost and it's going to take a while or it's time to go and maybe delete some of the stuff we don't necessarily need to keep in perpetuity.There is no moment of reckoning in traditional S3 in that sense because, oh, you can just always add one more gigabyte at 2.3 or however many cents it happens to be, and you wind up with an unbounded growth problem that you're never really forced to wrestle with. Because it's infinite storage. They can add drives faster than you can fill them in most cases. So, it's it just feels like there's an economic story, if nothing else, just from a governance control and make sure this doesn't run away from me, and alert me before we get into the multi-petabyte style of storage for my Hello World WordPress website.AB: Mm-hm. Yeah, so I always thought that Amazon did not do this—it's not just Amazon, the cloud players, right—they did not do this because they want—is good for their business; they want all the customers' data, like unrestricted growth of data. Certainly it is beneficial for their business, but there is an operational challenge. When you set quota—this is why we grudgingly introduced this feature. We did not have quotas and we didn't want to because Amazon S3 API doesn't talk about quota, but the enterprise community wanted this so badly.And eventually we [unintelligible 00:09:54] it and we gave. But there is one issue to be aware of, right? The problem with quota is that you as an object storage administrator, you set a quota, let's say this bucket, this application, I don't see more than 20TB; I'm going to set 100TB quota. And then you forget it. And then you think in six months, they will reach 20TB. The reality is, in six months they reach 100TB.And then when nobody expected—everybody has forgotten that there was a code a certain place—suddenly application start failing. And when it fails, it doesn't—even though the S3 API responds back saying that insufficient space, but then the application doesn't really pass that error all the way up. When applications fail, they fail in unpredictable ways. By the time the application developer realizes that it's actually object storage ran out of space, the lost time and it's a downtime. So, as long as they have proper observability—because I mean, I've will also asked observability, that it can alert you that you are only going to run out of space soon. If you have those system in place, then go for quota. If not, I would agree with the S3 API standard that is not about cost. It's about operational, unexpected accidents.Corey: Yeah, on some level, we wound up having to deal with the exact same problem with disk volumes, where my default for most things was, at 70%, I want to start getting pings on it and at 90%, I want to be woken up for it. So, for small volumes, you wind up with a runaway log or whatnot, you have a chance to catch it and whatnot, and for the giant multi-petabyte things, okay, well, why would you alert at 70% on that? Well, because procurement takes a while when we're talking about buying that much disk for that much money. It was a roughly good baseline for these things. The problem, of course, is when you have none of that, and well it got full so oops-a-doozy.On some level, I wonder if there's a story around soft quotas that just scream at you, but let you keep adding to it. But that turns into implementation details, and you can build something like that on top of any existing object store if you don't need the hard limit aspect.AB: Actually, that is the right way to do. That's what I would recommend customers to do. Even though there is hard quota, I will tell, don't use it, but use soft quota. And the soft quota, instead of even soft quota, you monitor them. On the cloud, at least you have some kind of restriction that the more you use, the more you pay; eventually the month end bills, it shows up.On MinIO, when it's deployed on these large data centers, that it's unrestricted access, quickly you can use a lot of space, no one knows what data to delete, and no one will tell you what data to delete. The way to do this is there has to be some kind of accountability.j, the way to do it is—actually [unintelligible 00:12:27] have some chargeback mechanism based on the bucket growth. And the business units have to pay for it, right? That IT doesn't run for free, right? IT has to have a budget and it has to be sponsored by the applications team.And you measure, instead of setting a hard limit, you actually charge them that based on the usage of your bucket, you're going to pay for it. And this is a observability problem. And you can call it soft quotas, but it hasn't been to trigger an alert in observability. It's observability problem. But it actually is interesting to hear that as soft quotas, which makes a lot of sense.Corey: It's one of those problems that I think people only figure out after they've experienced it once. And then they look like wizards from the future who, “Oh, yeah, you're going to run into a quota storage problem.” Yeah, we all find that out because the first time we smack into something and live to regret it. Now, we can talk a lot about the nuances and implementation and low level detail of this stuff, but let's zoom out of it. What are you folks up to these days? What is the bigger picture that you're seeing of object storage and the ecosystem?AB: Yeah. So, when we started, right, our idea was that world is going to produce incredible amount of data. In ten years from now, we are going to drown in data. We've been saying that today and it will be true. Every year, you say ten years from now and it will still be valid, right?That was the reason for us to play this game. And we saw that every one of these cloud players were incompatible with each other. It's like early Unix days, right? Like a bunch of operating systems, everything was incompatible and applications were beginning to adopt this new standard, but they were stuck. And then the cloud storage players, whatever they had, like, GCS can only run inside Google Cloud, S3 can only run inside AWS, and the cloud player's game was bring all the world's data into the cloud.And that actually requires enormous amount of bandwidth. And moving data into the cloud at that scale, if you look at the amount of data the world is producing, if the data is produced inside the cloud, it's a different game, but the data is produced everywhere else. MinIO's idea was that instead of introducing yet another API standard, Amazon got the architecture right and that's the right way to build large-scale infrastructure. If we stick to Amazon S3 API instead of introducing it another standard, [unintelligible 00:14:40] API, and then go after the world's data. When we started in 2014 November—it's really 2015, we started, it was laughable. People thought that there won't be a need for MinIO because the whole world will basically go to AWS S3 and they will be the world's data store. Amazon is capable of doing that; the race is not over, right?Corey: And it still couldn't be done now. The thing is that they would need to fundamentally rethink their, frankly, you serious data egress charges. The problem is not that it's expensive to store data in AWS; it's that it's expensive to store data and then move it anywhere else for analysis or use on something else. So, there are entire classes of workload that people should not consider the big three cloud providers as the place where that data should live because you're never getting it back.AB: Spot on, right? Even if network is free, right, Amazon makes, like, okay, zero egress-ingress charge, the data we're talking about, like, most of MinIO deployments, they start at petabytes. Like, one to ten petabyte, feels like 100 terabyte. For even if network is free, try moving a ten-petabyte infrastructure into the cloud. How are you going to move it?Even with FedEx and UPS giving you a lot of bandwidth in their trucks, it is not possible, right? I think the data will continue to be produced everywhere else. So, our bet was there we will be [unintelligible 00:15:56]—instead of you moving the data, you can run MinIO where there is data, and then the whole world will look like AWS's S3 compatible object store. We took a very different path. But now, when I say the same story that when what we started with day one, it is no longer laughable, right?People believe that yes, MinIO is there because our market footprint is now larger than Amazon S3. And as it goes to production, customers are now realizing it's basically growing inside a shadow IT and eventually businesses realize the bulk of their business-critical data is sitting on MinIO and that's how it's surfacing up. So now, what we are seeing, this year particularly, all of these customers are hugely concerned about cost optimization. And as part of the journey, there is also multi-cloud and hybrid-cloud initiatives. They want to make sure that their application can run on any cloud or on the same software can run on their colos like Equinix, or like bunch of, like, Digital Reality, anywhere.And MinIO's software, this is what we set out to do. MinIO can run anywhere inside the cloud, all the way to the edge, even on Raspberry Pi. It's now—whatever we started with is now has become reality; the timing is perfect for us.Corey: One of the challenges I've always had with the idea of building an application with the idea to run it anywhere is you can make explicit technology choices around that, and for example, object store is a great example because most places you go now will or can have an object store available for your use. But there seem to be implementation details that get lost. And for example, even load balancers wind up being implemented in different ways with different scaling times and whatnot in various environments. And past a certain point, it's okay, we're just going to have to run it ourselves on top of HAproxy or Nginx, or something like it, running in containers themselves; you're reinventing the wheel. Where is that boundary between, we're going to build this in a way that we can run anywhere and the reality that I keep running into, which is we tried to do that but we implicitly without realizing it built in a lot of assumptions that everything would look just like this environment that we started off in.AB: The good part is that if you look at the S3 API, every request has the site name, the endpoint, bucket name, the path, and the object name. Every request is completely self-contained. It's literally a HTTP call away. And this means that whether your application is running on Android, iOS, inside a browser, JavaScript engine, anywhere across the world, they don't really care whether the bucket is served from EU or us-east or us-west. It doesn't matter at all, so it actually allows you by API, you can build a globally unified data infrastructure, some buckets here, some buckets there.That's actually not the problem. The problem comes when you have multiple clouds. Different teams, like, part M&A, the part—like they—even if you don't do M&A, different teams, no two data engineer will would agree on the same software stack. Then where they will all end up with different cloud players and some is still running on old legacy environment.When you combine them, the problem is, like, let's take just the cloud, right? How do I even apply a policy, that access control policy, how do I establish unified identity? Because I want to know this application is the only one who is allowed to access this bucket. Can I have that same policy on Google Cloud or Azure, even though they are different teams? Like if that employer, that project, or that admin, if he or she leaves the job, how do I make sure that that's all protected?You want unified identity, you want unified access control policies. Where are the encryption key store? And then the load balancer itself, the load, its—load balancer is not the problem. But then unless you adopt S3 API as your standard, the definition of what a bucket is different from Microsoft to Google to Amazon.Corey: Yeah, the idea of an of the PUTS and retrieving of actual data is one thing, but then you have how do you manage it the control plane layer of the object store and how do you rationalize that? What are the naming conventions? How do you address it? I even ran into something similar somewhat recently when I was doing an experiment with one of the Amazon Snowball edge devices to move some data into S3 on a lark. And the thing shows up and presents itself on the local network as an S3 endpoint, but none of their tooling can accept a different endpoint built into the configuration files; you have to explicitly use it as an environment variable or as a parameter on every invocation of something that talks to it, which is incredibly annoying.I would give a lot for just to be able to say, oh, when you're talking in this profile, that's always going to be your S3 endpoint. Go. But no, of course not. Because that would make it easier to use something that wasn't them, so why would they ever be incentivized to bake that in?AB: Yeah. Snowball is an important element to move data, right? That's the UPS and FedEx way of moving data, but what I find customers doing is they actually use the tools that we built for MinIO because the Snowball appliance also looks like S3 API-compatible object store. And in fact, like, I've been told that, like, when you want to ship multiple Snowball appliances, they actually put MinIO to make it look like one unit because MinIO can erase your code objects across multiple Snowball appliances. And the MC tool, unlike AWS CLI, which is really meant for developers, like low-level calls, MC gives you unique [scoring 00:21:08] tools, like lscp, rsync-like tools, and it's easy to move and copy and migrate data. Actually, that's how people deal with it.Corey: Oh, God. I hadn't even considered the problem of having a fleet of Snowball edges here that you're trying to do a mass data migration on, which is basically how you move petabyte-scale data, is a whole bunch of parallelism. But having to figure that out on a case-by-case basis would be nightmarish. That's right, there is no good way to wind up doing that natively.AB: Yeah. In fact, Western Digital and a few other players, too, now the Western Digital created a Snowball-like appliance and they put MinIO on it. And they are actually working with some system integrators to help customers move lots of data. But Snowball-like functionality is important and more and more customers who need it.Corey: This episode is sponsored in part by Honeycomb. I'm not going to dance around the problem. Your. Engineers. Are. Burned. Out. They're tired from pagers waking them up at 2 am for something that could have waited until after their morning coffee. Ring Ring, Who's There? It's Nagios, the original call of duty! They're fed up with relying on two or three different “monitoring tools” that still require them to manually trudge through logs to decipher what might be wrong. Simply put, there's a better way. Observability tools like Honeycomb (and very little else because they do admittedly set the bar) show you the patterns and outliers of how users experience your code in complex and unpredictable environments so you can spend less time firefighting and more time innovating. It's great for your business, great for your engineers, and, most importantly, great for your customers. Try FREE today at honeycomb.io/screaminginthecloud. That's honeycomb.io/screaminginthecloud.Corey: Increasingly, it felt like, back in the on-prem days, that you'd have a file server somewhere that was either a SAN or it was going to be a NAS. The question was only whether it presented it to various things as a volume or as a file share. And then in cloud, the default storage mechanism, unquestionably, was object store. And now we're starting to see it come back again. So, it started to increasingly feel, in a lot of ways, like Cloud is no longer so much a place that is somewhere else, but instead much more of an operating model for how you wind up addressing things.I'm wondering when the generation of prosumer networking equipment, for example, is going to say, “Oh, and send these logs over to what object store?” Because right now, it's still write a file and SFTP it somewhere else, at least the good ones; some of the crap ones still want old unencrypted FTP, which is neither here nor there. But I feel like it's coming back around again. Like, when do even home users wind up instead of where do you save this file to having the cloud abstraction, which hopefully, you'll never have to deal with an S3-style endpoint, but that can underpin an awful lot of things. It feels like it's coming back and that's cloud is the de facto way of thinking about things. Is that what you're seeing? Does that align with your belief on this?AB: I actually, fundamentally believe in the long run, right, applications will go SaaS, right? Like, if you remember the days that you used to install QuickBooks and ACT and stuff, like, on your data center, you used to run your own Exchange servers, like, those days are gone. I think these applications will become SaaS. But then the infrastructure building blocks for these SaaS, whether they are cloud or their own colo, I think that in the long run, it will be multi-cloud and colo all combined and all of them will look alike.But what I find from the customer's journey, the Old World and the New World is incompatible. When they shifted from bare metal to virtualization, they didn't have to rewrite their application. But this time, you have—it as a tectonic shift. Every single application, you have to rewrite. If you retrofit your application into the cloud, bad idea, right? It's going to cost you more and I would rather not do it.Even though cloud players are trying to make, like, the file and block, like, file system services [unintelligible 00:24:01] and stuff, they make it available ten times more expensive than object, but it's just to [integrate 00:24:07] some legacy applications, but it's still a bad idea to just move legacy applications there. But what I'm finding is that the cost, if you still run your infrastructure with enterprise IT mindset, you're out of luck. It's going to be super expensive and you're going to be left out modern infrastructure, because of the scale, it has to be treated as code. You have to run infrastructure with software engineers. And this cultural shift has to happen.And that's why cloud, in the long run, everyone will look like AWS and we always said that and it's now being becoming true. Like, Kubernetes and MinIO basically is leveling the ground everywhere. It's giving ECS and S3-like infrastructure inside AWS or outside AWS, everywhere. But what I find the challenging part is the cultural mindset. If they still have the old cultural mindset and if they want to adopt cloud, it's not going to work.You have to change the DNA, the culture, the mindset, everything. The best way to do it is go to the cloud-first. Adopt it, modernize your application, learn how to run and manage infrastructure, then ask economics question, the unit economics. Then you will find the answers yourself.Corey: On some level, that is the path forward. I feel like there's just a very long tail of systems that have been working and have been meeting the business objective. And well, we should go and refactor this because, I don't know, a couple of folks on a podcast said we should isn't the most compelling business case for doing a lot of it. It feels like these things sort of sit there until there is more upside than just cost-cutting to changing the way these things are built and run. That's the reason that people have been talking about getting off of mainframe since the '90s in some companies, and the mainframe is very much still there. It is so ingrained in the way that they do business, they have to rethink a lot of the architectural things that have sprung up around it.I'm not trying to shame anyone for the [laugh] state that their environment is in. I've never yet met a company that was super proud of its internal infrastructure. Everyone's always apologizing because it's a fire. But they think someone else has figured this out somewhere and it all runs perfectly. I don't think it exists.AB: What I am finding is that if you are running it the enterprise IT style, you are the one telling the application developers, here you go, you have this many VMs and then you have, like, a VMware license and, like, Jboss, like WebLogic, and like a SQL Server license, now you go build your application, you won't be able to do it. Because application developers talk about Kafka and Redis and like Kubernetes, they don't speak the same language. And that's when these developers go to the cloud and then finish their application, take it live from zero lines of code before it can procure infrastructure and provision it to these guys. The change that has to happen is how can you give what the developers want now that reverse journey is also starting. In the long run, everything will look alike, but what I'm finding is if you're running enterprise IT infrastructure, traditional infrastructure, they are ashamed of talking about it.But then you go to the cloud and then at scale, some parts of it, you want to move for—now you really know why you want to move. For economic reasons, like, particularly the data-intensive workloads becomes very expensive. And at that part, they go to a colo, but leave the applications on the cloud. So, it's the multi-cloud model, I think, is inevitable. The expensive pieces that where you can—if you are looking at yourself as hyperscaler and if your data is growing, if your business focus is data-centric business, parts of the data and data analytics, ML workloads will actually go out, if you're looking at unit economics. If all you are focused on productivity, stick to the cloud and you're still better off.Corey: I think that's a divide that gets lost sometimes. When people say, “Oh, we're going to move to the cloud to save money.” It's, “No you're not.” At a five-year time horizon, I would be astonished if that juice were worth the squeeze in almost any scenario. The reason you go for therefore is for a capability story when it's right for you.That also means that steady-state workloads that are well understood can often be run more economically in a place that is not the cloud. Everyone thinks for some reason that I tend to be its cloud or it's trash. No, I'm a big fan of doing things that are sensible and cloud is not the right answer for every workload under the sun. Conversely, when someone says, “Oh, I'm building a new e-commerce store,” or whatnot, “And I've decided cloud is not for me.” It's, “Ehh, you sure about that?”That sounds like you are smack-dab in the middle of the cloud use case. But all these things wind up acting as constraints and strategic objectives. And technology and single-vendor answers are rarely going to be a panacea the way that their sales teams say that they will.AB: Yeah. And I find, like, organizations that have SREs, DevOps, and software engineers running the infrastructure, they actually are ready to go multi-cloud or go to colo because they have the—exactly know. They have the containers and Kubernetes microservices expertise. If you are still on a traditional SAN, NAS, and VM architecture, go to cloud, rewrite your application.Corey: I think there's a misunderstanding in the ecosystem around what cloud repatriation actually looks like. Everyone claims it doesn't exist because there's basically no companies out there worth mentioning that are, “Yep, we've decided the cloud is terrible, we're taking everything out and we are going to data centers. The end.” In practice, it's individual workloads that do not make sense in the cloud. Sometimes just the back-of-the-envelope analysis means it's not going to work out, other times during proof of concepts, and other times, as things have hit a certain point of scale, we're in an individual workload being pulled back makes an awful lot of sense. But everything else is probably going to stay in the cloud and these companies don't want to wind up antagonizing the cloud providers by talking about it in public. But that model is very real.AB: Absolutely. Actually, what we are finding with the application side, like, parts of their overall ecosystem, right, within the company, they run on the cloud, but the data side, some of the examples, like, these are in the range of 100 to 500 petabytes. The 500-petabyte customer actually started at 500 petabytes and their plan is to go at exascale. And they are actually doing repatriation because for them, their customers, it's consumer-facing and it's extremely price sensitive, but when you're a consumer-facing, every dollar you spend counts. And if you don't do it at scale, it matters a lot, right? It will kill the business.Particularly last two years, the cost part became an important element in their infrastructure, they knew exactly what they want. They are thinking of themselves as hyperscalers. They get commodity—the same hardware, right, just a server with a bunch of [unintelligible 00:30:35] and network and put it on colo or even lease these boxes, they know what their demand is. Even at ten petabytes, the economics starts impacting. If you're processing it, the data side, we have several customers now moving to colo from cloud and this is the range we are talking about.They don't talk about it publicly because sometimes, like, you don't want to be anti-cloud, but I think for them, they're also not anti-cloud. They don't want to leave the cloud. The completely leaving the cloud, it's a different story. That's not the case. Applications stay there. Data lakes, data infrastructure, object store, particularly if it goes to a colo.Now, your applications from all the clouds can access this centralized—centralized, meaning that one object store you run on colo and the colos themselves have worldwide data centers. So, you can keep the data infrastructure in a colo, but applications can run on any cloud, some of them, surprisingly, that they have global customer base. And not all of them are cloud. Sometimes like some applications itself, if you ask what type of edge devices they are running, edge data centers, they said, it's a mix of everything. What really matters is not the infrastructure. Infrastructure in the end is CPU, network, and drive. It's a commodity. It's really the software stack, you want to make sure that it's containerized and easy to deploy, roll out updates, you have to learn the Facebook-Google style running SaaS business. That change is coming.Corey: It's a matter of time and it's a matter of inevitability. Now, nothing ever stays the same. Everything always inherently changes in the full sweep of things, but I'm pretty happy with where I see the industry going these days. I want to start seeing a little bit less centralization around one or two big companies, but I am confident that we're starting to see an awareness of doing these things for the right reason more broadly permeating.AB: Right. Like, the competition is always great for customers. They get to benefit from it. So, the decentralization is a path to bringing—like, commoditizing the infrastructure. I think the bigger picture for me, what I'm particularly happy is, for a long time we carried industry baggage in the infrastructure space.If no one wants to change, no one wants to rewrite application. As part of the equation, we carried the, like, POSIX baggage, like SAN and NAS. You can't even do [unintelligible 00:32:48] as a Service, NFS as a Service. It's too much of a baggage. All of that is getting thrown out. Like, the cloud players be helped the customers start with a clean slate. I think to me, that's the biggest advantage. And that now we have a clean slate, we can now go on a whole new evolution of the stack, keeping it simpler and everyone can benefit from this change.Corey: Before we wind up calling this an episode, I do have one last question for you. As I mentioned at the start, you're very much open-source, as in legitimate open-source, which means that anyone who wants to can grab an implementation and start running it. How do you, I guess make peace with the fact that the majority of your user base is not paying you? And I guess how do you get people to decide, “You know what? We like the cut of his jib. Let's give him some money.”AB: Mm-hm. Yeah, if I looked at it that way, right, I have both the [unintelligible 00:33:38], right, on the open-source side as well as the business. But I don't see them to be conflicting. If I run as a charity, right, like, I take donation. If you love the product, here is the donation box, then that doesn't work at all, right?I shouldn't take investor money and I shouldn't have a team because I have a job to pay their bills, too. But I actually find open-source to be incredibly beneficial. For me, it's about delivering value to the customer. If you pay me $5, I ought to make you feel $50 worth of value. The same software you would buy from a proprietary vendor, why would—if I'm a customer, same software equal in functionality, if its proprietary, I would actually prefer open-source and pay even more.But why are, really, customers paying me now and what's our view on open-source? I'm actually the free software guy. Free software and open-source are actually not exactly equal, right? We are the purest of the open-source community and we have strong views on what open-source means, right. That's why we call it free software. And free here means freedom, right? Free does not mean gratis, that free of cost. It's actually about freedom and I deeply care about it.For me it's a philosophy and it's a way of life. That's why I don't believe in open core and other models that holding—giving crippleware is not open-source, right? I give you some freedom but not all, right, like, it's it breaks the spirit. So, MinIO is a hundred percent open-source, but it's open-source for the open-source community. We did not take some community-developed code and then added commercial support on top.We built the product, we believed in open-source, we still believe and we will always believe. Because of that, we open-sourced our work. And it's open-source for the open-source community. And as you build applications that—like the AGPL license on the derivative works, they have to be compatible with AGPL because we are the creator. If you cannot open-source, you open-source your application derivative works, you can buy a commercial license from us. We are the creator, we can give you a dual license. That's how the business model works.That way, the open-source community completely benefits. And it's about the software freedom. There are customers, for them, open-source is good thing and they want to pay because it's open-source. There are some customers that they want to pay because they can't open-source their application and derivative works, so they pay. It's a happy medium; that way I actually find open-source to be incredibly beneficial.Open-source gave us that trust, like, more than adoption rate. It's not like free to download and use. More than that, the customers that matter, the community that matters because they can see the code and they can see everything we did, it's not because I said so, marketing and sales, you believe them, whatever they say. You download the product, experience it and fall in love with it, and then when it becomes an important part of your business, that's when they engage with us because they talk about license compatibility and data loss or a data breach, all that becomes important. Open-source isn't—I don't see that to be conflicting for business. It actually is incredibly helpful. And customers see that value in the end.Corey: I really want to thank you for being so generous with your time. If people want to learn more, where should they go?AB: I was on Twitter and now I think I'm spending more time on, maybe, LinkedIn. I think if they—they can send me a request and then we can chat. And I'm always, like, spending time with other entrepreneurs, architects, and engineers, sharing what I learned, what I know, and learning from them. There is also a [community open channel 00:37:04]. And just send me a mail at ab@min.io and I'm always interested in talking to our user base.Corey: And we will, of course, put links to that in the [show notes 00:37:12]. Thank you so much for your time. I appreciate it.AB: It's wonderful to be here.Corey: AB Periasamy, CEO and co-founder of MinIO. I'm Cloud Economist Corey Quinn and this has been a promoted guest episode of Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice that presumably will also include an angry, loud comment that we can access from anywhere because of shared APIs.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

MLOps.community
The Rise of Serverless Databases // Alex DeBrie // MLOps Podcast #147

MLOps.community

Play Episode Listen Later Feb 28, 2023 58:03


MLOps Coffee Sessions #147 with Alex DeBrie, Something About Databases co-hosted by Abi Aryan. // Abstract For databases, it feels like we're in the middle of a big shift. The first 10-15 years of the cloud were mostly about using the same core infrastructure patterns but in the cloud (SQL Server, MySQL, Postgres, Redis, Elasticsearch). In the last few years, we're finally seeing data infrastructure that is truly built for the cloud. Elastic, scalable, resilient, managed, etc. Early examples were Snowflake + DynamoDB. The most recent ones are all the 'NewSQL' contenders (Cockroach, Yugabyte, Spanner) or the 'serverless' ones (Neon, Planetscale). Also seeing improvements in caching, search, etc. Exciting times! // Bio Alex is an AWS Data Hero and self-employed AWS consultant and trainer. He is the author of The DynamoDB Book, a comprehensive guide to data modeling with DynamoDB. Previously, he worked for Stedi and for Serverless, Inc., creators of the Serverless Framework. He loves being involved in the AWS & serverless community, and he lives in Omaha, NE with his family. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Key Takeaways from the DynamoDB Paper: https://www.alexdebrie.com/posts/dynamodb-paper/ Understanding Eventual Consistency in DynamoDB: https://www.alexdebrie.com/posts/dynamodb-eventual-consistency/ Two Scoops of Django 1.11: Best Practices for the Django Web Framework: https://www.amazon.com/Two-Scoops-Django-1-11-Practices/dp/0692915729CAP or no CAP? Understanding when the CAP theorem applies and what it means: https://www.alexdebrie.com/posts/when-does-cap-theorem-apply/ Stop fighting your database/ The DynamoDB book: https://dynamodbbook.com/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Abi on LinkedIn: https://www.linkedin.com/in/abiaryan/ Connect with Alex on LinkedIn: https://www.linkedin.com/in/alex-debrie/ Timestamps: [00:00] Alex's preferred coffee [00:27] Introduction to Alex DeBrie and DynamoDB [01:05] Takeaways [03:47] Please write down your reviews and you might have a book of Alex! [04:57] Alex's journey from being an Attorney to becoming a Data Engineer [07:31] Why engineering? [10:07] Serverless Data [12:54] Before Airflow [15:41] Batch vs streaming [17:03] Difficulties in Batch and Streaming side [19:45] Modern data infrastructure databases [24:37] Cloud Ecosystem Maturity [27:59] New generation type of Snowflake [29:47] Comparing databases [30:58] What's next on connectors from 2 perspectives? [34:25] Management services at the MLOps level [36:38] DynamoDB [39:32] Why do you like DynamoDB? [41:00] Data used in DynamoDB and size limits [43:46] Comparison of tradeoffs between DynamoDB and Redis [45:52] Preferred opinionated databases [48:43] CAP or no CAP? Understanding when the CAP theorem applies and what it means [52:10] The DynamoDB book [56:17] Chapter you want to expand on the book [57:43] Next book to write [59:25] ChatGPT iterations [1:01:59] Data modeling book wished to be written [1:03:27] Wrap up

Red Hat X Podcast Series
As a beginner, what programming language should you start with? feat. Denis Magda at Yugabyte

Red Hat X Podcast Series

Play Episode Listen Later Jan 4, 2023 37:31


You are about to explore computer science and start developing the first applications. What should be your first programming language? Should it be adorable JavaScript, glorious Python, legendary Java, or... something else? Well, as always, it depends.Join Brian and Denis Magda, Head of Developer Relations at Yugabyte, in reflecting on their experiences in an attempt to find that mysterious programming language X for beginners.

Data on Kubernetes Community
Weathering The Cloud Storm- Modern Data Management Patterns for Reliability and Availability (DoK Day EU 2022) // Denis Magda

Data on Kubernetes Community

Play Episode Listen Later May 28, 2022 10:46


https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) “Zero downtime” and “always-on” are illusions. All systems fail sooner or later, whether it's a regional e-commerce website or a major cloud region hosting thousands of applications. That's why, instead of chasing these illusions, it's worth focusing on the nines of availability. Based on true stories, this session walks you through modern data availability and reliability patterns used by architects whose applications withstood major cloud outages. With the focus on the data storage layer and Kubernetes, you'll learn: * How to architect the data layer in Kubernetes with the server, zone, and region-level resiliency in mind. * How to find a compromise between latency and availability for multi-region deployments. * How to ensure the data layer remains reliable (i.e., always returns expected data) even during a major incident. Denis Magda has spent half of his career working on distributed systems, applications, and databases. His experience spans from the development of distributed database engines and high-performance applications to training and education on the topic of distributed and cloud computing. Presently, Denis runs the Developer Relations team at Yugabyte and serves a PMC Member for Apache Ignite. He started his professional career at Sun Microsystems and Oracle, where he led one of the Java development groups and worked on technology evangelism efforts.

Hashmap on Tap
#129 The Evolution of Data Engineering with Richie Bachala at Yugabyte

Hashmap on Tap

Play Episode Listen Later May 24, 2022 45:41


Welcome back, Richie Bachala! Richie is a Principal at Yugabyte, a distributed SQL database built to power global-scale, cloud-native applications. On this episode, Richie joins us to walk us through his latest blog post about the Data Engineering Center of Excellence, current trends for where investment is going in data, and his current role at Yugabyte. Listen in to hear some fantastic perspectives on data engineering and what areas you should be keeping an eye on. Show notes: Check out Yugabyte: https://www.yugabyte.com/ Join the Yugabyte Slack community: https://communityinviter.com/apps/yugabyte-db/register Read Richie's blog post: https://towardsdatascience.com/building-a-data-engineering-center-of-excellence-b83d51cedb6a Connect with Richie on LinkedIn: https://www.linkedin.com/in/richiebachala/ Follow Richie on Twitter: https://twitter.com/richiebachala On tap for today's episode: Sparkling Iced Tea and Bulletproof Coffee Contact Us: https://www.hashmapinc.com/reach-out

Percona's HOSS Talks FOSS:  The Open Source Database Podcast
Modernize Relational Databases Through a Cloud-Native Approach – The HOSS 67 /w Denis Magda

Percona's HOSS Talks FOSS: The Open Source Database Podcast

Play Episode Listen Later May 13, 2022 33:19


Distributed SQL (also called New SQL) aims to modernize relational databases by bringing better availability, scale, and performance through a cloud-native approach.  Yugabyte is one of the leaders at the forefront of this movement by marrying the rock solid PostgreSQL client & protocol with a brand new cloud-native backend.  Denis Magda stops by and chat's with the HOSS not only about Yugabyte but also about his work on various Apache projects.

Paparelli Podcast
Major Success From Major Accounts

Paparelli Podcast

Play Episode Listen Later Apr 29, 2022 87:25


He sold $60mm in software in one year. Russ is clearly the most successful major account seller I know.Russ West is a Senior Account Executive with a red-hot Silicon Valley startup named Yugabyte. In their last funding round, they were valued at $1.2b. I am sure Russ contributed to their valuation by helping grow their SaaS sales in the F50 right here in Atlanta. In this conversation, Russ tells us about his networking approach to prospecting, his service approach to relationship retention, and his methods for managing major accounts with over one hundred stakeholders. And it doesn't end there. Entrepreneurs, business owners, and sales execs should listen to this podcast. I guarantee you'll walk away with some great ideas on how to increase your sales success and how to move toward higher average deals.

Screaming in the Cloud
Developing Storage Solutions Before the Rest with AB Periasamay

Screaming in the Cloud

Play Episode Listen Later Feb 2, 2022 38:54


About ABAB Periasamy is the co-founder and CEO of MinIO, an open source provider of high performance, object storage software. In addition to this role, AB is an active investor and advisor to a wide range of technology companies, from H2O.ai and Manetu where he serves on the board to advisor or investor roles with Humio, Isovalent, Starburst, Yugabyte, Tetrate, Postman, Storj, Procurify, and Helpshift. Successful exits include Gitter.im (Gitlab), Treasure Data (ARM) and Fastor (SMART).AB co-founded Gluster in 2005 to commoditize scalable storage systems. As CTO, he was the primary architect and strategist for the development of the Gluster file system, a pioneer in software defined storage. After the company was acquired by Red Hat in 2011, AB joined Red Hat's Office of the CTO. Prior to Gluster, AB was CTO of California Digital Corporation, where his work led to scaling of the commodity cluster computing to supercomputing class performance. His work there resulted in the development of Lawrence Livermore Laboratory's “Thunder” code, which, at the time was the second fastest in the world.  AB holds a Computer Science Engineering degree from Annamalai University, Tamil Nadu, India.AB is one of the leading proponents and thinkers on the subject of open source software - articulating the difference between the philosophy and business model. An active contributor to a number of open source projects, he is a board member of India's Free Software Foundation.Links: MinIO: https://min.io/ Twitter: https://twitter.com/abperiasamy MinIO Slack channel: https://minio.slack.com/join/shared_invite/zt-11qsphhj7-HpmNOaIh14LHGrmndrhocA LinkedIn: https://www.linkedin.com/in/abperiasamy/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig is the solution for securing DevOps. They have a blog post that went up recently about how an insecure AWS Lambda function could be used as a pivot point to get access into your environment. They've also gone deep in-depth with a bunch of other approaches to how DevOps and security are inextricably linked. To learn more, visit sysdig.com and tell them I sent you. That's S-Y-S-D-I-G dot com. My thanks to them for their continued support of this ridiculous nonsense.Corey: This episode is sponsored in part by our friends at Rising Cloud, which I hadn't heard of before, but they're doing something vaguely interesting here. They are using AI, which is usually where my eyes glaze over and I lose attention, but they're using it to help developers be more efficient by reducing repetitive tasks. So, the idea being that you can run stateless things without having to worry about scaling, placement, et cetera, and the rest. They claim significant cost savings, and they're able to wind up taking what you're running as it is, in AWS, with no changes, and run it inside of their data centers that span multiple regions. I'm somewhat skeptical, but their customers seem to really like them, so that's one of those areas where I really have a hard time being too snarky about it because when you solve a customer's problem, and they get out there in public and say, “We're solving a problem,” it's very hard to snark about that. Multus Medical, Construx.ai, and Stax have seen significant results by using them, and it's worth exploring. So, if you're looking for a smarter, faster, cheaper alternative to EC2, Lambda, or batch, consider checking them out. Visit risingcloud.com/benefits. That's risingcloud.com/benefits, and be sure to tell them that I said you because watching people wince when you mention my name is one of the guilty pleasures of listening to this podcast.in a siloCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined this week by someone who's doing something a bit off the beaten path when we talk about cloud. I've often said that S3 is sort of a modern wonder of the world. It was the first AWS service brought into general availability. Today's promoted guest is the co-founder and CEO of MinIO, Anand Babu Periasamy, or AB as he often goes, depending upon who's talking to him. Thank you so much for taking the time to speak with me today.AB: It's wonderful to be here, Corey. Thank you for having me.Corey: So, I want to start with the obvious thing, where you take a look at what is the cloud and you can talk about AWS's ridiculous high-level managed services, like Amazon Chime. Great, we all see how that plays out. And those are the higher-level offerings, ideally aimed at problems customers have, but then they also have the baseline building blocks services, and it's hard to think of a more baseline building block than an object store. That's something every cloud provider has, regardless of how many scare quotes there are around the word cloud; everyone offers the object store. And your solution is to look at this and say, “Ah, that's a market ripe for disruption. We're going to build through an open-source community software that emulates an object store.” I would be sitting here, more or less poking fun at the idea except for the fact that you're a billion-dollar company now.AB: Yeah.Corey: How did you get here?AB: So, when we started, right, we did not actually think about cloud that way, right? “Cloud, it's a hot trend, and let's go disrupt is like that. It will lead to a lot of opportunity.” Certainly, it's true, it lead to the M&S, right, but that's not how we looked at it, right? It's a bad idea to build startups for M&A.When we looked at the problem, when we got back into this—my previous background, some may not know that it's actually a distributed file system background in the open-source space.Corey: Yeah, you were one of the co-founders of Gluster—AB: Yeah.Corey: —which I have only begrudgingly forgiven you. But please continue.AB: [laugh]. And back then we got the idea right, but the timing was wrong. And I had—while the data was beginning to grow at a crazy rate, end of the day, GlusterFS has to still look like an FS, it has to look like a file system like NetApp or EMC, and it was hugely limiting what we can do with it. The biggest problem for me was legacy systems. I have to build a modern system that is compatible with a legacy architecture, you cannot innovate.And that is where when Amazon introduced S3, back then, like, when S3 came, cloud was not big at all, right? When I look at it, the most important message of the cloud was Amazon basically threw everything that is legacy. It's not [iSCSI 00:03:21] as a Service; it's not even FTP as a Service, right? They came up with a simple, RESTful API to store your blobs, whether it's JavaScript, Android, iOS, or [AAML 00:03:30] application, or even Snowflake-type application.Corey: Oh, we spent ten years rewriting our apps to speak object store, and then they released EFS, which is NFS in the cloud. It's—AB: Yeah.Corey: —I didn't realize I could have just been stubborn and waited, and the whole problem would solve itself. But here we are. You're quite right.AB: Yeah. And even EFS and EBS are more for legacy stock can come in, buy some time, but that's not how you should stay on AWS, right? When Amazon did that, for me, that was the opportunity. I saw that… while world is going to continue to produce lots and lots of data, if I built a brand around that, I'm not going to go wrong.The problem is data at scale. And what do I do there? The opportunity I saw was, Amazon solved one of the largest problems for a long time. All the legacy systems, legacy protocols, they convinced the industry, throw them away and then start all over from scratch with the new API. While it's not compatible, it's not standard, it is ridiculously simple compared to anything else.No fstabs, no [unintelligible 00:04:27], no [root 00:04:28], nothing, right? From any application anywhere you can access was a big deal. When I saw that, I was like, “Thank you Amazon.” And I also knew Amazon would convince the industry that rewriting their application is going to be better and faster and cheaper than retrofitting legacy applications.Corey: I wonder how much that's retconned because talking to some of the people involved in the early days, they were not at all convinced they [laugh] would be able to convince the industry to do this.AB: Actually, if you talk to the analyst reporters, the IDC's, Gartner's of the world to the enterprise IT, the VMware community, they would say, “Hell no.” But if you talk to the actual application developers, data infrastructure, data architects, the actual consumers of data, for them, it was so obvious. They actually did not know how to write an fstab. The iSCSI and NFS, you can't even access across the internet, and the modern applications, they ran across the globe, in JavaScript, and all kinds of apps on the device. From [Snap 00:05:21] to Snowflake, today is built on object store. It was more natural for the applications team, but not from the infrastructure team. So, who you asked that mattered.But nevertheless, Amazon convinced the rest of the world, and our bet was that if this is going to be the future, then this is also our opportunity. S3 is going to be limited because it only runs inside AWS. Bulk of the world's data is produced everywhere and only a tiny fraction will go to AWS. And where will the rest of the data go? Not SAN, NAS, HDFS, or other blob store, Azure Blob, or GCS; it's not going to be fragmented. And if we built a better object store, lightweight, faster, simpler, but fully compatible with S3 API, we can sweep and consolidate the market. And that's what happened.Corey: And there is a lot of validity to that. We take a look across the industry, when we look at various standards—I mean, one of the big problems with multi-cloud in many respects is the APIs are not quite similar enough. And worse, the failure patterns are very different, of I don't just need to know how the load balancer works, I need to know how it breaks so I can detect and plan for that. And then you've got the whole identity problem as well, where you're trying to manage across different frames of reference as you go between providers, and leads to a bit of a mess. What is it that makes MinIO something that has been not just something that has endured since it was created, but clearly been thriving?AB: The real reason, actually is not the multi-cloud compatibility, all that, right? Like, while today, it is a big deal for the users because the deployments have grown into 10-plus petabytes, and now the infrastructure team is taking it over and consolidating across the enterprise, so now they are talking about which key management server for storing the encrypted keys, which key management server should I talk to? Look at AWS, Google, or Azure, everyone has their own proprietary API. Outside they, have [YAML2 00:07:18], HashiCorp Vault, and, like, there is no standard here. It is supposed to be a [KMIP 00:07:23] standard, but in reality, it is not. Even different versions of Vault, there are incompatibilities for us.That is where—like from Key Management Server, Identity Management Server, right, like, everything that you speak around, how do you talk to different ecosystem? That, actually, MinIO provides connectors; having the large ecosystem support and large community, we are able to address all that. Once you bring MinIO into your application stack like you would bring Elasticsearch or MongoDB or anything else as a container, your application stack is just a Kubernetes YAML file, and you roll it out on any cloud, it becomes easier for them, they're able to go to any cloud they want. But the real reason why it succeeded was not that. They actually wrote their applications as containers on Minikube, then they will push it on a CI/CD environment.They never wrote code on EC2 or ECS writing objects on S3, and they don't like the idea of [past 00:08:15], where someone is telling you just—like you saw Google App Engine never took off, right? They liked the idea, here are my building blocks. And then I would stitch them together and build my application. We were part of their application development since early days, and when the application matured, it was hard to remove. It is very much like Microsoft Windows when it grew, even though the desktop was Microsoft Windows Server was NetWare, NetWare lost the game, right?We got the ecosystem, and it was actually developer productivity, convenience, that really helped. The simplicity of MinIO, today, they are arguing that deploying MinIO inside AWS is easier through their YAML and containers than going to AWS Console and figuring out how to do it.Corey: As you take a look at how customers are adopting this, it's clear that there is some shift in this because I could see the story for something like MinIO making an awful lot of sense in a data center environment because otherwise, it's, “Great. I need to make this app work with my SAN as well as an object store.” And that's sort of a non-starter for obvious reasons. But now you're available through cloud marketplaces directly.AB: Yeah.Corey: How are you seeing adoption patterns and interactions from customers changing as the industry continues to evolve?AB: Yeah, actually, that is how my thinking was when I started. If you are inside AWS, I would myself tell them that why don't use AWS S3? And it made a lot of sense if it's on a colo or your own infrastructure, then there is an object store. It even made a lot of sense if you are deploying on Google Cloud, Azure, Alibaba Cloud, Oracle Cloud, it made a lot of sense because you wanted an S3 compatible object store. Inside AWS, why would you do it, if there is AWS S3?Nowadays, I hear funny arguments, too. They like, “Oh, I didn't know that I could use S3. Is S3 MinIO compatible?” Because they will be like, “It came along with the GitLab or GitHub Enterprise, a part of the application stack.” They didn't even know that they could actually switch it over.And otherwise, most of the time, they developed it on MinIO, now they are too lazy to switch over. That also happens. But the real reason that why it became serious for me—I ignored that the public cloud commercialization; I encouraged the community adoption. And it grew to more than a million instances, like across the cloud, like small and large, but when they start talking about paying us serious dollars, then I took it seriously. And then when I start asking them, why would you guys do it, then I got to know the real reason why they wanted to do was they want to be detached from the cloud infrastructure provider.They want to look at cloud as CPU network and drive as a service. And running their own enterprise IT was more expensive than adopting public cloud, it was productivity for them, reducing the infrastructure, people cost was a lot. It made economic sense.Corey: Oh, people always cost more the infrastructure itself does.AB: Exactly right. 70, 80%, like, goes into people, right? And enterprise IT is too slow. They cannot innovate fast, and all of those problems. But what I found was for us, while we actually build the community and customers, if you're on AWS, if you're running MinIO on EBS, EBS is three times more expensive than S3.Corey: Or a single copy of it, too, where if you're trying to go multi-AZ and you have the replication traffic, and not to mention you have to over-provision it, which is a bit of a different story as well. So, like, it winds up being something on the order of 30 times more expensive, in many cases, to do it right. So, I'm looking at this going, the economics of running this purely by itself in AWS don't make sense to me—long experience teaches me the next question of, “What am I missing?” Not, “That's ridiculous and you're doing it wrong.” There's clearly something I'm not getting. What am I missing?AB: I was telling them until we made some changes, right—because we saw a couple of things happen. I was initially like, [unintelligible 00:12:00] does not make 30 copies. It makes, like, 1.4x, 1.6x.But still, the underlying block storage is not only three times more expensive than S3, it's also slow. It's a network storage. Trying to put an object store on top of it, another, like, software-defined SAN, like EBS made no sense to me. Smaller deployments, it's okay, but you should never scale that on EBS. So, it did not make economic sense. I would never take it seriously because it would never help them grow to scale.But what changed in recent times? Amazon saw that this was not only a problem for MinIO-type players. Every database out there today, every modern database, even the message queues like Kafka, they all have gone scale-out. And they all depend on local block store and putting a scale-out distributed database, data processing engines on top of EBS would not scale. And Amazon introduced storage optimized instances. Essentially, that reduced to bet—the data infrastructure guy, data engineer, or application developer asking IT, “I want a SuperMicro, or Dell server, or even virtual machines.” That's too slow, too inefficient.They can provision these storage machines on demand, and then I can do it through Kubernetes. These two changes, all the public cloud players now adopted Kubernetes as the standard, and they have to stick to the Kubernetes API standard. If they are incompatible, they won't get adopted. And storage optimized that is local drives, these are machines, like, [I3 EN 00:13:23], like, 24 drives, they have SSDs, and fast network—like, 25-gigabit 200-gigabit type network—availability of these machines, like, what typically would run any database, HDFS cluster, MinIO, all of them, those machines are now available just like any other EC2 instance.They are efficient. You can actually put MinIO side by side to S3 and still be price competitive. And Amazon wants to—like, just like their retail marketplace, they want to compete and be open. They have enabled it. In that sense, Amazon is actually helping us. And it turned out that now I can help customers build multiple petabyte infrastructure on Amazon and still stay efficient, still stay price competitive.Corey: I would have said for a long time that if you were to ask me to build out the lingua franca of all the different cloud providers into a common API, the S3 API would be one of them. Now, you are building this out, multi-cloud, you're in all three of the major cloud marketplaces, and the way that you do that and do those deployments seems like it is the modern multi-cloud API of Kubernetes. When you first started building this, Kubernetes was very early on. What was the evolution of getting there? Or were you one of the first early-adoption customers in a Kubernetes space?AB: So, when we started, there was no Kubernetes. But we saw the problem was very clear. And there was containers, and then came Docker Compose and Swarm. Then there was Mesos, Cloud Foundry, you name it, right? Like, there was many solutions all the way up to even VMware trying to get into that space.And what did we do? Early on, I couldn't choose. I couldn't—it's not in our hands, right, who is going to be the winner, so we just simply embrace everybody. It was also tiring that to allow implement native connectors to all of them different orchestration, like Pivotal Cloud Foundry alone, they have their own standard open service broker that's only popular inside their system. Go outside elsewhere, everybody was incompatible.And outside that, even, Chef Ansible Puppet scripts, too. We just simply embraced everybody until the dust settle down. When it settled down, clearly a declarative model of Kubernetes became easier. Also Kubernetes developers understood the community well. And coming from Borg, I think they understood the right architecture. And also written in Go, unlike Java, right?It actually matters, these minute new details resonating with the infrastructure community. It took off, and then that helped us immensely. Now, it's not only Kubernetes is popular, it has become the standard, from VMware to OpenShift to all the public cloud providers, GKS, AKS, EKS, whatever, right—GKE. All of them now are basically Kubernetes standard. It made not only our life easier, it made every other [ISV 00:16:11], other open-source project, everybody now can finally write one code that can be operated portably.It is a big shift. It is not because we chose; we just watched all this, we were riding along the way. And then because we resonated with the infrastructure community, modern infrastructure is dominated by open-source. We were also the leading open-source object store, and as Kubernetes community adopted us, we were naturally embraced by the community.Corey: Back when AWS first launched with S3 as its first offering, there were a bunch of folks who were super excited, but object stores didn't make a lot of sense to them intrinsically, so they looked into this and, “Ah, I can build a file system and users base on top of S3.” And the reaction was, “Holy God don't do that.” And the way that AWS decided to discourage that behavior is a per request charge, which for most workloads is fine, whatever, but there are some that causes a significant burden. With running something like MinIO in a self-hosted way, suddenly that costing doesn't exist in the same way. Does that open the door again to so now I can use it as a file system again, in which case that just seems like using the local file system, only with extra steps?AB: Yeah.Corey: Do you see patterns that are emerging with customers' use of MinIO that you would not see with the quote-unquote, “Provider's” quote-unquote, “Native” object storage option, or do the patterns mostly look the same?AB: Yeah, if you took an application that ran on file and block and brought it over to object storage, that makes sense. But something that is competing with object store or a layer below object store, that is—end of the day that drives our block devices, you have a block interface, right—trying to bring SAN or NAS on top of object store is actually a step backwards. They completely missed the message that Amazon told that if you brought a file system interface on top of object store, you missed the point, that you are now bringing the legacy things that Amazon intentionally removed from the infrastructure. Trying to bring them on top doesn't make it any better. If you are arguing from a compatibility some legacy applications, sure, but writing a file system on top of object store will never be better than NetApp, EMC, like EMC Isilon, or anything else. Or even GlusterFS, right?But if you want a file system, I always tell the community, they ask us, “Why don't you add an FS option and do a multi-protocol system?” I tell them that the whole point of S3 is to remove all those legacy APIs. If I added POSIX, then I'll be a mediocre object storage and a terrible file system. I would never do that. But why not write a FUSE file system, right? Like, S3Fs is there.In fact, initially, for legacy compatibility, we wrote MinFS and I had to hide it. We actually archived the repository because immediately people started using it. Even simple things like end of the day, can I use Unix [Coreutils 00:19:03] like [cp, ls 00:19:04], like, all these tools I'm familiar with? If it's not file system object storage that S3 [CMD 00:19:08] or AWS CLI is, like, to bloatware. And it's not really Unix-like feeling.Then what I told them, “I'll give you a BusyBox like a single static binary, and it will give you all the Unix tools that works for local filesystem as well as object store.” That's where the [MC tool 00:19:23] came; it gives you all the Unix-like programmability, all the core tool that's object storage compatible, speaks native object store. But if I have to make object store look like a file system so UNIX tools would run, it would not only be inefficient, Unix tools never scaled for this kind of capacity.So, it would be a bad idea to take step backwards and bring legacy stuff back inside. For some very small case, if there are simple POSIX calls using [ObjectiveFs 00:19:49], S3Fs, and few, for legacy compatibility reasons makes sense, but in general, I would tell the community don't bring file and block. If you want file and block, leave those on virtual machines and leave that infrastructure in a silo and gradually phase them out.Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats v-u-l-t-r.com slash screaming.Corey: So, my big problem, when I look at what S3 has done is in it's name because of course, naming is hard. It's, “Simple Storage Service.” The problem I have is with the word simple because over time, S3 has gotten more and more complex under the hood. It automatically tiers data the way that customers want. And integrated with things like Athena, you can now query it directly, whenever of an object appears, you can wind up automatically firing off Lambda functions and the rest.And this is increasingly looking a lot less like a place to just dump my unstructured data, and increasingly, a lot like this is sort of a database, in some respects. Now, understand my favorite database is Route 53; I have a long and storied history of misusing services as databases. Is this one of those scenarios, or is there some legitimacy to the idea of turning this into a database?AB: Actually, there is now S3 Select API that if you're storing unstructured data like CSV, JSON, Parquet, without downloading even a compressed CSV, you can actually send a SQL query into the system. IN MinIO particularly the S3 Select is [CMD 00:21:16] optimized. We can load, like, every 64k worth of CSV lines into registers and do CMD operations. It's the fastest SQL filter out there. Now, bringing these kinds of capabilities, we are just a little bit away from a database; should we do database? I would tell definitely no.The very strength of S3 API is to actually limit all the mutations, right? Particularly if you look at database, they're dealing with metadata, and querying; the biggest value they bring is indexing the metadata. But if I'm dealing with that, then I'm dealing with really small block lots of mutations, the separation of objects storage should be dealing with persistence and not mutations. Mutations are [AWS 00:21:57] problem. Separation of database work function and persistence function is where object storage got the storage right.Otherwise, it will, they will make the mistake of doing POSIX-like behavior, and then not only bringing back all those capabilities, doing IOPS intensive workloads across the HTTP, it wouldn't make sense, right? So, object storage got the API right. But now should it be a database? So, it definitely should not be a database. In fact, I actually hate the idea of Amazon yielding to the file system developers and giving a [file three 00:22:29] hierarchical namespace so they can write nice file managers.That was a terrible idea. Writing a hierarchical namespace that's also sorted, now puts tax on how the metadata is indexed and organized. The Amazon should have left the core API very simple and told them to solve these problems outside the object store. Many application developers don't need. Amazon was trying to satisfy everybody's need. Saying no to some of these file system-type, file manager-type users, what should have been the right way.But nevertheless, adding those capabilities, eventually, now you can see, S3 is no longer simple. And we had to keep that compatibility, and I hate that part. I actually don't mind compatibility, but then doing all the wrong things that Amazon is adding, now I have to add because it's compatible. I kind of hate that, right?But now going to a database would be pushing it to the whole new level. Here is the simple reason why that's a bad idea. The right way to do database—in fact, the database industry is already going in the right direction. Unstructured data, the key-value or graph, different types of data, you cannot possibly solve all that even in a single database. They are trying to be multimodal database; even they are struggling with it.You can never be a Redis, Cassandra, like, a SQL all-in-one. They tried to say that but in reality, that you will never be better than any one of those focused database solutions out there. Trying to bring that into object store will be a mistake. Instead, let the databases focus on query language implementation and query computation, and leave the persistence to object store. So, object store can still focus on storing your database segments, the table segments, but the index is still in the memory of the database.Even the index can be snapshotted once in a while to object store, but use objects store for persistence and database for query is the right architecture. And almost all the modern databases now, from Elasticsearch to [unintelligible 00:24:21] to even Kafka, like, message queue. They all have gone that route. Even Microsoft SQL Server, Teradata, Vertica, name it, Splunk, they all have gone object storage route, too. Snowflake itself is a prime example, BigQuery and all of them.That's the right way. Databases can never be consolidated. There will be many different kinds of databases. Let them specialize on GraphQL or Graph API, or key-value, or SQL. Let them handle the indexing and persistence, they cannot handle petabytes of data. That [unintelligible 00:24:51] to object store is how the industry is shaping up, and it is going in the right direction.Corey: One of the ways I learned the most about various services is by talking to customers. Every time I think I've seen something, this is amazing. This service is something I completely understand. All I have to do is talk to one more customer. And when I was doing a bill analysis project a couple of years ago, I looked into a customer's account and saw a bucket with okay, that has 280 billion objects in it—and wait was that billion with a B?And I asked them, “So, what's going on over there?” And there's, “Well, we built our own columnar database on top of S3. This may not have been the best approach.” It's, “I'm going to stop you there. With no further context, it was not, but please continue.”It's the sort of thing that would never have occurred to me to even try, do you tend to see similar—I would say they're anti-patterns, except somehow they're made to work—in some of your customer environments, as they are using the service in ways that are very different than ways encouraged or even allowed by the native object store options?AB: Yeah, when I first started seeing the database-type workloads coming on to MinIO, I was surprised, too. That was exactly my reaction. In fact, they were storing these 256k, sometimes 64k table segments because they need to index it, right, and the table segments were anywhere between 64k to 2MB. And when they started writing table segments, it was more often [IOPS-type 00:26:22] I/O pattern, then a throughput-type pattern. Throughput is an easier problem to solve, and MinIO always saturated these 100-gigabyte NVMe-type drives, they were I/O intensive, throughput optimized.When I started seeing the database workloads, I had to optimize for small-object workloads, too. We actually did all that because eventually I got convinced the right way to build a database was to actually leave the persistence out of database; they made actually a compelling argument. If historically, I thought metadata and data, data to be very big and coming to object store make sense. Metadata should be stored in a database, and that's only index page. Take any book, the index pages are only few, database can continue to run adjacent to object store, it's a clean architecture.But why would you put database itself on object store? When I saw a transactional database like MySQL, changing the [InnoDB 00:27:14] to [RocksDB 00:27:15], and making changes at that layer to write the SS tables [unintelligible 00:27:19] to MinIO, and then I was like, where do you store the memory, the journal? They said, “That will go to Kafka.” And I was like—I thought that was insane when it started. But it continued to grow and grow.Nowadays, I see most of the databases have gone to object store, but their argument is, the databases also saw explosive growth in data. And they couldn't scale the persistence part. That is where they realized that they still got very good at the indexing part that object storage would never give. There is no API to do sophisticated query of the data. You cannot peek inside the data, you can just do streaming read and write.And that is where the databases were still necessary. But databases were also growing in data. One thing that triggered this was the use case moved from data that was generated by people to now data generated by machines. Machines means applications, all kinds of devices. Now, it's like between seven billion people to a trillion devices is how the industry is changing. And this led to lots of machine-generated, semi-structured, structured data at giant scale, coming into database. The databases need to handle scale. There was no other way to solve this problem other than leaving the—[unintelligible 00:28:31] if you looking at columnar data, most of them are machine-generated data, where else would you store? If they tried to build their own object storage embedded into the database, it would make database mentally complicated. Let them focus on what they are good at: Indexing and mutations. Pull the data table segments which are immutable, mutate in memory, and then commit them back give the right mix. What you saw what's the fastest step that happened, we saw that consistently across. Now, it is actually the standard.Corey: So, you started working on this in 2014, and here we are—what is it—eight years later now, and you've just announced a Series B of $100 million dollars on a billion-dollar valuation. So, it turns out this is not just one of those things people are using for test labs; there is significant momentum behind using this. How did you get there from—because everything you're saying makes an awful lot of sense, but it feels, at least from where I sit, to be a little bit of a niche. It's a bit of an edge case that is not the common case. Obviously, I missing something because your investors are not the types of sophisticated investors who see something ridiculous and, “Yep. That's the thing we're going to go for.” There right more than they're not.AB: Yeah. The reason for that was the saw what we were set to do. In fact, these are—if you see the lead investor, Intel, they watched us grow. They came into Series A and they saw, everyday, how we operated and grew. They believed in our message.And it was actually not about object store, right? Object storage was a means for us to get into the market. When we started, our idea was, ten years from now, what will be a big problem? A lot of times, it's hard to see the future, but if you zoom out, it's hidden in plain sight.These are simple trends. Every major trend pointed to world producing more data. No one would argue with that. If I solved one important problem that everybody is suffering, I won't go wrong. And when you solve the problem, it's about building a product with fine craftsmanship, attention to details, connecting with the user, all of that standard stuff.But I picked object storage as the problem because the industry was fragmented across many different data stores, and I knew that won't be the case ten years from now. Applications are not going to adopt different APIs across different clouds, S3 to GCS to Azure Blob to HDFS to everything is incompatible. I saw that if I built a data store for persistence, industry will consolidate around S3 API. Amazon S3, when we started, it looked like they were the giant, there was only one cloud industry, it believed mono-cloud. Almost everyone was talking to me like AWS will be the world's data center.I certainly see that possibility, Amazon is capable of doing it, but my bet was the other way, that AWS S3 will be one of many solutions, but not—if it's all incompatible, it's not going to work, industry will consolidate. Our bet was, if world is producing so much data, if you build an object store that is S3 compatible, but ended up as the leading data store of the world and owned the application ecosystem, you cannot go wrong. We kept our heads low and focused on the first six years on massive adoption, build the ecosystem to a scale where we can say now our ecosystem is equal or larger than Amazon, then we are in business. We didn't focus on commercialization; we focused on convincing the industry that this is the right technology for them to use. Once they are convinced, once you solve business problems, making money is not hard because they are already sold, they are in love with the product, then convincing them to pay is not a big deal because data is so critical, central part of their business.We didn't worry about commercialization, we worried about adoption. And once we got the adoption, now customers are coming to us and they're like, “I don't want open-source license violation. I don't want data breach or data loss.” They are trying to sell to me, and it's an easy relationship game. And it's about long-term partnership with customers.And so the business started growing, accelerating. That was the reason that now is the time to fill up the gas tank and investors were quite excited about the commercial traction as well. And all the intangible, right, how big we grew in the last few years.Corey: It really is an interesting segment, that has always been something that I've mostly ignored, like, “Oh, you want to run your own? Okay, great.” I get it; some people want to cosplay as cloud providers themselves. Awesome. There's clearly a lot more to it than that, and I'm really interested to see what the future holds for you folks.AB: Yeah, I'm excited. I think end of the day, if I solve real problems, every organization is moving from compute technology-centric to data-centric, and they're all looking at data warehouse, data lake, and whatever name they give data infrastructure. Data is now the centerpiece. Software is a commodity. That's how they are looking at it. And it is translating to each of these large organizations—actually, even the mid, even startups nowadays have petabytes of data—and I see a huge potential here. The timing is perfect for us.Corey: I'm really excited to see this continue to grow. And I want to thank you for taking so much time to speak with me today. If people want to learn more, where can they find you?AB: I'm always on the community, right. Twitter and, like, I think the Slack channel, it's quite easy to reach out to me. LinkedIn. I'm always excited to talk to our users or community.Corey: And we will of course put links to this in the [show notes 00:33:58]. Thank you so much for your time. I really appreciate it.AB: Again, wonderful to be here, Corey.Corey: Anand Babu Periasamy, CEO and co-founder of MinIO. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with what starts out as an angry comment but eventually turns into you, in your position on the S3 product team, writing a thank you note to MinIO for helping validate your market.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Winning Retail
NRF 2022 Recap

Winning Retail

Play Episode Listen Later Jan 31, 2022 32:33


This episode features a recap of NRF 2022: Retail's Big Show, which took place in New York City from January 16th-18th.On today's episode, we hit the convention floor and hear from various companies about how they are tackling some of retail's most pressing matters, from staffing shortages to supply chain issues to leveraging massive amounts of data.--Show Notes(3:38) What were the big themes of this year's convention?(5:30) Zliide, the automated checkout company(7:38) Exotec, the warehouse automation company(10:24) Retail Robotics, the parcel locker company(14:42) Ulisse, the behavioral AI company(16:00) Heuritech, the image intelligence company(17:57) Dressipi, the fashion AI company(19:23) Nobal Technologies, the interactive mirror company(21:04) Scandit, the barcode scanning AR company(22:45) Tokinomo, the shelf advertising robot company(25:19) Yugabyte, the open-source, multi-cloud company(27:20) The future of retail technology(28:58) The advantage of being press at NRF(30:08) Why New York is the perfect location for NRF--SponsorThis podcast is presented by Dell Technologies and Intel. Together they help you realize digital transformation across retail by driving IT innovation to better engage with today's connected consumer. Learn more at DellTechnologies.com/retail and Intel.com/retail.

Hashmap on Tap
#110 The Hotel California (Without Strings) of Startups with AB Periasamy, Co-Founder and CEO of MinIO

Hashmap on Tap

Play Episode Listen Later Jan 18, 2022 49:35


On this episode of Hashmap on Tap, host Kelly Kohlleffel is joined by AB Periasamy. AB is Co-Founder and CEO at MinIO, where they are delivering high-performance, S3 compatible multi-cloud object storage that is software-defined, 100% open-source, and native to Kubernetes. Prior to starting MinIO, AB co-founded Gluster which was acquired by RedHat and he's also an angel investor and advisor to a range of companies including Starburst, H2O.ai, Manetu, Humio, and Yugabyte. AB shares his story, how a culture of collaboration launched him into the open-source space, and provides sound advice to startups from a startup founder and investor. Show Notes: Learn more about MinIO: https://min.io/ Check out MinIO's Blog: https://blog.min.io/ MinIO on Twitter: @Minio Connect with AB on LinkedIn: https://www.linkedin.com/in/abperiasamy/ Download MinIO: https://min.io/download On tap for today's episode: Mexican Coffee from La Lucha and Nespresso Mexico Contact Us: https://www.hashmapinc.com/reach-out

Cloud Database Report
Cloud Database Predictions for 2022

Cloud Database Report

Play Episode Listen Later Jan 13, 2022 6:32


2021 was a  busy year for cloud databases, with startups like Cockroach Labs, DataStax, and SingleStore challenging larger, established vendors like Oracle, IBM, and SAP. And of course the Big 3 cloud providers  - Microsoft, AWS, and Google Cloud.There's a lot of momentum carrying into 2022. A few observations on products and platforms.  First, I expect we will see more Exabyte-size databases, which are 1,000 times larger than the petabyte databases that many businesses operate today. We're moving into the realm of extreme data, and that's going to require even greater scalability than most companies are experienced with. That will be a challenge.Second, database migrations from on-premises systems to the cloud will continue to be a major trend, and not always an easy one, which will require new tools and services. Database migrations can actually take weeks and even months to complete.Third, database management is getting easier. Cloud database providers have begun offering fully managed services,  "serverless" capabilities, and autonomous databases, all of which reduce the amount of provisioning and hands-on management required.And finally, more business people will begin to pay attention to who has access to data and where data is stored, which means conversations about governance and data distribution will become more of a line of business conversation.A few comments about the competitive landscape. I see 3 major trends."Immovable objects meet irresistible forces." Immovable objects are the deeply rooted vendors like Oracle and IBM, and irresistible forces are the cloud-native startups. These emerging companies are coming on strong, and the old guard must continue reinventing themselves.The Big 3 cloud providers are the new center of gravity for data management. AWS, Google Cloud, and Microsoft Azure have momentum with their portfolios of purpose-built databases, and other cloud services like analytics and AI.And last, Snowflake, with its data cloud model, has leap frogged old style centralized data warehouses. I expect more database providers to offer their own Snowflake-like services.For more on the latest trends in the cloud database market, register for Acceleration Economy's Cloud Database Battleground on January 27, 2022. The digital event will be hosted by John Foley, editor of the Cloud Database Report and database analyst with Acceleration Economy. Registration is free. Participating companies include Couchbase, Cockroach Labs, DataStax, Redis, SingleStore, and Yugabyte. Each vendor will answer the same five questions:How does your database help organizations manage data at scale and speed to lead their industry?When customers talk about becoming a data-driven organization and creating new revenue streams with data, how do you help them make that a reality?What are the top reasons developers and IT teams want to use your cloud database for the first time?In what ways does your cloud database simplify data distribution and sharing across hybrid, multi-cloud, and edge environments?How does your cloud database provide a trusted data environment through access, security, privacy, and governance controls? 

Screaming in the Cloud
GCP's Many Profundities with Miles Ward

Screaming in the Cloud

Play Episode Listen Later Jan 11, 2022 42:06


About MilesAs Chief Technology Officer at SADA, Miles Ward leads SADA's cloud strategy and solutions capabilities. His remit includes delivering next-generation solutions to challenges in big data and analytics, application migration, infrastructure automation, and cost optimization; reinforcing our engineering culture; and engaging with customers on their most complex and ambitious plans around Google Cloud.Previously, Miles served as Director and Global Lead for Solutions at Google Cloud. He founded the Google Cloud's Solutions Architecture practice, launched hundreds of solutions, built Style-Detection and Hummus AI APIs, built CloudHero, designed the pricing and TCO calculators, and helped thousands of customers like Twitter who migrated the world's largest Hadoop cluster to public cloud and Audi USA who re-platformed to k8s before it was out of alpha, and helped Banco Itau design the intercloud architecture for the bank of the future.Before Google, Miles helped build the AWS Solutions Architecture team. He wrote the first AWS Well-Architected framework, proposed Trusted Advisor and the Snowmobile, invented GameDay, worked as a core part of the Obama for America 2012 “tech” team, helped NASA stream the Curiosity Mars Rover landing, and rebooted Skype in a pinch.Earning his Bachelor of Science in Rhetoric and Media Studies from Willamette University, Miles is a three-time technology startup entrepreneur who also plays a mean electric sousaphone.Links: SADA.com: https://sada.com Twitter: https://twitter.com/milesward Email: miles@sada.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com.Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open source database that is not the bind DNS server. If you're tired of managing open source Redis on your own, or you're using one of the vanilla cloud caching services, these folks have you covered with the go to manage Redis service for global caching and primary database capabilities; Redis Enterprise. To learn more and deploy not only a cache but a single operational data platform for one Redis experience, visit redis.com/hero. Thats r-e-d-i-s.com/hero. And my thanks to my friends at Redis for sponsoring my ridiculous non-sense.  Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined today, once again by my friend and yours, Miles Ward, who's the CTO at SADA. However, he is, as I think of him, the closest thing the Google Cloud world has to Corey Quinn. Now, let's be clear, not the music and dancing part that is Forrest Brazeal, but Forrest works at Google Cloud, whereas Miles is a reasonably salty third-party. Miles, thank you for coming back and letting me subject you to that introduction.Miles: Corey, I appreciate that introduction. I am happy to provide substantial salt. It is easy, as I play brass instruments that produce my spit in high volumes. It's the most disgusting part of any possible introduction. For the folks in the audience, I am surrounded by a collection of giant sousaphones, tubas, trombones, baritones, marching baritones, trumpets, and pocket trumpets.So, Forrest threw down the gauntlet and was like, I can play a keyboard, and sing, and look cute at the same time. And so I decided to fail at all three. We put out a new song just a bit ago that's, like, us thanking all of our customers and partners, covering Kool & the Gang “Celebration,” and I neither look good, [laugh] play piano, or smiling, or [capturing 00:01:46] any of the notes; I just play the bass part, it's all I got to do.Corey: So, one thing that I didn't get to talk a lot about because it's not quite in my universe, for one, and for another, it is during the pre re:Invent—pre:Invent, my nonsense thing—run up, which is Google Cloud Next.Miles: Yes.Corey: And my gag a few years ago is that I'm not saying that Google is more interested in what they're building and what they're shipping, but even their conference is called Next. Buh dum, hiss.Miles: [laugh].Corey: So, I didn't really get to spend a lot of attention on the Google Cloud releases that came out this year, but given that SADA is in fact the, I believe, largest Google Cloud partner on the internet, and thus the world—Miles: [unintelligible 00:02:27] new year, three years in a row back, baby.Corey: Fantastic. I assume someone's watch got stuck or something. But good work. So, you have that bias in the way that I have a bias, which is your business is focused around Google Cloud the way that mine is focused on AWS, but neither of us is particularly beholden to that given company. I mean, you do have the not getting fired as partner, but that's a bit of a heavy lift; I don't think I can mouth off well enough to get you there.So, we have a position of relative independence. So, you were tracking Google Next, the same way that I track re:Invent. Well, not quite the same way I track re:Invent; there are some significant differences. What happened at Cloud Next 2021, that the worst of us should be paying attention to?Miles: Sure. I presented 10% of the material at the first re:Invent. There are 55 sessions; I did six. And so I have been at Cloud events for a really long time and really excited about Google's willingness to dive into demos in a way that I think they have been a little shy about. Kelsey Hightower is the kind of notable deep exception to that. Historically, he's been ready to dive into the, kind of, heavy hands-on piece but—Corey: Wait, those were demos? [Thought 00:03:39] was just playing Tetris on stage for the love of it.Miles: [laugh]. No. And he really codes all that stuff up, him and the whole team.Corey: Oh, absol—I'm sorry. If I ever grow up, I wish to be Kelsey Hightower.Miles: [laugh]. You and me both. So, he had kind of led the charge. We did a couple of fun little demos while I was there, but they've really gotten a lot further into that, and I think are doing a better job of packaging the benefits to not just developers, but also operators and data scientists and the broader roles in the cloud ecosystem from the new features that are being launched. And I think, different than the in-person events where there's 10, 20,000, 40,000 people in the audience paying attention, I think they have to work double-hard to capture attention and get engineers to tune in to what's being launched.But if you squint and look close, there are some, I think, very interesting trends that sit in the back of some of the very first launches in what I think are going to be whole veins of launches from Google over the course of the next several years that we are working really hard to track along with and make sure we're extracting maximum value from for our customers.Corey: So, what was it that they announced that is worth paying attention to? Now, through the cacophony of noise, one announcement that [I want to note 00:04:49] was tied to Next was the announcement that GME group, I believe, is going to be putting their futures exchange core trading systems on Google Cloud. At which point that to me—and I know people are going to yell at me, and I don't even slightly care—that is the last nail in the coffin of the idea that well, Google is going to turn this off in a couple years. Sorry, no. That is not a thing that's going to happen. Worst case, they might just stop investing it as aggressively as they are now, but even that would be just a clown-shoes move that I have a hard time envisioning.Miles: Yeah, you're talking now over a dozen, over ten year, over a billion-dollar commitments. So, you've got to just really, really hate your stock price if you're going to decide to vaporize that much shareholder value, right? I mean, we think that, in Google, stock price is a material fraction of the recognition of the growth trajectory for cloud, which is now basically just third place behind YouTube. And I think you can do the curve math, it's not like it's going to take long.Corey: Right. That requires effectively ejecting Thomas Kurian as the head of Google Cloud and replacing him with the former SVP of Bad Decisions at Yahoo.Miles: [laugh]. Sure. Google has no shyness about continuing to rotate leadership. I was there through three heads of Google Cloud, so I don't expect that Thomas will be the last although I think he may well go down in history as having been the best. The level of rotation to the focuses that I think are most critical, getting enterprise customers happy, successful, committed, building macroscale systems, in systems that are critical to the core of the business on GCP has grown at an incredible rate under his stewardship. So, I think he's doing a great job.Corey: He gets a lot of criticism—often from Googlers—when I wind up getting the real talk from them, which is, “Can you tell me what you really think?” Their answer is, “No,” I'm like, “Okay, next question. Can I go out and buy you eight beers and then”— and it's like, “Yeah.” And the answer that I get pretty commonly is that he's brought too much Oracle into Google. And okay, that sounds like a bad thing because, you know, Oracle, but let's be clear here, but what are you talking about specifically? And what they say distills down to engineers are no longer the end-all be-all of everything that Google Cloud. Engineers don't get to make sales decisions, or marketing decisions, or in some cases, product decisions. And that is not how Google has historically been run, and they don't like the change. I get it, but engineering is not the only hard thing in the world and it's not the only business area that builds value, let's be clear on this. So, I think that the things that they don't like are in fact, what Google absolutely needs.Miles: I think, one, the man is exceptionally intimidating and intentionally just hyper, hyper attentive to his business. So, one of my best employees, Brad [Svee 00:07:44], he worked together with me to lay out what was the book of our whole department, my team of 86 people there. What are we about? What do we do? And like I wanted this as like a memoriam to teach new hires as got brought in. So, this is, like, 38 pages of detail about our process, our hiring method, our promotional approach, all of it. I showed that to my new boss who had come in at the time, and he thought some of the pictures looked good. When we showed it to TK, he read every paragraph. I watched him highlight the paragraphs as he went through, and he read it twice as fast as I can read the thing. I think he does that to everybody's documents, everywhere. So, there's a level of just manual rigor that he's brought to the practice that was certainly not there before that. So, that alone, it can be intimidating for folks, but I think people that are high performance find that very attractive.Corey: Well, from my perspective, he is clearly head and shoulders above Adam Selipsky, and Scott Guthrie—the respective heads of AWS and Azure—for one key reason: He is the only one of those three people who follows me on Twitter. And—Miles: [laugh].Corey: —honestly, that is how I evaluate vendors.Miles: That's the thing. That's the only measure, yep. I've worked on for a long time with Selipsky, and I think that it will be interesting to see whether Adam's approach to capital allocation—where he really, I think, thinks of himself as the manager of thousands of startups, as opposed to a manager of a global business—whether that's a more efficient process for creating value for customers, then, where I think TK is absolutely trying to build a much more unified, much more singular platform. And a bunch of the launches really speak to that, right? So, one of the product announcements that I think is critical is this idea of the global distributed cloud, Google Distributed Cloud.We started with Kubernetes. And then you layer on to that, okay, we'll take care of Kubernetes for you; we call that Anthos. We'll build a bunch of structural controls and features into Anthos to make it so that you can really deal with stuff in a global way. Okay, what does that look like further? How do we get out into edge environments? Out into diverse hardware? How do we partner up with everybody to make sure that, kind of like comparing Apple's approach to Google's approach, you have an Android ecosystem of Kubernetes providers instead of just one place you can buy an outpost. That's generally the idea of GDC. I think that's a spot where you're going to watch Google actually leverage the muscle that it already built in understanding open-source dynamics and understanding collaboration between companies as opposed to feeling like it's got to be built here. We've got to sell it here. It's got to have our brand on it.Corey: I think that there's a stupendous and extreme story that is still unfolding over at Google Cloud. Now, re:Invent this year, they wound up talking all about how what they were rolling out was a focus on improving primitives. And they're right. I love their managed database service that they launched because it didn't exist.Miles: Yeah Werner's slide, “It's primitives, not frameworks.” I was like, I think customers want solutions, not frameworks or primitives. [laugh]. What's your plan?Corey: Yeah. However, I take a different perspective on all of this, which is that is a terrific spin on the big headline launches all missed the re:Invent timeline, and… oops, so now we're just going to talk about these other things instead. And that's great, but then they start talking about industrial IOT, and mainframe migrations, and the idea of private 5G, and running fleets of robots. And it's—Miles: Yeah, that's a cool product.Corey: Which one? I'm sorry, they're all very different things.Miles: Private 5G.Corey: Yeah, if someone someday will explain to me how it differs from Wavelength, but that's neither here nor there. You're right, they're all interesting, but none of them are actually doing the thing that I do, which is build websites, [unintelligible 00:11:31] looking for web services, it kind of says it in the name. And it feels like it's very much broadening into everything, and it's very difficult for me to identify—and if I have trouble that I guarantee you customers do—of, which services are for me and which are very much not? In some cases, the only answer to that is to check the pricing. I thought Kendra, their corporate information search thing was for me, then it's 7500 bucks a month to get started with that thing, and that is, “I can hire an internal corporate librarian to just go and hunt through our Google Drive.” Great.Miles: Yeah.Corey: So, there are—or our Dropbox, or our Slack. We have, like, five different information repositories, and this is how corporate nonsense starts, let me assure you.Miles: Yes. We call that luxury SaaS, you must enjoy your dozens of overlapping bills for, you know, what Workspace gives you as a single flat rate.Corey: Well, we have [unintelligible 00:12:22] a lot of this stuff, too. Google Drive is great, but we use Dropbox for holding anything that touches our customer's billing information, just because I—to be clear, I do not distrust Google, but it also seems a little weird to put the confidential billing information for one of their competitors on there to thing if a customer were to ask about it. So, it's the, like, I don't believe anyone's doing anything nefarious, but let's go ahead and just make sure, in this case.Miles: Go further man. Vimeo runs on GCP. You think YouTube doesn't want to look at Vimeo stats? Like they run everything on GCP, so they have to have arrived at a position of trust somehow. Oh, I know how it's called encryption. You've heard of encryption before? It's the best.Corey: Oh, yes. I love these rumors that crop up every now and again that Amazon is going to start scanning all of its customer content, somehow. It's first, do you have any idea how many compute resources that would take and to if they can actually do that and access something you're storing in there, against their attestations to the contrary, then that's your story because one of them just makes them look bad, the other one utterly destroys their entire business.Miles: Yeah.Corey: I think that that's the one that gets the better clicks. So no, they're not doing that.Miles: No, they're not doing that. Another product launch that I thought was super interesting that describes, let's call it second place—the third place will be the one where we get off into the technical deep end—but there's a whole set of coordinated work they're calling Cortex. So, let's imagine you go to a customer, they say, “I want to understand what's happening with my business.” You go, “Great.” So, you use SAP, right? So, you're a big corporate shop, and that's your infrastructure of choice. There are a bunch of different options at that layer.When you set up SAP, one of the advantages that something like that has is they have, kind of, pre-built configurations for roughly your business, but whatever behaviors SAP doesn't do, right, say, data warehousing, advanced analytics, regression and projection and stuff like that, maybe that's somewhat outside of the core wheelhouse for SAP, you would expect like, oh okay, I'll bolt on BigQuery. I'll build that stuff over there. We'll stream the data between the two. Yeah, I'm off to the races, but the BigQuery side of the house doesn't have this like bitching menu that says, “You're a retailer, and so you probably want to see these 75 KPIs, and you probably want to chew up your SKUs in exactly this way. And here's some presets that make it so that this is operable out of the box.”So, they are doing the three way combination: Consultancies plus ISVs plus Google products, and doing all the pre-work configuration to go out to a customer and go I know what you probably just want. Why don't I just give you the whole thing so that it does the stuff that you want? That I think—if that's the very first one, this little triangle between SAP, and Big Query, and a bunch of consultancies like mine, you have to imagine they go a lot further with that a lot faster, right? I mean, what does that look like when they do it with Epic, when they go do it with Go just generally, when they go do it with Apache? I've heard of that software, right? Like, there's no reason not to bundle up what the obvious choices are for a bunch of these combinations.Corey: The idea of moving up the stack and offering full on solutions, that's what customers actually want. “Well, here's a bunch of things you can do to wind up wiring together to build a solution,” is, “Cool. Then I'm going to go hire a company who's already done that is going to sell it to me at a significant markup because I just don't care.” I pay way more to WP Engine than I would to just run WordPress myself on top of AWS or Google Cloud. In fact, it is on Google Cloud, but okay.Miles: You and me both, man. WP Engine is the best. I—Corey: It's great because—Miles: You're welcome. I designed a bunch of the hosting on the back of that.Corey: Oh, yeah. But it's also the—I—well, it costs a little bit more that way. Yeah, but guess what's not—guess what's more expensive than that bill, is my time spent doing the care and feeding of this stuff. I like giving money to experts and making it their problem.Miles: Yeah. I heard it said best, Lego is an incredible business. I love their product, and you can build almost any toy with it. And they have not displaced all other plastic toy makers.Corey: Right.Miles: Some kids just want to buy a little car. [laugh].Corey: Oh, yeah, you can build anything you want out of Lego bricks, which are great, which absolutely explains why they are a reference AWS customer.Miles: Yeah, they're great. But they didn't beat all other toy companies worldwide, and eliminate the rest of that market because they had the better primitive, right? These other solutions are just as valuable, just as interesting, tend to have much bigger markets. Lego is not the largest toy manufacturer in the world. They are not in the top five of toy manufacturers in the world, right?Like, so chasing that thread, and getting all the way down into the spots where I think many of the cloud providers on their own, internally, had been very uncomfortable. Like, you got to go all the way to building this stuff that they need for that division, inside of that company, in that geo, in that industry? That's maybe, like, a little too far afield. I think Google has a natural advantage in its more partner-oriented approach to create these combinations that lower the cost to them and to customers to getting out of that solution quick.Corey: So, getting into the weeds of Google Next, I suppose, rather than a whole bunch of things that don't seem to apply to anyone except the four or five companies that really could use it, what things did Google release that make the lives of people building, you know, web apps better?Miles: This is the one. So, I'm at Amazon, hanging out as a part of the team that built up the infrastructure for the Obama campaign in 2012, and there are a bunch of Googlers there, and we are fighting with databases. We are fighting so hard, in fact, with RDS that I think we are the only ones that [Raju 00:17:51] has ever allowed to SSH into our RDS instances to screw with them.Corey: Until now, with the advent of RDS Custom, meaning that you can actually get in as root; where that hell that lands between RDS and EC2 is ridiculous. I just know that RDS can now run containers.Miles: Yeah. I know how many things we did in there that were good for us, and how many things we did in there that were bad for us. And I have to imagine, this is not a feature that they really ought to let everybody have, myself included. But I will say that what all of the Googlers that I talk to, you know, at the first blush, were I'm the evil Amazon guy in to, sort of, distract them and make them build a system that, you know, was very reliable and ended up winning an election was that they had a better database, and they had Spanner, and they didn't understand why this whole thing wasn't sitting on Spanner. So, we looked, and I read the white paper, and then I got all drooly, and I was like, yes, that is a much better database than everybody else's database, and I don't understand why everybody else isn't on it. Oh, there's that one reason, but you've heard of it: No other software works with it, anywhere in the world, right? It's utterly proprietary to Google. Yes, they were kind—Corey: Oh, you want to migrate it off somewhere else, or a fraction of it? Great. Step one, redo your data architecture.Miles: Yeah, take all of my software everywhere, rewrite every bit of it. And, oh all those commercial applications? Yeah, forget all those, you got, too. Right? It was very much where Google was eight years ago. So, for me, it was immensely meaningful to see the launch at Next where they described what they are building—and have now built; we have alpha access to it—a Postgres layer for Spanner.Corey: Is that effectively you have to treat it as Postgres at all times, or is it multimodal access?Miles: You can get in and tickle it like Spanner, if you want to tickle it like Spanner. And in reality, Spanner is ANSI SQL compliant; you're still writing SQL, you just don't have to talk to it like a REST endpoint, or a GRPC endpoint, or something; you can, you know, have like a—Corey: So, similar to Azure's Cosmos DB, on some level, except for the part where you can apparently look at other customers' data in that thing?Miles: [laugh]. Exactly. Yeah, you will not have a sweeping discovery of incredible security violations in the structure Spanner, in that it is the control system that Google uses to place every ad, and so it does not suck. You can't put a trillion-dollar business on top of a database and not have it be safe. That's kind of a thing.Corey: The thing that I find is the most interesting area of tech right now is there's been this rise of distributed databases. Yugabyte—or You-ji-byte—Pla-netScale—or PlanetScale, depending on how you pronounce these things.Miles: [laugh]. Yeah, why, why is G such an adversarial consonant? I don't understand why we've all gotten to this place.Corey: Oh, yeah. But at the same time, it's—so you take a look at all these—and they all are speaking Postgres; it is pretty clear that ‘Postgres-squeal' is the thing that is taking over the world as far as databases go. If I were building something from scratch that used—Miles: For folks in the back, that's PostgreSQL, for the rest of us, it's okay, it's going to be, all right.Corey: Same difference. But yeah, it's the thing that is eating the world. Although recently, I've got to say, MongoDB is absolutely stepping up in a bunch of really interesting ways.Miles: I mean, I think the 4.0 release, I'm the guy who wrote the MongoDB on AWS Best Practices white paper, and I would grab a lot of customer's and—Corey: They have to change it since then of, step one: Do not use DocumentDB; if you want to use Mongo, use Mongo.Miles: Yeah, that's right. No, there were a lot of customers I was on the phone with where Mongo had summarily vaporized their data, and I think they have made huge strides in structural reliability over the course of—you know, especially this 4.0 launch, but the last couple of years, for sure.Corey: And with all the people they've been hiring from AWS, it's one of those, “Well, we'll look at this now who's losing important things from production?”Miles: [laugh]. Right? So, maybe there's only actually five humans who know how to do operations, and we just sort of keep moving around these different companies.Corey: That's sort of my assumption on these things. But Postgres, for those who are not looking to depart from the relational model, is eating the world. And—Miles: There's this, like, basic emotional thing. My buddy Martin, who set up MySQL, and took it public, and then promptly got it gobbled up by the Oracle people, like, there was a bet there that said, hey, there's going to be a real open database, and then squish, like, the man came and got it. And so like, if you're going to be an independent, open-source software developer, I think you're probably not pushing your pull requests to our friends at Oracle, that seems weird. So instead, I think Postgres has gobbled up the best minds on that stuff.And it works. It's reliable, it's consistent, and it's functional in all these different, sort of, reapplications and subdivisions, right? I mean, you have to sort of squint real hard, but down there in the guts of Redshift, that's Postgres, right? Like, there's Postgres behind all sorts of stuff. So, as an interface layer, I'm not as interested about how it manages to be successful at bossing around hardware and getting people the zeros and ones that they ask for back in a timely manner.I'm interested in it as a compatibility standard, right? If I have software that says, “I need to have Postgres under here and then it all will work,” that creates this layer of interop that a bunch of other products can use. So, folks like PlanetScale, and Yugabyte can say, “No, no, no, it's cool. We talk Postgres; that'll make it so your application works right. You can bring a SQL alchemy and plug it into this, or whatever your interface layer looks like.”That's the spot where, if I can trade what is a fairly limited global distribution, global transactional management on literally ridiculously unlimited scalability and zero operations, I can handle the hard parts of running a database over to somebody else, but I get my layer, and my software talks to it, I think that's a huge step.Corey: This episode is sponsored in part by my friends at Cloud Academy. Something special just for you folks. If you missed their offer on Black Friday or Cyber Monday or whatever day of the week doing sales it is—good news! They've opened up their Black Friday promotion for a very limited time. Same deal, $100 off a yearly plan, $249 a year for the highest quality cloud and tech skills content. Nobody else can get this because they have a assured me this not going to last for much longer. Go to CloudAcademy.com, hit the "start free trial" button on the homepage, and use the Promo code cloud at checkout. That's c-l-o-u-d, like loud, what I am, with a “C” in front of it. It's a free trial, so you'll get 7 days to try it out to make sure it's really a good fit for you, nothing to lose except your ignorance about cloud. My thanks again for sponsoring my ridiculous nonsense.Corey: I think that there's a strong movement toward building out on something like this. If it works, just because—well, I'm not multiregion today, but I can easily see a world in which I'd want to be. So, great. How do you approach the decision between—once this comes out of alpha; let's be clear. Let's turn this into something that actually ships, and no, Google that does not mean slapping a beta label on it for five years is the answer here; you actually have to stand behind this thing—but once it goes GA—Miles: GA is a good thing.Corey: Yeah. How do you decide between using that, or PlanetScale? Or Yugabyte?Miles: Or Cockroach or or SingleStore, right? I mean, there's a zillion of them that sit in this market. I think the core of the decision making for me is in every team you're looking at what skills do you bring to bear and what problem that you're off to go solve for customers? Do the nuances of these products make it easier to solve? So, I think there are some products that the nature of what you're building isn't all that dependent on one part of the application talking to another one, or an event happening someplace else mattering to an event over here. But some applications, that's, like, utterly critical, like, totally, totally necessary.So, we worked with a bunch of like Forex exchange trading desks that literally turn off 12 hours out of the day because they can only keep it consistent in one geographical location right near the main exchanges in New York. So, that's a place where I go, “Would you like to trade all day?” And they go, “Yes, but I can't because databases.” So, “Awesome. Let's call the folks on the Spanner side. They can solve that problem.”I go, “Would you like to trade all day and rewrite all your software?” And they go, “No.” And I go, “Oh, okay. What about trade all day, but not rewrite all your software?” There we go. Now, we've got a solution to that kind of problem.So like, we built this crazy game, like, totally other end of the ecosystem with the Dragon Ball Z people, hysterical; your like—you literally play like Rock, Paper, Scissors with your phone, and if you get a rock, I throw a fireball, and you get a paper, then I throw a punch, and we figure out who wins. But they can play these games like Europe versus Japan, thousands of people on each side, real-time, and it works.Corey: So, let's be clear, I have lobbied a consistent criticism at Google for a while now, which is the Google Cloud global control plane. So, you wind up with things like global service outages from time to time, you wind up with this thing is now broken for everyone everywhere. And that, for a lot of these use cases, is a problem. And I said that AWS's approach to regional isolation is the right way to do it. And I do stand by that assessment, except for the part where it turns out there's a lot of control plane stuff that winds up single tracking through us-east-1, as we learned in the great us-east-1 outage of 2021.Miles: Yeah, when I see customers move from data center to AWS, what they expect is a higher count of outages that lasts less time. That's the trade off, right? There's going to be more weird spurious stuff, and maybe—maybe—if they're lucky, that outage will be over there at some other region they're not using. I see almost exactly the same promise happening to folks that come from AWS—and in particular from Azure—over onto GCP, which is, there will be probably a higher frequency of outages at a per product level, right? So, like sometimes, like, some weird product takes a screw sideways, where there is structural interdependence between quite a few products—we actually published a whole internal structural map of like, you know, it turns out that Cloud SQL runs on top of GCE not on GKE, so you can expect if GKE goes sideways, Cloud SQL is probably not going to go sideways; the two aren't dependent on each other.Corey: You take the status page and Amazon FreeRTOS in a region is having an outage today or something like that. You're like, “Oh, no. That's terrible. First, let me go look up what the hell that is.” And I'm not using it? Absolutely not. Great. As hyperscalers, well, hyperscale, they're always things that are broken in different ways, in different locations, and if you had a truly accurate status page, it would all be red all the time, or varying shades of red, which is not helpful. So, I understand the challenge there, but very often, it's a partition that is you are not exposed to, or the way that you've architected things, ideally, means it doesn't really matter. And that is a good thing. So, raw outage counts don't solve that. I also maintain that if I were to run in a single region of AWS or even a single AZ, in all likelihood, I will have a significantly better uptime across the board than I would if I ran it myself. Because—Miles: Oh, for sure.Corey: —it is—Miles: For sure they're way better at ops than you are. Me, right?Corey: Of course.Miles: Right? Like, ridiculous.Corey: And they got that way, by learning. Like, I think in 2022, it is unlikely that there's going to be an outage in an AWS availability zone by someone tripping over a power cable, whereas I have actually done that. So, there's a—to be clear in a data center, not an AWS facility; that would not have flown. So, there is the better idea of of going in that direction. But the things like Route 53 is control plane single-tracking through the us-east-1, if you can't make DNS changes in an outage scenario, you may as well not have a DR plan, for most use cases.Miles: To be really clear, it was a part of the internal documentation on the AWS side that we would share with customers to be absolutely explicit with them. It's not just that there are mistakes and accidents which we try to limit to AZs, but no, go further, that we may intentionally cause outages to AZs if that's what allows us to keep broader service health higher, right? They are not just a blast radius because you, oops, pulled the pin on the grenade; they can actually intentionally step on the off button. And that's different than the way Google operates. They think of each of the AZs, and each of the regions, and the global system as an always-on, all the time environment, and they do not have systems where one gets, sort of, sacrificed for the benefit of the rest, right, or they will intentionally plan to take a system offline.There is no planned downtime in the SLA, where the SLAs from my friends at Amazon and Azure are explicit to, if they choose to, they decide to take it offline, they can. Now, that's—I don't know, I kind of want the contract that has the other thing where you don't get that.Corey: I don't know what the right answer is for a lot of these things. I think multi-cloud is dumb. I think that the idea of having this workload that you're going to seamlessly deploy to two providers in case of an outage, well guess what? The orchestration between those two providers is going to cause you more outages than you would take just sticking on one. And in most cases, unless you are able to have complete duplication of not just functionality but capacity between those two, congratulations, you've now just doubled your number of single points of failure, you made the problem actively worse and more expensive. Good job.Miles: I wrote an article about this, and I think it's important to differentiate between dumb and terrifyingly shockingly expensive, right? So, I have a bunch of customers who I would characterize as rich, as like, shockingly rich, as producing businesses that have 80-plus percent gross margins. And for them, the costs associated with this stuff are utterly rational, and they take on that work, and they are seeing benefits, or they wouldn't be doing it.Corey: Of course.Miles: So, I think their trajectory in technology—you know, this is a quote from a Google engineer—it's just like, “Oh, you want to see what the future looks like? Hang out with rich people.” I went into houses when I was a little kid that had whole-home automation. I couldn't afford them; my mom was cleaning house there, but now my house, I can use my phone to turn on the lights. Like—Corey: You know, unless us-east-1 is having a problem.Miles: Hey, and then no Roomba for you, right? Like utterly offline. So—Corey: Roomba has now failed to room.Miles: Conveniently, my lights are Philips Hue, and that's on Google, so that baby works. But it is definitely a spot where the barrier of entry and the level of complexity required is going down over time. And it is definitely a horrible choice for 99% of the companies that are out there right now. But next year, it'll be 98. And the year after that, it'll probably be 97. [laugh].And if I go inside of Amazon's data centers, there's not one manufacturer of hard drives, there's a bunch. So, that got so easy that now, of course you use more than one; you got to do—that's just like, sort of, a natural thing, right? These technologies, it'll move over time. We just aren't there yet for the vast, vast majority of workloads.Corey: I hope that in the future, this stuff becomes easier, but data transfer fees are going to continue to be a concern—Miles: Just—[makes explosion noise]—Corey: Oh, man—Miles: —like, right in the face.Corey: —especially with the Cambrian explosion of data because the data science folks have successfully convinced the entire industry that there's value in those mode balancer logs in 2012. Okay, great. We're never deleting anything again, but now you've got to replicate all of that stuff because no one has a decent handle on lifecycle management and won't for the foreseeable future. Great, to multiple providers so that you can work on these things? Like, that is incredibly expensive.Miles: Yeah. Cool tech, from this announcement at Next that I think is very applicable, and recognized the level of like, utter technical mastery—and security mastery to our earlier conversation—that something like this requires, the product is called BigQuery Omni, what Omni allows you to do is go into the Google Cloud Console, go to BigQuery, say I want to do analysis on this data that's in S3, or in Azure Blob Storage, Google will spin up an account on your behalf on Amazon and Azure, and run the compute there for you, bring the result back. So, just transfer the answers, not the raw data that you just scanned, and no work on your part, no management, no crapola. So, there's like—that's multi-cloud. If I've got—I can do a join between a bunch of rows that are in real BigQuery over on GCP side and rows that are over there in S3. The cross-eyedness of getting something like that to work is mind blowing.Corey: To give this a little more context, just because it gets difficult to reason about these things, I can either have data that is in a private subnet in AWS that traverses their horribly priced Managed NAT Gateways, and then goes out to the internet and sent there once, for the same cost as I could take that same data and store it in S3 in their standard tier for just shy of six full months. That's a little imbalanced, if we're being direct here. And then when you add in things like intelligent tiering and archive access classes, that becomes something that… there's no contest there. It's, if we're talking about things that are now approaching exabyte scale, that's one of those, “Yeah, do you want us to pay by a credit card?”—get serious. You can't at that scale anyway—“Invoice billing, or do we just, like, drive a dump truck full of gold bricks and drop them off in Seattle?”Miles: Sure. Same trajectory, on the multi-cloud thing. So, like a partner of ours, PacketFabric, you know, if you're a big, big company, you go out and you call Amazon and you buy 100 gigabit interconnect on—I think they call theirs Direct Connect, and then you hook that up to the Google one that's called Dedicated Interconnect. And voila, the price goes from twelve cents a gig down to two cents a gig; everybody's much happier. But Jesus, you pay the upfront for that, you got to set the thing up, it takes days to get deployed, and now you're culpable for the whole pipe if you don't use it up. Like, there are charges that are static over the course of the month.So, PacketFabric just buys one of those and lets you rent a slice of it you need. And I think they've got an incredible product. We're working with them on a whole bunch of different projects. But I also expect—like, there's no reason the cloud providers shouldn't be working hard to vend that kind of solution over time. If a hundred gigabit is where it is now, what does it look like when I get to ten gigabit? When I get to one gigabit? When I get to half gigabit? You know, utility price that for us so that we get to rational pricing.I think there's a bunch of baked-in business and cost logic that is a part of the pricing system, where egress is the source of all of the funding at Amazon for internal networking, right? I don't pay anything for the switches that connect to this machine to that machine, in region. It's not like those things are cheap or free; they have to be there. But the funding for that comes from egress. So, I think you're going to end up seeing a different model where you'll maybe have different approaches to egress pricing, but you'll be paying like an in-system networking fee.And I think folks will be surprised at how big that fee likely is because of the cost of the level of networking infrastructure that the providers deploy, right? I mean, like, I don't know, if you've gone and tried to buy a 40 port, 40 gig switch anytime recently. It's not like they're those little, you know, blue Netgear ones for 90 bucks.Corey: Exactly. It becomes this, [sigh] I don't know, I keep thinking that's not the right answer, but part of it also is like, well, you know, for things that I really need local and don't want to worry about if the internet's melting today, I kind of just want to get, like, some kind of Raspberry Pi shoved under my desk for some reason.Miles: Yeah. I think there is a lot where as more and more businesses bet bigger and bigger slices of the farm on this kind of thing, I think it's Jassy's line that you're, you know, the fat in the margin in your business is my opportunity. Like, there's a whole ecosystem of partners and competitors that are hunting all of those opportunities. I think that pressure can only be good for customers.Corey: Miles, thank you for taking the time to speak with me. If people want to learn more about you, what you're up to, your bad opinions, your ridiculous company, et cetera—Miles: [laugh].Corey: —where can they find you?Miles: Well, it's really easy to spell: SADA.com, S-A-D-A dot com. I'm Miles Ward, it's @milesward on Twitter; you don't have to do too hard of a math. It's miles@sada.com, if you want to send me an email. It's real straightforward. So, eager to reach out, happy to help. We've got a bunch of engineers that like helping people move from Amazon to GCP. So, let us know.Corey: Excellent. And we will, of course, put links to this in the [show notes 00:37:17] because that's how we roll.Miles: Yay.Corey: Thanks so much for being so generous with your time, and I look forward to seeing what comes out next year from these various cloud companies.Miles: Oh, I know some of them already, and they're good. Oh, they're super good.Corey: This is why I don't do predictions because like, the stuff that I know about, like, for example, I was I was aware of the Graviton 3 was coming—Miles: Sure.Corey: —and it turns out that if your—guess what's going to come up and you don't name Graviton 3, it's like, “Are you simple? Did you not see that one coming?” It's like—or if I don't know it's coming and I make that guess—which is not the hardest thing in the world—someone would think I knew and leaked. There's no benefit to doing predictions.Miles: No. It's very tough, very happy to do predictions in private, for customers. [laugh].Corey: Absolutely. Thanks again for your time. I appreciate it.Miles: Cheers.Corey: Myles Ward, CTO at SADA. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice and be very angry in your opinion when you write that obnoxious comment, but then it's going to get lost because it's using MySQL instead of Postgres.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

A Bootiful Podcast
Yugabyte CEO Karthik Ranganathan

A Bootiful Podcast

Play Episode Listen Later Dec 23, 2021 64:45


Hi, Spring fans! Welcome to another installment of a _Bootiful Podcast_! How are you doing? In this episode, we've got an extra special holiday treat for you! [Josh Long (@starbuxman)](https://twitter.com/starbuxman) talks to [Yugabyte](https://twitter.com/Yugabyte) CEO and Apache Cassandra, and Apache HBase co-founder [Karthik Ranganathan (@karthikr)](https://twitter.com/karthikr). Merry Christmas (if you celebrate!)

Screaming in the Cloud
“Liqui”fying the Database Bottleneck with Robert Reeves

Screaming in the Cloud

Play Episode Listen Later Dec 16, 2021 50:45


About RobertR2 advocates for Liquibase customers and provides technical architecture leadership. Prior to co-founding Datical (now Liquibase), Robert was a Director at the Austin Technology Incubator. Robert co-founded Phurnace Software in 2005. He invented and created the flagship product, Phurnace Deliver, which provides middleware infrastructure management to multiple Fortune 500 companies.Links: Liquibase: https://www.liquibase.com Liquibase Community: https://www.liquibase.org Liquibase AWS Marketplace: https://aws.amazon.com/marketplace/seller-profile?id=7e70900d-dcb2-4ef6-adab-f64590f4a967 Github: https://github.com/liquibase Twitter: https://twitter.com/liquibase TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com. Corey: You know how Git works right?Announcer: Sorta, kinda, not really. Please ask someone else.Corey: That's all of us. Git is how we build things, and Netlify is one of the best ways I've found to build those things quickly for the web. Netlify's Git-based workflows mean you don't have to play slap-and-tickle with integrating arcane nonsense and web hooks, which are themselves about as well understood as Git. Give them a try and see what folks ranging from my fake Twitter for Pets startup, to global Fortune 2000 companies are raving about. If you end up talking to them—because you don't have to; they get why self-service is important—but if you do, be sure to tell them that I sent you and watch all of the blood drain from their faces instantly. You can find them in the AWS marketplace or at www.netlify.com. N-E-T-L-I-F-Y dot com.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This is a promoted episode. What does that mean in practice? Well, it means the company who provides the guest has paid to turn this into a discussion that's much more aligned with the company than it is the individual.Sometimes it works, Sometimes it doesn't, but the key part of that story is I get paid. Why am I bringing this up? Because today's guest is someone I met in person at Monktoberfest, which is the RedMonk conference in Portland, Maine, one of the only reasons to go to Maine, speaking as someone who grew up there. And I spoke there, I met my guest today, and eventually it turned into this, proving that I am the envy of developer advocates everywhere because now I can directly tie me attending one conference to making a fixed sum of money, and right now they're all screaming and tearing off their headphones and closing this episode. But for those of you who are sticking around, thank you. My guest today is the CTO and co-founder of Liquibase. Please welcome Robert Reeves. Robert, thank you for joining me, and suffering the slings and arrows I'm about to hurled directly into your arse, as a warning shot.Robert: [laugh]. Man. Thanks for having me. Corey, I've been looking forward to this for a while. I love hanging out with you.Corey: One of the things I love about the Monktoberfest conference, and frankly, anything that RedMonk gets up to is, forget what's on stage, which is uniformly excellent; forget the people at RedMonk who are wonderful and I aspire to do more work with them in different ways; they're great, but the people that they attract are invariably interesting, they are invariably incredibly diverse in terms of not just demographics, but interests and proclivities. It's just a wonderful group of people, and every time I get the opportunity to spend time with those folks I do, and I've never once regretted it because I get to meet people like you. Snark and cynicism about sponsoring this nonsense aside—for which I do thank you—you've been a fascinating person to talk to you because you're better at a lot of the database-facing things than I am, so I shortcut to instead of forming my own opinions, I just skate off of yours in some cases. You're going to get letters now.Robert: Well, look, it's an occupational hazard, right? Releasing software, it's hard so you have to learn these platforms, and part of it includes the database. But I tell you, you're spot on about Monktoberfest. I left that conference so motivated. Really opened my eyes, certainly injecting empathy into what I do on a day-to-day basis, but it spurred me to action.And there's a lot of programs that we've started at Liquibase that the germination for that seed came from Monktoberfest. And certainly, you know, we were bummed out that it's been canceled two years in a row, but we can't wait to get back and sponsor it. No end of love and affection for that team. They're also really smart and right about a hundred percent of the time.Corey: That's the most amazing part is that they have opinions that generally tend to mirror my own—which, you know—Robert: [laugh].Corey: —confirmation bias is awesome, but they almost never get it wrong. And that is one of the impressive things is when I do it, I'm shooting from the hip and I already have an apology half-written and ready to go, whereas when dealing with them, they do research on this and they don't have the ‘I'm a loud, abrasive shitpostter on Twitter' defense to fall back on to defend opinions. And if they do, I've never seen them do it. They're right, and the fact that I am as aligned with them as I am, you'd think that one of us was cribbing from the other. I assure you that's not the case.But every time Steve O'Grady or Rachel Stephens, or Kelly—I forget her last name; my apologies is all Twitter, but she studied medieval history, I remember that—or James Governor writes something, I'm uniformly looking at this and I feel a sense of dismay, been, “Dammit. I should have written this. It's so well written and it makes such a salient point.” I really envy their ability to be so consistently on point.Robert: Well, they're the only analysts we pay money to. So, we vote with our dollars with that one. [laugh].Corey: Yeah. I'm only an analyst when people have analyst budget. Other than that, I'm whatever the hell you describe me. So, let's talk about that thing you're here to show. You know, that little side project thing you found and are the CTO of.I wasn't super familiar with what Liquibase does until I looked into it and then had this—I got to say, it really pissed me off because I'm looking at it, and it's how did I not know that this existed back when the exact problems that you solve are the things I was careening headlong into? I was actively annoyed. You're also an open-source project, which means that you're effectively making all of your money by giving things away and hoping for gratitude to come back on you in the fullness of time, right?Robert: Well, yeah. There's two things there. They're open-source component, but also, where was this when I was struggling with this problem? So, for the folks that don't know, what Liquibase does is automate database schema change. So, if you need to update a database—I don't care what it is—as part of your application deployment, we can help.Instead of writing a ticket or manually executing a SQL script, or generating a bunch of docs in a NoSQL database, you can have Liquibase help you out with that. And so I was at a conference years ago, at the booth, doing my booth thing, and a managing director of a very large bank came to me, like, “Hey, what do you do?” And saw what we did and got angry, started yelling at me. “Where were you three years ago when I was struggling with this problem?” Like, spitting mad. [laugh]. And I was like, “Dude, we just started”—this was a while ago—it was like, “We just started the company two years ago. We got here as soon as we could.”But I struggled with this problem when I was a release manager. And so I've been doing this for years and years and years—I don't even want to talk about how long—getting bits from dev to test to production, and the database was always, always, always the bottleneck, whether it was things didn't run the same in test as they did, eventually in production, environments weren't in sync. It's just really hard. And we've automated so much stuff, we've automated application deployment, lowercase a compiled bits; we're building things with containers, so everything's in that container. It's not a J2EE app anymore—yay—but we haven't done a damn thing for the database.And what this means is that we have a whole part of our industry, all of our database professionals, that are frankly struggling. I always say we don't sell software Liquibase. We sell piano recitals, date nights, happy hours, all the stuff you want to do but you can't because you're stuck dealing with the database. And that's what we do at Liquibase.Corey: Well, you're talking about database people. That's not how I even do it. I would never call myself that, for very good reason because you know, Route 53 remains the only database I use. But the problem I always had was that, “Great. I'm doing a deployment. Oh, I'm going to put out some changes to some web servers. Okay, what's my rollback?” “Well, we have this other commit we can use.” “Oh, we're going to be making a database schema change. What's your rollback strategy,” “Oh, I've updated my resume and made sure that any personal files I had on my work laptop been backed up somewhere else when I immediately leave the company when we can't roll back.” Because there's not really going to be a company anymore at that point.It's one of those everyone sort of holds their breath and winces when it comes to anything that resembles a schema change—or an ALTER TABLE as we used to call it—because that is the mistakes will show territory and you can hope and plan for things in pre-prod environments, but it's always scary. It's always terrifying because production is not like other things. That's why I always call my staging environment ‘theory' because things work in theory but not in production. So, it's how do you avoid the mess of winding up just creating disasters when you're dealing with the reality of your production environments? So, let's back up here. How do you do it? Because it sounds like something people would love to sell me but doesn't exist.Robert: [laugh]. Well, it's real simple. We have a file, we call it the change log. And this is a ledger. So, databases need to be evolved. You can't drop everything and recreate it from scratch, so you have to apply changes sequentially.And so what Liquibase will do is it connects to the database, and it says, “Hey, what version are you?” It looks at the change log, and we'll see, ehh, “There's ten change sets”—that's what components of a change log, we call them change sets—“There's ten change sets in there and the database is telling me that only five had been executed.” “Oh, great. Well, I'll execute these other five.” Or it asks the database, “Hey, how many have been executed?” And it says, “Ten.”And we've got a couple of meta tables that we have in the database, real simple, ANSI SQL compliant, that store the changes that happen to the database. So, if it's a net new database, say you're running a Docker container with the database in it on your local machine, it's empty, you would run Liquibase, and it says, “Oh, hey. It's got that, you know, new database smell. I can run everything.”And so the interesting thing happens when you start pointing it at an environment that you haven't updated in a while. So, dev and test typically are going to have a lot of releases. And so there's going to be little tiny incremental changes, but when it's time to go to production, Liquibase will catch it up. And so we speak SQL to the database, if it's a NoSQL database, we'll speak their API and make the changes requested. And that's it. It's very simple in how it works.The real complex stuff is when we go a couple of inches deeper, when we start doing things like, well, reverse engineering of your database. How can I get a change log of an existing database? Because nobody starts out using Liquibase for a project. You always do it later.Corey: No, no. It's one of those things where when you're doing a project to see if it works, it's one of those, “Great, I'll run a database in some local Docker container or something just to prove that it works.” And, “Todo: fix this later.” And yeah, that todo becomes load-bearing.Robert: [laugh]. That's scary. And so, you know, we can help, like, reverse engineering an entire database schema, no problem. We also have things called quality checks. So sure, you can test your Liquibase change against an empty database and it will tell you if it's syntactically correct—you'll get an error if you need to fix something—but it doesn't enforce things like corporate standards. “Tables start with T underscore.” “Do not create a foreign key unless those columns have an ID already applied.” And that's what our quality checks does. We used to call it rules, but nobody likes rules, so we call it quality checks now.Corey: How do you avoid the trap of enumerating all the bad things you've seen happen because at some point, it feels like that's what leads to process ossification at large companies where, “Oh, we had this bad thing happen once, like, a disk filled up, so now we have a check that makes sure that all the disks are at least 20, empty.” Et cetera. Great. But you keep stacking those you have thousands and thousands and thousands of those, and even a one-line code change then has to pass through so many different tests to validate that this isn't going to cause the failure mode that happened that one time in a unicorn circumstance. How do you avoid the bloat and the creep of stuff like that?Robert: Well, let's look at what we've learned from automated testing. We certainly want more and more tests. Look, DevOp's algorithm is, “All right, we had a problem here.” [laugh]. Or SRE algorithm, I should say. “We had a problem here. What happened? What are we going to change in the future to make sure this doesn't happen?” Typically, that involves a new standard.Now, ossification occurs when a person has to enforce that standard. And what we should do is seek to have automation, have the machine do it for us. Have the humans come up and identify the problem, find a creative way to look for the issue, and then let the machine enforce it. Ossification happens in large organizations when it's people that are responsible, not the machine. The machines are great at running these things over and over again, and they're never hung over, day after Super Bowl Sunday, their kid doesn't get sick, they don't get sick. But we want humans to look at the things that we need that creative energy, that brain power on. And then the rote drudgery, hand that off to the machine.Corey: Drudgery seems like sort of a job description for a lot of us who spend time doing operation stuff.Robert: [laugh].Corey: It's drudgery and it's boring, punctuated by moments of sheer terror. On some level, you're more or less taking some of the adrenaline high of this job away from people. And you know, when it comes to databases, I'm kind of okay with that as it turns out.Robert: Yeah. Oh, yeah, we want no surprises in database-land. And that is why over the past several decades—can I say several decades since 1979?Corey: Oh, you can s—it's many decades, I'm sorry to burst your bubble on that.Robert: [laugh]. Thank you, Corey. Thank you.Corey: Five, if we're being honest. Go ahead.Robert: So, it has evolved over these many decades where change is the enemy of stability. And so we don't want change, and we want to lock these things down. And our database professionals have become changed from sentinels of data into traffic cops and TSA. And as we all know, some things slip through those. Sometimes we speed, sometimes things get snuck through TSA.And so what we need to do is create a system where it's not the people that are in charge of that; that we can set these policies and have our database professionals do more valuable things, instead of that adrenaline rush of, “Oh, my God,” how about we get the rush of solving a problem and saving the company millions of dollars? How about that rush? How about the rush of taking our old, busted on-prem databases and figure out a way to scale these up in the cloud, and also provide quick dev and test environments for our developer and test friends? These are exciting things. These are more fun, I would argue.Corey: You have a list of reference customers on your website that are awesome. In fact, we share a reference customer in the form of Ticketmaster. And I don't think that they will get too upset if I mention that based upon my work with them, at no point was I left with the impression that they played fast and loose with databases. This was something that they take very seriously because for any company that, you know, sells tickets to things you kind of need an authoritative record of who's bought what, or suddenly you don't really have a ticket-selling business anymore. You also reference customers in the form of UPS, which is important; banks in a variety of different places.Yeah, this is stuff that matters. And you support—from the looks of it—every database people can name except for Route 53. You've got RDS, you've got Redshift, you've got Postgres-squeal, you've got Oracle, Snowflake, Google's Cloud Spanner—lest people think that it winds up being just something from a legacy perspective—Cassandra, et cetera, et cetera, et cetera, CockroachDB. I could go on because you have multiple pages of these things, SAP HANA—whatever the hell that's supposed to be—Yugabyte, and so on, and so forth. And it's like, some of these, like, ‘now you're just making up animals' territory.Robert: Well, that goes back to open-source, you know, you were talking about that earlier. There is no way in hell we could have brought out support for all these database platforms without us being open-source. That is where the community aligns their goals and works to a common end. So, I'll give you an example. So, case in point, recently, let me see Yugabyte, CockroachDB, AWS Redshift, and Google Cloud Spanner.So, these are four folks that reached out to us and said, either A) “Hey, we want Liquibase to support our database,” or B) “We want you to improve the support that's already there.” And so we have what we call—which is a super creative name—the Liquibase test harness, which is just genius because it's an automated way of running a whole suite of tests against an arbitrary database. And that helped us partner with these database vendors very quickly and to identify gaps. And so there's certain things that AWS Redshift—certain objects—that AWS Redshift doesn't support, for all the right reasons. Because it's data warehouse.Okay, great. And so we didn't have to run those tests. But there were other tests that we had to run, so we create a new test for them. They actually wrote some of those tests. Our friends at Yugabyte, CockroachDB, Cloud Spanner, they wrote these extensions and they came to us and partnered with us.The only way this works is with open-source, by being open, by being transparent, and aligning what we want out of life. And so what our friends—our database friends—wanted was they wanted more tooling for their platform. We wanted to support their platform. So, by teaming up, we help the most important person, [laugh] the most important person, and that's the customer. That's it. It was not about, “Oh, money,” and all this other stuff. It was, “This makes our customers' lives easier. So, let's do it. Oop, no brainer.”Corey: There's something to be said for making people's lives easier. I do want to talk about that open-source versus commercial divide. If I Google Liquibase—which, you know, I don't know how typing addresses in browsers works anymore because search engines are so fast—I just type in Liquibase. And the first thing it spits me out to is liquibase.org, which is the Community open-source version. And there's a link there to the Pro paid version and whatnot. And I was just scrolling idly through the comparison chart to see, “Oh, so ‘Community' is just code for shitty and you're holding back advanced features.” But it really doesn't look that way. What's the deal here?Robert: Oh, no. So, Liquibase open-source project started in 2006 and Liquibase the company, the commercial entity, started after that, 2012; 2014, first deal. And so, for—Nathan Voxland started this, and Nathan was struggling. He was working at a company, and he had to have his application—of course—you know, early 2000s, J2EE—support SQL Server and Oracle and he was struggling with it. And so he open-sourced it and added more and more databases.Certainly, as open-source databases grew, obviously he added those: MySQL, Postgres. But we're never going to undo that stuff. There's rollback for free in Liquibase, we're not going to be [laugh] we're not going to be jerks and either A) pull features out or, B) even worse, make Stephen O'Grady's life awful by changing the license [laugh] so he has to write about it. He loves writing about open-source license changes. We're Apache 2.0 and so you can do whatever you want with it.And we believe that the things that make sense for a paying customer, which is database-specific objects, that makes sense. But Liquibase Community, the open-source stuff, that is built so you can go to any database. So, if you have a change log that runs against Oracle, it should be able to run against SQL Server, or MySQL, or Postgres, as long as you don't use platform-specific data types and those sorts of things. And so that's what Community is about. Community is about being able to support any database with the same change log. Pro is about helping you get to that next level of DevOps Nirvana, of reaching those four metrics that Dr. Forsgren tells us are really important.Corey: Oh, yes. You can argue with Nicole Forsgren, but then you're wrong. So, why would you ever do that?Robert: Yeah. Yeah. [laugh]. It's just—it's a sucker's bet. Don't do it. There's a reason why she's got a PhD in CS.Corey: She has been a recurring guest on this show, and I only wish she would come back more often. You and I are fun to talk to, don't get me wrong. We want unbridled intellect that is couched in just a scintillating wit, and someone is great to talk to. Sorry, we're both outclassed.Robert: Yeah, you get entertained with us; you learn with her.Corey: Exactly. And you're still entertained while doing it is the best part.Robert: [laugh]. That's the difference between Community and Pro. Look, at the end of the day, if you're an individual developer just trying to solve a problem and get done and away from the computer and go spend time with your friends and family, yeah, go use Liquibase Community. If it's something that you think can improve the rest of the organization by teaming up and taking advantage of the collaboration features? Yes, sure, let us know. We're happy to help.Corey: Now, if people wanted to become an attorney, but law school was too expensive, out of reach, too much time, et cetera, but they did have a Twitter account, very often, they'll find that they can scratch that itch by arguing online about open-source licenses. So, I want to be very clear—because those people are odious when they email me—that you are licensed under the Apache License. That is a bonafide OSI approved open-source license. It is not everyone except big cloud companies, or service providers, which basically are people dancing around—they mean Amazon. So, let's be clear. One, are you worried about Amazon launching a competitive service with a dumb name? And/or have you really been validated as a product if AWS hasn't attempted and failed to launch a competitor?Robert: [laugh]. Well, I mean, we do have a very large corporation that has embedded Liquibase into one of their flagship products, and that is Oracle. They have embedded Liquibase in SQLcl. We're tickled pink because that means that, one, yes, it does validate Liquibase is the right way to do it, but it also means more people are getting help. Now, for Oracle users, if you're just an Oracle shop, great, have fun. We think it's a great solution. But there's not a lot of those.And so we believe that if you have Liquibase, whether it's open-source or the Pro version, then you're going to be able to support all the databases, and I think that's more important than being tied to a single cloud. Also—this is just my opinion and take it for what it's worth—but if Amazon wanted to do this, well, they're not the only game in town. So, somebody else is going to want to do it, too. And, you know, I would argue even with Amazon's backing that Liquibase is a little stronger brand than anything they would come out with.Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don't ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense. Corey: So, I want to call out though, that on some level, they have already competed with you because one of database that you do not support is DynamoDB. Let's ignore the Route 53 stuff because, okay. But the reason behind that, having worked with it myself, is that, “Oh, how do you do a schema change in DynamoDB?” The answer is that you don't because it doesn't do schemas for one—it is schemaless, which is kind of the point of it—as well as oh, you want to change the primary, or the partition, or the sort key index? Great. You need a new table because those things are immutable.So, they've solved this Gordian Knot just like Alexander the Great did by cutting through it. Like, “Oh, how do you wind up doing this?” “You don't do this. The end.” And that is certainly an approach, but there are scenarios where those were first, NoSQL is not a acceptable answer for some workloads.I know Rick [Horahan 00:26:16] is going to yell at me for that as soon as he hears me, but okay. But there are some for which a relational database is kind of a thing, and you need that. So, Dynamo isn't fit for everything. But there are other workloads where, okay, I'm going to just switch over. I'm going to basically dump all the data and add it to a new table. I can't necessarily afford to do that with anything less than maybe, you know, 20 milliseconds of downtime between table one and table two. And they're obnoxious and difficult ways to do it, but for everything else, you do kind of need to make ALTER TABLE changes from time to time as you go through the build and release process.Robert: Yeah. Well, we certainly have plans for DynamoDB support. We are working our way through all the NoSQLs. Started with Mongo, and—Corey: Well, back that out a second then for me because there's something I'm clearly not grasping because it's my understanding, DynamoDB is schemaless. You can put whatever you want into various arbitrary fields. How would Liquibase work with something like that?Robert: Well, that's something I struggled with. I had the same question. Like, “Dude, really, we're a schema change tool. Why would we work with a schemaless database?” And so what happened was a soon-to-be friend of ours in Europe had reached out to me and said, “I built an extension for MongoDB in Liquibase. Can we open-source this, and can y'all take care of the care and feeding of this?” And I said, “Absolutely. What does it do?” [laugh].And so I looked at it and it turns out that it focuses on collections and generating data for test. So, you're right about schemaless because these are just documents and we're not going to go through every single document and change the structure, we're just going to have the application create a new doc and the new format. Maybe there's a conversion log logic built into the app, who knows. But it's the database professionals that have to apply these collections—you know, indices; that's what they call them in Mongo-land: collections. And so being able to apply these across all environments—dev, test, production—and have consistency, that's important.Now, what was really interesting is that this came from MasterCard. So, this engineer had a consulting business and worked for MasterCard. And they had a problem, and they said, “Hey, can you fix this with Liquibase?” And he said, “Sure, no problem.” And he built it.So, that's why if you go to the MongoDB—the liquibase-mongodb repository in our Liquibase org, you'll see that MasterCard has the copyright on all that code. Still Apache 2.0. But for me, that was the validation we needed to start expanding to other things: Dynamo, Couch. And same—Corey: Oh, yeah. For a lot of contributors, there's a contributor license process you can go through, assign copyright. For everything else, there's MasterCard.Robert: Yeah. Well, we don't do that. Look, you know, we certainly have a code of conduct with our community, but we don't have a signing copyright and that kind of stuff. Because that's baked into Apache 2.0. So, why would I want to take somebody's ability to get credit and magical internet points and increase the rep by taking that away? That's just rude.Corey: The problem I keep smacking myself into is just looking at how the entire database space across the board goes, it feels like it's built on lock-in, it's built on it is super finicky to work with, and it generally feels like, okay, great. You take something like Postgres-squeal or whatever it is you want to run your database on, yeah, you could theoretically move it a bunch of other places, but moving databases is really hard. Back when I was at my last, “Real job,” quote-unquote, years ago, we were late to the game; we migrated the entire site from EC2 Classic into a VPC, and the biggest pain in the ass with all of that was the RDS instance. Because we had to quiesce the database so it would stop taking writes; we would then do snapshot it, shut it down, and then restore a new database from that RDS snapshot.How long does it take, at least in those days? That is left as an experiment for the reader. So, we booked a four hour maintenance window under the fear that would not be enough. It completed in 45 minutes. So okay, there's that. Sparked the thing up and everything else was tested and good to go. And yay. Okay.It took a tremendous amount of planning, a tremendous amount of work, and that wasn't moving it very far. It is the only time I've done a late-night deploy, where not a single thing went wrong. Until I was on the way home and the Uber driver sideswiped a city vehicle. So, there we go—Robert: [laugh].Corey: —that's the one. But everything else was flawless on this because we planned these things out. But imagine moving to a different provider. Oh, forget it. Or imagine moving to a different database engine? That's good. Tell another one.Robert: Well, those are the problems that we want our database professionals to solve. We do not want them to be like janitors at an elementary school, cleaning up developer throw-up with sawdust. The issue that you're describing, that's a one time event. This is something that doesn't happen very often. You need hands on the keyboard, you want people there to look for problems.If you can take these database releases away from those folks and automate them safely—you can have safety and speed—then that frees up their time to do these other herculean tasks, these other feats of strength that they're far better at. There is no silver bullet panacea for database issues. All we're trying to do is take about 70% of DBAs time and free it up to do the fun stuff that you described. There are people that really enjoy that, and we want to free up their time so they can do that. Moving to another platform, going from the data center to the cloud, these sorts of things, this is what we want a human on; we don't want them updating a column three times in a row because dev couldn't get it right. Let's just give them the keys and make sure they stay in their lane.Corey: There's something glorious about being able to do that. I wish that there were more commonly appreciated ways of addressing those pains, rather than, “Oh, we're going to sell you something big and enterprise-y and it's going to add a bunch of process and not work out super well for you.” You integrate with existing CI/CD systems reasonably well, as best I can tell because the nice thing about CI/CD—and by nice I mean awful—is that there is no consensus. Every pipeline you see, in a release engineering process inherently becomes this beautiful bespoke unicorn.Robert: Mm-hm. Yeah. And we have to. We have to integrate with whatever CI/CD they have in place. And we do not want customers to just run Liquibase by itself. We want them to integrate it with whatever is driving that application deployment.We're Switzerland when it comes to databases, and CI/CD. And I certainly have my favorite of those, and it's primarily based on who bought me drinks at the last conference, but we cannot go into somebody's house and start rearranging the furniture. That's just rude. If they're deploying the app a certain way, what we tell that customer is, “Hey, we're just going to have that CI/CD tool call Liquibase to update the database. This should be an atomic unit of deployment.” And it should be hidden from the person that pushes that shiny button or the automation that does it.Corey: I wish that one day that you could automate all of the button pushing, but the thing that always annoyed me in release engineering was the, “Oh, and here's where we stop to have a human press the button.” And I get it. That stuff's scary for some folks, but at the same time, this is the nature of reality. So, you're not going to be able to technology your way around people. At least not successfully and not for very long.Robert: It's about trust. You have to earn that database professional's trust because if something goes wrong, blaming Liquibase doesn't go very far. In that company, they're going to want a person [laugh] who has a badge to—with a throat to choke. And so I've seen this pattern over and over again.And this happened at our first customer. Major, major, big, big, big bank, and this was on the consumer side. They were doing their first production push, and they wanted us ready. Not on the call, but ready if there was an issue they needed to escalate and get us to help them out. And so my VP of Engineering and me, we took it. Great. Got VP of engineering and CTO. Right on.And so Kevin and I, we stayed home, stayed sober [laugh], you know—a lot of places to party in Austin; we fought that temptation—and so we stayed and I'm texting with Kevin, back and forth. “Did you get a call?” “No, I didn't get a call.” It was Friday night. Saturday rolls around. Sunday. “Did you get a—what's going on?” [laugh].Monday, we're like, “Hey. Everything, okay? Did you push to the next weekend?” They're like, “Oh, no. We did. It went great. We forgot to tell you.” [laugh]. But here's what happened. The DBAs push the Liquibase ‘make it go' button, and then they said, “Uh-Oh.” And we're like, “What do you mean, uh-oh?” They said, “Well, something went wrong.” “Well, what went wrong?” “Well, it was too fast.” [laugh]. Something—no way. And so they went through the whole thing—Corey: That was my downtime when I supposed to be compiling.Robert: Yeah. So, they went through the whole thing to verify every single change set. Okay, so that was weekend one. And then they go to weekend two, they do it the same thing. All right, all right. Building trust.By week four, they called a meeting with the release team. And they said, “Hey, process change. We're no longer going to be on these calls. You are going to push the Liquibase button. Now, if you want to integrate it with your CI/CD, go right ahead, but that's not my problem.” Dev—or, the release team is tier one; dev is tier two; we—DBAs—are tier three support, but we'll call you because we'll know something went wrong. And to this day, it's all automated.And so you have to earn trust to get people to give that up. Once they have trust and you really—it's based on empathy. You have to understand how terrible [laugh] they are sometimes treated, and to actively take care of them, realize the problems they're struggling with, and when you earn that trust, then and only then will they allow automation. But it's hard, but it's something you got to do.Corey: You mentioned something a minute ago that I want to focus on a little bit more closely, specifically that you're in Austin. Seems like that's a popular choice lately. You've got companies that are relocating their headquarters there, presumably for tax purposes. Oracle's there, Tesla's there. Great. I mean, from my perspective, terrific because it gets a number of notably annoying CEOs out of my backyard. But what's going on? Why is Austin on this meteoric rise and how'd it get there?Robert: Well, a lot of folks—overnight success, 40 years in the making, I guess. But what a lot of people don't realize is that, one, we had a pretty vibrant tech hub prior to all this. It all started with MCC, Microcomputer Consortium, which in the '80s, we were afraid of the Japanese taking over and so we decided to get a bunch of companies together, and Admiral Bobby Inman who was director planted it in Austin. And that's where it started. You certainly have other folks that have a huge impact, obviously, Michael Dell, Austin Ventures, a whole host of folks that have really leaned in on tech in Austin, but it actually started before that.So, there was a time where Willie Nelson was in Nashville and was just fed up with RCA Records. They would not release his albums because he wanted to change his sound. And so he had some nice friends at Atlantic Records that said, “Willie, we got this. Go to New York, use our studio, cut an album, we'll fix it up.” And so he cut an album called Shotgun Willie, famous for having “Whiskey River” which is what he uses to open and close every show.But that album sucked as far as sales. It's a good album, I like it. But it didn't sell except for one place in America: in Austin, Texas. It sold more copies in Austin than anywhere else. And so Willie was like, “I need to go check this out.”And so he shows up in Austin and sees a bunch of rednecks and hippies hanging out together, really geeking out on music. It was a great vibe. And then he calls, you know, Kris, and Waylon, and Merle, and say, “Come on down.” And so what happened here was a bunch of people really wanted to geek out on this new type of country music, outlaw country. And it started a pattern where people just geek out on stuff they really like.So, same thing with Austin film. You got Robert Rodriguez, you got Richard Linklater, and Slackers, his first movie, that's why I moved to Austin. And I got a job at Les Amis—a coffee shop that's closed—because it had three scenes in that. There was a whole scene of people that just really wanted to make different types of films. And we see that with software, we see that with film, we see it with fashion.And it just seems that Austin is the place where if you're really into something, you're going to find somebody here that really wants to get into it with you, whether it's board gaming, D&D, noise punk, whatever. And that's really comforting. I think it's the community that's just welcoming. And I just hope that we can continue that creativity, that sense of community, and that we don't have large corporations that are coming in and just taking from the system. I hope they inject more.I think Oracle's done a really good job; their new headquarters is gorgeous, they've done some really good things with the city, doing a land swap, I think it was forty acres for nine acres. They coughed up forty for nine. And it was nine acres the city wasn't even using. Great. So, I think they're being good citizens. I think Tesla's been pretty cool with building that factory where it is. I hope more come. I hope they catch what is ever in the water and the breakfast tacos in Austin.Corey: [laugh]. I certainly look forward to this pandemic ending; I can come over and find out for myself. I'm looking forward to it. I always enjoyed my time there, I just wish I got to spend more of it.Robert: How many folks from Duckbill Group are in Austin now?Corey: One at the moment. Tim Banks. And the challenge, of course, is that if you look across the board, there really aren't that many places that have more than one employee. For example, our operations person, Megan, is here in San Francisco and so is Jesse DeRose, our manager of cloud economics. But my business partner is in Portland; we have people scattered all over the country.It's kind of fun having a fully-distributed company. We started this way, back when that was easy. And because all right, travel is easy; we'll just go and visit whenever we need to. But there's no central office, which I think is sort of the dangerous part of full remote because then you have this idea of second-class citizens hanging out in one part of the country and then they go out to lunch together and that's where the real decisions get made. And then you get caught up to speed. It definitely fosters a writing culture.Robert: Yeah. When we went to remote work, our lease was up. We just didn't renew. And now we have expanded hiring outside of Austin, we have folks in the Ukraine, Poland, Brazil, more and more coming. We even have folks that are moving out of Austin to places like Minnesota and Virginia, moving back home where their family is located.And that is wonderful. But we are getting together as a company in January. We're also going to, instead of having an office, we're calling it a ‘Liquibase Lounge.' So, there's a number of retail places that didn't survive, and so we're going to take one of those spots and just make a little hangout place so that people can come in. And we also want to open it up for the community as well.But it's very important—and we learned this from our friends at GitLab and their culture. We really studied how they do it, how they've been successful, and it is an awareness of those lunch meetings where the decisions are made. And it is saying, “Nope, this is great we've had this conversation. We need to have this conversation again. Let's bring other people in.” And that's how we're doing at Liquibase, and so far it seems to work.Corey: I'm looking forward to seeing what happens, once this whole pandemic ends, and how things continue to thrive. We're long past due for a startup center that isn't San Francisco. The whole thing is based on the idea of disruption. “Oh, we're disruptive.” “Yes, we're so disruptive, we've taken a job that can be done from literally anywhere with internet access and created a land crunch in eight square miles, located in an earthquake zone.” Genius, simply genius.Robert: It's a shame that we had to have such a tragedy to happen to fix that.Corey: Isn't that the truth?Robert: It really is. But the toothpaste is out of the tube. You ain't putting that back in. But my bet on the next Tech Hub: Kansas City. That town is cool, it has one hundred percent Google Fiber all throughout, great university. Kauffman Fellows, I believe, is based there, so VC folks are trained there. I believe so; I hope I'm not wrong with that. I know Kauffman Foundation is there. But look, there's something happening in that town. And so if you're a buy low, sell high kind of person, come check us out in Austin. I'm not trying to dissuade anybody from moving to Austin; I'm not one of those people. But if the housing prices [laugh] you don't like them, check out Kansas City, and get that two-gig fiber for peanuts. Well, $75 worth of peanuts.Corey: Robert, I want to thank you for taking the time to speak with me so extensively about Liquibase, about how awesome RedMonk is, about Austin and so many other topics. If people want to learn more, where can they find you?Robert: Well, I think the best place to find us right now is in AWS Marketplace. So—Corey: Now, hand on a second. When you say the best place for anything being the AWS Marketplace, I'm naturally a little suspicious. Tell me more.Robert: [laugh]. Well, best is, you know, it's—[laugh].Corey: It is a place that is there and people can find you through it. All right, then.Robert: I have a list. I have a list. But the first one I'm going to mention is AWS Marketplace. And so that's a really easy way, especially if you're taking advantage of the EDP, Enterprise Discount Program. That's helpful. Burn down those dollars, get a discount, et cetera, et cetera. Now, of course, you can go to liquibase.com, download a trial. Or you can find us on Github, github.com/liquibase. Of course, talking smack to us on Twitter is always appreciated.Corey: And we will, of course, include links to that in the [show notes 00:46:37]. Robert Reeves, CTO and co-founder of Liquibase. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment complaining about how Liquibase doesn't support your database engine of choice, which will quickly be rendered obsolete by the open-source community.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Talking Serverless
#44 - Franck Pachot Developer Advocate at Yugabyte

Talking Serverless

Play Episode Listen Later Nov 19, 2021 65:15


In this episode, host Ryan Jones is joined by the fantastic Franck Pachot. Franck is an AWS Hero, Oracle ACE Director, as well as an Oracle Certified Master. With over 20 years of experience in development, data modeling, infrastructure, and all DBA tasks, it's no surprise Franck is a recognized expert across Oracle, PostgreSQL, and AWS. Currently, Franck is a Developer Advocate at Yugabyte; an open-source cloud-native distributed SQL database. You can keep up with Franck on his: -Blog -Twitter -Podcast --- Send in a voice message: https://anchor.fm/talking-serverless/message

Mission Control Center
From Oracle to Developer Advocacy: A Database Career – Franck Pachot, Yugabyte

Mission Control Center

Play Episode Listen Later Nov 17, 2021 12:08


Franck Pachot is a Swiss database career expert working as a developer advocate at the open-source distributed SQL database firm Yugabyte. Here's how this Oracle ACE Director, Oracle Certified Master, and AWS Data Hero went from consulting to developer advocacy, his take on new database technologies and his advice is for those looking to go into the field. -- Mission Control Center is your one-stop shop for career advice and tech trends analysis. IT career tips,  industry news and insider stories. Every other week, in your inbox and ears. This podcast is brought to you by Mindquest, Europe's new IT recruiting agency. Mindquest accompanies you throughout your entire professional journey with high-quality offers that meet all your needs and aspirations. 

Electro Monkeys
Yugabyte, du PostgreSQL distribué avec Franck Pachot

Electro Monkeys

Play Episode Listen Later Nov 16, 2021 63:07


La persistance de la donnée a toujours été compliquée, notamment lorsqu'il s'agit de tacler des problèmes aussi délicats que la disponibilité ou la mise à l'échelle. Longtemps on s'est appuyé sur le modèle master/slave pour la haute disponibilité, mais si ça venait résoudre plus ou moins le problème, ça ne résolvait pas celui de la mise à l'échelle. Puis il y a eu l'arrivée des bases de données NoSQL : distribuées, "facile" a mettre a l'echelle, on pensait tenir la solution, mais c'était sans compter sur le fait qu'elles n'étaient souvent que peu appropriés pour du transactionnel. Aujourd'hui, on assiste à l'essor des bases de données SQL de nouvelle génération, comme Google Spanner, Cockroach DB ou encore Yugabyte. Et quoi de mieux pour en bavarder, que de recevoir Franck Pachot, développeur Advocate chez Yugabyte ?

Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis
Another globally distributed cloud native SQL database unicorn: Yugabyte raises $188M Series C funding at $1.3B valuation. Featuring CEO Bill Cook, Co-founder Karthik Ranganathan

Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis

Play Episode Listen Later Oct 28, 2021 32:23


A number of successive funding rounds have given Yugabyte unicorn status, while positioning the company to aim for a big piece of a growing pie in the shape-shifting database market Article published on ZDNet

Screaming in the Cloud
Yugabyte and Database Innovations with Karthik Ranganathan

Screaming in the Cloud

Play Episode Listen Later Sep 21, 2021 38:53


About KarthikKarthik was one of the original database engineers at Facebook responsible for building distributed databases including Cassandra and HBase. He is an Apache HBase committer, and also an early contributor to Cassandra, before it was open-sourced by Facebook. He is currently the co-founder and CTO of the company behind YugabyteDB, a fully open-source distributed SQL database for building cloud-native and geo-distributed applications.Links: Yugabyte community Slack channel: https://yugabyte-db.slack.com/ Distributed SQL Summit: https://distributedsql.org Twitter: https://twitter.com/YugaByte TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: You could build you go ahead and build your own coding and mapping notification system, but it takes time, and it sucks! Alternately, consider Courier, who is sponsoring this episode. They make it easy. You can call a single send API for all of your notifications and channels. You can control the complexity around routing, retries, and deliverability and simplify your notification sequences with automation rules. Visit courier.com today and get started for free. If you wind up talking to them, tell them I sent you and watch them wince—because everyone does when you bring up my name. Thats the glorious part of being me. Once again, you could build your own notification system but why on god's flat earth would you do that?Corey: This episode is sponsored in part by “you”—gabyte. Distributed technologies like Kubernetes are great, citation very much needed, because they make it easier to have resilient, scalable, systems. SQL databases haven't kept pace though, certainly not like no SQL databases have like Route 53, the world's greatest database. We're still, other than that, using legacy monolithic databases that require ever growing instances of compute. Sometimes we'll try and bolt them together to make them more resilient and scalable, but let's be honest it never works out well. Consider Yugabyte DB, its a distributed SQL database that solves basically all of this. It is 100% open source, and there's not asterisk next to the “open” on that one. And its designed to be resilient and scalable out of the box so you don't have to charge yourself to death. It's compatible with PostgreSQL, or “postgresqueal” as I insist on pronouncing it, so you can use it right away without having to learn a new language and refactor everything. And you can distribute it wherever your applications take you, from across availability zones to other regions or even other cloud providers should one of those happen to exist. Go to yugabyte.com, thats Y-U-G-A-B-Y-T-E dot com and try their free beta of Yugabyte Cloud, where they host and manage it for you. Or see what the open source project looks like—its effortless distributed SQL for global apps. My thanks to Yu—gabyte for sponsoring this episode.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's promoted episode comes from the place where a lot of my episodes do: I loudly and stridently insist that Route 53—or DNS in general—is the world's greatest database, and then what happens is a whole bunch of people who work at database companies get upset with what I've said. Now, please don't misunderstand me; they're wrong, but I'm thrilled to have them come on and demonstrate that, which is what's happening today. My guest is CTO and co-founder of Yugabyte. Karthik Ranganathan, thank you so much for spending the time to speak with me today. How are you?Karthik: I'm doing great. Thanks for having me, Corey. We'll just go for YugabyteDB being the second-best database. Let's just keep the first [crosstalk 00:01:13]—Corey: Okay. We're all fighting for number two, there. And besides, number two tries harder. It's like that whole branding thing from years past. So, you were one of the original database engineers at Facebook, responsible for building a bunch of nonsense, like Cassandra and HBase. You were an HBase committer, early contributor to Cassandra, even before it was open-sourced.And then you look around and said, “All right, I'm going to go start a company”—roughly around 2016, if memory serves—“And I'm going to go and build a database and bring it to the world.” Let's start at the beginning. Why on God's flat earth do we need another database?Karthik: Yeah, that's the question. That's the million-dollar question isn't it, Corey? So, this is one, fortunately, that we've had to answer so many times from 2016, that I guess we've gotten a little good at it. So, here's the learning that a lot of us had from Facebook: we were the original team, like, all three of us founders, we met at Facebook, and we not only build databases, we also ran them. And let me paint a picture.Back in 2007, the public cloud really wasn't very common, and people were just going into multi-region, multi-datacenter deployments, and Facebook was just starting to take off, to really scale. Now, forward to 2013—I was there through the entire journey—a number of things happened in Facebook: we saw the rise of the equivalent of Kubernetes which was internally built; we saw, for example, microservice—Corey: Yeah, the Tupperware equivalent, there.Karthik: Tupperware, exactly. You know the name. Yeah, exactly. And we saw how we went from two data centers to multiple data centers, and nearby and faraway data centers—zones and regions, what do you know as today—and a number of such technologies come up. And I was on the database side, and we saw how existing databases wouldn't work to distribute data across nodes, failover, et cetera, et cetera.So, we had to build a new class of databases, what we now know is NoSQL. Now, back in Facebook, I mean, the typical difference between Facebook and an enterprise at large is Facebook has a few really massive applications. For example, you do a set of interactions, you view profiles, you add friends, you talk with them, et cetera, right? These are supermassive in their usage, but they were very few in their access patterns. At Facebook, we were mostly interested in dealing with scale and availability.Existing databases couldn't do it, so we built NoSQL. Now, forward a number of years, I can't tell you how many times I've had conversations with other people building applications that will say, “Hey, can I get a secondary index on the SQL database?” Or, “How about that transaction? I only need it a couple of times; I don't need it all the time, but could you, for example, do multi-row transactions?” And the answer was always, “Not,” because it was never built for that.So today, what we're seeing is that transactional data and transactional applications are all going cloud-native, and they all need to deal with scale and availability. And so the existing databases don't quite cut it. So, the simple answer to why we need it is we need a relational database that can run in the cloud to satisfy just three properties: it needs to be highly available, failures or no, upgrades or no, it needs to be available; it needs to scale on demand, so simply add or remove nodes and scale up or down; and it needs to be able to replicate data across zones, across regions, and a variety of different topologies. So availability, scale, and geographic distribution, along with retaining most of the RDBMS features, the SQL features. That's really what the gap we're trying to solve.Corey: I don't know that I've ever told this story on the podcast, but I want to say it was back in 2009. I flew up to Palo Alto and interviewed at Facebook, and it was a different time, a different era; it turns out that I'm not as good on the whiteboard as I am at running my mouth, so all right, I did not receive an offer, but I think everyone can agree at this point that was for the best. But I saw one of the most impressive things I've ever seen, during a part of that interview process. My interview is scheduled for a conference room for must have been 11 o'clock or something like that, and at 10:59, they're looking at their watch, like, “Hang on ten seconds.” And then the person I was with reached out to knock on the door to let the person know that their meeting was over and the door opened.So, it's very clear that even in large companies, which Facebook very much was at the time, people had synchronized clocks. This seems to be a thing, as I've learned from reading the parts that I could understand of the Google Spanner paper: when you're doing distributed databases, clocks are super important. At places like Facebook, that is, I'm not going to say it's easy, let's be clear here. Nothing is easy, particularly at scale, but Facebook has advantages in that they can mandate how clocks are going to be handled throughout every piece of their infrastructure. You're building an open-source database and you can't guarantee in what environment and on what hardware that's going to run, and, “You must have an atomic clock hooked up,” is not something you're generally allowed to tell people. How do you get around that?Karthik: That's a great question. Very insightful, cutting right to the chase. So, the reality is, we cannot rely on atomic clocks, we cannot mandate our users to use them, or, you know, we'd not be very popularly used in a variety of different deployments. In fact, we also work in on-prem private clouds and hybrid deployments where you really cannot get these atomic clocks. So, the way we do this is we come up with other algorithms to make sure that we're able to get the clocks as synchronized as we can.So, think about at a higher level; the reason Google uses atomic clocks is to make sure that they can wait to make sure every other machine is synchronized with them, and the wait time is about seven milliseconds. So, the atomic clock service, or the true time service, says no two machines are farther apart than about seven milliseconds. So, you just wait for seven milliseconds, you know everybody else has caught up with you. And the reason you need this is you don't want to write on a machine, you don't want to write some data, and then go to a machine that has a future or an older time and get inconsistent results. So, just by waiting seven milliseconds, they can ensure that no one is going to be older and therefore serve an older version of the data, so every write that was written on the other machine see it.Now, the way we do this is we only have NTP, the Network Time Protocol, which does synchronization of time across machines, except it takes 150 to 200 milliseconds. Now, we wouldn't be a very good database, if we said, “Look, every operation is going to take 150 milliseconds.” So, within these 150 milliseconds, we actually do the synchronization in software. So, we replaced the notion of an atomic clock with what is called a hybrid logical clock. So, one part using NTP and physical time, and another part using counters and logical time and keep exchanging RPCs—which are needed in the course of the database functioning anyway—to make sure we start normalizing time very quickly.This in fact has some advantages—and disadvantages, everything was a trade-offs—but the advantage it has over a true time-style deployment is you don't even have to wait that seven milliseconds in a number of scenarios, you can just instantly respond. So, that means you get even lower latencies in some cases. Of course, the trade-off is there are other cases where you have to do more work, and therefore more latency.Corey: The idea absolutely makes sense. You started this as an open-source project, and it's thriving. Who's using it and for what purposes?Karthik: Okay, so one of the fundamental tenets of building this database—I think back to your question of why does the world need another database—is that the hypothesis is not so much the world needs another database API; that's really what users complain against, right? You create a new API and—even if it's SQL—and you tell people, “Look. Here's a new database. It does everything for you,” it'll take them two years to figure out what the hell it does, and build an app, and then put it in production, and then they'll build a second and a third, and then by the time they hit the tenth app, they find out, “Okay, this database cannot do the following things.” But you're five years in; you're stuck, you can only add another database.That's really the story of how NoSQL evolved. And it wasn't built as a general-purpose database, right? So, in the meanwhile, databases like Postgres, for example, have been around for so long that they absorb and have such a large ecosystem, and usage, and people who know how to use Postgres and so on. So, we made the decision that we're going to keep the database API compatible with known things, so people really know how to use them from the get-go and enhance it at a lower level to make a cloud-native. So, what is YugabyteDB do for people?It is the same as Postgres and Postgres features of the upper half—it reuses the code—but it is built on the lower half to be [shared nothing 00:09:10], scalable, resilient, and geographically distributed. So, we're using the public cloud managed database context, the upper half is built like Amazon Aurora, the lower half is built like Google Spanner. Now, when you think about workloads that can benefit from this, we're a transactional database that can serve user-facing applications and real-time applications that have lower latency. So, the best way to think about it is, people that are building transactional applications on top of, say, a database like Postgres, but the application itself is cloud-native. You'd have to do a lot of work to make this Postgres piece be highly available, and scalable, and replicate data, and so on in the cloud.Well, with YugabyteDB, we've done all that work for you and it's as open-source as Postgres, so if you're building a cloud-native app on Postgres that's user-facing or transactional, YugabyteDB takes care of making the database layer behave like Postgres but become cloud-native.Corey: Do you find that your users are using the same database instance, for lack of a better term? I know that instance is sort of a nebulous term; we're talking about something that's distributed. But are they having database instances that span multiple cloud providers, or is that something that is more talk than you're actually seeing in the wild?Karthik: So, I'd probably replace the word ‘instance' with ‘cluster', just for clarity, right?Corey: Excellent. Okay.Karthik: So, a cluster has a bunch—Corey: I concede the point, absolutely.Karthik: Okay. [laugh]. Okay. So, we'll still keep Route 53 on top, though, so it's good. [laugh].Corey: At that point, the replication strategy is called a zone transfer, but that's neither here nor there. Please, by all means, continue.Karthik: [laugh]. Okay. So, a cluster database like YugabyteDB has a number of instances. Now, I think the question is, is it theoretical or real? What we're seeing is, it is real, and it is real perhaps in slightly different ways than people imagine it to be.So, I'll explain what I mean by that. Now, there's one notion of being multi-cloud where you can imagine there's like, say, the same cluster that spans multiple different clouds, and you have your data being written in one cloud and being read from another. This is not a common pattern, although we have had one or two deployments that are attempting to do this. Now, a second deployment shifted once over from there is where you have your multiple instances in a single public cloud, and a bunch of other instances in a private cloud. So, it stretches the database across public and private—you would call this a hybrid deployment topology—that is more common.So, one of the unique things about YugabyteDB is we support asynchronous replication of data, just like your RDBMSs do, the traditional RDBMSs. In fact, we're the only one that straddles both synchronous replication of data as well as asynchronous replication of data. We do both. So, once shifted over would be a cluster that's deployed in one of the clouds but an asynchronous replica of the data going to another cloud, and so you can keep your reads and writes—even though they're a little stale, you can serve it from a different cloud. And then once again, you can make it an on-prem private cloud, and another public cloud.And we see all of those deployments, those are massively common. And then the last one over would be the same instance of an app, or perhaps even different applications, some of them running on one public cloud and some of them running on a different public cloud, and you want the same database underneath to have characteristics of scale and failover. Like for example, if you built an app on Spanner, what would you do if you went to Amazon and wanted to run it for a different set of users?Corey: That is part of the reason I tend to avoid the idea of picking a database that does not have at least theoretical exit path because reimagining your entire application's data model in order to migrate is not going to happen, so—Karthik: Exactly.Corey: —come hell or high water, you're stuck with something like that where it lives. So, even though I'm a big proponent as a best practice—and again, there are exceptions where this does not make sense, but as a general piece of guidance—I always suggest, pick a provider—I don't care which one—and go all-in. But that also should be shaded with the nuance of, but also, at least have an eye toward theoretically, if you had to leave, consider that if there's a viable alternative. And in some cases in the early days of Spanner, there really wasn't. So, if you needed that functionality, okay, go ahead and use it, but understand the trade-off you're making.Now, this really comes down to, from my perspective, understand the trade-offs. But the reason I'm interested in your perspective on this is because you are providing an open-source database to people who are actually doing things in the wild. There's not much agenda there, in the same way, among a user community of people reporting what they're doing. So, you have in many ways, one of the least biased perspectives on the entire enterprise.Karthik: Oh, yeah, absolutely. And like I said, I started from the least common to the most common; maybe I should have gone the other way. But we absolutely see people that want to run the same application stack in multiple different clouds for a variety of reasons.Corey: Oh, if you're a SaaS vendor, for example, it's, “Oh, we're only in this one cloud,” potential customers who in other clouds say, “Well, if that changes, we'll give you money.” “Oh, money. Did you say ‘other cloud?' I thought you said something completely different. Here you go.” Yeah, you've got to at some point. But the core of what you do, beyond what it takes to get that application present somewhere else, you usually keep in your primary cloud provider.Karthik: Exactly. Yep, exactly. Crazy things sometimes dictate or have to dictate architectural decisions. For example, you're seeing the rise of compliance. Different countries have different regulatory reasons to say, “Keep my data local,” or, “Keep some subset of data are local.”And you simply may not find the right cloud providers present in those countries; you may be a PaaS or an API provider that's helping other people build applications, and the applications that the API provider's customers are running could be across different clouds. And so they would want the data local, otherwise, the transfer costs would be really high. So, a number of reasons dictate—or like a large company may acquire another company that was operating in yet another cloud; everything else is great, but they're in another cloud; they're not going to say, “No because you're operating on another cloud.” It still does what they want, but they still need to be able to have a common base of expertise for their app builders, and so on. So, a number of things dictate why people started looking at cross-cloud databases with common performance and operational characteristics and security characteristics, but don't compromise on the feature set, right?That's starting to become super important, from our perspective. I think what's most important is the ability to run the database with ease while not compromising on your developer agility or the ability to build your application. That's the most important thing.Corey: When you founded the company back in 2016, you are VC-backed, so I imagine your investor pitch meetings must have been something a little bit surreal. They ask hard questions such as, “Why do you think that in 2016, starting a company to go and sell databases to people is a viable business model?” At which point you obviously corrected them and said, “Oh, you misunderstand. We're building an open-source database. We're not charging for it; we're giving it away.”And they apparently said, “Oh, that's more like it.” And then invested, as of the time of this recording, over $100 million in your company. Let me to be the first to say there are aspects of money that I don't fully understand and this is one of those. But what is the plan here? How do you wind up building a business case around effectively giving something away for free?And I want to be clear here, Yugabyte is open-source, and I don't have an asterisk next to that. It is not one of those ‘source available' licenses, or ‘anyone can do anything they want with it except Amazon' or ‘you're not allowed to host it and offer it as a paid service to other people.' So, how do you have a business, I guess is really my question here?Karthik: You're right, Corey. We're 100% open-source under Apache 2.0—I mean the database. So, our theory on day one—I mean, of course, this was a hard question and people did ask us this, and then I'll take you guys back to 2016. It was unclear, even as of 2016, if open-source companies were going to succeed. It was just unclear.And people were like, “Hey, look at Snowflake; it's a completely managed service. They're not open-source; they're doing a great job. Do you really need open-source to succeed?” There were a lot of such questions. And every company, every project, every space has to follow its own path, just applying learnings.Like for example, Red Hat was open-source and that really succeeded, but there's a number of others that may or may not have succeeded. So, our plan back then was to tread the waters carefully in the sense we really had to make sure open-source was the business model we wanted to go for. So, under the advisement from our VCs, we said we'd take it slowly; we want to open-source on day one. We've talked to a number of our users and customers and make sure that is indeed the path we've wanted to go. The conversations pretty clearly told us people wanted an open database that was very easy for them to understand because if they are trusting their crown jewels, their most critical data, their systems of record—this is what the business depends on—into a database, they sure as hell want to have some control over it and some transparency as to what goes on, what's planned, what's on the roadmap. “Look, if you don't have time, I will hire my people to go build for it.” They want it to be able to invest in the database.So, open-source was absolutely non-negotiable for us. We tried the traditional technique for a couple of years of keeping a small portion of the features of the database itself closed, so it's what you'd call ‘open core.' But on day one, we were pretty clear that the world was headed towards DBaaS—Database as a Service—and make it really easy to consume.Corey: At least the bad patterns as well, like, “Oh, if you want security, that's a paid feature.”Karthik: Exactly.Corey: No. That is not optional. And the list then of what you can wind up adding as paid versus not gets murky, and you're effectively fighting your community when they try and merge some of those features in and it just turns into a mess.Karthik: Exactly. So, it did for us for a couple of years, and then we said, “Look, we're not doing this nonsense. We're just going to make everything open and just make it simple.” Because our promise to the users was, we're building everything that looks like Postgres, so it's as valuable as Postgres, and it'll work in the cloud. And people said, “Look, Postgres is completely open and you guys are keeping a few features not open. What gives?”And so after that, we had to concede the point and just do that. But one of the other founding pieces of a company, the business side, was that DBaaS and ability to consume the database is actually far more critical than whether the database itself is open-source or not. I would compare this to, for example, MySQL and Postgres being completely open-source, but you know, Amazon's Aurora being actually a big business, and similarly, it happens all over the place. So, it is really the ability to consume and run business-critical workloads that seem to be more important for our customers and enterprises that paid us. So, the day-one thesis was, look, the world is headed towards DBaaS.We saw that already happen with inside Facebook; everybody was automated operations, simplified operations, and so on. But the reality is, we're a startup, we're a new database, no one's going to trust everything to us: the database, the operations, the data, “Hey, why don't we put it on this tiny company. And oh, it's just my most business-critical data, so what could go wrong?” So, we said we're going to build a version of our DBaaS that is in software. So, we call this Yugabyte Platform, and it actually understands public clouds: it can spin up machines, it can completely orchestrate software installs, rolling upgrades, turnkey encryption, alerting, the whole nine yards.That's a completely different offering from the database. It's not the database, it's just on top of the database and helps you run your own private cloud. So, effectively if you install it on your Amazon account or your Google account, it will convert it into what looks like a DynamoDB, or a Spanner, or what have you with you, with Yugabyte as DB as the database inside. So, that is our commercial product; that's source available and that's what we charge for. The database itself, completely open.Again, the other piece of the thinking is, if we ever charge too much, our customers have the option to say, “Look, I don't want your DBaaS thing; I'm going to the open-source database and we're fine with that.” So, we really want to charge for value. And obviously, we have a completely managed version of our database as well. So, we reuse this platform for our managed version, so you can kind of think of it as portability, not just of the database but also of the control plane, the DBaaS plane.They can run it themselves, we can run it for them, they could take it to a different cloud, so on and so forth.Corey: I like that monetization model a lot better than a couple of others. I mean, let's be clear here, you've spent a lot of time developing some of these concepts for the industry when you were at Facebook. And because at Facebook, the other monetization models are kind of terrifying, like, “Okay. We're going to just monetize the data you store in the open-source database,” is terrifying. Only slightly less would be the Google approach of, “Ah, every time you wind up running a SQL query, we're going to insert ads.”So, I like the model of being able to offer features that only folks who already have expensive problems with money to burn on those problems to solve them will gravitate towards. You're not disadvantaging the community or the small startup who wants it but can't afford it. I like that model.Karthik: Actually, the funny thing is, we are seeing a lot of startups also consume our product a lot. And the reason is because we only charge for the value we bring. Typically the problems that a startup faces are actually much simpler than the complex requirements of an enterprise at scale. They are different. So, the value is also proportional to what they want and how much they want to consume, and that takes care of itself.So, for us, we see that startups, equally so as enterprises, have only limited amount of bandwidth. They don't really want to spend time on operationalizing the database, especially if they have an out to say, “Look, tomorrow, this gets expensive; I can actually put in the time and money to move out and go run this myself. Why don't I just get started because the budget seems fine, and I couldn't have done it better myself anyway because I'd have to put people on it and that's more expensive at this point.” So, it doesn't change the fundamentals of the model; I just want to point out, both sides are actually gravitating to this model.Corey: This episode is sponsored in part by our friends at Jellyfish. So, you're sitting in front of your office chair, bleary eyed, parked in front of a powerpoint and—oh my sweet feathery Jesus its the night before the board meeting, because of course it is! As you slot that crappy screenshot of traffic light colored excel tables into your deck, or sift through endless spreadsheets looking for just the right data set, have you ever wondered, why is it that sales and marketing get all this shiny, awesome analytics and inside tools? Whereas, engineering basically gets left with the dregs. Well, the founders of Jellyfish certainly did. That's why they created the Jellyfish Engineering Management Platform, but don't you dare call it JEMP! Designed to make it simple to analyze your engineering organization, Jellyfish ingests signals from your tech stack. Including JIRA, Git, and collaborative tools. Yes, depressing to think of those things as your tech stack but this is 2021. They use that to create a model that accurately reflects just how the breakdown of engineering work aligns with your wider business objectives. In other words, it translates from code into spreadsheet. When you have to explain what you're doing from an engineering perspective to people whose primary IDE is Microsoft Powerpoint, consider Jellyfish. Thats Jellyfish.co and tell them Corey sent you! Watch for the wince, thats my favorite part.Corey: A number of different surveys have come out that say overwhelmingly companies prefer open-source databases, and this is waved around as a banner of victory by a lot of—well, let's be honest—open-source database companies. I posit that is in fact crap and also bad data because what the open-source purists—of which I admit, I used to be one, and now I solve business problems instead—believe that people are talking about freedom, and choice, and the rest. In practice, in my experience, what people are really distilling that down to is they don't want a commercial database. And it's not even about they're not willing to pay money for it, but they don't want to have a per-core licensing challenge, or even having to track licensing of where it is installed and how, and wind up having to cut checks for folks. For example, I'm going to dunk on someone because why not?Azure for a while has had this campaign that it is five times cheaper to run some Microsoft SQL workloads in Azure than it is on AWS as if this was some magic engineering feat of strength or something. It's absolutely not, it's that it is really expensive licensing-wise to run it on things that aren't Azure. And that doesn't make customers feel good. That's the thing they want to get away from, and what open-source license it is, and in many cases, until the source-available stuff starts trending towards, “Oh, you're going to pay us or you're not going to run it at all,” that scares the living hell out of people, then they don't actually care about it being open. So, at the risk of alienating, I'm sure, some of the more vocal parts of your constituency, where do you fall on that?Karthik: We are completely open, but for a few reasons right? Like, multiple different reasons. The debate of whether it purely is open or is completely permissible, to me, I tend to think a little more where people care about the openness more so than just the ability to consume at will without worrying about the license, but for a few different reasons, and it depends on which segment of the market you look at. If you're talking about small and medium businesses and startups, you're absolutely right; it doesn't matter. But if you're looking at larger companies, they actually care that, like for example, if they want a feature, they are able to control their destiny because you don't want to be half-wedded to a database that cannot solve everything, especially when the time pressure comes or you need to do something.So, you want to be able to control or to influence the roadmap of the project. You want to know how the product is built—the good and the bad—you want a lot of people testing the product and their feedback to come out in the open, so you at least know what's wrong. Many times people often feel like, “Hey, my product doesn't work in these areas,” is actually a bad thing. It's actually a good thing because at least those people won't try it and [laugh] they'll be safe. Customer satisfaction is more important than just the apparent whatever it is that you want to project about the product.At least that's what I've learned in all these years working with databases. But there's a number of reasons why open-source is actually good. There's also a very subtle reason that people may not understand which is that legal teams—engineering teams that want to build products don't want to get caught up in a legal review that takes many months to really make sure, look, this may be a unique version of a license, but it's not a license the legal team as seen before, and there's going to be a back and forth for many months, and it's just going to derail their product and their timelines, not because the database didn't do its job or because the team wasn't ready, but because the company doesn't know what the risk it'll face in the future is. There's a number of these aspects where open-source starts to matter for real. I'm not a purist, I would say.I'm a pragmatist, and I have always been, but I would say that a number of reasons why–you know, I might be sounding like a purist, but a number of reasons why a true open-source is actually useful, right? And at the end of the day, if we have already established, at least at Yugabyte, we're pretty clear about that, the value is in the consumption and is not in the tech if we're pretty clear about that. Because if you want to run a tier-two workload or a hobbyist app at home, would you want to pay for a database? Probably not. I just want to do something for a while and then shut it down and go do my thing. I don't care if the database is commercial or open-source. In that case, being open-source doesn't really take away. But if you're a large company betting, it does take away. So.Corey: Oh, it goes beyond that because it's not even, in the large company story, whether it costs money because regardless, I assure you, open-source is not free; the most expensive thing that we see in all of our customer accounts—again, our consultancy fixes AWS bills, an expensive problem that hits everyone—the environment in AWS is always less expensive than the people who are working on the environment. Payroll is an expense that dwarfs the AWS bill for anyone that is not a tiny startup that is still not paying a market-rate salary to its founders. It doesn't work that way. And the idea, for those folks is, not about the money, it's about the predictability. And if there's a 5x price hike from their database manager that suddenly completely disrupts their unit economic model, and they're in trouble. That's the value of open-source in that it can go anywhere. It's a form of not being locked into any vendor where it's hosted, as well as, now, no one company that has put it out there into the world.Karthik: Yeah, and the source-available license, we considered that also. The reason to vote against that was you can get into scenarios where the company gets competitive with his open-source site where the open-source wants a couple other features to really make it work for their own use case, like you know, case in point is the startup, but the company wants to hold those features for the commercial side, and now the startup has that 5x price jump anyway. So, at this point, it comes to a head-on where the company—the startup—is being charged not for value, but because of the monetization model or the business model. So, we said, “You know what? The best way to do this is to truly compete against open-source. If someone wants to operationalize the database, great. But we've already done it for you.” If you think that you can operationalize it at a lower cost than what we've done, great. That's fine.Corey: I have to ask, there has to have been a question somewhere along the way, during the investment process of, what if AWS moves into your market? And I can already say part of the problem with that line of reasoning is, okay, let's assume that AWS turns Yugabyte into a managed database offering. First, they're not going to be able to articulate for crap why you should use that over anything else because they tend to mumble when it comes time to explain what it is that they do. But it has to be perceived as a competitive threat. How do you think about that?Karthik: Yeah, this absolutely came up quite a bit. And like I said, in 2016, this wasn't news back then; this is something that was happening in the world already. So, I'll give you a couple of different points of view on this. The reason why AWS got so successful in building a cloud is not because they wanted to get into the database space; they simply wanted their cloud to be super successful and required value-added services like these databases. Now, every time a new technology shift happens, it gives some set of people an unfair advantage.In this case, database vendors probably didn't recognize how important the cloud was and how important it was to build a first-class experience on the cloud on day one, as the cloud came up because it wasn't proven, and they had twenty other things to do, and it's rightfully so. Now, AWS comes up, and they're trying to prove a point that the cloud is really useful and absolutely valuable for their customers, and so they start putting value-added services, and now suddenly you're in this open-source battle. At least that's how I would view that it kind of developed. With Yugabyte, obviously, the cloud's already here; we know on day one, so we're kind of putting out our managed service so we'll be as good as AWS or better. The database has its value, but the managed service has its own value, and so we'd want to make sure we provide at least as much value as AWS, but on any cloud, anywhere.So, that's the other part. And we also talked about the mobility of the DBaaS itself, the moving it to your private account and running the same thing, as well as for public. So, these are some of the things that we have built that we believe makes us super valuable.Corey: It's a better approach than a lot of your predecessor companies who decided, “Oh, well, we built the thing; obviously, we're going to be the best at running it. The end.” Because they dramatically sold AWS's operational excellence short. And it turns out, they're very good at running things at scale. So, that's a challenging thing to beat them on.And even if you're able to, it's hard to differentiate among the differences because at that caliber of operational rigor, it's one of those, you can only tell in the very niche cases; it's a hard thing to differentiate on. I like your approach a lot better. Before we go, I have one last question for you, and normally, it's one of those positive uplifting ones of what workloads are best for Yugabyte, but I think that's boring; let's be more cynical and negative. What workloads would run like absolute crap on YugabyteDB?Karthik: [laugh]. Okay, we do have a thing for this because we don't want to take on workloads and, you know, everybody have a bad experience around. So, we're a transactional database built for user-facing applications, real-time, and so on, right? We're not good at warehousing and analytic workloads. So, for example, if you were using a Snowflake or a Redshift, those workloads are not going to work very well on top of Yugabyte.Now, we do work with other external systems like Spark, and Presto, which are real-time analytic systems, but they translate the queries that the end-user have into a more operational type of query pattern. However, if you're using it straight-up for analytics, we're not a good bet. Similarly, there's cases where people want very high number of IOPS by reusing a cache or even a persistent cache. Amazon just came out with a [number of 00:31:04] persistent cache that does very high throughput and low-latency serving. We're not good at that.We can do reasonably low-latency serving and reasonably high IOPS at scale, but we're not the use case where you want to hit that same lookup over and over and over, millions of times in a second; that's not the use case for us. The third thing I'd say is, we're a system of record, so people care about the data they put, and they don't absolutely don't want to lose it and they want to show that it's transactional. So, if there's a workload where there's a lot of data and you're okay if you want to lose, and it's just some sensor data, and your reasoning is like, “Okay, if I lose a few data points, it's fine.” I mean, you could still use us, but at that point you'd really have to be a fanboy or something for Yugabyte. I mean, there's other databases that probably do it better.Corey: Yeah, that's the problem is whenever someone says, “Oh, yeah. Database”—or any tool that they've built—“Like, this is great.” “What workloads is it not a fit for?” And their answer is, “Oh, nothing. It's perfect for everything.”Yeah, I want to believe you, but my inner bullshit sense is tingling on that one because nothing's fit for all purposes; it doesn't work that way. Honestly, this is going to be, I guess, heresy in the engineering world, but even computers aren't always the right answer for things. Who knew?Karthik: As a founder, I struggled with this answer a lot, initially. I think the problem is, when you're thinking about a problem space, that's all you're thinking about, you don't know what other problem spaces exist, and when you are asked the question, “What workloads is it a fit for?” At least I used to say, initially, “Everything,” because I'm only thinking about that problem space as the world, and it's fit for everything in that problem space, except I don't know how to articulate the problem space—Corey: Right—Karthik: —[crosstalk 00:32:33]. [laugh].Corey: —and at some point, too, you get so locked into one particular way of thinking that the world that people ask about other cases like, “Oh, that wouldn't count.” And then your follow-up question is, “Wait, what's a bank?” And it becomes a different story. It's, how do you wind up reasoning about these things? I want to thank you for taking all the time you have today to speak with me. If people want to learn more about Yugabyte—either the company or the DB—how can they do that?Karthik: Yeah, thank you as well for having me. I think to learn about Yugabyte, just come join our community Slack channel. There's a lot of people; there's, like, over 3000 people. They're all talking interesting questions. There's a lot of interesting chatter on there, so that's one way.We have an industry-wide event, it's called the Distributed SQL Summit. It's coming up September 22nd, 23rd, I think a couple of days; it's a two-day event. That would be a great place to actually learn from practitioners, and people building applications, and people in the general space and its adjacencies. And it's not necessarily just about Yugabyte; it's generally about distributed SQL databases, in general, hence it's called the Distributed SQL Summit. And then you can ask us on Twitter or any of the usual social channels as well. So, we love interaction, so we are pretty open and transparent company. We love to talk to you guys.Corey: Well, thank you so much for taking the time to speak with me. Well, of course, throw links to that into the [show notes 00:33:43]. Thank you again.Karthik: Awesome. Thanks a lot for having me. It was really fun. Thank you.Corey: Likewise. Karthik Ranganathan, CTO, and co-founder of YugabyteDB. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment, halfway through realizing that I'm not charging you anything for this podcast and converting the angry comment into a term sheet for $100 million investment.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Cloud Database Report
Yugabyte CTO Karthik Ranganathan: Where Data Lives Forever

Cloud Database Report

Play Episode Listen Later Sep 13, 2021 25:50


Ranganathan discusses the design considerations that influenced development of YugabyteDB, including the learnings gleaned from the engineering team's previous work at Facebook.  YugabyteDB can be deployed on premises or as a cloud service. With built-in replication, YugabyteDB can be used to distribute data across geographic regions in support of data localization requirements and for high availability.Key topics in the interview include: The Yugabyte engineering team worked on the HBase and Cassandra databases at Facebook, experience that is now carrying over to the work they are doing at Yugabyte.How YugabyteDB is different from other distributed SQL databases, including its support for both SQL and NoSQL interfaces.Common uses cases for Yugabyte DB include real-time transactions, microservices, Edge and IoT applications, and geographically-distributed workloads.Yugabyte is available via Apache 2.0 license and as self-managed and fully-managed cloud services.Quotes from the podcast: “One of the important characteristics of transactional data is the fact that it needs to live forever.”“We reuse the upper half of Postgres, so it literally is Postgres-compatible and has all of the features.”“We said we're going to meet developers where they develop. We will support both API's [SQL and NoSQL]. We're not going to invent a new API — that's what people hate.”“It's not the database that people pay money for; it's the operations of the database and making sure it runs in a turnkey manner that people really find valuable in an enterprise setting.” 

Enterprise Masters
Kannan Muthukkaruppan of YugabyteDB

Enterprise Masters

Play Episode Listen Later Sep 9, 2021 24:43


Topics include Kannan's background & motivation to start Yugabyte (2:40), his experiences at Facebook with database technology (4:00), how he and leadership team have retained some of the culture from his previous companies (7:00), how entrepreneurs should think about and approach challenges when building their businesses (12:05), what it's like to develop open source software as a startup and build a profitable business (15:05), and the thought process around bringing in a professional CEO to run the organization (18:01).

ceo kannan yugabyte
Data on Kubernetes Community
DoK Talks #70 - YugabyteDB - Distributed SQL Database on Kubernetes // Amey Banarse

Data on Kubernetes Community

Play Episode Listen Later Aug 4, 2021 72:25


Abstract of the talk… Kubernetes has hit a home run for stateless workloads, but can it do the same for stateful services such as distributed databases? Before we can answer that question, we need to understand the challenges of running stateful workloads on, well anything. In this talk, we will first look at which stateful workloads, specifically databases, are ideal for running inside Kubernetes. Secondly, we will explore the various concerns around running databases in Kubernetes for production environments, such as: - The production-readiness of Kubernetes for stateful workloads in general - The pros and cons of the various deployment architectures - The failure characteristics of a distributed database inside containers In this session, we will demonstrate what Kubernetes brings to the table for stateful workloads and what database servers must provide to fit the Kubernetes model. This talk will also highlight some of the modern databases that take full advantage of Kubernetes and offer a peek into what's possible if stateful services can meet Kubernetes halfway. We will go into the details of deployment choices, how the different cloud-vendor managed container offerings differ in what they offer, as well as compare performance and failure characteristics of a Kubernetes-based deployment with an equivalent VM-based deployment. Bio… Amey is a VP of Data Engineering at Yugabyte with a deep passion for Data Analytics and Cloud-Native technologies. In his current role, he collaborates with Fortune 500 enterprises to architect their business applications with scalable microservices and geo-distributed, fault-tolerant data backend using YugabyteDB. Prior to joining Yugabyte, he spent 5 years at Pivotal as Platform Data Architect and has helped enterprise customers across multiple industry verticals to extend their analytical capabilities using Pivotal & OSS Big Data platforms. He is originally from Mumbai, India, and has a Master's degree in Computer Science from the University of Pennsylvania(UPenn), Philadelphia. Twitter: @ameybanarse LinkedIn: linkedin.com/in/ameybanarse/

Heavybit Podcast Network: Master Feed
Ep. #35, Database Software with Karthik Ranganathan of Yugabyte

Heavybit Podcast Network: Master Feed

Play Episode Listen Later Apr 13, 2021 66:25


In episode 35 of EnterpriseReady, Grant speaks with Karthik Ranganathan of Yugabyte. They discuss Karthik's enterprise journey, the fruits of maintaining a customer-driven ethos in business, and building modern databases.

EnterpriseReady
Ep. #35, Database Software with Karthik Ranganathan of Yugabyte

EnterpriseReady

Play Episode Listen Later Apr 13, 2021 66:25


In episode 35 of EnterpriseReady, Grant speaks with Karthik Ranganathan of Yugabyte. They discuss Karthik’s enterprise journey, the fruits of maintaining a customer-driven ethos in business, and building modern databases. The post Ep. #35, Database Software with Karthik Ranganathan of Yugabyte appeared first on Heavybit.

Heavybit Podcast Network: Master Feed
Ep. #35, Database Software with Karthik Ranganathan of Yugabyte

Heavybit Podcast Network: Master Feed

Play Episode Listen Later Apr 13, 2021 66:25


In episode 35 of EnterpriseReady, Grant speaks with Karthik Ranganathan of Yugabyte. They discuss Karthik's enterprise journey, the fruits of maintaining a customer-driven ethos in business, and building modern databases. The post Ep. #35, Database Software with Karthik Ranganathan of Yugabyte appeared first on Heavybit.

EnterpriseReady
Ep. #35, Database Software with Karthik Ranganathan of Yugabyte

EnterpriseReady

Play Episode Listen Later Apr 13, 2021 66:25


In episode 35 of EnterpriseReady, Grant speaks with Karthik Ranganathan of Yugabyte. They discuss Karthik's enterprise journey, the fruits of maintaining a customer-driven ethos in business, and building modern databases.

Percona's HOSS Talks FOSS:  The Open Source Database Podcast
The Hoss Talks Foss _ Ep 17 with Karthik Ranganathan CTO at Yugabyte

Percona's HOSS Talks FOSS: The Open Source Database Podcast

Play Episode Listen Later Apr 9, 2021 43:43


Karthik Ranganathan CTO at Yugabyte knows databases inside and out having been on the team that first built Apache Cassandra, helped optimize and scale HBase, and most recently built Yugabyte.  What insights does he have from participating in these efforts?  He sat down with Percona’s HOSS Matt Yonkovit to talk through what he learned, what he regretted, and how Yugabyte takes those lessons and implements them.

The Business of Open Source
Building a reliable, transactional cloud native database with Karthik Ranganathan

The Business of Open Source

Play Episode Listen Later Mar 10, 2021 37:24


This week, I talked with Karthik Ranganathan about the challenges going from employee of a large company to startup frounder and why he founded Yugabyte because he wanted a database that both was transactional and still could be highly available.Highlights: Why the ability to scale is important for any cloud native application, including for a cloud native database.Why Yugabyte is still open source and why being open source is important to the company. Why enterprises wanted an open source database to house their mission-critical data. Why the company went from an open core model to open source / managed service model. Why end customers care about open source. Why early-stage, small companies have trouble establishing trust and how being open source helps build trust. Why building around open source helps nudge customers to ‘buy' instead of built it themselves. Why finding the right position and the right message is a major challenge at the beginning of the company. Links: Karthik on LinkedInKarthik on Twitter YugabyteYugabyte Slack

TechCrunch Startups – Spoken Edition
Yugabyte announces $48M Series C as cloud native database makes enterprise push

TechCrunch Startups – Spoken Edition

Play Episode Listen Later Mar 4, 2021 5:07


As demand for cloud native applications is growing, Yugabyte, makers of the cloud native, open source YugabyteDB database are seeing a corresponding rise in demand for their products, especially with large enterprise customers.

More than a refresh: A podcast about data and the people who wrangle it
Episode One: Karthik Ranganathan, Founder & CTO at YugaByte

More than a refresh: A podcast about data and the people who wrangle it

Play Episode Listen Later Feb 2, 2021 55:12


Welcome to episode one of More than a Refresh, with Joshua "JD" Drake. Listen in as he has an honest discussion with Karthik Ranganathan about Yugabyte, PostgreSQL, Open Source, and the best surf movie ever made.

Data on Kubernetes Community
#3 DoK community: Design considerations for operationalizing Distributed SQL on Kubernetes // Nikhil Chandrappa

Data on Kubernetes Community

Play Episode Listen Later Aug 6, 2020 58:17


Distributed databases on kubernetes And we just keep rolling along! Round 3 of the data on kubernetes community meetup! This time we will be talking with Nikhil Chandrappa Lead Software engineer at YugabyteDB. We will take a Practical look at running distributed SQL on Kubernetes using YugabyteDB Key takeaways: - Introduction to YugabyteDB Distributed SQL databases and its design principles - Design considerations for operationalizing Distributed SQL on Kubernetes - Deployment strategies for clustered Databases - Storage orchestration on Kubernetes - Yugabyte's approach for DBAAS on Kubernetes - DB Creation, Scale up / Scale down - Implementing Day 2 operations for distributed SQL databases - upgrades, backups, and monitoring - Distributed SQL Demo: A real-world e-commerce application Abstract This talk is targeted towards cloud-native developers and architects looking to deploy the operational database on Kubernetes. We are going to walk you through the design decisions YugabyteDB's team took when architecting the database as a service on Kubernetes. We are going to cover concepts related to Kubernetes Volume provisioning, pod placement strategies for data resilience/High availability, and how cluster events are used for reconciling the k8s workloads during day 2 operations like upgrades, scale-up/down. Bio: Nikhil is an ecosystem engineer at Yugabyte. He is leading the efforts on YugabyteDB integrations with open source developer tools like GraphQL, Spring Data, R2DBC, and Kubernetes. He also works with the developer community on the adoption of Distributed SQL databases in cloud native apps. Before joining Yugabyte, he worked as a senior data engineer at Pivotal which is now part of VMware Tanzu, championing the cloud native data APIs and in-memory data grids for fortune 500 customers. He has presented at major developer conferences, SpringOne Platform, PostgreSQL conf, JPMC tech fest. He is originally from Mysore, India, and has graduated with a masters degree in Computer Engineering from Syracuse University. I am currently looking for speakers who can talk about things such as operators, databases, multicloud/hybrid, or anything else that could be interesting for the SRE engineering crowd. Join our slack: https://join.slack.com/t/dokcommunity/shared_invite/zt-g3ui5r0g-jDKz5dhh2W1ayElqwKYYAg Follow us on Twitter: @dokcommunity Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Nikhil on Linkedin: https://www.linkedin.com/in/nikhilmc/ This meetup is sponsored by MayaData, which helped start the DOK.community and remains an active supporter. MayaData sponsors two Cloud Native Computing Foundation (CNCF) projects, OpenEBS - the leading open-source container attached storage solution - and Litmus - the leading Kubernetes native chaos engineering project, which was recently donated to the CNCF as a Sandbox project. As of June 2020, MayaData is the sixth-largest contributor to CNCF projects. Well-known users of MayaData products include the CNCF itself, Bloomberg, Comcast, Arista, Orange, Intuit, and others. Check out more info at https://mayadata.io/ ||SHOW NOTES|| Slides: https://docs.google.com/presentation/d/1MOYgKm3EuhQHY2ryxSC3qFId2snCI0nPzdKa28wL4EI/edit?usp=sharing YugaByte CTO's talk about logical clocks https://blog.yugabyte.com/distributed-postgresql-on-a-google-spanner-architecture-storage-layer/ Link to Yugabyte hiring page https://blog.yugabyte.com/insert-into-yugabyte-were-hiring-july-2020-edition/ Getting started with YugabyteDB - https://download.yugabyte.com/ Learn more about the internals of Distributed SQL https://blog.yugabyte.com/distributed-postgresql-on-a-google-spanner-architecture-query-layer/ Learn more about Microservices + YugabyteDB https://www.yugabyte.com/spring/

The Computing Podcast
Part 2: Yugabyte - Deep dive into a distributed SQL database

The Computing Podcast

Play Episode Listen Later Jun 11, 2020 28:42


Welcome to our 5rd episode. This is the second part of a two part series where go deep into the internals of Yugabyte with Karthik and Kannan. Yugabyte is a highly scalable and developer friendly open source distributed SQL database. Yugabyte is built by an Ex-Facebook team that wanted to bring what they learnt running one of the latest databases on the planet out into the open source world. Learn more about how the shared-nothing architecture used by Yugabyte works and how the team build Postgres and other API layers on top of a highly-scalable document DB powered by their own fork of RocksDB. Our guests for this episode are: Kannan Muthukkaruppan, Founder & President, Product Dev. @ Yugabyte Karthik Ranganathan, Founder & CTO @ YugaByte Links: Kudu: Storage for Fast Analytics on Fast Data - https://kudu.apache.org/kudu.pdf Under the Hood: Building and open-sourcing RocksDB - https://www.facebook.com/notes/facebook-engineering/under-the-hood-building-and-open-sourcing-rocksdb/10151822347683920/ The Log-Structured Merge-Tree (LSM-Tree) - https://www.cs.umb.edu/~poneil/lsmtree.pdf Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases - https://dl.acm.org/doi/epdf/10.1145/3035918.3056101

The Computing Podcast
Part 1: Yugabyte - Deep dive into a distributed SQL database

The Computing Podcast

Play Episode Listen Later Jun 11, 2020 24:16


Welcome to our 5rd episode. This is the second part of a two part series where go deep into the internals of Yugabyte with Karthik and Kannan. Yugabyte is a highly scalable and developer friendly open source distributed SQL database. Yugabyte is built by an Ex-Facebook team that wanted to bring what they learnt running one of the latest databases on the planet out into the open source world. One thing I find really fascinating with Yugabyte is that they are fully compatible with Postgres, Redis and Apache Cassandra which makes it easy to replace a lot of infrastructure with just Yugabyte. Hope you enjoy the listen and remember to subscribe for many more of these deep technical discussions. Our guests for this episode are: Kannan Muthukkaruppan, Founder & President, Product Dev. @ Yugabyte Karthik Ranganathan, Founder & CTO @ YugaByte Links: Kudu: Storage for Fast Analytics on Fast Data - https://kudu.apache.org/kudu.pdf Under the Hood: Building and open-sourcing RocksDB - https://www.facebook.com/notes/facebook-engineering/under-the-hood-building-and-open-sourcing-rocksdb/10151822347683920/ The Log-Structured Merge-Tree (LSM-Tree) - https://www.cs.umb.edu/~poneil/lsmtree.pdf Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases - https://dl.acm.org/doi/epdf/10.1145/3035918.3056101

TechCrunch Startups – Spoken Edition
Yugabyte lands $30M Series B as open source database continues to flourish

TechCrunch Startups – Spoken Edition

Play Episode Listen Later Jun 10, 2020 4:35


It's been a big period of positive change for Yugabyte, makers of the open source, cloud native YugabyteDB database. Just last month they brought on former Pivotal CEO Bill Cook as CEO, and today the company announced it has closed a $30 million Series B. 8VC and strategic investor WiPro led the round with participation […]

Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis
Another globally distributed cloud native SQL database on the rise: Yugabyte Raises $30 million in Series B Funding. Backstage chat with CEO and Founders

Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis

Play Episode Listen Later Jun 9, 2020 31:59


Your good old on-premise SQL database is in terminal decline. A pure-play open-source cloud-native PostgreSQL, with support for Apache Cassandra and GraphQL interfaces, is what you need. Or at least, this is what the Yugabyte crew thinks. The company, founded by Facebook data infrastructure veterans, announced that it has raised $30 million in an oversubscribed Series B round to double down on community and team growth. This is a crowded market, but big enough to be a non-zero-sum game. We connected with Yugabyte founders Kannan Muthukkaruppan and Karthik Ranganathan, and newly recruited CEO Bill Cook, previously of Sun Microsystems and Pivotal, for a deep dive in the company, the funding, and the market. Article published on ZDNet in June 2020