POPULARITY
A slick new CoreOS-based distro that might be your next home-lab base, easy self-hosted notifications, and Plex stumbles into controversy. Plus: ECC RAM guilt. Special Guests: Brent Gervais and Dusty Mabe .
A slick new CoreOS-based distro that might be your next home-lab base, easy self-hosted notifications, and Plex stumbles into controversy. Plus: ECC RAM guilt.
In this episode, Kyle sits down with Jake Moshenko, co-founder and CEO of AuthZed, to discuss his journey from working at major tech companies to becoming a two-time founder. After his first company Quay (a Docker registry service) was acquired, he founded AuthZed to solve application permission challenges. Jake shares the contrasts between his two founding experiences, transitioning from a bootstrapped two-person company to a venture-backed enterprise-focused business. He emphasizes the importance of choosing the right problem space and being prepared for the long journey of entrepreneurship. We also discuss the significance of open-source in modern software development and his perspectives on AI technology.Jake MoshenkoJake Moshenko has been an innovator in the cloud-native ecosystem for over 15 years. After engineering roles at Amazon and Google, Jake founded Quay, the first private Docker registry, which was acquired by CoreOS. Jake then became an engineering leader at CoreOS, which was acquired by Red Hat (and then IBM). He is now the co-founder and CEO of AuthZed, the company commercializing SpiceDB, the industry-leading cloud-native permissions database.Links from the Show:LinkedIn: Jake MoshenkoWebsite: AuthZedBooks: The MartianGames: AzulMore by Kyle:Follow Prodity on Twitter and TikTokFollow Kyle on Twitter and TikTokSign up for the Prodity Newsletter for more updates.Kyle's writing on MediumProdity on MediumLike our podcast, consider Buying Us a Coffee or supporting us on Patreon
#SecurityConfidential #DarkRhiinoSecurity Jake has been an innovator in the cloud-native ecosystem for over 15 years. After engineering roles at Amazon and Google, Jake founded Quay, the first private Docker registry, which was acquired by CoreOS. Jake then became an engineering leader at CoreOS, which was acquired by Red Hat (and then IBM). He is now the co-founder and CEO of AuthZed, the company commercializing SpiceDB, the industry-leading cloud-native permissions database. 00:00 Intro 00:58 Our guest 02:15 The Entrepreneur chip on your shoulder 06:58 The fear of failure 09:46 How do you pay salaries on open source when you use it daily 12:40 The basics of a Distributed Architecture 20:00 Distributed Databases 26:43 What if the platform isn't distributed? 31:38 AuthZed 43:21 What will AI do in your world? 47:01 News from Jake ---------------------------------------------------------------------- Kiteworks enables organizations to effectively manage risk in every send, share, receive, and save of sensitive content. To this end, they created a platform that delivers content governance, compliance, and protection to customers. The platform unifies, tracks, controls, and secures sensitive content moving within, into, and out of their organization, significantly improving risk management while ensuring regulatory compliance on all sensitive content communications. To learn more about Kiteworks, visit https://www.kiteworks.com/ ---------------------------------------------------------------------- To learn more about Jake visit https://kitcaster.com/jake-moshenko/ To learn more about Dark Rhiino Security visit https://www.darkrhiinosecurity.com ---------------------------------------------------------------------- SOCIAL MEDIA: Stay connected with us on our social media pages where we'll give you snippets, alerts for new podcasts, and even behind the scenes of our studio! Instagram: @securityconfidential and @Darkrhiinosecurity Facebook: @Dark-Rhiino-Security-Inc Twitter: @darkrhiinosec LinkedIn: @dark-rhiino-security Youtube: @DarkRhiinoSecurity ----------------------------------------------------------------------
In this episode, we're thrilled to have Jacob Moshenko, CEO of Authzed, join us to discuss the revolutionary approach Authzed is taking to manage permissions at scale. Jacob shares his journey from being a software engineer at Google to co-founding Authzed, and why he's passionate about solving complex permissions challenges for developers worldwide. With a rich background in leading technological innovations and his tenure at companies like Google and CoreOS, Jacob brings a wealth of knowledge on the evolution of permissions management. Authzed, inspired by Google's Zanzibar, offers an open-source, fine-grained permissions database that aims to redefine how applications handle access control. Throughout the conversation, Jacob dives into the challenges developers face with traditional permissions systems and how Authzed's SpiceDB addresses these issues by providing a scalable, secure, and easy-to-integrate solution. He also touches upon the impact of Authzed on the future of software development and how it enables developers to build more secure and efficient applications. Join us as we explore the cutting-edge world of permissions management with Jacob Moshenko and learn how Authzed is empowering developers to take control of their applications' access controls like never before. This show is supported by www.matchrelevant.com. A company that helps venture-backed Startups find the best people available in the market, who have the skills, experience, and desire to grow. With over a decade of experience in recruitment across multiple domains, they give people career options to choose from in their career journey.
Stephen Augustus, Head of Open Source at Cisco, and Liz Rice, Chief Open Source Officer at Isovalent, discuss Cisco's acquisition of Isovalent, which has closed since recording, bringing together two teams with long-standing expertise in open source cloud native technologies, observability, and security. The two share their excitement about working together, emphasizing the alignment of Isovalent with Cisco's security division and the potential enhancements this acquisition brings to open source projects like Cilium and eBPF. They explore the implications for the open source community, and the continuous investment and development in these projects under Cisco's umbrella. We discuss the ways this merger could innovate security practices, enhance infrastructure observability, and leverage AI for more intelligent networking solutions. 00:00 Welcome and Introduction 00:22 Cisco's Acquisition of Isovalent 00:53 The Excitement and Potential of the Acquisition 02:14 Strategic Alignment and Future Vision 04:03 Open Source Commitment and Community Impact 06:53 The Road Ahead: Integration and Innovation 19:49 Exploring AI and Future Technologies at Cisco 26:03 Reflections and Closing Thoughts Resources: Cilium, eBPF and Beyond | Open at Intel (podbean.com) The Art of Open Source: A Conversation with Stephen Augustus | Open at Intel (podbean.com) Guests: Liz Rice is Chief Open Source Officer with eBPF specialists Isovalent, creators of the Cilium cloud native networking, security and observability project. She was Chair of the CNCF's Technical Oversight Committee in 2019-2022, and Co-Chair of KubeCon + CloudNativeCon in 2018. She is also the author of Container Security, published by O'Reilly. She has a wealth of software development, team, and product management experience from working on network protocols and distributed systems, and in digital technology sectors such as VOD, music, and VoIP. When not writing code, or talking about it, Liz loves riding bikes in places with better weather than her native London, competing in virtual races on Zwift, and making music under the pseudonym Insider Nine. Stephen Augustus is a Black engineering director and leader in open source communities. He is the Head of Open Source at Cisco, working within the Strategy, Incubation, & Applications (SIA) organization. For Kubernetes, he has co-founded transformational elements of the project, including the KEP (Kubernetes Enhancements Proposal) process, the Release Engineering subproject, and Working Group Naming. Stephen has also previously served as a chair for both SIG PM and SIG Azure. He continues his work in Kubernetes as a Steering Committee member and a Chair for SIG Release. Across the wider LF (Linux Foundation) ecosystem, Stephen has the pleasure of serving as a member of the OpenSSF Governing Board and the OpenAPI Initiative Business Governing Board. Previously, he was a TODO Group Steering Committee member, a CNCF (Cloud Native Computing Foundation) TAG Contributor Strategy Chair, and one of the Program Chairs for KubeCon / CloudNativeCon, the cloud native community's flagship conference. He is a maintainer for the Scorecard and Dex projects, and a prolific contributor to CNCF projects, amongst the top 40 (as of writing) code/content committers, all-time. In 2020, Stephen co-founded the Inclusive Naming Initiative, a cross-industry group dedicated to helping projects and companies make consistent, responsible choices to remove harmful language across codebases, standards, and documentation. He has previously held positions at VMware (via Heptio), Red Hat, and CoreOS. Stephen is based in New York City.
In this episode of The Digital Executive, Brian Thomas sits down with Jake Moshenko, the visionary co-founder and CEO of AuthZed and the mind behind SpiceDB. With a robust background in cloud native ecosystems, Jake's career trajectory took him from engineering roles at tech giants like Amazon and Google to founding Quay, the pioneering private Docker registry later acquired by CoreOS. His journey didn't stop there; following CoreOS's acquisition by Red Hat and then IBM, Jake embarked on a new venture with AuthZed to address a critical challenge in the tech world: permissions management.Jake shares his motivation behind creating AuthZed, aiming to simplify the complex permissions landscape for developers. He highlights the inception of SpiceDB, an industry-leading cloud native permissions database developed to offer scalable and flexible permissions models. Drawing from personal experience, Jake emphasizes the importance of a robust permissions system, sharing how Quay's challenges with customer requirements led to the realization that existing models were insufficient for evolving developer needs.Throughout the conversation, Jake reflects on the challenges in the cloud native ecosystem, including operationalizing software delivery to cloud platforms and ensuring secure, consistent permissions across microservices. He discusses AuthZed's unique approach to leveraging AI and machine learning to enhance cloud native permissions management, focusing on using existing AI models to accelerate and improve the permissions modeling process for their customers.Looking to the future, Jake envisions making AuthZed a ubiquitous solution in permissions management, fostering seamless integration across various software solutions, and enabling better collaboration and efficiency. He draws a compelling picture of a future where permissions no longer limit software integration and user experience, inspired by the seamless interactions within Google's ecosystem.Jake's passion for solving complex technical challenges and his vision for a more integrated and permission-aware software ecosystem shine through in this insightful conversation. It's a deep dive into the world of cloud native permissions and a testament to the power of innovation in addressing fundamental challenges in the tech landscape.
Stephen Augustus, the Head of Open Source at Cisco, shares his experiences and insights about contributing to and maintaining open source projects including Kubernetes and OpenSSF Scorecard. Stephen highlights the importance of building sustainable practices and the value of having product, program, and project management skills in open source projects. Discussions delve into the inner workings of the Kubernetes project, the role and functionality of the OpenSSF Scorecard, and the process of incorporating new contributors and projects. He further emphasizes the importance of transparency and intentionality in corporations' involvement in open source projects. 00:00 Introduction and Guest Background 00:22 Stephen's Journey into Open Source and Kubernetes 05:41 The Success Factors of Kubernetes 06:09 Maintaining the Maintainers: The Balance of Work in Open Source 06:28 The Role of Corporations in Open Source 09:03 The Overwhelming Nature of Open Source Contribution 10:10 The Impact of Kubernetes on Other Open Source Projects 10:59 The Increasing Complexity in Full Stack Development 12:29 The Importance of Open Source Project Management 20:27 OpenSSF Scorecard Guest: Stephen Augustus is a Black engineering director and leader in open source communities. He is the Head of Open Source at Cisco, working within the Strategy, Incubation, & Applications (SIA) organization. For Kubernetes, he has co-founded transformational elements of the project, including the KEP (Kubernetes Enhancements Proposal) process, the Release Engineering subproject, and Working Group Naming. Stephen has also previously served as a chair for both SIG PM and SIG Azure. He continues his work in Kubernetes as a Steering Committee member and a Chair for SIG Release. Across the wider LF (Linux Foundation) ecosystem, Stephen has the pleasure of serving as a member of the OpenSSF Governing Board and the OpenAPI Initiative Business Governing Board. Previously, he was a TODO Group Steering Committee member, a CNCF (Cloud Native Computing Foundation) TAG Contributor Strategy Chair, and one of the Program Chairs for KubeCon / CloudNativeCon, the cloud native community's flagship conference. He is a maintainer for the Scorecard and Dex projects, and a prolific contributor to CNCF projects, amongst the top 40 (as of writing) code/content committers, all-time. In 2020, Stephen co-founded the Inclusive Naming Initiative, a cross-industry group dedicated to helping projects and companies make consistent, responsible choices to remove harmful language across codebases, standards, and documentation. He has previously held positions at VMware (via Heptio), Red Hat, and CoreOS. Stephen is based in New York City.
Kendall "Boo Boo" Howse, a Senior Designer, Illustrator, and Musician, discusses his design journey, which started with creating artwork for his music projects that eventually led to him freelancing full-time. He shares his passion for Dungeons & Dragons and its evolution to becoming a more inclusive and cooperative game. He emphasizes the importance of community and inclusivity in design and shares his experiences advocating for diversity and equity in the tech industry. He also encourages slow progress and reevaluation of solutions to create meaningful change.Key Takeaways:Experience cooperative storytelling and communication through Dungeons & DragonsFollow his journey from creating band merchandise to becoming a full-time freelancer and eventually joining the tech industryExplore his commitment to creating inclusive spaces and advocating for marginalized communities within the Bay Area Black Designers communityLearn about community, collaboration, allyship, and systemic change from his perspectiveReflect on the challenges and opportunities of working in the tech industry as a designer and on the need for progress and meaningful actionQuotes:"If I believe that I am to do this alone, if I believe that the best ideas available to me are already in my head, I am doomed to fail." - Kendall "Boo Boo" Howse"In alternative spaces, we have the power to set our own rules and create new realities." - Kendall "Boo Boo" Howse"Every once in a while, we need to take stock and see if there has actually been progress forward." - Kendall "Boo Boo" HowseTimestamps:(01:28) Introduction of the multi-talented Kendall "Boo Boo" Howse(02:24) Icebreaker: How playing Dungeons & Dragons impacts him as a designer(06:31) The history of his nickname "Boo Boo" and why he sought out alternative culture as a kid(10:02) Feeling isolated after the Murder of George Floyd, the transformation of his music from despair and frustration to hope and encouragement, and the self-made CEO myth(17:39) His path into design, freelancing, and how he almost lost his home(24:03) ERGs and Bay Area Black Designers: His career in tech and inclusion(26:29) The importance of allyship to get through marginalization and isolation and how to make a meaningful impact in the workplace(31:04) How to measure systemic change and the importance of knowing the differences between no progress, slow progress, and false progress(32:21) Ways to get in touch with KendallAbout The Guest:Kendall "Boo Boo" Howse is a Senior Designer, Illustrator, and Musician based in Oakland, California. With a background in DIY punk rock bands, he developed a passion for design and visual communication. He has freelanced for various clients and worked in-house at CoreOS, later acquired by Red Hat and IBM. He also actively promotes diversity and inclusion in the tech industry by helping companies form ERGs, co-chairing Bay Area Black Designers, and facilitating Frame Shift Consulting.Connect with Kendall "Boo Boo" Howse:EmailInstagramReferenced Links:Dungeons & DragonsMass ArrestZulu
Today's guest is Kelsey Hightower, a distinguished engineer and developer advocate at Google and speaker known for his work with Kubernetes, open source software and cloud computing.As a curious and motivated self-learner, Kelsey dropped out of College and taught himself the skills required to start his career as an independent contractor for BellSouth – a telecoms company in Atlanta helping the community to get online. From there, Kelsey set up his own business – an electronics store before becoming involved in the open source world, working at New Relic, CoreOS, Puppet Labs, and most recently at Google.A self-taught developer, Kelsey's work on Kubernetes and at Google, from which he just retired, is well-known* so I wanted to focus our conversation on his life - how he got into tech, his love of learning, what drives him, what it means to be hopeful and the one piece of advice he would offer a younger Kelsey.I know I am not meant to have favourites – these conversations are like children - but I have to say this is up there with one of my most loved conversations. I learned so much from Kelsey and I think you will too.Enjoy!Kelsey on TwitterDanielle Twitter / Instagram / NewsletterPhoto of Kelsey is part of the Faces of Open Source Project by Peter Adams*If you want to learn more about Kelsey's work history, give this episode from Ardan Labs a listen.
[English episode] In deze aflevering gaan we in gesprek met Tim Jones van Sidero Labs. Samen verkennen we grondig de wereld van Talos, de spirituele opvolger van CoreOS, dat fungeert als een toegewijd besturingssysteem, volledig gericht op Kubernetes.Tijdens het gesprek bespreken we uitgebreid de kenmerken en voordelen van Talos, dat speciaal is ontworpen om naadloos te integreren met Kubernetes. Ontdek hoe Talos de infrastructuur vereenvoudigt en optimaliseert, waardoor Kubernetes-clusters soepeler draaien dan ooit tevoren.Maar dat is niet alles! Ook komt Omni aan bod, een krachtige tool waarmee je diverse Talos Clusters moeiteloos kunt beheren. Luister mee en leer hoe deze innovatieve oplossingen de wereld van containerorchestratie transformeren.Kubernetes Community Days (KCD Utrecht)Koop je tickets met kortingscode: KCDUT23-PODCASTFRIENDS en ontvang 20% korting!Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.
Jeff Geerling, Owner of Midwestern Mac, joins Corey on Screaming in the Cloud to discuss the importance of storytelling, problem-solving, and community in the world of cloud. Jeff shares how and why he creates content that can appeal to anybody, rather than focusing solely on the technical qualifications of his audience, and how that strategy has paid off for him. Corey and Jeff also discuss the impact of leading with storytelling as opposed to features in product launches, and what's been going on in the Raspberry Pi space recently. Jeff also expresses the impact that community has on open-source companies, and reveals his take on the latest moves from Red Hat and Hashicorp. About JeffJeff is a father, author, developer, and maker. He is sometimes called "an inflammatory enigma".Links Referenced:Personal webpage: https://jeffgeerling.com/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. A bit off the beaten path of the usual cloud-focused content on this show, today I'm speaking with Jeff Geerling, YouTuber, author, content creator, enigma, and oh, so much more. Jeff, thanks for joining me.Jeff: Thanks for having me, Corey.Corey: So, it's hard to figure out where you start versus where you stop, but I do know that as I've been exploring a lot of building up my own home lab stuff, suddenly you are right at the top of every Google search that I wind up conducting. I was building my own Kubernete on top of a Turing Pi 2, and sure enough, your teardown was the first thing that I found that, to be direct, was well-documented, and made it understandable. And that's not the first time this year that that's happened to me. What do you do exactly?Jeff: I mean, I do everything. And I started off doing web design and then I figured that design is very, I don't know, once it started transitioning to everything being JavaScript, that was not my cup of tea. So, I got into back-end work, databases, and then I realized to make that stuff work well, you got to know the infrastructure. So, I got into that stuff. And then I realized, like, my home lab is a great place to experiment on this, so I got into Raspberry Pis, low-power computing efficiency, building your own home lab, all that kind of stuff.So, all along the way, with everything I do, I always, like, document everything like crazy. That's something my dad taught me. He's an engineer in radio. And he actually hired me for my first job, he had me write an IT operations manual for the Radio Group in St. Louis. And from that point forward, that's—I always start with documentation. So, I think that was probably what really triggered that whole series. It happens to me too; I search for something, I find my old articles or my own old projects on GitHub or blog posts because I just put everything out there.Corey: I was about to ask, years ago, I was advised by Scott Hanselman to—the third time I find myself explaining something, write a blog post about it because it's easier to refer people back to that thing than it is for me to try and reconstruct it on the fly, and I'll drop things here and there. And the trick is, of course, making sure it doesn't sound dismissive and like, “Oh, I wrote a thing. Go read.” Instead of having a conversation with people. But as a result, I'll be Googling how to do things from time to time and come up with my own content as a result.It's at least a half-step up from looking at forums and the rest, where I realized halfway through that I was the one asking the question. Like, “Oh, well, at least this is useful for someone.” And I, for better or worse, at least have a pattern of going back and answering how I solved a thing after I get there, just because otherwise, it's someone asked the question ten years ago and never returns, like, how did you solve it? What did you do? It's good to close that loop.Jeff: Yeah, and I think over 50% of what I do, I've done before. When you're setting up a Kubernetes cluster, there's certain parts of it that you're going to do every time. So, whatever's not automated or the tricky bits, I always document those things. Anything that is not in the readme, is not in the first few steps, because that will help me and will help others. I think that sometimes that's the best success I've found on YouTube is also just sharing an experience.And I think that's what separates some of the content that really drives growth on a YouTube channel or whatever, or for an organization doing it because you bring the experience, like, I'm a new person to this Home Assistant, for instance, which I use to automate things at my house. I had problems with it and I just shared those problems in my video, and that video has, you know, hundreds of thousands of views. Whereas these other people who know way more than I could ever know about Home Assistant, they're pulling in fewer views because they just get into a tutorial and don't have that perspective of a beginner or somebody that runs into an issue and how do you solve that issue.So, like I said, I mean, I just always share that stuff. Every time that I have an issue with anything technological, I put it on GitHub somewhere. And then eventually, if it's something that I can really formulate into an outline of what I did, I put a blog post up on my blog. I still, even though I write I don't know how many words per week that goes into my YouTube videos or into my books or anything, I still write two or three blog posts a week that are often pretty heavy into technical detail.Corey: One of the challenges I've always had is figuring out who exactly I'm storytelling for when I'm putting something out there. Because there's a plethora, at least in cloud, of beginner content of, here's how to think about cloud, here's what the service does, here's why you should use it et cetera, et cetera. And that's all well and good, but often the things that I'm focusing on presuppose a certain baseline level of knowledge that you should have going into this. If you're trying to figure out the best way to get some service configured, I probably shouldn't have to spend the first half of the article talking about what AWS is, as a for instance. And I think that inherently limits the size of the potential audience that would be interested in the content, but it's also the kind of stuff that I wish was out there.Jeff: Yeah. There's two sides to that, too. One is, you can make content that appeals to anybody, even if they have no clue what you're talking about, or you can make content that appeals to the narrow audience that knows the base level of understanding you need. So, a lot of times with—especially on my YouTube channel, I'll put things in that is just irrelevant to 99% of the population, but I get so many comments, like, “I have no clue what you said or what you're doing, but this looks really cool.” Like, “This is fun or interesting.” Just because, again, it's bringing that story into it.Because really, I think on a base level, a lot of programmers especially don't understand—and infrastructure engineers are off the deep end on this—they don't understand the interpersonal nature of what makes something good or not, what makes something relatable. And trying to bring that into technical documentation a lot of times is what differentiates a project. So, one of the products I love and use and recommend everywhere and have a book on—a best-selling book—is Ansible. And one of the things that brought me into it and has brought so many people is the documentation started—it's gotten a little bit more complex over the years—but it started out as, “Here's some problems. Here's how you solve them.”Here's, you know, things that we all run into, like how do you connect to 12 servers at the same time? How do you have groups of servers? Like, it showed you all these little examples. And then if you wanted to go deeper, there was more documentation linked out of that. But it was giving you real-world scenarios and doing it in a simple way. And it used some little easter eggs and fun things that made it more interesting, but I think that that's missing from a lot of technical discussion and a lot of technical documentation out there is that playfulness, that human side, the get from Point A to Point B and here's why and here's how, but here's a little interesting way to do it instead of just here's how it's done.Corey: In that same era, I was one of the very early developers behind SaltStack, and I think one of the reasons that Ansible won in the market was that when you started looking into SaltStack, it got wrapped around its own axle talking about how it uses ZeroMQ for a full mesh between all of the systems there, as long—sorry [unintelligible 00:07:39] mesh network that all routes—not really a mesh network at all—it talks through a single controller that then talks to all of its subordinate nodes. Great. That's awesome. How do I use this to install a web server, is the question that people had. And it was so in love with its own cleverness in some ways. Ansible was always much more approachable in that respect and I can't understate just how valuable that was for someone who just wants to get the problem solved.Jeff: Yeah. I also looked at something like NixOS. It's kind of like the arch of distributions of—Corey: You must be at least this smart to use it in some respects—Jeff: Yeah, it's—Corey: —has been the every documentation I've had with that.Jeff: [laugh]. There's, like, this level of pride in what it does, that doesn't get to ‘and it solves this problem.' You can get there, but you have to work through the barrier of, like, we're so much better, or—I don't know what—it's not that. Like, it's just it doesn't feel like, “You're new to this and here's how you can solve a problem today, right now.” It's more like, “We have this golden architecture and we want you to come up to it.” And it's like, well, but I'm not ready for that. I'm just this random developer trying to solve the problem.Corey: Right. Like, they should have someone hanging out in their IRC channel and just watch for a week of who comes in and what questions do they have when they're just getting started and address those. Oh, you want to wind up just building a Nix box EC2 for development? Great, here's how you do that, and here's how to think about your workflow as you go. Instead, I found that I had to piece it together from a bunch of different blog posts and the rest and each one supposed that I had different knowledge coming into it than the others. And I felt like I was getting tangled up very easily.Jeff: Yeah, and I think it's telling that a lot of people pick up new technology through blog posts and Substack and Medium and whatever [Tedium 00:09:19], all these different platforms because it's somebody that's solving a problem and relating that problem, and then you have the same problem. A lot of times in the documentation, they don't take that approach. They're more like, here's all our features and here's how to use each feature, but they don't take a problem-based approach. And again, I'm harping on Ansible here with how good the documentation was, but it took that approach is you have a bunch of servers, you want to manage them, you want to install stuff on them, and all the examples flowed from that. And then you could get deeper into the direct documentation of how things worked.As a polar opposite of that, in a community that I'm very much involved in still—well, not as much as I used to be—is Drupal. Their documentation was great for developers but not so great for beginners and that was always—it still is a difficulty in that community. And I think it's a difficulty in many, especially open-source communities where you're trying to build the community, get more people interested because that's where the great stuff comes from. It doesn't come from one corporation that controls it, it comes from the community of users who are passionate about it. And it's also tough because for something like Drupal, it gets more complex over time and the complexity kind of kills off the initial ability to think, like, wow, this is a great little thing and I can get into it and start using it.And a similar thing is happening with Ansible, I think. We were at when I got started, there were a couple hundred modules. Now there's, like, 4000 modules, or I don't know how many modules, and there's all these collections, and there's namespaces now, all these things that feel like Java overhead type things leaking into it. And that diminishes that ability for me to see, like, oh, this is my simple tool that solving these problems.Corey: I think that that is a lost art in the storytelling side of even cloud marketing, where they're so wrapped around how they do what they do that they forget, customers don't care. Customers care very much about their problem that they're trying to solve. If you have an answer for solving that problem, they're very interested. Otherwise, they do not care. That seems to be a missing gap.Jeff: I think, like, especially for AWS, Google, Azure cloud platforms, when they build their new services, sometimes you're, like, “And that's for who?” For some things, it's so specialized, like, Snowmobile from Amazon, like, there's only a couple customers on the planet in a given year that needs something like that. But it's a cool story, so it's great to put that into your presentation. But some other things, like, especially nowadays with AI, seems like everybody's throwing tons of AI stuff—spaghetti—at the wall, seeing what will stick and then that's how they're doing it. But that really muddies up everything.If you have a clear vision, like with Apple, they just had their presentation on the new iPhone and the new neural engine and stuff, they talk about, “We see your heart patterns and we tell you when your heart is having problems.” They don't talk about their AI features or anything. I think that leading with that story and saying, like, here's how we use this, here's how customers can build off of it, those stories are the ones that are impactful and make people remember, like, oh Apple is the company that saves people's lives by making watches that track their heart. People don't think that about Google, even though they might have the same feature. Google says we have all these 75 sensors in our thing and we have this great platform and Android and all that. But they don't lead with the story.And that's something where I think corporate Apple is better than some of the other organizations, no matter what the technology is. But I get that feeling a lot when I'm watching launches from Amazon and Google and all their big presentations. It seems like they're tech-heavy and they're driven by, like, “What could we do with this? What could you do with this new platform that we're building,” but not, “And this is what we did with this other platform,” kind of building up through that route.Corey: Something I've been meaning to ask someone who knows for a while, and you are very clearly one of those people, I spend a lot of time focusing on controlling cloud costs and I used to think that Managed NAT Gateways were very expensive. And then I saw the current going rates for Raspberries Pi. And that has been a whole new level of wild. I mean, you mentioned a few minutes ago that you use Home Assistant. I do too.But I was contrasting the price between a late model, Raspberry Pi 4—late model; it's three years old if this point of memory serves, maybe four—versus a used small form factor PC from HP, and the second was less expensive and far more capable. Yeah it drags a bit more power and it's a little bit larger on the shelf, but it was basically no contest. What has been going on in that space?Jeff: I think one of the big things is we're at a generational improvement with those small form-factor little, like, tiny-size almost [nook-sized 00:13:59] PCs that were used all over the place in corporate environments. I still—like every doctor's office you go to, every hospital, they have, like, a thousand of these things. So, every two or three or four years, however long it is on their contract, they just pop all those out the door and then you get an E-waste company that picks up a thousand of these boxes and they got to offload them. So, the nice thing is that it seems like a year or two ago, that really started accelerating to the point where the price was driven down below 100 bucks for a fully built-out little x86 Mini PC. Sure, it's, you know, like you said, a few generations old and it pulls a little bit more power, usually six to eight watts at least, versus a Raspberry Pi at two to three watts, but especially for those of us in the US, electricity is not that expensive so adding two or three watts to your budget for a home lab computer is not that bad.The other part of that is, for the past two-and-a-half years because of the global chip shortages and because of the decisions that Raspberry Pi made, there were so few Raspberry Pis available that their prices shot up through the roof if you wanted to get one in any timely fashion. So, that finally is clearing up, although I went to the Micro Center near me yesterday, and they said that they have not had stock of Raspberry Pi 4s for, like, two months now. So, they're coming, but they're not distributed evenly everywhere. And still, the best answer, especially if you're going to run a lot of things on it, is probably to buy one of those little mini PCs if you're starting out a home lab.Or there's some other content creators who build little Kubernetes clusters with multiple mini PCs. Three of those stack up pretty nicely and they're still super quiet. I think they're great for home labs. I have two of them over on my shelf that I'm using for testing and one of them is actually in my rack. And I have another one on my desk here that I'm trying to set up for a five gigabit home router since I finally got fiber internet after years with cable and I'm still stuck on my old gigabit router.Corey: Yeah, I wound up switching to a Protectli, I think is what it's called for—it's one of those things I've installed pfSense on. Which, I'm an old FreeBSD hand and I haven't kept up with it, but that's okay. It feels like going back in time ten years, in some respects—Jeff: [laugh].Corey: —so all right. And I have a few others here and there for various things that I want locally. But invariably, I've had the WiFi controller; I've migrated that off. That lives on an EC2 box in Ohio now. And I do wind up embracing cloud services when I don't want it to go down and be consistently available, but for small stuff locally, I mean, I have an antenna on the roof doing an ADS-B receiver dance that's plugged into a Pi Zero.I have some backlogged stuff on this, but they've gotten expensive as alternatives have dropped in price significantly. But what I'm finding as I'm getting more into 3D printing and a lot of hobbyist maker tools out there, everything is built with the Raspberry Pi in mind; it has the mindshare. And yeah, I can get something with similar specs that are equivalent, but then I've got to do a whole bunch of other stuff as soon as it gets into controlling hardware via GPIO pins or whatnot. And I have to think about it very differently.Jeff: Yeah, and that's the tough thing. And that's the reason why Raspberry Pis, even though they're three years old, even though they're hard to get, they still are fetching—on the used market—way more than the original MSRP. It's just crazy. But the reason for that is the Raspberry Pi organization. And there's two: there's the Raspberry Pi Foundation that's goals are to increase educational computing and accessibility for computers for kids and learning and all that, then there's the Raspberry Pi trading company that makes the Raspberry Pis.The Trading Company has engineers who sit there 24/7 working on the software, working on the kernel drivers, working on hardware bugs, listening to people on the forums and in GitHub and everywhere, and they're all English-speaking people there—they're over in the UK—and they manufacture their own boards. So, there's a lot of things on top of that, even though they're using some silicons of Broadcom chips that are a little bit locked down and not completely open-source like some other chips might be, they're a phone number you could call if you need the support or there's a forum that has activity that you can get help in and their software that's supported. And there's a newer Linux kernel and the kernel is updated all the time. So, all those advantages mean you get a little package that will work, it'll sip two watts of power, sitting 24/7. It's reliable hardware.There's so many people that use it that it's so well tested that almost any problem you could ever run into, someone else has and there's a blog post or a forum post talking about it. And even though the hardware is not super powerful—it's three years old—you can add on a Coral TPU and do face recognition and object recognition. And throw in Frigate for Home Assistant to get notifications on your phone when your mom walks up to the door. There's so many things you can do with them and they're so flexible that they're still so valuable. I think that they really knocked it out of the park with that model, the Raspberry Pi 4, and the compute module 4, which is still impossible to get. I have not been able to buy one for two years now. Luckily, I bought 12 two-and-a-half years ago [laugh] otherwise I would be running out for all my projects that I do.Corey: Yeah. I got two at the moment and two empty slots in the Turing Pi 2, which I'll care more about if I can actually get the thing up and booted. But it presupposes you have a Windows computer or otherwise, ehh, watch this space; more coming. Great. Like, do I build a virtual machine on top of something else? It leads down the path super quickly of places I thought I'd escaped from.Jeff: Yeah, you know, outside of the Pi realm, that's the state of the communities. It's a lot of, like, figuring out your own things. I did a project—I don't know if you've heard of Mr. Beast—but we did a project for him that involves a hundred single-board computers. We couldn't find Raspberry Pi's so we had to use a different single-board computer that was available.And so, I bought an older one thinking, oh, this is, like, three or four years old—it's older than the Pi 4—and there must be enough support now. But still, there's, like, little rough edges everywhere I went and we ended up making them work, but it took us probably an extra 30 to 40 hours of development work to get those things running the same way as a Raspberry Pi. And that's just the way of things. There's so much opportunity.If one of these Chinese manufacturers that makes most of these things, if one of them decided, you know what? We're going to throw tons of money into building support for these things, get some English-speaking members of these forums to build up the community, all that stuff, I think that they could have a shot at Raspberry Pi's giant portion of the market. But so far, I haven't really seen that happen. So far, they're spamming hardware. And it's like, the hardware is awesome. These chips are great if you know how to deal with them and how to get the software running and how to deal with Linux issues, but if you don't, then they're not great because you might not even get the thing to boot.Corey: I want to harken back to something you said a minute ago, where there's value in having a community around something, where you can see everyone else has already encountered a problem like this. I think that folks who weren't around for the rise of cloud have no real insight into how difficult it used to be just getting servers into racks and everything up, and okay, they're identical, and seven of them are working, but that eighth one isn't for some strange reason. And you spend four hours troubleshooting what turns out to be a bad cable or something not seated properly and it's awful. Cloud got away from a lot of that nonsense. But it's important—at least to me—to not be Captain Edgecase, where if you pick some new cloud provider and Google for how to set up a load balancer and no one's done it before you, that's not great. Whereas if I'm googling now in the AWS realm and no one has done, the thing I'm trying to do, that should be something of a cautionary flag of maybe this isn't how most people go about approaching production. Really think twice about this.Jeff: Yep. Yeah, we ran into that on a project I was working on was using Magento—which I don't know if anybody listening uses Magento, but it's not fun—and we ran into some things where it's like, “We're doing this, and it says that they do this on their official supported platform, but I don't know how they are because the code just doesn't exist here.” So, we ran into some weird edge cases on AWS with some massive infrastructure for the databases, and I ran into scaling issues. But even there, there were forum posts in AWS here and there that had little nuggets that helped us to figure out a way to get around it. And like you say, that is a massive advantage for AWS.And we ran into an issue with, we were one of the first customers trying out the new Lambda functions for RDS—or I don't remember exactly what it was called initially—but we ended up not using that. But we ran into some of these issues and figured out we were the first customer running into this weird scaling thing when we had a certain size of database trying to use it with these Lambda calls. And eventually, they got those things solved, but with AWS, they've seen so many things and some other cloud providers haven't seen these things. So, when you have certain types of applications that need to scale in certain ways, that is so valuable and the community of users, the ability to pull from that community when you need to hire somebody in an emergency, like, we need somebody to help us get this project done and we're having this issue, you can find somebody that is, like, okay, I know how to get you from Point A to Point B and get this project out the door. You can't do that on certain platforms.And open-source projects, too. We've always had that problem in Drupal. The amount of developers who are deep into Drupal to help with the hard problems is not vast, so the ones who can do that stuff, they're all hired off and paid a handsome sum. And if you have those kinds of problems you realize, I either going to need to pay a ton of money or we're just going to have to not do that thing that we wanted to do. And that's tough.Corey: What I've found, sort of across the board, has been that there's a lot of, I guess, open-source community ethos that has bled into a lot of this space and I wanted to make sure that we have time to talk about this because I was incensed a while back when Red Hat decided, “Oh, you know that whole ten-year commitment on CentOS? That project that we acquired and are now basically stabbing in the face?”—disclosure. I used to be part of the CentOS project years ago when I was on network staff for the Freenode IRC network—then it was, “Oh yeah, we're just going to basically undermine our commitments to you and now you can pay us if you want to get that support there.” And that really set me off. Was nice to see you were right there as well in almost lockstep with me, pointing out that this is terrible, just as far as breaking promises you've made to customers. Has your anger cooled any? Because mine hasn't.Jeff: It has not. My temper has cooled. My anger has not. I don't think that they get it. After all the backlash that they got after that, I don't think that the VP-level folks at Red Hat understand that this is already impacting them and will impact them much more in the future because people like me and you, people who help other people build infrastructure and people who recommend operating systems and people who recommend patterns and things, we're just going to drop off using CentOS because it doesn't exist. It does exist and some other people are saying, “Oh, it's actually better to use this new CentOS, you know, Stream. Stream is amazing.” It's not. It's not the same thing. It's different. And—Corey: I used to work at a bank. That was not an option. I mean, granted at the bank for the production systems it was always [REL 00:25:18], but being able to spin up a pre-production environment without having to pay license fees on every VM. Yeah.Jeff: Yeah. And not only that, they did this announcement and framed it a certain way, and the community immediately saw. You know, I think that they're just angry about something, and whether it was a NASA contract with Rocky Linux, or whether it was something Oracle did, who knows, but it seems petty in retrospect, especially in comparison to the amount of backlash that came out of it. And I really don't think that they understand the thing that they had with that Red Hat Enterprise Linux is not a massive growth opportunity for Red Hat. It's, in some ways, a dying product in terms of compared to using cloud stuff, it doesn't matter.You could use CoreOS, you could use NixOS, and you could use anything, it doesn't really matter. For people like you and me, we just want to deploy our software. And if it's containers, it really doesn't matter. It's just the people in government or in certain organizations that have these roles that you have to use whatever FIPS and all that kind of stuff. So, it's not like it's a hyper-growth opportunity for them.CentOS was, like, the only reason why all the software, especially on the open-source side, was compatible with Red Hat because we could use CentOS and it was easy and simple. They took that—well, they tried to take that away and everybody's like, “That's—what are you doing?” Like, I posted my blog post and I think that sparked off quite a bit of consternation, to the point where there was a lot of personal stuff going on. I basically said, “I'm not supporting Red Hat Enterprise Linux for any of my work anymore.” Like, “From this point forward, it's not supported.”I'll support OpenELA, I'll support Rocky Linux or Oracle Linux or whatever because I can get free versions that I don't have to sign into a portal and get a license and download the license and integrate it with my CI work. I'm an open-source developer. I'm not going to pay for stuff or use 16 free licenses. Or I was reached out to and they said, “We'll give you more licenses. We'll give you extra.” And it's like, that's not how this works. Like, I don't have to call Debian and Ubuntu and [laugh] I don't even have to call Oracle to get licenses. I can just download their software and run it.So, you know, I don't think they understood the fact that they had that. And the bigger problem for me was the two-layer approach to destroying all the trust that the community had. First was in, I think it was 2019 when they said—we're in the middle of CentOS 8's release cycle—they said, “We're dropping CentOS 8. It's going to be Stream now.” And everybody was up in arms.And then Rocky Linux and [unintelligible 00:27:52] climbed in and gave us what we wanted: basically, CentOS. So, we're all happy and we had a status quo, and Rocky Linux 9 and [unintelligible 00:28:00] Linux nine came out after Red Hat 9, and the world was a happy place. And then they just dumped this thing on us and it's like, two major release cycles in a row, they did it again. Like, I don't know what this guy's thinking, but in one of the interviews, one of the Red Hat representatives said, “Well, we wanted to do this early in Red Hat 9's release cycle because people haven't started migrating.” It's like, well, I already did all my automation upgrades for CI to get all my stuff working in Rocky Linux 9 which was compatible with Red Hat Enterprise Linux 9. Am I not one of the people that's important to you?Like, who's important to you? Is it only the people who pay you money or is it also the people that empower your operating system to be a premier Enterprise Linux operating system? So, I don't know. You can tell. My anger has not died down. The amount of temper that I have about it has definitely diminished because I realize I'm talking at a wall a lot of times, when I'm having conversations on Twitter, private conversations and email, things like that.Corey: People come to argue; they don't come to actually have a discussion.Jeff: Yeah. I think that they just, they don't see the community aspect of it. They just see the business aspect. And the business aspect, if they want to figure out ways that they can get more people to pay them for their software, then maybe they should provide more value and not just cut off value streams. It doesn't make sense to me from a long-term business perspective.From a short term, maybe there were some clients who said, “Oh, shoot. We need this thing stable. We're going to pay for some more licenses.” But the engineers that those places are going to start making plans of, like, how do we make this not happen again. And the way to not make that happen, again is to use, maybe Ubuntu or maybe [unintelligible 00:29:38] or something. Who knows? But it's not going to be increasing our spend with Red Hat.Corey: That's what I think a lot of companies are missing when it comes to community as well, where it's not just a place to go to get support for whatever it is you're doing and it's not a place [where 00:29:57] these companies view prospective customers. There's more to it than that. There has to be a social undercurrent on this. I look at the communities I spend time in and in some of them dating back long enough, I've made lifelong significant friendships out of those places, just through talking about our lives, in addition to whatever the community is built around. You have to make space for that, and companies don't seem to fully understand that.Jeff: Yeah, I think that there's this thing that a community has to provide value and monetizable value, but I don't think that you get open-source if you think that that's what it is. I think some people in corporate open-source think that corporate open-source is a value stream opportunity. It's a funnel, it's something that is going to bring you more customers—like you say—but they don't realize that it's a community. It's like a group of people. It's friends, it's people who want to make the world a better place, it's people who want to support your company by wearing your t-shirt to conferences, people want to put on your red fedora because it's cool. Like, it's all of that. And when you lose some of that, you lose what makes your product differentiated from all the other ones on the market.Corey: That's what gets missed. I think that there's a goodwill aspect of it. People who have used the technology and understand its pitfalls are likelier to adopt it. I mean, if you tell me to get a website up and running, I am going to build an architecture that resembles what I've run before on providers that I've run on before because I know what the failure modes look like; I know how to get things up and running. If I'm in a hurry, trying to get something out the door, I'm going to choose the devil that I know, on some level.Don't piss me off as a community member and incentivize me to change that estimation the next time I've got something to build. Well, that doesn't show up on this quarter's numbers. Well, we have so little visibility into how decisions get made many companies that you'll never know that you have a detractor who's still salty about something you did five years ago and that's the reason the bank decided not to because that person called in their political favors to torpedo that deal and have a sweetheart offer from your competitor, et cetera and so on and so forth. It's hard to calculate the actual cost of alienating goodwill. But—Jeff: Yeah.Corey: I wish companies had a longer memory for these things.Jeff: Yeah. I mean, and thinking about that, like, there was also the HashiCorp incident where they kind of torpedoed all developer goodwill with their Terraform and other—Terraform especially, but also other products. Like, I probably, through my book and through my blog posts and my GitHub examples have brought in a lot of people into the HashiCorp ecosystem through Vagrant use, and through Packer and things like that. At this point, because of the way that they treated the open-source community with the license change, a guy like me is not going to be enthusiastic about it anymore and I'm going to—I already had started looking at alternatives for Vagrant because it doesn't mesh with modern infrastructure practices for local development as much, but now it's like that enthusiasm is completely gone. Like I had that goodwill, like you said earlier, and now I don't have that goodwill and I'm not going to spread that, I'm not going to advocate for them, I'm not going to wear their t-shirt [laugh], you know when I go out and about because it just doesn't feel as clean and cool and awesome to me as it did a month ago.And I don't know what the deal is. It's partly the economy, money's drying up, things like that, but I don't understand how the people at the top can't see these things. Maybe it's just their organization isn't set up to show the benefits from the engineers underneath, who I know some of these engineers are, like, “Yeah, I'm sorry. This was dumb. I still work here because I get a paycheck, but you know, I can't say anything on social media, but thank you for saying what you did on Twitter.” Or X.Corey: Yeah. It's nice being independent where you don't really have to fear the, well if I say this thing online, people might get mad at me and stop doing business with me or fire me. It's well, yeah, I mean, I would have to say something pretty controversial to drive away every client and every sponsor I've got at this point. And I don't generally have that type of failure mode when I get it wrong. I really want to thank you for taking the time to talk with me. If people want to learn more, where's the best place for them to find you?Jeff: Old school, my personal website, jeffgeerling.com. I link to everything from there, I have an About page with a link to every profile I've ever had, so check that out. It links to my books, my YouTube, all that kind of stuff.Corey: There's something to be said for picking a place to contact you that will last the rest of your career as opposed to, back in the olden days, my first email address was the one that my ISP gave me 25 years ago. I don't use that one anymore.Jeff: Yep.Corey: And having to tell everyone I corresponded with that it was changing was a pain in the butt. We'll definitely put a link to that one in the [show notes 00:34:44]. Thank you so much for taking the time to speak with me. I appreciate it.Jeff: Yeah, thanks. Thanks so much for having me.Corey: Jeff Geerling, YouTuber, author, content creator, and oh so very much more. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that we will, of course, read [in action 00:35:13], just as soon as your payment of compute modules for Raspberries Pi show up in a small unmarked bag.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
This week we explore the history of containers, particularly containerd, with Phil Estes. Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod News of the week Notary Project announces a major release! (Blog) Kubernetes Legacy Package Repositories Will Be Frozen On September 13, 2023 (Blog) Gateway API v0.8.0: Introducing Service Mesh Support (Blog) Amazon VPC CNI now supports Kubernetes Network Policies (Blog) Introducing VMware Tanzu Developer Portal: Empowering Developers with Enterprise-Grade Backstage Google Cloud Next page Google Cloud Next Blogs Google Cloud Post-Next Videos KubeCon NA 2023 Schedule Rig.dev startup (Blog) Links from the interview Docker Containerd Chroot (archlinux wiki) Linux namespaces (Linux man page) runC announcement (2015) runC on Github Containerd project creation announcement (2016) Containerd donation to CNCF announcement (2017) Containerd graduation announcement (2019) Container Runtime Interface (CRI) Kubernetes SIG Node Dockershim debacle (kubernetes.io blog) Dockershim deprecation FAQ (kubernetes.io blog) Mirantis-owned cri-dockershim on Github Open Container Initiative (OCI) Cloud Native Computing Foundation (CNCF) CoreOS (“What was CoreOS” blog by RedHat) Rkt (“What is Rkt” blog by RedHat) Kinvolk BlaBlaCar BlaBlaCar Case Study on Google Cloud gRPC gVisor Kata Containers Docker && WASM with Justin Cormack (Docker CTO) on the Kubernetes Podcast from Google WasmEdge (A Wasm runtime) CRI-O (lightweight container runtime for Kubernetes) Containerd scope and principles nerdctl: Docker-compatible CLI for containerd Docker Buildkit github.com/container-image, github.com/container-storage Podman Skopeo Firecracker microvms Intel Clear Containers Hyper.sh Open Infrastructure Foundation OpenStack Cloud Native Rejekts “Face off: VMs vs. Containers vs Firecracker” by Alex Ellis at Cloud Native Rejekts EU 2023 Links from the post-interview chat Keynote: Reperforming a Nobel Prize Discovery on Kubernetes - Ricardo Rocha & Lukas Heinrich Keynote: CERN Experiences - Ricardo Rocha & Clenimar Filemon Jesse Frazelle's container escape challenge used to be at contained.af, but it doesn't seem to exist anymore. Containers from Scratch - Liz Rice at GOTO 2018 (there are a bunch of recordings of this talk) Mirantis-owned cri-dockershim on Github
Coming up in this episode * The prying eyes wanna know
Kelsey Hightower joins Corey on Screaming in the Cloud to discuss his reflections on how the tech industry is progressing. Kelsey describes what he's been getting out of retirement so far, and reflects on what he learned throughout his high-profile career - including why feature sprawl is such a driving force behind the complexity of the cloud environment and the tactics he used to create demos that are engaging for the audience. Corey and Kelsey also discuss the importance of remaining authentic throughout your career, and what it means to truly have an authentic voice in tech. About KelseyKelsey Hightower is a former Distinguished Engineer at Google Cloud, the co-chair of KubeCon, the world's premier Kubernetes conference, and an open source enthusiast. He's also the co-author of Kubernetes Up & Running: Dive into the Future of Infrastructure. Recently, Kelsey announced his retirement after a 25-year career in tech.Links Referenced:Twitter: https://twitter.com/kelseyhightower TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Do you wish there were cheat codes for database optimization? Well, there are – no seriously. If you're using Postgres or MySQL on Amazon Aurora or RDS, OtterTune uses AI to automatically optimize your knobs and indexes and queries and other bits and bobs in databases. OtterTune applies optimal settings and recommendations in the background or surfaces them to you and allows you to do it. The best part is that there's no cost to try it. Get a free, thirty-day trial to take it for a test drive. Go to ottertune dot com to learn more. That's O-T-T-E-R-T-U-N-E dot com.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. You know, there's a great story from the Bible or Torah—Old Testament, regardless—that I was always a big fan of where you wind up with the Israelites walking the desert for 40 years in order to figure out what comes next. And Moses led them but could never enter into what came next. Honestly, I feel like my entire life is sort of going to be that direction. Not the biblical aspects, but rather always wondering what's on the other side of a door that I can never cross, and that door is retirement. Today I'm having returning guest Kelsey Hightower, who is no longer at Google. In fact, is no longer working and has joined the ranks of the gloriously retired. Welcome back, and what's it like?Kelsey: I'm happy to be here. I think retirement is just like work in some ways: you have to learn how to do it. A lot of people have no practice in their adult life what to do with all of their time. We have small dabs in it, like, you get the weekend off, depending on what your work, but you never have enough time to kind of unwind and get into something else. So, I'm being honest with myself. It's going to be a learning curve, what to do with that much time.You're probably still going to do work, but it's going to be a different type of work than you're used to. And so, that's where I am. 30 days into this, I'm in that learning mode, I'm on-the-job training.Corey: What's harder than you expected?Kelsey: It's not the hard part because I think mentally I've been preparing for, like, the last ten years, being a minimalist, learning how to kind of live within my means, learn to appreciate things that are just not work-related or status symbols. And so, to me, it felt like a smooth transition because I started to value my time more than anything else, right? Just waking up the next day became valuable to me. Spending time in the moment, right, you go to these conferences, there's, like, 10,000 people, but you learn to value those one-on-one encounters, those one-off, kind of, let's just go grab lunch situations. So, to me, retirement just makes more room for that, right? I no longer have this calendar that is super full, so I think for me, it was a nice transition in terms of getting more of that valuable time back.Corey: It seems to me that you're in a similar position to the one that I find myself in where the job that you were doing and I still am is tied, more or less, to a sense of identity as opposed to a particular task or particular role that you fill. You were Kelsey Hightower. That was a complete sentence. People didn't necessarily need to hear the rest of what you were working on or what you were going to be talking about at a given conference or whatnot. So, it seemed, at least from the outside, that an awful lot of what you did was quite simply who you were. Do you feel that your sense of identity has changed?Kelsey: So, I think when you have that much influence, when you have that much reputation, the words you say travel further, they tend to come with a little bit more respect, and so when you're working with a team on new product, and you say, “Hey, I think we should change some things.” And when they hear those words coming from someone that they trust or has a name that is attached to reputation, you tend to be able to make a lot of impact with very few words. But what you also find is that no matter what you get involved in—configuration management, distributed systems, serverless, working with customers—it all is helped and aided by the reputation that you bring into that line of work. And so yes, who you are matters, but one thing that I think helped me, kind of greatly, people are paying attention maybe to the last eight years of my career: containers, Kubernetes, but my career stretches back to the converting COBOL into Python days; the dawn of DevOps, Puppet, Chef, and Ansible; the Golang appearance and every tool being rewritten from Ruby to Golang; the Docker era.And so, my identity has stayed with me throughout those transitions. And so, it was very easy for me to walk away from that thing because I've done it three or four times before in the past, so I know who I am. I've never had, like, a Twitter bio that said, “Company X. X person from company X.” I've learned long ago to just decouple who I am from my current employer because that is always subject to change.Corey: I was fortunate enough to not find myself in the public eye until I owned my own company. But I definitely remember times in my previous incarnations where I was, “Oh, today I'm working at this company,” and I believed—usually inaccurately—that this was it. This was where I really found my niche. And then surprise I'm not there anymore six months later for, either their decision, my decision, or mutual agreement. And I was always hesitant about hanging a shingle out that was tied too tightly to any one employer.Even now, I was little worried about doing it when I went independent, just because well, what if it doesn't work? Well, what if, on some level? I think that there's an authenticity that you can bring with you—and you certainly have—where, for a long time now, whenever you say something, I take it seriously, and a lot of people do. It's not that you're unassailably correct, but I've never known you to say something you did not authentically believe in. And that is an opinion that is very broadly shared in this industry. So, if nothing else, you definitely were a terrific object lesson in speaking the truth, as you saw it.Kelsey: I think what you describe is one way that, whether you're an engineer doing QA, working in the sales department, when you can be honest with the team you're working with, when you can be honest with the customers you're selling into when you can be honest with the community you're part of, that's where the authenticity gets built, right? Companies, sometimes on the surface, you believe that they just want you to walk the party line, you know, they give you the lines and you just read them verbatim and you're doing your part. To be honest, you can do that with the website. You can do that with a well-placed ad in the search queries.What people are actually looking for are real people with real experiences, sharing not just fact, but I think when you mix kind of fact and opinion, you get this level of authenticity that you can't get just by pure strategic marketing. And so, having that leverage, I remember back in the day, people used to say, “I'm going to do the right thing and if it gets me fired, then that's just the way it's going to be. I don't want to go around doing the wrong thing because I'm scared I'm going to lose my job.” You want to find yourself in that situation where doing the right thing, is also the best thing for the company, and that's very rare, so when I've either had that opportunity or I've tried to create that opportunity and move from there.Corey: It resonates and it shows. I have never had a lot of respect for people who effectively are saying one thing today and another thing the next week based upon which way they think that the winds are blowing. But there's also something to be said for being able and willing to publicly recant things you have said previously as technology evolves, as your perspective evolves and, in light of new information, I'm now going to change my perspective on something. I've done that already with multi-cloud, for example. I thought it was ridiculous when I heard about it. But there are also expressions of it that basically every company is using, including my own. And it's a nuanced area. Where I find it challenging is when you see a lot of these perspectives that people are espousing that just so happen to deeply align with where their paycheck comes from any given week. That doesn't ring quite as true to me.Kelsey: Yeah, most companies actually don't know how to deal with it either. And now there has been times at any number of companies where my authentic opinion that I put out there is against party line. And you get those emails from directors and VPs. Like, “Hey, I thought we all agree to think this way or to at least say this.” And that's where you have to kind of have that moment of clarity and say, “Listen, that is undeniably wrong. It's so wrong in fact that if you say this in public, whether a small setting or large setting, you are going to instantly lose credibility going forward for yourself. Forget the company for a moment. There's going to be a situation where you will no longer be effective in your job because all of your authenticity is now gone. And so, what I'm trying to do and tell you is don't do that. You're better off saying nothing.”But if you go out there, and you're telling what is obviously misinformation or isn't accurate, people are not dumb. They're going to see through it and you will be classified as a person not to listen to. And so, I think a lot of people struggle with that because they believe that enterprise's consensus should also be theirs.Corey: An argument that I made—we'll call it a prediction—four-and-a-half years ago, was that in five years, nobody would really care about Kubernetes. And people misunderstood that initially, and I've clarified since repeatedly that I'm not suggesting it's going away: “Oh, turns out that was just a ridiculous fever dream and we're all going back to running bare metal with our hands again,” but rather that it would slip below the surface-level of awareness. And I don't know that I got the timing quite right on that, I think it's going to depend on the company and the culture that you find yourself in. But increasingly, when there's an application to run, it's easy to ask someone just, “Oh, great. Where's the Kubernetes cluster live so we can throw this on there and just add it to the rest of the pile?”That is sort of what I was seeing. My intention with that was not purely just to be controversial, as much fun as that might be, but also to act as a bit of a warning, where I've known too many people who let their identities become inextricably tangled with the technology. But technologies rise and fall, and at some point—like, you talk about configuration management days; I learned to speak publicly as a traveling trainer for Puppet. I wrote part of SaltStack once upon a time. But it was clear that that was not the direction the industry was going, so it was time to find something else to focus on. And I fear for people who don't keep an awareness or their feet underneath them and pay attention to broader market trends.Kelsey: Yeah, I think whenever I was personally caught up in linking my identity to technology, like, “I'm a Rubyist,” right?“, I'm a Puppeteer,” and you wear those names proudly. But I remember just thinking to myself, like, “You have to take a step back. What's more important, you or the technology?” And at some point, I realized, like, it's me, that is more important, right? Like, my independent thinking on this, my independent experience with this is far more important than the success of this thing.But also, I think there's a component there. Like when you talked about Kubernetes, you know, maybe being less relevant in five years, there's two things there. One is the success of all infrastructure things equals irrelevancy. When flights don't crash, when bridges just work, you do not think about them. You just use them because they're so stable and they become very boring. That is the success criteria.Corey: Utilities. No one's wondering if the faucet's going to work when they turn it on in the morning.Kelsey: Yeah. So, you know, there's a couple of ways to look at your statement. One is, you believe Kubernetes is on the trajectory that it's going to stabilize itself and hit that success criteria, and then it will be irrelevant. Or there's another part of the irrelevancy where something else comes along and replaces that thing, right? I think Cloud Foundry and Mesos are two good examples of Kubernetes coming along and stealing all of the attention from that because those particular products never gained that mass adoption. Maybe they got to the stable part, but they never got to the mass adoption part. So, I think when it comes to infrastructure, it's going to be irrelevant. It's just what side of that [laugh] coin do you land on?Corey: It's similar to folks who used to have to work at a variety of different companies on very specific Linux kernel subsystems because everyone had to care because there were significant performance impacts. Time went on and now there's still a few of those people that very much need to care, but for the rest of us, it is below the level of things that we have to care about. For me, the signs of the unsustainability were, oh, you can run Kubernetes effectively in production? That's a minimum of a quarter-million dollars a year in comp or up in some cases. Not every company is going to be able to field a team of those people and still remain a going concern in business. Nor frankly, should they have to.Kelsey: I'm going to pull on that thread a little bit because it's about—we're hitting that ten-year mark of Kubernetes. So, when Kubernetes comes out, why were people drawn to it, right? Why did it even get the time of day to begin with? And I think Docker kind of opened Pandora's box there. This idea of Chef, Puppet, Ansible, ten thousand package managers, and honestly, that trajectory was going to continue forever and it was helping no one. It was literally people doing duplicate work depending on the operating system you're dealing with and we were wasting time copying bits to servers—literally—in a very glorified way.So, Docker comes along and gives us this nicer, better abstraction, but it has gaps. It has no orchestration. It's literally this thing where now we've unified the packaging situation, we've learned a lot from Red Hat, YUM, Debian, and the various package repo combinations out there and so we made this universal thing. Great. We also learned a little bit about orchestration through brute force, bash scripts, config management, you name it, and so we serialized that all into this thing we call Kubernetes.It's pretty simple on the surface, but it was probably never worthy of such fanfare, right? But I think a lot of people were relieved that now we finally commoditized this expertise that the Googles, the Facebooks of the world had, right, building these systems that can copy bits to other systems very fast. There you go. We've gotten that piece. But I think what the market actually wants is in the mobile space, if you want to ship software to 300 million people that you don't even know, you can do it with the app store.There's this appetite that the boring stuff should be easy. Let's Encrypt has made SSL certificates beyond easy. It's just so easy to do the right thing. And I think for this problem we call deployments—you know, shipping apps around—at some point we have to get to a point where that is just crazy easy. And it still isn't.So, I think some of the frustration people express ten years later, they're realizing that they're trying to recreate a Rube Goldberg machine with Kubernetes is the base element and we still haven't understood that this whole thing needs to simplify, not ten thousand new pieces so you can build your own adventure.Corey: It's the idea almost of what I'm seeing AWS go through, and to some extent, its large competitors. But building anything on top of AWS from scratch these days is still reminiscent of going to Home Depot—or any hardware store—and walking up and down the aisles and getting all the different components to piece together what you want. Sometimes just want to buy something from Target that's already assembled and you have to do all of that work. I'm not saying there isn't value to having a Home Depot down the street, but it's also not the panacea that solves for all use cases. An awful lot of customers just want to get the job done and I feel that if we cling too tightly to how things used to be, we lose it.Kelsey: I'm going to tell you, being in the cloud business for almost eight years, it's the customers that create this. Now, I'm not blaming the customer, but when you start dealing with thousands of customers with tons of money, you end up in a very different situation. You can have one customer willing to pay you a billion dollars a year and they will dictate things that apply to no one else. “We want this particular set of features that only we will use.” And for a billion bucks a year times ten years, it's probably worth from a business standpoint to add that feature.Now, do this times 500 customers, each major provider. What you end up with is a cloud console that is unbearable, right? Because they also want these things to be first-class citizens. There's always smaller companies trying to mimic larger peers in their segment that you just end up in that chaos machine of unbound features forever. I don't know how to stop it. Unless you really come out maybe more Apple style and you tell people, “This is the one and only true way to do things and if you don't like it, you have to go find an alternative.” The cloud business, I think, still deals with the, “If you have a large payment, we will build it.”Corey: I think that that is a perspective that is not appreciated until you've been in the position of watching how large enterprises really interact with each other. Because it's, “Well, what customer the world is asking for yet another way to run containers?” “Uh, this specific one and their constraints are valid.” Every time I think I've seen everything there is to see in the world of cloud, I just have to go talk to one more customer and I'm learning something new. It's inevitable.I just wish that there was a better way to explain some of this to newcomers, when they're looking at, “Oh, I'm going to learn how this cloud thing works. Oh, my stars, look at how many services there are.” And then they wind up getting lost with analysis paralysis, and every time they get started and ask someone for help, they're pushed in a completely different direction and you keep spinning your wheels getting told to start over time and time again when any of these things can be made to work. But getting there is often harder than it really should be.Kelsey: Yeah. I mean, I think a lot of people don't realize how far you can get with, like, three VMs, a load balancer, and Postgres. My guess is you can probably build pretty much any clone of any service we use today with at least 1 million customers. Most people never reached that level—I don't even want to say the word scale—but that blueprint is there and most people will probably be better served by that level of simplicity than trying to mimic the behaviors of large customers—or large companies—with these elaborate use cases. I don't think they understand the context there. A lot of that stuff is baggage. It's not [laugh] even, like, best-of-breed or great design. It's like happenstance from 20 years of trying to buy everything that's been sold to you.Corey: I agree with that idea wholeheartedly. I was surprising someone the other day when I said that if you were to give me a task of getting some random application up and running by tomorrow, I do a traditional three-tier architecture, some virtual machines, a load balancer, and a database service. And is that the way that all the cool kids are doing it today? Well, they're not talking about it, but mostly. But the point is, is that it's what I know, it's where my background is, and the thing you already know when you're trying to solve a new problem is incredibly helpful, rather than trying to learn everything along that new path that you're forging down. Is that architecture the best approach? No, but it's perfectly sufficient for an awful lot of stuff.Kelsey: Yeah. And so, I mean, look, I've benefited my whole career from people fantasizing about [laugh] infrastructure—Corey: [laugh].Kelsey: And the truth is that in 2023, this stuff is so powerful that you can do almost anything you want to do with the simplest architecture that's available to us. The three-tier architecture has actually gotten better over the years. I think people are forgotten: CPUs are faster, RAM is much bigger quantities, the networks are faster, right, these databases can store more data than ever. It's so good to learn the fundamentals, start there, and worst case, you have a sound architecture people can reason about, and then you can go jump into the deep end, once you learn how to swim.Corey: I think that people would be depressed to understand just how much the common case for the value that Kubernetes brings is, “Oh yeah, now we can lose a drive or a server and the application stays up.” It feels like it's a bit overkill for that one somewhat paltry use case, but that problem has been hounding companies for decades.Kelsey: Yeah, I think at some point, the whole ‘SSH is my only interface into these kinds of systems,' that's a little low level, that's a little bare bones, and there will probably be a feature now where we start to have this not Infrastructure as Code, not cloud where we put infrastructure behind APIs and you pay per use, but I think what Kubernetes hints at is a future where you have APIs that do something. Right now the APIs give you pieces so you can assemble things. In the future, the APIs will just do something, “Run this app. I need it to be available and here's my money budget, my security budget, and reliability budget.” And then that thing will say, “Okay, we know how to do that, and here's roughly what is going to cost.”And I think that's what people actually want because that's how requests actually come down from humans, right? We say, “We want this app or this game to be played by millions of people from Australia to New York.” And then for a person with experience, that means something. You kind of know what architecture you need for that, you know what pieces that need to go there. So, we're just moving into a realm where we're going to have APIs that do things all of a sudden.And so, Kubernetes is the warm-up to that era. And that's why I think that transition is a little rough because it leaks the pieces part, so where you can kind of build all the pieces that you want. But we know what's coming. Serverless also hints at this. But that's what people should be looking for: APIs that actually do something.Corey: This episode is sponsored in part by Panoptica. Panoptica simplifies container deployment, monitoring, and security, protecting the entire application stack from build to runtime. Scalable across clusters and multi-cloud environments, Panoptica secures containers, serverless APIs, and Kubernetes with a unified view, reducing operational complexity and promoting collaboration by integrating with commonly used developer, SRE, and SecOps tools. Panoptica ensures compliance with regulatory mandates and CIS benchmarks for best practice conformity. Privacy teams can monitor API traffic and identify sensitive data, while identifying open-source components vulnerable to attacks that require patching. Proactively addressing security issues with Panoptica allows businesses to focus on mitigating critical risks and protecting their interests. Learn more about Panoptica today at panoptica.app.Corey: You started the show by talking about how your career began with translating COBOL into Python. I firmly believe someone starting their career today listening to this could absolutely find that by the time their career starts drawing to their own close, that Kubernetes is right in there as far as sounding like the deprecated thing that no one really talks about or thinks about anymore. And I hope so. I want the future to be brighter than the past. I want getting a business or getting software together in a way that helps people to not require the amount of, “First, spend six weeks at a boot camp,” or, “Learn how to write just enough code that you can wind up getting funding and then have it torn apart.”What's the drag-and-drop story? What's the describe the application to a robot and it builds it for you? I'm optimistic about the future of infrastructure, just because based upon its power to potentially make reliability and scale available to folks who have no idea of what's involved with that. That's kind of the point. That's the end game of having won this space.Kelsey: Well, you know what? Kubernetes is providing the metadata to make that possible, right? Like in the early days, people were writing one-off scripts or, you know, writing little for loops to get things in the right place. And then we get config management that kind of formalizes that, but it still had no metadata, right? You'd have things like Puppet report information.But in the world of, like, Kubernetes, or any cloud provider, now you get semantic meaning. “This app needs this volume with this much space with this much memory, I need three of these behind this load balancer with these protocols enabled.” There is now so much metadata about applications, their life cycles, and how they work that if you were to design a new system, you can actually use that data to craft a much better API that made a lot of this boilerplate the defaults. Oh, that's a web application. You do not need to specify all of this boilerplate. Now, we can give you much better nouns and verbs to describe what needs to happen.So, I think this is that transition as all the new people coming up, they're going to be dealing with semantic meaning to infrastructure, where we were dealing with, like, tribal knowledge and intuition, right? “Run this script, pipe it to this thing, and then this should happen. And if it doesn't, run the script again with this flag.” Versus, “Oh, here's the semantic meaning to a working system.” That's a game-changer.Corey: One other topic I wanted to ask you about—I've it's been on my list of things to bring up the next time I ran into you and then you went ahead and retired, making it harder to run into you. But a little while back, I was at a tech conference and someone gave a demo, and it didn't go as well as they had hoped. And a few of us were talking about it afterwards. We've all been speakers, we've all lived that life. Zero shade.But someone brought you up in particular—unprompted; your legend does precede you—and the phrase that they used was that Kelsey's demos were always picture-perfect. He was so lucky with how the demos worked out. And I just have to ask—because you don't strike me as someone who is not careful, particularly when all eyes are upon you—and real experts make things look easy, did you have demos periodically go wrong that the audience just didn't see going wrong along the way? Or did you just actually YOLO all of your demos and got super lucky every single time for the last eight years?Kelsey: There was a musician who said, “Hey, your demos are like jazz. You improvise the whole thing.” There's no script, there's no video. The way I look at the demo is, like, you got this instrument, the command prompt, and the web browser. You can do whatever you want with them.Now, I have working code. I wrote the code, I wrote the deployment scenarios, I delete it all and I put it all back. And so, I know how it's supposed to work from the ground up. And so, what that means is if anything goes wrong, I can improvise. I could go into fixing the code. I can go into doing a redeploy.And I'll give you one good example. The first time Kubernetes came out, there was this small meetup in San Francisco with just the core contributors, right? So, there is no community yet, there's no conference yet, just people hacking on Kubernetes. And so, we decided, we're going to have the first Kubernetes meetup. And everyone got, like, six, seven minutes, max. That's it. You got to move.And so, I was like, “Hey, I noticed that in the lineup, there is no ‘What is Kubernetes?' talk. We're just getting into these nuts and bolts and I don't think that's fair to the people that will be watching this for the first time.” And I said, “All right, Kelsey, you should give maybe an intro to what it is.” I was like, “You know what I'll do? I'm going to build a Kubernetes cluster from the ground up, starting with VMs on my laptop.”And I'm in it and I'm feeling confident. So, confidence is the part that makes it look good, right? Where you're confident in the commands you type. One thing I learned to do is just use your history, just hit the up arrow instead of trying to copy all these things out. So, you hit the up arrow, you find the right command and you talk through it and no one looks at what's happening. You're cycling through the history.Or you have multiple tabs where you know the next up arrow is the right history. So, you give yourself shortcuts. And so, I'm halfway through this demo. We got three minutes left, and it doesn't work. Like, VMware is doing something weird on my laptop and there's a guy calling me off stage, like, “Hey, that's it. Cut it now. You're done.”I'm like, “Oh, nope. Thou shalt not go out like this.” It's time to improvise. And so, I said, “Hey, who wants to see me finish this?” And now everyone is locked in. It's dead silent. And I blow the whole thing away. I bring up the VMs, I [pixie 00:28:20] boot, I installed the kubelet, I install Docker. And everyone's clapping. And it's up, it's going, and I say, “Now, if all of this works, we run this command and it should start running the app.” And I do kubectl apply-f and it comes up and the place goes crazy.And I had more to the demo. But you stop. You've gotten the point across, right? This is what Kubernetes is, here's how it works, and look how you do it from scratch. And I remember saying, “And that's the end of my presentation.” You need to know when to stop, you need to know when to pivot, and you need to have confidence that it's supposed to work, and if you've seen it work a couple of times, your confidence is unshaken.And when I walked off that stage, I remember someone from Red Hat was like—Clayton Coleman; that's his name—Clayton Coleman walked up to me and said, “You planned that. You planned it to fail just like that, so you can show people how to go from scratch all the way up. That was brilliant.” And I was like, “Sure. That's exactly what I did.”Corey: “Yeah, I meant to do that.” I like that approach. I found there's always things I have to plan for in demos. For example, I can never count on having solid WiFi from a conference hall. The show has to go on. It's, okay, the WiFi doesn't work. I've at one point had to give a talk where the projector just wasn't working to a bunch of students. So okay, close the laptop. We're turning this into a bunch of question-and-answer sessions, and it was one of the better talks I've ever given.But the alternative is getting stuck in how you think a talk absolutely needs to go. Now, keynotes are a little harder where everything has been scripted and choreographed and at that point, I've had multiple fallbacks for demos that I've had to switch between. And people never noticed I was doing it for that exact reason. But it takes work to look polished.Kelsey: I will tell you that the last Next keynote I gave was completely irresponsible. No dry runs, no rehearsals, no table reads, no speaker notes. And I think there were 30,000 people at that particular Next. And Diane Greene was still CEO, and I remember when marketing was like, “Yo, at least a backup recording.” I was like, “Nah, I don't have anything.”And that demo was extensive. I mean, I was building an app from scratch, starting with Postgres, adding the schema, building an app, deploying the app. And something went wrong halfway. And there's this joke that I came up with just to pass over the time, they gave me a new Chromebook to do the demo. And so, it's not mine, so none of the default settings were there, I was getting pop-ups all over the place.And I came up with this joke on the way to the conference. I was like, “You know what'd be cool? When I show off the serverless stuff, I would just copy the code from Stack Overflow. That'd be like a really cool joke to say this is what senior engineers do.” And I go to Stack Overflow and it's getting all of these pop-ups and my mouse couldn't highlight the text.So, I'm sitting there like a deer in headlights in front of all of these people and I'm looking down, and marketing is, like, “This is what… this is what we're talking about.” And so, I'm like, “Man do I have to end this thing here?” And I remember I kept trying, I kept trying, and came to me. Once the mouse finally got in there and I cleared up all the popups, I just came up with this joke. I said, “Good developers copy.” And I switched over to my terminal and I took the text from Stack Overflow and I said, “Great developers paste,” and the whole room start laughing.And I had them back. And we kept going and continued. And at the end, there was like this Google Assistant, and when it was finished, I said, “Thank you,” to the Google Assistant and it was talking back through the live system. And it said, “I got to admit, that was kind of dope.” So, I go to the back and Diane Greene walks back there—the CEO of Google Cloud—and she pats me on the shoulder. “Kelsey, that was dope.”But it was the thrill because I had as much thrill as the people watching it. So, in real-time, I was going through all these emotions. But I think people forget, the demo is supposed to convey something. The demo is supposed to tell some story. And I've seen people overdo their demos with way too much code, way too many commands, almost if they're trying to show off their expertise versus telling a story. And so, when I think about the demo, it has to complement the entire narrative. And so, sometimes you don't need as many commands, you don't need as much code. You can keep things simple and that gives you a lot more ins and outs in case something does go crazy.Corey: And I think the key takeaway here that so many people lose sight of is you have to know the material well enough that whatever happens, well, things don't always go the way I planned during the day, either, and talking through that is something that I think serves as a good example. It feels like a bit more of a challenge when you're trying to demo something that a company is trying to sell someone, “Oh, yeah, it didn't work. But that's okay.” But I'm still reminded by probably one of the best conference demo fails I've ever seen on video. One day, someone was attempting to do a talk that hit Amazon S3 and it didn't work.And the audience started shouting at him that yeah, S3 is down right now. Because that was the big day that S3 took a nap for four hours. It was one of those foundational things you'd should never stop to consider. Like, well, what if the internet doesn't work tomorrow when I'm doing my demo? That's a tough one to work around. But rough timing.Kelsey: [breathy sound]Corey: He nailed the rest of the talk, though. You keep going. That's the thing that people miss. They get stuck in the demo that isn't working, they expect the audience knows as much as they do about what's supposed to happen next. You're the one up there telling a story. People forget it's storytelling.Kelsey: Now, I will be remiss to say, I know that the demo gods have been on my side for, like, ten, maybe fifteen years solid. So, I retired from doing live demos. This is why I just don't do them anymore. I know I'm overdue as an understatement. But the thing I've learned though, is that what I found more impressive than the live demo is to be able to convey the same narratives through story alone. No slides. No demo. Nothing. But you can still make people feel where you would try to go with that live demo.And it's insanely hard, especially for technologies people have never seen before. But that's that new challenge that I kind of set up for myself. So, if you see me at a keynote and you've noticed why I've been choosing these fireside chats, it's mainly because I'm also trying to increase my ability to share narrative, technical concepts, but now in a new form. So, this new storytelling format through the fireside chat has been my substitute for the live demo, normally because I think sometimes, unless there's something really to show that people haven't seen before, the live demo isn't as powerful to me. Once the thing is kind of known… the live demo is kind of more of the same. So, I think they really work well when people literally have never seen the thing before, but outside of that, I think you can kind of move on to, like, real-life scenarios and narratives that help people understand the fundamentals and the philosophy behind the tech.Corey: An awful lot of tools and tech that we use on a day-to-day basis as well are thankfully optimized for the people using them and the ergonomics of going about your day. That is orthogonal, in my experience, to looking very impressive on stage. It's the rare company that can have a product that not only works well but also presents well. And that is something I don't tend to index on when I'm selecting a tool to do something with. So, it's always a question of how can I make this more visually entertaining? For while I got out of doing demos entirely, just because talking about things that have more staying power than a screenshot that is going to wind up being irrelevant the next week when they decide to redo the console for some service yet again.Kelsey: But you know what? That was my secret to doing software products and projects. When I was at CoreOS, we used to have these meetups we would used to do every two weeks or so. So, when we were building things like etcd, Fleet was a container management platform that came before Kubernetes, we would always run through them as a user, start install them, use them, and ask how does it feel? These command line flags, they don't feel right. This isn't a narrative you can present with the software alone.But once we could, then the meetups were that much more engaging. Like hey, have you ever tried to distribute configuration to, like, a thousand servers? It's insanely hard. Here's how you do with Puppet. But now I'm going to show you how you do with etcd. And then the narrative will kind of take care of itself because the tool was positioned behind what people would actually do with it versus what the tool could do by itself.Corey: I think that's the missing piece that most marketing doesn't seem to quite grasp is, they talk about the tool and how awesome it is, but that's why I love customer demos so much. They're showing us how they use a tool to solve a real-world problem. And honestly, from my snarky side of the world and the attendant perspective there, I can make an awful lot of fun about basically anything a company decides to show me, but put a customer on stage talking about how whatever they've built is solving a real-world problem for them, that's the point where I generally shut up and listen because I'm going to learn something about a real-world story. Because you don't generally get to tell customers to go on stage and just make up a story that makes us sound good, and have it come off with any sense of reality whatsoever. I haven't seen that one happen yet, but I'm sure it's out there somewhere.Kelsey: I don't know how many founders or people building companies listen in to your podcast, but this is right now, I think the number one problem that especially venture-backed startups have. They tend to have great technology—maybe it's based off some open-source project—with tons of users who just know how that tool works, it's just an ingredient into what they're already trying to do. But that isn't going to ever be your entire customer base. Soon, you'll deal with customers who don't understand the thing you have and they need more than technology, right? They need a product.And most of these companies struggle painting that picture. Here's what you can do with it. Or here's what you can't do now, but you will be able to do if you were to use this. And since they are missing that, a lot of these companies, they produce a lot of code, they ship a lot of open-source stuff, they raise a lot of capital, and then it just goes away, it fades out over time because they can bring on no newcomers. The people who need help the most, they don't have a narrative for them, and so therefore, they're just hoping that the people who have all the skills in the world, the early adopters, but unfortunately, those people are tend to be the ones that don't actually pay. They just kind of do it themselves. It's the people who need the most help.Corey: How do we monetize the bleeding edge of adoption? In many cases you don't. They become your community if you don't hug them to death first.Kelsey: Exactly.Corey: Ugh. None of this is easy. I really want to thank you for taking the time to catch up and talk about how you seen the remains of a career well spent, and now you're going off into that glorious sunset. But I have a sneaking suspicion you'll still be around. Where should people go if they want to follow up on what you're up to these days?Kelsey: Right now I still use… I'm going to keep calling it Twitter.Corey: I agree.Kelsey: I kind of use that for my real-time interactions. And I'm still attending conferences, doing fireside chats, and just meeting people on those conference floors. But that's what where I'll be for now. So yeah, I'll still be around, but maybe not as deep. And I'll be spending more time just doing normal life stuff, maybe less building software.Corey: And we will, of course, put a link to that in the show notes. Thank you so much for taking the time to catch up and share your reflections on how the industry is progressing.Kelsey: Awesome. Thanks for having me, Corey.Corey: Kelsey Hightower, now gloriously retired. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that you're going to type on stage as part of a conference talk, and then accidentally typo all over yourself while you're doing it.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
The story of an open-source hero who became a villain. Special Guest: Alex Kretzschmar.
An airhacks.fm conversation with Kelsey Hightower (@kelseyhightower) about: HP laptop and playing Age of Empires, programming calculators with TI-BASIC, playing Mario on NES, enjoying the Metroid on NES, working at Google datacenter as contractor, bash is a programming language, working for a financial institution, modernising COBOL with Java, rewriting Cobol to python, learning Java and using JBoss, contributing to Python to make it better, venv (virtualenv) and pypy, using Puppet for configuration management, python vs. Ruby, overengineering with Java, Java is lean now, creeating the confd project, envsubst and Java, Cost Driven Architectures in the clouds, replacing Java with GO, starting at CoreOS, etcd as coordinator, implementation of RAFT, RAFT and cluster membership, contributing to Packr and Terraform, docker is written in GO, RAFT is understandable Paxos, RAFT did not consider bootstrapping, Apache zookeeper is used for coordination, Apache BookKeeper, CoreOS fleet, rkt vs. docker, salt configuration maangement, kubernetes pod, the status field in kubernetes, Google Service Weaver, Google App Engine, checkout episode: "#153 Java, Serverless, Google App Engine, gVisor, Kubernetes", writing modular code is important, monoliths and microservices, rust is leaking details, Kubernetes The Hard Way the step by step guide, Kubernetes Autopilot Kelsey Hightower on twitter: @kelseyhightower
About KelseyKelsey Hightower is the Principal Developer Advocate at Google, the co-chair of KubeCon, the world's premier Kubernetes conference, and an open source enthusiast. He's also the co-author of Kubernetes Up & Running: Dive into the Future of Infrastructure.Links: Twitter: @kelseyhightower Company site: Google.com Book: Kubernetes Up & Running: Dive into the Future of Infrastructure TranscriptAnnouncer: Hello and welcome to Screaming in the Cloud, with your host Cloud economist Corey Quinn. This weekly show features conversations with people doing interesting work in the world of Cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us by our friends at Pinecone. They believe that all anyone really wants is to be understood, and that includes your users. AI models combined with the Pinecone vector database let your applications understand and act on what your users want… without making them spell it out. Make your search application find results by meaning instead of just keywords, your personalization system make picks based on relevance instead of just tags, and your security applications match threats by resemblance instead of just regular expressions. Pinecone provides the cloud infrastructure that makes this easy, fast, and scalable. Thanks to my friends at Pinecone for sponsoring this episode. Visit Pinecone.io to understand more.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. I'm joined this week by Kelsey Hightower, who claims to be a principal developer advocate at Google, but based upon various keynotes I've seen him in, he basically gets on stage and plays video games like Tetris in front of large audiences. So I assume he is somehow involved with e-sports. Kelsey, welcome to the show.Kelsey: You've outed me. Most people didn't know that I am a full-time e-sports Tetris champion at home. And the technology thing is just a side gig.Corey: Exactly. It's one of those things you do just to keep the lights on, like you're waiting to get discovered, but in the meantime, you're waiting table. Same type of thing. Some people wait tables you more or less a sling Kubernetes, for lack of a better term.Kelsey: Yes.Corey: So let's dive right into this. You've been a strong proponent for a long time of Kubernetes and all of its intricacies and all the power that it unlocks and I've been pretty much the exact opposite of that, as far as saying it tends to be over complicated, that it's hype-driven and a whole bunch of other, shall we say criticisms that are sometimes bounded in reality and sometimes just because I think it'll be funny when I put them on Twitter. Where do you stand on the state of Kubernetes in 2020?Kelsey: So, I want to make sure it's clear what I do. Because when I started talking about Kubernetes, I was not working at Google. I was actually working at CoreOS where we had a competitor Kubernetes called Fleet. And Kubernetes coming out kind of put this like fork in our roadmap, like where do we go from here? What people saw me doing with Kubernetes was basically learning in public. Like I was really excited about the technology because it's attempting to solve a very complex thing. I think most people will agree building a distributed system is what cloud providers typically do, right? With VMs and hypervisors. Those are very big, complex distributed systems. And before Kubernetes came out, the closest I'd gotten to a distributed system before working at CoreOS was just reading the various white papers on the subject and hearing stories about how Google has systems like Borg tools, like Mesa was being used by some of the largest hyperscalers in the world, but I was never going to have the chance to ever touch one of those unless I would go work at one of those companies.So when Kubernetes came out and the fact that it was open source and I could read the code to understand how it was implemented, to understand how schedulers actually work and then bonus points for being able to contribute to it. Those early years, what you saw me doing was just being so excited about systems that I attended to build on my own, becoming this new thing just like Linux came up. So I kind of agree with you that a lot of people look at it as a more of a hype thing. They're looking at it regardless of their own needs, regardless of understanding how it works and what problems is trying to solve that. My stance on it, it's a really, really cool tool for the level that it operates in, and in order for it to be successful, people can't know that it's there.Corey: And I think that might be where part of my disconnect from Kubernetes comes into play. I have a background in ops, more or less, the grumpy Unix sysadmin because it's not like there's a second kind of Unix sysadmin you're ever going to encounter. Where everything in development works in theory, but in practice things pan out a little differently. I always joke that ops is the difference between theory and practice. In theory, devs can do everything and there's no ops needed. In practice, well it's been a burgeoning career for a while. The challenge with this is Kubernetes at times exposes certain levels of abstraction that, sorry certain levels of detail that generally people would not want to have to think about or deal with, while papering over other things with other layers of abstraction on top of it. That obscure, valuable troubleshooting information from a running something in an operational context. It absolutely is a fascinating piece of technology, but it feels today like it is overly complicated for the use a lot of people are attempting to put it to. Is that a fair criticism from where you sit?Kelsey: So I think the reason why it's a fair criticism is because there are people attempting to run their own Kubernetes cluster, right? So when we think about the cloud, unless you're in OpenStack land, but for the people who look at the cloud and you say, "Wow, this is much easier." There's an API for creating virtual machines and I don't see the distributed state store that's keeping all of that together. I don't see the farm of hypervisors. So we don't necessarily think about the inherent complexity into a system like that, because we just get to use it. So on one end, if you're just a user of a Kubernetes cluster, maybe using something fully managed or you have an ops team that's taking care of everything, your interface of the system becomes this Kubernetes configuration language where you say, "Give me a load balancer, give me three copies of this container running." And if we do it well, then you'd think it's a fairly easy system to deal with because you say, "kubectl, apply," and things seem to start running.Just like in the cloud where you say, "AWS create this VM, or G cloud compute instance, create." You just submit API calls and things happen. I think the fact that Kubernetes is very transparent to most people is, now you can see the complexity, right? Imagine everyone driving with the hood off the car. You'd be looking at a lot of moving things, but we have hoods on cars to hide the complexity and all we expose is the steering wheel and the pedals. That car is super complex but we don't see it. So therefore we don't attribute as complexity to the driving experience.Corey: This to some extent feels it's on the same axis as serverless, with just a different level of abstraction piled onto it. And while I am a large proponent of serverless, I think it's fantastic for a lot of Greenfield projects. The constraints inherent to the model mean that it is almost completely non-tenable for a tremendous number of existing workloads. Some developers like to call it legacy, but when I hear the term legacy I hear, "it makes actual money." So just treating it as, "Oh, it's a science experiment we can throw into a new environment, spend a bunch of time rewriting it for minimal gains," is just not going to happen as companies undergo digital transformations, if you'll pardon the term.Kelsey: Yeah, so I think you're right. So let's take Amazon's Lambda for example, it's a very opinionated high-level platform that assumes you're going to build apps a certain way. And if that's you, look, go for it. Now, one or two levels below that there is this distributed system. Kubernetes decided to play in that space because everyone that's building other platforms needs a place to start. The analogy I like to think of is like in the mobile space, iOS and Android deal with the complexities of managing multiple applications on a mobile device, security aspects, app stores, that kind of thing. And then you as a developer, you build your thing on top of those platforms and APIs and frameworks. Now, it's debatable, someone would say, "Why do we even need an open-source implementation of such a complex system? Why not just everyone moved to the cloud?" And then everyone that's not in a cloud on-premise gets left behind.But typically that's not how open source typically works, right? The reason why we have Linux, the precursor to the cloud is because someone looked at the big proprietary Unix systems and decided to re-implement them in a way that anyone could run those systems. So when you look at Kubernetes, you have to look at it from that lens. It's the ability to democratize these platform layers in a way that other people can innovate on top. That doesn't necessarily mean that everyone needs to start with Kubernetes, just like not everyone needs to start with the Linux server, but it's there for you to build the next thing on top of, if that's the route you want to go.Corey: It's been almost a year now since I made an original tweet about this, that in five years, no one will care about Kubernetes. So now I guess I have four years running on that clock and that attracted a bit of, shall we say controversy. There were people who thought that I meant that it was going to be a flash in the pan and it would dry up and blow away. But my impression of it is that in, well four years now, it will have become more or less system D for the data center, in that there's a bunch of complexity under the hood. It does a bunch of things. No-one sensible wants to spend all their time mucking around with it in most companies. But it's not something that people have to think about in an ongoing basis the way it feels like we do today.Kelsey: Yeah, I mean to me, I kind of see this as the natural evolution, right? It's new, it gets a lot of attention and kind of the assumption you make in that statement is there's something better that should be able to arise, giving that checkpoint. If this is what people think is hot, within five years surely we should see something else that can be deserving of that attention, right? Docker comes out and almost four or five years later you have Kubernetes. So it's obvious that there should be a progression here that steals some of the attention away from Kubernetes, but I think where it's so new, right? It's only five years in, Linux is like over 20 years old now at this point, and it's still top of mind for a lot of people, right? Microsoft is still porting a lot of Windows only things into Linux, so we still discuss the differences between Windows and Linux.The idea that the cloud, for the most part, is driven by Linux virtual machines, that I think the majority of workloads run on virtual machines still to this day, so it's still front and center, especially if you're a system administrator managing BDMs, right? You're dealing with tools that target Linux, you know the Cisco interface and you're thinking about how to secure it and lock it down. Kubernetes is just at the very first part of that life cycle where it's new. We're all interested in even what it is and how it works, and now we're starting to move into that next phase, which is the distro phase. Like in Linux, you had Red Hat, Slackware, Ubuntu, special purpose distros.Some will consider Android a special purpose distribution of Linux for mobile devices. And now that we're in this distro phase, that's going to go on for another 5 to 10 years where people start to align themselves around, maybe it's OpenShift, maybe it's GKE, maybe it's Fargate for EKS. These are now distributions built on top of Kubernetes that start to add a little bit more opinionation about how Kubernetes should be pushed together. And then we'll enter another phase where you'll build a platform on top of Kubernetes, but it won't be worth mentioning that Kubernetes is underneath because people will be more interested on the thing above.Corey: I think we're already seeing that now, in terms of people no longer really care that much what operating system they're running, let alone with distribution of that operating system. The things that you have to care about slip below the surface of awareness and we've seen this for a long time now. Originally to install a web server, it wound up taking a few days and an intimate knowledge of GCC compiler flags, then RPM or D package and then yum on top of that, then ensure installed, once we had configuration management that was halfway decent.Then Docker run, whatever it is. And today feels like it's with serverless technologies being what they are, it's effectively a push a file to S3 or it's equivalent somewhere else and you're done. The things that people have to be aware of and the barrier to entry continually lowers. The downside to that of course, is that things that people specialize in today and effectively make very lucrative careers out of are going to be not front and center in 5 to 10 years the way that they are today. And that's always been the way of technology. It's a treadmill to some extent.Kelsey: And on the flip side of that, look at all of the new jobs that are centered around these cloud-native technologies, right? So you know, we're just going to make up some numbers here, imagine if there were only 10,000 jobs around just Linux system administration. Now when you look at this whole Kubernetes landscape where people are saying we can actually do a better job with metrics and monitoring. Observability is now a thing culturally that people assume you should have, because you're dealing with these distributed systems. The ability to start thinking about multi-regional deployments when I think that would've been infeasible with the previous tools or you'd have to build all those tools yourself. So I think now we're starting to see a lot more opportunities, where instead of 10,000 people, maybe you need 20,000 people because now you have the tools necessary to tackle bigger projects where you didn't see that before.Corey: That's what's going to be really neat to see. But the challenge is always to people who are steeped in existing technologies. What does this mean for them? I mean I spent a lot of time early in my career fighting against cloud because I thought that it was taking away a cornerstone of my identity. I was a large scale Unix administrator, specifically focusing on email. Well, it turns out that there aren't nearly as many companies that need to have that particular skill set in house as it did 10 years ago. And what we're seeing now is this sort of forced evolution of people's skillsets or they hunker down on a particular area of technology or particular application to try and make a bet that they can ride that out until retirement. It's challenging, but at some point it seems that some folks like to stop learning, and I don't fully pretend to understand that. I'm sure I will someday where, "No, at this point technology come far enough. We're just going to stop here, and anything after this is garbage." I hope not, but I can see a world in which that happens.Kelsey: Yeah, and I also think one thing that we don't talk a lot about in the Kubernetes community, is that Kubernetes makes hyper-specialization worth doing because now you start to have a clear separation from concerns. Now the OS can be hyperfocused on security system calls and not necessarily packaging every programming language under the sun into a single distribution. So we can kind of move part of that layer out of the core OS and start to just think about the OS being a security boundary where we try to lock things down. And for some people that play at that layer, they have a lot of work ahead of them in locking down these system calls, improving the idea of containerization, whether that's something like Firecracker or some of the work that you see VMware doing, that's going to be a whole class of hyper-specialization. And the reason why they're going to be able to focus now is because we're starting to move into a world, whether that's serverless or the Kubernetes API.We're saying we should deploy applications that don't target machines. I mean just that step alone is going to allow for so much specialization at the various layers because even on the networking front, which arguably has been a specialization up until this point, can truly specialize because now the IP assignments, how networking fits together, has also abstracted a way one more step where you're not asking for interfaces or binding to a specific port or playing with port mappings. You can now let the platform do that. So I think for some of the people who may be not as interested as moving up the stack, they need to be aware that the number of people we need being hyper-specialized at Linux administration will definitely shrink. And a lot of that work will move up the stack, whether that's Kubernetes or managing a serverless deployment and all the configuration that goes with that. But if you are a Linux, like that is your bread and butter, I think there's going to be an opportunity to go super deep, but you may have to expand into things like security and not just things like configuration management.Corey: Let's call it the unfulfilled promise of Kubernetes. On paper, I love what it hints at being possible. Namely, if I build something that runs well on top of Kubernetes than we truly have a write once, run anywhere type of environment. Stop me if you've heard that one before, 50,000 times in our industry... or history. But in practice, as has happened before, it seems like it tends to fall down for one reason or another. Now, Amazon is famous because for many reasons, but the one that I like to pick on them for is, you can't say the word multi-cloud at their events. Right. That'll change people's perspective, good job. The people tend to see multi-cloud are a couple of different lenses.I've been rather anti multi-cloud from the perspective of the idea that you're setting out day one to build an application with the idea that it can be run on top of any cloud provider, or even on-premises if that's what you want to do, is generally not the way to proceed. You wind up having to make certain trade-offs along the way, you have to rebuild anything that isn't consistent between those providers, and it slows you down. Kubernetes on the other hand hints at if it works and fulfills this promise, you can suddenly abstract an awful lot beyond that and just write generic applications that can run anywhere. Where do you stand on the whole multi-cloud topic?Kelsey: So I think we have to make sure we talk about the different layers that are kind of ready for this thing. So for example, like multi-cloud networking, we just call that networking, right? What's the IP address over there? I can just hit it. So we don't make a big deal about multi-cloud networking. Now there's an area where people say, how do I configure the various cloud providers? And I think the healthy way to think about this is, in your own data centers, right, so we know a lot of people have investments on-premises. Now, if you were to take the mindset that you only need one provider, then you would try to buy everything from HP, right? You would buy HP store's devices, you buy HP racks, power. Maybe HP doesn't sell air conditioners. So you're going to have to buy an air conditioner from a vendor who specializes in making air conditioners, hopefully for a data center and not your house.So now you've entered this world where one vendor does it make every single piece that you need. Now in the data center, we don't say, "Oh, I am multi-vendor in my data center." Typically, you just buy the switches that you need, you buy the power racks that you need, you buy the ethernet cables that you need, and they have common interfaces that allow them to connect together and they typically have different configuration languages and methods for configuring those components. The cloud on the other hand also represents the same kind of opportunity. There are some people who really love DynamoDB and S3, but then they may prefer something like BigQuery to analyze the data that they're uploading into S3. Now, if this was a data center, you would just buy all three of those things and put them in the same rack and call it good.But the cloud presents this other challenge. How do you authenticate to those systems? And then there's usually this additional networking costs, egress or ingress charges that make it prohibitive to say, "I want to use two different products from two different vendors." And I think that's-Corey: ...winds up causing serious problems.Kelsey: Yes, so that data gravity, the associated cost becomes a little bit more in your face. Whereas, in a data center you kind of feel that the cost has already been paid. I already have a network switch with enough bandwidth, I have an extra port on my switch to plug this thing in and they're all standard interfaces. Why not? So I think the multi-cloud gets lost in the chew problem, which is the barrier to entry of leveraging things across two different providers because of networking and configuration practices.Corey: That's often the challenge, I think, that people get bogged down in. On an earlier episode of this show we had Mitchell Hashimoto on, and his entire theory around using Terraform to wind up configuring various bits of infrastructure, was not the idea of workload portability because that feels like the windmill we all keep tilting at and failing to hit. But instead the idea of workflow portability, where different things can wind up being interacted with in the same way. So if this one division is on one cloud provider, the others are on something else, then you at least can have some points of consistency in how you interact with those things. And in the event that you do need to move, you don't have to effectively redo all of your CICD process, all of your tooling, et cetera. And I thought that there was something compelling about that argument.Kelsey: And that's actually what Kubernetes does for a lot of people. For Kubernetes, if you think about it, when we start to talk about workflow consistency, if you want to deploy an application, queue CTL, apply, some config, you want the application to have a load balancer in front of it. Regardless of the cloud provider, because Kubernetes has an extension point we call the cloud provider. And that's where Amazon, Azure, Google Cloud, we do all the heavy lifting of mapping the high-level ingress object that specifies, "I want a load balancer, maybe a few options," to the actual implementation detail. So maybe you don't have to use four or five different tools and that's where that kind of workload portability comes from. Like if you think about Linux, right? It has a set of system calls, for the most part, even if you're using a different distro at this point, Red Hat or Amazon Linux or Google's container optimized Linux.If I build a Go binary on my laptop, I can SCP it to any of those Linux machines and it's going to probably run. So you could call that multi-cloud, but that doesn't make a lot of sense because it's just because of the way Linux works. Kubernetes does something very similar because it sits right on top of Linux, so you get the portability just from the previous example and then you get the other portability and workload, like you just stated, where I'm calling kubectl apply, and I'm using the same workflow to get resources spun up on the various cloud providers. Even if that configuration isn't one-to-one identical.Corey: This episode is sponsored in part by our friends at Uptycs, because they believe that many of you are looking to bolster your security posture with CNAPP and XDR solutions. They offer both cloud and endpoint security in a single UI and data model. Listeners can get Uptycs for up to 1,000 assets through the end of 2023 (that is next year) for $1. But this offer is only available for a limited time on UptycsSecretMenu.com. That's U-P-T-Y-C-S Secret Menu dot com.Corey: One thing I'm curious about is you wind up walking through the world and seeing companies adopting Kubernetes in different ways. How are you finding the adoption of Kubernetes is looking like inside of big E enterprise style companies? I don't have as much insight into those environments as I probably should. That's sort of a focus area for the next year for me. But in startups, it seems that it's either someone goes in and rolls it out and suddenly it's fantastic, or they avoid it entirely and do something serverless. In large enterprises, I see a lot of Kubernetes and a lot of Kubernetes stories coming out of it, but what isn't usually told is, what's the tipping point where they say, "Yeah, let's try this." Or, "Here's the problem we're trying to solve for. Let's chase it."Kelsey: What I see is enterprises buy everything. If you're big enough and you have a big enough IT budget, most enterprises have a POC of everything that's for sale, period. There's some team in some pocket, maybe they came through via acquisition. Maybe they live in a different state. Maybe it's just a new project that came out. And what you tend to see, at least from my experiences, if I walk into a typical enterprise, they may tell me something like, "Hey, we have a POC, a Pivotal Cloud Foundry, OpenShift, and we want some of that new thing that we just saw from you guys. How do we get a POC going?" So there's always this appetite to evaluate what's for sale, right? So, that's one case. There's another case where, when you start to think about an enterprise there's a big range of skillsets. Sometimes I'll go to some companies like, "Oh, my insurance is through that company, and there's ex-Googlers that work there." They used to work on things like Borg, or something else, and they kind of know how these systems work.And they have a slightly better edge at evaluating whether Kubernetes is any good for the problem at hand. And you'll see them bring it in. Now that same company, I could drive over to the other campus, maybe it's five miles away and that team doesn't even know what Kubernetes is. And for them, they're going to be chugging along with what they're currently doing. So then the challenge becomes if Kubernetes is a great fit, how wide of a fit it isn't? How many teams at that company should be using it? So what I'm currently seeing as there are some enterprises that have found a way to make Kubernetes the place where they do a lot of new work, because that makes sense. A lot of enterprises to my surprise though, are actually stepping back and saying, "You know what? We've been stitching together our own platform for the last five years. We had the Netflix stack, we got some Spring Boot, we got Console, we got Vault, we got Docker. And now this whole thing is getting a little more fragile because we're doing all of this glue code."Kubernetes, We've been trying to build our own Kubernetes and now that we know what it is and we know what it isn't, we know that we can probably get rid of this kind of bespoke stack ourselves and just because of the ecosystem, right? If I go to HashiCorp's website, I would probably find the word Kubernetes as much as I find the word Nomad on their site because they've made things like Console and Vault become first-class offerings inside of the world of Kubernetes. So I think it's that momentum that you see across even People Oracle, Juniper, Palo Alto Networks, they're all have seem to have a Kubernetes story. And this is why you start to see the enterprise able to adopt it because it's so much in their face and it's where the ecosystem is going.Corey: It feels like a lot of the excitement and the promise and even the same problems that Kubernetes is aimed at today, could have just as easily been talked about half a decade ago in the context of OpenStack. And for better or worse, OpenStack is nowhere near where it once was. It would felt like it had such promise and such potential and when it didn't pan out, that left a lot of people feeling relatively sad, burnt out, depressed, et cetera. And I'm seeing a lot of parallels today, at least between what was said about OpenStack and what was said about Kubernetes. How do you see those two diverging?Kelsey: I will tell you the big difference that I saw, personally. Just for my personal journey outside of Google, just having that option. And I remember I was working at a company and we were like, "We're going to roll our own OpenStack. We're going to buy a free BSD box and make it a file server. We're going all open sources," like do whatever you want to do. And that was just having so many issues in terms of first-class integrations, education, people with the skills to even do that. And I was like, "You know what, let's just cut the check for VMware." We want virtualization. VMware, for the cost and when it does, it's good enough. Or we can just actually use a cloud provider. That space in many ways was a purely solved problem. Now, let's fast forward to Kubernetes, and also when you get OpenStack finished, you're just back where you started.You got a bunch of VMs and now you've got to go figure out how to build the real platform that people want to use because no one just wants a VM. If you think Kubernetes is low level, just having OpenStack, even OpenStack was perfect. You're still at square one for the most part. Maybe you can just say, "Now I'm paying a little less money for my stack in terms of software licensing costs," but from an extraction and automation and API standpoint, I don't think OpenStack moved the needle in that regard. Now in the Kubernetes world, it's solving a huge gap.Lots of people have virtual machine sprawl than they had Docker sprawl, and when you bring in this thing by Kubernetes, it says, "You know what? Let's reign all of that in. Let's build some first-class abstractions, assuming that the layer below us is a solved problem." You got to remember when Kubernetes came out, it wasn't trying to replace the hypervisor, it assumed it was there. It also assumed that the hypervisor had APIs for creating virtual machines and attaching disc and creating load balancers, so Kubernetes came out as a complementary technology, not one looking to replace. And I think that's why it was able to stick because it solved a problem at another layer where there was not a lot of competition.Corey: I think a more cynical take, at least one of the ones that I've heard articulated and I tend to agree with, was that OpenStack originally seemed super awesome because there were a lot of interesting people behind it, fascinating organizations, but then you wound up looking through the backers of the foundation behind it and the rest. And there were something like 500 companies behind it, an awful lot of them were these giant organizations that ... they were big e-corporate IT enterprise software vendors, and you take a look at that, I'm not going to name anyone because at that point, oh will we get letters.But at that point, you start seeing so many of the patterns being worked into it that it almost feels like it has to collapse under its own weight. I don't, for better or worse, get the sense that Kubernetes is succumbing to the same thing, despite the CNCF having an awful lot of those same backers behind it and as far as I can tell, significantly more money, they seem to have all the money to throw at these sorts of things. So I'm wondering how Kubernetes has managed to effectively sidestep I guess the open-source miasma that OpenStack didn't quite manage to avoid.Kelsey: Kubernetes gained its own identity before the foundation existed. Its purpose, if you think back from the Borg paper almost eight years prior, maybe even 10 years prior. It defined this problem really, really well. I think Mesos came out and also had a slightly different take on this problem. And you could just see at that time there was a real need, you had choices between Docker Swarm, Nomad. It seems like everybody was trying to fill in this gap because, across most verticals or industries, this was a true problem worth solving. What Kubernetes did was played in the exact same sandbox, but it kind of got put out with experience. It's not like, "Oh, let's just copy this thing that already exists, but let's just make it open."And in that case, you don't really have your own identity. It's you versus Amazon, in the case of OpenStack, it's you versus VMware. And that's just really a hard place to be in because you don't have an identity that stands alone. Kubernetes itself had an identity that stood alone. It comes from this experience of running a system like this. It comes from research and white papers. It comes after previous attempts at solving this problem. So we agree that this problem needs to be solved. We know what layer it needs to be solved at. We just didn't get it right yet, so Kubernetes didn't necessarily try to get it right.It tried to start with only the primitives necessary to focus on the problem at hand. Now to your point, the extension interface of Kubernetes is what keeps it small. Years ago I remember plenty of meetings where we all got in rooms and said, "This thing is done." It doesn't need to be a PaaS. It doesn't need to compete with serverless platforms. The core of Kubernetes, like Linux, is largely done. Here's the core objects, and we're going to make a very great extension interface. We're going to make one for the container run time level so that way people can swap that out if they really want to, and we're going to do one that makes other APIs as first-class as ones we have, and we don't need to try to boil the ocean in every Kubernetes release. Everyone else has the ability to deploy extensions just like Linux, and I think that's why we're avoiding some of this tension in the vendor world because you don't have to change the core to get something that feels like a native part of Kubernetes.Corey: What do you think is currently being the most misinterpreted or misunderstood aspect of Kubernetes in the ecosystem?Kelsey: I think the biggest thing that's misunderstood is what Kubernetes actually is. And the thing that made it click for me, especially when I was writing the tutorial Kubernetes The Hard Way. I had to sit down and ask myself, "Where do you start trying to learn what Kubernetes is?" So I start with the database, right? The configuration store isn't Postgres, it isn't MySQL, it's Etcd. Why? Because we're not trying to be this generic data stores platform. We just need to store configuration data. Great. Now, do we let all the components talk to Etcd? No. We have this API server and between the API server and the chosen data store, that's essentially what Kubernetes is. You can stop there. At that point, you have a valid Kubernetes cluster and it can understand a few things. Like I can say, using the Kubernetes command-line tool, create this configuration map that stores configuration data and I can read it back.Great. Now I can't do a lot of things that are interesting with that. Maybe I just use it as a configuration store, but then if I want to build a container platform, I can install the Kubernetes kubelet agent on a bunch of machines and have it talk to the API server looking for other objects you add in the scheduler, all the other components. So what that means is that Kubernetes most important component is its API because that's how the whole system is built. It's actually a very simple system when you think about just those two components in isolation. If you want a container management tool that you need a scheduler, controller, manager, cloud provider integrations, and now you have a container tool. But let's say you want a service mesh platform. Well in a service mesh you have a data plane that can be Nginx or Envoy and that's going to handle routing traffic. And you need a control plane. That's going to be something that takes in configuration and it uses that to configure all the things in a data plane.Well, guess what? Kubernetes is 90% there in terms of a control plane, with just those two components, the API server, and the data store. So now when you want to build control planes, if you start with the Kubernetes API, we call it the API machinery, you're going to be 95% there. And then what do you get? You get a distributed system that can handle kind of failures on the back end, thanks to Etcd. You're going to get our backs or you can have permission on top of your schemas, and there's a built-in framework, we call it custom resource definitions that allows you to articulate a schema and then your own control loops provide meaning to that schema. And once you do those two things, you can build any platform you want. And I think that's one thing that it takes a while for people to understand that part of Kubernetes, that the thing we talk about today, for the most part, is just the first system that we built on top of this.Corey: I think that's a very far-reaching story with implications that I'm not entirely sure I am able to wrap my head around. I hope to see it, I really do. I mean you mentioned about writing Learn Kubernetes the Hard Way and your tutorial, which I'll link to in the show notes. I mean my, of course, sarcastic response to that recently was to register the domain Kubernetes the Easy Way and just re-pointed to Amazon's ECS, which is in no way shape or form Kubernetes and basically has the effect of irritating absolutely everyone as is my typical pattern of behavior on Twitter. But I have been meaning to dive into Kubernetes on a deeper level and the stuff that you've written, not just the online tutorial, both the books have always been my first port of call when it comes to that. The hard part, of course, is there's just never enough hours in the day.Kelsey: And one thing that I think about too is like the web. We have the internet, there's webpages, there's web browsers. Web Browsers talk to web servers over HTTP. There's verbs, there's bodies, there's headers. And if you look at it, that's like a very big complex system. If I were to extract out the protocol pieces, this concept of HTTP verbs, get, put, post and delete, this idea that I can put stuff in a body and I can give it headers to give it other meaning and semantics. If I just take those pieces, I can bill restful API's.Hell, I can even bill graph QL and those are just different systems built on the same API machinery that we call the internet or the web today. But you have to really dig into the details and pull that part out and you can build all kind of other platforms and I think that's what Kubernetes is. It's going to probably take people a little while longer to see that piece, but it's hidden in there and that's that piece that's going to be, like you said, it's going to probably be the foundation for building more control planes. And when people build control planes, I think if you think about it, maybe Fargate for EKS represents another control plane for making a serverless platform that takes to Kubernetes API, even though the implementation isn't what you find on GitHub.Corey: That's the truth. Whenever you see something as broadly adopted as Kubernetes, there's always the question of, "Okay, there's an awful lot of blog posts." Getting started to it, learn it in 10 minutes, I mean at some point, I'm sure there are some people still convince Kubernetes is, in fact, a breakfast cereal based upon what some of the stuff the CNCF has gotten up to. I wouldn't necessarily bet against it socks today, breakfast cereal tomorrow. But it's hard to find a decent level of quality, finding the certain quality bar of a trusted source to get started with is important. Some people believe in the hero's journey, story of a narrative building.I always prefer to go with the morons journey because I'm the moron. I touch technologies, I have no idea what they do and figure it out and go careening into edge and corner cases constantly. And by the end of it I have something that vaguely sort of works and my understanding's improved. But I've gone down so many terrible paths just by picking a bad point to get started. So everyone I've talked to who's actually good at things has pointed to your work in this space as being something that is authoritative and largely correct and given some of these people, that's high praise.Kelsey: Awesome. I'm going to put that on my next performance review as evidence of my success and impact.Corey: Absolutely. Grouchy people say, "It's all right," you know, for the right people that counts. If people want to learn more about what you're up to and see what you have to say, where can they find you?Kelsey: I aggregate most of outward interactions on Twitter, so I'm @KelseyHightower and my DMs are open, so I'm happy to field any questions and I attempt to answer as many as I can.Corey: Excellent. Thank you so much for taking the time to speak with me today. I appreciate it.Kelsey: Awesome. I was happy to be here.Corey: Kelsey Hightower, Principal Developer Advocate at Google. I'm Corey Quinn. This is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on Apple podcasts. If you've hated this podcast, please leave a five-star review on Apple podcasts and then leave a funny comment. Thanks.Announcer: This has been this week's episode of Screaming in the Cloud. You can also find more Core at screaminginthecloud.com or wherever fine snark is sold.Announcer: This has been a HumblePod production. Stay humble.
Wie sieht eigentlich der Layer unter Docker aus? Und wie interagiert Kubernetes mit Containern?In Episode 46 haben wir geklärt, welches Problem Docker eigentlich löst. Das Container-Ecosystem ist jedoch weit größer. Deswegen widmet sich diese Folge der darunter liegenden Ebene. Wir besprechen die Modularisierung von Docker, die herausgetrennte Highlevel Runtime containerd, wie Kubernetes mit Docker-Containern umgeht, ob Docker Container die einzige Art von Containern ist, die Kubernetes unterstützt, was ein Container Runtime Interface (CRI) ist, was die Open Container Initiative (OCI) und ob auch du dir deine eigene Highlevel Container Runtime programmieren kannst.Bonus: Was die Linux-Mafia ist und wieso es bald eine österreichische Container Runtime gibt.Feedback (gerne auch als Voice Message)Email: stehtisch@engineeringkiosk.devTwitter: https://twitter.com/EngKioskWhatsApp +49 15678 136776Gerne behandeln wir auch euer Audio Feedback in einer der nächsten Episoden, einfach Audiodatei per Email oder WhatsApp Voice Message an +49 15678 136776LinksEngineering Kiosk #46 Welches Problem löst Docker?: https://engineeringkiosk.dev/podcast/episode/46-welches-problem-l%C3%B6st-docker/c't - Magazin für Computer Technik: https://www.heise.de/ct/iX - Magazin für professionelle Informationstechnik: https://www.heise.de/ix/Highlevel container runtime containerd: https://containerd.io/Getting started with containerd: https://github.com/containerd/containerd/blob/main/docs/getting-started.mdcontaiNERD CTL - Docker-compatible CLI for containerd, with support for Compose, Rootless, eStargz, OCIcrypt, IPFS: https://github.com/containerd/nerdctlshim: https://de.wikipedia.org/wiki/Shim_(Informatik)Core OS: https://de.wikipedia.org/wiki/Core_OSrkt: https://github.com/rkt/rktContainer Runtime Interface (CRI): https://kubernetes.io/docs/concepts/architecture/cri/CRI Plugin von containerd: https://github.com/containerd/containerd/tree/main/pkg/criHighlevel Container Runtime CRI-O: https://cri-o.io/Cloud Native Computing Foundation: https://www.cncf.io/Linux Foundation: https://www.linuxfoundation.org/Apache Foundation: https://www.apache.org/gVisor - Container Security Platform: https://gvisor.dev/minikube - local Kubernetes cluster: https://minikube.sigs.k8s.io/docs/Open Container Initiative: https://opencontainers.org/Sprungmarken(00:00:00) Intro(00:00:58) Computerbild, c't und iX und einen Layer tiefer als "Was ist Docker"(00:04:31) Der nächsttiefere Layer nach docker run: Highlevel Runtime containerd(00:07:05) Was ist denn ein Container-Lebenszyklus und der Unterschied zwischen Highlevel und Lowlevel-Runtime?(00:09:37) Ist containerd noch ein Teil von docker?(00:10:35) Kann containerd auch von anderen Clients als docker angesprochen werden?(00:13:06) Verwendet Kubernetes auch docker? docker shim, CoreOS, rkt(00:16:01) Container Runtime Interface (CRI) von Kubernetes(00:19:44) Kubernetes, API-Server, kubelet und die Highlevel-Runtime(00:20:58) Eine alternative Highlevel-Container-Runtme zu containerd: CRI-O(00:24:15) Die Cloud Native Computing Foundation (CNCF)(00:28:16) Was ist besser? Containerd oder CRI-O?(00:29:57) docker shim und unterstützt Kubernetes noch docker?(00:32:10) Docker container sind eigentlich Open Container Initiative (OCI) Images und Container(00:33:21) Woher kommt die Open Container Initiative (OCI) und die Standards?(00:34:41) Zusammenfassung und die Möglichkeiten eure eigene Runtime zu schreiben(00:36:31) Ausblick auf die nächste Container-Folge und FeedbackHostsWolfgang Gassler (https://twitter.com/schafele)Andy Grunwald (https://twitter.com/andygrunwald)Feedback (gerne auch als Voice Message)Email: stehtisch@engineeringkiosk.devTwitter: https://twitter.com/EngKioskWhatsApp +49 15678 136776
We tried Fedora 37 on the Pi 4, the Google surprise this week, and our thoughts on the WSL 1.0 release.
We tried Fedora 37 on the Pi 4, the Google surprise this week, and our thoughts on the WSL 1.0 release.
This week our host Brandi Starr is joined by Meagen Eisenberg, CMO of TripActions. Meagen has spent over 20 years in high-tech, previously at MongoDB (Nasdaq: MDB) and on the board of G2, Terminus and Reactful. Before MongoDB, Meagen was VP of Demand Generation at DocuSign (Nasdaq: DOCU). Named Top 25 for B2B Marketing Influencers, Meagen has advised over 30 tech companies such as Loom, Branch.io, CoreOS, and Productboard, 7 of which were acquired in the past few years. Meagen has an MBA from Yale SOM, and a B.S. degree in MIS and a minor in CSC from Cal Poly – SLO.In this week's episode, Brandi and self-described builder, Meagen, tackle Growth & Scaling. With 17 successful exits since 2011 as an operator and advisor, including 3 IPOs and 14 mergers and acquisitions, CMOs will get the blueprint for Achieving a Successful Exit as CMO directly from a proven expert. Links: Get in touch with Meagen Eisenberg on: LinkedIn Twitter Learn more about TripActions Subscribe, listen, and rate/review Revenue Rehab Podcast on Apple Podcasts, Spotify, Google Podcasts , Amazon Music, or iHeart Radio and find more episodes on our website RevenueRehab.live
Today we chat with Cisco's head of developer content, community, and events, Michael Chenetz. We discuss everything from KubeCon to kindness and Legos! Michael delves into some of the main themes he heard from creators at KubeCon, and we discuss methods for increasing adoption of new concepts in your organization. We have a conversation about attending live conferences, COVID protocol, and COVID shaming, and then we talk about how Legos can be used in talks to demonstrate concepts. We end the conversation with a discussion about combining passions to practice creativity. We discuss our time at KubeCon in Spain (5:51) Themes Michael heard at KubeCon talking with creators (7:46) Increasing adoption of new concepts (9:27) We talk conferences, COVID shaming, and blamelessness (12:21) Legos and reliability (18:04) Michael talks about ways to exercise creativity (23:20) Links: KubeCon October 2022: https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/ Nintendo Lego Set: https://www.amazon.com/dp/B08HVXMQ87?ref_=cm_sw_r_cp_ud_dp_ED7NVBWPR8ANGT8WNGS5 Cloud Unfiltered podcast episode featuring Julie and Jason:https://podcasts.apple.com/us/podcast/ep125-chaos-engineering-with-julie-gunderson-and-jason/id1215105578?i=1000562393884 Links Referenced: Cisco: https://www.cisco.com/ Cloud Unfiltered Podcast with Julie and Jason: https://podcasts.apple.com/us/podcast/ep125-chaos-engineering-with-julie-gunderson-and-jason/id1215105578?i=1000562393884 Cloud Unfiltered Podcast: https://www.cisco.com/c/en/us/solutions/cloud/podcasts.html Nintendo Lego: https://www.amazon.com/dp/B08HVXMQ87 TranscriptJulie: And for folks that are interested in, too, what day it is—because I think we're all still a little bit confused—it is Monday, May 24th that we are recording this episode.Jason: Uh, Julie's definitely confused on what day it is because it's actually Tuesday, [laugh] May 24th.Michael: Oh, my God. [laugh]. That's great. I love it.Julie: Welcome to Break Things on Purpose, a podcast about reliability, learning from each other, and blamelessness. In this episode, we talk to Michael Chenetz, head of developer content, community, and events at Cisco, about all of the learnings from KubeCon, the importance of being kind to each other, and of course, how Lego translates into technology.Julie: Today, we are joined by Michael Chenetz. Michael, do you want to tell us a little bit about yourself?Michael: Yeah. [laugh]. Well, first of all, thank you for having me on the show. And I'm really good at breaking things, so I guess that's why I'm asked to be here is because I'm superb at it. What I'm not so good at is, like, putting things back together.Like when I was a kid, I remember taking my dad's stereo apart; wasn't too happy about that. Wasn't very good at putting it back together. But you know, so that's just going back a little ways there. But yeah, so I work for the DevRel at Cisco and my whole responsibility is, you know, to get people to know that know a little bit about us in terms of, you know, all the developer-related topics.Julie: Well, and Jason and I had the awesome opportunity to hang out with you at KubeCon, where we got to join your Cloud Unfiltered podcast. So folks, definitely go check out that episode. We have a lot of fun. We'll put a link in the [show notes 00:02:03]. But yeah, let's talk a little bit about KubeCon. So, as of recording this episode, we all just recently traveled back from Spain, for KubeCon EU, which was… amazing. I really enjoyed being there. My first time in Spain. I got back, I can tell you, less than 24 hours ago. Michael, I think—when did you get back?Michael: So, I got back Saturday night, but my bags have not arrived yet. So, they're still traveling and they're enjoying Europe. And they should be back soon, I guess when they're when they feel like they're—you know, they should be back from vacation.Julie: [laugh].Michael: So. [laugh].Julie: Jason, how about you? When did you get home?Jason: I got home on Sunday night. So, I took the train from Valencia to Barcelona on Saturday evening, and then an early morning flight on Sunday and got home late Sunday night.Julie: And for folks that are interested in, too, what day it is—because I think we're all still a little bit confused—it is Monday, May 24th that we are recording this episode.Jason: Uh, Julie's definitely confused on what day it is because it's actually Tuesday, [laugh] May 24th.Michael: Oh, my God. [laugh]. That's great. I love it. By the way, yesterday was my birthday so I'm going to say—Julie: Happy birthday.Michael: —happy birthday to myself.Julie: Oh, my gosh, happy birthday. [laugh].Michael: Thank you [laugh].Julie: So… what is time anyway?Jason: Yeah.Michael: It's all good. It's all relative. Time is relative.Julie: Time is relative. And so, you know, tell us a little bit about—I'd love to know a little bit about why you want folks to know about, like, what is the message you try to get across?Jason: Oh, that's not the question I thought you were going to ask. I thought you were going to ask, “What's on your Amazon wishlist so people can send you birthday presents?”Julie: Yeah, let's back up. Let's do that. So, let's start with your Amazon wishlist. We know that there might be some Legos involved.Michael: Oh, my God, yeah. I mean, you just told me about a cool one, which was Optimus Prime and I just—I'm already on the website, my credit card is out and I'm ready to buy. So, you know, this is the problem with talking to you guys. [laugh]. It's definitely—you know, that's definitely on my list. So, anything that, anything music-related because obviously behind me is a lot of music equipment—I love music stuff—and anything tech. The combination of tech and music, and if you can combine Legos and that, too, man that would just match all the boxes. [laugh].Julie: Just to let you know, there's a Lego Con. Like, I did not know this until last night, actually. But it is a virtual conference.Michael: Really.Julie: Yeah. But one of the things I was looking at actually on Lego, when you look at their website, like, to request one of their speakers, to request one of their engineers as a speaker, they actually don't do that because they get so many requests for their folks to speak at conferences, they actually have a dedicated part of their website that talks about this. So, I thought that was interesting.Michael: Well listen, just because of that, if they want somebody that's in, you know, cloud computing, I'm not going to go talk for Lego. And I know they really want somebody from cloud computing talking to Lego, so, you know… it's, you know, quid pro quo there, so that's just the way it's going to work. [laugh].Julie: I want to be best friends with Lego people.Michael: [laugh]. I know, me too.Julie: I'm just going to make it a goal in life now to have one of their engineers speak at DevOpsDays Boise. It's like a challenge.Michael: It is. I accept it.Julie: [laugh]. With that, though, just on other Lego news, before we start talking about all the other things that folks may also want to hear about, there is another new Lego, which is the Van Gogh Starry Night that has been newly released by the time this episode comes out.Michael: With a free ear, right?Julie: I mean—[laugh].Michael: Is that what happens?Julie: —well played. Well, played. [laugh]. So, now you really got to spend a lot of time at KubeCon, you were just really recording podcast after podcast.Michael: Oh, my God. Yeah. So, I mean, it was great. I love—because I'm a techie, so I love tech and I love to find out origin stories of stuff. So, I love to, like, talk to these people and like, “Why did that come about? How did—” you know, “What happened in your life that made you want to do this? Who hurt you?” [laugh].And so, that's what I constantly try and figure out is, like, [laugh], “What is that?” So, it was really cool because I had, like, Jimmy Zelinskie who came from CoreOS, and he came from—you know, they create, you know, Quay and some of this other kinds of stuff. And you know, just to talk about, like, some of the operators and how they came about, and like… those were the original operators, so that was pretty cool. Varun from Tetrate was supposed to come on, and he created Istio, you know? So, there were so many of these things that I just geek out knowing about, you know?And then the other thing that was really high on our list, and it's really high from where I am, is API quality, API testing, API—so really, that's why I got in touch with you guys because I was like, “Wow, that fits in really good, you know? You guys are doing stuff that's around chaos, and you know, I think that's amazing.” So, all of this stuff is just so interesting to me. But man, it was just a whirlwind of every day just recording, and by the end that was just like, you know, “I'm so sorry, but I just, I can't talk anymore.” You know, and that was it. [laugh].Jason: I love that chatting with the creators. We had Zack Butcher on who is also from Tetrate and one of the early Istio—Michael: Yeah, yeah.Jason: Contributors. And I find it fascinating because I feel like when you chat with these folks, you start to understand the context of why things were built. And it—Michael: Yes.Jason: —it opens your brain up to, like, cool, there's a software—oh, now I know exactly why it's doing things that way, right? Like, it's just so, so eye-opening. I love it.Julie: With that, though, like, did you see any trends or any themes as you were talking to all these folks?Michael: Yeah, so a few real big trends. One is everybody wants to know about eBPF. That was the biggest thing at KubeCon, by far, was that, “We want to learn how to do this low-level kernel stuff that's really fast, that can give us all the information we need, and we don't have to use sidecars and things like that.” I mean it was—you know, that was the most excitement that I saw. OTel was another one for OpenTelemetry, which was a big one.The other thing was simplification. You know, a lot of people were looking to simplify the Kubernetes ecosystem because there's so much out there, and there's so many things that you have to learn about that it was super hard, you know, for somebody to come into it to say, “Where do I even start?” You know? So, that was a big theme was simplification.I'm trying to think. I think another one is APIs, for sure. You know, because there's this whole thing about API sprawl. And people don't know what their APIs are, people just, like—you know, I always say people can see—like, developers are lazy in a good way, and I consider myself one of them. So, what that means is that when we want to develop something, what we're going to do is we're just going to pull down the nearest API that does what we need, that has the best documentation, that has the best blog, that has the best everything.We don't know what their testing strategy is; we don't know what their security strategy is; we don't know if they use other libraries. And you have to figure that stuff out. And that's the thing that—you know, so everything around APIs is super important. And you really have to test that stuff out. Yes, people, you have to test it [laugh] and know more about it. So, those are those were the big themes, I think. [laugh].Julie: You know, I know that Kerim and I gave a talk on observability where we kind of talked more high-level about some of the overarching concepts, but folks were really excited about that. I think is was because we briefly touched on OpenTelemetry, which we should have gone into a little bit more depth, but there's only so much you can fit into a 30-minute talk, so hopefully we'll be able to talk about that more at a KubeCon in the future, we [crosstalk 00:09:54] to the selection committee.Michael: Hashtag topics?Julie: Uh-huh. [laugh]. You know, that said, though, it really did seem like a huge topic that people just wanted to learn more about. I know, too, at the Gremlin booth, a lot of folks were also interested in talking about, like, how do we just get our organization to adopt some of these concepts that we're hearing about here? And I think that was the thing that surprised me the most is I expected people to be coming up to the booth and deep-diving into very, very deep, technical-level questions, and really, a lot of it was how do we get our organization to do this? How can we increase adoption? So, that was a surprise for me.Michael: Yeah, you know what, and I would say two things to that. One is, when you talk about Chaos Engineering, I think people think it's like rocket science and people are really scared and they don't want to claim to be experts in it, so they're like, “Wow, this is, like, next-level stuff, and you know, we're really scared. You guys are the experts. I don't want to even attempt this.” And the other thing is that organizations are scared because they think that it's going to, like, create mass hysteria throughout their organization.And really, none of this is true in either way. In reality, it's a very, very scripted, very exacting stuff that you're testing, and you throw stuff out there and see what kind of response you get. So, you know, it's not this, like, you know—I think people just have—there needs to be more education around a lot of areas in cloud-native. But you know, that's one of the areas. So, I think it's really interesting there.Julie: I think so too. How about for you, Jason? Like, what was your surprise from the conference or something that maybe—Jason: Yeah, I mean, I think my surprise was mostly around just seeing people coming back, right? Because we're now I would say, six months into conferences being back as a thing, right? Like, we had re:Invent last year in Vegas; we had KubeCon last year in LA, and so, like, those are okay events. They weren't, like, back to normal. And this was, I feel like, one of the first conferences, that it really started to feel back to normal.Like, there was much better attendance, there was much more just buzz and hallway tracking and everything else that we're used to. Like, the whole reason that we go to conferences is getting together with people and hanging out and stuff, and this one has so far felt the most back-to-normal out of any event that I've been to over the past six months.Michael: Can I just talk about one thing that I think, you know, people have to get over is, you know, I see a lot online, I think it was—I forget who it was that was talking about it. But this whole idea of Covid shaming. I mean, we're going to this event, and it's like, yeah, everybody wants to get out, everybody wants to learn things, but don't shame people just because they got Covid, everybody's getting Covid, okay? That's just the point of life at this point. So, let's just, you know, let's just be nice to each other, be friendly to each other, you know? I just have to say that because I think it's a shame that people are getting shamed, you know, just for going to an event. [laugh].Julie: See, and I think that—that's an interesting—there's been a lot of conversation around this. And I don't think anybody should be Covid-shamed. Look, I think that we all took a calculated risk in coming—Michael: Absolutely.Julie: To this event. I personally gave out a lot of hugs. I hugged some of the folks that have mentioned that they have come up positive from Covid, so there's a calculated risk in going. I think there has been a little bit of pushback on maybe how some of the communication has come out around it. That said, as an organizer of a small conference with, like, 400 people, I think that these are very complicated matters. And what I really think is important is to listen to feedback from attendees and to take that.And then we're always looking to improve, right?Michael: Absolutely.Julie: If everything that we did was perfect right out of the gate, then we wouldn't have Chaos Engineering because there'd be nothing [crosstalk 00:13:45] be just perfectly reliable. And so, if we take away anything, let's take away—just like what you said, first of all, Covid, you should never shame somebody for having Covid. Like, that's not cool. It's not somebody's fault that they caught an illness.Michael: Yes.Julie: I mean unless they were licking doorknobs. And that's a whole different—Michael: Yes. [laugh]. That's a whole different thing, right there.Julie: Conversation. But when we talk about just like these questions around cultural adoption, we talk about blamelessness; we talk about learning from failure; we talked about finding ways to improve, and I think all of that can come into play. So, it'll be interesting to see how we learn and grow as we move forward. And like, thank you to re:Invent, thank you to KubeCon, thank you to DevOpsDays Boise. But these conferences that have started going back in-person, at great risk to organizers and the committee because people are going to be mad, one way or the other.Michael: Yeah. And you can see that people want to be back because it was huge, you know?Julie: Yeah.Michael: Maybe you guys, I'm going to put in a feature request for Gremlin to chaos engineer crowds. Can we do that so we can figure out, like, what's going to happen when we have these big events? Can we do that?Julie: I mean, that sounds fun. I think what's going to happen is there's going to be hugs, there's going to be people getting sick, but there's going to be people learning and growing.Michael: Yes.Julie: And ultimately, I just think that we have to remember that just, like, our systems aren't perfect, and neither are people. Like, the fact that we expect people to be perfect, and maybe we should just keep some mask mandates for a little bit longer when we're at conferences with 8000 people.Michael: Sure.Julie: I mean, that's—Michael: That makes sense.Jason: Yeah. I mean, it's all about risk management, right? This is, essentially what we do in SRE is there's always a risk of a massive outage, and so it's that balance of, right, do what you can, but ultimately, that's why we have SLOs and things is, you can never be a hundred percent, so like, where do we draw the line of here are the things that we're going to do to help manage this risk, but you can never shoot for a perfectly, entirely safe space, right? Because then we'd all be having conferences in padded rooms, and not touching each other, and things like that. There's a balance there.And I think we're all just trying to find that, so yeah, as you mentioned, that whole, like, DevOps blamelessness thing, you know, treat each other with the notion that we're all trying to get through this together and do what we think is best. Nobody's just like John Allspaw said, you know, “Nobody goes to work thinking that, like, their intent is to crash everything and destroy the company.” No one's going to KubeCon or any of these conferences thinking, “Yeah, I'm going to be a super-spreader.”Julie: [laugh].Michael: Yeah, that would be [crosstalk 00:16:22].Jason: Like, everyone's trying not to do it. They're doing their best. They're not actively, like, aggressively trying to get you sick or intentionally about it. But you know—so just be kind to one another.Michael: Yeah. And that's the key.Julie: It is.Michael: The key. Be kind to one another, you know? I mean, it's a great community. People are really nice, so, you know, let's keep that up. I think that's something special about the, you know, the community around KubeCon, specifically.Julie: As we can refine this and find ways, I would take all of the hugs over virtual conferences—Michael: Yes.Julie: Any day now. Because, as Jason mentioned, is even just with you, Michael, the time we got to spend with you, or the time I kept going up to Jfrog's booth and Baruch and I would have conversations as he made me a delicious coffee, these hallway tracks, these conversations, that's what no one figured out how to recreate during the virtual events—Michael: Absolutely.Julie: —and it's just not possible, right?Michael: Yeah. I mean, I think it would take a little bit of VR and then maybe some, like, suit that you wear in order to feel the hug. And, you know, so it would take a lot more in order to do that. I mean, I guess it's technologically possible. I don't know if the graphics are there yet, so it might be like a pixelated version, like, you know, like, NES-style, or something like that. But it could look pretty cool. [laugh]. So, we'll have to see, you know?Julie: Everybody listening to this episode, I hope you're getting as much of a kick out of it as we are recording it because I mean, there are so many different topics here. One of the things that Michael and I bonded about years ago, for our listeners that are—not years ago; months ago. Again, what is time?Michael: Yeah. What is time? It's all relative.Julie: It is. It was Lego, though, and so we've been talking about that. But Michael, you asked a great question when we were recording with you, which is, like—Michael: Wow.Julie: Can—just one. Only one great question.Michael: [laugh].Julie: [laugh]. Which was, how would you incorporate Lego into a talk? And, like, when we look at our systems breaking and all of that, I've really been thinking about that and how to make our systems more reliable. And here's one of the things I really wanted to clarify that answer. I kind of went… I went talking about my Lego that I build, like, my Optim—not my Optimus Primes, I don't have it, but my Voltron or my Nintendo Lego. And those are all box sets.Michael: Yep.Julie: But one of the things if you're not playing with a box set with instruction, if you're just playing with just the—or excuse me, architecting with just the Lego blocks because it's not playing because we're adults now, I think.Michael: Yes, now it's architecting. Yes.Julie: Yes, now that we're architecting, like, that's one of the things that I was really thinking about this, and I think that it would make something really fun to talk about is how you're building upon each layer and you're testing out these new connection pieces. And then that really goes into, like, when we get into Technics, into dependencies because if you forget that one little one-inch plastic piece that goes from the one to the other, then your whole Lego can fall apart. So anyway, I just thought that was really interesting, and I'd wondered if you or Jason even gave that any more thought, or if it was just fleeting for you.Michael: It was definitely fleeting for me, but I will give it some more thought, you know? But you know, when—as you're saying that though, I'm thinking these Lego pieces really need names because you're like that little two-inch Lego piece that kind of connects this and this, like, we got to give these all names so that people can know, that's x-54 that's—that you're putting between x-53 and x-52. I don't know but you need some kind of name for these parts now.Julie: There are Lego names. You just Google it. There are actual names for all of the parts but—Michael: Wow. [laugh].Julie: Like, Jason, what do you think? I know you've got [unintelligible 00:19:59].Jason: Yeah, I mean, I think it's interesting because I am one of those, like, freeform folks, right? You know, my standard practice when I was growing up with Legos was you build the thing that you bought once and then you immediately, like, tear it apart, and you build whatever the hell you want.Michael: Absolutely.Jason: So, I think that that's kind of an interesting thing as we think about our systems and stuff, right? Like, part of it is, like, yeah, there's best practices and various companies will publish, like, you know, “Here's how to architect such-and-such system.” And it's interesting because that's just not reality, right? You're not going to go and take, like, the Amazon CloudFormation thing, and like, congrats, you're done. You know, you just implement that and your job's done; you just kick back for the rest of the week.It never works that way, right? You're taking these little bits of, like, cool, I might have, like, set that up once just to see what's happening but then you immediately, like, deconstruct it, and you take the knowledge of what you learned in those building blocks, and you, like, go and remix it to build the thing that you actually need to build.Michael: But yeah, I mean, that's exactly—so you know, Legos is what got me interested in that as a kid, but when you look at, you know, cloud services and things like that, there's so many different ways to combine things and so many different ways to, like—you know, you could use Terraform, you could use Crossplane, you could use, you know, any of the services in the cloud, you could use FaaS, you could use serverless, you could use, you know, all these different kinds of solutions and tie them together. So, there's so much choice, and what Lego teaches you is that, embrace the choice. Figure out and embrace the different pieces, embrace all the different things that you have and what the art of possibility is, and then start to build on that. So, I think it's a really good thing. And that's why there's so much correlation between, like, kind of, art and tech and things like that because that's the kind of mentality that you need in order to be really successful in tech.Jason: And I think the other thing that works really well with what you said is, as you're playing with Legos, you start to learn these hacks, right? Like, I don't have, like, a four-by-one brick, but I know that if I have three four-by-one flats, I can stack those three and it's the same height as a brick, right?Michael: Yep.Jason: And you can start combining things. And I love that engineering mentality of, like, I have this problem that I need to solve, I have a limited toolbox for whatever constraints, right, and understanding those constraints, and then cool, how can I remix what I've got in my toolbox to get this thing done?Michael: And that's a thing that I'm always doing. Like, when I used to do a lot of development, you know, it was always like, what is the right code? Or what is the library that's going to solve my problem? Or what is the API that's going to solve my problem, you know?And there's so many different ways to do it. I mean, so many people are afraid of, like, making the wrong choice, when really in programming, there is no wrong choice. It's all about how you want to do it and what makes sense to you, you know? There might be better options in formatting and in the way that you kind of, you know, format that code together and put them in different libraries and things like that, but making choices on, like, APIs and things like that, that's all up to the artist. I would say that's an artist. [laugh]. So, you know, I think it all stems though, when you go back from, you know, just being creative with things… so creativity is king.Jason: So Michael, how do you exercise your creativity, then? How do you keep up that creativity?Michael: Yeah, so there's multiple ways. And that's a great segment because one of the things that I really enjoy—so you know, I like development, but I'm also a people person. And I like product management, but I also like dealing with people. So really, to me, it's about how do I relate products, how do I relate solutions, how do I talk to people about solutions that people can understand? And that's a creative process.Like, what is the right media? What is the right demos? What is the right—you know, what do people need? And what do people need to, kind of, embrace things? And to me, that's a really creative medium to me, and I love it.So, I love that I can use my technical, I love that I can use my artistic, I love that I can use, you know, all these pieces all at once. And sometimes maybe I'll play guitar and just put it in the intro or something, I don't know. So, that kind of combines that together, too. So, we'll figure that piece out later. Maybe nobody wants to hear me play guitar, that's fine, too. [laugh].But I love to be able to use, you know, both sides of my brain to do these creative aspects. So, that's really what does it. And then sometimes I'll program again and I'll find the need, and I'll say, “Hey, look, you know, I realized there's a need for this,” just like a lot of those creators are. But I haven't created anything cool, but you know, maybe someday I will. I feel like it's just been in between all those different intersections that's really cool.Jason: I love the electric guitar stuff that you mentioned. So, for folks who are listening to this show, during our recording of the Cloud Unfiltered you were talking about bringing that art and technical together with electric guitars, and you've been building electric guitar pickups.Michael: Yes. Yeah. So, I mean, I love anything that can combine my music passion with tech, so I have a CNC machine back here that winds pickups and it does it automatically. So, I can say, “Hey, I need a 57 pickup, you know, whatever it is,” and it'll wind it to that exact spec.But that's not the only thing I do. I mean, I used to design control surfaces for artists that were a big band, and I really can't—a lot of them I can't mention because we're under NDA. But I designed a lot of these big, you know, control surfaces for a lot of the big electronic and rock bands that are out there. I taught people how to use Max for Live, which is an artist's, kind of, programming language that's graphical, so [NMax 00:25:33] and MSP and all that kind of stuff. So, I really, really like to combine that.Nowadays, you know, I'm talking about doing some kind of events that may be combined tech, with art. So, maybe doing things like Algorave, and you know, things that are live-coding music and an art. So, being able to combine all these things together, I love that. That's my ultimate passion.Jason: That is super cool.Julie: I think we have learned quite a bit on this episode of Break Things on Purpose, first of all, from the guy who said he hasn't created much—because you did say that, which I'm going to call you out on that because you just gave a long list of things that you created. And I think we need to remember that we're all creators in our own way, so it's very important to remember that. But I think that right now we've created a couple of options for talks in the future, whether or not it's with Lego, or guitar pickups.Michael: Yeah.Julie: Is that—Michael: Hey—Julie: Because I—Michael: Yeah, why not?Julie: —know you do kind of explain that a little bit to me as well when I was there. So, Michael, this has just been amazing having you. We're going to put a lot of links in the notes for everybody today. So, to Michael's podcast, to some Lego, and to anything else Michael wants to share with us as well. Oh, real quick, is there anything you want to leave our listeners with other than that? You know, are you looking to hire Cisco? Is there anything you wanted to share with us?Michael: Yeah, I mean, we're always looking for great people at Cisco, but the biggest thing I'd say is, just realize that we are doing stuff around cloud-native, we're not just network. And I think that's something to note there. But you know, I just love being on the show with you guys. I love doing anything with you guys. You guys are awesome, you know. So.Julie: You're great too, and I think we'll probably do more stuff, all of us together, in the future. And with that, I just want to thank everybody for joining us today.Michael: Thank you. Thanks so much. Thanks for having me.Jason: For links to all the information mentioned, visit our website at gremlin.com/podcast. If you liked this episode, subscribe to the Break Things on Purpose podcast on Spotify, Apple Podcasts, or your favorite podcast platform. Our theme song is called, “Battle of Pogs” by Komiku, and it's available on loyaltyfreakmusic.com.
We just wrapped up our East Coast meetup and have a bunch of great stories to share. Plus some Nix ups and downs, and more. Special Guest: Alex Kretzschmar.
We've grown to rely on “the three pillars” for observability - logs, metrics and traces. Popular frameworks such as Prometheus have helped popularize these practices. But now people are starting to realize that it's not enough. On this episode Dotan Horovits will host Frederic Branczyk for a discussion about the unspoken pitfalls of Prometheus and the challenges of current observability coverage. We will also discuss the rise of Continuous Profiling as a new observability signal, what it's about and where it can help. We'll also review the recent launch of Parca, an open source project for continuous profiling that traces its roots to Red Hat's internal ConProf open source tool. Frederic is the founder and CEO of Polar Signals. Before founding Polar Signals he was a senior principal engineer and the main architect for all things Observability at Red Hat, which he joined through the CoreOS acquisition. Frederic is a Prometheus and Thanos maintainer as well as the tech lead for the special interest group for instrumentation in Kubernetes. In a previous life, he was a security researcher working on key management solutions as well as intrusion detection systems. When not working on software Frederic enjoys obsessing over brewing a perfect cup of coffee. The episode was live-streamed on 16 December 2021 and the video is available at https://www.youtube.com/watch?v=G02g63oI0IA OpenObservability Talks episodes are released monthly, on the last Thursday of each month. The episodes are also live-streamed on Twitch and YouTube Live - tune in to see us live, and pitch in with your comments and questions on the live chat. Show Notes: The limitations of the three pillars model of observability Prometheus strengths and pitfalls how to start with continuous profiling how to correlate between different telemetry Parca OSS intro eBPF turned out perfect for instrumenting continuous profiling Parca OSS future plan how is the performance penalty of continuous profiling kept low what's the solution for high cardinality in Prometheus? will Parca OSS be contributed to an established OSS foundation? Prometheus Agent mode released OTEL operator now has an instrumentation CR continuous profiling support for interpreted languages Resources: https://www.parca.dev/ https://github.com/google/pprof https://increment.com/containers/observing-containers-pillars-of-observability/ https://ebpf.io/ https://research.google/pubs/pub36575/ Social: Twitter: https://twitter.com/OpenObserv YouTube: https://www.youtube.com/@openobservabilitytalks
En el corazón de Tiktok hay música y hay coreografías, que forman una parte enorme de los videos que publican sus usuarios. Pero: ¿por qué nos atrae mostrarle a otros que podemos hacer una sucesión de movimientos específicos al compás de la música? Eso le preguntó Clarisa Herrera a Sofia Geyer, especialista en comportamiento humano y consultora en creatividad e innovación; a Julia Kaiser, chief content officer de Havas y Laura Jurkowski, psicóloga y directora de reConectarse, un centro de tratamiento de adicciones a internet, videojuegos y uso de la tecnología; y a la bailarina Flor Viterbo.
Oxide and Friends Twitter Space: September 13th, 2021Docker, Inc., an Early EpitaphWe've been holding a Twitter Space weekly on Mondays at 5p for about an hour. Even though it's not (yet?) a feature of Twitter Spaces, we have been recording them all; here is the recording for our Twitter Space for September 13th, 2021.In addition to Bryan Cantrill and Adam Leventhal, speakers on September 13th included Steve Tuck, Tom Lyon, Dan Cross, Josh Clulow, Ian, Nick Gerace, Aaron Goldman, Drew Vogel, and vint serp. (Did we miss your name and/or get it wrong? Drop a PR!)Some of the topics we hit on, in the order that we hit them: Topic: Scott Carey's article How Docker broke in halfMore by Carey on Docker: Docker Desktop is no longer free for enterprise users What is Docker? The spark for the container revolution Andrej Karpathy's tweet showing InfoWorld.com spamming ads Carey talked to: Solomon Hykes (Docker cofounder with Sebastien Pahl) Ben Golub (Docker CEO 2013-2017) Craig McLuckie (Kubernetes cofounder) Nick Stinemates (early employee and former VP of Business Development) [@5:21](https://youtu.be/l9LTJdT0sZ8?t=321) Akira Kurosawa's 1950 Rashomon ~90mins. Watch a 2min trailer Box office bomb “The Hottie and the Nottie” movie. Other stinkers: Gigli, Gotti [@9:31](https://youtu.be/l9LTJdT0sZ8?t=571) Jerry Kaplan's 1996 book Startup: A Silicon Valley Adventure Steve's take on commercialization > Bryan: There's no question that they hit on something very big. > We saw a container as an operational vessel, but we failed to see > a container as a development vessel. [@14:36](https://youtu.be/l9LTJdT0sZ8?t=876) dotCloud (PaaS) struggles to find a buyer; ultimately open sources as last resort > All of a sudden a company that nobody had heard of, > was a company that everybody had heard of. They took too much money. [@17:40](https://youtu.be/l9LTJdT0sZ8?t=1060) Pitfalls in raising money and scaling sales by imitating big companiesHBO's Silicon Valley Clip ~1min with Jan the Man, Keith, and Doug (I'm shadowing Keith) > Everybody should be spending time arm in arm with customers understanding > how is this technology going to solve a problem > which they'll want to pay to have a solution. Tom: Was there actually a business anyways? Or was it just technology? What if developers are attracted to those things they know cannot be monetized? There was this belief that if a technology is this ubiquitous, it will be readily monetizable. [@27:26](https://youtu.be/l9LTJdT0sZ8?t=1646) Docker Swarm and Kubernetes > Hykes: We didn't work at Google, we didn't go to Stanford, > we didn't have a PhD in computer science. Stinemates: (The Kubernetes team) had strong opinions about the need for a service level API and Docker technically had its own opinion about a single API from a simplicity standpoint. We couldn't agree. DockerCon 2015: No mentioning Kubernetes! Brendan Burns' talk “The distributed system toolkit: Container patterns for modular distributed system design” was unfortunately made private by Docker sometime in the last two years. The internet archive only has this. Burns wrote a blog post about the topics from his talk. rkt (“Rocket”), CoreOS [@36:11](https://youtu.be/l9LTJdT0sZ8?t=2171) Docker coming to market Enterprise teams wanted support Initial support offerings were expensive and limited (no after hours, no weekends) > Bryan: I floated to Solomon in 2014: run container management as a service. Rancher Labs, K3s (lightweight kubernetes) People care about GitHub stars (for better or worse) [@48:02](https://youtu.be/l9LTJdT0sZ8?t=2882) Monetizing open source technologies Triton implementing the Docker API The support relationships are the foothold to figure out the product. [@54:36](https://youtu.be/l9LTJdT0sZ8?t=3276) Venture capital going into DockerDocker acquires Tutum Product market fitAcquisitions [@1:04:42](https://youtu.be/l9LTJdT0sZ8?t=3882) Could the outcome have been materially different? Who made money on Docker? Cloud companies? Developers? VMware acquires Heptio Who invented containers? BSD Jails, Plan9 namespaces? Tyler Tringas' post about how small teams can create value with little outside investment, as a result of the Peace Dividend of the SaaS Wars. If we got something wrong or missed something, please file a PR! Our next Twitter space will likely be on Monday at 5p Pacific Time; stay tuned to our Twitter feeds for details. We'd love to have you join us, as we always love to hear from new speakers!
Qualys is currently the only vendor to secure both container and host operating system for the OpenShift Container Platform providing unparalleled visibility, actionable intelligence, and security auditing to the OpenShift stack while using Red Hat CoreOS. Join Qualys' Spencer Brown, Security Solutions Architect and Alex Mandernack, Director of Product Management, Cloud Security Solutions as we discuss a full-stack security for the OpenShift Container Platform.
David and Redbeard talk about developing your personal network through core communities, how teaching can improve your own knowledge, and that genuine conversation can lead to some amazing opportunities.
Abstract of the talk… Cockroach Labs has built a database architected from the ground up to be distributed. It is a perfect fit for the cloud and Kubernetes as it naturally scales and survives without manual interaction. The unique architecture of CockroachDB delivers some key innovations that may not only provide value for your applications but might also give you insight into the challenges/solutions in distributed systems. In this session, we will deliver a deep-dive exploration into the internals of the database, exploring the following, and more: How the database uses KV at the storage layer to effectively distribute data How Raft and MVCC are used to guarantee serializable isolation for transactions How Cockroach automates scale and guarantees an always-on resilient database How to tie data to a location to help with performance and data privacy Bio… Jim has been a product marketer for almost twenty years and before that he coded professionally in Smalltalk, C++ and Java. He still codes and likes to dive deep into tech so that he can help translate complex topicsinto consumable forms. Over the course of his career he has focused on emerging tech and has been directly involved in creating six categories. He prides himself as an advocate of the developer and a rabid open source software promoter. His list of startups that he's helped build include Servgate, Vontu (acquired), Initiate Sytems (acquired), Talend (IPO), Hortonworks (IPO), EverString (acquired), CoreOS (acquired) and is currently the VP of a Product Marketing at pre-IPO, Cockroach Labs. Key take-aways from the talk… We will dive deep into the architecture of the database and explicitly cover the following areas: Ranges (partitions): SQL to KV RAFT Distributed Data: Range Distribution, Scale and Resilience Distributed Transactions Distributed SQL Execution Distributed Latency Distributed Performance Optimizations
As an engineer at Puppet, CoreOS, and Google Cloud, Kelsey Hightower has been at the forefront of new deployment technologies over the past decade. Along the way, he has built tools like confd, created learning resources like Kubernetes The Hard Way, co-founded KubeCon, and taught multitudes of people about containers, infrastructure as code, service meshes, and the operating system of the cloud.In this conversation, Kelsey talks about how he learns new technologies, shares stories over the course of Kubernetes history, and explains how one might make sense of the varied ecosystem of infrastructure tools ("engineering organizations are like restaurants").Show notes and transcript: https://about.sourcegraph.com/podcast/kelsey-hightower
Meagen is the CMO of TripActions, previously CMO of MongoDB and on the board of G2, TinyPulse and Reactful. Before joining MongoDB, she was VP of Demand Generation at DocuSign. Named Top 25 for B2B Marketing Influencers, Meagen has advised over 25 tech companies such as Branch.io, Chorus.ai, Sendoso, CoreOS, Accompany, TinyPulse and SumoLogic, seven of which were acquired in the past two years. She has an MBA from Yale SOM.
Some highlights of the show include: The company's cloud native journey, which accelerated with the acquisition of Uswitch. How the company assessed risk prior to their migration, and why they ultimately decided the task was worth the gamble. Uswitch's transformation into a profitable company resulting from their cloud native migration. The role that multidisciplinary, collaborative teams played in solving problems and moving projects forward. Paul also offers commentary on some of the tensions that resulted between different teams. Key influencing factors that caused the company to adopt containerization and Kubernetes. Paul goes into detail about their migration to Kubernetes, and the problems that it addressed. Paul's thoughts on management and prioritization as CTO. He also explains his favorite engineering tool, which may come as a surprise. Links: RVU Website: https://www.rvu.co.uk/ Uswitch Website: https://www.uswitch.com/ Twitter: https://twitter.com/pingles GitHub: https://github.com/pingles TranscriptAnnouncer: Welcome to The Business of Cloud Native podcast, where we explore how end users talk and think about the transition to Kubernetes and cloud-native architectures.Emily: Welcome to The Business of Cloud Native. I'm your host, Emily Omier, and today I am chatting with Paul Ingles. Paul, thank you so much for joining me.Paul: Thank you for having me.Emily: Could you just introduce yourself: where do you work? What do you do? And include, sort of, some specifics. We all have a job title, but it doesn't always reflect what our actual day-to-day is.Paul: I am the CTO at a company called RVU in London. We run a couple of reasonably big-ish price comparison, aggregator type sites. So, we help consumers figure out and compare prices on broadband products, mobile phones, energy—so in the UK, energy is something which is provided through a bunch of different private companies, so you've got a fair amount of choice on kind of that thing. So, we tried to make it easier and simpler for people to make better decisions on the household choices that they have. I've been there for about 10 years, so I've had a few different roles. So, as CTO now, I sit on the exec team and try to help inform the business and technology strategy. But I've come through a bunch of teams. So, I've worked on some of the early energy price comparison stuff, some data infrastructure work a while ago, and then some underlying DevOps type automation and Kubernetes work a couple of years ago.Emily: So, when you get in to work in the morning, what types of things are usually on your plate?Paul: So, I keep a journal. I use bullet journalling quite extensively. So, I try to track everything that I've got to keep on top of. Generally, what I would try to do each day is catch up with anybody that I specifically need to follow up with. So, at the start of the week, I make a list of every day, and then I also keep a separate column for just general priorities. So, things that are particularly important for the week, themes of work going on, like, technology changes, or things that we're trying to launch, et cetera. And then I will prioritize speaking to people based on those things. So, I'll try and make sure that I'm focusing on the most important thing. I do a weekly meeting with the team. So, we have a few directors that look after different aspects of the business, and so we do a weekly meeting to just run through everything that's going on and sharing the problems. We use the three P's model: so, sharing progress problems and plans. And we use that to try and steer on what we do. And we also look at some other team health metrics. Yeah, it's interesting actually. I think when I switched from working in one of the teams to being in the CTO role, things change quite substantially. That list of things that I had to care about increase hugely, to the point where it far exceeded how much time I had to spend on anything. So, nowadays, I find that I'm much more likely for some things to drop off. And so it's unfortunate, and you can't please everybody, so you just have to say, “I'm really sorry, but this thing is not high on the list of priorities, so I can't spend any time on it this week, but if it's still a problem in a couple of weeks time, then we'll come back to it.” But yeah, it can vary quite a lot.Emily: Hmm, interesting. I might ask you more questions about that later. For now, let's sort of dive into the cloud-native journey. What made RVU decide that containerization was a good idea and that Kubernetes was a good idea? What were the motivations and who was pushing for it?Paul: That's a really good question. So, I got involved about 10 years ago. So, I worked for a search marketing startup that was in London called Forward Internet Group, and they acquired USwitch in 2010. And prior to working at Forward, I'd worked as a consultant at ThoughtWorks in London, so I spent a lot of time working in banks on continuous delivery and things like that. And so when Uswitch came along, there were a few issues around the software release process. Although there was a ton of automation, it was still quite slow to actually get releases out. We were only doing a release every fortnight. And we also had a few issues with the scalability of data. So, it was a monolithic Windows Microsoft stack. So, there was SQL Server databases, and .NET app servers, and things like that. And our traffic can be quite spiky, so when companies are in the news, or there's policy changes and things like that, we would suddenly get an increase in traffic, and the Microsoft solution would just generally kind of fall apart as soon as we hit some kind of threshold. So, I got involved, partly to try and improve some of the automation and release practices because at the search start-up, we were releasing experiments every couple of hours, even. And so we wanted to try and take a bit of that ethos over to Uswitch, and also to try and solve some of the data scalability and system scalability problems. And when we got started doing that, a lot of it was—so that was in the early heyday of AWS, so this was about 2008, that I was at the search startup. And we were used to using EC2 to try and spin up Hadoop clusters and a few other bits and pieces that we were playing around with. And when we acquired Uswitch, we felt like it was quickest for us to just create a different environment, stick it under the load balancer so end users wouldn't realize that some requests was being served off of the AWS infrastructure instead, and then just gradually go from there. We found that that was just the fastest way to move. So, I think it was interesting, and it was both a deliberate move, but it was also I think the degree to which we followed through on it, I don't think we'd really anticipated quite how quickly we would shift everything. And so when Forward made the acquisition, I joined summer of 2010, and myself and a colleague wrote a little two-pager on, here are the problems we see, here are the things that we think we can help with and the ways that technology approach that we'd applied at Forward would carry across, and what benefits we thought it would bring. Unfortunately because Forward was a privately held business—we were relatively small but profitable—and the owner of that business was quite risk-affine. He was quite keen on playing blackjack and other stuff. So, he was pretty happy with talking about probabilities of success.And so we just said, we think there's a future in it if we can get the wheels turning a bit better. And he was up for it. He backed us and we just took it from there. And so we replaced everything from self-hosted physical infrastructure running on top of .NET to all AWS hosted, running a mix of Ruby, and Closure, and other bits and pieces in about two years. And that's just continued from there. So, the move to Kubernetes was a relatively recent one; that was only within the last—I say ‘recent.' it was about two years ago, we started moving things in earnest. And then you asked what was the rationale for switching to Kubernetes—Emily: Let me first ask you, when you were talking with the owner, what were the odds that you gave him for success?Paul: [laughs]. That's a good question. I actually don't know. I think we always knew that there was a big impact to be had. I don't think we knew the scale of the upside. So, I don't think we—I mean, at the time, Uswitch was just about breaking even, so we didn't realize that there was an opportunity to radically change that. I think we underestimated how long it would take to do. So, I think we'd originally thought that we could replace, I think maybe most of the stuff that we needed replaced within six months. We had an early prototype out within two weeks, two or three weeks because we'd always placed a big emphasis on releasing early, experimenting, iterative delivery, A/B testing, that kind of thing. So, I think it was almost like that middle term that was the harder piece. And there was definitely a point where… I don't know, I think it was this classic situation of pulling on a ball of string where it was like, what wanted to do was to focus on improving the end-user experience because our original belief was that, aside from the scalability issues, that the existing site just didn't solve the problem sufficiently well, that it needed an overhaul to simplify the journeys, and simplify the process, and improve the experience for people. We were focusing on that and we didn't want to get drawn into replacing a lot of the back office and integration type systems partly because there was a lot of complexity there. But also because you then have to engage with QA environments, and test environments, and sign-offs with the various people that we integrate with. But it was, as I said, it was this kind of tugging on a ball of string where every improvement that we made in the end-user experience—so we would increase conversion rate by 10 percent but through doing that, we would introduce downstream error in the ways that those systems would integrate, and so we gradually just ended up having to pull in slightly more and more pieces to make it work. I don't think we ever gave odds of success. I think we underestimated how long that middle piece would take. I don't think we really anticipated the degree of upside that we would get as a consequence, through nothing other than just making releases quicker, being able to test and move faster, and focusing on end-user experience was definitely the right thing to focus on.Emily: Do you think though, that everybody perceived it as a risk? I'm just asking because you mentioned the blackjack, was this a risk that could fail?Paul: Well, I think the interesting thing about it was that we knew it was the right thing to do. So, again, I think our experience as consultants at ThoughtWorks was on applying continuous delivery, what we would today call DevOps, applying those practices to software delivery. And so we'd worked on systems where there weren't continuous integration servers and where people weren't releasing every day, and then we'd worked in environments where we were releasing every couple of hours, and we were very quickly able to hone in on what worked and discard things that didn't. And so I think because we've been able to demonstrate that success within the search business, I think that carried a great deal of trust. And so when it came to talking about things we could potentially do, we were totally convinced that there were things that we could improve. I think it was a combination of, there was a ton of potential, we knew that there was a new confluence of technologies and approaches that could be successful if we were able to just start over. And then I think also probably a healthy degree of, like, naive, probably overconfidence in what we could do that we would just throw ourselves into it. So, it's hard work, but yeah, it was ultimately highly successful. So, it's something I'm exceedingly proud of today.Emily: You said something really interesting, which is that Uswitch was barely profitable. And if I understand correctly, that changed for the better. Can you talk about how this is related?Paul: Yeah, sure. I think the interesting thing about it was that we knew that there was something we could do better, but we weren't sure what it was. And so the focus was always on being able to release as frequently as we possibly could to try and understand what that was, as well as trying to just simplify and pay back some of the technical debt. Well, so, trying to overcome some of the artificial constraints that existed because of the technology choices that people have made—perfectly decent decisions on, back in the day, but platforms like AWS offered better alternatives, now. So, we just focused on being able to deliver iteratively, and just keep focusing on continual improvement, releasing, understanding what the problems were, and then getting rid of those little niggly things. The manager I had at Forward was this super—I don't know, he just had the perfect ethos, and he was driven—so we were a team that were focused on doing daily experiments. And so we would rely on data on our spend and data on our revenue. And that would come in on a daily cycle. So, a lot of the rhythm of the team was driven off of that cycle. And so as we could run experiments and measure their profitability, we could then inform what we would do on the day. And so, we have a handful of long-running technology things that we were doing, and then we would also have other tactical things that he would have ideas on, he would have some hypothesis of, well, “Maybe this is the reason that this is happening, let's come up with a test that we can use to try and figure out whether that's true.” We would build something quickly to throw it together to help us either disprove it or support it, and we would put it live, see what happened, and then move on to the next thing. And so I think a lot of the—what we wanted to do is to instill a bit of that environment in Uswitch. And so a lot of it was being able to release quickly, making sure that people had good data in front of them. I mean, even tools like Google Analytics were something which we were quite au fait with using but didn't have broad adoption at the time. And so we were using that to look at site behavior and what was going on and reason about what was happening. So, we just tried to make sure that people were directly using that, rather than just making changes on a longer cycle without data at all.Emily: And can you describe how you were working with the business side, and how you were communicating, what the sort of working relationship was like? If there was any misunderstandings on either side.Paul: Yeah, it's a good question. So, when I started at Uswitch, the organizational structure was, I guess, relatively classical. So, you had a pooled engineering team. So, it was a monolithic system, deployed onto physical infrastructure. So, there was an engineering team, there was an operations team, and then there were a handful of people that were business specific in the different markets that we operated in. So, there was a couple of people that focused on, like, the credit card market; a couple of people that focused on energy, for example. And, I used to call it the stand-up swarm: so, in the morning, we would sit on our desks and you would see almost the entire office moved from the different card walls that were based around the office. Although there was a high degree of interaction between the business stakeholders, the engineers, designers, and other people, it always felt slightly weird that you would have almost all of the company interested in almost everything that was going on, and so I think the intuition we had was that a lot of the ways that we would think about structuring software around loosely-coupled but highly cohesive, those same principles should or could apply to the organization itself. And so what we tried to do is to make sure that we had multidisciplinary teams that had the people in them to do the work. So, for the early days of the energy work, there was only a couple of us that were in it. So, we had a couple of engineers, and we had a lady called Emma, who was the product owner. She used to work in the production operations team, so she used to be focused on data entry from the products that different energy providers would send us, but she had the strongest insight into the domain problem, what problem consumers were trying to overcome, and what ways that we could react to it. And so, when we got involved, she had a couple of ideas that she'd been trying to get traction on, that she'd been unable to. And so what we—we had a, I don't know, probably a, I think a half-day session in an office. So, we took over the boardroom at the office and just said, “Look, we could really do with a separate space away from everybody to be able to focus on it. And we just want to prove something out for a couple of weeks. And we want to make sure that we've got space for people to focus.” And so we had a half-day in there, we had a conversation about, “Okay, well, what's the problem? What's the technical complexity of going after any of these things?” And there's a few nuances, too. Like, if you choose option A, then we have to get all of the historical information around it, as well as the current products and market. Whereas if we choose option B, then we can simplify it down, and we don't need to do all of that work, and we can try and experiment with something sooner. So, we wanted it to be as collaborative as possible because we knew that the way that we would be successful is by trying to execute on ideas faster than we'd been able to before. And at the same time, we also wanted to make sure that there was a feeling of momentum and that we would—I think there was probably a healthy degree of slight overconfidence, but we were also very keen to be able to show off what we could do. And so we genuinely wanted to try and improve the environment for people so that we could focus on solving problems quicker, trying out more experiments, being less hung up on whether it was absolutely the right thing to do, and instead just focus on testing it. So, were there tensions? I think there were definitely tensions; I don't think there weren't tensions so much on the technical side; we were very lucky that most of the engineers that already worked there were quite keen on doing something different, and so we would have conversations with them and just say, “Look, we'll try everything we can to try and remove as many of the constraints that exist today.” I think a lot of the disagreement or tension was whether or not it was the right problem to be going after. So, again, the search business that we worked in was doing a decent amount of money for the number of people that were there, and we knew there was a problem we could fix, but we didn't know how much runway it would have. And so there was a lot of tension on whether we should be pulling people into focusing on extending the search business, or whether we needed to focus on fixing Uswitch. So, there was a fair amount of back and forth about whether or not we needed to move people from one part of the business to another and that kind of thing.Emily: Let's talk a little bit about Kubernetes, and how Uswitch decided to use Kubernetes, what problem it solved, and who was behind the decision, who was really making the push.Paul: Yeah, interesting. So, I think containers was something that we'd been experimenting with for a little while. So, as I think a lot of the culture was, we were quite risk-affine. So, we were quite keen to be trying out new technologies, and we'd been using modern languages and platforms like Closure since the early days of them being available. We'd been playing around with containers for a while, and I think we knew there was something in it, but we weren't quite sure what it was. So, I think, although we were playing around with it quite early, I think we were quite slow to choose one platform or another. I think in the end, we—in the intervening period, I guess, between when we went from the more classical way of running Puppet across a bunch of EC2 instances that run a version of your application, the next step after that was switching over to using ECS. So, Amazon's container service. And I guess the thing that prompted a bit more curiosity into Kubernetes was that—I forgot the projects I was working on, but I was working on a team for a little while, and then I switched to go do something else. And I needed to put a new service up, and rather than just doing the thing that I knew, I thought, “Well, I'll go talk to the other teams.” I'll talk to some other people around the company, and find out what's the way that I ought to be doing this today, and there was a lot of work around standardizing the way that you would stand up an ECS cluster. But I think even then, it always felt like we were sharing things in the wrong way. So, if you were working on a team, you had to understand a great deal of Amazon to be able to make progress. And so, back when I got started at Uswitch, when I talk about doing the work about the energy migration, AWS at the time really only offered EC2, load balancers, firewalling, and then eventually relational databases, and so back then the amounts of complexity to stand up something was relatively small. And then come to a couple of years ago. You have to appreciate and understand routing tables, VPCs, the security rules that would permit traffic to flow between those, it was one of those—it was just relatively non-trivial to do something that was so core to what we needed to be able to do. And I think the thing that prompted Kubernetes was that, on the Kubernetes project side, we'd seen a gradual growth and evolution of the concepts, and abstractions, and APIs that it offered. And so there was a differentiation between ECS or—I actually forget what CoreOS's equivalent was. I think maybe it was just called CoreOS. But there are a few alternative offerings for running containerized, clustered services, and Kubernetes seems to take a slightly different approach that it was more focused on end-user abstractions. So, you had a notion of making a deployment: that would contain replicas of a container, and you would run multiple instances of your application, and then that would become a service, and you could then expose that via Ingress. So, there was a language that you could use to talk about your application and your system that was available to you in the environment that you're actually using. Whereas AWS, I think, would take the view that, “Well, we've already got these building blocks, so what we want our users to do is assemble the building blocks that already exist.” So, you still have to understand load balancers, you still have to understand security groups, you have to understand a great deal more at a slightly lower level of abstraction. And I think the thing that seemed exciting, or that seems—the potential about Kubernetes was that if we chose something that offered better concepts, then you could reasonably have a team that would run some kind of underlying platform, and then have teams build upon that platform without having to understand a great deal about what was going on inside. They could focus more on the applications and the systems that they were hoping to build. And that would be slightly harder on the alternative. So, I think at the time, again, it was one of those fortunate things where I was just coming to the end of another project and was in the fortunate position where I was just looking around at the various different things that we were doing as a business, and what opportunity there was to do something that would help push things on. And Kubernetes was one of those things which a couple of us had been talking about, and thinking, “Oh, maybe now is the time to give it a go. There's enough stability and maturity in it; we're starting to hit the problems that it's designed to address. Maybe there's a bit more appetite to do something different.” So, I think we just gave it a go. Built a proof of concept, showed that could run the most complex system that we had, and I think also did a couple of early experiments on the ways in which Kubernetes had support for horizontal scaling and other things which were slightly harder to put into practice in AWS. And so we did all that, I think gradually it just kind of growed out from there, just took the proof of concepts to other teams that were building products and services. We found a team that were struggling to keep their systems running because they were a tiny team. They only had, like, two or three engineers in. They had some stability problems over a weekend because the server ran out of hard disk space, and we just said, “Right. Well, look, if you use this, we'll take on that problem. You can just focus on the application.” It kind of just grew and grew from there.Emily: Was there anything that was a lot harder than you expected? So, I'm looking for surprises as you're adopting Kubernetes.Paul: Oh, surprises. I think there was a non-trivial amount that we had to learn about running it. And again, I think at the point at which we'd picked it up, it was, kind of, early days for automation, so there was—I think maybe Google had just launched Google Kubernetes engine on Google Cloud. Amazon certainly hadn't even announced that hosted Kubernetes would be an option. There was an early project within Kubernetes, called kops that you could use to create a cluster, but even then it didn't fit our network topology because it wouldn't work with the VPC networking that we needed and expected within our production infrastructure. So, there was a lot of that kind of work in the early days, to try and make something work, you had to understand in quite a level of detail what each component of Kubernetes was doing. As we were gradually rolling it out, I think the things that were most surprising were that, for a lot of people, it solved a lot of problems that meant they could move on, and I think people were actually slightly surprised by that. Which, [laughs], it sounds like quite a weird turn of phrase, but I think people were positively surprised at the amount of stuff that they didn't have to do for solving a fair few number of problems that they had. There was a couple of teams that were doing things that are slightly larger scale that we had to spend a bit more time on improving the performance of our setup. So, in particular, there was a team that had a reasonably strong requirement on the latency overheads of Ingress. So, they wanted their application to respond within, I don't know, I think it was maybe 200 milliseconds or something. And we, through setting up the monitoring and other bits and pieces that we had, we realized that Ingress currently was doing all right, but there was a fair amount of additional latency that was added at the tail that was a consequence of a couple of bugs or other things that existed in the infrastructure. So, there was definitely a lot of little niggly things that came up as we were going, but we were always confident that we could overcome it. And, as I said, I think that a lot of teams saw benefits very early on. And I think the other teams that were perhaps a little bit more skeptical because they got their own infrastructure already, they knew how to operate it, it was highly tested, they'd already run capacity and load tests on it, they were convinced that it was the most efficient thing that they could possibly run. I think even over the long run, I think they realized that there was more work that they needed to do than they should be focusing on, and so they were quite happy to ultimately switch over to the shared platform and infrastructure that the cloud infrastructure team run.Emily: As we wrap up, there's actually a question I want to go back to, which is how you were talking about the shifting priorities now that you've become CTO. Do you have any sort of examples of, like, what are the top three things that you will always care about, that you will always have the energy to think about? And then I'm curious to have some examples of things that you can't deal with, you can't think about. The things that tend to drop off.Paul: The top three things that I always think about. So, I think, actually, what's interesting about being CTO, that I perhaps wasn't expecting is that you're ever so slightly removed from the work, that you can't rely on the same signals or information to be able to make a decision on things, and so when I give the Kubernetes story, it's one of those, like, because I'd moved from one system to another, and I was starting a new project, I experienced some pain. It's like, “Right. Okay, I've got to go do something to fix this. I've had enough.” And I think the thing that I'm always paying attention to now, is trying to understand where that pain is next, and trying to make sure that I've got a mechanism for being able to appreciate that. So, I think a lot of the things I try to spend time on are things to help me keep track of what's going on, and then help me make decisions off the back of it. So, I think the things that I always spend time on are generally things trying to optimize some process or invest in automation. So, a good example at the moment is, we're talking about starting to do canary deployments. So, starting to automate the actual rollout of some new release, and being able to automate a comparison against the existing service, looking at latency, or some kind of transactional metrics to understand whether it's performing as well or different than something historical. So, I think the things that I tend to spend time on are process-oriented or are things to try and help us go quicker. One of the books that I read that changed my opinion of management was Andy Grove's, High Output Management. And I forget who recommended it to me, but somebody recommended it to me, and it completely altered my opinion of what value a manager can add. So, one of the lenses I try to apply to anything is of everything that's going on, what's the handful of things that are going to have the most impact or leverage across the organization, and try and spend my time on those. I think where it gets tricky is that you have to go broad and deep. So, as much as there are broad things that have a high consequence on the organization as a whole, you also need an appreciation of what's going on in the detail, and I think that's always tricky to manage. I'm sorry, I forgot what was the second part of your question.Emily: The second part was, do you have any examples of the things that you tend to not care about? That presumably someone is asking you to care about, and you don't?Paul: [laughs]. Yeah, it's a good question. I don't think it's that I don't care about it. I think it's that there are some questions that come my way that I know that I can defer, or they're things which are easy to hand off. So, I think the… that is a good question. I think one of the things that I think are always tricky to prioritize, are things which feel high consequence but are potentially also very close to bikeshedding. And I think that is something which is fair—I'd be interested to hear what other people said. So, a good example is, like, choice of tooling. And so when I was working on a team, or on a problem, we would focus on choosing the right tool for the job, and we would bias towards experimenting with tools early, and figuring out what worked, and I think now you have to view the same thing through a different lens. So, there's a degree to which you also incur an organizational cost as a consequence of having high variability in the programming languages that you choose to use. And so I don't think it's something I don't care about, but I think it's something which is interesting that I think it's something which, over the time I've been doing this role, I've gradually learned to let go of things that I would otherwise have previously thoroughly enjoyed getting involved in. And so you have to step back and say, “Well, actually I'm not the right person to be making a decision about which technology this team should be using. I should be trusting the team to make that decision.” And you have to kind of—I think that over the time I've been doing the role, you kind of learn which are the decisions that are high consequence that you should be involved in and which are the ones that you have to step back from. And you just have to say, look, I've got two hours of unblocked time this week where I can focus on something, so of the things on my priority list—the things that I've written in my journal that I want to get done this month—which of those things am I going to focus on, and which of the other things can I leave other people to get on with, and trust that things will work out all right?Emily: That's actually a very good segue into my final question, which is the same for everyone. And that is, what is an engineering tool that you can't live without—your favorite?Paul: Oh, that's a good question. So, I don't know if this is a cop-out by not mentioning something engineering-related, but I think the tool and technique which has helped me the most as I had more and more management responsibility and trying to keep track of things, is bullet journaling. So, I think, up until, I don't know, maybe five years ago, probably, I'd focus on using either iOS apps or note tools in both my laptop, and phone, and so on, and it never really stuck. And bullet journaling, through using a pen and a notepad, it forced me to go a bit slower. So, it forced me to write things down, to think through what was going on, and there is something about it being physical which makes me treat it slightly differently. So, I think bullet journaling is one of the things which has had the—yeah, it's really helped me deal with keeping track of what's going on, and then giving me the ability to then look back over the week, figure out what were the things that frustrated me, what can I change going into next week, one of the suggestions that the person that came up with bullet journaling recommended, is this idea of an end of week reflection. And so, one of the things I try to do—it's been harder doing it now that I'm working at home—is to spend just 15 minutes at the end of the week thinking of, what are the things that I'm really proud of? What are some good achievements that I should feel really good about going into next week? And so I think a lot of the activities that stem from bullet journaling have been really helpful. Yeah, it feels like a bit of a cop-out because it's not specifically technology related, but bullet journaling is something which has made a big difference to me.Emily: Not at all. That's totally fair. I think you are the first person who's had a completely non-technological answer, but I think I've had someone answer Slack, something along those lines.Paul: Yeah, I think what's interesting is there there are loads of those tools that we use all the time. Like Google Docs is something I can't live without. So, I think there's a ton of things that I use day-to-day that are hard to let go off, but I think the I think that the things that have made the most impact on my ability to deal with a stressful job, and give you the ability to manage yourself a little bit, I think yeah, it's been one of the most interesting things I've done.Emily: And where could listeners connect with you or follow you?Paul: Cool. So, I am @pingles on Twitter. My DMs are open, so if anybody wants to talk on that, I'm happy to. I'm also on GitHub under pingles, as well. So, @pingles, generally in most places will get you to me.Emily: Well, thank you so much for joining me.Paul: Thank you for talking. It's been good fun.Announcer: Thank you for listening to The Business of Cloud Native podcast. Keep up with the latest on the podcast at thebusinessofcloudnative.com and subscribe on iTunes, Spotify, Google Podcasts, or wherever fine podcasts are distributed. We'll see you next time.
KubeCon + CloudNativeCon sponsored this podcast. The New Stack will shortly launch its latest edition of “The State of the Kubernetes Ecosystem” after the first edition of the ebook was published in 2017. Ahead of its publication, The New Stack was able to speak with Kelsey Hightower, principal developer advocate at Google Cloud, who is likely one of the most recognized voices in the Kubernetes space. In this edition of The New Stack Makers podcast hosted by Alex Williams, founder and publisher of The New Stack, Hightower spoke about his role in Kubernetes since the beginning, his thoughts on the project's leadership today and the challenges that lay ahead. During the early days of Kubernetes, there “were no ebooks available” on the subject, Hightower said. The main goal was to help “raise the profile of the people with the job of trying to manage applications." “I think the whole point was when I was showing Kubernetes off [as] a contributor [and] building things around the ecosystem, my product work at CoreOS — we were all trying to solve problems that we all had in the past,” Hightower said. “We were trying to uplift the community. We were pretty sure that technology was going to be okay over time.”
Jeff Gray brings 20 years of experience in sales, software delivery, engineering, operations and strategy. Most recently he was a VP of Business Operations at Domino Data Lab, a machine learning platform that focuses on the needs of enterprises with large teams of code-first data scientists. The company is funded by Sequoia and several other VCs. Prior to that, Jeff was a VP of Business Operations at CoreOS, a leader in the Kubernetes community that was acquired by Red Hat for $250M in Jan of 2018. In this episode we talk about scaling a startup from $1 to $10 million in revenue.
History of Tirefi.re Immutable OSs, is that really something that's worth looking into for servers? Could you use ChromeOS's philosophy on servers? Flatpak! Love or hate? CoreOS go boom boom Fedora coreos silverblue vs flatcar container linux Breakfast. How do you do it? Eggs or no eggs? “Balanced breakfast” - is that still a thing? Books vs ebooks vs audiobooks Has the boom boom times changed how you consume? When do you listen to podcasts now ? Special Guest: Shaun Mouton.
History of Tirefi.re Immutable OSs, is that really something that’s worth looking into for servers? Could you use ChromeOS’s philosophy on servers? Flatpak! Love or hate? CoreOS go boom boom Fedora coreos silverblue vs flatcar container linux Breakfast. How do you do it? Eggs or no eggs? “Balanced breakfast” - is that still a thing? Books vs ebooks vs audiobooks Has the boom boom times changed how you consume? When do you listen to podcasts now ? Special Guest: Shaun Mouton.
Ardour 6 is out with major changes under the hood, CoreOS Container Linux is officially unmaintained, TeleIRC version 2.0.0 lands with a complete rewrite, the FIDO Alliance launches an instructional campaign, and PeerTube outlines its newest fundraising goals.
This is interview was recorded when Red Hat adquire CoreOS. At the moment, Fedora shipped Fedora Atomic and Fedora Atomic Host, and we talk about Fedora Silverlue and Fedora CoreOS as new as they were at the moment
This is interview was recorded when Red Hat adquire CoreOS. At the moment, Fedora shipped Fedora Atomic and Fedora Atomic Host, and we talk about Fedora Silverlue and Fedora CoreOS as new as they were at the moment
In this episode Rich speaks with Verónica López from Digital Ocean. Topics include: How Verónica learned to love math and computers, her experience working at CoreOS in its heyday, participating on the Kubernetes CI Signal team, working on the API Engineering team at Digital Ocean, and her advice for people that are new to the Kubernetes community.
If you’re running Kubernetes, you’re running etcd. The distributed key-value store was started as an intern project at CoreOS by Xiang Li, who is still maintaining it but now working on infrastructure at Alibaba. Xiang joins your hosts to discuss. Do you have something cool to share? Some questions? Let us know: web: kubernetespodcast.com mail: kubernetespodcast@google.com twitter: @kubernetespod Chatter of the week Getting toilet paper be like So, stay at home and play with free synth apps! Korg Kaossilator: download for Android or iOS MiniMoog Model D: download for iOS iSongs on YouTube News of the week vSphere 7 and VMware Tanzu announcements Docker announces new strategy and roadmap Hitachi Vantara acquires Containership’s assets Containership’s since-removed “goodbye” post Lens, now from Lakend Labs KEDA and SMI join the CNCF Sandbox AWS Bottlerocket blog post and GitHub repo Enable encryption on App Mesh with custom or ACM certs EKS supports Kubernetes 1.15 Firecracker thread by Micah Hausler gVisor thread by Ian Lewis Kublr adds rolling upgrades Google Cloud moves to its own ACME certificate provider GKE Workload Identity is GA Analysis of Redis operators by Flant Bank Vaults 1.0 and HSM support by Banzai Cloud CNCF joins Google Summer of Code Lifemiles case study Rancher Labs raises $40m Episode 57, with Darren Shepherd Links from the interview etcd etcd on GitHub How Kubernetes uses etcd The history of etcd, including the famous garage Built to handle upgrading CoreOS Container Linux nodes Prior art: Zookeeper: too much JVM Doozer: not enough community Chubby: too private to Google Paxos The paper Paxos Made Live - An Engineering Perspective Multi-Paxos raft The paper Announcing etcd Ben and Blake etcd3 moved from a tree keyspace to flat keyspace Latest version: etcd 3.4 etcd and Kubernetes at Alibaba: Demystifying Kubernetes as a Service – How Alibaba Cloud Manages 10,000s of Kubernetes Clusters Performance optimization of etcd in web scale data scenario The first etcd operator created by Xiang Jepsen tests of 0.4.1 and 3.4.3 CNCF to host etcd in December 2018 etcd roadmap Xiang Li on GitHub Xiang Li on Twitter
Running Kubernetes on conventional operating systems is time-consuming and labor-intensive. Today’s guests Andrew Rynhard and Timothy Gerla have engineered a product that attempts to provide a solution to this problem. They call it Talos, and it is a modern OS designed specifically to host Kubernetes clusters, managed by a flexible and powerful API. Talos is completely stripped down to the bare components required to run Kubernetes and get information from the system. It stays updated by keeping time with Kubernetes, but also provides the user with a large degree of control in the event that they might need to update a flag. In this episode, Andrew and Timothy get into some of the mechanics and thought processes behind Talos, telling us why they went with a read-only API, how they handle security concerns on the OS, and how a system like theirs might get adopted by the Kubernetes community and layperson more broadly. They get into the advantages provided by a stripped-down solution for systematizing the use of Kubernetes across communities and running new components through clusters rather than on the OS itself. In a space where most participants are largely operating in the dark, it is a pleasure to see innovations like this display such lasting power so make sure you check out this episode. Follow us: https://twitter.com/thepodlets Website: https://thepodlets.io Feeback: info@thepodlets.io https://github.com/vmware-tanzu/thepodlets/issues Guests: Andrew Rynhard https://twitter.com/andrewrynhard Tim Gerla https://twitter.com/tybstar Hosts: Carlisia Campos Bryan Liles Olive Power Key Points From This Episode: What a Kubernetes OS is: a stripped-down OS that integrates with Kubernetes. The difficulties of managing and getting Kubernetes installed on regular OSs. Why a Kubernetes OS? Less attack surface and OS compatibility issues. What Talos does: quickly makes nodes part of a Kubernetes cluster by being stripped down. How replacing SSH with an API alleviates some users’ security concerns. A command-line interface called OSCTL that allows users to explore the API. What does ‘stripped-down’ mean? Talos runs kubelets and gets information from the OS. The ability to run new components through clusters rather than from the OS. How the Kubernetes OS evolves with Kubernetes but gets separately controlled too. Better integrating into Kubernetes by abstracting OS features into Kubernetes as operators. Security precautions: kernel hardening, SSH and Bash removal, and a read-only OS. Usability of Talos for the average Joe, and its consistency across base platforms. Possibilities for interacting with deeper levels of an OS through an API managed OS. How Talos might become appealing to laypeople: decreasing costs for porting to it. Value gained from switching to a purpose-built OS as something which could outweigh costs. Tendencies to hang onto tried and trusted tech even if its predecessors are superior. Quotes: “To me, it’s just about abstracting away the operating system and not even having to worry about it anymore, and looking at Kubernetes and the entire cluster as an operating system.” — Andrew Rynhard [0:05:00] “As rapid as the technology is changing, you need an operating system that is going to evolve with it or at least the operations intelligence to evolve with Kubernetes right alongside it.” — Andrew Rynhard [0:13:08] “The challenge I think for us and for anybody changing the way that operating systems work is is it better enough than what I have today or what I had before?” — @tybstar [0:26:50] “There’s a lot of companies out there who got us at this point in tech that don’t exist anymore, but if they didn’t do what they did, we would not be here right now.” — @bryanl [0:33:41] Links Mentioned in Today’s Episode: Talos — https://www.talos-systems.com/ Timothy Gerla — http://www.gerla.net/ Timothy Gerla on Twitter — https://twitter.com/tybstar Andrew Ryndhard on LinkedIn —https://www.linkedin.com/in/andrewrynhard/ Andrew Ryndhard on GitHub — https://github.com/andrewrynhard Jed Salazar on LinkedIn — https://www.linkedin.com/in/jedsalazar/ Bryan Liles on LinkedIn — https://www.linkedin.com/in/bryanliles/ Carlisia Campos on LinkedIn — https://www.linkedin.com/in/carlisia/ Red Hat — https://www.redhat.com/en/technologies/linux-platforms/enterprise-linux Arch — https://www.archlinux.org/Debian — https://www.debian.org/ Linux — https://www.linux.org/ Bell Labs — http://www.bell-labs.com/ AT&T — https://www.att.com/ Transcript: EPISODE 20 [INTRODUCTION] [0:00:08.7] ANNOUNCER: Welcome to The Podlets Podcast, a weekly show that explores Cloud Native one buzzword at a time. Each week, experts in the field will discuss and contrast distributed systems concepts, practices, tradeoffs and lessons learned to help you on your Cloud Native journey. This space moves fast and we shouldn’t reinvent the wheel. If you’re an engineer, operator or technically minded decision maker, this podcast is for you. [INTERVIEW] [00:00:41] CC: Hi, everybody. Welcome back to the Podlets. Today we have a special episode, and on the show, we have a special guest, Andrew Ryndhard. Say hi, Andrew. [00:00:53] AR: Hello, how are you? [00:00:55] CC: We also have Timothy Gerla. Say hi, Tim. [00:00:58] TG: Hi. Thanks for having me. [00:01:00] CC: Yeah. Andrew and Timothy are from Talos. Andrew dropped an issue on our GitHub repo and here we are. It was a great suggestion. What we’re going to talk about today is what they are working on, which is a Kubernetes operating system. We have tons of questions for them for sure. We also have a special participant on the episode today as a co-host, Jed Salazar. Hi, Jed. [00:01:28] JS: Hey, everyone. Jed Salazar here from the CRE team here at VMware. [00:01:31] CC: And Bryan Liles. [00:01:32] BL: Hi. [00:01:33] CC: Hi. And me, Carlisia Campos. Who’d like to get the party started and kick this off? [00:01:41] BL: Oh, I’m here. Let’s throw the gauntlet down. We’re talking about Kubernetes operating systems today. I have an operating system, a Mac, or I have Linux. I can run Kubernetes. What is a Kubernetes operating system and why should I even be thinking about this? [00:01:58] AR: Sure. I’d like to think about Kubernetes operating system as an operating system that has stripped down the absolute bare minimum to run Kubernetes. Everything that is required to run the kubelet, and essentially that’s it, at least in my opinion. It should be super minimal to start with. Second of all, I also think that it should integrate with Kubernetes as well. The combination of just being able to strip down Linux as we know as small as possible and then actually integrating with Kubernetes itself using APIs to figure out things about itself, whatever. I think that that, in my opinion, is what I would call a Kubernetes OS. [00:02:42] BL: Interesting. Okay. Now that we know a little bit about Kubernetes operating systems, and like I said, I’m starting in early today as the devil’s advocate. Now, like I said before, I have a Mac and I have Linux and I have Windows on my desktop. There’re been lots of efforts from lots of people trying to get Kubernetes running up on Ubuntu or Fedora, and it’s cool that you’re trying to slim this down, but really why would I look at a Kubernetes operating system over my Linux that I’m familiar with? I like Ubuntu with Debian. [00:03:17] AR: Sure. That’s a great question. It’s one we get a lot. I like to think that you actually just get less operational overhead when you actually have a Kubernetes-specific operating system. I think that Kubernetes itself is a job, managing it, getting it installed, unfortunately. It’s getting better, but it’s still a job at the end of the day. Having to manage Kubernetes and the operating system, everything that you need to pass compliance on the operating system, get all the packages installed, these are all things that we kind of know that Kubernetes needs already and yet we’re still having to go in and app install whatever we might need to get Kubernetes up and running. The idea with a Kubernetes operating system in my mind is that we should stop worrying about the individual node, the underlying operating system and start looking at Kubernetes as a whole as a giant machine and we just add machines, nodes to this giant machine that give us extra resources. The less that we have to care about the machine or the underlying operating system, the better, in my mind. We get to focus on Kubernetes. Not only that, but because it’s minimal, you get a smaller attack surface. There’re just not things there that you would otherwise have to worry about. I’ve done Kubernetes for three years now and having to go in and worry about updating packages that are just completely unrelated, it’s something that I think we shouldn’t have to do anymore. If you’re dedicated to running your apps and your stack in Kubernetes, then why are we going in and managing the nodes on an individual basis. For that matter, managing things that don’t really have any relevance for running Kubernetes. To me, it’s just about abstracting away the operating system and not even having to worry about it anymore and looking at Kubernetes and the entire whole cluster as an operating system. We can’t really get there if we’re having to worry about the two jobs of managing both at the same time. [00:05:17] JS: Andrew, can I ask a follow up question? [00:05:18] AR: Sure. [00:05:20] JS: I fully agree with all of those statements. I think a general purpose operating system might not be the best job for a specific role, like being a Kubernetes node. As you mentioned, you have to deal with kind of all the various packages that might be beneficial to you if you’re running it for some general purpose. It’s really supposed to be running a workload as a Kubernetes node so you can kind of scope that down. I’m just wondering when you kind of make this pitch or kind of let these folks know, how do you get folks to kind of relinquish their desire to have full control over their operating system from being able to install their own security management processes on it or being a little bit shy about not being able to SSH or kind of use their common patterns of operating system management? [00:06:09] AR: Oh, that’s a great question. I think the biggest thing that I always answer back is – I can take this in two parts. Let me first of all talk about what – People, they do want to run things on the host. My answer always back is can you run it in Kubernetes? Kubernetes is sort of your package manager, if you will. They sit back usually and they’re like, “Hmm. Yeah, I probably could.” If you need to run something on every host, Kubernetes has something for that. It’s a daemon set. Run it on Kubernetes and call it a day. This isn’t something that’s going to work for absolutely everything I imagine. Nothing in the world is like that. But I think for the majority of the use cases out there and for the things that people want to run on the host, you could actually just run it in Kubernetes itself. As far as SSH and for those that don’t really know what we’ve done in Talos, in Talos we’ve actually stripped down just the kernel and a small Go lang binary that’s our – That, basically, its whole goal is to create a Kubernetes cluster or make a node part of a Kubernetes cluster as fast as possible, and that’s really it. We’ve gone so far as ripping out Bash and SSH and we’ve actually replaced that with an API. My answer always back to the SSH question is what is it that you really trying to get out of SSH? 9 times out of 10, it’s, “I want to get information about what’s wrong. I want to do troubleshooting.” If our answer back to them is, “Oh! We have an API for that,” you still at the end of the day – it’s really the information that you’re after. It’s not necessarily that you need SSH to do that. You need a way to get this information and not necessarily have to sit there and wait for a Prometheus metric, see it pulled it every minute. You want something right on the spot. You want to ask a question and you want to get an immediate answer. I feel like we can answer that with an API. That tends to satisfy the desire for wanting SSH most of the time. I mean, as you said, people are still going to want to hold on to it, but I think over time we’re going to have to educate people that this is a better way. It’s a read-only API that gives operations engineers a way to get that information that they would otherwise get by SSH-ing and asking via Unix utilities what you want to know. [00:08:27] CC: When you say an API, are you also giving them a command line tool or like in the case of Talos, or only an actual API? [00:08:37] TG: Yeah, we do provide a command line interface to the API. It’s called OSCTL and it basically wraps our API, and our intention is that that will be used for exploration of the system, automation through scripting languages, etc. Then as you get more sophisticated with your environment, you might begin to build your own tools that interact directly with that API. [00:08:56] CC: Cool. Yeah, this is a really cool subject. I wasn’t even aware that Kubernetes operating system was a thing until really recently, and I don’t remember how I came across it. One question I have is, Andrew, you were saying, “Well, we strip down Kubernetes to the bare minimum.” How opinionated is it in your case in specific? When you say you – it’s a stripped down version to the bare minimum, this statement of bare minimum, would there be a consensus in the community that, yes, this set of functionality is the bare minimum? Is it your opinion of what the bare minimum should be? [00:09:38] AR: Sure. I think at the absolute bare minimum, we need to run the kubelet. In my mind, that’s really all we need, but you still have this practical issue of, like you said, you need to get information off that machine. You need to be able to kind of manage Kubernetes without having to need Kubernetes as a chicken and egg’s problem. That’s where the API was actually born. When I started Talos, I actually just built a very minimal strip down route-fs that all that did was run the kubelet. But figuring out why the kubelet wasn’t running successfully obviously was not very easy. I figured, “You know what? Let’s put an API in front of this. I want to keep this as minimal as possible. I want to keep this read-only.” I threw an API in front of it. I think you need two things, really. You need to have what’s required by the kubelet. You need a CNI. You need all the utilities that the kubelet will run and you also need a way to query the system. If that is – If in the case of other operating systems that are minimal operating systems, they have decided to do SSH and all the classic utilities that we all know and love, we went another route with an API. But I don’t think the operating system, the route-fs should have any more than what’s required by the kubelet. That would be the pie in the sky dream right there. [00:11:01] CC: The two questions that come to my mind are if I wanted to add Kubernetes components to that, would it be possible? If I wanted to add anything to the operating system, would it possible? I think the second question you already answered, which is, well, if you need to run – Correct me if I’m wrong. If you need to run something on the operating system that’s not there, you can run it in the actual cluster. [00:11:27] AR: Yeah, that’s the idea, is that Kubernetes gives us the APIs to do – We could schedule to specific nodes. We can schedule to a class of nodes. We can schedule to every single node. I think that you can actually handle a lot of the use cases out there for any kind of application with Kubernetes itself. I think that that’s really strong because you get one single consistent API in managing your infrastructure. I want to deploy applications for this team or this team. At the end of the day, everything is just declarative and Kubernetes will make it happen. You don’t have to worry about the scheduling and all of these different things. The only thing that the operating system is concerned about is making that machine available to the Kubernetes cluster. [00:12:10] BL: This idea of slimmed down operating systems, it’s not a new one. CoreOS was doing this years ago. One issue that CoreOS ran into was like, “Well, what’s current?” Well, it depends on what stream you’re on. How do you manage keeping everything up-to-date? [00:12:28] AR: Our goal is to keep pace with Kubernetes essentially. I know that, traditionally, there’s long-term support and there’s all these different ways of releasing different versions of an operating system, but Kubernetes isn’t really there yet. There is no notion that I know of of LTS in Kubernetes yet. There’s just, I believe, it’s N-2 or something like that where they actually offer official support. I think that the operating system is bound to that. I think that it needs to follow Kubernetes as close as possible. There’re constantly different feature gates being opened up. There’re things being graduated to GA. I think especially at this time right now, as rapid as the technology is changing, you need an operating system that is going to evolve with it or at least the operations intelligence to evolve with Kubernetes right alongside it. [00:13:20] BL: So that brings up an interesting point. I mean, there are two things here. There’s the operating system itself and there’s Kubernetes. Do they upgrade in lockstep or are they upgraded separately? [00:13:29] AR: I could only speak for ourselves. There are people that I think they actually have upgrades kind of be one and the same, where the operating system and a Kubernetes upgrade both happen. We’ve decided started to go the other route where we actually want to evolve our APIs sort of independently, but then give you a way to still manage Kubernetes on its own. We’ve actually done self-hosted Kubernetes. In Talos, we’ll actually bootstrap a lightweight control plane, small control plane and then we’ll spin up another control plane using the Kubernetes API. Then now, Kubernetes upgrades simply look like a kubectl edit. I’m going to update my daemon step for my API server. Then from there, you will have to basically update the kub. We use hyperkub for the kubelet. You have to tell Talos, “Use this kubelet image next time you boot.” We’ve separated the two I think for good reason. I think that the two should be able to evolve independently to give a little bit more power back to the user. If you combine them, if you couple them really closely, it becomes really, really opinionated. I think we should at least support what Kubernetes supports, and that’s the N-2 and leave it up to the user to kind of configure Kubernetes, but we still have same best practices out of the box. [00:14:54] BL: Yeah, that makes sense, because yesterday, what did we get? We got a Kubernetes 1.15.10, and I don’t know 16, but we got 1.17.3 yesterday too. You might not want to move, because you might not – 1.17 introduced a whole bunch of deprecations and for custom resource definition. You’re not ready to move yet. We’re on beta 1 for a while for CRDs. I totally see why you had moved that direction. [00:15:20] AR: Yeah, that’s exactly it. We can’t impose too much opinion, but I think that we should drive – The opinion at least up until like, “Hey, don’t worry about what’s on this machine. I’m going to make it a Kubernetes node for you. Just tell me which version you want.” I think that’s where we should draw the boundary and then we should still give the controls back to the user as far as what flags do I want to specify. What kind of feature gates? All these various things that you don’t get out of a lot of the different managed products out there. Hopefully we’ll be tittering right on the line of having that convenience of managed but still giving you that power and flexibility to update a flag if you need to. [00:16:04] CC: This episode is so in the style of an interrogation. It’s hilarious. [00:16:08] BL: That’s me. I’m digging in. [00:16:09] CC: I feel like – No. We are all digging in. It’s just because – At least speaking from myself. I’m super curious. I wanted to ask you, Andrew, at the beginning you were saying that a Kubernetes operating system needs to integrate with Kubernetes and I was sitting here thinking, “Operate? It’s supposed to be Kubernetes.” What did you have in mind when you said that? Did you mean to be able to interface with another Kubernetes cluster? Was that what you meant? [00:16:36] AR: Not quite. What I meant by that is there’s this really powerful thing that Kubernetes gives us in CRDs and this idea of operators or controllers. If you can actually have a way to use an operator controller, say, for upgrading your operating system, which we have in Talos, it’s just an upgrade operator lives in Kubernetes and knows how to talk to Kubernetes and it knows how to talk to our API and sort of orchestrate upgrades across the board. Part of that is, for example, when you receive the upgrade API on a Talos node, it actually is aware, “Hey, I’m running Kubernetes. I’m going to cordon myself, because I know I’ve gotten this and I know that I’m not going to be able to schedule workload on me.” I think that that’s just one example, but we could probably take that a lot farther one day. But I would like to see everything that we know and love about our operating systems today essentially be abstracted and pushed up into Kubernetes as operators. There’s a lot of power in that where you can actually orchestrate things, like I said, like upgrades. I think that that’s one example of how we can integrate better with Kubernetes as how an operating system should, at least. [00:17:45] CC: Got you. [00:17:46] JS: I was wondering if we can kind of maybe just pivot a little bit, like maybe to satisfy my own curiosities, but I was kind of hoping we could talk a little bit about like some of the selling features. Imagine if I’m a hardened sys admin or security team and basically someone comes up and says, “Hey, I want to run this Kubernetes operating system.” Knowing what I know about the state of security today and operating systems, there’s a lot of efforts to basically kind of contain things. No pun intended, but we have user space operates out of some type of sandbox. We have seccomp to limit sys calls. How does Talos approach security maybe like philosophically or maybe even down to the implementation details? What is security in Talos look like? [00:18:33] AR: Yeah. Again, our goal is to basically – We want people to forget about the operating system. But to forget about the operating system, you have to know it’s secure. You have to go to great lengths to secure that because you can’t forget about it for that reason. We actually go down to the kernel, we actually apply what’s called the kernel self-protection project. We basically try to harden the kernel, and at boot time, we do a bunch of checks to make sure that your kernel is running at least most of those configurations. I think we have a little bit of work to do as far as enforcing all of them. But we do some checks to ensure that your kernel is compatible with KSPP, for example. That alone has a ton of benefits to it. It’s a statically compiled kernel so you it can’t do any kernel module loading and stuff like that. That’s completely prohibited. That alone just kind of cuts off a lot of security issues in itself. Then going up the stack further, we’ve actually stripped out SSH. We stripped out Bash. So you have nothing that you can really log on to anymore. Again, that’s just flat out removes a lot of – A whole category of potential attacks potentially. Going even further than that, we’ve actually have Talos running completely out of RAM and it’s a squash-fs. So it’s a read-only file system. The only thing that actually uses a disk is the kubelet. The idea is that we want to make the operating system, again, just have it go away. Having it read-only I think is a really strong thing, and squash-fs in particular, because you can’t remount it, rewrite if you’re a user or something like that. Then up in Kubernetes we actually – Out of the box, we try to deploy it with all of the security best practices, the CIS benchmarks and all of that. We go to all the way from the kernel, to our user LAN and even to Kubernetes itself. We try to bring out security best practices out of the box. I think that’s something I’d love to see for Kubernetes itself upstream, but for now that’s what we’re doing. [00:20:33] BL: Can we go back to the interrogation? No. Let’s not go back to an interrogation. Thinking of – If we take the concept of a Kubernetes operating system, that can be updated in a different cadence, then the Kubernetes running on it – Who is Talos for? Who does it make – Could Joe as a neophyte or someone who doesn’t really know the space, will this make their life any easier or is there a special set of expertise that we would need to be fruitful with this? [00:21:06] TG: I think from our perspective, we hope that everybody who uses Kubernetes would find something useful in Talos, or a system like Talos. Number one, I think Talos would be a great way to get started on your laptop or workstation. I got some basic features to standup a small Kubernetes cluster there. That’s one place to start. As you move further into production, I think that a Kubernetes OS-based platform would be particularly useful in an environment where you might have multiple clusters spread across different geographical locations, spread across different teams. Maybe spread across different hosting environments. We’ve talked to a number of folks who have been running Kubernetess in production for a couple of years now, and these clusters kind of come up organically within a larger organization in different areas, doing different things for the business, managed by different teams. Now that a little bit of time has passed, these organizations are realizing that, “Hey, we’ve got kind of a Kubernetes sprawl problem. We have this team over here on Amazon managing and running Kubernetes one way. We have a separate team managing and running Kubernetes a different way over here on a different kind of platform.” I think anything that – anywhere where we can drive some consistency across the tooling, consistency across the base platforms would be useful. We also think that the minimal aspect of our system and some of the design decisions we’ve made around security and make it particularly useful in maybe a regulated environment. I think that that claim would hold true for any sort of special purpose operating system or minimal operating system designed for a specific task. [00:22:35] BL: Interesting. Just thinking about a concept of a Kubernetes operating system, what’s next? I’m not asking what’s next from Talos, but given all the opportunity all the time and all the knowledge. What should we be doing that we’re not doing right now? [00:22:49] AR: Specifically around operating systems or Kubernetes? [00:22:52] BL: Well, you know what? You can start with operating systems. I mean, you can go to Kubernetes and then we’ll see if our lists match. [00:22:57] AR: That’s a good question. Right off the bat, I’m going to say I don’t really know. I think this is new space. I think that we have a big task in front of us already in getting people to use these kinds of operating systems, hopefully not too big of a task. I’m hoping to see – Because you find these big companies, “Oh! We can’t do this. We can’t do that,” because getting a new OS is hard. I think we first of all need to win people over on just these even more minimal operating systems beyond what CoreOS has done. Personally, I don’t know if I could answer that question honestly without just owing something. [00:23:33] TG: I’ve got a thought here. One of the things that I’m really interested in beyond just Kubernetes and beyond just the operating system – what is computing going to look like in 5, 10, 15 years? I don’t know if Kubernetes is going to be around. I’m kind of a tech-cynic, right? I’ve seen a lot of fads in my career and things that pop up and are very popular for a couple of years and then sort of disappear. I don’t think Kubernetes is one of those. I think Kubernetes and the concepts and the layers of abstraction that Kubernetes has provided, all of that will remain useful and powerful in distant future whether or not it’s called Kubernetes or if it’s called something else, some new paradigm. But what I’m really interested in is seeing what can we do with this idea of an API-managed OS? If you look at the general purpose operating systems out there, some aspects of the system might expose an API. But for the most part, you’re still interacting and interfacing with this system like you were 30 years ago, 35, 40 years ago even. That’s fine. What works works, but everything else today has an API. Kubernetes has a powerful and extensible API and I think that your operating system should have something similar, something comparable, something that you can interact with using the same tools and the same processes and the same ideas that you can at the top of the stack and move some of those concepts down to the host OS level where we’re talking about today. [00:24:51] CC: This brings up a point that I’m so curious about, not only the idea of having a Kubernetes operating system, but any idea that is new that you were just talking about, Tim, is – So what works works. For example, every year or every couple of years, I am evaluating a new code editor or I am evaluating a new note taking app, or do-to-list app, those three things. I’m continuously finding something to reevaluate because what I have has never worked for me just the way I think. Actually, recently I found a couple of things that are really good. In any case, the thing is they just never worked for years. They’re very limited. They don’t match my thinking. But operating system, I would never – Well, I’m not an administrator also, but just like from having my own laptops forever, I’m not going to go out there. That’s not true either, but I was going to say I’m not going to go out there and try a new operating system to see if it’s offering that I already have, then it might be better for me. But that’s not true, because I have done that many times too. So never mind. But I think the idea of my question is stance, is how are you communicating to people out there that, “Hey, there is this new thing that maybe it’s working for you – Maybe you think it’s working for you, but you just don’t know that there is a new different way of doing.” When you do try to do that, how are people responding? I mean, of course, there are those cases where people just know they get it and they immediately resonate with them. But I’m talking about the people who like might benefit from this but they don’t quite grasp. How do you break through that barrier? [00:26:38] TG: Sure. Maybe the lay majority. [00:26:40] CC: Yeah, and how are people responding? [00:26:42] TG: Yeah. The great thing about Talos is that people understand pretty immediately what it is, how it works and why we’ve done it. The challenge I think for us and for anybody changing the way that operating systems work. Is it better enough than what I have today or what I had before? Is it worth the switch in costs? I think that switching cost is something that’s pretty well understood in the industry. People have gone through this process and they’ve moved from virtualization to containers, from Docker to Kubernetes, etc. They understand that process and they understand there’s a technical cost. There’s a people cost, etc. We have to show that value. I think that progress in our industry is incremental. Our industry is young. We’re not building bridges. We’re not at the level of like the internal combustion engine where the engineering is understood and we know how to build it and we know how to make it so that it doesn’t fall over and explode. Clearly, we’re not quite there yet in the broader world of computing. I think anywhere where we can show a little bit of incremental improvement where we can tackle one narrow slice of a problem and make it a little bit better and get to a point where computing is just a little bit safer and a little bit easier and a little bit faster. I think that’ll be a pretty compelling argument and there’s a lot of details involved and we have to talk about how do you get your applications from one operating system to the next? 15 years ago, it may have been a very big ask to ask someone to port their enterprise application from one operating system to another. They’re so inextricably linked. There’re a lot of connections between the OS and the applications, but today, we have these levels of abstraction. We have containers. We have the Kubernetes orchestration mechanisms and I think that switching cost is going down every release of Kubernetes and every step along the way as people change the way that applications are deployed that switching cost gets a little bit cheaper. It will be easier for us to prove that the value you gain by moving to a purpose-built operating system is greater than the switching cost. [00:28:41] CC: Very good points. [00:28:42] JS: I feel like there’s a lot of emphasis and focus on the move over. The first steps toward migrating to something new. There’s a lot of emphasis on bootstrapping a cluster. There’s a lot of emphasis on how do I get started. I’m part of a team called customer reliability engineering and we see operators running Kubernetes environments that are durable and have been in the field for many years. I think that there’s kind of a hidden cost in these day two operations where, like today, to effectively be a Kubernetes operator, you need to also have a great deal of understanding of Linux internal operating systems or Linux operating systems internals. These are abstractions on top, but sometimes those abstractions are leaky. So you need to be able to parse IP tables rules. You need to be able to understand how traffic gets routed, all of these aspects of it. I’m just wondering how do we kind of get folks shifted from this mindset of I’m going to start with something that’s general purpose and then I’m going to basically make it do what I want it to do by making all of these configuration changes and installing things on top of it to kind of make that not general purpose, but kind of specific focus on it and kind of get people to move back more fundamentally and think, “Well, what if we just started with something that is strictly for running workloads?” We don’t have to worry about installing a security suite on top of this or making this configuration change or hardening requirements or what have you. We’re fundamentally in a better place because we’ve started with something that’s arguably more secure. [00:30:21] BL: You know that – I mean, I’m old. I’m old now. I’m realizing this. When I started – Back in my day when we started with Linux, we went through this whole thing of Linux installers and there’re many iterations of Linux installers and it depended on, “Well, did you like what Red Hat was doing? Did you like what Debian project was doing? Oh! Did you like what Arch was doing? Oh! Did you want to do it yourself? Do you want to merge the world with gen 2?” Really, we come to this point now, no one ever talks about Linux installers anymore. You just put it on there. I think what I’m getting at is that we don’t actually know what we want. I mean we say that we want it to be simpler. We say we want it to be more secure, but we don’t know. Only time will tell, and I think it’s going to be a lot of chipping away at problems. Then people who are wanting to have the bold ideas are saying, “I’m going to out there and create a Kubernetes operating system.” In reality, it may work. We hope it works, or it may not work, but at least we gained just a little tiny bit more knowledge on how we want to run this thing. I think – And I’ll just say one more last thing, is that if you look at like Bell Labs, Bell Labs created the vacuum tube, and then like 20 years later, 20 or 30 years later, they created the transistor, twice actually. It took a long time to get the vacuum tube out because it kind of just worked and they just said, “We can’t throw it back. We just can’t throw that away.” Maybe we’re seeing a lot of that in Kubernetes. We’re holding on to some good things even though some greater things are going to come, but it might not be here this year or next year. It might be 18 months. It might be 24 months. We just got to really pay attention to that. [00:32:02] AR: Brian, when you said you were old, I was going to shake my head internally and then you brought up the vacuum tube and I’m like, “Okay.” [00:32:07] BL: I mean, I’m not that old. [00:32:09] JS: Yeah, I think that’s a good point, Brian. The thing I like to point out is the allegory of the cave. People have been living a certain way for so long they think that these shadows are real and they just know that way of life until some crazy comes along and says, “Hey, there’s a whole world out there,” and no one believes him. I think we just need to do – Like you said, we just need to do it. When you just create and make it happen and hopefully educate people in the process and just keep chipping away at it. Do the good work. [00:32:38] BL: That’s the important piece and that was the power of Bell Labs. You probably can tell. I just read a book about Bell Labs. I’m an expert now. But that was the power of Bell Labs. They didn’t just focus on making product for AT&T. They focused on changing the world, like literally. Who creates a transistor if you knew what one was? You just don’t create that. That’s like some really crazy stuff. I try to bring the parallel back to what we’re doing here. We can’t just create this perfect Kubernetes thing, because really, we don’t know what it is. I mean, we can be smart and say, “Well, it needs to be secure. It needs to be networking,” and all these stuff. But you know what? We don’t even have cgroups v2 support yet. We don’t even know where we are. Let’s figure out – Let’s just keep going down the path, but we will suss out these better patterns. [00:33:23] CC: Yeah, I like that. [00:33:24] BL: That’s it. It is incremental. Here’s the crazy part, and this is the real tough part. You know what? It is incremental, and reality says that not everybody can win. Don’t take your failures as a loss. Take them as, “well, maybe we shouldn’t have done that,” and keep on moving forward because there’s a lot of companies out there who got us at this point in tech that don’t exist anymore, but if they didn’t do what they did, we would not be here right now. It’s not [inaudible 00:33:52]. [00:33:53] CC: Why are we talking about failures? [00:33:55] BL: I’m sorry. It’s the ultimate success. [00:34:01] CC: Oh gosh! Let’s not end the show in such a downer. [00:34:04] BL: No. That’s a happy point though. Let me put the bow on the happy point and then I will stop talking. The thing is, is it’s not the glass is half empty. It is glass is half full. The path to success is littered with failure and it’s not a bad thing. It’s a good thing, because it’s good that we can continue making those failures because we know they lead to successes. That is actually a happy thing. [00:34:29] CC: I wonder if Andrew and Tim want to do a little bit of interrogating of us. I think that would be fair. [00:34:36] AR: I wouldn’t know what to interrogate you guys about. [00:34:40] CC: Well, we are coming up at the top of the hour. So it’s time to say goodbye. It was great having you, Andrew, and you, Tim, on the call. Jed, thank you for participating as well. I think it was very informative. With that, I will say, until next week. Bye everybody. [00:34:59] TG: Bye. Thanks for having us. [00:35:00] CC: My pleasure. [00:35:00] AR: Bye-bye. Thank you. [00:35:02] JS: Thank you. Bye. [END OF INTERVIEW] [0:35:05.3] ANNOUNCER: Thank you for listening to The Podlets Cloud Native Podcast. Find us on Twitter at https://twitter.com/ThePodlets and on the http://thepodlets.io/ website, where you'll find transcripts and show notes. We'll be back next week. Stay tuned by subscribing. [END]See omnystudio.com/listener for privacy information.
This week on The Podlets Cloud Native Podcast we have Josh, Carlisia, Duffie, and Nick on the show, and are also happy to be joined by a newcomer, Brian Liles, who is a senior staff engineer at VMWare! The purpose of today’s show is coming to a deeper understanding of the meaning of ‘stateful’ versus ‘stateless’ apps, and how they relate to the cloud native environment. We cover some definitions of ‘state’ initially and then move to consider how ideas of data persistence and co-ordination across apps complicate or elucidate understandings of ‘stateful’ and ‘stateless’. We then think about the challenging practice of running databases within Kubernetes clusters, which effectively results in an ephemeral system becoming stateful. You’ll then hear some clarifications of the meaning of operators and controllers, the role they play in mediating and regulating states, and also how important they are in a rapidly evolving but skills-scarce environment. Another important theme in this conversation is the CAP theorem or the impossibility of consistency, availability and partition tolerance all at once, but the way different databases allow for different combinations of two out of the three. We then move on to chat about the fundamental connection between workloads and state and then end off with a quick consideration about how ideas of stateful and stateless play out in the context of networks. Today’s show is a real deep dive offering perspectives from some the most knowledgeable in the cloud native space so make sure to tune in! Follow us: https://twitter.com/thepodlets Website: https://thepodlets.io Feeback: info@thepodlets.io https://github.com/vmware-tanzu/thepodlets/issues Hosts: Carlisia Campos Duffie Cooley Bryan Liles Josh Rosso Nicholas Lane Key Points From This Episode: • What ‘stateful’ means in comparison to ‘stateless’.• Understanding ‘state’ as a term referring to data which must persist.• Examples of stateful apps such as databases or apps that revolve around databases.• The idea that ‘persistence’ is debatable, which then problematizes the definition of ‘state’. • Considerations of the push for cloud native to run stateless apps.• How inter-app coordination relates to definitions of stateful and stateless applications.• Considering stateful data as data outside of a stateless cloud native environment.• Why it is challenging to run databases in Kubernetes clusters.• The role of operators in running stateful databases in clusters.• Understanding CRDs and controllers, and how they relate to operators.• Controllers mediate between actual and desired states.• Operators are codified system administrators.• The importance of operators as app number grows in a skill-scarce environment.• Mechanisms around stateful apps are important because they ensure data integrity.• The CAP theorem: the impossibility of consistency, availability, and tolerance.• Why different databases allow for different iterations of the CAP theorem.• When partition tolerance can and can’t get sacrificed.• Recommendations on when to run stateful or stateless apps through Kubernetes.• The importance of considering models when thinking about how to run a stateful app.• Varying definitions of workloads.• Pods can run multiple workloads• Workloads create states, so you can’t have one without the other.• The term ‘workloads’ can refer to multiple processes running at once.• Why the ephemerality of Kubernetes systems makes it hard to run stateful applications. • Ideas of stateful and stateless concerning networks.• The shift from server to browser in hosting stateful sessions. Quotes: “When I started envisioning this world of stateless apps, to me it was like, ‘Why do we even call them apps? Why don’t we just call them a process?’” — @carlisia [0:02:60] “‘State’ really is just that data which must persist.” — @joshrosso [0:04:26] “From the best that I can surmise, the operator pattern is the combination of a CRD plus a controller that will operate on events from the Kubernetes API based on that CRD’s configuration.” — @bryanl [0:17:00] “Once again, don’t let developers name them anything.” — @bryanl [0:17:35] “Data integrity is so important” — @apinick [0:22:31] “You have to really be careful about the different models that you’re evaluating when trying to think about how to manage a stateful application like a database.” — @mauilion [0:31:34] Links Mentioned in Today’s Episode: KubeCon+CloudNativeCon — https://events19.linuxfoundation.org/events/kubecon-cloudnativecon-north-america-2019/Google Spanner — https://cloud.google.com/spanner/CockroachDB — https://www.cockroachlabs.com/CoreOS — https://coreos.com/Red Hat — https://www.redhat.com/enMetacontroller — https://metacontroller.app/Brandon Philips — https://www.redhat.com/en/blog/authors/brandon-phillipsMySQL — https://www.mysql.com/ Transcript: EPISODE 009 [INTRODUCTION] [0:00:08.7] ANNOUNCER: Welcome to The Podlets Podcast, a weekly show that explores Cloud Native one buzzword at a time. Each week, experts in the field will discuss and contrast distributed systems concepts, practices, tradeoffs and lessons learned to help you on your cloud native journey. This space moves fast and we shouldn’t reinvent the wheel. If you’re an engineer, operator or technically minded decision maker, this podcast is for you. [INTERVIEW] [00:00:41] JR: All right! Hello, everybody, and welcome to episode 6 of The Cubelets Podcast. Today we are going to be discussing the concept of stateful and stateless and what that means in this crazy cloud native landscape that we all work. I am Josh Rosso. Joined with me today is Carlisia. [00:00:59] CC: Hi, everybody. [00:01:01] JR: We also have Duffie. [00:01:03] D: Hey, everybody. [00:01:04] JR: Nicholas. [00:01:05] NL: Yo! [00:01:07] JR: And a newcomer to the podcast, we also have Brian. Brian, you want to give us a little intro about yourself? [00:01:12] BL: Hi! I’m Brian. I work at VMWare. I do lots of community stuff, including sharing the KubeCon+CloudNativeCon. [00:01:22] JR: Awesome! Cool. All right. We’ve got a pretty good cast this week. So let’s dive right into it. I think one of the first things that we’ve been talking a bit about is the concept of what makes an application stateful? And of course in reverse, what makes an application stateless? Maybe we could try to start by discerning those two. Maybe starting with stateless if that makes? Does someone want to take that on? [00:01:45] CC: Well, I’m going to jump right in. I have always been a developer, as supposed to some of you or all of you have who have system admin backgrounds. The first time that I heard the stateless app, I was like, “What?” That wasn’t recent, okay? It was a long time ago, but that was a knot in my head. Why would you have a stateless app? If you have an app, you’re going to need state. I couldn’t imagine what that was. But of course it makes a lot of sense now. That was also when we were more in the monolithic world. [00:02:18] BM: Actually that’s a good point. Before you go into that, it’s a great point. Whenever we start with apps or we start developing apps, we think of an application. An application does everything. It takes input and it does stuff and it gives output. But now in this new world where we have lots of apps, big apps, small apps, we start finding that there’s apps that only talk and coordinate with other apps. They don’t do anything else. They don’t save any data. They don’t do anything. That’s what – where we get into this thing called stateless apps. Apps don’t have any type of data that they store locally. [00:02:53] CC: Yeah. It’s more like when I envision in my head. You said it brilliantly, Brian. It’s almost like a process. When I started envisioning this world of stateless apps, to me it was like, “Why do we even call them apps? Why don’t we just call them a process?” They’re just shifting back data and forth but they’re not – To me, at the beginning, apps were always stateless. They went together. [00:03:17] D: I think, frequently, people think of applications that have only locally relevant stuff that is actually not going to persist to disc, but maybe held in memory or maybe only relevant to the type of connection that’s coming through that application also as stateless, which is interesting, because there’s still some state there, but the premise is that you could lose that state and not lose the functionality of that code. [00:03:42] NL: Something that we might want to dive into really quickly when talking about stateless and stateful apps. What do we mean by the word state? When I first learned about these things, that was what always screwed me up. I’m like, “What do you mean state? Like Washington? Yeah. We got it over here.” [00:03:57] JR: Oh! State. That’s that word. State is one of those words that we use to sound smarter than we actually are 95% of the time, and that’s a number I just made up. When people are talking about state, they mean databases. Yeah. But there are other types of state as well. If you maintain local cache that needs to be persistent, if you have local files that you’re dealing with, like you’re opening files. That’s still state. State really is just that it’s data that must persist. [00:04:32] D: I agree with that definition. I think that state, whether persisted to memory or persisted to disc or persisted to some external system, that’s still what we refer to as state. [00:04:41] JR: All right. Makes sense and sounds about like what I got from it as well. [00:04:45] CC: All right. So now we have this world where we talk about stateless apps and stateful apps. Are there even stateful apps? Do we call a database an app? If we have a distributed system where we have one stateless app over here, another stateless app over there and then we have the database that’s connected to the two of them, are we calling the database a stateful app or is that whole thing – How do we call this? [00:05:15] NL: Yeah. The database is very much a state as an app with state. I’m very much – [00:05:19] D: That’s a close definition. Yeah. [00:05:21] NL: Yeah. Literally, it’s the epitome of a stateful app. But then you also have these apps that talk to databases as well and they might have local data, like data that – they start a transaction and then complete it or they have a long distributed type transaction. Any apps that revolve around a database, if they store local data, whether it’s within a transaction or something else, they’re still stateful apps. [00:05:46] D: Yup. I think you can modify and input data or modify state that has to be persisted in some way I think is a stateful app, even though I do think it’s confusing because of what – As I said before, I think that there are a bunch of applications that we think of, like not everybody considers Spark jobs to be stateful. Spark jobs, for example, are something that would bring data in, mutate that data in some way, produce some output and go away. The definition there is that Spark would generally push the resulting data into some other external system. It’s interesting, because in that model, Spark is not considered to be a stateful app because the Spark job could fail, go away, get recreated, pick up the pieces where it left off or just redo that work until all of the work is done. In many cases, people consider that to be a stateless application. That’s I think is like the crux – In my opinion, the crux of the confusion around what a stateful and stateless application is, is that people frequently – I think it’s more about where you store – what you mean by persistence and how that actually realizes in your application. If you’re pushing your state to an external database, is your application still stateful? [00:06:58] NL: I think it’s a good question, or if you are gathering data from an external source and mutating it in some way, but you don’t need data to be present when you start up, is that a stateful app or a stateless app? Even though you are taking in data, modifying it and checking it, sending out to some other mechanism or serving it in your own way, does that become like a stateless app? If that app gets killed and it comes back and it’s able to recover, is it stateful or stateless? That’s a bit of a gray area, I think. [00:07:26] JR: Yeah. I feel like a lot of the customers I work with, if the application can get killed even if it has some type of local state, they still refer to it as stateless usually, to me at least, when we talk about it because they think, “I can kind of restart this application and I’m not too worried about losing whatever it may have had.” Let’s say cached for simplicity, right? I think that kind of leads us into an interesting question. We’ve talked a lot on this podcast about cloud native infrastructure and cloud native applications and it seems like since the inception of cloud native, there’s always been this push that a stateless app is the best candidate to run or the easiest candidate to run. I’m just curious if we could dive into that for a moment. Why in the cloud native infrastructure area has there always been this push for running stateless applications? Why is it simpler? Those kinds of things. [00:08:15] BL: Before we dive into that, we have to realize – And this is just a problem of our whole ecosystem, this whole cloud native. We’re very hand-wavy in our descriptions for things. There’re a lot of ambiguous descriptions, and state is one of those. Just keep that in mind, that when we’re talking today, we’re really just talking about these things that store data and when that’s the state. Just keep that in mind as you’re listening to this. But when it comes to distributed systems in general, the easiest system is a system that doesn’t need coordination with any other system. If it happens to die, that’s okay. We can just restart it. People like to start there. It’s the easiest thing to start. [00:08:58] NL: Yeah, that was basically what I was going to say. If your application needs to tie into other applications, it becomes significantly more complicated to implement it, at least for your first time and in your system. These small applications that only – They don’t care about anybody else, they just take in data or not, they just do whatever. Those are super easy to start with because they’re just like, “Here. Start this up. Who cares? Whatever happens, it happens.” [00:09:21] CC: That could be a good boundary to define – I don’t want to jump back too far, but to define where is the stateless app to me is part of a system and just say it depends for it to come back up. Does it depend on something else that has state? [00:09:39] BL: I’ll give you an example. I can give you a good example of a stateless app that we use every day, every single one of us, none of us on this call, but when you search Google. You go to google.com and you go to the bar and you type in a search, what’s happening is there is a service at the beginning that collects that search and it federates the search over many different probably clusters of computers so they can actually do the search currently. That app that actually coordinates all that work is a stateless app most likely. All it does is just splits it up and allows more CPUs to do the work. Probably, that goes away. Probably not a problem. You probably have 10 more of them. That’s what I consider stateless. It doesn’t really own any of the data. It’s the coordinator. [00:10:25] CC: Yeah. If it goes down, it comes back up. It doesn’t need to reset itself to the state where it was before. It can truly be considered a stateless because it can just, “Okay. I reset. I’m starting from the beginning from this clear state.” [00:10:43] BL: Yes. That’s a good summary of that. [00:10:45] CC: Because another way to think about stateless – What makes an app stateful app, does it have to be combined or like deployed and shipped together with the part that maintains the state? That’s a more clear cut definition. Then that app is definitely a stateful app. [00:11:05] D: What we frequently talk about in like the cloud native space is like you know that you have a stateless app if you can just create 20 of them and not have to worry about the coordination of them. They are all workers. They are all going to take input. You could spread the load across those 20 in an identical way and not worry about which one you landed on. That’s stateless application. A stateful application is a very different thing. You have to have some coordination. You have to say how many databases can you have on a backend? Because you’re persisting data there, you have to be really careful about that you only write to the master database or to the writing database and you could read of any other memories of that database cluster, that sort of stuff. [00:11:44] CC: It might seem that we are going so deep into this differentiating between stateful and stateless, but this is so important because clusters are usually designed to be ephemeral. Ephemeral means obviously they die down, they are brought back up, the nodes, and you should worry as least as possible with the state of things. Then going back to what Joshua is saying, when we are in this cloud native world, usually we are talking about stateless apps, stateless workloads and then we’re going to just talk about what workload means. But then if that’s the case, where are the stateful apps? It’s like we have this vision that the stateful apps live outside the cloud native world? How does it work? But it’s supposed to work. [00:12:36] BL: Yup. This is the question that keeps a lot of people employed. Making sure my state is available when I need it. You know what? I’m not going to even use that word state. Making sure my data is available wherever I need it and when I need it. I don’t want to go too deep in right now, but this is actually a huge problem in the Kubernetes community in general, and we see it because there’s been lots of advice given, “Don’t run things like databases in your clusters.” This is why we see people taking the ideas of Google Spanner and like CockroachDB and actually going through a lot of work to make sure that you can run databases in Kubernetes clusters. The interesting piece about this is that we’re actually to the point where we can run these types of workloads in our clusters, but with a caveat, big star at the end, it’s very difficult and you have to know what you’re doing. [00:13:34] JR: Yeah. I want to dovetail on that Brian, because it’s something that we see all the time. I feel like when we first started setting up, let’s call them clusters, but in our case it was Kubernetes, right? We always saw that data level always being delegated to like if you’re in Amazon, some service that they hosted and so on. But now I think more and more of the customers that at least I’m seeing. I’m sure Nicholas and Duffie too, they’re interested in doing exactly what you just described. Cockroach is an example I literally just worked with recently, and it’s just interesting how much more thoughtful they have to be about their cluster operations. Going back to what you said Carlisia, it’s not as easy as just like trashing a cluster and instantiating a new one anymore, like they’re used to. They need to be more thoughtful about keeping that data integrity intact through things like upgrades and disaster recover. [00:14:18] D: Another interesting point kind to your point, Brian, is that like, frequently, people are starting to have conversations and concerns around data gravity, which means that I have a whole bunch of data that I need to work with, like to a Spark job, which I mentioned earlier. I need to basically put my compute where that data is. The way that I store that data inside the cluster and use Kubernetes to manage it or whether I just have to make sure that I have some way of bringing up compute workloads close to that data. It’s actually kind of introducing a whole new layer to this whole thing. [00:14:48] BL: Yeah! Whole new layer of work and a whole new layer of complexity, because that’s actually – The crux of all this is like where we slide the complexity too, but this is interesting, and I don’t want to go too far to this one definitely. This is why we’re seeing more people creating operators around managing data. I’ve seen operators who are bringing databases up inside of Kubernetes. I’ve seen operators that actually can bring up resources outside of Kubernetes using the Kubernetes API. The interesting thing about this is that I looked at both solutions and I said, “I still don’t know what the answer is,” and that’s great. That means that we have a lot to learn about the problem, and at least we have some paths for it. [00:15:29] NL: Actually, that kind of reminds me of the first time I ever heard the word stateful or stateless – I’m an infrastructure guy. Was around the discussion of operators, which there’s only a couple of years ago when operators were first introduced at CoreOS and some people were like, “Oh! Well, this is how you now operate a stateful mechanism inside of Kubernetes. This is the way forward that we want to propose.” I was just like, “Cool! What is that? What’s state? What do you mean stateful and stateless?” I had no idea. Josh, you were there. You’re like, “Your frontend doesn’t care about state and your backend does.” I’m like, “Does it? I don’t know. I’m not a developer.” [00:16:10] JR: Let’s talk about exactly that, because I think these patterns we’re starting to see are coming out of the needs that we’re all talking about, right? We’ve seen at least in the Kubernetes community a lot of push for these different constructs, like something called a stateful [inaudible 00:16:21], which isn’t that important right now, but then also like an operator. Maybe we can start by defining what is an operator? What is that pattern and why does it relate to stateful apps? [00:16:31] CC: I think that would be great. I am not clear what an operator is. I know there’s going to be a controller involved. I know it’s not a CRD. I am not clear on that at all, because I only work with CRDs and we don’t define – like the project I worked on with Velero, we don’t categorize it as an operator. I guess an operator uses specific framework that exists out there. Is it a Kubernetes library? I have no idea. [00:16:56] BL: We did it to ourselves again. We’re all doing these to ourselves. From the best that I can surmise, the operator pattern is the combination of a CRD plus a controller that will operate on events from the Kubernetes API based on that CRD’s configuration. That’s what an operator is. [00:17:17] NL: That’s exactly right. [00:17:18] BL: To conflate this, Red Hat created the operator SDK, and then you have [inaudible 00:17:23] and you have a Metacontroller, which can help you build operators. Then we actually sometimes conflate and call CRDs operators, and that’s pretty confusing for everyone. Once again, don’t let developers name anything. [00:17:41] CC: Wait. So let’s back up a little. Okay. There is an actual library that’s called an operator. [00:17:46] BL: Yes. There’s an operator SDK. [00:17:47] CC: Referred to as an operator. I heard that. Okay. Great. But let me back up a little because – [00:17:49] D: The word operator can [00:17:50] CC: Because if you are developing an app for Kubernetes, if you’re extending Kubernetes, you are – Okay, you might not use CRDs, but if you are using CRDs, you need a controller, right? Because how will you do actions? Then every app that has a CRD – because the alternative to having CRDs is just using the API directly without creating CRDs to reflect to resources. If you’re creating CRDs to reflect to resources, you need controllers. All of those apps, they have CRDs, are operators. [00:18:24] D: Yip [inaudible 00:18:25] is an operator. [00:18:26] CC: [inaudible 00:18:26] not an operator. How can you extend Kubernetes and not be qualified [inaudible 00:18:31] operator? [00:18:32] BL: Well, there’s a way. There is a way. You can actually just create a CRD and use a CRD for data storage, you know, store states, and you can actually query the Kubernetes API for that information. You don’t need a controller, but we couple them with controllers a lot to perform action based on that state we’ve saved to etcd. [00:18:50] CC: Duffie. [00:18:51] D: I want to back up just for a moment and talk about the controller pattern and what it is and then go from there to operators, because I think it makes it easier to get it in your head. A control pattern is effectively a way to understand desired state and real state and provide some logic or business code that will allow you to converge those two states, your actual state and your desired state. This is a pattern that we see used in almost everything within a distributed system. It’s like within Kubernetes, within most of the kind of more interesting systems that are out there. This control pattern describes a pretty good way of actually managing application flow across distributed systems. Now, operators, when they were initially introduced, we were talking about that this is a slightly different thing. Operators, when we introduced the idea, came more from like the operational burden of these stateful applications, things like databases and those sorts of stuff. With the database, etcd for example, you have a whole bunch of operational and runtime concerns around managing the lifecycle of that system. How do I add a new member to the cluster? What do I do when a member dies? How do I take action? Right now, that’s somebody like myself waking up at 2 in the morning and working through a run book to basically make sure that that service remains operational through the night. But the idea of an operator was to take that control pattern that we described earlier and make it wake up at 2 in the morning to fix this stuff. We’re going to actually codify the operational knowledge of managing the burden of these stateful applications so that we don’t have to wake up at 2 in the morning and do it anymore. Nobody wants to do that. [00:20:32] BL: Yeah. That makes sense. Remember back at KubCon years ago, I know it was one in Seattle where Brandon Philips was on stage talking about operators. He basically was saying if we think about SysOp, system operators, it was a way to basically automate or capture the knowledge of our system administrators in scripts or in a process or in code a la operators. [00:20:57] D: The last part that I’ll add to this thing, which I think is actually what really describes the value of this idea to me is that there are only so many people on the planet that do what the people in this blog post do. Maybe you’re one of them that listen to this podcast. People who are operating software or operating infrastructure at scale, there just aren’t that many of us on the planet. So as we add more applications, as more people adopt the cloud native regime or start coming to a place where they can crank out more applications more quickly, we’re going to have to get to a place where we are able to automate the burden of managing those applications, because there just aren’t enough of us to be able to support the load that is coming. There just aren’t enough people on the planet that do this to be able to support that. That’s the thing that excites me most about the operator pattern, is that it gives us a place to start. It gives us a place to actually start thinking about managing that burden over time, because if we don’t start changing the way we think about managing that burden, we’re going to run out of people. We’re not going to be able to do it. [00:22:05] NL: Yeah. It’s interesting. With stateful apps, we keep kind of bringing them – coming back to stateful apps, because stateful apps are hard and stateless apps are easy, and we’ve created all these mechanisms around operating things with state because of how just complicated it is to make sure that your data is ready, accessible and has integrity. That’s the big one that I keep not thinking about as a SysOps person coming into the Dev world. Data integrity is so important and making sure that your data is exactly what it needs to be and was the last time you checked it, is super important. It’s only something I’m really starting to grasp. That’s why I was like these things, like operators and all these mechanisms that we keep creating and recreating and recreating keep coming about, because making sure that your stateful apps have the right data at the right time is so important. [00:22:55] BL: Since you brought this up, and we just talked about why a state is so hard, I want to introduce the new term to this conversation, the whole CAP theorem, where data would typically be – in a distributed system at least, your data will be consistent or your data can be available, or if your distributed systems falls in multiple parts, you can have partition tolerance. This is one of those computer science things where you can actually pick two. You can have it be available and have partition tolerance, but your data won’t be consistent, or you can have consistency and availability, but you won’t have partition tolerance. If your cluster splits into two for some reason, the data will be bad. This is why it’s hard, this is why people have written basically lots of PhD dissertations on this subject, and this is why we are talking about this here today, is because managing state, and particularly managing distributed, is actually a very, very hard problem. But there’s software out there that will help us, and Kubernetes is definitely part of that and stateful sets are definitely part of that as well. [00:24:05] JR: I was just going to say on those three points, consistently, availability and partition tolerance. Obviously, we’d want all three if we could have them. Is there one that we most commonly tradeoff and give up or does it go case-by-case? [00:24:17] BL: Actually, it’s been proven. You can’t have all three. It’s literally impossible. It depends. If you have a MySQL server and you’re using MySQL to actually serve data out of this, you’re going to most likely get consistency and availability. If you have it replicated, you might not have partition tolerance. That’s something to think about, and there are different databases and this is actually one of the reasons why there are different databases. This is why people use things like relational databases and they use key value stores not because we really like the interfaces, but because they have different properties around the data. [00:24:55] NL: That’s an interesting point and something that I had recently just been thinking about, like why are there so many different types of databases. I just didn’t know. It was like in only recently heard of CAP theorem as well just before you mentioned it. I’m like, “Wow! That’s so fascinating.” The whole thing where you only pick two. You can’t get three. Josh, to kind of go back to your question really quickly, I think that partition tolerance is the one that we throw away the most. We’re willing to not be able to segregate our database as much as possible because C and A are just too important, I think. At least that’s what I’m saying, like I am wearing an [inaudible 00:25:26] shirt and [inaudible 00:25:27] is not partition tolerant. It’s bad at it. [00:25:31] BL: This is why Google introduced Spanner, and Spanner in some situations can get free with tradeoffs and a lot of really, really smart stuff, but most people can’t run this scale. But we do need to think about partition tolerance, especially with data whenever – Let’s say you run a store and you have multiple instances across the world and someone buys something from inventory, what is your inventory look like at any particular point? You don’t have to answer my question, of course, but think about that. These are still very important problems if fiber gets cut across the Atlantic and now I’ve sold more things than I have. Carlisia, speaking to you as someone who’s only been a developer, have you moved your thoughts on state any further? [00:26:19] CC: Well, I feel that I’m clear on – Well, I think you need to clarify your question better for me. If you’re asking if I understand what it means, I understand what it means. But I actually was thinking to ask this question to all of you, because I don’t know the answer, if that’s the question you’re asking me. I want to put that to the group. Do you recommend people, as in like now-ish, to run stateful workloads? We need to talk about workloads mean. Run stateful apps or database in sites if they’re running a Kubernetes cluster or if they’re planning for that, do you all as experts recommend that they should already be looking into doing that or they should be running for now their stateful apps or databases outside of the cloud native ecosystem and just connecting the two? Because if that’s what your question was, I don’t know. [00:27:21] BL: Well, I’ll take this first. I think that we should be spending lots of more time than we are right now in coming up community-tested solutions around using stateful sets to their best ability. What that means is let’s say if you’re running a database inside of Kubernetes and you’re using a stateful set to manage this, what we do need to figure out is what happens when my database goes down? The pod just kills? When I bring up a new version, I need to make sure that I have the correct software to verify integrity, rebuilt things, so that when it comes back up, it comes back up correctly. That’s what I think we should be doing. [00:27:59] JR: For me, I think working with customers, at least Kubernetes-oriented folks, when they’re trying to introduce Kubernetes as their orchestration part of their overall platform, I’m usually just trying to kind of meet them where they’re at. If they’re new to Kubernetes and distributed systems as a whole, if we have stateless, let’s call them maybe simpler applications to start with, I generally have them lean into that first, because we already have so much in front of us to learn about. I think it was either Brian or Duffie, you said it introduces a whole bunch more complexity. You have to know what you’re doing. You have to know how to operate these things. If they’re new to Kubernetes, I generally will advise start with stateless still. But that being said, so many of our customers that we work with are very interested in running stateful workloads on Kubernetes. [00:28:42] CC: But just to clarify what you said, Josh, because you spoke like an expert, but I still have beginner’s ears. You said something that sounded to me like you recommend that you go stateless. It sounded to me like that. What you really say is that they take out the stateless part of what they have, which they might already have or they might have to change and put the stateless. You’re not suggesting that, “Oh! You can’t do stateful anymore. You need to just do everything stateless.” What you’re saying is take the stateless part of your system, put that in Kubernetes, because that is really well-tested and keep the stateful outside of that ecosystem. Is that right? [00:29:27] JR: I think that’s a better way to put it. Again, it’s not that Kubernetes can’t do stateful. It’s more of a concept of biting off more than you can chew. We still work with a lot of people who are very new to these distributed systems concepts, and to take on running stateful workloads, if we could just delegate that to some other layer, like outside of the cluster, that could be a better place to start, at least in my experience. Nicholas and Duff might have different – [00:29:51] NL: Josh, you basically nailed it like what I was going to say, where it’s like if the team that I’m working with is interested in taking on the complexity of maintaining their databases, their stateful sets and making sure that they have data integrity and availability, then I’m all for them using Kubernetes for a stateful set. Kubernetes can run stateful applications, but there is all this complexity that we keep talking about and maintaining data and all that. If they’re willing to take on their complexity, great, it’s there for you. If they’re not, if they’re a little bit kind of behind as – Not behind, but if they’re kind of starting out their Kubernetes journey or their distributed systems journey, I would recommend them to move that complexity to somebody else and start with something a little bit easier, like a stateless application. There are a lot of good services that provide data as a service, right? You’ve got dataview as RDS is great for creating stateful application. You can leverage it anytime and you’ve got like dedicated wires too. I would point them to there first if they don’t want to take on like complexity. [00:30:51] D: I completely agree with that. An important thing I would add, which is in response to the stateful set piece here, is that as we’ve already described, managing a stateful application like a database does come with some complexity. So you should really carefully look at just what these different models provide you. Whether that model is making use of a stateful set, which provides you like ordinality, ensuring that things start up in a particular order and some of the other capabilities around that stuff. But it won’t, for example, manage some of the complexity. A stateful set won’t, for example, try and issue a command to the new member to make sure that it’s part of an existing database cluster. It won’t manage that kind of stuff. So you have to really be careful about the different models that you’re evaluating when trying to think about how to manage a stateful application like a database. I think because it’s actually why the topic of an operator came up kind of earlier, which was that like there are a lot of primitives within Kubernetes in general that provide you a lot of capability for managing things like stateful applications, but they may not entirely suit your needs. Because of the complexity with stateful applications, you have to really kind of be really careful about what you adopt and where you jump in. [00:32:04] CC: Yeah. I know just from working with Velero, which is a tool for doing backup and recovery migration of Kubernetes clusters. I know that we backup volumes. So if you have something mounted on a volume, we can back that up. I know for a fact that people are using that to backup stateful workloads. We need to talk about workloads. But at any case, one thing to – I think one of you mentioned is that you definitely also need to look at a backup and recovery strategy, which is ever more important if you’re doing stateful workloads. [00:32:46] NL: That’s the only time it’s important. If you’re doing stateless, who cares? [00:32:49] BL: Have we defined what a workload is? [00:32:50] CC: Yeah. But let me say something. Yeah, I think we should do an episode on that maybe, maybe not. We should do an episode on GitOps type of thing for related things, because even though you – Things are stateless, but I don’t want to get into it. Your cluster will change state. You can recover in stuff from like a fresh version. But as it goes through a lifecycle, it will change state and you might want to keep that state. I don’t know. I’m not the expert in that area, but let’s talk about workloads, Brian. Okay. Let me start talking about workloads. I never heard the term workload until I came into the cloud native world, and that was about a year ago or when they started looking in this space more closely. Maybe a little bit before a year ago. It took me forever to understand what a workload was. Now I understand, especially today, we’re talking about a little bit before we started recording. Let me hear from you all what it means to you. [00:34:00] BL: This is one of those terms, and I’m sure like the last any ex-Googlers about this, they’ll probably agree. This is a Google term that we actually have zero context about why it’s a term. I’m sure we could ask somebody and they would tell us, but workloads to me personally are anything that ultimately creates a pod. Deployments create replica sets, create pods. That whole thing is a workload. That’s how I look at it. [00:34:29] CC: Before there were pods, were there workloads, or is a workload a new thing that came along with pods? [00:34:35] BL: Once again, these words don’t make any sense to us, because they’re Google terms. I think that a pod is a part of a workload, like a deployment is a part of a workload, like a replica set is part of a workload. Workload is the term that encompasses an entire set of objects. [00:34:52] D: I think of a workload as a subset of an application. When I think of an application or a set of microservices, I might think of each of the services that make up that entire application as a workload. I think of it that way because that’s generally how I would divide it up to Brian’s point into different deployment or different stateful sets or different – That sort of stuff. Thinking of them each as their own autonomous piece, and altogether they form an application. That’s my think of it. [00:35:20] CC: To connect to what Brian said, deployment, will always run in the pods, which is super confusing if you’re not looking at these things, just so people understand, because it took me forever to understand that. The connection between a workload, a deployment and a pod. Pods contain – If you have a deployment that you’re going to shift Kubernetes – I don’t know if shift is the right word. You’re going to need to run on Kubernetes. That deployment needs to run somewhere, in some artifact, and that artifact is called a pod. [00:35:56] NL: Yeah. Going back to what Duffie said really quickly. A workload to me was always a process, kind of like not just a pod necessarily, but like whatever it is that if you’re like, “I just need to get this to run,” whatever that is. To me that was always a workload, but I think I’m wrong. I think I’m oversimplifying it. I’m just like whatever your process is. [00:36:16] BL: Yeah. I would give you – The reason why I would not say that is because a pod can run multiple containers at once, which ergo is multiple processes. That’s why I say it that way. [00:36:29] NL: Oh! You changed my mind. [00:36:33] BL: The reason I bring this up, and this is probably a great idea for a future show, is about all the jargon and terminology that we use in this land that we just take as everyone knows it, but we don’t all know it, and should be a great conversation to have around that. But the reason I always bring up the whole workload thing is because when we think about workloads and then you can’t have state without workloads, really. I just wanted to make sure that we tied those two things together. [00:36:58] CC: Why can you not have state without workloads? What does that mean? [00:37:01] BL: Well, the reason you can’t have state without workloads is because something is going to have to create that state, whether that workload is running in or out a cluster. Something is going to have to create it. It just doesn’t come out of nowhere. [00:37:11] CC: That goes back to what Nick was saying, that he thinks a workload is a process. Was that was you said, Nick? [00:37:18] NL: It is, yeah, but I’m renegading on that. [00:37:23] CC: At least I could see why you said that. Sorry, Brian. I cut you off. [00:37:28] BL: What I was saying is a workload ultimately is one or more processes. It’s not just a process. It’s not a single process. It could be 10, it could be 1. [00:37:39] JS: I have one final question, and we can bail on this and edit it out if it’s not a good one to end with. I hope it’s not too big, but I think maybe one thing we overlooked is just why it’s hard to run stateful workloads in these new systems like Kubernetes. We talked about how there’s more complexity and stuff, but there might be some room to talk about – People have been spinning up an EC2 server, a server on the web and running MySQL on it forever. Why in like the Kubernetes world of like pods and things is it a little bit harder to run, say, MySQL just [inaudible 00:38:10]. Is that something worth diving into? [00:38:13] NL: Yeah, I think so. I would say that for things like, say, applications, like databases particularly, they are less resilient to outages. While Kubernetes itself is dedicated to – Or most container orchestrations, but Kubernetes specifically, are dedicated to running your pods continuously as long as they will, that it is still somewhat of a shifting landscape. You do have priority and preemption. If you don’t set those things up properly of if there’s just like a total failure of your system at large, your stateful application can just go down at any time. Then how do you reconcile the outage in data, whatever data that might have gotten lost? Those sorts of things become significantly more complicated in an environment like Kubernetes where you don’t necessarily have access to a command line to run the commands to recover as easy. You may not, but it’s the same. [00:39:01] BL: Yes. You got to understand what databases do. Disk is slow, whether you have spinning disk or you have disk on chip, like SSD. What databases do in a lot of cases is they store things in memory. So if it goes away, didn’t get stored. In other cases, what databases do is they have these huge transactional logs, maybe they write them out in files and then they process the transaction log whenever they have CPU time. If a database dies just suddenly, maybe its state is inconsistent because it had items that were to be processed in a queue that haven’t been processed. Now it doesn’t know what’s going on, which is why – [00:39:39] NL: That’s interesting. I didn’t know that. [00:39:40] BL: If you kill MySQL, like kill MySQL D with a -9, why it might not come back up. [00:39:46] JR: Yeah. Going back to Kubernetes as an example, we are living in this newer world where things can get rescheduled and moved around and killed and their IPs changed and things. It seems like this environment is, should I say, more ephemeral, and those types of considerations becoming to be more complex. [00:40:04] NL: I think that really nails it. Yeah. I didn’t know that there were transactional logs about databases. I should, I feel like, have known that but I just have no idea. [00:40:11] D: There’s one more part to the whole stateful, stateless thing that I think is important to cover, but I don’t know if we’ll be able to cover it entirely in the time that we have left, and that is from the network perspective. If you think about the types of connections coming into an application, we refer to some of those connections as stateful and stateless. I think that’s something we could tackle in our remaining time, or what’s everybody’s thought? [00:40:33] JR: Why don’t you try giving us maybe a quick summary of it, Duffie, and then we can end on that. [00:40:36] CC: Yeah. I think it’s a good idea to talk about network and then address that in the context of network. I’m just thinking an idea for an episode. But give us like a quick rundown. [00:40:45] D: Sure. A lot of the kind of older monolithic applications, the way that you would scale these things is you would have multiple of them and then you would have some intelligence in the way that you’re routing connections down to those applications that would describe the ability to ensure that when Bob accesses a website and he authenticates, he’s going to authenticate to one specific instance of this application and the intelligence up in the frontend is going to handle the routing to make sure that Bob’s connection always comes back to that same instance. This is an older pattern. It’s been around for a very long time and it’s certainly the way that we first kind of learned to scale applications before we’ve decided to break into maker services and kind of handle a lot of this routing in a more resilient way. That was kind of one of the early versions of how we do this, and that is a pretty good example of a stateful session, and that there is actually some – Perhaps Bob has authenticated and he has a cookie that allows him, that when he comes back to that particular application, a lot of the settings, his browser settings, whether he’s using the dark theme or the light theme, that sort of stuff, is persisted on the server side rather than on the client side. That’s kind of what I mean by stateful sessions. Stateless sessions mean it doesn’t really matter that the user is terminating to the same end of point, because we’ve managed to keep the state either with the client. We’re handling state on the browser side of things rather on the server side of things. So you’re not necessarily gaining anything by pushing that connection back to the same specific instance, but just to a service that is more widely available. There are lots of examples of this. I mean, Brian’s example of Google earlier. Obviously, when I come back to Google, there are some things I want it to remember. I want it to remember that I’m logged in as myself. I want it to remember that I’ve used a particular – I want it to remember my history. I want it to remember that kind of stuff so that I could go back and find things that I looked at before. There are a ton of examples of this when we think about it. [00:42:40] JR: Awesome! All right, everyone. Thank you for joining us in episode 6, Stateful and Stateless. Signing off. I’m Josh Rosso, and going across the line, thank you Nicholas Lane. [00:42:54] NL: Thank you so much. This was really informative for me. [00:42:56] JR: Carlisia Campos. [00:42:57] CCC: This was a great conversation. Bye, everybody. [00:42:59] JR: Our new comer, Brian Liles. [00:43:01] BL: Until next time. [00:43:03] JR: And Duffie Cooley. [00:43:05] DCC: Thank you so much, everybody. [00:43:06] JR: Thanks all. [00:43:07] CCC: Bye! [END OF EPISODE] [0:50:00.3] ANNOUNCER: Thank you for listening to The Podlets Cloud Native Podcast. Find us on Twitter at https://twitter.com/ThePodlets and on the http://thepodlets.io/ website, where you'll find transcripts and show notes. We'll be back next week. Stay tuned by subscribing. [END]See omnystudio.com/listener for privacy information.
This week on The Podlets Podcast, we talk about cloud native infrastructure. We were interested in discussing this because we’ve spent some time talking about the different ways that people can use cloud native tooling, but we wanted to get to the root of it, such as where code lives and runs and what it means to create cloud native infrastructure. We also have a conversation about the future of administrative roles in the cloud native space, and explain why there will always be a demand for people in this industry. We dive into the expense for companies when developers run their own scripts and use cloud services as required, and provide some pointers on how to keep costs at a minimum. Joining in, you’ll also learn what a well-constructed cloud native environment should look like, which resources to consult, and what infrastructure as code (IaC) really means. We compare containers to virtual machines and then weigh up the advantages and disadvantages of bare metal data centers versus using the cloud. Note: our show changed name to The Podlets. Follow us: https://twitter.com/thepodlets Website: https://thepodlets.io Feeback: info@thepodlets.io https://github.com/vmware-tanzu/thepodlets/issues Hosts: Carlisia Campos Duffie Cooley Nicholas Lane Key Points From This Episode: • A few perspectives on what cloud native infrastructure means. • Thoughts about the future of admin roles in the cloud native space. • The increasing volume of internet users and the development of new apps daily. • Why people in the infrastructure space will continue to become more valuable. • The cost implications for companies if every developer uses cloud services individually. • The relationships between IaC for cloud native and IaC for the could in general. • Features of a well-constructed cloud native environment. • Being aware that not all clouds are created equal and the problem with certain APIs. • A helpful resource for learning more on this topic: Cloud Native Infrastructure. • Unpacking what IaC is not and how Kubernetes really works. • Reflecting how it was before cloud native infrastructure, including using tools like vSphere. • An explanation of what containers are and how they compare to virtual machines. • Is it worth running bare metal in the clouds age? Weighing up the pros and cons. • Returning to the mainframe and how the cloud almost mimics that idea. • A list of the cloud native infrastructures we use daily. • How you can have your own “private” cloud within your bare metal data center. Quotes: “This isn’t about whether we will have jobs, it’s about how, when we are so outnumbered, do we as this relatively small force in the world handle the demand that is coming, that is already here.” — Duffie Coolie @mauilion [0:07:22] “Not every cloud that you’re going to run into is made the same. There are some clouds that exist of which the API is people. You send a request and a human being interprets your request and makes the changes. That is a big no-no.” — Nicholas Lane @apinick [0:16:19] “If you are in the cloud native workspace you may need 1% of your workforce dedicated to infrastructure, but if you are in the bare metal world, you might need 10 to 20% of your workforce dedicated just to running infrastructure.” — Nicholas Lane @apinick [0:41:03] Links Mentioned in Today’s Episode: VMware RADIO — https://www.vmware.com/radius/vmware-radio-amplifying-ideas-innovation/CoreOS — https://coreos.com/Brandon Phillips on LinkedIn — https://www.linkedin.com/in/brandonphilips Kubernetes — https://kubernetes.io/Apache Mesos — http://mesos.apache.orgAnsible — https://www.ansible.comTerraform — https://www.terraform.ioXenServer (Citrix Hypervisor) — https://xenserver.orgOpenStack — https://www.openstack.orgRed Hat — https://www.redhat.com/Kris Nova on LinkedIn — https://www.linkedin.com/in/kris-novaCloud native Infrastructure — https://www.amazon.com/Cloud-Native-Infrastructure-Applications-Environment/dp/1491984309Heptio — https://heptio.cloud.vmware.comAWS — https://aws.amazon.comAzure — https://azure.microsoft.com/en-us/vSphere — https://www.vmware.com/products/vsphere.htmlCircuit City — https://www.circuitcity.comNewegg — https://www.newegg.comUber —https://www.uber.com/ Lyft — https://www.lyft.com Transcript: EPISODE 05 [INTRODUCTION] [0:00:08.7] ANNOUNCER: Welcome to The Podlets Podcast, a weekly show that explores Cloud Native one buzzword at a time. Each week, experts in the field will discuss and contrast distributed systems concepts, practices, tradeoffs and lessons learned to help you on your cloud native journey. This space moves fast and we shouldn’t reinvent the wheel. If you’re an engineer, operator or technically minded decision maker, this podcast is for you. [EPISODE] [0:00:41.1] NL: Hello and welcome to episode five of The Podlets Podcast, the podcast where we explore cloud native topics one topic at a time. This week, we’re going to the root of everything, money. No, I mean, infrastructure. My name is Nicholas Lane and joining me this week are Carlisia Campos. [0:00:59.2] CC: Hi everybody. [0:01:01.1] NL: And Duffie Cooley. [0:01:02.4] DC: Hey everybody, good to see you again. [0:01:04.6] NL: How have you guys been? Anything new and exciting going on? For me, this week has been really interesting, there’s an internal VMware conference called RADIO where we have a bunch of engineering teams across the entire company, kind of get together and talk about the future of pretty much everything and so, have been kind of sponging that up this week and that’s been really interesting to kind of talking about all the interesting ideas and fascinating new technologies that we’re working on. [0:01:29.8] DC: Awesome. [0:01:30.9] NL: Carlisia? [0:01:31.8] CC: My entire team is at RADIO which is in San Francisco and I’m not. But I’m sort of glad I didn’t have to travel. [0:01:42.8] NL: Yeah, nothing too exciting for me this week. Last week I was on PTO and that was great so this week it has just been kind of getting spun back up and I’m getting back into the swing of things a bit. [0:01:52.3] CC: Were you spun back up with a script? [0:01:57.1] NL: Yeah, I was. [0:01:58.8] CC: With infrastructure I suppose? [0:02:00.5] NL: Yes, absolutely. This week on The Podlets Podcast, we are going to be talking about cloud native infrastructure. Basically, I was interested in talking about this because we’ve spent some time talking about some of the different ways that people can use cloud native tooling but I wanted to kind of get to the root of it. Where does your code live, where does it run, what does it mean to create cloud native infrastructure? Start us off, you know, we’re going to talk about the concept. To me, cloud native infrastructure is basically any infrastructure tool or service that allows you to programmatically create infrastructure and by that I mean like your compute nodes, anything running your application, your networking, software defined networking, storage, Seth, object store, dev sort of thing, you can just spin them up in a programmatical in contract way and then databases as well which is very nice. Then I also kind of lump in anything that’s like a managed service as part of that. Going back to databases if you use like difference and they have their RDS or RDB tooling that provides databases on the fly and then they manage it for you. Those things to me are cloud native infrastructure. Duffy, what do you think? [0:03:19.7] DC: Year, I think it’s definitely one of my favorite topics. I spent a lot of my career working with infrastructure one way or the other, whether that meant racking servers in racks and doing it the old school way and figuring out power budgets and you know, dealing with networking and all of that stuff or whether that meant, finally getting to a point where I have an API and my customer’s going to come to me and say, I need 10 new servers, I can be like, one second. Then run all the script because they have 10 new servers versus you know, having to order the hardware, get the hardware delivered, get the hardware racked. Replace the stuff that was dead on arrival, kind of go through that whole process and yeah. Cloud native infrastructure or infrastructure as a service is definitely near and dear to my heart. [0:03:58.8] CC: How do you feel about if you are an admin? You work from VMware and you are a field engineer now. You’re basically a consultant but if you were back in that role of an admin at a company and you had the company was practicing cloud native infrastructure things. Basically, what we’re talking about is we go back to this theme of self-sufficiency a lot. I think we’re going to be going back to this a lot too, as we go through different topics. Mainly, someone was a server in that environment now, they can run an existing script that maybe you made it for them. But do you have concerns that your job is redundant now that you can just one script can do a lot of your work? [0:04:53.6] NL Yeah, in the field engineering org, we kind of have this mantra that we’re trying to automate ourselves out of a job. I feel like anyone who is like really getting into cloud native infrastructure, that is the path that they’re taking as well. If I were an admin in a world that was like hybrid or anything like that, they had like on prem or bare metal infrastructure and they had cloud native infrastructure. I would be more than ecstatic to take any amount of the administrative work of like spinning up new servers in the cloud native infrastructure. If the people just need somewhere they can go click, I got whatever services I need and they all work together because the cloud makes them work together, awesome. That gives me more time to do other tasks that may be a bit more onerous or less automated. I would be all for it. [0:05:48.8] CC: You’re saying that if you are – because I don’t want the admin people listening to this to stop listening and thinking, screw this. You’re saying, if you’re an admin, there will still be plenty of work for you to do? [0:06:03.8] NL: Year, there’s always stuff to do I think. If not, then I guess maybe it’s time to find somewhere else to go. [0:06:12.1] DC: There was a really interesting presentation that really stuck with me when I was working for CoreOS which is another infrastructure company, it was a presentation by our CTO, his name is Brandon Philips and Brandon put together a presentation around the idea that every single day, there are you know, so many thousand new users of the Internet coming online for the first time. That’s so many thousand people who are like going to be storing their photos there, getting emails, doing all those things that we do in our daily lives with the Internet. That globally, across the whole world, there are only about, I think it was like 250k or 300,000 people that do what we do, that understand the infrastructure at a level that they might even be able to automate it, you know? That work in the IT industry and are able to actually facilitate the creation of those resources on which all of those applications will be hosted, right? This isn’t even taking into account, the number of applications per day that are brought into the Internet or made available to users, right? That in itself is a whole different thing. How many people are putting up new webpages or putting up new content or what have you every single day. Fundamentally, I think that we have to think about the problem in a slightly different way, this isn’t about whether we will have jobs, it’s about how, when we are so outnumbered, how do we as this relatively small force in the world, handle the demand that is coming, that is already here today, right? Those people that are listening, who are working infrastructure today, you’re even more valuable when you think about it in those terms because there just aren’t enough people on the planet today to solve those problems using the tools that we are using today, right? Automation is king and it has been for a long time but it’s not going anywhere, we need the people that we have to be able to actually support much larger numbers or bigger scale of infrastructure than they know how to do today. That’s the problem that we have to solve. [0:08:14.8] NL: Yeah, totally. [0:08:16.2] CC: Looking from the perspective of whoever is paying the bills. I think that in the past, as a developer, you had to request a server to run your app in the test environment and eventually you’ll get it and that would be the server that everybody would use to run against, right? Because you’re the developer in the group and everybody’s developing different features and that one server is what we would use to push out changes to and do some level of manual task or maybe we’ll have a QA person who would do it. That’s one server or one resource or one virtual machine. Now, maybe I’m wrong but as a developer, I think what I’m seeing is I will have access to compute and storage and I’ll run a script and I boot up that resource just for myself. Is that more or less expensive? You know? If every single developer has this facility to speed things up much quicker because we’re not depending on IT and if we have a script. I mean, the reality is not as easy like just, well command and you get it but – If it’s so easy and that’s what everybody is doing, doesn’t it become expensive for the company? [0:09:44.3] NL: It can, I think when cloud native infrastructure really became more popular in the workplace and became more like mainstream, there was a lot of talk about the concept of sticker shock, right? It’s the idea of you had this predictable amount of money that was allocated to your infrastructure before, these things cost this much and their value will degrade over time, right? The server you had in 2005 is not going to be as valuable as the server you buy in 2010 but that might be a refresh cycles like five years or so. But you have this predictable amount of money. Suddenly, you have this script that we’re talking about that can spin up in equivalent resource as one of those servers and if someone just leaves it running, that will run for a long time and so, irresponsible users of the cloud or even just regular users of cloud, it does cost a lot of money to use any of these cloud services or it can cost a lot of money. Yes, there is some concern about the amount of money that these things cost because honestly, as we’re exploring cloud native topics. One thing keeps coming up is that cloud really just means, somebody else’s computer, right? You’re not using the cost of maintenance or the time it takes to maintain things, somebody else is and you’re paying that price upfront instead of doing it on like a yearly basis, right? It’s less predictable and usually a bit more than people are expected, right? But there is value there as well. [0:11:16.0] CC: You’re saying if the user is diligent enough to terminate clusters or the machine, that is how you don’t rack up unnecessary cost? [0:11:26.6] NL: Right For a test, like say you want to spin up your code really quickly and just need to – a quick like setup like networking and compute resources to test out your code and you spin up a small, like a tiny instance somewhere in one of the clouds, test out your code and then kill the instance. That won’t cost hardly anything. It didn’t really cost you much on time either, right? You had this automated process hopefully or had a manual process that isn’t too onerous and you get the resource and the things you needed and you’re good. If you aren’t a good player and you keep it going, that can get very expensive very quickly. Because It’s a number of resources used per hour I think is how most billing happens in the cloud. That can exponentially – I mean, they’re not really exponentially grow but it will increase in time to a value that you are not expecting to see. You get a billing and you’re like holy crap, what is this? [0:12:26.9] DC: I think it’s also – I mean, this is definitely where things like orchestration come in, right? With the level of obstruction that you get from things like Kubernetes or Mesos or some other tools, you’re able to provide access to those resources in a more dynamic way with the expectation and sometimes one of the explicit contract, that work load that you deploy will be deployed on common equipment, allowing for things like Vin packing which is a pretty interesting term when it comes to infrastructure and means that you can think of the fact that like, for a particular cluster. I might have 10 of those VMs that we talk about having high value. I attended to those running and then my goal is to make sure that I have enough consumers of those 10 nodes to be able to get my value out of it and so when I split up the environments, we did a little developer has a main space, right? This gets me the ability to effectively over subscribe those resources that I’m paying for in a way that will reduce the overall cost of ownership or cost of – not ownership, maybe cost of operation for those 10 nodes. Let’s take a step back and go down a like memory lane. [0:13:34.7] NL: When did you first hear about the concept of IAS or cloud native infrastructure or infrastructure as code? Carlisia? [0:13:45.5] CC: I think in the last couple of years, same as pretty much coincided with when I started to – you need to cloud native and Kubernetes. I’m not clear on the difference between the infrastructure as code for cloud native versus infrastructure as code for the cloud in general. Is there anything about cloud native that has different requirements and solutions? Are we just talking about, is the cloud and the same applies for cloud native? [0:14:25.8] NL: Yes, I think that they’re the same thing. Cloud, like infrastructure is code for the cloud is inherently cloud native, right? Cloud native just means that whatever you’re trying to do, leverages the tools and the contracts that a cloud provides. The basic infrastructure as code is basically just how do I use the cloud and that’s – [0:14:52.9] CC: In an automated way. [0:14:54.7] NL: In an automated way or just in a way, right? A properly constructed cloud should have a user interface of some kind, that uses an API, right? A contract to create these machines or create these resources. So that the way that it creates its own resources is the same that you create the resource if you programmatically do it right. Orchestration tool like Ansible or Terraform. The API calls that itself makes and its UI needs to be the same and if we have that then we have a well-constructed cloud native environment. [0:15:32.7] DC: Yeah, I agree with that. I think you know, from the perspective of cloud infrastructure or cloud native infrastructure, the goal is definitely to have – it relies on one of the topics that we covered earlier in a program around the idea of API first or API driven being such an intrinsic quality of any cloud native architecture, right? Because, fundamentally, if we can’t do it programmatically, then we’re kind of stuck in that old world of wrecking servers or going through some human managed process and then we’re right back to the same state that we were in before and there’s no way that we can actually scale our ability to manage these problems because we’re stuck in a place where it’s like one to one rather than one to many. [0:16:15.6] NL: Yeah, those API’s are critical. [0:16:17.4] DC: The API’s are critical. [0:16:18.4] NL: You bring up a good point, reminder of something. Not every cloud that you’re going to run into is made the same. There are some clouds that exist and I’m not going to specifically call them out but there are some clouds that exist that the API is people. Using their request and a human being interprets your request and makes the changes. That is a big no-no. Me no like, no good, my brain stopped. That is a poorly constructed cloud native environment. In fact, I would say it is not a cloud native environment at all, it can barely call itself a cloud and sentence. Duffie, how about you? When was the first time you heard the concept of cloud native infrastructure? [0:17:01.3] DC: I’m going to take this question in a form of like, what was the first programmatic infrastructure of those service that I played with. For me, that was actually like back in the Nicera days when we were virtualizing the network and effectively providing and building an API that would allow you to create network resources and a different target that we were developing for where things like XenServer which the time had a reasonable API that would allow you to create virtual machines but didn’t really have a good virtual network solution. There were also technologies like KVM, the ability to actually use KVM to create virtual machines. Again, with an API and then, although, in KVM, it wasn’t quite the same as an API, that’s kind of where things like OpenStack and those technologies came along and kind of wrapped a lot of the capability of KVM behind a restful API which was awesome. But yeah, I would say, XenServer was the first one and that gave me the ability to – like a command line option with which I could stand up and create virtual machines and give them so many resources and all that stuff. You know, from my perspective, it was the Nicera was the first network API that I actually ever saw and it was also one of the first ones that I worked on which was pretty neat. [0:18:11.8] CC: When was that Duffie? [0:18:13.8] DC: Some time ago. I would say like 2006 maybe? [0:18:17.2] NL: The TVs didn’t have color back then. [0:18:21.0] DC: Hey now. [0:18:25.1] NL: For me, for sort of cloud native infrastructure, it reminded me, it was the open stack days I think it was really when I first heard the phrase like I – infrastructure as a service. At the time, I didn’t even get close to grocking it. I still don’t know if I grock it fully but I was working at Red Hat at the time so this was probably back in like 2012, 2013 and we were starting to leverage OpenStack more and having like this API driven toolset that could like spin up these VMs or instances was really cool to me but it’s something I didn’t really get into using until I got into Kubernetes and specifically Kubernetes on different clouds such as like AWS or AS Ram. We’ll touch on those a little bit later but it was using those and then having like a CLI that had an API that I could reference really easily to spin things up. I was like, holy crap, this is incredible. That must have been around like the 2015, 16 timeframe I think. I think I actually heard the phrase cloud native infrastructure first from our friend Kris Nova’s book, Cloud Native Infrastructure. I think that really helped me wrap my brain around really what it is to have something be like cloud native infrastructure, how the different clouds interact in this way. I thought that was a really handy book, I highly recommend it. Also, well written and interesting read. [0:19:47.7] CC: Yes, I read it back to back and when I joined what I have to, I need to read it again actually. [0:19:53.6] NL: I agree, I need to go back to it. I think we’ve touched on something that we normally do which is like what does this topic mean to us. I think we kind of touched on that a bit but if there’s anything else that you all want to expand upon? Any aspect of infrastructure that we’ve not touched on? [0:20:08.2] CC: Well, I was thinking to say something which is reflects my first encounter with Kubernetes when I joined Heptio when it was started using Kubernetes for the very first time and I had such a misconception of why Kubernetes was. I’m saying what I’m going to say to touch base on what I want to say – I wanted to relate what infrastructure as code is not. [0:20:40.0] NL: That’s’ a very good point actually, I like that. [0:20:42.4] CC: Maybe, what Kubernetes now, I’m not clear what it is I’m going to say. Hang with me. It’s all related. I work from Valero, Valero is out, they run from Kubernetes and so I do have to have Kubernetes running for me to run Valero so we came on my machine or came around on the cloud provider and our own prem just for the pluck. For Kubernetes to run, I need to have a cluster running. Not one instance because I was used to like yeah, I could bring up an instance or an instance or two. I’ve done this before but bringing up a cluster, make sure that everything’s connected. I was like, then this. When I started having to do that, I was thinking. I thought that’s what Kubernetes did. Why do I have to bother with this? Isn’t that what Kubernetes supposed to do, isn’t it like just Kubernetes and all of that gets done? Then I had that realization, you know, that encounter where reality, no, I still have to boot up my infrastructure. That doesn’t go away because we’re doing Kubernetes. Kubernetes, this abstraction called that sits on top of that infrastructure. Now what? Okay, I can do it manually while DJIK has a great episode where Joe goes and installs everything by hand, he does thing like every single thing, right? Every single step that you need to do, hook up the network KP’s. It’s brilliant, it really helps you visualize what happens when all of the stuff, how all the stuff is brought up because ideally, you’re not doing that by hand which is what we’re talking about here. I used for example, cloud formation on AWS with a template that Heptio also has in partnership with AWS, there is a template that you can use to bring up a cluster for Kubernetes and everything’s hooked up, that proper networking piece is there. You have to do that first and then you have Kubernetes installed as part of that template. But the take home lesson for me was that I definitely wanted to do it using some sort of automation because otherwise, I don’t have time for that. [0:23:17.8] DC: Ain’t nobody got time for that, exactly. [0:23:19.4] CC: Ain’t got no time for that. That’s not my job. My job is something different. I’m just boarding these stuff up to test my software. Absolutely very handy and if you put people is not working with Kubernetes yet. I just wanted to clarify that there is a separation and the one thing is having your infrastructure input and then you have installed Kubernetes on top of that and then you know, you might have, your application running in Kubernetes or you can have an external application that interacts with Kubernetes. As an extension of Kubernetes, right? Which is what I – the project that I work on. [0:24:01.0] NL: That’s a good point and that’s something we should dive into and I’m glad that you brought this up actually, that’s a good example of a cloud native application using the cloud native infrastructure. Kubernetes has a pretty good job of that all around and so the idea of like Kubernetes itself is a platform. You have like a platform as a service that’s kind of what you’re talking about which is like, if I spin up. I just need to spin up a Kubernetes and then boom, I have a platform and that comes with the infrastructure part of that. There are more and more to like, of this managed Kubernetes offerings that are coming out that facilitate that function and those are an aspect of cloud native infrastructure. Those are the managed services that I was referring to where the administrators of the cloud are taking it upon themselves to do all that for you and then manage it for you and I think that’s a great offering for people who just don’t want to get into the weeds or don’t want to worry about the management of their application. Some of these like – For instance, databases. These managed services are awesome tool and going back to Kubernetes a little bit as well, it is a great show of how a cloud native application can work with the infrastructure that it’s on. For instance, when Kubernetes, if you spin up a service of type load balancer and it’s connected to a cloud properly, the cloud will create that object inside of itself for you, right? A load balancer in AWS is an ELB, it’s just a load balancer in Azure and I’m not familiar with the other terms that the other clouds use, they will create these things for you. I think that the dopest thing on the planet, that is so cool where I’m just like, this tool over here, I created it I n this thing and it told this other thing how to make that work in reality. [0:25:46.5] DC: That is so cool. Orchestration magic. [0:25:49.8] NL: Yeah, absolutely. [0:25:51.6] DC: I agree, and then, actually, I kind of wanted to make a point on that as well which is that, I think – the way I interpreted your original question, Carlisia was like, “What is the difference perhaps between these different models that I call the infrastructure-esque code versus a plat… you know or infrastructure as a service versus platform as a service versus containers as a service like what differentiates these things and for my part, I feel like it is effectively an evolution of the API like what the right entry point for your consumer is. So in the form, when we consider container orchestration. We consider things like middle this in Kubernetes and technologies like that. We make the assumption that the right entry point for that user is an API in which they can define those things that we want to orchestrate. Those containers or those applications and we are going to provide within the form of that platform capability like you know, service discovery and being able to handle networking and do all of those things for you. So that all you really have to do is to find like what needs to be deployed. And what other services they need to rely on and away you go. Whereas when you look at infrastructure of the service, the entry point is fundamentally different, right? We need to be thinking about what the infrastructure needs to look at like I would might ask an infrastructure as a service API how man machines I have running and what networks they are associated with and how much memory and disk is associated with each of those machines. Whereas if I am interacting with a platform as a service, I might ask whatever the questions about those primitives that are exposed by their platform, how many deployments do I have? What name spaces do I have access to do? How many pods are running right now versus how many I ask that would be running? Those questions capabilities. [0:27:44.6] NL: Very good point and yeah I am glad that we explored that a little bit and cloud native infrastructure is not nothing but it is close to useless without a properly leveraged cloud native application of some kind. [0:27:57.1] DC: These API’s all the way down give you this true flexibility and like the real functionality that you are looking for in cloud because as a developer, I don’t need to care how the networking works or where my storage is coming from or where these things are actually located. What the API does any of these things. I want someone to do it for me and then API does that, success and that is what cloud native structure gets even. Speaking of that, the thing that you don’t care about. What was it like before cloud? What do we have before cloud native infrastructure? The things that come to mind are the things like vSphere. I think vSphere is a bridge between the bare metal world and the cloud native world and that is not to say that vSphere itself is not necessarily cloud native but there are some limitations. [0:28:44.3] CC: What is vSphere? [0:28:46.0] DC: vSphere is a tool that VMware has created a while back. I think it premiered back in 2000, 2000-2001 timeframe and it was a way to predictably create and manage virtual machines. So in virtual machine being a kernel that sits on top of a kernel inside of a piece of hardware. [0:29:11.4] CC: So is vSphere two virtual machines where to Kubernetes is to containers? [0:29:15.9] DC: Not quite the same thing I don’t think because fundamentally, the underlying technologies are very different. Another way of explaining the difference that you have that history is you know, back in the day like 2000, 90s even currently like what we have is we have a layer of people who are involved in dealing with physical hardware. They rack 10 or 20 servers and before we had orchestration and virtualization and any of those things, we would actually be installing an operating system and applications on those servers. Those servers would be webservers and those servers will be database servers and they would be a physical machine dedicated to a single task like a webserver or database. Where vSphere comes in and XenServer and then KVM and those technologies is that we think about this model fundamentally differently instead of thinking about it like one application per physical box. We think about each physical box being able to hold a bunch of these virtual boxes that look like virtual machines. And so now those virtual machines are what we put our application code. I have a virtual machine that is a web server. I have a virtual machine that is a database server. I have that level of abstraction and the benefit of this is that I can get more value out of those hardware boxes than I could before, right? Before I have to buy one box or one application, now I can buy one box for 10 or 20 applications. When we take that to the next level, when we get to container orchestration. We realized, “You know what? Maybe I don’t need the full abstraction of a machine. I just need enough of an abstraction to give me enough, just enough isolation back and forth between those applications such that they have their own file system but they can share the kernel but they have their own network” but they can share the physical network that we have enough isolation between them but they can’t interact with each other except for when intent is present, right? That we have that sort of level of abstraction. You can see that this is a much more granular level of abstraction and the benefit of that is that we are not actually trying to create a virtual machine anymore. We are just trying to effectively isolate processes in a Linux kernel and so instead of 20 or maybe 30 VMs per physical box, I can get a 110 process all day long you know on a physical box and again, this takes us back to that concept that I mentioned earlier around bin packing. When we are talking about infrastructure, we have been on this eternal quest to make the most of what we have to make the most of that infrastructure. You know, how do we actually – what tools and tooling that we need to be able to see the most efficiency for the dollar that we spend on that hardware? [0:32:01.4] CC: That is simultaneously a great explanation of container and how containers compare with virtual machines, bravo. [0:32:11.6] DC: That was really low crap idea that was great. [0:32:14.0] CC: Now explain vSphere please I still don’t understand it. I don’t know what it does. [0:32:19.0] NL: So vSphere is the one that creates the many virtual machines on top of a physical machine. It gives you the capability of having really good isolation between these virtual machines and inside of these virtual machines you feel like you have like a metal box but it is not a metal box it is just a process running on a metal box. [0:32:35.3] CC: All right, so it is a system that holds multiple virtual machines inside the same machine. [0:32:41.8] NL: Yeah, so think of it in the cloud native like infrastructure world, vSphere is essentially the API or you could think of it as the cloud itself, which is in the sense an AWS but for your datacenter. The difference being that there isn’t a particularly useful API that vSphere exposes so it makes it harder for people to programmatically leverage, which makes it difficult for me to say like vSphere is a cloud native app like tool. It is a great tool and it has worked wonders and still works wonders for a lot of companies throughout the years but I would hesitate to lump into a cloud native functionality. So prior to cloud native or infrastructures and service, we had these tools like vSphere, which allowed us to make smaller and smaller VMs or smaller and smaller compute resources on a larger compute resource and going back to something, we’re talking about containers and how you spin up these processes. Prior to things that containers and in this world of VM’s a lot of times what you do is you would create a VM that had your application already installed into it. It is burnt into the image so that when that VM stood up it would spin up that process. So that would be the way that you would start processes. The other way would be through orchestration tools similar to Ansible but they existed right or Ansible. That essentially just ran SSH on a number of servers to startup tools like these processes and that is how you’d get distributed systems prior to things like cloud native and containers. [0:34:20.6] CC: Makes sense. [0:34:21.6] NL: And so before we had vSphere we already had XenServer. Before we had virtual machine automation, which is what these tools are, virtual machine automation we had bare metal. We just had joes like Duffie and me cutting our hands on rack equipment. A server was never installed properly unless my blood is on it essentially because they are heavy and it is all metal and it sharpens some capacity and so you would inevitably squash a hand or something. And so you’d rack up the server and then you’d plug in all the things, plug in all the plugs and then set it up manually and then it is good to go and then someone can use it at will or in a logical manner hopefully and that is what we had to do before. It’s like, “Oh I need a new compute resource” okay well, let us call Circuit City or whoever, let us call Newegg and get a new server in and there you go” and then there is a process for that and yeah I think, I don’t know I’m dying of blood loss anymore. [0:35:22.6] CC: And still a lot of companies are using bare metal as a matter of course, which brings up another question, if one would want to ask, which is, is it worth it these days to run bare metal if you have the clouds, right? One example is we see companies like Uber, Lyft, all of these super high volume companies using public clouds, which is to say paying another company for all the data, traffic of data, storage of data in computes and security and everything. And you know one could say, you would save a ton of money to have that in house but using bare metal and other people would say there is no way that this costs so much to host all of that. [0:36:29.1] NL: I would say it really depends on the environment. So usually I think that if a company is using only cloud native resources to begin with, it is hard to make the transition into bare metal because you are used to these tools being in place. A company that is more familiar like they came from a bare metal background and moved to cloud, they may leverage both in a hybrid fashion well because they should have tooling or they can now create tooling so that they can make it so that their bare metal environment can mimic the functionality of the cloud native platform, it really depends on the company and also the need of security and data retention and all of these thing to have like if you need this granularity of control bare metal might be a better approach because of you need to make sure that your data doesn’t leave your company in a specific way putting it on somebody else’s computer probably isn’t the best way to handle that. So there is a balancing act of how much resources are you using in the cloud and how are you using the cloud and what does that cost look like versus what does it cost to run a data center to like have the physical real estate and then have like run the electricity VH fact, people’s jobs like their salaries specifically to manage that location and your compute and network resources and all of these things. That is a question that each company will have to ask. There is normally like hard and fast answer but many smaller companies something like the cloud is better because you don’t have to like think of all these other costs associated with running your application. [0:38:04.2] CC: But then if you are a small company. I mean if you are a small company it is a no brainer. It makes sense for you to go to the clouds but then you said that it is hard to transition from the clouds to bare metal. [0:38:17.3] NL: It can and it really depends on the people that you have working for you. Like if the people you have working for you are good and creating automation and are good and managing infrastructure of any kind, it shouldn’t be too bad but as we’re moving more and more into a cloud focused world, I wonder if those people are going to start going away. [0:38:38.8] CC: For people who are just listening on the podcast, Duffie was heavily nodding as Nick was saying that. [0:38:46.5] DC: I was. I do, I completely agree with the statement that it depends on the people that you have, right? And fundamentally I think the point I would add to this is that Uber or a company like Uber or a company like Lyft, how many people do they employ that are focused on infrastructure, right? And I don’t know the answer to this question but I am positioning it, right? And so, if we assume that they have – that this is a pretty hot market for people who understand infrastructure. So they are not going to have a ton of them, so what specific problems do they want those people that they employ that are focused on infrastructure to solve right? And we are thinking about this as a scale problem. I have 10 people that are going to manage the infrastructure for all of the applications that Uber has, maybe have 20 people but it is going to be a percentage of the people compared to the number of people that I have maybe developing for those applications or for marketing or for within the company. I am going to have, I am going to be, I am going to quickly find myself off balance and then the number of applications that I need to actually support that I need to operationalize to be able to get to be deployed, right? Versus the number of people that I have doing the infrastructure work to provide that base layer for which problems in application will be deployed, right? I look around in the market and I see things like orchestration. I see things like Kubernetes and Mezos and technologies like that. That can provide kind of a multiplier or even AWS or Azure or GCP. I have these things act as a multiplier for those 10 infrastructure pull that I have, right? Because no – they are not looking at trying to figure out how to store it, you know get this data. They are not worried about data centers, they are not worried about servers, they are not worried about networking, right? They can actually have ten people that are decent at infrastructure that can expand, that can spin up very large amounts of infrastructure to satisfy the developing and operational need of a company the size of Uber reasonably. But if we had to go back to a place where we had to do all of that in the physical world like rack those servers, deal with the power, deal with the cool low space, deal with all the cabling and all of that stuff, 10 people are probably not enough. [0:40:58.6] NL: Yeah, absolutely. I put into numbers like if you are in the cloud native workspace you may need 1% of your workforce dedicated to infrastructure, but if you are in the bare metal world, you might need 10 to 20% of your workforce dedicated just to running infrastructure. because the overtly overhead of people’s work is so much greater and a lot of it is focused on things that are tangible instead of things that are fundamental like automation, right? So those 1% or Uber if I am pulling this number totally out of nowhere but if that one percent, their job is to usually focus around automating the allocation of resources and figuring out like the tools that they can use to better leverage those clouds. In the bare metal environment, those people’s jobs are more like, “Oh crap, suddenly we are like 90 degrees in the data center in San Jose, what is going on?” and then having someone go out there and figuring out what physical problem is going on. It is like their day to day lives and something I want to touch on really quickly as well with bare metal, prior to bare metal we had something kind of interesting that we were able to leverage from an infrastructure standpoint and that is mainframes and we are actually going back a little bit to the mainframe idea but the idea of a mainframe, it is essentially it is almost like a cloud native technology but not really. So instead of you using whatever like you are able to spin up X number of resources and networking and make that work. With the mainframe at work is that everyone use the same compute resource. It was just one giant compute resource. There is no networking native because everyone is connected with the same resource and they would just do whatever they needed, write their code, run it and then be done with it and then come off and it was a really interesting idea I think where the cloud almost mimics a mainframe idea where everyone just connects to the cloud. Does whatever they need and then comes off but at a different scale and yeah, Duffie do you have any thoughts on that? [0:43:05.4] DC: Yeah, I agree with your point. I think it is interesting to go back to mainframe days and I think from the perspective of like what a mainframe is versus like what the idea of a cluster in those sorts of things are is that it is kind of like the domain of what you’re looking at. Mainframe considers everything to be within the same physical domain whereas like when you start getting into the larger architectures or some of the more scalable architectures you find that just like any distributed system we are spreading that work across a number of physical nodes. And so we think about it fundamentally differently but it is interesting the parallels between what we do, the works that we are doing today versus what we are doing in maintain times. [0:43:43.4] NL: Yeah, cool. I think we are getting close to a wrap up time but something that I wanted to touch on rather quickly, we have mentioned these things by names but I want to go over some of like the cloud native infrastructures that we use on a day to day basis. So something that we don’t mention them before but Amazon’s AWS is pretty much the number one cloud, I’m pretty sure, right? That is the most number of users and they have a really well structured API. A really good seal eye, peace and gooey, sometimes there is some problems with it and it is the thing that people think of when they think cloud native infrastructure. Going back to that point again, I think that AWS was agreed, AWS is one of the largest cloud providers and has certainly the most adoption as an infrastructure for cloud native infrastructure and it is really interesting to see a number of solutions out there. IBM has one, GCP, Azure, there are a lot of other solutions out there now. That are really focused on trying to follow the same, it could be not the same exact patterns that AWS has but certainly providing this consistent API for all of the same resources or for all of the same services serving and maybe some differentiating services as well. So it is yeah, you could definitely tell it is sort of like for all the leader. You can definitely tell that AWS stumbled onto a really great solution there and that all of the other cloud providers are jumping on board trying to get a piece of that as well. [0:45:07.0] DC: Yep. [0:45:07.8] NL: And also something we can touch on a little bit as well but from a cloud native infrastructure standpoint, it isn’t wrong to say that a bare metal data center can be a cloud native infrastructure. As long as you have the tooling in place, you can have your own cloud native infrastructure, your own private cloud essentially and I know that private cloud doesn’t actually make any sense. That is not how cloud works but you can have a cloud native infrastructure in your own data center but it takes a lot of work. And it takes a lot of management but it isn’t something that exists solely in the realm of Amazon, IBM, Google or Microsoft and I can’t remember the other names, you know the ones that are running as well. [0:45:46.5] DC: Yeah agreed and actually one of the questions you asked, Carlisia, earlier that I didn’t get a chance to answer was do you think it is worth running bare metal today and in my opinion the answer will always be yes, right? Especially as we think about like it is a line that we draw in the sand is container isolation or container orchestration, then there will always be a good reason to run on bare metal to basically expose resources that are available to us against a single bare metal instance. Things like GPU’s or other physical resources like that or maybe we just need really, really fast disk and we want to make sure that we like provide those containers to access to SSDs underlying and there is technology certainly introduced by VMware that expose real hardware from the underlying hype riser up to the virtual machine where these particular containers might run but you know the question I think you – I always come back to the idea that when thinking about those levels of abstraction that provide access to resources like GPU’s and those sorts of things, you have to consider that simplicity is king here, right? As we think about the different fault domains or the failure domains as we are coming up with these sorts of infrastructures, we have to think about what would it look like when they fail or how they fail or how we actually go about allocating those resources for ticket of the machines and that is why I think that bare metal and technologies like that are not going away. I think they will always be around but to the next point and I think as we covered pretty thoroughly in this episode having an API between me and the infrastructure is not something I am willing to give up. I need that to be able to actually to solve my problems at scale even reasonable scale. [0:47:26.4] NL: Yeah, you mean you don’t want to go back to the battle days of around tell netting into a juniper switch and Telnet – setting up your IP – not IP tables it was your IP comp commands. [0:47:39.9] DC: Did you just say Telnet? [0:47:41.8] NL: I said Telnet yeah or serial, serial connect into it. [0:47:46.8] DC: Nice, yeah. [0:47:48.6] NL: All right, I think that pretty much covers it from a cloud native infrastructure. Do you all have any finishing thoughts on the topic? [0:47:54.9] CC: No, this was great. Very informative. [0:47:58.1] DC: Yeah, I had a great time. This is a topic that I very much enjoy. It is things like Kubernetes and the cloud native infrastructure that we exist in is always what I wanted us to get to. When I was in university this was I’m like, “Oh man, someday we are going to live in a world with virtual machines” and I didn’t even have the idea of containers but people can real easily to play with applications like I was amazed that we weren’t there yet and I am so happy to be in this world now. Not to say that I think we can stop and we need to stop improving, of course not. We are not at the end of the journey by far but I am so happy we’re at where we are at right now. [0:48:34.2] CC: As a developer I have to say I am too and I had this thought in my mind as we are having this conversation that I am so happy too that we are where we are and I think well obviously not everybody is there yet but as people start practicing the cloud native development they will come to realize what is it that we are talking about. I mean I said before, I remember the days when for me to get access to a server, I had to file a ticket, wait for somebody’s approval. Maybe I won’t get an approval and when I say me, I mean my team or one of us would do the request and see that. Then you had that server and everybody was pushing to the server like one the main version of our app would maybe run like every day we will get a new fresh copy. The way it is now, I don’t have to depend on anyone and yes, it is a little bit of work to have to run this choreo to put up the clusters but it is so good that is not for me. [0:49:48.6] DC: Yeah, exactly. [0:49:49.7] CC: I am selfish that way. No seriously, I don’t have to wait for a code to merge and be pushed. It is my code that is right there sitting on my computer. I am pushing it to the clouds, boom, I can test it. Sometimes I can test it on my machine but some things I can’t. So when I am doing volume or I have to push it to the cloud provider is amazing. It is so, I don’t know, I feel very autonomous and that makes me very happy. [0:50:18.7] NL: I totally agree for that exact reason like testing things out maybe something isn’t ready for somebody else to see. It is not ready for prime time. So I need something really quick to test it out but also for me, I am the avatar of unwitting chaos meaning basically everything I touch will eventually blow up. So it is also nice that whatever I do that’s weird isn’t going to affect anybody else either and that is great. Blast radius is amazing. All right, so I think that pretty much wraps it up for a cloud native infrastructure episode. I had an awesome time. This is always fun so please send us your concepts and ideas in the GitHub issue tracker. You can see our existing episode and suggestions and you can add your own at github.com/heptio/thecubelets and go to the issues tab and file a new issue or see what is already there. All right, we’ll see you next time. [0:51:09.9] DC: Thank you. [0:51:10.8] CC: Thank you, bye. [END OF INTERVIEW] [0:51:13.9] KN: Thank you for listening to The Podlets Cloud Native Podcast. Find us on Twitter https://twitter.com/ThePodlets and on the https://thepodlets.io website, where you'll find transcripts and show notes. We'll be back next week. Stay tuned by subscribing. [END]See omnystudio.com/listener for privacy information.
Welcome to the first episode of The Podlets Podcast! On the show today we’re kicking it off with some introductions to who we all are, how we got involved in VMware and a bit about our career histories up to this point. We share our vision for this podcast and explain the unique angle from which we will approach our conversations, a way that will hopefully illuminate some of the concepts we discuss in a much greater way. We also dive into our various experiences with open source, share what some of our favorite projects have been and then we define what the term “cloud native” means to each of us individually. The contribution that the Cloud Native Computing Foundation (CNCF) is making in the industry is amazing, and we talk about how they advocate the programs they adopt and just generally impact the community. We are so excited to be on this podcast and to get feedback from you, so do follow us on Twitter and be sure to tune in for the next episode! Note: our show changed name to The Podlets. Follow us: https://twitter.com/thepodlets Hosts: Carlisia Campos Kris Nóva Josh Rosso Duffie Cooley Nicholas Lane Key Points from This Episode: An introduction to us, our career histories and how we got into the cloud native realm. Contributing to open source and everyone’s favorite project they have worked on. What the purpose of this podcast is and the unique angle we will approach topics from. The importance of understanding the “why” behind tools and concepts. How we are going to be interacting with our audience and create a feedback loop. Unpacking the term “cloud native” and what it means to each of us. Differentiating between the cloud native apps and cloud native infrastructure. The ability to interact with APIs as the heart of cloud natives. More about the Cloud Native Computing Foundation (CNCF) and their role in the industry. Some of the great things that happen when a project is donated to the CNCF. The code of conduct that you need to adopt to be part of the CNCF. And much more! Quotes: “If you tell me the how before I understand what that even is, I'm going to forget.” — @carlisia [0:12:54] “I firmly believe that you can't – that you don't understand a thing if you can't teach it.” — @mauilion [0:13:51] “When you're designing software and you start your main function to be built around the cloud, or to be built around what the cloud enables us to do in the services a cloud to offer you, that is when you start to look at cloud native engineering.” — @krisnova [0:16:57] Links Mentioned in Today’s Episode: Kubernetes — https://kubernetes.io/The Podlets on Twitter — https://twitter.com/thepodlets VMware — https://www.vmware.com/Nicholas Lane on LinkedIn — https://www.linkedin.com/in/nicholas-ross-laneRed Hat — https://www.redhat.com/CoreOS — https://coreos.com/Duffie Cooley on LinkedIn — https://www.linkedin.com/in/mauilionApache Mesos — http://mesos.apache.org/Kris Nova on LinkedIn — https://www.linkedin.com/in/kris-novaSolidFire — https://www.solidfire.com/NetApp — https://www.netapp.com/us/index.aspxMicrosoft Azure — https://azure.microsoft.com/Carlisia Campos on LinkedIn — https://www.linkedin.com/in/carlisiaFastly — https://www.fastly.com/FreeBSD — https://www.freebsd.org/OpenStack — https://www.openstack.org/Open vSwitch — https://www.openvswitch.org/Istio — https://istio.io/The Kublets on GitHub — https://github.com/heptio/thekubeletsCloud Native Infrastructure on Amazon — https://www.amazon.com/Cloud-Native-Infrastructure-Applications-Environment/dp/1491984309Cloud Native Computing Foundation — https://www.cncf.io/Terraform — https://www.terraform.io/KubeCon — https://www.cncf.io/community/kubecon-cloudnativecon-events/The Linux Foundation — https://www.linuxfoundation.org/Sysdig — https://sysdig.com/opensource/falco/OpenEBS — https://openebs.io/Aaron Crickenberger — https://twitter.com/spiffxp Transcript: [INTRODUCTION] [0:00:08.1] ANNOUNCER: Welcome to The Podlets Podcast, a weekly show that explores Cloud Native one buzzword at a time. Each week, experts in the field will discuss and contrast distributed systems concept, practices, tradeoffs and lessons learned to help you on your cloud native journey. This space moves fast and we shouldn’t reinvent the wheel. If you’re an engineer, operator or technically minded decision maker, this podcast is for you. [EPISODE] [0:00:41.3] KN: Welcome to the podcast. [0:00:42.5] NL: Hi. I’m Nicholas Lane. I’m a cloud native Architect. [0:00:45.0] CC: Who do you work for, Nicholas? [0:00:47.3] NL: I've worked for VMware, formerly of Heptio. [0:00:50.5] KN: I think we’re all VMware, formerly Heptio, aren’t we? [0:00:52.5] NL: Yes. [0:00:54.0] CC: That is correct. It just happened that way. Now Nick, why don’t you tell us how you got into this space? [0:01:02.4] NL: Okay. I originally got into the cloud native realm working for Red Hat as a consultant. At the time, I was doing OpenShift consultancy. Then my boss, Paul, Paul London, left Red Hat and I decided to follow him to CoreOS, where I met Duffie and Josh. We were on the field engineering team there and the sales engineering team. Then from there, I found myself at Heptio and now with VMware. Duffie, how about you? [0:01:30.3] DC: My name is Duffie Cooley. I'm also a cloud native architect at VMware, also recently Heptio and CoreOS. I've been working in technologies like cloud native for quite a few years now. I started my journey moving from virtual machines into containers with Mesos. I spent some time working on Mesos and actually worked with a team of really smart individuals to try and develop an API in front of that crazy Mesos thing. Then we realized, “Well, why are we doing this? There is one that's called Kubernetes. We should jump on that.” That's the direction in my time with containerization and cloud native stuff has taken. How about you Josh? [0:02:07.2] JR: Hey, I’m Josh. I similar to Duffie and Nicholas came from CoreOS and then to Heptio and then eventually VMware. Actually got my start in the middleware business oddly enough, where we worked on the Egregious Spaghetti Box, or the ESB as it’s formally known. I got to see over time how folks were doing a lot of these, I guess, more legacy monolithic applications and that sparked my interest into learning a bit about some of the cloud native things that were going on. At the time, CoreOS was at the forefront of that. It was a natural progression based on the interests and had a really good time working at Heptio with a lot of the folks that are on this call right now. Kris, you want to give us an intro? [0:02:48.4] KN: Sure. Hi, everyone. Kris Nova. I've been SRE DevOps infrastructure for about a decade now. I used to live in Boulder, Colorado. I came out of a couple startups there. I worked at SolidFire, we went to NetApp. I used to work on the Linux kernel there some. Then I was at Deis for a while when I first started contributing to Kubernetes. We got bought by Microsoft, left Microsoft, the Azure team. I was working on the original managed Kubernetes there. Left that team, joined up with Heptio, met all of these fabulous folks. I think, I wrote a book and I've been doing a lot of public speaking and some other junk along the way. Yeah. Hi. What about you, Carlisia? [0:03:28.2] CC: All right. I think it's really interesting that all the guys are lined up on one call and all the girls on another call. [0:03:34.1] NL: We should have probably broken it up more. [0:03:36.4] CC: I am a developer and have always been a developer. Before joining Heptio, I was working for Fastly, which is a CDN company. They’re doing – helping them build the latest generation of their TLS management system. At some point during my stay there, Kevin Stuart was posting on Twitter, joined Heptio. At this point, Heptio was about, I don't know, between six months in a year-old. I saw those tweets go by I’m like, “Yeah, that sounds interesting, but I'm happy where I am.” I have a very good friend, Kennedy actually. He saw those tweets and here he kept saying to me, “You should apply. You should apply, because they are great people. They did great things. Kubernetes is so hot.” I’m like, “I'm happy where I am.” Eventually, I contacted Kevin and he also said, “Yeah, that it would be a perfect match.” two months later decided to apply. The people are amazing. I did think that Kubernetes was really hard, but my decision-making went towards two things. The people are amazing and some people who were working there I already knew from previous opportunities. Some of the people that I knew – I mean, I love everyone. The only thing was that it was an opportunity for me to work with open source. I definitely could not pass that up. I could not be happier to have made that decision now with VMware acquiring Heptio, like everybody here I’m at VMware. Still happy. [0:05:19.7] KN: Has everybody here contributed to open source before? [0:05:22.9] NL: Yup, I have. [0:05:24.0] KN: What's everybody's favorite project they've worked on? [0:05:26.4] NL: That's an interesting question. From a business aspect, I really like Dex. Dex is an identity provider, or a middleware for identity provider. It provides an OIDC endpoint for multiple different identity providers. You can absorb them into Kubernetes. Since Kubernetes only has an OIDC – only accepts OIDC job tokens for authentication, that functionality that Dex provides is probably my favorite thing. Although, if I'm going to be truly honest, I think right now the thing that I'm the most excited about working on is my own project, which is starting to join like me, joining into my interest in doing Chaos engineering. What about you guys? What’s your favorite? [0:06:06.3] KN: I understood some of those words. NL: Those are things we'll touch on on different episodes. [0:06:12.0] KN: Yeah. I worked on FreeBSD for a while. That was my first welcome to open source. I mean, that was back in the olden days of IRC clients and writing C. I had a lot of fun, and still I'm really close with a lot of folks in the FreeBSD community, so that always has a special place in my heart, I think, just that was my first experience of like, “Oh, this is how you work on a team and you work collaboratively together and it's okay to fail and be open.” [0:06:39.5] NL: Nice. [0:06:40.2] KN: What about you, Josh? [0:06:41.2] JR: I worked on a project at CoreOS. Well, a project that's still out there called ALB Ingress controller. It was a way to bring the AWS ALBs, which are just layer 7 load balancers and take the Kubernetes API ingress, attach those two together so that the ALB could serve ingress. The reason that it was the most interesting, technology aside, is just it went from something that we started just myself and a colleague, and eventually gained community adoption. We had to go through the process of just being us two worrying about our concerns, to having to bring on a large community that had their own business requirements and needs, and having to say no at times and having to encourage individuals to contribute when they had ideas and issues, because we didn't have the bandwidth to solve all those problems. It was interesting not necessarily from a technical standpoint, but just to see what it actually means when something starts to gain traction. That was really cool. Yeah, how about you Duffie? [0:07:39.7] DC: I've worked on a number of projects, but I find that generally where I fit into the ecosystem is basically helping other people adopt open source technologies. I spent a quite a bit of my time working on OpenStack and I spent some time working on Open vSwitch and recently in Kubernetes. Generally speaking, I haven't found myself to be much of a contributor to of code to those projects per se, but more like my work is just enabling people to adopt those technologies because I understand the breadth of the project more than the detail of some particular aspect. Lately, I've been spending some time working more on the SIG Network and SIG-cluster-lifecycle stuff. Some of the projects that have really caught my interest are things like, Kind which is Kubernetes in Docker and working on KubeADM itself, just making sure that we don't miss anything obvious in the way that KubeADM is being used to manage the infrastructure again. [0:08:34.2] KN: What about you, Carlisia? [0:08:36.0] CC: I realize it's a mission what I'm working on at VMware. That is coincidentally the project – the open source project that is my favorite. I didn't have a lot of experience with open source, just minor contributions here and there before this project. I'm working with Valero. It's a disaster recovery tool for Kubernetes. Like I said, it's open source. We’re coming up to version 1 pretty soon. The other maintainers are amazing, super knowledgeable and very experienced, mature. I have such a joy to work with them. My favorites. [0:09:13.4] NL: That's awesome. [0:09:14.7] DC: Should we get into the concept of cloud native and start talking about what we each think of this thing? Seems like a pretty loaded topic. There are a lot of people who would think of cloud native as just a generic term, we should probably try and nail it down here. [0:09:27.9] KN: I'm excited for this one. [0:09:30.1] CC: Maybe we should talk about what this podcast show is going to be? [0:09:34.9] NL: Sure. Yeah. Totally. [0:09:37.9] CC: Since this is our first episode. [0:09:37.8] NL: Carlisia, why don't you tell us a little bit about the podcast? [0:09:40.4] CC: I will be glad to. The idea that we had was to have a show where we can discuss cloud native concepts. As opposed to talking about particular tools or particular project, we are going to aim to talk about the concepts themselves and approach it from the perspective of a distributed system idea, or issue, or concept, or a cloud native concept. From there, we can talk about what really is this problem, what people or companies have this problem? What usually are the solutions? What are the alternative ways to solve this problem? Then we can talk about tools that are out there that people can use. I don't think there is a show that approaches things from this angle. I'm really excited about bringing this to the community. [0:10:38.9] KN: It's almost like TGIK, but turned inside out, or flipped around where TGIK, we do tools first and we talk about what exactly is this tool and how do you use it, but I think this one, we're spinning that around and we're saying, “No, let's pick a broader idea and then let's explore all the different possibilities with this broader idea.” [0:10:59.2] CC: Yeah, I would say so. [0:11:01.0] JR: From the field standpoint, I think this is something we often times run into with people who are just getting started with larger projects, like Kubernetes perhaps, or anything really, where a lot of times they hear something like the word Istio come out, or some technology. Often times, the why behind it isn't really considered upfront, it's just this tool exists, it's being talked about, clearly we need to start looking at it. Really diving into the concepts and the why behind it, hopefully will bring some light to a lot of these things that we're all talking about day-to-day. [0:11:31.6] CC: Yeah. Really focusing on the what and the why. The how is secondary. That's what my vision of this show is. [0:11:41.7] KN: I like it. [0:11:43.0] NL: That's something that really excites me, because there are a lot of these concepts that I talk about in my day-to-day life, but some of them, I don't actually think that I understand pretty well. It's those words that you've heard a million times, so you know how to use them, but you don't actually know the definition of them. [0:11:57.1] CC: I'm super glad to hear you say that mister, because as a developer in many not a system – not having a sysadmin background. Of course, I did sysadmin things as a developer, but not it wasn't my day-to-day thing ever. When I started working with Kubernetes, a lot of things I didn't quite grasp and that's a super understatement. I noticed that I mean, I can ask questions. No problem. I will dig through and find out and learn. The problem is that in talking to experts, a lot of the time when people, I think, but let me talk about myself. A lot of time when I ask a question, the experts jump right to the how. What is this? “Oh, this is how you do it.” I don't know what this is. Back off a little bit, right? Back up. I don't know what this is. Why is this doing this? I don't know. If you tell me the how before I understand what that even is, I'm going to forget. That's what's going to happen. I mean, it’s great you're trying to make an effort and show me the how to do something. This is personal, the way I learn. I need to understand the how first. This is why I'm so excited about this show. It's going to be awesome. This is what we’re going to talk about. [0:13:19.2] DC: Yeah, I agree. This is definitely one of the things that excites me about this topic as well, is that I find my secret super power is troubleshooting. That means that I can actually understand what the expected relationships between things should do, right? Rather than trying to figure out. Without really digging into the actual problem of stuff and what and the how people were going, or the people who were developing the code were trying to actually solve it, or thought about it. It's hard to get to the point where you fully understand that that distributed system. I think this is a great place to start. The other thing I'll say is that I firmly believe that you can't – that you don't understand a thing if you can't teach it. This podcast for me is about that. Let's bring up all the questions and we should enable our audience to actually ask us questions somehow, and get to a place where we can get as many perspectives on a problem as we can, such that we can really dig into the detail of what the problem is before we ever talk about how to solve it. Good stuff. [0:14:18.4] CC: Yeah, absolutely. [0:14:19.8] KN: Speaking of a feedback loop from our audience and taking the problem first and then solution in second, how do we plan on interacting with our audience? Do we want to maybe start a GitHub repo, or what are we thinking? [0:14:34.2] NL: I think a GitHub repo makes a lot of sense. I also wouldn't mind doing some social media malarkey, maybe having a Twitter account that we run or something like that, where people can ask questions too. [0:14:46.5] CC: Yes. Yes to all of that. Yeah. Having an issue list that in a repo that people can just add comments, praises, thank you, questions, suggestions for concepts to talk about and say like, “Hey, I have no clue what this means. Can you all talk about it?” Yeah, we'll talk about it. Twitter. Yes. Interact with those on Twitter. I believe our Twitter handle is TheKubelets. [0:15:12.1] KN: Oh, we already have one. Nice. [0:15:12.4] NL: Yes. See, I'm learning something new already. [0:15:15.3] CC: We already have. I thought you were all were joking. We have the Kubernetes repo. We have a github repo called – [0:15:22.8] NL: Oh, perfect. [0:15:23.4] CC: heptio/thekubelets. [0:15:27.5] DC: The other thing I like that we do in TGIK is this HackMD thing. Although, I'm trying to figure out how we could really make that work for us in a show that's recorded every week like this one. I think, maybe what we could do is have it so that when people can listen to the recording, they could go to the HackMD document, put questions in or comments around things if they would like to hear more about, or maybe share their perspectives about these topics. Maybe in the following week, we could just go back and review what came in during that period of time, or during the next session. [0:15:57.7] KN: Yeah. Maybe we're merging the HackMD on the next recording. [0:16:01.8] DC: Yeah. [0:16:03.3] KN: Okay. I like it. [0:16:03.6] DC: Josh, you have any thoughts? Friendster, MySpace, anything like that? [0:16:07.2] JR: No. I think we could pass on MySpace for now, but everything else sounds great. [0:16:13.4] DC: Do we want to get into the meat of the episode? [0:16:15.3] KN: Yeah. [0:16:17.2] DC: Our true topic, what does cloud native mean to all of us? Kris, I'm interested to hear your thoughts on this. You might have written a book about this? [0:16:28.3] KN: I co-authored a book called Cloud Native Infrastructure, which it means a lot of things to a lot of people. It's one of those umbrella terms, like DevOps. It's up to you to interpret it. I think in the past couple of years of working in the cloud native space and working directly at the CNCF as a CNCF ambassador, Cloud Native Computing Foundation, they're the open source nonprofit folks behind this term cloud native. I think the best definition I've been able to come up with is when you're designing software and you start your main function to be built around the cloud, or to be built around what the cloud enables us to do in the services a cloud to offer you, that is when you start to look at cloud native engineering. I think all cloud native infrastructure is, it's designing software that manages and mutates infrastructure in that same way. I think the underlying theme here is we're no longer caddying configurations disk and doing system D restarts. Now we're just sending HTTPS API requests and getting messages back. Hopefully, if the cloud has done what we expect it to do, that broadcast some broader change. As software engineers, we can count on those guarantees to design our software around. I really think that you need to understand that it's starting with the main function first and completely engineering your app around these new ideas and these new paradigms and not necessarily a migration of a non-cloud native app. I mean, you technically could go through and do it. Sure, we've seen a lot of people do it, but I don't think that's technically cloud native. That's cloud alien. Yeah. I don't know. That's just my thought. [0:18:10.0] DC: Are you saying that cloud native approach is a greenfield approach generally? To be a cloud native application, you're going to take that into account in the DNA of your application? [0:18:20.8] KN: Right. That's exactly what I'm saying. [0:18:23.1] CC: It's interesting that never said – mentioned cloud alien, because that plays into the way I would describe the meaning of cloud native. I mean, what it is, I think Nova described it beautifully and it's a lot of – it really shows her know-how. For me, if I have to describe it, I will just parrot things that I have read, including her book. What it means to me, what it means really is I'm going to use a metaphor to explain what it means to me. Given my accent, I’m obviously not an American born, and so I'm a foreigner. Although, I do speak English pretty well, but I'm not native. English is not my native tongue. I speak English really well, but there are certain hiccups that I'm going to have every once in a while. There are things that I'm not going to know what to say, or it's going to take me a bit long to remember. I rarely run into not understanding it, something in English, but it happens sometimes. That's the same with the cloud native application. If it hasn't been built to run on cloud native platforms and systems, you can migrate an application to cognitive environment, but it's not going to fully utilize the environments, like a native app would. That's my take. [0:19:56.3] KN: Cloud immigrant. [0:19:57.9] CC: Cloud immigrant. Is Nick a cloud alien? [0:20:01.1] KN: Yeah. [0:20:02.8] CC: Are they cloud native alien, or cloud native aliens. Yeah. [0:20:07.1] JR: On that point, I'd be curious if you all feel there is a need to discern the notion of cloud native infrastructure, or platforms, then the notion of cloud native apps themselves. Where I'm going with this, it's funny hearing the Greenfield thing and what you said, Carlisia, with the immigration, if you will, notion. Oftentimes, you see these very cloud native platforms, things, the amount of Kubernetes, or even Mesos or whatever it might be. Then you see the applications themselves. Some people are using these platforms that are cloud native to be a forcing function, to make a lot of their legacy stuff adopt more cloud native principles, right? There’s this push and pull. It's like, “Do I make my app more cloud native? Do I make my infrastructure more cloud native? Do I do them both at the same time?” Be curious what your thoughts are on that, or if that resonates with you at all. [0:21:00.4] KN: I've got a response here, if I can jump in. Of course, Nova with opinions. Who would have thought? I think what I'm hearing here, Josh is as we're using these cloud native platforms, we're forcing the hand of our engineers. In a world where we may be used to just send this blind DNS request out so whatever, and we would be ignorant of where that was going, now in the cloud native world, we know there's the specific DNS implementation that we can count on. It has this feature set that we can guarantee our software around. I think it's a little bit of both and I think that there is definitely an art to understanding, yes, this is a good idea to do both applications and infrastructure. I think that's where you get into this what it needs to be a cloud native engineer. Just in the same traditional legacy infrastructure stack, there's going to be good engineering choices you can make and there's going to be bad ones and there's many different schools of thought over do I go minimalist? Do I go all in at once? What does that mean? I think we're seeing a lot of folks try a lot of different patterns here. I think there's pros and cons though. [0:22:03.9] CC: Do you want to talk about this pros and cons? Do you see patterns that are more successful for some kinds of company versus others? [0:22:11.1] KN: I mean, I think going back to the greenfield thing that we were talking about earlier, I think if you are lucky enough to build out a greenfield application, you're able to bake in greenfield infrastructure management instead as well. That's where you get these really interesting hybrid applications, just like Kubernetes, that span the course of infrastructure and application. If we were to go into Kubernetes and say, “I wanted to define a service of type load balancer,” it’s actually going to go and create a load balancer for you and actually mutate that underlying infrastructure. The only way we were able to get that power and get that paradigm is because on day one, we said we're going to do that as software engineers; taking the infrastructure where you were hidden behind the firewall, or hidden behind the load balancer in the past. The software would have no way to reason about it. They’re blind and greenfield really is going to make or break your ability to even you take the infrastructure layers. [0:23:04.3] NL: I think that's a good distinction to make, because something that I've been seeing in the field a lot is that the users will do cloud native practices, but they’ll use a tool to do the cloud native for them, right? They'll use something along the lines of HashiCorp’s Terraform to create the VMs and the load balancers for them. It's something I think that people forget about is that the application themselves can ask for these resources as well. Terraform is just using an API and your code can use an API to the same API, in fact. I think that's an important distinction. It forces the developer to think a little bit like a sysadmin sometimes. I think that's a good melding of the dev and operations into this new word. Regrettably, that word doesn't exist right now. [0:23:51.2] KN: That word can be cloud native. [0:23:53.3] DC: Cloud here to me breaks down into a different set of topics as well. I remember seeing a talk by Brandon Phillips a few years ago. In his talk, he was describing – he had some numbers up on the screen and he was talking about the fact that we were going to quickly become overwhelmed by the desire to continue to develop and put out more applications for our users. His point was that every day, there's another 10,000 new users of the Internet, new consumers that are showing up on the Internet, right? Globally, I think it's something to the tune of about 350,000 of the people in this room, right? People who understand infrastructure, people who understand how to interact with applications, or to build them, those sorts of things. There really aren't a lot of people who are in that space today, right? We're surrounded by them all the time, but they really just globally aren't that many. His point is that if we don't radically change the way that we think about the development as the deployment and the management of all of these applications that we're looking at today, we're going to quickly be overrun, right? There aren't going to be enough people on the planet to solve that problem without thinking about the problem in a fundamentally different way. For me, that's where the cloud native piece comes in. With that, comes a set of primitives, right? You need some way to automate, or to write software that will manage other software. You need the ability to manage the lifecycle of that software in a resilient way that can be managed. There are lots of platforms out there that thought about this problem, right? There are things like Mesos, there are things like Kubernetes. There's a number of different shots on goal here. There are lots of things that I've really tried to think about that problem in a fundamentally different way. I think of those primitives that being able to actually manage the lifecycle of software, being able to think about packaging that software in such a way that it can be truly portable, the idea that you have some API abstraction that brings again, that portability, such that you can make use of resources that may not be hosted on your infrastructure on your own personal infrastructure, but also in the cloud, like how do we actually make that API contract so complete that you can just take that application anywhere? These are all part of that cloud native definition in my opinion. [0:26:08.2] KN: This is so fascinating, because the human race totally already learned this lesson with the Linux kernel in the 90s, right? We had all these hardware manufacturers coming out and building all these different hardware components with different interfaces. Somebody said, “Hey, you know what? There's a lot of noise going on here. We should standardize these and build a contract.” That contract then implemented control loops, just like in Kubernetes and then Mesos. Poof, we have the Linux kernel now. We're just distributed Linux kernel version 2.0. The human race is here repeating itself all over again. [0:26:41.7] NL: Yeah. It seems like the blast radius of Linux kernel 2.0 is significantly higher than the Linux kernel itself. That made it sound like I was like, pooh-poohing what you're saying. It’s more like, we're learning the same lesson, but at a grander scale now. [0:27:00.5] KN: Yeah. I think that's a really elegant way of putting it. [0:27:03.6] DC: You do raise a good point. If you are embracing on a cloud native infrastructure, remember that little changes are big changes, right? Because you're thinking about managing the lifecycle of a thousand applications now, right? If you're going full-on cloud native, you're thinking about operating at scale, it's a byproduct of that. Little changes that you might be able to make to your laptop are now big changes that are going to affect a fleet of thousand machines, right? [0:27:30.0] KN: We see this in Kubernetes all the time, where a new version of Kubernetes comes out and something totally unexpected happens when it is ran at scale. Maybe it worked on 10 nodes, but when we need to fire up a thousand nodes, what happens then? [0:27:42.0] NL: Yeah, absolutely. That actually brings up something that to me, defines cloud native as well. A lot of my definition of cloud native follows in suit with Kris Nova's book, or Kris Nova, because your book was what introduced me to the phrase cloud native. It makes sense that your opinion informs my opinion, but something that I think that we were just starting to talk about a little bit is also the concept of stability. Cloud native applications and infrastructure means coding with instability in mind. It's not being guaranteed that your VM will live forever, because it's on somebody else's hardware, right? Their hardware could go down, and so what do you do? It has to move over really quickly, has to figure out, have the guarantees of its API and its endpoints are all going to be the same no matter what. All of these things have to exist for the code, or for your application to live in the cloud. That's something that I find to be very fascinating and that's something that really excites me, is not trying to make a barge, but rather trying to make a schooner when you're making an app. Something that can, instead of taking over the waves, can be buffeted by the waves and still continue. [0:28:55.6] KN: Yeah. It's a little more reactive. I think we see this in Kubernetes a lot. When I interviewed Joe a couple years ago, Joe Beda for the book to get a quote from him, he said, this magic phrase that has stuck with me over the past few years, which is “goal-seeking behavior.” If you look at a Kubernetes object, they all use this concept in Go called embedding. Every Kubernetes object has a status in the spec. All it is is it’s what's actually going on, versus what did I tell it, what do I want to go on. Then all we're doing is just like you said with your analogy, is we're just trying to be reactive to that and build to that. [0:29:31.1] JR: That's something I wonder if people don't think about a lot. They don't they think about the spec, but not the status part. I think the status part is as important, or more important maybe than the spec. [0:29:41.3] KN: It totally is. Because I mean, a status like, if you have one potentiality for status, your control loop is going to be relatively trivial. As you start understanding more of the problems that you could see and your code starts to mature and harden, those statuses get more complex and you get more edge cases and your code matures and your code hardens. Then we can take that and globally in these larger cloud native patterns. It's really cool. [0:30:06.6] NL: Yeah. Carlisia, you’re a developer who's now just getting into the cloud native ecosystem. What are your thoughts on developing with cloud native practices in mind? [0:30:17.7] CC: I’m not sure I can answer that. When I started developing for Kubernetes, I was like, “What is a pod?” What comes first? How does this all fit together? I joined the project [inaudible 00:30:24]. I don't have to think about that. It's basically moving the project along. I don't have to think what I have to do differently from the way I did things before. [0:30:45.1] DC: One thing that I think you probably ran into in working with the application is the management of state and how that relates to – where you actually end up coupling that state. Before in development, you might just assume that there is a database somewhere that you would have to interact with. That database is a way of actually pushing that state off of the code that you're actually going to work with. In this way, that you might think of being able to write multiple consumers of state, or multiple things that are going to mutate state and all share that same database. This is one of the patterns that comes up all the time when we start talking about cloud native architectures, is because we have to really be very careful about how we manage that state and mainly, because one of the other big benefits of it is the ability to horizontally scale things that are going to mutate, or consume state. [0:31:37.5] CC: My brain is in its infancy as it relates to Kubernetes. All that I see is APIs all the way down. It's just APIs all the way down. It’s not very different than as a developer for me, is not very much more complex than developing against the database that sits behind. Ask me again a year from now and I will have a more interesting answer. [0:32:08.7] KN: This is so fascinating, right? I remember a couple years ago when Kubernetes was first coming out and listening to some of the original “Elders of Kubernetes,” and even some of the stuff that we were working on at this time. One of the things that they said was we hope one day, somebody doesn't have to care about what's passed these APIs and gets to look at Kubernetes as APIs only. Then they hear that come from you authentically, it's like, “Hey, that's our success statement there. We nailed it.” It's really cool. [0:32:37.9] CC: Yeah. I don’t understood their patterns and I probably should be more cognizant about these patterns are, even if it's just to articulate them. To me, my day-to-day challenge is understanding the API, understanding what library call do I make to make this happen and how – which is just programming 101 almost. Not different from any other regular project. [0:33:10.1] JR: Yeah. That is something that's nice about programming with Kubernetes in mind, because a lot of times you can use the source code as documentation. I hate to say that particularly is a non-developer. I'm a sysadmin first getting into development and documentation is key in my mind. There's been more than a few times where I'm like, “How do I do this?” You can look in the source code for pretty much any application that you're using that's in Kubernetes, or around the Kubernetes ecosystem. The API for that application is there and it'll tell you what you need to do, right? It’s like, “Oh, this is how you format your config file. Got it.” [0:33:47.7] CC: At the same time, I don't want to minimize that knowing what the patterns are is very useful. I haven't had to do any design for Valero for our projects. Maybe if I had, I would have to be forced to look into that. I'm still getting to know the codebase and developing features, but no major design that I had to lead at least. I think with time, I will recognize those patterns and it will make it easier for me to understand what is happening. What I was saying is that not understanding the patterns that are behind the design of those APIs doesn't preclude me at all so call against it, against them. [0:34:30.0] KN: I feel this is the heart of cloud native. I think we totally nailed it. The heart of cloud native is in the APIs and your ability to interact with the APIs. That's what makes it programmable and that's what makes – gives you the interface for you and your software to interact with that. [0:34:45.1] DC: Yeah, I agree with that. API first. On the topic of cloud native, what about the Cloud Native Computing Foundation? What are our thoughts on the CNCF and what is the CNCF? Josh, you have any thoughts on that? [0:35:00.5] JR: Yeah. I haven't really been as close to the CNCF as I probably should, to be honest with you. One of the great things that the CNCF has put together are programs around getting projects into this, I don't know if you would call it vendor neutral type program. Maybe somebody can correct me on that. Effectively, there's a lot of different categories, like networking and storage and runtimes for containers and things of that nature. There's a really cool landscape that can show off a lot of these different technologies. A lot of the categories, I'm guessing we'll be talking about on this podcast too, right? Things like, what does it mean to do cloud native networking and so on and so forth? That's my purview of the CNCF. Of course, they put on KubeCon, which is the most important thing to me. I'm sure someone else on this call can talk deeper at an organization level what they do. [0:35:50.5] KN: I'm happy to jump in here. I've been working with them for I think three years now. I think first, it's important to know that they are a subsidiary of the Linux Foundation. The Linux Foundation is the original open source, nonprofit here, and then the CNCF is one of many, like Apache is another one that is underneath the broader Linux Foundation umbrella. I think the whole point of – or the CNCF is to be this neutral party that can help us as we start to grow and mature the ecosystem. Obviously, money is going to be involved here. Obviously, companies are going to be looking out for their best interest. It makes sense to have somebody managing the software that is outside, or external of these revenue-driven companies. That's where I think the CNCF comes into play. I think that's its main responsibility is. What happens when somebody from company A and somebody from Company B disagree with the direction that the software should go? The CNCF can come in and say, “Hey, you know what? Let's find a happy medium in here and let's find a solution that works for both folks and let's try to do this the best we can.” I think a lot of this came from lessons we learned the hard way with Linux. In a weird way, we did – we are in version 2.0, but we were able to take advantage of some of the priority here. [0:37:05.4] NL: Do you have any examples of a time in the CNCF jumped in and mediated between two companies? [0:37:11.6] KN: Yeah. I think the steering committee, the Kubernetes steering committee is a great example of this. It's a relatively new thing. It hasn't been around for a very long time. You look at the history of Kubernetes and we used to have this incubation process that has since been retired. We've tried a lot of solutions and the CNCF has been pretty instrumental and guiding the shape of how we're going to manage, solve governance for such a monolithic project. As Kubernetes grows, the problem space grows and more people get involved. We're having to come up with new ways of managing that. I think that's not necessarily a concrete example of two specific companies, but I think that's more of as people get involved, the things that used to work for us in the past are no longer working. The CNCF is able to recognize that and guide us out of that. [0:37:57.2] DC: Cool. That’s such a very good perspective on the CNCF that I didn't have before. Because like Josh, my perspective with CNCF was well, they put on that really cool party three times a year. [0:38:07.8] KN: I mean, they definitely are great at throwing parties. [0:38:12.6] NL: They are that. [0:38:14.1] CC: My perspective of the CNCF is from participating in the Kubernetes meetup here in San Diego. I’m trying to revive our meetup, which is really hard to do, but different topic. I know that they try to make it easier for people to find meetups, because they have on meetup.com, they have an organization. I don't know what the proper name is, but if you go there and you put your zip code, you'll find any meetup that's associated with them. My meetup here in San Diego is associated, can be easily found. They try to give a little bit of money for swags. We also give out ads for meetup. They offer help for finding speakers and they also have a speaker catalog on their website. They try to help in those ways, which I think is very helpful, very valuable. [0:39:14.9] DC: Yeah, I agree. I know about CNCF, mostly just from interacting with folks who are working on its behalf. Working at meeting a bunch of the people who are working on the Kubernetes project, on behalf of the CNCF, folks like Ihor and people like that, which are constantly amazingly with the amount of work that they do on behalf of the CNCF. I think it's been really good seeing what it means to provide governance over a project. I think that really highlights – that's really highlighted by the way that Kubernetes itself has managed. I think a lot of us on the call have probably worked with OpenStack and remember some of the crazy battles that went on between vendors around particular components in that stack. I've yet to actually really see that level of noise creep into the Kubernetes situation. I think squarely on the CNCF around managing governance, and also run the community for just making it accessible enough thing that people can plug into it, without actually having to get into a battle about taking ownership of CNI, for example. Nobody should own CNI. That should be its own project under its own governance. How you satisfy the needs for something like container networking should be a project that you develop as a company, and you can make the very best one that you could make it even attract as many customers to that as you want. Fundamentally, the way that your interface to that major project should be something that is abstracted in such a way that it isn't owned by any one company. There should be a contact in an API, that sort of thing. [0:40:58.1] KN: Yeah. I think the best analogy I ever heard was like, “We’re just building USB plugs.” [0:41:02.8] DC: That's actually really great. [0:41:05.7] JR: To that point Duffie, I think what's interesting is more and more companies are looking to the CNCF to determine what they're going to place their bets on from a technology perspective, right? Because they've been so burned historically from some project owned by one vendor and they don't really know where it's going to end up and so on and so forth. It's really become a very serious thing when people consider the technologies they're going to bet their business on. [0:41:32.0] DC: Yeah. When a project is absorbed into the CNCF, or donated to the CNCF, I guess. There are a number of projects that this has happened to. Obviously, if you see that iChart that is the CNCF landscape, there's just tons of things happening inside of there. It's a really interesting process, but I think that from my part, I remember recently seeing Sysdig Falco show up in that list and seeing them donate – seeing Sysdig donate Falco to the CNCF was probably one of the first times that I've actually have really tried to see what happens when that happens. I think that some of the neat stuff here that happens is that now this is an open source project. It's under the governance of the CNCF. It feels to me more an approachable project, right? I don't feel I have to deal with Sysdig directly to interact with Falco, or to contribute to it. It opens that ecosystem up around this idea, or the genesis of the idea that they built around Falco, which I think is really powerful. What do you all think of that? [0:42:29.8] KN: I think, to look at it from a different perspective, that's one example of when the CNCF helps a project liberate itself. There's plenty of other examples out there where the CNCF is an opt-in feature, that is only there if we need it. I think cluster API, which I'm sure we're going to talk about this in a later episode. I mean, just a quick overview is a lot of different vendors implementing the same API and making that composable and modular. I mean, nowhere along the way in the history of that project has the CNCF had to come and step in. We’ve been able to operate independently of that. I think because the CNCF is even there, we all are under this working agreement of we're going to take everybody's concerns into consideration and we're going to take everybody’s use case in some consideration, work together as an ecosystem. I think it's just even having that in place, whether or not you use it or not is a different story. [0:43:23.4] CC: Do you all know any project under the CNCF? [0:43:26.1] KN: I have one. [0:43:27.7] JR: Well, I've heard of this one. It's called Kubernetes. [0:43:30.1] CC: Is it called Kubernetes or Kubernetes? [0:43:32.8] JR: It’s called Kubernetes. [0:43:36.2] CC: Wow. That’s not what Duffie thinks. [0:43:38.3] DC: I don’t say it that way. No, it's been pretty fascinating seeing just the breadth of projects that are under there. In fact, I was just recently noticing that OpenEBS is up for joining the CNCF. There seems to be – it's fascinating that the things that are being generated through the CNCF and going through that life cycle as a project sometimes overlap with one another and it's very – it seems it's a delicate balance that the CNCF would have to play to keep from playing favorites. Because part of the charter of CNCF is to promote the project, right? I'm always curious to see and I'm fascinated to see how this plays out as we see projects that are normally competitive with one another under the auspice of the same organization, like a CNCF. How do they play this in such a way that they remain neutral, even it would – it seems like it would take a lot of intention. [0:44:39.9] KN: Yeah. Well, there's a difference between just being a CNCF project and being an official project, or a graduated project. There's different tiers. For instance, Kubicorn, a tool that I wrote, we just adopted the CNCF, like I think a code of conduct and there was another file I had to include in the repo and poof, were magically CNCF now. It's easy to get onboard. Once you're onboard, there's legal implications that come with that. There totally is this tier ladder stature that I'm not even super familiar with. That’s how officially CNCF you can be as your product grows and matures. [0:45:14.7] NL: What are some of the code of conduct that you have to do to be part of the CNCF? [0:45:20.8] KN: There's a repo on it. I can maybe find it and add it to the notes after this, but there's this whole tutorial that you can go through and it tells you everything you need to add and what the expectations are and what the implications are for everything. [0:45:33.5] NL: Awesome. [0:45:34.1] CC: Well, Valero is a CNCF project. We follow the what is it? The covenant? [0:45:41.2] KN: Yeah, I think that’s what it is. [0:45:43.0] CC: Yes. Which is the same that Kubernetes follows. I am not sure if there can be others that can be adopted, but this is definitely one. [0:45:53.9] NL: Yeah. According to Aaron Crickenberger, who was the Release Lead for Kubernetes 1.14, the CNCF code of conduct can be summarized as don't be a jerk. [0:46:06.6] KN: Yeah. I mean, there's more to it than that, but – [0:46:10.7] NL: That was him. [0:46:12.0] KN: Yeah. This is something that I remember seeing an open source my entire career, open source comes with this implication of you need to be well-rounded and polite and listen and be able to take others’ just thoughts and concerns into consideration. I think we just are getting used to working like that as an engineering industry. [0:46:32.6] NL: Agreed. Yeah. Which is a great point. It's something that I hadn't really thought of. The idea of development back in the day, it seems like before, there was such a thing as the CNCF are cloud native. It seemed that things were combative, or people were just trying to push their agenda as much as possible. Bully their way through. That doesn't seem that happens as much anymore. Do you guys have any thoughts on that? [0:46:58.9] DC: I think what you're highlighting is more the open source piece than the cloud native piece, which I – because I think that when you're working – open source, I think has been described a few times as a force multiplier for software development and software adoption. I think of these things are very true. If you look at a lot of the big successful closed source projects, they have – the way that people in this room and maybe people listening to this podcast might perceive them, it's definitely just fundamentally differently than some open source project. Mainly, because it feels it's more of a community-driven thing and it also feels you're not in a place where you're beholden to a set of developers that you don't know that are not interested in your best, and in what's best for you, or your organization to achieve whatever they set out to do. With open source, you can be a part of the voice of that project, right? You can jump in and say, “You know, it would really be great if this thing has this feature, or I really like how you would do this thing.” It really feels a lot more interactive and inclusive. [0:48:03.6] KN: I think that that is a natural segue to this idea of we build everything behind the scenes and then hey, it's this new open source project, that everything is done. I don't really think that's open source. We see some of these open source projects out there. If you go look at the git commit history, it's all everybody from the same company, or the same organization. To me, that's saying that while granted the source code might be technically open source, the actual act of engineering and architecting the software is not done as a group with multiple buyers into it. [0:48:37.5] NL: Yeah, that's a great point. [0:48:39.5] DC: Yeah. One of the things I really appreciate about Heptio actually is that all of the projects that we developed there were – that the developer chat for that was all kept in some neutral space, like the Kubernetes Slack, which I thought was really powerful. Because it means that not only is it open source and you can contribute code to a project, but if you want to talk to people who are also being paid to develop that project, you can just go to the channel and talk to them, right? It's more than open source. It's open community. I thought that was really great. [0:49:08.1] KN: Yeah. That's a really great way of putting it. [0:49:10.1] CC: With that said though, I hate to be a party pooper, but I think we need to say goodbye. [0:49:16.9] KN: Yeah. I think we should wrap it up. [0:49:18.5] JR: Yeah. [0:49:19.0] CC: I would like to re-emphasize that you can go to the issues list and add requests for what you want us to talk about. [0:49:29.1] DC: We should also probably link our HackMD from there, so that if you want to comment on something that we talked about during this episode, feel free to leave comments in it and we'll try to revisit those comments maybe in our next episode. [0:49:38.9] CC: Exactly. That's a good point. We will drop a link the HackMD page on the corresponding issue. There is going to be an issue for each episode, so just look for that. [0:49:51.8] KN: Awesome. Well, thanks for joining everyone. [0:49:54.1] NL: All right. Thank you. [0:49:54.6] CC: Thank you. I'm really glad to be here. [0:49:56.7] DC: Hope you enjoyed the episode and I look forward to a bunch more. [END OF EPISODE] [0:50:00.3] ANNOUNCER: Thank you for listening to The Podlets Cloud Native Podcast. Find us on Twitter https://twitter.com/ThePodlets and on the https://thepodlets.io, where you'll find transcripts and show notes. We'll be back next week. Stay tuned by subscribing. [END]See omnystudio.com/listener for privacy information.
Support Mobycasthttps://glow.fm/mobycastIn this episode, we cover the following topics: Container runtimes Responsible for: Setting up namespaces and cgroups for containers Running commands inside those namespaces and cgroups Types of runtimes Low-level Handles tasks related to containers such as: Creating a container Attaching a process to an existing container High-level Handles "high level" tasks such as: Image creation Image management Defers container tasks to "low level" runtime Container standards Open Container Initiative (OCI) Established in June 2015 by Docker and others Contains two specifications: Runtime Specification (runtime-spec) runc is an implementation of OCI runtime-spec Image Specification (image-spec) Runc Low-level container runtime CLI tool for spawning and running containers according to the OCI specification Container runtime originally developed as part of Docker Later lifted out as separate open source project Containerd High-level container runtime Provides abstraction layer for the syscalls and OS-specific functionality required to run container Platforms can build on top of abstraction layer without having to drop down to kernel Much nicer to work with abstractions (Container, Task, Snapshot) than to manage calls to clone() and mount() Available as daemon for Linux and Windows Designed to be embedded into a larger system (like Docker) Used by container platforms like Docker and Kubernetes Under the hood, containerd uses runc Only deals with core container management Push/pull functionality Image management Container lifecycle APIs Create, execute and manage containers and their tasks Snapshot management Out of scope Networking Very complicated, much more platform specific than abstracting Linux calls Instead, containerd exposes events so consumers can subscribe to events they care about E.g. hook events to add interfaces to network namespace Exposes functionality via GRPC API Listens on Linux socket Runtime platforms Docker Uses containerd, plus shim Shim allows for daemon-less containers e.g. You can upgrade Docker daemon without restarting containers rkt Application container engine, part of CoreOS (RedHat) rkt has no centralized "init" daemon Instead it launches containers directly from client commands Was part of CNCF, but archived in August 2019 Reasons cited by CNCF: "end user adoption has severely declined" "project activity and the number of contributors has also steadily declined over time" containerd and CRI-O are now the CNCF container runtime projects CRI-O Implementation of the Kubernetes CRI (Container Runtime Interface) to enable using OCI runtimes Lightweight alternative to using Docker, Moby or rkt as the runtime for k8s Allows k8s to use any OCI-compliant runtime as the container runtime for running pods Supports runc and Kata Containers Revisit pseudo code example of creating a container Steps: Create root filesystem for container Spin up busybox in Docker container, and then export filesystem Run "launcher" process that sets up "child" namespace Launcher process forks new child process (now under new namespaces) Child process then forks new process for container chroot (to our root filesystem) mount any other FS set cgroups (e.g. apply CPU constraints) Links Open Container Initiative - OCI runc containerd What is containerd? CoreOS rkt rkt vs Docker End SongMiquel Salla - All is Coming Back (Focalist Remix) For a full transcription of this episode, please visit the episode webpage.We'd love to hear from you! You can reach us at: Web: https://mobycast.fm Voicemail: 844-818-0993 Email: ask@mobycast.fm Twitter: https://twitter.com/hashtag/mobycast Support Mobycast:https://glow.fm/mobycast
The Byte - A Byte-sized podcast about Containers, Cloud, and Tech
Website - https://github.com/coreos/clairSaaS Vendors mentioned in this episode: Aqua Security NeuVector Twistlock Episode TranscriptionWelcome back to The Byte. In this episode, we're going to talk about Clair, a vulnerability Static Analysis tool for containers. Before we get started I want to see a raise of hands who runs containers in production? Now, keep your hand up if you scan your images that are running in production. Now, this is a question I ask in workshops to various banks, and big customers that you would think would be doing this, and it's shocking. If we were all sitting in one room, I would imagine only 20% of us would still have our hands up saying we run production containers, and we scan these containers... Scan the container images.Now, Clair is actually a brilliant tool. It's was developed by CoreOS, which was acquired by Red Hat, which Red Hat was acquired by IBM, but it's still going. I mean, it's still active, which is brilliant, because it's an awesome tool. Now, typically in the enterprise world, and the small-medium enterprise, I mean, different segments, you have different options, right? I mean, typically, if you are going to do container security, you're going to go with some sort of SAS solution, one of the big vendors, and we're talking about Aqua Security, NeuVector, Twistlock. I mean, just to name a couple of them.But, Clair is actually the open-source version, and obviously, it is open source. I mean, you're not getting any SLAs, or anything like that, but it does a great job, and what it does, I mean, it actually does Static Analysis and Vulnerability Scanning of your container images. How that works, it regularly downloads the metadata from various sources, stores them in a database, and then, compares the metadata versus your images that are running. This then provides you a notification, or lets you know, "Hey, this particular image has vulnerabilities, and I'll notify you, and I'll keep notifying you until you..." Like siren's notification.Additionally, we can also integrate Clair into your CICD pipeline, which allows us to, as we build container images we can actually, as it's pushed to a Registry, Clair then fires up, scans the image, and then, provides you like a report about them, if there are any vulnerabilities inside this image. It integrates into your CICD pipeline, it integrates into various container registries, it has configurable notifications, so we can then push notifications to slack, or email, or whatever notification system you want to use, Permit To Use, for example. You can go to the Alert Manager. It has a lot of different possibilities there. It does integrate quite well to a bunch of different type of platforms, so if you go into the documentation on Clair OS, GitHub page, you go to Integrations you can see it obviously integrates into the CoreOS Registry.It integrates into all sorts of different projects. You can look through it. As I said, it's an open-source project. If you're not doing container scanning now, I would highly, highly recommend you use Clair, that at least you have something, right? Because, many times people are not doing any scanning, and it's better to do something, so at least you know, hey, do I have a heart bleed running around in my production systems? Do I have any vulnerabilities that are like super, like red alert? It's good to know at least baseline where I'm sitting. I would recommend Clair if you're not running any security system. If you have the budget I would definitely go for an enterprise solution, Aqua, NeuVector, Twistlock, or just to name a couple of them, but there's a lot of options out there.Security starts sooner than later. I mean, the sooner you can integrate this into your CICD pipeline the better off you are. Give it a try, github.com/coreos/clair. It's a great tool. We've used it for a couple of projects. We're quite happy with it. I mean, obviously, for what you pay for, right? But, at least you're getting some sort of security put in place. This is step one. Obviously, there are a lot more best practices you can incorporate into your building of images, as well as the security in your container environment, but at least with Clair, we have some sort of reporting and availability... Ability to actually scan your images.Give it a try. Clair has great documentation. It's being used quite regularly. it's also being updated quite frequently as well. That's all I have for this episode. Have a great day. We'll see you next time.
OpenBSD 6.2 is here, style arguments, a second round of viewer interview questions, how to set CPU affinity for FreeBSD jails, containers on FreeNAS & more! Headlines OpenBSD 6.2 Released (https://www.openbsd.org/62.html) OpenBSD continues their six month release cadence with the release of 6.2, the 44th release On a disappointing note, the song for 6.2 will not be released until December Highlights: Improved hardware support on modern platforms including ARM64/ARMv7 and octeon, while amd64 users will appreciate additional support for the Intel Kaby Lake video cards. Network stack improvements include extensive SMPization improvements and a new FQ-CoDel queueing discipline, as well as enhanced WiFi support in general and improvements to iwn(4), iwm(4) and anthn(4) drivers. Improvements in vmm(4)/vmd include VM migration, as well as various compatibility and performance improvements. Security enhancements including a new freezero(3) function, further pledge(2)ing of base system programs and conversion of several daemons to the fork+exec model. Trapsleds, KARL, and random linking for libcrypto and ld.so, dramatically increase security by making it harder to find helpful ROP gadgets, and by creating a unique order of objects per-boot. A unique kernel is now created by the installer to boot from after install/upgrade. The base system compiler on the amd64 and i386 platforms has switched to clang(1). New versions of OpenSSH, OpenSMTPd, LibreSSL and mandoc are also included. The kernel no longer handles IPv6 Stateless Address Autoconfiguration (RFC 4862), allowing cleanup and simplification of the IPv6 network stack. Improved IPv6 checks for IPsec policies and made them consistent with IPv4. Enabled the use of per-CPU caches in the network packet allocators. Improved UTF-8 line editing support for ksh(1) Emacs and Vi input mode. breaking change for nvme(4) users with GPT: If you are booting from an nvme(4) drive with a GPT disk layout, you are affected by an off-by-one in the driver with the consequence that the sector count in your partition table may be incorrect. The only way to fix this is to re-initialize the partition table. Backup your data to another disk before you upgrade. In the new bsd.rd, drop to a shell and re-initialize the GPT: fdisk -iy -g -b 960 sdN Why we argue: style (https://www.sandimetz.com/blog/2017/6/1/why-we-argue-style) I've been thinking about why we argue about code, and how we might transform vehement differences of opinion into active forces for good. My thoughts spring from a very specific context. Ten or twelve times a year I go to an arbitrary business and spend three or more days teaching a course in object-oriented design. I'm an outsider, but for a few days these business let me in on their secrets. Here's what I've noticed. In some places, folks are generally happy. Programmers get along. They feel as if they are all "in this together." At businesses like this I spend most of my time actually teaching object-oriented design. Other places, folks are surprisingly miserable. There's a lot of discord, and the programmers have devolved into competing "camps." In these situations the course rapidly morphs away from OO Design and into wide-ranging group discussions about how to resolve deeply embedded conflicts. Tolstoy famously said that "Happy families are all alike; every unhappy family is unhappy in its own way." This is known as the Anna Karenina Principle, and describes situations in which success depends on meeting all of a number of criteria. The only way to be happy is to succeed at every one of them. Unhappiness, unfortunately, can be achieved by any combination of failure. Thus, all happy businesses are similar, but unhappy ones appear unique in their misery. Today I'm interested in choices of syntax, i.e whether or not your shop has agreed upon and follows a style guide. If you're surprised that I'm starting with this apparently mundane issue, consider yourself lucky in your choice of workplace. If you're shaking your head in rueful agreement about the importance of this topic, I feel your pain. I firmly believe that all of the code that I personally have to examine should come to me in a consistent format. Code is read many more times than it is written, which means that the ultimate cost of code is in its reading. It therefore follows that code should be optimized for readability, which in turn dictates that an application's code should all follow the same style. This is why FreeBSD, and most other open source projects, have a preferred style. Some projects are less specific and less strict about it. Most programmers agree with the prior paragraph, but here's where things begin to break down. As far as I'm concerned, my personal formatting style is clearly the best. However, I'm quite sure that you feel the same. It's easy for a group of programmers to agree that all code should follow a common style, but surprisingly difficult to get them to agree on just what that common style should be. Avoid appointing a human "style cop", which just forces someone to be an increasingly ill-tempered nag. Instead, supply programmers with the information they need to remedy their own transgressions. By the time a pull request is submitted, mis-stylings should long since have been put right. Pull request conversations ought to be about what code does rather than how code looks. What about old code? Ignore it. You don't have to re-style all existing code, just do better from this day forward. Defer updating old code until you touch it for other reasons. Following this strategy means that the code you most often work on will gradually take on a common style. It also means that some of your existing code might never get updated, but if you never look at it, who cares? If you choose to re-style code that you otherwise have no need to touch, you're declaring that changing the look of this old code has more value to your business than delivering the next item on the backlog. The opportunity cost of making a purely aesthetic change includes losing the benefit of what you could have done instead. The rule-of-thumb is: Don't bother updating the styling of stable, existing code unless not doing so costs you money. Most open source projects also avoid reformatting code just to change the style, because of the merge conflicts this will cause for downstream consumers If you disagree with the style guide upon which your team agrees, you have only two honorable options: First, you can obey the guide despite your aversion. As with me in the Elm story above, this act is likely to change your thinking so that over time you come to prefer the new style. It's possible that if you follow the guide you'll begin to like it. Alternatively, you can decide you will not obey the style guide. Making this decision demands that you leave your current project and find some other project whose guide matches your preferred style. Go there and follow that one. Notice that both of these choices have you following a guide. This part is not optional. The moral of this story? It's more important for all code to be formatted the same than it is for any one of us to get our own way. Commit to agreeing upon and following a style guide. And if you find that your team cannot come to an agreement, step away from this problem and start a discussion about power. There have been many arguments about style, and it can often be one of the first complaints of people new to any open source project This article covers it fairly well from both sides, a) you should follow the style guide of the project you are contributing to, b) the project should review your actual code, then comment on the style after, and provide gentle guidance towards the right style, and avoid being “style cops” *** Interview - The BSDNow Crew, Part II News Roundup Building FreeBSD for the Onion Omega 2 (https://github.com/sysadminmike/freebsd-onion-omega2-build) I got my Onion Omega 2 devices in the mail quite a while ago, but I had never gotten around to trying to install FreeBSD on them. They are a different MIPS SoC than the Onion Omega 1, so it would not work out of the box at the time. Now, the SoC is supported! This guide provides the steps to build an image for the Omega 2 using the freebsd-wifi-build infrastructure First some config files are modified to make the image small enough for the Omega 2's flash chip The DTS (Device Tree Source) files are not yet included in FreeBSD, so they are fetched from github Then the build for the ralink SoC is run, with the provided DTS file and the MT7628_FDT kernel config Once the build is complete, you'll have a tftp image file. Then that image is compressed, and bundled into a uboot image Write the files to a USB stick, and plug it into the Omega's dock Turn it on while holding the reset button with console open Press 1 to get into the command line. You will need to reset the usb: usb reset Then load the kernel boot image: fatload usb 0:1 0x80800000 kernel.MT7628_FDT.lzma.uImage And boot it: bootm 0x80800000 At this point FreeBSD should boot Mount a userland, and you should end up in multi-user mode Hopefully this will get even easier in the next few weeks, and we'll end up with a more streamlined process to tftp boot the device, then write FreeBSD into the onboard flash so it boots automatically. *** Setting the CPU Affinity on FreeBSD Jails with ezjail (https://www.neelc.org/setting-the-cpu-affinity-on-freebsd-jails-with-ezjail/) While there are more advanced resource controls available for FreeBSD jails, one of the most basic ways to control CPU usage is to limit the subset of CPUs that each jail can use. This can make sure that every jail has access to some dedicated resources, while at the same time doesn't have the ability to entirely dominate the machine I just got a new home server: a HP ProLiant ML110 G6. Being a FreeBSD person myself, it was natural that I used it on my server instead of Linux I chose to use ezjail to manage the jails on my ProLiant, with the initial one being a Tor middle node. Despite the fact that where my ML110 is, the upstream is only 35mbps (which is pretty good for cable), I did not want to give my Tor jail access to all four cores. Setting the CPU Affinity would let you choose a specific CPU core (or a range of cores) you want to use. However, it does not just let you pick the number of CPU cores you want and make FreeBSD choose the core running your jail. Going forward, I assumed that you have already created a jail using ezjail-admin. I also do not cover limiting a jail to a certain percentage of CPU usage. ezjail-admin config -c [CORENUMBERFIRST]-[CORENUMBERLAST] [JAIL_NAME] or ezjail-admin config -c [CORENUMBERFIRST],[CORENUMBERSECOND],...,[CORENUMBERN] [JAILNAME] And hopefully, you should have your ezjail-managed FreeBSD jail limited to the CPU cores you want. While I did not cover a CPU percentage or RAM usage, this can be done with rctl I'll admit: it doesn't really matter which CPU a jail runs on, but it might matter if you don't want a jail to have access to all the CPU cores available and only want [JAILNAME] to use one core. Since it's not really possible just specify the number of CPU cores with ezjail (or even iocell), a fallback would be to use CPU affinity, and that requires you to specify an exact CPU core. I know it's not the best solution (it would be better if we could let the scheduler choose provided a jail only runs on one core), but it's what works. We use this at work on high core count machines. When we have multiple databases colocated on the same machine, we make sure each one has a few cores to itself, while it shares other cores with the rest of the machine. We often reserve a core or two for the base system as well. *** A practical guide to containers on FreeNAS for a depraved psychopath. (https://medium.com/@andoriyu/a-practical-guide-to-containers-on-freenas-for-a-depraved-psychopath-c212203c0394) If you are interested in playing with Docker, this guide sets up a Linux VM running on FreeBSD or FreeNAS under bhyve, then runs linux docker containers on top of it You know that jails are dope and I know that jails are dope, yet no one else knows it. So here we are stuck with docker. Two years ago I would be the last person to recommend using docker, but a whole lot of things has changes past years… This tutorial uses iohyve to manage the VMs on the FreeBSD or FreeNAS There are many Linux variants you can choose from — RancherOS, CoreOS are the most popular for docker-only hosts. We going to use RancherOS because it's more lightweight out of the box. Navigate to RancherOS website and grab link to latest version sudo iohyve setup pool=zpool kmod=1 net=em0 sudo iohyve fetch https://releases.rancher.com/os/latest/rancheros.iso sudo iohyve renameiso rancheros.iso rancheros-v1.0.4.iso sudo pkg install grub2-bhyve sudo iohyve create rancher 32G sudo iohyve set rancher loader=grub-bhyve ram=8G cpu=8 con=nmdm0 os=debian sudo iohyve install rancher rancheros-v1.0.4.iso sudo iohyve console rancher Then the tutorial does some basic configuration of RancherOS, and some house keeping in iohyve to make RancherOS come up unattended at boot The whole point of this guide is to reduce pain, and using the docker CLI is still painful. There are a lot of Web UIs to control docker. Most of them include a lot of orchestrating services, so it's just overkill. Portainer is very lightweight and can be run even on Raspberry Pi Create a config file as described After reboot you will be able to access WebUI on 9000 port. Setup is very easy, so I won't go over it The docker tools for FreeBSD are still being worked on. Eventually you will be able to host native FreeBSD docker containers on FreeBSD jails, but we are not quite there yet In the meantime, you can install sysutils/docker and use it to manage the docker instances running on a remote machine, or in this case, the RancherOS VM running in bhyve *** Beastie Bits The Ghost of Invention: A Visit to Bell Labs, excerpt from the forthcoming book: “Kitten Clone: Inside Alcatel-Lucent” (https://www.wired.com/2014/09/coupland-bell-labs/) OpenBSD Cookbook (set of Ansible playbooks) (https://github.com/ligurio/openbsd-cookbooks) 15 useful sockstat commands to find open ports on FreeBSD (https://www.tecmint.com/sockstat-command-examples-to-find-open-ports-in-freebsd/) A prehistory of Slashdot (https://medium.freecodecamp.org/a-pre-history-of-slashdot-6403341dabae) Using ed, the unix line editor (https://medium.com/@claudio.santos.ribeiro/using-ed-the-unix-line-editor-557ed6466660) *** Feedback/Questions Malcolm - ZFS snapshots (http://dpaste.com/16EB3ZA#wrap) Darryn - Zones (http://dpaste.com/1DGHQJP#wrap) Mohammad - SSH Keys (http://dpaste.com/08G3VTB#wrap) Send questions, comments, show ideas/topics, or stories you want mentioned on the show to feedback@bsdnow.tv (mailto:feedback@bsdnow.tv)
Kubernetes Joe Beda @jbeda | Heptio | eightypercent.net Show Notes: 00:51 - What is Kubernetes? Why does it exist? 07:32 - Kubernetes Cluster; Cluster Autoscaling 11:43 - Application Abstraction 14:44 - Services That Implement Kubernetes 16:08 - Starting Heptio 17:58 - Kubernetes vs Services Like Cloud Foundry and OpenShift 22:39 - Getting Started with Kubernetes 27:37 - Working on the Original Internet Explorer Team Resources: Google Compute Engine Google Container Engine Minikube Kubernetes: Up and Running: Dive into the Future of Infrastructure by Kelsey Hightower, Brendan Burns, and Joe Beda Joe Beda: Kubecon Berlin Keynote: Scaling Kubernetes: How do we grow the Kubernetes user base by 10x? Wordpress with Helm Sock Shop: A Microservices Demo Application Kelsey Hightower Keynote: Kubernetes Federation Joe Beda: Kubernetes 101 AWS Quick Start for Kubernetes by Heptio Open Source Bridge: Enter the coupon code PODCAST to get $50 off a ticket! The conference will be held June 20-23, 2017 at The Eliot Center in downtown Portland, Oregon. Transcript: CHARLES: Hello everybody and welcome to The Frontside Podcast, Episode 70. With me is Elrick Ryan. ELRICK: Hey, what's going on? CHARLES: We're going to get started with our guest here who many of you may have heard of before. You probably heard of the technology that he created or was a key part of creating, a self-described medium deal. [Laughter] JOE: Thanks for having me on. I really appreciate it. CHARLES: Joe, here at The Frontside most of what we do is UI-related, completely frontend but obviously, the frontend is built on backend technology and we need to be running things that serve our clients. Kubernetes is something that I think I started hearing about, I don't know maybe a year ago. All of a sudden, it just started popping up in my Twitter feed and I was like, "Hmm, that's a weird word," and then people started talking more and more about it and move from something that was behind me into something that was to the side and now it's edging into our peripheral vision more and more as I think more and more people adopt it, to build things on top of it. I'm really excited to have you here on the show to just talk about it. I guess we should start by saying what is the reason for its existence? What are the unique set of problems that you were encountering or noticed that everybody was encountering that caused you to want to create this? JOE: That's a really good set up, I think just for way of context, I spent about 10 years at Google. I learned how to do software on the server at Google. Before that, I was at Microsoft working on Internet Explorer and Windows Presentation Foundation, which maybe some of your listeners had to actually go ahead and use that. I learned how to write software for the server at Google so my experience in terms of what it takes to build and deploy software was really warped by that. It really doesn't much what pretty much anybody else in the industry does or at least did. As my career progressed, I ended up starting this project called Google Compute Engine which is Google's virtual machine as a service, analogous to say, EC2. Then as that became more and more of a priority for the company. There was this idea that we wanted internal Google developers to have a shared experience with external users. Internally, Google didn't do anything with virtual machines hardly. Everything was with containers and Google had built up some really sophisticated systems to be able to manage containers across very large clusters of computers. For Google developers, the interface to the world of production and how you actually launched off and monitor and maintain it was through this toolset, Borg and all these fellow travelers that come along with it inside of Google. Nobody really actually managed machines using traditional configuration management tools like Puppet or Chef or anything like that. It's a completely different experience. We built a compute engine, GCE and then I had a new boss because of executive shuffle and he spun up a VM and he'd been at Google for a while. His reaction to the thing was like, "Now, what?" I was like I'm sitting there at the root prompt go and like, "I don't know what to do now." It turns out that inside of Google that was actually a common thing. It just felt incredibly primitive to actually have a raw VM that you could have SSH into because there's so much to be done above that to get to something that you're comfortable with building a production grade service on top of. The choice as Google got more and more serious about cloud was to either have everybody inside of Google start using raw VMs and live the life that everybody outside of Google's living or try and bring the experience around Borg and this idea of very dynamic, container-centric, scheduled-cluster thinking bring that outside of Google. Borg was entangled enough with the rest of Google systems that sort of porting that directly and externalizing that directly wasn't super practical. Me and couple of other folks, Brendan Burns and Craig McLuckie pitched this crazy idea of starting a new open source project that borrowed from a lot of the ideas from Borg but really melded it with a lot of the needs for folks outside of Google because again, Google is a bit of a special case in so many ways. The core problem that we're solving here is how do you move the idea of deploying software from being something that's based on these physical concepts like virtual machines, where the amount of problems that you have to solve, to actually get that thing up and running is actually pretty great. How do we move that such that you have a higher, more logical set of abstractions that you're dealing with? Instead of worrying about what kernel you're running on, instead of worrying about individual nodes and what happens if a node goes down, you can instead just say, "Make sure this thing is running," and the system will just do its best to make sure that things are running and then you can also do interesting things like make sure 10 of these things are running, which is at Google scale that ends up being important. CHARLES: When you say like a thing, you're talking about like a database server or API server or --? JOE: Yeah, any process that you could want to be running. Exactly. The abstraction that you think about when you're deploying stuff into the cloud moves from a virtual machine to a process. When I say process, I mean like a process plus all the things that it needs so that ends up being a container or a Docker image or something along those lines. Now the way that Google does it internally slightly different than how it's done with Docker but you can squint at these things and you can see a lot of parallels there. When Docker first came out, it was really good. I think at Docker and containers people look for three things out of it. The first one is that they want a packaged artifact, something that I can create, run on my laptop, run in a data center and it's mostly the same thing running in both places and that's an incredibly useful thing, like on your Mac you have a .app and it's really a directory but the finder treats it as you can just drag it around and the thing runs. Containers are that for the server. They just have this thing that you can just say, run this thing on the server and you're pretty sure that it's going to run. That's a huge step forward and I think that's what most folks really see in the value with respect to Docker. Other things that folks look at with containerized technology is a level of efficiency of being able to pack a lot of stuff onto a little bit of hardware. That was the main driver for Google. Google has so many computers that if you improve utilization by 1%, that ends up being real money. Then the last thing is, I think a lot of folks look at this as a security boundary and I think there's some real nuance conversations to have around that. The goal is to take that logical infrastructure and make it such that, instead of talking about raw VMs, you're actually talking about containers and processes and how these things relate to each other. Yet, you still have the flexibility of a tool box that you get with an infrastructure level system versus if you look at something like Heroku or App Engine or these other platform as a service. Those things are relatively fixed function in terms of their architectures that you can build. I think the container cluster stuff that you see with things like Kubernetes is a nice middle ground between raw VMs and a very, very opinionated platform as a service type of thing. It ends up being a building block for building their more specialized experiences. There's a lot to digest there so I apologize. CHARLES: Yeah, there's a lot to digest there but we can jump right into digesting it. You were talking about the different abstractions where you have your hardware, your virtual machine and the containers that are running on top of that virtual machine and then you mentioned, I think I'm all the way up there but then you said Kubernetes cluster. What is the anatomy of a Kubernetes cluster and what does that entail? And what can you do with it? JOE: When folks talk about Kubernetes, I think there's two different audiences and it's important to talk about the experience from each audience. There's the audience from the point of view of what it takes to actually run a cluster -- this is a cluster operator audience -- then there's the audience in terms of what it takes to use a cluster. Assuming that somebody else is running a cluster for me, what does it look like for me to go ahead and use this thing? This is really different from a lot of different dev app tools which really makes these things together. We've tried to create a clean split here. I'm going to skip past what it means to launch and run a Kubernetes cluster because it turns out that over time, this is going to be something that you can just have somebody else do for you. It's like running your own MySQL database versus using RDS in Amazon. At some point, you're going to be like, "You know what, that's a pain in the butt. I want to make that somebody else's problem." When it comes to using the cluster, pretty much what it comes down to is that you can tell a cluster. There's an API to a cluster and that API is sort of a spiritual cousin to something like the EC2 API. You can talk to this API -- it's a RESTful API -- and you can say, "Make sure that you have 10 of these container images running," and then Kubernetes will make sure that ten of those things are running. If a node goes down, it'll start another one up and it will maintain that. That's the first piece of the puzzle. That creates a very dynamic environment where you can actually program these things coming and going, scaling up and down. The next piece of the puzzle that really, really starts to be necessary then is that if you have things moving around, you need a way to find them. There is built in ideas of defining what a service is and then doing service discovery. Service discovery is a fancy name for naming. It's like I have a name for something, I want to look that up to an IP address so that I can talk to it. Traditionally we use DNS. DNS is problematic in the super dynamic environments so a lot of folks, as they build backend systems within the data center, they really start moving past DNS to something that's a lot more dynamic and purpose-built for that. But you can think about it in your mind as a fancy super-fast DNS. CHARLES: The customer is itself something that's abstract so I can change it state and configure it and say, "I want 10 instances of Postgres running," or, "I want between five and 15 and it will handle all of that for you." How do you then make it smart so that you can react to load, for example like all of the sudden, this thing is handling more load so I need to say... What's the word I'm looking for, I need to handle -- JOE: Autoscale? CHARLES: Yeah, autoscale. Are there primitives for that? JOE: Exactly. Kubernetes itself was meant to be a tool box that you can build on top of. There are some common community-built primitives for doing it's called -- excuse the nomenclature here because there's a lot of it in Kubernetes and I can define it -- Horizontal Pod Autoscaling. It's this idea that you can have a set of pods and you want to tune the number of replicas to that pod based on load. That's something that's built in. But now maybe you're cluster, you don't have enough nodes in your cluster as you go up and down so there's this idea of cluster autoscaling where I want to add more capacity that I'm actually launching these things into. Fundamentally, Kubernetes is built on top of virtual machines so at the base, there's a bunch of virtual or physical machines hardware that's running and then it's the idea of how do I schedule stuff into that and then I can pack things into that cluster. There's this idea of scaling the cluster but then also scaling workloads running on top of the cluster. If you find that some of these algorithms or methods for how you want to scale things when you want to launch things, how you want to hook them up, if those things don't work for you, the Kubernetes system itself is programmable so you can build your own algorithms for how you want to launch and control things. It's really built from the get go to be an extensible system. CHARLES: One question that's keeps coming up is as I hear you describing these things is the Kubernetes cluster then, it's not application-oriented so you could have multiple applications running on a single cluster? JOE: Very much so. CHARLES: How do you then layer on your application abstraction on top of this cluster abstraction? JOE: An application is made up of a bunch of running bits, whether it'd be a database. I think as we move towards microservices, it's not just going to be one set of code. It can be a bunch of sets of codes that are working together or bunch of servers that are working together. There are these ideas are like I want to run 10 of these things, I want to run five of these things, I want to run three of these things and then I want them to be able to find each other and then I want to take this thing and I want to expose it out to the internet through a load balancer on Amazon, for example. Kubernetes can help to set up all those pieces. It turns out that Kubernetes doesn't have an idea of an application. There is no actually object inside a Kubernetes called application. There is this idea of running services and exposing services and if you bring a bunch of services together, that ends up being an application. But in a modern world, you actually have services that can play double duty across applications. One of the things that I think is exciting about Kubernetes is that it can grow with you as you move from a single application to something that really becomes a service mesh, as your application, your company grows. Imagine that you have some sort of app and then you have your customer service portal for your internal employees. You can have those both being frontend applications, both running on a Kubernetes cluster, talking to a common backend with a hidden API that you don't expose to customers but it's something that's exposed to both of those frontends and then that API may talk to a database. Then as you understand your problems, you can actually spawn off different microservices that can be managed separately by different teams. Kubernetes becomes a platform where you can actually start with something relatively simple and then grow with that and have it stretch from single application to multiple service microservice-base application to a larger cluster that can actually stretch across multiple teams and there's a bunch of facilities for folks not stepping on each other's toes as they do this stuff. Just to be clear, this is what Kubernetes is as it's based. I think one of the powerful things that you can do is that there's a whole host to folks that are building more platform as a service like abstractions on top of Kubernetes. I'm not going to say it's a trivial thing but it's a relatively straightforward thing to build a Heroku-like experience on top of Kubernetes. But the great thing is that if you find that that Heroku experience, if some of the opinions that were made as part of that don't work for you, you can actually drop down to a level that's more useful than going all the way down to raw VM because right now, if you're running on Heroku and something doesn't work for you, it's like, "Here's a raw VM. Good luck with that." There's a huge cliff as you actually want to start coloring outside the lines for, as I mix my metaphors here for these platform services. ELRICK: What services that are out there that you can use that would implement Kubernetes? JOE: That's a great question. There are a whole host there. One of the folks in the community has pulled together a spreadsheet of all the different ways to install and run Kubernetes and I think there were something like 60 entries on it. It's an open source system. It's credibly adaptable in terms of running in all sorts of different mechanisms for places and there are really active startups that are helping folks to run that stuff. In terms of the easiest turnkey things, I would probably start with Google Container Engine, which is honestly one click. It fits within a Free Tier. It can get you up and running so that you can actually play with Kubernetes super easy. There's this thing from the folks at CoreOS called minikube that lets you run it on your laptop as a development environment. That's a great way to kick the tires. If you're on Amazon, my company Heptio has a quick start that we did with some of the Amazon community folks. It's a cloud formation template that launches a Kubernetes stack that you can get up and running and really understand what's happening. I think as users, understand what value it brings at the user level then they'll figure out whether they want to invest in terms of figuring out what the best place to run and the best way to run it for them is. I think my advice to folks would be find some way to start getting familiar with it and then decide if you have to go deep in terms of how to be a cluster operator and how to run the thing. ELRICK: Yup. That was going to be my next question. You just brought up your company, Heptio. What was the reason for starting that startup? JOE: Heptio was founded by Craig McLuckie, one of the other Kubernetes founders and me. We started about six months or seven months ago now. The goal here is to bring Kubernetes to enterprises and how do we bridge the gap of bringing some of this technology forward company thinking to think about companies like Google and Twitter and Facebook. They have a certain way of thinking about building a deployment software. How do we bring those ideas into more mainstream enterprise? How do we bridge that gap and we're really using doing Kubernetes as the tool to do that? We're doing a bunch of things to make that happen. The first being that we're offering training, support and services so right now, if companies want to get started today, they can engage with us and we can help them understand what makes sense there. Over time, we want to make that be more self-service, easier to do so that you actually don't have to hire someone like us to get started and to be successful there. We want to invest in the community in terms of making Kubernetes easier to approach, easier to run and then more applicable to a more diverse set of audiences. This conversation that we're having here, I'm hoping that at some point, we won't have to have this because Kubernetes will be easy enough and self-describing enough that folks won't feel like they have to dig deep to get started. Then the last thing that we're going to be doing is offering commercial services and software that really helps teach Kubernetes into the fabric of how large companies work. I think there's a set of tools that you need as you move from being a startup or a small team to actually dealing within the structure of a large enterprise and that's really where we're going to be looking to create and sell product. ELRICK: Gotcha. CHARLES: How does Kubernetes then compare in contrast to other technologies that we hear when we talk about integrating with the enterprise and having enterprise clients managing their own infrastructure things like Cloud Foundry, for example. From someone who's kind of ignorant of both, how do you discriminate between the two? JOE: Cloud Foundry is a more of a traditional platform as a service. There's a lot to like there and there are some places where the Kubernetes community and the Cloud Foundry community are starting to cooperate. There is a common way for provisioning and creating external services so you can say, "I want MySQL database." We're trying to make that idea of, "Give me MySQL database. I don't care who and where it's running." We're trying to make those mechanisms common across Cloud Foundry and Kubernetes so there is some effort going in there. But Cloud Foundry is more of a traditional platform as a service. It's opinionated in terms of the right way to create, launch, roll out, hooks services together. Whereas, Kubernetes is more of a building block type of thing. Kubernetes is, at least raw Kubernetes in some ways a more of a lower levels building block technology than something like Cloud Foundry. The most applicable competitor in this world to Cloud Foundry, I would say would be OpenShift from Red Hat. Open Shift is a set of extensions built on top of it. Right now, it's a little bit of a modified version of Kubernetes but over time that teams working to make it be a set of pure extensions on top of Kubernetes that adds a platform as a service layer on top of the container cluster layer. The experience for Open Shift will be comparable to the experience for Cloud Foundry. There's other folks like Microsoft just bought the small company called Deis. They offer a thing called Workflow which gives you a little bit of the flavor of a platform as a service also. There's multiple flavors of platforms built on top of Kubernetes that would be more apples to apples comparable to something like Cloud Foundry. Now the interesting thing with thing Deis' Workflow or Open Shift or some of the other platforms built on top of Kubernetes is that, again if you find yourself where that platform doesn't work for you for some reason, you don't have to throw out everything. You can actually start picking and choosing what primitives you want to drop down to in the Kubernetes world without having to go down to raw VMs. Whereas, Cloud Foundry really doesn't have a widely supported, sort of more raw interface to run in containers and services. It's kind of subtle. CHARLES: Yeah, it's kind of subtle. This is an analogy that just popped into my head while I was listening to you and I don't know if this is way off base. But when you were describing having... What was the word you used? You said a container clast --? It was a container clustered... JOE: Container orchestrator, container cluster. These are all -- CHARLES: Right and then kind of hearkening back to the beginning of our conversation where you were talking about being able to specify, "I want 10 of these processes," or an elastic amount of these processes that reminded me of Erlang VM and how kind of baked into that thing is the concept of these lightweight processes and be able to manage communication between these lightweight processes and also supervise these processes and have layers of supervisors supervising other supervisors to be able to declare a configuration for a set of processes to always be running. Then also propagate failure of those processes and escalate and stuff like that. Would you say that there is an analogy there? I know there are completely separate beast but is there a co-evolution there? JOE: I've never used Erlang in Anger so it's hard for me to speak super knowledgeably about it. For what I understand, I think there is a lot in common there. I think Erlang was originally built by Nokia for telecoms switches, I believe which you have these strong availability guarantees so any time when you're aiming for high availability, you need to decouple things with outside control loops and ways to actually coordinate across pieces of hardware and software so that when things fail, you can isolate that and have a blast radius for a failure and then have higher level mechanisms that can help recover. That's very much what happens with something like Kubernetes and container orchestrator. I think there's a ton of parallels there. CHARLES: I'm just trying to grasp at analogies of things that might be -- ELRICK: I think they call that the OTP, Open Telecom Platform or something like that in Erlang. CHARLES: Yeah, but it just got a lot of these things -- ELRICK: Very similar. CHARLES: Yeah, it seems very similar. ELRICK: Interestingly enough, for someone that's starting from the bottom, an initiated person to Kubernetes containers, Docker images, Docker, where would they start to ramp up themselves? I know you mentioned that you are writing a book --? JOE: Yes. ELRICK: -- 'Kubernetes: Up and Running'. Would that be a good place to start when it comes out or is there like another place they should start before they get there. What is your thoughts on that? JOE: Definitely, check out the book. This is a book that I'm writing with Kelsey Hightower who's one of the developer evangelists for Google. He is the most dynamic speaker I've ever seen so if you ever have a chance to see him live, it's pretty great. But Kelsey started this and he's a busy guy so he brought in Brendan Burns, one of the other Kubernetes co-founders and me to help finish that book off and that should be coming out soon. It's Kubernetes: Up and Running. Definitely check that out. There's a bunch of good tutorials out there also that start introducing you to a lot of the concepts in Kubernetes. I'm not going to go through all of those concepts right now. There's probably like half a dozen different concepts and terminology, things that you have to learn to really get going with it and I think that's a problem right now. There's a lot to import before you can get started. I gave a talk at the Kubernetes Conference in Berlin, a month or two ago and it was essentially like, yeah we got our work cut out for us to actually make the stuff applicable to wider audience. But if you want to see the power, I think one of the things that you can do is there's the system built on top of Kubernetes called Helm, H-E-L-M, like a ship's helm because we love our nautical analogies here. Helm is a package manager for Kubernetes and just like you can login to say, in Ubuntu machine and do apps get install MySQL and you have a database up and running. With Helm you can say, create and install 'WordPress install' on my Kubernetes cluster and it'll just make that happen. It takes this idea of package management of describing applications up to the next level. When you're doing regular sysadmin stuff, you can actually go through and do the system to [Inaudible] files or to [Inaudible] files and copy stuff out and use Puppet and Chef to orchestrate all of that stuff. Or you can take the stuff that sort of package maintainers for the operating system have done and actually just go ahead and say, "Get that installed." We want to be able to offer a similar experience at the cluster level. I think that's a great way to start seeing the power. After you understand all these concepts here is how easy you can make it to bring up and run these distributed systems that are real applications. The Weaveworks folks, there are company that do container networking and introspection stuff based out of London. They have this example application called Sock Shop. It's like the pet shop example but distributed and built to show off how you can build an application on top of Kubernetes that pulls a lot of moving pieces together. Then there's some other applications out there like that that give you a little bit of an idea of what things look like as you start using this stuff to its fullest extent. I would say start with something that feels concrete where you can start poking around and seeing how things work before you commit. I know some people are sort of depth first learners and some are breadth first learners. If you're depth first, go and read the book, go to Kubernetes documentation site. If you're breadth first, just start with an application and go from there. ELRICK: Okay. CHARLES: I think I definitely fall into that breadth first. I want to build something with it first before trying to manage my own cluster. ELRICK: Yeah. True. I think I watched your talk and I did watch one of Kelsey's talks: container management. There was stuff about replicators and schedulers and I was like, "The ocean just getting deeper and deeper," as I listened to his talk. JOE: Actually, I think this is one of the cultural gaps to bridge between frontend and backend thinking. I think a lot of backend folks end up being these depths first types of folks, where when they want to use a technology, they want to read all the source code before they first apply it. I'm sure everybody has met those type of developers. Then I think there's folks that are breadth first where they really just want to understand enough to be effective, they want to get something up and running, they want to like if they hit a problem, then they'll go ahead and fix that problem but other than that, they're very goal-oriented towards, I want to get this thing running. Kubernetes right now is kind of built by systems engineers for systems engineers and it shows so we have our work cut out for us, I think to bridge that gap. It's going to be an ongoing thing. ELRICK: Yeah, I'm like a depth first but I have to keep myself in check because I have to get work done as a developer. [Laughter] JOE: That sounds about right, yeah. Yeah, so you're held accountable for writing code. CHARLES: Yeah. That's where real learning happens when you're depth first but you've got deadlines. ELRICK: Yes. CHARLES: I think that's a very effective combination. Before we go, I wanted to switch topic away from Kubernetes for just a little bit because you mentioned something when we were emailing that, I guess in a different lifetime you were actually on the original IE team or at the very beginning of the Internet Explorer team at Microsoft? JOE: Yes, that's where I started my career. Back in '97, I've done a couple of internships at Microsoft and then went to join full time, moved up here to Seattle and I had a choice between joining the NT kernel team or the Internet Explorer team. This was after IE3 before IE4. I don't know if this whole internet thing is going to pan out but it looks like that gives you a lot of interesting stuff. You got to understand the internet, it wasn't an assumed thing back then, right? ELRICK: Yeah, that's true. JOE: I don't know, this internet thing. CHARLES: I know. I was there and I know that like old school IE sometimes gets a bad rap. It does get a bad rap for being a little bit of an albatross but if you were there for the early days of IE, it really was the thing that blew it wide open like people do not give credit. It was extraordinarily ahead of its time. That was [Inaudible] team that coin DHTML back to when it was called DHTML. I remember, actually using it for the first time, I think about '97 is about what I was writing raw HTML for everything. CSS wasn't even a thing hardly. When I realized, all these static things when we render them, they're etched in stone. The idea that every one of these properties which I already knew is now dynamic and completely reflected, just moment to moment. It was just eye-opening. It was mind blowing and it was kind of the beginning of the next 20 years. I want to just talk a little bit about that, about where those ideas came from and what was the impetus for that? JOE: Oh, man. There's so much history here. First of all, thank you for calling out. I think we did a lot of really interesting groundbreaking work then. I think the sin was not in IE6 as it was but in [inaudible]. I think the fact that -- CHARLES: IE6 was actually an amazing browser. Absolutely an amazing browser. JOE: And then the world moved past it, right? It didn't catch up. That was the problem. For its time when it was released, I was proud of that release. But four years on, things get a little bit long in the tooth. I think IE3 was based on rendering engine that was very static, very similar to Netscape at the time. The thing to keep in mind is that Netscape at that time, it would download a webpage, parse it and display it. There was no idea of a DOM at Netscape at that point so it would throw away a lot of the information and actually only store stuff that was very specific to the display context. Literally, when you resize the window for Netscape back then, it would actually reparse the original HTML to regenerate things. It wasn't even able to actually resize the window without going through and reparsing. What we did with IE4 -- and I joined sort of close to the tail on IE4 so I can't claim too much credit here -- is bringing some of the ideas from something like Visual Basic and merge those into the idea of the browser where you actually have this programming model which became the DOM of where your controls are, how they fit together, being able to live modify these things. This was all part and parcel of how people built Windows applications. It turns out that IE4 was the combination of the old IE3 rendering engine, sort of stealing stuff from there but then this project that was built as a bunch of Active X controls for Office called [inaudible]. As you smash that stuff together and turn it into a browser rendering engine, that browser rendering engine ended up being called Trident. That's the thing that got a nautical theme. I don't think it's connected and that's the thing that that I joined and started working on at the time. This whole idea that you have actually have this DOM, that you can modify a programmable representation of DHTML and have it be live updated on screen, that was only with IE4. I don't think anybody had done it at that point. The competing scheme from Netscape was this thing called layers where it was essentially multiple HTML documents where you could replace one of the HTML documents and they would be rendered on top of each other. It was awful and it lost to the mist of time. CHARLES: I remember marketing material about layers and hearing how layers was just going to be this wonderful thing but I don't ever remember actually, did they ever even ship it? JOE: I don't know if they did or not. The thing that you got to understand is that anybody who spent any significant amount of time at Microsoft, you just really internalize the idea of a platform like no place else. Microsoft lives and breathes platforms. I think sometimes it does them a disservice. I've been out of Microsoft for like 13 years now so maybe some of my knowledge is a little outdated here but I still have friends over there. But Microsoft is like the poor schmuck that goes to Vegas and pulls the slot machine and wins the jackpot on the first pull. I'm not saying that there wasn't a lot of hard work that went behind Windows but like they hit the goldmine with that from a platform point of view and then they essentially did it again with Office. You have these two incredibly powerful platforms that ended up being an enormous growth engine for the company over time so that fundamentally changed the world view of Microsoft where they really viewed everything as a platform. I think there were some forward thinking people at Netscape and other companies but I think, Microsoft early on really understood what it meant to be a platform and we saw back then what the web could be. One of the original IE team members, I'm going to give a shout out to him, Chris Wilson who's now on the Chrome team, I think. I don't know where he is these days. Chris was on the original IE team. He's still heavily involved in web standards. None of this stuff is a surprise to us. I look at some of the original so after we finished IE6, a lot of the IE team rolled off to doing Avalon which became Windows Presentation Foundation, which was really looking to sort of reinvent Windows UI, importing a bunch of the ideas from web and modern programming there. That's where we came up with XAML and eventually begat Silverlight for good or ill. But some of our original demos for Avalon, if you go back in time and look at that, that was probably... I don't know, 2000 or something like that. They're exactly the type of stuff that people are building with the web platform today. Back then, they'll flex with the thing. We're reinventing this stuff over and over again. I like where it's going. I think we're in a good spot right now but we see things like the Shadow DOM come up and I look at that and I'm like, "We had HTC controls which did a lot of Shadow DOM stuff like stuff in IE early on." These things get reinvented and refined over time and I think it's great but it's fascinating to be in the industry long enough that you can see these patterns repeat. CHARLES: It is actually interesting. I remember doing UI in C++ and in Java. We did a lot of Java and it was a long time. I felt like I was wandering in the wilderness of the web where I was like, "Oh, man. I just wish we had these capabilities of things that we could do in swing, 10 or 15 years ago," but the happy ending is that I really actually do feel we are in a place now, finally where you have options for it really is truly competitive as a developer experience to the way it was, these many years ago and it's also a testament just how compelling the deployment model of the web is, that people were willing to forgo all of that so they could distribute their applications really easily. JOE: Never underestimate the power of view source. CHARLES: Yeah. [Laughter] ELRICK: I think that's why this sort of conversations are very powerful, like going back in time and looking at the development up until now because like they say that people that don't know their history, they're doomed to repeat it. I think this is a beautiful conversation. JOE: Yeah. Because I've done that developer focused frontend type of stuff. I've done the backend stuff. One of the things that I noticed is that you see patterns repeat over and over again. Let's be honest, it probably more like a week, I was going to say a weekend and learn the React the other day and the way that it encapsulate state up and down, model view, it's like these things are like there's different twists on them that you see in different places but you see the same patterns repeat again and again. I look at the way that we do scheduling in Kubernetes. Scheduling is this idea that you have a bunch of workloads that have a certain amount of CPU and RAM that they require like you want to play this Tetris game of being able to fit these things in, you look at scheduling like that and there are echoes for how layout happens in a browser. There is a deeper game coming on here and as you go through your career and if you're like me and you always are interested in trying new things, you never leave it all behind. You always see things that influence your thinking moving forward. CHARLES: Absolutely. I kind of did the opposite. I started out on the backend and then moved over into the frontend but there's never been any concept that I was familiar with working on server side code that did not come to my aid at some point working on the frontend. I can appreciate that fully. ELRICK: Yup. I can agree with the same thing. I jump all around the board, learning things that I have no use currently but somehow, they come back to help me. CHARLES: That will come back to help you. You thread them together at some point. ELRICK: Yup. CHARLES: As they said in one of my favorite video games in high school, Mortal Kombat there is no knowledge that is not power. JOE: I was all Street Fighter. CHARLES: Really? [Laughter] JOE: I cut class in high school and went to play Street Fighter at the mall. CHARLES: There is no knowledge that isn't power except for... I'm not sure that the knowledge of all these little mashy key buttons combinations, really, I don't think there's much power in that. JOE: Well, the Konami code still shows up all the time, right? [Laughter] CHARLES: I'm surprised how that's been passed down from generation to generation. JOE: You still see it show up in places that you wouldn't expect. One of the sad things that early on in IE, we had all these Internet Explorer Easter eggs where if you type this right combination into the address bar, do this thing and you clicked and turn around three times and face west, you actually got this cool DHTML thing and those things are largely disappearing. People don't make Easter eggs like they used to. I think there's probably legal reasons for making sure that every feature is as spec. But I kind of missed those old Easter eggs that we used to find. CHARLES: Yeah, me too. I guess everybody save their Easter eggs for April 1st but -- JOE: For the release notes, [inaudible]. CHARLES: All right. Well, thank you so much for coming by JOE. I know I'm personally excited. I'm going to go find one of those Kubernetes as a services that you mentioned and try and do a little breadth first learning but whether you're depth first or breadth first, I say go to it and thank you so much for coming on the show. JOE: Well, thank you so much for having me on. It's been great. CHARLES: Before we go, there is actually one other special item that I wanted to mention. This is the Open Source Bridge which is a conference being held in Portland, Oregon on the 20th to 23rd of June this year. The tracks are activism, culture hacks, practice and theory and podcast listeners may be offered a discount code for $50 off of the ticket by entering in the code 'podcast' on the Event Brite page, which we will link to in the show notes. Thank you, Elrick. Thank you, Joe. Thank you everybody and we will see you next week.