Podcasts about documentdb

  • 25PODCASTS
  • 39EPISODES
  • 37mAVG DURATION
  • ?INFREQUENT EPISODES
  • Nov 30, 2023LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about documentdb

Latest podcast episodes about documentdb

Screaming in the Cloud
How MongoDB is Paving The Way for Frictionless Innovation with Peder Ulander

Screaming in the Cloud

Play Episode Listen Later Nov 30, 2023 36:08


Peder Ulander, Chief Marketing & Strategy Officer at MongoDB, joins Corey on Screaming in the Cloud to discuss how MongoDB is paving the way for innovation. Corey and Peder discuss how Peder made the decision to go from working at Amazon to MongoDB, and Peder explains how MongoDB is seeking to differentiate itself by making it easier for developers to innovate without friction. Peder also describes why he feels databases are more ubiquitous than people realize, and what it truly takes to win the hearts and minds of developers. About Peder Peder Ulander, the maestro of marketing mayhem at MongoDB, juggles strategies like a tech wizard on caffeine. As the Chief Marketing & Strategy Officer, he battles buzzwords, slays jargon dragons, and tends to developers with a wink. From pioneering Amazon's cloud heyday as Director of Enterprise and Developer Solutions Marketing to leading the brand behind cloud.com's insurgency, Peder's built a legacy as the swashbuckler of software, leaving a trail of market disruptions one vibrant outfit at a time. Peder is the Scarlett Johansson of tech marketing — always looking forward, always picking the edgy roles that drive what's next in technology.Links Referenced:MongoDB: https://mongodb.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode of Screaming in the Cloud is brought to us by my friends and yours at MongoDB, and into my veritable verbal grist mill, they have sent Peder Ulander, their Chief Marketing Officer. Peder, an absolute pleasure to talk to you again.Peder: Always good to see you, Corey. Thanks for having me.Corey: So, once upon a time, you worked in marketing over at AWS, and then you transitioned off to Mongo to, again, work in marketing. Imagine that. Almost like there's a narrative arc to your career. A lot of things change when you change companies, but before we dive into things, I just want to call out that you're a bit of an aberration in that every single person that I have spoken to who has worked within your org has nothing but good things to say about you, which means you are incredibly effective at silencing dissent. Good work.Peder: Or it just shows that I'm a good marketer and make sure that we paint the right picture that the world needs to see.Corey: Exactly. “Do you have any proof of you being a great person to work for?” “No, just word of mouth,” and everyone, “Ah, that's how marketing works.”Peder: Exactly. See, I'm glad you picked up somewhere.Corey: So, let's dive into that a little bit. Why would you leave AWS to go work at Mongo. Again, my usual snark and sarcasm would come up with a half dozen different answers, each more offensive than the last. Let's be serious for a second. At AWS, there's an incredibly powerful engine that drives so much stuff, and the breadth is enormous.MongoDB, despite an increasingly broad catalog of offerings, is nowhere near that level of just universal applicability. Your product strategy is not a Post-It note with the word ‘yes' written on it. There are things that you do across the board, but they all revolve around databases.Peder: Yeah. So, going back prior to MongoDB, I think you know, at AWS, I was across a number of different things, from the developer ecosystem, to the enterprise transformation, to the open-source work, et cetera, et cetera. And being privy to how customers were adopting technology to change their business or change the experiences that they were delivering to their customers or increase the value of the applications that they built, you know, there was a common thread of something that fundamentally needed to change. And I like to go back to just the evolution of tech in that sense. We could talk about going from physical on-prem systems to now we're distributed in the cloud. You could talk about application constructs that started as big fat monolithic apps that moved to virtual, then microservices, and now functions.Or you think about networking, we've gone from fixed wire line, to network edge, and cellular, and what have you. All of the tech stack has changed with the exception of one layer, and that's the data layer. And I think for the last 20 years, what's been in place has worked okay, but we're now meeting this new level of scale, this new level of reach, where the old systems are not what's going to be what the new systems are built on, or the new experiences are built on. And as I was approached by MongoDB, I kind of sat back and said, “You know, I'm super happy at AWS. I love the learning, I love the people, I love the space I was in, but if I were to put my crystal ball together”—here's a Bezos statement of looking around corners—“The data space is probably one of the biggest spaces ripe for disruption and opportunity, and I think Mongo is in an incredible position to go take advantage of that.”Corey: I mean, there's an easy number of jokes to make about AmazonBasics MongoDB, which is my disparaging name for their DocumentDB first-party offering. And for a time, it really felt like AWS's perspective toward its partners was one of outright hostility, if not antagonism. But that narrative no longer holds true in 2023. There's been a definite shift. And to be direct, part of the reason that I believe that is the things you have said both personally and professionally in your role as CMO of Mongo that has caused me to reevaluate this because despite all of your faults—a counted list of which I can provide you after the show—Peder: [laugh].Corey: You do not say things that you do not believe to be true.Peder: Correct.Corey: So, something has changed. What is it?Peder: So, I think there's an element of coopetition, right? So, I would go as far as to say the media loved to sensationalize—actually even the venture community—loved to sensationalize the screen scraping stripping of open-source communities that Amazon represented a number of years ago. The reality was their intent was pretty simple. They built an incredibly amazing IT stack, and they wanted to run whatever applications and software were important to their customers. And when you think about that, the majority of systems today, people want to run open-source because it removes friction, it removes cost, it enables them to go do cool new things, and be on the bleeding edge of technology.And Amazon did their best to work with the top open-source projects in the world to make it available to their customers. Now, for the commercial vendors that are leaning into this space, that obviously does present itself threat, right? And we've seen that along a number of the cohorts of whether you want to call it single-vendor open-source or companies that have a heavy, vested interest in seeing the success of their enterprise stack match the success of the open-source stack. And that's, I think, where media, analysts, venture, all kind of jumped on the bandwagon of not really, kind of, painting that bigger picture for the future. I think today when I look at Amazon—and candidly, it'll be any of the hyperscalers; they all have a clone of our database—it's an entry point. They're running just the raw open-source operational database capabilities that we have in our community edition and making that available to customers.We believe there's a bigger value in going beyond just that database and introducing, you know, anything from the distributed zones to what we do around vector search to what we do around stream processing, and encryption and all of these advanced features and capabilities that enable our customers to scale rapidly on our platform. And the dependency on delivering that is with the hyperscalers, so that's where that coopetition comes in, and that becomes really important for us when we're casting our web to engage with some of the world's largest customers out there. But interestingly enough, we become a big drag of services for an AWS or any of the other hyperscalers out there, meaning that for every dollar that goes to a MongoDB, there's, you know, three, five, ten dollars that goes to these hyperscalers. And so, they're very active in working with us to ensure that, you know, we have fair and competing offers in the marketplace, that they're promoting us through their own marketplace as well as their own channels, and that we're working together to further the success of our customers.Corey: When you take a look at the exciting things that are happening at the data layer—because you mentioned that we haven't really seen significant innovation in that space for a while—one of the things that I see happening is with the rise of Generative AI, which requires very special math that can only be handled by very special types of computers. I'm seeing at least a temporary inversion in what has traditionally been thought of as data gravity, whereas it's easier to move compute close to the data, but in this case, since the compute only lives in the, um, sparkling us-east-1 regions of Virginia, otherwise, it's just generic, sparkling expensive computers, great, you have to effectively move the mountain to Mohammed, so to speak. So, in that context, what else is happening that is driving innovation in the data space right now?Peder: Yeah, yeah. I love your analogy of, move the mountain of Mohammed because that's actually how we look at the opportunity in the whole Generative AI movement. There are a lot of tools and capabilities out there, whether we're looking at code generation tools, LLM modeling vendors, some of the other vector database companies that are out there, and they're all built on the premise of, bring your data to my tool. And I actually think that's a flawed strategy. I think that these are things that are going to be features in core application databases or operational databases, and it's going to be dependent on the reach and breadth of that database, and the integrations with all of these AI tools that will define the victor going forward.And I think that's been a big core part of our platform. When we look at Atlas—111 availability zones across all three hyperscalers with a single, unified, you know, interface—we're actually able to have the customers keep their operational data where it's most important to them and then apply the tools of the hyperscalers or the partners where it makes the most sense without moving the data, right? So, you don't actually have to move the mountain to Mohammed. We're literally building an experience where those that are running on MongoDB and have been running on MongoDB can gain advantage of these new tools and capabilities instantly, without having to change anything in their architectures or how they're building their applications.Corey: There was a somewhat over-excited… I guess, over-focus in the space of vector databases because whatever those are—which involves math, and I am in no way shape, or form smart enough to grasp the nuances thereof, but everyone assures me that it's necessary for Generative AI and machine learning and yadda, yadda, yadda. So, when in doubt, when I'm confronted by things I don't fully understand, I turn to people who do. And the almost universal consensus that I have picked up from people who track databases for a living—as opposed to my own role of inappropriately using everything in the world except databases as a database—is that vector is very much a feature, not a core database type.Peder: Correct. The best way to think about it—I mean, databases in general, they're dealing with structured and unstructured data, and generally, especially when you're doing searches or relevance, you're limited to the fact that those things in the rows and the columns or in the documents is text, right? And the reality is, there's a whole host of information that can be found in metadata, in images, in sounds, in all of these other sources that were stored as individual files but unsearchable. Vector, vectorization, and vector embeddings actually enable you to take things far beyond the text and numbers that you traditionally were searching against and actually apply more, kind of, intelligence to it, or apply sounds or apply sme—you know, you can vectorize smells to some extent. And what that does is it actually creates a more pleasing slash relevant experience for how you're actually building the engagements with your customers.Now, I'll make it a little more simple because that was trying to define vectors, which as you know, is not the easiest thing. But imagine being able to vectorize—let's say I'm a car company—we're actually working with a car company on this—and you're able to store all of the audio files of cars that are showing certain diagnostic issues—the putters and the spurts and the pings and the pangs—and you can actually now isolate these sounds and apply them directly to the problem and resolution for the mechanics that are working on them. Using all of this stuff together, now you actually have a faster time to resolution. You don't want mechanics knowing the mechanics of vectors in that sense, right, so you build an application that abstracts all of that complexity. You don't require them to go through PDFs of data and find all of the options for fixing this stuff.The relevance comes back and says, “Yes, we've seen that sound 20 times across this vehicle. Here's how you fix it.” Right? And that cuts significant amount of time, cost, efficiency, and complexity for those auto mechanics. That is such a big push forward, I think, from a technology perspective, on what the true promise of some of these new capabilities are, and why I get excited about what we're doing with vector and how we're enabling our customers to, you know, kind of recreate experiences in a way that are more human, more relevant.Corey: Now, I have to say that of course you're going to say nice things about your capabilities where vector is concerned. You would be failing in your job if you did not. So, I feel like I can safely discount every positive thing that you say about Mongo's positioning in the vector space and instead turn to, you know, third parties with no formalized relationship with you. Yesterday, Retool's State of AI report came across my desk. I am a very happy Retool customer. They've been a periodic sponsor, from time-to-time, of my ridiculous nonsense, which is neither here nor there, but I want to disclaim the relationship.And they had a Gartner Magic Quadrant equivalent that on one axis had Net Promoter Score—NPS, which is one of your people's kinds of things—and the other was popularity. And Mongo was so far up and to the right that it was almost hilarious compared to every other entrant in the space. That is a positioning that I do not believe it is possible to market your way into directly. This is something that people who are actually doing these things have to use the product, and it has to stand up. Mongo is clearly effective at doing this in a way that other entrants aren't. Why?Peder: Yeah, that's a good question. I think a big part of that goes back to the earlier statement I made that vector databases or vector technology, it's a feature, it's not a separate thing, right? And when I think about all of the new entrants, they're creating a new model where now you have to move your data out of your operational database and into their tool to get an answer and then push back in. The complexity, the integrations, the capabilities, it just slows everything down, right? And I think when you look at MongoDB's approach to take this developer data platform vision of getting all of the core tools that developers need to build compelling applications with from a data perspective, integrating it into one seamless experience, we're able to basically bring classic operational database capabilities, classic text search type capabilities, embed the vector search capabilities as well, it actually creates a richer platform and experience without all of that complexity that's associated with bolt-on sidecar Gen AI tool or vector database.Corey: I would say that that's one of those things that, again, can only really be credibly proven by what the market actually does, as opposed to, you know, lip-sticking the heck out of a pig and hoping that people don't dig too deeply into what you're saying. It's definitely something we're seeing adoption of.Peder: Yeah, I mean, this kind of goes to some of the stuff, you know, you pointed out, the Retool thing. This is not something you can market your way into. This is something that, you know, users are going to dictate the winners in this space, the developers, they're going to dictate the winners in the space. And so, what do you have to do to win the hearts and minds of developers, you have to make the tech extremely approachable, it's got to be scalable to meet their needs, not a lot of friction involved in learning these new capabilities and applying it to all of the stuff that has come before. All of these things put together, really focusing on that developer experience, I mean, that goes to the core of the MongoDB ethos.I mean, this is who we were when we started the company so long ago, and it's continued to drive the innovation that we do in the platform. And I think this is just yet again, another example of focusing on developer needs, making it super engaging and useful, removing the friction, and enabling them to just go create new things. That's what makes it so fun. And so when, you know, as a marketer, and I get the Retool chart across my desk, we haven't been pitching them, we haven't been marketing to them, we haven't tried to influence this stuff, so knowing that this is a true, unbiased audience, actually is pretty cool to see. To your point, it was surprising how far up and to the right that we sat, given, you know, where we were in just—we launched this thing… six months ago? We launched it in June. The amount of customers that have signed up, are using it, and engaged with us on moving forward has been absolutely amazing.Corey: I think that there has been so much that gets lost in the noise of marketing. My approach has always been to cut through so much of it—that I think AWS has always done very well with—is—almost at their detriment these days—but if you get on stage, you can say whatever you want about your company's product, and I will, naturally and lovingly, make fun of whatever it is that you say. But when you have a customer coming on stage and saying, “This is how we are using the thing that they have built to solve a very specific business problem that was causing us pain,” then I shut up, and I listen because it's very hard to wind up dismissing that without being an outright jerk about things. I think the failure mode of that is, taken too far, you lose the ability to tell your own story in a coherent way, and it becomes a crutch that becomes very hard to get rid of. But the proof is really in the pudding.For me, like, the old jokes about—in the early teens—where MongoDB would periodically lose data as configured by default. Like, “MongoDB. It's Snapchat for databases.” Hilarious joke at the time, but it really has worn thin. That's like being angry about what Microsoft did in 2005 and 2006. It's like, “Yeah, okay, you have a point, but it is also ancient history, and at some point you need to get with the modern era, get with the program.”And I think that seeing the success and breadth of MongoDB that I do—you are in virtually every customer that I talk to, in some way, shape, or form—and seeing what it is that they're doing with you folks, it is clear that you are not a passing fad, that you are not going away anytime soon.Peder: Right.Corey: And even with building things in my spare time and following various tutorials of dubious credibility from various parts of the internet—as those things tend to go—MongoDB is very often a default go-to reference when someone needs a database for which a SQLite file won't do.Peder: Right. It's fascinating to see the evolution of MongoDB, and today we're lucky to track 45,000-plus customers on our platform doing absolutely incredible things. But I think the biggest—to your point—the biggest proof is in the pudding when you get these customers to stand up on stage and talk about it. And even just recently, through our .local series, some of the customers that we've been highlighting are doing some amazing things using MongoDB in extremely business-critical situations.My favorite was, I was out doing our .local in Hong Kong, where Cathay Pacific got up on stage, and they talked a little bit about their flight folder. Now, if you remember going through the airport, you always see the captains come through, and they had those two big boxes of paperwork before they got onto the plane. Not only was that killing the environment with all the trees that got cut down for it, it was cumbersome, complex, and added a lot of time and friction with regards to flight operations. Now, take that from a single flight over all of the fleet that's happening across the world.We were able to work with Cathay Pacific to digitize their entire flight folder, all of their documentation, removing the need for cutting down trees and minimizing a carbon footprint form, but at the same time, actually delivering a solution where if it goes down, it grounds the entire fleet of the airline. So, imagine that. That's so business-critical, mission-critical, has to be there, reliable, resilient, available for the pilots, or it shuts down the business. Seeing that growth and that transformation while also seeing the environmental benefit for what they have achieved, to me, that makes me proud to work here.Similarly, we have companies like Ford, another big brand-name company here in the States, where their entire connected car experience and how they're basically operationalizing the connection between the car and their home base, this is all being done using MongoDB as well. So, as they think of these new ideas, recognizing that things are going to be either out at the edges or at a level of scale that you can't just bring it back into classic rows and columns, that's actually where we're so well-suited to grow our footprint. And, you know, I remember back to when I was at Sun—Sun Microsystems. I don't know if anybody remembers that company. That was an old one.But at one point, it was Jonathan that said, “Everything of value connects to the network.” Right? Those things that are connecting to the network also need applications, they need data, they need all of these services. And the further out they go, the more you need a database that basically scales to meet them where they are, versus trying to get them to come back to where your database happens to sit. And in order to do that, that's where you break the mold.That's where—I mean, that kind of goes into the core ethos of why we built this company to begin with. The original founders were not here to build a database; they were building a consumer app that needed to scale to the edges of the earth. They recognized that databases didn't solve for that, so they built MongoDB. That's actually thinking ahead. Everything connecting to the network, everything being distributed, everything basically scaling out to all the citizens of the planet fundamentally needs a new data layer, and that's where I think we've come in and succeeded exceptionally well.Corey: I would agree. Another example I like to come up with, and it's fun that the one that leaps to the top of my mind is not one of the ones that you mentioned, but HSBC—the massive bank—very publicly a few years ago, wound up consolidating, I think it was 46 relational databases onto MongoDB. And the jokes at the time wrote themselves, but let's be serious for a second. Despite the jokes that we all love to tell, they are a bank, a massive bank, and they don't play fast-and-loose or slap-and-tickle with transactional integrity or their data stores for these things.Because there's a definite belief across the banking sector—and I know this having worked in it myself for years—that if at some point, you have the ATMs spitting out the wrong account balances, people will begin rioting in the streets. I don't know if that's strictly accurate or hyperbole, but it's going to cause massive amounts of chaos if it happens. So, that is something that absolutely cannot happen. The fact that they're willing to engage with you folks and your technology and be public about it at that scale, that's really all you need to know from a, “Is this serious technology or clown shoes technology?”Peder: [laugh]. Well, taking that comment, now let's exponentially increase that. You know, if I sit back, and I look at my customer base, financial services is actually one of our biggest verticals as a business. And you mentioned HSBC. We had Wells Fargo on the stage last year at our world event.Nine out of the top ten world's banks are using MongoDB in some of their applications, some at the scale of HSBC, some are still just getting started. And it all comes down to the fact that we have proven ourselves, we are aligned to mission-critical business environments. And I think when it comes down to banks, especially that transactional side, you know, building in the capabilities to be able to have high frequency transactions in the banking world is a hard thing to go do, and we've been able to prove it with some of the largest banks on the planet.Corey: I also want to give you credit—although it might be that I'm giving you credit for a slow release process; I hope not—but when I visit mongodb.com, it still talks up front that you are—and I want to quote here—oh, good lord, it changes every time I load the page—but it talks about, “Build faster, build smarter,” on this particular version of the load. It talks about the data platform. You have not effectively decided to pivot everything you say in public to tie directly into the Generative AI hype bubble that we are currently experiencing. You have a bunch of different use cases, and you're not suddenly describing what you do in Gen AI terms that make it impossible to understand just what the company-slash-product-slash-services actually do.Peder: Right.Corey: So, I want to congratulate you on that.Peder: Appreciate that, right? Look, it comes down to the core basics. We are a developer data platform. We bring together all of the capabilities, tools, and functions that developers need when building apps as it pertains to their data functions or data layer, right? And that's why this integrated approach of taking our operational database and building in search, or stream processing, or vector search, all of the things that we're bringing to the platform enable developers to move faster. And what that says is, we're great for all use cases out there, not just Gen AI use cases. We're great for all use cases where customers are building applications to change the way that they're engaging with the customers.Corey: And what I like about this is that you're clearly integrating this stuff under the hood. You are talking to people who are building fascinating stuff, you're building things yourself, but you're not wrapping yourself in the mantle of, “This is exactly what we do because it's trendy right now.” And I appreciate that. It's still intelligible, and I wouldn't think that I had to congratulate someone on, “Wow, you build marketing that a human being can extract meaning from. That's amazing.” But in 2023, the closing days thereof, it very much is.Peder: Yep, yep. And it speaks a lot to the technology that we've built because, you know, on one side—it reminds me a lot of the early days of cloud where everything was kind of cloud-washed for a bit, we're seeing a little bit of that in the hype cycle that we have right now—sticking to our guns and making sure that we are building a technology platform that enables developers to move quickly, that removing the friction from the developer lifecycle as it pertains to the data layer, that's where the success is right, we have to stay on top of all of the trends, we have to make sure that we're enabling Gen AI, we have to make sure that we're integrating with the Amazon Bedrocks and the CodeWhisperers of the world, right, to go push this stuff forward. But to the point we made earlier, those are capabilities and features of a platform where the higher-level order is to really empower our customers to develop innovative, disruptive, or market-leading technologies for how they engage with their customers.Corey: Yeah. And that it's neat to be able to see that you are empowering companies to do that without feeling the need to basically claim their achievements as your own, which is an honest-to-God hard thing to do, especially as you become a platform company because increasingly, you are the plumbing that makes a lot of the flashy, interesting stuff possible. It's imperative, you can't have those things without the underlying infrastructure, but it's hard to talk about that infrastructure, too.Peder: You know, it's funny, I'm sure all of my colleagues would hate me for saying this, but the wheel doesn't turn without the ball bearing. Somebody still has to build the ball bearing in order for that sucker to move, right? And that's the thing. This is the infrastructure, this is the heart of everything that businesses need to build applications. And one of the—you know, another kind of snide comment I've made to some of my colleagues here is, if you think about every market-leading app, in fact, let's go to the biggest experiences you and I use on a daily basis, I'm pretty sure you're booking travel online, you're searching for stuff on Google, you're buying stuff through Amazon, you're renting a house through Airbnb, and you're listening to your music through Spotify. What are those? Those are databases with a search engine.Corey: The world is full of CRUD applications. These are, effectively, simply pretty front-ends to a database. And as much as we'd like to pretend otherwise, that's very much the reality of it. And we want that to be the case. Different modes of interaction, different requirements around them, but yeah, that is what so much of the world is. And I think to ignore that is to honestly blind yourself to a bunch of very key realities here.Peder: That kind of goes back to the original vision for when I came here. It's like, look, everything of value for us, everything that I engage with, is—to your point—it's a database with a great experience on top of it. Now, let's start to layer in this whole Gen AI push, right, what's going on there. We're talking about increased relevance in search, we're talking about new ways of thinking about sourcing information. We've even seen that with some of the latest ChatGPT stuff that developers are using that to get code snippets and figure out how to solve things within their platform.The era of the classic search engine is in the middle of a complete change, and the opportunity, I think, that I see as this moves forward is that there is no incumbent. There isn't somebody who owns this space, so we're just at the beginning of what probably will be the next. Google's, Airbnb's, and Uber's of the world for the next generation. And that's really exciting to see.Corey: I'm right there with you. What are the interesting founding stories at Google is that they wound up calling typical storage vendors for what they needed, got basically ‘screw on out of here, kids,' pricing, so they shrugged, and because they had no real choice to get enterprise-quality hardware, they built a bunch of highly redundant systems on top of basically a bunch of decommissioned crap boxes from the university they were able to more or less get for free or damn near it, and that led to a whole innovation in technology. One of the glorious things about cloud that I think goes under-sold is that I can build a ridiculous application tonight for maybe, what, 27 cents IT infrastructure spend, and if it doesn't work, I round up to dollar, it'll probably get waived because it'll cost more to process the credit card transaction than take my 27 cents. Conversely, if it works, I'm already building with quote-unquote, “Enterprise-grade” components. I don't need to do a massive uplift. I can keep going. And that is no small thing.Peder: No, it's not. When you step back, every single one of those stories was about abstracting that complexity to the end-user. In Google's case, they built their own systems. You or I probably didn't know that they were screwing these things together and soldering them in the back room in the middle of the night. Similarly, when Amazon got started, that was about taking something that was only accessible to a few thousand and now making it accessible to a few million with the costs of 27 cents to build an app.You removed the risk, you removed the friction from enabling a developer to be able to build. That next wave—and this is why I think the things we're doing around Gen AI, and our vector search capabilities, and literally how we're building our developer data platform is about removing that friction and limits and enabling developers to just come in and, you know, effectively do what they do best, which is innovate, versus all of the other things. You know, in the Google world, it's no longer racking and stacking. In the cloud world, it's no longer managing and integrating all the systems. Well, in the data world, it's about making sure that all of those integrations are ready to go and at your fingertips, and you just focus on what you do well, which is creating those new experiences for customers.Corey: So, we're recording this a little bit beforehand, but not by much. You are going to be at re:Invent this year—as am I—for eight nights—Peder: Yes.Corey: Because for me at least, it is crappy cloud Hanukkah, and I've got to deal with that. What have you got coming up? What do you plan to announce? Anything fun, exciting, or are you just there basically, to see how many badges you can actually scan in one day?Peder: Yeah [laugh]. Well, you know, it's shaping up to be quite an incredible week, there's no question. We'll see what brings to town. As you know, re:Invent is a huge event for us. We do a lot within that ecosystem, a lot of the customers that are up on stage talking about the cool things they're doing with AWS, they're also MongoDB customers. So, we go all out. I think you and I spoke before about our position there with SugarCane right on the show floor, I think we've managed to secure you a Friends of Peder all-access pass to SugarCane. So, I look forward to seeing you there, Corey.Corey: Proving my old thesis of, it really is who you know. And thank you for your generosity, please continue.Peder: [laugh]. So, we will be there in full force. We have a number of different innovation talks, we have a bunch of community-related events, working with developers, helping them understand how we play in the space. We're also doing a bunch of hands-on labs and design reviews that help customers basically build better, and build faster, build smarter—to your point earlier on some of the marketing you're getting off of our website. But we're also doing a number of announcements.I think first off, it was actually this last week, we made the announcement of our integrations with Amazon—or—yeah, Amazon CodeWhisperer. So, their code generation tool for developers has now been fully trained on MongoDB so that you can take advantage of some of these code generation tools with MongoDB Atlas on AWS. Similarly, there's been a lot of noise around what Amazon is doing with Bedrock and the ability to automate certain tasks and things for developers. We are going to be announcing our integrations with Agents for Amazon Bedrock being supported inside of MongoDB Atlas, so we're excited to see that, kind of, move forward. And then ultimately, we're really there to celebrate our customers and connect them so that they can share what they're doing with many peers and others in the space to give them that inspiration that you so eloquently talked about, which is, don't market your stuff; let your customers tell what they're able to do with your stuff, and that'll set you up for success in the future.Corey: I'm looking forward to seeing what you announce in conjunction with what AWS announces, and the interplay between those two. As always, I'm going to basically ignore 90% of what both companies say and talk instead to customers, and, “What are you doing with it?” Because that's the only way to get truth out of it. And, frankly, I've been paying increasing amounts of attention to MongoDB over the past few years, just because of what people I trust who are actually good at databases have to say about you folks. Like, my friends at RedMonk always like to say—I've stolen the line from them—“You can buy my attention, but not my opinion.”Peder: A hundred percent.Corey: You've earned the opinion that you have, at this point. Thank you for your sponsorship; it doesn't hurt, but again, you don't get to buy endorsements. I like what you're doing. Please keep going.Peder: No, I appreciate that, Corey. You've always been supportive, and definitely appreciate the opportunity to come on Screaming in the Cloud again. And I'll just push back to that Friends of Peder. There's, you know, also a little bit of ulterior motive there. It's not just who you know, but it's [crosstalk 00:34:39]—Corey: It's also validating that you have friends. I get it. I get it.Peder: Oh yeah, I know, right? And I don't have many, but I have a few. But the interesting thing there is we're going to be able to connect you with a number of the customers doing some of these cool things on top of MongoDB Atlas.Corey: I look forward to it. Thank you so much for your time. Peder Ulander, Chief Marketing Officer at MongoDB. I'm Cloud Economist Corey Quinn and this has been a promoted guest episode of Screaming in the Cloud, brought to us by our friends at Mongo. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review in your podcast platform of choice, along with an angry, insulting comment that I will ignore because you basically wrapped it so tightly in Generative AI messaging that I don't know what the hell your point is supposed to be.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business, and we get to the point. Visit duckbillgroup.com to get started.

Screaming in the Cloud
The Man Behind the Curtain at Zoph with Victor Grenu

Screaming in the Cloud

Play Episode Listen Later Oct 25, 2022 28:28


About VictorVictor is an Independent Senior Cloud Infrastructure Architect working mainly on Amazon Web Services (AWS), designing: secure, scalable, reliable, and cost-effective cloud architectures, dealing with large-scale and mission-critical distributed systems. He also has a long experience in Cloud Operations, Security Advisory, Security Hardening (DevSecOps), Modern Applications Design, Micro-services and Serverless, Infrastructure Refactoring, Cost Saving (FinOps).Links Referenced: Zoph: https://zoph.io/ unusd.cloud: https://unusd.cloud Twitter: https://twitter.com/zoph LinkedIn: https://www.linkedin.com/in/grenuv/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is brought to us in part by our friends at Datadog. Datadog's SaaS monitoring and security platform that enables full stack observability for developers, IT operations, security, and business teams in the cloud age. Datadog's platform, along with 500 plus vendor integrations, allows you to correlate metrics, traces, logs, and security signals across your applications, infrastructure, and third party services in a single pane of glass.Combine these with drag and drop dashboards and machine learning based alerts to help teams troubleshoot and collaborate more effectively, prevent downtime, and enhance performance and reliability. Try Datadog in your environment today with a free 14 day trial and get a complimentary T-shirt when you install the agent.To learn more, visit datadoghq.com/screaminginthecloud to get. That's www.datadoghq.com/screaminginthecloudCorey: Managing shards. Maintenance windows. Overprovisioning. ElastiCache bills. I know, I know. It's a spooky season and you're already shaking. It's time for caching to be simpler. Momento Serverless Cache lets you forget the backend to focus on good code and great user experiences. With true autoscaling and a pay-per-use pricing model, it makes caching easy. No matter your cloud provider, get going for free at gomomento.co/screaming That's GO M-O-M-E-N-T-O dot co slash screamingCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. One of the best parts about running a podcast like this and trolling the internet of AWS things is every once in a while, I get to learn something radically different than what I expected. For a long time, there's been this sort of persona or brand in the AWS space, specifically the security side of it, going by Zoph—that's Z-O-P-H—and I just assumed it was a collective or a whole bunch of people working on things, and it turns out that nope, it is just one person. And that one person is my guest today. Victor Grenu is an independent AWS architect. Victor, thank you for joining me.Victor: Hey, Corey, thank you for having me. It's a pleasure to be here.Corey: So, I want to start by diving into the thing that first really put you on my radar, though I didn't realize it was you at the time. You have what can only be described as an army of Twitter bots around the AWS ecosystem. And I don't even know that I'm necessarily following all of them, but what are these bots and what do they do?Victor: Yeah. I have a few bots on Twitter that I push some notification, some tweets, when things happen on AWS security space, especially when the AWS managed policies are updated from AWS. And it comes from an initial project from Scott Piper. He was running a Git command on his own laptop to push the history of AWS managed policy. And it told me that I can automate this thing using a deployment pipeline and so on, and to tweet every time a new change is detected from AWS. So, the idea is to monitor every change on these policies.Corey: It's kind of wild because I built a number of somewhat similar Twitter bots, only instead of trying to make them into something useful, I'd make them into something more than a little bit horrifying and extraordinarily obnoxious. Like there's a Cloud Boomer Twitter account that winds up tweeting every time Azure tweets something only it quote-tweets them in all caps and says something insulting. I have an AWS releases bot called AWS Cwoud—so that's C-W-O-U-D—and that winds up converting it to OwO speak. It's like, “Yay a new auto-scawowing growp.” That sort of thing is obnoxious and offensive, but it makes me laugh.Yours, on the other hand, are things that I have notifications turned on for just because when they announce something, it's generally fairly important. The first one that I discovered was your IAM changes bot. And I found some terrifying things coming out of that from time to time. What's the data source for that? Because I'm just grabbing other people's Twitter feeds or RSS feeds; you're clearly going deeper than that.Victor: Yeah, the data source is the official AWS managed policy. In fact, I run AWS CLI in the background and I'm doing just a list policy, the list policy command, and with this list I'm doing git of each policy that is returned, so I can enter it in a git repository to get the full history of the time. And I also craft a list of deprecated policy, and I also run, like, a dog-food initiative, the policy analysis, validation analysis from AWS tools to validate the consistency and the accuracy of the own policies. So, there is a policy validation with their own tool. [laugh].Corey: You would think that wouldn't turn up anything because their policy validator effectively acts as a linter, so if it throws an error, of course, you wouldn't wind up pushing that. And yet, somehow the fact that you have bothered to hook that up and have findings from it indicates that that's not how the real world works.Victor: Yeah, there is some, let's say, some false positive because we are running the policy validation with their own linter then own policies, but this is something that is documented from AWS. So, there is an official page where you can find why the linter is not working on each policy and why. There is a an explanation for each findings. I thinking of [unintelligible 00:05:05] managed policy, which is too long, and policy analyzer is crashing because the policy is too long.Corey: Excellent. It's odd to me that you have gone down this path because it's easy enough to look at this and assume that, oh, this must just be something you do for fun or as an aspect of your day job. So, I did a little digging into what your day job is, and this rings very familiar to me: you are an independent AWS consultant, only you're based out of Paris, whereas I was doing this from San Francisco, due to an escalatingly poor series of life choices on my part. What do you focus on in the AWS consulting world?Victor: Yeah. I'm running an AWS consulting boutique in Paris and I'm working for a large customer in France. And I'm doing mostly infrastructure stuff, infrastructure design for cloud-native application, and I'm also doing some security audits and [unintelligible 00:06:07] mediation for my customer.Corey: It seems to me that there's a definite divide as far as how people find the AWS consulting experience to be. And I'm not trying to cast judgment here, but the stories that I hear tend to fall into one of two categories. One of them is the story that you have, where you're doing this independently, you've been on your own for a while working specifically on this, and then there's the stories of, “Oh, yeah, I work for a 500 person consultancy and we do everything as long as they'll pay us money. If they've got money, we'll do it. Why not?”And it always seems to me—not to be overly judgy—but the independent consultants just seem happier about it because for better or worse, we get to choose what we focus on in a way that I don't think you do at a larger company.Victor: Yeah. It's the same in France or in Europe; there is a lot of consulting firms. But with the pandemic and with the market where we are working, in the cloud, in the cloud-native solution and so on, that there is a lot of demands. And the natural path is to start by working for a consulting firm and then when you are ready, when you have many AWS certification, when you have the experience of the customer, when you have a network of well-known customer, and you gain trust from your customer, I think it's natural to go by yourself, to be independent and to choose your own project and your own customer.Corey: I'm curious to get your take on what your perception of being an AWS consultant is when you're based in Paris versus, in my case, being based in the West Coast of the United States. And I know that's a bit of a strange question, but even when I travel, for example, over to the East Coast, suddenly, my own newsletter sends out three hours later in the day than I expect it to and that throws me for a loop. The AWS announcements don't come out at two or three in the afternoon; they come out at dinnertime. And for you, it must be in the middle of the night when a lot of those things wind up dropping. The AWS stuff, not my newsletter. I imagine you're not excitedly waiting on tenterhooks to see what this week's issue of Last Week in AWS talks about like I am.But I'm curious is that even beyond that, how do you experience the market? From what you're perceiving people in the United States talking about as AWS consultants versus what you see in Paris?Victor: It's difficult, but in fact, I don't have so much information about the independent in the US. I know that there is a lot, but I think it's more common in Europe. And yeah, it's an advantage to whoever ten-hour time [unintelligible 00:08:56] from the US because a lot of stuff happen on the Pacific time, on the Seattle timezone, on San Francisco timezone. So, for example, for this podcast, my Monday is over right now, so, so yeah, I have some advantage in time, but yeah.Corey: This is potentially an odd question for you. But I find an awful lot of the AWS documentation to be challenging, we'll call it. I don't always understand exactly what it's trying to tell me, and it's not at all clear that the person writing the documentation about a service in some cases has ever used the service. And in everything I just said, there is no language barrier. This documentation was written—theoretically—in English and I, most days, can stumble through a sentence in English and almost no other language. You obviously speak French as a first language. Given that you live in Paris, it seems to be a relatively common affliction. How do you find interacting with AWS in French goes? Or is it just a complete nonstarter, and it all has to happen in English for you?Victor: No, in fact, the consultants in Europe, I think—in fact, in my part, I'm using my laptop in English, I'm using my phone in English, I'm using the AWS console in English, and so on. So, the documentation for me is a switch on English first because for the other language, there is sometimes some automated translation that is very dangerous sometimes, so we all keep the documentation and the materials in English.Corey: It's wild to me just looking at how challenging so much of the stuff is. Having to then work in a second language on top of that, it just seems almost insurmountable to me. It's good they have automated translation for a lot of this stuff, but that falls down in often hilariously disastrous ways, sometimes. It's wild to me that even taking most programming languages that folks have ever heard of, even if you program and speak no English, which happens in a large part of the world, you're still using if statements even if the term ‘if' doesn't mean anything to you localized in your language. It really is, in many respects, an English-centric industry.Victor: Yeah. Completely. Even in French for our large French customer, I'm writing the PowerPoint presentation in English, some emails are in English, even if all the folks in the thread are French. So yeah.Corey: One other area that I wanted to explore with you a bit is that you are very clearly focused on security as a primary area of interest. Does that manifest in the work that you do as well? Do you find that your consulting engagements tend to have a high degree of focus on security?Victor: Yeah. In my design, when I'm doing some AWS architecture, my main objective is to design some security architecture and security patterns that apply best practices and least privilege. But often, I'm working for engagement on security audits, for startups, for internal customer, for diverse company, and then doing some accommodation after all. And to run my audit, I'm using some open-source tooling, some custom scripts, and so on. I have a methodology that I'm running for each customer. And the goal is to sometime to prepare some certification, PCI DSS or so on, or maybe to ensure that the best practice are correctly applied on a workload or before go-live or, yeah.Corey: One of the weird things about this to me is that I've said for a long time that cost and security tend to be inextricably linked, as far as being a sort of trailing reactive afterthought for an awful lot of companies. They care about both of those things right after they failed to adequately care about those things. At least in the cloud economic space, it's only money as opposed to, “Oops, we accidentally lost our customers' data.” So, I always found that I find myself drifting in a security direction if I don't stop myself, just based upon a lot of the cost work I do. Conversely, it seems that you have come from the security side and you find yourself drifting in a costing direction.Your side project is a SaaS offering called unusd.cloud, that's U-N-U-S-D dot cloud. And when you first mentioned this to me, my immediate reaction was, “Oh, great. Another SaaS platform for costing. Let's tear this one apart, too.” Except I actually like what you're building. Tell me about it.Victor: Yeah, and unusd.cloud is a side project for me and I was working since, let's say one year. It was a project that I've deployed for some of my customer on their local account, and it was very useful. And so, I was thinking that it could be a SaaS project. So, I've worked at [unintelligible 00:14:21] so yeah, a few months on shifting the product to assess [unintelligible 00:14:27].The product aim to detect the worst on AWS account on all AWS region, and it scan all your AWS accounts and all your region, and you try to detect and use the EC2, LDS, Glue [unintelligible 00:14:45], SageMaker, and so on, and attach a EBS and so on. I don't craft a new dashboard, a new Cost Explorer, and so on. It's it just cost awareness, it's just a notification on email or Slack or Microsoft Teams. And you just add your AWS account on the project and you schedule, let's say, once a day, and it scan, and it send you a cost of wellness, a [unintelligible 00:15:17] detection, and you can act by turning off what is not used.Corey: What I like about this is it cuts at the number one rule of cloud economics, which is turn that shit off if you're not using it. You wouldn't think that I would need to say that except that everyone seems to be missing that, on some level. And it's easy to do. When you need to spin something up and it's not there, you're very highly incentivized to spin that thing up. When you're not using it, you have to remember that thing exists, otherwise it just sort of sits there forever and doesn't do anything.It just costs money and doesn't generate any value in return for that. What you got right is you've also eviscerated my most common complaint about tools that claim to do this, which is you build in either a explicit rule of ignore this resource or ignore resources with the following tags. The benefit there is that you're not constantly giving me useless advice, like, “Oh, yeah, turn off this idle thing.” It's, yeah, that's there for a reason, maybe it's my dev box, maybe it's my backup site, maybe it's the entire DR environment that I'm going to need at little notice. It solves for that problem beautifully. And though a lot of tools out there claim to do stuff like this, most of them really failed to deliver on that promise.Victor: Yeah, I just want to keep it simple. I don't want to add an additional console and so on. And you are correct. You can apply a simple tag on your asset, let's say an EC2 instances, you apply the tag in use and the value of, and then the alerting is disabled for this asset. And the detection is based on the CPU [unintelligible 00:17:01] and the network health metrics, so when the instances is not used in the last seven days, with a low CPU every [unintelligible 00:17:10] and low network out, it comes as a suspect. [laugh].[midroll 00:17:17]Corey: One thing that I like about what you've done, but also have some reservations about it is that you have not done with so many of these tools do which is, “Oh, just give us all the access in your account. It'll be fine. You can trust us. Don't you want to save money?” And yeah, but I also still want to have a company left when all sudden done.You are very specific on what it is that you're allowed to access, and it's great. I would argue, on some level, it's almost too restrictive. For example, you have the ability to look at EC2, Glue, IAM—just to look at account aliases, great—RDS, Redshift, and SageMaker. And all of these are simply list and describe. There's no gets in there other than in Cost Explorer, which makes sense. You're not able to go rummaging through my data and see what's there. But that also bounds you, on some level, to being able to look only at particular types of resources. Is that accurate or are you using a lot of the CloudWatch stuff and Cost Explorer stuff to see other areas?Victor: In fact, it's the least privilege and read-only permission because I don't want too much question for the security team. So, it's full read-only permission. And I've only added the detection that I'm currently supports. Then if in some weeks, in some months, I'm adding a new detection, let's say for Snapshot, for example, I will need to update, so I will ask my customer to update their template. There is a mechanisms inside the project to tell them that the template is obsolete, but it's not a breaking change.So, the detection will continue, but without the new detection, the new snapshot detection, let's say. So yeah, it's least privilege, and all I need is the get-metric-statistics from CloudWatch to detect unused assets. And also checking [unintelligible 00:19:16] Elastic IP or [unintelligible 00:19:19] EBS volume. So, there is no CloudWatching in this detection.Corey: Also, to be clear, I am not suggesting that what you have done is at all a mistake, even if you bound it to those resources right now. But just because everyone loves to talk about these exciting, amazing, high-level services that AWS has put up there, for example, oh, what about DocumentDB or all these other—you know, Amazon Basics MongoDB; same thing—or all of these other things that they wind up offering, but you take a look at where customers are spending money and where they're surprised to be spending money, it's EC2, it's a bit of RDS, occasionally it's S3, but that's a lot harder to detect automatically whether that data is unused. It's, “You haven't been using this data very much.” It's, “Well, you see how the bucket is labeled ‘Archive Backups' or ‘Regulatory Logs?'” imagine that. What a ridiculous concept.Yeah. Whereas an idle EC2 instance sort of can wind up being useful on this. I am curious whether you encounter in the wild in your customer base, folks who are having idle-looking EC2 instances, but are in fact, for example, using a whole bunch of RAM, which you can't tell from the outside without custom CloudWatch agents.Victor: Yeah, I'm not detecting this behavior for larger usage of RAM, for example, or for maybe there is some custom application that is low in CPU and don't talk to any other services using the network, but with this detection, with the current state of the detection, I'm covering large majority of waste because what I see from my customer is that there is some teams, some data scientists or data teams who are experimenting a lot with SageMaker with Glue, with Endpoint and so on. And this is very expensive at the end of the day because they don't turn off the light at the end of the day, on Friday evening. So, what I'm trying to solve here is to notify the team—so on Slack—when they forgot to turn off the most common waste on AWS, so EC2, LTS, Redshift.Corey: I just now wound up installing it while we've been talking on my dedicated shitposting account, and sure enough, it already spat out a single instance it found, which yeah was running an EC2 instance on the East Coast when I was just there, so that I had a DNS server that was a little bit more local. Okay, great. And it's a T4g.micro, so it's not exactly a whole lot of money, but it does exactly what it says on the tin. It didn't wind up nailing the other instances I have in that account that I'm using for a variety of different things, which is good.And it further didn't wind up falling into the trap that so many things do, which is the, “Oh, it's costing you zero and your spend this month is zero because this account is where I dump all of my AWS credit codes.” So, many things say, “Oh, well, it's not costing you anything, so what's the problem?” And then that's how you accidentally lose $100,000 in activate credits because someone left something running way too long. It does a lot of the right things that I would hope and expect it to do, and the fact that you don't do that is kind of amazing.Victor: Yeah. It was a need from my customer and an opportunity. It's a small bet for me because I'm trying to do some small bets, you know, the small bets approach, so the idea is to try a new thing. It's also an excuse for me to learn something new because building a SaaS is a challenging.Corey: One thing that I am curious about, in this account, I'm also running the controller for my home WiFi environment. And that's not huge. It's T3.small, but it is still something out there that it sits there because I need it to exist. But it's relatively bored.If I go back and look over the last week of CloudWatch metrics, for example, it doesn't look like it's usually busy. I'm sure there's some network traffic in and out as it updates itself and whatnot, but the CPU peeks out at a little under 2% used. It didn't warn on this and it got it right. I'm just curious as to how you did that. What is it looking for to determine whether this instance is unused or not?Victor: It's the magic [laugh]. There is some intelligence artif—no, I'm just kidding. It just statistics. And I'm getting two metrics, the superior average from the last seven days and the network out. And I'm getting the average on those metrics and I'm doing some assumption that this EC2, this specific EC2 is not used because of these metrics, this server average.Corey: Yeah, it is wild to me just that this is working as well as it is. It's just… like, it does exactly what I would expect it to do. It's clear that—and this is going to sound weird, but I'm going to say it anyway—that this was built from someone who was looking to answer the question themselves and not from the perspective of, “Well, we need to build a product and we have access to all of this data from the API. How can we slice and dice it and add some value as we go?” I really liked the approach that you've taken on this. I don't say that often or lightly, particularly when it comes to cloud costing stuff, but this is something I'll be using in some of my own nonsense.Victor: Thanks. I appreciate it.Corey: So, I really want to thank you for taking as much time as you have to talk about who you are and what you're up to. If people want to learn more, where can they find you?Victor: Mainly on Twitter, my handle is @zoph [laugh]. And, you know, on LinkedIn or on my company website, as zoph.io.Corey: And we will, of course, put links to that in the [show notes 00:25:23]. Thank you so much for your time today. I really appreciate it.Victor: Thank you, Corey, for having me. It was a pleasure to chat with you.Corey: Victor Grenu, independent AWS architect. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that is going to cost you an absolute arm and a leg because invariably, you're going to forget to turn it off when you're done.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
The Ever-Changing World of Cloud Native Observability with Ian Smith

Screaming in the Cloud

Play Episode Listen Later Sep 13, 2022 41:58


About IanIan Smith is Field CTO at Chronosphere where he works across sales, marketing, engineering and product to deliver better insights and outcomes to observability teams supporting high-scale cloud-native environments. Previously, he worked with observability teams across the software industry in pre-sales roles at New Relic, Wavefront, PagerDuty and Lightstep.Links Referenced: Chronosphere: https://chronosphere.io Last Tweet in AWS: lasttweetinaws.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while, I find that something I'm working on aligns perfectly with a person that I wind up basically convincing to appear on this show. Today's promoted guest is Ian Smith, who's Field CTO at Chronosphere. Ian, thank you for joining me.Ian: Thanks, Corey. Great to be here.Corey: So, the coincidental aspect of what I'm referring to is that Chronosphere is, despite the name, not something that works on bending time, but rather an observability company. Is that directionally accurate?Ian: That's true. Although you could argue it probably bend a little bit of engineering time. But we can talk about that later.Corey: [laugh]. So, observability is one of those areas that I think is suffering from too many definitions, if that makes sense. And at first, I couldn't make sense of what it was that people actually meant when they said observability, this sort of clarified to me at least when I realized that there were an awful lot of, well, let's be direct and call them ‘legacy monitoring companies' that just chose to take what they were already doing and define that as, “Oh, this is observability.” I don't know that I necessarily agree with that. I know a lot of folks in the industry vehemently disagree.You've been in a lot of places that have positioned you reasonably well to have opinions on this sort of question. To my understanding, you were at interesting places, such as LightStep, New Relic, Wavefront, and PagerDuty, which I guess technically might count as observability in a very strange way. How do you view observability and what it is?Ian: Yeah. Well, a lot of definitions, as you said, common ones, they talk about the three pillars, they talk really about data types. For me, it's about outcomes. I think observability is really this transition from the yesteryear of monitoring where things were much simpler and you, sort of, knew all of the questions, you were able to define your dashboards, you were able to define your alerts and that was really the gist of it. And going into this brave new world where there's a lot of unknown things, you're having to ask a lot of sort of unique questions, particularly during a particular instance, and so being able to ask those questions in an ad hoc fashion layers on top of what we've traditionally done with monitoring. So, observability is sort of that more flexible, more dynamic kind of environment that you have to deal with.Corey: This has always been something that, for me, has been relatively academic. Back when I was running production environments, things tended to be a lot more static, where, “Oh, there's a problem with the database. I will SSH into the database server.” Or, “Hmm, we're having a weird problem with the web tier. Well, there are ten or 20 or 200 web servers. Great, I can aggregate all of their logs to Syslog, and worst case, I can log in and poke around.”Now, with a more ephemeral style of environment where you have Kubernetes or whatnot scheduling containers into place that have problems you can't attach to a running container very easily, and by the time you see an error, that container hasn't existed for three hours. And that becomes a problem. Then you've got the Lambda universe, which is a whole ‘nother world pain, where it becomes very challenging, at least for me, in order to reason using the old style approaches about what's actually going on in your environment.Ian: Yeah, I think there's that and there's also the added complexity of oftentimes you'll see performance or behavioral changes based on even more narrow pathways, right? One particular user is having a problem and the traffic is spread across many containers. Is it making all of these containers perform badly? Not necessarily, but their user experience is being affected. It's very common in say, like, B2B scenarios for you to want to understand the experience of one particular user or the aggregate experience of users at a particular company, particular customer, for example.There's just more complexity. There's more complexity of the infrastructure and just the technical layer that you're talking about, but there's also more complexity in just the way that we're handling use cases and trying to provide value with all of this software to the myriad of customers in different industries that software now serves.Corey: For where I sit, I tend to have a little bit of trouble disambiguating, I guess, the three baseline data types that I see talked about again and again in observability. You have logs, which I think I've mostly I can wrap my head around. That seems to be the baseline story of, “Oh, great. Your application puts out logs. Of course, it's in its own unique, beautiful format. Why wouldn't it be?” In an ideal scenario, they're structured. Things are never ideal, so great. You're basically tailing log files in some cases. Great. I can reason about those.Metrics always seem to be a little bit of a step beyond that. It's okay, I have a whole bunch of log lines that are spitting out every 500 error that my app is throwing—and given my terrible code, it throws a lot—but I can then ideally count the number of times that appears and then that winds up incrementing counter, similar to the way that we used to see with StatsD, for example, and Collectd. Is that directionally correct? As far as the way I reason about, well so far, logs and metrics?Ian: I think at a really basic level, yes. I think that, as we've been talking about, sort of greater complexity starts coming in when you have—particularly metrics in today's world of containers—Prometheus—you mentioned StatsD—Prometheus has become sort of like the standard for expressing those things, so you get situations where you have incredibly high cardinality, so cardinality being the interplay between all the different dimensions. So, you might have, my container is a label, but also the type of endpoint is running on that container as a label, then maybe I want to track my customer organizations and maybe I have 5000 of those. I have 3000 containers, and so on and so forth. And you get this massive explosion, almost multiplicatively.For those in the audience who really live and read cardinality, there's probably someone screaming about well, it's not truly multiplicative in every sense of the word, but, you know, it's close enough from an approximation standpoint. As you get this massive explosion of data, which obviously has a cost implication but also has, I think, a really big implication on the core reason why you have metrics in the first place you alluded to, which is, so a human being can reason about it, right? You don't want to go and look at 5000 log lines; you want to know, out of those 5000 log lines of 4000 errors and I have 1000, OKs. It's very easy for human beings to reason about that from a numbers perspective. When your metrics start to re-explode out into thousands, millions of data points, and unique sort of time series more numbers for you to track, then you're sort of losing that original goal of metrics.Corey: I think I mostly have wrapped my head around the concept. But then that brings us to traces, and that tends to be I think one of the hardest things for me to grasp, just because most of the apps I build, for obvious reasons—namely, I'm bad at programming and most of these are proof of concept type of things rather than anything that's large scale running in production—the difference between a trace and logs tends to get very muddled for me. But the idea being that as you have a customer session or a request that talks to different microservices, how do you collate across different systems all of the outputs of that request into a single place so you can see timing information, understand the flow that user took through your application? Is that again, directionally correct? Have I completely missed the plot here? Which is again, eminently possible. You are the expert.Ian: No, I think that's sort of the fundamental premise or expected value of tracing, for sure. We have something that's akin to a set of logs; they have a common identifier, a trace ID, that tells us that all of these logs essentially belong to the same request. But importantly, there's relationship information. And this is the difference between just having traces—sorry, logs—with just a trace ID attached to them. So, for example, if you have Service A calling Service B and Service C, the relatively simple thing, you could use time to try to figure this out.But what if there are things happening in Service B at the same time there are things happening in Service C and D, and so on and so forth? So, one of the things that tracing brings to the table is it tells you what is currently happening, what called that. So oh, I know that I'm Service D. I was actually called by Service B and I'm not just relying on timestamps to try and figure out that connection. So, you have that information and ultimately, the data model allows you to fully sort of reflect what's happening with the request, particularly in complex environments.And I think this is where, you know, tracing needs to be sort of looked at as not a tool for—just because I'm operating in a modern environment, I'm using some Kubernetes, or I'm using Lambda, is it needs to be used in a scenario where you really have troubles grasping, from a conceptual standpoint, what is happening with the request because you need to actually fully document it. As opposed to, I have a few—let's say three Lambda functions. I maybe have some key metrics about them; I have a little bit of logging. You probably do not need to use tracing to solve, sort of, basic performance problems with those. So, you can get yourself into a place where you're over-engineering, you're spending a lot of time with tracing instrumentation and tracing tooling, and I think that's the core of observability is, like, using the right tool, the right data for the job.But that's also what makes it really difficult because you essentially need to have this, you know, huge set of experience or knowledge about the different data, the different tooling, and what influential architecture and the data you have available to be able to reason about that and make confident decisions, particularly when you're under a time crunch which everyone is familiar with a, sort of like, you know, PagerDuty-style experience of my phone is going off and I have a customer-facing incident. Where is my problem? What do I need to do? Which dashboard do I need to look at? Which tool do I need to investigate? And that's where I think the observability industry has become not serving the outcomes of the customers.Corey: I had a, well, I wouldn't say it's a genius plan, but it was a passing fancy that I've built this online, freely available Twitter client for authoring Twitter threads—because that's what I do is that of having a social life—and it's available at lasttweetinaws.com. I've used that as a testbed for a few things. It's now deployed to roughly 20 AWS regions simultaneously, and this means that I have a bit of a problem as far as how to figure out not even what's wrong or what's broken with this, but who's even using it?Because I know people are. I see invocations all over the planet that are not me. And sometimes it appears to just be random things crawling the internet—fine, whatever—but then I see people logging in and doing stuff with it. I'd kind of like to log and see who's using it just so I can get information like, is there anyone I should talk to about what it could be doing differently? I love getting user experience reports on this stuff.And I figured, ah, this is a perfect little toy application. It runs in a single Lambda function so it's not that complicated. I could instrument this with OpenTelemetry, which then, at least according to the instructions on the tin, I could then send different types of data to different observability tools without having to re-instrument this thing every time I want to kick the tires on something else. That was the promise.And this led to three weeks of pain because it appears that for all of the promise that it has, OpenTelemetry, particularly in a Lambda environment, is nowhere near ready for being able to carry a workload like this. Am I just foolish on this? Am I stating an unfortunate reality that you've noticed in the OpenTelemetry space? Or, let's be clear here, you do work for a company with opinions on these things. Is OpenTelemetry the wrong approach?Ian: I think OpenTelemetry is absolutely the right approach. To me, the promise of OpenTelemetry for the individual is, “Hey, I can go and instrument this thing, as you said and I can go and send the data, wherever I want.” The sort of larger view of that is, “Well, I'm no longer beholden to a vendor,”—including the ones that I've worked for, including the one that I work for now—“For the definition of the data. I am able to control that, I'm able to choose that, I'm able to enhance that, and any effort I put into it, it's mine. I own that.”Whereas previously, if you picked, say, for example, an APM vendor, you said, “Oh, I want to have some additional aspects of my information provider, I want to track my customer, or I want to track a particular new metric of how much dollars am I transacting,” that effort really going to support the value of that individual solution, it's not going to support your outcomes. Which is I want to be able to use this data wherever I want, wherever it's most valuable. So, the core premise of OpenTelemetry, I think, is great. I think it's a massive undertaking to be able to do this for at least three different data types, right? Defining an API across a whole bunch of different languages, across three different data types, and then creating implementations for those.Because the implementations are the thing that people want, right? You are hoping for the ability to, say, drop in something. Maybe one line of code or preferably just, like, attach a dependency, let's say in Java-land at runtime, and be able to have the information flow through and have it complete. And this is the premise of, you know, vendors I've worked with in the past, like New Relic. That was what New Relic built on: the ability to drop in an agent and get visibility immediately.So, having that out-of-the-box visibility is obviously a goal of OpenTelemetry where it makes sense—Go, it's very difficult to attach things at runtime, for example—but then saying, well, whatever is provided—let's say your gRPC connections, database, all these things—well, now I want to go and instrument; I want to add some additional value. As you said, maybe you want to track something like I want to have in my traces the email address of whoever it is or the Twitter handle of whoever is so I can then go and analyze that stuff later. You want to be able to inject that piece of information or that instrumentation and then decide, well, where is the best utilized? Is it best utilized in some tooling from AWS? Is it best utilized in something that you've built yourself? Is it best of utilized an open-source project? Is it best utilized in one of the many observability vendors, or is even becoming more common, I want to shove everything in a data lake and run, sort of, analysis asynchronously, overlay observability data for essentially business purposes.All of those things are served by having a very robust, open-source standard, and simple-to-implement way of collecting a really good baseline of data and then make it easy for you to then enhance that while still owning—essentially, it's your IP right? It's like, the instrumentation is your IP, whereas in the old world of proprietary agents, proprietary APIs, that IP was basically building it, but it was tied to that other vendor that you were investing in.Corey: One thing that I was consistently annoyed by in my days of running production infrastructures at places, like, you know, large banks, for example, one of the problems I kept running into is that this, there's this idea that, “Oh, you want to use our tool. Just instrument your applications with our libraries or our instrumentation standards.” And it felt like I was constantly doing and redoing a lot of instrumentation for different aspects. It's not that we were replacing one vendor with another; it's that in an observability, toolchain, there are remarkably few, one-size-fits-all stories. It feels increasingly like everyone's trying to sell me a multifunction printer, which does one thing well, and a few other things just well enough to technically say they do them, but badly enough that I get irritated every single time.And having 15 different instrumentation packages in an application, that's either got security ramifications, for one, see large bank, and for another it became this increasingly irritating and obnoxious process where it felt like I was spending more time seeing the care and feeding of the instrumentation then I was the application itself. That's the gold—that's I guess the ideal light at the end of the tunnel for me in what OpenTelemetry is promising. Instrument once, and then you're just adjusting configuration as far as where to send it.Ian: That's correct. The organization's, and you know, I keep in touch with a lot of companies that I've worked with, companies that have in the last two years really invested heavily in OpenTelemetry, they're definitely getting to the point now where they're generating the data once, they're using, say, pieces of the OpenTelemetry pipeline, they're extending it themselves, and then they're able to shove that data in a bunch of different places. Maybe they're putting in a data lake for, as I said, business analysis purposes or forecasting. They may be putting the data into two different systems, even for incident and analysis purposes, but you're not having that duplication effort. Also, potentially that performance impact, right, of having two different instrumentation packages lined up with each other.Corey: There is a recurring theme that I've noticed in the observability space that annoys me to no end. And that is—I don't know if it's coming from investor pressure, from folks never being satisfied with what they have, or what it is, but there are so many startups that I have seen and worked with in varying aspects of the observability space that I think, “This is awesome. I love the thing that they do.” And invariably, every time they start getting more and more features bolted onto them, where, hey, you love this whole thing that winds up just basically doing a tail-F on a log file, so it just streams your logs in the application and you can look for certain patterns. I love this thing. It's great.Oh, what's this? Now, it's trying to also be the thing that alerts me and wakes me up in the middle of the night. No. That's what PagerDuty does. I want PagerDuty to do that thing, and I want other things—I want you just to be the log analysis thing and the way that I contextualize logs. And it feels like they keep bolting things on and bolting things on, where everything is more or less trying to evolve into becoming its own version of Datadog. What's up with that?Ian: Yeah, the sort of, dreaded platform play. I—[laugh] I was at New Relic when there were essentially two products that they sold. And then by the time I left, I think there was seven different products that were being sold, which is kind of a crazy, crazy thing when you think about it. And I think Datadog has definitely exceeded that now. And I definitely see many, many vendors in the market—and even open-source solutions—sort of presenting themselves as, like, this integrated experience.But to your point, even before about your experience of these banks it oftentimes become sort of a tick-a-box feature approach of, “Hey, I can do this thing, so buy more. And here's a shared navigation panel.” But are they really integrated? Like, are you getting real value out of it? One of the things that I do in my role is I get to work with our internal product teams very closely, particularly around new initiatives like tracing functionality, and the constant sort of conversation is like, “What is the outcome? What is the value?”It's not about the feature; it's not about having a list of 19 different features. It's like, “What is the user able to do with this?” And so, for example, there are lots of platforms that have metrics, logs, and tracing. The new one-upmanship is saying, “Well, we have events as well. And we have incident response. And we have security. And all these things sort of tie together, so it's one invoice.”And constantly I talk to customers, and I ask them, like, “Hey, what are the outcomes that you're getting when you've invested so heavily in one vendor?” And oftentimes, the response is, “Well, I only need to deal with one vendor.” Okay, but that's not an outcome. [laugh]. And it's like the business having a single invoice.Corey: Yeah, that is something that's already attainable today. If you want to just have one vendor with a whole bunch of crappy offerings, that's what AWS is for. They have AmazonBasics versions of everything you might want to use in production. Oh, you want to go ahead and use MongoDB? Well, use AmazonBasics MongoDB, but they call it DocumentDB because of course they do. And so, on and so forth.There are a bunch of examples of this, but those companies are still in business and doing very well because people often want the genuine article. If everyone was trying to do just everything to check a box for procurement, great. AWS has already beaten you at that game, it seems.Ian: I do think that, you know, people are hoping for that greater value and those greater outcomes, so being able to actually provide differentiation in that market I don't think is terribly difficult, right? There are still huge gaps in let's say, root cause analysis during an investigation time. There are huge issues with vendors who don't think beyond sort of just the one individual who's looking at a particular dashboard or looking at whatever analysis tool there is. So, getting those things actually tied together, it's not just, “Oh, we have metrics, and logs, and traces together,” but even if you say we have metrics and tracing, how do you move between metrics and tracing? One of the goals in the way that we're developing product at Chronosphere is that if you are alerted to an incident—you as an engineer; doesn't matter whether you are massively sophisticated, you're a lead architect who has been with the company forever and you know everything or you're someone who's just come out of onboarding and is your first time on call—you should not have to think, “Is this a tracing problem, or a metrics problem, or a logging problem?”And this is one of those things that I mentioned before of requiring that really heavy level of knowledge and understanding about the observability space and your data and your architecture to be effective. And so, with the, you know, particularly observability teams and all of the engineers that I speak with on a regular basis, you get this sort of circumstance where well, I guess, let's talk about a real outcome and a real pain point because people are like, okay, yeah, this is all fine; it's all coming from a vendor who has a particular agenda, but the thing that constantly resonates is for large organizations that are moving fast, you know, big startups, unicorns, or even more traditional enterprises that are trying to undergo, like, a rapid transformation and go really cloud-native and make sure their engineers are moving quickly, a common question I will talk about with them is, who are the three people in your organization who always get escalated to? And it's usually, you know, between two and five people—Corey: And you can almost pick those perso—you say that and you can—at least anyone who's worked in environments or through incidents like this more than a few times, already have thought of specific people in specific companies. And they almost always fall into some very predictable archetypes. But please, continue.Ian: Yeah. And people think about these people, they always jump to mind. And one of the things I asked about is, “Okay, so when you did your last innovation around observably”—it's not necessarily buying a new thing, but it maybe it was like introducing a new data type or it was you're doing some big investment in improving instrumentation—“What changed about their experience?” And oftentimes, the most that can come out is, “Oh, they have access to more data.” Okay, that's not great.It's like, “What changed about their experience? Are they still getting woken up at 3 am? Are they constantly getting pinged all the time?” One of the vendors that I worked at, when they would go down, there were three engineers in the company who were capable of generating list of customers who are actually impacted by damage. And so, every single incident, one of those three engineers got paged into the incident.And it became borderline intolerable for them because nothing changed. And it got worse, you know? The platform got bigger and more complicated, and so there were more incidents and they were the ones having to generate that. But from a business level, from an observability outcomes perspective, if you zoom all the way up, it's like, “Oh, were we able to generate the list of customers?” “Yes.”And this is where I think the observability industry has sort of gotten stuck—you know, at least one of the ways—is that, “Oh, can you do it?” “Yes.” “But is it effective?” “No.” And by effective, I mean those three engineers become the focal point for an organization.And when I say three—you know, two to five—it doesn't matter whether you're talking about a team of a hundred or you're talking about a team of a thousand. It's always the same number of people. And as you get bigger and bigger, it becomes more and more of a problem. So, does the tooling actually make a difference to them? And you might ask, “Well, what do you expect from the tooling? What do you expect to do for them?” Is it you give them deeper analysis tools? Is it, you know, you do AI Ops? No.The answer is, how do you take the capabilities that those people have and how do you spread it across a larger population of engineers? And that, I think, is one of those key outcomes of observability that no one, whether it be in open-source or the vendor side is really paying a lot of attention to. It's always about, like, “Oh, we can just shove more data in. By the way, we've got petabyte scale and we can deal with, you know, 2 billion active time series, and all these other sorts of vanity measures.” But we've gotten really far away from the outcomes. It's like, “Am I getting return on investment of my observability tooling?”And I think tracing is this—as you've said, it can be difficult to reason about right? And people are not sure. They're feeling, “Well, I'm in a microservices environment; I'm in cloud-native; I need tracing because my older APM tools appear to be failing me. I'm just going to go and wriggle my way through implementing OpenTelemetry.” Which has significant engineering costs. I'm not saying it's not worth it, but there is a significant engineering cost—and then I don't know what to expect, so I'm going to go on through my data somewhere and see whether we can achieve those outcomes.And I do a pilot and my most sophisticated engineers are in the pilot. And they're able to solve the problems. Okay, I'm going to go buy that thing. But I've just transferred my problems. My engineers have gone from solving problems in maybe logs and grepping through petabytes worth of logs to using some sort of complex proprietary query language to go through your tens of petabytes of trace data but actually haven't solved any problem. I've just moved it around and probably just cost myself a lot, both in terms of engineering time and real dollars spent as well.Corey: One of the challenges that I'm seeing across the board is that observability, for certain use cases, once you start to see what it is and its potential for certain applications—certainly not all; I want to hedge that a little bit—but it's clear that there is definite and distinct value versus other ways of doing things. The problem is, is that value often becomes apparent only after you've already done it and can see what that other side looks like. But let's be honest here. Instrumenting an application is going to take some significant level of investment, in many cases. How do you wind up viewing any return on investment that it takes for the very real cost, if only in people's time, to go ahead instrumenting for observability in complex environments?Ian: So, I think that you have to look at the fundamentals, right? You have to look at—pretend we knew nothing about tracing. Pretend that we had just invented logging, and you needed to start small. It's like, I'm not going to go and log everything about every application that I've had forever. What I need to do is I need to find the points where that logging is going to be the most useful, most impactful, across the broadest audience possible.And one of the useful things about tracing is because it's built in distributed environments, primarily for distributed environments, you can look at, for example, the biggest intersection of requests. A lot of people have things like API Gateways, or they have parts of a monolith which is still handling a lot of requests routing; those tend to be areas to start digging into. And I would say that, just like for anyone who's used Prometheus or decided to move away from Prometheus, no one's ever gone and evaluated Prometheus solution without having some sort of Prometheus data, right? You don't go, “Hey, I'm going to evaluate a replacement for Prometheus or my StatsD without having any data, and I'm simultaneously going to generate my data and evaluate the solution at the same time.” It doesn't make any sense.With tracing, you have decent open-source projects out there that allow you to visualize individual traces and understand sort of the basic value you should be getting out of this data. So, it's a good starting point to go, “Okay, can I reason about a single request? Can I go and look at my request end-to-end, even in a relatively small slice of my environment, and can I see the potential for this? And can I think about the things that I need to be able to solve with many traces?” Once you start developing these ideas, then you can have a better idea of, “Well, where do I go and invest more in instrumentation? Look, databases never appear to be a problem, so I'm not going to focus on database instrumentation. What's the real problem is my external dependencies. Facebook API is the one that everyone loves to use. I need to go instrument that.”And then you start to get more clarity. Tracing has this interesting network effect. You can basically just follow the breadcrumbs. Where is my biggest problem here? Where are my errors coming from? Is there anything else further down the call chain? And you can sort of take that exploratory approach rather than doing everything up front.But it is important to do something before you start trying to evaluate what is my end state. End state obviously being sort of nebulous term in today's world, but where do I want to be in two years' time? I would like to have a solution. Maybe it's open-source solution, maybe it's a vendor solution, maybe it's one of those platform solutions we talked about, but how do I get there? It's really going to be I need to take an iterative approach and I need to be very clear about the value and outcomes.There's no point in doing a whole bunch of instrumentation effort in things that are just working fine, right? You want to go and focus your time and attention on that. And also you don't want to go and burn just singular engineers. The observability team's purpose in life is probably not to just write instrumentation or just deploy OpenTelemetry. Because then we get back into the land where engineers themselves know nothing about the monitoring or observability they're doing and it just becomes a checkbox of, “I dropped in an agent. Oh, when it comes time for me to actually deal with an incident, I don't know anything about the data and the data is insufficient.”So, a level of ownership supported by the observability team is really important. On that return on investment, sort of, though it's not just the instrumentation effort. There's product training and there are some very hard costs. People think oftentimes, “Well, I have the ability to pay a vendor; that's really the only cost that I have.” There's things like egress costs, particularly volumes of data. There's the infrastructure costs. A lot of the times there will be elements you need to run in your own environment; those can be very costly as well, and ultimately, they're sort of icebergs in this overall ROI conversation.The other side of it—you know, return and investment—return, there's a lot of difficulty in reasoning about, as you said, what is the value of this going to be if I go through all this effort? Everyone knows a sort of, you know, meme or archetype of, “Hey, here are three options; pick two because there's always going to be a trade off.” Particularly for observability, it's become an element of, I need to pick between performance, data fidelity, or cost. Pick two. And when data fidelity—particularly in tracing—I'm talking about the ability to not sample, right?If you have edge cases, if you have narrow use cases and ways you need to look at your data, if you heavily sample, you lose data fidelity. But oftentimes, cost is a reason why you do that. And then obviously, performance as you start to get bigger and bigger datasets. So, there's a lot of different things you need to balance on that return. As you said, oftentimes you don't get to understand the magnitude of those until you've got the full data set in and you're trying to do this, sort of, for real. But being prepared and iterative as you go through this effort and not saying, “Okay, well, I'm just going to buy everything from one vendor because I'm going to assume that's going to solve my problem,” is probably that undercurrent there.Corey: As I take a look across the entire ecosystem, I can't shake the feeling—and my apologies in advance if this is an observation, I guess, that winds up throwing a stone directly at you folks—Ian: Oh, please.Corey: But I see that there's a strong observability community out there that is absolutely aligned with the things I care about and things I want to do, and then there's a bunch of SaaS vendors, where it seems that they are, in many cases, yes, advancing the state of the art, I am not suggesting for a second that money is making observability worse. But I do think that when the tool you sell is a hammer, then every problem starts to look like a nail—or in my case, like my thumb. Do you think that there's a chance that SaaS vendors are in some ways making this entire space worse?Ian: As we've sort of gone into more cloud-native scenarios and people are building things specifically to take advantage of cloud from a complexity standpoint, from a scaling standpoint, you start to get, like, vertical issues happening. So, you have things like we're going to charge on a per-container basis; we're going to charge on a per-host basis; we're going to charge based off the amount of gigabytes that you send us. These are sort of like more horizontal pricing models, and the way the SaaS vendors have delivered this is they've made it pretty opaque, right? Everyone has experiences, or has jerks about overages from observability vendors' massive spikes. I've worked with customers who have used—accidentally used some features and they've been billed a quarter million dollars on a monthly basis for accidental overages from a SaaS vendor.And these are all terrible things. Like, but we've gotten used to this. Like, we've just accepted it, right, because everyone is operating this way. And I really do believe that the move to SaaS was one of those things. Like, “Oh, well, you're throwing us more data, and we're charging you more for it.” As a vendor—Corey: Which sort of erodes your own value proposition that you're bringing to the table. I mean, I don't mean to be sitting over here shaking my fist yelling, “Oh, I could build a better version in a weekend,” except that I absolutely know how to build a highly available Rsyslog cluster. I've done it a handful of times already and the technology is still there. Compare and contrast that with, at scale, the fact that I'm paying 50 cents per gigabyte ingested to CloudWatch logs, or a multiple of that for a lot of other vendors, it's not that much harder for me to scale that fleet out and pay a much smaller marginal cost.Ian: And so, I think the reaction that we're seeing in the market and we're starting to see—we're starting to see the rise of, sort of, a secondary class of vendor. And by secondary, I don't mean that they're lesser; I mean that they're, sort of like, specifically trying to address problems of the primary vendors, right? Everyone's aware of vendors who are attempting to reduce—well, let's take the example you gave on logs, right? There are vendors out there whose express purpose is to reduce the cost of your logging observability. They just sit in the middle; they are a middleman, right?Essentially, hey, use our tool and even though you're going to pay us a whole bunch of money, it's going to generate an overall return that is greater than if you had just continued pumping all of your logs over to your existing vendor. So, that's great. What we think really needs to happen, and one of the things we're doing at Chronosphere—unfortunate plug—is we're actually building those capabilities into the solution so it's actually end-to-end. And by end-to-end, I mean, a solution where I can ingest my data, I can preprocess my data, I can store it, query it, visualize it, all those things, aligned with open-source standards, but I have control over that data, and I understand what's going on with particularly my cost and my usage. I don't just get a bill at the end of the month going, “Hey, guess what? You've spent an additional $200,000.”Instead, I can know in real time, well, what is happening with my usage. And I can attribute it. It's this team over here. And it's because they added this particular label. And here's a way for you, right now, to address that and cap it so it doesn't cost you anything and it doesn't have a blast radius of, you know, maybe degraded performance or degraded fidelity of the data.That though is diametrically opposed to the way that most vendors are set up. And unfortunately, the open-source projects tend to take a lot of their cues, at least recently, from what's happening in the vendor space. One of the ways that you can think about it is a sort of like a speed of light problem. Everyone knows that, you know, there's basic fundamental latency; everyone knows how fast disk is; everyone knows the, sort of like, you can't just make your computations happen magically, there's a cost of running things horizontally. But a lot of the way that the vendors have presented efficiency to the market is, “Oh, we're just going to incrementally get faster as AWS gets faster. We're going to incrementally get better as compression gets better.”And of course, you can't go and fit a petabyte worth of data into a kilobyte, unless you're really just doing some sort of weird dictionary stuff, so you feel—you're dealing with some fundamental constraints. And the vendors just go, “I'm sorry, you know, we can't violate the speed of light.” But what you can do is you can start taking a look at, well, how is the data valuable, and start giving the people controls on how to make it more valuable. So, one of the things that we do with Chronosphere is we allow you to reshape Prometheus metrics, right? You go and express Prometheus metrics—let's say it's a business metric about how many transactions you're doing as a business—you don't need that on a per-container basis, particularly if you're running 100,000 containers globally.When you go and take a look at that number on a dashboard, or you alert on it, what is it? It's one number, one time series. Maybe you break it out per region. You have five regions, you don't need 100,000 data points every minute behind that. It's very expensive, it's not very performant, and as we talked about earlier, it's very hard to reason about as a human being.So, giving the tools to be able to go and condense that data down and make it more actionable and more valuable, you get performance, you get cost reduction, and you get the value that you ultimately need out of the data. And it's one of the reasons why, I guess, I work at Chronosphere. Which I'm hoping is the last observability [laugh] venture I ever work for.Corey: Yeah, for me a lot of the data that I see in my logs, which is where a lot of this stuff starts and how I still contextualize these things, is nonsense that I don't care about and will never care about. I don't care about load balance or health checks. I don't particularly care about 200 results for the favicon when people visit the site. I care about other things, but just weed out the crap, especially when I'm paying by the pound—or at least by the gigabyte—in order to get that data into something. Yeah. It becomes obnoxious and difficult to filter out.Ian: Yeah. And the vendors just haven't done any of that because why would they, right? If you went and reduced the amount of log—Corey: Put engineering effort into something that reduces how much I can charge you? That sounds like lunacy. Yeah.Ian: Exactly. They're business models entirely based off it. So, if you went and reduced every one's logging bill by 30%, or everyone's logging volume by 30% and reduced the bills by 30%, it's not going to be a great time if you're a publicly traded company who has built your entire business model on essentially a very SaaS volume-driven—and in my eyes—relatively exploitative pricing and billing model.Corey: Ian, I want to thank you for taking so much time out of your day to talk to me about this. If people want to learn more, where can they find you? I mean, you are a Field CTO, so clearly you're outstanding in your field. But if, assuming that people don't want to go to farm country, where's the best place to find you?Ian: Yeah. Well, it'll be a bunch of different conferences. I'll be at KubeCon this year. But chronosphere.io is the company website. I've had the opportunity to talk to a lot of different customers, not from a hard sell perspective, but you know, conversations like this about what are the real problems you're having and what are the things that you sort of wish that you could do?One of the favorite things that I get to ask people is, “If you could wave a magic wand, what would you love to be able to do with your observability solution?” That's, A, a really great part, but oftentimes be being able to say, “Well, actually, that thing you want to do, I think I have a way to accomplish that,” is a really rewarding part of this particular role.Corey: And we will, of course, put links to that in the show notes. Thank you so much for being so generous with your time. I appreciate it.Ian: Thanks, Corey. It's great to be here.Corey: Ian Smith, Field CTO at Chronosphere on this promoted guest episode. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment, which going to be super easy in your case, because it's just one of the things that the omnibus observability platform that your company sells offers as part of its full suite of things you've never used.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
GCP's Many Profundities with Miles Ward

Screaming in the Cloud

Play Episode Listen Later Jan 11, 2022 42:06


About MilesAs Chief Technology Officer at SADA, Miles Ward leads SADA's cloud strategy and solutions capabilities. His remit includes delivering next-generation solutions to challenges in big data and analytics, application migration, infrastructure automation, and cost optimization; reinforcing our engineering culture; and engaging with customers on their most complex and ambitious plans around Google Cloud.Previously, Miles served as Director and Global Lead for Solutions at Google Cloud. He founded the Google Cloud's Solutions Architecture practice, launched hundreds of solutions, built Style-Detection and Hummus AI APIs, built CloudHero, designed the pricing and TCO calculators, and helped thousands of customers like Twitter who migrated the world's largest Hadoop cluster to public cloud and Audi USA who re-platformed to k8s before it was out of alpha, and helped Banco Itau design the intercloud architecture for the bank of the future.Before Google, Miles helped build the AWS Solutions Architecture team. He wrote the first AWS Well-Architected framework, proposed Trusted Advisor and the Snowmobile, invented GameDay, worked as a core part of the Obama for America 2012 “tech” team, helped NASA stream the Curiosity Mars Rover landing, and rebooted Skype in a pinch.Earning his Bachelor of Science in Rhetoric and Media Studies from Willamette University, Miles is a three-time technology startup entrepreneur who also plays a mean electric sousaphone.Links: SADA.com: https://sada.com Twitter: https://twitter.com/milesward Email: miles@sada.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: It seems like there is a new security breach every day. Are you confident that an old SSH key, or a shared admin account, isn't going to come back and bite you? If not, check out Teleport. Teleport is the easiest, most secure way to access all of your infrastructure. The open source Teleport Access Plane consolidates everything you need for secure access to your Linux and Windows servers—and I assure you there is no third option there. Kubernetes clusters, databases, and internal applications like AWS Management Console, Yankins, GitLab, Grafana, Jupyter Notebooks, and more. Teleport's unique approach is not only more secure, it also improves developer productivity. To learn more visit: goteleport.com. And not, that is not me telling you to go away, it is: goteleport.com.Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open source database that is not the bind DNS server. If you're tired of managing open source Redis on your own, or you're using one of the vanilla cloud caching services, these folks have you covered with the go to manage Redis service for global caching and primary database capabilities; Redis Enterprise. To learn more and deploy not only a cache but a single operational data platform for one Redis experience, visit redis.com/hero. Thats r-e-d-i-s.com/hero. And my thanks to my friends at Redis for sponsoring my ridiculous non-sense.  Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I am joined today, once again by my friend and yours, Miles Ward, who's the CTO at SADA. However, he is, as I think of him, the closest thing the Google Cloud world has to Corey Quinn. Now, let's be clear, not the music and dancing part that is Forrest Brazeal, but Forrest works at Google Cloud, whereas Miles is a reasonably salty third-party. Miles, thank you for coming back and letting me subject you to that introduction.Miles: Corey, I appreciate that introduction. I am happy to provide substantial salt. It is easy, as I play brass instruments that produce my spit in high volumes. It's the most disgusting part of any possible introduction. For the folks in the audience, I am surrounded by a collection of giant sousaphones, tubas, trombones, baritones, marching baritones, trumpets, and pocket trumpets.So, Forrest threw down the gauntlet and was like, I can play a keyboard, and sing, and look cute at the same time. And so I decided to fail at all three. We put out a new song just a bit ago that's, like, us thanking all of our customers and partners, covering Kool & the Gang “Celebration,” and I neither look good, [laugh] play piano, or smiling, or [capturing 00:01:46] any of the notes; I just play the bass part, it's all I got to do.Corey: So, one thing that I didn't get to talk a lot about because it's not quite in my universe, for one, and for another, it is during the pre re:Invent—pre:Invent, my nonsense thing—run up, which is Google Cloud Next.Miles: Yes.Corey: And my gag a few years ago is that I'm not saying that Google is more interested in what they're building and what they're shipping, but even their conference is called Next. Buh dum, hiss.Miles: [laugh].Corey: So, I didn't really get to spend a lot of attention on the Google Cloud releases that came out this year, but given that SADA is in fact the, I believe, largest Google Cloud partner on the internet, and thus the world—Miles: [unintelligible 00:02:27] new year, three years in a row back, baby.Corey: Fantastic. I assume someone's watch got stuck or something. But good work. So, you have that bias in the way that I have a bias, which is your business is focused around Google Cloud the way that mine is focused on AWS, but neither of us is particularly beholden to that given company. I mean, you do have the not getting fired as partner, but that's a bit of a heavy lift; I don't think I can mouth off well enough to get you there.So, we have a position of relative independence. So, you were tracking Google Next, the same way that I track re:Invent. Well, not quite the same way I track re:Invent; there are some significant differences. What happened at Cloud Next 2021, that the worst of us should be paying attention to?Miles: Sure. I presented 10% of the material at the first re:Invent. There are 55 sessions; I did six. And so I have been at Cloud events for a really long time and really excited about Google's willingness to dive into demos in a way that I think they have been a little shy about. Kelsey Hightower is the kind of notable deep exception to that. Historically, he's been ready to dive into the, kind of, heavy hands-on piece but—Corey: Wait, those were demos? [Thought 00:03:39] was just playing Tetris on stage for the love of it.Miles: [laugh]. No. And he really codes all that stuff up, him and the whole team.Corey: Oh, absol—I'm sorry. If I ever grow up, I wish to be Kelsey Hightower.Miles: [laugh]. You and me both. So, he had kind of led the charge. We did a couple of fun little demos while I was there, but they've really gotten a lot further into that, and I think are doing a better job of packaging the benefits to not just developers, but also operators and data scientists and the broader roles in the cloud ecosystem from the new features that are being launched. And I think, different than the in-person events where there's 10, 20,000, 40,000 people in the audience paying attention, I think they have to work double-hard to capture attention and get engineers to tune in to what's being launched.But if you squint and look close, there are some, I think, very interesting trends that sit in the back of some of the very first launches in what I think are going to be whole veins of launches from Google over the course of the next several years that we are working really hard to track along with and make sure we're extracting maximum value from for our customers.Corey: So, what was it that they announced that is worth paying attention to? Now, through the cacophony of noise, one announcement that [I want to note 00:04:49] was tied to Next was the announcement that GME group, I believe, is going to be putting their futures exchange core trading systems on Google Cloud. At which point that to me—and I know people are going to yell at me, and I don't even slightly care—that is the last nail in the coffin of the idea that well, Google is going to turn this off in a couple years. Sorry, no. That is not a thing that's going to happen. Worst case, they might just stop investing it as aggressively as they are now, but even that would be just a clown-shoes move that I have a hard time envisioning.Miles: Yeah, you're talking now over a dozen, over ten year, over a billion-dollar commitments. So, you've got to just really, really hate your stock price if you're going to decide to vaporize that much shareholder value, right? I mean, we think that, in Google, stock price is a material fraction of the recognition of the growth trajectory for cloud, which is now basically just third place behind YouTube. And I think you can do the curve math, it's not like it's going to take long.Corey: Right. That requires effectively ejecting Thomas Kurian as the head of Google Cloud and replacing him with the former SVP of Bad Decisions at Yahoo.Miles: [laugh]. Sure. Google has no shyness about continuing to rotate leadership. I was there through three heads of Google Cloud, so I don't expect that Thomas will be the last although I think he may well go down in history as having been the best. The level of rotation to the focuses that I think are most critical, getting enterprise customers happy, successful, committed, building macroscale systems, in systems that are critical to the core of the business on GCP has grown at an incredible rate under his stewardship. So, I think he's doing a great job.Corey: He gets a lot of criticism—often from Googlers—when I wind up getting the real talk from them, which is, “Can you tell me what you really think?” Their answer is, “No,” I'm like, “Okay, next question. Can I go out and buy you eight beers and then”— and it's like, “Yeah.” And the answer that I get pretty commonly is that he's brought too much Oracle into Google. And okay, that sounds like a bad thing because, you know, Oracle, but let's be clear here, but what are you talking about specifically? And what they say distills down to engineers are no longer the end-all be-all of everything that Google Cloud. Engineers don't get to make sales decisions, or marketing decisions, or in some cases, product decisions. And that is not how Google has historically been run, and they don't like the change. I get it, but engineering is not the only hard thing in the world and it's not the only business area that builds value, let's be clear on this. So, I think that the things that they don't like are in fact, what Google absolutely needs.Miles: I think, one, the man is exceptionally intimidating and intentionally just hyper, hyper attentive to his business. So, one of my best employees, Brad [Svee 00:07:44], he worked together with me to lay out what was the book of our whole department, my team of 86 people there. What are we about? What do we do? And like I wanted this as like a memoriam to teach new hires as got brought in. So, this is, like, 38 pages of detail about our process, our hiring method, our promotional approach, all of it. I showed that to my new boss who had come in at the time, and he thought some of the pictures looked good. When we showed it to TK, he read every paragraph. I watched him highlight the paragraphs as he went through, and he read it twice as fast as I can read the thing. I think he does that to everybody's documents, everywhere. So, there's a level of just manual rigor that he's brought to the practice that was certainly not there before that. So, that alone, it can be intimidating for folks, but I think people that are high performance find that very attractive.Corey: Well, from my perspective, he is clearly head and shoulders above Adam Selipsky, and Scott Guthrie—the respective heads of AWS and Azure—for one key reason: He is the only one of those three people who follows me on Twitter. And—Miles: [laugh].Corey: —honestly, that is how I evaluate vendors.Miles: That's the thing. That's the only measure, yep. I've worked on for a long time with Selipsky, and I think that it will be interesting to see whether Adam's approach to capital allocation—where he really, I think, thinks of himself as the manager of thousands of startups, as opposed to a manager of a global business—whether that's a more efficient process for creating value for customers, then, where I think TK is absolutely trying to build a much more unified, much more singular platform. And a bunch of the launches really speak to that, right? So, one of the product announcements that I think is critical is this idea of the global distributed cloud, Google Distributed Cloud.We started with Kubernetes. And then you layer on to that, okay, we'll take care of Kubernetes for you; we call that Anthos. We'll build a bunch of structural controls and features into Anthos to make it so that you can really deal with stuff in a global way. Okay, what does that look like further? How do we get out into edge environments? Out into diverse hardware? How do we partner up with everybody to make sure that, kind of like comparing Apple's approach to Google's approach, you have an Android ecosystem of Kubernetes providers instead of just one place you can buy an outpost. That's generally the idea of GDC. I think that's a spot where you're going to watch Google actually leverage the muscle that it already built in understanding open-source dynamics and understanding collaboration between companies as opposed to feeling like it's got to be built here. We've got to sell it here. It's got to have our brand on it.Corey: I think that there's a stupendous and extreme story that is still unfolding over at Google Cloud. Now, re:Invent this year, they wound up talking all about how what they were rolling out was a focus on improving primitives. And they're right. I love their managed database service that they launched because it didn't exist.Miles: Yeah Werner's slide, “It's primitives, not frameworks.” I was like, I think customers want solutions, not frameworks or primitives. [laugh]. What's your plan?Corey: Yeah. However, I take a different perspective on all of this, which is that is a terrific spin on the big headline launches all missed the re:Invent timeline, and… oops, so now we're just going to talk about these other things instead. And that's great, but then they start talking about industrial IOT, and mainframe migrations, and the idea of private 5G, and running fleets of robots. And it's—Miles: Yeah, that's a cool product.Corey: Which one? I'm sorry, they're all very different things.Miles: Private 5G.Corey: Yeah, if someone someday will explain to me how it differs from Wavelength, but that's neither here nor there. You're right, they're all interesting, but none of them are actually doing the thing that I do, which is build websites, [unintelligible 00:11:31] looking for web services, it kind of says it in the name. And it feels like it's very much broadening into everything, and it's very difficult for me to identify—and if I have trouble that I guarantee you customers do—of, which services are for me and which are very much not? In some cases, the only answer to that is to check the pricing. I thought Kendra, their corporate information search thing was for me, then it's 7500 bucks a month to get started with that thing, and that is, “I can hire an internal corporate librarian to just go and hunt through our Google Drive.” Great.Miles: Yeah.Corey: So, there are—or our Dropbox, or our Slack. We have, like, five different information repositories, and this is how corporate nonsense starts, let me assure you.Miles: Yes. We call that luxury SaaS, you must enjoy your dozens of overlapping bills for, you know, what Workspace gives you as a single flat rate.Corey: Well, we have [unintelligible 00:12:22] a lot of this stuff, too. Google Drive is great, but we use Dropbox for holding anything that touches our customer's billing information, just because I—to be clear, I do not distrust Google, but it also seems a little weird to put the confidential billing information for one of their competitors on there to thing if a customer were to ask about it. So, it's the, like, I don't believe anyone's doing anything nefarious, but let's go ahead and just make sure, in this case.Miles: Go further man. Vimeo runs on GCP. You think YouTube doesn't want to look at Vimeo stats? Like they run everything on GCP, so they have to have arrived at a position of trust somehow. Oh, I know how it's called encryption. You've heard of encryption before? It's the best.Corey: Oh, yes. I love these rumors that crop up every now and again that Amazon is going to start scanning all of its customer content, somehow. It's first, do you have any idea how many compute resources that would take and to if they can actually do that and access something you're storing in there, against their attestations to the contrary, then that's your story because one of them just makes them look bad, the other one utterly destroys their entire business.Miles: Yeah.Corey: I think that that's the one that gets the better clicks. So no, they're not doing that.Miles: No, they're not doing that. Another product launch that I thought was super interesting that describes, let's call it second place—the third place will be the one where we get off into the technical deep end—but there's a whole set of coordinated work they're calling Cortex. So, let's imagine you go to a customer, they say, “I want to understand what's happening with my business.” You go, “Great.” So, you use SAP, right? So, you're a big corporate shop, and that's your infrastructure of choice. There are a bunch of different options at that layer.When you set up SAP, one of the advantages that something like that has is they have, kind of, pre-built configurations for roughly your business, but whatever behaviors SAP doesn't do, right, say, data warehousing, advanced analytics, regression and projection and stuff like that, maybe that's somewhat outside of the core wheelhouse for SAP, you would expect like, oh okay, I'll bolt on BigQuery. I'll build that stuff over there. We'll stream the data between the two. Yeah, I'm off to the races, but the BigQuery side of the house doesn't have this like bitching menu that says, “You're a retailer, and so you probably want to see these 75 KPIs, and you probably want to chew up your SKUs in exactly this way. And here's some presets that make it so that this is operable out of the box.”So, they are doing the three way combination: Consultancies plus ISVs plus Google products, and doing all the pre-work configuration to go out to a customer and go I know what you probably just want. Why don't I just give you the whole thing so that it does the stuff that you want? That I think—if that's the very first one, this little triangle between SAP, and Big Query, and a bunch of consultancies like mine, you have to imagine they go a lot further with that a lot faster, right? I mean, what does that look like when they do it with Epic, when they go do it with Go just generally, when they go do it with Apache? I've heard of that software, right? Like, there's no reason not to bundle up what the obvious choices are for a bunch of these combinations.Corey: The idea of moving up the stack and offering full on solutions, that's what customers actually want. “Well, here's a bunch of things you can do to wind up wiring together to build a solution,” is, “Cool. Then I'm going to go hire a company who's already done that is going to sell it to me at a significant markup because I just don't care.” I pay way more to WP Engine than I would to just run WordPress myself on top of AWS or Google Cloud. In fact, it is on Google Cloud, but okay.Miles: You and me both, man. WP Engine is the best. I—Corey: It's great because—Miles: You're welcome. I designed a bunch of the hosting on the back of that.Corey: Oh, yeah. But it's also the—I—well, it costs a little bit more that way. Yeah, but guess what's not—guess what's more expensive than that bill, is my time spent doing the care and feeding of this stuff. I like giving money to experts and making it their problem.Miles: Yeah. I heard it said best, Lego is an incredible business. I love their product, and you can build almost any toy with it. And they have not displaced all other plastic toy makers.Corey: Right.Miles: Some kids just want to buy a little car. [laugh].Corey: Oh, yeah, you can build anything you want out of Lego bricks, which are great, which absolutely explains why they are a reference AWS customer.Miles: Yeah, they're great. But they didn't beat all other toy companies worldwide, and eliminate the rest of that market because they had the better primitive, right? These other solutions are just as valuable, just as interesting, tend to have much bigger markets. Lego is not the largest toy manufacturer in the world. They are not in the top five of toy manufacturers in the world, right?Like, so chasing that thread, and getting all the way down into the spots where I think many of the cloud providers on their own, internally, had been very uncomfortable. Like, you got to go all the way to building this stuff that they need for that division, inside of that company, in that geo, in that industry? That's maybe, like, a little too far afield. I think Google has a natural advantage in its more partner-oriented approach to create these combinations that lower the cost to them and to customers to getting out of that solution quick.Corey: So, getting into the weeds of Google Next, I suppose, rather than a whole bunch of things that don't seem to apply to anyone except the four or five companies that really could use it, what things did Google release that make the lives of people building, you know, web apps better?Miles: This is the one. So, I'm at Amazon, hanging out as a part of the team that built up the infrastructure for the Obama campaign in 2012, and there are a bunch of Googlers there, and we are fighting with databases. We are fighting so hard, in fact, with RDS that I think we are the only ones that [Raju 00:17:51] has ever allowed to SSH into our RDS instances to screw with them.Corey: Until now, with the advent of RDS Custom, meaning that you can actually get in as root; where that hell that lands between RDS and EC2 is ridiculous. I just know that RDS can now run containers.Miles: Yeah. I know how many things we did in there that were good for us, and how many things we did in there that were bad for us. And I have to imagine, this is not a feature that they really ought to let everybody have, myself included. But I will say that what all of the Googlers that I talk to, you know, at the first blush, were I'm the evil Amazon guy in to, sort of, distract them and make them build a system that, you know, was very reliable and ended up winning an election was that they had a better database, and they had Spanner, and they didn't understand why this whole thing wasn't sitting on Spanner. So, we looked, and I read the white paper, and then I got all drooly, and I was like, yes, that is a much better database than everybody else's database, and I don't understand why everybody else isn't on it. Oh, there's that one reason, but you've heard of it: No other software works with it, anywhere in the world, right? It's utterly proprietary to Google. Yes, they were kind—Corey: Oh, you want to migrate it off somewhere else, or a fraction of it? Great. Step one, redo your data architecture.Miles: Yeah, take all of my software everywhere, rewrite every bit of it. And, oh all those commercial applications? Yeah, forget all those, you got, too. Right? It was very much where Google was eight years ago. So, for me, it was immensely meaningful to see the launch at Next where they described what they are building—and have now built; we have alpha access to it—a Postgres layer for Spanner.Corey: Is that effectively you have to treat it as Postgres at all times, or is it multimodal access?Miles: You can get in and tickle it like Spanner, if you want to tickle it like Spanner. And in reality, Spanner is ANSI SQL compliant; you're still writing SQL, you just don't have to talk to it like a REST endpoint, or a GRPC endpoint, or something; you can, you know, have like a—Corey: So, similar to Azure's Cosmos DB, on some level, except for the part where you can apparently look at other customers' data in that thing?Miles: [laugh]. Exactly. Yeah, you will not have a sweeping discovery of incredible security violations in the structure Spanner, in that it is the control system that Google uses to place every ad, and so it does not suck. You can't put a trillion-dollar business on top of a database and not have it be safe. That's kind of a thing.Corey: The thing that I find is the most interesting area of tech right now is there's been this rise of distributed databases. Yugabyte—or You-ji-byte—Pla-netScale—or PlanetScale, depending on how you pronounce these things.Miles: [laugh]. Yeah, why, why is G such an adversarial consonant? I don't understand why we've all gotten to this place.Corey: Oh, yeah. But at the same time, it's—so you take a look at all these—and they all are speaking Postgres; it is pretty clear that ‘Postgres-squeal' is the thing that is taking over the world as far as databases go. If I were building something from scratch that used—Miles: For folks in the back, that's PostgreSQL, for the rest of us, it's okay, it's going to be, all right.Corey: Same difference. But yeah, it's the thing that is eating the world. Although recently, I've got to say, MongoDB is absolutely stepping up in a bunch of really interesting ways.Miles: I mean, I think the 4.0 release, I'm the guy who wrote the MongoDB on AWS Best Practices white paper, and I would grab a lot of customer's and—Corey: They have to change it since then of, step one: Do not use DocumentDB; if you want to use Mongo, use Mongo.Miles: Yeah, that's right. No, there were a lot of customers I was on the phone with where Mongo had summarily vaporized their data, and I think they have made huge strides in structural reliability over the course of—you know, especially this 4.0 launch, but the last couple of years, for sure.Corey: And with all the people they've been hiring from AWS, it's one of those, “Well, we'll look at this now who's losing important things from production?”Miles: [laugh]. Right? So, maybe there's only actually five humans who know how to do operations, and we just sort of keep moving around these different companies.Corey: That's sort of my assumption on these things. But Postgres, for those who are not looking to depart from the relational model, is eating the world. And—Miles: There's this, like, basic emotional thing. My buddy Martin, who set up MySQL, and took it public, and then promptly got it gobbled up by the Oracle people, like, there was a bet there that said, hey, there's going to be a real open database, and then squish, like, the man came and got it. And so like, if you're going to be an independent, open-source software developer, I think you're probably not pushing your pull requests to our friends at Oracle, that seems weird. So instead, I think Postgres has gobbled up the best minds on that stuff.And it works. It's reliable, it's consistent, and it's functional in all these different, sort of, reapplications and subdivisions, right? I mean, you have to sort of squint real hard, but down there in the guts of Redshift, that's Postgres, right? Like, there's Postgres behind all sorts of stuff. So, as an interface layer, I'm not as interested about how it manages to be successful at bossing around hardware and getting people the zeros and ones that they ask for back in a timely manner.I'm interested in it as a compatibility standard, right? If I have software that says, “I need to have Postgres under here and then it all will work,” that creates this layer of interop that a bunch of other products can use. So, folks like PlanetScale, and Yugabyte can say, “No, no, no, it's cool. We talk Postgres; that'll make it so your application works right. You can bring a SQL alchemy and plug it into this, or whatever your interface layer looks like.”That's the spot where, if I can trade what is a fairly limited global distribution, global transactional management on literally ridiculously unlimited scalability and zero operations, I can handle the hard parts of running a database over to somebody else, but I get my layer, and my software talks to it, I think that's a huge step.Corey: This episode is sponsored in part by my friends at Cloud Academy. Something special just for you folks. If you missed their offer on Black Friday or Cyber Monday or whatever day of the week doing sales it is—good news! They've opened up their Black Friday promotion for a very limited time. Same deal, $100 off a yearly plan, $249 a year for the highest quality cloud and tech skills content. Nobody else can get this because they have a assured me this not going to last for much longer. Go to CloudAcademy.com, hit the "start free trial" button on the homepage, and use the Promo code cloud at checkout. That's c-l-o-u-d, like loud, what I am, with a “C” in front of it. It's a free trial, so you'll get 7 days to try it out to make sure it's really a good fit for you, nothing to lose except your ignorance about cloud. My thanks again for sponsoring my ridiculous nonsense.Corey: I think that there's a strong movement toward building out on something like this. If it works, just because—well, I'm not multiregion today, but I can easily see a world in which I'd want to be. So, great. How do you approach the decision between—once this comes out of alpha; let's be clear. Let's turn this into something that actually ships, and no, Google that does not mean slapping a beta label on it for five years is the answer here; you actually have to stand behind this thing—but once it goes GA—Miles: GA is a good thing.Corey: Yeah. How do you decide between using that, or PlanetScale? Or Yugabyte?Miles: Or Cockroach or or SingleStore, right? I mean, there's a zillion of them that sit in this market. I think the core of the decision making for me is in every team you're looking at what skills do you bring to bear and what problem that you're off to go solve for customers? Do the nuances of these products make it easier to solve? So, I think there are some products that the nature of what you're building isn't all that dependent on one part of the application talking to another one, or an event happening someplace else mattering to an event over here. But some applications, that's, like, utterly critical, like, totally, totally necessary.So, we worked with a bunch of like Forex exchange trading desks that literally turn off 12 hours out of the day because they can only keep it consistent in one geographical location right near the main exchanges in New York. So, that's a place where I go, “Would you like to trade all day?” And they go, “Yes, but I can't because databases.” So, “Awesome. Let's call the folks on the Spanner side. They can solve that problem.”I go, “Would you like to trade all day and rewrite all your software?” And they go, “No.” And I go, “Oh, okay. What about trade all day, but not rewrite all your software?” There we go. Now, we've got a solution to that kind of problem.So like, we built this crazy game, like, totally other end of the ecosystem with the Dragon Ball Z people, hysterical; your like—you literally play like Rock, Paper, Scissors with your phone, and if you get a rock, I throw a fireball, and you get a paper, then I throw a punch, and we figure out who wins. But they can play these games like Europe versus Japan, thousands of people on each side, real-time, and it works.Corey: So, let's be clear, I have lobbied a consistent criticism at Google for a while now, which is the Google Cloud global control plane. So, you wind up with things like global service outages from time to time, you wind up with this thing is now broken for everyone everywhere. And that, for a lot of these use cases, is a problem. And I said that AWS's approach to regional isolation is the right way to do it. And I do stand by that assessment, except for the part where it turns out there's a lot of control plane stuff that winds up single tracking through us-east-1, as we learned in the great us-east-1 outage of 2021.Miles: Yeah, when I see customers move from data center to AWS, what they expect is a higher count of outages that lasts less time. That's the trade off, right? There's going to be more weird spurious stuff, and maybe—maybe—if they're lucky, that outage will be over there at some other region they're not using. I see almost exactly the same promise happening to folks that come from AWS—and in particular from Azure—over onto GCP, which is, there will be probably a higher frequency of outages at a per product level, right? So, like sometimes, like, some weird product takes a screw sideways, where there is structural interdependence between quite a few products—we actually published a whole internal structural map of like, you know, it turns out that Cloud SQL runs on top of GCE not on GKE, so you can expect if GKE goes sideways, Cloud SQL is probably not going to go sideways; the two aren't dependent on each other.Corey: You take the status page and Amazon FreeRTOS in a region is having an outage today or something like that. You're like, “Oh, no. That's terrible. First, let me go look up what the hell that is.” And I'm not using it? Absolutely not. Great. As hyperscalers, well, hyperscale, they're always things that are broken in different ways, in different locations, and if you had a truly accurate status page, it would all be red all the time, or varying shades of red, which is not helpful. So, I understand the challenge there, but very often, it's a partition that is you are not exposed to, or the way that you've architected things, ideally, means it doesn't really matter. And that is a good thing. So, raw outage counts don't solve that. I also maintain that if I were to run in a single region of AWS or even a single AZ, in all likelihood, I will have a significantly better uptime across the board than I would if I ran it myself. Because—Miles: Oh, for sure.Corey: —it is—Miles: For sure they're way better at ops than you are. Me, right?Corey: Of course.Miles: Right? Like, ridiculous.Corey: And they got that way, by learning. Like, I think in 2022, it is unlikely that there's going to be an outage in an AWS availability zone by someone tripping over a power cable, whereas I have actually done that. So, there's a—to be clear in a data center, not an AWS facility; that would not have flown. So, there is the better idea of of going in that direction. But the things like Route 53 is control plane single-tracking through the us-east-1, if you can't make DNS changes in an outage scenario, you may as well not have a DR plan, for most use cases.Miles: To be really clear, it was a part of the internal documentation on the AWS side that we would share with customers to be absolutely explicit with them. It's not just that there are mistakes and accidents which we try to limit to AZs, but no, go further, that we may intentionally cause outages to AZs if that's what allows us to keep broader service health higher, right? They are not just a blast radius because you, oops, pulled the pin on the grenade; they can actually intentionally step on the off button. And that's different than the way Google operates. They think of each of the AZs, and each of the regions, and the global system as an always-on, all the time environment, and they do not have systems where one gets, sort of, sacrificed for the benefit of the rest, right, or they will intentionally plan to take a system offline.There is no planned downtime in the SLA, where the SLAs from my friends at Amazon and Azure are explicit to, if they choose to, they decide to take it offline, they can. Now, that's—I don't know, I kind of want the contract that has the other thing where you don't get that.Corey: I don't know what the right answer is for a lot of these things. I think multi-cloud is dumb. I think that the idea of having this workload that you're going to seamlessly deploy to two providers in case of an outage, well guess what? The orchestration between those two providers is going to cause you more outages than you would take just sticking on one. And in most cases, unless you are able to have complete duplication of not just functionality but capacity between those two, congratulations, you've now just doubled your number of single points of failure, you made the problem actively worse and more expensive. Good job.Miles: I wrote an article about this, and I think it's important to differentiate between dumb and terrifyingly shockingly expensive, right? So, I have a bunch of customers who I would characterize as rich, as like, shockingly rich, as producing businesses that have 80-plus percent gross margins. And for them, the costs associated with this stuff are utterly rational, and they take on that work, and they are seeing benefits, or they wouldn't be doing it.Corey: Of course.Miles: So, I think their trajectory in technology—you know, this is a quote from a Google engineer—it's just like, “Oh, you want to see what the future looks like? Hang out with rich people.” I went into houses when I was a little kid that had whole-home automation. I couldn't afford them; my mom was cleaning house there, but now my house, I can use my phone to turn on the lights. Like—Corey: You know, unless us-east-1 is having a problem.Miles: Hey, and then no Roomba for you, right? Like utterly offline. So—Corey: Roomba has now failed to room.Miles: Conveniently, my lights are Philips Hue, and that's on Google, so that baby works. But it is definitely a spot where the barrier of entry and the level of complexity required is going down over time. And it is definitely a horrible choice for 99% of the companies that are out there right now. But next year, it'll be 98. And the year after that, it'll probably be 97. [laugh].And if I go inside of Amazon's data centers, there's not one manufacturer of hard drives, there's a bunch. So, that got so easy that now, of course you use more than one; you got to do—that's just like, sort of, a natural thing, right? These technologies, it'll move over time. We just aren't there yet for the vast, vast majority of workloads.Corey: I hope that in the future, this stuff becomes easier, but data transfer fees are going to continue to be a concern—Miles: Just—[makes explosion noise]—Corey: Oh, man—Miles: —like, right in the face.Corey: —especially with the Cambrian explosion of data because the data science folks have successfully convinced the entire industry that there's value in those mode balancer logs in 2012. Okay, great. We're never deleting anything again, but now you've got to replicate all of that stuff because no one has a decent handle on lifecycle management and won't for the foreseeable future. Great, to multiple providers so that you can work on these things? Like, that is incredibly expensive.Miles: Yeah. Cool tech, from this announcement at Next that I think is very applicable, and recognized the level of like, utter technical mastery—and security mastery to our earlier conversation—that something like this requires, the product is called BigQuery Omni, what Omni allows you to do is go into the Google Cloud Console, go to BigQuery, say I want to do analysis on this data that's in S3, or in Azure Blob Storage, Google will spin up an account on your behalf on Amazon and Azure, and run the compute there for you, bring the result back. So, just transfer the answers, not the raw data that you just scanned, and no work on your part, no management, no crapola. So, there's like—that's multi-cloud. If I've got—I can do a join between a bunch of rows that are in real BigQuery over on GCP side and rows that are over there in S3. The cross-eyedness of getting something like that to work is mind blowing.Corey: To give this a little more context, just because it gets difficult to reason about these things, I can either have data that is in a private subnet in AWS that traverses their horribly priced Managed NAT Gateways, and then goes out to the internet and sent there once, for the same cost as I could take that same data and store it in S3 in their standard tier for just shy of six full months. That's a little imbalanced, if we're being direct here. And then when you add in things like intelligent tiering and archive access classes, that becomes something that… there's no contest there. It's, if we're talking about things that are now approaching exabyte scale, that's one of those, “Yeah, do you want us to pay by a credit card?”—get serious. You can't at that scale anyway—“Invoice billing, or do we just, like, drive a dump truck full of gold bricks and drop them off in Seattle?”Miles: Sure. Same trajectory, on the multi-cloud thing. So, like a partner of ours, PacketFabric, you know, if you're a big, big company, you go out and you call Amazon and you buy 100 gigabit interconnect on—I think they call theirs Direct Connect, and then you hook that up to the Google one that's called Dedicated Interconnect. And voila, the price goes from twelve cents a gig down to two cents a gig; everybody's much happier. But Jesus, you pay the upfront for that, you got to set the thing up, it takes days to get deployed, and now you're culpable for the whole pipe if you don't use it up. Like, there are charges that are static over the course of the month.So, PacketFabric just buys one of those and lets you rent a slice of it you need. And I think they've got an incredible product. We're working with them on a whole bunch of different projects. But I also expect—like, there's no reason the cloud providers shouldn't be working hard to vend that kind of solution over time. If a hundred gigabit is where it is now, what does it look like when I get to ten gigabit? When I get to one gigabit? When I get to half gigabit? You know, utility price that for us so that we get to rational pricing.I think there's a bunch of baked-in business and cost logic that is a part of the pricing system, where egress is the source of all of the funding at Amazon for internal networking, right? I don't pay anything for the switches that connect to this machine to that machine, in region. It's not like those things are cheap or free; they have to be there. But the funding for that comes from egress. So, I think you're going to end up seeing a different model where you'll maybe have different approaches to egress pricing, but you'll be paying like an in-system networking fee.And I think folks will be surprised at how big that fee likely is because of the cost of the level of networking infrastructure that the providers deploy, right? I mean, like, I don't know, if you've gone and tried to buy a 40 port, 40 gig switch anytime recently. It's not like they're those little, you know, blue Netgear ones for 90 bucks.Corey: Exactly. It becomes this, [sigh] I don't know, I keep thinking that's not the right answer, but part of it also is like, well, you know, for things that I really need local and don't want to worry about if the internet's melting today, I kind of just want to get, like, some kind of Raspberry Pi shoved under my desk for some reason.Miles: Yeah. I think there is a lot where as more and more businesses bet bigger and bigger slices of the farm on this kind of thing, I think it's Jassy's line that you're, you know, the fat in the margin in your business is my opportunity. Like, there's a whole ecosystem of partners and competitors that are hunting all of those opportunities. I think that pressure can only be good for customers.Corey: Miles, thank you for taking the time to speak with me. If people want to learn more about you, what you're up to, your bad opinions, your ridiculous company, et cetera—Miles: [laugh].Corey: —where can they find you?Miles: Well, it's really easy to spell: SADA.com, S-A-D-A dot com. I'm Miles Ward, it's @milesward on Twitter; you don't have to do too hard of a math. It's miles@sada.com, if you want to send me an email. It's real straightforward. So, eager to reach out, happy to help. We've got a bunch of engineers that like helping people move from Amazon to GCP. So, let us know.Corey: Excellent. And we will, of course, put links to this in the [show notes 00:37:17] because that's how we roll.Miles: Yay.Corey: Thanks so much for being so generous with your time, and I look forward to seeing what comes out next year from these various cloud companies.Miles: Oh, I know some of them already, and they're good. Oh, they're super good.Corey: This is why I don't do predictions because like, the stuff that I know about, like, for example, I was I was aware of the Graviton 3 was coming—Miles: Sure.Corey: —and it turns out that if your—guess what's going to come up and you don't name Graviton 3, it's like, “Are you simple? Did you not see that one coming?” It's like—or if I don't know it's coming and I make that guess—which is not the hardest thing in the world—someone would think I knew and leaked. There's no benefit to doing predictions.Miles: No. It's very tough, very happy to do predictions in private, for customers. [laugh].Corey: Absolutely. Thanks again for your time. I appreciate it.Miles: Cheers.Corey: Myles Ward, CTO at SADA. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice and be very angry in your opinion when you write that obnoxious comment, but then it's going to get lost because it's using MySQL instead of Postgres.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
MongoDB's Purposeful Application Data Platform with Sahir Azam

Screaming in the Cloud

Play Episode Listen Later Dec 14, 2021 35:01


About SahirSahir is responsible for product strategy across the MongoDB portfolio. He joined MongoDB in 2016 as SVP, Cloud Products & GTM to lead MongoDB's cloud products and go-to-market strategy ahead of the launch of Atlas and helped grow the cloud business from zero to over $150 million annually. Sahir joined MongoDB from Sumo Logic, an SaaS machine-data analytics company, where he managed platform, pricing, packaging and technology partnerships. Before Sumo Logic, Sahir was the Director of Cloud Management Strategy & Evangelism at VMware, where he launched VMware's first organically developed SaaS management product and helped grow the management tools business to over $1B in revenue. Earlier in his career, Sahir held a variety of technical and sales-focused roles at DynamicOps, BMC Software, and BladeLogic.Links:MongoDB: https://www.mongodb.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open source database that is not the bind DNS server. If you're tired of managing open source Redis on your own, or you're using one of the vanilla cloud caching services, these folks have you covered with the go to manage Redis service for global caching and primary database capabilities; Redis Enterprise. Set up a meeting with a Redis expert during re:Invent, and you'll not only learn how you can become a Redis hero, but also have a chance to win some fun and exciting prizes. To learn more and deploy not only a cache but a single operational data platform for one Redis experience, visit redis.com/hero. Thats r-e-d-i-s.com/hero. And my thanks to my friends at Redis for sponsoring my ridiculous nonsense.  Corey: Are you building cloud applications with a distributed team? Check out Teleport, an open source identity-aware access proxy for cloud resources. Teleport provides secure access to anything running somewhere behind NAT: SSH servers, Kubernetes clusters, internal web apps and databases. Teleport gives engineers superpowers! Get access to everything via single sign-on with multi-factor. List and see all SSH servers, kubernetes clusters or databases available to you. Get instant access to them all using tools you already have. Teleport ensures best security practices like role-based access, preventing data exfiltration, providing visibility and ensuring compliance. And best of all, Teleport is open source and a pleasure to use.Download Teleport at https://goteleport.com. That's goteleport.com. Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. For the first in-person recording in ages we have a promoted guest joining us from MongoDB. Sahir Azam is the chief product officer. Thank you so much for joining me.Sahir: Thank you for having me, Corey. It's really exciting to be able to talk and actually meet in person.Corey: I know it feels a little scandalous these days, when we're in a position of meeting people in person, like you're almost like you're doing something wrong somehow. So, MongoDB has been a staple of the internet for a long time. It's oh good; another database to keep track of. What do you do these days? What is MongoDB in this ecosystem?Sahir: That's a great question. I think we're fortunate that MongoDB has been very popular for a very long time. We're seeing, you know, massive adoption grow across the globe and the massive developer community is sort of adopting the technology. What I would bring across is today MongoDB is really one of the leading cloud database companies in the world. The majority of the company's business comes from our cloud service; we partner very heavily with AWS and other cloud providers on making sure we have global availability of that. That's our flagship product.And we've invested really heavily in the last I would say, five or six years, and really extending the capabilities of the product to not just be the, sort of, database for modern web scale applications, but also to be able to handle mission-critical use cases across every vertical, you know, enterprises to startups, and doing so in a way that really empowers a general purpose strategy for any app they want to build.Corey: You're talking about general purpose which, I guess, leads to the obvious question that AWS has been pushing for a while, the idea of purpose-built databases, which makes sense from a certain point of view, and then they, of course, take that way beyond the bounds of normalcy. I don't know what the job is for someone whose role is to disambiguate between the 20 different databases that they offer by, who knows, probably the end of this year. And I don't know what that looks like. What's your take on that whole idea of a different database for every problem slash every customer slash every employee slash API request.Sahir: What we see is customers clearly moved to the cloud because they want to be able to move faster, innovate faster, be more competitive in whatever market or business or organization they're in. And certainly, I think the days of a single vendor database to rule all use cases are gone. We're not [laugh] by any means supportive of that. However, the idea that you would have 15 different databases that need to be rationalized, integrated, scripted together, frankly, may be interesting for technical teams who want to cobble together, you know, a bespoke architecture. But when we look at it from, sort of a skills, repeatability, cost, simplicity, perspective of architecture, we're seeing these, sort of like, almost like Rube Goldbergian sort of architectures.And in a large organization that wants to adopt the cloud en mass, the idea of every development team coming up with their own architecture and spending all of that time and duplication and integration of work is a distraction from ultimately their core mission, which is driving more capability and differentiation in the application for their end customer. So, to be blunt, we actually think the idea of having 15 different databases, ‘the right tool for the job' is the wrong approach. We think that there's certain key technologies that most organizations will use for 70, 80% of use cases, and then use the niche technologies for where they really need specialized solutions for particular needs.Corey: So, if you're starting off with a general-purpose database then, what is the divergence point at which point—like in my case, eventually I have to admit that using TXT records in Route 53 as a database starts to fall down for certain use cases. Not many, mind you, but one or two here and there. At what point when you're sticking with a general-purpose database does migrating to something else—what's the tipping point there?Sahir: Yeah, I think what we see is if you have a general-purpose database that hits the majority of your needs, oftentimes, especially with a microservices kind of modern architecture, it's not necessarily replacing your general-purpose database with a completely different solution, it may be augmenting it. So, you may have a particular need for, I don't know deep graph capabilities, for example, for a particular traversal use case. Maybe you augment that with a specialized solution for that. But the idea is that there's a certain set of velocity you can enable an organization by building skill set and consolidation around a technology provider that gives much more repeatability, security, less data duplication, and ultimately focuses your organization in teams on innovation as opposed to plumbing and that's where the 15 different databases been cobbled together may be interesting, but it's not really focusing on innovation, it's focusing more on the technology problems that you solved.Corey: So, we're recording this on site in Las Vegas, as re:Invent, thankfully and finally, draws to a close. How was your conference?Sahir: It's been fantastic. And to be clear, we are huge fans and partners of AWS. This is one of our most exciting conferences we sponsor. We go big, [laugh] we throw a party, we have a huge presence, we have hundreds of customer meetings. So, although I'm a little ragged, as you can probably tell from my voice from many meetings and conversations and drinks with friends, it's actually been a really great week.Corey: It is one of those things where having taken a year off, you forget so much of it, where it's, “Oh, I can definitely walk between those two hotels,” and then you sort of curse the name of God as you wind up going down that path. It was a relief, honestly, to not see, for example, another managed database service being launched that I can recall in that flurry of announcements, did you catch any?Sahir: I didn't catch any new particular database services that at least caught my eye. Granted, I've been in meetings most of the time, however, we're really excited about a lot of the infrastructure innovation. You know, I just happened to have a meeting with the compute teams on the Amazon side and what they're doing with, you know, Wavelength, and Local Zones, and new hardware, and chips with Graviton, it's all stuff we're really excited about. So, it is always interesting to see the innovation coming out of AWS.Corey: You mentioned that you are a partner with AWS, and I get it, but AWS is also one of those companies whose product strategy is ‘yes.' And they a couple years ago launched their DocumentDB, in parentheses with MongoDB compatibility, which they say, “Oh, customers were demanding this,” but no, no, they weren't. I've been talking to customers; what they wanted was actual MongoDB. The couple of folks I'm talking to who are using it are using it for one reason and one reason only, and that is replication traffic between AZs on native AWS services is free; everyone else must pay. So, there's some sub-offering in many respects that is largely MongoDB compatible to a point. Okay, but… how do you wind up, I guess, addressing the idea of continuing to partner with a company that is also heavily advantaging its own first party services, even when those are not the thing that best serves customers.Sahir: Yeah, I've been in technology for a while, and you know, the idea of working with major platform players in the context of being, in our case, a customer, a partner, and a competitor is something we're more than comfortable with, you know, and any organization at our scale and size is navigating those same dynamics. And I think on the outside, it's very easy to pay way more attention to the competitive dynamics of oh, you run in AWS but you compete with them, but the reality is, honestly, there's a lot more collaboration, both on the engineering side but also in the field. Like, we go jointly work with customers, getting them onto our platform, way more often than I think the world sees. And that's a really positive relationship. And we value that and we're investing heavily on our side to make sure you know, we're good partners in that sense.The nuances of DocumentDB versus the real MongoDB, the reality of the situation is yes, if you want the minimal MongoDB experience for, you know, a narrow percentage of our functionality, you can get that from that technology, but that's not really what customers want. Customers choose MongoDB for the breadth of capabilities that we have, and in particular, in the last few years, it's not just the NoSQL query capability of Mongo, we've integrated rich aggregation capabilities for analytics, transactional guarantees, a globally distributed architecture that scales horizontally and across regions much further than anything a relational architecture can accomplish. And we've integrated other domains of data, so things like full text, search, analytics, mobile synchronization are all baked into our Atlas platform. So, to be honest, when customers compare the two on the merits of the technology, we're more than happy to be competitors with AWS.Corey: No, I think that everyone competes with AWS, including its own product teams amongst each other because, you know, that's how you, I guess, innovate more rapidly. What do I know? I don't run a hyperscale platform. Thankfully.If I go and pull up your website, it's mongodb.com. It is natural for me to assume that you make a database, but then I start reading; after the big text and the logo, it says that you are an application data platform. Tell me more about that.Sahir: Yeah, and this has been a relatively new area of focus for us over the last couple of years. You know, I think many people know MongoDB as a non-relational modern database. Clearly, that's our core product. I think in general, we have a lot of capabilities in the database that many customers are unaware of in terms of transactional guarantees and schema management and others, so that's kind of all within the core database. But over the last few years, we've both built and acquired technology, things like Realm, that allows for mobile synchronization; event-driven architectures; APIs to be created on your data Easily; Atlas data lake, which allows for data transformation and analytics to be done using the same API as the core Mongo database; as I mentioned a couple minutes ago, things like search, where we actually allow customers to remove the need for a separate search engine for their application and make it really seamless operationally, and from the developer experience standpoint.And you know, there's no real term in the industry for that, so we kind of describe ourselves as an application data platform because really, what we're trying to do is simplify the data architecture for applications, so you don't need ten different niche database technologies to be able to build a powerful, modern, scalable application; you can build it in a unified way with an amazing developer experience that allows your teams to focus on differentiation and competitiveness as opposed to plumbing together the data infrastructure.Corey: So, when I hear platform, I think about a number of different things that may or may not be accurate, but the first thing that I think is, “Oh. There's code running on this then, as sort of part of an ecosystem.” Effectively is their code running on the data platform that you built today that wasn't written by people at MongoDB?Sahir: Yes, but it's typically the customer's code as part of their application. So, you know, I'll give you a couple of simple examples. We provide SDKs to be able to build web and mobile applications. We handle the synchronization of data from the client and front end of an application back to the back end seamlessly through our Realm platform. So, we're certainly, in that case, operating some of the business logic, or extending beyond sort of just the back end data.Similarly, a lot of what we focus on is modern event-driven architectures with MongoDB. So, to make it easier to create reactive applications, trigger off of changes in your data, we built functions and triggers natively in the platform. Now, we're not trying to be a full-on application hosting platform; that's not our business, our business is a data platform, but we really invest in making sure that platform is open, accessible, provides APIs, and functional capabilities make it very easy to integrate into any application our customers want to build.Corey: It seems like a lot of different companies now are trying to, for lack of a better term, get some of the love that Snowflake has been getting for, “Oh, their data cloud is great.” But when you take a step back and talk to people about, “So, what do you think about Mongo?” The invariable response you're going to get every time is, “Oh, you mean the database?” Like, “No, no. The character from the Princess Bride. Yes, the database.” How do you view that?Sahir: Yeah, it's easy to look at all the data landscape through a simple lens, but the reality is, there's many sub markets within the database and data market overall. And for MongoDB we're, frankly, an operational data company. And we're not focused on data warehousing, although you can use MongoDB for various analytical capabilities. We're focused on helping organizations build amazing software, and leveraging data as an enabler for great customer experiences, for digital transformation initiatives, for solving healthcare problems, or [unintelligible 00:12:51] problems in the government, or whatever it might be. We're not really focused on selling customers'—or platforms of data from—not the customers' data, but other—allowing people to monetize their data. We're focused on their applications and developers building those experiences.Corey: Yeah. So, you're if you were selling customers' data, you just rebrand as FacebookDB and be done with it, or MetaDB now—Sahir: MetaDB?Corey: Yeah. As far as the general Zeitgeist around Mongo goes, back when I was first hearing about it, in I don't know, I want to say the first half of the 2010s, the running gag was, “Oh, Mongo. It's Snapchat for databases,” with the gag being that it lost production data was unsafe for a bunch of things. To be clear, based upon my Route 53 comments, I am not a database expert by any stretch of the imagination. Now, the most common thing in my experience that loses production data is me being allowed near it. But what was the story? What gave rise to that narrative?Sahir: Yeah, I think that—thank you for bringing that up. I mean, to be clear, you know, if a database doesn't keep your data safe, consistent, and guaranteed, the rest of the functionality doesn't matter, and we take that extremely seriously at MongoDB. Now, you know, MongoDB, has been around a long time, and for better or worse—I think there's, frankly, good things and bad things about this—the database exploded in popularity extremely fast, partially because it was so easy to use for developers and it was also very different than the traditional relational database models. And so I think in many ways, customer's expectation of where the technology was compared to where we were from a maturity standpoint, combined with running an operating it the same way as a traditional system, which was, frankly, wrong for a distributed database caused, unfortunately, some situations where customers stubbed their toes and, you know, we weren't able to get to them and help them as easily as we could. Thankfully, you know, none of those issues fundamentally are, like, foundational problems. You know, we've matured the product for many, many years, you know, we work with 30,000-plus customers worldwide on mission-critical applications. I just want to make sure that everyone understands that, like, we take any issue that has to do with data loss or data corruption, as sort of the foundational [P zero 00:14:56] problem we always have to solve.Corey: I tend to form a lot of my opinions based upon very little on what, you know, sorry to say it, execs say and a lot more about what I see. There was a whole buzz going around on Twitter that HSBC was moving a whole bunch of its databases over to Mongo. And everyone was saying, “Oh, they're going to lose all their data.” But I've done work with a fair number of financial services companies, and of all the people I talk to, they're pretty far on one end of that spectrum of, “How cool are we with losing data?” So, voting with a testimonial and a wallet like that—because let's be clear, getting financial services companies to reference anything for anyone anywhere is like pulling teeth—that says a lot more than any, I guess, PR talking points could.Sahir: Yeah, I appreciate you saying that. I mean, we're very fortunate to have a very broad customer base, everything from the world's largest gaming companies to the world's largest established banks, the world's most fastest growing fintechs, to health care organizations distributing vaccines with technologies built on Mongo. Like, you name it, there's a use case in any vertical, as mission critical as you can think, built on our technology. So, customers absolutely don't take our word for granted. [laugh]. They go, you know, get comfortable with a new database technology over a span of years, but we've really hit sort of mainstream adoption for the majority of organizations. You mentioned financial services, but it's really any vertical globally, you know, we can count on our customer list.Corey: How do you, I guess, for lack of a better term, monetize what it is you do when you're one of the open-source—and yes, if you're an open-source zealot who wants to complain about licensing, it's imperative that you do not email me—but you are available for free—for certain definitions of free; I know, I know—that I can get started with a two o'clock in the morning and start running it myself in my environment. What is the tipping point that causes people to say, “Well, that was a good run. Now, I'm going to pay you folks to run it for me.”Sahir: Yeah, so there's two different sides to that, first and foremost, the majority of our engineering investment for our business goes in our core database, and our core database is free. And the way we actually, you know, survive and make money as a business, so we can keep innovating, you know, on top of the billion dollars of investment we've put in our technology over the years is, for customers who are self-managing in their own data center, we provide a set of management tools, enterprise security integrations, and others that are commercially licensed to be able to manage MongoDB for mission-critical applications in production, that's a product called Enterprise Advanced. It's typically used for large enterprise accounts in their own data centers. The flagship product for the company these days, the fastest growing part of the business is a product we call Atlas—or platform we call Atlas. That's a cloud data service.So, you know, you can go onto our website, sign up with our free tier, swipe a credit card, all consumption-based, available in every AWS region, as well as Azure and GCP, has the ability to run databases across AWS, Azure, and GCP, which is quite unique to us. And that, like any cloud data technology, is then used in conjunction with a bunch of other application components in the cloud, and customers pay us for the consumption of that database and how much they use.Corey: This episode is sponsored by our friends at Oracle Cloud. Counting the pennies, but still dreaming of deploying apps instead of "Hello, World" demos? Allow me to introduce you to Oracle's Always Free tier. It provides over 20 free services and infrastructure, networking, databases, observability, management, and security. And—let me be clear here—it's actually free. There's no surprise billing until you intentionally and proactively upgrade your account. This means you can provision a virtual machine instance or spin up an autonomous database that manages itself all while gaining the networking load, balancing and storage resources that somehow never quite make it into most free tiers needed to support the application that you want to build. With Always Free, you can do things like run small scale applications or do proof-of-concept testing without spending a dime. You know that I always like to put asterisks next to the word free. This is actually free, no asterisk. Start now. Visit snark.cloud/oci-free that's snark.cloud/oci-free.Corey: I want to zero in a little bit on something you just said, where you can have data shared between all three of the primary hyperscalers. That sounds like a story that people like to tell a lot, but you would know far better than I: how common is that use case?Sahir: It's definitely one from a strategic standpoint, especially in large enterprises, that's really important. Now, to your point, the actual usage of cross-cloud databases is still very early, but the fact that customers know that we can go, in three minutes, spin up a database cluster that allows them to either migrate, or span data across multiple regions from multiple providers for high availability, or extend their data to another cloud for analytics purposes or whatnot, is something that it almost is like science fiction to them, but it's crucial as a capability I know they will need in the future.Now, to our surprise, we've seen more real production adoption of it probably sooner than we would have expected, and there's kind of three key use cases that come into play. One—you know, for example, I was with a challenger bank from Latin America yesterday; they need high availability in Latin America. In the countries they're in, no single infrastructure cloud provider has multiple regions. They need to span across multiple regions. They mix and match cloud providers, in their case AWS being their primary, and they have a secondary cloud provider, in their case GCP, for high availability.But it's also regulatorily-driven because the banking SEC regulations in that country state that they need to be able to show portability because they don't want concentration risk of their banking sector to be on a single cloud provider or single cloud provider's region. So, we see that in multiple countries happening right now. That's one use case.The other tends to be geographic reach. So, we work with a very large international gaming company, majority of their use cases happen to be run out of the US. They happen to have a spike of customers using their game [unintelligible 00:19:58] gamers using it in Taiwan; their cloud provider of choice didn't have a region in Taiwan, but they were able to seamlessly extend a replica into a different cloud to serve low-latency performance in that country. That's the second.And then the third, which is a little bit more emerging is kind of the analytic-style use case where you may have your operational data running in a particular cloud provider, but you want to leverage the best of every cloud provider's, newest, fanciest services on top of your data. So, isn't it great if you can just hit a couple clicks, we'll extend your data and keep it in sync in near real time, and allow you to plumb into some new service from another cloud provider.Corey: In an ideal world with all things being equal, this is a wonderful vision. There's been a lot of noise made—a fair bit of it by me, let's be fair—around the data egress pricing for—it's easy to beat up on AWS because they are the largest cloud provider and it's not particularly close, but they all do it. Does that serve as a brake on that particular pattern?Sahir: Thankfully, for a database like ours and various mechanisms we use, it's not a barrier to entry. It's certainly a cost component to enabling this capability, for sure. We absolutely would love to see the industry be more open and use less of egress fees as a way to wall people into a particular cloud providers. We certainly have that belief, and would push that notion and continually do in the industry. But it hasn't been a barrier to adoption because it's not the major cost component of operating a multi-cloud database.Corey: Well, [then you start 00:21:27] doing this whole circular replication thing, at which point, wow. It just goes round and round and round and lives on the network all the time. I'm told that's what a storage area network is because I'm about as good at storage as I am at databases. As you look at Atlas, since you are in all of the major hyperscalers, is the experience different in any way, depending upon which provider you're running in?Sahir: By and large, it's pretty consistent. However, what we are not doing is building to the lowest common denominator. If there's a service integration that our customers on AWS want, and that service doesn't integrate, it doesn't exist on another cloud provider, or vice versa, we're not going to stop ourselves from building a great customer experience and integration point. And the same thing goes for infrastructure; if there's some infrastructure innovation that delivers price, performance, great value for our customers and it's only on a single cloud, we're not going to stop ourselves from delivering that value to customers. So, there's a line there, you know, we want to provide a great experience, portability across the cloud providers, consistency where it makes sense, but we are not going to water down our experience on a particular cloud provider if customers are asking for some native capabilities.Corey: It always feels like a strange challenge historically to wind up—at least in large, regulated environments—getting a new vendor in. Originally an end run around this was using the AWS Marketplace or whatever marketplace you were using at any given cloud provider. Then procurement caught on and in some cases banned in the Marketplace outright and now, the Marketplace is sort of reformed, in some ways, to being a tool for procurement to use. Have you seen significant uptake of your offering through the various cloud marketplaces?Sahir: We do work with all the cloud marketplaces. In fact, we just made an announcement with AWS that we're going to be implementing the pay-as-you-go marketplace model for self-service as well on AWS. So, it is definitely a driver for our business. It tends to be used most heavily when we're selling with the, you know, sales teams from the cloud providers, and customers want to benefit from a single bill, benefit from, you know, drawing down on their large commitments that they might have with any given cloud providers. So, it drives really good alignment between the customer, us as a third-party on AWS or Azure GCP, and the infrastructure cloud provider. And so we're all aligned on a motion. So, in that sense, it's definitely been helpful, but it's largely been a procurement and fulfillment sort of value proposition to drive that alignment, I'd say, by and large today.Corey: I don't know if you're able to answer this without revealing anything confidential, so please feel free not to, but as you look across the total landscape—since I would say that you have a fairly reasonable snapshot of the industry as a whole—am I right when I say that AWS is the behemoth in the space, or is it a closer horse race than most people would believe, based upon your perspective?Sahir: I think in general, for sure AWS is the market share leader. It would be crazy to say anything otherwise. They innovated this model, you know, the amount of innovation happening at AWS is incredible, you know, and we're benefiting from it as a customer as well. However, we do believe it's a multi-cloud future. I mean, look at the growth of Azure. You know, we're seeing Google show up in large enterprises across the globe as well.And even beyond the three American clouds, you know, we work heavily with Alibaba and Tencent in mainland China, which is a completely different market than Western world. So, I do think the trend over time will be a more heterogeneous, more multi-cloud world—which I'm biased; that does favor MongoDB, but that's the trend we're seeing—but that doesn't mean that AWS won't continue to still be a leader and a very strong player in that market.Corey: I want to talk a little bit about Jepsen. And for those who are unaware, jepsen.io is run by Kyle Kingsbury. Kyle is wonderful, and he's also nuts. If you followed him back when he was on Twitter, you've also certainly seen them.But beyond that, he is the de facto resource I go to when it comes to consistency testing and stress testing of databases. I'm a little annoyed he hasn't taken on Route 53 yet, but hope does spring eternal. He's evaluated Mongo a number of times, and his conclusions, as always are mixed sometimes, shall we say, incendiary, but they always seem relatively fair. What is your experience been, working with him? And do you share my opinion of him as being a neutral and fair arbiter of these things?Sahir: I do. I think he's got real expertise and credibility in beating up distributed database systems and finding the edges of where they don't live up to what we all hope they do, right? Whether it's us or anyone else, just to be clear. And so anytime Kyle finds some flaw in MongoDB, we take it seriously, we add it to our test suite, [laugh] we remediate, and I think we have a pretty good history of that. And in fact, we've actually worked with Kyle to welcome him beating up our database on multiple occasions, too, so it's not an adversarial relationship at all.Corey: I have to ask, since you are a more modern generation of database, then many from the previous century, but there's always been a significant, shall we say… concern, when I wind up looking at it [it again in 00:26:33] any given database, and I look in the terms and conditions and, like, “Oh, it's a great database. We're by far the best. Whatever you do, do not publish benchmarks.” What's going on with that?Sahir: I think benchmarks can be spun in any direction you want, by any vendor. And it's not just database technology. I've been in IT for a while, and you know, that applies to any technology. So, we absolutely do not shy away from our performance or benchmark or comparisons to any technology. We just think that, you know, vendors benchmarking technologies for their—are doing so largely to only make their own technologies look good versus competition.Corey: I tend to be somewhat skeptical of the various benchmark stuff. I remember repeatedly oh, I'll wind up running whatever it is—I think it's Geek Speed—on my various devices to see oh, how snappy and performant is it going to be? But then I'm sitting there opening Microsoft Word and watching the beach ball spin, and spin, and spin, and it turns out, don't care about benchmarks in a real-world use case in many scenarios.Sahir: Yeah, it's kind of a good analogy, right? I mean, performance of an application, sure, the database at the heart of it is a crucial component, but there's many more aspects of it that have to do with the overall real world performance than just some raw benchmark results for any database, right? It's the way you model your data, the way the rest of the architecture of the application interacts and hangs together with the database, many, many layers of complexity. So, I don't always think those benchmarks are indicative of how real world performance will look, but at the same time, I'm very confident in MongoDB's performance comparatively to our peers, so it's not something we're afraid of.Corey: As you take a look at where you've been and where you are now, what's next? Where are you going? Because I have a hard time believing that, “Yep, we're deciding it's feature complete and we're just going to sell this until the end of time exactly as is, we're laying off our entire engineering team and we're going to be doing support from our yacht, parked comfortably in international waters.” That's a slightly different company. What's the plan?Sahir: So, [laugh] you're—we are not parking anything, anytime soon. We are continuing to invest heavily in the innovation of the technology, and really, it's two reasons: you know, one, we're seeing an acceleration of adoption of MongoDB, either with any customers that have used us for a long time, but for more important and more use cases, but also just broader adoption globally as more and more developers learn to code, they're choosing Mongo as the place to start, increasingly. And so that's really exciting for us, and we need to keep up with those customer demands and that roadmap of asks that they have.And at the same time, customer requirements are increasing as more and more organizations are software-first organizations, the requirements of what they demand from us continually increase, which requires continual innovation in our architecture and our functionality to keep up with those and stay ahead of those customer requirements. So, what you'll see from us is, one, making sure we can build the best modern database we can. That's the core of what we do; everything we do now especially is cloud first, so working closely with our cloud partners on that. And even though we're very fortunate to be a high-performance, high-growth company with a very pervasive open technology, we're still in a giant market that has a lot of legacy technologies powering old applications. So, [laugh] you know, we have a long, long runway to become a long-standing major player in this market.And then we're going to continue this vision of an application data platform, which is really just about simplifying the capabilities and data architecture for organizations and developers so they can focus on building their application and less on the plumbing.Corey: I want to thank you so much for taking the time to speak with me today. If people want to learn more, where can they go?Sahir: Clearly, you can go to mongodb.com. You can also reach out to us on our community sites: our own or on any of the public sites that you would typically find developers hanging out. We always have folks from our teams or our champions program of advocates worldwide helping out our customers and users. And I just want to thank you, Corey, for having me. I've followed you online for a while; it's great to finally be able to meet in person.Corey: Uh-oh. It's disturbing having realized some of the things I've said on Twitter and realizing I'm now within range to get punched in the face. But, you know, we take what we can get. Thank you so much for taking the time to speak with me. I appreciate it.Sahir: My pleasure.Corey: Sahir Azam, Chief Product Officer at MongoDB. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment telling me that is not the reason that AWS is building many new databases. Tell me which one you're building and why it solves a problem other than getting you the promotion you probably don't deserve.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

The Cloudcast
The Evolution of MongoDB

The Cloudcast

Play Episode Listen Later Sep 19, 2021 26:40


The transition of @MongoDB from an open source project to commercially successful public company to cloud provider has been an interesting transition. One that many other software companies are looking to emulate. SHOW: 550CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwCHECK OUT OUR NEW PODCAST - "CLOUDCAST BASICS"SHOW SPONSORS:Datadog Synthetic Monitoring: Frontend and Backend Modern MonitoringStart detecting user-facing issues with API and browser tests with a free 14 day Datadog trial. Listeners of The Cloudcast will also receive a free Datadog T-shirt.CBT Nuggets: Expert IT Training for individuals and teamsSign up for a CBT Nuggets Free Learner account AWS Data Backup for Dummies (Veeam)Choose Your Own Cloud Adventure with Veeam and AWSSHOW NOTES:History of MongoDB (wikipedia)MongoDB Atlas is launched (DBaaS) - 2016Amazon launches DocumentDB (with MongoDB compatibility) - 2019MongoDB IPO - 2019SaaS and Moving Downmarket - MongoDB's TransformationEvolution of Commercial OSS (Cloudcast Eps.492)How Cloud is Changing OSS Licensing (Cloudcast Eps.493) FROM OPEN TO COMMERCIAL TO IPO TO CLOUDMany software companies are trying to make the evolution from customer-operated to cloud-operated business models. MongoDB is an early lighthouse is showing the blueprint for success. CHANGING (OR GROWING NEW) MARKETS IS VERY DIFFICULTSolve a technical problemCreate a unique value proposition (simplicity)[Marketing] Create (and lead) a growing community of users - via open source[Monetization] Create open-core features to differentiate and solve unique problems [New GTM, New Markets] Evolve the product to new delivery modelsGrow into new markets, through different customer engagement models FEEDBACK?Email: show at the cloudcast dot netTwitter: @thecloudcastnet

AWS Morning Brief
Listener Questions 3 - How to Get Rid of Your Oracle Addiction

AWS Morning Brief

Play Episode Listen Later Apr 16, 2021 23:34


Links: Unconventional Guide to AWS Cost Management: https://www.duckbillgroup.com/resources/unconventional-guide-to-aws-cost-management/ Migrate from Oracle to Amazon Aurora: https://aws.amazon.com/getting-started/hands-on/migrate-oracle-to-amazon-aurora/ TranscriptCorey: This episode is sponsored in part by LaunchDarkly. Take a look at what it takes to get your code into production. I’m going to just guess that it’s awful because it’s always awful. No one loves their deployment process. What if launching new features didn’t require you to do a full-on code and possibly infrastructure deploy? What if you could test on a small subset of users and then roll it back immediately if results aren’t what you expect? LaunchDarkly does exactly this. To learn more, visit launchdarkly.com and tell them Corey sent you, and watch for the wince.Pete: Hello, and welcome to the AWS Morning Brief: Fridays From the Field. I am Pete Cheslock.Jesse: I’m Jesse DeRose.Pete: We’re coming at you again with some more listener questions from the Unconventional Guide to AWS Cost Management. I’m excited. People are listening to us, Jesse.Jesse: This is fantastic. I’m really excited that we have one fan. I’ve always wanted one fan.Pete: Well, two fans now. Maybe even more because we keep getting questions. And you can also be one of our Friends of the Pod by going to lastweekinaws.com/QA. And you can give us some feedback, you can give us a question and, like, will totally answer it because we like Friends of the Pod.Jesse: We may or may not enter you into a raffle to get a Members Only jacket that’s branded with ‘Friends with the Pod.’Pete: We should get some pins made, maybe.Jesse: Ohh…Pete: I think that's a good idea.Jesse: Yeah.Pete: So, what are we answering today, or attempting to answer for our listener, Jesse?Jesse: So today, we’ve got a really great question from [Godwin 00:01:20]. Thank you, Godwin, Godwin writes, “I truly believe that the system that I support is, like, a data hoarder. We do a lot of data ingestion, we recently did a lift-and-shift of the system to AWS, we use an Oracle database. The question is, how do I segregate the data and start thinking about moving it out of traditional relational databases and into other types of databases? Presently, our method is all types of data goes into a quote-unquote, ‘all-purpose database,’ and the database is growing quite fast. Where should I get started?”Pete: Well, I just want to commend you for a lift-and-shift into Amazon. That’s a Herculean feat, no matter what you’re lifting and shifting over. Hopefully, you have maybe started to decommission those original data centers and you don’t just have more data in twice as many locations.Jesse: [laugh]. But I also want to call out well done for thinking about not just the lift-and-shift, but the next step. I feel like that’s the thing that a lot of people forget about. They think about the lift-and-shift, and then they go, “Awesome. We’re hybrid. We’re in AWS, now. We’re in our data center. We’re good. Case closed.” And they forget that there’s a lot more work to do to modernize all those workloads in AWS, once you’ve lifted and shifted. And this is part of that conversation.Pete: Yeah, that’s a really good point because I know we’ve talked about this in the past, the lift-and-shift shot clock: when you don’t start migrating, start modernizing those applications to take advantage of things that are more cloud-native, the technical debt is really going to start piling up, and the folks that are going to manage that are going to get more burnt out, and it really is going to end poorly. So, the fact you’re starting to think about this now is a great thing. Also, what is available to you now that you’re on AWS is huge compared to a traditional data center.Jesse: Yeah.Pete: And that’s not just talking about the—I don’t even know if I’ve ever counted how many different databases exist on Amazon. I mean, they have a database for, at this point, every type of data. I mean, is there a type of data that they’re going to create, just so that they can create a database to put it into?Jesse: Wouldn’t surprise me at this point.Pete: They’ll find a way [laugh] to come up with that charge on your bill. But when it comes to Oracle, specifically Oracle databases, there’s obviously a big problem in not only the cost of the engine, running the database on a RDS or something to that effect, but you have licensing costs that are added into it as well. Maybe you have a bring-your-own-license or maybe you’re just using the off-the-shelf, but the off-the-shelf, kind of, ‘retail on-demand pricing’ RDS—I’m using air quotes for all these things, but you can’t see that—they will just have the licensing costs baked in as well. So, you’re paying for it—kind of—either way.Jesse: And I think this is something also to think about that we’ll dive into in a minute, but one of the things that a lot of people forget about when they move into AWS says that you’re not just paying for data sitting on a piece of hardware in a data center that’s depreciating, now. You’re paying for storage, you’re paying for I/O costs, you’re paying for data transfer, to Pete’s point, you’re also paying for some of the license as well, potentially. So, there’s lots of different costs associated with keeping an Oracle Database running in AWS. So, that’s actually probably the best place to start thinking about this next step about where to get started. Think about the usage patterns of your data.And this may be something that you need to involve engineering, maybe involve product for if they’re part of these conversations for storage of your product or your feature sets. Think about what are the usage patterns of your data?Pete: Yeah, exactly. Now, you may say to yourself, “Well, we’re on Oracle”—and I’m sure people listening are like, “Well, that’s your problem. You should just move off of Oracle.” And since you can’t go back in time and undo that decision—and the reality is, it probably was a good decision at the time. There’s a lot of businesses, including Amazon, who ran all of their systems on Oracle.And then migrated off of them. Understanding the usage patterns, what type of data is going into Oracle, I think is a big one. Because if you can understand the access patterns of the types of data that are going in, that can help you start peeling off where that data should go. Now, let’s say you’re just pushing all new data created. And we don’t even know what your data is, so we’re going to take some wild assumptions here on what you could possibly do—but more so just giving you homework, really—thinking about the type of data going in, right?If you’re just—“I’m pushing all of my data into this database because someday we might need to query it.” That’s actually a situation where you really want to start thinking of leveraging more of a data warehouse-style approach to it, where you have a large amount of data being created, you don’t know if you’re going to need to query it in the future, but you might want to glean some value out of that. Using S3, which is now available to you outside of your data center world, is going to be super valuable to just very cheaply shove data into S3, to be able to go back in later time. And then you can use things like Athena to ad hoc query that data, or leverage a lot of the ingestion services that exist to suck that data into other databases. But thinking about what’s being created, when it is going into places is a big first step to start understanding, well, how quickly does this data need to come back?Can the query be measured in many seconds? Can it be done ad hoc, like in Athena? Does it need to be measured in milliseconds? What’s the replication that needs to happen? Is this very valuable data that we need to have multiple backups on?Is it queried more than it’s created? Maybe you need to have multiple replica reader databases that are there. So, all these types of things of really understanding just what’s there to begin with, and it’s probably going to be in talking to a lot of engineering teams.Jesse: Yeah, you can think about this project in the same way that you might move from a monolith to a microservice architecture. So, if you’re moving from a monolith to a microservice architecture, you might start peeling away pieces of the monolith, one at a time. Pieces that can easily be turned into microservices that stand on their own within the cloud, even if they’re running on the same underlying infrastructure as the monolith itself within AWS. And then, as you can pull those pieces away, then start thinking about does this need to be in a relational database? Does this need to have the same amount of uptime and availability as the resources that are sitting in my Oracle Database right now?All those things that Pete just mentioned, start thinking about all of those components to figure out where best to pull off the individual components of data, and ultimately put them in different places within AWS. And to be clear, there’s lots of great guides on the internet that talk about moving from your Oracle database into, gosh, just about any database of choice. AWS even has specific instructions for this, and we’ll throw a link in the [show notes 00:09:02].They really, really want you to move this data to RDS Aurora. They go through painstaking detail to talk about using the AWS schema conversion tool to convert your schema over; they talk about the AWS database migration service to migrate the data over, and then they talk about performing post-migration activities such as running SQL queries for validating the object types, object count, things like that. I think that a lot of folks actually don’t know that the database migration service exists, and it’s something worth calling out as a really powerful tool.Pete: Yeah, the Amazon DMS service is honestly I think, a super-underrated service that people just don’t know about. It has the ability to replicate data from both on-premises databases to Amazon databases but also databases already running on Amazon. You could replicate from a database running on EC2 into Aurora. You could replicate that into S3—you know, replicate data into S3 that way, bringing things into sync—replicate that data into S3, and then maybe use it for other purposes. It can replicate data from DocumentDB into other sources.So, they’re clearly doing a big investment in there. And to Jesse’s point, yeah, Amazon really wants this data. So, talk to your account manager as you’re testing out some of these services. Do a small proof of concept, maybe, to see how well it works, if you can understand the queries, or you can point your application over at an Aurora database with some of this data migrated in; that’s a great way to understand how well this could work for your organization. But as Jesse mentioned, they do want that data in Aurora.So, if it turns out that you’re looking at your—you know, migrate some data in there, and it’s starting to work, and you’re kind of getting a feel for the engineering effort to migrate there, stop. Talk to your account manager before you spend any more money on Aurora because it’s very likely that they can put together a program—if a program doesn’t already exist—to incentivize you to move that data over; they can give you subject matter expertise; they can provide you credits to help you migrate that data over. Don’t feel like you have to do this on your own. You have an account team; you should definitely reach out to them, and they will provide you a lot of help to get that data in there. They’ve done it for many of their other clients, and they’re happy to do it for you because they know that, long term, when you move that data to Aurora, it’s going to be very sticky in Aurora.You’re probably not going to move off of there. It’s a long game for them; that’s how they play it. So, check out those services; that could be a really great way to help you get rid of your Oracle addiction.Jesse: Yeah, and if you’re able to, as we talked about earlier, if you’re able to identify workloads that don’t need to run in a relational database, or don’t need to run in, maybe, a database at all, for that matter, stick that data in S3. Call it a day. Put them on lifecycle management policies or different storage tiers, and use Athena for ad hoc queries, or maybe Redshift if you’re doing more data warehouse-style tasks. But if that data doesn’t need to live in a relational database, there are many cheaper options for that data.Pete: Exactly. But one last point I will make is don’t shove it into MongoDB just because you want to have schema-less, or—Jesse: Please.Pete: —think about what you’re going to use it for, think about what the data access patterns because there is a right place for your data. Don’t just jump into no-SQL just ‘cause because you’ll probably end up with a bigger problem. In the long run.Corey: If your mean time to WTF for a security alert is more than a minute, it's time to look at Lacework. Lacework will help you get your security act together for everything from compliance service configurations to container app relationships, all without the need for PhDs in AWS to write the rules. If you're building a secure business on AWS with compliance requirements, you don't really have time to choose between antivirus or firewall companies to help you secure your stack. That's why Lacework is built from the ground up for the Cloud: low effort, high visibility and detection. To learn more, visit lacework.com.Pete: So Jesse, I’m looking at our list of questions. And it turns out, we have another question.Jesse: Ohh.Pete: Two questions came in.Jesse: You like me, you really like me!Pete: It’s so great. Again, you can also send us a question, lastweekinaws.com/QA. You can go there, drop in a question and feel free to put your name. Or not; you can be anonymous, it’s totally fine. We’ll happily answer your question either way. So Jesse, who is our next question from? What is this one about?Jesse: This one’s from [Joseph 00:13:19]. They write in, “Hey, folks. Love the show. Longtime listener, first-time caller.” Thank you. “I would love to know how people manage their costs in AWS Batch. Jobs themselves can’t be tagged for cost allocation, which makes things a bit complicated.” Lord Almighty, yes, it does. “How best should I see if the jobs are right-sized? Are they over-provisioned in terms of memory or compute? What’s the best way to see if EC2 is my better choice, versus Fargate, versus other options? How can I tell if the batch-managed cluster itself is under-utilized?”Pete: Oof. This is a loaded question with a lot of variables.Jesse: Yeah. And so we’re going to break it down because there’s definitely a couple questions here. But I want to start off with what AWS Batch is, just really quick to make sure everybody’s on the same page here. AWS Batch, effectively, is a managed service in AWS that schedules it and runs your batch computing jobs on top of AWS compute resources. Effectively, it does a lot of the heavy lifting configuration for you so you can just focus on analyzing the results of those queries.Pete: Yeah, exactly. And Batch supports a really wide variety of tooling that can operate this, and that’s why it’s hard for us to give, specifically, how you might optimize this, but I think some of the optimizations actually mirror a lot of the optimizations we’ve done with optimizing EMR clusters and things of that nature, where you’re running these distributed jobs. And you want to make sure that if you’re running straight off of EC2 instances, then you want to make sure that they are essentially maxed out. If the CPU is anything less than 100% for an on-demand instance, then there’s wasted, or there’s opportunity for improvement. And so making sure that your jobs are sized appropriately and balancing out memory and CPU so that, effectively, you’re using all of the memory and all of the CPU, that’s a real basic first step.But honestly, a lot of folks kind of miss out on that. They just kind of run a job and go off and do their own thing. They never really go back and look at those graphs. You can go to CloudWatch, they’re all going to be there for you.Jesse: Yeah. And to this point, there’s always an opportunity to make these workloads more ephemeral. If you have the opportunity to make it more ephemeral, please, please, please, please, absolutely do so. Unless your batch job needs to run 24/7. We’ve seen that in a few cases where they have, essentially, clusters that are running 24/7, but they’re not actually utilized regularly; the workloads are only scheduled for a short amount of time.So, if you don’t need those batch jobs running 24/7, please, by all means, move to more ephemeral resources, like Fargate. Fargate on Spot, Spot Instances in general, or even Lambda, which AWS Batch now supports as well.Pete: Yeah, it has some step function support, which is pretty interesting. Yeah, this is a great opportunity to aggressively—aggressively—leverage Spots, if you’re not currently today. The reality is that check out Fargate on Spot if you don’t need, like, a custom operating system, you don’t need a custom EBS volume size. If you do, then EC2 on Spot is probably the best option that you really have. But really do not want to be running anything on on-demand instances. Even on-demand instances with a really good savings plan, you’re still leaving money on the table because Spot Instances are going to be a lot cheaper than even the best savings plan that’s out there.Jesse: And I think that’s a good point, too, Pete, which is if you do need to run these workloads on-demand, 24/7, think about if you can get away with using Spot Instances. If you can’t get away with using Spot Instances, at least purchase a savings plan if you don’t do anything else. If you take nothing else away from this, at least make sure that you have some kind of savings plan in place for these resources so that you’re not paying on-demand costs 24/7. But in most cases, you can likely make them more ephemeral, which is going to save you a lot more money in the long run.Pete: Yeah, exactly. That’s the name of the game. I mean, when we talk to folks on Amazon, the more ephemeral you can make your application—the more you can have it handle interruption—the less expensive it will be to operate. And that goes from everywhere from Spot Instances and how they’re priced, right? If you just get a normal Spot Instance, it will have a really aggressive discount on it if you need zero time in advance before interruption.So, if that instance can just go in at any second, then you’ll get the best discount on that Spot Instance. But if your app needs a little time, or runs for a defined period of time—let’s say your app runs for one hour—you can get a defined duration Spot of one hour, you’ll get a great discount still and you’ll only pay for however long you use it, but you will get that resource for one whole hour, and then you’ll lose it. If that’s still too aggressive, there’s configurable options up to six hours. Again, less discount, but more stability in that resource. So, that’s the trade-off you make when you move over to Spot Instances.Jesse: So, I also want to make sure that we get to the second part of this question, which is about attributing cost to your AWS Batch workloads. According to the AWS Batch documentation, you can tag AWS Batch compute environments, jobs, job definitions, and job queues, but you can’t propagate those tags to the underlying resources that actually run those jobs. Which to me, kind of just defeats the point.Pete: Yeah. [sigh]. Hashtag AWS wishlist here. You know, again, continuing to expand out tagging support for things that don’t support it. I know we’ve seen kind of weird inconsistencies, and just even, like, tagging ECS jobs and where you have to tag them for they’re to apply.So, I know it’s a hard problem, but obviously, it’s something that should be continually worked out on because, yeah, if you’re trying to attribute these costs, you’re left with the only option to run them in separate Amazon accounts, which solves this problem, but again, depending on your organization, could increase just the management overhead of those. But that is the ultimate way. I mean, that is the one way to ensure 100% of costs are encapsulated to a service is to have them run in a dedicated account. The downside being is that if you have a series of different jobs running across a different, maybe, business units, then obviously that’s going to break down super quick.Jesse: Yeah, and it’s also worth calling out that if there’s any batch jobs that need to send data to different places—maybe the batch job belongs to product A, but it needs to send data to product B—there’s going to be some amount of data transfer either across regionally or across accounts in order to share that data, depending on how your organization, how your products are set up. So, keep in mind that there are potentially some minor charges that may appear with this, but ultimately, if you’re talking about the best ways to really attribute costs for your AWS Batch workloads, linked accounts is the way to go.Pete: Yeah. If you need attribution down to the penny—some of our clients absolutely do. For invoicing purposes, they need attribution for business unit down to the penny. And if you’re an organization that needs that, then the only way to get that, effectively, is segmented accounts. So, keep that in mind.Again, until Amazon comes out with the ability to get a little bit more flexible tagging, but also, too, feel free to yell at your account manager—I mean, ask them nicely. They are people, too. But, you know, let them know that you want this. Amazon builds what the customers want, and if you don’t tell them that you want it, they’re not going to prioritize it. I’m not saying if you tell them, you’re going to get it in a couple of months, but you’re never going to get it if you don’t say anything. So, definitely let people know when there’s something that doesn’t work the way you expect it to.Jesse: Absolutely.Pete: Awesome. Wow. Two questions. I feel it’s like Christmas. Except—Jesse: [laugh].Pete: —it’s Christmas in almost springtime. It’s great. Well, again, you, too, can join us by being a Friend of the Pod, which Jesse really loves that one for some reason. [laugh].Jesse: Yeah. Don’t know why, but it’s going to be stuck in my brain.Pete: Exactly. You too can be a Friend of the Pod by going to lastweekinaws.com/QA and you can send us a question. We would love to spend some time in a future episode, answering them for you.If you’ve enjoyed this podcast, please go to lastweekinaws.com/review. Give it a five-star review on your podcast platform of choice, whereas if you hated this podcast, please go to lastweekinaws.com/review and give it a five-star rating on your podcast platform of choice and tell us why you want to be a Friend of the Pod. Thank you.Announcer: This has been a HumblePod production. Stay humble. 

Investorideas -Trading & News
The #AI Eye: AWS Announces Amazon ( $AMZN) DocumentDB, Gridsum ( $GSUM) Expands Relationship with NIKE (NYSE: $NKE)

Investorideas -Trading & News

Play Episode Listen Later Jan 16, 2020 5:16


The #AI Eye: AWS Announces Amazon ( $AMZN) DocumentDB, Gridsum ( $GSUM) Expands Relationship with NIKE (NYSE: $NKE)

Screaming in the Cloud
Episode 50: If You Lose Data, Your Company is Having a Very Bad Day

Screaming in the Cloud

Play Episode Listen Later Feb 27, 2019 37:01


If you use MongoDB, then you may be feeling ecstatic right now. Why? Amazon Web Services (AWS) just released DocumentDB with MongoDB compatibility. Users who switch from MongoDB to DocumentDB can expect improved speed, scalability, and availability. Today, we’re talking to Shawn Bice, vice president of non-relational databases at AWS, and Rahul Pathak, general manager of big data, data lakes, and blockchain at AWS . They share AWS’ overall database strategy and how to choose the best tool for what you want to build. Some of the highlights of the show include: Database Categories: Relational, key value, document, graph, in memory, ledger, and time series AWS database strategy is to have the most popular and best APIs to sustain functionality, performance, and scale Many database tools are available; pick based on use case and access pattern Product recommendations feature highly connected data - who do you know who bought what and when? Analytics Architecture: Use S3 as data lake, put in data via open-data format, and run multiple analyses using preferred tool at the same time on the same data AWS offers Quantum Ledger Database (QLDB) and Managed Blockchain to address use case and need for blockchain Authenticity of data is a concern with traditional databases; consider a database tool or service that does not allow data to be changed Lake Formation lets customers set up, build, and secure data lakes in less time DocumentDB: Made as simple as possible to improve customer experience AWS Culture: Awareness and recognition that it takes many to conceive, build, launch, and grow a product - acknowledge every participant, including customers Links: Amazon DocumentDB MongoDB Amazon RDS React Aurora re:Invent DynamoDB Amazon Neptune Amazon Elasti-Cache Amazon Quantum Ledger Database Amazon Timestream Amazon S3 Amazon EMR Amazon Athena Amazon Redshift Amazon Managed Blockchain Amazon EC2 Amazon Lake Formation Perl CHAOSSEARCH

Linux Action News
Linux Action News 89

Linux Action News

Play Episode Listen Later Jan 20, 2019 30:25


Another troubling week for MongoDB, ZFS On Linux lands a kernel workaround, and 600 days of postmarketOS. Plus our thoughts on the new Project Trident release, and Mozilla ending their Test Pilot program.

linux firefox mozilla red hat mongodb debian test pilot action news zol documentdb sspl project trident greg kh trueos server side public license zfs on linux
Scaling Postgres
Episode 47 | pgBouncer | Postgres 11 Gems | DocumentDB | JSON

Scaling Postgres

Play Episode Listen Later Jan 20, 2019 11:17


In this episode of Scaling Postgres, we review articles covering pgBouncer, Postgres 11 gems, DocumentDB similarities and JSON capabilities. Subscribe at https://www.scalingpostgres.com to get notified of new episodes. Links for this episode: https://blog.2ndquadrant.com/pg-phriday-pgbouncer-bust/ https://pgbouncer.github.io/changelog.html https://pgbouncer.github.io/usage.html https://www.cybertec-postgresql.com/en/unearthing-some-hidden-postgresql-11-gems/ https://www.citusdata.com/blog/2019/01/13/citus-data-top-posts-of-2018/ https://www.enterprisedb.com/blog/documentdb-really-postgresql https://severalnines.com/blog/overview-json-capabilities-within-postgresql https://severalnines.com/blog/one-security-system-application-connection-pooling-and-postgresql-case-ldap https://www.citusdata.com/blog/2019/01/15/contributing-to-postgres/ https://blog.2ndquadrant.com/maintaining-feature-branches-submitting-patches-git/  

Linux Action News
Linux Action News 89

Linux Action News

Play Episode Listen Later Jan 20, 2019 30:25


Another troubling week for MongoDB, ZFS On Linux lands a kernel workaround, and 600 days of postmarketOS. Plus our thoughts on the new Project Trident release, and Mozilla ending their Test Pilot program.

linux firefox mozilla red hat mongodb debian test pilot action news zol documentdb sspl project trident greg kh trueos server side public license zfs on linux
Linux Action News
Linux Action News 89

Linux Action News

Play Episode Listen Later Jan 20, 2019 30:25


Another troubling week for MongoDB, ZFS On Linux lands a kernel workaround, and 600 days of postmarketOS. Plus our thoughts on the new Project Trident release, and Mozilla ending their Test Pilot program.

linux firefox mozilla red hat mongodb debian test pilot action news zol documentdb sspl project trident greg kh trueos server side public license zfs on linux
Software Defined Talk
Episode 162: The diapers.com effect, also, LTS and the mysteries of software pricing

Software Defined Talk

Play Episode Listen Later Jan 18, 2019 64:37


Are we still on that open source licensing thing? Yes. “The most boring topic of all time.” Also, Slack's logo and long term support software monetization models: how do they work? Summary: “Diapers.com buster (AKA Amazon)” “What is someone really selling with LTS?” “Artful genitals.” “It’s not butt ducks” “I’ve had three dogs since then…” Microsoft laughed. This week’s cover art from TheNextWeb (https://thenextweb.com/apps/2019/01/16/slack-has-a-new-logo-and-umm-you-be-the-judge/). MONGO, MONGO, MONGO! MongoDB Issues New Server Side Public License for MongoDB Community Server (https://www.mongodb.com/press/mongodb-issues-new-server-side-public-license-for-mongodb-community-server) MongoDB not in RHEL 8.0 (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8-beta/html/8.0_beta_release_notes/new-features#web_servers_databases_dynamic_languages_2) MongoDB "open-source" Server Side Public License rejected (https://www.zdnet.com/article/mongodb-open-source-server-side-public-license-rejected/) AWS vs. open source: DocumentDB is the latest battlefront (https://www.infoworld.com/article/3331903/database/aws-vs-open-source-documentdb-is-the-latest-battlefront.html) AWS gives open source the middle finger (https://techcrunch.com/2019/01/09/aws-gives-open-source-the-middle-finger/) AWS, MongoDB, and the Economic Realities of Open Source (https://stratechery.com/2019/aws-mongodb-and-the-economic-realities-of-open-source/) (Ben Thompson) Fine, fine…but music companies didn’t “sell” CDs, they sold music. Authors don’t “sell” printed books, they sell stories. They sell IP. The medium isn’t the product. “This trade-off is inescapable, and it is fair to wonder if the golden age of VC-funded open source companies will start to fade (although not open source generally). The monetization model depends on the friction of on-premise software; once cloud computing is dominant, the economic model is much more challenging.” There’s some ponderous gyrating between public cloud being good at managed hosting/services (they run the stuff well) vs. software (their features are unique/good). Ben’s follow-up (https://stratechery.com/2019/mongodb-follow-up-aws-incentives-batteries-the-iphones-missing-miss/#memberful_done) (subscription required): “ Atlas was only 8% of total revenue last year, which grew 57% year-over-year; that means that Atlas itself grew 330% year-over-year, from $3.3 million to $14.3 million. Of course cost of revenue grew 68% as well, thanks to a $4.1 million increase in hosting costs (AWS wins either way), but particularly given the addition of a free Atlas offering, those costs aren’t out of line.” So, with this “SSPL” thing, AWS would have to open source all of itself, or just the DocumentDB part? Here (https://www.zdnet.com/article/mongodb-open-source-server-side-public-license-rejected/): “The specific objection is that SSPL requires, if you offer services licensed under it, that you must open-source all programs that you use to make the software available as a service. From Mongo’s press release on SSPL, Oct. 2018 (https://www.mongodb.com/press/mongodb-issues-new-server-side-public-license-for-mongodb-community-server): “The only substantive change is an explicit condition that any organization attempting to exploit MongoDB as a service must open source the software that it uses to offer such service.” What would happen if AWS was all open source? Given that few companies could use OpenStack or make their own clouds (even with cloud.com and such), just having the code matters little to a successful cloud business, right? Or, maybe it doesn’t mean all of AWS, just the DocumentDB part. Which is, really, the in the spirit of the GPL. The competitive tactic of forcing competitors to open source their stuff is weird. Relevant to your interests Amazon reportedly acquired Israeli disaster recovery service CloudEndure for around $200M (https://techcrunch.com/2019/01/08/amazon-reportedly-acquired-israeli-disaster-recovery-service-cloudendure-for-around-200m/) AWS makes another acquisition grabbing TSO Logic (https://techcrunch.com/2019/01/15/aws-makes-another-acquisition-grabbing-tso-logic/) IBM Just Unveiled The First Commercial Quantum Computer (https://www.sciencealert.com/ibm-unveils-a-quantum-computer-that-will-be-available-to-businesses) “Watson! Whatever happened to ‘unikernal’?” Is that one in the bag and this is the new thing? Announcing TriggerMesh Knative Lambda Runtime (KLR) | Multicloud Serverless Management Platform (https://triggermesh.com/2019/01/09/announcing-triggermesh-knative-lambda-runtime-klr/) Serverless computing: one step forward, two steps back (https://blog.acolyer.org/2019/01/14/serverless-computing-one-step-forward-two-steps-back/) Day Two Kubernetes: Tools for Operability (https://www.infoq.com/presentations/kubernetes-tools) Taking the smarts out of smart TVs would make them more expensive (https://www.theverge.com/2019/1/7/18172397/airplay-2-homekit-vizio-tv-bill-baxter-interview-vergecast-ces-2019) OneLogin snares $100M investment to expand identity solution into new markets (https://techcrunch.com/2019/01/10/onelogin-snares-100m-investment-to-expand-identity-solution-into-new-markets/) Want to get rich from bug bounties? You're better off exterminating roaches for a living (http://go.theregister.com/feed/www.theregister.co.uk/2019/01/15/bugs_bounty_salary/) Direct Listings Are a Thing Now (https://www.bloomberg.com/opinion/articles/2019-01-11/direct-listings-are-a-thing-now) Software Maker PagerDuty Files Confidentially for IPO (http://www.bloomberg.com/news/articles/2019-01-15/software-maker-pagerduty-is-said-to-file-confidentially-for-ipo) Slack’s Financials Ahead of Listing Plans (https://www.theinformation.com/articles/slacks-financials-ahead-of-listing-plans) - “As of October 2018, the firm had roughly $900 million in cash on its balance sheet.” Fiserve buying FirstData for $22bn (https://techcrunch.com/2019/01/16/fiserv-is-buying-first-data-in-a-22b-fintech-megadeal/?guccounter=1) - FundsXpress (https://www.crunchbase.com/organization/fundsxpress)! The 773 Million Record "Collection #1" Data Breach (https://www.troyhunt.com/the-773-million-record-collection-1-data-reach/) AWS launches Backup, a fully-managed backup service for AWS (https://techcrunch.com/2019/01/16/aws-launches-backup-to-let-you-back-up-your-on-premises-and-aws-data-to-aws/) ## Non Sense The WELL: State of the World 2019 (https://people.well.com/conf/inkwell.vue/topics/506/State-of-the-World-2019-page01.html) Apple reportedly replaced about 10 times more iPhone batteries than it expected to (https://www.cnbc.com/2019/01/15/apple-upgraded-10-to-11-million-batteries-according-to-report.html) Say hello, new logo (https://slackhq.com/say-hello-new-logo) Sponsors Plastic SCM Visit https://plasticscm.com/SDT (https://www.plasticscm.com/sdt?utm_source=Podcast&utm_medium=jingle&utm_campaign=SDT&utm_term=DevOps&utm_content=mergebots) to find out more and get some sassy t-shirts!! Arrested DevOps Subscribe to the Arrested DevOps podcast by visiting https://www.arresteddevops.com/ Conferences, et. al. 2019, a city near you: The 2019 SpringTours are posted (http://springonetour.io/). Coté will be speaking at many of these, hopefully all the ones in EMEA. They’re free and all about programming and DevOps things. Free lunch and stickers! Jan 28th to 29th, 2019 - SpringOne Tour Charlotte (https://springonetour.io/2019/charlotte), $50 off with the code S1Tour2019_100. Feb 12th to 13th, 2019 - SpringOne Tour St. Louis (https://springonetour.io/2019/st-louis). $50 off the code S1Tour2019_100. Mar 7th to 8th, 2019 - Incontro DevOps in Bologna (https://2019.incontrodevops.it/), Coté speaking. Mar 18th to 19th, 2019 - SpringOne Tour London (https://springonetour.io/2019/london). Get £50 off ticket price of £150 with the code S1Tour2019_100. Mar 21st to 2nd, 2019 (https://springonetour.io/2019/amsterdam) - SpringOne Tour Amsterdam. Get €50 off ticket price of €150 with the code S1Tour2019_100. Get a Free SDT T-Shirt Write an iTunes review of SDT and get a free SDT T-Shirt. Write an iTunes Review on the SDT iTunes Page. (https://itunes.apple.com/us/podcast/software-defined-talk/id893738521?mt=2) Send an email to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) and include the following: T-Shirt Size (Only Large or X-Large remain), Preferred Color (Gray, Black) and Postal address. First come, first serve. while supplies last! Can only ship T-Shirts within the United State SDT news & hype Join us in Slack (http://www.softwaredefinedtalk.com/slack). Follow us on Twitter (https://twitter.com/softwaredeftalk), Instagram (https://www.instagram.com/softwaredefinedtalk/) or LinkedIn (https://www.linkedin.com/company/software-defined-talk/) Send your postal address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) and we will send you a sticker. Listen to the Software Defined Interviews Podcast (https://www.softwaredefinedinterviews.com/). Check out the back catalog (http://cote.coffee/howtotech/). Brandon built the Quick Concall iPhone App (https://itunes.apple.com/us/app/quick-concall/id1399948033?mt=8) and he wants you to buy it for $0.99. Recommendations Matt: Neil Gaiman’s Norse Mythology (https://www.amazon.com/dp/B01HQA6EOC/ref=dp-kindle-redirect?_encoding=UTF8&btkr=1). Brandon: DIRECTV Alexa skill (https://www.amazon.com/DIRECTV-LLC/dp/B07FDNYMB6). Coté: Peak (https://www.goodreads.com/book/show/29369213-peak), but read in, like 4x mode. Summary: (1.) Model the thing learned, (2.) focused exercises, (3.) coaching, (3.) using feedback loops to improve, (4.) stretching yourself. Derry Girls (https://en.wikipedia.org/wiki/Derry_Girls).

Mobycast
AWS Launches DocumentDB - Should Mongo Be Worried?

Mobycast

Play Episode Listen Later Jan 16, 2019 36:34


In episode 44 of Mobycast, we discuss AWS's launch of DocumentDB, and whether or not Mongo should be worried.

Mobycast
AWS Launches DocumentDB - Should Mongo Be Worried?

Mobycast

Play Episode Listen Later Jan 16, 2019 36:34


In episode 44 of Mobycast, we discuss AWS's launch of DocumentDB, and whether or not Mongo should be worried.

LINUX Unplugged
284: Free as in Get Out

LINUX Unplugged

Play Episode Listen Later Jan 15, 2019 62:39


ZFS on Linux is becoming the official upstream project of all major ZFS implementations, even the BSDs. But recent kernel changes prevent ZFS from even building on Linux. Neal Gompa joins us to discuss why it all matters. Plus some surprising community news, and a few great picks! Special Guests: Dalton Durst and Neal Gompa.

Linux Action News
Linux Action News 88

Linux Action News

Play Episode Listen Later Jan 13, 2019 26:39


Choose your own Linux is coming to Chrome OS, GitHub private repos go free, LVFS gets another win, and Amazon released their MongoDB competitor DocumentDB. Plus Homebrew comes to Linux, the recent Ethereum Classic attack, and more.

Linux Action News Video
Linux Action News 88

Linux Action News Video

Play Episode Listen Later Jan 13, 2019


Choose your own Linux is coming to Chrome OS, GitHub private repos go free, LVFS gets another win, and Amazon released their MongoDB competitor DocumentDB.

Linux Action News
Linux Action News 88

Linux Action News

Play Episode Listen Later Jan 13, 2019 26:39


Choose your own Linux is coming to Chrome OS, GitHub private repos go free, LVFS gets another win, and Amazon released their MongoDB competitor DocumentDB. Plus Homebrew comes to Linux, the recent Ethereum Classic attack, and more.

Linux Action News Video
Linux Action News 88

Linux Action News Video

Play Episode Listen Later Jan 13, 2019


Choose your own Linux is coming to Chrome OS, GitHub private repos go free, LVFS gets another win, and Amazon released their MongoDB competitor DocumentDB.

All Jupiter Broadcasting Shows
Linux Action News 88

All Jupiter Broadcasting Shows

Play Episode Listen Later Jan 13, 2019 26:39


Choose your own Linux is coming to Chrome OS, GitHub private repos go free, LVFS gets another win, and Amazon released their MongoDB competitor DocumentDB.

Linux Action News
Linux Action News 88

Linux Action News

Play Episode Listen Later Jan 13, 2019 26:39


Choose your own Linux is coming to Chrome OS, GitHub private repos go free, LVFS gets another win, and Amazon released their MongoDB competitor DocumentDB. Plus Homebrew comes to Linux, the recent Ethereum Classic attack, and more.

The AI Eye: stock news & deal tracker
The #AI Eye: AWS Announces Amazon ( $AMZN) DocumentDB, Gridsum ( $GSUM) Expands Relationship with NIKE (NYSE: $NKE)

The AI Eye: stock news & deal tracker

Play Episode Listen Later Jan 10, 2019 5:16


The #AI Eye: AWS Announces Amazon ( $AMZN) DocumentDB, Gridsum ( $GSUM) Expands Relationship with NIKE (NYSE: $NKE)

The AI Eye: stock news & deal tracker
The #AI Eye: AWS Announces Amazon ( $AMZN) DocumentDB, Gridsum ( $GSUM) Expands Relationship with NIKE (NYSE: $NKE)

The AI Eye: stock news & deal tracker

Play Episode Listen Later Jan 10, 2019 5:16


The #AI Eye: AWS Announces Amazon ( $AMZN) DocumentDB, Gridsum ( $GSUM) Expands Relationship with NIKE (NYSE: $NKE)

Techmeme Ride Home
Thu. 01/10 - Foldable Phones Have An Arrival Date

Techmeme Ride Home

Play Episode Listen Later Jan 10, 2019 16:54


We have a good idea when that foldable Samsung phone is coming, Google is actually close to a big legal win in Europe (for a change), the government shutdown might actually be affecting CES and why the “gig economy” might actually be a big nothingburger. Sponsors: Flatironschool.com/podcast Metalab.co Go.BitRide.io/ride Links: Samsung to Show Off Its New Foldable Phone in February (WSJ) Amazon Web Services calls MongoDB’s licensing bluff with DocumentDB, a new managed database (GeekWire) Google Nears Win in Europe Over ‘Right to Be Forgotten’ (WSJ) Google Only Has to Respect Your 'Right to Be Forgotten' in the EU, Court Says (Gizmodo) 2019 is already full of weird and wonderful monitors (The Verge) Government shutdown halts FCC device approvals (Axios) How Estimates of the Gig Economy Went Wrong (WSJ)

JavaScript Jabber
JSJ BONUS: Web Apps on Linux with Jeremy Likness and Michael Crump

JavaScript Jabber

Play Episode Listen Later Sep 12, 2017 59:19


Tweet this episode JSJ BONUS: Web Apps on Linux with Jeremy Likness and Michael Crump In this episode Aimee Knight and Charles Max Wood discuss Microsoft's Web Apps on Linux offering with Jeremy Likness and Michael Crump. [00:37] Michael Crump Introduction Michael is on the developer experience team for Azure. [00:52] Jeremy Likness Introduction Jeremy is on the cloud developer advocacy team. Their mission is to remove friction and support developers and work with teams to build a positive experience. The NodeJS team is headed up by John Papa. They have teams around the world and involved in many open source communities. They're focused on building documentation and creating great experiences [02:54] What is it about Azure that people should be getting excited about? Azure is a huge platform. It can be overwhelming. They're trying to help you start with your problem and then see the solution as it exists on Azure. Azure is growing to embrace the needs of developers as they solve these problems. The experience is intended to be open and easy to use for any developer in any language on any platform. It allows you to work in whatever environment you want. Standing up applications in production is tough. Azure provides services and facilities (and interfaces) that make it easy to manage infrastructure. You don't have to be an operations expert. Chuck mentions this messaging as he heard it at Microsoft Connect() last year. It's not about bringing you to .NET. It's about making it easy where you're at. Aimee adds that as a new-ish person in the community and Azure excites her because the portal and tutorials are easy to follow for many new programmers. A lot of these features are available across command lines, tools, and much more. The documentation is great. See our interview with Dan Fernandez on the Microsoft Docs. [12:04] Web Apps on Linux Web application as a service offering from Microsoft. I don't need to worry about the platform, just what's different about my application. Web Apps has traditionally been on Windows. Web Apps on Linux is in preview. You can choose the size of your infrastructure. You only get billed for what you use and can scale up. Setting up multiple servers, managing synchronization and load balancing is a pain. Web Apps gives you a clean interface that makes this management easy. You can also scale across multiple datacenters around the world. [15:06] Why Linux? What's hard about Windows? Node was originally created on Linux and many tools run nicely on Linux. It was later ported to Windows. The toolchains and IDE's and build processes is in an ecosystem that is targeted more toward Linux than Windows. This allows people to work in an environment that operates how they expect instead of trying to map to an underlying Windows kernel. Aimee gives the example of trying to set up ImageMagick on Windows. Web Apps on Linux also allows you to build integrations with your tools that let you build, test, and deploy your application automatically. [19:12] Supported Runtimes Web Apps on Linux supports Node, PHP, Ruby, and .NET Core. You can run a docker container with Node up to 6.x. If you want Node 7.x and 8.x you can create your own Docker container. Web Apps on Linux is build on Docker. The containers also have SSH, so developers can log into the docker container and troubleshoot problems on the container. If you can build a container, you can also run it on this service. At certain levels, there's automatic scaling. [22:06] Consistency between containers? Shared ownership of state or assets It depends on how you build your app. The Docker containers have a shared storage where all the containers have access to the same data and state. There's a system called kudu that makes this really simple. You can also pull logs across all systems. You can also use SSH in the browser [25:23] What's painful about Linux and containers? How is the application built and how does it manage state so that you can isolate issues. If you have 20 containers, can you connect to the right one. It's up to you to manage correlation between containers so you can find the information you need. Knowing your traffic and understanding what to do to prepare for it with scaling and automation is sometimes more art than science. [28:28] How should you manage state? A lot of these systems lend themselves to running stateless, but you don't want to run mongodb on each container versus running one mongodb instance that everything attaches. You want a common place to store data for the entire app for shared state. [30:34] CosmosDB (was DocumentDB) It's an API equivalent to MongoDB. It's a database as a service and you can connect your containers to the CosmosDB in Azure using your portal to make it super easy. You may need to open up some firewall rules, but it should be pretty straightforward. [34:14] Third Party Logging Management Apps Azure has a service that provides metrics (Application Insights) and a logging service. Many other companies use elasticsearch based solutions that solve some of these problems as well. [36:06] How do people use Web Apps on Linux? Companies building new applications many times want to run without managing any infrastructure. So, they use Azure Functions, and other services on Azure. Lift and shift: Take a virtual machine and change it into a web app container that they can run in the cloud. They also move from SQL Server on a server to SQL Server on the cloud. Moving from hosted MongoDB to CosmosDB. You can also use any images on DockerHub. [40:06] Continuous Integration and Continuous Deployment Whether you're using a private registry or cloud registry. When you publish a new image, it'll use a webhook to pull the custom image and deploy it. Or to run it through Continuous Integration and then deploy it without any human interaction. Chuck mentions the case when you haven't logged into a server for a while, there's a huge backlog of system updates. Updating your container definitions makes upkeep automatic. [42:02] Process files and workers with PM2 format You can set up instances to run across cores with the PM2 definitions. You can also make it run various types of workers on different containers. Why did you use PM2? What other uses are there for this kind of setup? You can tell it which processes to start up on boot. You can also have it restart processes when a file is changed, for example, with a config file you can have it restart the processes that run off that config file. [45:38] How to get started Getting started with Node docs.microsoft.com Trial account with a few hundred dollars in Azure credit. Michael's Links michaelcrump.net @mbcrump github.com/mbcrump Jeremy's Links bit.ly/coderblog @jeremylikness github/jeremylikness Picks Aimee Having a little bit of mindfulness while waiting on code and tests to run. Joe Ozark on Netflix Star Wars: Rogue One Chuck Travelers on Netflix Jeremy Ozark filming in Woodstock, GA Autonomous Smart Desk LED light strips Michael Conference Call Bingo Life (Movie) Get Out (Movie)

Devchat.tv Master Feed
JSJ BONUS: Web Apps on Linux with Jeremy Likness and Michael Crump

Devchat.tv Master Feed

Play Episode Listen Later Sep 12, 2017 59:19


Tweet this episode JSJ BONUS: Web Apps on Linux with Jeremy Likness and Michael Crump In this episode Aimee Knight and Charles Max Wood discuss Microsoft's Web Apps on Linux offering with Jeremy Likness and Michael Crump. [00:37] Michael Crump Introduction Michael is on the developer experience team for Azure. [00:52] Jeremy Likness Introduction Jeremy is on the cloud developer advocacy team. Their mission is to remove friction and support developers and work with teams to build a positive experience. The NodeJS team is headed up by John Papa. They have teams around the world and involved in many open source communities. They're focused on building documentation and creating great experiences [02:54] What is it about Azure that people should be getting excited about? Azure is a huge platform. It can be overwhelming. They're trying to help you start with your problem and then see the solution as it exists on Azure. Azure is growing to embrace the needs of developers as they solve these problems. The experience is intended to be open and easy to use for any developer in any language on any platform. It allows you to work in whatever environment you want. Standing up applications in production is tough. Azure provides services and facilities (and interfaces) that make it easy to manage infrastructure. You don't have to be an operations expert. Chuck mentions this messaging as he heard it at Microsoft Connect() last year. It's not about bringing you to .NET. It's about making it easy where you're at. Aimee adds that as a new-ish person in the community and Azure excites her because the portal and tutorials are easy to follow for many new programmers. A lot of these features are available across command lines, tools, and much more. The documentation is great. See our interview with Dan Fernandez on the Microsoft Docs. [12:04] Web Apps on Linux Web application as a service offering from Microsoft. I don't need to worry about the platform, just what's different about my application. Web Apps has traditionally been on Windows. Web Apps on Linux is in preview. You can choose the size of your infrastructure. You only get billed for what you use and can scale up. Setting up multiple servers, managing synchronization and load balancing is a pain. Web Apps gives you a clean interface that makes this management easy. You can also scale across multiple datacenters around the world. [15:06] Why Linux? What's hard about Windows? Node was originally created on Linux and many tools run nicely on Linux. It was later ported to Windows. The toolchains and IDE's and build processes is in an ecosystem that is targeted more toward Linux than Windows. This allows people to work in an environment that operates how they expect instead of trying to map to an underlying Windows kernel. Aimee gives the example of trying to set up ImageMagick on Windows. Web Apps on Linux also allows you to build integrations with your tools that let you build, test, and deploy your application automatically. [19:12] Supported Runtimes Web Apps on Linux supports Node, PHP, Ruby, and .NET Core. You can run a docker container with Node up to 6.x. If you want Node 7.x and 8.x you can create your own Docker container. Web Apps on Linux is build on Docker. The containers also have SSH, so developers can log into the docker container and troubleshoot problems on the container. If you can build a container, you can also run it on this service. At certain levels, there's automatic scaling. [22:06] Consistency between containers? Shared ownership of state or assets It depends on how you build your app. The Docker containers have a shared storage where all the containers have access to the same data and state. There's a system called kudu that makes this really simple. You can also pull logs across all systems. You can also use SSH in the browser [25:23] What's painful about Linux and containers? How is the application built and how does it manage state so that you can isolate issues. If you have 20 containers, can you connect to the right one. It's up to you to manage correlation between containers so you can find the information you need. Knowing your traffic and understanding what to do to prepare for it with scaling and automation is sometimes more art than science. [28:28] How should you manage state? A lot of these systems lend themselves to running stateless, but you don't want to run mongodb on each container versus running one mongodb instance that everything attaches. You want a common place to store data for the entire app for shared state. [30:34] CosmosDB (was DocumentDB) It's an API equivalent to MongoDB. It's a database as a service and you can connect your containers to the CosmosDB in Azure using your portal to make it super easy. You may need to open up some firewall rules, but it should be pretty straightforward. [34:14] Third Party Logging Management Apps Azure has a service that provides metrics (Application Insights) and a logging service. Many other companies use elasticsearch based solutions that solve some of these problems as well. [36:06] How do people use Web Apps on Linux? Companies building new applications many times want to run without managing any infrastructure. So, they use Azure Functions, and other services on Azure. Lift and shift: Take a virtual machine and change it into a web app container that they can run in the cloud. They also move from SQL Server on a server to SQL Server on the cloud. Moving from hosted MongoDB to CosmosDB. You can also use any images on DockerHub. [40:06] Continuous Integration and Continuous Deployment Whether you're using a private registry or cloud registry. When you publish a new image, it'll use a webhook to pull the custom image and deploy it. Or to run it through Continuous Integration and then deploy it without any human interaction. Chuck mentions the case when you haven't logged into a server for a while, there's a huge backlog of system updates. Updating your container definitions makes upkeep automatic. [42:02] Process files and workers with PM2 format You can set up instances to run across cores with the PM2 definitions. You can also make it run various types of workers on different containers. Why did you use PM2? What other uses are there for this kind of setup? You can tell it which processes to start up on boot. You can also have it restart processes when a file is changed, for example, with a config file you can have it restart the processes that run off that config file. [45:38] How to get started Getting started with Node docs.microsoft.com Trial account with a few hundred dollars in Azure credit. Michael's Links michaelcrump.net @mbcrump github.com/mbcrump Jeremy's Links bit.ly/coderblog @jeremylikness github/jeremylikness Picks Aimee Having a little bit of mindfulness while waiting on code and tests to run. Joe Ozark on Netflix Star Wars: Rogue One Chuck Travelers on Netflix Jeremy Ozark filming in Woodstock, GA Autonomous Smart Desk LED light strips Michael Conference Call Bingo Life (Movie) Get Out (Movie)

All JavaScript Podcasts by Devchat.tv
JSJ BONUS: Web Apps on Linux with Jeremy Likness and Michael Crump

All JavaScript Podcasts by Devchat.tv

Play Episode Listen Later Sep 12, 2017 59:19


Tweet this episode JSJ BONUS: Web Apps on Linux with Jeremy Likness and Michael Crump In this episode Aimee Knight and Charles Max Wood discuss Microsoft's Web Apps on Linux offering with Jeremy Likness and Michael Crump. [00:37] Michael Crump Introduction Michael is on the developer experience team for Azure. [00:52] Jeremy Likness Introduction Jeremy is on the cloud developer advocacy team. Their mission is to remove friction and support developers and work with teams to build a positive experience. The NodeJS team is headed up by John Papa. They have teams around the world and involved in many open source communities. They're focused on building documentation and creating great experiences [02:54] What is it about Azure that people should be getting excited about? Azure is a huge platform. It can be overwhelming. They're trying to help you start with your problem and then see the solution as it exists on Azure. Azure is growing to embrace the needs of developers as they solve these problems. The experience is intended to be open and easy to use for any developer in any language on any platform. It allows you to work in whatever environment you want. Standing up applications in production is tough. Azure provides services and facilities (and interfaces) that make it easy to manage infrastructure. You don't have to be an operations expert. Chuck mentions this messaging as he heard it at Microsoft Connect() last year. It's not about bringing you to .NET. It's about making it easy where you're at. Aimee adds that as a new-ish person in the community and Azure excites her because the portal and tutorials are easy to follow for many new programmers. A lot of these features are available across command lines, tools, and much more. The documentation is great. See our interview with Dan Fernandez on the Microsoft Docs. [12:04] Web Apps on Linux Web application as a service offering from Microsoft. I don't need to worry about the platform, just what's different about my application. Web Apps has traditionally been on Windows. Web Apps on Linux is in preview. You can choose the size of your infrastructure. You only get billed for what you use and can scale up. Setting up multiple servers, managing synchronization and load balancing is a pain. Web Apps gives you a clean interface that makes this management easy. You can also scale across multiple datacenters around the world. [15:06] Why Linux? What's hard about Windows? Node was originally created on Linux and many tools run nicely on Linux. It was later ported to Windows. The toolchains and IDE's and build processes is in an ecosystem that is targeted more toward Linux than Windows. This allows people to work in an environment that operates how they expect instead of trying to map to an underlying Windows kernel. Aimee gives the example of trying to set up ImageMagick on Windows. Web Apps on Linux also allows you to build integrations with your tools that let you build, test, and deploy your application automatically. [19:12] Supported Runtimes Web Apps on Linux supports Node, PHP, Ruby, and .NET Core. You can run a docker container with Node up to 6.x. If you want Node 7.x and 8.x you can create your own Docker container. Web Apps on Linux is build on Docker. The containers also have SSH, so developers can log into the docker container and troubleshoot problems on the container. If you can build a container, you can also run it on this service. At certain levels, there's automatic scaling. [22:06] Consistency between containers? Shared ownership of state or assets It depends on how you build your app. The Docker containers have a shared storage where all the containers have access to the same data and state. There's a system called kudu that makes this really simple. You can also pull logs across all systems. You can also use SSH in the browser [25:23] What's painful about Linux and containers? How is the application built and how does it manage state so that you can isolate issues. If you have 20 containers, can you connect to the right one. It's up to you to manage correlation between containers so you can find the information you need. Knowing your traffic and understanding what to do to prepare for it with scaling and automation is sometimes more art than science. [28:28] How should you manage state? A lot of these systems lend themselves to running stateless, but you don't want to run mongodb on each container versus running one mongodb instance that everything attaches. You want a common place to store data for the entire app for shared state. [30:34] CosmosDB (was DocumentDB) It's an API equivalent to MongoDB. It's a database as a service and you can connect your containers to the CosmosDB in Azure using your portal to make it super easy. You may need to open up some firewall rules, but it should be pretty straightforward. [34:14] Third Party Logging Management Apps Azure has a service that provides metrics (Application Insights) and a logging service. Many other companies use elasticsearch based solutions that solve some of these problems as well. [36:06] How do people use Web Apps on Linux? Companies building new applications many times want to run without managing any infrastructure. So, they use Azure Functions, and other services on Azure. Lift and shift: Take a virtual machine and change it into a web app container that they can run in the cloud. They also move from SQL Server on a server to SQL Server on the cloud. Moving from hosted MongoDB to CosmosDB. You can also use any images on DockerHub. [40:06] Continuous Integration and Continuous Deployment Whether you're using a private registry or cloud registry. When you publish a new image, it'll use a webhook to pull the custom image and deploy it. Or to run it through Continuous Integration and then deploy it without any human interaction. Chuck mentions the case when you haven't logged into a server for a while, there's a huge backlog of system updates. Updating your container definitions makes upkeep automatic. [42:02] Process files and workers with PM2 format You can set up instances to run across cores with the PM2 definitions. You can also make it run various types of workers on different containers. Why did you use PM2? What other uses are there for this kind of setup? You can tell it which processes to start up on boot. You can also have it restart processes when a file is changed, for example, with a config file you can have it restart the processes that run off that config file. [45:38] How to get started Getting started with Node docs.microsoft.com Trial account with a few hundred dollars in Azure credit. Michael's Links michaelcrump.net @mbcrump github.com/mbcrump Jeremy's Links bit.ly/coderblog @jeremylikness github/jeremylikness Picks Aimee Having a little bit of mindfulness while waiting on code and tests to run. Joe Ozark on Netflix Star Wars: Rogue One Chuck Travelers on Netflix Jeremy Ozark filming in Woodstock, GA Autonomous Smart Desk LED light strips Michael Conference Call Bingo Life (Movie) Get Out (Movie)

The NoSQL Database Podcast
NDP018: Microsoft DocumentDB for NoSQL in the Cloud

The NoSQL Database Podcast

Play Episode Listen Later May 16, 2017 30:22


In this episode I am joined by Kirill Gavrylyuk, who works at Microsoft on the Azure DocumentDB team, where we talk about NoSQL as a hosted solution as well as Microsoft's NoSQL solution, DocumentDB. Kirill discusses why DocumentDB is such a great solution and how developers and administrators can get started with it.

.NET Rocks!
Data on DocumentDB with Ryan CrawCour

.NET Rocks!

Play Episode Listen Later Sep 29, 2015 58:37


Document databases as a service? For sure! Carl and Richard talk to Ryan CrawCour about Azure DocumentDB. DocumentDB is a JSON store - with an amazing set of features, including SQL querying. What? Ryan talks about how DocumentDB provides a fast, scalable place to store objects and write your queries any way you like. You write the rules for how your data partitions between collections, as well as the performance of each of those collections, and you can change them on the fly. More sophisticated than a simple key-value-pair store, but less structured that a relational database, DocumentDB sits in a great spot in your data storage needs. Check it out!Support this podcast at — https://redcircle.com/net-rocks/donations

.NET Rocks!
Data on DocumentDB with Ryan CrawCour

.NET Rocks!

Play Episode Listen Later Sep 29, 2015 58:36


Document databases as a service? For sure! Carl and Richard talk to Ryan CrawCour about Azure DocumentDB. DocumentDB is a JSON store - with an amazing set of features, including SQL querying. What? Ryan talks about how DocumentDB provides a fast, scalable place to store objects and write your queries any way you like. You write the rules for how your data partitions between collections, as well as the performance of each of those collections, and you can change them on the fly. More sophisticated than a simple key-value-pair store, but less structured that a relational database, DocumentDB sits in a great spot in your data storage needs. Check it out!Support this podcast at — https://redcircle.com/net-rocks/donations

.NET Rocks!
Diving into Aurelia with Julie Lerman

.NET Rocks!

Play Episode Listen Later Aug 19, 2015 55:40


So what happens when you dive head-first into the latest Javascript libraries? Carl and Richard chat with Julie Lerman about her experiences playing with Rob Eisenberg's Aurelia library. Of course, it doesn't stop there: If you're going to learn Aurelia, you're going to change the whole stack - including node, expressjs and DocumentDB! Julie walks through the process of adding each of the bits into the stack, learning online through search engines and twitter, and what she brought back from this exploration that changed the way she works with C# and Entity Framework!Support this podcast at — https://redcircle.com/net-rocks/donations

.NET Rocks!
Diving into Aurelia with Julie Lerman

.NET Rocks!

Play Episode Listen Later Aug 19, 2015 55:39


So what happens when you dive head-first into the latest Javascript libraries? Carl and Richard chat with Julie Lerman about her experiences playing with Rob Eisenberg's Aurelia library. Of course, it doesn't stop there: If you're going to learn Aurelia, you're going to change the whole stack - including node, expressjs and DocumentDB! Julie walks through the process of adding each of the bits into the stack, learning online through search engines and twitter, and what she brought back from this exploration that changed the way she works with C# and Entity Framework!Support this podcast at — https://redcircle.com/net-rocks/donations

dotNETpodcast
DocumentDB visto da vicino con Davide Benvegnù

dotNETpodcast

Play Episode Listen Later Apr 13, 2015 29:52


DocumentDB visto da vicino con Davide Benvegnù

dotNETpodcast
DocumentDB visto da vicino - Davide Benvegnù

dotNETpodcast

Play Episode Listen Later Apr 13, 2015 29:53


DocumentDB è il primo database noSQL di Microsoft presente su Azure, le cui potenzialità sono davvero interessanti in quanto è stato progettato per avere il meglio delle due tecnologie, relazionale/noSQL.Il nostro ospite, Davide Benvegnù, ci parlerà di DocumenteDB e di tutta una serie novità introdotte in questo nuovo database NoSQL.La puntata è stata registrata durante i Community Days 2015, tenutisi a fine Marzo 2015 nella sede Microsoft di Milano.

SQL Down Under
SDU Show 64 with guest Ryan Crawcour

SQL Down Under

Play Episode Listen Later Oct 30, 2014 51:00


SDU Show 64 features Microsoft Azure DocumentDB team member Ryan Crawcour discussing what SQL Server DBAs and developers need to know about DocumentDB

SQL Server Radio
Show 4 – Push or Pull?

SQL Server Radio

Play Episode Listen Later Sep 10, 2014 43:43


Two weeks have passed, and it’s time for our 4th show. As always, we have a mix of interesting stuff covering performance, data architecture, new services, tips, tricks, and much more. Among other topics, we talk about: Why pulling from a Linked Server is A LOT faster than pushing The effect of new hardware on Writelog waits Ways to improve the throughput of your data loading processes The architecture of Transaction Replication A weird bug with sequence DocumentDB, Microsoft’s new NoSQL database service String search solutions in SQL Server Why you should wear the Database Architect hat from time to time How Paul Simon is related to SQL Server Items mentioned in the show: Guy Glantser – Parameterization Series Linchi Shea – Linked servers and performance impact: Direction matters! Maria Zakourdaev – Choosing SEQUENCE instead of IDENTITY? Watch your step SQLBits Session Recordings Klaus Aschenbrenner – Latches, Spinlocks, and Lock Free Data Structures 24 Hours of Pass Denny Cherry – PASS Summit 2014 Speaker Idol Competition SQL Server Days Microsoft NoSQL database and full-text search service previews available on Azure Bob Beauchemin – New Azure services and evolution of the Service/SQL Server relationship Stuart Ainsworth – From DBA to Data Architect: Changing Your Game (zip file for download)

MS Dev Show
DocumentDB with Ryan and Shireesh

MS Dev Show

Play Episode Listen Later Sep 5, 2014 66:32


We talk to Ryan and Shireesh from the document DB team about the new hosted document database option in Azure. Google shows old versions of their homepage in older browsers. John Gruber and Jeff Atwood go head to head over markdown. Microsoft plays nice.