POPULARITY
Listen and subscribe on Apple Podcasts | SpotifyHey all, Jason here.In this episode recorded live in Las Vegas this Tuesday during Money20/20, I had the chance to talk with Oscilar cofounder and CEO Neha Narkhede. We had the chance to discuss:* Neha's background as co-creator of Apache Kafka and cofounder of Confluent, which eventually scaled to a $10 billion IPO* How Oscilar is helping both fintechs and banks — including household names like Sofi — to power real-time risk stacks* Thinking about risk as a data and AI problem* Why organizations need to move beyond point solutions and data silos to manage risk effectively* The importance of real-time decisioning across the customer lifecycle, from onboarding/KYC to fraud, credit risk, and transaction monitoring* And more! Get full access to Fintech Business Weekly at fintechbusinessweekly.substack.com/subscribe
Shawn Tierney meets up with Connor Mason of Software Toolbox to learn their company, products, as well as see a demo of their products in action in this episode of The Automation Podcast. For any links related to this episode, check out the “Show Notes” located below the video. Watch The Automation Podcast from The Automation Blog: Listen to The Automation Podcast from The Automation Blog: The Automation Podcast, Episode 248 Show Notes: Special thanks to Software Toolbox for sponsoring this episode so we could release it “ad free!” To learn about Software Toolbox please checkout the below links: TOP Server Cogent DataHub Industries Case studies Technical blogs Read the transcript on The Automation Blog: (automatically generated) Shawn Tierney (Host): Welcome back to the automation podcast. My name is Shawn Tierney with Insights and Automation, and I wanna thank you for tuning back in this week. Now this week on the show, I meet up with Connor Mason from Software Toolbox, who gives us an overview of their product suite, and then he gives us a demo at the end. And even if you’re listening, I think you’re gonna find the demo interesting because Connor does a great job of talking through what he’s doing on the screen. With that said, let’s go ahead and jump into this week’s episode with Connor Mason from Software Toolbox. I wanna welcome Connor from Software Toolbox to the show. Connor, it’s really exciting to have you. It’s just a lot of fun talking to your team as we prepared for this, and, I’m really looking forward to because I just know in your company over the years, you guys have so many great solutions that I really just wanna thank you for coming on the show. And before you jump into talking about products and technologies Yeah. Could you first tell us just a little bit about yourself? Connor Mason (Guest): Absolutely. Thanks, Shawn, for having us on. Definitely a pleasure to be a part of this environment. So my name is Connor Mason. Again, I’m with Software Toolbox. We’ve been around for quite a while. So we’ll get into some of that history as well before we get into all the the fun technical things. But, you know, I’ve worked a lot with the variety of OT and IT projects that are ongoing at this point. I’ve come up through our support side. It’s definitely where we grow a lot of our technical skills. It’s a big portion of our company. We’ll get that into that a little more. Currently a technical application consultant lead. So like I said, I I help run our support team, help with these large solutions based projects and consultations, to find what’s what’s best for you guys out there. There’s a lot of different things that in our in our industry is new, exciting. It’s fast paced. Definitely keeps me busy. My background was actually in data analytics. I did not come through engineering, did not come through the automation, trainings at all. So this is a whole new world for me about five years ago, and I’ve learned a lot, and I really enjoyed it. So, I really appreciate your time having us on here, Shawn Tierney (Host): Shawn. Well, I appreciate you coming on. I’m looking forward to what you’re gonna show us today. I had a the audience should know I had a little preview of what they were gonna show, so I’m looking forward to it. Connor Mason (Guest): Awesome. Well, let’s jump right into it then. So like I said, we’re here at Software Toolbox, kinda have this ongoing logo and and just word map of connect everything, and that’s really where we lie. Some people have called us data plumbers in the past. It’s all these different connections where you have something, maybe legacy or something new, you need to get into another system. Well, how do you connect all those different points to it? And, you know, throughout all these projects we worked on, there’s always something unique in those different projects. And we try to work in between those unique areas and in between all these different integrations and be something that people can come to as an expert, have those high level discussions, find something that works for them at a cost effective solution. So outside of just, you know, products that we offer, we also have a lot of just knowledge in the industry, and we wanna share that. You’ll kinda see along here, there are some product names as well that you might recognize. Our top server and OmniServer, we’ll be talking about LOPA as well. It’s been around in the industry for, you know, decades at this point. And also our symbol factory might be something you you may have heard in other products, that they actually utilize themselves for HMI and and SCADA graphics. That is that is our product. So you may have interacted it with us without even knowing it, and I hope we get to kind of talk more about things that we do. So before we jump into all the fun technical things as well, I kind of want to talk about just the overall software toolbox experience as we call it. We’re we’re more than just someone that wants to sell you a product. We we really do work with, the idea of solutions. How do we provide you value and solve the problems that you are facing as the person that’s actually working out there on the field, on those operation lines, and making things as well. And that’s really our big priority is providing a high level of knowledge, variety of the things we can work with, and then also the support. It’s very dear to me coming through the the support team is still working, you know, day to day throughout that software toolbox, and it’s something that has been ingrained into our heritage. Next year will be thirty years of software toolbox in 2026. So we’re established in 1996. Through those thirty years, we have committed to supporting the people that we work with. And I I I can just tell you that that entire motto lives throughout everyone that’s here. So from that, over 97% of the customers that we interact with through support say they had an awesome or great experience. Having someone that you can call that understands the products you’re working with, understands the environment you’re working in, understands the priority of certain things. If you ever have a plant shut down, we know how stressful that is. Those are things that we work through and help people throughout. So this really is the core pillars of Software Toolbox and who we are, beyond just the products, and and I really think this is something unique that we have continued to grow and stand upon for those thirty years. So jumping right into some of the industry challenges we’ve been seeing over the past few years. This is also a fun one for me, talking about data analytics and tying these things together. In my prior life and education, I worked with just tons of data, and I never fully knew where it might have come from, why it was such a mess, who structured it that way, but it’s my job to get some insights out of that. And knowing what the data actually was and why it matters is a big part of actually getting value. So if you have dirty data, if you have data that’s just clustered, it’s in silos, it’s very often you’re not gonna get much value out of it. This was a study that we found in 2024, from Garner Research, And it said that, based on the question that business were asked, were there any top strategic priorities for your data analytics functions in 2024? And almost 50%, it’s right at ’49, said that they wanted to improve data quality, and that was a strategic priority. This is about half the industry is just talking about data quality, and it’s exactly because of those reasons I said in my prior life gave me a headache, to look at all these different things that I don’t even know where they became from or or why they were so different. And the person that made that may have been gone may not have the contacts, and making that from the person that implemented things to the people that are making decisions, is a very big task sometimes. So if we can create a better pipeline of data quality at the beginning, makes those people’s lives a lot easier up front and allows them to get value out of that data a lot quicker. And that’s what businesses need. Shawn Tierney (Host): You know, I wanna just data quality. Right? Mhmm. I think a lot of us, when we think of that, we think of, you know, error error detection. We think of lost connections. We think of, you know, just garbage data coming through. But I I think from an analytical side, there’s a different view on that, you know, in line with what you were just saying. So how do you when you’re talking to somebody about data quality, how do you get them to shift gears and focus in on what you’re talking about and not like a quality connection to the device itself? Connor Mason (Guest): Absolutely. Yeah. We I kinda live in both those worlds now. You know, I I get to see that that connection state. And when you’re operating in real time, that quality is also very important to you. Mhmm. And I kind of use that at the same realm. Think of that when you’re thinking in real time, if you know what’s going on in the operation and where things are running, that’s important to you. That’s the quality that you’re looking for. You have to think beyond just real time. We’re talking about historical data. We’re talking about data that’s been stored for months and years. Think about the quality of that data once it’s made up to that level. Are they gonna understand what was happening around those periods? Are they gonna understand what those tags even are? Are they gonna understand what those conventions that you’ve implemented, to give them insights into this operation. Is that a clear picture? So, yeah, you’re absolutely right. There are two levels to this, and and that is a big part of it. The the real time data and historical, and we’re gonna get some of that into into our demo as well. It it’s a it’s a big area for the business, and the people working in the operations. Shawn Tierney (Host): Yeah. I think quality too. Think, you know, you may have data. It’s good data. It was collected correctly. You had a good connection to the device. You got it. You got it as often as you want. But that data could really be useless. It could tell you nothing. Connor Mason (Guest): Right. Exactly. Shawn Tierney (Host): Right? It could be a flow rate on part of the process that irrelevant to monitoring the actual production of the product or or whatever you’re making. And, you know, I’ve known a lot of people who filled up their databases, their historians, with they just they just logged everything. And it’s like a lot of that data was what I would call low quality because it’s low information value. Right? Absolutely. I’m sure you run into that too. Connor Mason (Guest): Yeah. We we run into a lot of people that, you know, I’ve got x amount of data points in my historian and, you know, then we start digging into, well, I wanna do something with it or wanna migrate. Okay. Like, well, what do you wanna achieve at the end of this? Right? And and asking those questions, you know, it’s great that you have all these things historized. Are you using it? Do you have the right things historized? Are they even set up to be, you know, worked upon once they are historized by someone outside of this this landscape? And I think OT plays such a big role in this, and that’s why we start to see the convergence of the IT and OT teams just because that communication needs to occur sooner. So we’re not just passing along, you know, low quality data, bad quality data as well. And we’ll get into some of that later on. So to jump into some of our products and solutions, I kinda wanna give this overview of the automation pyramid. This is where we work from things like the field device communications. And you you have certain sensors, meters, actuators along the actual lines, wherever you’re working. We work across all the industries, so this can vary between those. Through there, you work up kind of your control area. A lot of control engineers are working. This is where I think a lot of the audience is very familiar with PLCs. Your your typical name, Siemens, Rockwell, your Schneiders that are creating, these hardware products. They’re interacting with things on the operation level, and they’re generating data. That that was kind of our bread and butter for a very long time and still is that communication level of getting data from there, but now getting it up the stack further into the pyramid of your supervisory, MES connections, and it’ll also now open to these ERP. We have a lot of large corporations that have data across variety of different solutions and also want to integrate directly down into their operation levels. There’s a lot of value to doing that, but there’s also a lot of watch outs, and a lot of security concerns. So that’ll be a topic that we’ll be getting into. We also all know that the cloud is here. It’s been here, and it’s it’s gonna continue to push its way into, these cloud providers into OT as well. There there’s a lot of benefit to it, but there there’s also some watch outs as this kind of realm, changes in the landscape that we’ve been used to. So there’s a lot of times that we wanna get data out there. There’s value into AI agents. It’s a hot it’s a hot commodity right now. Analytics as well. How do we get those things directly from shop floor, up into the cloud directly, and how do we do that securely? It’s things that we’ve been working on. We’ve had successful projects, continues to be an interest area and I don’t see it slowing down at all. Now, when we kind of begin this level at the bottom of connectivity, people mostly know us for our top server. This is our platform for industrial device connectivity. It’s a thing that’s talking to all those different PLCs in your plant, whether that’s brownfield or greenfield. We pretty much know that there’s never gonna be a plant that’s a single PLC manufacturer, that exists in one plant. There’s always gonna be something that’s slightly different. Definitely from Brownfield, things different engineers made different choices, things have been eminent, and you gotta keep running them. TopServe provides this single platform to connect to a long laundry list of different PLCs. And if this sounds very familiar to Kepserver, well, you’re not wrong. Kepserver is the same exact technology that TopServer is. What’s the difference then is probably the biggest question we usually get. The difference technology wise is nothing. The difference in the back end is that actually it’s all the same product, same product releases, same price, but we have been the biggest single source of Kepserver or Topsyra implementation into the market, for almost two plus decades at this point. So the single biggest purchase that we own this own labeled version of Kepserver to provide to our customers. They interact with our support team, our solutions teams as well, and we sell it along the stack of other things because it it fits so well. And we’ve been doing this since the early two thousands when, Kepware was a a much smaller company than it is now, and we’ve had a really great relationship with them. So if you’ve enjoyed the technology of of Kepserver, maybe there’s some users out there. If you ever heard of TopServer and that has been unclear, I hope this clear clarifies it. But it it is a great technology stack that that we build upon and we’ll get into some of that in our demo. Now the other question is, what if you don’t have a standard communication protocol, like a modbus, like an Allen Bradley PLC as well? We see this a lot with, you know, testing areas, pharmaceuticals, maybe also in packaging, barcode scanners, weigh scales, printers online as well. They they may have some form of basic communications that talks over just TCP or or serial. And how do you get that information that’s really valuable still, but it’s not going through a PLC. It’s not going into your typical agent mind SCADA. It might be very manual process for a lot of these test systems as well, how they’re collecting and analyzing the data. Well, you may have heard of our Arm server as well. It’s been around, like I said, for a couple decades and just a proven solution that without coding, you can go in and build a custom protocol that expects a format from that device, translates it, puts it into standard tags, and now that those tags can be accessible through the open standards of OPC, or to it was a a Veeva user suite link as well. And that really provides a nice combination of your standard communications and also these more custom communications may have been done through scripting in the past. Well, you know, put this onto, an actual server that can communicate through those protocols natively, and just get that data into those SCADA systems, HMIs, where you need it. Shawn Tierney (Host): You know, I used that. Many years ago, I had an integrator who came to me. He’s like, Shawn, I wanna this is back in the RSVUE days. He’s like, Shawn, I I got, like, 20 Euotherm devices on a four eighty five, and they speak ASCII, and I gotta I gotta get into RSVUE 32. And, you know, OmniSIR, I love that you could you could basically developing and we did Omega and some other devices too. You’re developing your own protocol, but it’s beautiful. And and the fact that when you’re testing it, it color codes everything. So you know, hey. That part worked. The header worked. The data worked. Oh, the trailing didn’t work, or the terminated didn’t work, or the data’s not in the right format. Or I just it was a joy to work with back then, and I can imagine it’s only gotten better since. Connor Mason (Guest): Yeah. I think it’s like a little engineer playground where you get in there. It started really decoding and seeing how these devices communicate. And then once you’ve got it running, it it’s one of those things that it it just performs and, is saved by many people from developing custom code, having to manage that custom code and integrations, you know, for for many years. So it it’s one of those things that’s kinda tried, tested, and, it it’s kind of a staple still our our base level communications. Alright. So moving along kind of our automation pyramid as well. Another part of our large offering is the Cogent data hub. Some people may have heard from this as well. It’s been around for a good while. It’s been part of our portfolio for for a while as well. This starts building upon where we had the communication now up to those higher echelons of the pyramid. This is gonna bring in a lot of different connectivities. You if you’re not if you’re listening, it it’s kind of this cog and spoke type of concept for real time data. We also have historical implementations. You can connect through a variety of different things. OPC, both the profiles for alarms and events, and even OPC UA’s alarming conditions, which is still getting adoption across the, across the industry, but it is growing. As part of the OPC UA standard, we have integrations to MQTT. It can be its own MQTT broker, and it can also be an MQTT client. That has grown a lot. It’s one of those things that lives be besides OPC UA, not exactly a replacement. If you ever have any questions about that, it’s definitely a topic I love to talk about. There’s space for for this to combine the benefits of both of these, and it’s so versatile and flexible for these different type of implementations. On top of that, it it’s it’s a really strong tool for conversion and aggregation. You kind of add this, like, its name says, it’s a it’s a data hub. You send all the different information to this. It stores it into, a hierarchy with a variety of different modeling that you can do within it. That’s gonna store these values across a standard data format. Once I had data into this, any of those different connections, I can then send data back out. So if I have anything that I know is coming in through a certain plug in like OPC, bring that in, send it out to on these other ones, OPC, DA over to MQTT. It could even do DDA if I’m still using that, which I probably wouldn’t suggest. But overall, there’s a lot of good benefits from having something that can also be a standardization, between all your different connections. I have a lot of different things, maybe variety of OPC servers, legacy or newer. Bring that into a data hub, and then all your other connections, your historians, your MAS, your SCADAs, it can connect to that single point. So it’s all getting the same data model and values from a single source rather than going out and making many to many connections. A a large thing that it was originally, used for was getting around DCOM. That word is, you know, it might send some shivers down people’s spines still, to this day, but it’s it’s not a fun thing to deal with DCOM and also with the security hardening. It’s just not something that you really want to do. I’m sure there’s a lot of security professionals would advise against EPRA doing it. This tunneling will allow you to have a data hub that locally talks to any of the DA server client, communicate between two data hubs over a tunnel that pushes the data just over TCP, takes away all the comm wrappers, and now you just have values that get streamed in between. Now you don’t have to configure any DCOM at all, and it’s all local. So a lot of people went transitioning, between products where maybe the server only supports OPC DA, and then the client is now supporting OPC UA. They can’t change it yet. This has allowed them to implement a solution quickly and cost and at a cost effective price, without ripping everything out. Shawn Tierney (Host): You know, I wanna ask you too. I can see because this thing is it’s a data hub. So if you’re watching and you’re if you’re listening and not watching, you you’re not gonna see, you know, server, client, UAD, a broker, server, client. You know, just all these different things up here on the site. Do you what how does somebody find out if it does what they need? I mean, do you guys have a line they can call to say, I wanna do this to this. Is that something Data Hub can do, or is there a demo? What would you recommend to somebody? Connor Mason (Guest): Absolutely. Reach out to us. We we have a a lot of content outline, and it’s not behind any paywall or sign in links even. You you can always go to our website. It’s just softwaretoolbox.com. Mhmm. And that’s gonna get you to our product pages. You can download any product directly from there. They have demo timers. So typically with, with coaching data hub, after an hour, it will stop. You can just rerun it. And then call our team. Yeah. We have a solutions team that can work with you on, hey. What do I need as well? Then our support team, if you run into any issues, can help you troubleshoot that as well. So, I’ll have some contact information at the end, that’ll get some people to, you know, where they need to go. But you’re absolutely right, Shawn. Because this is so versatile, everyone’s use case of it is usually something a little bit different. And the best people to come talk to that is us because we’ve we’ve seen all those differences. So Shawn Tierney (Host): I think a lot of people run into the fact, like, they have a problem. Maybe it’s the one you said where they have the OPC UA and it needs to connect to an OPC DA client. And, you know, and a lot of times, they’re they’re a little gunshot to buy a license because they wanna make sure it’s gonna do exactly what they need first. And I think that’s where having your people can, you know, answer their questions saying, yes. We can do that or, no. We can’t do that. Or, you know, a a demo that they could download and run for an hour at a time to actually do a proof of concept for the boss who’s gonna sign off on purchasing this. And then the other thing is too, a lot of products like this have options. And you wanna make sure you’re buying the ticking the right boxes when you buy your license because you don’t wanna buy something you’re not gonna use. You wanna buy the exact pieces you need. So I highly recommend I mean, this product just does like, I have, in my mind, like, five things I wanna ask right now, but not gonna. But, yeah, def definitely, when it when it comes to a product like this, great to touch base with these folks. They’re super friendly and helpful, and, they’ll they’ll put you in the right direction. Connor Mason (Guest): Yeah. I I can tell you that’s working someone to support. Selling someone a solution that doesn’t work is not something I’ve been doing. Bad day. Right. Exactly. Yeah. And we work very closely, between anyone that’s looking at products. You know, me being as technical product managers, well, I I’m engaged in those conversations. And Mhmm. Yeah. If you need a demo license, reach out to us to extend that. We wanna make sure that you are buying something that provides you value. Now kind of moving on into a similar realm. This is one of our still somewhat newer offerings, I say, but we’ve been around five five plus years, and it’s really grown. And I kinda said here, it’s called OPC router, and and it’s not it’s not a networking tool. A lot of people may may kinda get that. It’s more of a, kind of a term about, again, all these different type of connections. How do you route them to different ways? It it kind of it it separates itself from the Cogent data hub, and and acting at this base level of being like a visual workflow that you can assign various tasks to. So if I have certain events that occur, I may wanna do some processing on that before I just send data along, where the data hub is really working in between converting, streaming data, real time connections. This gives you a a kind of a playground to work around of if I have certain tasks that are occurring, maybe through a database that I wanna trigger off of a certain value, based on my SCADA system, well, you can build that in in these different workflows to execute exactly what you need. Very, very flexible. Again, it has all these different type of connections. The very unique ones that have also grown into kind of that OT IT convergence, is it can be a REST API server and client as well. So I can be sending out requests to, RESTful servers where we’re seeing that hosted in a lot of new applications. I wanna get data out of them. Or once I have consumed a variety of data, I can become the REST server in OPC router and offer that to other applications to request data from itself. So, again, it can kind of be that centralized area of information. The other thing as we talked about in the automation pyramid is it has connections directly into SAP and ERP systems. So if you have work orders, if you have materials, that you wanna continue to track and maybe trigger things based off information from your your operation floors via PLCs tracking, how they’re using things along the line, and that needs to match up with what the SAP system has for, the amount of materials you have. This can be that bridge. It’s really is built off the mindset of the OT world as well. So we kinda say this helps empower the OT level because we’re now giving them the tools to that they understand what what’s occurring in their operations. And what could you do by having a tool like this to allow you to kind of create automated workflows based off certain values and certain events and automate some of these things that you may be doing manually or doing very convoluted through a variety of solutions. So this is one of those prod, products as well that’s very advanced in the things that supports. Linux and Docker containers is, is definitely could be a hot topic, rightly fleet rightfully so. And this can run on a on a Docker container deployed as well. So we we’ve seen that with the I IT folks that really enjoy being able to control and to higher deployment, allows you to update easily, allows you to control and spin up new containers as well. This gives you a lot of flexibility to to deploy and manage these systems. Shawn Tierney (Host): You know, I may wanna have you back on to talk about this. I used to there’s an old product called Rascal that I used to use. It was a transaction manager, and it would based on data changing or on a time that as a trigger, it could take data either from the PLC to the database or from the database to the PLC, and it would work with stored procedures. And and this seems like it hits all those points, And it sounds like it’s a visual like you said, right there on the slide, visual workflow builder. Connor Mason (Guest): Yep. Shawn Tierney (Host): So you really piqued my interest with this one, and and it may be something we wanna come back to and and revisit in the future, because, it just it’s just I know that that older product was very useful and, you know, it really solved a lot of old applications back in the day. Connor Mason (Guest): Yeah. Absolutely. And this this just takes that on and builds even more. If you if anyone was, kind of listening at the beginning of this year or two, a conference called Prove It that was very big in the industry, we were there to and we presented on stage a solution that we had. Highly recommend going searching for that. It’s on our web pages. It’s also on their YouTube links, and it’s it’s called Prove It. And OPC router was a big part of that in the back end. I would love to dive in and show you the really unique things. Kind of as a quick overview, we’re able to use Google AI vision to take camera data and detect if someone was wearing a hard hat. All that logic and behind of getting that information to Google AI vision, was through REST with OPC router. Then we were parsing that information back through that, connection and then providing it back to the PLCs. So we go all the way from a camera to a PLC controlling a light stack, up to Google AI vision through OPC router, all on hotel Wi Fi. It’s very imp it’s very, very fun presentation, and, our I think our team did a really great job. So a a a pretty new offering I have I wanna highlight, is our is our data caster. This is a an actual piece of hardware. You know, our software toolbox is we we do have some hardware as well. It’s just, part of the nature of this environment of how we mesh in between things. But the the idea is that, there’s a lot of different use cases for HMI and SCADA. They have grown so much from what they used to be, and they’re very core part of the automation stack. Now a lot of times, these are doing so many things beyond that as well. What we found is that in different areas of operations, you may not need all that different control. You may not even have the space to make up a whole workstation for that as well. What this does, the data caster, is, just simply plug it plugs it into any network and into an HDMI compatible display, and it gives you a very easy configure workplace to put a few key metrics onto a screen. So if I have different things from you can connect directly to PLCs like Allen Bradley. You can connect to SQL databases. You can also connect to rest APIs to gather the data from these different sources and build a a a kind of easy to to view, KPI dashboard in a way. So if you’re on a operation line and you wanna look at your current run rate, maybe you have certain things in the POC tags, you know, flow and pressure that’s very important for those operators to see. They may not be, even the capacity to be interacting with anything. They just need visualizations of what’s going on. This product can just be installed, you know, industrial areas with, with any type of display that you can easily access and and give them something that they can easily look at. It’s configured all through a web browser to display what you want. You can put on different colors based on levels of values as well. And it’s just I feel like a very simple thing that sometimes it seems so simple, but those might be the things that provide value on the actual operation floor. This is, for anyone that’s watching, kind of a quick view of a very simple screen. What we’re showing here is what it would look like from all the different data sources. So talking directly to ControlLogs PLC, talking to SQL databases, micro eight eight hundreds, an arrest client, and and what’s coming very soon, definitely by the end of this year, is OPC UA support. So any OPC UA server that’s out there that’s already having your PLC data or etcetera, this could also connect to that and get values from there. Shawn Tierney (Host): Can I can you make it I’m I’m here I go? Can you make it so it, like, changes, like, pages every few seconds? Connor Mason (Guest): Right now, it is a single page, but this is, like I said, very new product, so we’re taking any feedback. If, yeah, if there’s this type of slideshow cycle that would be, you know, valuable to anyone out there, let us know. We’re definitely always interested to see the people that are actually working out at these operation sites, what what’s valuable to them. Yeah. Shawn Tierney (Host): A lot of kiosks you see when when you’re traveling, it’ll say, like, line one well, I’ll just throw out there. Line one, and that’ll be on there for five seconds, and then it’ll go line two. That’ll be on there for five seconds, and then line you know, I and that’s why I just mentioned that because I can see that being a question that, that that I would get from somebody who is asking me about it. Connor Mason (Guest): Oh, great question. Appreciate it. Alright. So now we’re gonna set time for a little hands on demo. For anyone that’s just listening, we’re gonna I’m gonna talk about this at at a high level and walk through everything. But the idea is that, we have a few different POCs, very common in Allen Bradley and just a a Siemens seven, s seven fifteen hundred that’s in our office, pretty close to me on the other side of the wall wall, actually. We’re gonna first start by connecting that to our top server like we talked about. This is our industrial communication server, that offers both OCDA, OC UA, SweetLink connectivity as well. And then we’re gonna bring this into our Cogent data hub. This we talked about is getting those values up to these higher levels. What we’ll be doing is also tunneling the data. We talked about being able to share data through the data hubs themselves. Kinda explain why we’re doing that here and the value you can add. And then we’re also gonna showcase adding on MQTT to this level. Taking beta now just from these two PLCs that are sitting on a rack, and I can automatically make all that information available in the MQTT broker. So any MQTT client that’s out there that wants to subscribe to that data, now has that accessible. And I’ve created this all through a a really simple workflow. We also have some databases connected. Influx, we install with Code and DataHub, has a free visualization tool that kinda just helps you see what’s going on in your processes. I wanna showcase a little bit of that as well. Alright. So now jumping into our demo, when we first start off here is the our top server. Like I mentioned before, if anyone has worked with KEP server in the past, this is gonna look very similar. Like it because it is. The same technology and all the things here. The the first things that I wanted to establish in our demo, was our connection to our POCs. I have a few here. We’re only gonna use the Allen Bradley and the Siemens, for the the time that we have on our demo here. But how this builds out as a platform is you create these different channels and the devices connections between them. This is gonna be your your physical connections to them. It’s either, IP TCPIP connection or maybe your serial connection as well. We have support for all of them. It really is a long list. Anyone watching out there, you can kind of see all the different drivers that that we offer. So allowing this into a single platform, you can have all your connectivity based here. All those different connections that you now have that up the stack, your SCADA, your historians, MAS even as well, they can all go to a single source. Makes that management, troubleshooting, all those a bit easier as well. So one of the first things I did here, I have this built out, but I’ll kinda walk through what you would typically do. You have your Allen Bradley ControlLogix Ethernet driver here first. You know, I have some IPs in here I won’t show, but, regardless, we have our our our drivers here, and then we have a set of tags. These are all the global tags in the programming of the PLC. How I got these to to kind of map automatically is in our in our driver, we’re able to create tags automatically. So you’re able to send a command to that device and ask for its entire tag database. They can come back, provide all that, map it out for you, create those tags as well. This saves a lot of time from, you know, an engineer have to go in and, addressing all the individual items themselves. So once it’s defined in the program project, you’re able to bring this all in automatically. I’ll show now how easy that makes it connecting to something like the Cogent data hub. In a very similar fashion, we have a connection over here to the Siemens, PLC that I also have. You can see beneath it all these different tag structures, and this was created the exact same way. While those those PLC support it, you can do an automatic tag generation, bring in all the structure that you’ve already built out your PLC programming, and and make this available on this OPC server now as well. So that’s really the basis. We first need to establish communications to these PLCs, get that tag data, and now what do we wanna do with it? So in this demo, what I wanted to bring up was, the code in DataHub next. So here, I see a very similar kind of layout. We have a different set set of plugins on the left side. So for anyone listening, the Cogent Data Hub again is kind of our aggregation and conversion tool. All these different type of protocols like OPC UA, OPC DA, and OPC A and E for alarms and events. We also support OPC alarms and conditions, which is the newer profile for alarms in OPC UA. We have all a variety of different ways that you can get data out of things and data’s into the data hub. We can also do bridging. This concept is, how you share data in between different points. So let’s say I had a connection to one OPC server, and it was communicating to a certain PLC, and there were certain registers I was getting data from. Well, now I also wanna connect to a different OPC server that has, entirely different brand of PLCs. And then maybe I wanna share data in between them directly. Well, with this software, I can just bridge those points between them. Once they’re in the data hub, I can do kind of whatever I want with them. I can then allow them to write between those PLCs and share data that way, and you’re not now having to do any type of hardwiring directly in between them, and then I’m compatible to communicate to each other. Through the standards of OPC and these variety of different communication levels, I can integrate them together. Shawn Tierney (Host): You know, you bring up a good point. When you do something like that, is there any heartbeat? Like, is there on the general or under under, one of these, topics? Is there are there tags we can use that are from DataHub itself that can be sent to the destination, like a heartbeat or, you know, the merge transactions? Or Connor Mason (Guest): Yeah. Absolutely. So with this as well, there’s pretty strong scripting engine, and I have done that in the past where you can make internal tags. And that that could be a a timer. It could be a counter. And and just kind of allows you to create your own tags as well that you could do the same thing, could share that, through bridge connection to a PLC. So, yeah, there there are definitely some people that had those cert and, you know, use cases where they wanna get something to just track, on this software side and get it out to those hardware PLCs. Absolutely. Shawn Tierney (Host): I mean, when you send out the data out of the PLC, the PLC doesn’t care to take my data. But when you’re getting data into the PLC, you wanna make sure it’s updating and it’s fresh. And so, you know, they throw a counter in there, the script thing, and be able to have that. As as long as you see that incrementing, you know, you got good data coming in. That’s that’s a good feature. Connor Mason (Guest): Absolutely. You know, another big one is the the redundancy. So what this does is beyond just the OPC, we can make redundancy to basically anything that has two things running of it. So any of these different connections. How it’s unique is what it does is it just looks at the buckets of data that you create. So for an example, if I do have two different OPC servers and I put them into two areas of, let’s say, OPC server one and OPC server two, I can what now create an OPC redundancy data bucket. And now any client that connects externally to that and wants that data, it’s gonna go talk to that bucket of data. And that bucket of data is going to automatically change in between sources as things go down, things come back up, and the client would never know what’s hap what that happened unless you wanted to. There are internal tasks to show what’s the current source and things, but the idea is to make this trans kind of hidden that regardless of what’s going on in the operations, if I have this set up, I can have my external applications just reading from a single source without knowing that there’s two things behind it that are actually controlling that. Very important for, you know, historian connections where you wanna have a full complete picture of that data that’s coming in. If you’re able to make a redundant connection to two different, servers and then allow that historian to talk to a single point where it doesn’t have to control that switching back and forth. It it will just see that data flow streamlessly as as either one is up at that time. Kinda beyond that as well, there’s quite a few other different things in here. I don’t think we have time to cover all of them. But for for our demo, what I wanna focus on first is our OPC UA connection. This allows us both to act as a OPC UA client to get data from any servers out there, like our top server. And also we can act as an OPC UA server itself. So if anything’s coming in from maybe you have multiple connections to different servers, multiple connections to other things that aren’t OPC as well, I can now provide all this data automatically in my own namespace to allow things to connect to me as well. And that’s part of that aggregation feature, and kind of topic I was mentioning before. So with that, I have a connection here. It’s pulling data all from my top server. I have a few different tags from my Alec Bradley and and my Siemens PLC selected. The next part of this, while I was meshing, was the tunneling. Like I said, this is very popular to get around DCOM issues, but there’s a lot of reasons why you still may use this beyond just the headache of DCOM and what it was. What this runs on is a a TCP stream that takes all the data points as a value, a quality, and a timestamp, and it can mirror those in between another DataHub instance. So if I wanna get things across a network, like my OT side, where NASH previously, I would have to come in and allow a, open port onto my network for any OPC UA clients, across the network to access that, I can now actually change the direction of this and allow me to tunnel data out of my network without opening up any ports. This is really big for security. If anyone out there, security professional or working as an engineer, you have to work with your IT and security a lot, they don’t you don’t wanna have an open port, especially to your operations and OT side. So this allows you to change that direction of flow and push data out of this direction into another area like a DMZ computer or up to a business level computer as well. The other things as well that I have configured in this demo, the benefit of having that tunneling streaming data across this connection is I can also store this data locally in a, influx database. The purpose of that then is that I can actually historize this, provide then if this connection ever goes down to backfill any information that was lost during that tunnel connection going down. So with this added layer on and real time data scenarios like OPC UA, unless you have historical access, you would lose a lot of data if that connection ever went down. But with this, I can actually use the back end of this InfluxDB, buffer any values. When my connection comes back up, pass them along that stream again. And if I have anything that’s historically connected, like, another InfluxDB, maybe a PI historian, Vue historian, any historian offering out there that can allow that connection. I can then provide all those records that were originally missed and backfill that into those systems. So I switched over to a second machine. It’s gonna look very similar here as well. This also has an instance of the Cogent Data Hub running here. For anyone not watching, what we’ve actually have on this side is the the portion of the tunneler that’s sitting here and listening for any data requests coming in. So on my first machine, I was able to connect my PLCs, gather that information into Cogent DataHub, and now I’m pushing that information, across the network into a separate machine that’s sitting here and listening to gather information. So what I can quickly do is just make sure I have all my data here. So I have these different points, both from my Allen Bradley PLCs. I have a few, different simulation demo points, like temperature, pressure, tank level, a few statuses, and all this is updating directly through that stream as the PLC is updating it as well. I also have my scenes controller. I have some, current values and a few different counters tags as well. All of this again is being directly streamed through that tunnel. I’m not connecting to an OPC server at all on this side. I can show you that here. There’s no connections configured. I’m not talking to the PLCs directly on this machine as well. But maybe we’ll pass all the information through without opening up any ports on my OT demo machine per se. So what’s the benefit of that? Well, again, security. Also, the ability to do the store and forward mechanisms. On the other side, I was logging directly to a InfluxDB. This could be my d- my buffer, and then I was able to configure it where if any values were lost, to store that across the network. So now with this side, if I pull up Chronic Graph, which is a free visualization tool that installs with the DataHub as well, I can see some very nice, visual workflows and and visual diagrams of what is going on with this data. So I have a pressure that is just a simulator in this, Allen Bradley PLC that ramps up and and comes back down. It’s not actually connected to anything that’s reading a real pressure, but you can see over time, I can kind of change through these different layers of time. And I might go back a little far, but I have a lot of data that’s been stored in here. For a while during my test, I turned this off and, made it fail, but then I came back in and I was able to recreate all the data and backfill it as well. So through through these views, I can see that as data disconnects, as it comes back on, I have a very cyclical view of the data because it was able to recover and store and forward from that source. Like I said, Shawn, data quality is a big thing in this industry. It’s a big thing for people both at the operations side, and both people making decision in the business layer. So being able to have a full picture, without gaps, it is definitely something that, you should be prioritizing, when you can. Shawn Tierney (Host): Now what we’re seeing here is you’re using InfluxDB on this, destination PC or IT side PC and chronograph, which was that utility or that package that comes, gets installed. It’s free. But you don’t actually have to use that. You could have sent this in to an OSI pi or Exactly. Somebody else’s historian. Right? Can you name some of the historians you work with? I know OSI pie. Connor Mason (Guest): Yeah. Yeah. Absolutely. So there’s quite a few different ones. As far as what we support in the Data Hub natively, Amazon Kinesis, the cloud hosted historian that we can also do the same things from here as well. Aviva Historian, Aviva Insight, Apache Kafka. This is a a kind of a a newer one as well that used to be a very IT oriented solution, now getting into OT. It’s kind of a similar database structure where things are stored in different topics that we can stream to. On top of that, just regular old ODBC connections. That opens up a lot of different ways you can do it, or even, the old classic OPC, HDA. So if you have any, historians that that can act as an OPC HDA, connection, we we can also stream it through there. Shawn Tierney (Host): Excellent. That’s a great list. Connor Mason (Guest): The other thing I wanna show while we still have some time here is that MQTT component. This is really growing and, it’s gonna continue to be a part of the industrial automation technology stack and conversations moving forward, for streaming data, you know, from devices, edge devices, up into different layers, both now into the OT, and then maybe out to, IT, in our business levels as well, and definitely into the cloud as we’re seeing a lot of growth into it. Like I mentioned with Data Hub, the big benefit is I have all these different connections. I can consume all this data. Well, I can also act as an MQTT broker. And what what a broker typically does in MQTT is just route data and share data. It’s kind of that central point where things come to it to either say, hey. I’m giving you some new values. Share it with someone else. Or, hey. I need these values. Can you give me that? It really fits in super well with what this product is at its core. So all I have to do here is just enable it. What that now allows is I have an example, MQTT Explorer. If anyone has worked with MQTT, you’re probably familiar with this. There’s nothing else I configured beyond just enabling the broker. And you can see within this structure, I have all the same data that was in my Data Hub already. The same things I were collecting from my PLCs and top server. Now I’ve embedded these as MPPT points and now I have them in JSON format with the value, their timestamp. You can even see, like, a little trend here kind of matching what we saw in Influx. And and now this enables all those different cloud connectors that wanna speak this language to do it seamlessly. Shawn Tierney (Host): So you didn’t have to set up the PLCs a second time to do this? Nope. Connor Mason (Guest): Not at all. Shawn Tierney (Host): You just enabled this, and now the data’s going this way as well. Exactly. Connor Mason (Guest): Yeah. That’s a really strong point of the Cogent Data Hub is once you have everything into its structure and model, you just enable it to use any of these different connections. You can get really, really creative with these different things. Like we talked about with the the bridging aspect and getting into different systems, even writing down the PLCs. You can make crust, custom notifications and email alerts, based on any of these values. You could even take something like this MTT connection, tunnel it across to another data hub as well, maybe then convert it to OPC DA. And now you’ve made a a a new connection over to something that’s very legacy as well. Shawn Tierney (Host): Yeah. That, I mean, the options here are just pretty amazing, all the different things that can be done. Connor Mason (Guest): Absolutely. Well, I, you know, I wanna jump back into some of our presentation here while we still got the time. And now after we’re kinda done with our demo, there’s so many different ways that you can use these different tools. This is just a really simple, kind of view of the, something that used to be very simple, just connecting OpenSea servers to a variety of different connections, kind of expanding onto with that that’s store and forward, the local influx usage, getting out to things like MTT as well. But there’s a lot more you can do with these solutions. So like Shawn said, reach out to us. We’re happy to engage and see what we can help you with. I have a few other things before we wrap up. Just overall, it we’ve worked across nearly every industry. We have installations across the globe on all continents. And like I said, we’ve been around for pushing thirty years next year. So we’ve seen a lot of different things, and we really wanna talk to anyone out there that maybe has some struggles that are going on with just connectivity, or you have any ongoing projects. If you work in these different industries or if there’s nothing marked here and you have anything going on that you need help with, we’re very happy to sit down and let you know if there’s there’s something we can do there. Shawn Tierney (Host): Yeah. For those who are, listening, I mean, we see most of the big energy and consumer product, companies on that slide. So I’m not gonna read them off, but, it’s just a lot of car manufacturers. You know, these are these are these, the household name brands that everybody knows and loves. Connor Mason (Guest): So kind of wrap some things up here. We talked about all the different ways that we’ve kind of helped solve things in the past, but I wanna highlight some of the unique ones, that we’ve also gone do some, case studies on and and success stories. So this one I actually got to work on, within the last few years that, a plastic packaging, manufacturer was looking to track uptime and downtime across multiple different lines, and they had a new cloud solution that they were already evaluating. They’re really excited to get into play. They they had a lot of upside to, getting things connected to this and start using it. Well, what they had was a lot of different PLCs, a lot of different brands, different areas, different, you know, areas of operation that they need to connect to. So what they used was to first get that into our top server, kind of similar to how they showed them use in their in our demo. We just need to get all the data into a centralized platform first, get that data accessible. Then from there, once they had all that information into a centralized area, they used the Cogent Data Hub as well to help aggregate that information and transform it to be sent to the cloud through MQTT. So very similar to the demo here, this is actually a real use case of that. Getting information from PLCs, structuring it into that how that cloud system needed it for MQTT, and streamlining that data connection to now where it’s just running in operation. They constantly have updates about where their lines are in operation, tracking their downtime, tracking their uptime as well, and then being able to do some predictive analytics in that cloud solution based on their history. So this really enabled them to kind of build from what they had existing. It was doing a lot of manual tracking, into an entirely automated system with management able to see real views of what’s going on at this operation level. Another one I wanna talk about was we we were able to do this success story with, Ace Automation. They worked with a pharmaceutical company. Ace Automation is a SI and they were brought in and doing a lot of work with some some old DDE connections, doing some custom Excel macros, and we’re just having a hard time maintaining some legacy systems that were just a pain to deal with. They were working with these older files, from some old InTouch histor HMIs, and what they needed to do was get something that was not just based on Excel and doing custom macros. So one product we didn’t get to talk about yet, but we also carry is our LGH file inspector. It’s able to take these files, put them out into a standardized format like CSV, and also do a lot of that automation of when when should these files be queried? Should they be, queried for different lengths? Should they be output to different areas? Can I set these up in a scheduled task so it can be done automatically rather than someone having to sit down and do it manually in Excel? So they will able to, recover over fifty hours of engineering time with the solution from having to do late night calls to troubleshoot a, Excel macro that stopped working, from crashing machines, because they were running a legacy systems to still support some of the DDE servers, into saving them, you know, almost two hundred plus hours of productivity. Another example, if we’re able to work with a renewable, energy customer that’s doing a lot of innovative things across North America, They had a very ambitious plan to double their footprint in the next two years. And with that, they had to really look back at their assets and see where they currently stand, how do we make new standards to support us growing into what we want to be. So with this, they had a lot of different data sources currently. They’re all kind of siloed at the specific areas. Nothing was really connected commonly to a corporate level area of historization, or control and security. So again, they they were able to use our top server and put out a standard connectivity platform, bring in the DataHub as an aggregation tool. So each of these sites would have a top server that was individually collecting data from different devices, and then that was able to send it into a single DataHub. So now their corporate level had an entire view of all the information from these different plants in one single application. That then enabled them to connect their historian applications to that data hub and have a perfect view and make visualizations off of their entire operations. What this allowed them to do was grow without replacing everything. And that’s a big thing that we try to strive on is replacing and ripping out all your existing technologies. It’s not something you can do overnight. But how do we provide value and gain efficiency with what’s in place and providing newer technologies on top of that without disrupting the actual operation as well? So this was really, really successful. And at the end, I just wanna kind of provide some other contacts and information people can learn more. We have a blog that goes out every week on Thursdays. A lot of good technical content out there. A lot of recast of the the awesome things we get to do here, the success stories as well, and you can always find that at justblog.softwaretoolbox.com. And again, our main website is justsoftwaretoolbox.com. You can get product information, downloads, reach out to anyone on our team. Let’s discuss what what issues you have going on, any new projects, we’ll be happy to listen. Shawn Tierney (Host): Well, Connor, I wanna thank you very much for coming on the show and bringing us up to speed on not only software toolbox, but also to, you know, bring us up to speed on top server and doing that demo with top server and data hub. Really appreciate that. And, I think, you know, like you just said, if anybody, has any projects that you think these solutions may be able to solve, please give them a give them a call. And if you’ve already done something with them, leave a comment. You know? To leave a comment, no matter where you’re watching or listening to this, let us know what you did. What did you use? Like me, I used OmniServer all those many years ago, and, of course, Top Server as an OPC server. But if you guys have already used Software Toolbox and, of course, Symbol Factory, I use that all the time. But if you guys are using it, let us know in the comments. It’s always great to hear from people out there. I know, you know, with thousands of you guys listening every week, but I’d love to hear, you know, are you using these products? Or if you have questions, I’ll funnel them over to Connor if you put them in the comments. So with that, Connor, did you have anything else you wanted to cover before we close out today’s show? Connor Mason (Guest): I think that was it, Shawn. Thanks again for having us on. It was really fun. Shawn Tierney (Host): I hope you enjoyed that episode, and I wanna thank Connor for taking time out of his busy schedule to come on the show and bring us up to speed on software toolbox and their suite of products. Really appreciated that demo at the end too, so we actually got a look at if you’re watching. Gotta look at their products and how they work. And, just really appreciate them taking all of my questions. I also appreciate the fact that Software Toolbox sponsored this episode, meaning we were able to release it to you without any ads. So I really appreciate them. If you’re doing any business with Software Toolbox, please thank them for sponsoring this episode. And with that, I just wanna wish you all good health and happiness. And until next time, my friends, peace. Until next time, Peace ✌️ If you enjoyed this content, please give it a Like, and consider Sharing a link to it as that is the best way for us to grow our audience, which in turn allows us to produce more content
In banking, every second matters. Fraud happens in milliseconds. Customers demand instant answers. And AI can only deliver value if it's powered by live, real-time data. Yet many banks are still relying on batch reports and outdated systems, making decisions based on yesterday's insights. The shift can't wait. Forrester predicts that by 2025, half of all businesses will use AI-powered self-service as their primary customer touchpoint. That future won't be possible without real-time data at the core. Banks that leverage streaming data will transform customer experiences, manage risks more efficiently, and unlock the full potential of AI. Those who don't risk being left behind. Today, I'm joined by Guillaume Aymé, CEO of Lenses.io and a leading voice on data innovation. Together, we'll explore why real-time data is becoming the lifeblood of modern banking, the hurdles institutions must overcome, and how to build the foundation for AI-driven success. This episode of Banking Transformed is sponsored by Lenses Lenses 6.0 is a Developer Experience designed to empower organizations to modernize applications and systems with real-time data autonomy. This is particularly crucial as AI adoption accelerates, and enterprises operate hundreds of Kafka clusters across multi-cloud environments. As the industry's first multi-Kafka developer experience, Lenses 6.0 allows teams to access, govern and process streaming data across any combination of Apache Kafka-based streaming platforms, from a single interface. https://lenses.io/
AWS Morning Brief for the week of Monday, June 30th, with Corey Quinn. Links:What is AWS Security Hub?Amazon data center complexCode reviews in you IDEAWS Local Zones Features - AWS Last Week in AWS Slack communityAmazon VPC raises default Route Table capacityAnnouncing Intelligent Search for re:Post and re:Post PrivateHow to Set Up Automated Alerts for Newly Purchased AWS Savings PlansIntroducing AWS Lambda native support for Avro and Protobuf formatted Apache Kafka events
How do you retrofit a clustered data-processing system to use cheap commodity storage? That's the big question in this episode as we look at one of the many attempts to build a version of Kafka that uses object storage services like S3 as its main disk, sacrificing a little latency for cheap, infinitely-scalable disks.There are several companies trying to walk down that road, and it's clearly big business - one of them recently got bought out for a rumoured $250m. But one of them is actively trying to get those changes back into the community, as are pushing to make Apache Kafka speak object storage natively.Joining me to explain why and how are Josep Prat and Filip Yonov of Aiven. We break down what it takes to make Kafka's storage layer optional on a per-topic basis, how they're making sure it's not a breaking change, and how they plan to get such a foundational feature merged.–Announcement Post: https://aiven.io/blog/guide-diskless-apache-kafka-kip-1150Aiven's (Temporary) Fork, Project Inkless: https://github.com/aiven/inkless/blob/main/docs/inkless/README.mdKafka Improvement Process (KIP) Articles: KIP-1150: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1150%3A+Diskless+Topics KIP-1163: Diskless Core: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1163%3A+Diskless+Core KIP-1164: Topic Based Batch Coordinator: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1164%3A+Topic+Based+Batch+Coordinator KIP-1165: Object Compaction for Diskless: https://cwiki.apache.org/confluence/display/KAFKA/KIP-1165%3A+Object+Compaction+for+DisklessSupport Developer Voices on Patreon: https://patreon.com/DeveloperVoicesSupport Developer Voices on YouTube: https://www.youtube.com/@developervoices/joinFilip on LinkedIn: https://www.linkedin.com/in/filipyonovJosep on LinkedIn: https://www.linkedin.com/in/jlprat/Kris on Bluesky: https://bsky.app/profile/krisajenkins.bsky.socialKris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/
AWS Morning Brief for the week of Tuesday, May 27th with Corey Quinn. Links:Amazon Aurora reduces cross-Region Global Database Switchover time to typically under 30 secondsAmazon MSK adds support for Apache Kafka version 4.0AWS Control Tower releases Enabled controls view for centralized visibility - AWSAWS Cost Anomaly Detection enables advanced alerting through AWS User NotificationsAWS service changesDynamoDB local is now accessible on AWS CloudShellJoin Us at FinOps X 2025: Your Guide to All Things AWSIntroducing the AWS Product Lifecycle page and AWS service availability updatesJoin AWS Cloud Infrastructure Day to learn cutting-edge innovations building global cloud infrastructureHow to secure your instances with multi-factor authenticationCost Optimization for Healthcare on AWSCORS configuration through Amazon CloudFrontIntroducing Strands Agents, an Open Source AI Agents SDK | AWS Open Source BlogAndy Jassy's leadership lesson he practices at work and at home
I'm joined this week by one of the authors of Apache Kafka In Action, to take a look at the state of Kafka, event systems & stream-processing technology. It's an approach (and a whole market) that's had at least a decade to mature, so how has it done? What does Kafka offer to developers and businesses, and which parts do they actually care about? What have streaming data systems promised and what have they actually delivered? What's still left to build?–Apache Kafka in Action: https://www.manning.com/books/apache-kafka-in-actionPat Helland, Data on the Inside vs Data on the Outside: https://queue.acm.org/detail.cfm?id=3415014Out of the Tar Pit: https://curtclifton.net/papers/MoseleyMarks06a.pdfMartin Kleppmann, Turning the Database Inside-Out: https://martin.kleppmann.com/2015/11/05/database-inside-out-at-oredev.htmlData Mesh by Zhamak Dehghani: https://www.amazon.co.uk/Data-Mesh-Delivering-Data-Driven-Value/dp/1492092398Quix Streams: https://github.com/quixio/quix-streamsXTDB: https://xtdb.com/Support Developer Voices on Patreon: https://patreon.com/DeveloperVoicesSupport Developer Voices on YouTube: https://www.youtube.com/@developervoices/joinAnatoly's Website: https://zelenin.de/Kris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins
An airhacks.fm conversation with Francesco Nigro (@forked_franz) about: JCTools as a Java concurrency utility library created by Nitsan Wakart, the history of JCTools and how Cliff Click donated his non-blocking HashMap algorithm to the project, contributions to JCTools including weight-free queue implementations, Apache Storm vs. Apache Kafka, explanation of how JCTools improves upon Java's standard concurrent queues by reducing garbage creation and optimizing memory layout, the difference between linked node implementations in standard Java collections versus array-based implementations in JCTools, detailed explanation of linearizability as a property of concurrent algorithms, the challenges of implementing concurrent data structures that maintain proper ordering guarantees, explanation of lock-free versus wait-free algorithms and their progress guarantees, discussion of the xadd instruction in x86 processors and how it's used in JCTools for atomic operations, the implementation of MessagePassingQueue API in JCTools that provides relaxed guarantees for better performance, comparison between JCTools and other solutions like Disruptor, explanation of how JCTools achieves 400 million operations per second in single-producer single-consumer scenarios, discussion of cooperative algorithms for multi-producer scenarios, the use of padding to avoid false sharing in concurrent data structures, the implementation of code generation in JCTools to create different flavors of queues, the use of Unsafe and AtomicLongFieldUpdater for low-level operations, real-world applications in high-frequency trading and medical data processing, integration of JCTools with quarkus and mutiny frameworks, the importance of proper memory layout for performance Francesco Nigro on twitter: @forked_franz
In an era where real-time decision-making is a serious competitive advantage, streaming-first architectures are revolutionizing how organizations process and act on data. Unlike traditional batch-oriented systems, streaming platforms like Apache Kafka, Redpanda, Apache Flink, and Apache Pulsar enable continuous data ingestion, transformation, and analysis at scale. These technologies empower businesses to break free from the limitations of periodic data updates, unlocking the ability to react instantly to events, personalize customer experiences in real-time, and drive automation with high-velocity insights. By decoupling producers and consumers through scalable, event-driven pipelines, streaming-first architectures not only enhance system resilience but also pave the way for a more agile, intelligence-driven enterprise. Register for this episode of DM Radio to learn how today's innovators are leveraging this rapidly evolving discipline.
This interview was recorded for the GOTO Book Club.http://gotopia.tech/bookclubRead the full transcription of the interview hereKate Stanley - Principal Software Engineer at Red Hat & Co-Author of "Kafka Connect"Mickael Maison - Senior Principal Software Engineer at Red Hat & Co-Author of "Kafka Connect"Danica Fine - Lead Developer Advocate, Open Source at SnowflakeRESOURCESKatehttps://fosstodon.org/@katherishttps://www.linkedin.comMickaelhttps://bsky.app/profile/mickaelmaison.bsky.socialhttps://mas.to/@MickaelMaisonhttps://www.linkedin.comhttps://mickaelmaison.comDanicahttps://bsky.app/profile/thedanicafine.bsky.socialhttps://data-folks.masto.host/@thedanicafinehttps://www.linkedin.comhttps://linktr.ee/thedanicafineLinkshttps://kafka.apache.orghttps://flink.apache.orghttps://debezium.iohttps://strimzi.ioDESCRIPTIONDanica Fine together with the authors of “Kafka Connect” Kate Stanley and Mickael Maison, unpack Kafka Connect's game-changing power for building data pipelines—no tedious custom scripts needed! Kate and Mickael Maison discuss how they structured the book to help everyone, from data engineers to developers, tap into Kafka Connect's strengths, including Change Data Capture (CDC), real-time data flow, and fail-safe reliability.RECOMMENDED BOOKSKate Stanley & Mickael Maison • Kafka ConnectShapira, Palino, Sivaram & Petty • Kafka: The Definitive GuideViktor Gamov, Dylan Scott & Dave Klein • Kafka in ActionBlueskyTwitterInstagramLinkedInFacebookCHANNEL MEMBERSHIP BONUSJoin this channel to get early access to videos & other perks:https://www.youtube.com/channel/UCs_tLP3AiwYKwdUHpltJPuA/joinLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket: gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!
Data Streaming und Stream Processing mit Apache Kafka und dem entsprechenden Ecosystem.Eine ganze Menge Prozesse in der Softwareentwicklung bzw. für die Verarbeitung von Daten müssen nicht zur Laufzeit, sondern können asynchron oder dezentral bearbeitet werden. Begriffe wie Batch-Processing oder Message Queueing / Pub-Sub sind dafür geläufig. Es gibt aber einen dritten Player in diesem Spiel: Stream Processing. Da ist Apache Kafka das Flaggschiff, bzw. die verteilte Event Streaming Platform, die oft als erstes genannt wird.Doch was ist denn eigentlich Stream Processing und wie unterscheidet es sich zu Batch Processing oder Message Queuing? Wie funktioniert Kafka und warum ist es so erfolgreich und performant? Was sind Broker, Topics, Partitions, Producer und Consumer? Was bedeutet Change Data Capture und was ist ein Sliding Window? Auf was muss man alles acht geben und was kann schief gehen, wenn man eine Nachricht schreiben und lesen möchte?Die Antworten und noch viel mehr liefert unser Gast Stefan Sprenger.Bonus: Wie man Stream Processing mit einem Frühstückstisch für 5-jährige beschreibt.Unsere aktuellen Werbepartner findest du auf https://engineeringkiosk.dev/partnersDas schnelle Feedback zur Episode:
In this engaging conversation at the All Things Open conference, Tim Spann, Principal Developer Advocate at Zilliz, discusses the importance of community collaboration in advancing AI technologies. He emphasizes the need for diverse perspectives in solving complex problems and highlights his work with the Milvus open source vector database. Tim also explains the evolving landscape of retrieval augmented generation (RAG) and its applications and shares insights into the future of AI development. The conversation concludes on a lighter note with Tim describing his creative use of Milvus in a fun Halloween project to catalog and identify ghosts. 00:00 Introduction 00:41 Meet Tim Spann: Principal Developer Advocate 01:35 The Importance of Community in AI 02:56 Advanced RAG and Multimodal Models 06:17 The Future of Agentic RAG 09:04 Challenges and Excitement in AI Development 13:35 Building AI the Right Way 17:50 Fun with AI: Capturing Ghosts 19:24 Conclusion and Final Thoughts Guest: Tim Spann is a Principal Developer Advocate for Zilliz and Milvus. He works with Apache NiFi, Apache Kafka, Apache Pulsar, Apache Flink, Flink SQL, Apache Pinot, Trino, Apache Iceberg, DeltaLake, Apache Spark, Big Data, IoT, Cloud, AI/DL, machine learning, and deep learning. Tim has over ten years of experience with the IoT, big data, distributed computing, messaging, streaming technologies, and Java programming. Previously, he was a Principal Developer Advocate at Cloudera, Developer Advocate at StreamNative, Principal DataFlow Field Engineer at Cloudera, a Senior Solutions Engineer at Hortonworks, a Senior Solutions Architect at AirisData, a Senior Field Engineer at Pivotal and a Team Leader at HPE. He blogs for DZone, where he is the Big Data Zone leader, and runs a popular meetup in Princeton & NYC on Big Data, Cloud, IoT, deep learning, streaming, NiFi, the blockchain, and Spark. Tim is a frequent speaker at conferences such as ApacheCon, DeveloperWeek, Pulsar Summit and many more. He holds a BS and MS in computer science.
What does it take to go from leading Kafka development at Confluent to becoming a key figure in the PostgreSQL world? Join us as we talk with Gwen Shapira, co-founder and chief product officer at Nile, about her transition from cloud-native technologies to the vibrant PostgreSQL community. Gwen shares her journey, including the shift from conferences like O'Reilly Strata to PostgresConf and JavaScript events, and how the Postgres community is evolving with tools like Discord that keep it both grounded and dynamic.We dive into the latest developments in PostgreSQL, like hypothetical indexes that enable performance tuning without affecting live environments, and the growing importance of SSL for secure database connections in cloud settings. Plus, we explore the potential of integrating PostgreSQL with Apache Arrow and Parquet, signaling new possibilities for data processing and storage.At the intersection of AI and PostgreSQL, we examine how companies are using vector embeddings in Postgres to meet modern AI demands, balancing specialized vector stores with integrated solutions. Gwen also shares insights from her work at Nile, highlighting how PostgreSQL's flexibility supports SaaS applications across diverse customer needs, making it a top choice for enterprises of all sizes.Follow Gwen on:Nile BlogX (Twitter)LinkedInNile DiscordWhat's New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What's New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.
In this episode of Spring Office Hours, hosts Dan Vega and DeShaun Carter interview Chris Bono, a Spring team member who works on Spring Cloud Dataflow and Spring Pulsar. They discuss streaming data, comparing Apache Kafka and Apache Pulsar, and explore the features and use cases of Spring Cloud Stream applications. Chris provides insights into the architecture of streaming applications, explains key concepts, and highlights the benefits of using Spring's abstraction layers for working with messaging systems.Show Notes:Introduction to Chris Bono and his work on Spring Cloud Dataflow and Spring PulsarComparison between Apache Kafka and Apache PulsarOverview of Spring Cloud Stream and its bindersExplanation of source, processor, and sink concepts in streaming applicationsIntroduction to Spring Cloud Stream Applications projectDiscussion on Change Data Capture (CDC) and its importance in streamingExploration of various sources, processors, and sinks available in Spring Cloud Stream ApplicationsMention of KEDA (Kubernetes Event-driven Autoscaling) and its potential use with Spring Cloud applicationsUpcoming features in Spring Pulsar 1.2 releaseImportance of community feedback and using GitHub discussions for feature requests and issue reportingThe podcast provides a comprehensive overview of streaming data concepts and how Spring projects can be used to build efficient streaming applications.
Redpanda CEO Alex Gallego joins us to talk about Sovereign AI that never leaves your private environment, highly optimized stream processing, and why the future of data is real time. Discover how Alex's journey from building racing motorcycles and tattoo machines as a child led him to revolutionize stream processing and cloud infrastructure. Alex also gets deep into the internals of Redpanda's C++ implementation that ultimately gives it better performance and lower cost than Apache Kafka, while using the same Kafka-compatible API. We explore the challenges and groundbreaking innovations in data storage and streaming. From Kafka's distributed logs to the pioneering Redpanda, Alex shares the operational advantages of streaming over traditional batch processing. Learn about the core concepts of stream processing through real-world examples, such as fraud detection and real-time reward systems, and see how Redpanda is simplifying these complex distributed systems to make real-time data processing more accessible and efficient for engineers everywhere.Finally, we delve into emerging trends that are reshaping the landscape of data infrastructure. Examine how lightweight, embedded databases are revolutionizing edge computing environments and the growing emphasis on data sovereignty and "Bring Your Own Cloud" solutions. Get a glimpse into the future of data ownership and AI, where local inferencing and traceability of AI models are becoming paramount. Join us for this compelling conversation that not only highlights the evolution from Kafka to Redpanda but paints a visionary picture of the future of real-time systems and data architecture.What's New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What's New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.
On this episode, we are joined by special co-host Hugh Evans and returning guest Will Xu as we announce Druid Summit 2024 and dive into Druid 30.0's new features and enhancements. Improvements include better ingestion for Amazon Kinesis and Apache Kafka, enhanced support for Delta Lake, and advanced integrations with Google Cloud Storage and Azure Blob Storage. Come for the technical upgrades like GROUP BY and ORDER BY for complex columns and faster query processing with new IN and AND filters, stay for the stabilized concurrent append and replace API for late-arriving streaming data. We also explore experimental features like the centralized data source schema for better performance. Tune in to learn about the latest on arrays, the upcoming GA for window functions, and the benefits of upgrading Druid! To submit a talk or register for Druid Summit 2024, visit https://druidsummit.org/
In this episode of The GeekNarrator podcast, host Kaivalya Apte interviews Ryan and Richie, the founders of WarpStream. They discuss the architecture, benefits, and core functionalities of WarpStream, a drop-in replacement for Apache Kafka. The conversation covers their experience with Kafka, the design decisions behind WarpStream, and the operational challenges it addresses. They also delve into the seamless migration process, the scalability, and cost benefits, the integration with the Kafka ecosystem, and potential future features. This episode is a must-watch for developers and tech enthusiasts interested in modern, distributed data streaming solutions. Chapters: 00:00 Introduction 02:27 Introducing Warpstream: A Kafka Replacement 11:07 Deep Dive into Warpstream's Architecture 35:42 Exploring Kafka's Ordering Guarantees 36:52 Handling Buffering and Compaction 38:44 Efficient Data Reading and File Caching 44:06 WarpStream's Flexibility and Cost Efficiency 01:06:59 Future Features Links: WarpStream : https://www.warpstream.com/ Blog: https://www.warpstream.com/blog X: Ryan: https://x.com/ryanworl Richard Artoul: https://x.com/richardartoul Kaivalya Apte: https://x.com/thegeeknarrator If you like this episode, please hit the like button and share it with your network. Also please subscribe if you haven't yet. Database internals series: https://youtu.be/yV_Zp0Mi3xs Popular playlists: Realtime streaming systems: https://www.youtube.com/playlist?list=PLL7QpTxsA4se-mAKKoVOs3VcaP71X_LA- Software Engineering: https://www.youtube.com/playlist?list=PLL7QpTxsA4sf6By03bot5BhKoMgxDUU17 Distributed systems and databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4sfLDUnjBJXJGFhhz94jDd_d Modern databases: https://www.youtube.com/playlist?list=PLL7QpTxsA4scSeZAsCUXijtnfW5ARlrsN Stay Curios! Keep Learning! #distributedsystems #kafka #s3 #streaming
In the past couple of episodes, we'd gone over what Apache Kafka is and along the way we mentioned some of the pains of managing and running Kafka clusters on your own. In this episode, we discuss some of the ways you can offload those responsibilities and focus on writing streaming applications. Along the way, […]
In the past couple of episodes, we'd gone over what Apache Kafka is and along the way we mentioned some of the pains of managing and running Kafka clusters on your own. In this episode, we discuss some of the ways you can offload those responsibilities and focus on writing streaming applications. Along the way, […]
Topics, Partitions, and APIs oh my! This episode we're getting further into how Apache Kafka works and its use cases. Also, Allen is staying dry, Joe goes for broke, and Michael (eventually) gets on the right page. The full show notes are available on the website at https://www.codingblocks.net/episode236 News Kafka Topics Kafka APIS Use Cases Tip […]
Topics, Partitions, and APIs oh my! This episode we're getting further into how Apache Kafka works and its use cases. Also, Allen is staying dry, Joe goes for broke, and Michael (eventually) gets on the right page. The full show notes are available on the website at https://www.codingblocks.net/episode236 News Kafka Topics Kafka APIS Use Cases Tip […]
This week on Developer Voices we're talking to Ryan Worl, whose career in big data engineering has taken him from DataDog to Co-Founding WarpStream, an Apache Kafka-compatible streaming system that uses Golang for the brains and S3 for the storage. Ryan tells us about his time at DataDog, along with the things he learnt from doing large-scale systems migration bit-by-bit, before we discuss how and why he started WarpStream. Why re-implement Kafka? What are the practical challenges and cost benefits of moving all your storage to S3? And would he choose Go a second time around?--WarpStream: https://www.warpstream.com/DataDog: https://www.datadoghq.com/Ryan on Twitter: https://x.com/ryanworl Kris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins
We finally start talking about Apache Kafka! Also, Allen is getting acquainted with Aesop, Outlaw is killing clusters, and Joe was paying attention in drama class. The full show notes are available on the website at https://www.codingblocks.net/episode235 News Intro to Apache Kafka What is it? Apache Kafka is an open-source distributed event streaming platform used […]
We finally start talking about Apache Kafka! Also, Allen is getting acquainted with Aesop, Outlaw is killing clusters, and Joe was paying attention in drama class. The full show notes are available on the website at https://www.codingblocks.net/episode235 News Intro to Apache Kafka What is it? Apache Kafka is an open-source distributed event streaming platform used […]
The Datanation Podcast - Podcast for Data Engineers, Analysts and Scientists
In this episode, we delve into the Apache Iceberg Kafka Connector, a critical tool for streaming data into your data lakehouse. We’ll explore how this connector facilitates seamless data ingestion from Apache Kafka into Apache Iceberg, enhancing your real-time analytics capabilities and data lakehouse efficiency. We’ll cover: Join us to understand how the Apache Iceberg […]
Álvaro Hernández is the founder and CEO of OnGres a company that provides among other things a distribution of Postgres that runs on Kubernetes, called “StackGres”. Álvaro is also an AWS Data Hero and a passionate database and open source software developer Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod Note: This episode was edited on May 17th to remove a chatter segment from episode 219, which had been mistakenly edited into it. News of the week Kubernetes code cleanup KEP-2395: Removing In-Tree Cloud Provider Code - GitHub KEP Readme Remove gcp in-tree cloud provider and credential providers - GitHub PR Spotlight on SIG Cloud Provider - Blog The Future of Cloud Providers in Kubernetes - Blog Kubernetes 1.29: Cloud Provider Integrations Are Now Separate Components - Blog Google I/O KubeCon + CloudNativeCon Europe 2024 Report KuberTENes Birthday Bash The Kubernetes Community takes over kubernetesio on X WG-Serving on GitHub DoK Community Ambassador Applications Links from the interview Álvaro Hernández: LinkedIn Twitter/X OnGres PostgreSQL Stackgres.io Stackgres github Kubernetes Pg_repack Data on Kubernetes (DoK) Community Data On Kubernetes 2022 Report Data on Kubernetes Whitepaper - Database Patterns - by CNCF TAG Storage Istio Apache Zookeeper Strimzi - CNCF Project for running Apache Kafka on Kubernetes Apache Kafka Postgres extensions The Kubernetes Operator Pattern Presentation about PostreSQL Hooks from PostgreSQL wiki OCI - Open Container Initiative Why Postgres Extensions should be packaged and distributed as OCI images
https://oscourse.win Allegro improved their Kafka produce tail latency by over 80% when they switched from ext4 to xfs. What I enjoyed most about this article is the detailed analysis and tweaking the team made to ext4 before considering switching to xfs. This is a classic case of how a good tech blog looks like in my opinion. 0:00 Intro 0:30 Summary 2:35 How Kafka Works? 5:00 Producers Writes are Slow 7:10 Tracing Kafka Protocol 12:00 Tracing Kernel System Calls 16:00 Journaled File Systems 21:00 Improving ext4 26:00 Switching to XFS Blog https://blog.allegro.tech/2024/03/kafka-performance-analysis.html
The “big data infrastructure” world is dominated by Java, but the data-analysis world is dominated by Python. So if you need to analyse and process huge amounts of data, chances are you're in for a less-than-ideal time. The impedance mismatch will probably make your life hard somehow. So there are a lot of projects and companies trying to solve that problem. To bridge those two worlds seamlessly, and many of the popular solutions see SQL as the glue. But this week we're going to look at another solution - ignore Java, treat Kafka as a protocol, and build up all the infrastructure tools you need with a pure Python library. It's a lot of work, but in theory it would make Python the one language for data storage, analysis and processing, at scale. Tempting, but is it feasible? Joining me to discuss the pros, cons, and massive scope of that approach is Tomáš Neubauer. He started off doing real time data analysis for the Maclaren's F1 team, and is now deep in the Python mines effectively rewriting Kafka Streams in Python. But how? How much work is actually involved in porting those ideas to Python-land, and how do you even get started? And perhaps most fundamental of all - even if you succeed, will that be enough to make the job easy, or will you still have to scale the mountain of teaching people how to use the new tools you've built? Let's find out.– Quix Streams on Github: https://github.com/quixio/quix-streamsQuix Streams getting started guide: https://quix.io/get-started-with-quix-streamsQuix: https://quix.io/ Tomáš on LinkedIn: https://www.linkedin.com/in/tom%C3%A1%C5%A1-neubauer-a10bb144Tomáš on Twitter: https://twitter.com/TomasNeubauer0Kris on Mastodon: http://mastodon.social/@krisajenkinsKris on LinkedIn: https://www.linkedin.com/in/krisjenkins/Kris on Twitter: https://twitter.com/krisajenkins --#podcast #softwaredevelopment #datascience #apachekafka #streamprocessing
Follow: https://stree.ai/podcast | Sub: https://stree.ai/sub | Today, Tim dives into the world of Kafka Streams with Matthias Sax, Software Engineer at Confluent and core contributor to Apache Kafka. Matthias updates us on the latest in Interactive Queries, their enhancements in recent releases, insights on stream processing and how Kafka Streams stands out in the real-time analytics landscape. Remember to use the 30% discount Tim mentioned for the Real-Time Analytics Summit: https://stree.ai/rtapod30 (Code: RTAPOD30)
When the open source streaming service Apache Kafka was created in 2011 at LinkedIn, it was a different world. Most companies were still on prem. Learn more about your ad choices. Visit megaphone.fm/adchoices
Guest Angie Byron Panelist Richard Littauer Show Notes Hello and welcome to Sustain! Richard is in Portland at FOSSY, the Free and Open Source Software Yearly conference that is held by the Software Freedom Conservancy. In this episode, we're joined by Angie Byron, the Director of Community at Aiven, a leading open source data platform. Angie brings us insights from her role overseeing 11 open source projects, explaining how they provide managed services and security updates for several data projects, and highlighting the importance of prioritizing by impact. She also gives us a peek into their “start at the end” exercise used for goal setting and talks about the challenges of transparency and confidentiality in open source projects. Tune in now and download this episode to hear more! [00:00:39] Angie explains that Aiven is an open source data platform that provides managed services and security updates for several open source data projects such as Apache Kafka, MySQL, Postgres, Redis, and Grafana. [00:01:30] Angie shares that she's the Director of Community at Aiven and has been there for a couple of months. She talks about her role as a meta community manager, overseeing 11 open source projects with a small team. [00:02:32] There's a discussion by Angie on the importance of prioritizing by impact and empowering community members, and she explains the “start at the end” exercise she uses for setting their goals, and she explains using the Open Practice Library, which is a division of Red Hat. [00:07:17] Richard asks about the challenges of balancing transparency and confidentiality in open source projects. Angie shares that they're working on a public-facing version of a roadmap with an ideation system. [00:08:23] Angie discusses three main goals of their work: increasing revenue, reducing costs, and mitigating risk. [00:09:59] Angie explains that she internalizes achievement by helping others grow, thrive, and accomplish their goals, with her success and that of her team tied to the success of others. [00:11:24] Find out where you can learn more about Aiven's community efforts, and where you can learn more about Angie online. Links SustainOSS (https://sustainoss.org/) SustainOSS Twitter (https://twitter.com/SustainOSS?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor) SustainOSS Discourse (https://discourse.sustainoss.org/) podcast@sustainoss.org (mailto:podcast@sustainoss.org) SustainOSS Mastodon (https://mastodon.social/tags/sustainoss) Richard Littauer Twitter (https://twitter.com/richlitt?lang=en) Software Freedom Conservancy (https://sfconservancy.org/) Open OSS (https://openoss.sourceforge.net/) Angie Byron Tech Blog (https://openpracticelibrary.com/) Angie Byron Twitter (https://twitter.com/webchick) Angie Byron LinkedIn (https://ca.linkedin.com/in/webchick?original_referer=https%3A%2F%2Fwww.google.com%2F) Angie Byron Mastodon (https://mastodon.social/@webchick) Aiven (https://aiven.io/) Open Practice Library (https://openpracticelibrary.com/) Credits Produced by Richard Littauer (https://www.burntfen.com/) Edited by Paul M. Bahr at Peachtree Sound (https://www.peachtreesound.com/) Show notes by DeAnn Bahr Peachtree Sound (https://www.peachtreesound.com/) Special Guest: Angie Byron.
On this episode, we dive into Apache Druid 28. This latest Druid release includes improved ANSI SQL and Apache Calcite support, the addition of window functions as an experimental feature, async queries and query from deep storage going GA, array enhancements, multi-topic Apache Kafka ingestion, and so much more! Will Xu, program manager at Imply returns to give us the full scoop.
Neha Narkhede is a co-founder at Confluent, a data streaming software that raised at a $9.1b valuation in 2021. Neha later co-founded Oscilar, a no-code platform that helps companies detect and manage fraud. Before building these two companies, Neha was a Principal Software Engineer at LinkedIn where she co-created Apache Kafka. Neha is ranked #50 on Forbes' list of “America's Richest Self-Made Women 2023” with an estimated net worth of $520m. — In today's episode we discuss: The origins of Confluent, Kafka, and Oscilar How to become a successful second-time founder Advice for monetizing open source product Neha's unique GTM strategies How Confluent ran two businesses within one company Neha's path to founder market fit — Referenced: Apache Kafka: https://kafka.apache.org/ Confluent: https://www.confluent.io/ Confluent Cloud: https://www.confluent.io/confluent-cloud/ Jay Kreps, co-founder at Confluent: https://www.linkedin.com/in/jaykreps/ Jun Rao, co-founder at Confluent: https://www.linkedin.com/in/junrao/ MongoDB: https://www.mongodb.com/ Oscilar: https://oscilar.com/ — Where to find Neha: LinkedIn: https://www.linkedin.com/in/nehanarkhede/ Twitter/X: https://twitter.com/nehanarkhede Website: https://www.nehanarkhede.com/ — Where to find Brett: LinkedIn: https://www.linkedin.com/in/brett-berson-9986094/ Twitter/X: https://twitter.com/brettberson — Where to find First Round Capital: Website: https://firstround.com/ First Round Review: https://review.firstround.com/ Twitter: https://twitter.com/firstround Youtube: https://www.youtube.com/@FirstRoundCapital This podcast on all platforms: https://review.firstround.com/podcast — Timestamps: (00:00) Introduction (02:14)The origin story of Kafka (05:24) Co-creating Kafka at LinkedIn (07:31) Why open sourcing Kafka was a masterstroke (11:04) The unique nature of Confluent's Zero to One phase (16:35) Building for a specific customer early on (18:42) Inside Confluent's successful launch (20:12) Establishing Confluent as an enterprise company (22:00) The role of developer evangelism in Confluent's success (23:49) Using developer evangelism in category creation (26:41) Navigating early co-founder dynamics (30:06) Leveraging complementary founder skills (31:56) Advice for future founders (32:45) Building Confluent with monetization in mind (34:38) Monetizing open source products (36:05) GTM for subscription Saas versus consumption SaaS (39:48) The importance of founder-led GTM sales (40:58) Neha's order of operations for GTM sales (42:33) When to build out outbound sales (44:34) Adding SaaS to a software business (48:54) Choosing what to license and what to open source (52:38) How Confluent's co-founders decided on SaaS offering (56:04) Neha's journey as a second-time founder (58:54) Building Oscilar differently to Confluent (63:21) Going from speculation to product realization (69:06) Solving problems people are willing to pay for (71:13) Neha's “proactive research sprint” tactic (72:54) How Neha has applied this tactic
Confluent's platform provides infrastructure for enterprises to connect, stream and process data across applications and systems in real time. In this episode of the Tech Disruptors podcast, Confluent's cofounder and CEO Jay Kreps joins Bloomberg Intelligence senior software analyst Sunil Rajgopal to discuss the origins of Apache Kafka and Confluent, the flow of enterprise data and future of software architecture. The two also talk about the opportunity arising from the shift toward real-time data streaming from batch processing, budding artificial intelligence workloads and the company's new products such as Confluent Cloud for Apache Flink, Kora Engine and KSQL database.
In this bonus episode, Eric and Kostas preview their upcoming conversation with David Yaffe and Johnny Graettinger of Estuary.
Guillermo Rauch is the CEO of Vercel, a frontend-as-a-service product that was valued at $2.5b in 2021. Vercel serves customers like Uber, Notion and Zapier, and their React framework - Next.js - is used by over 500,000 developers and designers worldwide. Guillermo started his first company at age 11 in Buenos Aires and moved to San Francisco at age 18. In 2013, he sold his company Cloudup to Automattic (the company behind WordPress), and in 2015 he founded Vercel. — In today's episode we discuss: Guillermo's fascinating path into tech Learnings from building Cloudup and selling the company to Automattic (the company behind WordPress) Vercel's origin story and path to product market fit How to make an open source business successful Vercel's unique philosophy on developer experience Insights and predictions on the future of AI — Referenced: Algolia: https://www.algolia.com/ Apache Zookeeper: https://zookeeper.apache.org/ Apache Kafka: https://kafka.apache.org/ AWS: https://www.aws.training/ C++: https://www.techtarget.com/searchdatamanagement/definition/C Clerk: https://clerk-tech.com/ Cloudup: https://cloudup.com/ Commerce Cloud: https://www.salesforce.com/products/commerce/ Contentful: https://www.contentful.com/ Debian: https://www.debian.org/ Fintool: https://www.fintool.com/ Figma: https://www.figma.com/ GitLab: https://about.gitlab.com/ IRC: https://en.wikipedia.org/wiki/Internet_Relay_Chat KDE: https://kde.org/ Linux: https://en.wikipedia.org/wiki/Linux Mozilla: https://www.mozilla.org MooTools (UI library): https://mootools.net/ Next.js: https://nextjs.org/ React Native: https://reactnative.dev/ Red Hat: https://www.redhat.com/ Redpanda: https://redpanda.com/ Resend: https://resend.com/ Rust: https://www.rust-lang.org/ Salesforce: https://www.salesforce.com Servo: https://servo.org/ Shopify: https://www.shopify.com/ Socket.io: https://socket.io/ Symphony: https://symphony.com/ Trilio: https://trilio.io/ Twilio: https://www.twilio.com Vercel: https://vercel.com/ V0.dev: https://v0.dev/ — Where to find Guillermo: Twitter/x: https://twitter.com/rauchg LinkedIn: https://www.linkedin.com/in/rauchg/ Personal website: https://rauchg.com/ — Where to find Todd Jackson: Twitter: https://twitter.com/tjack LinkedIn: https://www.linkedin.com/in/toddj0 — Where to find First Round Capital: Website: https://firstround.com/ First Round Review: https://review.firstround.com/ Twitter: https://twitter.com/firstround Youtube: https://www.youtube.com/@FirstRoundCapital This podcast on all platforms: https://review.firstround.com/podcast — Timestamps: (02:35) Becoming an “internet celebrity” at age 11 (08:30) Guillermo's first company: Cloudup (11:09) Biggest learnings from Cloudup and WordPress (15:06) The insights behind starting Vercel (17:11) Sources of validation for Vercel (20:29) How Vercel formed its V1 product (23:25) Navigating the early reactions from competitors and users (25:58) The paradox of developers and how it impacted Next.js (31:20) Advice on finding product market fit (34:48) The forces behind a trend towards "Front-end Cloud” (38:35) Why people now pay so much attention to the front-end (40:06) How to make an open source business successful (44:54) Insights on product positioning and category creation (48:52) Vercel's journey through becoming multi-product (51:44) Guillermo's take on the future of AI (53:43) Heuristics for building better product experiences (55:49) AI insights from Vercel's customers (57:37) How AI might change engineering in the next 10-20 years (62:43) Guillermo's favorite advice (65:45) Guillermo's advice to himself of 10 years ago
No episódio de hoje, Luan Moreno e Mateus Oliveira entrevistam Brian Olsen, atualmente Head of Developer Relations na Tabular.Trino é um produto open-source, para virtualizar os dados através de queries. Imagine uma engine de SQL capaz de consultar dados do Apache Kafka, Cloud Storage, Databases e diversas outras fontes de forma simples e extremamente eficaz. Com Trino, você tem os seguintes benefícios:Diversos conectores para múltiplas fontes de dadosGerar queries analytics de forma simples e eficazTrabalhar com modelos de Lakehouse como Iceberg e DeltaFalamos também neste bate-papo sobre os seguintes temas:História do TrinoCapacidades do TrinoRecursos avançados Novas featuresAdaptive Query ExecutionCasos de UsoAprenda mais sobre Trino, e como utilizar esta tecnologia para explorar os dados em diversas fontes diferentes, junto com um dos principais vozes da comunidade. Brian Olsen Luan Moreno = https://www.linkedin.com/in/luanmoreno/
Alex Gallego, CEO & Founder of Redpanda, joins Corey on Screaming in the Cloud to discuss his experience founding and scaling a successful data streaming company over the past 4 years. Alex explains how it's been a fun and humbling journey to go from being an engineer to being a founder, and how he's built a team he trusts to hand the production off to. Corey and Alex discuss the benefits and various applications of Redpanda's data streaming services, and Alex reveals why it was so important to him to focus on doing one thing really well when it comes to his product strategy. Alex also shares details on the Hack the Planet scholarship program he founded for individuals in underrepresented communities. About AlexAlex Gallego is the founder and CEO of Redpanda, the streaming data platform for developers. Alex has spent his career immersed in deeply technical environments, and is passionate about finding and building solutions to the challenges of modern data streaming. Prior to Redpanda, Alex was a principal engineer at Akamai, as well as co-founder and CTO of Concord.io, a high-performance stream-processing engine acquired by Akamai in 2016. He has also engineered software at Factset Research Systems, Forex Capital Markets and Yieldmo; and holds a bachelor's degree in computer science and cryptography from NYU. Links Referenced: Redpanda: https://redpanda.com/ Twitter: https://twitter.com/emaxerrno Redpanda community Slack: https://redpandacommunity.slack.com/join/shared_invite/zt-1xq6m0ucj-nI41I7dXWB13aQ2iKBDvDw Hack The Planet Scholarship: https://redpanda.com/scholarship TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Tired of slow database performance and bottlenecks on MySQL or PostgresSQL when using Amazon RDS or Aurora? How'd you like to reduce query response times by ninety percent? Better yet, how would you like to get me to pronounce database names correctly? Join customers like Zscaler, Intel, Booking.com, and others that use OtterTune's artificial intelligence to automatically optimize and keep their databases healthy. Go to ottertune dot com to learn more and start a free trial. That's O-T-T-E-R-T-U-N-E dot com.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn, and this promoted guest episode is brought to us by our friends at Redpanda, which I'm thrilled about because I have a personal affinity for companies that have cartoon mascots in the form of animals and are willing to at least be slightly creative with them. My guest is Alex Gallego, the founder and CEO over at Redpanda. Alex, thanks for joining me.Alex: Corey, thanks for having me.Corey: So, I'm not asking about the animal; I'm talking about the company, which I imagine is a frequent source of disambiguation when you meet people at parties and they don't quite understand what it is that you do. And you folks are big in the data streaming space, but data streaming can mean an awful lot of things to an awful lot of people. What is it for you?Alex: Largely it's about enabling developers to build applications that can extract value of every single event, every click, every mouse movement, every transaction, every event that goes through your network. This is what Redpanda is about. It's like how do we help you make more money with every single event? How do we help you be more successful? And you know, happy to give examples in finance, or IoT, or oil and gas, if it's helpful for the audience, but really, to me, it's like, okay, if we can give you the framework in which you can build a new application that allows you to extract value out of data, every single event that's going through your network, to me, that's what a streaming is about. It large, it's you know, data contextualized with a timestamp and largely, a sort of a database of event streaming.Corey: One of the things that I find curious about the space is that usually, companies wind up going one of two directions when you're talking about data streaming. Either there, “Oh, just send it all to us and we'll take care of it for you,” or otherwise, it's a, great they more or less ship something that you've run in your own environment. In the olden days of data centers, that usually resembled a box of some sort. You're one of those interesting split-the-difference companies where you offer both models. Do you find that one of those tends to be seeing more adoption these days or that there's an increasing trend toward one direction or the other?Alex: Yeah. So, right now, I think that to me, the future of all these data-intensive products—whether you're a database or a streaming engine—will, because simply of cost of networks transferred between the hybrid clouds and your accounts, sending a gigabyte a second of data between, let's say, you know, your data center and a vendor, it's just so expensive that at some point, from just a cost perspective, like, running the infrastructure, it's in the millions of dollars. And so, running the data inside your VPC, it's sort of the next logical evolution of how we've used to consume services. And so, I actually think it's just the evolution: people would self-host because of costs and then they would use services because of operational simplicity. “I don't want to spend team skills and time building this. I want to pay a vendor.”And so, BYOC, to be honest—which is what we call this offering—it was about [laugh] sidestepping the costs and of being stuck in the hybrid clouds, whether it's Google or Amazon, where you're paying egress and ingress costs and it's just so expensive, in addition to this whole idea of data residency or data sovereignty and privacy. It's like, yeah, why not both? Like, if I'm an engineer, I want low latency and I don't want to pay you to transfer this thing to the next rack. I mean, my computer's probably, like, you know, a hundred feet away from my customer's computer. Like, why [laugh] way is that so complicated? So, you know, my view is that the future of data-intensive products will be in this form of where it—like, data planes are actually owned by companies, and then you offer that as a Software as a Service.Corey: One of the things that catches an awful lot of companies with telemetry use cases—or data streaming as another example of that—by surprise when they start building their own cloud-hosted offering is that they're suddenly seeing a lot more cross-AZ data charges than they would have potentially expected. And that's because unlike cross-region or the really expensive version of this with egress, it's a penny in and a penny out per gigabyte in most of AWS regions. Which means that that isn't also bound strictly to an AWS organization. So, you have customers co-located with you and you're starting to pay ingress charges on customers throwing their data over to you. And, on some level, the most economical solution for you is well, we're just going to put our listeners somewhere else far away so that we can just have them pay the steep egress fee but then we can just reflect it back to ourselves for free.And that's a terrible pattern, but it's a byproduct of the absolutely byzantine cross-AZ data transfer pricing, in fact, all of the data transfer pricing that is at least AWS tends to present. And it shapes the architectural decisions you make as a result.Alex: You know, as a user, it just didn't make sense. When we launched this product, the number of people that says like, “Why wouldn't your charge for, you know, effectively renting [unintelligible 00:05:14], and giving a markup to your customers?” That's we don't add any value on that, you know? I think people should really just pay us for the value that we create for them. And so, you know, for us competing with other companies is relatively easy.Competing with MSK is it's harder because MSK just has this, you know, muscle where they don't charge you for some particular network traffic between you. And so, it forces companies like us that are trying to be innovative in the data space to, like, put our services in that so that we can actually compete in the market. And so, it's a forcing function of the hybrid clouds having this strong muscle of being able to discount their services in a way that companies just simply don't have access to. And then, you know, it becomes—for the others—latency and sovereignty.Corey: This is the way that effectively all of AWS has first-party offerings of other things go. Replication traffic between AZs is not chargeable. And when I asked them about that, they say, “Oh, yeah. We just price that into the cost of the service.” I don't know that I necessarily buy that because if I try and run this sort of thing on top of EC2, it would cost me more than using their crappy implementation of it, just in data transfer alone for an awful lot of use cases.No third party can touch that level of cost-effectiveness and discounting. It really is probably the clearest example I can think of actual anti-competitive behavior in the market. But it's also complex enough to explain, to, you know, regulators that it doesn't make for exciting exposés and the basis for lawsuits. Yet. Hope springs eternal.Alex: [laugh]. You know—okay, so here is how—if someone is listening to this podcast and is, like, “Okay, well, what can I do?” For us, S3 is the answer. S3 is basically you need to be able to lean in into S3 as a way of replication across [AZ 00:06:56], you need to be able to lean into S3 to read data. And so actually, when I wrote, originally, Redpanda, you know, it's just like this C++ thing using [unintelligible 00:07:04], geared towards super low latency.When we moved it into the cloud, what we realized is, this is cost prohibitive to run either on EBS volumes or local disk. I have to tier all the storage into S3, so that I can use S3's cross-AZ network transfer, which is basically free, to be able to then bring a separate cluster on a different AZ, and then read from the bucket at zero cost. And so, you end up really—like, there are fundamental technical things that you have to do to just be able to compete in a way that's cost-effective for you. And so, in addition to just, like, the muscle that they can enforce on the companies is—it—there are deep implications of what it translates to at the technical level. Like, at the code level.Corey: In the cloud, more than almost anywhere else, it really does become apparent that cost and architecture are fundamentally the same thing. And I have a bit of an advantage here in that I've seen what you do deployed at least one customer of mine. It's fun. When you have a bunch of logos on your site, it's, “Hey, I recognize some of those.” And what I found interesting was the way that multiple people, when I spoke to them, described what it is that you do because some of them talked about it purely as a cost play, but other people were just as enthusiastic about it being a means of improving feature velocity and unlocking capabilities that they didn't otherwise have or couldn't have gotten to without a whole lot of custom work on their part. Which is it? How do you view what it is that you're bringing to market? Is it a cost play or is it a capability story?Alex: From our customer base, I would say 40% is—of our customer base—is about Redpanda enabling them to do things that they simply couldn't do before. An example is, we have, you know, a Fortune 100 company that they basically run their hedge trading strategy on top of Redpanda. And the reason for that is because we give them a five-millisecond average latency with predictable flight latencies, right? And so, for them, that predictability of Redpanda, you know, and sort of like the architecture that came about from trying to invent a new storage engine, allows them to throw away a bunch of in-house, you know, custom-built pub/sub messaging that, you know, basically gave them the same or worse latency. And so, for them, there's that.For others, I think in the IoT space, or if you have flying vehicles around the world, we have some logos that, you know, I just can't mention them. But they have this, like, flying computers around the world and they want to measure that. And so, like, the profile of the footprint, like, the mechanical footprint of being able to run on a single Pthread with a few megs of memory allows these new deployment models that, you know, simply, it's just, it's not possible with the alternatives where let's say you have to have, you know, like, a zookeeper on the schema registry and an HTTP proxy and a broker and all of these things. That simply just, it cannot run on a single Pthread with a few megs of memory, if you put any sort of workload into that. And so, it's like, the computational efficiencies simply enable new things that you couldn't do before. And that's probably 40%. And then the other, it's just… money was really cheap last year [laugh] or the year before and I think now it's less cheap [unintelligible 00:10:08] yeah.Corey: Yeah, I couldn't help but notice that in my own business, too. It turns out that not giving a shit about the AWS bill was a zero-interest-rate phenomenon. Who knew?Alex: [laugh]. Yeah, exactly. And now people [unintelligible 00:10:17], you know, the CIOs in particular, it's like, help. And so, that's really 60%, and our business has boomed since.Corey: Yeah, one thing that I find interesting is that you've been around for only four years. I know that's weird to say ‘only,' but time moves differently in tech. And you've started showing up in some very strange places that I would not have expected. You recently—somewhat recently; time is, of course, a flat circle—completed $100 million Series C, and I also saw you in places where I didn't expect to see you in the form of, last week, one of your large competitor's earnings calls, where they were asked by an analyst about an unnamed company that had raised $100 million Series C, and the CEO [unintelligible 00:11:00], “Oh, you're probably talking about Redpanda.” And then they gave an answer that was fine.I mean, no one is going to be on an earnings call and not be prepared for questions like that and to not have an answer ready to go. No one's going to say, “Well, we're doomed if it works,” because I think that businesses are more sophisticated than that. But it was an interesting shout-out in a place where you normally don't see competitors validate that you're doing something interesting by name-checking you.Alex: What was fundamentally interesting for me about that, is that I feel that as an investor, if you're putting you know, 2, 3, 4, or $500 million check into a public position of a company, you want to know, is this money simply going to make returns? That's basically what an investor cares about. And so, the reason for that question is, “Hey, there's a Series C startup company that now has a bunch of these Fortune 2000 logos,” and you know, when we talked to them, like, their customer [unintelligible 00:11:51] phenomena, like, why is that the case? And then, you know, our competitor was forced to name, you know, [laugh] a single win. That's as far as I remember it. We don't know of any additional customers that have switched to that.And so, I think when you have, like, you know, your win rate is above, whatever, 95%, 97% ratio, then I think, you know, they're just sort of forced to answer that. And in a way, I just think that they focus on different things. And for me, it was like, “Okay, developer, hands on keyboard, behind the terminal, how do I make you successful?” And that seems to have worked out enough to be mentioned in the earnings call.Corey: On some level, it's a little bit of a dog-and-pony show. I think that as companies had a certain point of scale, they feel that they need to validate what they're doing to investors at various points—which is always, on some level, of concern—and validate themselves to analysts, both financial—which, okay, whatever—and also, industry analysts, where they come with checklists that they believe is what customers want and is often a little bit off of the mark. But the validation that I think that matters, that actually determines whether or not something has legs is what your customers—you know, people paying you money for a thing—have to say and what they take away from what you're doing. And having seen in a couple of cases now myself, that usage of Redpanda has increased after initial proofs of concept and putting things on to it, I already sort of know the answer to this, but it seems that you also have a vibrant community of boosters for people who are thrilled to use the thing you're selling them.Alex: You know, Jumptraders recently posted that there was a use case in the new stack where they, like, put for the most mission-critical. So, for those of you that listening, Jumptraders is financial company, and they're super technical company. One of, like, the hardest things, they'll probably put your [unintelligible 00:13:35] your product through some of the most rigorous testing [unintelligible 00:13:38]. So, when you start doing some of these logos, it gives confidence. And actually, the majority of our developers that we get to partner with, it was really a friend telling a friend, for [laugh] the longest time, my marketing department was super, super small.And then what's been fun, some, like, really different use case was the one I mentioned about on this, like, flying vehicles around the world. They fly both in outer space and in airplanes. That was really fun. And then the large one is when you have workloads at, like, 14-and-a-half gigabytes per second, where the alternative of using something like Kinesis in the case of Lacework—which, you know, they wrote a new stack article about—would be so exorbitantly expensive. And so, in a way, I think that, you know, just trying to make the developers successful, really focusing, honestly, on the person who just has to make things work. We don't—by the time we get to the CIO, really the champion was the engineer who had to build an application. “I was just trying to figure it out the whack-a-mole of trying to debug alternative systems.”Corey: One of the, I think, seductive problems with your entire space is that no one decides day one that they're going to implement a data streaming solution for a very scaled-out, high-traffic site. The early adoption is always a small thing that you're in the process of building. And at that scale at that speed, it just doesn't feel like it's that hard of a problem because scale introduces its own unique series of challenges, but it's often one that people only really find out themselves when the simple thing that works in theory but not in production starts to cause problems internally. I used to work with someone who was a deeply passionate believer in Apache Kafka to a point where it almost became a problem, just because their answer to every problem—it almost didn't matter if it was, “How do we get more coffee this morning?”—Kafka would be the answer for all of it.And that's great, but it turned out, they became one of these people that borderline took on a product or a technology as their identity. So, anything that would potentially take a workload away from that, I got a lot of internal resistance. I'm wondering if you find that you're being brought in to replace existing systems or for completely greenfield stuff. And if the former, are you seeing a lot of internal resistance to people who have built a little niche for themselves?Alex: It's true, the people that have built a career, especially at large banks, were a pretty good fit for, you know, they actually get a team, they got a promotion cycle because they brought this technology and the technology sort of helped them make money. I personally tend to love to talk to these people. And there was a ca—to me, like, technically, let's talk about, like, deeply technical. Let me help you. That obviously doesn't scale because I can't have the same conversation with ten people.So, we do tend to see some of that. Actually, from our customers' standpoint, I would say that the large part of our customer base, you know, if I'm trying to put numbers, maybe 65%, I probably rip and replace of, you know, either upstream Apache Software or private companies or hosted services, et cetera. And so, I think you're right in saying, “Hey, that resistance,” they probably handled the [unintelligible 00:16:38], but what changed in the last year is that the CIO now stepped in and says, “I am going to fire all of you or you have to come up with a $10 million savings. Help me.” [laugh]. And so, you know, then really, my job is to help them look like a hero.It's like, “Hey, look, try it tested, benchmark it in your with your own workload, and if it saves you money, then use it.” That's been, you know, to sort of super helpful kind of on the macroeconomic environment. And then the last one is sometimes, you know, you do have to go with a greenfield, right? Like, someone has built a career, they want to gain confidence, they want to ask you questions, they want to trust you that you don't lose data, they want to make sure that you do say the things that you want to say. And so, sometimes it's about building trust and building that relationship.And developers are right. Like, there's a bunch of products out there. Like, why should I trust you? And so, a little easier time, probably now, that you know, with the CIOs wanting to cut costs, and now you have an excuse to go back to the executive team and say, “Look, I made you look smart. We get to [unintelligible 00:17:35], you know, our systems can scale to this.” That's easy. Or the second one is we do, you know, we'll start with some side use case or a greenfield. But both exists, and I would say 65% is probably rip-outs.Corey: One question, I love to, I wouldn't call it ambush, but definitely come up with, the catches some folks by surprise is one of the ways I like to sort out zealots from people who are focused on business problems. Do have an example of a data streaming workload for which Redpanda would not be a great fit?Alex: Yeah. Database-style queries are not a fit. And so, think that there was a streaming engine before there was trying to build a database on top of it, and, like—and probably it does work in some low volume of traffic, like, say 5, 10 megabytes per second, but when you get to actual large scale, it just it doesn't work. And it doesn't work because but what Redpanda is, it gives you two properties as a developer. You can add data to the end or you can truncate the head, right?And so, because those are your only two operations on the log, then you have to build this entire caching level to be able to give this database semantics. And so, do you know, I think for that the future isn't for us to build a database, just as an example, it's really to almost invert it. It's like, hey, what if we make our format an open format like Apache Iceberg and then bring in your favorite database? Like, bring in, you know, Snowflake or Athena or Trina or Spark or [unintelligible 00:18:54] or [unintelligible 00:18:55] or whatever the other [unintelligible 00:18:56] of great databases that are better than we are, and doing, you know, just MPP, right, like a massively parallelizable database, do that, and then the job for us, for [unintelligible 00:19:05], let me just structure your log in a way that allows you to query, right? And so, for us, when we announced the $100 million dollar Series C funding, it's like, I'm going to put the data in an iceberg format so you can go and query it with the other ten databases. And there are a better job than we are at that than we are.Corey: It's frankly, refreshing to see a vendor that knows where, okay, this is where we start and this is where we stop because it just seems that there's been an industry-wide push for a while now to oh, you built a component in a larger system that works super well. Now, expand to do everything else in the architectural diagram. And you suddenly have databases trying to be network transport layers and queues trying to be data warehouses, and it just doesn't work that way. It just it feels like oh, this is a terrible approach to solving this particular problem. And what's worse, from my mind, is that people who hadn't heard of you before look at you through this lens that does not put you in your best light, and, “Oh, this is a terrible database.” Well, it's not supposed to be one.Alex: [laugh].Corey: But it also—it puts them off as a result. Have you faced pressure to expand beyond your core competency from either investors or customers or analysts or, I don't know, the voices late at night that I hear and I assume everyone else does, too?Alex: Exactly. The 3 a.m. voice that I have to take my phone and take a voice note because it's like, I don't want to lose this idea. Totally. For us. I think there's pressures, like, hey, you built this great engine. Why don't you add, like, the latest, you know, soup de jour in systems was like a vector database.I was like, “This doesn't even make any sense.” For me, it's, I want to do one thing really well. And I generally call it internally, ‘the ring zero.' It's, if you think of the internet, right, like, as a computer, especially with this mode to what we talked about earlier in a BYOC, like, we could be the best ring zero, the best sort of like, you know, messaging platform for people to build real-time applications. And then that's the case and there's just so much low-hanging fruit for us.Like, the developer experience wasn't great for other systems, like, why don't we focus on the last mile, like, making that developer, you know, successful at doing this one thing as opposed to be an average and a bunch of other a hundred products? And until we feel, honestly, that we've done a phenomenal job at that—I think we still have some roadmap to get there—I don't want to expand. And, like, if there's pressure, my answer is, like… look, the market is big enough. We don't have to do it. We're still, you know, growing.I think it's obviously not trivial and I'm kind of trivializing a bunch of problems from a business perspective. I'm not trying to degrade anyone else. But for us, it's just being focused. This is what we do well. And bring every other technology that makes you successful. I don't really care. I just want to make this part well.Corey: I think that that is something that's under-appreciated. I feel like I should get over at one point to something that's been nagging at the back of my mind. Some would call it a personal attack and I suppose I'll let them, but what I find interesting is your background. Historically, you were a distributed systems engineer at very large scale. And you apparently wrote the first version of Redpanda yourself in—was it C or C++?Alex: C++.Corey: Yeah. And now you are the CEO of a company that is clearly doing very well. Have you gotten the hell out of production yet? The reason I ask this is I have worked in a number of companies where the founder was also the initial engineer and then they invariably treated main as their feature branch and the rest of us all had to work around them to keep them from, you know, destroying everything we were trying to build around us, due to missing context. In other words, how annoyed with you are your engineers on any given afternoon?Alex: [laugh]. Yeah. I would say that as a company builder now, if I may say that, is the team is probably the thing I'm the most proud of. They're just so talented, such good [unintelligible 00:22:47] of humans. And so—group of humans—I stopped coding about two years ago, roughly.So, the company is four-and-a-half years old, really the first two-and-a-half years old, the first one, two years, definitely, I was personally putting in, like, tons and tons of hours working on the code. It was a ton of fun. To me, one of the most rewarding technical projects I've ever had a chance to do. I still read pull requests, though, just so that when I have a conversation with a technical leader, I don't be, like, I have no clue how the transactions work. So, I still have to read the code, but I don't write any more code and my heart was a little broken when my dev prod team removed my write access to the GitHub repo.We got SOC2 compliance, and they're like, “You can't have access to being an admin on Google domains, and you're no longer able to write into main.” And so, I think as a—I don't know, maybe my identity—myself identity is that of a builder, and I think as long as I personally feel like I'm building, today, it's not code, but you know, is the company and [unintelligible 00:23:41] sort of culture, then I feel okay [laugh]. But yeah, I no longer write code. And the last story on that, is this—an engineer of ours, his name is [Stefan 00:23:51], he's like, “Hey, so Alex wrote this semaphore”—this was actually two days ago—and so they posted a video, and I commented, I was like, “Hey, this was the context of semaphore. I'm sorry for this bug I caused.” But yeah, at least I still remember some context for them.Corey: What's fun is watching things continue to outpace and outgrow you. I mean, one of the hard parts of building a company is the realization that every person you hire for a thing that's now getting off of your plate is better at that thing than you are. It's a constant experience of being humbled. And at some point, things wind up outpacing you to the point where, at least in my case, I've been on calls with customers and I explained how we did some things and how it worked and had to be corrected by my team of, “Well. That used to be true, however…” like, “Oh, dear Lord. I'm falling behind.” And that's always been a weird feeling for me.Alex: Totally. You know, it's the feeling of being—before I think I became a CEO, I was a highly comped engineer and did a competent, to the extent that it allowed me to build this product. And then you start doing all of these things and you're incompetent, obviously, by definition because you haven't done those things and so there's like that discomfort [laugh]. But I have to get it done because no one else wants to do, whatever, like say, like, you know, rev ops or marketing or whatever.And then you find somebody who's great and you're like, oh my God, I was like, I was so poor tactically at doing this thing. And it's definitely humbling every day. And it's almost it's, like, gosh, you're just—this year was kind of this role where you're just, like, mediocre at, like, a whole lot of things as a company, but you're the only person that has to do the job because you have the context and you just have to go and do it. And so, it's definitely humbling. And in some ways, I'm learning, so for me today, it's still a lot of fun to learn.Corey: This is a little more in the weeds, I suppose, but I always love to ask people these questions. Because I used to be naive, which meant that I had hope and I saw a brighter future in technology. I now know that was all a lie. But I used to believe that out there was some company whose internal infrastructure for what they'd built was glorious and it would be amazing. And I knew I would never work there, nor what I want to, because when everything's running perfectly, all I can really do is mess that up; there's no way to win and a bunch of ways to lose.But I found that place doesn't exist. Every time I talk to someone about how they built the thing that they built and I ask them, “If you were starting over from scratch, what would you do differently?” The answer often distills down to, “Oh, everything.” Because it's an organically evolving system that oh, yeah, everything's easier the second time. At least you get to find new failure modes go in that way. When you look back at how you designed it originally, are there any missteps that you could have saved yourself a whole lot of grief by not making the first time?Alex: Gosh, so many things. But if I were to give Hollywood highlights on these things, something that [unintelligible 00:26:35] is, does well is exposing these high-level data types of, like, streams, and lists and maps and et cetera. And I was like, “Well, why couldn't streams offer this as a first-class citizen?” And we got some things well which I think would still do, like the whole [thread recorder 00:26:49] could—like, the fundamentals of the engine I will still do the same. But, you know, exposing new programming models earlier in the life of the product, I think would have allowed us to capture even more wildly different use cases.But now we kind of have this production engine, we have to support Fortune 2000, so you know, it's kind of like a very delicate evolution of the product. Definitely would have changed—I would have added, like, custom data types upfront, I would have pushed a little harder on I think WebAssembly than we did originally. Man, I could just go on for—like, [added detail 00:27:21], I would definitely have changed things. Like, I would have pressed on the first—on the version of the cloud that we talked about early on, that as the first deployment mode. If we go back through the stack of all of the products you had, it's funny, like, 11 products that are surfaced to the customers to, like, business lines, I would change fundamental things about just [laugh], you know, everything else. I think that's maybe the curse of the expert. Like, you know, you could always find improvements.Corey: Oh, always. I still look back at my career before starting this place when I was working in a bunch of finance companies, and—I'll never forget this; it was over a decade ago—we were building out our architecture in AWS, and doing a deal with a large finance company. And they said, “Cool, where's your data center?” And I said, “Oh, it's AWS.” And they said, “Ha ha ha ha. Where's your data center?”And that was oh, okay, great. Now, it feels like if that's their reaction, they have not kept pace with the times. It feels it is easier to go to a lot of very serious enterprises with very serious businesses and serious workload concerns attendant to those and not get laughed out of the room because you didn't wind up doing a multi-million dollar data center build out that, with an eye toward making it look as enterprise-y as possible.Alex: Yeah. Okay, so here's, I think, maybe something a little bit controversial. I think that's true. People are moving to the cloud, and I don't think that that idea, especially when we go when we talk to banks, is true. They're like, “Hey, I have this contract with one of the hybrid clouds.”—you know, it's usually with two of them, and then you're like—“This is my workload. I want to spend $70 million or $100 million. Who could give me the biggest discount?” And then you kind of shop it around.But what we are seeing is that effectively, the data transfer costs are so expensive and running this for so much this large volume of traffic is still so, so expensive, that there is an inverse [unintelligible 00:29:09] to host from some category of the workload where you don't have dynamism. Actually hosted in your data center is, like, a huge boom in terms of cost efficiencies for the companies, especially where we are and especially in finances—you mentioned that—if you're trying to trade and you have this, like, steady state line from nine to five, whatever, eight to four, whenever the markets open, it's actually relatively cost-efficient because you can measure hey, look, you know, the New York Stock Exchange is 1.5 gigabytes per second at market close. Like, I could provision my hardware to beat this. And like, it'll be that I don't need this dynamism that the cloud gives me.And so yeah, it's kind of fascinating that for us because we offered the self-hosted Redpanda which can adapt to super low latencies with kernel parameter tuning, and the cloud due to the tiered storage, we talked about S3 being [unintelligible 00:29:52] to, so it's been really fun to participate in deployments where we have both. And you couldn't—they couldn't look more different. I mean, it's almost looks like two companies.Corey: One last question before we wind up calling it an episode. I think I saw something fly by on Twitter a while back as I slowly returned to the platform—no, I'm not calling it X—something you're doing involving a scholarship. Can you tell me a bit more about that?Alex: Yeah. So, you know, I'm a Latino CEO, first generation in the States, and some of the things that I felt really frustrated with, growing up that, like, I feel fortunate because I got to [unintelligible 00:30:25] that is that, you know, people were just—that look like me are probably given some bullshit QA jobs, so like, you know, behemoth job, I think, for a bank. And so, I wanted to change that. And so, we give money and mentorship to people and we release all of the intellectual property. And so, we mentor someone—actually, anyone from underrepresented backgrounds—for three months.We give then, like, 1200 bucks a month—or 1500, I can't remember—mentorship from our top principal level engineers that have worked at Amazon and Google and Facebook and basically the world's top companies. And so, they meet with them one hour a week, we give them money, they could sit in the couch if they want to. No one has to [unintelligible 00:31:06]. And all we're trying to do is, like, “Hey, if you are part of this group, go and try to build something super hard.” [laugh].And often their minds, which is great, and they're like, “I want to build an OpenAI competitor in three months, and here's the week-by-week progress.” Or, “I want to build a new storage engine, new database in three months.” And that's the kind of people that we want to help, these like, super ambitious, that just hasn't had a chance to be mentored by some of the world's best engineers. And I just want to help them. Like, we—this is a non-scalable project. I meet with them once a week. I don't want to have a team of, like, ten people.Like, to me, I feel like their most valuable thing I could do is to give them my time and to help them mentor. I was like, “Hey, let's think about this problem. Let's decompose this. How do you think about this?” And then bring you the best engineers that I, you know, that work for—with me, and let me help you think about problems differently and give you some money.And we just don't care how you use the time or the money; we just want people to work on hard problems. So, it's active. It runs once a year, and if anyone is listening to this, if you want to send it to your friends, we'd love to have that application. It's for anyone in the world, too, as long as we can send the person a check [laugh]. You know, my head of finance is not going to walk to a Moneygram—which we have done in the past—but other than that, as long as you have a bank account that we can send the check to, you should be able to apply.Corey: That is a compelling offer, particularly in the current macro environment that we find ourselves faced in. We'll definitely put a link to that into the [show notes 00:32:32]. I really want to thank you for taking the time to, I guess, get me up to speed on what it is you're doing. If people want to learn more where's the best place for them to go?Alex: On Twitter, my handle is @emaxerrno, which stands for the largest error in the kernel. I felt like that was apt for my handle. So, that's one. Feel free to find me on the community Slack. There's a Slack button on the website redpanda.com on the top right. I'm always there if you want to DM me. Feel free to stop by. And yeah, thanks for having me. This was a lot of fun.Corey: Likewise. I look forward to the next time. Alex Gallego, CEO and founder at Redpanda. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an insulting comment that I will almost certainly never read because they have not figured out how to get data from one place to another.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Yaniv Ben Hemo (@yanivbh1, Founder/CEO at @memphis_Dev) talks about Memphis Cloud, an alternative architecture to delivering streaming data for applications. SHOW: 747CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwNEW TO CLOUD? CHECK OUT - "CLOUDCAST BASICS"SHOW SPONSORS:Reduce the complexities of protecting your workloads and applications in a multi-cloud environment. Panoptica provides comprehensive cloud workload protection integrated with API security to protect the entire application lifecycle. Learn more about Panoptica at panoptica.appCloudZero – Cloud Cost Visibility and SavingsCloudZero provides immediate and ongoing savings with 100% visibility into your total cloud spendAWS Insiders is an edgy, entertaining podcast about the services and future of cloud computing at AWS. Listen to AWS Insiders in your favorite podcast player. Cloudfix HomepageSHOW NOTES:Memphis.dev (homepage)Getting Started with Memphis (docs page)Apache Kafka vs. MemphisMemphis on GitHubTopic 1 - Welcome to the show. Tell us a little bit about your background, and what brought you to create Memphis.DevTopic 2 - Let's start at the beginning. Most folks will want to know why a streaming alternative. Isn't Kafka good enough? What challenges did you personally encounter?Topic 3 - In reviewing the architecture, it mentions differences between a broker and a streaming stack. Can you elaborate on what that means? What components are typically needed for a proper data streaming solution?Topic 4 - One of the common issues with Kafka I hear about is operations complexity over time. It isn't uncommon that the more a system scales, the more complex it is to operate and also maybe the harder it is to get insights and mine for key data for instance. Have you seen this in your experience?Topic 5 - Let's talk use cases. How do you envision organizations using Memphis Cloud? What problems are you trying to solve in the market? Is Memphis Cloud a SaaS offering? How would it be implemented in an organization?Topic 6 - The data management side of all of this to always be problematic. Where and how is the data managed? What does the lifecycle of the data look like and what design considerations went into this aspect?Topic 7 - When building large distributed streaming systems, I'm sure there are trade offs and optimizations of features to consider. What are you optimizing for and what are the design tradeoffs developers need to consider?FEEDBACK?Email: show at the cloudcast dot netTwitter: @thecloudcastnet
Anna Belak, Director of The Office of Cybersecurity Strategy at Sysdig, joins Corey on Screaming in the Cloud to discuss the findings in this year's newly-released Sysdig Global Cloud Threat Report. Anna explains the challenges that teams face in ensuring their cloud is truly secure, including quantity of data versus quality, automation, and more. Corey and Anna also discuss how much faster attacks are able to occur, and Anna gives practical insights into what can be done to make your cloud environment more secure. About AnnaAnna has nearly ten years of experience researching and advising organizations on cloud adoption with a focus on security best practices. As a Gartner Analyst, Anna spent six years helping more than 500 enterprises with vulnerability management, security monitoring, and DevSecOps initiatives. Anna's research and talks have been used to transform organizations' IT strategies and her research agenda helped to shape markets. Anna is the Director of The Office of Cybersecurity Strategy at Sysdig, using her deep understanding of the security industry to help IT professionals succeed in their cloud-native journey.Anna holds a PhD in Materials Engineering from the University of Michigan, where she developed computational methods to study solar cells and rechargeable batteries.Links Referenced: Sysdig: https://sysdig.com/ Sysdig Global Cloud Threat Report: https://www.sysdig.com/2023threatreport duckbillgroup.com: https://duckbillgroup.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by our friends over at Sysdig. And once again, I am pleased to welcome Anna Belak, whose title has changed since last we spoke to Director of the Office of Cybersecurity Strategy at Sysdig. Anna, welcome back, and congratulations on all the adjectives.Anna: [laugh]. Thank you so much. It's always a pleasure to hang out with you.Corey: So, we are here today to talk about a thing that has been written. And we're in that weird time thing where while we're discussing it at the moment, it's not yet public but will be when this releases. The Sysdig Global Cloud Threat Report, which I am a fan of. I like quite a bit the things it talks about and the ways it gets me thinking. There are things that I wind up agreeing with, there are things I wind up disagreeing with, and honestly, that makes it an awful lot of fun.But let's start with the whole, I guess, executive summary version of this. What is a Global Cloud Threat Report? Because to me, it seems like there's an argument to be made for just putting all three of the big hyperscale clouds on it and calling it a day because they're all threats to somebody.Anna: To be fair, we didn't think of the cloud providers themselves as the threats, but that's a hot take.Corey: Well, an even hotter one is what I've seen out of Azure lately with their complete lack of security issues, and the attackers somehow got a Microsoft signing key and the rest. I mean, at this point, I feel like Charlie Bell was brought in from Amazon to head cybersecurity and spent the last two years trapped in the executive washroom or something. But I can't prove it, of course. No, you target the idea of threats in a different direction, towards what people more commonly think of as threats.Anna: Yeah, the bad guys [laugh]. I mean, I would say that this is the reason you need a third-party security solution, buy my thing, blah, blah, blah, but [laugh], you know? Yeah, so we are—we have a threat research team like I think most self-respecting security vendors these days do. Ours, of course, is the best of them all, and they do all kinds of proactive and reactive research of what the bad guys are up to so that we can help our customers detect the bad guys, should they become their victims.Corey: So, there was a previous version of this report, and then you've, in long-standing tradition, decided to go ahead and update it. Unlike many of the terrible professors I've had in years past, it's not just slap a new version number, change the answers to some things, and force all the students to buy a new copy of the book every year because that's your retirement plan, you actually have updated data. What are the big changes you've seen since the previous incarnation of this?Anna: That is true. In fact, we start from scratch, more or less, every year, so all the data in this report is brand new. Obviously, it builds on our prior research. I'll say one clearly connected piece of data is, last year, we did a supply chain story that talked about the bad stuff you can find in Docker Hub. This time we upleveled that and we actually looked deeper into the nature of said bad stuff and how one might identify that an image is bad.And we found that 10% of the malware scary things inside images actually can't be detected by most of your static tools. So, if you're thinking, like, static analysis of any kind, SCA, vulnerability scanning, just, like, looking at the artifact itself before it's deployed, you actually wouldn't know it was bad. So, that's a pretty cool change, I would say [laugh].Corey: It is. And I'll also say what's going to probably sound like a throwaway joke, but I assure you it's not, where you're right, there is a lot of bad stuff on Docker Hub and part of the challenge is disambiguating malicious-bad and shitty-bad. But there are serious security concerns to code that is not intended to be awful, but it is anyway, and as a result, it leads to something that this report gets into a fair bit, which is the ideas of, effectively, lateralling from one vulnerability to another vulnerability to another vulnerability to the actual story. I mean, Capital One was a great example of this. They didn't do anything that was outright negligent like leaving an S3 bucket open; it was a determined sophisticated attacker who went from one mistake to one mistake to one mistake to, boom, keys to the kingdom. And that at least is a little bit more understandable even if it's not great when it's your bank.Anna: Yeah. I will point out that in the 10% that these things are really bad department, it was 10% of all things that were actually really bad. So, there were many things that were just shitty, but we had pared it down to the things that were definitely malicious, and then 10% of those things you could only identify if you had some sort of runtime analysis. Now, runtime analysis can be a lot of different things. It's just that if you're relying on preventive controls, you might have a bad time, like, one times out of ten, at least.But to your point about, kind of, chaining things together, I think that's actually the key, right? Like, that's the most interesting moment is, like, which things can they grab onto, and then where can they pivot? Because it's not like you barge in, open the door, like, you've won. Like, there's multiple steps to this process that are sometimes actually quite nuanced. And I'll call out that, like, one of the other findings we got this year that was pretty cool is that the time it takes to get through those steps is very short. There's a data point from Mandiant that says that the average dwell time for an attacker is 16 days. So like, two weeks, maybe. And in our data, the average dwell time for the attacks we saw was more like ten minutes.Corey: And that is going to be notable for folks. Like, there are times where I have—in years past; not recently, mind you—I have—oh, I'm trying to set something up, but I'm just going to open this port to the internet so I can access it from where I am right now and I'll go back and shut it in a couple hours. There was a time that that was generally okay. These days, everything happens so rapidly. I mean, I've sat there with a stopwatch after intentionally committing AWS credentials to Gif-ub—yes, that's how it's pronounced—and 22 seconds until the first probing attempt started hitting, which was basically impressively fast. Like, the last thing in the entire sequence was, and then I got an alert from Amazon that something might have been up, at which point it is too late. But it's a hard problem and I get it. People don't really appreciate just how quickly some of these things can evolve.Anna: Yeah. And I think the main reason, from at least what we see, is that the bad guys are into the cloud saying, right, like, we good guys love the automation, we love the programmability, we love the immutable infrastructure, like, all this stuff is awesome and it's enabling us to deliver cool products faster to our customers and make more money, but the bad guys are using all the same benefits to perpetrate their evil crimes. So, they're building automation, they're stringing cool things together. Like, they have scripts that they run that basically just scan whatever's out there to see what new things have shown up, and they also have scripts for reconnaissance that will just send a message back to them through Telegram or WhatsApp, letting them know like, “Hey, I've been running, you know, for however long and I see a cool thing you may be able to use.” Then the human being shows up and they're like, “All right. Let's see what I can do with this credential,” or with this misconfiguration or what have you. So, a lot of their initial, kind of, discovery into what they can get at is heavily automated, which is why it's so fast.Corey: I feel like, on some level, this is an unpleasant sharp shock for an awful lot of executives because, “Wait, what do you mean attackers can move that quickly? Our crap-ass engineering teams can't get anything released in less than three sprints. What gives?” And I don't think people have a real conception of just how fast bad actors are capable of moving.Anna: I think we said—actually [unintelligible 00:07:57] last year, but this is a business for them, right? They're trying to make money. And it's a little bleak to think about it, but these guys have a day job and this is it. Like, our guys have a day job, that's shipping code, and then they're supposed to also do security. The bad guys just have a day job of breaking your code and stealing your stuff.Corey: And on some level, it feels like you have a choice to make in which side you go at. And it's, like, which one of those do I spend more time in meetings with? And maybe that's not the most legitimate way to pick a job; ethics do come into play. But yeah, there's it takes a certain similar mindset, on some level, to be able to understand just how the security landscape looks from an attacker's point of view.Anna: I'll bet the bad guys have meetings too, actually.Corey: You know, you're probably right. Can you imagine the actual corporate life of a criminal syndicate? That's a sitcom in there that just needs to happen. But again, I'm sorry, I shouldn't talk about that. We're on a writer's strike this week, so there's that.One thing that came out of the report that makes perfect sense—and I've heard about it, but I haven't seen it myself and I wanted to dive into on this—specifically that automation has been weaponized in the cloud. Now, it's easy to misinterpret that the first time you read it—like I did—as, “Oh, you mean the bad guys have discovered the magic of shell scripts? No kidding.” It's more than that. You have reports of people using things like CloudFormation to stand up resources that are then used to attack the rest of the infrastructure.And it's, yeah, it makes perfect sense. Like, back in the data center days, it was a very determined attacker that went through the process of getting an evil server stuffed into a rack somewhere. But it's an API call away in cloud. I'm surprised we haven't seen this before.Anna: Yeah. We probably have; I don't know if we've documented before. And sometimes it's hard to know that that's what's happening, right? I will say that both of those things are true, right? Like the shell scripts are definitely there, and to your point about how long it takes, you know, to stopwatch, these things, on the short end of our dwell time data set, it's zero seconds. It's zero seconds from, like, A to B because it's just a script.And that's not surprising. But the comment about CloudFormation specifically, right, is we're talking about people, kind of, figuring out how to create policy in the cloud to prevent bad stuff from happening because they're reading all the best practices ebooks and whatever, watching the YouTube videos. And so, you understand that you can, say, write policy to prevent users from doing certain things, but sometimes we forget that, like, if you don't want a user to be able to attach user policy to something. If you didn't write the rule that says you also can't do that in CloudFormation, then suddenly, you can't do it in command line, but you can do it in CloudFormation. So there's, kind of, things like this, where for every kind of tool that allows this beautiful, programmable, immutable infrastructure, kind of, paradigm, you now have to make sure that you have security policies that prevent those same tools from being used against you and deploying evil things because you didn't explicitly say that you can't deploy evil things with this tool and that tool and that other tool in this other way. Because there's so many ways to do things, right?Corey: That's part of the weird thing, too, is that back when I was doing the sysadmin dance, it was a matter of taking a bunch of tools that did one thing well—or, you know, aspirationally well—and then chaining them together to achieve things. Increasingly, it feels like that's what cloud providers have become, where they have all these different services with different capabilities. One of the reasons that I now have a three-part article series, each one titled, “17 Ways to Run Containers on AWS,” adding up for a grand total of 51 different AWS services you can use to run containers with, it's not just there to make fun of the duplication of efforts because they're not all like that. But rather, each container can have bad acting behaviors inside of it. And are you monitoring what's going on across that entire threatened landscape?People were caught flat-footed to discover that, “Wait, Lambda functions can run malware? Wow.” Yes, effectively, anything that can bang two bits together and return a result is capable of running a lot of these malware packages. It's something that I'm not sure a number of, shall we say, non-forward-looking security teams have really wrapped their heads around yet.Anna: Yeah, I think that's fair. And I mean, I always want to be a little sympathetic to the folks, like, in the trenches because it's really hard to know all the 51 ways to run containers in the cloud and then to be like, oh, 51 ways to run malicious containers in the cloud. How do I prevent all of them, when you have a day job?Corey: One point that it makes in the report here is that about who the attacks seem to be targeting. And this is my own level of confusion that I imagine we can probably wind up eviscerating neatly. Back when I was running, like, random servers for me for various projects I was working on—or working at small companies—there was a school of thought in some quarters that, well, security is not that important to us. We don't have any interesting secrets. Nobody actually cares.This was untrue because a lot of these things are running on autopilot. They don't have enough insight to know that you're boring and you have to defend just like everyone else does. But then you see what can only be described as dumb attacks. Like there was the attack on Twitter a few years ago where a bunch of influential accounts tweeted about some bitcoin scam. It's like, you realize with the access you had, you had so many other opportunities to make orders of magnitude more money if you want to go down that path or to start geopolitical conflict or all kinds of other stuff. I have to wonder how much these days are attacks targeted versus well, we found an endpoint that doesn't seem to be very well secured; we're going to just exploit it.Anna: Yeah. So, that's correct intuition, I think. We see tons of opportunistic attacks, like, non-stop. But it's just, like, hitting everything, honeypots, real accounts, our accounts, your accounts, like, everything. Many of them are pretty easy to prevent, honestly, because it's like just mundane stuff, whatever, so if you have decent security hygiene, it's not a big deal.So, I wouldn't say that you're safe if you're not special because none of us are safe and none of us are that special. But what we've done here is we actually deliberately wanted to see what would be attacked as a fraction, right? So, we deployed a honey net that was indicative of what a financial org would look like or what a healthcare org would look like to see who would bite, right? And what we expected to see is that we probably—we thought the finance would be higher because obviously, that's always top tier. But for example, we thought that people would go for defense more or for health care.And we didn't see that. We only saw, like, 5% I think for health—very small numbers for healthcare and defense and very high numbers for financial services and telcos, like, around 30% apiece, right? And so, it's a little curious, right, because you—I can theorize as to why this is. Like, telcos and finance, obviously, it's where the money is, like, great [unintelligible 00:14:35] for fraud and all this other stuff, right?Defense, again, maybe people don't think defense and cloud. Healthcare arguably isn't that much in cloud, right? Like a lot of health healthcare stuff is on-premise, so if you see healthcare in cloud, maybe, you, like, think it's a honeypot or you don't [laugh] think it's worth your time? You know, whatever. Attacker logic is also weird. But yeah, we were deliberately trying to see which verticals were the most attractive for these folks. So, these attacks are infected targeted because the victim looked like the kind of thing they should be looking for if they were into that.Corey: And how does it look in that context? I mean, part of me secretly suspects that an awful lot of terrible startup names where they're so frugal they don't buy vowels, is a defense mechanism. Because you wind up with something that looks like a cat falling on a keyboard as a company name, no attacker is going to know what the hell your company does, so therefore, they're not going to target you specifically. Clearly, that's not quite how it works. But what are those signals that someone gets into an environment and says, “Ah, this is clearly healthcare,” versus telco versus something else?Anna: Right. I think you would be right. If you had, like… hhhijk as your company name, you probably wouldn't see a lot of targeted attacks. But where we're saying either the company and the name looks like a provider of that kind, and-slash-or they actually contain some sort of credential or data inside the honeypot that appears to be, like, a credential for a certain kind of thing. So, it really just creatively naming things so they look delicious.Corey: For a long time, it felt like—at least from a cloud perspective because this is how it manifested—the primary purpose of exploiting a company's cloud environment was to attempt to mine cryptocurrency within it. And I'm not sure if that was ever the actual primary approach, or rather, that was just the approach that people noticed because suddenly, their AWS bill looks a lot more like a telephone number than it did yesterday, so they can as a result, see that it's happening. Are these attacks these days, effectively, just to mine Bitcoin, if you'll pardon the oversimplification, or are they focused more on doing more damage in different ways?Anna: The analyst answer: it depends. So, again, to your point about how no one's safe, I think most attacks by volume are going to be opportunistic attacks, where people just want money. So, the easiest way right now to get money is to mine coins and then sell those coins, right? Obviously, if you have the infrastructure as a bad guy to get money in other ways, like, you could do extortion through ransomware, you might pursue that. But the overhead on ransomware is, like, really high, so most people would rather not if they can get money other ways.Now, because by volume APTs, or Advanced Persistent Threats, are much smaller than all the opportunistic guys, they may seem like they're not there or we don't see them. They're also usually better at attacking people than the opportunistic guys who will just spam everybody and see what they get, right? But even folks who are not necessarily nation states, right, like, we see a lot of attacks that probably aren't nation states, but they're quite sophisticated because we see them moving through the environment and pivoting and creating things and leveraging things that are quite interesting, right? So, one example is that they might go for a vulnerable EC2 instance—right, because maybe you have Log4J or whatever you have exposed—and then once they're there, they'll look around to see what else they can get. So, they'll pivot to the Cloud Control Plane, if it's possible, or they'll try to.And then in a real scenario we actually saw in an attack, they found a Terraform state file. So, somebody was using Terraform for provisioning whatever. And it requires an access key and this access key was just sitting in an S3 bucket somewhere. And I guess the victim didn't know or didn't think it was an issue. And so, this state file was extracted by the attacker and they found some [unintelligible 00:18:04], and they logged into whatever, and they were basically able to access a bunch of information they shouldn't have been able to see, and this turned into a data [extraction 00:18:11] scenario and some of that data was intellectual property.So, maybe that wasn't useful and maybe that wasn't their target. I don't know. Maybe they sold it. It's hard to say, but we increasingly see these patterns that are indicative of very sophisticated individuals who understand cloud deeply and who are trying to do intentionally malicious things other than just like, I popped [unintelligible 00:18:30]. I'm happy.Corey: This episode is sponsored in part by our friends at Calisti.Introducing Calisti. With Integrated Observability, Calisti provides a single pane of glass for accelerated root cause analysis and remediation. It can set, track, and ensure compliance with Service Level Objectives.Calisti provides secure application connectivity and management from datacenter to cloud, making it the perfect solution for businesses adopting cloud native microservice-based architectures. If you're running Apache Kafka, Calisti offers a turnkey solution with automated operations, seamless integrated security, high-availability, disaster recovery, and observability. So you can easily standardize and simplify microservice security, observability, and traffic management. Simplify your cloud-native operations with Calisti. Learn more about Calisti at calisti.app.Corey: I keep thinking of ransomware as being a corporate IT side of problem. It's a sort of thing you'll have on your Windows computers in your office, et cetera, et cetera, despite the fact that intellectually I know better. There were a number of vendors talking about ransomware attacks and encrypting data within S3, and initially, I thought, “Okay, this sounds like exactly a story people would talk about some that isn't really happening in order to sell their services to guard against it.” And then AWS did a blog post saying, “We have seen this, and here's what we have learned.” It's, “Oh, okay. So, it is in fact real.”But it's still taking me a bit of time to adapt to the new reality. I think part of this is also because back when I was hands-on-keyboard, I was unlucky, and as a result, I was kept from taking my aura near anything expensive or long-term like a database, and instead, it's like, get the stateless web servers. I can destroy those and we'll laugh and laugh about it. It'll be fine. But it's not going to destroy the company in the same way. But yeah, there are a lot of important assets in cloud that if you don't have those assets, you will no longer have a company.Anna: It's funny you say that because I became a theoretical physicist instead of experimental physicist because when I walked into the room, all the equipment would stop functioning.Corey: Oh, I like that quite a bit. It's one of those ideas of, yeah, your aura just winds up causing problems. Like, “You are under no circumstances to be within 200 feet of the SAN. Is that clear?” Yeah, same type of approach.One thing that I particularly like that showed up in the report that has honestly been near and dear to my heart is when you talk about mitigations around compromised credentials at one point when GitHub winds up having an AWS credential, AWS has scanners and a service that will catch that and apply a quarantine policy to those IAM credentials. The problem is, is that policy goes nowhere near far enough at all. I wound up having fun thought experiment a while back, not necessarily focusing on attacking the cloud so much as it was a denial of wallet attack. With a quarantined key, how much money can I cost? And I had to give up around the $26 billion dollar mark.And okay, that project can't ever see the light of day because it'll just cause grief for people. The problem is that the mitigations around trying to list the bad things and enumerate them mean that you're forever trying to enumerate something that is innumerable in and of itself. It feels like having a hard policy of once this is compromised, it's not good for anything would be the right answer. But people argue with me on that.Anna: I don't think I would argue with you on that. I do think there are moments here—again, I have to have sympathy for the folks who are actually trying to be administrators in the cloud, and—Corey: Oh God, it's hard.Anna: [sigh]. I mean, a lot of the things we choose to do as cloud users and cloud admins are things that are very hard to check for security goodness, if you will, right, like, the security quality of the naming convention of your user accounts or something like that, right? One of the things we actually saw in this report it—and it almost made me cry, like, how visceral my reaction was to this thing—is, there were basically admin accounts in this cloud environment, and they were named according to a specific convention, right? So, if you were, like, admincorey and adminanna, like, that, if you were an admin, you've got an adminanna account, right? And then there was a bunch of rules that were written, like, policies that would prevent you from doing things to those accounts so that they couldn't be compromised.Corey: Root is my user account. What are you talking about?Anna: Yeah, totally. Yeah [laugh]. They didn't. They did the thing. They did the good accounts. They didn't just use root everybody. So, everyone had their own account, it was very neat. And all that happened is, like, one person barely screwed up the naming of their account, right? Instead of a lowercase admin, they use an uppercase Admin, and so all of the policy written for lowercase admin didn't apply to them, and so the bad guy was able to attach all kinds of policies and basically create a key for themselves to then go have a field day with this admin account that they just found laying around.Now, they did nothing wrong. It's just, like, a very small mistake, but the attacker knew what to do, right? The attacker went and enumerated all these accounts or whatever, like, they see what's in the environment, they see the different one, and they go, “Oh, these suckers created a convention, and like, this joker didn't follow it. And I've won.” Right? So, they know to check with that stuff.But our guys have so much going on that they might forget, or they might just you know, typo, like, whatever. Who cares. Is it case-sensitive? I don't know. Is it not case-sensitive? Like, some policies are, some policies aren't. Do you remember which ones are and which ones aren't? And so, it's a little hopeless and painful as, like, a cloud defender to be faced with that, but that's sort of the reality.And right now we're in kind of like, ah, preventive security is the way to save yourself in cloud mode, and these things just, like, they don't come up on, like, the benchmarks and, like the configuration checks and all this other stuff that's just going, you know, canned, did you, you know, put MFA on your user account? Like, yeah, they did, but [laugh] like, they gave it a wrong name and now it's a bad na—so it's a little bleak.Corey: There's too much data. Filtering it becomes nightmarish. I mean, I have what I think of as the Dependabot problem, where every week, I get this giant list of Dependabot freaking out about every repository I have on Gif-ub and every dependency thereof. And some of the stuff hasn't been deployed in years and I don't care. Other stuff is, okay, I can see how that markdown parser could have malicious input passed to it, but it's for an internal project that only ever has very defined things allowed to talk to it so it doesn't actually matter to me.And then at some point, it's like, you expect to read, like, three-quarters of the way down the list of a thousand things, like, “Oh, and by the way, the basement's on fire.” And then have it keep going on where it's… filtering the signal from noise is such a problem that it feels like people only discover the warning signs after they're doing forensics when something has already happened rather than when it's early enough to be able to fix things. How do you get around that problem?Anna: It's brutal. I mean, I'm going to give you, like, my [unintelligible 00:24:28] vendor answer: “It's just easy. Just do what we said.” But I think [laugh] in all honesty, you do need to have some sort of risk prioritization. I'm not going to say I know the answer to what your algorithm has to be, but our approach of, like, oh, let's just look up the CVSS score on the vulnerabilities. Oh, look, 600,000 criticals. [laugh]. You know, you have to be able to filter past that, too. Like, is this being used by the application? Like, has this thing recently been accessed? Like, does this user have permissions? Have they used those permissions?Like, these kinds of questions that we know to ask, but you really have to kind of like force the security team, if you will, or the DevOps team or whatever team you have to actually, instead of looking at the list and crying, being like, how can we pare this list down? Like anything at all, just anything at all. And do that iteratively, right? And then on the other side, I mean, it's so… defense-in-depth, like, right? I know it's—I'm not supposed to say that because it's like, not cool anymore, but it's so true in cloud, like, you have to assume that all these controls will fail and so you have to come up with some—Corey: People will fail, processes will fail, controls will fail, and great—Anna: Yeah.Corey: How do you make sure that one of those things failing isn't winner-take-all?Anna: Yeah. And so, you need some detection mechanism to see when something's failed, and then you, like, have a resilience plan because you know, if you can detect that it's failed, but you can't do anything about it, I mean, big deal, [laugh] right? So detection—Corey: Good job. That's helpful.Anna: And response [laugh]. And response. Actually, mostly response yeah.Corey: Otherwise, it's, “Hey, guess what? You're not going to believe this, but…” it goes downhill from there rapidly.Anna: Just like, how shall we write the news headline for you?Corey: I have to ask, given that you have just completed this report and are absolutely in a place now where you have a sort of bird's eye view on the industry at just the right time, over the past year, we've seen significant macro changes affect an awful lot of different areas, the hiring markets, the VC funding markets, the stock markets. How has, I guess, the threat space evolved—if at all—during that same timeframe?Anna: I'm guessing the bad guys are paying more than the good guys.Corey: Well, there is part of that and I have to imagine also, crypto miners are less popular since sanity seems to have returned to an awful lot of people's perspective on money.Anna: I don't know if they are because, like, even fractions of cents are still cents once you add up enough of them. So, I don't think [they have stopped 00:26:49] mining.Corey: It remains perfectly economical to mine Bitcoin in the cloud, as long as you use someone else's account to do it.Anna: Exactly. Someone else's money is the best kind of money.Corey: That's the VC motto and then some.Anna: [laugh]. Right? I think it's tough, right? I don't want to be cliche and say, “Look, oh automate more stuff.” I do think that if you're in the security space on the blue team and you are, like, afraid of losing your job—you probably shouldn't be afraid if you do your job at all because there's a huge lack of talent, and that pool is not growing quick enough.Corey: You might be out of work for dozens of minutes.Anna: Yeah, maybe even an hour if you spend that hour, like, not emailing people, asking for work. So yeah, I mean, blah, blah, skill up in cloud, like, automate, et cetera. I think what I said earlier is actually the more important piece, right? We have all these really talented people sitting behind these dashboards, just trying to do the right thing, and we're not giving them good data, right? We're giving them too much data and it's not good quality data.So, whatever team you're on or whatever your business is, like, you will have to try to pare down that list of impossible tasks for all of your cloud-adjacent IT teams to a list of things that are actually going to reduce risk to your business. And I know that's really hard to do because you're asking now, folks who are very technical to communicate with folks who are very non-technical, to figure out how to, like, save the business money and keep the business running, and we've never been good at this, but there's no time like the present to actually get good at it.Corey: Let's see, what is it, the best time to plant a tree was 20 years ago. The second best time is now. Same sort of approach. I think that I'm seeing less of the obnoxious whining that I saw for years about how there's a complete shortage of security professionals out there. It's, “Okay, have you considered taking promising people and training them to do cybersecurity?” “No, that will take six months to get them productive.” Then they sit there for two years with the job rec open. It's hmm. Now, I'm not a professor here, but I also sort of feel like there might be a solution that benefits everyone. At least that rhetoric seems to have tamped down.Anna: I think you're probably right. There's a lot of awesome training out there too. So there's, like, folks giving stuff away for free that's super resources, so I think we are doing a good job of training up security folks. And everybody wants to be in security because it's so cool. But yeah, I think the data problem is this decade's struggle, more so than any other decades.Corey: I really want to thank you for taking the time to speak with me. If people want to learn more, where can they go to get their own copy of the report?Anna: It's been an absolute pleasure, Corey, and thanks, as always for having us. If you would like to check out the report—which you absolutely should—you can find it ungated at www.sysdig.com/2023threatreport.Corey: You had me at ungated. Thank you so much for taking the time today. It's appreciated. Anna Belak, Director of the Office of Cybersecurity Strategy at Sysdig. This promoted guest episode has been brought to us by our friends at Sysdig and I'm Cloud Economist Corey Quinn.If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment that no doubt will compile into a malicious binary that I can grab off of Docker Hub.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Colleen Coll, Account Executive at The Duckbill Group, joins Corey on Screaming in the Cloud to discuss her journey of breaking into tech and why it's so important to make your presence known. Colleen explains how she wound up working for The Duckbill Group after taking the initiative to go and meet Corey at a networking event, and what motivates her to take risks and do things that might feel intimidating in order to advance her career. Colleen and Corey also discuss the power of influencer marketing, as well as the focus The Duckbill Group has on setting a high standard for employee onboarding and culture. About ColleenColleen Coll is a native of Pittsburgh and wannabe tech geek working in tech media sales, events, writing and marketing. She's an advocate for women and underrepresented communities in tech and is extremely proud of her efforts to learn coding languages and engage and connect diversity in the open source circle! When she's not geeking out and traveling the globe (and virtually) producing/ hosting tech podcasts and livestreams, she enjoys trips to local wineries, binging sci-fi, and hosting bolognese dinner parties.Links Referenced: Twitter: https://twitter.com/colleencoll LinkedIn: https://www.linkedin.com/in/colleen-coll-b971505/ Last Week in AWS sponsorship form: https://lastweekinaws.com/sponsorship TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Human-scale teams use Tailscale to build trusted networks. Tailscale Funnel is a great way to share a local service with your team for collaboration, testing, and experimentation. Funnel securely exposes your dev environment at a stable URL, complete with auto-provisioned TLS certificates. Use it from the command line or the new VS Code extensions. In a few keystrokes, you can securely expose a local port to the internet, right from the IDE.I did this in a talk I gave at Tailscale Up, their first inaugural developer conference. I used it to present my slides and only revealed that that's what I was doing at the end of it. It's awesome, it works! Check it out!Their free plan now includes 3 users & 100 devices. Try it at snark.cloud/tailscalescream Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while, I like to do a bit of a behind-the-scenes episode where I talk to one of the people here at The Duckbill Group that makes the whole thing run. Because I've got to be honest, there's a certain audience and public perception that everyone on our team's page more or less just sits around and claps as I do all the work. And that's not true, at least, you know, 80% of the time. My guest today is a relatively new hire here at Duckbill. Colleen Coll is our account executive in media here at The Duckbill Group. Colleen, thank you for joining me, both in an employment sense and on the podcast.Colleen: [laugh]. Thanks for having me, Corey. This is an honor and privilege [laugh].Corey: You say that now we'll see how you feel by the end of this conversation. There's always that. So, I find when you're trying to tell a story, one of the best places to begin is the beginning. And we look at people in the space who are doing things that are aspirational or admirable and we have this tendency to believe that they were always this way as if they were formed fully and sprung live from the forehead of some ancient god. And that isn't how it works. Where do you come from? How do you find yourself now? And where did you start before getting here?Colleen: I come from a long line—well, possibly a short line—of people who wanted to be a journalist. They wanted to be the All the President's Men, and that's what I grew up with. And I took the journalistic ethics classes for it and that's what I wanted to do. I wanted to be hard-hitting. And then after I graduated, I found that it did not pay well at all, and [laugh] it was a hard jobs to get. It was just hard to get in. So, I had to figure out a plan. And somehow I went through nonprofits, hospitals, and all kinds of marketing gigs, and then I ended up in tech and I still don't know how I got here. I think it was a dare basically. It was basically a dare [laugh].Corey: Something that you didn't do post-journalism that, as best I can tell, is the path that so many, I don't want to use the term ‘fallen journalist,' so I won't, but so many folks who have gone through journalism decide to drift into is in many cases PR and corporate comms. And, on some level, we've now hit a point where I think there's two or three PR professionals for every working journalist, at least. And that ratio is even more skewed in tech. You didn't go in that direction. Was that not something that appealed? Did it not exist in the timeframe that you were making that transition in the same numbers that it does these days? What was it that wound up, I guess, making that not a path you went on?Colleen: Actually Corey, I did.Corey: Oh, wonderful.Colleen: Yeah, for a short time. I went in as communications, doing a little bit of PR work. Not a lot because that's not what my focus was. But because—this is what I found out—because you had that writing specialty that you could go into PR, somehow be dropped into it, but then it will lead you to other things like speech writing for whomever VP or C-level. Because I found that a lot of people with Business Economics degrees did not really know how to write or spell for that matter [laugh]. And so, I was sort of used from the bottom once they knew that I had these writing abilities to help in that manner. And that's-I dropped into PR. It was okay, but I love being engaged in the community more, so they put me into events.Corey: I did not realize that. And I apologize. It turns out that we don't do the in-depth, totally invasive style of background check that it seems is becoming more and more common in some places. Imagine that. No, it's one of those weird areas just because it's—I deal with so many PR types in different ways and, on some level, it's easy to fall into the trap of forming a dim view based upon the worst examples because those are the ones that stick in your minds.But a lot of folks have serious challenges in the corporate comms space and communicating authentically and effectively to the audience, which in many cases is something that, as you said, you take it in a slightly different direction. You do know how to write, you know how to spell, and that sounds like I'm being incredibly sarcastic, like, “Yay, and you can tie your shoes. Triple threat, baby,” but no, it's not. It is a vanishingly rare skill set in this space, and I see that and get frustrated with it every week when I try and put my newsletter together. And, “What am I going to link to?” I see promising titles that look like it's going to be germane and if I can't get through the first paragraph without facepalming, just based upon how poorly it's written, I don't want to inflict that on the audience.People lose sight of the fact that we're going to record a podcast or we're going to write blog posts, we're going to send out an email newsletter. How do we get the best production value for all of this without focusing on the most important part, which is, what are we going to talk about? How are we going to get people to care about it? And that's the thing that I think gets overlooked the most is the audiences will forgive all kinds of weird production-style issues. It's okay, this was filmed on your iPhone or it was just posted in plain text on a web dumped somewhere. People will read it if it's compelling. If it's not, it doesn't matter how big the production budget was, it's not interesting. And if it's not interesting, no one's going to care.Colleen: And I think that's where I'm sort of a professional in a sense where—and sort of, I'd like to say gifted in a way because I thought it was natural, I still think I'm a natural in storytelling. Regardless of where you are, it's how you tell your story where people would listen is very, very important. Whether you're doing a sales pitch or whether you're trying to sell a bride—from my event management background—just tell the story of how this is going to be successful and to try to sell her the 12-top instead of something else. So yeah, you have to just make it compelling and try to—I like to call myself a kick-ass storyteller, and that's what's gotten me here so far, and will get me to where I need to be in the future.Corey: I would agree with that. You've always had a fascinating curiosity, I think is the best approach to this. I still remember the first time we met. It was over a year ago at Monitorama where I threw the annual drink-up—or basically, the drink-up I throw whenever I'm in town somewhere, otherwise, people get annoyed that I didn't remember to hang out with them personally. And then well, I'll be there for six weeks. That's how many lunches I need to book.And you showed up and introduce yourself, and it was glorious. It was, wow, someone wants to have an actual conversation about things I said on the internet. And for once, I don't think they're about to punch me right in my snarky face. Like, this is amazing. How do we get more people like this showing up? And when you applied to work here it was, “Oh, wait is Colleen? Colleen, Colleen?” And that definitely raised eyebrows.Colleen: Yeah. It was so weird. I love telling that story because it's a great story to tell because I heard about you where I was before, in the tech industry. And I started following him, like, this guy is… he's crazy [laugh]. I want to know more.Corey: Undoubtedly.Colleen: Exactly. And you were so tech-heavy, but you made it to a point where it was just, like, some kind of the humor and people got you, and I was just so—even though I didn't understand—I will admit, I don't understand half the stuff you're talking about, but the engagement that you had, I was just it was just so compelling, I'm just so interested in that. Because in any kind of work, you want to see, when somebody says something and people respond, and a lot of people respond whether it's bad or, “Good,” quote-unquote, but I was like, “I got to meet this guy.” And then one time, you were in Portland. Who knew? I thought you'd be somewhere else, and I was, “Like he's at Momo's.” Which I live in Portland; it's right down the street.I was like, I'm going to—I was on the couch, just laying down [laugh], doing nothing. I got dressed up, put on my eyelashes for you, Corey, and I went to the bar [laugh]. And I said, “You know what?” I was still nervous, I was [unintelligible 00:08:42], but you know what? I'm going to just say, “Hey, Corey. I'm Colleen. I work here. Nice to meet you finally.” So. It was so weird, you and Mike and everybody were just so nice and I just ended up having a—and I got the photo with your mouth open. So, that was awesome.Corey: Well, that's the important part. People walk up like I'm the mascot. Like, “Can I get a photo with you? Is that weird?” It's like, “Yes, it's extremely weird and you would not be the even fifth hundredth person to ask me for that this year. So sure, by all means.” My face has more or less become a cautionary tale to small children. “You know they say your face is going to freeze like this if you hold it?” “Yeah.”Colleen: It got me so much street cred though. I was so happy. Like, I didn't have that many followers in tech and then, soon as I put that picture and I tagged you in it, like, “Oh, my God. You met Corey.”I was like, “Shut up.” So it's, thank you. And then how we got here, I have no idea. Again, on Twitter. So [laugh].Corey: One thing leads to another leads to another. And I have to ask, as someone who is explicitly bad at this, namely, approaching someone I don't know and striking up a conversation, I have uniformly been terrible at this my entire life, I was bad at dating and honestly, it's the reason I became a conference speaker because once you give a talk, everyone starts the conversation with you and you're golden. Was it intimidating for you to come up to someone who you only knew is a loudmouth on Twitter who snarked about everything? And if so, what made you decide to do it anyway?Colleen: Yes, it was intimidating, I'm not going to lie. Your presence and how you talk and your directness and I didn't know who you were, I just knew your presence on social media. But you had a lot of followers. But you were here. And I was like, if I don't take this opportunity, then screw it.What made me do it is I had to do it for the women and the minorities who always wanted to be in tech and just let them know that you got to do it to make your presence, and you just get up and do it. It's not that hard. And that's how it started with me in tech in the first place. Somebody—it was a dare that I take a, I take [laugh] a Python class. And I didn't—I was like, okay, we'll do it. It should be easy.And I did. And I got into it. And that's how I learned the culture. So, when that opportunity arose and I came up to you, my heart was pounding so fast. I thought you would just like, “Oh, hey. You know, whatever.” And you were just, like, so engaging, and nice and you smiled, and I was like, “Wow. This is it. This is what I want to tell to people.” Even though it might seem hard, it really isn't once you give your all and just do it and take that chance. I know it sounds very cliche, but that's what happened.Corey: It's an interesting problem, just because the upside feels limited and the downside feels vast. And for me, I've gone through an awakening over the last few years as my Twitter audience got to a point where the baseline baked-in assumption I had no longer applied which was, when I started this company, I had less than 1500 followers. And no—all of them had seen me or knew who I was from conferences, so I always assumed that the people who are reading this know who I am, they know what I'm about. And I didn't really think of the use case of this is someone's only exposure to me. And I found out a few years back that I was inadvertently causing people pain, which is not what I set out to do, with the singular exception of Larry Ellison, who is not a person, nor does he feel pain.But as for everyone else, it's a, I'm not here to make your day actively worse. I'm here to advocate on behalf of, in most cases, customers of which I am one usually, and trying to make tomorrow's experience better than it is today, and as a part of that, to send the elevator back down. And I realized I was abdicating some of that responsibility, so it started an intentional shift toward being more mindful about how things I say can resonate. And I still get it wrong a lot. And I spent more time apologizing, but that's something that is going to be a lot more nuanced, tricky, and delicate than I think people give it credit for. And it turns out an apology is not just saying you're sorry. You actually have to change the behavior.Colleen: Hmm.Corey: More people should be aware of that one.Colleen: Yeah. Well, I think that's part of the attraction is that you own up to it [laugh]. So.Corey: I do try.Colleen: Yeah.Corey: During the interview process, you redistinguished yourself again and again. Like, one of my personal favorite memories of that was you asked about, I believe it was our event strategy of what do we do at events? And I said, “Yeah, that doesn't work because the people that buy sponsorships are generally not the same people at the events physically.” And you had the politest framing I can remember of, “Oh, you sweet summer child.” I forget the exact phrasing that you used, but it was simultaneously clear that you did not agree with what I just said in the least, were willing to challenge me on it, but also weren't going to come lunging across the table with, “Now, you listen here,” all of which I have seen before.It was the perfect mix of, “Ah, yeah, you just said something that's complete bullshit, but I'm just going to say something that lets you figure that out for yourself rather than leading you by hand down the journey of discovery of what a dumbass you're being.” And lo and behold, you were absolutely right. That's one of the more useful and also irritating aspects is when you have people who come in who are better in than you are at the job you hired them to do. It's a constant humbling process of hey, I thought I knew what I was doing, and guess what? I absolutely did not. And now I just step out of the way and hope I don't wind up causing problems accidentally.Colleen: Well, you're welcome [laugh]. It's my pleasure.Corey: It's worked out rather well.Colleen: [laugh]. Yes, yes. Yeah, I am glad that I had the opportunity to enlighten you and couple other people in the staff at the Duckbill Group. So yes, I definitely—I will definitely fall—what do you call it? A fall on the sword [laugh] or die on a hill that particular topic: event management, sales, community, it all works.Corey: Before this, you spent a few years at The New Stack and that was honestly one of our biggest internal fears. We're a big fan of Alex and what he's been able to build over there. It's like, is he going to hate us if we extend an offer to you? And of course, we could not ask him that question in advance because it's, “So, one of your employees is interviewing and we'd like to make them an offer. Is this going to cause a problem for you?”Yeah, that's called how you potentially just absolutely gut someone who took a chance on talking to you. Confidentiality in those things is required because you never know the actual story. And there's a power imbalance in the job interview process of, “Hey, do you mind if I talk to your current boss about bringing you aboard?” And what did they going to say? “Please don't?” Because, oh, that's going to potentially ring hollow. And there's nothing nefarious in it, but we knew there was no way we could ask it. So it's, well, we'll send him a lovely fruit basket and an apology note. Which I still need to get out. If you're listening to this, Alex, my apologies. It's on my backlog.Colleen: [laugh]. Yes, that's awesome. Now, that was a difficult decision. I wasn't even looking. I was on Twitter and I saw the opportunity. And I love being in the company of journalists for the first time. It's funny, I started off as a journalist, like, post-college and just writing a few things as a freelancer, and I ended up in corporate world.And then somehow I finally got back to working with journalists at The New Stack. I mean, these are writers talking about tech and writing about tech. And it was fantastic and I got to be a part of that. But when you get to phase where I am, my age [laugh], my experience, I wanted to do more. And when I saw that opportunity—and the opportunity to work with your brand because I met you last year—I was like, you know what? Well, we'll see. It's who'll know—who—you know, we'll see what happens. And I ended up having this won—Corey: You meet me for 30 seconds, you're like, “Well, I already had a perfect answer to ‘so why didn't it work out?'” It's like, “Have you met that jackhole? There we go.” No one is going to have a follow-up to that. You're always going to be assured that yep, that one is not going to come back on you.Colleen: Yeah. The New Stack is a wonderful place to start if you want to go into tech media, and just—and it's not a niche market like we have here, but it's just all around what's going on and the trends. And the people there taught me so much. So, I think that's kind of why I had the opportunity here. If I didn't have that opportunity at The New Stack, I don't think I would have had this opportunity to have these conversations with you, this team. So, yeah.Corey: I have to wonder, on some level, though, there's a this is niche upon niche because not only is it, we wind up focusing exclusively on the AWS market, but not only that, we also are incredibly sarcastic and generally make fun of you. Would you like to buy a sponsorship? I always thought that that was a ridiculous pitch that was never going to work, but sponsors have come in repeatedly and they still talk to us and still asked to give us money, which is frankly, somewhat surprising to me. And to my understanding that is no longer the exact pitch we give verbatim—because it's not strictly true—but it does feel like it's a harder conversation, then. “So, what do you do?” “We're a news site.” “What do you do there?” “We report the news.” “How do I sponsor?” “You give us money, and we put banner ads on the website, and possibly sponsored content. The end.” This feels like it's much more nuanced and as a result is probably a harder sales conversation.Colleen: You think so? I've noticed just by being in the know of tech and going into and reading everything about media and everything, people are—especially when it comes to buying—people are more apt—and you know this—to buy from influencers and customers than the actual company. And when you have somebody that's constantly in the know, tech-heavy like yourself or somebody with another product, whether it's nail polish or something like that, and they used it, they gave this wonderful review or gave a bad review, they're more apt to buy or not buy. I think that's why, that's the connection with the company, to somebody, the big company to purchase what we have to offer here at Last Week in AWS is because you're buying sort of an influence, and you have the audience of customers who believe in you and believe most of the things that you say [laugh] when you're not shitposting. And [laugh]—Corey: Wait, when am I not shitposting?Colleen: [laugh]. Exactly. And so, people a—I think that that's what these sponsors are buying. They want to buy the influence in the customers' view because they know that they'll get more buy-in. So, I think that whole buy this product because I am Heinz ketchup, that whole generation is gone. It's, buy the Heinz ketchup because what his name is using it on his hot dog all the time and he just absolutely loves it and Tik Toks about it all the time. I'm so I think that's why, I think it's a—I don't want to say it's an easy sell; everything's not that easy, but it's a fun and more compelling sell here.Corey: This episode is sponsored in part by our friends at Calisti.Introducing Calisti. With Integrated Observability, Calisti provides a single pane of glass for accelerated root cause analysis and remediation. It can set, track, and ensure compliance with Service Level Objectives.Calisti provides secure application connectivity and management from datacenter to cloud, making it the perfect solution for businesses adopting cloud native microservice-based architectures. If you're running Apache Kafka, Calisti offers a turnkey solution with automated operations, seamless integrated security, high-availability, disaster recovery, and observability. So you can easily standardize and simplify microservice security, observability, and traffic management. Simplify your cloud-native operations with Calisti. Learn more about Calisti at calisti.app.Corey: I want to be very clear a nuance here that I'm not sure it's fully understood, in that years ago, I made a very intentional choice of severing myself almost completely from the sponsor sales process here. And the reason is, is that I never wanted to find myself in a position of writing the weekly news and, I don't know, let's pick on a former sponsor, for example, Google Cloud does something that I'm about to dunk on. But oh, it turns out that Google is also sponsoring this issue so I probably shouldn't do that. I built my own version of an editorial firewall so I did not have that conflict. So, I say what I want, I don't find out who's sponsoring something until afterwards and to be very clear, to this day, I have never had a single complaint or piece of pushback on anything I've written from a sponsor who is sponsoring that issue, which is, frankly, tells me that it's sort of unnecessary from an external perspective, but it makes it work better for me.So, I don't know what a lot of the sales conversations look like. People reach out, “Hey, can I sponsor your stuff?” And it's, “Have you met Colleen?” And I get the hell out of that critical path as fast as possible. Also because I'm bad at email. And that just means that I'm more or less have a mystery box that I throw all of those things into and then sponsors come out the other side. And, lovely, I'll take it.Colleen: Yeah, that's—I mean… should you? You should get updates on who wants to buy and why, but most of the time, it's the audience. They want to connect to the audience that you have created, well, the company has created. And basically they'll know about this, I mean, eventually, they'll know about the services we provide, whether it's consulting, and then of course, the opportunities for ad placements at Last Week in WS—at AWS. Oh, God, I need to really improve that [laugh].But it's the audience. So, why if you would shitpost something or say something that might make them uncomfortable, why they would buy it anyway, I mean, that's a conversation that maybe we should keep having. But I think the answer is clear. I mean, it's the audience who believe in you and believe in what you have to say and believe in our brand. So, they want to get close to it in order to sell their product. And they have the money and means to do so. So, I don't question it too much.Corey: No. It's similar to the whole approach that I always take is I don't think too hard about what keeps the airplane in the air when I'm mid-flight because if I do, it might stop working. Similar here. It's like, I don't know why these people keep showing up and listening to what I have to say and caring about it. I'm not going to look too closely at it because then the magic might break. But that is probably at this point not the most helpful instinct I could have.Colleen: Yeah. That's good point, yeah. I think what you're doing is actually really great and it's keeping tech media interesting.Corey: I try anyway. To turn it around slightly, though, I have to ask you if you knew my public persona for probably entirely too long and then you got to actually work inside the sausage factory, and—which is a polite way of saying abattoir—and as you move aside as someone's, like, shoving a cow into a food processor behind you or whatnot, I have to ask, what's caught you by surprise once you got here that you did not know or expect before you joined?Colleen: I think what caught me by surprise is two things. The onboarding was, for such a small company, was immaculate. I didn't have to ask too many questions; things were there. And the questions that I asked were answered. So, the culture was so open to a point with feedback and how we do things that I didn't have to do most of the work and info gathering when you're going from 30, 60, to 90 days. So, I was shocked by that because usually in my past [laugh], I'm usually just, “Hey, got a job. Figure it out.” [laugh]. So, I went in that way, but it was just, it was fantastic.Then I'm not used to this, especially when you're the only… when people don't look like you [laugh] and you're usually the only one—especially when it comes to CEOs and founders—the amount of openness, friendliness, and direct feedback that wasn't as condescending—because this is what we expect most of the time—was just fantastic. It was just friendly and I can do my job without having to be attacked or attack any other mindsets that they might have some stereotypes of how—who I am and how I got here. It was just, you were just so professionally, “Colleen, this is what we expect. How do you feel about this?” The, how do you feel about this? Do you have any other ways or feedback that this could be better?And I know this sounds so corny, but when it comes to people like me, we don't always get that opportunity before—like, so fast, before we have to prove ourselves. I know, that's a long-winded way of saying that [laugh]. And so far, it's been a delight to a point where I sort of have a little bit of PTSD because I'm, like, how do I operate in this non-toxic [laugh] environment?Corey: And I don't know if you recall this, but you made the observation that in many places, there is an undercurrent of bias, be it conscious or otherwise, that causes people to out of hand reject proposals or ideas that come from people who do not look like the traditional person you would expect in that role to be framing those ideas. And how much of that would you encounter here? And that I thought was a poignant question that deserves a great answer. And my answer then remains as it is now, which is, “I don't know. I don't believe that we have that type of culture here.”But again, I wouldn't believe that we have that type of culture here, even if it were rampant. So, I would consider it a personal favor if you see elements of that to please let someone you trust here know that because it is certainly not our intent, it is certainly not who we aspire to be, but societal and systemic patterns are incredibly hard to break. And I don't know what a good answer to that would be. I know the bad answers are obvious of, “Oh, we don't hire anyone like that.” Or, “Nope, that's not a problem.” Or the worst, I suppose is, “How dare someone who doesn't look like me ask me that question,” which I'm pretty sure gets the high score for terrible answers. But I don't know what the good answer to that is, other than we're always learning and trying.Colleen: And that's basically how you did respond. And it was eye-opening. And it's not an easy question to ask. I'm like, “Will you have a problem with someone like me giving you feedback on something like this? Will you have a problem that someone that looks like me working this and doing this, and you know, just trying to do the job, or do I have to make you feel comfortable first?”And I'm at a point in my life where I don't have time to do that. I would rather just go on, let me do my work without, you know, making other people feel uncomfortable or feel comfortable. So, I will tell you, it's almost been two months. This is the first time I haven't had that feeling of trying to make people feel comfortable before I can actually do my job. And I am not kissing ass because you know, I'm really direct in that approach because—Corey: I have not known you to ever kiss ass—Colleen: [laugh].Corey: Which is probably a good thing, and also, some of them may be disturbing, but I don't know. It's like, “Oh, you're not authorized to fire me. It's fine.” Which I'm in fact not, so… cool, by all means. But no, it is refreshing. It really is.Colleen: It is. It's very refreshing. I want to—if I can even tell people out there that there are places like that you don't always have to use 50% of your time attacking stereotypes and you can actually do your job. They do exist out there. And this position so far has been living proof. And I do appreciate it.But I also want them to make sure that they know that I worked hard at this to get here and it's good to be appreciated, but it's also good to be respected and valued. And I do believe that you, that's why you hired me is because you saw what I was capable of and you valued my input, my feedback, and it's still going on, and we keep having these conversations. And I did not expect this interview, which I was like, “Is he serious?” I mean really. Why [laugh]?So, this is just another, like, example of how that—what I just talked about is being respected and valued, and regardless of if I don't look like you. And one of the funniest parts of our interview is when you said that whole manel description of how if you were asked to be on a panel, but it was a bunch of white males, and you refer to it as a manel, I've never heard of that [laugh] before, so I u—Corey: It's not my term. I don't want to claim credit for it at all. I heard from some wit on Twitter years ago that is lost to the mysteries of time. But it's a perfect description.Colleen: So, I use it. I steal it. It's awesome.Corey: It seems like one of those weird areas, too, where it's a—like, we're going to get stuff wrong. That is human nature. The question is, is when it's pointed out, how do you react? Do you get hyper-defensive? Do you just, like, turn that into a cudgel to beat other people with? Or do you take the lesson? Do you pass it forward to folks in a way that is constructive and helpful?And I believe one of the rejoinders I asked to you was, if you have an idea, we are absolutely going to hear it out, but there are going to be cases where… like, in the consulting side of our business, whenever I describe that we fix the AWS bill for our clients, I explain that to an engineer, and they think hard on that for two-and-a-half seconds and then they say the same thing in almost every case, which is, “You should charge a percentage of savings,” to a point where now my default reflexive response is, “Holy shit. I never thought of that. This is going to change everything.” Now, there are a variety of reasons that that doesn't work, but it is an obvious line of inquiry. And the only concern I had was, understand that there are things like that scattered throughout the business that things are like they are for a reason.And I'm thrilled to reevaluate and reexamine a bunch of those, but are you going to take the response of, “Well, this is why it is the way that it is,” as shutting down the line of inquiry? And your response was incredibly reassuring. You said, “No, that is strictly a business discussion. That's fine. I just want to be heard.”And I can't commit to always agreeing with ideas you have. I can't even give it to that to my business partner. In fact, correcting him is my favorite part of any given hour for me, but I can at least guarantee you'll be heard or you'll be—or I will hear you out on any of these concerns that you raise or ideas that you have. And I'd like to think that almost three months in, that we've lived up to that. And if not, please let us know. You don't need to actually call me out on it now if you don't want to. I realize, like, yeah, well, that's the right time to ask someone for harsh feedback, at a point where they cannot possibly give it to you other than that a really flattering way. Go for it. You need not respond.Colleen: [laugh]. No, I will keep that in mind. And so far, so good. We are good [laugh].Corey: I really want to thank you for taking the time to sit down and basically have to, I suppose justify after the fact why you accepted the job that we offered to you, which feels very strange, and yet here we are. If people want to learn more, where's the best place for them to find you?Colleen: Oh, where the best place to find me? Shall I mention Twitter [laugh]? Or [laugh]—Corey: That's always a bit of a dicey thing these days.Colleen: Well, Twitters, Threads, LinkedIn, wherever you, your heart's desire, you can find me at Colleen Coll. It's really easy.Corey: We'll put links to all of that in the [show notes 00:32:26].Colleen: Yes. But you can also find me in Portland and sometimes in Europe, and always just being open. And I love to meet people. I know that sounds weird, but if I have the opportunity to network, it's going to be—and please, if you ever see me at an event, just please walk up to me and say hello. And I—because I know that I would do the same with you. I did that with Corey [laugh].Corey: Or there's always the guaranteed way to make sure that you see something and that is to fill out the form at lastweekinaws.com/sponsorship. There's a little self-interest behind that one I absolutely am aware of and I'm putting that in there.Colleen: Nice.Corey: Thank you again for your time. I appreciate it.Colleen: No, thank you, Corey. And have fun with the squirrels and FedEx.Corey: I'll do my best. Colleen Coll, account executive here at The Duckbill Group. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment complaining about the episode disparaging the value of writing clearly and journalism is particular, and of course failing to have anything remotely resembling a coherent sentence structure while you do.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
No episódio de hoje, Luan Moreno e Mateus Oliveira entrevistam Neha Pawar, atualmente Founder Engineer na StarTree.Apache Pinot é um banco de dados OLAP de baixa latência, que foi desenvolvido para queries analíticas dentro do Linkedin.O objetivo é resolver um dos problemas que tecnologias como o Apache Kafka não resolvem, consultar bilhões de eventos com performance e baixa latêcia . Com Apache Pinot, você tem os seguintes benefícios: Alto desempenho de consultas analíticas; Dados que residem no Apache Pinot são comprimidos; Habilita milhares de acessos concorrentes aos dados residentes no Apache Pinot.Falamos também sobre os temas: Criação do Apache Pinot; User Facing Analytics;Tipos de Deployment no Apache Pinot; O que vem por aí no Apache Pinot.Aprenda mais sobre Apache Pinot, uma tecnologia capaz de armazenar dados em tempo real, e executar queries com baixa latência, chegando até milissegundos.Neha Pawar = Linkedinhttps://pinot.apache.org/ Luan Moreno = https://www.linkedin.com/in/luanmoreno/
Ricardo Gonzalez, Senior Principal Product Manager at Oracle, joins Corey on Screaming in the Cloud to discuss his approach to Product Management and cloud migration. Ricardo explains how a chance conversation landed him a role at Oracle, and why he feels it's so important to always bring your A-game in any conversation. Corey and Ricardo discuss why being a good Product Manager involves empathy for your customers and being able to speak their language as well as the language of your product and development team. Ricardo also explains how he's seen the Oracle product suite grow, and why he feels more and more companies are seeing the value of migrating their data to the cloud. About RicardoRicardo is a Product Manager at Oracle, in charge of Database Migration to the Cloud, and the ZDM and ACFS products.Ricardo is a native Costa Rican and has lived in Mexico, Italy and currently resides in the United States.He is passionate about technology, education, photography, music and cooking. He loves languages and connecting with people from all over the world. In a future life, Ricardo wants to own a taco truck, and share taco happiness with everybody.Links Referenced: Oracle: https://www.oracle.com/ LinkedIn: https://www.linkedin.com/in/ricardogonzaleza/ Twitter: https://twitter.com/productmanaged TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Human-scale teams use Tailscale to build trusted networks. Tailscale Funnel is a great way to share a local service with your team for collaboration, testing, and experimentation. Funnel securely exposes your dev environment at a stable URL, complete with auto-provisioned TLS certificates. Use it from the command line or the new VS Code extensions. In a few keystrokes, you can securely expose a local port to the internet, right from the IDE.I did this in a talk I gave at Tailscale Up, their first inaugural developer conference. I used it to present my slides and only revealed that that's what I was doing at the end of it. It's awesome, it works! Check it out!Their free plan now includes 3 users & 100 devices. Try it at snark.cloud/tailscalescream Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn. Some wit once said that 90% of life was just showing up. And I'm not going to suggest that today's guest has only the fact that he shows up going for him, but I do want to say that when I first met him, it was at a drink-up that I threw here in San Francisco. And he kept turning up to a variety of community events, not just ones that I wound up hosting, but other people, too. By day, Ricardo Gonzalez is a Senior Principal Product Manager at Oracle. But in the community, he is also much more. Ricardo, thank you for joining me today.Ricardo: Thank you so much for having me, Corey. It's a great pleasure to be here with you today.Corey: So, it is interesting watching you come to what I can only describe as the other side of the tracks. Because you work at Oracle. I make fun of AWS all the time, so yes, I suppose our companies do have that in common, but I digress. You also work in the database world, which is, I guess you could say I do and that I misuse things as databases, mostly for laughs and occasionally for production. And you're over in the product manager side of the world, which for me, has always—may as well be a language that I do not understand, let alone speak. Yet you have consistently shown up and made great contributions to every conversation you've ever been a part in. Where did you come from? How did you start doing this?Ricardo: Well, I'm originally from Costa Rica, right, which is I wouldn't say uncommon, but then again, there's just a few of us. And I was doing my master's degree in Mexico when I showed up to a recruitment event dressed up like a business student and realize all of my peers were actually developers—although I'm a computer scientist by trade—looking for a job at Oracle's Development Center in Mexico, right? And by showing up, something magical happened. I stayed at the session, they made a raffle with numbers. I didn't win, but they asked us questions nobody answered, and as you can see, I talk a lot.I raised my hand, and they said, “Okay, answer these questions.” And then it became, like, a competition, and I won. And back then I got, like, a tablet. I think it was an iPad; it was great. I thought, okay, no job for me because I wasn't working—looking for a job in development.And then this person, which is now an SVP in my company, which has been my mentor in many ways, approached me, and he said, “I really liked what you did. It seemed you do have some technical background. We need somebody that can talk like that with customers, but at the same time, understand the requirements for a technical product and work with engineers. Do you want to come to the office tomorrow?” And a week later, I got an offer my life change in ways, like, we've never foreseen.Corey: This is a hard thing to talk about because it's the way the world works, but when you say it, people love to come back and tear you down, like, “You just got lucky.” Or it—“Well, yeah, that works for you, but it doesn't work for other people.” But I've found invariably that the seminal moments that happened in the course of my career have all come from conversations I had with people I didn't need to be talking to at events I didn't need to be attending, but one thing leads to another. Instead of sitting at home and brooding, I put myself in situations where I could, for lack of a better term, make my own luck. Sure, if only one conversation in a thousand winds up turning into a career opportunity, okay, but that means you need to have a thousand conversations to get there, so time to get started. And you are probably one of the best living embodiments of this that I've ever met.Ricardo: Well, it's interesting. You're right. I mean, the luck part plays a factor, I guess, but you have to change your own luck. And it's complicated to talk about that because there's also privilege in both, and being part of—like, I was in college. I had the privilege to go to college, although, I mean, there's a whole, like, list of things that made me get there and the sacrifices from family, et cetera.And not everybody has the same level of field, right? But what I can say though, is that I heard somebody said something that really resonated with me, which is, “For some of us, right, we won't be the main player in the game.” [reading 00:04:25], like, so imagine you have, like, a sports event where—whatever sport you want—and there's a game of playing, right? The coach will not call you. But they might call you over the last five minutes, but when they do, you have to be there and score a goal, touchdown, whatever you want to call it, be the best player because that's the opportunity you have and you have to make the most out of it. Some people were born and they have the opportunity to be in the starting lineup. Some of us will be just called at the last minute. But when you do, your A-Game has to be there on top and you have to be the best you can because that's the only way you have to shine.Corey: I think that you're right. There's a tremendous amount of privilege baked into all of this. And privilege is one of those things you can't just set aside. It's something that we wind up all manifesting in different ways to different degrees. But it's a, “Oh, just be like me,” is fundamentally what a lot of advice comes down to, regardless of whoever it is the me in question that's talking about it.But there seems to be just certain things that lend themselves to better possibilities of success. One of the things that has always impressed me is that you just show up and start great conversations with people, left and right. That's a skill that I honestly wish I had. I have to be noisy and public to get people to approach me, whereas you, ah, you don't have the time for that. You just walk up and start talking to them. I've never been good at that.Ricardo: I guess part of my upbringing—also, you know, my home country has a whole history of [horizontalness 00:05:47], but that's a different discussion. And we are, I guess, not shy to just talk to people, right, which sometimes can bring into interesting conversations with management and, like—because if I disagree, I will let you know, right? I will be completely candid about things. But I think it's important, right? Because like, we're all human beings trying to do the same thing, right?We all wake up in the morning with the same set of problems and then get to share moments in between each other. Why don't make them as pleasant as possible and try to see how can we actually grow together? It's important that you're not only getting things and growing yourself but also see how can with that help others grow as well, right? So that's, I think, part of what conversations can be—I mean, starting conversation with anybody just it's really important to say, “Okay, nice to meet you. How can we, you know, make the most out of it for both of us?” And, you know, either even if it's just, like, you'd have a great conversation or, you know, help each other or just me help you, et cetera.Corey: So, I want to talk a little bit about your day job. Given that you work in product management, I have to assume that having people skills is kind of a prerequisite for the role. At least you would think. I've worked in places where that was apparently not the case, and not for nothing, it kind of showed.Ricardo: Yeah, I mean, it's really important. I think product management is one of these positions in which you are in the middle of things. When people ask me—and these are people that don't work in technology—“What do you do?” I tell them, “I'm a translator,” right? And when they ask me, like, “Oh, so you do it between languages?” I said, “Well, yeah, I speak different languages with us.”So, the point is, I am able to talk with people that have a less technical acumen or are actually just users of our product, right, and [unintelligible 00:07:17] highly skilled, and then go back to the engineers, which have a different point of view, right? So, I'm always back and forth. But that people skills, as you mentioned, is really important because otherwise you cannot do your job. The thing that is interesting for me is that product management itself is not really a thing that can be defined. I mean, yes, of course, there's, like, books on it and people that have done their careers and, like, saying how it works, but it changes from company to company.And even within the same company, there are different product managers doing different things. What I do—and I've been really fortunate to have really good managers that I've worked for the last seven years, I think—has a lot to do with the people skills that you mentioned, right? And it allows me to be as good as I can with my job and try to do me just, you know, grow every day.Corey: It's easier to sit here and reason about these things in the context of specifics, on some level. And it's also easy for me at least to look at a company and think, “Oh, they do one thing,” but I have it on good authority that Oracle is a large-ish company that might have more than one product at any given point in time. What product do you work with? Where do you start and where do you stop?Ricardo: Okay, well, I've been part of three different teams, if you could call it that way. Although, like, over the last seven years, I've been focusing mostly for—I mean, always within the database organization, so like database development. And then over the last, like, six, seven years, I've been on the high-availability team, which focus on a thing called maximum availability architecture, right, which is basically helping customers to achieve all their requirements. And we're talking about, like, heavy usage of, you know, regular Oracle database with high availability, scalability, I mean, requirements for, like, 24/7, like, great uptime.And I started working with them with the cluster file system, which I still do, but my main job over the last, like, let's say, four, almost five years, have been working towards helping customers come to the cloud, right, to Oracle Cloud. And my product, I'm the product manager for protocols ZDM, Zero Downtime Migration, and it's been in the market for the last three-and-a-half years, right? So, I was there since it's all started as a whole interesting story about cross-work with different teams in Oracle getting together to get this product out. So, that's my day-to-day job, just enabling customers on maximize the usage of the Oracle database in the high-availability realm, and also helping them move to the cloud, the Oracle Cloud, if that's what they want and the mission they have right now in their organizations.Corey: I know that people are going to have opinions about Oracle Cloud, and I'm just going to say something that I think is relatively uncontroversial, in that the technology is freaking solid. I have used it in a bunch of different ways, I've talked to folks who have, and there is remarkably little argument that when you use it as directed, that stuff works. And there's a lot to be said for that. So, you focus a lot on the migration story, specifically, to my understanding, databases inward from a variety of other places. Do they tend to find themselves living in on-prem environments? Are they in other cloud providers? Are they, God forbid, well, we have this filing cabinet full of paperwork and we're hoping you can help us digitize it all, which, yes, those projects exist. And no, I don't want to be within 6000 miles of them.Ricardo: Well, mostly, we're talking about on-premises customers, right, that have large fleets of Oracle databases and we're trying to help these customers, either as small businesses, it could be public or enterprises move to the Oracle Cloud when they deem that the strategy they're doing, right? So, my product, what it does is it actually orchestrates, it automates that process for them so that when they're actually doing the migration, it's as seamless as possible for them. Because there's a lot of, like, caveats and a lot of things to consider when we're talking about database migration into the cloud.Corey: When you take a look at what is going on in the larger ecosystem, it's easy for me to sit here and say, “Well, I don't see Oracle databases very often.” And yeah, in the context of companies that I work with, that are very often founded in the last few years and are born in a particular cloud provider—in my case, AWS—yeah, there doesn't seem to be a lot of those things. But at the same time, Oracle rose to its current position by having database technology that was second to none. There's a reason that all of these quote-unquote, “Legacy companies,” by which of course, we mean, companies that made money and had the temerity to be founded more than three years ago, have wound up standardizing across Oracle to a large extent. As a result then, we're seeing a stupendous amount of those companies now looking and weighing, what does moving into the cloud actually look like because we have an increasingly dire raccoon problem in our data center?Ricardo: Yeah, I mean, we have all the latest technology over the last 40 years. Like, Oracle, as you mentioned, right? It has impressive technology and it's quite solid. Now, you're asking me about companies that, you know, that might not be using Oracle or that you're not aware of they're using Oracle. The interesting thing, and when people asked me about this, right, is that it's really easy, both me and you without knowing, use Oracle products today, right?Because you check your bank account, you use certain financial services, you made phone calls, et cetera, right? And a lot of the underlying technology and infrastructure that runs the world today—either you took a plane, et cetera—is running on Oracle, right? There's a lot of deployments there, right? It's just that is not that maybe we're not doing—you know, again, we're talking about the whole ecosystem that runs a lot of infrastructure that normal people would do on a daily basis, but it's right on the back end, so you might not hear about it, or it's not as known, but it is there in the top companies all over the world. So, what we're doing now is helping these companies, right, migrate to the cloud when their needs really adapt to exactly that goal.And sometimes it's actually more, “Okay, how can we actually modernize your data center, right?” So, Oracle actually has Cloud@Customer, and we also help them with that migration as well. So, we have a whole set of products and deployments that would work within the customer data center, but within a cloud managed by Oracle.Corey: I think that that's an interesting question in its own right, which is you have these companies that are doing incredibly important things. Like this, like Oracle databases, run hospitals, they run DMVs—Ricardo: Yep.Corey: At various states. They run basically everything big infrastructure that you can imagine a lot of places. They run banks, for example. And now these companies are looking at transforming into a cloud approach, on some level. How on earth you convince them to move something as critical as a workload on Oracle database, which in many cases, is a bedrock layer upon which aspects of society depend, to, “Oh, yeah, just go ahead and move it to this cloud thing. That'll be fine.” It feels like an almost impossible goal, but it's clearly not. What drives it.Ricardo: Well, it's happening all over the industry, right? People are realizing that cloud, it's—I wouldn't dare to say the future because it's been happening for, you know, over the last years, but clearly for cost management, security, administration, resource scaling, you name it, it's the way to go, right? So, it takes time, and depending on who you're working with the projects could, you know, span, three years, et cetera. But people, that's the way you like, you know, the whole ecosystem is going, right? So, what we're doing is, and we didn't reinvent the wheel here, at least with my product, right, was to take technology has been used for over 40 years as a standard for, you know, backup, export, data transfer, synchronization, security, database management, and integrate it into a single product that would be, like, automated and helping the customers.And what we wanted to do, and it was really important for me is, like, we want you to be in our cloud, we want to help you, so let's make this free. Even if we're using other products that Oracle already has that have a cost, if you're using the migration suite that we offer, it will not cost you money.Corey: There's a lot of value to being able to make assurances like that but, on some level, that feels like whatever someone migrates anything anywhere, but a few things are certain. One is that there's going to be technical challenges with it. There always are. That is the nature of large systems, particularly systems built upon systems built upon systems. And too, as humans, as much as we love talking about the idea of blamelessness, everyone's going to be looking for a scapegoat when something inherently goes wrong.The database is always an easy thing to blame, and the cloud, aha, that's stuff where it's non-deterministic and we can't go and put our meaty hands on it in the data center the way we used to when things start breaking. How do you avoid becoming the blame center in a scenario like that?Ricardo: That's a great question and it's interesting because it could happen, right, that somebody says, “Well, because of the migration, things are not working as expected,” et cetera. So, we do help customer—there is a lot of implications when you're talking about migration, right—to the proper planning, sizing, are there any architecture implications? Are you doing any cross-endianness? Then, you know, database-wise, Oracle has different architectures, so we have the previous model of non-containerized or no-container databases. Now, we're going to tenant-based.We're working—are you doing an upgrade as well? Are you doing, you know, you're coming from an older version to a newer version? Are there security implications? Because a lot of the database is on-premises might not have encryption, and we by default encrypt at the target level because it's a requirement in the cloud, right?So, what we work with the customers is two things. First of all, do all the planning and testing as possible before the migration so that you know what you're doing is correct. Is the app certified with newer version and the environment you're going into, right? And we can work with you to do all these tests. And then one thing that we realized was really important in the product is to have a way to have, like, knobs or control of what you're doing, and you could actually do testing before the actual switch over into the cloud.So, you will have, like, a standby database, like, a copy of your database, running in the cloud, [unintelligible 00:16:47] synchronization with your on-prem, your database, right, on your application, but you can use that to just do all the testing you want and then be sure. And only when you're ready, then you will do a switchover, and then things would work as expected, right? But again, there's a lot of process. And we've worked with customers that you know, they know what they're doing, they were, like, super happy and they did it quite fast. There's others that said, “You know what? I am going to do a nine-month testing process because my week that I'm going to be migrating and then the weekend that I'm going to do the switchover is crucial.” And then we work with them over those nine months. But then when it happened, it went, you know, perfectly, right? So, it really depends on the project. But we do ensure that everything is taken care of because as you mentioned, it's a big change, it's the big shift.Corey: Tired of wrestling with Apache Kafka's complexity and cost? Feel like you're stuck in a Kafka novel, but with more latency spikes and less existential dread by at least 10%? You're not alone.What if there was a way to 10x your streaming data performance without having to rob a bank? Enter Redpanda. It's not just another Kafka wannabe. Redpanda powers mission-critical workloads without making your AWS bill look like a phone number.And with full Kafka API compatibility, migration is smoother than a fresh jar of peanut butter. Imagine cutting as much as 50% off your AWS bills. With Redpanda, it's not a pipedream, it's reality.Visit go.redpanda.com/duckbill today. Redpanda: Because your data infrastructure shouldn't give you Kafkaesque nightmares.Corey: I think that there's a very true story about how oh, we just try to close our eyes and cross our fingers and hope for the best and press the migrate button that everything will work out flawlessly. It doesn't work that way. The way that we always wound up handling migrations in places that weren't riddled with dysfunction up, down, and sideways—at least not in this particular way because everyone's environment's terrible—is that we would test these things out, we'd stage them, we would have rollbacks that were tested and known to work. In some cases, we'd begin with the rollback before we started the migration plan, just because we absolutely cannot have this system down outside of a maintenance window or outside of certain constraints. And it feels like a lot of that planning is wasted when things go well. But it's not. It's the reason that important things don't crumble underneath us. Like, on some level it's, do I feel like I wasted money on my airbags and seatbelts because I've never used them? Not really no.Ricardo: Well, I mean, this is, like, the classic [unintelligible 00:18:23] ops thing and support thing, right? People always complain when things don't work, but when they do work, they don't realize it because of all the worries that all the people that infrastructure and planning and support and ops were doing, right? So it's—yeah, there's a lot of time that can be spent in planning and people would think that it's actually wasted time, that actually is super important and crucial for these. The other thing I think it's important is that you always should have a fallback plan. There's, like, different configurations in which these might be more cumbersome or complex, but we do have the possibility to, like, keep replicating back to on-premises, so that if anything happens, people do have that option, right?And we do have customers that like the idea of having a disaster recovery configuration in which they have, like, something in the cloud and then another thing on-premises, so there's always an option for you. But planning is crucial, right? So, we even have a thing called, like, evaluation mode in which we could we dry run a migration without actually doing it, just to tell you what could happen. Of course, when you do things live, there's always things, right, that could be related to many other factors, but we really, really try to dial in and be sure that when you're doing the migration and you properly planned, things will be automated and work for you.And so, we've grown over the last three-and-a-half years, and I was doing some research, right, and we've had, like, you know, thousands of databases migrated, great customer that have been using us, and surprises, so sometimes we don't know, right, and we find out, oh, somebody's doing a course in one of these learning platforms based on our product, which is really new, but it's, oh, it's cool. Like, when we're not creating, like, even your [unintelligible 00:19:50] et cetera, right? And I'm really glad that what your doing has an impact and helps people. That's all you want. You want to help people achieve their goals.Corey: So, I have to ask. On some level, building something that migrates a database from one location to another naively would seem to folks to be a, “Okay, at some point, this gets declared feature-complete and then we go work on other interesting problems.” But yet the fact that you've remained employed in the role that you're in where you continue to work on the problem would strongly suggest that this is not, in fact, true. How does the product continue to evolve once you are, let's be clear, shipping this to paying customers?Ricardo: Well, I mean, the product will evolve, as you mentioned, right—Corey: And I want to be clear, that's not just a rephrasing of, “Hey, quick, justify your job.” Obviously, this stuff has to evolve. This is not one of those, “So, what is it? You'd say these do here,” crappy questions that isn't really a question so much as an accusation. Those come in a slightly different tone of voice.Ricardo: That's, you know, it's a super valid question and I actually appreciate it a lot because it also makes me reflect on how much we've grown right? I mean, I think the magic of ZDM and the team behind it is that it's kind of like a startup within Oracle, right? It all started because different teams [within 00:21:03] Oracle, right, you're banded together, a team propose a prototype based on existing technology, right? So like, again, as I mentioned, like, Oracle technology for database has been over 40 years in the making. And, you know, a team said, “Okay, what are the standard tools to actually do a backup or an export of data transferred, you know, to a location”—in this case, the cloud—“Doing a whole synchronization, encryption, et cetera, and then the switchover?” Right?So, the thing is that databases come in many flavors, there are different options, different ways for databases to work. There's also different targets in the Oracle Cloud and those then change how you would be migrating into, you will have different workflows, physical, logical, you could use different backup locations, so of course, in Oracle Cloud, the standard is the object storage, right? You can do a direct data transfer; you have that technology as well. If you're doing migrations to [cloud 00:21:51] customer, you'll definitely will require, like, external storage, like NFS. If you're doing a conversion from AIX or Solaris into, you know, the cloud target which is Linux, then again, there's other implications, if you're doing an inflight upgrade, if you're changing architectures, from non-multi-tenancy to multi-tenancy, if you're doing, you know, coming from other clouds, there are also certain considerations.So, now that I mentioned all of these, you can see how a product from the get-go can have all those, right? So, we started with a subset of features and we've grown up to six releases now over the last three-and-a-half years that have incorporated everything that I've just mentioned. And we can do all those things, but it keeps getting better. And then there's always, like, things that we realize that customers are using us in ways that maybe were not expected, which is great because oh, okay, cool, then this is something that we can actually, like, make better or enhance, right?And there's always requests from customers on what they want to do or see change in the product. We also integrate with our team. So, there is an advisor that does a pre-check for the database and checks, okay, what are, like, the recommendations on what you should do? So, those integrations and working with our teams across Oracle, again, take time, and hence why, you know, products keep growing and evolving. And you're right, maybe at some point, we will be able to cover everything that there is to do, right?What we're doing now, and we've been working, again, in partnership with our teams at Oracle, right, is, like, be the engine of other Oracle migration strategies. So, there is a native service in Oracle Cloud infrastructure called DMS that has a subset of our features and it uses ZDM under the hood, right? So again, there's always work to do and a lot of it sometimes is go to communication and working with customers, but there's also a lot of, like, going back to the drawing board and see how can the product be improved.Corey: I think that there's a certain lack of attention also given to the fact that every time you think you've seen it all, all it takes is talking to one more customer, where they have a use case that you potentially hadn't considered. And maybe it sounds ridiculous to you, but it's ridiculous in load-bearing ways in an awful lot of these other places. Empathy becomes such a key aspect of this that I'm somewhat surprised that more folks don't spend more time than they do thinking about these things.Ricardo: Well, I think as a product manager, this is really important, right? You need to put yourself in the customer's shoes. And you also need to use the product. Sometimes using the product, like, so I use it, like, to create my own workshops that we have. There's a platform called Live Labs in Oracle that has, like, I don't like 6000 labs that are free for you to use and learn about our technology, right?So, in order to better the product understand, and then you know, when we're doing a new release, et cetera, then see the key features, like, we create materials like that and we use it. But that doesn't give us the whole scope of how a product customer would be using it. So, for all internal migration that we have within Oracle products into the Oracle Cloud, we use that and then that gives us a lot of insights. But then going to a customer and spending time with them, sometimes developing relationships that go more than a year because we were talking about, like, big [fleet 00:24:41] migrations, thousands of databases, you realize, oh, the scope is broader than we expect, but it's actually a really—there's a lot of satisfaction in learning from them and then getting back to the development team. Or even including. I think that's really important as well.I think a good PM would include development sometimes in the conversation with customers because they then—there is, like, a better understanding from both sides of the aisle. And even bring them to conferences, et cetera, so that the actual, you know, empathy of the customer requests and what they need, it's created.Corey: Yeah, I think that there's also a presupposition that you can look at a company and say, “Oh, you're using X technology? You must be crappy,” or whatnot. Something I've learned is that every company of a size that is remarkably small compared to what people often think he is using basically everything already. Like, I'm at this point at a company that has less than ten employees and we already have five different clouds that we have accounts with, doing different things in different ways. This explosion of different tools and different utilities is like it is for a reason. And it's very tricky to really, I think, appreciate that until you've walked a mile in the shoes of someone who's building things like that.Ricardo: Yeah. It's interesting. There's a whole, like, view of product management, right, and having this idea of building and building and building products, but what you're doing is actually helping people with their needs, and their needs can be really broad, so maybe the solution is not your product. And maybe the solution is not your technology. But I think good PM, and I think anyone in technology, a good person, would actually, like, help these users or customers to get where they need to, even if it's not using your technology, right?Corey: I would agree wholeheartedly. I really want to thank you for taking the time to go through what it is you're up to and how you view the world. If people want to learn more, where's the best place for them to find you other than, you know, local community meetups when you happen to be in town?Ricardo: Well, I mean, of course, anybody can, like, go to LinkedIn and look me there. I have a Twitter account @productmanaged, so product manager, but without the R and a D instead because of course. Twitter handles are—or handles over on social media are hard to get, although I'm as active lately on Twitter. And I, you know, I opened an account on Bluesky, which is [@productmanager 00:26:54]. I did get that one. But um, I only starting now to use it, right, so, you know, I guess those three would be the places to.Corey: Awesome. And we will, of course, put links to that in the [show notes 00:27:04]. Thank you so much for your time, I appreciate it.Ricardo: Anytime. And one thing. If anyone is ever in San Francisco, you know, I'm more than happy to meet up. I love this city. It has changed my life tremendously and I'm happy to show you around. I consider myself now somebody that really, really, really cares for this place and happy to just, you know, have a good time, talk technology or not. I also love to cook. So anytime, I'm here.Corey: I highly recommend that. He's not just fun to hang out with, he is an excellent cook as well. But I don't know if there's a good way to put that in show notes, so you'll have to take my word [laugh] for it instead.Ricardo: [laugh].Corey: Ricardo Gonzalez, Senior Principal Product Manager at Oracle. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, insulting comment that one day I will find a tool to migrate into a central database. I know not where.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Martin Mao, CEO & Cofounder at Chronosphere, joins Corey on Screaming in the Cloud to discuss the trends he sees in the observability industry. Martin explains why he feels measuring observability costs isn't nearly as important as understanding the velocity of observability costs increasing, and why he feels efficiency is something that has to be built into processes as companies scale new functionality. Corey and Martin also explore how observability can now be used by business executives to provide top line visibility and value, as opposed to just seeing observability as a necessary cost. About MartinMartin is a technologist with a history of solving problems at the largest scale in the world and is passionate about helping enterprises use cloud native observability and open source technologies to succeed on their cloud native journey. He's now the Co-Founder & CEO of Chronosphere, a Series C startup with $255M in funding, backed by Greylock, Lux Capital, General Atlantic, Addition, and Founders Fund. He was previously at Uber, where he led the development and SRE teams that created and operated M3. Previously, he worked at AWS, Microsoft, and Google. He and his family are based in the Seattle area, and he enjoys playing soccer and eating meat pies in his spare time.Links Referenced: Chronosphere: https://chronosphere.io/ LinkedIn: https://www.linkedin.com/in/martinmao/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Human-scale teams use Tailscale to build trusted networks. Tailscale Funnel is a great way to share a local service with your team for collaboration, testing, and experimentation. Funnel securely exposes your dev environment at a stable URL, complete with auto-provisioned TLS certificates. Use it from the command line or the new VS Code extensions. In a few keystrokes, you can securely expose a local port to the internet, right from the IDE.I did this in a talk I gave at Tailscale Up, their first inaugural developer conference. I used it to present my slides and only revealed that that's what I was doing at the end of it. It's awesome, it works! Check it out!Their free plan now includes 3 users & 100 devices. Try it at snark.cloud/tailscalescream Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by our friends at Chronosphere. It's been a couple of years since I got to talk to their CEO and co-founder, Martin Mao, who is kind enough to subject himself to my slings and arrows today. Martin, great to talk to you.Martin: Great to talk to you again, Corey, and looking forward to it.Corey: I should probably disclose that I did run into you at Monitorama a week before this recording. So, that was an awful lot of fun to just catch up and see people in person again. But one thing that they started off the conference with, in the welcome-to-the-show style of talk, was the question about benchmarking: what observability spend should be as a percentage of your infrastructure spent. And from my perspective, that really feels a lot like a question that looks like, “Well, how long should a piece of string be?” It's always highly contextual.Martin: Mm-hm.Corey: Agree, disagree, or are you hopelessly compromised because you are, in fact, an observability vendor, and it should always be more than it is today?Martin: [laugh]. I would say, definitely agree with you from a exact number perspective. I don't think there is a magic number like 13.82% that this should be. It definitely depends on the context of how observability is used within a company, and really, ultimately, just like anything else you pay for, it really gets derived from the value you get out of it. So, I feel like if you feel like you're getting the value out of it, it's sort of worth the dollars that you put in.I do see why a lot of companies out there and people are interested because they're trying to benchmark, to trying to see, am I doing best practice? So, I do think that there are probably some best practice ranges that I'd say most typical organizations out there that we see. This is one thing I'd say. The other thing I'd say when it comes to observability costs is one of the concerns we've seen talking with companies is that the relative proportion of that cost to the infrastructure is rising over time. And that's probably a bad sign for companies because if you extrapolate, you know, if the relative cost of observability is growing faster than infrastructure, and you extrapolate that out a few years, then the direction in which this is going is bad. So, it's probably more the velocity of growth than the absolute number that folks should be worried about.Corey: I think that that is probably a fair assessment. I get it all the time, at least in years past, where companies will say, “For every 1000 daily active users, what should it cost to service them?” And I finally snapped in one of my talks that I gave at DevOps Enterprise Summit, and said, I think it was something like $7.34.Martin: [laugh]. Right, right.Corey: It's an arbitrary number that has no context on your business, regardless of whether those users are, you know, Twitter users or large banks you have partnerships with. But now you have something to cite. Does it help you? Not really. But we'll it get people to leave you alone and stop asking you awkward questions?Martin: Right, right.Corey: Also not really, but at least now you have a number.Martin: Yeah, a hundred percent. And again, like I said, there's no—and glad magic numbers weren't too far away from each other. But yeah, I mean, there's no exact number there, for sure. One pattern I've been seeing more recently is, like, rather than asking for the number, there's been a lot more clarity in companies on figuring out, “Well, okay, before even pick what the target should be, how much am I spending on this per whatever unit of efficiency is?” Right?And generally, that unit of efficiency, I've actually seen it being mapped more to the business side of things, so perhaps to the number of customers or to customer transactions and whatnot. And those things are generally perhaps easier to model out and easier to justify as opposed to purely, you know, the number of seats or the number of end-users. But I've seen a lot more companies at least focus on the measurement of things. And again, it's been more about this sort of, rather than the absolute number, the relative change in number because I think a lot of these companies are trying to figure out, is my business scaling in a linear fashion or sub-linear fashion or perhaps an exponential fashion, if it's—the costs are, you know, you can imagine growing exponentially, that's a really bad thing that you want to get ahead of.Corey: That I think is probably the real question people are getting at is, it seems like this number only really goes up and to the right, it's not something that we have any real visibility into, and in many cases, it's just the pieces of it that rise to the occasion. A common story is someone who winds up configuring a monitoring system, and they'll be concerned about how much they're paying that vendor, but ignore the fact that, well, it's beating up your CloudWatch API charges all the time on this other side as well, and data egress is not free—surprise, surprise. So, it's the direct costs, it's the indirect costs. And the thing people never talk about, of course, is the cost of people to feed and maintain these systems.Martin: Yeah, a hundred percent, you're spot on. There's the direct costs, there's the indirect costs. Like you mentioned, in observability, network egress is a huge indirect cost. There's the people that you mentioned that need to maintain these systems. And I think those are things that companies definitely should take into account when they think about the total cost of ownership there.I think what's more in observability actually is, and this is perhaps a hard thing to measure, as well, is often we ask companies, “Well, what is the cost of downtime?” Right? Like if you're, if your business is impacted and your customers are impacted and you're down, what is the cost of each additional minute of downtime, perhaps, right? And then the effectiveness of the tool can be evaluated against that because you know, observability is one of these, it's not just any other developer tool; it's the thing that's giving you insight into, is my business or my product or my service operating in the way that I intend. And, you know, is my infrastructure up, for example, as well, right? So, I think there's also the piece of, like, what is the tool really doing in terms of, like, a lost revenue or brand impact? Those are often things that are sort of quite easily overlooked as well.Corey: I am curious to see whether you have noticed a shifting in the narrative lately, where, as someone who sells AWS cost optimization consulting as a service, something that I've noticed is that until about a year ago, no one really seemed to care overly much about what the AWS bill was. And suddenly, my phone's been ringing off the hook. Have you found that the same is true in the observability space, where no one really cared what the observability cost, until suddenly, recently, everyone does or has this been simmering for a while?Martin: We have found that exact same phenomenon. And what I tell most companies out there is, we provide an observability platform that's targeted at cloud-native platforms. So, if your—a cloud-native architecture, so if you're running microservices-oriented architecture on containers, that's the type of architecture that we've optimized our solution for. And historically, we've always done two things to try to differentiate: one is, provide a better tool to solve that particular problem in that particular architecture, and the second one is to be a more cost-efficient solution in doing so. And not just cost-efficient, but a tool that shows you the cost and the value of the data that you're storing.So, we've always had both sides of that equation. And to your point, in conversations in the past years, they've generally been led with, “Look, I'm looking for a better solution. If you just happen to be cheaper, great. That's a nice cherry on top.” Whereas this year, the conversations have flipped 180, in which case, most companies are looking for a more cost-efficient solution. If you just happen to be a better tool at the same time, that's more of a nice-to-have than anything else. So, that conversation has definitely flipped 180 for us. And we found a pretty similar experience to what you've been seeing out in the market right now.Corey: Which makes a tremendous amount of sense. I think that there's an awful lot of—oh, we'll just call it strangeness, I think. That's probably the best way to think about it—in terms of people waking up to the grim reality that not caring about your bills was functionally a zero-interest-rate phenomenon in the corporate sense. Now, suddenly, everyone really has to think about this in some unfortunate and some would say displeasing ways.Martin: Yeah, a hundred percent. And, you know, it was a great environment for tech for over a decade, right? So, it was an environment that I think a lot of companies and a lot of individuals got used to, and perhaps a lot of folks that have entered the market in the last decade don't know of another situation or another set of conditions where, you know, efficiency and cost really do matter. So, it's definitely top of mind, and I do think it's top of mind for good reason. I do think a lot of companies got fairly inefficient over the last few years chasing that top-line growth.Corey: Yeah, that has been—and I think it makes sense in the context with which people were operating. Because before a lot of that wound up hitting, it was, well grow, grow, grow at all costs. “What do you mean you're not doing that right now? You should be doing that right now. Are you being irresponsible? Do we need to come down there and talk to you?”Martin: A hundred percent.Corey: Yeah, it's like eating your vegetables. Now, it's time to start paying attention to this.Martin: Yeah, a hundred percent. It's always a trade-off, right? It's like in an individual company and individual team, you only have so many resources and prioritization. I do think, to your point, in a zero interest environment, trying to grow that top line was the main thing to do, and hence, everything was pushed on how quickly can we deliver new functionality, new features, or grow that top line. Whereas, the efficiency is always something I think a lot of companies looked at as something I can go deal with later on and go fix. And you know, I feel like that that time has now just come.Corey: I will say that I somewhat recently had the distinct privilege of working with a company whose observability story was effectively, “We wait for customers to call and tell us there's a problem and then we go looking in into it.” And on the one hand, my immediate former SRE reflexes kicked in, and I recoiled, but this company has been in this industry longer than I have. They clearly have a model that is working for them and for their customers. It's not the way I would build something, but it does seem that for some use cases, you absolutely are going to be okay with something like that. And I should probably point out, they were not, for example, a bank where yeah, you kind of want to get some early warning on things that could destabilize the economy.Martin: Right, right. I mean, to your point, depending on the context, and the company, it could definitely make sense, and depending on how they execute it as well, right? So, you know, you called out an example already, where if they were a bank or if any correctness or timeliness of a response was important to that business, perhaps not the best thing to do to have your customers find out, especially if you have a ton of customers at the same time. But however, you know, if it's a different type of business where, you know, the responses are perhaps more asynchronous or you don't have a lot of users encountering at the same time or perhaps you have a great A/B experimentation platform and testing platform, you know, there are definitely conditions in which that could be potentially a viable option.Especially when you weigh up the cost and the benefit, right? If the cost to having a few bad customers have a bad experience is not that much to the business and the benefit is that you don't have to spend a ton on observability, perhaps that's a trade-off that the company is willing to make. In most of the businesses that we've been working with, I would say that probably not been the case, but I do think that there's probably some bias and some skew there in the sense that you can imagine a company that cares about these things, perhaps it's more likely to talk to an observability vendor like us to try to fix these problems.Corey: When we spoke a few years back, you definitely were focused on the large, one would say, almost hyperscale style of cloud-native build-out. Is that still accurate or has the bar to entry changed since we last spoke? I know you've raised an awful lot of money, which good for you. It's a sign of a healthy, robust VC ecosystem. What the counterpoint to that is, they're probably not investing in a company whose total addressable market is, like, 15 companies that must be at least this big.Martin: [laugh]. Yeah, a hundred percent. A hundred percent. So, I'll tell you that the bar to entry definitely has changed, but it's not due to a business decision on our end. If you think about how we started and, you know, the focus area, we're really targeting accounts that are adopting cloud-native technology.And it just so happens that the large tech, [decacorns 00:12:35], and the hyperscalers were the earliest adopters of cloud-native. So containerization, microservices, they were the earliest adopters of that, so hence, there was a high correlation in the companies that had that problem and the companies that we could serve. Luckily, for us, the trend has been that more of the rest of the industry has gone down this route as well. And it's not just new startups; you can imagine any new startup these days probably starts off cloud-native from day one, but what we're finding is the more established, larger enterprises are doing this shift as well. And I think the folks out there like Gartner have studied this and predicted that, you know, by about 2028, I believe was the date, about 95% of applications are going to be containerized in large enterprises. So, it's definitely a trend that the rest of the industry will go on. And as they continue down that trend, that's when, sort of, our addressable market will grow because the amount of use cases where our technology shines will grow along with that as well.Corey: I'm also curious about your description of being aimed at cloud-native companies. You gave one example of microservices powered by containers, if I understood correctly. What are the prerequisites for this? When you say that it almost sounds like you're trying to avoid defining a specific architecture that you don't want to deal well with or don't want to support for a variety of reasons? Is that what it is or is there certain you must be built in these ways or the product does not work super well for you? What is it you're trying to say with that, is what I'm trying to get at here.Martin: Yeah, a hundred percent. If you look at the founding story here, it's really myself and my co-founder, found Uber going through this transition of both a new architecture, in the sense that, you know, they were going containers, they were building microservices-oriented architecture there, were also adopting a DevOps mentality as well. So, it was just a new way of building software, almost. And what we found is that when you develop software in this particular way—so you can imagine when you're developing a tiny piece of functionality as a microservice and you're a individual developer, and you're—you know, you can imagine rolling that out into production multiple times a day, in that way of developing software, what we found was that the traditional tools, the application performance monitoring tools, the IT monitoring tools that used to exist pre this way of both architecture and way of developing software just weren't a good fit.So, the whole reason we exist is that we had to figure out a better way of solving this particular problem for the way that Uber built software, which was more of a cloud-native approach. And again, it just so happens that the rest of the industry is moving down this path as well and hence, you know, that problem is larger for a larger portion of the companies out there. You know, I would say some of the things when you look into why the existing solutions can't solve these problems well, you know, if you look at a application performance monitoring tool, an APM tool, it's really focused on introspecting into that application and its interaction with the operating system or the underlying hardware. And yet, these days, that is less important when you're running inside the container. Perhaps you don't even have access to the underlying hardware, or the operating system and what you care about—you can imagine—is how that piece of functionality interacts with all the other pieces of functionality out there, over a network core.So, just the architecture and the conditions ask for a different type of observability, a different type of monitoring, and hence, you just need a different type of solution to go solve for this new world. Along with this, which is sort of related to the cost as well, is that, you know, as we go from virtual machines onto containers, you can imagine the sheer volume of data that gets produced now because everything is much smaller than it was before and a lot more ephemeral than it was before, and hence, every small piece of infrastructure, every small piece of code, you can imagine still needs as much monitoring and observability as it did before, as well. So, just the sheer volume of data is so much larger for the same amount of infrastructure, for the same amount of hardware that that you used to have, and that's really driving a huge problem in terms of being able to scale for it and also being able to pay for these systems as well.Corey: Tired of Apache Kafka's complexity making your AWS bill look like a phone number? Enter Redpanda. You get 10x your streaming data performance without having to rob a bank. And migration? Smoother than a fresh jar of peanut butter. Imagine cutting as much as 50% off your AWS bills. With Redpanda, it's not a dream, it's reality. Visit go.redpanda.com/duckbill. Redpanda: Because Kafka shouldn't cause you nightmares.Corey: I think that there's a common misconception in the industry that people are going to either have ancient servers rotting away in racks, or they're going to build something greenfield, the way that we see done on keynote stage is all the time of companies that have been around with this architecture for less than 18 months. In practice, I find it's awfully frequent that this is much more of a spectrum, and a case-by-case per-workload basis. I haven't met too many data center companies where everything's the disaster that the cloud companies like to paint it as, and vice versa, I also have never yet seen a architecture that really existed as described in a keynote presentation.Martin: A hundred percent agree with you there. And you know, it's not clean-cut from that perspective. And also, you're also forgetting the messy middle as well, right? Like, often what happens is, there's a transition. If you don't start off cloud-native from day one, you do need to transition there from your monolithic applications, from your VM-based architectures, and often the use case can't transform over perfectly.What ends up happening is you start moving some functionality and containerizing some functionality and that still has dependencies between the old architecture and the new architecture. And companies have to live in this middle state, perhaps for a very long time. So, it's definitely true. It's not a clean-cut transition. But you can think about that middle state is actually one that a lot of companies struggle with because all of a sudden, you only have a partial view of the world, or what's happening with your old tools, they're not well suited for the new environments. Perhaps you got to start bringing new tools and new ways of doing things in your new environments, and they're not perhaps the best suited for the old environment as well.So, you do actually end up in this middle state where you need a good solution that can really handle both because there are a lot of interdependencies between the two. And it's actually one of the things that we strive to do here at Chronosphere is to help companies through that transition. So, it's not just all of the new use cases and it's not just all of your new environments. It's actually helping companies through this transition is actually pretty critical as well.Corey: My question for you is, given that you have, I don't want to say a preordained architecture that your customers have to use, but there are certain assumptions you've made based upon both their scale and the environment in which they're operating. How heavy of a lift is it for them to wind up getting Chronosphere into their environments? Just because seems to me that it's not that hard to design an architecture on a whiteboard that can meet almost any requirement. The messy part is figuring out how to get something that resembles that into place on a pre-existing, extant architecture.Martin: Yeah. I'd say it's something we spent a lot of time on. The good thing for the industry overall, for the observability industry, is that open-source standards are now created and now exist when they didn't before. So, if you look at the APM-based view, it was all proprietary agents producing the data themselves that would only really work with one vendor product, whereas if you've look at a modern environment, the production of the data has actually been shifted from the vendor down to the companies themselves, and there'll be producing these pieces of data in open-source standard formats like OpenTelemetry for distributed traces, or perhaps Prometheus for metrics.So, the good thing is that for all of the new environments, there's a standard way to produce all of this data and you can send all that data to whichever vendor you want on the back end. So, it just makes the implementation for the new environments so much easier. Now, for the legacy environments, or if a company is shifting over from an existing tool, there is actually a messy migration there because often you're trying to replace proprietary formats and proprietary ways of producing data with open-source standard ones. So, just something that us as Chronosphere just come in and we view that as a particular problem that we need to solve and we take the responsibility of solving for a company because what we're trying to sell companies is not just a tool, what we're really trying to solve them is the solution to the problem, and the problem is they need an observability solution end to end. So, this often involves us coming in and helping them, you can imagine, not just convert the data types over but also move over existing dashboards, existing alerts.There's a huge piece of lift that the end—that perhaps every developer in a company would have to do if we didn't come in and do it on behalf of those companies. So, it's just an additional responsibility. It's not an easy thing to do. We've built some tooling that helps with it, and we just spend a lot of manual hours going through this, but it's a necessary one in order to help a company transition. Now, the good thing is, once they have transitioned into the new way of doing things and they are dependent on open-source standard formats, they are no longer locked in. So, you know, you can imagine future transitions will be much easier, however the current one does have to go through a little bit of effort.Corey: I think that's probably fair. And then there's no such thing, in my experience, as a easy deployment for something that is large enough to matter. And let's be clear, people are not going to be deploying something as large scale as Chronosphere on a lark. This is going to be when they have a serious application with serious observability challenges. So, it feels like, on some level, that even doing a POC is a tricky proposition, just due to the instrumentation part of it. Something I've seen is that very often, enterprise sales teams will decide that by the time that they can get someone to successfully pull off a POC, at that point, the deal win rate is something like 95% just because no one wants to try that in a bake-off with something else.Martin: Yeah, I'd say that we do see high pilot conversion rates, to your point. For us, it's perhaps a little bit easier than other solutions out there, in the sense that I think with our type of observability tooling, the good thing is, an individual team could pick this up for their one use case and they could get value out of it. It's not that every team across an environment or every team in an organization needs to adopt. So, while generally, we do see that, you know, a company would want to pilot and it's not something you can play around online with by yourself because it does need a particular deployment, it does need a little bit of setup, generally one single team can come and perform that and see value out of the tool. And that sort of value can be extrapolated and applied to all the other teams as well. So, you're correct, but it hasn't been a huge lift. And you know, these processes end to end, we've seen be as short as perhaps 30-something days end to end, which is generally a pretty fast-moving process there.Corey: Now, I guess, on some level, I'm still trying to wrap my head around the idea of the scale that you operate at, just because as you mentioned, this came out of Uber—which is beyond imagining for most people—and you take a look at a wide variety of different use cases. And in my experience it's never been, “Holy crap, we have no observability and we need to fix that.” It's, “There are a variety of systems in place that just are not living up to the hopes, dreams, and potential that they had when they were originally deployed.” Either due to growth or due to lack of product fit, or the fact that it turns out in a post zero-interest-rate world, most people don't want to have a pipeline of 20 discrete observability tools.Martin: Yep, yep. No, a hundred percent. And, to your point there, ultimately, it's our goal and, you know, in many companies were replacing up to six to eight tools in a single platform. And so, it's great to do. That definitely doesn't happen overnight. It takes time.You know, you can imagine in a pilot or when you're looking at it, we're picking a few of the use cases to demonstrate what our tool could do across many other use cases, and then generally on the onboarding, during the onboarding time or perhaps over a period of months or perhaps even a year plus, we then go on board these use cases a piece by piece. So, it's definitely not a quick overnight process there, but, you know, you can imagine something that can help each end developer in that particular company be more effective and it's something that can really help move the bottom line in terms of far better price efficiency. These things are generally not things that are quick fixes; these are generally things that do take some time and a little bit of investment to achieve the results.Corey: So, a question I do have for you, given that I just watched an awful lot of people talking about observability for three days at Monitorama, what are people not talking about? What did you not see discussed that you think should be?Martin: Yeah, one thing I think often gets overlooked, and especially in today's climate is, I think observability gets relegated to a cost center. It's something that every company must have, every company has today, and it's often looked at a tool that gives you insights about your infrastructure and your applications and it's a backend tool, something you have to have, something you have to pay for and it doesn't really move the direct needle for the business top line. And I think that's often something that companies don't talk about enough. And you know, from our experience at Uber and through most of the companies that we work with here at Chronosphere, yes, there are infrastructure problems and application level problems that we help companies solve, but ultimately, the more mature organizations, or when it comes to observability, are often starting to get real-time insights into the business more than the application layer and the infrastructure layer.And if you think about it, companies that are cloud-native architected, there's not one single endpoint or one single application that fulfills a single customer request. So, even if you could look at all the individual pieces, the actual what we have to do for customers in our products and services span across so many of them that often you need to introduce a new view, a view that's just focused on your customers, just focused on the business, and sort of apply the same type of techniques on your backend infrastructure as you do for your business. Now, this isn't a replacement for your BI tools, you still need those, but what we find is that BI tools are more used for longer-term strategic decisions, whereas you may need to do a lot of sort of tactical, more tactical, business operational functions based on having a live view of the business. So, what we find is often observability is only ever thought about for infrastructure, it's only ever thought about as a cost center, but ultimately observability tooling can actually add a lot directly to your top line by giving you visibility into the products and services that make up that top line. And I would say the more mature organizations that we work with here at Chronosphere all had their executives looking at, you know, monitoring dashboards to really get a good sense of what's happening in their business in real-time. So, I think that's something that hopefully a lot more companies evolve into over time and they really see the full benefit of observability and what it can do to a business's top line.Corey: I think that's probably a fair way of approaching it. It seems similar, in some respects, to what I tend to see over in the cloud cost optimization space. People often want to have something prescriptive of, do this, do that do the other thing, but it depends entirely what the needs of the business are internally, it depends upon the stories that they wind up working with, it depends really on what their constraints are, what their architectures are doing. Very often it's a let's look and figure out what's going on and accidentally, they discover they can blow 40% off their spend by just deleting things that aren't in use anymore. That becomes increasingly uncommon with scale, but it's still one of those questions of, “What do we do here and how?”Martin: Yep, a hundred percent.Corey: I really want to thank you for taking the time to speak with me today about what you're seeing. If people want to learn more, where's the best place for them to find you?Martin: Yeah, the best place is probably going to our website Chronosphere.io to find out more about the company, or if you want to chat with me directly, LinkedIn is probably the best place to come find me, via my name.Corey: And we will, of course, put links to both of those things in the [show notes 00:28:49]. Thank you so much for suffering the slings and arrows I was able to throw at you today.Martin: Thank you for having me Corey. Always a pleasure to speak with you, and looking forward to our next conversation.Corey: Likewise. Martin Mao, CEO and co-founder of Chronosphere. This promoted guest episode has been brought to us by Chronosphere, here on Screaming in the Cloud. And I'm Cloud Economist Corey Quinn. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment that I will never notice because I have an observability gap.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
Brandon Sherman, Cloud Security Engineer at Temporal Technologies Inc., joins Corey on Screaming in the Cloud to discuss his experiences at recent cloud conferences and the ongoing changes in cloud computing. Brandon shares why he enjoyed fwd:cloudsec more than this year's re:Inforce, and how he's seen AWS events evolve over the years. Brandon and Corey also discuss how the cloud has matured and why Brandon feels ongoing change can be expected to be the continuing state of cloud. Brandon also shares insights on how his perspective on Google Cloud has changed, and why he's excited about the future of Temporal.io.About BrandonBrandon is currently a Cloud Security Engineer at Temporal Technologies Inc. One of Temporal's goals is to make our software as reliable as running water, but to stretch the metaphor it must also be *clean* water. He has stared into the abyss and it stared back, then bought it a beer before things got too awkward. When not at work, he can be found playing with his kids, working on his truck, or teaching his kids to work on his truck.Links Referenced: Temporal: https://temporal.io/ Personal website: https://brandonsherman.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: In the cloud, ideas turn into innovation at virtually limitless speed and scale. To secure innovation in the cloud, you need Runtime Insights to prioritize critical risks and stay ahead of unknown threats. What's Runtime Insights, you ask? Visit sysdig.com/screaming to learn more. That's S-Y-S-D-I-G.com/screaming.My thanks as well to Sysdig for sponsoring this ridiculous podcast.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined today by my friend who I am disappointed to say I have not dragged on to this show before. Brandon Sherman is a cloud security engineer over at Temporal. Brandon, thank you for finally giving in.Brandon: Thanks, Corey, for finally pestering me enough to convince me to join. Happy to be here.Corey: So, a few weeks ago as of this recording—I know that time is a flexible construct when it comes to the podcast production process—you gave a talk at fwd:cloudsec, the best cloud security conference named after an email subject line. Yes, I know re:Inforce also qualifies; this one's better. Tell me about what you talked about.Brandon: Yeah, definitely agree on this being the better the two conferences. I gave a talk about how the ground shifts underneath us, kind of touching on how these cloud services that we operate—and I'm mostly experienced in AWS and that's kind of the references that I can give—but these services work as a contract basis, right? We use their APIs and we don't care how they're implemented behind the scenes. At this point, S3 has been rewritten I don't know how many times. I'm sure that other AWS services, especially the longer-lived ones have gone through that same sort of rejuvenation cycle.But as a security practitioner, these implementation details that get created are sort of byproducts of, you know, releasing an API or releasing a managed service can have big implications to how you can either secure that service or respond to actions or activities that happen in that service. And when I say actions and activity, I'm kind of focused on, like, security incidents, breaches, your ability to do incident response from that.Corey: One of the reasons I've always felt that cloud providers have been cagey around how the services work under the hood is not because they don't want to talk about it so much as they don't want to find themselves committed to certain patterns that are not guaranteed as a part of the definition of the service. So if, “Yeah, this is how it works under the hood,” and you start making plans and architecting in accordance with that and they rebuild the service out from under you like they do with S3, then very often, those things that you depend upon being true could very easily no longer be true. And there's no announcement around those things.Brandon: No. It's very much Amazon is… you know, they're building a service to meet the needs of their customers. And they're trying to grow these services as the customers grow along with them. And it's absolutely within their right to act that way, to not have to tell us when they make a change because in some contexts, right, Amazon's feature update might be me as a customer a breaking change. And Amazon wants to try and keep that, what they need to tell me, as small as possible, probably not out of malice, but just because there's a lot of people out there using their services and trying to figure out what they've promised to each individual entity through either literal contracts or their API contracts is hard work. And that's not the job I would want.Corey: No. It seems like it's one of those thankless jobs where you don't get praise for basically anything. Instead, all you get to do is deal with the grim reality that people either view as invisible or a problem.Brandon: Yeah. It sort of feels like documentation. Everyone wants more and better documentation, but it's always an auxiliary part of the service creation process. The best documentation always starts out when you write the documentation first and then kind of build backwards from that, but that's rarely how I've seen software get made.Corey: No. I feel like I left them off the hook, on some level, when we say this, but I also believe in being fair. I think there's a lot of things that cloud providers get right and by and large, with any of the large cloud providers, they are going to do a better job of securing the fundamentals than you are yourself. I know that that is a controversial statement to some folks who spent way too much time in the data centers, but I stand by it.Brandon: Yeah, I agree. I've had to work in both environments and some of the easiest, best wins in security is just what do I have, so that way I know what I have to protect, what that is there. But even just that asset inventory, that's the sort of thing that back in the days of data centers—and still today; it was data centers all over the place—to do an inventory you might need to go and send an actual human with an actual clipboard or iPad or whatever, to the actual physical location and hope that they read the labels on hundreds of thousands of servers correctly and get their serial numbers and know what you have. And that doesn't even tell you what's running on them, what ports are open, what stuff you have to care about. In AWS, I can run a couple of describe calls or list calls and that forms the backbone of my inventory.There's no server that, you know, got built into a wall or lost behind and some long-forgotten migration. A lot of those basic stuff that really, really helps. Not to mention then the user-managed service like S3, you never have to care about patch notes or what an update might do. Plenty of times I've, like, hesitated upgrading a software package because I didn't know what was going to happen. Control Tower, I guess, is kind of an exception to that where you do have to care about the version of your cloud service, but stuff like, yeah, these other services is absolutely right. The undifferentiated heavy lifting it's taken care of. And hopefully, we always kind of hope that the undifferentiated heavy lifting doesn't become differentiated and heavy and lands on us.Corey: So, now that we've done the obligatory be nice to cloud providers thing, let's potentially be a little bit harsher. While you were speaking at fwd:cloudsec, did you take advantage of the fact that you were in town to also attend re:Inforce?Brandon: I did because I was given a ticket, and I wanted to go see some people who didn't have tickets to fwd:cloudsec. Yeah, we've been nice to cloud providers, but as—I haven't found I've learned a lot from the re:Inforce sessions. They're all recorded anyway. There's not even an open call for papers, right, for talking about at a re:Inforce session, “Hey, like, this would be important and fresh or things that I would be wanting to share.” And that's not the sort of thing that Amazon does with their conferences.And that's something that I think would be really interesting to change if there was a more community-minded track that let people submit, not just handpicked—although I suppose any kind of Amazon selection committee is going to be involved, but to pick out, from the community, stories or projects that are interesting that can be, not just have to get filtered through your TAM but something you can actually talk to and say, “Hey, this is something I'd like to talk about. Maybe other people would find it useful.”Corey: One of the things that I found super weird about re:Inforce this year has been that, in a normal year, it would have been a lot more notable, I think. I know for a fact that if I had missed re:Invent, for example, I would have had to be living in a cave not to see all of the various things coming out of that conference on social media, in my email, in all the filters I put out there. But unless you're looking for it, you've would not know that they had a conference that costs almost as much.Brandon: Yeah. The re:Invent-driven development cycle is absolutely a real thing. You can always tell in the lead up to re:Invent when there's releases that get pushed out beforehand and you think, “Oh, that's cool. I wonder why this doesn't get a spot at re:Invent, right, some kind of announcement or whatever.” And I was looking for that this year for re:Inforce and didn't see any kind of announcement or that kind of pre-release trickle of things that are like, oh, there's a bunch of really cool stuff. And that's not to say that cool stuff didn't happen; it just there was a very different marketing feel to it. Hard to say, it's just the vibes around felt different [laugh].Corey: Would you recommend that people attend next year—well let me back up. I've heard that they had not even announced a date for next year. Do you think there will be a re:Inforce next year?Brandon: Making me guess, predict the future, something that I'm—Corey: Yeah, do a prediction. Why not?Brandon: [laugh]. Let's engage in some idle speculation, right? I think that not announcing it was kind of a clue that there's a decent chance it won't happen because in prior years, it had been pre-announced at the—I think it was either at closing or opening ceremonies. Or at some point. There's always the, “Here's what you can look forward to next year.”And that didn't happen, so I think that's there's a decent chance this may have been the last re:Inforce, especially once all the data is crunched and people look at the numbers. It might just be… I don't know, I'm not a marketing-savvy kind of person, but it might just be that a day at re:Invent next year is dedicated to security. But then again, security is always job zero at Amazon so maybe re:Invent just becomes re:Inforce all the time, right? Do security, everybody.Corey: It just feels like a different type of conference. Whenever re:Invent there's something for everyone. At re:Inforce, there's something for everyone as long as they work in InfoSec. Because other than that, you wind up just having these really unfortunate spiels of them speaking to people that are not actually present, and it winds up missing the entire forest for the trees, really.Brandon: I don't know if I'd characterize it as that. I feel like some of the re:Inforce content was people who were maybe curious about the cloud or making progress in their companies and moving to the cloud—and in Amazon's case when they say the cloud, they mean themselves. They don't mean any other cloud. And re:Inforce tries to dispel the notion there are any other clouds.But at the same time, it feels like an attempt to try and make people feel better. There's a change underway in the industry and it still is going to continue for a while. There's still all kinds of non-cloud environments people are going to operate for probably until the end of time. But at the same time, a lot of these are moving to the cloud and they want the people who are thinking about this or engaged in it, to be comforted by that Amazon that either has these services, or there's a pattern you can follow to do something in a secure manner. I think that's that was kind of the primary audience of re:Inforce was people who were charged with doing cloud security or were exploring moving their corporate systems to AWS and they wanted some assurance that they're going to actually be doing things the right way, or someone else hadn't made those mistakes first. And if that audience has been sort of saturated, then maybe there isn't a need for that style of conference anymore.Corey: It feels like it's not intended to be the same thing at re:Invent, which is probably I guess, a bigger problem. Re:Invent for a long time has attempted to be all things to all people, and it has grown to a scale where that is no longer possible. So, they've also done a poor job of signaling that, so you wind up attending Adam Selipsky's keynote, and in many cases, find yourself bored absolutely to tears. Or you go in expecting it to be an Andy Jassy style of, “Here are 200 releases, four of them good,” and instead, you wind up just having what feels like a relatively paltry number doled out over a period of days. And I don't know that their wrong to do it; I just think it doesn't align with pre-existing expectations. I also think people expecting to go to re:Inforce to see a whole bunch of feature releases are bound to be disappointed.Brandon: Like, both of those are absolutely correct. The number of releases on the slide must always increase up and the right; away we go; we're pushing more code and making more changes to services. I mean, if you look at the history, there's always new instance types. Do they count each instance type as a new release, or they not do that?Corey: Yeah, it honestly feels like that sometimes. They also love to do price cuts where they—you wind up digging into them and something like 90% of them are services you've never heard of in regions you couldn't find on a map if your life depended on it. It's not quite the, “Yeah, the bill gets lower all the time,” that they'd love to present it as being.Brandon: Yeah. And you may even find that there's services that had updates that you didn't know about until you go and check the final bill, the Cost and Usage Report, and you look and go, “Oh, hey. Look at all the services that we were using, that our engineers started using after they heard announcements at re:Invent.” And then you find out how much you're actually paying for them. [pause]. Or that they were in use in the first place. There's no better way to find what is actually happening in your environment than, look at the bill.Corey: It's depressing that that's true. At least they finally stopped doing the slides where they talk about year-over-year, they have a histogram of number of feature and service releases. It's, no one feels good about that, even the people building the services and features because they look at that and think, “Oh, whatever I do is going to get lost in the noise.” And they're not wrong. Customers see it and freak out because how am I ever going to keep current with all this stuff? I take a week off and I spend a month getting caught back up again.Brandon: Yeah. And are you going to—you know, what's your strategy for dealing with all these new releases and features? Do you want to have a strategy of saying, “No, you can't touch any of those until we've vetted and understand them?” I mean, you don't even have to talk about security in that context; just the cost alone, understanding it's someone, someone going to run an experiment that bankrupts your company by forgetting about it or by growing into some monster in the bill. Which I suspect helps [laugh] helps you out when those sorts of things happen, right, for companies don't have that strategy.But at the same time, all these things are getting released. There's not really a good way of understanding which of these do I need to care about. Which of these is going to really impact my operational flow, my security impacts? What does this mean to me as a user of the service when there's, I don't know, an uncountable number really, or at least a number that's so big, it stops mattering that it got any bigger?Corey: One thing that I will say was great about re:Invent, I want to say 2021, was how small it felt. It felt like really a harkening back to the old re:Invents. And then you know, 2022 hit, and we go there and half of us wound up getting Covid because of course we did. But it was also this just this massive rush of, we're talking with basically the population of a midsize city just showing up inside of this entire enormous conference. And you couldn't see the people you wanted to see, it was difficult to pay attention to all there was to pay attention to, and it really feels like we've lost something somewhere.Brandon: Yeah, but at the same time is that just because there are more people in this ecosystem now? You know, 2021 may have been a callback to that a decade ago. And these things were smaller when it was still niche, but growing in kind of the whole ecosystem. And parts of—let's say, the ecosystem there, I'm talking about like, how—when I say that ecosystem there, I'm kind of talking about how in general, I want to run something in technology, right? I need a server, I need an object store, I need compute, whatever it is that you need, there is more attractive services that Amazon offers to all kinds of customers now.So, is that just because, right, we've been in this for a while and we've seen the cloud grow up and like, oh, wow, you're now in your awkward teenage phase of cloud computing [laugh]? Have we not yet—you know, we're watching the maturity to adulthood, as these things go? I really don't know. But it definitely feels a little, uh… feels a little like we've watched this cloud thing grow from a half dozen services to now, a dozen-thousand services all operating different ways.Corey: Part of me really thinks that we could have done things differently, had we known, once upon a time, what the future was going to hold. So, much of the pain I see in Cloud is functionally people trying to shove things into the cloud that weren't designed with Cloud principles in mind. Yeah, if I was going to build a lot of this stuff from scratch myself, then yeah, I would have absolutely made a whole universe of different choices. But I can't predict the future. And yet, here we are.Brandon: Yep. If I could predict the future, I would have definitely won the lottery a lot more times, avoided doing that one thing I regretted that once back in my history [laugh]. Like, knowing the future change a lot of things. But at least unless you're not letting on with something, then that's something that no one's got the ability to, do not even at Amazon.Corey: So, one of the problems I've always had when I come back from a conference, especially re:Invent, it takes me a few… well, I'll be charitable and say days, but it's more like weeks, to get back into the flow of my day-to-day work life. Was there any of that with you and re:Inforce? I mean, what is your day job these days anyway? What are you up to?Brandon: What is my day job? There's a lot. So, Temporal is a small, but quickly growing company. A lot of really cool customers that are doing really cool things with our technology and we need to build a lot of basics, essentially, making sure that when we grow, that we're going to kind of grow into our security posture. There's not anything talking about predicting the future. My prediction is that the company I work for is going to do well. You can hold your analysis on that [laugh].So, while I'm predicting what the company that I'm working at is going to do well, part of it is also what are the things that I'm going to regret not having in two or three years' time. So, some baseline cloud monitoring, right? I want that asset inventory across all of our accounts; I want to know what's going on there. There's other things that are sort of security adjacent. So, things like DNS records, domain names, a lot of those things where if we can capture this and centralize it early and build it in a way—especially that users are less unhappy about, like, not everyone, for example, is hosting their own—buying their own domains on personal cards and filing for reimbursement, that DNS records aren't scattered across a dozen different software projects and manipulated in different ways, then that sets us up.It may not be perfect today, but in a year, year-and-a-half, two years, we have the ability to then say, “Okay, we know what we're pointing at. What are the dangling subdomains? What are the things that are potential avenues of being taken over? What do we have? What are people doing?” And trying to understand how we can better help users with their needs day-to-day.Also as a side part of my day job is advising a startup Common Fate. Does just-in-time access management. And that's been a lot of fun to do as well because fundamentally—this is maybe a hot take—that, in a lot of cases, you really only need admin access and read-only access when you're doing really intensive work. In Temporal day job, we've got infrastructure teams that are building stuff, they need lots of permissions and it'd be very silly to say you can't do your job just because you could potentially use IAM and privilege escalate yourself to administrator. Let's cut that out. Let's pretend that you are a responsible adult. We can monitor you in other ways, we're not going to put restrictions between you and doing your job. Have admin access, just only have it for a short period of time, when you say you're going to need it and not all the time, every account, every service, all the time, all day.Corey: I do want to throw a shout-in for that startup you advise, Common Fate. I've been a big fan of their Granted offering for a while now. granted.dev for those who are unfamiliar. I use that to automatically generate console logins, do all kinds of other things. When you're moving between a bunch of different AWS accounts, which it kind of feels like people building the services don't have to do somehow because of their Isengard system handling it for them. Well, as a customer, can I just say that experience absolutely sucks and Granted goes a long way toward making it tolerable, if not great.Brandon: Mm-hm. Yeah, I remember years ago, the way that I would have to handle this is I would have probably a half-dozen different browsers at the same time, Safari, Chrome, the Safari web developer preview, just so I could have enough browsers to log into with, to see all the accounts I needed to access. And that was an extremely painful experience. And it still feels so odd that the AWS console today still acts like you have one account. You can switch roles, you can type in a [role 00:21:23] on a different account, but it's very clunky to use, and having software out there that makes this easier is definitely, definitely fills a major pain point I have with using these services.Corey: Tired of Apache Kafka's complexity making your AWS bill look like a phone number? Enter Redpanda. You get 10x your streaming data performance without having to rob a bank. And migration? Smoother than a fresh jar of peanut butter. Imagine cutting as much as 50% off your AWS bills. With Redpanda, it's not a dream, it's reality. Visit go.redpanda.com/duckbill. Redpanda: Because Kafka shouldn't cause you nightmares.Corey: Do you believe that there's hope? Because we have seen some changes where originally AWS just had the AWS account you'd log into, it's the root user. Great. Then they had IAM. Now, they're using what used to be known as AWS SSO, which they wound up calling IAM Access Identity Center, or—I forget the exact words they put in order, but it's confusing and annoying. But it does feel like the trend is overall towards something that's a little bit more coherent.Brandon: Mm-hm.Corey: Is the future five years from now better than it looks like today?Brandon: That's certainly the hope. I mean, we've talked about how we both can't predict the future, but I would like to hope that the future gets better. I really like GCP's project model. There's complaints I have with how Google Cloud works, and it's going to be here next year, and if the permission model is exactly how I'd like to use it, but I do like the mental organization that feels like Google was able to come in and solve a lot of those problems with running projects and having a lot of these different things. And part of that is, there's still services in AWS that don't really respect resource-based permissions or tag-based permissions, or I think the new one is attribute-based access control.Corey: One of the challenges I see, too, is that I don't think that there's been a lot of thought put into how a lot of these things are going to work between different AWS accounts. One of my bits of guidance whenever I'm talking to someone who's building anything, be it at AWS or external is, imagine an architecture diagram and now imagine that between any two resources in that diagram is now an account boundary. Because someone somewhere is going to have one there, so it sounds ridiculous, but you can imagine a microservices scenario where every component is in its own isolated account. What are you going to do now as a result? Because if you're going to build something that scales, you've got to respect those boundaries. And usually, that just means the person starts drinking.Brandon: Not a bad place to start, the organizational structure—lowercase organizations, not the Amazon service, Organizations—it's still a little tricky to get it in a way that sort of… I guess, I always kind of feel that these things are going to change and that the—right, the only constant is change. That's true. The services we use are going to change. The way that we're going to want to organize them is going to change. Our researcher is going to come out with something and say, “Hey, I found a really cool way to do something really terrible to the stuff in your cloud environment.”And that's going to happen eventually, in the fullness of time. So, how do we be able to react quickly to those kinds of changes? And how can we make sure that if you know, suddenly, we do need to separate out these services to go, you know, to decompose the monolith even more, or whatever the cool, current catchphrase is, and we have those account boundaries, which are phenomenal boundaries, they make it so much easier to do—if you can do multi-account then you've solved multi-regional on the way, you've sold failover, you've solve security issues. You have not solved the fact that your life is considerably more challenging at the moment, but I would really hope that in you know, even next year, but by the time five years comes around, that that's really been taken to heart within Amazon and it's a lot easier to be working creating services in different accounts that can talk to each other, especially in the current environment where it's kind of a mess to wire these things all together. ClickOps has its place, but some console applications just don't want to believe that you have a KMS key in another account because well, why would you put that over there? It's not like if your current account has a problem, you want to lose all your data that's encrypted.Corey: It's one of those weird things, too, where the clouds almost seem to be arguing against each other. Like, I would be hard-pressed to advise someone not to put a ‘rehydrate the entire business' level of backups into a different cloud provider entirely, but there's so steeped in the orthodoxy of no other clouds ever, that that message is not something that they can effectively communicate. And I think they're doing their customers a giant disservice by that, just because it is so much easier to explain to your auditor that you've done it than to explain why it's not necessary. And it's never true; you always have the single point of failure of the payment instrument, or the contract with that provider that could put things at risk.Is it a likely issue? No. But if you're running a publicly traded company on top of it, you'd be negligent not to think about it that way. So, why pretend otherwise?Brandon: Is that a question for me because [laugh]—Corey: Oh, that was—no, absolutely. That was a rant ending in a rhetorical question. So, don't feel you have to answer it. But getting the statement out there because hopefully, someone at Amazon is listening to this.Brandon: That's, uh, hopefully, if you find out who's the one that listens to this and can affect it, then yeah, I'd like to send them a couple of emails because absolutely. There's room out there, there will always be room for at least two providers.Corey: Yeah, I'd say a third, but I don't know that Google is going to have the attention span to still have a cloud offering by lunchtime today.Brandon: Yeah. I really wish that I had more faith in the services and that they weren't going—you know, speaking of services changing underneath you, that's definitely a—speaking of services changing underneath, you definitely a major disservice if you don't know—if you're going to put into work into architecting and really using cloud providers as they're meant to be used. Not in a, sort of, least common denominator sense, in which case, you're not in good shape.Corey: Right. You should not be building something with an idea toward what if this gets deprecated. You shouldn't have to think about that on a consistent basis.Brandon: Mm-hm. Absolutely. You should expect those things to change because they will, right, the performance impact. I mean, the performance of these services is going to change, the underlying technology that the providers use is going to change, but you should still be able to mostly expect that at least the API calls you make are going to still be there and still be consistent come this time next year.Corey: The thing that really broke me was the recent selling off of Google domains to Squarespace. Nothing against Squarespace, but they have a different target market in many respects. And oh, I'm a Google customer, you're now going to give all of my information to a third party I never asked to deal with. Great. And more to the point, if I recommend Google to folks because as has happened in years past, then they canceled the thing that I recommended, then I looked like a buffoon. So, we've gotten to a point now where it has become so steady and so consistent, that I fear I cannot, in good conscience, recommend a Google product without massive caveats. Otherwise, I look like a clown or worse, a paid shill.Brandon: Yeah. And when you want to start incorporating these things into the core of your business, to take that point about, you know, total failover scenarios, you should, you know, from you want it to have a domain registered in a Google service that was provisioned to Google Cloud services, that whole sort of ecosystem involved there, that's now gone, right? If I want to use Google Cloud with a Google Cloud native domain name hosting services, I can't. How am—I just—now I can't [laugh]. There's, like, not workarounds available.I've got to go to some other third-party and it just feels odd that an organization would sort of take those core building blocks and outsource them. [I know 00:29:05] that Google's core offering isn't Google Cloud; it's not their primary focus, and it kind of reflects that, which was a shame. There's things that I'd love to see grow out of Google Cloud and get better. And, you know, competition is good for the whole cloud computing industry.Corey: I think that it's a sad thing, but it's real, that there are people who were passionate defenders of Google over the years. I used to be one. We saw a bunch of them with Stadia fans coming out of the woodwork, and then all those people who have defended Google and said, “No, no, you can trust Google on this service because it's different,” for some reason or other, then wind up looking ridiculous. And some of the staunchest Google defenders that I've seen are starting to come around to my point of view. Eventually, you've run out of people who are willing to get burned if you burn them all.Brandon: Yeah. I've always been a little, uh… maybe this is the security Privacy part of me; I've always been a little leery of the services that really want to capture and gather your data. But I always respected the Google engineering that went into building these things at massive scale. It's something beyond my ability to understand as I haven't worked in something that big before. And Google made it look… maybe not effortless, but they made it look like they knew what they were doing, they could build something really solid.And I don't know if that's still true because it feels like they might know how to build something, and then they'll just dismantle it and turn it over to somebody else, or just dismantle it completely. And I think humans, we do a lot of things because we don't want to look foolish and… now recommending Google Cloud starts to make you wonder, “Am I going to look foolish?” Is this going to be a reflection on me in a year or two years, when you got to come in to say, “Hey, I guess that whole thing we architected around, it's being sold to someone else. It's being closed down. We got to transfer and rearchitect our whole whatever we built because of factors out of our control.” I want to be rearchitecting things because I screwed it up. I want to be rearchitecting things because I made an interesting novel mistake, not something that's kind of mundane, like, oh, I guess the thing we were going to use got shut down. Like, that makes it look like not only can I not predict the future, but I can't even pretend to read the tea leaves.Corey: And that's what's hard is because, on some level, our job, when we work in operations and cloud and try and make these decisions, is to convince the business we know what we're talking about. And when we look foolish, we don't make that same mistake again.Brandon: Mm-hm. Billing and security are oftentimes frequently aligned with each other. We're trying to convince the business that we need to build things a certain way to get a certain outcome, right? Either lower costs or more performance for the dollar, so that way, we don't wind up in the front page of newspapers, any kinds of [laugh] any kind of those things.Corey: Oh, yes. I really want to thank you for taking the time to speak with me. If people want to learn more, where's the best place for them to find you?Brandon: The best place to find me, I have a website about me, [brandonsherman.com 00:32:13]. That's where I post stuff. There's some links to—I have a [Mastodon 00:32:18] profile. I'm not much of a social, sort of post your information out there kind of person, but if you want to get a hold of me, then that's probably the best way to find me and contact me. Either that or head out to the desert somewhere, look for a silver truck out in the dunes and without technology around. It's another good spot if you can find me there.Corey: And I will include a link to that, of course, in the [show notes 00:32:45]. Thank you so much for taking the time to speak with me today. As always, I appreciate it.Brandon: Thank you very much for having me, Corey. Good to chat with you.Corey: Brandon Sherman, cloud security engineer at Temporal. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that will somehow devolve into you inviting me to your new uninspiring cloud security conference that your vendor is putting on, and is of course named after an email subject line.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
No episódio de hoje, Luan Moreno e Mateus Oliveira conversam sobre a participação no Kafka Summit London 2023. Kafka Summit é uma das maiores conferências de tecnologia do mundo, onde empresas de tecnologias de streaming anunciam novidades e podemos entender mais sobre como as elas estão usando estas tecnologias no dia a dia.Na conferência tivemos 3 momentos:Keynote - (Anúncios);Vendor Hall - (Onde os patrocinadores ficam); Sessions - (Salas que os palestrantes fazem suas apresentações).Falamos também nesse bate-papo sobre os seguintes temas: Anúncios Open-Source;Anúncios Confluent;Overview das sessões;Hall dos patrocinadores;Impressões principais da Conferência.Aprenda mais sobre tecnologias como Apache Kafka, Apache Flink dentre outras de Streaming. Além disso, vamos entender como as empresas como financeiras europeias, Apple, Uber, Netflix, entre outras, estão usando o Apache Kafka para resolver problemas de negócio.Kafka Summit 2023 Londonhttps://www.confluent.io/events/kafka-summit-london-2023/ Luan Moreno = https://www.linkedin.com/in/luanmoreno/
Jake Gold, Infrastructure Engineer at Bluesky, joins Corey on Screaming in the Cloud to discuss his experience helping to build Bluesky and why he's so excited about it. Jake and Corey discuss the major differences when building a truly open-source social media platform, and Jake highlights his focus on reliability. Jake explains why he feels downtime can actually be a huge benefit to reliability engineers, and why how he views abstractions based on the size of the team he's working on. Corey and Jake also discuss whether cloud is truly living up to its original promise of lowered costs. About JakeJake Gold leads infrastructure at Bluesky, where the team is developing and deploying the decentralized social media protocol, ATP. Jake has previously managed infrastructure at companies such as Docker and Flipboard, and most recently, he was the founding leader of the Robot Reliability Team at Nuro, an autonomous delivery vehicle company.Links Referenced: Bluesky: https://blueskyweb.xyz/ Bluesky waitlist signup: https://bsky.app TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. In case folks have missed this, I spent an inordinate amount of time on Twitter over the last decade or so, to the point where my wife, my business partner, and a couple of friends all went in over the holidays and got me a leather-bound set of books titled The Collected Works of Corey Quinn. It turns out that I have over a million words of shitpost on Twitter. If you've also been living in a cave for the last year, you'll notice that Twitter has basically been bought and driven into the ground by the world's saddest manchild, so there's been a bit of a diaspora as far as people trying to figure out where community lives.Jake Gold is an infrastructure engineer at Bluesky—which I will continue to be mispronouncing as Blue-ski because that's the kind of person I am—which is, as best I can tell, one of the leading contenders, if not the leading contender to replace what Twitter was for me. Jake, welcome to the show.Jake: Thanks a lot, Corey. Glad to be here.Corey: So, there's a lot of different angles we can take on this. We can talk about the policy side of it, we can talk about social networks and things we learn watching people in large groups with quasi-anonymity, we can talk about all kinds of different nonsense. But I don't want to do that because I am an old-school Linux systems administrator. And I believe you came from the exact same path, given that as we were making sure that I had, you know, the right person on the show, you came into work at a company after I'd left previously. So, not only are you good at the whole Linux server thing; you also have seen exactly how good I am not at the Linux server thing.Jake: Well, I don't remember there being any problems at TrueCar, where you worked before me. But yeah, my background is doing Linux systems administration, which turned into, sort of, Linux programming. And these days, we call it, you know, site reliability engineering. But yeah, I discovered Linux in the late-90s, as a teenager and, you know, installing Slackware on 50 floppy disks and things like that. And I just fell in love with the magic of, like, being able to run a web server, you know? I got a hosting account at, you know, my local ISP, and I was like, how do they do that, right?And then I figured out how to do it. I ran Apache, and it was like, still one of my core memories of getting, you know, httpd running and being able to access it over the internet and telling my friends on IRC. And so, I've done a whole bunch of things since then, but that's still, like, the part that I love the most.Corey: The thing that continually surprises me is just what I think I'm out and we've moved into a fully modern world where oh, all I do is I write code anymore, which I didn't realize I was doing until I realized if you call YAML code, you can get away with anything. And I get dragged—myself getting dragged back in. It's the falling back to fundamentals in these weird moments of yes, yes, immutable everything, Infrastructure is code, but when the server is misbehaving and you want to log in and get your hands dirty, the skill set rears its head yet again. At least that's what I've been noticing, at least as far as I've gone down a number of interesting IoT-based projects lately. Is that something you experience or have you evolved fully and not looked back?Jake: Yeah. No, what I try to do is on my personal projects, I'll use all the latest cool, flashy things, any abstraction you want, I'll try out everything, and then what I do it at work, I kind of have, like, a one or two year, sort of, lagging adoption of technologies, like, when I've actually shaken them out in my own stuff, then I use them at work. But yeah, I think one of my favorite quotes is, like, “Programmers first learn the power of abstraction, then they learn the cost of abstraction, and then they're ready to program.” And that's how I view infrastructure, very similar thing where, you know, certain abstractions like container orchestration, or you know, things like that can be super powerful if you need them, but like, you know, that's generally very large companies with lots of teams and things like that. And if you're not that, it pays dividends to not use overly complicated, overly abstracted things. And so, that tends to be [where 00:04:22] I follow up most of the time.Corey: I'm sure someone's going to consider this to be heresy, but if I'm tasked with getting a web application up and running in short order, I'm putting it on an old-school traditional three-tier architecture where you have a database server, a web server or two, and maybe a job server that lives between them. Because is it the hotness? No. Is it going to be resume bait? Not really.But you know, it's deterministic as far as where things live. When something breaks, I know where to find it. And you can miss me with the, “Well, that's not webscale,” response because yeah, by the time I'm getting something up overnight, to this has to serve the entire internet, there's probably a number of architectural iterations I'm going to be able to go through. The question is, what am I most comfortable with and what can I get things up and running with that's tried and tested?I'm also remarkably conservative on things like databases and file systems because mistakes at that level are absolutely going to show. Now, I don't know how much you're able to talk about the Blue-ski infrastructure without getting yelled at by various folks, but how modern versus… reliable—I guess that's probably a fair axis to put it on: modernity versus reliability—where on that spectrum, does the official Blue-ski infrastructure land these days?Jake: Yeah. So, I mean, we're in a fortunate position of being an open-source company working on an open protocol, and so we feel very comfortable talking about basically everything. Yeah, and I've talked about this a bit on the app, but the basic idea we have right now is we're using AWS, we have auto-scaling groups, and those auto-scaling groups are just EC2 instances running Docker CE—the Community Edition—for the runtime and for containers. And then we have a load balancer in front and a Postgres multi-AZ instance in the back on RDS, and it is really, really simple.And, like, when I talk about the difference between, like, a reliability engineer and a normal software engineer is, software engineers tend to be very feature-focused, you know, they're adding capabilities to a system. And the goal and the mission of a reliability team is to focus on reliability, right? Like, that's the primary thing that we're worried about. So, what I find to be the best resume builder is that I can say with a lot of certainty that if you talk to any teams that I've worked on, they will say that the infrastructure I ran was very reliable, it was very secure, and it ended up being very scalable because you know, the way we solve the, sort of, integration thing is you just version your infrastructure, right? And I think this works really well.You just say, “Hey, this was the way we did it now and we're going to call that V1. And now we're going to work on V2. And what should V2 be?” And maybe that does need something more complicated. Maybe you need to bring in Kubernetes, you maybe need to bring in a super-cool reverse proxy that has all sorts of capabilities that your current one doesn't.Yeah, but by versioning it, you just—it takes away a lot of the, sort of, interpersonal issues that can happen where, like, “Hey, we're replacing Jake's infrastructure with Bob's infrastructure or whatever.” I just say it's V1, it's V2, it's V3, and then I find that solves a huge number of the problems with that sort of dynamic. But yeah, at Bluesky, like, you know, the big thing that we are focused on is federation is scaling for us because the idea is not for us to run the entire global infrastructure for AT Proto, which is the protocol that Bluesky is based on. The idea is that it's this big open thing like the web, right? Like, you know, Netscape popularized the web, but they didn't run every web server, they didn't run every search engine, right, they didn't run all the payment stuff. They just did all of the core stuff, you know, they created SSL, right, which became TLS, and they did all the things that were necessary to make the whole system large, federated, and scalable. But they didn't run it all. And that's exactly the same goal we have.Corey: The obvious counterexample is, no, but then you take basically their spiritual successor, which is Google, and they build the security, they build—they run a lot of the servers, they have the search engine, they have the payments infrastructure, and then they turn a lot of it off for fun and… I would say profit, except it's the exact opposite of that. But I digress. I do have a question for you that I love to throw at people whenever they start talking about how their infrastructure involves auto-scaling. And I found this during the pandemic in that a lot of people believed in their heart-of-hearts that they were auto-scaling, but people lie, mostly to themselves. And you would look at their daily or hourly spend of their infrastructure and their user traffic dropped off a cliff and their spend was so flat you could basically eat off of it and set a table on top of it. If you pull up Cost Explorer and look through your environment, how large are the peaks and valleys over the course of a given day or week cycle?Jake: Yeah, no, that's a really good point. I think my basic approach right now is that we're so small, we don't really need to optimize very much for cost, you know? We have this sort of base level of traffic and it's not worth a huge amount of engineering time to do a lot of dynamic scaling and things like that. The main benefit we get from auto-scaling groups is really just doing the refresh to replace all of them, right? So, we're also doing the immutable server concept, right, which was popularized by Netflix.And so, that's what we're really getting from auto-scaling groups. We're not even doing dynamic scaling, right? So, it's not keyed to some metric, you know, the number of instances that we have at the app server layer. But the cool thing is, you can do that when you're ready for it, right? The big issue is, you know, okay, you're scaling up your app instances, but is your database scaling up, right, because there's not a lot of use in having a whole bunch of app servers if the database is overloaded? And that tends to be the bottleneck for, kind of, any complicated kind of application like ours. So, right now, the bill is very flat; you could eat off, and—if it wasn't for the CDN traffic and the load balancer traffic and things like that, which are relatively minor.Corey: I just want to stop for a second and marvel at just how educated that answer was. It's, I talk to a lot of folks who are early-stage who come and ask me about their AWS bills and what sort of things should they concern themselves with, and my answer tends to surprise them, which is, “You almost certainly should not unless things are bizarre and ridiculous. You are not going to build your way to your next milestone by cutting costs or optimizing your infrastructure.” The one thing that I would make sure to do is plan for a future of success, which means having account segregation where it makes sense, having tags in place so that when, “Huh, this thing's gotten really expensive. What's driving all of that?” Can be answered without a six-week research project attached to it.But those are baseline AWS Hygiene 101. How do I optimize my bill further, usually the right answer is go build. Don't worry about the small stuff. What's always disturbing is people have that perspective and they're spending $300 million a year. But it turns out that not caring about your AWS bill was, in fact, a zero interest rate phenomenon.Jake: Yeah. So, we do all of those basic things. I think I went a little further than many people would where every single one of our—so we have different projects, right? So, we have the big graph server, which is sort of like the indexer for the whole network, and we have the PDS, which is the Personal Data Server, which is, kind of, where all of people's actual social data goes, your likes and your posts and things like that. And then we have a dev staging, sandbox, prod environment for each one of those, right? And there's more services besides. But the way we have it is those are all in completely separated VPCs with no peering whatsoever between them. They are all on distinct IP addresses, IP ranges, so that we could do VPC peering very easily across all of them.Corey: Ah, that's someone who's done data center work before with overlapping IP address ranges and swore, never again.Jake: Exactly. That is when I had been burned. I have cleaned up my mess and other people's messes. And there's nothing less fun than renumbering a large complicated network. But yeah, so once we have all these separate VPCs and so it's very easy for us to say, hey, we're going to take this whole stack from here and move it over to a different region, a different provider, you know?And the other thing is that we're doing is, we're completely cloud agnostic, right? I really like AWS, I think they are the… the market leader for a reason: they're very reliable. But we're building this large federated network, so we're going to need to place infrastructure in places where AWS doesn't exist, for example, right? So, we need the ability to take an environment and replicate it in wherever. And of course, they have very good coverage, but there are places they don't exist. And that's all made much easier by the fact that we've had a very strong separation of concerns.Corey: I always found it fun that when you had these decentralized projects that were invariably NFT or cryptocurrency-driven over the past, eh, five or six years or so, and then AWS would take a us-east-1 outage in a variety of different and exciting ways,j and all these projects would go down hard. It's, okay, you talk a lot about decentralization for having hard dependencies on one company in one data center, effectively, doing something right. And it becomes a harder problem in the fullness of time. There is the counterargument, in that when us-east-1 is having problems, most of the internet isn't working, so does your offering need to be up and running at all costs? There are some people for whom that answer is very much, yes. People will die if what we're running is not up and running. Usually, a social network is not on that list.Jake: Yeah. One of the things that is surprising, I think, often when I talk about this as a reliability engineer, is that I think people sometimes over-index on downtime, you know? They just, they think it's much bigger deal than it is. You know, I've worked on systems where there was credit card processing where you're losing a million dollars a minute or something. And like, in that case, okay, it matters a lot because you can put a real dollar figure on it, but it's amazing how a few of the bumps in the road we've already had with Bluesky have turned into, sort of, fun events, right?Like, we had a bug in our invite code system where people were getting too many invite codes and it was sort of caused a problem, but it was a super fun event. We all think back on it fondly, right? And so, outages are not fun, but they're not life and death, generally. And if you look at the traffic, usually what happens is after an outage traffic tends to go up. And a lot of the people that joined, they're just, they're talking about the fun outage that they missed because they weren't even on the network, right?So, it's like, I also like to remind people that eBay for many years used to have, like, an outage Wednesday, right? Whereas they could put a huge dollar figure on how much money they lost every Wednesday and yet eBay did quite well, right? Like, it's amazing what you can do if you relax the constraints of downtime a little bit. You can do maintenance things that would be impossible otherwise, which makes the whole thing work better the rest of the time, for example.Corey: I mean, it's 2023 and the Social Security Administration's website still has business hours. They take a nightly four to six-hour maintenance window. It's like, the last person out of the office turns off the server or something. I imagine some horrifying mainframe job that needs to wind up sweeping after itself are running some compute jobs. But yeah, for a lot of these use cases, that downtime is absolutely acceptable.I am curious as to… as you just said, you're building this out with an idea that it runs everywhere. So, you're on AWS right now because yeah, they are the market leader for a reason. If I'm building something from scratch, I'd be hard-pressed not to pick AWS for a variety of reasons. If I didn't have cloud expertise, I think I'd be more strongly inclined toward Google, but that's neither here nor there. But the problem is these large cloud providers have certain economic factors that they all treat similarly since they're competing with each other, and that causes me to believe things that aren't necessarily true.One of those is that egress bandwidth to the internet is very expensive. I've worked in data centers. I know how 95th percentile commit bandwidth billing works. It is not overwhelmingly expensive, but you can be forgiven for believing that it is looking at cloud environments. Today, Blue-ski does not support animated GIFs—however you want to mispronounce that word—they don't support embedded videos, and my immediate thought is, “Oh yeah, those things would be super expensive to wind up sharing.”I don't know that that's true. I don't get the sense that those are major cost drivers. I think it's more a matter of complexity than the rest. But how are you making sure that the large cloud provider economic models don't inherently shape your view of what to build versus what not to build?Jake: Yeah, no, I kind of knew where you're going as soon as you mentioned that because anyone who's worked in data centers knows that the bandwidth pricing is out of control. And I think one of the cool things that Cloudflare did is they stopped charging for egress bandwidth in certain scenarios, which is kind of amazing. And I think it's—the other thing that a lot of people don't realize is that, you know, these network connections tend to be fully symmetric, right? So, if it's a gigabit down, it's also a gigabit up at the same time, right? There's two gigabits that can be transferred per second.And then the other thing that I find a little bit frustrating on the public cloud is that they don't really pass on the compute performance improvements that have happened over the last few years, right? Like computers are really fast, right? So, if you look at a provider like Hetzner, they're giving you these monster machines for $128 a month or something, right? And then you go and try to buy that same thing on the public, the big cloud providers, and the equivalent is ten times that, right? And then if you add in the bandwidth, it's another multiple, depending on how much you're transferring.Corey: You can get Mac Minis on EC2 now, and you do the math out and the Mac Mini hardware is paid for in the first two or three months of spinning that thing up. And yes, there's value in AWS's engineering and being able to map IAM and EBS to it. In some use cases, yeah, it's well worth having, but not in every case. And the economics get very hard to justify for an awful lot of work cases.Jake: Yeah, I mean, to your point, though, about, like, limiting product features and things like that, like, one of the goals I have with doing infrastructure at Bluesky is to not let the infrastructure be a limiter on our product decisions. And a lot of that means that we'll put servers on Hetzner, we'll colo servers for things like that. I find that there's a really good hybrid cloud thing where you use AWS or GCP or Azure, and you use them for your most critical things, you're relatively low bandwidth things and the things that need to be the most flexible in terms of region and things like that—and security—and then for these, sort of, bulk services, pushing a lot of video content, right, or pushing a lot of images, those things, you put in a colo somewhere and you have these sort of CDN-like servers. And that kind of gives you the best of both worlds. And so, you know, that's the approach that we'll most likely take at Bluesky.Corey: I want to emphasize something you said a minute ago about CloudFlare, where when they first announced R2, their object store alternative, when it first came out, I did an analysis on this to explain to people just why this was as big as it was. Let's say you have a one-gigabyte file and it blows up and a million people download it over the course of a month. AWS will come to you with a completely straight face, give you a bill for $65,000 and expect you to pay it. The exact same pattern with R2 in front of it, at the end of the month, you will be faced with a bill for 13 cents rounded up, and you will be expected to pay it, and something like 9 to 12 cents of that initially would have just been the storage cost on S3 and the single egress fee for it. The rest is there is no egress cost tied to it.Now, is Cloudflare going to let you send petabytes to the internet and not charge you on a bandwidth basis? Probably not. But they're also going to reach out with an upsell and they're going to have a conversation with you. “Would you like to transition to our enterprise plan?” Which is a hell of a lot better than, “I got Slashdotted”—or whatever the modern version of that is—“And here's a surprise bill that's going to cost as much as a Tesla.”Jake: Yeah, I mean, I think one of the things that the cloud providers should hopefully eventually do—I hope Cloudflare pushes them in this direction—is to start—the original vision of AWS when I first started using it in 2006 or whenever launched, was—and they said this—they said they're going to lower your bill every so often, you know, as Moore's law makes their bill lower. And that kind of happened a little bit here and there, but it hasn't happened to the same degree that you know, I think all of us hoped it would. And I would love to see a cloud provider—and you know, Hetzner does this to some degree, but I'd love to see these really big cloud providers that are so great in so many ways, just pass on the savings of technology to the customer so we'll use more stuff there. I think it's a very enlightened viewpoint is to just say, “Hey, we're going to lower the costs, increase the efficiency, and then pass it on to customers, and then they will use more of our services as a result.” And I think Cloudflare is kind of leading the way in there, which I love.Corey: I do need to add something there—because otherwise we're going to get letters and I don't think we want that—where AWS reps will, of course, reach out and say that they have cut prices over a hundred times. And they're going to ignore the fact that a lot of these were a service you don't use in a region you couldn't find a map if your life depended on it now is going to be 10% less. Great. But let's look at the general case, where from C3 to C4—if you get the same size instance—it cut the price by a lot. C4 to C5, somewhat. C5 to C6 effectively is no change. And now, from C6 to C7, it is 6% more expensive like for like.And they're making noises about price performance is still better, but there are an awful lot of us who say things like, “I need ten of these servers to live over there.” That workload gets more expensive when you start treating it that way. And maybe the price performance is there, maybe it's not, but it is clear that the bill always goes down is not true.Jake: Yeah, and I think for certain kinds of organizations, it's totally fine the way that they do it. They do a pretty good job on price and performance. But for sort of more technical companies—especially—it's just you can see the gaps there, where that Hetzner is filling and that colocation is still filling. And I personally, you know, if I didn't need to do those things, I wouldn't do them, right? But the fact that you need to do them, I think, says kind of everything.Corey: Tired of wrestling with Apache Kafka's complexity and cost? Feel like you're stuck in a Kafka novel, but with more latency spikes and less existential dread by at least 10%? You're not alone.What if there was a way to 10x your streaming data performance without having to rob a bank? Enter Redpanda. It's not just another Kafka wannabe. Redpanda powers mission-critical workloads without making your AWS bill look like a phone number.And with full Kafka API compatibility, migration is smoother than a fresh jar of peanut butter. Imagine cutting as much as 50% off your AWS bills. With Redpanda, it's not a pipedream, it's reality.Visit go.redpanda.com/duckbill today. Redpanda: Because your data infrastructure shouldn't give you Kafkaesque nightmares.Corey: There are so many weird AWS billing stories that all distill down to you not knowing this one piece of trivia about how AWS works, either as a system, as a billing construct, or as something else. And there's a reason this has become my career of tracing these things down. And sometimes I'll talk to prospective clients, and they'll say, “Well, what if you don't discover any misconfigurations like that in our account?” It's, “Well, you would be the first company I've ever seen where that [laugh] was not true.” So honestly, I want to do a case study if we do.And I've never had to write that case study, just because it's the tax on not having the forcing function of building in data centers. There's always this idea that in a data center, you're going to run out of power, space, capacity, at some point and it's going to force a reckoning. The cloud has what distills down to infinite capacity; they can add it faster than you can fill it. So, at some point it's always just keep adding more things to it. There's never a let's clean out all of the cruft story. And it just accumulates and the bill continues to go up and to the right.Jake: Yeah, I mean, one of the things that they've done so well is handle the provisioning part, right, which is kind of what you're getting out there. One of the hardest things in the old days, before we all used AWS and GCP, is you'd have to sort of requisition hardware and there'd be this whole process with legal and financing and there'd be this big lag between the time you need a bunch more servers in your data center and when you actually have them, right, and that's not even counting the time takes to rack them and get them, you know, on network. The fact that basically, every developer now just gets an unlimited credit card, they can just, you know, use that's hugely empowering, and it's for the benefit of the companies they work for almost all the time. But it is an uncapped credit card. I know, they actually support controls and things like that, but in general, the way we treated it—Corey: Not as much as you would think, as it turns out. But yeah, it's—yeah, and that's a problem. Because again, if I want to spin up $65,000 an hour worth of compute right now, the fact that I can do that is massive. The fact that I could do that accidentally when I don't intend to is also massive.Jake: Yeah, it's very easy to think you're going to spend a certain amount and then oh, traffic's a lot higher, or, oh, I didn't realize when you enable that thing, it charges you an extra fee or something like that. So, it's very opaque. It's very complicated. All of these things are, you know, the result of just building more and more stuff on top of more and more stuff to support more and more use cases. Which is great, but then it does create this very sort of opaque billing problem, which I think, you know, you're helping companies solve. And I totally get why they need your help.Corey: What's interesting to me about distributed social networks is that I've been using Mastodon for a little bit and I've started to see some of the challenges around a lot of these things, just from an infrastructure and architecture perspective. Tim Bray, former Distinguished Engineer at AWS posted a blog post yesterday, and okay, well, if Tim wants to put something up there that he thinks people should read, I advise people generally read it. I have yet to find him wasting my time. And I clicked it and got a, “Server over resource limits.” It's like wow, you're very popular. You wound up getting—got effectively Slashdotted.And he said, “No, no. Whatever I post a link to Mastodon, two thousand instances all hidden at the same time.” And it's, “Oh, yeah. The hug of death. That becomes a challenge.” Not to mention the fact that, depending upon architecture and preferences that you make, running a Mastodon instance can be extraordinarily expensive in terms of storage, just because it'll, by default, attempt to cache everything that it encounters for a period of time. And that gets very heavy very quickly. Does the AT Protocol—AT Protocol? I don't know how you pronounce it officially these days—take into account the challenges of running infrastructures designed for folks who have corporate budgets behind them? Or is that really a future problem for us to worry about when the time comes?Jake: No, yeah, that's a core thing that we talked about a lot in the recent, sort of, architecture discussions. I'm going to go back quite a ways, but there were some changes made about six months ago in our thinking, and one of the big things that we wanted to get right was the ability for people to host their own PDS, which is equivalent to, like, posting a WordPress or something. It's where you post your content, it's where you post your likes, and all that kind of thing. We call it your repository or your repo. But that we wanted to make it so that people could self-host that on a, you know, four or five $6-a-month droplet on DigitalOcean or wherever and that not be a problem, not go down when they got a lot of traffic.And so, the architecture of AT Proto in general, but the Bluesky app on AT Proto is such that you really don't need a lot of resources. The data is all signed with your cryptographic keys—like, not something you have to worry about as a non-technical user—but all the data is authenticated. That's what—it's Authenticated Transfer Protocol. And because of that, it doesn't matter where you get the data, right? So, we have this idea of this big indexer that's looking at the entire network called the BGS, the Big Graph Server and you can go to the BGS and get the data that came from somebody's PDS and it's just as good as if you got it directly from the PDS. And that makes it highly cacheable, highly conducive to CDNs and things like that. So no, we intend to solve that problem entirely.Corey: I'm looking forward to seeing how that plays out because the idea of self-hosting always kind of appealed to me when I was younger, which is why when I met my wife, I had a two-bedroom apartment—because I lived in Los Angeles, not San Francisco, and could afford such a thing—and the guest bedroom was always, you know, 10 to 15 degrees warmer than the rest of the apartment because I had a bunch of quote-unquote, “Servers” there, meaning deprecated desktops that my employer had no use for and said, “It's either going to e-waste or your place if you want some.” And, okay, why not? I'll build my own cluster at home. And increasingly over time, I found that it got harder and harder to do things that I liked and that made sense. I used to have a partial rack in downtown LA where I ran my own mail server, among other things.And when I switched to Google for email solutions, I suddenly found that I was spending five bucks a month at the time, instead of the rack rental, and I was spending two hours less a week just fighting spam in a variety of different ways because that is where my technical background lives. Being able to not have to think about problems like that, and just do the fun part was great. But I worry about the centralization that that implies. I was opposed to it at the idea because I didn't want to give Google access to all of my mail. And then I checked and something like 43% of the people I was emailing were at Gmail-hosted addresses, so they already had my email anyway. What was I really doing by not engaging with them? I worry that self-hosting is going to become passe, so I love projects that do it in sane and simple ways that don't require massive amounts of startup capital to get started with.Jake: Yeah, the account portability feature of AT Proto is super, super core. You can backup all of your data to your phone—the [AT 00:28:36] doesn't do this yet, but it most likely will in the future—you can backup all of your data to your phone and then you can synchronize it all to another server. So, if for whatever reason, you're on a PDS instance and it disappears—which is a common problem in the Mastodon world—it's not really a problem. You just sync all that data to a new PDS and you're back where you were. You didn't lose any followers, you didn't lose any posts, you didn't lose any likes.And we're also making sure that this works for non-technical people. So, you know, you don't have to host your own PDS, right? That's something that technical people can self-host if they want to, non-technical people can just get a host from anywhere and it doesn't really matter where your host is. But we are absolutely trying to avoid the fate of SMTP and, you know, other protocols. The web itself, right, is sort of… it's hard to launch a search engine because the—first of all, the bar is billions of dollars a year in investment, and a lot of websites will only let us crawl them at a higher rate if you're actually coming from a Google IP, right? They're doing reverse DNS lookups, and things like that to verify that you are Google.And the problem with that is now there's sort of this centralization with a search engine that can't be fixed. With AT Proto, it's much easier to scrape all of the PDSes, right? So, if you want to crawl all the PDSes out on the AT Proto network, they're designed to be crawled from day one. It's all structured data, we're working on, sort of, how you handle rate limits and things like that still, but the idea is it's very easy to create an index of the entire network, which makes it very easy to create feed generators, search engines, or any other kind of sort of big world networking thing out there. And then without making the PDSes have to be very high power, right? So, they can do low power and still scrapeable, still crawlable.Corey: Yeah, the idea of having portability is super important. Question I've got—you know, while I'm talking to you, it's, we'll turn this into technical support hour as well because why not—I tend to always historically put my Twitter handle on conference slides. When I had the first template made, I used it as soon as it came in and there was an extra n in the @quinnypig username at the bottom. And of course, someone asked about that during Q&A.So, the answer I gave was, of course, n+1 redundancy. But great. If I were to have one domain there today and change it tomorrow, is there a redirect option in place where someone could go and find that on Blue-ski, and oh, they'll get redirected to where I am now. Or is it just one of those 404, sucks to be you moments? Because I can see validity to both.Jake: Yeah, so the way we handle it right now is if you have a, something.bsky.social name and you switch it to your own domain or something like that, we don't yet forward it from the old.bsky.social name. But that is totally feasible. It's totally possible. Like, the way that those are stored in your what's called your [DID record 00:31:16] or [DID document 00:31:17] is that there's, like, a list that currently only has one item in general, but it's a list of all of your different names, right? So, you could have different domain names, different subdomain names, and they would all point back to the same user. And so yeah, so basically, the idea is that you have these aliases and they will forward to the new one, whatever the current canonical one is.Corey: Excellent. That is something that concerns me because it feels like it's one of those one-way doors, in the same way that picking an email address was a one-way door. I know people who still pay money to their ancient crappy ISP because they have a few mails that come in once in a while that are super-important. I was fortunate enough to have jumped on the bandwagon early enough that my vanity domain is 22 years old this year. And my email address still works,which, great, every once in a while, I still get stuff to, like, variants of my name I no longer use anymore since 2005. And it's usually spam, but every once in a blue moon, it's something important, like, “Hey, I don't know if you remember me. We went to college together many years ago.” It's ho-ly crap, the world is smaller than we think.Jake: Yeah.j I mean, I love that we're using domains, I think that's one of the greatest decisions we made is… is that you own your own domain. You're not really stuck in our namespace, right? Like, one of the things with traditional social networks is you're sort of, their domain.com/yourname, right?And with the way AT Proto and Bluesky work is, you can go and get a domain name from any registrar, there's hundreds of them—you know, we'd like Namecheap, you can go there and you can grab a domain and you can point it to your account. And if you ever don't like anything, you can change your domain, you can change, you know which PDS you're on, it's all completely controlled by you. And there's nearly no way we as a company can do anything to change that. Like, that's all sort of locked into the way that the protocol works, which creates this really great incentive where, you know, if we want to provide you services or somebody else wants to provide you services, they just have to compete on doing a really good job; you're not locked in. And that's, like, one of my favorite features of the network.Corey: I just want to point something out because you mentioned oh, we're big fans of Namecheap. I am too, for weird half-drunk domain registrations on a lark. Like, “Why am I poor?” It's like, $3,000 a month of my budget goes to domain purchases, great. But I did a quick whois on the official Bluesky domain and it's hosted at Route 53, which is Amazon's, of course, premier database offering.But I'm a big fan of using a enterprise registrar for enterprise-y things. Wasabi, if I recall correctly, wound up having their primary domain registered through GoDaddy, and the public domain that their bucket equivalent would serve data out of got shut down for 12 hours because some bad actor put something there that shouldn't have been. And GoDaddy is not an enterprise registrar, despite what they might think—for God's sake, the word ‘daddy' is in their name. Do you really think that's enterprise? Good luck.So, the fact that you have a responsible company handling these central singular points of failure speaks very well to just your own implementation of these things. Because that's the sort of thing that everyone figures out the second time.Jake: Yeah, yeah. I think there's a big difference between corporate domain registration, and corporate DNS and, like, your personal handle on social networking. I think a lot of the consumer, sort of, domain registries are—registrars—are great for consumers. And I think if you—yeah, you're running a big corporate domain, you want to make sure it's, you know, it's transfer locked and, you know, there's two-factor authentication and doing all those kinds of things right because that is a single point of failure; you can lose a lot by having your domain taken. So, I completely agree with you on there.Corey: Oh, absolutely. I am curious about this to see if it's still the case or not because I haven't checked this in over a year—and they did fix it. Okay. As of at least when we're recording this, which is the end of May 2023, Amazon's Authoritative Name Servers are no longer half at Oracle. Good for them. They now have a bunch of Amazon-specific name servers on them instead of, you know, their competitor that they clearly despise. Good work, good work.I really want to thank you for taking the time to speak with me about how you're viewing these things and honestly giving me a chance to go ambling down memory lane. If people want to learn more about what you're up to, where's the best place for them to find you?Jake: Yeah, so I'm on Bluesky. It's invite only. I apologize for that right now. But if you check out bsky.app, you can see how to sign up for the waitlist, and we are trying to get people on as quickly as possible.Corey: And I will, of course, be talking to you there and will put links to that in the show notes. Thank you so much for taking the time to speak with me. I really appreciate it.Jake: Thanks a lot, Corey. It was great.Corey: Jake Gold, infrastructure engineer at Bluesky, slash Blue-ski. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry comment that will no doubt result in a surprise $60,000 bill after you posted.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.
No episódio de hoje, Luan Moreno & Mateus Oliveira entrevistaram André Araújo , atualmente como Field Engineer, Data in Motion na Cloudera.CDP é uma Plataforma de Dados Enterprise Cloudera, com foco na versatilidade em casos de uso como Streaming Platform, possuindo tecnologias como Apache Kafka e Apache Flink .Com CSP, você tem os seguintes benefícios: Apache Kafka - Plataforma de armazenamento de Streaming de Dados líder de mercado;Apache Flink - Plataforma de Processamento de Dados.Neste bate-papo vamos falar sobre:Plataforma de Dados Cloudera ;Plataforma de transmissão Cloudera .O Cloudera sempre foi uma das plataformas mais utilizadas no mercado, agora com a nova versão e casos de uso que atendem diversos cenários, como o caso do CSP ( Cloudera Stream Platform ).André Araújo = LinkedinCloudera = webpage Luan Moreno = https://www.linkedin.com/in/luanmoreno/
Danica Fine is a Senior Developer Advocate at Confluent. She is a big fan of the power of data and has deep expertise in Apache Kafka. She chats with Scott about the importance of a strongly architected data platform and gives tips on when you need to move from the basics of SQL to a true data rich environment that includes data streaming products.Head over to https://elevateai.com/hanselminutes to sign up today and get started!