Podcasts about Lightstep

shifting sr bloggers text messages career pivots developer advocate observability villela lightstep

Play Episode Listen Later Sep 19, 2024 33:49

It turns out, you don't need to step outside to observe the clouds. On this episode, we're joined by Chronosphere Field CTO Ian Smith. He and Corey delve into the innovative solutions Chronosphere offers, share insights from Ian's experience in the industry, and discuss the future of cloud-native technologies. Whether you're a seasoned cloud professional or new to the field, this conversation with Ian Smith is packed with valuable perspectives and actionable takeaways.Show Highlights:(0:00) Intro(0:42) Chronosphere sponsor read(1:53) The role of Chief of Staff at Chronosphere(2:45) Getting recognized in the Gartner Magic Quadrant(4:42) Talking about the buying process(8:26) The importance of observability(10:18) Guiding customers as a vendor(12:19) Chronosphere sponsor read(12:46) What should you do as an observability buyer(16:01) Helping orgs understand observability(19:56) Avoiding toxicly positive endorsements(24:15) Being transparent as a vendor(27:43) The myth of "winner take all"(30:02) Short term fixes vs. long term solutions(33:54) Where you can find more from Ian and ChronosphereAbout Ian SmithIan Smith is Field CTO at Chronosphere where he works across sales, marketing, engineering and product to deliver better insights and outcomes to observability teams supporting high-scale cloud-native environments. Previously, he worked with observability teams across the software industry in pre-sales roles at New Relic, Wavefront, PagerDuty and Lightstep.LinksChronosphere: https://chronosphere.io/?utm_source=duckbill-group&utm_medium=podcastIan's Twitter: https://x.com/datasmithingIan's LinkedIn: https://www.linkedin.com/in/ismith314159/SponsorChronosphere: https://chronosphere.io/?utm_source=duckbill-group&utm_medium=podcast

amazon chief cloud insider guiding aws devops vendor ian smith new relic pagerduty gartner magic quadrant chronosphere lightstep wavefront last week in aws

Career Pivots, OpenTelemetry, and Shifting to Observability 2.0 with Adriana Villela

The Cloud Gambit

Play Episode Listen Later Jul 2, 2024 41:26 Transcription Available

Send us a Text Message.Adriana Villela is a Sr. Developer Advocate at ServiceNow Cloud Observibility (formerly LightStep), Blogger, CNCF Ambassador, and host of the Geeking Out Podcast. In this conversation, we discuss Adriana's brief pivot out of tech into photography, the beauty of rock climbing, OpenTelemetry, and the shift to Observability 2.0.Where to find AdrianaPodcast: https://bento.me/geekingoutYouTube: https://youtube.com/@adrianamvillelaLinkedIn: https://www.linkedin.com/in/adrianavillela/X: https://x.com/adrianamvillelaInstagram: https://instagram.com/adrianamvillelaMastodon: https://hachyderm.io/@adrianamvillelaO'Reilly Course: https://learning.oreilly.com/videos/fundamentals-of-observability/0636920926597/Follow, Like, and Subscribe!Podcast: https://www.thecloudgambit.com/YouTube: https://www.youtube.com/@TheCloudGambitLinkedIn: https://www.linkedin.com/company/thecloudgambitTwitter: https://twitter.com/TheCloudGambitTikTok: https://www.tiktok.com/@thecloudgambit

152: Lisa Crispin

Agile FM

Play Episode Listen Later May 1, 2024 31:25

Transcript: Agile FM radio for the agile community. [00:00:04] Joe Krebs: In today's episode of Agile FM, I have Lisa Crispin with me. She can be reached at very easy to remember lisacrispin. com. Lisa is an author of a total of five books. There's three I want to highlight here or four actually is Obviously, a lot of people have talked in 2009, when the book Agile Testing came out, a practical guide for testers and Agile teams.Then following that more Agile Testing, right? Then I thought it would be most Agile Testing, but it turned into Agile Testing Condensed in 2019 and just very recently a downloadable book, Holistic Testing, a mini book. Welcome to the podcast, Lisa. [00:00:47] Lisa Crispin: Thank you for inviting me. I'm honored to be part of the podcast.You've had so many amazing people on so many episodes. So it's great. [00:00:54] Joe Krebs: Thank you. And now it's one more with you. So thank you for joining. And we will be talking a little bit about a totally different topic than maybe the last 20 episodes I had maybe way back. I did some testing topics, but I cannot recall maybe the last 20 episodes.So we're not about testing a super important topic. I would not consider myself an expert in that. And I don't know of the audience who has been listening to maybe the last 20 episodes are very familiar with agile testing. Maybe everybody has a feeling about, when they hear the word testing, but there is a huge difference between agile testing.And let's say traditional testing methods. If you just want to summarize like very brief, I know a lot of people are familiar with some of those things, but what it is, if somebody says what is agile testing, why was this different to traditional testing methods? [00:01:47] Lisa Crispin: Yeah. I think that there are a couple of big differences.One is that testing this is just a truth and not necessarily something to do with agile, but testing is really just part of software development. So many people think of it as a phase that happens after you write code, but in modern software development we're testing all the time, all the way around that whole DevOps loop, really.And and so the whole team's getting engaged in it through the whole lifecycle and the focus. Is on bug prevention rather than bug detection. Of course, we want to detect the bugs that make it out to production so we can fix them quickly. But really what we want to do is prevent those bugs from happening in the first place.So there are all these great practices that were popularized by that extreme programming and agile, things like test driven development, continuous integration, test automation all the things that go into, the planning. Workshops and things where we talk about our new features and break them into stories and what's going to be valuable to customers, having those early conversations, getting that shared understanding, things like behavior driven development, where we think about what we're going to code before we code it.That's all really different from, I guess I would say a more traditional software development approach where, Oh, we focus on these requirements. The requirements and a lot of people think about testing is just make sure it met the requirements. But there's so much more to that. We've got all these quality attributes, like security and performance and all the things that we also need to test.So it's a huge area, but it's woven into software development, just like coding, just like design, just like architecture, just like monitoring and observability. It's all part of the process. [00:03:31] Joe Krebs: Yeah. It's like a QA baked in, if you want to see it this way. And then also the automation of all that, right?So automating everything you just said is probably also a concern. Not that's necessarily new to agile, but that's a focus as well now I don't know if I don't have necessarily data points around that but I have worked with a lot of Scrum teams and Agile teams in my career.And it seems, if somebody would say what are the challenges within these teams? And one of them is, you can almost always highlight that, and I say almost purposely because there are good exceptions, is to build an increment of work once per sprint. A lot of teams do not accomplish that, and it's often related to testing activities.Why is that, in your opinion, like when we're seeing these teams struggle to put an increment of work out or a piece of the product or whatever you want to call it if you don't use Scrum necessarily, but to build something that could potentially go out. It's the quality standards of going out. What are the struggles out there for teams, especially on the testing side?I see that as you just said, like it's always happening or often happens at the end, rather than in the front. [00:04:46] Lisa Crispin: Yes. Unfortunately, I see, still see a whole lot of scrum teams and other agile teams doing a mini waterfall where they have testers on the cross functional team. But. The testers are not being involved in the whole process, and the developers aren't taking up practices like tester development, because those things are hard to learn and a lot of places don't enable.The non testers to learn testing skills because they don't put those skills into the skills matrix that those people need to advance their careers. And the places I've worked where we succeeded with this sort of whole team holistic approach, everybody had testing skills in their skills matrix.And we all had to learn from each other and, testers had other skills in their taste, matrix, like database skills and at least the ability to read code and be able to pair or ensemble with somebody. So that's part of it. And I just think. It's people don't focus enough on that, on the early process of the business stakeholder has brought us a new feature.We need to test that idea before we do anything. Is this really giving, what value, what's the purpose of the feature? What value is it going to give to the customer and to the business? And a lot of times we don't ask those questions up front. And the stakeholders don't ask themselves and then they get, you deliver the feature and it's something the customers didn't even want.[00:06:11] Joe Krebs: Lisa, we need to code. We need to get software. Why would we talk about that? Why would we not just code? I'm kidding. [00:06:18] Lisa Crispin: Yeah. Yeah. And that's also required, that's why the whole concept of self organizing team works really well. When you really let the teams be autonomous, because then they can think how best, how can we best accomplish this than they can do?Let's do some risk storing before we try to slice this into stories and let's do good practices to slice that feature into small, consistently sized stories that give us a reliable cadence predictable cadence of the business can plan and. Take those risks that we identified, get concrete examples for the business stakeholders of how this should behave and turn those into tests that guide development.Then we can automate those tests. And now we have regression tests to provide a safety net. So that all fits together. And of course, these days, we also need to put the effort into kind of the right side of the DevOps loop. We're not going to prevent all the bugs. We're not going to know about all the unknown unknowns, no matter how hard we try.And. These cloud architectures are very complex. Our test environments never look like production, so there's always something unexpected that happens. And so we have to really do a good job of the telemetry for our code, gathering all the data, all the events, all the logging data for monitoring. For alerting and also for observability, if something happens that we didn't anticipate, so it wasn't on our dashboard.We didn't have an alert for it. We need to be able to quickly diagnose that problem and know what to do. And if we didn't have. Enough telemetry for diagnosing that problem without having to, Oh, we've got to go back and add more logging and redeploy to production so we could figure it out. Oh, how many times has my team done that?That's all part of it. And then learning from production using those. And we've got fantastic analytics tools these days. Learning from those and what are the customers do? What was most valuable to them? What did they do when they, especially I mostly have worked on web applications.What did they do again? We released this new feature in the UI. How did they use it? And it's, we can learn, we know that stuff now. So that feeds back into what changes should we make next? [00:08:29] Joe Krebs: All right. So it's, it comes full circle, right? What's interesting is there's this company, it's all over the news.It's Boeing, right? We're recording this in 2024 quality related issues. Now, that is an extreme example, obviously, but. We do have these kind of aha and wake up moments in software development too, right? So that we're shipping products and I remember times where testing, I purposely call it testing and not QA, testing personnel was outsourced.That was like many years ago. We actually felt oh, this activity can be outsourced somewhere else. And you just made a point of if we have self organizing teams, And we're starting with it and we're feeding in at the end of a loop back into the development efforts, how important that is and how we treated these activities in the past and how, what we thought of it is, it's shocking now looking back in 24, isn't it?[00:09:23] Lisa Crispin: Yeah, it's just, it just became so much part of our lives to run into that. And the inevitable happened, it generally didn't work very well. I've actually known somebody who led an outsourcing test team in India and was working with companies in the UK and Europe.They actually were able to take an agile approach and keep the testers involved through the whole loop. They had to work really hard to do that. And there were a lot of good practices they embraced to make that work. But you have to be very conscious. And and both sides have to be willing to do that extra work.[00:09:56] Joe Krebs: You just mentioned that there were some really cool analytics tools. I don't know if you want to share any of those because you seem very excited about this, [00:10:05] Lisa Crispin: the most, the one that I found the most useful and I, a couple of different places I worked at used it.It's called full story. And it actually. It captures all the events that are happening in the user interface and plays it back for you as a screencast. Now, it does block anything they type in. It keeps it anonymized. But you can see the cursor. And I can remember one time a team I was on, it's we put a whole new page in our UI, a new feature.We thought people would really love it. And we worked really hard on it and we, we tried to do a minimum viable version of it, but we still put some effort in it and we put it out there. And then we looked at the analytics and full story and we could see that people got to the page. Their cursor moved around and then they navigated off the page.So either it wasn't clear what that page was for, or they just couldn't figure it out. So that was really valuable. I was like, okay, can we come up with a new design for this page? If we think that's what the problem is, or should we just, okay, that was a good. Good learning opportunity. But as a tester, especially there, because we can't reproduce problems, we know there's a problem in production, can't reproduce it.But if we go and watch a session where somebody had the problem, and there are other things, mixed panel, won't play it back for you, but you can see every step that the person did. And even observability tools Honeycomb and LightStep can show you like the whole, they can trace the whole path of what did the user do.And that really helps us not only understand the production problem, but, Oh, there's a whole scenario. We didn't even think about testing that. And so there's so much we can learn because we're so bound by our cognitive bias, our unconscious biases that we know how we wanted it to work.[00:11:54] Joe Krebs: Yeah. [00:11:55] Lisa Crispin: And it's really hard to think outside the box and get away from your biases and really approach it like a customer who never saw it before would do. [00:12:03] Joe Krebs: Yeah. It's this is the typical thing, right? If a software engineer demonstrates their own software they produce and was like eight books on my machine, I'm sure you have heard that.And it's it's obvious that you would do this, right? And it's just not necessarily obvious for somebody else. But if you're like sitting in front of a screen developing something for a long time, it just becomes natural that you would be working like this. I myself have engineered software and and fell into that trap, right?It's oh my God, eye opening event. If somebody else looks at you or. Yeah, [00:12:33] Lisa Crispin: Even when you sometimes have different people, like I can remember an occasion that Timo was on with a, again, a web application and I was just changed in the UI, just adding something in the UI and I tested it. My, my manager tested it.One of the product owner tested it. And we all thought it looked great and it did look great. We didn't notice the other thing we had broken on the screen until we put it in the production and customers were like, Hey, I really do think things like pair programming, pair testing, ensemble, working in ensembles for both programming and testing, doing all the work together.Getting those diversity points does help hugely with that. My theory is we all have different unconscious biases. So maybe if we're all together, somebody will notice a problem. I don't have any science to back that up, but But that's why those kind of practices are especially important. [00:13:28] Joe Krebs: Yeah. [00:13:28] Lisa Crispin: To catch as many things as we can.[00:13:30] Joe Krebs: Yeah. So we both didn't have any kind of science to back this up, but let's talk a little bit about science. Okay. Because metrics, data points, evidence. What are some of the KPIs if somebody listens to this and says Oh, that sounds interesting. And we definitely have shortcomings on testing activities within Agile teams.Obviously there's the traditional way of testing. They're using very different data points. I have used some in the past, and I just want to verify those with you too. It's that's even useful and still up to date. What would be some good KPIs when somebody approaches you and says that's got to have that on your dashboard?[00:14:08] Lisa Crispin: I think you, I actually think one of my favorite metrics to use is cycle time, although that encompasses so many things, but just watching trends and cycle time. And if you're, if you've got, for example, if you've got good test coverage with your automated regression tests, you're going to be able to make changes really quickly and confidently.And if you have, a good deployment pipeline, you're going to Again, there's a lot of testing that goes into making sure your infrastructure is good and your pipeline is performing as it should, because it's all code to that reflects a whole lot of things. It's hard to isolate one thing in your cycle time but what counts is, how consistent are we at being able to frequently deliver small changes?So I think that's an important one. And in terms of. Did we catch all the problems? I think it gets really dangerous to do things like, Oh, let's count how many bugs got in production because all measures can be gained, but that's a really easy one to gain. But things like how many times did we have to roll back or revert a change in product in production?Because there was something we didn't catch and hopefully we detected that ourselves with an alert or with monitoring before the customers saw it. And now that we have so many release strategies where we can do, Canary releases or blue green deploy so that we can do testing and production safely.But still how many times did we have to roll back? How many times did we get to that point and realize we didn't catch everything? That can be a good, that can be a good thing to track. And depending on what caused it. If we had to, if we had a production failure because, somebody pulled the plug of the server out of the wall.That's not, that's just something that happened, but if it is something that the team's process failed in some way, we want to know about that. We want to improve it. And, just how frequently can we deploy I think, the thing with continuous delivery, so many teams are practicing that are trying to practice that you're not going to succeed at that if you're.If you're not preventing defects , and if you don't have good test automation, good automation the whole way through. [00:16:08] Joe Krebs: Yeah. [00:16:08] Lisa Crispin: And I think, deployment frequency, that's another one of the Dora key metrics. Yeah. That's a real that we know that correlates with high performing teams.And of course we shouldn't ignore, how do people feel are people burned out or do they feel happy about their jobs? That's a little harder metric to get. I was on a team, my last full time job, we really focused on cycle time as a metric and we didn't really have that many problems in production.So we didn't really bother to track how many times we had to revert because we were doing a good job, but. But but, how frequently were we going? What was our cycle time? But also we did a little developer joy survey. So once a week, we sent out a little 5 question survey based on Amy Edmondson's work.And now I would base it on I would also use Nicole Forsgren's space survey. Model, but that was just a little before that came out, but just asking just a few questions and multiple, from one to five, how did you feel about this? And it was really interesting because over time, if cycle time was longer, developer joy was down.So there's something happening here that people are not happy. And. Something's going wrong. That's affecting our cycle time. And then the reverse is true. When our cycle time was shorter, joy went up. So I think it's I think it's important and, you don't have to get real fancy with your measurements, just start just, I think you should first focus on what are we trying to improve and then find a metric to guide, to measure that.[00:17:41] Joe Krebs: I'm glad you said or mentioned code coverage. That's one of one of those I mentioned earlier. I've been working with it quite a bit and cycle time. Um, very powerful stuff. Now, with you, such, somebody who has written published about agile testing extensively we are in 2024. There was the years ahead.There are agile conferences. There is a lot going on. What are the trends you currently see in the testing world? What is what's happening right now? What do you think is influencing maybe tomorrow? The days coming, I know you have holistic testing yourself. So maybe that is one but I just want to see, what do you see happening in the agile testing? [00:18:24] Lisa Crispin: Oh, just all of software development, definitely evolving. I think one of the things is that we're starting to be more realistic and realize that executives don't care about testing. They care about how well does their product sell?How much money is the company making? We know that. Product quality and process product quality obviously affects that. And that's from a customer point of view. It's the customer who defines quality. And, back in the nineties, we testers thought we were defining quality. So that's a thing, a change that's occurred over time and really thinking about that and also knowing that our process quality has a huge impact on product quality and what are our, are the, What are practices?What are the most important practices we can be doing? Janet Gregory and who is my coauthor on four of those books and Selena Delesie they've been consultants for years and helped so many huge, even big companies through an agile transformation. And they've distilled their magic into their, what they call a quality practices assessment model.And they identified 10 quality practices that they feel are the most important and things like feedback loops. Things like communication, right? And the model helps you ask questions and see where is the team in these different different aspects of practices that would make them have a quality process, which would help them have a quality product.And it gives teams a kind of a roadmap. It's here's where we are now. What do we need to improve? Oh, we really need to get the continuous delivery and these things are on our way, things like that. So I think that's one realization that it ties back to the idea that testing is just part of software development and we've had for years.So like, how can I make the, president of this company understand that we testers are so important. We're not, but it's important that the team build that quality. [00:20:29] Joe Krebs: But you could also argue that maybe a CEO of a company or the leadership team would say, we also don't care if this is being expressed in one line of code or two lines of code.So it's not necessarily to testing. I think they're just saying we have, here's our product. But I think what has changed is that your competition is just one mouse click away. Yeah. Quality is a determining factor. Now, let's take this hypothetical CEO out there right now listening to our conversation and saying I do want to start to embrace agile testing and agile in general, but more of those things you just mentioned, what would be a good starting point for them?Obviously there's a lot of information right now keywords and buzzwords we just shared today. What would be a good starting point for them to start that journey, because that is obviously not something that's coming overnight. [00:21:20] Lisa Crispin: I think one of the most important things that leadership can do is to make, to enable the whole team to learn testing skills that will help them build quality.And that means making it part of their job description, making it part of their skills matrix for career advancement, because that gives them time. If developers are paid to write lines of code, that's what they're going to do. But if they're, it's okay, you're an autonomous team.You decide what practice you think will work for you. We're going to support you. It's going to, it's going to slow things down at first. Okay, like I was on a team in 2003 that was given this mission. Okay. Do what you think you need to do first. We decided what level of quality we wanted, of course.We wanted to write code that we would take home and show our moms and put it on our refrigerators, and and we all commit to that level of quality. How can we achieve that? We're seeing that test driven development has worked really well for a lot of teams. So let's do test driven development, which is really.Not that easy to learn, but when you have a leadership that lets you have that time to learn and support you, it pays off in the long run because eventually you're a lot more confident. You've got all these regression tests. You can go faster and things like continuous integration, refactoring, all these practices that we know are good.And we were empowered to adopt those. It was all part of all of our job descriptions. And that's, so we became a high performing team, not overnight. Yeah, within a few years and a part of our, part of what we did was spend a lot of time learning the business domain. It's a very complicated business domain.And so when the stakeholders came and said we want this feature and we asked them what, why do they want it? What was it supposed to do? What was the value? We could usually cut out half of what they thought they wanted. We can say, okay, if we did all of this, we think it's going to be this much effort, but we could do 80 percent of it for half the cost.How's that? Oh yeah. Oh yeah. Nobody ever turned us down on that one. So that's another way where you go fast, we eliminate things that customers don't want or need. And so yeah, it's the unicorn magic of a self true self organizing team. [00:23:30] Joe Krebs: Yeah. But I do think what you said is, , this one thing that just stood out to me It is an investment, it is an investment into the future.It's a really good feeling to have later on the capability of releasing software whenever you want. If that is not becoming a massive burden and the whole company needs to come together for all nighters to get a piece of software out of the door. Now you're not only an author here you're also a practitioner.You also work with teams and I just want to come back to that business case of agile testing. One more time. Do you have an example from a client recent or further back where you would say that stands out or that's an easy one? You remember where agile testing made a huge difference for an organization.I'm sure there are tons you have where you would say there was a significant impact for them based on introducing agile testing practices. [00:24:29] Lisa Crispin: I certainly, especially early on in the extreme programming and Agile adoption there was a few occasions where I joined a team that never had testers.They were doing the extreme programming practices and you may recall that the original extreme programming publications did not mention testers. They were all about testing and quality, but they didn't mention testers. And. So these teams were doing test driven development, continuous integration. They were writing really good code and then they were doing, they were doing their two week sprints and maybe, maybe it took them three sprints to develop what the customer wanted and then they give it to the customer and the customer is but that's not what I wanted.So they like, maybe we need a tester. So then they hired me. And I was like, oh okay, let's let's have some of the, some, okay, we're gonna do a new, some new features. Let's have some brainstorming sessions. How are we gonna, what is this new feature for? How are we gonna implement it?What are the risks? And start doing risk assessments. And how are we gonna mitigate those risks? Are we gonna do it through testing? Are we gonna do it through monitoring? And just asking those what if questions? What's the worst thing that could happen? That's my favorite question when we release this.And could we deploy this feature to production and have it not solve the customer problem? And just add, anyone could ask those questions. It doesn't have to be a tester, but I find the teams that don't have professional testers, specialists, they, nobody else thinks of those questions. They could. But they just, testing is a big area.It is a big set of skills. And anybody on that team, I know lots of developers who have those skills, but not every team has a developer like that, other specialists, like business analysts could also help, but there were even fewer business analysts back in the day than there were testers.And as soon, so as soon as the tester, and when I, one team I joined early on, okay, they're like, okay, Lisa you can be our tester. But you can't come to the planning meetings and you can't come to the standups. That's a little weird. I did as best I could without being involved in any of the planning.And so that's the end of the two weeks. They weren't finished. Nothing was really working. And I said, Oh, Hey, can we try it my way? Let me be involved in those early planning discussions. Let me be part of the standup and Oh, amazing. Next time we met our target. And and I was I couldn't support all the, there were 30 developers and one tester, but we agreed that one other person or two other people would wear the testing hat along with me every sprint or at least on a daily basis.And so they all started to get those testing skills. Yeah, they just test, like I say, testing is a big area and you don't know what you don't know. I see teams even today. That they don't have any testers because years ago they were told they didn't need them if they did these extreme programming practices and they're doing testers involvement, they're doing continuous integration.They're maybe even doing a little exploratory testing. They're doing pair programming, even some ensemble or mob programming. They're doing great stuff, but they're missing out all that stuff at the beginning to get the shared understanding with the stakeholders of what to build. [00:27:43] Joe Krebs: All those lines of code that were needed. Wouldn't need to be tested. [00:27:48] Lisa Crispin: And so they release the feature and bugs come in. And they're really, they're missing features. It's not what the customer needed. Too many gaps. . And of course, I want to say those aren't really bugs. But they're bad. Yeah. And if you'd had a risk storming session, if you had planning sessions where you got.Example mapping sessions, for example, where you got the business rules for the story, concrete examples for his business role, and then you turn those into tests to guide your development with behavior driven development. This would have solved your problem, but they didn't know to do that. Anybody could have learned those things, but we can't all know everything.[00:28:25] Joe Krebs: Yeah. We're almost out of time.But there's one question I wanted to ask you and it might be a short answer. I hope you condense it a little bit. But when somebody gets on your LinkedIn page Lisa Crispin there is you in a picture plus a donkey. And you have donkeys yourself.And how does this relate to, does it relate to your work? What do you find inspirational about donkeys? And what, why is, why did you even make your LinkedIn profile? It has to be, it has to be a story around it.[00:28:55] Lisa Crispin: It's interesting. A few years ago at European testing conference, we had an open space and somebody said, Oh, let's have an open space session on Lisa's donkeys.And then we got to talking about this and I actually have learned a lot. About Agile for my donkeys. And I think the biggest thing is trust. So donkeys are work on trust. So with horses, I've ridden horses all my life and had horses all my life as well. You can bribe or bully a horse into doing something, they're just, they're different.If you reward them enough, okay they'll go along with you. If you kick them hard enough, maybe they'll go. Donkeys are not that way. They're looking out for number one. They're looking out for their own safety. And if they think you might be getting them into a situation that's bad for them, they just flat won't do it.So that's how they get the reputation of being stubborn. You could beat a bloody, you could offer them any bribe you want. They're not doing it. And so I learned I had to earn my donkey's trust. That's so true of teams. We all have to trust each other. And when we don't trust each other. We can't make progress and the teams I've been on that have been high performing teams We had that trust so we could have discussions where we had different opinions We could express our opinions without anyone taking it personally Because we knew that we were all in it together and it was okay Anybody could feel safe To ask a question, anybody can feel safe to fail, but you have that trust that there's nothing bad is going to happen.And so I could bring my donkey right in that door in the house. I've taken them in schools. I've taken them to senior centers because they trust me. And if I did anything, if they came to harm while in my care, if I, let's say I was driving the cart and the collar rubbed a big sore on them, that would destroy the trust.And it would be really hard to build it back. And so we always need to be conscious of how we're treating each other in our software teams. [00:30:55] Joe Krebs: Yeah, wonderful. I did hear about the rumor of being stubborn. But I also always knew that donkeys are hardworking animals. [00:31:02] Lisa Crispin: They love to work hard.Yeah. [00:31:05] Joe Krebs: Awesome. Lisa, what a great ending. I'm glad we had time to even touch on that. That was a great insight. Thank you so much for all your insights around testing, but also at the end about donkeys. Thank you so much, Lisa. [00:31:17] Lisa Crispin: Oh, it's my pleasure.

193: Operational Elixir: Observing the Midsize Madness

Thinking Elixir Podcast

Play Episode Listen Later Mar 12, 2024 55:00

In this engaging third episode of our series, Dave Lucia returns to delve into the various systems that support small and medium-sized teams and companies for their Elixir systems. Dave shares insights gained from a range of situations including working at startups on up to Series C and D sized companies, with a particular focus on the critical role of observability tools. Drawing on his extensive experience, Dave discusses how these tools can greatly enhance a team's ability to monitor and troubleshoot applications, ensuring high performance and reliability. Tune in for a comprehensive look at the essential systems and tools that can make a tangible difference in the day-to-day operations of Elixir-powered organizations, and more! Show Notes online - http://podcast.thinkingelixir.com/193 (http://podcast.thinkingelixir.com/193) Elixir Community News - https://twitter.com/josevalim/status/1762921819776934146 (https://twitter.com/josevalim/status/1762921819776934146?utm_source=thinkingelixir&utm_medium=shownotes) – José Valim has teased a new feature for Elixir 1.17 which may include a mix test flag --breakpoints for debugging failed tests. - https://gleam.run/news/gleam-version-1/ (https://gleam.run/news/gleam-version-1/?utm_source=thinkingelixir&utm_medium=shownotes) – Gleam v1.0 has been released, marking the language's stability and readiness for production with a commitment to maintain backwards compatibility. - https://github.com/underjord/entrace (https://github.com/underjord/entrace?utm_source=thinkingelixir&utm_medium=shownotes) – Lars Wikman shared his work on the Entrace tracing project, offering easier tracing support to applications. - https://github.com/underjord/entracelivedashboard (https://github.com/underjord/entrace_live_dashboard?utm_source=thinkingelixir&utm_medium=shownotes) – Entrac LiveDashboard was announced by Lars Wikman to add a tracing page to the LiveDashboard plugin. - https://docs.google.com/forms/d/e/1FAIpQLSeGxJUadP1CaaU6EnTwe7Hv76RnBLIiqT6SJLIBvncHcEzGRg/viewform (https://docs.google.com/forms/d/e/1FAIpQLSeGxJUadP1CaaU6EnTwe7Hv76RnBLIiqT6SJLIBvncHcEzGRg/viewform?utm_source=thinkingelixir&utm_medium=shownotes) – The Call for Proposals for talks at ElixirConfUS is open, including information to book hotel rooms for the event taking place from August 28-30, 2024. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Discussion Resources - https://podcast.thinkingelixir.com/75 (https://podcast.thinkingelixir.com/75?utm_source=thinkingelixir&utm_medium=shownotes) – Previous interview with Dave Lucia talking about RabbitMQ and Commanded - https://podcast.thinkingelixir.com/97 (https://podcast.thinkingelixir.com/97?utm_source=thinkingelixir&utm_medium=shownotes) – Previous interview with Dave Lucia talking about Avro and Elixir - https://podcast.thinkingelixir.com/129 (https://podcast.thinkingelixir.com/129?utm_source=thinkingelixir&utm_medium=shownotes) – Previous interview with Dave Lucia talking about Time series data with Timescale DB - https://Sentry.io (https://Sentry.io?utm_source=thinkingelixir&utm_medium=shownotes) - https://www.servicenow.com/products/observability.html (https://www.servicenow.com/products/observability.html?utm_source=thinkingelixir&utm_medium=shownotes) – Observability tool formerly knowns as LightStep - https://www.honeycomb.io/ (https://www.honeycomb.io/?utm_source=thinkingelixir&utm_medium=shownotes) - https://opentelemetry.io/docs/collector/ (https://opentelemetry.io/docs/collector/?utm_source=thinkingelixir&utm_medium=shownotes) - https://github.com/open-telemetry (https://github.com/open-telemetry?utm_source=thinkingelixir&utm_medium=shownotes) - https://opentelemetry.io/docs/concepts/signals/traces/ (https://opentelemetry.io/docs/concepts/signals/traces/?utm_source=thinkingelixir&utm_medium=shownotes) - https://hex.pm/packages/opentelemetry (https://hex.pm/packages/opentelemetry?utm_source=thinkingelixir&utm_medium=shownotes) - https://hex.pm/packages/opentelemetry_exporter (https://hex.pm/packages/opentelemetry_exporter?utm_source=thinkingelixir&utm_medium=shownotes) - https://davelucia.com/ (https://davelucia.com/?utm_source=thinkingelixir&utm_medium=shownotes) – Dave's personal blog - https://github.com/prometheus/prometheus (https://github.com/prometheus/prometheus?utm_source=thinkingelixir&utm_medium=shownotes) - https://grafana.com/ (https://grafana.com/?utm_source=thinkingelixir&utm_medium=shownotes) - https://grafana.com/docs/loki/latest/send-data/promtail/ (https://grafana.com/docs/loki/latest/send-data/promtail/?utm_source=thinkingelixir&utm_medium=shownotes) - https://fly.io/docs/reference/metrics/ (https://fly.io/docs/reference/metrics/?utm_source=thinkingelixir&utm_medium=shownotes) - https://isburmistrov.substack.com/p/all-you-need-is-wide-events-not-metrics (https://isburmistrov.substack.com/p/all-you-need-is-wide-events-not-metrics?utm_source=thinkingelixir&utm_medium=shownotes) - https://amplitude.com/ (https://amplitude.com/?utm_source=thinkingelixir&utm_medium=shownotes) - Custom LiveView admin pages for dashboards - https://postmarkapp.com/ (https://postmarkapp.com/?utm_source=thinkingelixir&utm_medium=shownotes) - https://sendgrid.com/en-us (https://sendgrid.com/en-us?utm_source=thinkingelixir&utm_medium=shownotes) - https://milkroad.com/ (https://milkroad.com/?utm_source=thinkingelixir&utm_medium=shownotes) – A newsletter company that Dave Lucia worked at. - https://www.beehiiv.com/ (https://www.beehiiv.com/?utm_source=thinkingelixir&utm_medium=shownotes) – Newsletters as a service company - https://ahrefs.com/ (https://ahrefs.com/?utm_source=thinkingelixir&utm_medium=shownotes) – Tracking the back links from other sites to yours for SEO - https://search.google.com/search-console/about (https://search.google.com/search-console/about?utm_source=thinkingelixir&utm_medium=shownotes) - https://github.com/dbernheisel/phoenix_seo (https://github.com/dbernheisel/phoenix_seo?utm_source=thinkingelixir&utm_medium=shownotes) - https://tvlabs.ai/ (https://tvlabs.ai/?utm_source=thinkingelixir&utm_medium=shownotes) – Where Dave Lucia is working now. Guest Information - https://twitter.com/davydog187 (https://twitter.com/davydog187?utm_source=thinkingelixir&utm_medium=shownotes) – on Twitter - https://github.com/davydog187/ (https://github.com/davydog187/?utm_source=thinkingelixir&utm_medium=shownotes) – on Github - https://davelucia.com (https://davelucia.com?utm_source=thinkingelixir&utm_medium=shownotes) – Blog - https://tvlabs.ai (https://tvlabs.ai?utm_source=thinkingelixir&utm_medium=shownotes) – TVLabs company where he works now. Find us online - Message the show - @ThinkingElixir (https://twitter.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen - @brainlid (https://twitter.com/brainlid) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel - @bernheisel (https://twitter.com/bernheisel) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern) - Cade Ward - @cadebward (https://twitter.com/cadebward) - Cade Ward on Fediverse - @cadebward@genserver.social (https://genserver.social/cadebward)

Jacob Aronoff - At Least One Person Who Cares To See It Through

Maintainable

Play Episode Listen Later Nov 21, 2023 46:36

Robby has a chat with Staff Software Engineer at Lightstep from ServiceNow, Jacob Aronoff. Their conversation delves into the vital signs of a thriving open source software project. They unpack the characteristics of well-maintained open source endeavors, emphasizing the importance of a passionate community behind the project, rather than misleading indicators like GitHub stars. They discuss the nuances of evaluating a project's health through performance metrics, suggesting that a more holistic view that includes the scrutiny of open issues can provide better insights into the project's robustness and responsiveness to community needs. Furthermore, their discussion highlights a critical, yet often overlooked, aspect of open source software: the project's own dependencies. Jacob argues that understanding these dependencies is crucial before adopting an open source solution, as it could have far-reaching implications on the stability and security of one's own project. They also take a deep dive into the organizational dynamics of the OpenTelemetry community, examining its structured approach to scaling and sustaining the project over time. Their discussion then transitions into the philosophical debate of balancing between the extremes of premature abstraction and delivering a fully opinionated software project. Jacob shares his penchant for “building in the open”, advocating for transparency and community involvement in the development process. He provides valuable advice for both newcomers looking to contribute to open source projects and maintainers seeking to attract new talent. In a personal touch, he extends his gratitude to Robby for creating Oh My Zsh, sharing his own journey in developing a custom theme for it. Moreover, Jacob expresses his preference for pure functional languages, hinting at the broader discussion around programming paradigms and their influence on open source software development. Stay tuned for that and more!Book Recommendations:Killers of the Flower Moon By David Grann and The Hitchhiker's Guide to the Galaxy By Douglas AdamsHelpful Links:Jacob on LinkedInLightstep from ServiceNowJaronoff97 on GitHubJacob's WebsiteJacob on TwitterSubscribe to Maintainable on:Apple PodcastsOvercastSpotifyOr search "Maintainable" wherever you stream your podcasts.Keep up to date with the Maintainable Podcast by joining the newsletter.Thanks to Our Sponsor!Turn hours of debugging into just minutes! AppSignal is a performance monitoring and error tracking tool designed for Ruby, Elixir, Python, Node.js, Javascript, and soon, other frameworks. It offers six powerful features with one simple interface, providing developers with real-time insights into the performance and health of web applications. Keep your coding cool and error-free, one line at a time! Check them out!

Building a Strong Company Culture at Honeycomb with Mike Goldsmith

making sense observability villela lightstep

Play Episode Listen Later Nov 9, 2023 32:31

Mike Goldsmith, Staff Software Engineer at Honeycomb, joins Corey on Screaming in the Cloud to talk about Open Telemetry, company culture, and the pros and cons of Go vs. .NET. Corey and Mike discuss why OTel is such an important tool, while pointing out its double-edged sword of being fully open-source and community-driven. Opening up about Honeycomb's company culture and how to find a work-life balance as a fully-remote employee, Mike points out how core-values and social interaction breathe life into a company like Honeycomb.About MikeMike is an OpenSource focused software engineer that builds tools to help users create, shape and deliver system & application telemetry. Mike contributes to a number of OpenTelemetry initiatives including being a maintainer for Go Auto instrumentation agent, Go proto packages and an emeritus .NET SDK maintainer..Links Referenced: Honeycomb: https://www.honeycomb.io/ Twitter: https://twitter.com/Mike_Goldsmith Honeycomb blog: https://www.honeycomb.io/blog LinkedIn: https://www.linkedin.com/in/mikegoldsmith/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This promoted guest episode is brought to us by our friends at Honeycomb who I just love talking to. And we've gotten to talk to these folks a bunch of different times in a bunch of different ways. They've been a recurring sponsor of this show and my other media nonsense, they've been a reference customer for our consulting work at The Duckbill Group a couple of times now, and we just love working with them just because every time we do we learn something from it. I imagine today is going to be no exception. My guest is Mike Goldsmith, who's a staff software engineer over at Honeycomb. Mike, welcome to the show.Mike: Hello. Thank you for having me on the show today.Corey: So, I have been familiar with Honeycomb for a long time. And I'm still trying to break myself out of the misapprehension that, oh, they're a small, scrappy, 12-person company. You are very much not that anymore. So, we've gotten to a point now where I definitely have to ask the question: what part of the observability universe that Honeycomb encompasses do you focus on?Mike: For myself, I'm very focused on the telemetry side, so the place where I work on the tools that customers deploy in their own infrastructure to collect all of that useful data and make—that we can then send on to Honeycomb to make use of and help identify where the problems are, where things are changing, how we can best serve that data.Corey: You've been, I guess on some level, there's—I'm trying to make this not sound like an accusation, but I don't know if we can necessarily avoid that—you have been heavily involved in OpenTelemetry for a while, both professionally, as well as an open-source contributor in your free time because apparently you also don't know how to walk away from work when the workday is done. So, let's talk about that a little bit because I have a number of questions. Starting at the very beginning, for those who have not gone trekking through that particular part of the wilderness-slash-swamp, what is OpenTelemetry?Mike: So, OpenTelemetry is a vendor-agnostic set of tools that allow anybody to collect data about their system and then send it to a target back-end to make use of that data. The data, the visualization tools, and the tools that make use of that data are a variety of different things, so whether it's tracing data or metrics or logs, and then it's trying to take value from that. The big thing what OpenTelemetry is aimed at doing is making the collection of the data and the transit of the data to wherever you want to send it a community-owned resource, so it's not like you get vendor lock-in by going to using one competitor and then go to a different—you want to go and try a different tool and you've got to re-instrument or change your application heavily to make use of that. OpenTelemetry abstracts all that away, so all you need to know about is what you're instrumented with, what [unintelligible 00:03:22] can make of that data, and then you can send it to one or multiple different tools to make use of that data. So, you can even compare some tools side-by-side if you wanted to.Corey: So, given that it's an open format, from the customer side of the world, this sounds awesome. Is it envisioned that this is something—an instrument that gets instrumented at the application itself or once I send it to another observability vendor, is it envisioned that okay, if I send this data to Honeycomb, I can then instrument what Honeycomb sees about that and then send that onward somewhere else, maybe my ancient rsyslog server, maybe a different observability vendor that has a different emphasis. Like, how is it envisioned unfolding within the ecosystem? Like, in other words, can I build a giant ring of these things that just keep building an infinitely expensive loop?Mike: Yeah. So ideally, you would try and try to pick one or a few tools that will provide the most value that you can send to, and then it could answer all of the questions for you. So, at Honeycomb, we try to—we are primarily focused on tracing because we want to do application-level information to say, this user had this interaction, this is the context of what happened, these are the things that they clicked on, this is the information that flowed through your back-end system, this is the line-item order that was generated, the email content, all of those things all linked together so we know that person did this thing, it took this amount of time, and then over a longer period of time, from the analytics point of view, you can then say, “These are the most popular things that people are doing. This is typically how long it takes.” And then we can highlight outliers to say, “Okay, this person is having an issue.” This individual person, we can identify them and say, “This is an issue. This is what's different about what they're doing.”So, that's quite a unique tracing tool or opportunity there. So, that lets you really drive what's happening rather than what has happened. So, logs and metrics are very backward-looking to say, “This is the thing that this thing happened,” and tries to give you the context about it. Tracing tries to give you that extra layer of context to say that this thing happened and it had all of these things related to it, and why is it interesting?Corey: It's odd to me that vendors would be putting as much energy into OpenTelemetry—or OTel, as it seems to always be abbreviated as when I encounter it, so I'm using the term just so people, “Oh, wait, that's that thing I keep seeing. What is that?” Great—but it seems odd to me that vendors would be as embracing of that technology as they have been, just because historically, I remember whenever I had an application when I was using production in anger—which honestly, ‘anger' is a great name for the production environment—whenever I was trying to instrument things, it was okay, you'd have to grab this APM tools library and instrument there, and then something else as well, and you wound up with an order of operations where which one wrapped the other. And sometimes that caused problems. And of course, changing vendors meant you had to go and redeploy your entire application with different instrumentation and hope nothing broke. There was a lock-in story that was great for the incumbents back when that was state of the art. But even some of those incumbents are now embracing OTel. Why?Mike: I think it's because it's showing that there's such a diverse group of tools there, and [unintelligible 00:06:32] being the one that you've selected a number of years ago and then they could hold on to that. The momentum slowed because they were able to move at a slower pace because they were the organizations that allowed us—they were the de facto tooling. And then once new companies and competitors came around and we're open to trying to get a part of that market share, it's given the opportunity to then really pick the tool that is right for the job, rather than just the best than what is perceived to be the best tool because they're the largest one or the ones that most people are using. OpenTelemetry allows you to make an organization and a tool that's providing those tools focus on being the best at it, rather than just the biggest one.Corey: That is, I think, a more enlightened perspective than frankly, I expect a number of companies out there to have taken, just because it seems like lock-in seems to be the order of the day for an awful lot of companies. Like, “Okay, why are customers going to stay with us?” “Because we make it hard to leave,” is… I can understand the incentive, but that only works for so long if you're not actively solving a problem that customers have. One of the challenges that I ran into, even with OTel, was back when I was last trying to instrument a distributed application—which was built entirely on Lambda—is the fact that I was doing this for an application that was built entirely on Lambda. And it felt like the right answer was to, oh, just use an OTel layer—a Lambda layer that wound up providing the functionality you cared about.But every vendor seemed to have their own. Honeycomb had one, Lightstep had one, AWS had one, and now it's oh, dear, this is just the next evolution of that specific agent problem. How did that play out? Is that still the way it works? Is there other good reasons for this? Or is this just people trying to slap a logo on things?Mike: Yeah, so being a fully open-source project and a community-driven project is a double-edged sword in some ways. One it gives the opportunity for everybody to participate, everybody can move between tools a lot easier and you can try and find the best fit for you. The unfortunate part around open-source-driven projects like that means that it's extremely configuration-heavy because it can do anything; it's not opinionated, which means that if you want to have the opportunity to do everything, every possible use case is available to everyone all of the time. So, if you might have a very narrow use case, say, “I want to learn about this bit of information,” like, “I'm working with the [unintelligible 00:09:00] SDK. I want to talk about—I've got an [unintelligible 00:09:03] application and I want to collect data that's running in a Lambda layer.” The OpenTelemetry SDK that has to serve all of the [other 00:09:10] JavaScript projects across all the different instrumentations, possibly talking about auto-instrumentation, possibly talking about lots of other tools that can be built into that project, it just leads to a very highly configurable but very complicated tool.So, what the vendor specifics, what you've suggested there around like Honeycomb, or other organizations providing the layers, they're trying to simplify the usage of the SDK to make some of those assumptions for you that you are going to be sending telemetry to Honeycomb, you are going to be talking about an API key that is going to be in a particular format, it is easier to pass that information into the SDK so it knows how to communicate rather than—as well as where it's going to communicate that data to.Corey: There's a common story that I tend to find myself smacking into almost against my will, where I have found myself at the perfect intersection of a variety of different challenges, and for some reason, I have stumbled blindly and through no ill intent into ‘this is terrible' territory. I wound to finally getting blocked and getting distracted by something else shiny on this project about two years ago because the problem I was getting into was, okay, I got to start sending traces to various places and that was awesome, but now I wanted to annotate each span with a user identity that could be derived from code, and the way that it interfaced with the various Lambda layers at that point in time was, ooh, that's not going to be great. And I think there were a couple of GitHub issues opened on it as feature enhancements for a couple of layers. And then I, again, was still distracted by shiny things and never went back around to it. But I was left with the distinct impression that building something purely out of Lambda functions—and also probably popsicle sticks—is something of an edge case. Is there a particular software architecture or infrastructure architecture that OTel favors?Mike: I don't think it favors any in particular, but it definitely suffers because it's, as I said earlier, it's trying to do that avail—the single SDK is available to many different use cases, which has its own challenges because then it has to deal with so many different options. But I don't think OpenTelemetry has a specific, like, use case in mind. It's definitely focused on, like—sorry, telemetry tracing—tracing is focused on application telemetry. So, it's focused on about your code that you build yourself and then deploy. There are other tools that can collect operational data, things like the OpenTelemetry Collector is then available to sit outside of that process and say, what's going on in my system?But yeah, I wouldn't say that there's a specific infrastructure that it's aimed at doing. A lot of the cloud operators and tools are trying to make sure that that information is available and OpenTelemetry SDKs are available. But yeah, at the moment, it does require some knowledge around what's best for your application if you're not in complete control of all of the infrastructure that it's running in.Corey: It feels that with most things that are sort of pulled into the orbit of the CNCF—and OTel is no exception to this—that there's an idea that oh, well, everything is going to therefore be running in containers, on top of Kubernetes. And that might be unfair, but it also, frankly, winds up following pretty accurately what a lot of applications I'm seeing in client environments have been doing. Don't take it as a criticism. But it does seem like it is designed with an eye toward everything being microservices running on containers, scheduled which, from a infrastructure perspective, what appears to be willy-nilly abandoned, and how do you wind up gathering useful information out of that without drowning in data? That seems to be, from at least my brief experience with OTel, the direction it heads in. Is that directionally correct?Mike: Yeah, I think so. I think OpenTelemetry has a quite strong relationship with CNCF and therefore Kubernetes. That is a use case that we see as a very common with customers that we engage with, both at the prospect level and then just initial conversations, people using something like Kubernetes to do the application orchestration is very, very common. It's something that OpenTelemetry and Honeycomb are wanting to improve on as well. We want to get by a very good experience because it is so common when we come up to it that we want to have a very good, strong opinion around, well, if you're running in Kubernetes, these are the tools and these are the right ways to use OpenTelemetry to get the best out of it.Corey: I want to change gears a little bit. Something that's interested me about Honeycomb for a while has been its culture. Your founders have been very public about their views on a variety of different things that are not just engineering-centric, but tangential to it, like, engineering management: how not to be terrible at it. And based on a huge number of conversations I've had with folks over there, I'm inclined to agree that the stories they tell in public do align with how things go internally. Or at least if they're not, I would not expect you to admit it on the record, so either way, we'll just take that as a given.What I'm curious about is that you are many timezones away from their very nice office here in San Francisco. What's it like working remote in a company that is not fully distributed? Which is funny, we talk about distributed applications as if they're a given but distributed teams are still something we're wrangling with.Mike: Yeah, it's something that I've dealt with for quite a while, for maybe seven or eight years is worked with a few different organizations that are not based in my timezone. There's been a couple, primarily based in San Francisco area, so Pacific Time. An eight-hour time difference for the UK is challenging, it has its own challenges, but it also has a lot of benefits, too. So typically, I get to really have a lot of focus time on a morning. That means that I can start my day, look through whatever I think is appropriate for that morning, and not get interrupted very easily.I get a lot of time to think and plan and I think that's helped me at, like, the tech lead level because I can really focus on something and think it through without that level of interruption that I think some people do if you're working in the same timezone or even in the same office as someone. That approachability is just not naturally there. But the other side of that is that I have a very limited amount of natural overlap with people I work with on a day-to-day basis, so it's typically meetings from 2 till 5 p.m. most days to try and make sure that I build those social relationships, I'm talking to the right people, giving status updates, planning and that sort of thing. But it works for me. I really enjoy that balance of some ty—like, having a lot of focus time and having, like, then dedicated time to spend with people.And I think that's really important, as well is that a distributed team naturally means that you don't get to spend a lot of time with people and a lot of, like, one-on-one time with people, so that's something that I definitely focus on is doing a lot of social interaction as well. So, it's not just I have a meeting, we've got to stand up, we've got 15 minutes, and then everyone goes and does their own thing. I like to make sure that we have time so we can talk, we can connect to each other, we know each other, things that would—[unintelligible 00:16:35] that allow a space for conversations to happen that would naturally happen if you were sat next to somebody at a desk, or like, the more traditional, like, water cooler conversations. You hear somebody having a conversation, you go talk to them, that naturally evolves.Corey: That was where I ran into a lot of trouble with it myself. My first outing as a manager, I had—most of the people on my team were in the same room as I was, and then we had someone who was in Europe. And as much as we tried to include this person in all of our meetings, there was an intrinsic, “Let's go get a cup of coffee,” or, “Let's have a discussion and figure things out.” And sometimes it's four in the afternoon, we're going to figure something out, and they have long since gone to bed or have a life, hopefully. And it was one of those areas where despite a conscious effort to avoid this problem, it was very clear that they did not have an equal voice in the team dynamic, in the team functioning, in the team culture, and in many cases, some of the decisions we ultimately reached as an outgrowth of those sidebar conversations. This led to something of an almost religious belief for me, for at least a while, was that either everyone's distributed or no one is because otherwise you wind up with the unequal access problem. But it's clearly worked for you folks. How have you gotten around that?Mike: For Honeycomb, it was a conscious decision not long before the Covid pandemic that the team would be distributed first; the whole organization will be distributed first. So, a number of months before that happened, the intention was that anybody across the organization—which at the time, was only North America-based staff—would be able to do their job outside of the office. Because I think around the end of 2019 to the beginning of 2020, a lot of the staff were based in the San Francisco area and that was starting to grow, and want more staff to come into the business. And there were more opportunities for people outside of that area to join the business, so the business decided that if we're going to do this, if we're going to hire people outside of the local area, then we do want to make sure that, as you said, that everybody has an equal access, everyone has equal opportunity, they can participate, and everybody has the same opportunity to do those things. And that has definitely fed through pandemic, and then even when the office reopened and people can go back into the office. More than—I think there's only… maybe 25% of the company now is even in Pacific Time Zone. And then the office space itself is not very large considering the size of the company, so we couldn't fit everybody into our office space if we wanted to.Corey: Yeah, that's one of the constant growing challenges, too, that I understand that a lot of companies do see value in the idea of getting everyone together in a room. I know that I, for example, I'm a lot more effective and productive when I'm around other people. But I'm really expensive to their productivity because I am Captain Interrupter, which, you know, we have to recognize our limitations as we encounter them. But that also means that the office expense exceeds the AWS bill past a certain point of scale, and that is not a small thing. Like, I try not to take too much of a public opinion on should we be migrating everyone back to return-to-office as a mandate, yes, no, et cetera.I can see a bunch of different perspectives on this that are nuanced and I don't think it lends itself to my usual reactionary take on the Twitters, as it were, but it's a hard problem with no easy answer to it. Frankly, I also think it's a big mistake to do full-remote only for junior employees, just because so much of learning how the workforce works is through observation. You don't learn a lot about those unspoken dynamics in any other way than observing it directly.Mike: Yes, I fully agree. I think the stage that Honeycomb was at when I joined and has continued to be is that I think a very junior person joining an organization that is fully distributed is more challenging. It has different challenges, but it has more challenges because it doesn't have those… you can't just see something happening and know that that's the norm or that that's the expectation. You've got to push yourself into those in those different arenas, those different conversations, and it can be quite daunting when you're new to an organization, especially if you are not experienced in that organization or experienced in the role that you're currently occupying. Yeah, I think the distributed organizations is—fully distributed has its challenges and I think that's something that we do at Honeycomb is that we intentionally do that twice a year, maybe three times a year, bring in the people that do work very closely, bringing them together so they have that opportunity to work together, build those social interactions like I mentioned earlier, and then do some work together as well.And it builds a stronger trust relationship because of that, as well because you're reinforcing the social side with the work side in a face-to-face context. And there's just, there's no direct replacement for face-to-face. If you worked for somebody and never met them for over a year, it'd be very difficult to then just be in a room together and have a normal conversation.Corey: It takes a lot of effort because there's so much to a company culture that is not meetings or agenda-driven or talking about the work. I mean, companies get this wrong with community all the time where they think that a community is either a terrible option of people we can sell things to or more correctly, a place where users of our product or service or offering or platform can gather together to solve common challenges and share knowledge with each other. But where they fall flat often is it also has to have a social element. Like ohh, having a conversation about your lives is not on topic for this community Slack team is, great, that strangles community before it can even form, in many cases. And work is no different.Mike: Yeah, I fully agree. We see that with the Honeycomb Pollinators Slack channel. So, we use that as a primary way of community members to participate, talk to each other, share their experiences, and we can definitely see that there is a high level of social interaction alongside of that. They connect because they've got a shared interest or a shared tool or a shared problem that they're trying to solve, but we do see, like, people, the same people, reconnecting or re-communicating with each other because they have built that social connection there as well.And I think that's something that as organizations—like, OpenTelemetry is a community is more welcoming to that. And then you can participate with something that then transcends different organizations that you may work for as well because you're already part of this community. So, if that community then reaches to another organization, there's an opportunity to go, to move between organizations and then maintain a level of connection.Corey: That seems like one of the better approaches that people can have to this stuff. It's just a—the hard part, of course, is how do you change culture? I think the easy way to do it—the only easy way to do it—is you have to build the culture from the beginning. Every time I see companies bringing in outsiders to change the corporate culture, I can't help but feel that they're setting giant piles of money on fire. Culture is one of those things that's organic and just changing it by fiat doesn't work. If I knew how to actually change culture, I would have a much more lucrative target for my consultancy than I do today. You think AWS bills are a big problem? Everyone has a problem with company cultures.Mike: Yeah, I fully agree. I think that culture is something that you're right is very organic, it naturally happens. I think the value when organizations go through, like, a retrospective, like, what is our culture? How would we define it? What are the core values of that and how do we articulate that to people that might be coming into the organization, that's very valuable, too, because those core values are very useful to communicate to people.So, one of the bigger core values that we've got at Honeycomb is that—we refer to as, “We hire adults,” meaning that when somebody needs to do something, they just can go and do it. You don't have to report to somebody, you don't have to go and tell somebody, “I need a doctor appointment,” or, “I've got to go and pick up the kids from school,” or something like that. You're trusted to do your job to the highest level, and if you need additional help, you can ask for it. If somebody requires something of you they ask for it. They do it in a humane way and they expect to be treated like a human and an adult all of the time.Corey: On some level, I've always found, for better or worse, that people will largely respond to how you treat them and live up or down to the expectation placed upon them. You want a bunch of cogs who are going to have to raise their hand to go to the bathroom? Okay, you can staff that way if you want, but don't be surprised when those teams don't volunteer to come up with creative solutions to things either. You can micromanage people to death.Mike: Yeah. Yeah, definitely. I've been in organizations, like, fresh out of college and had to go to work at a particular place and it was very time-managed. And I had inbound sales calls and things like that and it was very, like, you've spent more than three minutes on a wrap-up call from having a previous call, and if you don't finish that call within three minutes, your manager will call your phone to say, “You need to go on to the next call.” And it's… you could have had a really important call or you could have had a very long call. They didn't care. They just wanted—you've had your time now move on to the next one and they didn't care.Corey: One last question I want to ask you about before we wind up calling this an episode, and it distills down to I guess, effectively, your history, for lack of a better term. You have done an awful lot of Go maintenance work—Go meaning the language, not the imperative command, to be clear—but you also historically were the .NET SDK maintainer for something or other. Do you find those languages to be similar or… how did that come to be? I mean, to be clear, my programming languages of choice are twofold: both brute force and enthusiasm. Most people take a slightly different path.Mike: Yeah, I worked with .NET for a very long time, so that was, like, the place—the first place that I joined as a real organization after finishing college was .NET and it just sort of stuck. I enjoyed the language. At the time, sort of, what 15 year—12, 15 years ago, the language itself was moving pretty well, there was things being added to it, it was enjoyable to use.Over the last maybe four or five years, I've had the opportunity to work a lot more in Go. And they are very different. So, Go is much more focused on simplicity and not hiding anything from anybody and just being very efficient at what you can see it does. .NET and many other languages such as Java, Ruby, JavaScript, Python, all have a level of magic to them, so if you're not part of the ecosystem or if you don't know particular really common packages that can do things for you, not knowing something about the ecosystem causes pain.I think Go takes away some of that because if you don't know those ecosystems or if you don't know those tools, you can still solve the problem fairly quickly and fairly simply. Tools will help but they're not required. .NET is probably on the boundary for me. It's still very easy to use, I enjoy using it, but it just… I found that it's not that long ago, I would say that I've switched from thinking like a .NET developer, so whenever I'm forming code in my head, like, how I would solve a problem, for a very long time, it was in .NET and C#.I'd probably say in the last 12 months or so, it's definitely moved more to Go just because of the simplicity. And it's also the tool that is most used within Honeycomb, especially, so if you're talking about Go code, you've got a wider audience to bounce ideas off, to talk to, communicate, get ideas from. .NET is not a very well used language within Honeycomb and probably even, like… even maybe West Coast-based organizations, it seems to be very high-level organizations that are willing to pay their money up for, like, Microsoft support. Like, Go is something that a lot of developers use because it's very simple, very quick, can move quick.Corey: I found that it was very easy for me to pick up Go to build out something ridiculous a few years back when I need to control my video camera through its ‘API' to use the term charitably. And it just works in a way that made an awful lot of sense. But I still find myself reaching for Python or for—God help me—TypeScript if I'm doing some CDK work these days. And honestly, they all tend to achieve more or less the same outcome. It's just different approaches to—well, to be unkind—dependency management in some cases, and also the ecosystem around it and what is done for you.I don't think there's a bad language to learn. I don't want this to be interpreted as language snobbery, but I haven't touched anything in the Microsoft ecosystem for a long time in production, so .NET was just never on my radar. But it's clear they have an absolutely massive community ecosystem built around it and that is no small thing. I'd say it rivals Java.Mike: Yeah definitely. I think over the last ten years or so, the popularity of .NET as a language to be built from enterprise, especially at larger-scale organizations have taken it on, and then, like, six, seven years ago, they introduced the .NET Core Framework, which allowed it to run on non-Windows platforms, and that accelerated the language dramatically, so they have a consistent API that can be used on Windows, on Linux, Mac, and that makes a huge difference for creating a larger audience for people to interact with it. And then also, with Azure becoming much more popular, they can have all of these—this language that people are typically used to using Linux as an operating system that runs infrastructure, but not being forced to use Windows is probably quite a big thing for Azure as well.Corey: I really want to thank you for taking the time to talk about what you're up to over there. If people want to learn more, where's the best place for them to go find you?Mike: Typically, I use Twitter, so it's Mike_Goldsmith. I create blogs on the Honeycomb blog website, which I've done a few different things; I've got a new one coming up soon to talk about different ways of collecting data. So yeah, those are the two main places. LinkedIn is usual as ever, but that's a little bit more work-focused.Corey: It does seem to be. And we'll put links to all of that in the [show notes 00:31:11]. Thank you so much for being so generous with your time, and of course, thank you Honeycomb for sponsoring this episode of my ridiculous podcast.Mike: Yeah, thank you very much for having me on.Corey: Mike Goldsmith, staff software engineer at Honeycomb. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an insulting comment that we will then have instrumented across the board with a unified observability platform to keep our days eventful.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

covid-19 god amazon culture europe uk starting san francisco microsoft tools north america cloud mac west coast windows slack api open source screaming python aws linux java github company culture tracing devops javascript azure goldsmith kubernetes sdks lambda honeycomb typescript apm cdk cncf corey quinn staff software engineer otel lightstep mike yeah mike so duckbill group mike for chief cloud economist last week in aws

#13 Making Sense of OpenTelemetry and Observability (with Adriana Villela)

S.R.E.path Podcast

Play Episode Listen Later Oct 31, 2023 32:52

Ash Patel talks with Adriana Villela (CNCF Ambassador, OpenTelemetry contributor, and senior developer advocate at Lightstep) about the promise of OpenTelemetry for observability teams, as well as the challenges of doing it right. She also touches on engineering leadership topics, recalling her experience as a leader of platform engineering and observability teams. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit srepath.substack.com

Adriana Villela - On Being a Serial Refactorer

Maintainable

Play Episode Listen Later Aug 22, 2023 51:24

Robby has a chat with Adriana Villela, a Senior Developer Advocate at Lightstep. Adriana highlights that well-maintained software should be software that one can understand when they go into the code even if they're not super familiar with it. She shares why she values being a serial refactorer and describes what beautiful code should look like. Adriana views debuggers as her best friends because as she says, “I do find maintaining documentation very difficult. That's where a debugger comes in very handy so that he can step through the code to figure out what is going on”She will share a story about joining a software project that required a lot of refactoring, why asking for forgiveness is often easier than asking for permission, her involvement with the OpenTelemetry project and the standardization of observability protocols, and how to think about observability on a practical day-to-day level as a software engineer. She will also introduce us to Lightstep and what being a Senior Developer Advocate role is like, and dive into trace-based testing, why every software engineer should develop a trace mindset, the complexities of tooling we have today versus what was available a few decades ago, and what her podcast, On-Call Me Maybe Podcast, is all about. Stay tuned for all that and more.Book Recommendations:Implementing Service Level Objectives by Alex HidalgoHelpful Links:https://oncallmemaybe.com/Adriana on Twitter - @adrianamvillelaMastodon - adrianamvillela@hachyderm.ioAdriana on LinkedInAdriana on Instagram - @adriana.m.VillelaLinktree - https://linktr.ee/adriana_villelaSLOconfVIDEO: SLOconf 2023 - Translating Failures into SLOs - Ana Margarita Medina and Adriana VillelaAbby Bangser: Building Trust In Your Deployment PipelineSubscribe to Maintainable on:Apple PodcastsOvercastSpotifyOr search "Maintainable" wherever you stream your podcasts.Keep up to date with the Maintainable Podcast by joining the newsletter.

code software engineers developers serial coding debugging technical debt villela senior developer advocate codebase lightstep

Observability and SRE Leaders - Tech.Strong.Women. EP 18

Tech. Strong. Women.

Play Episode Listen Later Apr 14, 2023 41:08

In this episode of Tech.Strong.Women, Jodi Ashley and Tracy Ragan are joined by Ana Margarita Medina, Lightstep staff developer advocate, and Adriana Villela, Lightstep developer advocate, two powerhouse technologists who've focused their careers on SRE and observability. With experiences in leading observability and SRE adoption, Adriana and Ana work with developers, open-source projects like OpenTelemetry and technology creators to continue the advancement of observability and share the best practices of SRE. Ana and Adriana host the On-Call Me Maybe podcast, speaking with diverse guests from someone getting started as an SRE to highly experienced, sharing best practices and lessons learned.

women tech leaders strong women sre observability lightstep

2022 Year in Review & 2023 Predictions

The Cloudcast

Play Episode Listen Later Dec 21, 2022 62:07

Aaron and Brian discuss the 2022 Year in Review, highlighting the biggest trends, as well as making 2023 predictions. SHOW: 679CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotwSHOW SPONSORS:Eaton HomepageEaton and Tripp Lite have joined forces to bring more sanity to IT pros days, every day. Visit www.eaton.com/audio to learn more!FujiFilm. Your archival and backup data strategy, built on tape. Fujifilm tape is helping businesses get a handle on their vast amounts of data in the most secure, scalable and efficient way. Find out more at builtontape.fujifilmusa.comAWS Insiders is an edgy, entertaining podcast about the services and future of cloud computing at AWS. Listen to AWS Insiders in your favorite podcast player. Cloudfix HomepageSHOW NOTES:THE BASICS:The show grew nearly 20% YoY (2nd year in a row), with our first 2M listen year.The Cloudcast named to Top 20 Kubernetes resources of 2022The Cloudcast hosts named to “Who's Who of Cloud (2022)” listThank you to all our sponsors throughout the year (Datadog, CloudZero, JumpCloud, Mergify, BMC, Teleport, NewRelic, StrongDM, Polyscale, LoadForge, NetApp, Revelo, Lightstep, Granulate, CDN77, Jetbrains, Eaton, Cloudfix)THE BIG NEWS AREAS:Tech layoffs in 2HCY22VMware got acquired by Broadcom (will be part of CA+others)The US made a big investment in CHIPSNVIDIA's acquisition of ARM fell throughAWS - $60B>$85B (+28%), Azure - $35B>$50B (+42%), GCP - $15B>$27B (+38%)Microsoft is now 50/50 in Software and Cloud revenuesAWS re:Invent is different under Adam SelipskyBetween Texts and Images, AI seemed to make a big leap WebAssembly (WASM) is starting to make noise in new ways (PaaS 2.0?)Is Platform Engineering replacing DevOps and SRE?Docker 2.0 is making money2023 PREDICTIONS: Our 2020 PredictionsOur 2021 PredictionsOur 2022 Predictions Aaron's Predictions:We'll see a Twitter clone founded by folks that leftAzure will become #1 public cloud (pulled from 2022 predictions)Docker will become a unicorn again and prove everyone wrong2023 will be the year of the down rounds:Worldwide: 450 unicorn and 24 decacornA unicorn will go underApple will finally give everyone a peek at their EV car in development, just to mess with Elon a bit.Brian's Predictions:We'll start seeing some of the 2020-2022 unicorns acquired as sub-unicorn pricesServerless makes a comeback as a cheaper computer alternativeFinOps conferences become a must-attend eventGCP makes a huge hail-mary acquisitionFEEDBACK?Email: show at the cloudcast dot netTwitter: @thecloudcastnet

ai predictions elon musk microsoft software cloud worldwide images year in review ev arm aws devops 2m invent eaton docker kubernetes sre yoy broadcom paas bmc fujifilm netapp datadog teleport new relic jetbrains 50b revelo jumpcloud 27b lightstep cloudcast 85b aws insiders tripp lite

Understanding Observability in Elixir with Dave Lucia - EMx 195

Elixir Mix

Play Episode Listen Later Nov 23, 2022 55:18

Dave Lucia is a CTO at a media company called Bitfo, which builds high-quality educational content in the cryptocurrency space. He has been an Elixir Developer for about 6 years. He is the author of “Elixir Observability: OpenTelemetry, Lightstep, Honeycomb”. He joins the show to talk about how they were able to build their system and other websites like DeFi Rate and ethereumprice.About this Episode Observability OpenTelemetry OpenTracing Analyzing and Making Data useful Tools used for tracing and metrics Sponsors Chuck's Resume Template Developer Book Club starting with Clean Architecture by Robert C. Martin Become a Top 1% Dev with a Top End Devs Membership Links Elixir Observability: OpenTelemetry, Lightstep, Honeycomb Bitfo DeFi Rate ethereumprice Dave Lucia's Blog GitHub: davydog187 Twitter: @davydog187 Picks Allen - Distributed Services with Go Dave - Software Unscripted Dave - bitfo/timescale Dave - bitfo/ectorange Sascha - ex_union

cto dev elixir honeycomb observability clean architecture lightstep

OpenTelemetry Properly Explained and Demoed

Irish Tech News Audio Articles

Play Episode Listen Later Nov 15, 2022 18:16

OpenTelemetry project offers vendor-neutral integration points that help organizations obtain the raw materials — the "telemetry" — that fuel modern observability tools, and with minimal effort at integration time. But what does OpenTelemetry mean for those who use their favorite observability tools but don't exactly understand how it can help them? How might OpenTelemetry be relevant to the folks who are new to Kuberentes (the majority of KubeCon attendees during the past years) and those who are just getting started with observability? Austin Parker, head of developer relations, Lightstep and Morgan McLean, director of product management, Splunk, discuss during this podcast at KubeCon + CloudNativeCon 2022 how the OpenTelemetry project has created demo services to help cloud native community members better understand cloud native development practices and test out OpenTelemetry, as well as Kubernetes, observability software, etc. At this conjecture in DevOps history, there has been considerable hype around observability for developers and operations teams, and more recently, much attention has been given to helping combine the different observability solutions out there in use through a single interface, and to that end, OpenTelemetry has emerged as a key standard. DevOps teams today need OpenTelemetry since they typically work with a lot of different data sources for observability processes, Parker said. “If you want observability, you need to transform and send that data out to any number of open source or commercial solutions and you need a lingua franca to to be consistent. Every time I have a host, or an IP address, or any kind of metadata, consistency is key and that's what OpenTelemetry provides.” Additionally, as a developer or an operator, OpenTelemetry serves to instrument your system for observability, McLean said. “OpenTelemetry does that through the power of the community working together to define those standards and to provide the components needed to extract that data among hundreds of thousands of different combinations of software and hardware and infrastructure that people are using,” McLean said. Observability and OpenTelemetry, while conceptually straightforward, do require a learning curve to use. To that end, the OpenTelemetry project has released a demo to help. It is intended to both better understand cloud native development practices and to test out OpenTelemetry, as well as Kubernetes, observability software, etc.,the project's creators say. OpenTelemetry Demo v1.0 general release is available on GitHub and on the OpenTelemetry site. The demo helps with learning how to add instrumentation to an application to gather metrics, logs and traces for observability. There is heavy instruction for open source projects like Prometheus for Kubernetes and Jaeger for distributed tracing. How to acquaint yourself with tools such as Grafana to create dashboards are shown. The demo also extends to scenarios in which failures are created and OpenTelemetry data is used for troubleshooting and remediation. The demo was designed for the beginner or the intermediate level user, and can be set up to run on Docker or Kubernetes in about five minutes. “The demo is a great way for people to get started,” Parker said. “We've also seen a lot of great uptake from our commercial partners as well who have said ‘we'll use this to demo our platform.'”

ServiceNow to acquire observability innovator Era Software, helping businesses turn data-driven insights into action

Play Episode Listen Later Oct 6, 2022 3:48

ServiceNow, the leading digital workflow company making the world work better for everyone, has announced it has signed an agreement to acquire observability and log management innovator, Era Software. Combined with ServiceNow's acquisition of Lightstep in 2021, Era Software will help provide customers with a unified observability solution at scale. Customers will be able to gather actionable insights that deliver value across the business, all within a single solution purpose-built for the era of digital business. Observability is foundational to digital transformation as it provides developers with the necessary insights to understand the performance of strategic applications at scale and translate that data into business value. Yet within large enterprises, observability often remains siloed and costly, creating a fragmented and complex experience for DevOps and SRE teams. Era Software's innovative technology and customer-centric approach to log management complements and augments existing features within Lightstep, and accelerates ServiceNow's path toward unified telemetry (logs, metrics, traces). “Digital transformation succeeds or fails based on unified observability,” says Ben Sigelman, general manager of ServiceNow's Lightstep business unit and co-founder of Lightstep. “Together, ServiceNow and Era Software are set up to deliver a unified and seamless observability experience within one solution, designed to scale.” As a founding member of the OpenTelemetry project, Lightstep leads the industry in a vision toward unified telemetry. Together, Era Software and Lightstep will further extend critical, unified observability workflows, removing the confusing context switches that hinder DevOps and SRE productivity at most enterprises today. Unified telemetry allows teams to innovate fast with precision and control, helping modern organizations deliver better outcomes across all their technology investments, capitalizing on the promise of digital transformation. “At Era Software, we created solutions to simplify the complex challenges of managing large volumes of observability data, with a particular focus on log management,” said Todd Persen, CEO and co?founder at Era Software. “We have always believed that observability should span across the enterprise. We are excited to join ServiceNow, as we further build a customer-centric model of observability that can help transform the way people work.” Since its inception, the Era Software team has engineered new approaches to log data management that resolves scale, performance, and cost issues associated with running distributed applications on modern cloud-native architectures. Seattle?based Era Software was co?founded in 2019 by CEO Todd Persen and CTO Robert Winslow. Persen was previously a co-founder and CTO at InfluxData, where he helped engineer the InfluxDB time-series database. With IDC forecasting the growth of the observability market to reach $9.08 billion by 2025, this announcement underscores ServiceNow's organic growth strategy with a focus on talent and technologies that strengthen the Now Platform with new and enhanced features for customers. It follows other recent ServiceNow acquisitions, including Hitch Works, DotWalk, Mapwize, and Gekkobrain. ServiceNow expects to complete the acquisition of Era Software in Q4 2022. Financial terms of the deal were not disclosed. See more stories here.

The Ever-Changing World of Cloud Native Observability with Ian Smith

Play Episode Listen Later Sep 13, 2022 41:58

About IanIan Smith is Field CTO at Chronosphere where he works across sales, marketing, engineering and product to deliver better insights and outcomes to observability teams supporting high-scale cloud-native environments. Previously, he worked with observability teams across the software industry in pre-sales roles at New Relic, Wavefront, PagerDuty and Lightstep.Links Referenced: Chronosphere: https://chronosphere.io Last Tweet in AWS: lasttweetinaws.com TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while, I find that something I'm working on aligns perfectly with a person that I wind up basically convincing to appear on this show. Today's promoted guest is Ian Smith, who's Field CTO at Chronosphere. Ian, thank you for joining me.Ian: Thanks, Corey. Great to be here.Corey: So, the coincidental aspect of what I'm referring to is that Chronosphere is, despite the name, not something that works on bending time, but rather an observability company. Is that directionally accurate?Ian: That's true. Although you could argue it probably bend a little bit of engineering time. But we can talk about that later.Corey: [laugh]. So, observability is one of those areas that I think is suffering from too many definitions, if that makes sense. And at first, I couldn't make sense of what it was that people actually meant when they said observability, this sort of clarified to me at least when I realized that there were an awful lot of, well, let's be direct and call them ‘legacy monitoring companies' that just chose to take what they were already doing and define that as, “Oh, this is observability.” I don't know that I necessarily agree with that. I know a lot of folks in the industry vehemently disagree.You've been in a lot of places that have positioned you reasonably well to have opinions on this sort of question. To my understanding, you were at interesting places, such as LightStep, New Relic, Wavefront, and PagerDuty, which I guess technically might count as observability in a very strange way. How do you view observability and what it is?Ian: Yeah. Well, a lot of definitions, as you said, common ones, they talk about the three pillars, they talk really about data types. For me, it's about outcomes. I think observability is really this transition from the yesteryear of monitoring where things were much simpler and you, sort of, knew all of the questions, you were able to define your dashboards, you were able to define your alerts and that was really the gist of it. And going into this brave new world where there's a lot of unknown things, you're having to ask a lot of sort of unique questions, particularly during a particular instance, and so being able to ask those questions in an ad hoc fashion layers on top of what we've traditionally done with monitoring. So, observability is sort of that more flexible, more dynamic kind of environment that you have to deal with.Corey: This has always been something that, for me, has been relatively academic. Back when I was running production environments, things tended to be a lot more static, where, “Oh, there's a problem with the database. I will SSH into the database server.” Or, “Hmm, we're having a weird problem with the web tier. Well, there are ten or 20 or 200 web servers. Great, I can aggregate all of their logs to Syslog, and worst case, I can log in and poke around.”Now, with a more ephemeral style of environment where you have Kubernetes or whatnot scheduling containers into place that have problems you can't attach to a running container very easily, and by the time you see an error, that container hasn't existed for three hours. And that becomes a problem. Then you've got the Lambda universe, which is a whole ‘nother world pain, where it becomes very challenging, at least for me, in order to reason using the old style approaches about what's actually going on in your environment.Ian: Yeah, I think there's that and there's also the added complexity of oftentimes you'll see performance or behavioral changes based on even more narrow pathways, right? One particular user is having a problem and the traffic is spread across many containers. Is it making all of these containers perform badly? Not necessarily, but their user experience is being affected. It's very common in say, like, B2B scenarios for you to want to understand the experience of one particular user or the aggregate experience of users at a particular company, particular customer, for example.There's just more complexity. There's more complexity of the infrastructure and just the technical layer that you're talking about, but there's also more complexity in just the way that we're handling use cases and trying to provide value with all of this software to the myriad of customers in different industries that software now serves.Corey: For where I sit, I tend to have a little bit of trouble disambiguating, I guess, the three baseline data types that I see talked about again and again in observability. You have logs, which I think I've mostly I can wrap my head around. That seems to be the baseline story of, “Oh, great. Your application puts out logs. Of course, it's in its own unique, beautiful format. Why wouldn't it be?” In an ideal scenario, they're structured. Things are never ideal, so great. You're basically tailing log files in some cases. Great. I can reason about those.Metrics always seem to be a little bit of a step beyond that. It's okay, I have a whole bunch of log lines that are spitting out every 500 error that my app is throwing—and given my terrible code, it throws a lot—but I can then ideally count the number of times that appears and then that winds up incrementing counter, similar to the way that we used to see with StatsD, for example, and Collectd. Is that directionally correct? As far as the way I reason about, well so far, logs and metrics?Ian: I think at a really basic level, yes. I think that, as we've been talking about, sort of greater complexity starts coming in when you have—particularly metrics in today's world of containers—Prometheus—you mentioned StatsD—Prometheus has become sort of like the standard for expressing those things, so you get situations where you have incredibly high cardinality, so cardinality being the interplay between all the different dimensions. So, you might have, my container is a label, but also the type of endpoint is running on that container as a label, then maybe I want to track my customer organizations and maybe I have 5000 of those. I have 3000 containers, and so on and so forth. And you get this massive explosion, almost multiplicatively.For those in the audience who really live and read cardinality, there's probably someone screaming about well, it's not truly multiplicative in every sense of the word, but, you know, it's close enough from an approximation standpoint. As you get this massive explosion of data, which obviously has a cost implication but also has, I think, a really big implication on the core reason why you have metrics in the first place you alluded to, which is, so a human being can reason about it, right? You don't want to go and look at 5000 log lines; you want to know, out of those 5000 log lines of 4000 errors and I have 1000, OKs. It's very easy for human beings to reason about that from a numbers perspective. When your metrics start to re-explode out into thousands, millions of data points, and unique sort of time series more numbers for you to track, then you're sort of losing that original goal of metrics.Corey: I think I mostly have wrapped my head around the concept. But then that brings us to traces, and that tends to be I think one of the hardest things for me to grasp, just because most of the apps I build, for obvious reasons—namely, I'm bad at programming and most of these are proof of concept type of things rather than anything that's large scale running in production—the difference between a trace and logs tends to get very muddled for me. But the idea being that as you have a customer session or a request that talks to different microservices, how do you collate across different systems all of the outputs of that request into a single place so you can see timing information, understand the flow that user took through your application? Is that again, directionally correct? Have I completely missed the plot here? Which is again, eminently possible. You are the expert.Ian: No, I think that's sort of the fundamental premise or expected value of tracing, for sure. We have something that's akin to a set of logs; they have a common identifier, a trace ID, that tells us that all of these logs essentially belong to the same request. But importantly, there's relationship information. And this is the difference between just having traces—sorry, logs—with just a trace ID attached to them. So, for example, if you have Service A calling Service B and Service C, the relatively simple thing, you could use time to try to figure this out.But what if there are things happening in Service B at the same time there are things happening in Service C and D, and so on and so forth? So, one of the things that tracing brings to the table is it tells you what is currently happening, what called that. So oh, I know that I'm Service D. I was actually called by Service B and I'm not just relying on timestamps to try and figure out that connection. So, you have that information and ultimately, the data model allows you to fully sort of reflect what's happening with the request, particularly in complex environments.And I think this is where, you know, tracing needs to be sort of looked at as not a tool for—just because I'm operating in a modern environment, I'm using some Kubernetes, or I'm using Lambda, is it needs to be used in a scenario where you really have troubles grasping, from a conceptual standpoint, what is happening with the request because you need to actually fully document it. As opposed to, I have a few—let's say three Lambda functions. I maybe have some key metrics about them; I have a little bit of logging. You probably do not need to use tracing to solve, sort of, basic performance problems with those. So, you can get yourself into a place where you're over-engineering, you're spending a lot of time with tracing instrumentation and tracing tooling, and I think that's the core of observability is, like, using the right tool, the right data for the job.But that's also what makes it really difficult because you essentially need to have this, you know, huge set of experience or knowledge about the different data, the different tooling, and what influential architecture and the data you have available to be able to reason about that and make confident decisions, particularly when you're under a time crunch which everyone is familiar with a, sort of like, you know, PagerDuty-style experience of my phone is going off and I have a customer-facing incident. Where is my problem? What do I need to do? Which dashboard do I need to look at? Which tool do I need to investigate? And that's where I think the observability industry has become not serving the outcomes of the customers.Corey: I had a, well, I wouldn't say it's a genius plan, but it was a passing fancy that I've built this online, freely available Twitter client for authoring Twitter threads—because that's what I do is that of having a social life—and it's available at lasttweetinaws.com. I've used that as a testbed for a few things. It's now deployed to roughly 20 AWS regions simultaneously, and this means that I have a bit of a problem as far as how to figure out not even what's wrong or what's broken with this, but who's even using it?Because I know people are. I see invocations all over the planet that are not me. And sometimes it appears to just be random things crawling the internet—fine, whatever—but then I see people logging in and doing stuff with it. I'd kind of like to log and see who's using it just so I can get information like, is there anyone I should talk to about what it could be doing differently? I love getting user experience reports on this stuff.And I figured, ah, this is a perfect little toy application. It runs in a single Lambda function so it's not that complicated. I could instrument this with OpenTelemetry, which then, at least according to the instructions on the tin, I could then send different types of data to different observability tools without having to re-instrument this thing every time I want to kick the tires on something else. That was the promise.And this led to three weeks of pain because it appears that for all of the promise that it has, OpenTelemetry, particularly in a Lambda environment, is nowhere near ready for being able to carry a workload like this. Am I just foolish on this? Am I stating an unfortunate reality that you've noticed in the OpenTelemetry space? Or, let's be clear here, you do work for a company with opinions on these things. Is OpenTelemetry the wrong approach?Ian: I think OpenTelemetry is absolutely the right approach. To me, the promise of OpenTelemetry for the individual is, “Hey, I can go and instrument this thing, as you said and I can go and send the data, wherever I want.” The sort of larger view of that is, “Well, I'm no longer beholden to a vendor,”—including the ones that I've worked for, including the one that I work for now—“For the definition of the data. I am able to control that, I'm able to choose that, I'm able to enhance that, and any effort I put into it, it's mine. I own that.”Whereas previously, if you picked, say, for example, an APM vendor, you said, “Oh, I want to have some additional aspects of my information provider, I want to track my customer, or I want to track a particular new metric of how much dollars am I transacting,” that effort really going to support the value of that individual solution, it's not going to support your outcomes. Which is I want to be able to use this data wherever I want, wherever it's most valuable. So, the core premise of OpenTelemetry, I think, is great. I think it's a massive undertaking to be able to do this for at least three different data types, right? Defining an API across a whole bunch of different languages, across three different data types, and then creating implementations for those.Because the implementations are the thing that people want, right? You are hoping for the ability to, say, drop in something. Maybe one line of code or preferably just, like, attach a dependency, let's say in Java-land at runtime, and be able to have the information flow through and have it complete. And this is the premise of, you know, vendors I've worked with in the past, like New Relic. That was what New Relic built on: the ability to drop in an agent and get visibility immediately.So, having that out-of-the-box visibility is obviously a goal of OpenTelemetry where it makes sense—Go, it's very difficult to attach things at runtime, for example—but then saying, well, whatever is provided—let's say your gRPC connections, database, all these things—well, now I want to go and instrument; I want to add some additional value. As you said, maybe you want to track something like I want to have in my traces the email address of whoever it is or the Twitter handle of whoever is so I can then go and analyze that stuff later. You want to be able to inject that piece of information or that instrumentation and then decide, well, where is the best utilized? Is it best utilized in some tooling from AWS? Is it best utilized in something that you've built yourself? Is it best of utilized an open-source project? Is it best utilized in one of the many observability vendors, or is even becoming more common, I want to shove everything in a data lake and run, sort of, analysis asynchronously, overlay observability data for essentially business purposes.All of those things are served by having a very robust, open-source standard, and simple-to-implement way of collecting a really good baseline of data and then make it easy for you to then enhance that while still owning—essentially, it's your IP right? It's like, the instrumentation is your IP, whereas in the old world of proprietary agents, proprietary APIs, that IP was basically building it, but it was tied to that other vendor that you were investing in.Corey: One thing that I was consistently annoyed by in my days of running production infrastructures at places, like, you know, large banks, for example, one of the problems I kept running into is that this, there's this idea that, “Oh, you want to use our tool. Just instrument your applications with our libraries or our instrumentation standards.” And it felt like I was constantly doing and redoing a lot of instrumentation for different aspects. It's not that we were replacing one vendor with another; it's that in an observability, toolchain, there are remarkably few, one-size-fits-all stories. It feels increasingly like everyone's trying to sell me a multifunction printer, which does one thing well, and a few other things just well enough to technically say they do them, but badly enough that I get irritated every single time.And having 15 different instrumentation packages in an application, that's either got security ramifications, for one, see large bank, and for another it became this increasingly irritating and obnoxious process where it felt like I was spending more time seeing the care and feeding of the instrumentation then I was the application itself. That's the gold—that's I guess the ideal light at the end of the tunnel for me in what OpenTelemetry is promising. Instrument once, and then you're just adjusting configuration as far as where to send it.Ian: That's correct. The organization's, and you know, I keep in touch with a lot of companies that I've worked with, companies that have in the last two years really invested heavily in OpenTelemetry, they're definitely getting to the point now where they're generating the data once, they're using, say, pieces of the OpenTelemetry pipeline, they're extending it themselves, and then they're able to shove that data in a bunch of different places. Maybe they're putting in a data lake for, as I said, business analysis purposes or forecasting. They may be putting the data into two different systems, even for incident and analysis purposes, but you're not having that duplication effort. Also, potentially that performance impact, right, of having two different instrumentation packages lined up with each other.Corey: There is a recurring theme that I've noticed in the observability space that annoys me to no end. And that is—I don't know if it's coming from investor pressure, from folks never being satisfied with what they have, or what it is, but there are so many startups that I have seen and worked with in varying aspects of the observability space that I think, “This is awesome. I love the thing that they do.” And invariably, every time they start getting more and more features bolted onto them, where, hey, you love this whole thing that winds up just basically doing a tail-F on a log file, so it just streams your logs in the application and you can look for certain patterns. I love this thing. It's great.Oh, what's this? Now, it's trying to also be the thing that alerts me and wakes me up in the middle of the night. No. That's what PagerDuty does. I want PagerDuty to do that thing, and I want other things—I want you just to be the log analysis thing and the way that I contextualize logs. And it feels like they keep bolting things on and bolting things on, where everything is more or less trying to evolve into becoming its own version of Datadog. What's up with that?Ian: Yeah, the sort of, dreaded platform play. I—[laugh] I was at New Relic when there were essentially two products that they sold. And then by the time I left, I think there was seven different products that were being sold, which is kind of a crazy, crazy thing when you think about it. And I think Datadog has definitely exceeded that now. And I definitely see many, many vendors in the market—and even open-source solutions—sort of presenting themselves as, like, this integrated experience.But to your point, even before about your experience of these banks it oftentimes become sort of a tick-a-box feature approach of, “Hey, I can do this thing, so buy more. And here's a shared navigation panel.” But are they really integrated? Like, are you getting real value out of it? One of the things that I do in my role is I get to work with our internal product teams very closely, particularly around new initiatives like tracing functionality, and the constant sort of conversation is like, “What is the outcome? What is the value?”It's not about the feature; it's not about having a list of 19 different features. It's like, “What is the user able to do with this?” And so, for example, there are lots of platforms that have metrics, logs, and tracing. The new one-upmanship is saying, “Well, we have events as well. And we have incident response. And we have security. And all these things sort of tie together, so it's one invoice.”And constantly I talk to customers, and I ask them, like, “Hey, what are the outcomes that you're getting when you've invested so heavily in one vendor?” And oftentimes, the response is, “Well, I only need to deal with one vendor.” Okay, but that's not an outcome. [laugh]. And it's like the business having a single invoice.Corey: Yeah, that is something that's already attainable today. If you want to just have one vendor with a whole bunch of crappy offerings, that's what AWS is for. They have AmazonBasics versions of everything you might want to use in production. Oh, you want to go ahead and use MongoDB? Well, use AmazonBasics MongoDB, but they call it DocumentDB because of course they do. And so, on and so forth.There are a bunch of examples of this, but those companies are still in business and doing very well because people often want the genuine article. If everyone was trying to do just everything to check a box for procurement, great. AWS has already beaten you at that game, it seems.Ian: I do think that, you know, people are hoping for that greater value and those greater outcomes, so being able to actually provide differentiation in that market I don't think is terribly difficult, right? There are still huge gaps in let's say, root cause analysis during an investigation time. There are huge issues with vendors who don't think beyond sort of just the one individual who's looking at a particular dashboard or looking at whatever analysis tool there is. So, getting those things actually tied together, it's not just, “Oh, we have metrics, and logs, and traces together,” but even if you say we have metrics and tracing, how do you move between metrics and tracing? One of the goals in the way that we're developing product at Chronosphere is that if you are alerted to an incident—you as an engineer; doesn't matter whether you are massively sophisticated, you're a lead architect who has been with the company forever and you know everything or you're someone who's just come out of onboarding and is your first time on call—you should not have to think, “Is this a tracing problem, or a metrics problem, or a logging problem?”And this is one of those things that I mentioned before of requiring that really heavy level of knowledge and understanding about the observability space and your data and your architecture to be effective. And so, with the, you know, particularly observability teams and all of the engineers that I speak with on a regular basis, you get this sort of circumstance where well, I guess, let's talk about a real outcome and a real pain point because people are like, okay, yeah, this is all fine; it's all coming from a vendor who has a particular agenda, but the thing that constantly resonates is for large organizations that are moving fast, you know, big startups, unicorns, or even more traditional enterprises that are trying to undergo, like, a rapid transformation and go really cloud-native and make sure their engineers are moving quickly, a common question I will talk about with them is, who are the three people in your organization who always get escalated to? And it's usually, you know, between two and five people—Corey: And you can almost pick those perso—you say that and you can—at least anyone who's worked in environments or through incidents like this more than a few times, already have thought of specific people in specific companies. And they almost always fall into some very predictable archetypes. But please, continue.Ian: Yeah. And people think about these people, they always jump to mind. And one of the things I asked about is, “Okay, so when you did your last innovation around observably”—it's not necessarily buying a new thing, but it maybe it was like introducing a new data type or it was you're doing some big investment in improving instrumentation—“What changed about their experience?” And oftentimes, the most that can come out is, “Oh, they have access to more data.” Okay, that's not great.It's like, “What changed about their experience? Are they still getting woken up at 3 am? Are they constantly getting pinged all the time?” One of the vendors that I worked at, when they would go down, there were three engineers in the company who were capable of generating list of customers who are actually impacted by damage. And so, every single incident, one of those three engineers got paged into the incident.And it became borderline intolerable for them because nothing changed. And it got worse, you know? The platform got bigger and more complicated, and so there were more incidents and they were the ones having to generate that. But from a business level, from an observability outcomes perspective, if you zoom all the way up, it's like, “Oh, were we able to generate the list of customers?” “Yes.”And this is where I think the observability industry has sort of gotten stuck—you know, at least one of the ways—is that, “Oh, can you do it?” “Yes.” “But is it effective?” “No.” And by effective, I mean those three engineers become the focal point for an organization.And when I say three—you know, two to five—it doesn't matter whether you're talking about a team of a hundred or you're talking about a team of a thousand. It's always the same number of people. And as you get bigger and bigger, it becomes more and more of a problem. So, does the tooling actually make a difference to them? And you might ask, “Well, what do you expect from the tooling? What do you expect to do for them?” Is it you give them deeper analysis tools? Is it, you know, you do AI Ops? No.The answer is, how do you take the capabilities that those people have and how do you spread it across a larger population of engineers? And that, I think, is one of those key outcomes of observability that no one, whether it be in open-source or the vendor side is really paying a lot of attention to. It's always about, like, “Oh, we can just shove more data in. By the way, we've got petabyte scale and we can deal with, you know, 2 billion active time series, and all these other sorts of vanity measures.” But we've gotten really far away from the outcomes. It's like, “Am I getting return on investment of my observability tooling?”And I think tracing is this—as you've said, it can be difficult to reason about right? And people are not sure. They're feeling, “Well, I'm in a microservices environment; I'm in cloud-native; I need tracing because my older APM tools appear to be failing me. I'm just going to go and wriggle my way through implementing OpenTelemetry.” Which has significant engineering costs. I'm not saying it's not worth it, but there is a significant engineering cost—and then I don't know what to expect, so I'm going to go on through my data somewhere and see whether we can achieve those outcomes.And I do a pilot and my most sophisticated engineers are in the pilot. And they're able to solve the problems. Okay, I'm going to go buy that thing. But I've just transferred my problems. My engineers have gone from solving problems in maybe logs and grepping through petabytes worth of logs to using some sort of complex proprietary query language to go through your tens of petabytes of trace data but actually haven't solved any problem. I've just moved it around and probably just cost myself a lot, both in terms of engineering time and real dollars spent as well.Corey: One of the challenges that I'm seeing across the board is that observability, for certain use cases, once you start to see what it is and its potential for certain applications—certainly not all; I want to hedge that a little bit—but it's clear that there is definite and distinct value versus other ways of doing things. The problem is, is that value often becomes apparent only after you've already done it and can see what that other side looks like. But let's be honest here. Instrumenting an application is going to take some significant level of investment, in many cases. How do you wind up viewing any return on investment that it takes for the very real cost, if only in people's time, to go ahead instrumenting for observability in complex environments?Ian: So, I think that you have to look at the fundamentals, right? You have to look at—pretend we knew nothing about tracing. Pretend that we had just invented logging, and you needed to start small. It's like, I'm not going to go and log everything about every application that I've had forever. What I need to do is I need to find the points where that logging is going to be the most useful, most impactful, across the broadest audience possible.And one of the useful things about tracing is because it's built in distributed environments, primarily for distributed environments, you can look at, for example, the biggest intersection of requests. A lot of people have things like API Gateways, or they have parts of a monolith which is still handling a lot of requests routing; those tend to be areas to start digging into. And I would say that, just like for anyone who's used Prometheus or decided to move away from Prometheus, no one's ever gone and evaluated Prometheus solution without having some sort of Prometheus data, right? You don't go, “Hey, I'm going to evaluate a replacement for Prometheus or my StatsD without having any data, and I'm simultaneously going to generate my data and evaluate the solution at the same time.” It doesn't make any sense.With tracing, you have decent open-source projects out there that allow you to visualize individual traces and understand sort of the basic value you should be getting out of this data. So, it's a good starting point to go, “Okay, can I reason about a single request? Can I go and look at my request end-to-end, even in a relatively small slice of my environment, and can I see the potential for this? And can I think about the things that I need to be able to solve with many traces?” Once you start developing these ideas, then you can have a better idea of, “Well, where do I go and invest more in instrumentation? Look, databases never appear to be a problem, so I'm not going to focus on database instrumentation. What's the real problem is my external dependencies. Facebook API is the one that everyone loves to use. I need to go instrument that.”And then you start to get more clarity. Tracing has this interesting network effect. You can basically just follow the breadcrumbs. Where is my biggest problem here? Where are my errors coming from? Is there anything else further down the call chain? And you can sort of take that exploratory approach rather than doing everything up front.But it is important to do something before you start trying to evaluate what is my end state. End state obviously being sort of nebulous term in today's world, but where do I want to be in two years' time? I would like to have a solution. Maybe it's open-source solution, maybe it's a vendor solution, maybe it's one of those platform solutions we talked about, but how do I get there? It's really going to be I need to take an iterative approach and I need to be very clear about the value and outcomes.There's no point in doing a whole bunch of instrumentation effort in things that are just working fine, right? You want to go and focus your time and attention on that. And also you don't want to go and burn just singular engineers. The observability team's purpose in life is probably not to just write instrumentation or just deploy OpenTelemetry. Because then we get back into the land where engineers themselves know nothing about the monitoring or observability they're doing and it just becomes a checkbox of, “I dropped in an agent. Oh, when it comes time for me to actually deal with an incident, I don't know anything about the data and the data is insufficient.”So, a level of ownership supported by the observability team is really important. On that return on investment, sort of, though it's not just the instrumentation effort. There's product training and there are some very hard costs. People think oftentimes, “Well, I have the ability to pay a vendor; that's really the only cost that I have.” There's things like egress costs, particularly volumes of data. There's the infrastructure costs. A lot of the times there will be elements you need to run in your own environment; those can be very costly as well, and ultimately, they're sort of icebergs in this overall ROI conversation.The other side of it—you know, return and investment—return, there's a lot of difficulty in reasoning about, as you said, what is the value of this going to be if I go through all this effort? Everyone knows a sort of, you know, meme or archetype of, “Hey, here are three options; pick two because there's always going to be a trade off.” Particularly for observability, it's become an element of, I need to pick between performance, data fidelity, or cost. Pick two. And when data fidelity—particularly in tracing—I'm talking about the ability to not sample, right?If you have edge cases, if you have narrow use cases and ways you need to look at your data, if you heavily sample, you lose data fidelity. But oftentimes, cost is a reason why you do that. And then obviously, performance as you start to get bigger and bigger datasets. So, there's a lot of different things you need to balance on that return. As you said, oftentimes you don't get to understand the magnitude of those until you've got the full data set in and you're trying to do this, sort of, for real. But being prepared and iterative as you go through this effort and not saying, “Okay, well, I'm just going to buy everything from one vendor because I'm going to assume that's going to solve my problem,” is probably that undercurrent there.Corey: As I take a look across the entire ecosystem, I can't shake the feeling—and my apologies in advance if this is an observation, I guess, that winds up throwing a stone directly at you folks—Ian: Oh, please.Corey: But I see that there's a strong observability community out there that is absolutely aligned with the things I care about and things I want to do, and then there's a bunch of SaaS vendors, where it seems that they are, in many cases, yes, advancing the state of the art, I am not suggesting for a second that money is making observability worse. But I do think that when the tool you sell is a hammer, then every problem starts to look like a nail—or in my case, like my thumb. Do you think that there's a chance that SaaS vendors are in some ways making this entire space worse?Ian: As we've sort of gone into more cloud-native scenarios and people are building things specifically to take advantage of cloud from a complexity standpoint, from a scaling standpoint, you start to get, like, vertical issues happening. So, you have things like we're going to charge on a per-container basis; we're going to charge on a per-host basis; we're going to charge based off the amount of gigabytes that you send us. These are sort of like more horizontal pricing models, and the way the SaaS vendors have delivered this is they've made it pretty opaque, right? Everyone has experiences, or has jerks about overages from observability vendors' massive spikes. I've worked with customers who have used—accidentally used some features and they've been billed a quarter million dollars on a monthly basis for accidental overages from a SaaS vendor.And these are all terrible things. Like, but we've gotten used to this. Like, we've just accepted it, right, because everyone is operating this way. And I really do believe that the move to SaaS was one of those things. Like, “Oh, well, you're throwing us more data, and we're charging you more for it.” As a vendor—Corey: Which sort of erodes your own value proposition that you're bringing to the table. I mean, I don't mean to be sitting over here shaking my fist yelling, “Oh, I could build a better version in a weekend,” except that I absolutely know how to build a highly available Rsyslog cluster. I've done it a handful of times already and the technology is still there. Compare and contrast that with, at scale, the fact that I'm paying 50 cents per gigabyte ingested to CloudWatch logs, or a multiple of that for a lot of other vendors, it's not that much harder for me to scale that fleet out and pay a much smaller marginal cost.Ian: And so, I think the reaction that we're seeing in the market and we're starting to see—we're starting to see the rise of, sort of, a secondary class of vendor. And by secondary, I don't mean that they're lesser; I mean that they're, sort of like, specifically trying to address problems of the primary vendors, right? Everyone's aware of vendors who are attempting to reduce—well, let's take the example you gave on logs, right? There are vendors out there whose express purpose is to reduce the cost of your logging observability. They just sit in the middle; they are a middleman, right?Essentially, hey, use our tool and even though you're going to pay us a whole bunch of money, it's going to generate an overall return that is greater than if you had just continued pumping all of your logs over to your existing vendor. So, that's great. What we think really needs to happen, and one of the things we're doing at Chronosphere—unfortunate plug—is we're actually building those capabilities into the solution so it's actually end-to-end. And by end-to-end, I mean, a solution where I can ingest my data, I can preprocess my data, I can store it, query it, visualize it, all those things, aligned with open-source standards, but I have control over that data, and I understand what's going on with particularly my cost and my usage. I don't just get a bill at the end of the month going, “Hey, guess what? You've spent an additional $200,000.”Instead, I can know in real time, well, what is happening with my usage. And I can attribute it. It's this team over here. And it's because they added this particular label. And here's a way for you, right now, to address that and cap it so it doesn't cost you anything and it doesn't have a blast radius of, you know, maybe degraded performance or degraded fidelity of the data.That though is diametrically opposed to the way that most vendors are set up. And unfortunately, the open-source projects tend to take a lot of their cues, at least recently, from what's happening in the vendor space. One of the ways that you can think about it is a sort of like a speed of light problem. Everyone knows that, you know, there's basic fundamental latency; everyone knows how fast disk is; everyone knows the, sort of like, you can't just make your computations happen magically, there's a cost of running things horizontally. But a lot of the way that the vendors have presented efficiency to the market is, “Oh, we're just going to incrementally get faster as AWS gets faster. We're going to incrementally get better as compression gets better.”And of course, you can't go and fit a petabyte worth of data into a kilobyte, unless you're really just doing some sort of weird dictionary stuff, so you feel—you're dealing with some fundamental constraints. And the vendors just go, “I'm sorry, you know, we can't violate the speed of light.” But what you can do is you can start taking a look at, well, how is the data valuable, and start giving the people controls on how to make it more valuable. So, one of the things that we do with Chronosphere is we allow you to reshape Prometheus metrics, right? You go and express Prometheus metrics—let's say it's a business metric about how many transactions you're doing as a business—you don't need that on a per-container basis, particularly if you're running 100,000 containers globally.When you go and take a look at that number on a dashboard, or you alert on it, what is it? It's one number, one time series. Maybe you break it out per region. You have five regions, you don't need 100,000 data points every minute behind that. It's very expensive, it's not very performant, and as we talked about earlier, it's very hard to reason about as a human being.So, giving the tools to be able to go and condense that data down and make it more actionable and more valuable, you get performance, you get cost reduction, and you get the value that you ultimately need out of the data. And it's one of the reasons why, I guess, I work at Chronosphere. Which I'm hoping is the last observability [laugh] venture I ever work for.Corey: Yeah, for me a lot of the data that I see in my logs, which is where a lot of this stuff starts and how I still contextualize these things, is nonsense that I don't care about and will never care about. I don't care about load balance or health checks. I don't particularly care about 200 results for the favicon when people visit the site. I care about other things, but just weed out the crap, especially when I'm paying by the pound—or at least by the gigabyte—in order to get that data into something. Yeah. It becomes obnoxious and difficult to filter out.Ian: Yeah. And the vendors just haven't done any of that because why would they, right? If you went and reduced the amount of log—Corey: Put engineering effort into something that reduces how much I can charge you? That sounds like lunacy. Yeah.Ian: Exactly. They're business models entirely based off it. So, if you went and reduced every one's logging bill by 30%, or everyone's logging volume by 30% and reduced the bills by 30%, it's not going to be a great time if you're a publicly traded company who has built your entire business model on essentially a very SaaS volume-driven—and in my eyes—relatively exploitative pricing and billing model.Corey: Ian, I want to thank you for taking so much time out of your day to talk to me about this. If people want to learn more, where can they find you? I mean, you are a Field CTO, so clearly you're outstanding in your field. But if, assuming that people don't want to go to farm country, where's the best place to find you?Ian: Yeah. Well, it'll be a bunch of different conferences. I'll be at KubeCon this year. But chronosphere.io is the company website. I've had the opportunity to talk to a lot of different customers, not from a hard sell perspective, but you know, conversations like this about what are the real problems you're having and what are the things that you sort of wish that you could do?One of the favorite things that I get to ask people is, “If you could wave a magic wand, what would you love to be able to do with your observability solution?” That's, A, a really great part, but oftentimes be being able to say, “Well, actually, that thing you want to do, I think I have a way to accomplish that,” is a really rewarding part of this particular role.Corey: And we will, of course, put links to that in the show notes. Thank you so much for being so generous with your time. I appreciate it.Ian: Thanks, Corey. It's great to be here.Corey: Ian Smith, Field CTO at Chronosphere on this promoted guest episode. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment, which going to be super easy in your case, because it's just one of the things that the omnibus observability platform that your company sells offers as part of its full suite of things you've never used.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

amazon defining cloud id roi b2b ip saas compare metrics api instrument screaming aws java prometheus pretend apis tracing devops kubernetes lambda mongodb ian smith apm cloud native ssh observability datadog new relic ever changing world pagerduty kubecon grpc amazon basics oks corey quinn chronosphere lightstep wavefront cloudwatch duckbill group syslog documentdb chief cloud economist ian no last week in aws humblepod

Randy Shoup on Evolving Architecture at eBay

Software Sessions

Play Episode Listen Later Aug 17, 2022 57:52

This episode originally aired on Software Engineering Radio.Randy Shoup is the VP of Engineering and Chief Architect at eBay. He was previously the VP of Engineering at WeWork and Stitch Fix, a Director of Engineering at Google Cloud where he worked on App Engine, and a Chief Engineer and Distinguished Architect at eBay in 2004. Topics covered: eBay's origins as a single C++ class The five-year migration to Java services Sharing a database between the old and new systems Building a distributed tracing system Working with bare metal Why most companies should stick to cloud Why individual services should own their own data storage How scale has caused solutions to change Rejoining a former company The Accelerate Book Improving delivery time. Related Links:@randyshoupOpenTelemetryLightStepHoneycombAccelerate BookThe MemoValue Stream MappingThe Epic Story of Dropbox's Exodus from the Amazon Cloud EmpireTranscript:[00:00:00] Jeremy: Today, I'm talking to Randy Shoup, he's the VP of engineering and chief architect at eBay.[00:00:05] Jeremy: He was previously the VP of engineering at WeWork and stitch fix, and he was also a chief engineer and distinguished architect at eBay back in 2004. Randy, welcome back to software engineering radio. This will be your fifth appearance on the show. I'm pretty sure that's a record.[00:00:22] Randy: Thanks, Jeremy, I'm really excited to come back. I always enjoy listening to, and then also contributing to software engineering radio.Back at, Qcon 2007, you spoke with Markus Volter he's he was the founder of SE radio. And you were talking about developing eBay's new search engine at the time.[00:00:42] Jeremy: And kind of looking back, I wonder if you could talk a little bit about how eBay was structured back then, maybe organizationally, and then we can talk a little bit about the, the tech stack and that sort of thing.[00:00:53] Randy: Oh, sure. Okay. Yeah. Um, so eBay started in 1995. I just want to like, you know, orient everybody. Same, same as the web. Same as Amazon, same as a bunch of stuff. So E-bay was actually almost 10 years old when I joined. That seemingly very old first time. Um, so yeah. What was ebay's tech stack like then? So E-bay current has gone through five generations of its infrastructure.It was transitioning between the second and the third when I joined in 2004. Um, so the. Iteration was Pierre Omidyar, the founder three-day weekend three-day labor day weekend in 1995, playing around with this new cool thing called the web. He wasn't intending to build a business. He just was playing around with auctions and wanted to put up a webpage.So he had a Perl backend and every item was a file and it lived on this little 486 tower or whatever you had at the time. Um, so that wasn't scalable and wasn't meant to be. The second generation of eBay's architecture was what we called V2 very, you know, creatively, uh, that was a C++ monolith. Um, an ISAPI DLL with essentially well at its worst, which grew to 3.4 million lines of code in that single DLL and basically in a single class, not just in a single, like repo or a single file, but in a single class.So that was very unpleasant to work in. As you can imagine, um, eBay had about a thousand engineers at the time and they were, you know, as you can imagine, like really stepping on each other's toes and not being able to make much forward progress. So starting in, I want to call it 2002. So two years before I joined, um, they were migrating to the creatively named V3 and V3 architecture was Java, and.you know, not microservices, but like we didn't even have that term, but it wasn't even that it was mini applications. So I'm actually going to take a step back. V2 was a monolith. So like all of eBay's code in that single DLL and like that was buying and selling and search and everything. And then we had two monster databases, a primary and a backup big Oracle machines on some hardware that was bigger, you know, bigger than refrigerators and that ran eBay for a bunch of years, before we changed the upper part of the stack, we, um, chopped up the, that single monolithic database into a bunch of, um, domain specific databases or entity specific databases, right?So a set of databases around users, you know, sharded by the user ID could talk about all that. If you want, you know, items again, sharded by item ID transactions, sharded by transaction ID... I think when I joined, it was the several hundred instances of, uh, Oracle databases, um, you know, spread around, but still that monolithic front end.And then in 2002, I wanna say we started migrating into that V3 that I was saying, okay. So that's, uh, that was a rewrite in Java, again, many applications. So you take the front end and instead of having it be in one big unit, it was this, uh, ER file, EAR, file, if run and people remember back to, you know, those stays in Java, um, you know, 220 different of those.So like here is the, you know, one of them for the search pages, you know, so the, you know, one application be the search application and it would, you know, do all the search related stuff, the handful of pages around search, uh, ditto for, you know, the buying area, ditto for the, you know, checkout area, ditto for the selling area...220 of those. Um, and that was again, domain, um, vertically sliced domains. And then the relationship between those V3, uh, applications and the databases was a many to many things. So like many applicants, many of those applications interact with items. So they would interact with those item databases. Many of them would interact with users.And so they would interact with a user databases, et cetera, uh, happy to go into as much gory detail as you want about all that. But like, that's what, uh, but we were in the transition period. You know, when I, uh, between the V2 monolith to the V3 mini applications in, uh, 2004, I'm just going to pause there and like, let me know where you want to take it.[00:05:01] Jeremy: Yeah. So you were saying that it was, um, it started as Perl, then it became a C++, and that's kind of interesting that you said it was all in one class, right? So it's wow. That's gotta be a gigantic [00:05:16] Randy: I mean, completely brutal. Yeah. 3.4 million lines of code. Yeah. We were hitting compiler limits on the number of methods per class.[00:05:22] Jeremy: Oh my gosh.[00:05:23] Randy: I'm, uh, uh, scared that I have that. I happen to know that at least at the time, uh, Microsoft allowed you 16 K uh, methods per class, and we were hitting that limit.So, uh, not great.[00:05:36] Jeremy: So it's just kind of interesting to think about how do you walk through that code, right? You have, I guess you just have this giant file.[00:05:45] Randy: Yeah. I mean, there were, you know, different methods. Um, but yeah, it was a big man. I mean, it was a monolith, it was, uh, you know, it was a spaghetti mess. Um, and you know, as you can imagine, Amazon went through a really similar thing by the way. So this wasn't soup. I mean, it was bad, but like we weren't the only people that were making that, making that a mistake.Um, and just like Amazon, where they were, uh, they did like one update a quarter (laughs) , you know, at that period, like 2000, uh, we were doing something really similar, like very, very slow. Um, you know, updates and, uh, when we moved to V3, you know, the idea was to get to do changes much faster. And we were very proud of ourselves starting in 2004 that we, uh, upgraded the whole site every two weeks.And we didn't have to do the whole site, but like each of those individual applications that I was mentioning, right. Those 220 applications, each of those would roll out on this biweekly cadence. Um, and they had interdependencies. And so we rolled them out in this dependency order in any way, lots of, lots of complexity associated with that.Um, yeah, there you go.[00:06:51] Jeremy: the V3 that, that was written in Java, I'm assuming this was a, as a complete rewrite. You, you didn't use the C++ code at all.[00:07:00] Randy: Yeah. And, uh, it was, um, we migrated, uh, page by page. So, uh, you know, in the transition period, which lasted probably five years, um, there were pages, you know, in the beginning, all pages were served by V2. In the end, all pages are served by V3 and, you know, over time you iterate and you like rewrite in parallel, you know, rewrite and maintain in parallel the V3 version of XYZ page and the V2 version of XYZ page.Um, and then when you're ready, you start to test out at low percentages of traffic, you know, what would, what does V3 look like? Is it correct? And when it isn't do you go and fix it, but then ultimately you migrate the traffic over, um, to fully take, get fully be in the V3 world, and then you, you know, remove or comment out or whatever.The, the code that supported that in the V2 monolith.[00:07:54] Jeremy: And then you had mentioned using Oracle databases. Did you have a set for V2 and a separate V3 and you were kind of trying to keep them in sync?[00:08:02] Randy: Oh, great question. Thank you for asking that question. No, uh, no. We had the databases. Um, so again, as I mentioned, we had pre-demonolith that's my that's a technical term, uh, pre broken up the databases starting in, let's call it 2000. Uh, actually I'm almost certain that's 2000. Cause we had a major site outage in 1999, which everybody still remembers who was there at the time.Uh wasn't me or I wasn't there at the time. Uh, but you know, you can look it up. Uh, anyway, so yeah, starting in 2000, we broke up that monolithic database into what I was telling you before those entity aligned databases. Again, one set for items, one set for users, one set for transactions, you know, dot dot, dot, um, and that division of those databases was shared.You know, those databases were shared between. The three using those things and then V sorry, V2 using those things and V3 using those things. Um, and then, you know, so we've completely decoupled the rewrite of the database, you know, kind of data storage layer from the rewrite of the application layer, if that makes sense.[00:09:09] Jeremy: Yeah. So, so you had V2 that was connecting to these individual Oracle databases. You said like they were for different types of entities, like maybe for items and users and things like that. but it was a shared database situation where V2 was connected to the same database as V3. Is that right?[00:09:28] Randy: Correct and also in V3, even when done. Different V3 applications, were also connecting to the same database, again, like anybody who used user, anybody who used the user entity, which is a lot we're connecting to the user suite of databases and anybody who used the item entity, which again is a lot, um, you were connecting to the item databases, et cetera.So yeah, it was this many to many that's, I'm trying to say many to many relationship between applications in the V3 world and databases.[00:10:00] Jeremy: Okay. Yeah, I think I, I got it because[00:10:03] Randy: It's easier with a diagram.[00:10:04] Jeremy: yeah. W 'cause when you, when you think about services now, um, you think of services having dependencies on other services. Whereas in this case you would have multiple services that rather than talking to a different service, they would all just talk to the same database.They all needed users. So they all needed to connect to the user's database.[00:10:24] Randy: Right exactly. And so, uh, I don't want to jump ahead in this conversation, but like the problems that everybody has, everybody who's feeling uncomfortable at the moment. You're right. To feel uncomfortable because that wasn't unpleasant situation and microservices, or more generally the idea that individual services would own their own data.And only in the only are interactions to the service would be through the service interface and not like behind the services back to the, to the data storage layer. Um, that's better. And Amazon discovered that, you know, uh, lots of people discovered that around that same, around that same early two thousands period.And so yeah, we had that situation at eBay at the time. Uh, it was better than it was before. Right, right. Better than a monolithic database and a monolithic application layer, but it definitely also had issues. Uh, as you can imagine,[00:11:14] Jeremy: you know, thinking about back to that time where you were saying it's better than a monolith, um, what were sort of the trade-offs of, you know, you have a monolith connecting to all these databases versus you having all these applications, connecting to all these databases, like what were the things that you gained and what did you lose if that made sense?[00:11:36] Randy: Hmm. Yeah. Well, I mean, why we did it in the first place is develop is like isolation between development teams right? So we were looking for developer productivity or the phrase we used to use was feature velocity, you know, so how quickly would we be able to move? And to the extent that we could move independently, you know, the search team could move independently from the buying team, which could move independently from the selling team, et cetera.Um, that was what we were gaining. Um, what were we losing? Uh, you know, when you're in a monolith situation, If there's an issue, you know, where it is, it's in the monolith. You might not know where in the monolith. Um, but like there's only one place that could be. And so an issue that one has, uh, when you break things up into smaller units, uh, especially when they have this, you know, shared, shared mutable state, essentially in the form of these databases, like who changed that column?What, you know, what's the deal. Uh, actually we did have a solution for that or something that really helped us, which was, um, now 20, more than 20 years ago, we had something that we would now call distributed tracing where, uh, actually I talked about this way back in the 2007 thing, cause it was pretty cool, uh, at the time, uh, You know, just like the spans one would create using a modern distributed tracing, you know, open telemetry or, you know, any of the disruptive tracing vendors.Um, just like you would do that. We, we didn't use the term span, but that same idea where, um, we could, and the goal was the same to like debug stuff. So, uh, every time we were about to make a database call, we would say, Hey, I'm about to make this data, you know, we would log we about to make this database call and then it would happen.And then we would log whether it was successful or not successful. We could see how long it took, et cetera. Um, and so we built our own, you know, monitoring system, which, which we called central application logging or CAL, uh, totally proprietary to eBay. I'm happy to talk about whatever gory details you want to know about that, but it was pretty cool certainly way back in 2000.It was, and that was our mitigation against the thing I'm telling you, which is, you know, when something, when not. Something is weird in the database. We can kind of back up and figure out where it might've happened, or things are slow. What's, you know, what's the deal. And, uh, you know, cause sometimes the database is slow for reasons.Um, and what, which, what thing is, you know, from an application perspective, I'm talking to 20 different databases, but things are slow. Like what is it? And, um, CAL helped us to, to figure out both elements of that, right? Like what applications are talking to, what databases and what backend services and like debug and diagnose from that perspective.And then for a given application, what, you know, databases in backend services are you talking to? And, um, debug that. And then we have the whole, and then we, um, we, we had monitors on those things and we would notice when databases would, where be a lot of errors or where, when database is starting in slower than they used to be.Um, and then. We implemented what people would now call circuit breakers, where we would notice that, oh, you know, everybody who's trying to talk to database 1, 2, 3, 4 is seeing it slow down. I guess 1, 2, 3, 4 is unhappy. So now flip everybody to say, don't talk to 1, 2, 3, 4, and like, just that kind of stuff.You're not going to be able to serve. Uh, but whatever, that's better than stopping everything. So I hope that makes sense. Like, you know, so all these, all these like modern resilience techniques, um, we always had, we had our own proprietary names for them, but you know, we, we implemented a lot of them way back when,[00:15:22] Jeremy: Yeah. And, and I guess just to contextualize it for the audience, I mean, this was back in 2004. Oh it back in 2000.[00:15:32] Randy: Again, because we had this, sorry to interrupt you because we have, the problem is that we were just talking about where application many applications are talking to many services and databases and we didn't know what was going on. And so we needed some visibility into what was going on.Sorry, go ahead.[00:15:48] Jeremy: yeah. Okay. So all the way back in 2000, there's a lot less, Services out there, like nowadays you think about so many software as a service products. if you were building the same thing today, what are some of the services that people today would just go and say like, oh, I'll just, I'll just pay for this and have this company handle it for me. You know, that wasn't available, then[00:16:10] Randy: sure. Well, there. No, essentially, no. Well, there was no cloud cloud didn't happen until 2006. Um, and there were a few software as a service vendors like Salesforce existed at the time, but they weren't usable in the way you're thinking of where I could give you money and you would operate a technical or technological software service on my behalf.Do you know what I mean? So we didn't have any of the monitoring vendors. We didn't have any of the stuff today. So yeah. So what would we do, you know, to solve that specific problem today? Uh, I would, as we do today, I would, uh, instrument everything with open telemetry because that's generic. Thank you, Ben Siegelman and LightStep for starting that whole open sourcing process, uh, of that thing and, and, um, getting all the vendors to, you know, respect it.Um, and then I would shoot, you know, for my backend, I would choose one of the very many wonderful, uh, you know, uh, distributed tracing vendors of which there are so many, I can't remember, but like LightStep is one honeycomb... you know, there were a bunch of, uh, you know, backend, um, distributed tracing vendors in particular, you know, for that.Uh, what else do you have today? I mean, we could go on for hours on this one, but like, we didn't have distributed logging or we didn't have like logging vendors, you know? So there was no, uh, there was no Splunk, there was no, um, you know, any, any of those, uh, any of the many, uh, distributed log, uh, or centralized logging vendor, uh, vendors.So we didn't have any of those things. We didn't. like caveman, you know, we rent, we, uh, you know, had our own data. We built our own data centers. We racked our own servers. We installed all the OSS in them, you know, uh, by the way, we still do all that because it's way cheaper for us at our scale to do that.But happy to talk about that too. Uh, anyway, but yeah, no, the people who live in, I don't know if this is where you want to go in 2022, the software developer has this massive menu of options. You know, if you only have a credit card, uh, and it doesn't usually cost that much, you can get a lot of stuff done from the cloud vendors, from the software service vendors, et cetera, et cetera.And none of that existed in 2000.[00:18:31] Jeremy: it's really interesting to think about how different, I guess the development world is now. Like, cause you mentioned how cloud wasn't even really a thing until 2006, all these, these vendors that people take for granted. Um, none of them existed. And so it just, uh, it must've been a very, very different time.[00:18:52] Randy: Well, we didn't know. It was every, every year is better than the previous year, you know, in software every year. You know? So at that time we were really excited that we had all the tools and capabilities that, that we did have. Uh, and also, you know, you look back from, you know, 20 years in the future and, uh, you know, it looks caveman, you know, from that perspective.But, uh, it was, you know, all those things were cutting edge at the time. What happened really was the big companies rolled their own, right. Everybody, you know, everybody built their own data centers, rack their own servers. Um, so at least at scale and the best you could hope for the most you could pay anybody else to do is rack your servers for you.You know what I mean? Like there were external people, you know, and they still exist. A lot of them, you know, the Rackspaces you know Equinixes, et cetera of the world. Like they would. Have a co-location facility. Uh, and you, you know, you ask them please, you know, I'd like to buy the, these specific machines and please rack these specific machines for me and connect them up on the network in this particular way.Um, that was the thing you could pay for. Um, but you pretty much couldn't pay them to put software on there for you. That was your job. Um, and then operating. It was also your job, if that makes sense.[00:20:06] Jeremy: and then back then, would that be where. Employees would actually have to go to the data center and then, you know, put in their, their windows CD or their Linux CD and, you know, actually do everything right there.[00:20:18] Randy: Yeah. 100%. Yeah. In fact, um, again, anybody who operates data centers, I mean, there's more automation, but the conceptually, when we run three data centers ourselves at eBay right now, um, and all of our, all of our software runs on them. So like we have physical, we have those physical data centers. We have employees that, uh, physically work in those things, physical.Rack and stack the servers again, we're smarter about it now. Like we buy a whole rack, we roll the whole rack in and cable it, you know, with one big chunk, uh, sound, uh, as distinct from, you know, individual wiring and the networks are different and better. So there's a lot less like individual stuff, but you know, at the end of the day, but yeah, everybody in quotes, everybody at that time was doing that or paying somebody to do exactly that.Right. Yeah.[00:21:05] Jeremy: Yeah. And it's, it's interesting too, that you mentioned that it's still being done by eBay. You said you have three, three data centers. because it seems like now maybe it's just assumed that someone's using a cloud service or using AWS or whatnot. And so, oh, go ahead.[00:21:23] Randy: I was just going to say, well, I'm just going to riff off what you said, how the world has changed. I mean, so much, right? So. Uh, it's fine. You didn't need to say my whole LinkedIn, but like I used to work on Google cloud. So I've been, uh, I've been a cloud vendor, uh, at a bunch of previous companies I've been a cloud consumer, uh, at stitch fix and we work in other places.Um, so I'm fully aware, you know, fully, fully, personally aware of, of all that stuff. But yeah, I mean, there's this, um, you know, eBay is in the, uh, eBay is at the size where it is actually. Cost-effective very, cost-effective, uh, can't tell you more than that, uh, for us to operate our own, um, uh, our own infrastructure, right?So, you know, you know, one would expect if Google didn't operate their own infrastructure, nobody would expect Google to use somebody else's right. Like that, that doesn't make any economic sense. Um, and, uh, you know, Facebook is in the same category. Uh, for a while, Twitter and PayPal have been in that category.So there's like this clap, you know, there are the known hyperscalers, right. You know, the, the Google, Amazon, uh, Microsoft that are like cloud vendors in addition to consumers internally have their own, their own clouds. Um, and then there's a whole class of other, um, places that operate their own internal clouds in quotes.Uh, but don't offer them externally and again, uh, Facebook or Meta, uh, you know, is one example. eBay's another, you know, there's a, I'm making this up. Dropbox actually famously started in the cloud and then found it was much cheaper for them to operate their own infrastructure again, for the particular workloads that they had.Um, so yeah, there's probably, I'm making this up. Let's call it two dozen around the world of these, I'm making this term up many hyperscalers, right? Like self hyperscalers or something like that. And eBay's in that category.[00:23:11] Jeremy: I know this is kind of a, you know, a big what if, but you were saying how once you reach a certain scale, that's when it makes sense to move into your own data center. And, uh, I'm wondering if, if E-bay, had started more recently, like, let's say in the last, you know, 10 years, I wonder if it would've made sense for it to start on a public cloud and then move to, um, you know, its own infrastructure after it got bigger, or if you know, it really did make sense to just start with your own infrastructure from the start.[00:23:44] Randy: Oh, I'm so glad you asked that. Um, the, the answer is obvious, but like, I'm so glad you asked that because I love to make this point. No one should ever, ever start by building your own servers and your own (laughs) cloud. Like, No, there's be, uh, you should be so lucky (laughs) after years and years and years that you outgrow the cloud vendors.Right. Um, it happens, but it doesn't happen that often, you know, it happens so rarely that people write articles about it when it happens. Do you know what I mean? Like Dropbox is a good example. So yes, 100% anytime. Where are we? 2022. Any time in, more than the last 10 years? Um, yeah, let's call it. Let's call it 2010, 2012.Right. Um, when cloud had proved itself over and you know, many times over, um, anybody who starts since that time should absolutely start in the public cloud. There's no argument about it. Uh, and again, one should be so lucky that over time, you're seeing successive zeros added to your cloud bill, and it becomes so many zeros that it makes sense to shift your focus toward building and operating your own data centers.That's it. I haven't been part of that transition. I've been the other way, you know, at other places where, you know, I've migrated from owned data centers and colos into, into public cloud. Um, and that's the, that's the more common migration. And again, there are, there are a handful, maybe not even a handful of, uh, companies that have migrated away, but when they do, they've done all the math, right.I mean, uh, Dropbox has done some great, uh, talks and articles about, about their transition and boy, the math makes sense for them. So, yeah.[00:25:30] Jeremy: Yeah. And it also seems like maybe it's for certain types of businesses where moving off of public cloud. Makes sense. Like you mentioned Dropbox where so much of their business is probably centered around storage or centered around, you know, bandwidth and, you know, there's probably certain workloads that it's like need to leave public cloud earlier.[00:25:51] Randy: Um, yeah, I think that's fair. Um, I think that, I think that's a, I think that's an insightful comment. Again, it's all about the economics at some point, you know, it's a big investment to, uh, uh, and it takes years to develop the intern, forget the money that you're paying people, but like just to develop the internal capabilities.So they're very specialized skill sets around building an operating data centers. So like it's a big deal. Um, and, uh, yeah. So are there particular classes of workloads where you would for the same dollar figure or whatever, uh, migrate earlier or later? I'm sure that's probably true. And again, what can absolutely imagine?Well, when they say Dropbox in this example, um, yeah, it's because like they, they need to go direct to the storage. And then, I mean, like, they want to remove every middle person, you know, from the flow of the bytes that are coming into the storage media. Um, and it makes perfect sense for, for them. And when I understood what they were doing, which was a number of years ago, they were hybrid, right. So they had, they had completely, you know, they kept the top, you know, external layer, uh, in public cloud. And then the, the storage layer was all custom. I don't know what they do today, but people could check.[00:27:07] Jeremy: And I'm kind of coming back to your, your first time at eBay. is there anything you felt that you would've done differently with the knowledge you have now?but with the technology that existed, then.[00:27:25] Randy: Gosh, that's the 20, 20 hindsight. Um, the one that comes to mind is the one we touched on a little bit, but I'll say it more starkly, the. If I could, if I could go back in time 20 years and say, Hey, we're about to do this V3 transition at eBay. I would not. I would have had us move directly to what we would now call microservices in the sense that individual services own their own data storage and are only interacted with through the public interface.Um, there's a famous Amazon memo around that same time. So Amazon did the transition from a monolith into what we would now call microservices over about a four or five-year period, 2000 to 2005. And there was a famous Jeff Bezos memo from the early part of that, where, you know, seven, you know, requirements I can't remember them, but you know, essentially it was, you may, you may, you may never, you may never talk to anybody else's database. You may only interact with other services through their public interfaces. I don't care what those public interfaces are, so they didn't standardize around. You know, CORBA or JSON or GRPC, which didn't exist at the time, you know, like they didn't standardize around any, any particular, uh, interaction mechanism, but you did need to again, have this kind of microservice capability, that's modern terminology, um, uh, where, you know, the only services own their own data and nobody can talk in the back door.So that is the one architectural thing that I wish, you know, with 2020 hindsight, uh, that I would bring back in my time travel to 20 years ago, because that would help. That does help a lot. And to be fair, Amazon, um, Amazon was, um, pioneering in that approach and a lot of people internally and externally from Amazon, I'm told, didn't think it would work, uh, and it, and it did famously.So that's, that's the thing I would do.[00:29:30] Jeremy: Yeah. I'm glad you brought that up because, when you had mentioned that, I think you said there were 220 applications or something like that at certain scales, people might think like, oh, that sounds like microservices to me. But when you, you mentioned that microservice to you means it having its own data store.I think that's a good distinction.[00:29:52] Randy: Yeah. So, um, I talk a lot about microservices that have for, for a decade or so. Yeah. I mean, several of the distinguishing characteristics are the micro in microservices is size and scope of the interface, right? So you can have a service oriented architecture with one big service, um, or some very small number of very large services.But the micro in microservice means this thing does, maybe it doesn't have one operation, but it doesn't have a thousand. The several or the handful or several handfuls of operations are all about this one particular thing. So that's the one part of it. And then the other part of it that is critical to the success of that is owning the, owning your own data storage.Um, so each service, you know, again, uh, it's hard to do this with a diagram, but like imagine, imagine the bubble of the service surrounding the data storage, right? So like people, anybody from the outside, whether they're interacting synchronously, asynchronously, messaging, synchronous, whatever HTTP doesn't matter are only interacting to the bubble and never getting inside where the, uh, where the data is I hope that makes sense.[00:31:04] Jeremy: Yeah. I mean, I mean, it's a kind of in direct contrast to before you're talking about how you had all these databases that all of these services shared. So it was probably hard to kind of keep track of, um, who had modified data. Um, you know, one service could modify it, then another service control to get data out and it's been changed, but it didn't change it.So it could be kind of hard to track what's going on.[00:31:28] Randy: Yeah, exactly. Inner integration at the database level is something that people have been doing since probably the 1980s. Um, and so again, I, you know, in retrospect it looks like caveman approach. Uh, it was pretty advanced at the time, actually, even the idea of sharding of, you know, Hey, there are users and the users live in databases, but they don't all live in the same one.Uh, they live in 10 different databases or 20 different databases. And then there's this layer that. For this particular user, it figures out which of the 20 databases it's in and finds it and gets it back. And, um, you know, that was all pretty advanced. And by the way, that's all those capabilities still exist.They're just hidden from everybody behind, you know, nice, simple, uh, software as a service, uh, interfaces anyway, but that takes nothing away from your excellent point, which is, yeah. It's, you know, when you're, again, when there's many to many to relations, when there is this many to many relationship between, um, uh, applications and databases, uh, and there's shared mutable state in those databases that when is shared, like that's bad, you know, it's not bad to have state.It's not bad to have mutable state it's bad to have shared beautiful state.[00:32:41] Jeremy: Yeah. And I think anybody who's kind of interested in learning more about the, you had talked about sharding and things like that. If they go back and listen to your, your first appearance on software engineering radio, um, yeah. It kind of struck me how you were talking about sharding and how it was something that was kind of unique or unusual.Whereas today it feels like it's very, I don't know, if quaint is the right word, but it's like, um, it's something that, that people kind of are accustomed to now.[00:33:09] Randy: Yeah. Yeah. Um, it's obvious. Um, it seems obvious in retrospect. Yeah. You know, at the time, and by the way, he didn't invent charting. As I said, in 2007, you know, Google and Yahoo and, uh, Amazon, and, you know, it was the obvious, it took a while to reach it, but it's one of those things where once, once people have the, you know, brainwave to see, oh, you know what, we don't actually have to stop store this in one, uh, database.We can, we can chop that database up into, you know, into chunks. And that, that looks similar to that herself similar. Um, yeah, that was, uh, that was, uh, that was reinvented by lots of, uh, Lots of the big companies at the same time again, because everybody was solving that same problem at the same time. Um, but yeah, when you look back and you, I mean, like, and honestly, like everything that I said there, it's still like this, all the techniques about how you shard things.And there's lots of, you know, it's not interesting anymore because the problems have been solved, but all those solutions are still the solutions, if that makes any sense, but you know,[00:34:09] Jeremy: Yeah, for sure. I mean, I think anybody who goes back and listens to it. Yeah. Like you said, it's, it's, it's very interesting because it's. it all still applies and it's like, I think the, the solutions that are kind of interesting to me are ones where it's, it's things that could have been implemented long ago, but we just later on realized like, this is how we could do it.[00:34:31] Randy: Well part of it is, as we grow as an industry, we just, we discover new problems. You know, we, we get to the point where, you know, sharding over databases has only a problem when one database doesn't work. You know, when it, when you're the load that you put on that database is too big, or you want the availability of, you know, multiple.Um, and so that's not a, that's not a day one problem, right? That's a day two or day 2000 and kind of problem. Right. Um, and so a lot of these things, yeah, well, you know, it's software. So like we could have done, we could have done any of these things in older languages and older operating systems and with older technology.But for the most part, we didn't have those problems or we didn't have them at sufficiently enough. People didn't have the problem that we, you know, um, for us to have solved it as an industry, if that makes any sense.[00:35:30] Jeremy: yeah, no, that's a good point because you think about when Amazon first started and it was just a bookstore, right. And the number of people using the site where, uh, who knows it was, it might've been tens a day or hundreds a day. I don't, I don't know. And, and so, like you said, the problems that Amazon has now in terms of scale are just like, it's a completely different world than when they started.[00:35:52] Randy: Yeah. I mean, probably I'm making it up, but I don't think that's too off to say that it's a billion times more, their problems are a billion fold. You know, what they, what they were[00:36:05] Jeremy: the next thing I'd like to talk about is you came back to eBay I think about has it been about two years ago.[00:36:14] Randy: Two years yeah.[00:36:15] Jeremy: Yeah. And, and so, so tell me about the experience of coming back to an organization that you had been at, you know, 10 years prior or however long it was like, how is your onboarding different when it's somewhere you've been before?[00:36:31] Randy: Yeah. Sure. So, um, like, like you said, I worked at eBay from 2004 to 2011. Um, and I worked in a different role than I have today. I've worked mostly on eBay search engine. Um, and then, uh, I left to co-found a startup, which was in the 99%. So the one, you know, like didn't really do much. Uh, I joined, I worked at Google in the early days of Google cloud, as I mentioned on Google app engine and had a bunch of other roles including more recently, like you said, stitch fix and we work, um, leading those engineering teams.And, um, so yeah, coming back to eBay as chief architect and, and, you know, leading. Developer platform, essentially a part of eBay. Um, yeah. What was the onboarding like? I mean, lots of things had changed, you know, in the, in the intervening 10 years or so. Uh, and lots had stayed the same, you know, not in a bad way, but just, you know, uh, some of the technologies that we use today are still some of the technologies we used 10 years ago, a lot has changed though.Um, a bunch of the people are still around. So there's something about eBay that, um, people tend to stay a long time. You know, it's not really very strange for people to be at eBay for 20 years. Um, in my particular team of let's call it 150, there are four or five people that have crossed their 20 year anniversary at the company.Um, and I also re I rejoined with a bunch of other boomerangs as the term we use internally. So it's, you know, the, um, including the CEO, by the way. So sort of bringing the band back together, a bunch of people that had gone off and worked at it, but at other places have, have come back for various reasons over the last couple of.So it was both a lot of familiarity, a lot of unfamiliarity, a lot of familiar faces. Um, yup.[00:38:17] Jeremy: So, I mean, having these people who you work with still be there and actually coming back with some of those people, um, what were some of the big, I guess, advantages or benefits you got from, you know, those existing connections?[00:38:33] Randy: Yeah. Well, I mean, as with all things, you know, imagine, I mean, everybody can imagine like getting back together with friends that they had from high school or university, or like you had some people had some schooling at some point and like you get back together with those friends and there's this, you know, there's this implicit trust in most situations of, you know, because you went through a bunch of stuff together and you knew each other, uh, you know, a long time.And so that definitely helps, you know, when you're returning to a place where again, there are a lot of familiar faces where there's a lot of trust built up. Um, and then it's also helpful, you know, eBay's a pretty complicated place and it's 10 years ago, it was too big to hold in any one person's head and it's even harder to hold it in one person said now, but to be able to come back and have a little bit of that, well, more than a little bit of that context about, okay, here's how eBay works.And here, you know, here are the, you know, unique complexities of the marketplace cause it's very unique, you know, um, uh, in the world. Um, and so, yeah, no, I mean, it was helpful. It's helpful a lot. And then also, you know, in my current role, um, uh, my, my main goal actually is to just make all of eBay better, you know, so we have about 4,000 engineers and, you know, my team's job is to make all of them better and more productive and more successful and, uh, being able to combine.Knowing what eBay, knowing the context about eBay and having a bunch of connections to the people that, you know, a bunch of the leaders there, uh, here, um, combining that with 10 years of experience doing other things at other places, you know, that's helpful because you know, now there are things that we do at eBay that, okay, well there, you know, you know, that this other place is doing, this has that same problem and is solving it in a different way.And so maybe we should, you know, look into that option. So,[00:40:19] Jeremy: so, so you mentioned just trying to make developers, work or lives easier. you start the job. How do you decide what to tackle first? Like how do you figure out where the problems are or what to do next?[00:40:32] Randy: yeah, that's a great question. Um, so, uh, again, my, uh, I lead this thing that we internally called the velocity initiative, which is about just making us, giving us the ability to deliver. Features and bug fixes more quickly to customers. Right. And, um, so what do I figure for that problem? How can we deliver things more quickly to customers and improve, you know, get more customer value and business value?Uh, what I did, uh, with, in collaboration with a bunch of people is what one would call a value stream map. And that's a term from lean software and lean manufacturing, where you just look end to end at a process and like say all the steps and how long those steps take. So a value stream, as you can imagine, like all these steps that are happening at the end, there's some value, right?Like we produced some, you know, feature or, you know, hopefully gotten some revenue or like helped out the customer and the business in some way. And so value, you know, mapping that value stream. That's what it means. And, um, Looking for you look at that. And when you can see the end-to-end process, you know, and like really see it in some kind of diagram, uh, you can look for opportunities like, oh, okay, well, you know, if it takes us, I'm making this effort, it takes us a week from when we have an idea to when it shows up on the site.Well, you know, some of those steps take five minutes. That's not worth optimizing, but some of those steps take, you know, five days and that is worth optimizing. And so, um, getting some visibility into the system, you know, looking end to end with some, with a kind of view of the system systems thinking, uh, that will give you the, uh, the knowledge about, or the opportunities about we know what can be improved.And so that's, that's what we did. And we didn't talk with all 4,000, you know, uh, engineers are all, you know, whatever, half a thousand teams or whatever we had. Um, but we sampled. And after we talked with three teams who were already hearing a bunch of the same things, you know, so we were hearing in the whole product life cycle, which I like to divide into four stages.I'd like to say, there's planning. How does an idea become a project or a thing that people work on a software development? How does a project or become committed code software delivery? How does committed code become a feature that people actually use? And then what I call post release iteration, which is okay, it's now there are out there on the site and we're turning it on and off for individual users.We're learning in analytics and usage in the real world and, and experimenting. And so there were opportunities that eBay at all, four of those stages, um, which I'm happy to talk about, but what we ended up seeing again and again, uh, is that that software delivery part was our current bottleneck. So again, that's the, how long does it take from an engineer when she commits her code to, it shows up as a feature on the site.And, you know, before we started the work. You know, two years ago before we started the work that I've been doing for the last two years with a bunch of people, um, on average and eBay was like a week and a half. So, you know, it'd be a week and a half between when someone's finished and then, okay. It gets code reviewed and, you know, dot, dot, dot it gets rolled out.It gets tested, you know, all that stuff. Um, it was, you know, essentially 10 days. And now for the teams that we've been working with, uh, it's down to two. So we used a lot of, um, what people may be familiar with, uh, the accelerate book. So it's called accelerate by Nicole Forsgren. Um, Jez humble and Gene Kim, uh, 2018, like if there's one book anybody should read about software engineering, it's that?Uh, so please read accelerate. Um, it summarizes almost a decade of research from the state of DevOps reports, um, which the three people that I mentioned led. So Nicole Forsgren, you know, is, uh, is a doctor, uh, you know, she's a PhD and, uh, data science. She knows how to do all this stuff. Um, anyway, so, uh, that when your, when your problem happens to be software delivery.The accelerate book tells you all the kind of continuous delivery techniques, trunk based development, uh, all sorts of stuff that you can do to, to solve that, uh, solve those problems. And then there are also four metrics that they use to measure the effectiveness of an organization, software delivery. So people might be familiar with, uh, there's deployment frequency.How often are we deploying a particular application lead time for change? That's that time from when a developer commits her code to when it shows up on the site, uh, change failure rate, which is when we deploy code, how often do we roll it back or hot fix it, or, you know, there's some problem that we need to, you know, address.Um, and then, uh, meantime to re uh, meantime to restore, which is when we have one of those incidents or problems, how, how quickly can we, uh, roll it back or do that hot fix? Um, and again, the beauty of Nicole Forsgren research summarized in the accelerate book is that the science shows that companies cluster, in other words, Mostly the organizations that are not good at, you know, deployment frequency and lead time are also not good at the quality metrics of, uh, meantime to restore and change failure rate and the companies that are excellent at, you know, uh, deployment frequency and lead time are also excellent at meantime, to recover and, uh, change failure rate.Um, so companies or organizations, uh, divided into these four categories. So there's a low performers, medium performers, high performers, and then elite performers. And, uh, eBay was solidly eBay on average at the time. And still on average is solidly in that medium performer category. So, uh, and what we've been able to do with the teams that we've been working with is we've been able to move those teams to the high category.So just super brief. Uh, and I w we'll give you a chance to ask you some more questions, but like in the low category, all those things are kind of measured in months, right. So how long, how often are we deploying, you know, measure that in months? How long does it take us to get a commit to the site? You know, measure that in months, you know, um, where, and then the low performer, sorry.Uh, the medium performers are like, everything's measured in weeks, right? So like, if we were deploy, you know, couple, you know, once every couple of weeks or once a week, uh, lead time is measured in weeks, et cetera. The, uh, the high-performers things are measured in days and the elite performance things are measured in hours.And so you can see there's like order of magnitude improvements when you go from, you know, when you move from one of those kind of clusters to another cluster. Anyway. So what we were focused on again, because our problem was software delivery was moving a whole, a whole set of teams from that medium performer category where things are measured in weeks to the, uh, high-performer category, where things are missing.[00:47:21] Jeremy: throughout all this, you said the, the big thing that you focused on was the delivery time. So somebody wrote code and, they felt that it was ready for deployment, but for some reason it took 10 days to actually get out to the actual site. So I wonder if you could talk a little bit about, uh, maybe a specific team or a specific application, where, where, where was that time being spent?You know, you, you said you moved from 10 days to two days. What, what was happening in the meantime?[00:47:49] Randy: Yeah, no, that's a great question. Thank you. Um, yeah, so, uh, okay, so now, so we, we, we looked end to end of the process and we found that software delivery was the first place to focus, and then there are other issues in other areas, but we'll get to them later. Um, so then for, um, to improve software delivery, now we asked individual teams, we, we, we did something like, um, you know, some conversation like I'm about to say, so we said, hi, it looks like you're deploying kind of once or twice.If I, if I told you, you had to deploy once a day, tell me all the reasons why that's not going to work. And the teams are like, oh, of course, well, it's a build times take too long. And the deployments aren't automated and you know, our testing is flaky. So we have to retry it all the time and, you know, dot, dot, dot, dot, dot.And we said, great, you just gave my team, our backlog. Right. So rather than, you know, just coming and like, let's complain about it. Um, which the teams work it's legit for them to complain. Uh, I was a, you know, we were able, because again, the developer program or sorry, the developer platform, you know, is as part of my team, uh, we said, great, like you just gave us, you just told us all the, all your top, uh, issues or your impediments, as we say, um, and we're going to work on them with you.And so every time we had some idea about, well, I bet we can use Canary deployments to automate the deployment which we have now done. We would pilot that with a bunch of teams, we'd learn what works and doesn't work. And then we would roll that out to everybody. Um, So what were the impediments like? It was a little bit different for each individual team, but in some, it was, uh, the things we ended up focusing on or have been focusing on our build times, you know, so we build everything in Java still.Um, and, uh, even though we're generation five, as opposed to that generation three that I mentioned, um, still build times for a lot of applications we're taking way too long. And so we, we spend a bunch of time improving those things and we were able to take stuff from, you know, hours down to, you know, single digit minutes.So that's a huge improvement to developer productivity. Um, we made a lot of investment in our continuous delivery pipelines. Um, so making all the, making all the automation around, you know, deploying something to one environment and checking it there and then deploying it into a common staging environment and checking it there and then deploying it from there into the production environment.And, um, and then, you know, rolling it out via this Canary mechanism. We invested a lot in something that we call traffic mirroring, which is a, we didn't invent. Other T other places have a different name for this? I don't know if there's a standard industry name. Some people call it shadowing, but the idea is I have a change that I'm making, which is not intended to change the behavior.Like a lots of changes that we make, bug fixes, et cetera, uh, upgrading to new, you know, open source, dependencies, whatever, changing the version of the framework. There's a bunch of changes that we make regularly day-to-day as developers, which are like, refactorings kind of where we're not actually intending to change the behavior.And so a tra traffic mirroring was our idea of. You have the old code that's running in production and you, and you fire a request, a production request at that old code and it responds, but then you also fire that request at the new version and compare the results, you know, did the same, Jason come back, you know, between the old version and the new version.Um, and that's, that's a great way kind of from the outside to sort of black box detect any unintended changes in the, in the behavior. And so we definitely leveraged that very, very aggressively. Um, we've invested in a bunch of other bunch of other things, but, but all those investments are driven by what does the team, what do the particular teams tell us are getting in their way?And there are a bunch of things that the teams themselves have, you know, been motivated to do. So my team's not the only one that's making improvements. You know, teams have. Reoriented, uh, moved, moved from branching development to trunk based development, which makes a big difference. Um, making sure that, uh, PR approvals and like, um, you know, code reviews are happening much more regularly.So like right after, you know, a thing that some teams have started doing is like immediately after standup in the morning, everybody does all the code reviews that you know, are waiting. And so things don't drag on for, you know, two, three days, cause whatever. Um, so there's just like a, you know, everybody kind of works on that much more quickly.Um, teams are building their own automations for things like testing site speed and accessibility and all sorts of stuff. So like all the, all the things that, you know, a team goes through in the development and roll out of their software, they were been spending a lot of time automating and making, making a leaner, making more efficient.[00:52:22] Jeremy: So, so some of those, it sounds like the PR example is really, on the team. Like you're, you're telling them like, Hey, this is something that you internally should change how you work. for things like improving the build time and things like that. Did you have like a separate team that was helping these teams, you know, speed that process up? Or what, what was that [00:52:46] Randy: like?Yeah. Great. I mean, and you did give to those two examples are, are like you say, very different. So I'm going to start from, we just simply showed everybody. Here's your deployment frequency for this application? Here's your lead time for this application? Here's your change failure rate. And here's your meantime to restore.And again, as I didn't mention before. All of the state of DevOps research and the accelerate book prove that by improving those metrics, you get better engineering outcomes and you also get better business outcomes. So like it's scientifically proven that improving those four things matters. Okay. So now we've shown to teams, Hey, you're we would like you to improve, you know, for your own good, but you know, more broadly at eBay, we would like the deployment frequency to be faster.And we would like the lead time to be shorter. And the insight there is when we deploy smaller units of work, when we don't like batch up a week's worth of work, a month's worth of work, uh, it's much, much less risky to just deploy like an hour's worth of work. Right. And the, and the insight is the hours worth of work fits in your head.And if you roll it out and there's an issue. First off rolling backs, no big deal. Cause you only, you know, not, you've only lost an hour of work for a temporary period of time, but also like you never have this thing, like what in the world broke? Cause like with a month's worth of work, there's a lot of things that changed and a lot of stuff that could break, but with an hour's worth of work, it's only like one change that you made.So, you know, when, if something happens, like it's pretty much, pretty much guaranteed to be that thing anyway, that's the back. Uh, that's the backstory. And um, and so yeah, we were just working with individual teams. Oh yeah. So they were, the teams were motivated to like, see what's the biggest bang for the buck in order to improve those things.Like how can we improve those things? And again, some teams were saying, well, you know what, a huge component of our, of that lead time between when somebody commits and it's, it's a feature on the site, a huge percentage of that. Maybe multiple days, it's like waiting for somebody to code review. Okay, great.We can just change our team kind of agreements and our team behavior to make that happen. And then yes, to answer your question about. Were the other things like building the Canary capability and traffic mirroring and build time improvements. Those were done by central, uh, platform and infrastructure teams, you know, some of which were in my group and some of which are in peer peer groups, uh, in, in my part of the organization.So, yeah, so I mean like providing the generic tools and, you know, generic capabilities, those are absolutely things that a platform organization does. Like that's our job. Um, and you know, we did it. And, uh, and then there are a bunch of other things like that around kind of team behavior and how you approach building a particular application that are, are, and should be completely in the control of the individual teams.And we were trying not to be, not trying not to be, we were definitely not being super prescriptive. Like we didn't come in and we say, we didn't come in and say, alright, by next, by next Tuesday, we want you to be doing trunk based development by, you know, the Tuesday after that, we want to see test-driven development, you know, dot, dot, Um, we would just offer to teams, you know, hear it.Here's where you are. Here's where we know you can get, because like we work with other teams and we've seen that they can get there. Um, you know, they just work together on, well, what's the biggest bang for the buck and what would be most helpful for that team? So it's like a menu of options and you don't have to take everything off the menu, if that makes sense.[00:56:10] Jeremy: And, and how did that communication flow from you and your team down to the individual contributor? Like you have, I'm assuming you have engineering managers and technical leads and all these people sort of in the chain. How does it[00:56:24] Randy: Yeah, thanks for asking that. Yeah. I didn't really say how we work as an initiative. So every, um, so there are a bunch of teams that are involved. Um, and we have, uh, every Monday morning, so, uh, just so happens. It's late Monday morning today. So we already did this a couple of hours ago, but once a week we get all the teams that are involved, both like the platform kind of provider teams and also the product.Or we would say domain like consumer teams. And we do a quick scrum of scrums, like a big old kind of stand up. What have you all done this week? What are you working on next week? What are you blocked by kind of idea. And, you know, there are probably 20 or 30 teams again, across the individual platform capabilities and across the teams that, you know, uh, consume this stuff and everybody gives a quick update and they, and, uh, it's a great opportunity for people to say, oh, I have that same problem too.Maybe we should offline try to figure out how to solve that together. You built a tool that automates the site speed stuff. That's great. I would S I would so love to have that. And, um, so it, uh, this weekly meeting has been a great opportunity for us to share wins, share, um, you know, help that people need and then get, uh, get teams to help with each other.And also, similarly, one of the platform teams would say something like, Hey, we're about to be done or beta, let's say, you know, this new Canary capability, I'm making this up. Anybody wanna pilot that for us? And then you get a bunch of hands raised of yo, we would be very happy to pilot that that would be great.Um, so that's how we communicate back and forth. And, you know, it's a big enough. It's kind of like engineering managers are kind of are the kind of level that are involved in that typically. Um, so it's not individual developers, but it's like somebody on most, every team, if that makes any sense. So like, that's kind of how we do that, that like communication, uh, back to the individual developers.If that makes sense.[00:58:26] Jeremy: Yeah. So it sounds like you would have, like you said, the engineering manager go to the standup and um, you said maybe 20 to 30 teams, or like, I'm just trying to get a picture for how many people are in this meeting.[00:58:39] Randy: Yeah. It's like 30 or 40 people.[00:58:41] Jeremy: Okay. Yeah.[00:58:42] Randy: And again, it's quick, right? It's an hour. So we just go, boom, boom, boom, boom. And we've just developed a cadence of people. We have a shared Google doc and like people like write their little summaries, you know, of what they're, what they've worked on and what they're working on.So we've over time made it so that it's pretty efficient with people's time. And. Pretty dense in a good way of like information flow, back and forth. Um, and then also separately, we meet more in more detail with the individual teams that are involved. Again, try to elicit, okay, now, where are you now?Here's where you are. Please let us know what problems you're seeing with this part of the infrastructure or problems you're seeing in the pipelines or something like that. And we're, you know, we're constantly trying to learn and get better and, you know, solicit feedback from teams on what we can do differently.[00:59:29] Jeremy: earlier you had talked a little bit about how there were a few services that got brought over from V2 or V3, basically kind of more legacy or older services that are, have been a part of eBay for quite some time.And I was wondering if there were things about those services that made this process different, like, you know, in terms of how often you could deploy or, um, just what were some key differences between something that was made recently versus something that has been with the company for a long time?[01:00:06] Randy: Yeah, sure. I mean, the stuff that's been with the company for a long time was best in class. As of when we built it, you know, maybe 15 and sometimes 20 years ago. Um, there actually, I wouldn't even less than a handful. There are, as we speak, there are two or three of those V3. Uh, clusters or applications or services still around and they should be gone in a completely migrated away from, in the next a couple of months.So like, we're almost at the end of, um, you know, uh, moving all to more modern things. But yeah, you know, I mean, again, uh, stuff that was state-of-the-art, you know, 20 years ago, which was like deploying things once every two weeks, like that was a big deal in 2000 or 2004. Uh, and it's, you know, like that was fast in 2004 and is slow in 2022.So, um, yeah, I mean, what's the difference? Um, yeah, I mean, a lot of these things, you know, if they haven't already been migrated, there's a reason. And it's because often that they're way in the guts of something that's really important. You know, this is the, this is a core part. I'm making these examples up and they're not even right, but like it's a core part of the payments flow.It's a core part of, you know, uh, how, uh, sellers get paid. And those aren't examples. We have, those are modern, but you see what I'm saying? Like stuff that's like really core to the business and that's why it's kind of lasted.[01:01:34] Jeremy: And, uh, I'm kind of curious from the perspective of some of these new things you're introducing, like you're talking about, um, improving continuous delivery and things like that. Uh, when you're working with some of these services that have been around a long time, are the teams the rate at which they deploy or the rate at which you find defects is that noticeably different from services that are more recent?[01:02:04] Randy: I mean, and that's true of any legacy at any, at any place. Right? So, um, yeah, I mean, people are legitimately, uh, I have some trepidation that say about, you know, changing something that's

ceo director amazon google pr phd er microsoft cost exodus employees services engineering id cd paypal jeff bezos ebay architecture evolving oracle yahoo developers salesforce budapest makes aws dropbox java wework devops ear rack canary xyz google cloud oss v2 stitch fix iteration splunk json chief engineer chief architect jez v3 gene kim grpc dll pierre omidyar nicole forsgren lightstep corba qcon software engineering radio distinguished architect randy shoup jeremy so randy oh

Generating Demand and Building Trust with Anadelia Fadeev

cloud native observability boten otel jessica kerr lightstep liz fong jones

Play Episode Listen Later Jul 28, 2022 36:25

About Anadelia Anadelia is a B2B marketing leader passionate about building tech brands and growing revenue. She is currently the Sr. Director of Demand Generation at Teleport. In her spare time she enjoys live music and craft beer.Links Referenced: Teleport: https://goteleport.com/ @anadeliafadeev: https://twitter.com/anadeliafadeev LinkedIn: https://www.linkedin.com/in/anadeliafadeev/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: DoorDash had a problem. As their cloud-native environment scaled and developers delivered new features, their monitoring system kept breaking down. In an organization where data is used to make better decisions about technology and about the business, losing observability means the entire company loses their competitive edge. With Chronosphere, DoorDash is no longer losing visibility into their applications suite. The key? Chronosphere is an open-source compatible, scalable, and reliable observability solution that gives the observability lead at DoorDash business, confidence, and peace of mind. Read the full success story at snark.cloud/chronosphere. That's snark.cloud slash C-H-R-O-N-O-S-P-H-E-R-E.Corey: Let's face it, on-call firefighting at 2am is stressful! So there's good news and there's bad news. The bad news is that you probably can't prevent incidents from happening, but the good news is that incident.io makes incidents less stressful and a lot more valuable. incident.io is a Slack-native incident management platform that allows you to automate incident processes, focus on fixing the issues and learn from incident insights to improve site reliability and fix your vulnerabilities. Try incident.io, recover faster and sleep more.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. This may surprise some of you to realize, but every once in a while, I mention how these episodes are sponsored by different companies. Well, to peel back a little bit of the mystery behind that curtain, I should probably inform some of you that when I say that, that means that companies have paid me to talk about them. I know, shocking.This is a revelation that will topple the podcast industry if it gets out. That's why it's just between us. My guest today knows this better than most. Anadelia Fadeev is the Senior Director of Demand Generation at Teleport, who does in fact sponsor a number of different things that I do, but this is not a sponsored episode in that context. Anadelia, thank you for joining me today.Anadelia: Thank you for having me.Corey: It's interesting. I always have to double-check where it is that you happen to be working because when we first met you were a Senior Marketing Manager, also in Demand Gen, at InfluxData, then you were a Director of Demand Generation at LightStep, and then you became a Director of Demand Gen and Growth and then a Senior Director of Demand Gen, where you are now at Teleport. And the couple of things that I've noticed are, one, you seem to more or less be not only doing the same role, but advancing within it, and also—selfishly—it turns out that every time you wind up working somewhere, that company winds up sponsoring some of my nonsense. So first, thank you for your business. It's always appreciated. Now, what is demand gen exactly? Because I have to say, when I started podcasting and newslettering and shooting my mouth off on the internet, I had no clue.Anadelia: [laugh]. Well, to put it very simply, demand generation, our goal is to drive awareness and interest in your products or services. It's as simple as that. Now, how we do that, we could definitely dive into the specifics, but it's all about generating awareness and interest. Especially when you work for an early-stage startup, it's all about awareness, right? Just getting your name out there.Corey: Marketing is one of those things that I suspect in some ways is kind of like engineering, where you take a look at, “Oh, what do you do? I'm a software engineer.” Okay, great. For someone who is in that space, does that mean front-end? Does that mean back-end? Does that mean security? Oh, wait, you're crying and awake at weird hours and you're angry all the time. You're a DevOps, aren't you?And you start to realize that there are these breakdowns within engineering. And we realize this and we get offended when people in some cases miscategorize us as, “I am not that kind of engineer. How dare you?” Which I think is unwarranted and ridiculous, but it also sort of slips under our notice in the engineering space that marketing is every bit as divided into different functions, different roles, and the rest. For those of us who think of marketing in the naive approach, like I did when I started this place—“Oh, marketing. So basically, you do Super Bowl ads, right?” And it turns out, there might be more than one or two facets to marketing. What's your journey been like in the wide world of marketing? Where did you start? Where does it stop?Anadelia: Yeah. I have not gotten to the Super Bowl ads phase yet but on my way there. No, but when you think about the different core areas within marketing, right, you have your product marketing team, and this is the team that sets the positioning, the messaging, and the information about who your ideal audience is, what pain points are they having, and how is your product solving those pain points? Right, so they sort of set the direction for the rest of the team, you have another core function, which is the content team, right? So, with the direction from Product Marketing, now that we know what the pain points are and what our value prop for our product is, how do we tell that to the world in a compelling way, right? So, this is where content marketing really comes into play.And then you have your demand generation teams. And some companies might call it growth or revenue or… I guess those two are the ones that come to mind. But this team is taking the direction from Product Marketing, taking the content produced by the content team, and then just making sure that people actually see it, right? And across all those teams, you have a lot of support from operations making sure that there's processes and systems in place to support all of those marketing efforts, you have teams that help support web development and design, and brand.Corey: One of the challenges that I think people have when they don't really understand what marketing is they think back on what they know—maybe they've seen Mad Men, which to my understanding does not much resemble modern all workplaces, but then again, I've been on my own for five years, so one wonders—and they also see things in the context of companies that are targeting more mass-market, in some respects. If you're trying to advertise Coca-Cola, every person on the planet—give or take—knows what Coca-Cola is. And the job is just to resurface it, on some level, in people's awareness, so the correct marketing answer there apparently, is to slap the logo on a bunch of things, be it a stadium, be it a billboard, be it almost anything, whereas when we're talking about earlier stage companies—oh, I don't know Teleport, for example—if you were to slap the Teleport logo on a stadium somewhere for some sports game, I have the impression that most people looking at that, if they notice it at all, would instead respond to some level of confusion of, “Teleport, what is that exactly? Have scientists cracked the way of getting me to Miami from San Francisco in less than ten seconds? Because I feel like I would have heard about that.”There's a matter of targeting beyond just the general public or human beings walking around and starting to target people who might have a problem that you know how to solve. And then, of course, figuring out where those people are gathering and how to get in front of them in a way that resonates instead of being annoying. At least that has been my lived experience of watching the challenges that marketing people have talked to me about over the years. Is that directionally correct or are they all just shining me on and, like, “Oh, Corey, you're adorable, you almost understand how this stuff works. Now, go insult some more things on Twitter. It'll be fine.”Anadelia: [laugh]. The reality is that advertising is a big part of a demand generation program, but it's not all, right? So, good demand generation is meeting people where they are. So, the right channels, the right mediums, the right physical places. So, when you look at it from an inbound and outbound approach, inbound, you have a sign outside of your door inviting people to your house, right, and this is in the form of your website. And outbound is you go out to where people are and you knock on their door to introduce yourself.So, when we look at it from that approach, so on the inbound side, right, the goal is to get people to come to your website because that is where you are telling them what you do and giving them the option to start using your product. So, what reason are you giving people to come to you, right? How are you helping them become better at something or achieve certain results, right? So, understanding the motivations behind it is extremely important.And how are you driving people to you? Well, that's where SEO comes in, right? Search engine optimization.So, what content are you producing that is driving the right search results to get your website to show up and get people to come to you, right? There's also SEM or Search Engine Marketing. So, when people are searching for certain keywords that are relevant to you, are you showing up in those search results?And on the outbound side of things is, what do you do to contribute to existing communities, right? So, this is where things like advertising comes into play. So, I know you have a huge following and I want to be where you are. So, of course, I'm going to sponsor your podcast and your newsletters. And similarly, I'm looking for what events are out there where I know that our potential customers are spending their time and what can we do to join that conversation in a way that adds value?So, that can be in the form of supporting community events and meetups, giving community members a platform to share their experiences, and even supporting local businesses, right, it's all about adding value, and by doing so, you are building trust that will allow you to then talk about how your product can help these communities solve their problems.Corey: It's interesting because when we look at the places that you have been, you were at InfluxData, they are a time-series database company; you were at LightStep, which was effectively an observability company, and now you're at Teleport where you are an authentication and access company. And forgive me, none of these are your terms. These are my understandings of having talked to these folks. And on the one hand, from a product perspective, it sounds like you're hopping between this and that and doing all those other things, and yet, we had conversations about all three of those products and how the companies around them are structured and built, and you've advertised all three of those on this show and others and all three of those companies and products speak specifically to problems that I have dealt with personally in the way I go through my engineering existence as well. So, instead of specializing on a particular product or on a particular niche, it almost feels like you're specializing on a particular audience. Is that how you think about it, or is that just one of those happy accident, or in retrospect, we're just going to retcon everything, and, “Yeah, that's exactly why I did it.” And you're like, “Let we jot that down. That belongs on my resume somewhere.”Anadelia: [laugh]. No, so prior to me joining InfluxData, I was at other companies that were marketing to sales, HR, finance, different audiences, right? And the moment I joined Influx, it was really eye-opening for me to be part of a product that has an open-source community, and between that and marketing to a highly technical audience that probably very likely doesn't want to hear from marketers, I found that to be a really good challenge for myself because it challenged me to elevate my own technical knowledge. And also personally, I just want to be surrounded by people that are smarter than me, and so I know that by being part of a community that markets to a developer audience, I am putting myself in a position where I'm having to constantly continue to learn. So, it's a good challenge for a marketer in our industry. Just like in any others, there's always the latest buzzword or the latest trend, and so it's really easy to get caught up in those things. And I think that being a marketer whose audience is developers really forces you to kind of look at what you're doing and sort of remove the fluff. This happens everywhere.Corey: Well, I have to be careful about selling yourself too short on this because I've talked to a lot of different people who want to wind up promoting what it is that their companies do, and people come from all kinds of different places, and some of the less likely to be successful—in many cases, I turn the business down—are, “Well, this is our first real experience with marketing.” And the reason for that is people expect unrealistic things. I describe what I do as top-of-funnel where we get people's attention and we give them a glimpse and a hook of what it is the product does. And I do that by talking about the painful problem that the product solves. So, when people hear their pain reflected in what we talk about, then that gives them the little bit of a push to go and take a look and see if this solves it.And that's great, but there has to be a process on the other side, where oh, a prospect comes in and starts looking at what it is we do. Do we have a sales funnel that moves them from someone just idly browsing to someone who might sign up for a trial, or try this in their own time, or start to understand how the community views it and the rest because just dropping a bunch of traffic on someone's website doesn't, in isolation, achieve anything without a means to convert that traffic into something that's a bit more meaningful and material to the business? I've talked to other folks who are big on oh, well, we want to wind up just instrument in the living crap out of everything we put out there, so I want to know, when someone clicks on the ad, who they are, what they do for a living, what their signing authority is, et cetera, et cetera, et cetera. And my answer, that's super easy, “Cool. We don't do any of that.”Part of the reason that people like hearing from me, is because I generally tend to respect their time, I'm not supporting invasive tracking of what they do, they don't see my dumb face smiling with a open mouth grin as they travel across the internet on every property. Although one of these days I will see myself on the side of a bus; I'm just waiting for it. And it's really nice to be able to talk to people who get the nuances and the peculiarities of the audience that I tend to speak to the most. You've always had that unlocked, even since our first conversation.Anadelia: Yeah, well, first of all, thank you. And yeah, the reality is that, especially within my world, right—and demand generation, we are very metrics-driven because our goal [tends 00:13:00] to be pipeline, right? Pipeline for the sales team, so we want to generate sales opportunities, and in order to do that, we need to be able to measure what's working and what is not working. But the reality is that good marketing is all about building trust, right? So, that's why I stress the importance of providing something of value to your prospect so that you're not wasting their time, right? The message that you have for them is something that can help them in the future.And if building trust sometimes means I'm not able to measure the direct results of the activity that you're doing, then that is okay, right? Because when you're driving people to your website, there are things that you can measure, like, you have some web visits, and you know that percentage of those visitors might be interested in continue further, right? So, when you look at the journey across the buyer stages, you have to have a compelling offer for a person on each of the possible stages, right? So, if they are just learning about you today because this is the first time that heard your ad, it's probably not expected that they would immediately go to your website and fill out your form, right? They've just heard about you, and now you start building that recognition.Now, if all the stars align, and I actually have a need for a solution that's like yours today, then, of course, you can expect a conversion to happen in that time point. But the reality is that having offers that are aimed at every stage of the buyer's journey is important.Corey: I'm glad to hear you say this. And the reason is that I often feel like when I say it, it sounds incredibly self-serving. But if you imagine the ideal buyer and their journey, they have the exact problem that your product does and there's an ad on my podcast that mentions it. Well, I imagine—and maybe this isn't accurate, but it's how I engage with podcasts myself—I'm probably not sitting in front of a computer ready to type in whatever it is that gets talked about.I'm probably doing dishes or outside harassing a dog or something. And if it resonates is, “Oh, I should look into that.” In an ideal world. I'll remember the short URL that I can go to, but in practice, I might just Google the company name. And oh, this does solve the problem.If it's not just me and there's a team I have to have a buy-in on, I might very well mention it in our next group meeting. And, “Okay, we're going to go ahead and try it out with an open-source version or whatnot.” And, “Oh, this seems to be working. We'll have procurement reach out and see what it takes to wind up generating a longer-term deal.” And the original attribution of the engineer who heard it on a podcast, or the DevOps director who read it in my newsletter, or whatever it is, is long since lost. I've commiserated with marketing people over this, and the adage that I picked up that I love quoting is half your marketing budget is wasted, but you can spend an entire career trying to figure out which half and get nowhere by the end of it.Anadelia: And this sort of touches on the buyer's journey is not linear. On the other side of that ad, or that marketing offer is a human, right? So, of course, as marketers, we're going to try to build this path of once you landed on our website, we want to guide you through all the steps until you do the thing that we want you to do, but the reality is, that does not happen in your example, right? You see something, you come back to it later through another channel, there's no way for us to measure those. And that's okay because that's just the reality of how humans behave.And also, I think it's worth noting that it takes multiple touch points until a person is ready to even hear what you have to say, right? And it sort of goes back to that point of building trust, right? It takes many times until you've gained that person's trust enough for them to listen to what you have to say.Corey: Building trust is important.Anadelia: SIt is very important. And that's why I think that running brand awareness programs are an extremely important part of a marketing mix. And sometimes there's not going to be any direct attribution, and we just have to be okay with it.Corey: I come bearing ill tidings. Developers are responsible for more than ever these days. Not just the code that they write, but also the containers and the cloud infrastructure that their apps run on. Because serverless means it's still somebody's problem. And a big part of that responsibility is app security from code to cloud. And that's where our friend Snyk comes in. Snyk is a frictionless security platform that meets developers where they are - Finding and fixing vulnerabilities right from the CLI, IDEs, Repos, and Pipelines. Snyk integrates seamlessly with AWS offerings like code pipeline, EKS, ECR, and more! As well as things you're actually likely to be using. Deploy on AWS, secure with Snyk. Learn more at Snyk.co/scream That's S-N-Y-K.co/screamCorey: I tend to take a perspective that trust is paramount, on some level, where we have our standard rules of, you know, don't break the law, et cetera, et cetera, that we do require our sponsors to conform to, but there are really two rules that I have that I care about. The first is you're not allowed to lie to the audience. Because if I wind up saying something is true in an ad or whatnot, and it's not, that damages my credibility. And I take this old world approach of, well, I believe trust is built over time, and you continually demonstrate a pattern of doing the right thing, and people eventually are willing to extend a little bit of credulousness when you say something that sounds that might be a little bit beyond their experience.The other is, and this is very nebulous, and difficult to define so I don't think we even have this in writing, but you have to be able to convince me if you're going to advertise something in one of my shows, that it will not, when used as directed, leave the user worse off than they were when they started. And that is a very strange thing. Like, a security product that has a bunch of typos on its page and is rolling its own crypto, for example—if you want an easy example—is one of those things that I will very gracefully decline not to wind up engaging with, just because I have the sneaking suspicion that if you trust that thing, you might very well live to regret it. In other cases, though—and this is almost never a problem because most companies that you have heard of and have established themselves as brands in this space already instinctively get that you're not able to build a lasting business by lying to people and then ripping them off.So, it's a relatively straightforward approach, but every once in a while, I see something that makes me raise an eyebrow. And it's not always bad. Sometimes I think that's a little odd. Teleport is a good example of this because, “Oh, really? You wound up doing access and authentication? That sounds exactly like the kind of thing I want something old and boring, not new and exciting, around, so let's dig into this and figure out whether this might be the one company you work at that doesn't get to sponsor stuff that I do.”But of course you do. You're absolutely focusing on an area that is relevant, useful, and having talked to people on your side of the world, you're doing the right thing. And okay, I would absolutely not be opposed to deploying this in the right production environment. But having that credulousness, having that exploratory conversation, makes it clear that I'm talking to people who know what they're doing and not effectively shilling for the highest bidder, which is not really a position I ever want to find myself in.Anadelia: And look, you have only one opportunity to make a first impression, right? So, being clear about what it is that you can do, and also being clear about what it is that you cannot do is extremely important, right? It kind of goes back to the point of just be a good human, don't waste people's time. You want to provide something of value to your audience. And so, setting those expectations early on is extremely important.And I don't know anyone that does this, but if your goal is only to drive people to your website, you can do that, probably very easily, but nothing will come out of it unless you have the right message.Corey: Oh, all you do is write something incendiary and offensive, and you'll have a lot of traffic. They won't buy anything and they'll hate you, but you'll get traffic, so maybe you want to be a little bit more intentional. It's the same reason that the companies that advertise on what I do pick me to advertise with as opposed to other things. It is more expensive than the mass-market podcasts and whatnot that speak to everyone. But you take a look at those podcasts and the things that they're advertising are things that actually apply to an awful lot more people, things like mattresses, and click-and-design website services, and the baseline stuff that a lot of people would be interested in, whereas the things that advertise on what I do tend to look a lot more like B2B SaaS companies where they're talking to folks who spend a lot of time working in cloud computing.And one of the weird things to think about from that perspective, at least for me, is if one person is listening to a show that I'm putting out and they go through the journey and become a customer, well, at the size of some of these B2B contracts between large companies, that one customer has basically paid for everything I can sell for advertising for the next decade and change, just because the long-term value of some of these customers is enormous. But it's why, for example—and I kept expecting it to happen, but it didn't—I've never been subjected to outreach from the mattress companies of, “Hey, you want to go talk about that to your guests?” No, because for those folks, it is pure raw numbers: how many millions of subscribers do you have? Here, it's—the newsletter is the easy one to get numbers on because lies, damned lies, and podcast statistics. I have 31,000 people that receive emails. Great, that's not the biggest newsletter in the world by a longshot, but the people who are the type of person to sign up for cloud computing-style newsletters, that alone says something very specific about them and it doesn't require anyone do anything creepy to wind up reaching out from that perspective.It doesn't require spying on customers to intuit that, hmm, maybe people who care about what AWS is up to and have big AWS-sized problems might sign up to a newsletter called Last Week in AWS. That's the sort of easy thinking about advertising that I tend to go for, which yeah, admittedly sounds a lot like something out of that Mad Men era. But I think that we got a lot right back then, and everything's new all the time.Anadelia: [laugh]. And actually, that's exactly what demand generation is, right? We want to find the right channels to reach our audience. And so, for a consumer company that sells mattresses, right, anyone might be on the market for a mattress, right? You want to go as broad as possible. But for something that's more specific, you want to find what are the right channels to reach that audience where you know that there's—it might be a smaller audience size, but it's the right people.And we've talked about the other core areas of marketing. So, with demand generation, it's all about finding people where they are, right, and providing them their message to you and attracting them to come to you, right? It kind of goes back to that inbound and outbound motion that I mentioned earlier. But at the end of the day also, if you don't have the right messaging to keep them engaged, once you got them to your website, then that's a different problem, right? So, demand gen alone cannot be successful without really strong product marketing and without really strong content, and everything else that's needed to support that, right? I mentioned the—if your website is not loading fast enough, then you're losing people if your form is not working. So, there's so many, so many different factors that come into play.Corey: Oh, God, the forms. Don't get me started on the forms. Hey, we have a great report that's super useful. Okay, cool. I'll click the link and I'll follow that. I talk to sponsors about this all the time. And it's, you have 30 mandatory fields on that website that I need to fill out. I am never going to do that.What is the absolute bare minimum that you need in an ideal world? Don't put any sort of gateway in front of it and just make it that good that I will reach out to thank you for it or something, but just make it an email address or something and that's it. You don't need to know the size of my company, the industry we're in, the level of my signing authority, et cetera, et cetera, et cetera. Because if this is good, I might very well be in touch. And if it's not, all you're going to do is harass me forever with pointless calls and emails and whatnot, and I don't want to deal with that. There's something to be said for adding value early in the conversation and letting other people sometimes make the first move. But this is also, to be clear, a very inbound type of approach.Anadelia: It's a never-ending debate, to gate or not to gate. And I don't know if there is a right answer. My approach is that if your content is good, people will come back to you. They'll keep coming back, and they'll want to take the next step with you. And so, I have some gated assets, and I have some that are not, and—but—Corey: But your gates have also never been annoying of the type that I'm talking about where it's the, “Oh, great. You need to, like, put in, like, how big is your company? What's the budget?” It feels like I'm answering a survey at some point. AWS is notorious for this.I counted once; there are 19 mandatory fields I had to fill out in order to watch a webinar that AWS was putting on.Anadelia: [laugh].Corey: And the worst part is they asked me the same questions every time I want to watch a different webinar. It's like, for a company that says the data is so valuable, you'd really think they'd be better at managing it.Anadelia: You know, like, some of the questions keep getting stranger. Like, I would not be surprised if people start asking what's your favorite color, or what's the answer to your—Corey: The one they always ask now for, like, big data seminars and whatnot, is where this really gets me, is this in relation to your professional interests or your personal interests? It's… “What do you think my hobbies are over there? Oh, yeah, I like big enterprise software. That's my hobby.” “Okay, I guess.” But I really do wonder what happens if someone checks the personal interest [vibe 00:25:33]. Do they wind up just with various AWS employees showing up want to hang out on the weekends and go surfing or something? I don't know.Anadelia: As somebody who has been on the receiving end of lists like this—for example, we sponsor a conference and we get people stop by to talk to us, and now we get the list of those people. And there's 25 columns. Like, honestly, that data does not come in helpful because at the end of the day, whatever you've marked on the required question is not going to change how I am going to communicate to you after, right, because we just had a conversation in person at this event.Corey: My budget is not material to the reason I let you scan my badge. The reason I let you scan my badge because I really wanted one of those fun plastic toy things, so I waited in line for 45 minutes to get it. But that doesn't mean that I'm going to be a buyer; it just means that now I'm in your funnel, although I could not possibly care less about what you do. One thing I do at re:Invent and a couple other conferences, for example, is I will have swag at a booth—because I don't tend to get booths myself, I don't have the staff to man it and I'm bad at that type of thing. But when people come up to get a sticker for Last Week in AWS or when of our data transfer diagram things or whatnot, the rule that we've always put in place is, you're not going to mandate a badge scan for that.And the kind of company I like doing that with gets it because the people who walk by and are interested will say, “Hey, can you scan my badge as well?” But they don't want to pollute their own lead lists with a bunch of people who are only there to get a sticker featuring a sarcastic platypus, as opposed to getting them confused with people actually care about what it is that they're solving for. And that's a delicate balance to strike sometimes, but the nice thing about being me is I have customers who come back again and again and again. Although I will argue that I probably got better at being a service provider when I started also being a customer at the same time, where I hired out a marketing department here because it turns out that fixing the AWS bill is something that does a fair bit of marketing work. It's not something people talk about at large scale in public, so you have to be noisy enough so that inbound finds its path to you a bunch of times. That's always tricky.And learning about how no matter what it is you do, in the case of my consulting work, we are quite honestly selling money, bring us in for an engagement, you will turn a profit on that engagement and we don't come back with a whole bunch of extra add-ons after the fact to basically claw back more things. It's one of the easiest sales in the world. And it's still nuanced, and challenging, and finding the right way to talk about it to the right people at the right time explains why marketing is the industry that it is. It's hard. None of this is easy.Anadelia: It is. And you know, in your example, you're not scanning that badge, but giving the person the sticker, right? Like, it's all about making a good first impression, and if the person's not ready to talk to you, that is okay. But there are ways that you can stay top-of-mind so that the moment that they have a need, they'll come to you. It kind of goes back again to my earlier points of adding value in supporting existing communities, right? So, what are you doing to stay top-of-mind with that person that wasn't quite ready back then, but the moment they have a need, they'll think of you first because you made a good first impression.Corey: And that's really what it comes down to. It's nice to talk to people who actually work in marketing because a lot of what I do in the marketing space, I've got to be honest, is terrible. Because I've done the old engineering thing of, well, I'm no marketer, but I know how to write code, so how hard could marketing really be and I invent this theory of marketing from first principles, which not only is mostly wrong, but also has a way of being incredibly insulting to people who have actually made this their profession and excel at it. But it's an evolutionary process and trying to figure out the right way to do things and how to think about things from particular point of view has been transformative. Really easy example of this: when I first started selling sponsorships, I was constantly worried that a sponsor was going to reach out and say, “Well, hang on a second. We didn't get the number of clicks that we expected to on this campaign. What do you have to say about that?”Because I'm a consultant. I am used to clients not getting results that they expected having some harsh words for me. In practice, I don't believe I've ever had a deep conversation about that with a marketing person. I've talked to them and they've said, “Well, some of these things worked. Some of these things didn't. Here's what works; here's what didn't, and for our next round, here's what we want to try instead.” Those are the great constructive conversations.The ones that I was fearing somehow would assume that I held this iron grip of control over exactly how many people would be clicking on a thing in a newsletter, and I'm not. We barely provide click-tracking at this point in the aggregate, let alone anything more specific, just because it's so hard to actually tell and get value out of it. You talk as well, about there being brand awareness. Even if someone doesn't click an ad, they're potentially reading it, they're starting to associate your company with the problem space. That's one of those things that are effectively impossible to track, but it does pay dividends.When you suddenly have a problem in a particular area. And there's one or two companies off the top of your mind that you know work in that space. Well, what do you think marketing is? There has been huge money put into making that association in your mind. It's not just about click the link; it's not just about buy the thing; it's about shaping the way that we think about different things.Anadelia: And I spend a lot of time thinking about how people think we talk about what are the things that motivate you. When you have a problem, where do you go to look for a solution, or who do you go to, right? So, just understanding what the thought process is when someone is trying to solve a problem or making a purchasing decision, I think that a lot of demand generation is what are the different ways by which someone is trying to solve a problem that they're having? And I had an interest in psychology growing up; both my parents are psychologists, and I think that marketing tends to bring some aspects of that in business and creativity, which is what led me to a career in marketing.And you ended up being sort of a connector, right? Like your job was to connect to people who would benefit from meeting each other. Just one of them happens to be a product, or you know, it depends on your company, right, but you're just introducing people and making sure they know about each other because there's going to be a mutually beneficial relationship between them.Corey: That seems to be what so many jobs ultimately distilled down to in the final analysis of things. I really want to thank you for being so generous with your time and talking about how you view the world slash industry in which we live. If people want to learn more about what you're up to and how you think about these things, where's the best place to find you?Anadelia: You can follow me on Twitter at @anadeliafadeev, or connect with me on LinkedIn.Corey: Oh, you're one of the LinkedIn peoples. I used to do that a bit, and then I just started getting deluged with all kinds of nonsense, and let me adjust my notification settings, and there are 600 of them. And no, no, no, no, no. And I basically have quit the field, by and large, on LinkedIn. But power to you for not having done that. Links to that will of course be in the [show notes 00:32:38]. Thank you so much for being so generous with your time.Anadelia: Thank you for having me. I appreciate it.Corey: Anadelia Fadeev, Senior Director of Demand Generation at Teleport. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry ranting comment about how we got it completely wrong and that marketing does not work on you in the least. And by the way, when you close out that ranting comment, tell me what kind of brand of shoes you're wearing today.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Expensive Observability: The Cardinality Challenge - OpenObservability Talks S3E02

OpenObservability Talks

Play Episode Listen Later Jul 28, 2022 60:03

We all collect logs, metrics and perhaps traces and other data types, in support of our observability. But this can get expensive pretty quickly, especially in microservices based systems, in what is commonly known as “the cardinality problem”. On this episode of OpenObservability Talks I'll host Ben Sigelman, co-founder and the GM of Lightstep, to discuss this data problem and how to overcome it. Ben architected Google's own planet-scale metrics and distributed tracing systems (still in production today), and went on to co-create the open-source OpenTracing and OpenTelemetry projects, both part of the CNCF. The episode was live-streamed on 12 July 2022 and the video is available at https://youtu.be/gJhzwP-mZ2k OpenObservability Talks episodes are released monthly, on the last Thursday of each month and are available for listening on your favorite podcast app and on YouTube. We live-stream the episodes on Twitch and YouTube Live - tune in to see us live, and pitch in with your comments and questions on the live chat.https://www.twitch.tv/openobservabilityhttps://www.youtube.com/channel/UCLKOtaBdQAJVRJqhJDuOlPg Have you got an interesting topic you'd like to share in an episode? Reach out to us and submit your proposal at https://openobservability.io/ Show Notes: The difference between monitoring, observability and APM What comprises the cost of observability How common is the knowledge of cardinality and how to add metrics Controlling cost with sampling, verbosity and retention Lessons from Google's metrics and tracing systems Using metric rollups and aggregations intelligently Semantic conventions for logs, metrics and traces OpenCost project New research paper by Meta on schema-first approach to application telemetry metadata OTEL code contributions - published stats Resources: Monitoring vs. observability: https://twitter.com/el_bhs/status/1349406398388400128 The two drivers of cardinality: https://twitter.com/el_bhs/status/1360276734344450050 Sampling vs verbosity: https://twitter.com/el_bhs/status/1440750741384089608 Observing resources and transactions: https://twitter.com/el_bhs/status/1372636288021524482 Socials: Twitter: https://twitter.com/OpenObserv Twitch: https://www.twitch.tv/openobservability YouTube: https://www.youtube.com/channel/UCLKOtaBdQAJVRJqhJDuOlPg

google lessons reach twitch gm expensive controlling observing sampling semantic observability cncf otel lightstep opentracing cardinality

Leading Business Integration After an Acquisition with Camron Shahmirzadi, Head of Revenue Operations of the Lightstep Business Unit at ServiceNow

Sales Ops Demystified

Play Episode Listen Later Jul 14, 2022 31:20

In this episode of The Revenue Insights podcast, Camron Shahmirzadi, Head of Revenue Operations of the Lightstep Business Unit at ServiceNow, shares valuable insights on how to succeed as a RevOps professional and how to integrate business operations during an acquisition.

head business acquisition servicenow cam'ron revops revenue operations business unit lightstep business integration ebsta

Ep. #54, Cloud Native Observability with Alex Boten of Lightstep

O11ycast

Play Episode Listen Later Jun 30, 2022 39:25

In episode 54 of o11ycast, Liz Fong-Jones and Jessica Kerr speak with Alex Boton of Lightstep. They discuss Alex's book Cloud Native Observability with OpenTelemetry, OTel documentation and community, vendor lock-in, and the pain of instrumentation.

Ep. #54, Cloud Native Observability with Alex Boten of Lightstep

cloud native observability boten otel jessica kerr lightstep liz fong jones

Play Episode Listen Later Jun 30, 2022 39:25

Software at Scale 47 - OpenTelemetry with Ted Young

Software at Scale

Play Episode Listen Later May 26, 2022 93:41

Ted Young is the Director of Developer Education at Lightstep and a co-founder of the OpenTelemetry project.Apple Podcasts | Spotify | Google PodcastsThis episode dives deep into the history of OpenTelemetry, why we need a new telemetry standard, all the work that goes into building generic telemetry processing infrastructure, and the vision for unified logging, metrics and traces.Episode Reading ListInstead of highlights, I’ve attached links to some of our discussion points.HTTP Trace Context - new headers to support a standard way to preserve state across HTTP requests.OpenTelemetry Data CollectionZipkinOpenCensus and OpenTracing - the precursor projects to OpenTelemetry This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.softwareatscale.dev

director young software scale lightstep opentracing

OpenTelemetry Gets Better Metrics

basic radio shack developer relations devrel lightstep

Play Episode Listen Later May 25, 2022 20:11

OpenTelemetry is defined by its creators as a collection of APIs used to instrument, generate, collect and export telemetry data for observability. This data is in the form of metrics, logs and traces and has emerged as a popular CNCF project. For this interview, we're delving deeper into OpenTelemetry and its metrics support which has just become generally available. The specifications provided for the metrics protocol are designed to connect metrics to other signals and to provide a path to OpenCensus, which enables customers to migrate to OpenTelemetry and to work with existing metrics-instrumentation protocols and standards, including, of course, Prometheus. In this episode of The New Stack Makers podcast, recorded on the show floor of KubeCon + CloudNativeCon Europe 2022 in Valencia, Spain, Morgan McLean, director of product management, Splunk, Ted Young, director of developer education, LightStep and Daniel Dyla, senior open source architect, Dynatrace discussed how OpenTelemetry is evolving and the magic of observability in general for DevOps.

spain metrics makers prometheus apis devops splunk cncf dynatrace lightstep new stack makers

S6 Bonus: Austin Parker, Lightstep

Code Story

Play Episode Listen Later May 5, 2022 47:41

Austin Parker started out at a young age with computers, writing programs in BASIC and hanging out on bulletin boards. Prior to his tech career, he held many other jobs - as a short order cook, waiting tables, and taking tickets at a theater. When he stepped into this industry, he started out in test automation, followed by getting involved in open source communities, which is how he got into Developer Relations. What he likes about this arena is that DevRel is taking your companies story of a product, and making it harmonize with what everyone else is singing in the market. He has a young family, which occupies most of his time these days. But he likes to do photography, tinkering with electronics, and building model aircraft. Back in the day, he experienced the glory days of Radio Shack where you could grab electronic components on a whim.Austin Parker has been at his current company for 4 years, right after stealth mode ended. He has helped enable the company to support developers through their innovative tech stack, built by industry experts.This is Austin's creation story of Lightstep.SponsorsImmediateOrbitPostmarkStytchVerb DataWebapp.ioLinksWebsite: https://lightstep.com/LinkedIn: https://www.linkedin.com/in/austinlparker/Support this podcast at — https://redcircle.com/code-story/donationsAdvertising Inquiries: https://redcircle.com/brandsPrivacy & Opt-Out: https://redcircle.com/privacy

Distributed Tracing Infrastructure with Ben Sigelman and Alex Kehlenbeck

Software Engineering Daily

Play Episode Listen Later Apr 19, 2022 45:37 Very Popular

Ben Sigelman Alex Kehlenbeck Observability consists of metrics, logs, and traces. Lightstep is a company that builds distributed tracing infrastructure, which requires them to store and serve high volumes of trace data. There are numerous architecture challenges that come with managing this data. Ben Sigelman and Alex Kehlenbeck join the show to discuss the implementation The post Distributed Tracing Infrastructure with Ben Sigelman and Alex Kehlenbeck appeared first on Software Engineering Daily.

infrastructure tracing distributed lightstep software engineering daily

About ClintClint is the CEO and a co-founder at Cribl, a company focused on making observability viable for any organization, giving customers visibility and control over their data while maximizing value from existing tools.Prior to co-founding Cribl, Clint spent two decades leading product management and IT operations at technology and software companies, including Splunk and Cricket Communications. As a former practitioner, he has deep expertise in network issues, database administration, and security operations.Links: Cribl: https://cribl.io/ Cribl.io: https://cribl.io Docs.cribl.io: https://docs.cribl.io Sandbox.cribl.io: https://sandbox.cribl.io TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Today's episode is brought to you in part by our friends at MinIO the high-performance Kubernetes native object store that's built for the multi-cloud, creating a consistent data storage layer for your public cloud instances, your private cloud instances, and even your edge instances, depending upon what the heck you're defining those as, which depends probably on where you work. It's getting that unified is one of the greatest challenges facing developers and architects today. It requires S3 compatibility, enterprise-grade security and resiliency, the speed to run any workload, and the footprint to run anywhere, and that's exactly what MinIO offers. With superb read speeds in excess of 360 gigs and 100 megabyte binary that doesn't eat all the data you've gotten on the system, it's exactly what you've been looking for. Check it out today at min.io/download, and see for yourself. That's min.io/download, and be sure to tell them that I sent you.Corey: This episode is sponsored in part by our friends at Sysdig. Sysdig is the solution for securing DevOps. They have a blog post that went up recently about how an insecure AWS Lambda function could be used as a pivot point to get access into your environment. They've also gone deep in-depth with a bunch of other approaches to how DevOps and security are inextricably linked. To learn more, visit sysdig.com and tell them I sent you. That's S-Y-S-D-I-G dot com. My thanks to them for their continued support of this ridiculous nonsense.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I have a repeat guest joining me on this promoted episode. Clint Sharp is the CEO and co-founder of Cribl. Clint, thanks for joining me.Clint: Hey, Corey, nice to be back.Corey: I was super excited when you gave me the premise for this recording because you said you had some news to talk about, and I was really excited that oh, great, they're finally going to buy a vowel so that people look at their name and understand how to pronounce it. And no, that's nowhere near forward-looking enough. It's instead it's some, I guess, I don't know, some product announcement or something. But you know, hope springs eternal. What have you got for us today?Clint: Well, one of the reasons I love talking to your audiences because product announcements actually matter to this audience. It's super interesting, as you get into starting a company, you're such, like, a product person, you're like, “Oh, I have this new set of things that's really going to make your life better.” And then you go out to, like, the general media, and you're like, “Hey, I have this product.” And they're like, “I don't care. What product? Do you have a funding announcement? Do you have something big in the market that—you know, do you have a new executive? Do you”—it's like, “No, but, like, these features, like these things, that we—the way we make our lives better for our customers. Isn't that interesting?” “No.”Corey: Real depressing once you—“Do you have a security breach to announce?” It's, “No. God no. Why would I wind up being that excited about it?” “Well, I don't know. I'd be that excited about it.” And yeah, the stuff that mainstream media wants to write about in the context of tech companies is exactly the sort of thing that tech companies absolutely do not want to be written about for. But fortunately, that is neither here nor there.Clint: Yeah, they want the thing that gets the clicks.Corey: Exactly. You built a product that absolutely resonates in its target market and outside of that market. It's one of those, what is that thing, again? If you could give us a light refresher on what Cribl is and does, you'll probably do a better job of it than I will. We hope.Clint: We'd love to. Yeah, so we are an observability company, fundamentally. I think one of the interesting things to talk about when it comes to observability is that observability and security are merging. And so I like to say observability and include security people. If you're a security person, and you don't feel included by the word observability, sorry.We also include you; you're under our tent here. So, we sell to technology professionals, we help make their lives better. And we do that today through a flagship product called LogStream—which is part of this announcement, we're actually renaming to Stream. In some ways, we're dropping logs—and we are a pipeline company. So, we help you take all of your existing agents, all of your existing data that's moving, and we help you process that data in the stream to control costs and to send it multiple places.And it sounds kind of silly, but one of the biggest problems that we end up solving for a lot of our enterprises is, “Hey, I've got, like, this old Syslog feed coming off of my firewalls”—like, you remember those things, right? Palo Alto firewalls, ASA firewalls—“I actually get that thing to multiple places because, hey, I want to get that data into another security solution. I want to get that data into a data lake. How do I do that?” Well, in today's world, that actually turns out is sort of a neglected set of features, like, the vendors who provide you logging solutions, being able to reshape that data, filter that data, control costs, wasn't necessarily at the top of their priority list.It wasn't nefarious. It wasn't like people are like, “Oh, I'm going to make sure that they can't process this data before it comes into my solution.” It's more just, like, “I'll get around to it eventually.” And the eventually never actually comes. And so our streaming product helps people do that today.And the big announcement that we're making this week is that we're extending that same processing technology down to the endpoint with a new product we're calling Cribl Edge. And so we're taking our existing best-in-class management technology, and we're turning it into an agent. And that seems kind of interesting because… I think everybody sort of assumed that the agent is dead. Okay, well, we've been building agents for a decade or two decades. Isn't everything exactly the same as it was before?But we really saw kind of a dearth of innovation in that area in terms of being able to manage your agents, being able to understand what data is available to be collected, being able to auto-discover the data that needs to be able to be collected, turning those agents into interactive troubleshooting experiences so that we can, kind of, replicate the ability to zoom into a remote endpoint and replicate that Linux command line experience that we're not supposed to be getting anymore because we're not supposed to SSH into boxes anymore. Well, how do I replicate that? How do I see how much disk is on this given endpoint if I can't SSH into that box? And so Cribl Edge is a rethink about making this rich, interactive experience on top of all of these agents that become this really massive distributed system that we can process data all the way out at where the data is being emitted.And so that means that now we don't nec—if you want to process that data in the stream, okay, great, but if you want to process that data at its origination point, we can actually provide you cheaper cost because now you're using a lot of that capacity that's sitting out there on your endpoints that isn't really being used today anyway—the average utilization of a Kubernetes cluster is like 30%—Corey: It's that high. I'm sort of surprised.Clint: Right? I know. So, Datadog puts out the survey every year, which I think is really interesting, and that's a number that always surprised me is just that people are already paying for this capacity, right? It's sitting there, it's on their AWS bill already, and with that average utilization, a lot of the stuff that we're doing in other clusters, or while we're moving that data can actually just be done right there where the data is being emitted. And also, if we're doing things like filtering, we can lower egress charges, there's lots of really, really good goodness that we can do by pushing that processing further closer to its origination point.Corey: You know, the timing of this episode is somewhat apt because as of the time that we're recording this, I spent most of yesterday troubleshooting and fixing my home wireless network, which is a whole Ubiquity-managed thing. And the controller was one of their all-in-one box things that kept more or less power cycling for no apparent reason. How do I figure out why it's doing that? Well, I'm used to, these days, doing everything in a cloud environment where you can instrument things pretty easily, where things start and where things stop is well understood. Finally, I just gave up and used a controller that's sitting on an EC2 instance somewhere, and now great, now I can get useful telemetry out of it because now it's stuff I know how to deal with.It also, turns out that surprise, my EC2 instance is not magically restarting itself due to heat issues. What a concept. So, I have a newfound appreciation for the fact that oh, yeah, not everything lives in a cloud provider's regions. Who knew? This is a revelation that I think is going to be somewhat surprising for folks who've been building startups and believe that anything that's older than 18 months doesn't exist.But there's a lot of data centers out there, there are a lot of agents living all kinds of different places. And workloads continue to surprise me even now, just looking at my own client base. It's a very diverse world when we're talking about whether things are on-prem or whether they're in cloud environments.Clint: Well, also, there's a lot of agents on every endpoint period, just due to the fact that security guys want an agent, the observability guys want an agent, the logging people want an agent. And then suddenly, I'm, you know, I'm looking at every endpoint—cloud, on-prem, whatever—and there's 8, 10 agents sitting there. And so I think a lot of the opportunity that we saw was, we can unify the data collection for metric type of data. So, we have some really cool defaults. [unintelligible 00:07:30] this is one of the things where I think people don't focus much on, kind of, the end-user experience. Like, let's have reasonable defaults.Let's have the thing turn on, and actually, most people's needs are set without tweaking any knobs or buttons, and no diving into YAML files and looking at documentation and trying to figure out exactly the way I need to configure this thing. Let's collect metric data, let's collect log data, let's do it all from one central place with one agent that can send that data to multiple places. And I can send it to Grafana Cloud, if I want to; I can send it to Logz.io, I can send it to Splunk, I can send it to Elasticsearch, I can send it to AWS's new Elasticsearch-y the thing that we don't know what they're going to call it yet after the lawsuit. Any of those can be done right from the endpoint from, like, a rich graphical experience where I think that there's a really a desire now for people to kind of jump into these configuration files where really a lot of these users, this is a part-time job, and so hey, if I need to go set up data collection, do I want to learn about this detailed YAML file configuration that I'm only going to do once or twice, or should I be able to do it in an easy, intuitive way, where I can just sit down in front of the product, get my job done and move on without having to go learn some sort of new configuration language?Corey: Once upon a time, I saw an early circa 2012, 2013 talk from Jordan Sissel, who is the creator of Logstash, and he talked a lot about how challenging it was to wind up parsing all of the variety of log files out there. Even something is relatively straightforward—wink, wink, nudge, nudge—as timestamps was an absolute monstrosity. And a lot of people have been talking in recent years about OpenTelemetry being the lingua franca that everything speaks so that is the wave of the future, but I've got a level with you, looking around, it feels like these people are living in a very different reality than the one that I appear to have stumbled into because the conversations people are having about how great it is sound amazing, but nothing that I'm looking at—granted from a very particular point of view—seems to be embracing it or supporting it. Is that just because I'm hanging out in the wrong places, or is it still a great idea whose time has yet to come, or something else?Clint: So, I think a couple things. One is every conversation I have about OpenTelemetry is always, “Will be.” It's always in the future. And there's certainly a lot of interest. We see this from customer after customer, they're very interested in OpenTelemetry and what the OpenTelemetry strategy is, but as an example OpenTelemetry logging is not yet finalized specification; they believe that they're still six months to a year out. It seems to be perpetually six months to a year out there.They are finalized for metrics and they are finalized for tracing. Where we see OpenTelemetry tends to be with companies like Honeycomb, companies like Datadog with their tracing product, or Lightstep. So, for tracing, we see OpenTelemetry adoption. But tracing adoption is also not that high either, relative to just general metrics of logs.Corey: Yeah, the tracing implementations that I've seen, for example, Epsagon did this super well, where it would take a look at your Lambdas Function built into an application, and ah, we're going to go ahead and instrument this automatically using layers or extensions for you. And life was good because suddenly you got very detailed breakdowns of exactly how data was flowing in the course of a transaction through 15 Lambdas Function. Great. With everything else I've seen, it's, “Oh, you have to instrument all these things by hand.” Let me shortcut that for you: That means no one's going to do it. They never are.It's anytime you have to do that undifferentiated heavy lifting of making sure that you put the finicky code just so into your application's logic, it's a shorthand for it's only going to happen when you have no other choice. And I think that trying to surface that burden to the developer, instead of building it into the platform so they don't have to think about it is inherently the wrong move.Clint: I think there's a strong belief in Silicon Valley that—similar to, like, Hollywood—that the biggest export Silicon Valley is going to have is culture. And so that's going to be this culture of, like, developer supporting their stuff in production. I'm telling you, I sell to banks and governments and telcos and I don't see that culture prevailing. I see a application developed by Accenture that's operated by Tata. That's a lot of inertia to overcome and a lot of regulation to overcome as well, and so, like, we can say that, hey, separation of duties isn't really a thing and developers should be able to support all their own stuff in production.I don't see that happening. It may happen. It'll certainly happen more than zero. And tracing is predicated on the whole idea that the developer is scratching their own itch. Like that I am in production and troubleshooting this and so I need this high-fidelity trace-level information to understand what's going on with this one user's experience, but that doesn't tend to be in the enterprise, how things are actually troubleshot.And so I think that more than anything is the headwind that slowing down distributed tracing adoption. It's because you're putting the onus on solving the problem on a developer who never ends up using the distributed tracing solution to begin with because there's another operations department over there that's actually operating the thing on a day-to-day basis.Corey: Having come from one of those operations departments myself, the way that I would always fix things was—you know, in the era that I was operating it made sense—you'd SSH into a box and kick the tires, poke around, see what's going on, look at the logs locally, look at the behaviors, the way you'd expect it to these days, that is considered a screamingly bad anti-pattern and it's something that companies try their damnedest to avoid doing at all. When did that change? And what is the replacement for that? Because every time I asked people for the sorts of data that I would get from that sort of exploration when they're trying to track something down, I'm more or less met with blank stares.Clint: Yeah. Well, I think that's a huge hole and one of the things that we're actually trying to do with our new product. And I think the… how do I replicate that Linux command line experience? So, for example, something as simple, like, we'd like to think that these nodes are all ephemeral, but there's still a disk, whether it's virtual or not; that thing sometimes fills up, so how do I even do the simple thing like df -kh and see how much disk is there if I don't already have all the metrics collected that I needed, or I need to go dive deep into an application and understand what that application is doing or seeing, what files it's opening, or what log files it's writing even?Let's give some good examples. Like, how do I even know what files an application is running? Actually, all that information is all there; we can go discover that. And so some of the things that we're doing with Edge is trying to make this rich, interactive experience where you can actually teleport into the end node and see all the processes that are running and get a view that looks like top and be able to see how much disk is there and how much disk is being consumed. And really kind of replicating that whole troubleshooting experience that we used to get from the Linux command line, but now instead, it's a tightly controlled experience where you're not actually getting an arbitrary shell, where I could do anything that could give me root level access, or exploit holes in various pieces of software, but really trying to replicate getting you that high fidelity information because you don't need any of that information until you need it.And I think that's part of the problem that's hard with shipping all this data to some centralized platform and getting every metric and every log and moving all that data is the data is worthless until it isn't worthless anymore. And so why do we even move it? Why don't we provide a better experience for getting at the data at the time that we need to be able to get at the data. Or the other thing that we get to change fundamentally is if we have the edge available to us, we have way more capacity. I can store a lot of information in a few kilobytes of RAM on every node, but if I bring thousands of nodes into one central place, now I need a massive amount of RAM and a massive amount of cardinality when really what I need is the ability to actually go interrogate what's running out there.Corey: The thing that frustrates me the most is the way that I go back and find my old debug statements, which is, you know, I print out whatever it is that the current status is and so I can figure out where something's breaking.Clint: [Got here 00:15:08].Corey: Yeah. I do it within AWS Lambda functions, and that's great. And I go back and I remove them later when I notice how expensive CloudWatch logs are getting because at 50 cents per gigabyte of ingest on those things, and you have that Lambda function firing off a fair bit, that starts to add up when you've been excessively wordy with your print statements. It sounds ridiculous, but okay, then you're storing it somewhere. If I want to take that log data and have something else consume it, that's nine cents a gigabyte to get it out of AWS and then you're going to want to move it again from wherever it is over there—potentially to a third system, because why not?—and it seems like the entire purpose of this log data is to sit there and be moved around because every time it gets moved, it winds up somehow costing me yet more money. Why do we do this?Clint: I mean, it's a great question because one of the things that I think we decided 15 years ago was that the reason to move this data was because that data may go poof. So, it was on a, you know, back in my day, it was an HP DL360 1U rackmount server that I threw in there, and it had raid zero discs and so if that thing went dead, well, we didn't care, we'd replace it with another one. But if we wanted to find out why it went dead, we wanted to make sure that the data had moved before the thing went dead. But now that DL360 is a VM.Corey: Yeah, or a container that is going to be gone in 20 minutes. So yeah, you don't want to store it locally on that container. But discs are also a fair bit more durable than they once were, as well. And S3 talks about its 11 nines of durability. That's great and all but most of my application logs don't need that. So, I'm still trying to figure out where we went wrong.Clint: Well, I think it was right for the time. And I think now that we have durable storage at the edge where that blob storage has already replicated three times and we can reattach—if that box crashes, we can reattach new compute to that same block storage. Actually, AWS has some cool features now, you can actually attach multiple VMs to the same block store. So, we could actually even have logs being written by one VM, but processed by another VM. And so there are new primitives available to us in the cloud, which we should be going back and re-questioning all of the things that we did ten to 15 years ago and all the practices that we had because they may not be relevant anymore, but we just never stopped to ask why.Corey: Yeah, multi-attach was rolled out with their IO2 volumes, which are spendy but great. And they do warn you that you need a file system that actively supports that and applications that are aware of it. But cool, they have specific use cases that they're clearly imagining this for. But ten years ago, we were building things out, and, “Ooh, EBS, how do I wind up attaching that from multiple instances?” The answer was, “Ohh, don't do that.”And that shaped all of our perspectives on these things. Now suddenly, you can. Is that, “Ohh don't do that,” gut visceral reaction still valid? People don't tend to go back and re-examine the why behind certain best practices until long after those best practices are now actively harmful.Clint: And that's really what we're trying to do is to say, hey, should we move log data anymore if it's at a durable place at the edge? Should we move metric data at all? Like, hey, we have these big TSDBs that have huge cardinality challenges, but if I just had all that information sitting in RAM at the original endpoint, I can store a lot of information and barely even touch the free RAM that's already sitting out there at that endpoint. So, how to get out that data? Like, how to make that a rich user experience so that we can query it?We have to build some software to do this, but we can start to question from first principles, hey, things are different now. Maybe we can actually revisit a lot of these architectural assumptions, drive cost down, give more capability than we actually had before for fundamentally cheaper. And that's kind of what Cribl does is we're looking at software is to say, “Man, like, let's question everything and let's go back to first principles.” “Why do we want this information?” “Well, I need to troubleshoot stuff.” “Okay, well, if I need to troubleshoot stuff, well, how do I do that?” “Well, today we move it, but do we have to? Do we have to move that data?” “No, we could probably give you an experience where you can dive right into that endpoint and get really, really high fidelity data without having to pay to move that and store it forever.” Because also, like, telemetry information, it's basically worthless after 24 hours, like, if I'm moving that and paying to store it, then now I'm paying for something I'm never going to read back.Corey: This episode is sponsored in part by our friends at Vultr. Spelled V-U-L-T-R because they're all about helping save money, including on things like, you know, vowels. So, what they do is they are a cloud provider that provides surprisingly high performance cloud compute at a price that—while sure they claim its better than AWS pricing—and when they say that they mean it is less money. Sure, I don't dispute that but what I find interesting is that it's predictable. They tell you in advance on a monthly basis what it's going to going to cost. They have a bunch of advanced networking features. They have nineteen global locations and scale things elastically. Not to be confused with openly, because apparently elastic and open can mean the same thing sometimes. They have had over a million users. Deployments take less that sixty seconds across twelve pre-selected operating systems. Or, if you're one of those nutters like me, you can bring your own ISO and install basically any operating system you want. Starting with pricing as low as $2.50 a month for Vultr cloud compute they have plans for developers and businesses of all sizes, except maybe Amazon, who stubbornly insists on having something to scale all on their own. Try Vultr today for free by visiting: vultr.com/screaming, and you'll receive a $100 in credit. Thats V-U-L-T-R.com slash screaming.Corey: And worse, you wind up figuring out, okay, I'm going to store all that data going back to 2012, and it's petabytes upon petabytes. And great, how do I actually search for a thing? Well, I have to use some other expensive thing of compute that's going to start diving through all of that because the way I set up my partitioning, it isn't aligned with anything looking at, like, recency or based upon time period, so right every time I want to look at what happened 20 minutes ago, I'm looking at what happened 20 years ago. And that just gets incredibly expensive, not just to maintain but to query and the rest. Now, to be clear, yes, this is an anti-pattern. It isn't how things should be set up. But how should they be set up? And it is the collective the answer to that right now actually what's best, or is it still harkening back to old patterns that no longer apply?Clint: Well, the future is here, it's just unevenly distributed. So there's, you know, I think an important point about us or how we think about building software is with this customer is first attitude and fundamentally bringing them choice. Because the reality is that doing things the old way may be the right decision for you. You may have compliance requirements to say—there's a lot of financial services institutions, for example, like, they have to keep every byte of data written on any endpoint for seven years. And so we have to accommodate their requirements.Like, is that the right requirement? Well, I don't know. The regulator wrote it that way, so therefore, I have to do it. Whether it's the right thing or the wrong thing for the business, I have no choice. And their decisions are just as right as the person who says this data is worthless and should all just be thrown away.We really want to be able to go and say, like, hey, what decision is right? We're going to give you the option to do it this way, we're going to give you the option to do it this way. Now, the hard part—and that when it comes down to, like, marketing, it's like you want to have this really simple message, like, “This is the one true path.” And a lot of vendors are this way, “There's this new wonderful, right, true path that we are going to take you on, and follow along behind me.” But the reality is, enterprise worlds are gritty and ugly, and they're full of old technology and new technology.And they need to be able to support getting data off the mainframe the same way as they're doing a brand new containerized microservices application. In fact, that brand new containerized microservices application is probably talking to the mainframe through some API. And so all of that has to work at once.Corey: Oh, yeah. And it's all of our payment data is in our PCI environment that PCI needs to have every byte logged. Great. Why is three-quarters of your infrastructure considered the PCI environment? Maybe you can constrain that at some point and suddenly save a whole bunch of effort, time, money, and regulatory drag on this.But as you go through that journey, you need to not only have a tool that will work when you get there but a tool that will work where you are today. And a lot of companies miss that mark, too. It's, “Oh, once you modernize and become the serverless success story of the decade, then our product is going to be right for you.” “Great. We'll send you a postcard if we ever get there and then you can follow up with us.”Alternately, it's well, “Yeah, we're this is how we are today, but we have a visions of a brighter tomorrow.” You've got to be able to meet people where they are at any point of that journey. One of the things I've always respected about Cribl has been the way that you very fluidly tell both sides of that story.Clint: And it's not their fault.Corey: Yeah.Clint: Most of the people who pick a job, they pick the job because, like—look, I live in Kansas City, Missouri, and there's this data processing company that works primarily on mainframes, it's right down the road. And they gave me a job and it pays me $150,000 a year, and I got a big house and things are great. And I'm a sysadmin sitting there. I don't get to play with the new technology. Like, that customer is just as an applicable customer, we want to help them exactly the same as the new Silicon Valley hip kid who's working at you know, a venture-backed startup, they're doing everything natively in the cloud. Those are all right decisions, depending on where you happen to find yourself, and we want to support you with our products, no matter where you find yourself on the technology spectrum.Corey: Speaking of old and new, and the trends of the industry, when you first set up this recording, you mentioned, “Oh, yeah, we should make it a point to maybe talk about the acquisition,” at which point I sprayed coffee across my iMac. Thanks for that. Turns out it wasn't your acquisition we were talking about so much as it is the—at the time we record this—-the yet-to-close rumored acquisition of Splunk by Cisco.Clint: I think it's both interesting and positive for some people, and sad for others. I think Cisco is obviously a phenomenal company. They run the networking world. The fact that they've been moving into observability—they bought companies like AppDynamics, and we were talking about Epsagon before the show, they bought—ServiceNow, just bought Lightstep recently. There's a lot of acquisitions in this space.I think that when it comes to something like Splunk, Splunk is a fast-growing company by compared to Cisco. And so for them, this is something that they think that they can put into their distribution channel, and what Cisco knows how to do is to sell things like they're very good at putting things through their existing sales force and really amplifying the sales of that particular thing that they have just acquired. That being said, I think for a company that was as innovative as Splunk, I do find it a bit sad with the idea that it's going to become part of this much larger behemoth and not really probably driving the observability and security industry forward anymore because I don't think anybody really looks at Cisco as a company that's driving things—not to slam them or anything, but I don't really see them as driving the industry forward.Corey: Somewhere along the way, they got stuck and I don't know how to reconcile that because they were a phenomenally fast-paced innovative company, briefly the most valuable company in the world during the dotcom bubble. And then they just sort of stalled out somewhere and, on some level, not to talk smack about it, but it feels like the level of innovation we've seen from Splunk has curtailed over the past half-decade or so. And selling to Cisco feels almost like a tacit admission that they are effectively out of ideas. And maybe that's unfair.Clint: I mean, we can look at the track record of what's been shipped over the last five years from Splunk. And again they're a partner, their customers are great, I think they still have the best log indexing engine on the market. That was their core product and what has made them the majority of their money. But there's not been a lot new. And I think objectively we can look at that without throwing stones and say like, “Well, what net-new? You bought SignalFX. Like, good for you guys like that seems to be going well. You've launched your observability suite based off of these acquisitions.” But organic product-wise, there's not a lot coming out of the factory.Corey: I'll take it a bit further-slash-sadder, we take a look at some great companies that were acquired—OpenDNS, Duo Security, SignalFX, as you mentioned, Epsagon, ThousandEyes—and once they've gotten acquired by Cisco, they all more or less seem to be frozen in time, like they're trapped in amber, which leads us up to the natural dinosaur analogy that I'll probably make in a less formal setting. It just feels like once a company is bought by Cisco, their velocity peters out, a lot of their staff leaves, and what you see is what you get. And I don't know if that's accurate, I'm just not looking in the right places, but every time I talk to folks in the industry about this, I get a lot of knowing nods that are tied to it. So, whether or not that's true or not, that is very clearly, at least in some corners of the market, the active perception.Clint: There's a very real fact that if you look even at very large companies, innovation is driven from a core set of a handful of people. And when those people start to leave, the innovation really stops. It's those people who think about things back from first principles—like why are we doing things? What different can we do?—and they're the type of drivers that drive change.So, Frank Slootman wrote a book recently called Amp it Up that I've been reading over the last weekend, and he talks—has this article that was on LinkedIn a while back called “Drivers vs. Passengers” and he's always looking for drivers. And those drivers tend to not find themselves as happy in bigger companies and they tend to head for the exits. And so then you end up with the people who are a lot of the passenger type of people, the people who are like—they'll carry it forward, they'll continue to scale it, the business will continue to grow at whatever rate it's going to grow, but you're probably not going to see a lot of the net-new stuff. And I'll put it in comparison to a company like Datadog who I have a vast amount of respect for I think they're incredibly innovative company, and I think they continue to innovate.Still driven by the founders, the people who created the original product are still there driving the vision, driving forward innovation. And that's what tends to move the envelope is the people who have the moral authority inside of an even larger organization to say, “Get behind me. We're going in this direction. We're going to go take that hill. We're going to go make things better for our customers.” And when you start to lose those handful of really critical contributors, that's where you start to see the innovation dry up.Corey: Where do you see the acquisitions coming from? Is it just at some point people shove money at these companies that got acquired that is beyond the wildest dreams of avarice? Is it that they believe that they'll be able to execute better on their mission and they were independently? These are still smart, driven, people who have built something and I don't know that they necessarily see an acquisition as, “Well, time to give up and coast for a while and then I'll leave.” But maybe it is. I've never found myself in that situation, so I can't speak for sure.Clint: You kind of I think, have to look at the business and then whoever's running the business at that time—and I sit in the CEO chair—so you have to look at the business and say, “What do we have inside the house here?” Like, “What more can we do?” If we think that there's the next billion-dollar, multi-billion-dollar product sitting here, even just in our heads, but maybe in the factory and being worked on, then we should absolutely not sell because the value is still there and we're going to grow the company much faster as an independent entity than we would you know, inside of a larger organization. But if you're the board of directors and you're looking around and saying like, hey look, like, I don't see another billion-dollar line of bus—at this scale, right, if your Splunk scale, right? I don't see another billion-dollar line of business sitting here, we could probably go acquire it, we could try to add it in, but you know, in the case of something like a Splunk, I think part of—you know, they're looking for a new CEO right now, so now they have to go find a new leader who's going to come in, re-energize and, kind of, reboot that.But that's the options that they're considering, right? They're like, “Do I find a new CEO who's going to reinvigorate things and be able to attract the type of talent that's going to lead us to the next billion-dollar line of business that we can either build inside or we can acquire and bring in-house? Or is the right path for me just to say, ‘Okay, well, you know, somebody like Cisco's interested?'” or the other path that you may see them go down to something like Silver Lake, so Silver Lake put a billion dollars into the company last year. And so they may be looking at and say, “Okay, well, we really need to do some restructuring here and we want to do it outside the eyes of the public market. We want to be able to change pricing model, we want to be able to really do this without having to worry about the stock price's massive volatility because we're making big changes.”And so I would say there's probably two big options there considering. Like, do we sell to Cisco, do we sell to Silver Lake, or do we really take another run at this? And those are difficult decisions for the stewards of the business and I think it's a different decision if you're the steward of the business that created the business versus the steward of the business for whom this is—the I've been here for five years and I may be here for five years more. For somebody like me, a company like Cribl is literally the thing I plan to leave on this earth.Corey: Yeah. Do you have that sense of personal attachment to it? On some level, The Duckbill Group, that's exactly what I'm staring at where it's great. Someone wants to buy the Last Week in AWS media side of the house.Great. Okay. What is that really, beyond me? Because so much of it's been shaped by my personality. There's an audience, sure, but it's a skeptical audience, one that doesn't generally tend to respond well to mass market, generic advertisements, so monetizing that is not going to go super well.“All right, we're going to start doing data mining on people.” Well, that's explicitly against the terms of service people signed up for, so good luck with that. So, much starts becoming bizarre and strange when you start looking at building something with the idea of, oh, in three years, I'm going to unload this puppy and make it someone else's problem. The argument is that by building something with an eye toward selling it, you build a better-structured business, but it also means you potentially make trade-offs that are best not made. I'm not sure there's a right answer here.Clint: In my spare time, I do some investments, angel investments, and that sort of thing, and that's always a red flag for me when I meet a founder who's like, “In three to five years, I plan to sell it to these people.” If you don't have a vision for how you're fundamentally going to alter the marketplace and our perception of everything else, you're not dreaming big enough. And that to me doesn't look like a great investment. It doesn't look like the—how do you attract employees in that way? Like, “Okay, our goal is to work really hard for the next three years so that we will be attractive to this other bigger thing.” They may be thinking it on the inside as an available option, but if you think that's your default option when starting a company, I don't think you're going to end up with the outcome is truly what you're hoping for.Corey: Oh, yeah. In my case, the only acquisition story I see is some large company buying us just largely to shut me up. But—Clint: [laugh].Corey: —that turns out to be kind of expensive, so all right. I also don't think it serve any of them nearly as well as they think it would.Clint: Well, you'll just become somebody else on Twitter. [laugh].Corey: Yeah, “Time to change my name again. Here we go.” So, if people want to go and learn more about a Cribl Edge, where can they do that?Clint: Yeah, cribl.io. And then if you're more of a technical person, and you'd like to understand the specifics, docs.cribl.io. That's where I always go when I'm checking out a vendor; just skip past the main page and go straight to the docs. So, check that out.And then also, if you're wanting to play with the product, we make online available education called Sandboxes, at sandbox.cribl.io, where you can go spin up your own version of the product, walk through some interactive tutorials, and get a view on how it might work for you.Corey: Such a great pattern, at least for the way that I think about these things. You can have flashy videos, you can have great screenshots, you can have documentation that is the finest thing on this earth, but let me play with it; let me kick the tires on it, even with a sample data set. Because until I can do that, I'm not really going to understand where the product starts and where it stops. That is the right answer from where I sit. Again, I understand that everyone's different, not everyone thinks like I do—thankfully—but for me, that's the best way I've ever learned something.Clint: I love to get my hands on the product, and in fact, I'm always a little bit suspicious of any company when I go to their webpage and I can't either sign up for the product or I can't get to the documentation, and I have to talk to somebody in order to learn. That's pretty much I'm immediately going to the next person in that market to go look for somebody who will let me.Corey: [laugh]. Thank you again for taking so much time to speak with me. I appreciate it. As always, it's a pleasure.Clint: Thanks, Corey. Always enjoy talking to you.Corey: Clint Sharp, CEO and co-founder of Cribl. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment. And when you hit submit, be sure to follow it up with exactly how many distinct and disparate logging systems that obnoxious comment had to pass through on your end of things.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

god ceo amazon time hollywood starting man security missouri silicon valley cloud kansas city stream ram drivers clint api cisco last week screaming accenture aws linux iso vm devops palo alto passengers tata imac s3 silver lake kubernetes lambda servicenow pci splunk honeycomb ohh deployments vms ssh datadog ubiquity elasticsearch ebs alternately aws lambda sharpens ec2 yaml appdynamics thousand eyes corey quinn sandboxes duo security opendns sysdig minio logz frank slootman lightstep cloudwatch signalfx duckbill group syslog chief cloud economist last week in aws humblepod clint you

Co-founder trust & making major pivots w/ Daniela Miao & Khawaja Shams @ Momento

Engineering Founders

Play Episode Listen Later Mar 3, 2022 44:00

How do co-founders know when it's time to pivot? And how do you make sure your team is on board with the new direction? Daniela Miao and Khawaja Shams (Co-Founders @ Momento) join us to talk about their experience pivoting from a consumer Social Fitness app to a B2B SaaS company. The technical co-founders share their story of starting with a team and not a product, how they decided on a new direction, establishing open communication, and getting feedback from the market.ABOUT KHAWAJA SHAMSKhawaja is a technical hands-on leader, passionate about investing in people, setting a bold vision, and execution with his team. At AWS, he owned DynamoDB, a highly available fully managed database service serving at extreme scales! It powers much of Amazon retail, Amazon Video, and control planes of critical AWS Services. Khawaja subsequently owned product and engineering for all 7 of the AWS Media Services, responsible for streaming some of the most visible events in the world, including the Super Bowl and the world's first Live 4K Stream from Space. He was awarded the prestigious NASA Early Career Medal for his contributions to the Mars Rovers.“It took us some time, but we eventually internalized that we're not the domain experts in this. And in some cases, we learned that the investors knew more about the space than we did. And that's a bad sign, right? Like that's a great thing for an entrepreneur... but it's a really difficult position to put the investors in.”- Khawaja Shams ABOUT DANIELA MIAODaniela Miao is the co-founder of Momento, a serverless distributed caching platform. Previously, she was the Director of Platform Engineering at Lightstep, and tech lead at AWS DynamoDB. Daniela has spoken at many events including re:Invent, QCon, and Kubecon. At Momento, she works on distributed system performance, observability, security, and the intersection of engineering with business."The hardest conversation... I think I can speak for both of us when I say this, was actually with each other. You know, imagine sort of that brewing sense of doubt and wanting to broach the conversation.And this is a BIG pivot, right? It's it has nothing to do with each other. And I think that was really profound. It normalized having a pivot... after that, the rest actually felt a lot easier...- Daniela MiaoSHOW NOTES:How Daniela and Khawaja met (2:35)Starting with a team, not a product (3:56)Choosing the idea for a product (5:02)Pivoting from a consumer fitness app to B2B SaaS (6:41)Getting the team on board with a pivot (13:08)How to establish open communication (16:43)Advice for pivoting a startup (20:34)How to know when to pivot (23:52)Why focus so much on values in the early days? (28:04)Go-to-market tips for technical leaders (32:58)Getting feedback from the market (35:19)Rapid Fire Questions (38:06)

What It Takes to Go from CNCF Sandbox to Incubation

acquisition dev servicenow sre observability lightstep

Play Episode Listen Later Nov 23, 2021 12:35

The number of Cloud Native Computing Foundation (CNCF) projects has exploded since Kubernetes came onboard, setting the stage for hundreds of tools and platforms that have achieved the various CNCF project maturity milestones of Sandbox, Incubated or Graduated.With the profound influence the adoption of the projects have had on cloud native notwithstanding, it can be easy to sometimes overlook the monumental effort involved in every project by their contributors. In this The New Stack Makers podcast, we look at two CNCF projects that have gone from sandbox to incubation: Crossplane, a Kubernetes add-on for infrastructure assembly and OpenTelemetry, which supports a collection of tools, APIs, and SDKs for observability.The podcast featured guests involved with the projects including Dan Mangum, senior software engineer, for cloud platform provider Upbound (Crossplane), Constance Caramanolis, principal software engineer, data platform provider Splunk, and on the OpenTelemetry Governance Committee and Ted Young, director of developer education, observability platform provider Lightstep and an OpenTelemetry co-founder who is also on the OpenTelemetry Governance Committee.Alex Williams, founder and publisher of The New Stack, hosted this podcast.

makers apis sandbox graduated kubernetes sdks splunk incubation alex williams cncf incubated new stack lightstep crossplane new stack makers

The acquisition of Lightstep and what it means for observability - Episode 127

What the Dev?

Play Episode Listen Later Oct 5, 2021 16:00

In this week's episode of the SD Times "What the Dev?" podcast, editor-in-chief David Rubinstein discusses the acquisition of observability company Lightstep by ServiceNow and what that means for the observability and SRE space. His guest is Austin Parker, lead developer advocate at Lightstep who will also be talking about the OpenTelemetry project.

What is Observability?

IT Experience Podcast - ServiceNow

Play Episode Listen Later Sep 28, 2021 12:26

Forrest Knight joined us this week to discuss customer challenges around application delivery and how Lightstep solves it via Observability. And, check out this excellent overview of Lightstep by Forrest. See omnystudio.com/listener for privacy information.

observability lightstep

What is Observability?

ServiceNow Podcasts

Play Episode Listen Later Sep 28, 2021 12:26

observability lightstep

142. Brent Hodge Director, Producer of The Holy Game | PharmaBro | I am Chris Farley

Two and a Half Amigos

Play Episode Listen Later Sep 11, 2021 78:08 Transcription Available

Join Scott and Mark with their guest, Brent Hodge, as they talk about the journey of coming to film Industry. Brent Hodge is a Canadian-New Zealander documentary filmmaker and entrepreneur. He is best known for his documentaries I Am Chris Farley, A Brony Tale, The Pistol Shrimps, Freaks and Geeks. He has also done corporate work for ESPN, Time magazine, Karlie Kloss, CBC Music, Tourism Alberta, and National Film Board of Canada (for the movie Hue: A Matter of Colour), as well as technology startups Uber, City Storage Systems, Lightstep, Hootsuite and Steve Russell's analytics startup Prism Skylabs. Early Life: Hodge grew up in the City of St. Albert, Alberta, but moved to Victoria, British Columbia at the age of 12. He was first exposed to filmmaking in his entrepreneur class at Mount Douglas Secondary School. After high school, he attended the University of Victoria for a year before completing a degree in commerce at the University of Otago in Dunedin, New Zealand. Upon completing his degree he returned to Canada, attending School Creative in Vancouver, during which time he did sketch comedy with Chris Kelly, Zahf Paroo as well as Ryan Steele and Amy Good murphy from The Ryan and Amy Show. Hodge holds dual citizenship for both New Zealand and Canada. Story of “I Am Chris Farley”: Chris Farley lived his life full speed and committed to make everyone around him laugh out loud, and I Am Chris Farley will tell his hilarious, touching and wildly entertaining story for the first time ever – from his early days in Madison, Wisconsin, and at Marquette University, through his work at the legendary Second City to his rapid rise to the top of the comedy world on “Saturday Night Live” and in hit films like Tommy Boy and Black Sheep. About Brent Hodge: Brent Hodge is a Canadian-New Zealander documentary filmmaker and entrepreneur. He is best known for his documentaries I Am Chris Farley, A Brony Tale, The Pistol Shrimps, Freaks and Geeks: The Documentary, Chris Farley: Anything for a Laugh, Who Let the Dogs Out and Pharma Bro. He has been nominated for six Leo Awards for his documentary movies Winning America, What Happens Next? and A Brony Tale, winning one for A Brony Tale in 2015. He was nominated for two Shorty Awards under the "director" category in 2014 and 2015 for his work on The Beetle Road trip Sessions and A Brony Tale. Hodge also won a Canadian Screen Award in 2014 for directing The Beetle Roadtrip Sessions with Grant Lawrence. Hodge directed I Am Chris Farley in 2015 with Derik Murray of Network Entertainment. The documentary is based on the life of comedian-actor Chris Farley and features interviews with numerous actors, comedians, and others who worked with Farley during his career. The film was long-listed for an Academy Award. In 2014, Hodge released his critically acclaimed documentary A Brony Tale. It delves into the world of the teenage and adult fans of the television show My Little Pony: Friendship is Magic (called "bronies") through the eyes of musician and voice actress Ashleigh Ball on her trip to the 2012 BronyCon. Outline of the Episode: ● [02:03] Brent Hodge Introduction. ● [04:31] How Brent Came Into Film Industry ● [07:50] How Brent completed the documentary with “ I Am Chris Farley”. ● [13:10] How was the experience with the big names during the documentary. ● [19:58] How you prepare for the Interviews. ● [27:29] Journey during the “Who let the dogs out”. ● [33:27] Talking about the Soccer film. ● [42:10] Insight on Importance of Re-evaluating the work on Consistent Basis. ● [53:40] How Brent follows the north American Soccer team. ● [58:07] Favourite Conspiracy ● [59:46 ] Piece of Information that is Illegal to know. ● [01:04:34] Talking about the discoveries during the documentary that are crazy and how to handle the situation with Crew. ● [01:08:11] Guilty Pleasure Catch! Website: https://www.hodgeefilms.com/ Instagram: http://instagram.com/hodgeepodgee/ Twitter: http://twitter.com/hodgeepodgee Facebook: http://www.facebook.com/hodgeefilms/ Youtube: https://www.youtube.com/user/hodgeepodgeeluvsyou Connect with AmigosPC! Website:https://www.amigospc.net Facebook:https://www.facebook.com/TwoandahalfAmigos Instagram:https://www.instagram.com/amigospc Twitter:https://twitter.com/AmigosPC Check out Offical AmigosPC Merch at: https://teespring.com/stores/amigospc Listen to the AmigosPC podcast on the following platforms: https://podcasts.apple.com/us/podcast/the-3-amigos... https://www.spreaker.com/show/the-two-and-a-half-a... https://www.iheart.com/podcast/256-two-and-a-half-... Join the conversation with the Amigos by becoming a member of Amigospc get direct access to our discord and other cool free stuff amigospc.supercast.tech

Build or Buy: Scaling Through Acquisitions & Ownership w/ Marianna Tessel & Aileen Lee #58

The Engineering Leadership Podcast

Play Episode Listen Later Aug 24, 2021 31:30

Marianna Tessel (CTO @ Intuit) & Aileen Lee (Founder/Managing Partner @ Cowboy Ventures) cover how to navigate the build vs. buy decision! They share the frameworks they use to make a “buy” decision, how they assess engineering talent during acquisitions, how they decide between vendor software vs. open-source vs. building yourself. Plus the leadership skills that help Marianna lead a 5,000+ person team! MARIANNA TESSEL, CTO @ INTUIT Marianna oversees Intuit's technology strategy and leads all of Intuit's product engineering, data science, information technology and information security teams worldwide. Marianna's been at the forefront of significant tech transformations, including virtualization, cloud, and dev ops. Marianna previously served as Executive VP of Strategic Development at Docker, held leadership roles at VMware, Ariba, and General Magic working on the forefront of significant tech transformations, including virtualization, cloud, and dev ops. the forefront of significant tech transformations, including virtualization, cloud, and dev ops. AILEEN LEE, FOUNDER & MANAGING PARTNER @ COWBOY VENTURES Aileen is founding Partner at Cowboy Ventures, a team that backs seed-stage technology companies re-imagining work and life through technology, what they call “life 2.0”. Cowboy Ventures works with startups like Guild Education, Lightstep, Dollar Shave Club, and Tally. Aileen periodically writes about technology insights and is known for coining the business term “unicorn” for public and private companies valued over $1bn. She has been named to the Forbes Midas List of best investors and Forbes Most Powerful Women, as well as to Time Magazine's 100 most influential people. Prior to Cowboy, Aileen was a partner at Kleiner Perkins Caufield & Byers, was founding CEO of RMG Networks, and worked at Gap Inc in operating roles. She has degrees from MIT and HBS, is mom of 3, wife to a startup founder, an Aspen Institute Henry Crown Fellow and co-founder of the non-profit All Raise - aiming to accelerate success for women in the technology ecosystem. SHOW NOTES About Marianna's role at Intuit (2:33) How many acquisitions / build vs. buy decisions have you had to make? (4:35) Marianna's evaluation framework for buying companies (6:26) Assessing engineering talent in acqui-hires (9:37) How do you decide to buy vendor software or build yourself? (15:41) How do you define what's core to the business vs. context? (19:11) Where are you looking to buy instead of build right now? (21:40) Hard & soft skills that helped Marianna advance her career and run a 5000+ person team (23:48) Were you always good at the "developing talent" and "managing" part of being a CTO? (26:50) BROUGHT TO YOU BY... Jellyfish - Jellyfish helps you align engineering work with business priorities and enables you to make better strategic decisions. Learn more at Jellyfish.co/elc Listen to our Bonus Episode w/ Guillermo Fisher, Director of Engineering, Infrastructure @ Handshake on internal mobility, mission-driven decisions, & self-service infrastructure! Listen HERE: https://spoti.fi/3zdNnXn Special thanks to our exclusive accessibility partner Mesmer! Mesmer's AI-bots automate mobile app accessibility testing to ensure your app is always accessible to everybody. To jump-start, your accessibility and inclusion initiative, visit mesmerhq.com/ELC --- Send in a voice message: https://anchor.fm/engineeringleadership/message

Spoons (Daniel Spoonhower) - On building Lightstep, being customer focused, developing systems at Google scale and much more - #12

Software Misadventures

Play Episode Listen Later Jul 9, 2021 74:06

Spoons is the Co-founder and Chief Architect of Lightstep. He joins the show to talk about building systems at Google scale and various aspects that make Google a weird place than other companies. We talked about Spoons's journey of leaving Google and deciding to join Lightstep as a co-founder. We dig into the challenges during the early days of Lightstep and discuss the importance of speaking to customers to build the right product. We talk about what it's like to start a family and run a startup and how one can be intentional about building a company's culture. As always, we go through some of the misadventures and one of them involves a cable being cut under the English channel.

english google technology developing scale spoons sre chief architect observability customer focused lightstep

Season A - Ep. 7: Anadelia Fadeev, Director of Demand Generation and Growth at Teleport on Lead Gen and Brand Awareness

SaaShimi

Play Episode Listen Later Jun 11, 2021 17:21

Anadelia Fadeev, Director of Demand Generation and Growth at Teleport, discusses lead gen and brand awareness, how she measures effectiveness of campaigns, and "must have" online tools.Anadelia Fadeev's BioAnadelia is a Director of Demand Generation and Growth at Teleport, a Kleiner Perkins-backed software company that allows engineers and security professionals to unify access to Servers, web applications, and databases.Prior to Teleport, Anadelia was Demand generation at several tech startups, including LightStep, InfluxData, ToutApp (acq. by Marketo), and Inkling. She started her career at Visage Mobile where she oversaw all aspects of marketing.Time Stamps00:10 Anadelia and Teleport 3:13 Brand Awareness6:50 Lead Generation8:00 Measuring effectiveness of campaigns10:02 Anadelia's Tech Stack11:20 Case Study14:00 Bowtie Funnel16:00 Looking for leadsSIGN UP at https://www.saashimi.cloud to receive transcripts of the interviews and news about upcoming guests and events.

director growth measuring servers brand awareness lead gen demand generation marketo teleport kleiner perkins inkling influxdata lightstep toutapp

Deserted Island DevOps with Austin Parker

ceo co founders software scale monarch devops lightstep

Play Episode Listen Later Jun 8, 2021 37:53

About AustinAustin makes problems with computers, and sometimes solves them. He's an open source maintainer, observability nerd, devops junkie, and poster. You can find him ignoring HN threads and making dumb jokes on Twitter. He wrote a book about distributed tracing, taught some college courses, streams on Twitch, and also ran a DevOps conference in Animal Crossing.Links: Lightstep: https://lightstep.com/ Lightstep Sandbox: https://lightstep.com/sandbox Desert Island DevOps: https://desertedislanddevops.com lastweekinAWS.com Resources: https://lastweekinAWS.com/resources Distributed Tracing in Practice: https://www.amazon.com/Distributed-Tracing-Practice-Instrumenting-Microservices/dp/1492056634 Twitter: https://twitter.com/austinlparker Personal Blog: https://aparker.io TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at the Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by Thinkst. This is going to take a minute to explain, so bear with me. I linked against an early version of their tool, canarytokens.org in the very early days of my newsletter, and what it does is relatively simple and straightforward. It winds up embedding credentials, files, that sort of thing in various parts of your environment, wherever you want to; it gives you fake AWS API credentials, for example. And the only thing that these things do is alert you whenever someone attempts to use those things. It's an awesome approach. I've used something similar for years. Check them out. But wait, there's more. They also have an enterprise option that you should be very much aware of canary.tools. You can take a look at this, but what it does is it provides an enterprise approach to drive these things throughout your entire environment. You can get a physical device that hangs out on your network and impersonates whatever you want to. When it gets Nmap scanned, or someone attempts to log into it, or access files on it, you get instant alerts. It's awesome. If you don't do something like this, you're likely to find out that you've gotten breached, the hard way. Take a look at this. It's one of those few things that I look at and say, “Wow, that is an amazing idea. I love it.” That's canarytokens.org and canary.tools. The first one is free. The second one is enterprise-y. Take a look. I'm a big fan of this. More from them in the coming weeks.Corey: This episode is sponsored by ExtraHop. ExtraHop provides threat detection and response for the Enterprise (not the starship). On-prem security doesn't translate well to cloud or multi-cloud environments, and that's not even counting IoT. ExtraHop automatically discovers everything inside the perimeter, including your cloud workloads and IoT devices, detects these threats up to 35 percent faster, and helps you act immediately. Ask for a free trial of detection and response for AWS today at extrahop.com/trial.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I'm joined this week by Austin Parker, who's a principal developer advocate at Lightstep. Austin, welcome to the show.Austin: Hey, it's great to be here.Corey: It really is. I love coming here. It's one of my favorite places to go. So, let's get the obvious stuff out of the way. You're a principal developer advocate at Lightstep. I know this because I said it a whole sentence ago, which is about the limit of my attention span. What is Lightstep? And what does your job mean?Austin: So, Lightstep is an observability platform. We take traces, and metrics, and logs, and all that good stuff, throw them together in a big old swamp of data, and then, kind of, give you some really cool workflows to help you make sense of it, figure out, hey, where is the slow SQL query? Where is the performance bad?Corey: The way to figure out, in most of my environments, where's the performance bad is git blame, figure out what part I wrote.Austin: But imagine there were, like, 1000, or 100,000 of you all working on this massive distributed system, and you didn't know half—Corey: It would snark itself to death before it ever got off the ground.Austin: Yeah. I mean, I think that's actually most large companies, right? We deliver shippable software only through inertia.Corey: Yeah. Just because at some point, it bounces off all the walls, there's nowhere else for it to go but to production.Austin: Yep. But yeah, you have thousands of people, hundreds of people, however many people, right? I think the whole distributed workforce thing that most people are dealing with now has really made observability rise to the top of your concern list because you don't have the luxury of just going and poking your head around the corner and saying, “Hey, Joanne. What the heck? Why did things break?” You can't just poke someone anymore. Or you can, but you never know what you're going to have to deal with.Corey: It feels weird to call them at home or bug their family members to poke them or whatnot. It just seems weird.Austin: It does. And until Amazon comes out with a minder drone that just, kind of like, hovers over your shoulder at all times, and pokes you, when someone is like, “Hey, you broke the build.” Then I think we're going to need observability so that people can sort of self-serve, figure out what's going on with their systems.Corey: Cool. One of the things I'm going to point out is that I've had a bunch of people attempt to explain what distributed tracing is and how observability works, and it never really stuck. And one of the things that I found that did help explain it—and we didn't even talk about this in the pre-show, while we figure out how to pronounce each other's names—but one of the things that has always stuck with me is the interactive sandbox on Lightstep, which used to be prominently featured on your page; now it's buried in the menu somewhere. But it's an interactive sandbox that sets up a scenario, problem you're trying to solve, gives you data—so it gets away from the problem of, “Step one, have a distributed application where it's all instrumented and reporting things in.” Because in a lot of shops, that's not exactly a small lift that you can do in an afternoon to start testing things like this out. It's genius. It shows what the product does, how it works, mapped to the type of problems people will generally encounter. And after I played with this, “Oh, my stars, I get it.”Austin: We actually just recently updated that to add some new stuff to it because we shipped a feature called ‘Change Intelligence' where you can take actual time-series metrics, and then overlay those on traces and say, “Hey, I saw a weird spike,” and highlight that, and then we go through, look at all the traces for that service and its related services during that time, and tell you, “Hey, we think it might be this. Here's things that are highly correlated in those time windows.” So, if you haven't checked it out recently, go back and check it out. It's—yeah, a little more hidden than it used to be, but I believe you can find it at lightstep.com/sandbox.Corey: Yeah. And there's no sign up to do this. It's free access. It asked for an email address, but that's okay, I just use yours. No, I'm not kidding. I actually did. And, yeah, it works; it shows exactly what it is. It even has, instead of ‘start' it says ‘play' because that's fundamentally what it is. If you're trying to wrap your head around distributed tracing, take a look at this.Austin: Yes, definitely. I have a long-standing Jira ticket to add achievements to that.Corey: Oh, that could be fun. You could bury some, too, like misusing services as databases—Austin: Ooh.Corey: —or most expensive query to get the right answer.Austin: Yeah. And then maybe, like, there's just one span, kind of, hidden there where it's ‘using Route 53 as a database.'Corey: I keep seeing that cropping up more and more places. That's something I get to own and that's an awful lot of fun. Speaking of gamification and playing in strange ways, one of the things you did last year that I wasn't paying attention to—because, you know, there was a pandemic on—was you were one of the organizers behind Desert Island DevOps which is a strange thing that I've only recently delved into—delven into—gone spelunking inside of. There we go.It wasn't instrumented for observability—buh-dum-tss. But it's fundamentally a DevOpsDays that takes place inside the animated world of Animal Crossing's New Horizon, which is apparently a Nintendo game, which is apparently a game company.Austin: Yeah.Corey: It is not really my space. I don't want to misspeak.Austin: No, you hit it. ‘Deserted.' Deserted Island [crosstalk 00:05:43].Corey: Oh, ‘Deserted.' Ah, got it. And don't spell it as ‘dessert' either, as in this would be a delicious game to play.Austin: I mean, it is a delicious and comforting sort of experience. If you aren't familiar with Animal Crossing, the short 30-second explanation is it is a life simulator, building game where, you as your character, you are on an island, and there are relatively adorable animal NPCs that are your villagers, and you can talk to them, and they will say funny things to you. You can go around and do chores like picking up fruit or fishing. And the purpose is, kind of, do these chores, get some in-game currency, and then go spend that in-game currency on furniture so that you can make a pretty house, or buy pretty clothing. And it came out at a perfect time last year because everyone was about to bundle inside for the—well, we're still inside—but everyone had to go inside. And suddenly, here's this like, “Oh, it's just this cute, sort of like, putz around and do whatever.”Corey: It was community-oriented. It was more of a building-oriented game than a destruction game.Austin: Yeah.Corey: It's the sort of thing that is a great way of taking your mind off your troubles. It is accessible to a bunch of people that aren't generally perceived as gamers when you think of that subculture. It really is an encompassing, warm, wonderful thing—by all accounts—and you looked at it and figured, “All right, how can we ruin something?” And the correct answer you got to is, “Let's pour DevOps on it.”Austin: Yeah. Let's use this as an event platform, and let's really just tech-bro this shit up.Corey: And it seems to work super well. At the time of this recording, I have submitted a talk that I live-streamed my submission around, and I have not heard in either direction. To be perfectly frank, I forget what I wound up submitting, which is always a bit of a challenge, just because I make so many throwaway random jokes that, cool. Well, we'll see how it plays out. I think you were even in the audience for that on the Twitch stream.Austin: Yeah. You found some bugs on the CFP form [laugh] that I had to fix.Corey: To be clear, the reason I do those things is not because it's a look how clever I am, but rather to instead talk about how it's not scary to submit a talk proposal. Everyone has a story that they can tell. And you don't need a big platform or decades of experience in this space to tell a story. And that was my goal, and I think I succeeded. You would have the numbers more than I do; I hope people wound up submitting based upon seeing that. I want to hear voices that, frankly, aren't ours all the time.Austin: I think in, like, a week, we basically got more submissions than we did for the entire CFP last year. One thing that I kind of think is interesting to bring up because you bring up, oh, we don't hear a variety of voices, right? One thing I tell people, and I know that it's not universally applicable advice, but I got into DevRel as a—not quite luck, but, like, everything in my life is luck, on some level. It always plays some level of importance. But I didn't go to school to get into DevRel, I didn't do a lot of things.I have actually been in tech, maybe—depending on how you want to count it—in terms of actually being in a software development job or primarily software development job, maybe, like, five or six years, give or take. And before that, I did a lot of stuff. I was a short-order cook; I worked at gas stations; I did tech support for Blackberry, and I did a lot of community organization. I was a union organizer for a little while. I like DevRel because it's like, oh, this kind of integrates a lot of things I'm interested in, right?I enjoy teaching, helping people, and helping people learn, but I also like talking; I like to go and be a public figure, and I like to build a platform and use that to get a message out. And I think what I did with Deserted Island, or what the impetus there was, we suddenly were in a situation where it's like, “Hey, there's a bunch of people that normally get together and they fly around the globe in decent airplane seats, and people come and see us talk.” Because why? Because they think we know what we're talking about, or because we have something that shows we know what we're talking about, or however you want to say it. But in a lot of cases, I think people are coming for that sort of community, they're coming because, “Hey, I can go to a room and I can sit in some weird little hotel, or conference center, or whatnot, and everyone I look at, everyone I see is someone that is doing what I'm doing, on some level. These are all people that are working in technology, they're building things, they're solving problems.”And that goes away really quickly when you get into this remote-first world, and when we can't travel, and we don't have that visual aspect. So, what I wanted to do with Deserted Island, what I thought what was important about it is, I was already sick of Zoom by the time, everyone went to Zoom; I was already sick of the idea of, oh my god, a year or two years of these sort of events and these community things just being, like, everyone's staring at a bunch of slides and a talking head. Didn't sound very appealing, so what if we try something different? What if we do something where it's like, look, we're going to take people out of their day; we're going to put them in somewhere else. And maybe that's somewhere else is just, hey, you're watching people run around on an Animal Crossing Island on a Twitch stream.But that sort of moment of just, like, this isn't what you would normally be doing, I think takes people's heads out of their normal routine and puts them in a place where they can learn, and they can feel community, and they can feel, like, a kinship. I also think it's really important because it's that whole stupid New Yorker joke of, “On the internet, nobody knows you're a dog.” We have this really cool opportunity to craft who we are as people, and how we present that to the world. And for a lot of people, you're stuck inside; you don't get that self-expression, so here's a way to be expressive, right? Here's a way to communicate who you are on a level that isn't just a profile picture or something, or things that don't work as well over Zoom.It's a way to help project your identity. And that, I think, gives more weight to what you're saying because when you feel like, “Hey, this is more of who I am,” or, “This is a representation of me. I can show something about who I am.” And that helps you speak. And that helps you deliver, I think, an effective talk. And that, again, builds community and builds these bonds.Corey: I want to talk to you about that, specifically because you are one of those people that aligns very much with my view of the world on developer marketing. But I don't want to lead you too much on this, so why don't you start? Take it away. Where do you stand on developer marketing? And what do people get wrong?Austin: I think the thing that a lot of people get wrong is that they try to monetize the idea of community. If you go and you search, insert major company name here; you search “Amazon community,” or you search “Microsoft community,” or you search “Google community,”—well, if you do that, you'll get no results, but whatever, right? You get the picture that marketers in a way have turned the idea of developer community into something that you can just throw a KPI or throw an OKR on and squeeze it for money. And I don't like that. I'm not very comfortable with that idea of community—because I think community in a lot of ways, it's like family. And the families that you like the best are the ones you choose. I think this is—Corey: The family you choose is an important concept.Austin: Right. And for the most part… so much of human experience activity is built around finding those people you choose, and those communities develop out of that. I use AWS sometimes, I don't necessarily know if I would put myself in a community with every other AWS user. I—Corey: Oh, I certainly wouldn't. This is the problem. Everyone thinks when you talk about community or a group of people doing something, they're ‘other people' that are in some level of otherness. And that's—like there are entire communities around AWS that I do not talk to, I do not see, I do not pretend to understand.Austin: Yeah, even at Lightstep. We're not a massive, massive company by any means, but we have a bunch of different users that are using our tool in different ways. And they all have different needs, and they all have different wants. So, I could say, “Oh, here's the Lightstep community.” But it's not a useful abstraction.It's not a useful way to abstract all of our users because any tool that's worth using is going to be this collection of other abstractions and building blocks. Like, you… I don't know, look at something like Notion, or look at something like Airtable, or the popularity of low or no-code stuff, where someone built a platform and then other people are building stuff on top of that platform, if you go to those user groups or you go to those forums, and it's just like, there's a million, million different varied use cases, and people are doing it in different ways, and some people are building this kind of application, or that kind of application, or whatever. So, the idea of, oh, there's a community and we can monetize that community somehow, I'm uncomfortable with that from, sort of, a base level. And I'm uncomfortable with the idea of the DevRel industry—or the developer marketing industry—kind of moving towards this idea of, like, we're going to become community marketers or whatever. I think you have to approach people as individuals.And individuals are motivated by a lot of things. They're motivated by, can you solve this problem? Do I like you? Are you funny? Whatever. And I believe that if you're a developer tool, and you are trying to attract developers, then [sigh] it works a lot better, I think, to have just individuals, to have people that can help influence the much broader—the superset of all developers that might have an interest in what you're doing by being different, I guess.Being something that's like, hey, this is entertaining, or this is informative, or this is interesting. The world is not a meritocracy. The world is governed by many, many different things. You're not going to win over the developer industry simply by going out and having the best white papers, or having one more ad read than your competitor. You need to do something to get people interested and excited in [sigh] a way that they can see themselves using it.It's like, why did Apple go and do ‘Think Different' ads? Because it's like, you using a Mac, that's kind of like being Einstein, or that's kind of like being Picasso. This is basic marketing stuff that I feel like a lot of technical marketers or developer marketers sort of leave at the door because they think the audience is too sophisticated for it, or their—Corey: I'll even soft-launch it here because I haven't at this point in time, talked about it in public, but if you go to lastweekinAWS.com/resources we wrote our own developer marketing guide because I got tired of explaining the same type of thing again, and again, and again. It asks for an email address and it sends it to you—I know, I'm as guilty as any. And I, of course, called it ‘Devreloper,' which is absolutely a problem with me and I talk about things. But I'm right.And it goes to an awful lot of what you're saying. An example that you just talked about of giving people something rather than trying to treat them as metrics, one of the best marketing things I've seen you do, for example, is you wrote O'Reilly's Distributed Tracing in Practice which means if someone has a question about distributed tracing and how it's supposed to work, well, that's not a half-bad resource. And okay, I've read it and I have some further questions. Let me track down the author and ask them. Oh, you work at a company that is in this space? Huh. Maybe I'll look into this. And it's a very long-tail story. And how do you attribute that as far as, did this lead come from someone who read your book or not will drive marketers crazy.Austin: Oh, it's super hard. And it does drive them crazy. [laugh].Corey: Yeah, my answer is, I don't know and I don't care. One of the early sponsors of this podcast sponsored for a month and then didn't continue because they saw no value. A month goes by, they bought out everything that held still long enough, and, “Thank you for your business.” “Can you explain to me what changed?” “Oh, we talked to some of our big customers and it turned out the two of them had heard about us for the first time on your show.”And that inspired them to start digging into it and reaching things out, but big companies, corporate games of telephone, there was no way to attribute that. My firm belief is, on some level, that if you get in front of an audience with a message that resonates and—and this is the part some people miss—is something that solves an actual problem that they have. It works. It's not necessarily predictable and it's hard to say that this thing is going to go big and this thing isn't. So, the solution, on some level is just keep publishing things that speak to your audience. But it works, long term. I'm living proof of this.Austin: Yeah. I think that it makes a lot more sense to… rather than to do, sort of, I don't want to say vanity metrics, but kind of vanity metrics around, like, oh, this many stars, or this many forks, or whatever. There's a lot of people, especially in this OSS proximate world. Where you have a lot of businesses that are implicitly or explicitly built on top of an open-source project, not everyone that is using your open-source project is going to, one, be capable of converting into a paid user, or two, be super interested in it. And I would rather spend time thinking about, well, what is the value someone gets out of this product?And even if that only thing is, is that, hey, we know what we're talking about because we've got a bunch of really smart people that are building this product that would solve their problem. If you want to go out and build your own internal observability solution using completely open-source tools Grafanas and Prometheuses of the world, great. Go for it. I'm not going to hold you back. And for a lot of people, if they come to me and say, “Well, this is what we got, and this what we're thinking about.”I'll say, “Yeah. Go for it. You don't need what we're offering.” But I can guarantee you that as it scales and as it grows, then you're going to have a moment where you have to ask yourself the question of, “Do I want to keep spending a bunch of time stitching together all these different data sources, and care and feeding of these databases, and this long term storage, and dealing with requests from end-users, or I just want to pay someone else to solve that problem for me? And if I'm going to pay someone else, shouldn't I pay the people who literally spend all day every day thinking about these problems and have had decades of experience solving these problems at really big companies that have a lot of time and effort to invest in this?”Corey: This episode is sponsored in part by our friends at Lumigo. If you've built anything from serverless, you know that if there's one thing that can be said universally about these applications, it's that it turns every outage into a murder mystery. Lumigo helps make sense of all of the various functions that wind up tying together to build applications. It offers one-click distributed tracing so you can effortlessly find and fix issues in your serverless and microservices environment. You've created more problems for yourself; make one of them go away. To learn more, visit lumigo.io.Corey: Oh, yeah. We're doing some new content experiments on our site, and what we're doing is we're having some folks write content for us. Now, when people hear that, what a lot of marketers will immediately do is dive down the path of, “Ah. I'm going to go ahead and hire some content farm.” Well, that doesn't work, I found that we wound up working with individual people that work super well.And these are people who are able to talk about these things because their day job is managing a team of 30 SREs or something like that, where they are very clearly experts in the space. And I want to be very clear, I'm not claiming credit for our content writers; they get their own bylines on these things.Austin: Yeah.Corey: And it turns out that that, over time, leads to good outcomes because it helps people what they need. There's the mystical SEO Juju that I don't pretend to understand, but okay, I'm told it's important, so fine, whatever. And it makes for an easier onboarding story, where there are now resources that I can trust and edit if I need to, as things change, that I can point people to, that isn't a rotating selection of sketchy sites.Austin: Mm-hm. I think that's one thing that I would love to see more of, just not in any one particular part of the tech industry, but overall, the one thing I've noticed, at least in the pandemic, during this whole work-from-home, whatever, whatever, we don't talk enough. And it sounds maybe weird, but I think this actually goes back to what you're saying earlier, about everyone having a story to tell. People don't feel comfortable, I think, putting their opinion out there or saying, “Hey, this is what worked. This is what didn't work.”And so if you want to go find that out—like, if I wanted to go write something about, hey, these are the five things you should do to ensure you have great observability, then that's going to involve a lot of me going around and sort of Sherlocking my way through StackOverflow posts, and forums, and reaching out to people individually for stories and comments and whatever. And I would love to see us get to a point where we're just like, “Actually, no. This isn't—we should just be sharing this. Let's write blogs about it.” If you're sitting there thinking no one's going to find this useful, right—like, you solve a problem, or you see something that could have worked better, and you're like, “Eh, no one else is going to find that valuable.”I can almost guarantee you that someone is going to find that valuable. Maybe not today, maybe not tomorrow, but go ahead and write about your experiences, write about the problems you've solved, write about the things that have vexed you, and put that on the internet because it's really easy to publish stuff on the internet.Corey: Yes. Which is a blessing and a curse. That is very much a double-edged sword.Austin: That very much is a double-edged sword. But I think that by biasing towards being more open, by biasing towards transparency and sharing what works, what doesn't work, and having that just kind of be the default state, I'm a big proponent of things like radical transparency in terms of incident reports, or outages, or hiring, or anything. The more information that you can put in the world is going to—it might not make it better, but it at least helps change the conversation, gives more data points. There was a whole blow-up on Twitter this week, where someone posted like, “Hey, this is a salary I'm looking for.” I think you—Corey: Oh, yeah. She's great.Austin: Yeah, she's worth it, right? And the thing that got everyone's bee in a bonnet was, like, she's saying, “Oh, I want $185k.” And it's like, “Well, why don't we just publish that information?” Why isn't everyone just very open and honest about their salary expectations? And I know why: because the paucity of information is a benefit to employers and it works against employees.There was a lady that left—gosh, where was it? [sigh] I forget the company, but she left because she found out she was systemically underpaid compared to their male peers. Having these sort of information imbalances don't really help the people at the bottom of the pyramid. They don't help the little guys. They really only help the people that are in the very large companies with a lot of clout and ability to control narratives.And they want it to stay that way; they don't necessarily want you to know what everyone's salary is because then it gives you, as someone trying to get a job, a better negotiating position because you know what someone with your level of experience is worth to them.Corey: It's important to understand the context behind these salary negotiations and how to go about getting interviews and the rest. The entire job-hunting process is heavily biased in favor of employers because, especially at large employers, they go through this multiple times a week, whereas we go through this, as employees basically, every time we change jobs. Which for most people is every couple of years and for me, because of my mouth, it's every three weeks.Austin: Yeah. I'm not saying it's a simple solution. I am advocating for, sort of, societal, or just cultural shifts, but I think that it all comes full circle in the sense that, hey, a big part of observability is the idea that you need to be able to ask arbitrary questions. You want to know about unknown unknowns. And maybe that's why I like it so much as a field, why I like tracing, why I like this idea.Because, yeah, a lot of things in the world would be interesting, and different, and maybe more equitable if we did have more observability about not just, hey, I use Kafka, I use these parameters on it, and that gives me better throughput, but what if you had observability for how HR runs? What if you had observability for how hiring is done? And that was something that you could see outside of the organization as well. What if we shared all this stuff more, and more, and more, and we treated a few less things as trade secrets? I don't know if that's ever going to happen in my lifetime, but it's my default position. Let's share more rather than less.Corey: Yes, absolutely. Especially those of us with inordinate amounts of privilege. And that privilege takes different forms; there's the usual stuff people are talking about in terms of the fact that we are over-represented in tech in many respects, but there are other forms of privilege, too. There's a privilege that comes with seniority in the space, there is a privilege with being a published author, in your case, there is privilege in having a broad audience, like I do. And it just becomes this incredibly nuanced story.The easiest part of it to lose sight of—at least for me—is I tell stories about what has worked for me and how I've done what I do, and I have to be constantly conscious of the fact that there is that privilege baked in and call it out where I can. I've gotten much better at that, but it's an ongoing process. Because what works for me does not work for other people across a wide variety of different axes. And I don't want people to feel bad based upon what I say.Austin: Oh, yeah, absolutely. I mean, I'm in the same boat. Like, I tend to be very irreverent and/or shitpost-y and I don't have much of an explanation other than, I learned at some point in my life, that it's just… [sigh] I would rather go through life shitposting on Twitter, rather than be employable. It's just who I am. There's—I'm sure some people think I come off as rude. I don't know. I also agree, you'd never punch down. You only punch up. But you never know how other people are going to take that, and I don't think that it always gets interpreted in the spirit it was meant. And I can always do better, right?Corey: As can we all. The hard part for, I think, a lot of us is to suppress that initial flash of defensiveness when someone says you didn't quite get there, and learn from the experience. One of the ways I do that, personally, is I walk away before responding, sometimes. I want to be a better version of myself, but when I get called out of—like, this tweet thread is the whitest thing I've seen since I redid my bathroom walls, and I get a flash of defensiveness, “Excuse me. That's not accurate.”And… and then I stop and I think, and then sanity prevails, where it's, yeah. There's a lot of privilege baked into my existence, and if I don't see it, that doesn't mean it's not there. I have made it a firm rule of not responding defensively to things like that, ever. And there are times when I get called out for aspects of how I present that I don't believe are justified, to be very honest. But that is a me thing; that is not them, and I welcome the feedback, regardless. If you make people feel like a jerk for giving you feedback, they stop giving you feedback. And then where are you?Austin: Yeah. Funny anecdote. I wrote a blog for my personal blog a little while ago about, oh, togetherness, community, something like that. But I wrote—the intro was something like, talking about why people love Sweet Caroline, right? Favorite song in the world.Corey: [sings].Austin: [joins in]. Yeah.Corey: Yeah. I'm not allowed to play with that song here at The Duckbill Group because one of our employees is named Caroline and, firm rule: don't make fun of people's names. They're sensitive about it, and let's not kid ourselves here, I own the company. Even if she says, “It's fine, I love it.” That doesn't help because I own the company. There is a power imbalance here.Austin: Yeah.Corey: I don't know that she would feel that she had the psychological safety to say, “That's not funny.” I absolutely hope she would because that's the culture that I spend significant effort on building, but I can't depend on that. So, I don't go down the path of making those jokes. But I—yes, I love the intro to the song. Please continue.Austin: It's great. Everyone loves it. So, the intro of my initial paragraph was ruminating on that. And this post went around enough that it got submitted to Hacker News a few times, and the only comment it got was some mendacious busybody Hacker News type going on about why I would be so racist against white people. [laugh]. And I was just like, “And this is why I don't come to this website at all.”Corey: Yeah. There are so many things on Twitter that are challenging and difficult and obnoxious, and it's still the best thing we have for a sense of community. This has replaced IRC for me, to be perfectly honest.Austin: Yeah. No, I used to be big on IRC, and then I left because [sigh], well, a couple reasons. One, I really liked being able to post gifs.Corey: Yeah, that is something where the IRC experience is substandard. I was Freenode network staff for years—Austin: Oh wow.Corey: —and that was the thing to do. Now, turns out that the open-source dialogue and the community dialogue have shifted form. And I still hang out there periodically for specific things, but by and large, it's not where the discourse is.Austin: Yeah, it is interesting. It's something that concerns me, kind of, in a long term sense about not only our identity but also, sort of, the actual organic communities we formed, we've put on to these extremely unaccountable privately held platforms whose goal is monetization and growth so that they can continue to make money. And for as much as anyone can rightfully say, “Hey, Twitter's missed the mark,” a lot of times, it is a hard balance to strike. They don't have simple questions to answer, and I don't necessarily know if the nuance of their solutions has really risen to the challenge of answering those well, but it's a hard thing for them to do. That said, I think we're in a really awkward position where suddenly you've got the world's collection of open-source software is being hosted on a platform that is run by Microsoft, and I am old enough to remember. “Embrace, extend, extinguish.”Corey: Oh, yeah. I made an entire personality out of hating Microsoft.Austin: Yeah. And I mean, a lot of people still do. I read MacRumors sometimes, and they're all posting there still. Or Slashdot.Corey: I wondered where they'd gone. I didn't think everyone had changed their mind.Austin: I had just a very out-of-body moment yesterday because someone replied to a comment on mine about Slashdot on it, and then the Slashdot Twitter account liked it. And there exists a photo of me from when I was a teenager, where I owned a Slashdot ballcap. And that picture is somewhere in the world. Probably not on the internet, though, for very good reason.Corey: I'm mostly just still reeling at the discovery that there's a Slashdot Twitter account. But I guess time does evolve.Austin: It does. It makes fools of us all.Corey: It really does. Well, Austin, thank you so much for taking the time to speak with me. If people want to learn more about what you're up to, how you view the world, et cetera, et cetera, et cetera. Where can they find you?Austin: So, you can find me on Twitter, mostly, at @austinlparker. You can find my blog with various musings that is updated frequently at aparker.io and you can learn more about Deserted Island DevOps 2021, coming on April 30th this year, at desertedislanddevops.com.Corey: Excellent. And we will put links to all of that in the [show notes 00:34:01]. Thank you so much for taking the time to speak with me. I appreciate it.Austin: Thank you for having me. This was a lot of fun.Corey: It really was. Austin Parker, principal developer advocate at Lightstep. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice and then a giant series of comments that all reference one another and then completely lose track of how they all interrelate and be unable to diagnose performance issues.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.This has been a HumblePod production. Stay humble.

amazon google apple speaking zoom practice microsoft funny embrace twitch cloud nintendo mac new yorker albert einstein route enterprise excuse iot cfp animal crossing screaming picasso aws kpi blackberry notion devops kafka npcs oss sql okr irc airtable stack overflow jira deserted think different sweet caroline new horizon hn hacker news devrel deserted island macrumors sres corey quinn devopsdays nmap slashdot extrahop lightstep freenode duckbill group thinkst change intelligence chief cloud economist austin it last week in aws humblepod

Episode 300: No more architecture talk

Software Defined Talk

Play Episode Listen Later May 21, 2021 54:32

This week we discuss cloud migration strategies, the rise of Serverless and the future of PaaS. Plus, advice on how to start your day. Rundown AWS introduces new Application Migration Service (https://www.zdnet.com/article/aws-introduces-new-application-migration-service/) AWS App Runner – Fully managed container application service (https://aws.amazon.com/apprunner/) Develop production-scale modern web apps quickly with Azure Static Web Apps (https://azure.microsoft.com/en-us/blog/develop-production-scale-modern-web-apps-quickly-with-azure-static-web-apps/) Relevant to your interests Only 4% of iOS users in US are opting in to ad tracking, report says (https://appleinsider.com/articles/21/05/07/only-4-of-ios-users-in-us-are-opting-in-to-ad-tracking-report-says) Lightstep is joining ServiceNow (https://lightstep.com/blog/lightstep-joins-servicenow/) Pentagon Weighs Ending JEDI Cloud Project Amid Amazon Court Fight (https://www.wsj.com/articles/pentagon-weighs-ending-jedi-cloud-project-amid-amazon-court-fight-11620639001?mod=hp_lead_pos4) Crypto Markets Are Where the Fun Is (https://www.bloomberg.com/opinion/articles/2021-05-11/crypto-markets-are-where-the-fun-is) You can now buy NFTs on eBay, and ‘blockchain-driven collectibles’ are coming soon (https://www.theverge.com/2021/5/11/22430827/ebay-nft-collectibles-blockchain-sale) TikTok is launching a job hiring service to help find gigs for Gen Z (https://www.axios.com/tiktok-job-hiring-tiktok-576f3b99-602c-46ac-afed-218ddf61a9ba.html) VMware CEO Raghuram On ‘Vigorously’ Competing With Nutanix (https://www.crn.com/slide-shows/data-center/vmware-ceo-raghuram-on-vigorously-competing-with-nutanix) Apple employees circulate petition demanding investigation into "misogynistic" new hire (https://www.theverge.com/2021/5/12/22432909/apple-petition-hiring-antonio-garcia-martinez-chaos-monkeys-facebook) A Conversation With a Dogecoin Millionaire (https://overcast.fm/+oIe8VShIo) Apple’s ‘Find My’ Network Exploited via Bluetooth (https://threatpost.com/apple-find-my-exploited-bluetooth/166121/) Jeff Blackburn Returns To Amazon To Oversee Combined Media & Entertainment (https://deadline.com/2021/05/jeff-blackburn-returns-amazon-oversee-media-entertainment-operations-1234755875/?stream=top) Goldman Sachs Executive Quits After Making Fortune on Dogecoin: Report | The Daily Hodl (https://dailyhodl.com/2021/05/14/goldman-sachs-executive-quits-after-making-fortune-on-dogecoin-report/) KubeCon + CloudNativeCon Europe 2021 (https://www.youtube.com/playlist?list=PLj6h78yzYM2MqBm19mRz9SYLsw4kfQBrC) Antonio García Martínez’s controversial exit from Apple — Recode Media (https://overcast.fm/+QL2c_llMU) Google Cloud CEO predicts boom in business-process-as-a-service (https://venturebeat.com/2021/05/13/google-cloud-ceo-predicts-boom-in-business-process-as-a-service/) GitHub now lets all developers upload videos to demo bugs and features (https://venturebeat.com/2021/05/14/github-now-lets-all-developers-upload-videos-to-demo-bugs-and-features/) 1 big thing: The new digital extortion (https://www.axios.com/newsletters/axios-login-d0286b8d-d3fb-4652-8ad5-14f21c899c8d.html?chunk=0&utm_term=emshare) Oracle sues Envisage claiming unauthorized database use amid licensing crackdown (https://www.theregister.com/2021/05/17/oracle_sues_envisage/) Apple robbed the mob's bank | Mobile Dev Memo (https://mobiledevmemo.com/apple-robbed-the-mobs-bank/) Apple Music subscribers will get lossless and spatial audio for free next month (https://arstechnica.com/gadgets/2021/05/apple-music-subscribers-will-get-lossless-and-spatial-audio-for-free-next-month/) Top 15 Kubernetes Podcasts You Must Follow in 2021 (https://blog.feedspot.com/kubernetes_podcasts/) Happy Blurpthday to Discord, a Place for Everything You Can Imagine (https://blog.discord.com/happy-blurpthday-to-discord-a-place-for-everything-you-can-imagine-fc99ee0a77c0) Apple’s M1 is a fast CPU—but M1 Macs feel even faster due to QoS (https://arstechnica.com/gadgets/2021/05/apples-m1-is-a-fast-cpu-but-m1-macs-feel-even-faster-due-to-qos/) First Enterprise-Grade Distribution of the Popular CNCF Project Crossplane Arrives, (https://www.businesswire.com/news/home/20210518005942/en/Industry%E2%80%99s-First-Enterprise-Grade-Distribution-of-the-Popular-CNCF-Project-Crossplane-Arrives-Bringing-the-Kubernetes-Powered-Universal-Control-Plane-Approach-to-Platform-Teams-Everywhere) PlanetScale grabs YouTube-developed open-source tech, promises (https://www.theregister.com/2021/05/18/planetscale_promises_vitess_dbaas/) Project Starline: Feel like you're there, together (https://blog.google/technology/research/project-starline/) Python programming: We want to make the language twice as fast, says its creator (https://www.zdnet.com/article/python-programming-we-want-to-make-the-language-twice-as-fast-says-its-creator/) Google Workspace turns to "smart chips" to weave Docs, Tasks, and Meet together (https://www.theverge.com/2021/5/18/22440226/google-workspace-smart-canvas-features-docs-updates) Google Cloud launches Vertex AI, a new managed machine learning platform – TechCrunch (https://techcrunch.com/2021/05/18/google-cloud-launches-vertex-a-new-managed-machine-learning-platform/) Freenode IRC staff quit after new owner "seizes" control of network (https://boingboing.net/2021/05/19/freenode-irc-staff-quit-after-new-owner-seizes-control.html) Service Mesh Wars, Goodbye Istio (https://medium.com/polymatic-systems/service-mesh-wars-goodbye-istio-b047d9e533c7) Google revives RSS (https://techcrunch.com/2021/05/19/undead-again-google-brings-back-rss/) Google Wants to Make Everyone Use Two Factor Authentication (https://www.vice.com/en/article/93yyqe/google-wants-to-make-everyone-use-two-factor-authentication?utm_source=newsletter&utm_medium=email&utm_campaign=newsletter_axioslogin&stream=top) Cloudflare launches campaign to ‘end the madness’ of CAPTCHAs (https://www.theregister.com/2021/05/14/cloudflare_cryptographic_attestation_of_personhood_captcha_killer/) Media Statement Updated May 8, 2021: Colonial Pipeline System Disruption (https://www.colpipe.com/news/press-releases/media-statement-colonial-pipeline-system-disruption) Snyk Acquires FossID to Accelerate Worldwide Developer-First Security Adoption (https://snyk.io/news/snyk-acquires-fossid-to-accelerate-worldwide-developer-first-security-adoption/) We raised $100M in our Series F: here’s what we’re building next (Circle CI) (https://circleci.com/blog/series-f/) Amazon Said to Make $9 Billion Offer for MGM (https://variety.com/2021/digital/news/amazon-mgm-acquisition-talks-9-billion-1234975168/) What to Know Ahead of the Squarespace's Direct Listing (https://www.barrons.com/articles/squarespace-direct-listing-51621376597?utm_source=newsletter&utm_medium=email&utm_campaign=newsletter_axioslogin&stream=top) WarnerMedia and Discovery combine to create a $150B streaming giant (https://thehustle.co/05182021-WarnerMedia-Discovery/) Cisco to acquire Indy startup Socio to bring hybrid events to Webex (https://techcrunch.com/2021/05/12/cisco-to-acquire-indy-startup-socio-to-bring-hybrid-events-to-webex/) A Big Day for Business Texting: Twilio Acquires Zipwhip for $850 Million (https://www.zipwhip.com/blog/twilio-acquires-zipwhip-for-850m/) SoftBank Reports Highest-Ever Annual Profit for a Japanese Company (https://www.wsj.com/articles/softbank-reports-highest-ever-annual-profit-for-a-japanese-company-11620803131) Netlify Acquires FeaturePeek and Launches Next Generation of Deploy Previews to Streamline Collaboration for Web Teams (https://www.netlify.com/press/netlify-acquires-featurepeek-and-launches-next-generation-of-deploy-previews-to-streamline-collaboration-for-web-teams) Nonsense Video shows man making slow escape in ride-on mower 'test drive' (https://www.abc.net.au/news/2021-05-13/video-shows-alleged-ride-on-mower-thief-cairns-qld/100137376) Company forgets why they exist after 11-week migration to Kubernetes (https://www.theolognion.com/company-forgets-why-they-exist-after-11-week-migration-to-kubernetes/) Reborn: Journals and Notebooks (https://www.amazon.com/Reborn-Notebooks-1947-1963-Susan-Sontag/dp/0312428502/ref=pd_lpo_14_t_0/131-0459878-9390864?_encoding=UTF8&pd_rd_i=0312428502&pd_rd_r=f0c639d8-2be9-4acc-a954-350a8fa89913&pd_rd_w=ZJBem&pd_rd_wg=snlZD&pf_rd_p=a0d6e967-6561-454c-84f8-2ce2c92b79a6&pf_rd_r=Y0JR2H6A0SZVRHVH29P1&psc=1&refRID=Y0JR2H6A0SZVRHVH29P1) Sponsors CBT Nuggets — Training available for IT Pros anytime, anywhere. Start your 7-day Free Trial today at cbtnuggets.com/sdt (https://cbtnuggets.com/sdt) strongDM — Manage and audit remote access to infrastructure. Start your free 14-day trial today at: strongdm.com/SDT (http://strongdm.com/SDT) Conferences RabbitMQ Summit (https://rabbitmqsummit.com), July 13-14, 2021. SpringOne (https://springone.io), Sep 1st to 2nd. June 3rd modernization webinar for EMEA (https://twitter.com/cote/status/1394655403468804105). SDT news & hype Join us in Slack (http://www.softwaredefinedtalk.com/slack). Send your postal address to stickers@softwaredefinedtalk.com (mailto:stickers@softwaredefinedtalk.com) and we will send you free laptop stickers! Follow us on Twitch (https://www.twitch.tv/sdtpodcast), Twitter (https://twitter.com/softwaredeftalk), Instagram (https://www.instagram.com/softwaredefinedtalk/) and LinkedIn (https://www.linkedin.com/company/software-defined-talk/). Brandon built the Quick Concall iPhone App (https://itunes.apple.com/us/app/quick-concall/id1399948033?mt=8) and he wants you to buy it for $0.99. Use the code SDT to get $20 off Coté’s book, (https://leanpub.com/digitalwtf/c/sdt) Digital WTF (https://leanpub.com/digitalwtf/c/sdt), so $5 total. Become a sponsor of Software Defined Talk (https://www.softwaredefinedtalk.com/ads)! Recommendations Brandon: ServerlessChats (https://www.serverlesschats.com) Matt: Apple M1 Macbook Air (https://www.apple.com/macbook-air/) Coté: OBS (https://obsproject.com) Photo Credit (https://unsplash.com/photos/CSJPm2POibQ) Photo Credit (https://unsplash.com/photos/X89VSmdDKE0)

ThingsTHINKING raises funds | Google collaborates with CrowdStrike | Data-logics now on cloud | Servicenow acquires Lightstep | Norgesmøllene AS deploys Infor Cloud Suite

The SaaS News Roundup

Play Episode Listen Later May 11, 2021 3:40

Deeptech AI start-up, thingsTHINKING, has raised €4.5 million in a funding round led by Early-bird and a number of angel investors. The company's propriety tool ‘Semantha', offers textual processing by providing out-of-the-box functionality, bypassing the dreaded training phase for AI-based applications. Earlybird co-founder, Dr. Hendrik Brandis said, “With semantha, domain experts can finally educate an AI system as if it were a co-worker, using just a few and simple examples”. The current funding allows would help them grow their technological advantage, and market faster, while supporting their international partners.Google LLC, collaborates with CrowdStrike Holdings Inc., to deliver half a dozen of their cybersecurity tools, to make it easier for IT teams to spot malware. CrowdStrike's flagship product is a platform called Falcon, that companies use to protect systems such as servers, and employee devices. The platform can now send security information, from a company's environment to Google's cloud-based Chronicle, an analytics platform. Chronicle can store and analyzing petabytes of security data at once. Google and CrowdStrike's collaboration focuses on securing public cloud environments. According to the companies, they will work together, to make it easier to set up Falcon, inside Google Cloud virtual machines.Data-logics, Inc., a source for Adobe PDF, and enterprise document management technologies, announces the launch of Data-logics Cloud, a suite of cloud-based PDF processing products, which includes a free app on Zapier, and a robust API on Amazon Web Services. This exciting next-generation move for the company, provides Data logics users with the same enterprise-grade capabilities to create, edit and convert PDFs. Based in Chicago, IL, they support hundreds of customers worldwide, who are using document management technology in diverse applications. With the addition of these new cloud offerings, that customer base will be able to become even more expansive.Leading digital workflow company, Service now, announced it has signed an agreement to acquire next-generation observability leader, Light step. Acquisition will help ServiceNow customers accelerate digital transformation with insight-driven, action-oriented workflows. ServiceNow is already a recognized market leader in IT service management, IT operations management, and digital workflows. With Lightstep, an emerging pioneer in next-generation application monitoring and observability, ServiceNow will help DevOps engineers build, deploy, run, and monitor state-of-the-art, cloud-native applications.Infor, the industry cloud company, today announced that Norgesmøllene A.S., part of Cernova, the Norway-based flour milling and industrial group, has deployed Infor Cloud Suite Food & Beverage, to support its digital transformation process. Infor CloudSuite Food & Beverage, which also incorporates Eye Share, to manage the workflow of inbound supplier invoices, is expected to streamline processes, and boost visibility for Norgesmøllene A.S.

Software at Scale 15 - Ben Sigelman: CEO, Lightstep

Software at Scale

Play Episode Listen Later Apr 4, 2021 54:24

Listen now | Ben Sigelman is the CEO and Co-Founder of Lightstep, a DevOps observability platform. He was the co-creator of Dapper - Google’s distributed tracing system and Monarch - an in-memory time-series database for metrics. Get on the email list at www.softwareatscale.dev

DevOps Speakeasy S02E02: Austin Parker, on Observability and Laziness

DevOps Speakeasy Podcast

Play Episode Listen Later Mar 26, 2021 37:19

This week, we're joined by Austin Parker (@austinlparker) from Lightstep, the Jason Momoa of DevOps, to talk about observability in the name of laziness. Also, Baruch advocates for Java, Kat brings up old Python drama, we reminisce about the JavaScript framework treadmill, and Austin sells us on OpenTelemetry. How do you pronounce "Haskell?"

python java laziness devops javascript jason momoa speakeasy baruch haskell observability lightstep

Ted Young on Observability and the Release of OpenTelemetry 1.0

The InfoQ Podcast

Play Episode Listen Later Mar 22, 2021 33:25

In this podcast Ted Young, director of developer education at Lightstep, sat down with InfoQ podcast host Daniel Bryant and discussed: observability (and the three pillars), the OpenTelemetry CNCF sandbox project and the 1.0 release, and how to build an effective telemetry collection platform. Read a transcript of this interview: https://bit.ly/3eWqJvF Subscribe to our newsletters: - The InfoQ weekly newsletter: bit.ly/24x3IVq - The Software Architects’ Newsletter [monthly]: www.infoq.com/software-architects-newsletter/ Upcoming Virtual Events - https://events.infoq.com/ InfoQ Live: https://live.infoq.com/ - April 13, 2021 - June 22, 2021 - July 20, 2021 QCon Plus: https://plus.qconferences.com/ - May 17-28, 2021 Follow InfoQ: - Twitter: twitter.com/InfoQ - LinkedIn: www.linkedin.com/company/infoq - Facebook: bit.ly/2jmlyG8 - Instagram: @infoqdotcom - Youtube: www.youtube.com/infoq

young observability infoq lightstep daniel bryant

OpenTelemetry with Austin Parker

The Frontside Podcast

Play Episode Listen Later Mar 15, 2021 46:34

In this episode, Austin Parker, Principal Developer Advocate at Lightstep talks about the OpenTelemetry Framework, which is an observability framework for cloud-native software and a collection of tools, APIs, and SDKs. You use it to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) for analysis in order to understand your software's performance and behavior.

apis sdks lightstep

Esperanto for Robots – Happy OpenTelemetry Day! with Austin Parker

Observy McObservface

Play Episode Listen Later Feb 17, 2021 44:46 Transcription Available

Our first repeat guest ever, Austin Parker, Principal Developer Advocate at Lightstep, talks about OpenTelemetry: an observability framework for cloud-native software that is a collection of tools, APIs, and SDKs. You use it to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) for analysis to understand your software's performance and behavior.The OpenTelemetry specification is now in 1.0! What does this mean?It means we've achieved a major milestone in defining the observability framework for cloud-native applications, and that production-ready 1.0 releases of OpenTelemetry libraries will start being released over the next weeks and months!

robots apis sdks esperanto lightstep

How Lightstep is advancing observability with its new Change Intelligence - Episode 84

What the Dev?

Play Episode Listen Later Feb 9, 2021 14:36

In this week's episode we spoke to Ben Sigelman, the CEO and co-founder of Lightstep, a company that recently released Change Intelligence, which is a solution that analyzes system-wide metrics and tracing data. We covered how ML and AI doesn't do well with operational data unless there is some kind of objective or function that it can be applied to and how Change Intelligence is built to address that problem.

ceo ai advancing ml observability lightstep change intelligence

Episode 1: Daniel 'Spoons' Spoonhower, Chief Architect and Co-Founder of Lightstep

10KMedia Podcast

Play Episode Listen Later Feb 4, 2021 46:19

Adam sits down with Spoons to chat about AI's role in operations, the differences between monitoring and observability, and the announcement of Change Intelligence.

ai co founders architects spoons chief architect lightstep change intelligence

#279 Daniel "Spoons" Spoonhower - CTO at Lightstep

Modern CTO with Joel Beasley

Play Episode Listen Later Jan 25, 2021 33:23

Today we are talking to Daniel "Spoons" Spoonhower the CTO at Lightstep. And we discuss taking an outcome driven approach to time management, how to set the foundation for building successful teams, and what managing change looks like when raising the next generation of leaders. All of this, right here, right now on the Modern CTO Podcast!

cto spoons lightstep

Opening The Island - OpenTelemetry and The Future of Observability with Austin Parker

Observy McObservface

Play Episode Listen Later Nov 24, 2020 37:43 Transcription Available

In this episode, Austin Parker, Principal Developer Advocate at Lightstep, talks about OpenTelemetry, an observability framework for cloud-native software that is a collection of tools, APIs, and SDKs. You use it to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) for analysis in order to understand your software's performance and behavior.He also tells Jonan how he started/threw together Deserted Island DevOps during the height of the first wave of the COVID-19 pandemic, and how overwhelmingly successful and loved it was. DIDevOps was a single-day virtual event that was livestreamed on Twitch this past April. All presentations took place in the world of Animal Crossing: New Horizons and the event pulled ~15,000 unique viewers on the day of the conference alone!Should you find a burning need to share your thoughts or rants about the show please spray them at devrel@newrelic.com. While you’re going to all the trouble of shipping us some bytes, please consider taking a moment to let us know what you’d like to hear on the show in the future. Despite the all-caps flaming you will receive in response, please know that we are sincerely interested in your feedback; we aim to appease. Follow us on the Twitters: @ObservyMcObserv.

covid-19 twitch island apis animal crossing new horizons sdks observability lightstep

What observability into Kubernetes can bring to you with Daniel "Spoons" Spoonhower - Episode 70

What the Dev?

Play Episode Listen Later Nov 17, 2020 12:20

In this week's episode, we spoke to Daniel "Spoons" Spoonhower from the observability company Lightstep about the importance of observability into Kubernetes. Observability is really about understanding changes and the effects of those changes on your application. We also covered where the future of monitoring is going with more organizations embracing OpenTelemetry. Be sure to tune into the virtual KubeCon + CloudNativeCon North America 2020 event that gathers adopters and technologists from leading open source and cloud native communities from November 17 – 20, 2020.

kubernetes spoons observability lightstep

Ep. #28, People Problems with Austin Parker of Lightstep

people problems devrel lightstep

Play Episode Listen Later Oct 29, 2020 37:36

In episode 28 of o11ycast, Charity and Shelby speak with Austin Parker of Lightstep. They explore topics like rethinking human error, purposeful and intentional training, DevRel expectations, and underrated management tools.

Ep. #28, People Problems with Austin Parker of Lightstep

O11ycast

Play Episode Listen Later Oct 29, 2020 37:36

people problems devrel lightstep heavybit

Ep. #28, People Problems with Austin Parker of Lightstep

O11ycast

Play Episode Listen Later Oct 29, 2020 37:36

people problems devrel lightstep

Ep. #28, People Problems with Austin Parker of Lightstep

people problems devrel lightstep heavybit

Play Episode Listen Later Oct 29, 2020 37:36

Anadelia Fadeev - Gravitational

DemandGen Club Podcast

Play Episode Listen Later Oct 29, 2020 27:39 Transcription Available

Our guest today is Anadelia Fadeev. Director of Demand Generation at Gravitational. Gravitational empowers engineers to access and distribute computing resources anywhere on the planet. At Gravitational Anadelia is in charge of scaling the leads and pipeline for the growing sales team. She's extremely process-oriented and has strong experience building high-performing marketing teams at B2B tech startups. Before Gravitational, she was The Director of Demand Generation at LightStep, and before then, Senior Demand Generation Manager at InfluxData.

director b2b demand generation gravitational influxdata lightstep

Ep. #3, OpenTelemetry with Ben Sigelman of LightStep

The Kubelist Podcast

Play Episode Listen Later Oct 14, 2020 40:37

In episode 3 of The Kubelist Podcast, Marc Campbell speaks with Ben Sigelman of LightStep. They discuss the inspiration and origin story behind OpenTelemetry, the challenges of observability, and the path from sandbox to incubation.

lightstep

Ep. #3, OpenTelemetry with Ben Sigelman of LightStep

Play Episode Listen Later Oct 14, 2020 40:37

lightstep

Ep. #3, OpenTelemetry with Ben Sigelman of LightStep

Play Episode Listen Later Oct 14, 2020 40:37

lightstep heavybit

Ben Sigelman - The Future Of Observability & Why It's Not Just Telemetry (Observability Series - Part 2)

Masters of Data Podcast

Play Episode Listen Later Aug 24, 2020 35:44

There are three types of people in the data world: mathematicians, scientists, and engineers. Mathematicians are interested in understanding things that are true or false. Scientists are interested in furthering knowledge and enjoy answering challenging questions. Engineers are interested in building things that are useful, so they can solve a problem that's important. Engineers in the software industry are currently searching for ways to resolve the issues associated with microservices. Right now, the software industry is facing a massive architectural transformation, and engineers have the opportunity to create systems that solve important problems. That's why Ben Sigelman — CEO and co-founder, started Lightstep, to create something useful and impactful. He saw an opportunity to accelerate the industry's transformation while improving the developer and end-user experience, and he took it. Using observability, he built something that could help people gain more confidence and understanding of their own system. As an ex-Googler and co-creator of Dapper, Ben Sigelman witnessed the birth of microservices at Google. He learned a great deal from his experiences, and Lightstep is in many ways a reaction to and a generational improvement beyond those approaches. Sigelman's fascination lies in deep systems and how they break, but he is also passionate about separating the telemetry from the rest of observability. There is a lot of noise in the marketplace and confusion about how to approach observability, but Sigelman is confident that in the next 5-10 years, applications could change the way the software actually works, not just the way we understand it. Listen to Ben Sigelman and Ben Newton discuss the future of observability, and learn more about how this transformation could impact the industry. This week's episode is the second installment of a special three-part series on observability in data. Tune in each week to hear about how the world of observability in transforming into a major player in the data realm.

google scientists engineers mathematicians dapper googlers observability telemetry lightstep ben newton

The Developer's Struggle for Control

google mobile spoons microservices monoliths lightstep

Play Episode Listen Later Aug 10, 2020 45:05

GitLab sponsored this podcast. The developer experience today certainly offers software engineers the freedom to create applications at scale across often highly distributed microservices environments. But with this degree of freedom to create and update deployments at scale, developers are under pressure to deliver faster cadences. They also face security concerns as well as unknowns about the frontend user experience, even once the security and QA teams have properly vetted the code. In this The New Stack Makers podcast, correspondent B. Cameron Gain, speaks with Christopher Lefelhocz, vice president of development at GitLab and Ben Sigelman, CEO and co-founder of Lightstep, about how developers can leverage elasticity and other processes and tools to ensure software remains resilient and secure from the time the code is uploaded to GitLab's repository and throughout the entire deployment and usage cycle.

ceo struggle qa gitlab lightstep new stack makers

Mobile, Monoliths & Microservices w/ LightStep

DevOps Chat

Play Episode Listen Later Apr 2, 2020 17:57

Daniel "Spoons" Spoonhower made a bit of a name for himself while he was at Google, as did his two co-founders. They were there at the birth of the microservices wave that has swept through the software world. They left Google and started LightStep to help all developers build better, more scalable applications. With their new release they have pushed the bar higher. In this podcast here from Spoons as to what they are doing and where he sees things moving.

Lightstep CTO Daniel Spoonhower - The 3 Pillars of Observability

Play Episode Listen Later Apr 2, 2020 31:27

Listen to more episodes here: https://thenewstack.io/podcasts/ In this episode of The New Stack Makers podcast, Daniel Spoonhower, CTO of Lightstep, discussed and described what the “three pillars” concept means for DevOps, how monitoring is different, Lightstep's evolution in developing observability solutions and a number of other related themes. Spoonhower — whose experience in developing observability tools traces back to work as a software engineer at Google — makes it clear that a “three pillar” observability solution consisting of metrics, logs, and distributed tracing represents, in fact, separate capabilities. “I think the thing that we've kind of seen is that thinking of those as three different tools that you can just kind of squish together is not really a great solution. I mean, the way that I think about observability is I like to get away from the what the specific tools are, and just say that observability is the thing that helps you connect the effects that you're seeing — whether that's performance or user experience, or whatever, connecting those effects back to the causes,” Spoonhower said. “And the thing that happened with deep systems is that it's not like there are five or 10 potential causes to those problems, but there are thousands or tens of thousands of those things. And so you need a tool to help you find those.”

google pillars cto devops observability lightstep new stack makers

Episode 110: Kelsey Hightower and Ben Sigelman Debate Microservices vs. Monoliths

Play Episode Listen Later Mar 27, 2020 40:12

Listen to ALL of our shows here: https://thenewstack.io/podcasts/ Welcome to The New Stack Context, a podcast where we discuss the latest news and perspectives in the world of cloud native computing. For this week's episode, we spoke with Kelsey Hightower, a developer advocate at Google, and Ben Sigelman, CEO and co-founder of observability services provider LightStep, about whether or not teams should favor a monolith over a microservices approach when architecting cloud native applications. Hightower recently tweeted a prediction that “Monolithic applications will be back in style after people discover the drawbacks of distributed monolithic applications.” It was quite a surprise for those who have been advocating the for operational benefits of microservices. Why go back to a monolith? As Hightower explains in the podcast: “There are a lot of people who have never left a monolith. So there's really not anything to go back to. So it's really about the challenges of adopting a microservices architecture. From a design perspective, like very few companies talk about, here's how we designed our monolith.” Sigelman, on the other hand, maintained that microservices are necessary for rapid development, which, in turn, is necessary for sustaining a business. “It's not so much that you should use microservices, it's more like, if you don't innovate faster than your competitors, your company will eventually be erased, like, that's the actual problem. And in order to do that, you need to build a lot of differentiated technology,” he said. Microservices is the most logical approach for maintaining a large software team while still maintaining a competitive velocity of development. Later in the show, we discuss some of the top TNS podcasts and news posts of the week, including an interview with IBM's Lin Sun on the importance of the service mesh, as Sysdig's offer of a distributed, scalable Prometheus, a group of chief technology officers who want to help the U.S. government with the current COVID-19 pandemic, and the hidden vulnerabilities that come with open source security. TNS editorial and marketing director Libby Clark hosted this episode, alongside founder and TNS publisher Alex Williams and TNS managing editor Joab Jackson.

covid-19 ceo google debate ibm prometheus hightower microservices monoliths alex williams monolithic tns kelsey hightower sysdig lightstep libby clark joab jackson new stack context

Episode 110: Kelsey Hightower and Ben Sigelman Debate Microservices vs. Monoliths

The New Stack Context

Play Episode Listen Later Mar 27, 2020 40:11

covid-19 ceo google debate ibm prometheus hightower microservices monoliths alex williams monolithic tns kelsey hightower sysdig lightstep libby clark joab jackson new stack context

The Joy of Building Enterprise Software with Ben Sigelman