Podcasts about equinix metal

  • 17PODCASTS
  • 26EPISODES
  • 52mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Oct 18, 2024LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about equinix metal

Latest podcast episodes about equinix metal

The Changelog
You'll rent chips and be happy (Friends)

The Changelog

Play Episode Listen Later Oct 18, 2024 98:10


Zac Smith left his role leading Equinix Metal in June of 2023. Since then, he's been thinking deeply about the present and potential future of data centers, OEMs, chip makers & more.

Changelog Master Feed
You'll rent chips and be happy (Changelog & Friends #66)

Changelog Master Feed

Play Episode Listen Later Oct 18, 2024 98:10


Zac Smith left his role leading Equinix Metal in June of 2023. Since then, he's been thinking deeply about the present and potential future of data centers, OEMs, chip makers & more.

Cables2Clouds
C2C Fortnightly News: Who Doesn't Love a Good Press Release? - NC2C003

Cables2Clouds

Play Episode Listen Later Feb 14, 2024 24:54 Transcription Available


Imagine the cybersecurity world turned on its head as a cloud security startup notches a jaw-dropping $10 billion valuation. That's where Wiz stands now, and today we're breaking down their recent $300 million windfall, what it signals for the future of cyber threats, and the potential for artificial intelligence to change the game. We're also weighing in on the ex-Zscaler COO's strategic move to Wiz and the ripples it's sending through the industry. Will Wiz's skyrocketing growth see them through to an IPO, or is an acquisition on the horizon? Tune in for an expert analysis that could redefine your understanding of today's cybersecurity landscape.Then, we switch lanes to the bustling intersection of cloud connectivity and market disruption. We dissect Arista's latest maneuver with Equinix Metal, aiming to take a bite out of Cisco's market share. We're talking the nitty-gritty of their SD-WAN tech, Arista's impressive low vulnerability count, and Arrcus's bold approach to slashing egress costs. This episode is a treasure trove of insights for anyone looking to stay ahead of the curve in cloud technology and network infrastructure. Strap in for a ride that's as informative as it is boundary-pushing.Check out the Fortnightly Cloud Networking NewsVisit our website and subscribe: https://www.cables2clouds.com/Follow us on Twitter: https://twitter.com/cables2cloudsFollow us on YouTube: https://www.youtube.com/@cables2clouds/Follow us on TikTok: https://www.tiktok.com/@cables2cloudsMerch Store: https://store.cables2clouds.com/Join the Discord Study group: https://artofneteng.com/iaatjArt of Network Engineering (AONE): https://artofnetworkengineering.com

Open at Intel
Nerdy About Networks

Open at Intel

Play Episode Listen Later Jan 24, 2024 26:04


Fen Aldrich, a Developer Advocate with Equinix, talks about their open source partner program and giving back to the open source community. Fen highlights collaborative relationships they've established with key projects, serving as a testing ground and leveraging their surplus of hardware and network resources. We discussed that appreciating the human aspect of tech is crucial, as all technical systems ultimately depend on human innovation and interaction. We also nerd out a bit about the basics of networking and the technology underneath the magic we all take for granted. 00:00 Introduction 02:57 Deep Dive into Networking Basics 09:33 Exploring the Importance of Corporate Open Source Contributions 19:54 Understanding the Interconnectedness in Open Source Guest: Fen Aldrich is a Developer Advocate at Equinix Metal and an organizer for DevOpsDays events in the northeast US. Passionate about Resilience Engineering and Mental Health in the tech industry, they believe that every technology problem is ultimately, when you get right down to it, a people challenge. Find their work at speaking.crayzeigh.com, and connect on twitter @crayzeigh or mastodon @crayzeigh@hachyderm.io

Virtually Speaking Podcast
VMware Cloud on Equinix

Virtually Speaking Podcast

Play Episode Listen Later Sep 5, 2023 7:37


With VMware Cloud, customers can choose the best cloud environment for their ever-evolving business and application needs because VMware's multi-cloud architecture is extensible to a broad range of cloud endpoints. Case in point, Equinix Metal in prime metro locations around the world! This year at Explore Las Vegas 2023, VMware announced that VMware Cloud on Equinix Metal (VMCE) is available as an Early Access Release. Watch the video of this episode Watch all VMware Explore Recap episodes

The Tech Trek
Hiring for a niche skill set

The Tech Trek

Play Episode Listen Later Jan 10, 2023 28:47


In this episode, Shweta Saraf, Head Of Engineering, Edge Infrastructure Services Engineering at Equinix, talks about how she handles hiring for a niche skill set and things you can do to help find the right talent. Key takeaways: Focus on optimizing the talent pipeline Removing the requirements friction Have existing staff find opportunities to grow skill sets Training from within What happens when you can't check the boxes Everything is now engineering About today's guest: see below ⬇️ Shweta Saraf is the Head of Engineering for Edge Infrastructure Services at Equinix. She leads the teams responsible for two of the highest-growing digital products - Equinix Metal and Network Edge. Shweta was recognized on Silicon Valley Business Journal's 40 under 40 list for 2022. She actively mentors tech leaders to give back to the community. She has previously held leadership positions at Packet, DigitalOcean, and Cisco. LinkedIn: https://www.linkedin.com/in/shwetasaraf/ Twitter: @shwetahari ___ Thank you so much for checking out this episode of The Tech Trek, and we would appreciate it if you would take a minute to rate and review us on your favorite podcast player. Want to learn more about us? Head over at https://www.elevano.com Have questions or want to cover specific topics with our future guests? Please message me at https://www.linkedin.com/in/amirbormand (Amir Bormand)

CodePen Radio
389: Migrating a Ruby on Rails GraphQL API to a Go GraphQL API

CodePen Radio

Play Episode Listen Later Nov 2, 2022


One thing that's been keeping us very busy at CodePen is moving our main API. We decided on GraphQL long ago and it's served us pretty well. We originally built it in Ruby on Rails alongside a lot of the rest of our app. But while Rails served us well, we've been moving off of it. We like our React architecture and we're better served leaning into that, with frameworks like Next, than staying on Rails. We proved out this combination of technologies for ourselves, building a whole set of admin tools with it. Now we're ready to keep that train going as we build out more of CodePen with the same stack. But removing Rails means moving off of our Rails-based GraphQL implementation. This means re-writing that API in Go, another bit of tech we've had a lot of luck with. Turns out that re-writing an API is more time-consuming than writing it to begin with, especially as we need to run them side-by-side and behave identically. No refactoring allowed! Unless of course we want to refactor it on both sides and take even more time. Dee joined me this week in talking about all this. It's a huge job! But we've been doing well at it, building our own tooling, doing lots of testing, and ultimately proving that it works by releasing it in small areas on the production site. It's all working out how we hoped it would: fast, cheap, and easier to reason about. Time Jumps Sponsor: Equinix Metal's Startup Partner Program Equinix Metal's Startup Partner Program helps early stage companies level up. Their experts work with startups like Koord and INVISV to build their competitive edge with infrastructure. Equinix Metal provides real time guidance and support to help startups grow faster. With up to $100,000 in infrastructure credit, access to Equinix's global ecosystem of over 10,000 customers and 1,800 networks, they might just be what you need to take your startup global. Visit metal.equinix.com/startups to take your startup to the next level.

CodePen Radio
388: Durable Objects

CodePen Radio

Play Episode Listen Later Oct 19, 2022


Robert and I jump on to chat about Cloudflare's product Durable Objects. It's part of their Workers platform, which we already use at CodePen a good bit, but with Durable Objects... Global Uniqueness guarantees that there will be a single instance of a Durable Object class with a given ID running at once, across the world. Requests for a Durable Object ID are routed by the Workers runtime to the Cloudflare data center that owns the Durable Object. In their intro blog post a few years back, they call the "killer app" real-time collaborative document editing, which is obviously of interest to us. So we've been tinkering and playing with how that might work with CodePen's future technology. Time Jumps Sponsor: Equinix Metal's Startup Partner Program Equinix Metal's Startup Partner Program helps early stage companies level up. Their experts work with startups like Koord and INVISV to build their competitive edge with infrastructure. Equinix Metal provides real time guidance and support to help startups grow faster. With up to $100,000 in infrastructure credit, access to Equinix's global ecosystem of over 10,000 customers and 1,800 networks, they might just be what you need to take your startup global. Visit metal.equinix.com/startups to take your startup to the next level.

Screaming in the Cloud
The Controversy of Cloud Repatriation With Amy Tobey of Equinix

Screaming in the Cloud

Play Episode Listen Later Sep 27, 2022 38:34


About AmyAmy Tobey has worked in tech for more than 20 years at companies of every size, working with everything from kernel code to user interfaces. These days she spends her time building an innovative Site Reliability Engineering program at Equinix, where she is a principal engineer. When she's not working, she can be found with her nose in a book, watching anime with her son, making noise with electronics, or doing yoga poses in the sun.Links Referenced: Equinix: https://metal.equinix.com Twitter: https://twitter.com/MissAmyTobey TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: Welcome to Screaming in the Cloud, I'm Corey Quinn, and this episode is another one of those real profiles in shitposting type of episodes. I am joined again from a few months ago by Amy Tobey, who is a Senior Principal Engineer at Equinix, back for more. Amy, thank you so much for joining me.Amy: Welcome. To your show. [laugh].Corey: Exactly. So, one thing that we have been seeing a lot over the past year, and you struck me as one of the best people to talk about what you're seeing in the wilderness perspective, has been the idea of cloud repatriation. It started off with something that came out of Andreessen Horowitz toward the start of the year about the trillion-dollar paradox, how, at a certain point of scale, repatriating to a data center is the smart and right move. And oh, my stars that ruffle some feathers for people?Amy: Well, I spent all this money moving to the cloud. That was just mean.Corey: I know. Why would I want to leave the cloud? I mean, for God's sake, my account manager named his kid after me. Wait a minute, how much am I spending on that? Yeah—Amy: Good question.Corey: —there is that ever-growing problem. And there have been the examples that people have given of Dropbox classically did a cloud repatriation exercise, and a second example that no one can ever name. And it seems like okay, this might not necessarily be the direction that the industry is going. But I also tend to not be completely naive when it comes to these things. And I can see repatriation making sense on a workload-by-workload basis.What that implies is that yeah, but a lot of other workloads are not going to be going to a data center. They're going to stay in a cloud provider, who would like very much if you never read a word of this to anyone in public.Amy: Absolutely, yeah.Corey: So, if there are workloads repatriating, it would occur to me that there's a vested interest on the part of every major cloud provider to do their best to, I don't know if saying suppress the story is too strongly worded, but it is directionally what I mean.Amy: They aren't helping get the story out. [laugh].Corey: Yeah, it's like, “That's a great observation. Could you maybe shut the hell up and never make it ever again in public, or we will end you?” Yeah. Your Amazon. What are you going to do, launch a shitty Amazon Basics version of what my company does? Good luck. Have fun. You're probably doing it already.But the reason I want to talk to you on this is a confluence of a few things. One, as I mentioned back in May when you were on the show, I am incensed and annoyed that we've been talking for as long as we have, and somehow I never had you on the show. So, great. Come back, please. You're always welcome here. Secondly, you work at Equinix, which is, effectively—let's be relatively direct—it is functionally a data center as far as how people wind up contextualizing this. Yes, you have higher level—Amy: Yeah I guess people contextualize it that way. But we'll get into that.Corey: Yeah, from the outside. I don't work there, to be clear. My talking points don't exist for this. But I think of oh, Equinix. Oh, that means you basically have a colo or colo equivalent. The pricing dynamics have radically different; it looks a lot closer to a data center in my imagination than it does a traditional public cloud. I would also argue that if someone migrates from AWS to Equinix, that would be viewed—arguably correctly—as something of a repatriation. Is that directionally correct?Amy: I would argue incorrectly. For Metal, right?Corey: Ah.Amy: So, Equinix is a data center company, right? Like that's why everybody knows us as. Equinix Metal is a bare metal primitive service, right? So, it's a lot more of a cloud workflow, right, except that you're not getting the rich services that you get in a technically full cloud, right? Like, there's no RDS; there's no S3, even. What you get is bare metal primitives, right? With a really fast network that isn't going to—Corey: Are you really a cloud provider without some ridiculous machine-learning-powered service that's going to wind up taking pictures, perform incredibly expensive operations on it, and then return something that's more than a little racist? I mean, come on. That's not—you're not a cloud until you can do that, right?Amy: We can do that. We have customers that do that. Well, not specifically that, but um—Corey: Yeah, but they have to build it themselves. You don't have the high-level managed service that basically serves as, functionally, bias laundering.Amy: Yeah, you don't get it in a box, right? So, a lot of our customers are doing things that are unique, right, that are maybe not exactly fit into the cloud well. And it comes back down to a lot of Equinix's roots, which is—we talk but going into the cloud, and it's this kind of abstract environment we're reaching for, you know, up in the sky. And it's like, we don't know where it is, except we have regions that—okay, so it's in Virginia. But the rule of real estate applies to technology as often as not, which is location, location, location, right?When we're talking about a lot of applications, a challenge that we face, say in gaming, is that the latency from the customer, so that last mile to your data center, can often be extremely important, right, so a few milliseconds even. And a lot of, like, SaaS applications, the typical stuff that really the cloud was built on, 10 milliseconds, 50 milliseconds, nobody's really going to notice that, right? But in a gaming environment or some very low latency application that needs to run extremely close to the customer, it's hard to do that in the cloud. They're building this stuff out, right? Like, I see, you know, different ones [unintelligible 00:05:53] opening new regions but, you know, there's this other side of the cloud, which is, like, the edge computing thing that's coming alive, and that's more where I think about it.And again, location, location, location. The speed of light is really fast, but as most of us in tech know, if you want to go across from the East Coast to the West Coast, you're talking about 80 milliseconds, on average, right? I think that's what it is. I haven't checked in a while. Yeah, that's just basic fundamental speed of light. And so, if everything's in us-east-1—and this is why we do multi-region, sometimes—the latency from the West Coast isn't going to be great. And so, we run the application on both sides.Corey: It has improved though. If you want to talk old school things that are seared into my brain from over 20 years ago, every person who's worked in data centers—or in technology, as a general rule—has a few IP addresses seared. And the one that I've always had on my mind was 130.111.32.11. Kind of arbitrary and ridiculous, but it was one of the two recursive resolvers provided at the University of Maine where I had my first help desk job.And it lives on-prem, in Maine. And generally speaking, I tended to always accept that no matter where I was—unless I was in a data center somewhere—it was about 120 milliseconds. And I just checked now; it is 85 and change from where I am in San Francisco. So, the internet or the speed of light have improved. So, good for whichever one of those it was. But yeah, you've just updated my understanding of these things. All of this is, which is to say, yes, latency is very important.Amy: Right. Let's forget repatriation to really be really honest. Even the Dropbox case or any of them, right? Like, there's an economic story here that I think all of us that have been doing cloud work for a while see pretty clearly that maybe not everybody's seeing that—that's thinking from an on-prem kind of situation, which is that—you know, and I know you do this all the time, right, is, you don't just look at the cost of the data center and the servers and the network, the technical components, the bill of materials—Corey: Oh, lies, damned lies, and TCO analyses. Yeah.Amy: —but there's all these people on top of it, and the organizational complexity, and the contracts that you got to manage. And it's this big, huge operation that is incredibly complex to do well that is almost nobody's business. So the way I look at this, right, and the way I even talk to customers about it is, like, “What is your produ—” And I talk to people internally about this way? It's like, “What are you trying to build?” “Well, I want to build a SaaS.” “Okay. Do you need data center expertise to build a SaaS?” “No.” “Then why the hell are you putting it in a data center?” Like we—you know, and speaking for my employer, right, like, we have Equinix Metal right here. You can build on that and you don't have to do all the most complex part of this, at least in terms of, like, the physical plant, right? Like, right, getting a bare metal server available, we take care of all of that. Even at the primitive level, where we sit, it's higher level than, say, colo.Corey: There's also the question of economics as it ties into it. It's never just a raw cost-of-materials type of approach. Like, my original job in a data center was basically to walk around and replace hard drives, and apparently, to insult people. Now, the cloud has taken one of those two aspects away, and you can follow my Twitter account and figure out which one of those two it is, but what I keep seeing now is there is value to having that task done, but in a cloud environment—and Equinix Metal, let's be clear—that has slipped below the surface level of awareness. And well, what are the economic implications of that?Well, okay, you have a whole team of people at large companies whose job it is to do precisely that. Okay, we're going to upskill them and train them to use cloud. Okay. First, not everyone is going to be capable or willing to make that leap from hard drive replacement to, “Congratulations and welcome to JavaScript. You're about to hate everything that comes next.”And if they do make that leap, their baseline market value—by which I mean what the market is willing to pay for them—approximately will double. And whether they wind up being paid more by their current employer or they take a job somewhere else with those skills and get paid what they are worth, the company still has that economic problem. Like it or not, you will generally get what you pay for whether you want to or not; that is the reality of it. And as companies are thinking about this, well, what gets into the TCO analysis and what doesn't, I have yet to see one where the outcome was not predetermined. They're less, let's figure out in good faith whether it's going to be more expensive to move to the cloud, or move out of the cloud, or just burn the building down for insurance money. The outcome is generally the one that the person who commissioned the TCO analysis wants. So, when a vendor is trying to get you to switch to them, and they do one for you, yeah. And I'm not saying they're lying, but there's so much judgment that goes into this. And what do you include and what do you not include? That's hard.Amy: And there's so many hidden costs. And that's one of the things that I love about working at a cloud provider is that I still get to play with all that stuff, and like, I get to see those hidden costs, right? Like you were talking about the person who goes around and swaps out the hard drives. Or early in my career, right, I worked with someone whose job it was this every day, she would go into data center, she'd swap out the tapes, you know, and do a few things other around and, like, take care of the billing system. And that was a job where it was kind of going around and stewarding a whole bunch of things that kind of kept the whole machine running, but most people outside of being right next to the data center didn't have any idea that stuff even happen, right, that went into it.And so, like you were saying, like, when you go to do the TCO analysis, I mean, I've been through this a couple of times prior in my career, where people will look at it and go like, “Well, of course we're not going to list—we'll put, like, two headcount on there.” And it's always a lie because it's never just to headcount. It's never just the network person, or the SRE, or the person who's racking the servers. It's also, like, finance has to do all this extra work, and there's all the logistic work, and there is just so much stuff that just is really hard to include. Not only do people leave it out, but it's also just really hard for people to grapple with the complexity of all the things it takes to run a data center, which is, like, one of the most complex machines on the planet, any single data center.Corey: I've worked in small-scale environments, maybe a couple of mid-sized ones, but never the type of hyperscale facility that you folks have, which I would say is if it's not hyperscale, it's at least directionally close to it. We're talking thousands of servers, and hundreds of racks.Amy: Right.Corey: I've started getting into that, on some level. Now, I guess when we say ‘hyperscale,' we're talking about AWS-size things where, oh, that's a region and it's going to have three dozen data center facilities in it. Yeah, I don't work in places like that because honestly, have you met me? Would you trust me around something that's that critical infrastructure? No, you would not, unless you have terrible judgment, which means you should not be working in those environments to begin with.Amy: I mean, you're like a walking chaos exercise. Maybe I would let you in.Corey: Oh, I bring my hardware destruction aura near anything expensive and things are terrible. It's awful. But as I looked at the cloud, regardless of cloud, there is another economic element that I think is underappreciated, and to be fair, this does, I believe, apply as much to Equinix Metal as it does to the public hyperscale cloud providers that have problems with naming things well. And that is, when you are provisioning something as a customer of one of these places, you have an unbounded growth problem. When you're in a data center, you are not going to just absentmindedly sign an $8 million purchase order for new servers—you know, a second time—and then that means you're eventually run out of power, space, places to put things, and you have to go find it somewhere.Whereas in cloud, the only limit is basically your budget where there is no forcing function that reminds you to go and clean up that experiment from five years ago. You have people with three petabytes of data they were using for a project, but they haven't worked there in five years and nothing's touched it since. Because the failure mode of deleting things that are important, or disasters—Amy: That's why Glacier exists.Corey: Oh, exactly. But that failure mode of deleting things that should not be deleted are disastrous for a company, whereas if you've leave them there, well, it's only money. And there's no forcing function to do that, which means you have this infinite growth problem with no natural limit slash predator around it. And that is the economic analysis that I do not see playing out basically anywhere. Because oh, by the time that becomes a problem, we'll have good governance in place. Yeah, pull the other one. It has bells on it.Amy: That's the funny thing, right, is a lot of the early drive in the cloud was those of us who wanted to go faster and we were up against the limitations of our data centers. And then we go out and go, like, “Hey, we got this cloud thing. I'll just, you know, put the credit card in there and I'll spin up a few instances, and ‘hey, I delivered your product.'” And everybody goes, “Yeah, hey, happy.” And then like you mentioned, right, and then we get down the road here, and it's like, “Oh, my God, how much are we spending on this?”And then you're in that funny boat where you have both. But yeah, I mean, like, that's just typical engineering problem, where, you know, we have to deal with our constraints. And the cloud has constraints, right? Like when I was at Netflix, one of the things we would do frequently is bump up against instance limits. And then we go talk to our TAM and be like, “Hey, buddy. Can we have some more instance limit?” And then take care of that, right?But there are some bounds on that. Of course, in the cloud providers—you know, if I have my cloud provider shoes on, I don't necessarily want to put those limits to law because it's a business, the business wants to hoover up all the money. That's what businesses do. So, I guess it's just a different constraint that is maybe much too easy to knock down, right? Because as you mentioned, in a data center or in a colo space, I outgrow my cage and I filled up all that space I have, I have to either order more space from my colo provider, I expand to the cloud, right?Corey: The scale I was always at, the limit was not the space because I assure you with enough shoving all things are possible. Don't believe me? Look at what people are putting in the overhead bin on any airline. Enough shoving, you'll get a Volkswagen in there. But it was always power constrained is what I dealt with it. And it's like, “Eh, they're just being conservative.” And the whole building room dies.Amy: You want blade servers because that's how you get blade servers, right? That movement was about bringing the density up and putting more servers in a rack. You know, there were some management stuff and [unintelligible 00:16:08], but a lot of it was just about, like, you know, I remember I'm picturing it, right—Corey: Even without that, I was still power constrained because you have to remember, a lot of my experiences were not in, shall we say, data center facilities that you would call, you know, good.Amy: Well, that brings up a fun thing that's happening, which is that the power envelope of servers is still growing. The newest Intel chips, especially the ones they're shipping for hyperscale and stuff like that, with the really high core counts, and the faster clock speeds, you know, these things are pulling, like, 300 watts. And they also have to egress all that heat. And so, that's one of the places where we're doing some innovations—I think there's a couple of blog posts out about it around—like, liquid cooling or multimode cooling. And what's interesting about this from a cloud or data center perspective, is that the tools and skills and everything has to come together to run a, you know, this year's or next year's servers, where we're pushing thousands of kilowatts into a rack. Thousands; one rack right?The bar to actually bootstrap and run this stuff successfully is rising again, compared to I take my pizza box servers, right—and I worked at a gaming company a long time ago, right, and they would just, like, stack them on the floor. It was just a stack of servers. Like, they were in between the rails, but they weren't screwed down or anything, right? And they would network them all up. Because basically, like, the game would spin up on the servers and if they died, they would just unplug that one and leave it there and spin up another one.It was like you could just stack stuff up and, like, be slinging cables across the data center and stuff back then. I wouldn't do it that way now, but when you add, say liquid cooling and some of these, like, extremely high power situations into the mix, now you need to have, for example, if you're using liquid cooling, you don't want that stuff leaking, right? And so, it's good as the pressure fittings and blind mating and all this stuff that's coming around gets, you still have that element of additional training, and skill, and possibility for mistakes.Corey: The thing that I see as I look at this across the space is that, on some level, it's gotten harder to run a data center than it ever did before. Because again, another reason I wanted to have you on this show is that you do not carry a quota. Although you do often carry the conversation, when you have boring people around you, but quotas, no. You are not here selling things to people. You're not actively incentivized to get people to see things a certain way.You are very clearly an engineer in the right ways. I will further point out though, that you do not sound like an engineer, by which I mean, you're going to basically belittle people, in many cases, in the name of being technically correct. You're a human being with a frickin soul. And believe me, it is noticed.Amy: I really appreciate that. If somebody's just listening to hearing my voice and in my name, right, like, I have a low voice. And in most of my career, I was extremely technical, like, to the point where you know, if something was wrong technically, I would fight to the death to get the right technical solution and maybe not see the complexity around the decisions, and why things were the way they were in the way I can today. And that's changed how I sound. It's changed how I talk. It's changed how I look at and talk about technology as well, right? I'm just not that interested in Kubernetes. Because I've kind of started looking up the stack in this kind of pursuit.Corey: Yeah, when I say you don't sound like an engineer, I am in no way shape or form—Amy: I know.Corey: —alluding in any respect to your technical acumen. I feel the need to clarify that statement for people who might be listening, and say, “Hey, wait a minute. Is he being a shithead?” No.Amy: No, no, no.Corey: Well, not the kind you're worried I'm being anyway; I'm a different breed of shithead and that's fine.Amy: Yeah, I should remember that other people don't know we've had conversations that are deeply technical, that aren't on air, that aren't context anybody else has. And so, like, I bring that deep technical knowledge, you know, the ability to talk about PCI Express, and kilovolts [unintelligible 00:19:58] rack, and top-of-rack switches, and network topologies, all of that together now, but what's really fascinating is where the really big impact is, for reliability, for security, for quality, the things that me as a person, that I'm driven by—products are cool, but, like, I like them to be reliable; that's the part that I like—really come down to more leadership, and business acumen, and understanding the business constraints, and then being able to get heard by an audience that isn't necessarily technical, that doesn't necessarily understand the difference between PCI, PCI-X, and PCI Express. There's a difference between those. It doesn't mean anything to the business, right, so when we want to go and talk about why are we doing, for example, multi-region deployment of our application? If I come in and say, “Well, because we want to use Raft.” That's going to fall flat, right?The business is going to go, “I don't care about Raft. What does that have to do with my customers?” Which is the right question to always ask. Instead, when I show up and say, “Okay, what's going on here is we have this application sits in a single region—or in a single data center or whatever, right? I'm using region because that's probably what most of the people listening understand—you know, so I put my application in a single region and it goes down, our customers are going to be unhappy. We have the alternative to spend, okay, not a little bit more money, probably a lot more money to build a second region, and the benefit we will get is that our customers will be able to access the service 24x7, and it will always work and they'll have a wonderful experience. And maybe they'll keep coming back and buy more stuff from us.”And so, when I talk about it in those terms, right—and it's usually more nuanced than that—then I start to get the movement at the macro level, right, in the systemic level of the business in the direction I want it to go, which is for the product group to understand why reliability matters to the customer, you know? For the individual engineers to understand why it matters that we use secure coding practices.[midroll 00:21:56]Corey: Getting back to the reason I said that you are not quota-carrying and you are not incentivized to push things in a particular way is that often we'll meet zealots, and I've never known you to be one, you have always been a strong advocate for doing the right thing, even if it doesn't directly benefit any given random employer that you might have. And as a result, one of the things that you've said to me repeatedly is if you're building something from scratch, for God's sake, put it in cloud. What is wrong with you? Do that. The idea of building it yourself on low-lying, underlying primitives for almost every modern SaaS style workload, there's no reason to consider doing something else in almost any case. Is that a fair representation of your position on this?Amy: It is. I mean, the simpler version right, “Is why the hell are you doing undifferentiated lifting?” Right? Things that don't differentiate your product, why would you do it?Corey: The thing that this has empowered then is I can build an experiment tonight—I don't have to wait for provisioning and signed contracts and do all the rest. I can spend 25 cents and get the experiment up and running. If it takes off, though, it has changed how I move going forward as well because there's no difference in the way that there was back when we were in data centers. I'm going to try and experiment I'm going to run it in this, I don't know, crappy Raspberry Pi or my desktop or something under my desk somewhere. And if it takes off and I have to scale up, I got to do a giant migration to real enterprise-grade hardware. With cloud, you are getting all of that out of the box, even if all you're doing with it is something ridiculous and nonsensical.Amy: And you're often getting, like, ridiculously better service. So, 20 years ago, if you and I sat down to build a SaaS app, we would have spun up a Linux box somewhere in a colo, and we would have spun up Apache, MySQL, maybe some Perl or PHP if we were feeling frisky. And the availability of that would be one machine could do, what we could handle in terms of one MySQL instance. But today if I'm spinning up a new stack for some the same kind of SaaS, I'm going to probably deploy it into an ASG, I'm probably going to have some kind of high availability database be on it—and I'm going to use Aurora as an example—because, like, the availability of an Aurora instance, in terms of, like, if I'm building myself up with even the very best kit available in databases, it's going to be really hard to hit the same availability that Aurora does because Aurora is not just a software solution, it's also got a team around it that stewards that 24/7. And it continues to evolve on its own.And so, like, the base, when we start that little tiny startup, instead of being that one machine, we're actually starting at a much higher level of quality, and availability, and even security sometimes because of these primitives that were available. And I probably should go on to extend on the thought of undifferentiated lifting, right, and coming back to the colo or the edge story, which is that there are still some little edge cases, right? Like I think for SaaS, duh right? Like, go straight to. But there are still some really interesting things where there's, like, hardware innovations where they're doing things with GPUs and stuff like that.Where the colo experience may be better because you're trying to do, like, custom hardware, in which case you are in a colo. There are businesses doing some really interesting stuff with custom hardware that's behind an application stack. What's really cool about some of that, from my perspective, is that some of that might be sitting on, say, bare metal with us, and maybe the front-end is sitting somewhere else. Because the other thing Equinix does really well is this product we call a Fabric which lets us basically do peering with any of the cloud providers.Corey: Yeah, the reason, I guess I don't consider you as a quote-unquote, “Cloud,” is first and foremost, rooted in the fact that you don't have a bandwidth model that is free and grass and criminally expensive to send it anywhere that isn't to you folks. Like, are you really a cloud if you're not just gouging the living piss out of your customers every time they want to send data somewhere else?Amy: Well, I mean, we like to say we're part of the cloud. And really, that's actually my favorite feature of Metal is that you get, I think—Corey: Yeah, this was a compliment, to be very clear. I'm a big fan of not paying 1998 bandwidth pricing anymore.Amy: Yeah, but this is the part where I get to do a little bit of, like, showing off for Metal a little bit, in that, like, when you buy a Metal server, there's different configurations, right, but, like, I think the lowest one, you have dual 10 Gig ports to the server that you can get either in a bonded mode so that you have a single 20 Gig interface in your operating system, or you can actually do L3 and you can do BGP to your server. And so, this is a capability that you really can't get at all on the other clouds, right? This lets you do things with the network, not only the bandwidth, right, that you have available. Like, you want to stream out 25 gigs of bandwidth out of us, I think that's pretty doable. And the rates—I've only seen a couple of comparisons—are pretty good.So, this is like where some of the business opportunities, right—and I can't get too much into it, but, like, this is all public stuff I've talked about so far—which is, that's part of the opportunity there is sitting at the crossroads of the internet, we can give you a server that has really great networking, and you can do all the cool custom stuff with it, like, BGP, right? Like, so that you can do Anycast, right? You can build Anycast applications.Corey: I miss the days when that was a thing that made sense.Amy: [laugh].Corey: I mean that in the context of, you know, with the internet and networks. These days, it always feels like the network engineering as slipped away within the cloud because you have overlays on top of overlays and it's all abstractions that are living out there right until suddenly you really need to know what's going on. But it has abstracted so much of this away. And that, on some level, is the surprise people are often in for when they wind up outgrowing the cloud for a workload and wanting to move it someplace that doesn't, you know, ride them like naughty ponies for bandwidth. And they have to rediscover things that we've mostly forgotten about.I remember having to architect significantly around the context of hard drive failures. I know we've talked about that a fair bit as a thing, but yeah, it's spinning metal, it throws off heat and if you lose the wrong one, your data is gone and you now have serious business problems. In cloud, at least AWS-land, that's not really a thing anymore. The way EBS is provisioned, there's a slight tick in latency if you're looking at just the right time for what I think is a hard drive failure, but it's there. You don't have to think about this anymore.Migrate that workload to a pile of servers in a colo somewhere, guess what? Suddenly your reliability is going to decrease. Amazon, and the other cloud providers as well, have gotten to a point where they are better at operations than you are at your relatively small company with your nascent sysadmin team. I promise. There is an economy of scale here.Amy: And it doesn't have to be good or better, right? It's just simply better resourced—Corey: Yeah.Amy: Than most anybody else can hope. Amazon can throw a billion dollars at it and never miss it. In most organizations out there, you know, and most of the especially enterprise, people are scratching and trying to get resources wherever they can, right? They're all competing for people, for time, for engineering resources, and that's one of the things that gets freed up when you just basically bang an API and you get the thing you want. You don't have to go through that kind of old world internal process that is usually slow and often painful.Just because they're not resourced as well; they're not automated as well. Maybe they could be. I'm sure most of them could, in theory be, but we come back to undifferentiated lifting. None of this helps, say—let me think of another random business—Claire's, whatever, like, any of the shops in the mall, they all have some kind of enterprise behind them for cash processing and all that stuff, point of sale, none of this stuff is differentiating for them because it doesn't impact anything to do with where the money comes in. So again, we're back at why are you doing this?Corey: I think that's also the big challenge as well, when people start talking about repatriation and talking about this idea that they are going to, oh, that cloud is too expensive; we're going to move out. And they make the economics work. Again, I do firmly believe that, by and large, businesses do not intentionally go out and make poor decisions. I think when we see a company doing something inscrutable, there's always context that we're missing, and I think as a general rule of thumb, that at these companies do not hire people who are fools. And there are always constraints that they cannot talk about in public.My general position as a consultant, and ideally as someone who aspires to be a decent human being, is that when I see something I don't understand, I assume that there's simply a lack of context, not that everyone involved in this has been foolish enough to make giant blunders that I can pick out in the first five seconds of looking at it. I'm not quite that self-confident yet.Amy: I mean, that's a big part of, like, the career progression into above senior engineer, right, is, you don't get to sit in your chair and go, like, “Oh, those dummies,” right? You actually have—I don't know about ‘have to,' but, like, the way I operate now, right, is I remember in my youth, I used to be like, “Oh, those business people. They don't know, nothing. Like, what are they doing?” You know, it's goofy what they're doing.And then now I have a different mode, which is, “Oh, that's interesting. Can you tell me more?” The feeling is still there, right? Like, “Oh, my God, what is going on here?” But then I get curious, and I go, “So, how did we get here?” [laugh]. And you get that story, and the stories are always fascinating, and they always involve, like, constraints, immovable objects, people doing the best they can with what they have available.Corey: Always. And I want to be clear that very rarely is it the right answer to walk into a room and say, look at the architecture and, “All right, what moron built this?” Because always you're going to be asking that question to said moron. And it doesn't matter how right you are, they're never going to listen to another thing out of your mouth again. And have some respect for what came before even if it's potentially wrong answer, well, great. “Why didn't you just use this service to do this instead?” “Yeah, because this thing predates that by five years, jackass.”There are reasons things are the way they are, if you take any architecture in the world and tell people to rebuild it greenfield, almost none of them would look the same as they do today because we learn things by getting it wrong. That's a great teacher, and it hurts. But it's also true.Amy: And we got to build, right? Like, that's what we're here to do. If we just kind of cycle waiting for the perfect technology, the right choices—and again, to come back to the people who built it at the time used—you know, often we can fault people for this—used the things they know or the things that are nearby, and they make it work. And that's kind of amazing sometimes, right?Like, I'm sure you see architectures frequently, and I see them too, probably less frequently, where you just go, how does this even work in the first place? Like how did you get this to work? Because I'm looking at this diagram or whatever, and I don't understand how this works. Maybe that's a thing that's more a me thing, like, because usually, I can look at a—skim over an architecture document and be, like, be able to build the model up into, like, “Okay, I can see how that kind of works and how the data flows through it.” I get that pretty quickly.And comes back to that, like, just, again, asking, “How did we get here?” And then the cool part about asking how did we get here is it sets everybody up in the room, not just you as the person trying to drive change, but the people you're trying to bring along, the original architects, original engineers, when you ask, how did we get here, you've started them on the path to coming along with you in the future, which is kind of cool. But until—that storytelling mode, again, is so powerful at almost every level of the stack, right? And that's why I just, like, when we were talking about how technical I bring things in, again, like, I'm just not that interested in, like, are you Little Endian or Big Endian? How did we get here is kind of cool. You built a Big Endian architecture in 2022? Like, “Ohh. [laugh]. How do we do that?”Corey: Hey, leave me to my own devices, and I need to build something super quickly to get it up and running, well, what I'm going to do, for a lot of answers is going to look an awful lot like the traditional three-tier architecture that I was running back in 2008. Because I know it, it works well, and I can iterate rapidly on it. Is it a best practice? Absolutely not, but given the constraints, sometimes it's the fastest thing to grab? “Well, if you built this in serverless technologies, it would run at a fraction of the cost.” It's, “Yes, but if I run this thing, the way that I'm running it now, it'll be $20 a month, it'll take me two hours instead of 20. And what exactly is your time worth, again?” It comes down to the better economic model of all these things.Amy: Any time you're trying to make a case to the business, the economic model is going to always go further. Just general tip for tech people, right? Like if you can make the better economic case and you go to the business with an economic case that is clear. Businesses listen to that. They're not going to listen to us go on and on about distributed systems.Somebody in finance trying to make a decision about, like, do we go and spend a million bucks on this, that's not really the material thing. It's like, well, how is this going to move the business forward? And how much is it going to cost us to do it? And what other opportunities are we giving up to do that?Corey: I think that's probably a good place to leave it because there's no good answer. We can all think about that until the next episode. I really want to thank you for spending so much time talking to me again. If people want to learn more, where's the best place for them to find you?Amy: Always Twitter for me, MissAmyTobey, and I'll see you there. Say hi.Corey: Thank you again for being as generous with your time as you are. It's deeply appreciated.Amy: It's always fun.Corey: Amy Tobey, Senior Principal Engineer at Equinix Metal. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment that tells me exactly what we got wrong in this episode in the best dialect you have of condescending engineer with zero people skills. I look forward to reading it.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

Screaming in the Cloud
Reliability Starts in Cultural Change with Amy Tobey

Screaming in the Cloud

Play Episode Listen Later May 11, 2022 46:37


About AmyAmy Tobey has worked in tech for more than 20 years at companies of every size, working with everything from kernel code to user interfaces. These days she spends her time building an innovative Site Reliability Engineering program at Equinix, where she is a principal engineer. When she's not working, she can be found with her nose in a book, watching anime with her son, making noise with electronics, or doing yoga poses in the sun.Links Referenced: Equinix Metal: https://metal.equinix.com Personal Twitter: https://twitter.com/MissAmyTobey Personal Blog: https://tobert.github.io/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Vultr. Optimized cloud compute plans have landed at Vultr to deliver lightning-fast processing power, courtesy of third-gen AMD EPYC processors without the IO or hardware limitations of a traditional multi-tenant cloud server. Starting at just 28 bucks a month, users can deploy general-purpose, CPU, memory, or storage optimized cloud instances in more than 20 locations across five continents. Without looking, I know that once again, Antarctica has gotten the short end of the stick. Launch your Vultr optimized compute instance in 60 seconds or less on your choice of included operating systems, or bring your own. It's time to ditch convoluted and unpredictable giant tech company billing practices and say goodbye to noisy neighbors and egregious egress forever. Vultr delivers the power of the cloud with none of the bloat. “Screaming in the Cloud” listeners can try Vultr for free today with a $150 in credit when they visit getvultr.com/screaming. That's G-E-T-V-U-L-T-R dot com slash screaming. My thanks to them for sponsoring this ridiculous podcast.Corey: Finding skilled DevOps engineers is a pain in the neck! And if you need to deploy a secure and compliant application to AWS, forgettaboutit! But that's where DuploCloud can help. Their comprehensive no-code/low-code software platform guarantees a secure and compliant infrastructure in as little as two weeks, while automating the full DevSecOps lifestyle. Get started with DevOps-as-a-Service from DuploCloud so that your cloud configurations are done right the first time. Tell them I sent you and your first two months are free. To learn more visit: snark.cloud/duplo. Thats's snark.cloud/D-U-P-L-O-C-L-O-U-D.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Every once in a while I catch up with someone that it feels like I've known for ages, and I realize somehow I have never been able to line up getting them on this show as a guest. Today is just one of those days. And my guest is Amy Tobey who has been someone I've been talking to for ages, even in the before-times, if you can remember such a thing. Today, she's a Senior Principal Engineer at Equinix. Amy, thank you for finally giving in to my endless wheedling.Amy: Thanks for having me. You mentioned the before-times. Like, I remember it was, like, right before the pandemic we had beers in San Francisco wasn't it? There was Ian there—Corey: Yeah, I—Amy: —and a couple other people. It was a really great time. And then—Corey: I vaguely remember beer. Yeah. And then—Amy: And then the world ended.Corey: Oh, my God. Yes. It's still March of 2020, right?Amy: As far as I know. Like, I haven't checked in a couple years.Corey: So, you do an awful lot. And it's always a difficult question to ask someone, so can you encapsulate your entire existence in a paragraph? It's—Amy: [sigh].Corey: —awful, so I'd like to give a bit more structure to it. Let's start with the introduction: You are a Senior Principal Engineer. We know it's high level because of all the adjectives that get put in there, and none of those adjectives are ‘associate' or ‘beginner' or ‘junior,' or all the other diminutives that companies like to play games with to justify paying people less. And you're at Equinix, which is a company that is a bit unlike most of the, shall we say, traditional cloud providers. What do you do over there and both as a company, as a person?Amy: So, as a company Equinix, what most people know about is that we have a whole bunch of data centers all over the world. I think we have the most of any company. And what we do is we lease out space in that data center, and then we have a number of other products that people don't know as well, which one is Equinix Metal, which is what I specifically work on, where we rent you bare-metal servers. None of that fancy stuff that you get any other clouds on top of it, there's things you can get that are… partner things that you can add-on, like, you know, storage and other things like that, but we just deliver you bare-metal servers with really great networking. So, what I work on is the reliability of that whole system. All of the things that go into provisioning the servers, making them come up, making sure that they get delivered to the server, make sure the API works right, all of that stuff.Corey: So, you're on the Equinix cloud side of the world more so than you are on the building data centers by the sweat of your brow, as they say?Amy: Correct. Yeah, yeah. Software side.Corey: Excellent. I spent some time in data centers in the early part of my career before cloud ate that. That was sort of cotemporaneous with the discovery that I'm the hardware destruction bunny, and I should go to great pains to keep my aura from anything expensive and important, like, you know, the SAN. So—Amy: Right, yeah.Corey: Companies moving out of data centers, and me getting out was a great thing.Amy: But the thing about SANs though, is, like, it might not be you. They're just kind of cursed from the start, right? They just always were kind of fussy and easy to break.Corey: Oh, yeah. I used to think—and I kid you not—that I had a limited upside to my career in tech because I sometimes got sloppy and I was fairly slow at crimping ethernet cables.Amy: [laugh].Corey: That is very similar to growing up in third grade when it became apparent that I was going to have problems in my career because my handwriting was sloppy. Yeah, it turns out the future doesn't look like we predicted it would.Amy: Oh, gosh. Are we going to talk about, like, neurological development now or… [laugh] okay, that's a thing I struggle with, too right, is I started typing as soon as they would let—in fact, before they would let me. I remember in high school, I had teachers who would grade me down for typing a paper out. They want me to handwrite it and I would go, “Cool. Go ahead and take a grade off because if I handwrite it, you're going to take two grades off my handwriting, so I'm cool with this deal.”Corey: Yeah, it was pretty easy early on. I don't know when the actual shift was, but it became more and more apparent that more and more things are moving towards a world where you could type. And I was almost five when I started working on that stuff, and that really wound up changing a lot of aspects of how I started seeing things. One thing I think you're probably fairly well known for is incidents. I want to be clear when I say that you are not the root cause as—“So, why are things broken?” “It's Amy again. What's she gotten into this time?” Great.Amy: [laugh]. But it does happen, but not all the time.Corey: Exa—it's a learning experience.Amy: Right.Corey: You've also been deeply involved with SREcon and a number of—a lot of aspects of what I will term—and please don't yell at me for this—SRE culture—Amy: Yeah.Corey: Which is sometimes a challenging thing to wind up describing or putting a definition around. The one that I've always been somewhat partial to is, “SRE is DevOps, except you worked at Google for a while.” I don't know how necessarily accurate that is, but it does rile people up.Amy: Yeah, it does. Dave Stanke actually did a really great talk at SREcon San Francisco just a couple weeks ago, about the DORA report. And the new DORA report, they split SRE out into its own function and kind of is pushing against that old model, which actually comes from Liz Fong-Jones—I think it's from her, or older—about, like, class SRE implements DevOps, which is kind of this idea that, like, SREs make DevOps happen. Things have evolved, right, since then. Things have evolved since Google released those books, and we're all just figured out what works and what doesn't a little bit.And so, it's not that we're implementing DevOps so much. In fact, it's that ops stuff that kind of holds us back from the really high impact work that SREs, I think, should be doing, that aren't just, like, fixing the problems, the symptoms down at the bottom layer, right? Like what we did as sysadmins 20 years ago. You know, we'd go and a lot of people are SREs that came out of the sysadmin world and still think in that mode, where it's like, “Well, I set up the systems, and when things break, I go and I fix them.” And, “Why did the developers keep writing crappy code? Why do I have to always getting up in the middle of the night because this thing crashed?”And it turns out that the work we need to do to make things more reliable, there's a ceiling to how far away the platform can take us, right? Like, we can have the best platform in the world with redundancy, and, you know, nine-way replicated data storage and all this crazy stuff, and still if we put crappy software on top, it's going to be unreliable. So, how do we make less crappy software? And for most of my career, people would be, like, “Well, you should test it.” And so, we started doing that, and we still have crappy software, so what's going on here? We still have incidents.So, we write more tests, and we still have incidents. We had a QA group, we still have incidents. We send the developers to training, and we still have incidents. So like, what is the thing we need to do to make things more reliable? And it turns out, most of it is culture work.Corey: My perspective on this stems from being a grumpy old sysadmin. And at some point, I started calling myself a systems engineer or DevOps or production engineer, or SRE. It was all from my point of view, the same job, but you know, if you call yourself a sysadmin, you're just asking for a 40% pay cut off the top.Amy: [laugh].Corey: But I still tended to view the world through that lens. I tended to be very good at Linux systems internals, for example, understanding system calls and the rest, but increasingly, as the DevOps wave or SRE wave, or Google-isation of the internet wound up being more and more of a thing, I found myself increasingly in job interviews, where, “Great, now, can you go wind up implementing a sorting algorithm on the whiteboard?” “What on earth? No.” Like, my lingua franca is shitty Bash, and no one tends to write that without a bunch of tab completions and quick checking with manpages—die.net or whatnot—on the fly as you go down that path.And it was awful, and I felt… like my skill set was increasingly eroding. And it wasn't honestly until I started this place where I really got into writing a fair bit of code to do different things because it felt like an orthogonal skill set, but the fullness of time, it seems like it's not. And it's a reskilling. And it made me wonder, does this mean that the areas of technology that I focused on early in my career, was that all a waste? And the answer is not really. Sometimes, sure, in that I don't spend nearly as much time worrying about inodes—for example—as I once did. But every once in a while, I'll run into something and I looked like a wizard from the future, but instead, I'm a wizard from the past.Amy: Yeah, I find that a lot in my work, now. Sometimes things I did 20 years ago, come back, and it's like, oh, yeah, I remember I did all that threading work in 2002 in Perl, and I learned everything the very, very, very hard way. And then, you know, this January, did some threading work to fix some stability issues, and all of it came flooding back, right? Just that the experiences really, more than the code or the learning or the text and stuff; more just the, like, this feels like threads [BLEEP]-ery. Is a diagnostic thing that sometimes we have to say.And then people are like, “Can you prove it?” And I'm like, “Not really,” because it's literally thread [BLEEP]-ery. Like, the definition of it is that there's weird stuff happening that we can't figure out why it's happening. There's something acting in the system that isn't synchronized, that isn't connected to other things, that's happening out of order from what we expect, and if we had a clear signal, we would just fix it, but we don't. We just have, like, weird stuff happening over here and then over there and over there and over there.And, like, that tells me there's just something happening at that layer and then have to go and dig into that right, and like, just basically charge through. My colleagues are like, “Well, maybe you should look at this, and go look at the database,” the things that they're used to looking at and that their experiences inform, whereas then I bring that ancient toiling through the threading mines experiences back and go, “Oh, yeah. So, let's go find where this is happening, where people are doing dangerous things with threads, and see if we can spot something.” But that came from that experience.Corey: And there's so much that just repeats itself. And history rhymes. The challenge is that, do you have 20 years of experience, or do you have one year of experience repeated 20 times? And as the tide rises, doing the same task by hand, it really is just a matter of time before your full-time job winds up being something a piece of software does. An easy example is, “Oh, what's your job?” “I manually place containers onto specific hosts.” “Well, I've got news for you, and you're not going to like it at all.”Amy: Yeah, yeah. I think that we share a little bit. I'm allergic to repeated work. I don't know if allergic is the right word, but you know, if I sit and I do something once, fine. Like, I'll just crank it out, you know, it's this form, or it's a datafile I got to write and I'll—fine I'll type it in and do the manual labor.The second time, the difficulty goes up by ten, right? Like, just mentally, just to do it, be like, I've already done this once. Doing it again is anathema to everything that I am. And then sometimes I'll get through it, but after that, like, writing a program is so much easier because it's like exponential, almost, growth in difficulty. You know, the third time I have to do the same thing that's like just typing the same stuff—like, look over here, read this thing and type it over here—I'm out; I can't do it. You know, I got to find a way to automate. And I don't know, maybe normal people aren't driven to live this way, but it's kept me from getting stuck in those spots, too.Corey: It was weird because I spent a lot of time as a consultant going from place to place and it led to some weird changes. For example, “Oh, thank God, I don't have to think about that whole messaging queue thing.” Sure enough, next engagement, it's message queue time. Fantastic. I found that repeating myself drove me nuts, but you also have to be very sensitive not to wind up, you know, stealing IP from the people that you're working with.Amy: Right.Corey: But what I loved about the sysadmin side of the world is that the vast majority of stuff that I've taken with me, lives in my shell config. And what I mean by that is I'm not—there's nothing in there is proprietary, but when you have a weird problem with trying to figure out the best way to figure out which Ruby process is stealing all the CPU, great, turns out that you can chain seven or eight different shell commands together through a bunch of pipes. I don't want to remember that forever. So, that's the sort of thing I would wind up committing as I learned it. I don't remember what company I picked that up at, but it was one of those things that was super helpful.I have a sarcastic—it's a one-liner, except no sane editor setting is going to show it in any less than three—of a whole bunch of Perl, piped into du, piped into the rest, that tells you one of the largest consumers of files in a given part of the system. And it rates them with stars and it winds up doing some neat stuff. I would never sit down and reinvent something like that today, but the fact that it's there means that I can do all kinds of neat tricks when I need to. It's making sure that as you move through your career, on some level, you're picking up skills that are repeatable and applicable beyond one company.Amy: Skills and tooling—Corey: Yeah.Amy: —right? Like, you just described the tool. Another SREcon talk was John Allspaw and Dr. Richard Cook talking about above the line; below the line. And they started with these metaphors about tools, right, showing all the different kinds of hammers.And if you're a blacksmith, a lot of times you craft specialized hammers for very specific jobs. And that's one of the properties of a tool that they were trying to get people to think about, right, is that tools get crafted to the job. And what you just described as a bespoke tool that you had created on the fly, that kind of floated under the radar of intellectual property. [laugh].So, let's not tell the security or IP people right? Like, because there's probably billions and billions of dollars of technically, like, made-up IP value—I'm doing air quotes with my fingers—you know, that's just basically people's shell profiles. And my God, the Emacs automation that people have done. If you've ever really seen somebody who's amazing at Emacs and is 10, 20, 30, maybe 40 years of experience encoded in their emacs settings, it's a wonder to behold. Like, I look at it and I go, “Man, I wish I could do that.”It's like listening to a really great guitar player and be like, “Wow, I wish I could play like them.” You see them just flying through stuff. But all that IP in there is both that person's collection of wisdom and experience and working with that code, but also encodes that stuff like you described, right? It's just all these little systems tricks and little fiddly commands and things we don't want to remember and so we encode them into our toolset.Corey: Oh, yeah. Anything I wound up taking, I always would share it with people internally, too. I'd mention, “Yeah, I'm keeping this in my shell files.” Because I disclosed it, which solves a lot of the problem. And also, none of it was even close to proprietary or anything like that. I'm sorry, but the way that you wind up figuring out how much of a disk is being eaten up and where in a more pleasing way, is not a competitive advantage. It just isn't.Amy: It isn't to you or me, but, you know, back in the beginning of our careers, people thought it was worth money and should be proprietary. You know, like, oh, that disk-checking script as a competitive advantage for our company because there are only a few of us doing this work. Like, it was actually being able to, like, manage your—[laugh] actually manage your servers was a competitive advantage. Now, it's kind of commodity.Corey: Let's also be clear that the world has moved on. I wound up buying a DaisyDisk a while back for Mac, which I love. It is a fantastic, pretty effective, “Where's all the stuff on your disk going?” And it does a scan and you can drive and collect things and delete them when trying to clean things out. I was using it the other day, so it's top of mind at the moment.But it's way more polished than that crappy Perl three-liner. And I see both sides, truly I do. The trick also, for those wondering [unintelligible 00:15:45], like, “Where is the line?” It's super easy. Disclose it, what you're doing, in those scenarios in the event someone is no because they believe that finding the right man page section for something is somehow proprietary.Great. When you go home that evening in a completely separate environment, build it yourself from scratch to solve the problem, reimplement it and save that. And you're done. There are lots of ways to do this. Don't steal from your employer, but your employer employs you; they don't own you and the way that you think about these problems.Every person I've met who has had a career that's longer than 20 minutes has a giant doc somewhere on some system of all of the scripts that they wound up putting together, all of the one-liners, the notes on, “Next time you see this, this is the thing to check.”Amy: Yeah, the cheat sheet or the notebook with all the little commands, or again the Emacs config, sometimes for some people, or shell profiles. Yeah.Corey: Here's the awk one-liner that I put that automatically spits out from an Apache log file what—the httpd log file that just tells me what are the most frequent talkers, and what are the—Amy: You should probably let go of that one. You know, like, I think that one's lifetime is kind of past, Corey. Maybe you—Corey: I just have to get it working with Nginx, and we're good to go.Amy: Oh, yeah, there you go. [laugh].Corey: Or S3 access logs. Perish the thought. But yeah, like, what are the five most high-volume talkers, and what are those relative to each other? Huh, that one thing seems super crappy and it's coming from Russia. But that's—hmm, one starts to wonder; maybe it's time to dig back in.So, one of the things that I have found is that a lot of the people talking about SRE seem to have descended from an ivory tower somewhere. And they're talking about how some of the best-in-class companies out there, renowned for their technical cultures—at least externally—are doing these things. But there's a lot more folks who are not there. And honestly, I consider myself one of those people who is not there. I was a competent engineer, but never a terrific one.And looking at the way this was described, I often came away thinking, “Okay, it was the purpose of this conference talk just to reinforce how smart people are, and how I'm not,” and/or, “There are the 18 cultural changes you need to make to your company, and then you can do something kind of like we were just talking about on stage.” It feels like there's a combination of problems here. One is making this stuff more accessible to folks who are not themselves in those environments, and two, how to drive cultural change as an individual contributor if that's even possible. And I'm going to go out on a limb and guess you have thoughts on both aspects of that, and probably some more hit me, please.Amy: So, the ivory tower, right. Let's just be straight up, like, the ivory tower is Google. I mean, that's where it started. And we get it from the other large companies that, you know, want to do conference talks about what this stuff means and what it does. What I've kind of come around to in the last couple of years is that those talks don't really reach the vast majority of engineers, they don't really apply to a large swath of the enterprise especially, which is, like, where a lot of the—the bulk of our industry sits, right? We spend a lot of time talking about the darlings out here on the West Coast in high tech culture and startups and so on.But, like, we were talking about before we started the show, right, like, the interior of even just America, is filled with all these, like, insurance and banks and all of these companies that are cranking out tons of code and servers and stuff, and they're trying to figure out the same problems. But they're structured in companies where their tech arm is still, in most cases, considered a cost center, often is bundled under finance, for—that's a whole show of itself about that historical blunder. And so, the tech culture is tend to be very, very different from what we experience in—what do we call it anymore? Like, I don't even want to say West Coast anymore because we've gone remote, but, like, high tech culture we'll say. And so, like, thinking about how to make SRE and all this stuff more accessible comes down to, like, thinking about who those engineers are that are sitting at the computers, writing all the code that runs our banks, all the code that makes sure that—I'm trying to think of examples that are more enterprise-y right?Or shoot buying clothes online. You go to Macy's for example. They have a whole bunch of servers that run their online store and stuff. They have internal IT-ish people who keep all this stuff running and write that code and probably integrating open-source stuff much like we all do. But when you go to try to put in a reliability program that's based on the current SRE models, like SLOs; you put in SLOs and you start doing, like, this incident management program that's, like, you know, you have a form you fill out after every incident, and then you [unintelligible 00:20:25] retros.And it turns out that those things are very high-level skills, skills and capabilities in an organization. And so, when you have this kind of IT mindset or the enterprise mindset, bringing the culture together to make those things work often doesn't happen. Because, you know, they'll go with the prescriptive model and say, like, okay, we're going to implement SLOs, we're going to start measuring SLIs on all of the services, and we're going to hold you accountable for meeting those targets. If you just do that, right, you're just doing more gatekeeping and policing of your tech environment. My bet is, reliability almost never improves in those cases.And that's been my experience, too, and why I get charged up about this is, if you just go slam in these practices, people end up miserable, the practices then become tarnished because people experienced the worst version of them. And then—Corey: And with the remote explosion as well, it turns out that changing jobs basically means their company sends you a different Mac, and the next Monday, you wind up signing into a different Slack team.Amy: Yeah, so the culture really matters, right? You can't cover it over with foosball tables and great lunch. You actually have to deliver tools that developers want to use and you have to deliver a software engineering culture that brings out the best in developers instead of demanding the best from developers. I think that's a fundamental business shift that's kind of happening. If I'm putting on my wizard hat and looking into the future and dreaming about what might change in the world, right, is that there's kind of a change in how we do leadership and how we do business that's shifting more towards that model where we look at what people are capable of and we trust in our people, and we get more out of them, the knowledge work model.If we want more knowledge work, we need people to be happy and to feel engaged in their community. And suddenly we start to see these kind of generational, bigger-pie kind of things start to happen. But how do we get there? It's not SLOs. It maybe it's a little bit starting with incidents. That's where I've had the most success, and you asked me about that. So, getting practical, incident management is probably—Corey: Right. Well, as I see it, the problem with SLOs across the board is it feels like it's a very insular community so far, and communicating it to engineers seems to be the focus of where the community has been, but from my understanding of it, you absolutely need buy-in at significantly high executive levels, to at the very least by you air cover while you're doing these things and making these changes, but also to help drive that cultural shift. None of this is something I have the slightest clue how to do, let's be very clear. If I knew how to change a company's culture, I'd have a different job.Amy: Yeah. [laugh]. The biggest omission in the Google SRE books was [Ers 00:22:58]. There was a guy at Google named Ers who owns availability for Google, and when anything is, like, in dispute and bubbles up the management team, it goes to Ers, and he says, “Thou shalt…” right? Makes the call. And that's why it works, right?Like, it's not just that one person, but that system of management where the whole leadership team—there's a large, very well-funded team with a lot of power in the organization that can drive availability, and they can say, this is how you're going to do metrics for your service, and this is the system that you're in. And it's kind of, yeah, sure it works for them because they have all the organizational support in place. What I was saying to my team just the other day—because we're in the middle of our SLO rollout—is that really, I think an SLO program isn't [clear throat] about the engineers at all until late in the game. At the beginning of the game, it's really about getting the leadership team on board to say, “Hey, we want to put in SLIs and SLOs to start to understand the functioning of our software system.” But if they don't have that curiosity in the first place, that desire to understand how well their teams are doing, how healthy their teams are, don't do it. It's not going to work. It's just going to make everyone miserable.Corey: It feels like it's one of those difficult to sell problems as well, in that it requires some tooling changes, absolutely. It requires cultural change and buy-in and whatnot, but in order for that to happen, there has to be a painful problem that a company recognizes and is willing to pay to make go away. The problem with stuff like this is that once you pay, there's a lot of extra work that goes on top of it as well, that does not have a perception—rightly or wrongly—of contributing to feature velocity, of hitting the next milestone. It's, “Really? So, we're going to be spending how much money to make engineers happier? They should get paid an awful lot and they're still complaining and never seem happy. Why do I care if they're happy other than the pure mercenary perspective of otherwise they'll quit?” I'm not saying that it's not worth pursuing; it's not a worthy goal. I am saying that it becomes a very difficult thing to wind up selling as a product.Amy: Well, as a product for sure, right? Because—[sigh] gosh, I have friends in the space who work on these tools. And I want to be careful.Corey: Of course. Nothing but love for all of those people, let's be very clear.Amy: But a lot of them, you know, they're pulling metrics from existing monitoring systems, they are doing some interesting math on them, but what you get at the end is a nice service catalog and dashboard, which are things we've been trying to land as products in this industry for as long as I can remember, and—Corey: “We've got it this time, though. This time we'll crack the nut.” Yeah. Get off the island, Gilligan.Amy: And then the other, like, risky thing, right, is the other part that makes me uncomfortable about SLOs, and why I will often tell folks that I talk to out in the industry that are asking me about this, like, one-on-one, “Should I do it here?” And it's like, you can bring the tool in, and if you have a management team that's just looking to have metrics to drive productivity, instead of you know, trying to drive better knowledge work, what you get is just a fancier version of more Taylorism, right, which is basically scientific management, this idea that we can, like, drive workers to maximum efficiency by measuring random things about them and driving those numbers. It turns out, that doesn't really work very well, even in industrial scale, it just happened to work because, you know, we have a bloody enough society that we pushed people into it. But the reality is, if you implement SLOs badly, you get more really bad Taylorism that's bad for you developers. And my suspicion is that you will get worse availability out of it than you would if you just didn't do it at all.Corey: This episode is sponsored by our friends at Revelo. Revelo is the Spanish word of the day, and its spelled R-E-V-E-L-O. It means “I reveal.” Now, have you tried to hire an engineer lately? I assure you it is significantly harder than it sounds. One of the things that Revelo has recognized is something I've been talking about for a while, specifically that while talent is evenly distributed, opportunity is absolutely not. They're exposing a new talent pool to, basically, those of us without a presence in Latin America via their platform. It's the largest tech talent marketplace in Latin America with over a million engineers in their network, which includes—but isn't limited to—talent in Mexico, Costa Rica, Brazil, and Argentina. Now, not only do they wind up spreading all of their talent on English ability, as well as you know, their engineering skills, but they go significantly beyond that. Some of the folks on their platform are hands down the most talented engineers that I've ever spoken to. Let's also not forget that Latin America has high time zone overlap with what we have here in the United States, so you can hire full-time remote engineers who share most of the workday as your team. It's an end-to-end talent service, so you can find and hire engineers in Central and South America without having to worry about, frankly, the colossal pain of cross-border payroll and benefits and compliance because Revelo handles all of it. If you're hiring engineers, check out revelo.io/screaming to get 20% off your first three months. That's R-E-V-E-L-O dot I-O slash screaming.Corey: That is part of the problem is, in some cases, to drive some of these improvements, you have to go backwards to move forwards. And it's one of those, “Great, so we spent all this effort and money in the rest of now things are worse?” No, not necessarily, but suddenly are aware of things that were slipping through the cracks previously.Amy: Yeah. Yeah.Corey: Like, the most realistic thing about first The Phoenix Project and then The Unicorn Project, both by Gene Kim, has been the fact that companies have these problems and actively cared enough to change it. In my experience, that feels a little on the rare side.Amy: Yeah, and I think that's actually the key, right? It's for the culture change, and for, like, if you really looking to be, like, do I want to work at this company? Am I investing my myself in here? Is look at the leadership team and be, like, do these people actually give a crap? Are they looking just to punt another number down the road?That's the real question, right? Like, the technology and stuff, at the point where I'm at in my career, I just don't care that much anymore. [laugh]. Just… fine, use Kubernetes, use Postgres, [unintelligible 00:27:30], I don't care. I just don't. Like, Oracle, I might have to ask, you know, go to finance and be like, “Hey, can we spend 20 million for a database?” But like, nobody really asks for that anymore, so. [laugh].Corey: As one does. I will say that I mostly agree with you, but a technology that I found myself getting excited about, given the time of the recording on this is… fun, I spent a bit of time yesterday—from when we're recording this—teaching myself just enough Go to wind up being together a binary that I needed to do something actively ridiculous for my camera here. And I found myself coming away deeply impressed by a lot of things about it, how prescriptive it was for one, how self-contained for another. And after spending far too many years of my life writing shitty Perl, and shitty Bash, and worse Python, et cetera, et cetera, the prescriptiveness was great. The fact that it wound up giving me something I could just run, I could cross-compile for anything I need to run it on, and it just worked. It's been a while since I found a technology that got me this interested in exploring further.Amy: Go is great for that. You mentioned one of my two favorite features of Go. One is usually when a program compiles—at least the way I code in Go—it usually works. I've been working with Go since about 0.9, like, just a little bit before it was released as 1.0, and that's what I've noticed over the years of working with it is that most of the time, if you have a pretty good data structure design and you get the code to compile, usually it's going to work, unless you're doing weird stuff.The other thing I really love about Go and that maybe you'll discover over time is the malleability of it. And the reason why I think about that more than probably most folks is that I work on other people's code most of the time. And maybe this is something that you probably run into with your business, too, right, where you're working on other people's infrastructure. And the way that we encode business rules and things in the languages, in our programming language or our config syntax and stuff has a huge impact on folks like us and how quickly we can come into a situation, assess, figure out what's going on, figure out where things are laid out, and start making changes with confidence.Corey: Forget other people for a minute they're looking at what I built out three or four years ago here, myself, like, I look at past me, it's like, “What was that rat bastard thinking? This is awful.” And it's—forget other people's code; hell is your own code, on some level, too, once it's slipped out of the mental stack and you have to re-explore it and, “Oh, well thank God I defensively wound up not including any comments whatsoever explaining what the living hell this thing was.” It's terrible. But you're right, the other people's shell scripts are finicky and odd.I started poking around for help when I got stuck on something, by looking at GitHub, and a few bit of searching here and there. Even these large, complex, well-used projects started making sense to me in a way that I very rarely find. It's, “What the hell is that thing?” is my most common refrain when I'm looking at other people's code, and Go for whatever reason avoids that, I think because it is so prescriptive about formatting, about how things should be done, about the vision that it has. Maybe I'm romanticizing it and I'll hate it and a week from now, and I want to go back and remove this recording, but.Amy: The size of the language helps a lot.Corey: Yeah.Amy: But probably my favorite. It's more of a convention, which actually funny the way I'm going to talk about this because the two languages I work on the most right now are Ruby and Go. And I don't feel like two languages could really be more different.Syntax-wise, they share some things, but really, like, the mental models are so very, very different. Ruby is all the way in on object-oriented programming, and, like, the actual real kind of object-oriented with messaging and stuff, and, like, the whole language kind of springs from that. And it kind of requires you to understand all of these concepts very deeply to be effective in large programs. So, what I find is, when I approach Ruby codebase, I have to load all this crap into my head and remember, “Okay, so yeah, there's this convention, when you do this kind of thing in Ruby”—or especially Ruby on Rails is even worse because they go deep into convention over configuration. But what that's code for is, this code is accessible to people who have a lot of free cognitive capacity to load all this convention into their heads and keep it in their heads so that the code looks pretty, right?And so, that's the trade-off as you said, okay, my developers have to be these people with all these spare brain cycles to understand, like, why I would put the code here in this place versus this place? And all these, like, things that are in the code, like, very compact, dense concepts. And then you go to something like Go, which is, like, “Nah, we're not going to do Lambdas. Nah”—[laugh]—“We're not doing all this fancy stuff.” So, everything is there on the page.This drives some people crazy, right, is that there's all this boilerplate, boilerplate, boilerplate. But the reality is, I can read most Go files from top to the bottom and understand what the hell it's doing, whereas I can go sometimes look at, like, a Ruby thing, or sometimes Python and e—Perl is just [unintelligible 00:32:19] all the time, right, it's there's so much indirection. And it just be, like, “What the [BLEEP] is going on? This is so dense. I'm going to have to sit down and write it out in longhand so I can understand what the developer was even doing here.” And—Corey: Well, that's why I got the Mac Studio; for when I'm not doing A/V stuff with it, that means that I'll have one core that I can use for, you know, front-end processing and the rest, and the other 19 cores can be put to work failing to build Nokogiri in Ruby yet again.Amy: [laugh].Corey: I remember the travails of working with Ruby, and the problem—I have similar problems with Python, specifically in that—I don't know if I'm special like this—it feels like it's a SRE DevOps style of working, but I am grabbing random crap off a GitHub constantly and running it, like, small scripts other people have built. And let's be clear, I run them on my test AWS account that has nothing important because I'm not a fool that I read most of it before I run it, but I also—it wants a different version of Python every single time. It wants a whole bunch of other things, too. And okay, so I use ASDF as my version manager for these things, which for whatever reason, does not work for the way that I think about this ergonomically. Okay, great.And I wind up with detritus scattered throughout my system. It's, “Hey, can you make this reproducible on my machine?” “Almost certainly not, but thank you for asking.” It's like ‘Step 17: Master the Wolf' level of instructions.Amy: And I think Docker generally… papers over the worst of it, right, is when we built all this stuff in the aughts, you know, [CPAN 00:33:45]—Corey: Dev containers and VS Code are very nice.Amy: Yeah, yeah. You know, like, we had CPAN back in the day, I was doing chroots, I think in, like, '04 or '05, you know, to solve this problem, right, which is basically I just—screw it; I will compile an entire distro into a directory with a Perl and all of its dependencies so that I can isolate it from the other things I want to run on this machine and not screw up and not have these interactions. And I think that's kind of what you're talking about is, like, the old model, when we deployed servers, there was one of us sitting there and then we'd log into the server and be like, I'm going to install the Perl. You know, I'll compile it into, like, [/app/perl 558 00:34:21] whatever, and then I'll CPAN all this stuff in, and I'll give it over to the developer, tell them to set their shebang to that and everything just works. And now we're in a mode where it's like, okay, you got to set up a thousand of those. “Okay, well, I'll make a tarball.” [laugh]. But it's still like we had to just—Corey: DevOps, but [unintelligible 00:34:37] dev closer to ops. You're interrelating all the time. Yeah, then Docker comes along, and add dev is, like, “Well, here's the container. Good luck, asshole.” And it feels like it's been cast into your yard to worry about.Amy: Yeah, well, I mean, that's just kind of business, or just—Corey: Yeah. Yeah.Amy: I'm not sure if it's business or capitalism or something like that, but just the idea that, you know, if I can hand off the shitty work to some other poor schlub, why wouldn't I? I mean, that's most folks, right? Like, just be like, “Well”—Corey: Which is fair.Amy: —“I got it working. Like, my part is done, I did what I was supposed to do.” And now there's a lot of folks out there, that's how they work, right? “I hit done. I'm done. I shipped it. Sure. It's an old [unintelligible 00:35:16] Ubuntu. Sure, there's a bunch of shell scripts that rip through things. Sure”—you know, like, I've worked on repos where there's hundreds of things that need to be addressed.Corey: And passing to someone else is fine. I'm thrilled to do it. Where I run into problems with it is where people assume that well, my part was the hard part and anything you schlubs do is easy. I don't—Amy: Well, that's the underclass. Yeah. That's—Corey: Forget engineering for a second; I throw things to the people over in the finance group here at The Duckbill Group because those people are wizards at solving for this thing. And it's—Amy: Well, that's how we want to do things.Corey: Yeah, specialization works.Amy: But we have this—it's probably more cultural. I don't want to pick, like, capitalism to beat on because this is really, like, human cultural thing, and it's not even really particularly Western. Is the idea that, like, “If I have an underclass, why would I give a shit what their experience is?” And this is why I say, like, ops teams, like, get out of here because most ops teams, the extant ops teams are still called ops, and a lot of them have been renamed SRE—but they still do the same job—are an underclass. And I don't mean that those people are below us. People are treated as an underclass, and they shouldn't be. Absolutely not.Corey: Yes.Amy: Because the idea is that, like, well, I'm a fancy person who writes code at my ivory tower, and then it all flows down, and those people, just faceless people, do the deployment stuff that's beneath me. That attitude is the most toxic thing, I think, in tech orgs to address. Like, if you're trying to be like, “Well, our liability is bad, we have security problems, people won't fix their code.” And go look around and you will find people that are treated as an underclass that are given codes thrown over the wall at them and then they just have to toil through and make it work. I've worked on that a number of times in my career.And I think just like saying, underclass, right, or caste system, is what I found is the most effective way to get people actually thinking about what the hell is going on here. Because most people are just, like, “Well, that's just the way things are. It's just how we've always done it. The developers write to code, then give it to the sysadmins. The sysadmins deploy the code. Isn't that how it always works?”Corey: You'd really like to hope, wouldn't you?Amy: [laugh]. Not me. [laugh].Corey: Again, the way I see it is, in theory—in theory—sysadmins, ops, or that should not exist. People should theoretically be able to write code as developers that just works, the end. And write it correct the first time and never have to change it again. Yeah. There's a reason that I always like to call staging environments in places I work ‘theory' because it works in theory, but not in production, and that is fundamentally the—like, that entire job role is the difference between theory and practice.Amy: Yeah, yeah. Well, I think that's the problem with it. We're already so disconnected from the physical world, right? Like, you and I right now are talking over multiple strands of glass and digital transcodings and things right now, right? Like, we are detached from the physical reality.You mentioned earlier working in data centers, right? The thing I miss about it is, like, the physicality of it. Like, actually, like, I held a server in my arms and put it in the rack and slid it into the rails. I plugged into power myself; I pushed the power button myself. There's a server there. I physically touched it.Developers who don't work in production, we talked about empathy and stuff, but really, I think the big problem is when they work out in their idea space and just writing code, they write the unit tests, if we're very lucky, they'll write a functional test, and then they hand that wad off to some poor ops group. They're detached from the reality of operations. It's not even about accountability; it's about experience. The ability to see all of the weird crap we deal with, right? You know, like, “Well, we pushed the code to that server, but there were three bit flips, so we had to do it again. And then the other server, the disk failed. And on the other server…” You know? [laugh].It's just, there's all this weird crap that happens, these systems are so complex that they're always doing something weird. And if you're a developer that just spends all day in your IDE, you don't get to see that. And I can't really be mad at those folks, as individuals, for not understanding our world. I figure out how to help them, and the best thing we've come up with so far is, like, well, we start giving this—some responsibility in a production environment so that they can learn that. People do that, again, is another one that can be done wrong, where it turns into kind of a forced empathy.I actually really hate that mode, where it's like, “We're forcing all the developers online whether they like it or not. On-call whether they like it or not because they have to learn this.” And it's like, you know, maybe slow your roll a little buddy because the stuff is actually hard to learn. Again, minimizing how hard ops work is. “Oh, we'll just put the developers on it. They'll figure it out, right? They're software engineers. They're probably smarter than you sysadmins.” Is the unstated thing when we do that, right? When we throw them in the pit and be like, “Yeah, they'll get it.” [laugh].Corey: And that was my problem [unintelligible 00:39:49] the interview stuff. It was in the write code on a whiteboard. It's, “Look, I understood how the system fundamentally worked under the hood.” Being able to power my way through to get to an outcome even in language I don't know, was sort of part and parcel of the job. But this idea of doing it in artificially constrained environment, in a language I'm not super familiar with, off the top of my head, it took me years to get to a point of being able to do it with a Bash script because who ever starts with an empty editor and starts getting to work in a lot of these scenarios? Especially in an ops role where we're not building something from scratch.Amy: That's the interesting thing, right? In the majority of tech work today—maybe 20 years ago, we did it more because we were literally building the internet we have today. But today, most of the engineers out there working—most of us working stiffs—are working on stuff that already exists. We're making small incremental changes, which is great that's what we're doing. And we're dealing with old code.Corey: We're gluing APIs together, and that's fine. Ugh. I really want to thank you for taking so much time to talk to me about how you see all these things. If people want to learn more about what you're up to, where's the best place to find you?Amy: I'm on Twitter every once in a while as @MissAmyTobey, M-I-S-S-A-M-Y-T-O-B-E-Y. I have a blog I don't write on enough. And there's a couple things on the Equinix Metal blog that I've written, so if you're looking for that. Otherwise, mainly Twitter.Corey: And those links will of course be in the [show notes 00:41:08]. Thank you so much for your time. I appreciate it.Amy: I had fun. Thank you.Corey: As did I. Amy Tobey, Senior Principal Engineer at Equinix. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, or on the YouTubes, smash the like and subscribe buttons, as the kids say. Whereas if you've hated this episode, same thing, five-star review all the platforms, smash the buttons, but also include an angry comment telling me that you're about to wind up subpoenaing a copy of my shell script because you're convinced that your intellectual property and secrets are buried within.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

The Data Center Podcast
Uptime with DCK: Can't kill the Metal

The Data Center Podcast

Play Episode Listen Later May 9, 2022 28:23


In the latest episode of Uptime with Data Center Knowledge, we look at the evolution of bare metal servers. To find out more about the subject, we chat to bothers Jacob and Zachary Smith – co-founders of Packet, a bare metal hosting service that was acquired by data center giant Equinix in 2020, in a deal worth $335 million. Packet became the foundation of the new Equinix Metal business, led by Zac as its managing director, and Jacob – as the VP of bare metal strategy and marketing Correction: Soon after we recorded this episode, Zac was promoted to head of edge infrastructure services at Equinix, and Jacob – to interim lead of the digital services go-to-market. According to the Smiths, the key attractions of bare metal are speed and performance: Equinix Metal can be set up in any supported facility in as little as 15 minutes, to run almost any workload on dedicated, physical servers. The process is considerably different from handling servers used to run public cloud applications, where the hardware is often shared between multiple users. Jacob himself jokes that “no one really cares about servers” – but there are plenty of applications that benefit from bare metal, especially in organizations that value automation and are heavily invested in custom software stacks. For such customers, bare metal represents choice – a dedicated server is a blank canvas, unburdened by multiple layers of complex software that enables typical cloud workloads. The customer alone will decide what the machine will do, and how it will do it. We also discuss: • Open Source software development at Equinix • Why Equinix Metal doesn't manage Kubernetes • How to improve sustainability at the server level

XenTegra - Nutanix Weekly
Nutanix Weekly: Nutanix Cloud Platform on Equinix Metal: Combine the power of the Nutanix cloud stack with the bare-metal performance and global reach of Equinix

XenTegra - Nutanix Weekly

Play Episode Listen Later Mar 18, 2022 30:00 Transcription Available


Fast and flexible access to IT infrastructure has become a critical success factor for modern business. Nutanix and Equinix have joined up to help enterprise IT teams reduce time to market for new applications and services. Equinix Metal™ enables as-a-service infrastructure deployment globally and rapid scalability for hyperconverged infrastructure using the Nutanix® Cloud Platform, enabling you to shift CapEx to OpEx while reducing hardware spending on key workloads and enhancing your ability to reach the users, partners, and clouds that matter most to your business.Host: Harvey GreenCo-host: Jirah Cox

Ship It! DevOps, Infra, Cloud Native
Find the infrastructure advantage

Ship It! DevOps, Infra, Cloud Native

Play Episode Listen Later Nov 24, 2021 66:07 Transcription Available


Zac Smith, managing director Equinix Metal, is sharing how Equinix Metal runs the best hardware and networking in the industry, why pairing magical software with the right hardware is the future, and what Open19 means for sustainability in the data centre. Think modular components that slot in (including CPUs), liquid cooling that converts heat into energy, and a few other solutions that minimise the impact on the environment. But first, Zac tells us about the transition from Packet to Equinix Metal, his reasons for doing what he does, as well as the things that he is really passionate about, such as the most efficient data centres in the world and building for the love of it. This is a great follow-up to episode 18 because it goes deeper into the reasons that make Gerhard excited about the work that Equinix Metal is doing. This conversation with Zac puts it all into perspective. By the way, did you know that Equinix stands for Equality in the Internet Exchange?

Founders Talk
Building on global bare metal

Founders Talk

Play Episode Listen Later Nov 24, 2021 92:39 Transcription Available


This week Adam is joined by Zac Smith, Co-Founder of Packet and now running Equinix Metal. They talk about the early days of the internet infrastructure space, the beginnings of Packet, the “why” of bare metal, transitioning Packet from startup to global company overnight when they were acquired by Equinix, and how all this for Zac is 20 years in the making.

Changelog Master Feed
Find the infrastructure advantage (Ship It! #29)

Changelog Master Feed

Play Episode Listen Later Nov 24, 2021 66:07 Transcription Available


Zac Smith, managing director Equinix Metal, is sharing how Equinix Metal runs the best hardware and networking in the industry, why pairing magical software with the right hardware is the future, and what Open19 means for sustainability in the data centre. Think modular components that slot in (including CPUs), liquid cooling that converts heat into energy, and a few other solutions that minimise the impact on the environment. But first, Zac tells us about the transition from Packet to Equinix Metal, his reasons for doing what he does, as well as the things that he is really passionate about, such as the most efficient data centres in the world and building for the love of it. This is a great follow-up to episode 18 because it goes deeper into the reasons that make Gerhard excited about the work that Equinix Metal is doing. This conversation with Zac puts it all into perspective. By the way, did you know that Equinix stands for Equality in the Internet Exchange?

Changelog Master Feed
Building on global bare metal (Founders Talk #84)

Changelog Master Feed

Play Episode Listen Later Nov 24, 2021 92:39 Transcription Available


This week Adam is joined by Zac Smith, Co-Founder of Packet and now running Equinix Metal. They talk about the early days of the internet infrastructure space, the beginnings of Packet, the “why” of bare metal, transitioning Packet from startup to global company overnight when they were acquired by Equinix, and how all this for Zac is 20 years in the making.

Ship It! DevOps, Infra, Cloud Native
Bare metal meets Kubernetes

Ship It! DevOps, Infra, Cloud Native

Play Episode Listen Later Sep 9, 2021 67:38 Transcription Available


In this episode, Gerhard talks to David and Marques from Equinix Metal about the importance of bare metal for steady workloads. Terraform, Kubernetes and Tinkerbell come up, as does Crossplane - this conversation is a partial follow-up to episode 15. David Flanagan, a.k.a. Rawkode, needs no introduction. Some of you may remember Marques Johansson from The new changelog.com setup for 2019. Marques was behind the Linode Terraforming that we used at the time, and our infrastructure was simpler because of it! This is not just a great conversation about bare metal and Kubernetes, there is also a Rawkode Live following up: Live Debugging Changelog's Production Kubernetes

Changelog Master Feed
Bare metal meets Kubernetes (Ship It! #18)

Changelog Master Feed

Play Episode Listen Later Sep 9, 2021 67:38 Transcription Available


In this episode, Gerhard talks to David and Marques from Equinix Metal about the importance of bare metal for steady workloads. Terraform, Kubernetes and Tinkerbell come up, as does Crossplane - this conversation is a partial follow-up to episode 15. David Flanagan, a.k.a. Rawkode, needs no introduction. Some of you may remember Marques Johansson from The new changelog.com setup for 2019. Marques was behind the Linode Terraforming that we used at the time, and our infrastructure was simpler because of it! This is not just a great conversation about bare metal and Kubernetes, there is also a Rawkode Live following up: Live Debugging Changelog's Production Kubernetes

Ship It! DevOps, Infra, Cloud Native
Assemble all your infrastructure

Ship It! DevOps, Infra, Cloud Native

Play Episode Listen Later Aug 18, 2021 60:56 Transcription Available


In this episode, Gerhard follows up on The Changelog #375, which is the last time that he spoke Crossplane with Dan and Jared. Many things changed since then, such as abstractions and compositions, as well as using Crossplane to build platforms, which were mostly ideas. Fast forward 18 months, 2k changes, as well as a major version, and Crossplane is now an easy choice - some would say the best choice - for platform teams to declare what infrastructure means to them. You can now use Crossplane to define your infrastructure abstractions across multiple vendors, including AWS, GCP & Equinix Metal. The crazy ideas from 2019 are now bold and within reach. Gerhard also has an idea for the changelog.com 2022 setup. Listen to what Jared & Dan think, and then let us know your thoughts too.

Changelog Master Feed
Assemble all your infrastructure (Ship It! #15)

Changelog Master Feed

Play Episode Listen Later Aug 18, 2021 60:56 Transcription Available


In this episode, Gerhard follows up on The Changelog #375, which is the last time that he spoke Crossplane with Dan and Jared. Many things changed since then, such as abstractions and compositions, as well as using Crossplane to build platforms, which were mostly ideas. Fast forward 18 months, 2k changes, as well as a major version, and Crossplane is now an easy choice - some would say the best choice - for platform teams to declare what infrastructure means to them. You can now use Crossplane to define your infrastructure abstractions across multiple vendors, including AWS, GCP & Equinix Metal. The crazy ideas from 2019 are now bold and within reach. Gerhard also has an idea for the changelog.com 2022 setup. Listen to what Jared & Dan think, and then let us know your thoughts too.

Software Daily
Equinix Metal with Nicole Hubbard

Software Daily

Play Episode Listen Later Apr 2, 2021


A major change in the software industry is the expectation of automation. The infrastructure for deploying code, hosting it, and monitoring it is now being viewed as a fully automatable substrate. Equinix Metal has taken the bare metal servers that you would see in data centers and fitted them with supreme automation and repeatability. This

Software Daily
Equinix Infrastructure with Tim Banks

Software Daily

Play Episode Listen Later Mar 17, 2021


Software-Defined Networking describes a category of technologies that separate the networking control plane from the forwarding plane. This enables more automated provisioning and policy-based management of network resources. Implementing software-defined networking is often the task of Site Reliability Engineers, or SREs. Site reliability engineers work at the intersection of development and operations by bringing software development practices to system administration. Equinix manages co-location data centers and provides networking, security, and cloud-related services to their clients. Equinix is leveraging its status as a market leader in on-prem networking capabilities to expand into cloud and IaaS offerings such as Equinix Metal, which has been referred to as “bare-metal-as-a-service,” and offers integrations with 3rd party cloud technologies with a goal of creating a seamless alternative to modern public clouds for organizations seeking the benefits of colocation.Tim Banks is a Principal Solutions Architect at Equinix and he joins the show to talk about what Equinix offers and how it differs from other cloud providers.

The Pure Report
Pure Storage on Equinix Metal

The Pure Report

Play Episode Listen Later Mar 8, 2021 36:52


Analysts are starting to recognize the value to enterprises for hosted bare metal environments to deliver a cloud-like experience and speed time to market. Hear from Jack Hogan, Pure VP of Technology Strategy, about a new partnership offering with Equinix that delivers bare metal as-a-service capabilities and solves challenges with cloud cost overruns, legacy technical debt, and skillset gaps. The new Pure Storage on Equinix Metal offering provides high-performance, on demand, single tenant environments with full control over provisioning and administration. Whether you have legacy on-prem applications, high volume data driven workloads, or are ramping cloud-native and next gen apps, the new BMaaS offering delivers a single platform for your cloud journey. For more information: http://www.purestorage.com/baremetal and http://www.purestorage.com/equinix

The Digital Digest
The Digital Digest: The GLF's IoT code; the story of FiBrazil; Equinix heads to Bordeaux; and Turbidite's plans for Asia, plus a look at what's to come for IWD

The Digital Digest

Play Episode Listen Later Mar 4, 2021 40:25


In this episode of the Digital Digest, we roundup the biggest stories of the week from the residential network that carried 400Gbps, to the sandwich-sized satellites tracking whales. First up, the ITW Global Leaders' Forum has published a Code of Conduct, defining a framework among global carriers providing IPX-based traffic, to ensure quality of service for critical IoT applications. Next, Natalie brings us the story behind FiBrazil, Brazil's new wholesale network provider and India's new payment network built by Reliance, Google and Facebook. Meanwhile, in the UK Virgin Media is testing technology that can deliver 400Gbps on a residential fibre network and in Egypt, an almost heart-shaped subsea project is connecting Africa in new ways. In the world of data centres, Abigail shares the details on Cologix's new Ohio data centre, while NetActuatehas finalised the upgrade of its Johannesburg hub and Equinix Metal has revealed its global expansion plans. And in other news, Digital Realty has snapped up the IP and engineering muscle of Pureport. Alan explains why Bill Barney set up Turbiditeand why Digital Colony paid $854 million for Boingo Wireless. Meanwhile, Swarm's IoT in space got 12 new satellites this week which, technically, could be used to track whales, and according to SpaceX and Samsung Austin, Texas, is the new place to be. We also look ahead to Digital Infra India, taking place online 9-10 March, and Natalie brings us a preview of what's coming up for International Women's Day, including a special feature looking at why equity needs to be part of the equality conversation. Season 2, episode 8 is presented by deputy editor Melanie Mingas, and features editor-at-large Alan Burkitt-Gray and senior reporters Abigail Opiah and Natalie Bannerman.

Heavybit Podcast Network: Master Feed
Ep. #32, Managing Hardware with Gianluca Arbezzano of Equinix Metal

Heavybit Podcast Network: Master Feed

Play Episode Listen Later Jan 14, 2021 33:00


In episode 32 of o11ycast, Liz and Shelby speak with Gianluca Arbezzano of Equinix Metal. They discuss diversifying the way observability is evangelized, the indicators that an organization has outgrown off-the-shelf tools, and managing hardware.

O11ycast
Ep. #32, Managing Hardware with Gianluca Arbezzano of Equinix Metal

O11ycast

Play Episode Listen Later Jan 14, 2021 33:00


In episode 32 of o11ycast, Liz and Shelby speak with Gianluca Arbezzano of Equinix Metal. They discuss diversifying the way observability is evangelized, the indicators that an organization has outgrown off-the-shelf tools, and managing hardware.