POPULARITY
AWS Morning Brief for the week of November 6, 2023, with Corey Quinn. Links Amazon Athena announces one hour reservations for Provisioned Capacity AWS Neuron adds support for Llama-2 70b model and PyTorch 2.0 Filter and Stream Logs from Amazon S3 Logging Buckets into Splunk Using AWS Lambda Announcing Amazon EC2 Capacity Blocks for ML to reserve GPU capacity for your machine learning workloads GoDaddy benchmarking results in up to 24% better price-performance for their Spark workloads with AWS Graviton2 on Amazon EMR Serverless How contact center leaders can evaluate using generative AI for customer experience Detect and fix low cardinality indexes in Amazon DocumentDB How power utilities analyze and detect harmonics issues using power quality and customer usage data with Amazon Timestream: Part 2 Techniques to improve the state-of-the-art in Cloud FinOps using Amazon Neptune Mark your calendars for AWS Mainframe Modernization sessions @ re: Invent 2023 Ready for Flight: Announcing Finch 1.0 GA! Aggregating, searching, and visualizing log data from distributed sources with Amazon Athena and Amazon QuickSight How to monitor Amazon Elastic File System (EFS) storage costs Fullstack generative AI sample app for Amazon Bedrock
Victoria is joined by guest co-host Joe Ferris, CTO at thoughtbot, and Seif Lotfy, the CTO and Co-Founder of Axiom. Seif discusses the journey, challenges, and strategies behind his data analytics and observability platform. Seif, who has a background in robotics and was a 2008 Sony AIBO robotic soccer world champion, shares that Axiom pivoted from being a Datadog competitor to focusing on logs and event data. The company even built its own logs database to provide a cost-effective solution for large-scale analytics. Seif is driven by his passion for his team and the invaluable feedback from the community, emphasizing that sales validate the effectiveness of a product. The conversation also delves into Axiom's shift in focus towards developers to address their need for better and more affordable observability tools. On the business front, Seif reveals the company's challenges in scaling across multiple domains without compromising its core offerings. He discusses the importance of internal values like moving with urgency and high velocity to guide the company's future. Furthermore, he touches on the challenges and strategies of open-sourcing projects and advises avoiding platforms like Reddit and Hacker News to maintain focus. Axiom (https://axiom.co/) Follow Axiom on LinkedIn (https://www.linkedin.com/company/axiomhq/), X (https://twitter.com/AxiomFM), GitHub (https://github.com/axiomhq), or Discord (https://discord.com/invite/axiom-co). Follow Seif Lotfy on LinkedIn (https://www.linkedin.com/in/seiflotfy/) or X (https://twitter.com/seiflotfy). Visit his website at seif.codes (https://seif.codes/). Follow thoughtbot on X (https://twitter.com/thoughtbot) or LinkedIn (https://www.linkedin.com/company/150727/). Become a Sponsor (https://thoughtbot.com/sponsorship) of Giant Robots! Transcript: VICTORIA: This is the Giant Robots Smashing Into Other Giant Robots Podcast, where we explore the design, development, and business of great products. I'm your host, Victoria Guido, and with me today is Seif Lotfy, CTO and Co-Founder of Axiom, the best home for your event data. Seif, thank you for joining me. SEIF: Hey, everybody. Thanks for having me. This is awesome. I love the name of the podcast, given that I used to compete in robotics. VICTORIA: What? All right, we're going to have to talk about that. And I also want to introduce a guest co-host today. Since we're talking about cloud, and observability, and data, I invited Joe Ferris, thoughtbot CTO and Director of Development of our platform engineering team, Mission Control. Welcome, Joe. How are you? JOE: Good, thanks. Good to be back again. VICTORIA: Okay. I am excited to talk to you all about observability. But I need to go back to Seif's comment on competing with robots. Can you tell me a little bit more about what robots you've built in the past? SEIF: I didn't build robots; I used to program them. Remember the Sony AIBOs, where Sony made these dog robots? And we would make them compete. There was an international competition where we made them play soccer, and they had to be completely autonomous. They only communicate via Bluetooth or via wireless protocols. And you only have the camera as your sensor as well as...a chest sensor throws the ball near you, and then yeah, you make them play football against each other, four versus four with a goalkeeper and everything. Just look it up: RoboCup AIBO. Look it up on YouTube. And I...2008 world champion with the German team. VICTORIA: That sounds incredible. What kind of crowds are you drawing out for a robot soccer match? Is that a lot of people involved with that? SEIF: You would be surprised how big the RoboCup competition is. It's ridiculous. VICTORIA: I want to go. I'm ready. I want to, like, I'll look it up and find out when the next one is. SEIF: No more Sony robots but other robots. Now, there's two-legged robots. So, they make them play as two-legged robots, much slower than four-legged robots, but works. VICTORIA: Wait. So, the robots you were playing soccer with had four legs they were running around on? SEIF: Yeah, they were dogs [laughter]. VICTORIA: That's awesome. SEIF: We all get the same robot. It's just a competition on software, right? On a software level. And some other competitions within the RoboCup actually use...you build your own robot and stuff like that. But this one was...it's called the Standard League, where we all have a robot, and we have to program it. JOE: And the standard robot was a dog. SEIF: Yeah, I think back then...we're talking...it's been a long time. I think it started in 2001 or something. I think the competition started in 2001 or 2002. And I compete from 2006 to 2008. Robots back then were just, you know, simple. VICTORIA: Robots today are way too complicated [laughs]. SEIF: Even AI is more complicated. VICTORIA: That's right. Yeah, everything has gotten a lot more complicated [laughs]. I'm so curious how you went from being a world-champion robot dog soccer player [laughs] programmer [laughs] to where you are today with Axiom. Can you tell me a little bit more about your journey? SEIF: The journey is interesting because it came from open source. I used to do open source on the side a lot–part of the GNOME Project. That's where I met Neil and the rest of my team, Mikkel Kamstrup, the whole crowd, basically. We worked on GNOME. We worked on Ubuntu. Like, most of them were working professionally on it. I was working for another company, but we worked on the same project. We ended up at Xamarin, which was bought by Microsoft. And then we ended up doing Axiom. But we've been around each other professionally since 2009, most of us. It's like a little family. But how we ended up exactly in observability, I think it's just trying to fix pain points in my life. VICTORIA: Yeah, I was reading through the docs on Axiom. And there's an interesting point you make about organizations having to choose between how much data they have and how much they want to spend on it. So, maybe you can tell me a little bit more about that pain point and what you really found in the early stages that you wanted to solve. SEIF: So, the early stages of what we wanted to solve we were mainly dealing with...so, the early, early stage, we were actually trying to be a Datadog competitor, where we were going to be self-hosted. Eventually, we focused on logs because we found out that's what was a big problem for most people, just event data, not just metric but generally event data, so logs, traces, et cetera. We built out our own logs database completely from scratch. And one of the things we stumbled upon was; basically, you have three things when it comes to logging, which is low cost, low latency, and large scale. That's what everybody wants. But you can't get all three of them; you can only get two of them. And we opted...like, we chose large scale and low cost. And when it comes to latency, we say it should be just fast enough, right? And that's where we focused on, and this is how we started building it. And with that, this is how we managed to stand out by just having way lower cost than anybody else in the industry and dealing with large scale. VICTORIA: That's really interesting. And how did you approach making the ingestion pipeline for masses amount of data more efficient? SEIF: Just make it coordination-free as possible, right? And get rid of Kafka because Kafka just, you know, drains your...it's where you throw in money. Like maintaining Kafka...it's like back then Elasticsearch, right? Elasticsearch was the biggest part of your infrastructure that would cost money. Now, it's also Kafka. So, we found a way to have our own internal way of queueing things without having to rely on Kafka. As I said, we wrote everything from scratch to make it work. Like, every now and then, I think that we can spin this out of the company and make it a new product. But now, eyes on the prize, right? JOE: It's interesting to hear that somebody who spent so much time in the open-source community ended up rolling their own solution to so many problems. Do you feel like you had some lessons learned from open source that led you to reject solutions like Kafka, or how did that journey go? SEIF: I don't think I'm rejecting Kafka. The problem is how Kafka is built, right? Kafka is still...you have to set up all these servers. They have to communicate, et cetera, etcetera. They didn't build it in a way where it's stateless, and that's what we're trying to go to. We're trying to make things as stateless as possible. So, Kafka was never built for the cloud-native era. And you can't really rely on SQS or something like that because it won't deal with this high throughput. So, that's why I said, like, we will sacrifice some latency, but at least the cost is low. So, if messages show after half a second or a second, I'm good. It doesn't have to be real-time for me. So, I had to write a couple of these things. But also, it doesn't mean that we reject open source. Like, we actually do like open source. We open-source a couple of libraries. We contribute back to open source, right? We needed a solution back then for that problem, and we couldn't find any. And maybe one day, open source will have, right? JOE: Yeah. I was going to ask if you considered open-sourcing any of your high latency, high throughput solutions. SEIF: Not high latency. You make it sound bad. JOE: [laughs] SEIF: You make it sound bad. It's, like, fast enough, right? I'm not going to compete on milliseconds because, also, I'm competing with ClickHouse. I don't want to compete with ClickHouse. ClickHouse is low latency and large scale, right? But then the cost is, you know, off the charts a bit sometimes. I'm going the other route. Like, you know, it's fast enough. Like, how, you know, if it's under two, three seconds, everybody's happy, right? If the results come within two, three seconds, everybody is happy. If you're going to build a real-time trading system on top of it, I'll strongly advise against that. But if you're building, you know, you're looking at dashboards, you're more in the observability field, yeah, we're good. VICTORIA: Yeah, I'm curious what you found, like, which customer personas that market really resonated with. Like, is there a particular, like, industry type where you're noticing they really want to lower their cost, and they're okay with this just fast enough latency? SEIF: Honestly, with the current recession, everybody is okay with giving up some of the speed to reduce the money because I think it's not linear reduction. It's more exponential reduction at this point, right? You give up a second, and you're saving 30%. You give up two seconds, all of a sudden, you're saving 80%. So, I'd say in the beginning, everybody thought they need everything to be very, very fast. And now they're realizing, you know, with limitations you have around your budget and spending, you're like, okay, I'm okay with the speed. And, again, we're not slow. I'm just saying people realize they don't need everything under a second. They're okay with waiting for two seconds. VICTORIA: That totally resonates with me. And I'm curious if you can add maybe a non-technical or a real-life example of, like, how this impacts the operations of a company or organization, like, if you can give us, like, a business-y example of how this impacts how people work. SEIF: I don't know how, like, how do people work on that? Nothing changed, really. They're still doing the, like...really nothing because...and that aspect is you run a query, and, again, as I said, you're not getting the result in a second. You're just waiting two seconds or three seconds, and it's there. So, nothing really changed. I think people can wait three seconds. And we're still like–when I say this, we're still faster than most others. We're just not as fast as people who are trying to compete on a millisecond level. VICTORIA: Yeah, that's okay. Maybe I'll take it back even, like, a step further, right? Like, our audience is really sometimes just founders who almost have no formal technical training or background. So, when we talk about observability, sometimes people who work in DevOps and operations all understand it and kind of know why it's important [laughs] and what we're talking about. So, maybe you could, like, go back to -- SEIF: Oh, if you're asking about new types of people who've been using it -- VICTORIA: Yeah. Like, if you're going to explain to, like, a non-technical founder, like, why your product is important, or, like, how people in their organization might use it, what would you say? SEIF: Oh, okay, if you put it like that. It's more of if you have data, timestamp data, and you want to run analytics on top of it, so that could be transactions, that could be web vitals, rather than count every time somebody visits, you have a timestamp. So, you can count, like, how many visitors visited the website and what, you know, all these kinds of things. That's where you want to use something like Axiom. That's outside the DevOps space, of course. And in DevOps space, there's so many other things you use Axiom for, but that's outside the DevOps space. And we actually...we implemented as zero-config integration with Vercel that kind of went viral. And we were, for a while, the number one enterprise for self-integration because so many people were using it. So, Vercel users are usually not necessarily writing the most complex backends, but a lot of things are happening on the front-end side of things. And we would be giving them dashboards, automated dashboards about, you know, latencies, and how long a request took, and how long the response took, and the content type, and the status codes, et cetera, et cetera. And there's a huge user base around that. VICTORIA: I like that. And it's something, for me, you know, as a managing director of our platform engineering team, I want to talk more to founders about. It's great that you put this product and this app out into the world. But how do you know that people are actually using it? How do you know that people, like, maybe, are they all quitting after the first day and not coming back to your app? Or maybe, like, the page isn't loading or, like, it's not working as they expected it to. And, like, if you don't have anything observing what users are doing in your app, then it's going to be hard to show that you're getting any traction and know where you need to go in and make corrections and adjust. SEIF: We have two ways of doing this. Right now, internally, we use our own tools to see, like, who is sending us data. We have a deployment that's monitoring production deployment. And we're just, you know, seeing how people are using it, how much data they're sending every day, who stopped sending data, who spiked in sending data sets, et cetera. But we're using Mixpanel, and Dominic, our Head of Product, implemented a couple of key metrics to that for that specifically. So, we know, like, what's the average time until somebody starts going from building its own queries with the builder to writing APL, or how long it takes them from, you know, running two queries to five queries. And, you know, we just start measuring these things now. And it's been going...we've been growing healthy around that. So, we tend to measure user interaction, but also, we tend to measure how much data is being sent. Because let's keep in mind, usually, people go in and check for things if there's a problem. So, if there's no problem, the user won't interact with us much unless there's a notification that kicks off. We also just check, like, how much data is being sent to us the whole time. VICTORIA: That makes sense. Like, you can't just rely on, like, well, if it was broken, they would write a [chuckles], like, a question or something. So, how do you get those metrics and that data around their interactions? So, that's really interesting. So, I wonder if we can go back and talk about, you know, we already mentioned a little bit about, like, the early days of Axiom and how you got started. Was there anything that you found in the early discovery process that was surprising and made you pivot strategy? SEIF: A couple of things. Basically, people don't really care about the tech as much as they care [inaudible 12:51] and the packaging, so that's something that we had to learn. And number two, continuous feedback. Continuous feedback changed the way we worked completely, right? And, you know, after that, we had a Slack channel, then we opened a Discord channel. And, like, this continuous feedback coming in just helps with iterating, helps us with prioritizing, et cetera. And that changed the way we actually developed product. VICTORIA: You use Slack and Discord? SEIF: No. No Slack anymore. We had a community Slack. We had a community [inaudible 13:19] Slack. Now, there's no community Slack. We only have a community Discord. And the community Slack is...sorry, internally, we use Slack, but there's a community Discord for the community. JOE: But how do you keep that staffed? Is it, like, everybody is in the Discord during working hours? Is it somebody's job to watch out for community questions? SEIF: I think everybody gets involved now just...and you can see it. If you go on our Discord, you will just see it. Just everyone just gets involved. I think just people are passionate about what they're doing. At least most people are involved on Discord, right? Because there's, like, Discord the help sections, and people are just asking questions and other people answering. And now, we reached a point where people in the community start answering the questions for other people in the community. So, that's how we see it's starting to become a healthy community, et cetera. But that is one of my favorite things: when I see somebody from the community answering somebody else, that's a highlight for me. Actually, we hired somebody from that community because they were so active. JOE: Yeah, I think one of the biggest signs that a product is healthy is when there's a healthy ecosystem building up around it. SEIF: Yeah, and Discord reminds me of the old days of open sources like IRC, just with memes now. But because all of us come from the old IRC days, being on Discord and chatting around, et cetera, et cetera, just gives us this momentum back, gave us this momentum back, whereas Slack always felt a bit too businessy to me. JOE: Slack is like IRC with emoji. Discord is IRC with memes. SEIF: I would say Slack reminds me somehow of MSN Messenger, right? JOE: I feel like there's a huge slam on MSN Messenger here. SEIF: [laughs] What do you guys use internally, Slack or? I think you're using Slack, right? Or Teams. Don't tell me you're using Teams. JOE: No, we're using Slack. SEIF: Okay, good, because I shit talk. Like, there is this, I'll sh*t talk here–when I start talking about Teams, so...I remember that one thing Google did once, and that failed miserably. JOE: Google still has, like, seven active chat products. SEIF: Like, I think every department or every, like, group of engineers just uses one of them internally. I'm not sure. Never got to that point. But hey, who am I to judge? VICTORIA: I just feel like I end up using all of them, and then I'm just rotating between different tabs all day long. You maybe talked me into using Discord. I feel like I've been resisting it, but you got me with the memes. SEIF: Yeah, it's definitely worth it. It's more entertaining. More noise, but more entertaining. You feel it's alive, whereas Slack is...also because there's no, like, history is forever. So, you always go back, and you're like, oh my God, what the hell is this? VICTORIA: Yeah, I have, like, all of them. I'll do anything. SEIF: They should be using Axiom in the background. Just send data to Axiom; we can keep your chat history. VICTORIA: Yeah, maybe. I'm so curious because, you know, you mentioned something about how you realized that it didn't matter really how cool the tech was if the product packaging wasn't also appealing to people. Because you seem really excited about what you've built. So, I'm curious, so just tell us a little bit more about how you went about trying to, like, promote this thing you built. Or was, like, the continuous feedback really early on, or how did that all kind of come together? SEIF: The continuous feedback helped us with performance, but actually getting people to sign up and pay money it started early on. But with Vercel, it kind of skyrocketed, right? And that's mostly because we went with the whole zero-config approach where it's just literally two clicks. And all of a sudden, Vercel is sending your data to Axiom, and that's it. We will create [inaudible 16:33]. And we worked very closely with Vercel to do this, to make this happen, which was awesome. Like, yeah, hats off to them. They were fantastic. And just two clicks, three clicks away, and all of a sudden, we created Axiom organization for you, the data set for you. And then we're sending it...and the data from Vercel is being forwarded to it. I think that packaging was so simple that it made people try it out quickly. And then, the experience of actually using Axiom was sticky, so they continued using it. And then the price was so low because we give 500 gigs for free, right? You send us 500 gigs a month of logs for free, and we don't care. And you can start off here with one terabyte for 25 bucks. So, people just start signing up. Now, before that, it was five terabytes a month for $99, and then we changed the plan. But yeah, it was cheap enough, so people just start sending us more and more and more data eventually. They weren't thinking...we changed the way people start thinking of “what am I going to send to Axiom” or “what am I going to send to my logs provider or log storage?” To how much more can I send? And I think that's what we wanted to reach. We wanted people to think, how much more can I send? JOE: You mentioned latency and cost. I'm curious about...the other big challenge we've seen with observability platforms, including logs, is cardinality of labels. Was there anything you had to sacrifice upfront in terms of cardinality to manage either cost or volume? SEIF: No, not really. Because the way we designed it was that we should be able to deal with high cardinality from scratch, right? I mean, there's open-source ways of doing, like, if you look at how, like, a column store, if you look at a column store and every dimension is its own column, it's just that becomes, like, you can limit on the amount of columns you're creating, but you should never limit on the amount of different values in a column could be. So, if you're having something like stat tags, right? Let's say hosting, like, hostname should be a column, but then the different hostnames you have, we never limit that. So, the cardinality on a value is something that is unlimited for us, and we don't really see it in cost. It doesn't really hit us on cost. It reflects a bit on compression if you get into technical details of that because, you know, high cardinality means a lot of different data. So, compression is harder, but it's not repetitive. But then if you look at, you know, oh, I want to send a lot of different types of fields, not values with fields, so you have hostname, and latency, and whatnot, et cetera, et cetera, yeah, that's where limitation starts because then they have...it's like you're going to a wide range of...and a wider dimension. But even that, we, yeah, we can deal with thousands at this point. And we realize, like, most people will not need more than three or four. It's like a Postgres table. You don't need more than 3,000 to 4000 columns; else, you know, you're doing a lot. JOE: I think it's actually pretty compelling in terms of cost, though. Like, that's one of the things we've had to be most careful about in terms of containing cost for metrics and logs is, a lot of providers will...they'll either charge you based on the number of unique metric combinations or the performance suffers greatly. Like, we've used a lot of Prometheus-based solutions. And so, when we're working with developers, even though they don't need more than, you know, a few dozen metric combinations most of the time, it's hard for people to think of what they need upfront. It's much easier after you deploy it to be able to query your data and slice it retroactively based on what you're seeing. SEIF: That's the detail. When you say we're using Prometheus, a lot of the metrics tools out there are using, just like Prometheus, are using the Gorilla data structure. And the real data structure was never designed to deal with high cardinality labels. So, basically, to put it in a simple way, every combination of tags you send for metrics is its own file on disk. That's, like, the very simple way of explaining this. And then, when you're trying to search through everything, right? And you have a lot of these combinations. I actually have to get all these files from this conversion back together, you know, and then they're chunked, et cetera. So, it's a problem. Generally, how metrics are doing it...most metrics products are using it, even VictoriaMetrics, et cetera. What they're doing is they're using either the Prometheus TSDB data structure, which is based on Gorilla. Influx was doing the same thing. They pivoted to using more and more like the ones we use, and Honeycomb uses, right? So, we might not be as fast on metrics side as these highly optimized. But then when it comes to high [inaudible 20:49], once we start dealing with high cardinality, we will be faster than those solutions. And that's on a very technical level. JOE: That's pretty cool. I realize we're getting pretty technical here. Maybe it's worth defining cardinality for the audience. SEIF: Defining cardinality to the...I mean, we just did that, right? JOE: What do you think, Victoria? Do you know what cardinality is now? [laughs] VICTORIA: All right. Now I'm like, do I know? I was like, I think I know what it means. Cardinality is, like, let's say you have a piece of data like an event or a transaction. SEIF: It's like the distinct count on a property that gives you the cardinality of a property. VICTORIA: Right. It's like how many pieces of information you have about that one event, basically, yeah. JOE: But with some traditional metrics stores, it's easy to make mistakes. For example, you could have unbounded cardinality by including response time as one of the labels -- SEIF: Tags. JOE: And then it's just going to -- SEIF: Oh, no, no. Let me give you a better one. I put in timestamp at some point in my life. JOE: Yeah, I feel like everybody has done that one. [laughter] SEIF: I've put a system timestamp at some point in my life. There was the actual timestamp, and there was a system timestamp that I would put because I wanted to know when the...because I couldn't control the timestamp, and the only timestamp I had was a system timestamp. I would always add the actual timestamp of when that event actually happened into a metric, and yeah, that did not scale. MID-ROLL AD: Are you an entrepreneur or start-up founder looking to gain confidence in the way forward for your idea? At thoughtbot, we know you're tight on time and investment, which is why we've created targeted 1-hour remote workshops to help you develop a concrete plan for your product's next steps. Over four interactive sessions, we work with you on research, product design sprint, critical path, and presentation prep so that you and your team are better equipped with the skills and knowledge for success. Find out how we can help you move the needle at tbot.io/entrepreneurs. VICTORIA: Yeah. I wonder if you could maybe share, like, a story about when it's gone wrong, and you've suddenly charged a lot of money [laughs] just to get information about what's happening in the system. Any, like, personal experiences with observability that kind of informed what you did with Axiom? SEIF: Oof, I have a very bad one, like, a very, very bad one. I used to work for a company. We had to deploy Elasticsearch on Windows Servers, and it was US-East-1. So, just a combination of Elasticsearch back in 2013, 2014 together with Azure and Windows Server was not a good idea. So, you see where this is going, right? JOE: I see where it's going. SEIF: Eventually, we had, like, we get all these problems because we used Elasticsearch and Kibana as our, you know, observability platform to measure everything around the product we were building. And funny enough, it cost us more than actually maintaining the infrastructure of the product. But not just that, it also kept me up longer because most of the downtimes I would get were not because of the product going down. It's because my Elasticsearch cluster started going down, and there's reasons for that. Because back then, Microsoft Azure thought that it's okay for any VM to lose connection with the rest of the VMs for 30 seconds per day. And then, all of a sudden, you have Elasticsearch with a split-brain problem. And there was a phase where I started getting alerted so much that back then, my partner threatened to leave me. So I bought a...what I think was a shock bracelet or a shock collar via Bluetooth, and I connected it to phone for any notification. And I bought that off Alibaba, by the way. And I would charge it at night, put it on my wrist, and go to sleep. And then, when alert happens, it will fully discharge the battery on me every time. JOE: Okay, I have to admit, I did not see where that was going. SEIF: Yeah, did that for a while; definitely did not save my relationship either. But eventually, that was the point where, you know, we started looking into other observability tools like Datadog, et cetera, et cetera, et cetera. And that's where the actual journey began, where we moved away from Elasticsearch and Kibana to look for something, okay, that we don't have to maintain ourselves and we can use, et cetera. So, it's not about the costs as much; it was just pain. VICTORIA: Yeah, pain is a real pain point, actual physical [chuckles] and emotional pain point [laughter]. What, like, motivates you to keep going with Axiom and to keep, like, the wind in your sails to keep working on it? SEIF: There's a couple of things. I love working with my team. So, honestly, I just wake up, and I compliment my team. I just love working with them. They're a lot of fun to work with. And they challenge me, and I challenge them back. And I upset them a lot. And they can't upset me, but I upset them. But I love working with them, and I love working with that team. And the other thing is getting, like, having this constant feedback from customers just makes you want to do more and, you know, close sales, et cetera. It's interesting, like, how I'm a very technical person, and I'm more interested in sales because sales means your product works, the product, the technical parts, et cetera. Because if technically it's not working, you can't build a product on top of it. And if you're not selling it, then what's the point? You only sell when the product is good, more or less, unless you're Oracle. VICTORIA: I had someone ask me about Oracle recently, actually. They're like, "Are you considering going back to it?" And I'm maybe a little allergic to it from having a federal consulting background [laughs]. But maybe they'll come back around. I don't know. We'll see. SEIF: Did you sell your soul back then? VICTORIA: You know, I feel like I just grew up in a place where that's what everyone did was all. SEIF: It was Oracle, IBM, or HP back in the day. VICTORIA: Yeah. Well, basically, when you're working on applications that were built in, like, the '80s, Oracle was, like, this hot, new database technology [laughs] that they just got five years ago. So, that's just, yeah, interesting. SEIF: Although, from a database perspective, they did a lot of the innovations. A lot of first innovations could have come from Oracle. From a technical perspective, they're ridiculous. I'm not sure from a product perspective how good they are. But I know their sales team is so big, so huge. They don't care about the product anymore. They can still sell. VICTORIA: I think, you know, everything in tech is cyclical. So, you know, if they have the right strategy and they're making some interesting changes over there, there's always a chance [laughs]. Certain use cases, I mean, I think that's the interesting point about working in technology is that you know, every company is a tech company. And so, there's just a lot of different types of people, personas, and use cases for different types of products. So, I wonder, you know, you kind of mentioned earlier that, like, everyone is interested in Axiom. But, you know, I don't know, are you narrowing the market? Or, like, how are you trying to kind of focus your messaging and your sales for Axiom? SEIF: I'm trying to focus on developers. So, we're really trying to focus on developers because the experience around observability is crap. It's stupid expensive. Sorry for being straightforward, right? And that's what we're trying to change. And we're targeting developers mainly. We want developers to like us. And we'll find all these different types of developers who are using it, and that's the interesting thing. And because of them, we start adding more and more features, like, you know, we added tracing, and now that enables, like, billions of events pushed through for, you know, again, for almost no money, again, $25 a month for a terabyte of data. And we're doing this with metrics next. And that's just to address the developers who have been giving us feedback and the market demand. I will sum it up, again, like, the experience is crap, and it's stupid expensive. I think that's the [inaudible 28:07] of observability is just that's how I would sum it up. VICTORIA: If you could go back in time and talk to yourself when you were still a developer, now that you're CTO, what advice would you give yourself? JOE: Besides avoiding shock collars. VICTORIA: [laughs] Yes. SEIF: Get people's feedback quickly so you know you're on the right track. I think that's very, very, very, very important. Don't just work in the dark, or don't go too long into stealth mode because, eventually, people catch up. Also, ship when you're 80% ready because 100% is too late. I think it's the same thing here. JOE: Ship often and early. SEIF: Yeah, even if it's not fully ready, it's still feedback. VICTORIA: Ship often and early and talk to people [laughs]. Just, do you feel like, as a developer, did you have the skills you needed to be able to get the most out of those feedback and out of those conversations you were having with people around your product? SEIF: I still don't think I'm good enough. You're just constantly learning, right? I just accepted I'm part of a team, and I have my contributions. But as an individual, I still don't think I know enough. I think there's more I need to learn at this point. VICTORIA: I wonder, what questions do you have for me or Joe? SEIF: How did you start your podcast, and why the name? VICTORIA: Oh, man, I hope I can answer. So, the podcast was started...I think it's, like, we're actually about to be at our 500th Episode. So, I've only been a host for the last year. Maybe Joe even knows more than I do. But what I recall is that one person at thoughtbot thought it would be a great idea to start a podcast, and then they did it. And it seems like the whole company is obsessed with robots. I'm not really sure where that came from. There used to be a tiny robot in the office, is what I remember. And people started using that as, like, the mascot. And then, yeah, that's it, that's the whole thing. SEIF: Was the robot doing anything useful or just being cute? JOE: It was just cute, and it's hard to make a robot cute. SEIF: Was it a real robot, or was it like a -- JOE: No, there was, at one point, a toy robot. The name...I actually forget the origin–origin of the name, but the name Giant Robots comes from our blog. So, we named the podcast the same as the blog: Giant Robots Smashing Into Other Giant Robots. SEIF: Yes, it's called transformers. VICTORIA: Yeah, I like it. It's, I mean, now I feel like -- SEIF: [laughs] VICTORIA: We got to get more, like, robot dogs involved [laughs] in the podcast. SEIF: Like, I wanted to add one thing when we talked about, you know, what gets me going. And I want to mention that I have a six-month-old son now. He definitely adds a lot of motivation for me to wake up in the morning and work. But he also makes me wake up regardless if I want to or not. VICTORIA: Yeah, you said you had invented an alarm clock that never turns off. Never snoozes [laughs]. SEIF: Yes, absolutely. VICTORIA: I have the same thing, but it's my dog. But he does snooze, actually. He'll just, like, get tired and go back to sleep [laughs]. SEIF: Oh, I have a question. Do dogs have a Tamagotchi phase? Because, like, my son, the first three months was like a Tamagotchi. It was easy to read him. VICTORIA: Oh yeah, uh-huh. SEIF: Noisy but easy. VICTORIA: Yes, yes. SEIF: Now, it's just like, yeah, I don't know, like, the last month he has opinions at six months. I think it's because I raised him in Europe. I should take him back to the Middle East [laughs]. No opinions. VICTORIA: No, dogs totally have, like, a communication style, you know, I pretty much know what he, I mean, I can read his mind, obviously [laughs]. SEIF: Sure, but that's when they grow a bit. But what when they were very...when the dog was very young? VICTORIA: Yeah, they, I mean, they also learn, like, your stuff, too. So, they, like, learn how to get you to do stuff or, like, I know she'll feed me if I'm sitting here [laughs]. SEIF: And how much is one dog year, seven years? VICTORIA: Seven years. SEIF: Seven years? VICTORIA: Yeah, seven years? SEIF: Yeah. So, basically, in one year, like, three months, he's already...in one month, he's, you know, seven months old. He's like, yeah. VICTORIA: Yeah. In a year, they're, like, teenagers. And then, in two years, they're, like, full adults. SEIF: Yeah. So, the first month is basically going through the first six months of a human being. So yeah, you pass...the first two days or three days are the Tamagotchi phase that I'm talking about. VICTORIA: [chuckles] I read this book, and it was, like, to understand dogs, it's like, they're just like humans that are trying to, like, maximize the number of positive experiences that they have. So, like, if you think about that framing around all your interactions about, like, maybe you're trying to get your son to do something, you can be like, okay, how do I, like, I don't know, train him that good things happen when he does the things I want him to do? [laughs] That's kind of maybe manipulative but effective. So, you're not learning baby sign language? You're just, like, going off facial expressions? SEIF: I started. I know how Mama looks like. I know how Dada looks like. I know how more looks like, slowly. And he already does this thing that I know that when he's uncomfortable, he starts opening and closing his hands. And when he's completely uncomfortable and basically that he needs to go sleep, he starts pulling his own hair. VICTORIA: [laughs] I do the same thing [laughs]. SEIF: You pull your own hair when you go to sleep? I don't have that. I don't have hair. VICTORIA: I think I do start, like, touching my head though, yeah [inaudible 33:04]. SEIF: Azure took the last bit of hair I had! Went away with Azure, Elasticsearch, and the shock collar. VICTORIA: [laughs] SEIF: I have none of them left. Absolutely nothing. I should sue Elasticsearch for this shit. VICTORIA: [laughs] Let me know how that goes. Maybe there's more people who could join your lawsuit, you know, with a class action. SEIF: [laughs] Yeah. Well, one thing I wanted to also just highlight is, right now, one of the things that also makes the company move forward is we realized that in a single domain, we proved ourselves very valuable to specific companies, right? So, that was a big, big thing, milestone for us. And now we're trying to move into a handful of domains and see which one of those work out the best for us. Does that make sense? VICTORIA: Yeah. And I'm curious: what are the biggest challenges or hurdles that you associate with that? SEIF: At this point, you don't want just feedback. You want constructive criticism. Like, you want to work with people who will criticize the applic...and you iterate with them based on this criticism, right? They're just not happy about you and trying to create design partners. So, for us, it was very important to have these small design partners who can work with us to actually prove ourselves as valuable in a single domain. Right now, we need to find a way to scale this across several domains. And how do you do that without sacrificing? Like, how do you open into other domains without sacrificing the original domain you came from? So, there's a lot of things [inaudible 34:28]. And we are in the middle of this. Honestly, I Forrest Gumped my way through half of this, right? Like, I didn't know what I was doing. I had ideas. I think it's more of luck at this point. And I had luck. No, we did work. We did work a lot. We did sleepless nights and everything. But I think, in the last three years, we became more mature and started thinking more about product. And as I said, like, our CEO, Neil, and Dominic, our head of product, are putting everything behind being a product-led organization, not just a tech-led organization. VICTORIA: That's super interesting. I love to hear that that's the way you're thinking about it. JOE: I was just curious what other domains you're looking at pushing into if you can say. SEIF: So, we are going to start moving into ETL a bit more. We're trying to see how we can fit in specific ML scenarios. I can't say more about the other, though. JOE: Do you think you'll take the same approaches in terms of value proposition, like, low cost, good enough latency? SEIF: Yes, that's definitely one thing. But there's also...so, this is the values we're bringing to the customer. But also, now, our internal values are different. Now it's more of move with urgency and high velocity, as we said before, right? Think big, work small. The values in terms of values we're going to take to the customers it's the same ones. And maybe we'll add some more, but it's still going to be low-cost and large-scale. And, internally, we're just becoming more, excuse my French, agile. I hate that word so much. Should be good with Scrum. VICTORIA: It's painful, but everyone knows what you're talking about [laughs], you know, like -- SEIF: See, I have opinions here about Scrum. I think Scrum should be only used in terms of iceScrum [inaudible 36:04], or something like that. VICTORIA: Oh no [laughter]. Well, it's a Rugby term, right? Like, that's where it should probably stay. SEIF: I did not know it's a rugby term. VICTORIA: Yeah, so it should stay there, but -- SEIF: Yes [laughs]. VICTORIA: Yeah, I think it's interesting. Yeah, I like the being flexible. I like the just, like, continuous feedback and how you all have set up to, like, talk with your customers. Because you mentioned earlier that, like, you might open source some of your projects. And I'm just curious, like, what goes into that decision for you when you're going to do that? Like, what makes you think this project would be good for open source or when you think, actually, we need to, like, keep it? SEIF: So, we open source libraries, right? We actually do that already. And some other big organizations use our libraries; even our competitors use our libraries, that we do. The whole product itself or at least a big part of the product, like database, I'm not sure we're going to open source that, at least not anytime soon. And if we open source, it's going to be at a point where the value-add it brings is nothing compared to how well our product is, right? So, if we can replace whatever's at the back with...the storage engine we have in the back with something else and the product doesn't get affected, that's when we open source it. VICTORIA: That's interesting. That makes sense to me. But yeah, thank you for clarifying that. I just wanted to make sure to circle back. Since you have this big history in open source, yeah, I'm curious if you see... SEIF: Burning me out? VICTORIA: Burning you out, yeah [laughter]. Oh, that's a good question. Yeah, like, because, you know, we're about to be in October here. Do you have any advice or strategies as a maintainer for not getting burned out during the next couple of weeks besides, like, hide in a cave and without internet access [laughs]? SEIF: Stay away from Reddit and Hacker News. That's my goal for October now because I'm always afraid of getting too attached to an idea, or too motivated, or excited by an idea that I drift away from what I am actually supposed to be doing. VICTORIA: Last question is, is there anything else you would like to promote? SEIF: Yeah, check out our website; I think it's at axiom.co. Check it out. Sign up. And comment on Discord and talk to me. I don't bite, sometimes grumpy, but that's just because of lack of sleep in the morning. But, you know, around midday, I'm good. And if you're ever in Berlin and you want to hang out, I'm more than willing to hang out. VICTORIA: Whoo, that's awesome. Yeah, Berlin is great. I was there a couple of years ago but no plans to go back anytime soon, but maybe I'll keep that in mind. You can subscribe to the show and find notes along with a complete transcript for this episode at giantrobots.fm. If you have questions or comments, email us at hosts@giantrobots.fm. And you could find me on Twitter @victori_ousg. And this podcast is brought to you by thoughtbot and produced and edited by Mandy Moore. Thanks for listening. See you next time. Did you know thoughtbot has a referral program? If you introduce us to someone looking for a design or development partner, we will compensate you if they decide to work with us. More info on our website at tbot.io/referral. Or you can email us at referrals@thoughtbot.com with any questions. Special Guests: Joe Ferris and Seif Lotfy.
Introducing the Elements of Cardinality! Personal update: I stepped on my laptop because I am a genius and now I have a broken laptop screen. Still works well enough to get to Prime Day!
Covering Part 6 of Alain Badiou's Being and Event on “The Impasse of Ontology,” Alex and Andrew discuss Badiou's critique of the discernible and constructible as foreclosures of the event. Guest Calvin Warren thinks the catastrophe through the post-metaphysics of anti-math and the problem of the one. Warren is a professor of African American Studies at Emory University. His research interests include Continental Philosophy (particularly post-Heideggerian and nihilistic philosophy), Lacanian psychoanalysis, queer theory, Black Philosophy, Afro-pessimism, and theology. He is the author of Ontological Terror: Blackness, Nihilism, and Emancipation (Duke University Press). Concepts related to The Impasse of Ontology The Cantor-Gödel-Cohen-Easton Symptom, Events as Decisions, James C Scott's Seeing Like a State, The Impasse of Ordinality/Cardinality Set/Number Situation/State and Belonging/Inclusion, Errancy and the Immeasurable, Cardinality, Diagonalization and Cantor/Continuum Hypothesis, Kurt Gödel and Paul Cohen, Jacques Lacan and the Impasse of Formalization, The Power Set and the Size of the State, The Subject and the Abyss, Critiques of Leibniz's Discernible and Constructible Worlds (and Analytic Philosophy's Symbolic Thought), Rousseau's General and Undifferentiated Being of Truth (and Paul Cohen's Absolutization of Errancy), and all Classic Metaphysics that includes Communist Eschatology (and Large Cardinals, the Virtual Being of Theology, and Transcendence). Interview with Calvin Warren Qui Parle on The Catastrophe, Ontological Terror, Alain Badiou and the One as Anti-Black, Denise Ferreira da Silva, Pure Form as Pure Violence, Black aesthetics, Katherine McKittrick, The Ledger as Both the Inclusion of Black Death and the Concealment of Black Life, Catastrophe, Abyss, Nihilism, Nothingness, Pessimism, Post-Metaphysics, Martin Heidegger, Jacques Lacan, Jean-Paul Sartre, Frantz Fanon and the Zone of Non-Being, Subtraction, Aesthetics, Romanticism, Afrofuturism Links Warren profile, https://aas.emory.edu/people/bios/warren-calvin.html Warren papers, https://emory.academia.edu/calvinwarren Warren, Ontological Terror: Blackness, Nihilism, and Emancipation, https://www.dukeupress.edu/ontological-terror Warren, "The Catastrophe: Black Feminist Poethics, (Anti)form, and Mathematical Nihilism," https://muse.jhu.edu/article/749148/pdf
Covering Part 3 of Alain Badiou's Being and Event on “Nature & Infinity,” Alex and Andrew complete the "arithmetic, natural story" that constitutes Badiou's presentation of being within the book so far. Guest Sarah Pourciau explores the history and philosophy of set theory, while also scrutinizing the conclusions Badiou tries to draw from it. Pourciau is a professor of German Studies at Duke University. Her expertise includes 19th Century German thought, including both philosophy and mathematics (Dedekind, Cantor). She is the author of the book The Writing of Spirit: Soul, System, and the Roots of Language Science. Concepts on Nature and Infinity Political Modernism, Math as the Difference between Real and Natural Numbers, Martin Heidegger's Poetic Ontology, Jacques Lacan's Matheme, Physis, Nature, Natural Multiples, the Non-existence of Nature, Cardinality and Ordinality, Ordinal Chain, Infinity and Finitude, Arithmetic and Natural Infinity, Georg Cantor and Richard Dedekind, Five Critiques of GWF Hegel's Notion of Infinity. Interview with Sarah Pourciau Digital Ocean, Richard Dedekind, Platonic Eidos, Georg Cantor and the Abyss, Gender and “The Feminine,” Kantian Intuition, Logos and the Origin of Set Theory, Politics, Naming and Numbers, Spontaneity, Différance, Alan Turing and Kurt Gödel, Computability. Links Pourciau profile, https://scholars.duke.edu/person/sarah.pourciau Pourciau, The Writing of Spirit: Soul, System, and the Roots of Language Science, https://www.fordhampress.com/9780823275632/the-writing-of-spirit/ Pourciau, "A/logos: An Anomalous Episode in the History of Number," https://muse.jhu.edu/article/728110 Pourciau, "On the Digital Ocean," https://www.journals.uchicago.edu/doi/abs/10.1086/717319
Richard talks to Joël Quenneville about his experiences with Ruby and Elm, how dependency graphs can be applied in teaching, and also a concept that's new to the podcast: Conditional Cardinality.
Covering Part 1 of Alain Badiou's Being and Event on the topic of “Being,” Alex and Andrew introduce some foundational concepts and address Badiou's relation to other philosophers. Guest Knox Peden outlines where Badiou fits within the intellectual history of French philosophy, Marxism, and science. Peden is author of Spinoza Contra Phenomenology: French Rationalism from Cavaillès to Deleuze (published in 2014). Knox has also worked as an editor and translator including collaborations on Cahiers pour l'Analyse (published as Concept and Form, volumes 1 and 2) and On Logic and the Theory of Science by Jean Cavaillès. Schools of Philosophy Math and the Philosophy of Mathematics, a Mathematic Ontology based in Set Theory, Being Qua Being, Martin Heidegger and Badiou's Critique of Poetic Ontology, Post-Cartesian Theories of the Subject from Karl Marx, Sigmund Freud, and Jacques Lacan, Logical Positivism and the Vienna Circle. Key Thinkers and Concepts Jean Cavaillès, Albert Lautman, Georg Cantor, and Kurt Gödel, Axiomatic Set Theory (Axiom of Extensionality, Power Sets, Axiom of Union, Axiom of Separation, Axioms of Replacement and Substitution), The Count, The One, Void, ∅ (Mark Naught), Nature, Name, Cardinality. Interview with Knox Peden French Marxism, Marxist Science and Ideology, Rationalism, Empiricism, Phenomenology and Edmund Husserl, Gaston Bachelard and Philosophy of Science, Truth, Cahiers pour l'Analyse including Jacques-Alain Miller and Jean-Claude Milner, “Mark and Lack,” the Subject, Suture. Links Knox Peden profile, https://hass.uq.edu.au/profile/7697/knox-peden Peden, Spinoza Contra Phenomenology: French Rationalism from Cavaillès to Deleuze, https://www.sup.org/books/title/?id=22793 Hallward and Peden, Concept and Form, two volumes dedicated to Cahiers pour l'Analyse, https://www.versobooks.com/series_collections/35-concept-and-form Cahiers pour l'Analyse(electronic edition) http://cahiers.kingston.ac.uk/ Cavaillès, On Logic and the Theory of Science, translated by Peden and Mackay, https://www.urbanomic.com/book/logic-theory-science/
This is the fourth in a 4-part series where Anders Larson and Shea Parkes discuss predictive analytics with high cardinality features. In the prior episodes we focused on approaches to handling individual high cardinality features, but these methods did not explicitly address feature interactions. Factorization Machines can responsibly estimate all pairwise interactions, even when multiple high cardinality features are included. With a healthy selection of high cardinality features, a well tuned Factorization Machine can produce results that are more accurate than any other learning algorithm.
This is the third in a 4-part series where Anders Larson and Shea Parkes discuss predictive analytics with high cardinality features. In this episode they focus on feature engineering via the hashing trick. The hashing trick is most applicable for extremely high cardinality, and at first glance can seem almost ridiculous. In a lot of ways, it is the same as bucketing values at random. But there are times that it is more valuable to include randomly engineered buckets than to exclude the original high cardinality feature entirely.
Joël has been pondering another tool for thought from Maggie Appleton: diagramming. What does drawing complex things reveal? Stephanie has updates on how Soup Group went, plus a clarification from last week's episode re: hexagons and tessellation. They also share the top most impactful articles they read in 2022. This episode is brought to you by Airbrake (https://airbrake.io/?utm_campaign=Q3_2022%3A%20Bike%20Shed%20Podcast%20Ad&utm_source=Bike%20Shed&utm_medium=website). Visit Frictionless error monitoring and performance insight for your app stack. Maggie Appleton tools for thought (https://maggieappleton.com/tools-for-thought) Squint test (https://www.youtube.com/watch?v=8bZh5LMaSmE&themeRefresh=1) Cardinality of types (https://guide.elm-lang.org/appendix/types_as_sets.html) Honeycomb hexagon construction (https://www.nature.com/articles/srep28341) Coachability (https://cate.blog/2021/02/22/coachability/) Strangler Fig Pattern (https://shopify.engineering/refactoring-legacy-code-strangler-fig-pattern) Finding time to refactor (https://thoughtbot.com/blog/finding-the-time-to-refactor) Parse don't validate (https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/) Errors cluster around boundaries (https://thoughtbot.com/blog/debugging-at-the-boundaries) Transcript: STEPHANIE: Hello and welcome to another episode of The Bike Shed, a weekly podcast from your friends at thoughtbot that has basically become a two-person book club between me and Joël. [laughter] JOËL: I love that. STEPHANIE: I'm so sorry, I had to. I think we've been sharing so many things we've been reading in the past couple of episodes, and I've been loving it. I think it's a lot of the conversations we have off-air too, and now we're just bringing it on on-air. And I am going to lean into it. [laughs] JOËL: I like it. STEPHANIE: So, Joël, what's new in your world? JOËL: So, in a recent episode, I think it was two episodes ago, you shared an article by Maggie Appleton about tools for thought. And I've kind of been going back to that article a few times in the past few weeks. And I feel like I always see something new. And one tool for thought that Maggie explicitly mentions in the article is diagramming, and that's something that we've used as an industry for a long time to deal with conditional logic is just writing a flow diagram. And I feel like that's such a useful tool sometimes to move away from code and text into visuals and draw your problem rather than write your problem. It's often useful either when I'm trying to figure out how to structure some of my own code or when I'm reviewing a PR for somebody else, and something just feels not quite right, but I'm not quite sure what I want to say. And so drawing the problem all of a sudden might give me some insights, might help me identify why does something feel off about this code that I can't quite put into words? STEPHANIE: What does drawing complex things reveal for you? Is there a time where you were able to see something that you hadn't seen before? JOËL: One thing I think it can make more obvious is the shape of the problem. When we describe a problem in words, sometimes there's a sense of like, okay, there are two main paths through this problem or something. And then when we do our code, we try to make it DRY, and we try all these things. And it's really hard to see the flow of logic. And we might actually have way more paths through our code than are actually needed by the initial problem definition. I think we talked about this in a past episode as well, structuring a multi-step form or a wizard. And oftentimes, that is structured way more complex than it needs to be. And you can really see that difference when you draw out a flow diagram, the difference between forcing everything down a single linear flow with a bunch of little independent conditions versus branching up front three or four or five ways, however many steps you have. And then, from there, it's just executing code. STEPHANIE: I have two thoughts here. Firstly, it's very tragic that this is an audio medium only [laughs] and not also a visual one. Because I think we've joked in the past about when we've, you know, talked about complex problems and branching conditionals and stuff like that, like, oh, like, if only we could show a visual representation to our listeners. [laughs] And secondly, now that makes a lot more sense why there are so many whiteboards just hanging out in offices everywhere. [laughs] JOËL: We should use them more. It's interesting you mentioned the limitations of an audio format that we have. But even just describing the problem in an audio format is different than implementing it in code. So if I were to describe a problem to you that says, oh, we have a multi-step form that has three different steps to it, in that description, you might initially think, oh, that means I want to branch three ways up front, and then each step will need to do some processing. But if you look at the implementation in the code, maybe whoever coded it, and maybe that's yourself, will have done it totally differently with a lot more branching than just three up front because it's a different medium. STEPHANIE: That's a really good point. I also remember reading something about how you can reason about how many branches a piece of code might have if you just look at the structure of the lines of code in your editor if you either step away from it and are just looking at the code not really able to see the text itself but just the shape that it makes. If you have some shorter lines and then a handful of longer lines, you might be able to see like, oh, like these are multiple conditionals happening, which I think is kind of related to what you're saying about taking a piece of code and then diagramming it out to really see the different paths. And I know that that can also be obscured a little bit if you are stylistically using different syntax. Like, if you are using a guard clause to return early, that's a conditional, but it gets a bit hidden from the visual representation than if you had written out the full if statement, for example. JOËL: I think that's a really interesting distinction that you bring up because a lot of languages provide syntactic sugar for common conditional tasks that we do. And sometimes, that syntactic sugar will almost obfuscate the fact that there is a conditional happening at all, which can be great in a lot of cases. But when it comes to analyzing and particularly comparing different implementations, a second conversion that I like to do is converting all of the conditional code to some standardized form, and, for me, that's typically just your basic if...elsif...else expressions. And so any fancy Boolean operators we're doing, any safe navigation that we're doing around nil, maybe some inline conditionals, early returns, things like that, all of the implicit elses that are involved as well, putting them all into some normalized form then allows me to compare two implementations with each other. And sometimes, two approaches that we initially thought were identical, just with different syntax, turned out to have slightly different behavior because maybe one has this sort of implicit branch that the other one doesn't. And by converting to a normalized syntax, all of a sudden, this difference becomes super obvious. To be clear, this is not something I do necessarily in the actual code that I commit, not necessarily writing everything long-form. But definitely, when I'm trying to think about conditional code or analyzing somebody else's code, I will often convert it to long-form, some normalized shape so that I can then see some things about it that were not obvious in the final form. Or to make a comparison with something else, and then you can compare apples to apples and say, okay, both these approaches that we're considering in normalized form, here's what they look like. There's some difference here that we do care about or don't care about. STEPHANIE: That's really interesting. I find it very curious that there is a value in having the long-form approach of writing the code out and being able to identify things. But then the end result that we commit might not look like that and be shortened and be kind of, quote, unquote, "polished," or at least condensed with syntactic sugar. And I'm kind of wondering why that might be the case. JOËL: I think a lot of that will come down to your personal or your company's style guide. Personally, I think I do lean a little bit more towards a slightly more explicit form. But there are plenty of times that I will use syntactic sugar as well, as long as everybody knows what it does. But sometimes, it will come at the cost of other analysis techniques. You had mentioned the squint test earlier, which I believe is a term coined by Sandi Metz. STEPHANIE: I think it might be. That rings a bell. JOËL: And that is a benefit that you get by writing explicit conditionals all the time. But sometimes, it is much nicer to write code that is a little bit more terse. And so you have to do the trade-offs there. STEPHANIE: Yeah, that's a really good point. JOËL: So that's two of the sort of three formats that I was thinking about for converting conditional code to gain more insight. The other format is honestly a little bit weird. It's almost a stretch. But from my time spent working with the Elm language, I learned how to use its type system, which uses a concept called algebraic data types, or some languages will call these tagged unions, some languages will call these sum types. This concept goes by a lot of different names. But they're used to define types into model data. But there's a really fun property, which is that you can model conditional code using this as well. And so you can convert executable code into these algebraic data types. And now, you can apply a lot of tools and heuristics that you have from the data modeling world to this conditional code. STEPHANIE: Do you have a practical example? JOËL: So a classic thing that data modelers will say is you should make impossible states impossible. So in practice, this means that when you define a type using these algebraic data types, you should not be able to create more distinct values than are actually valid in this particular system. So, for example, if a value is required to always be present for something and there's no way in the system for a value to become not present, then don't allow it to be nullable. We do something similar when we design a database schema when we put a null false on a column because we know that this will never be null. And so, why allow nulls when you know they should never be there? So it's a similar thing with the types. This sort of analysis that you can do looking at...the fancy term is the types cardinality. I'll link to an article that digs into that for people who are curious. But that can show you whether a type can represent, let's say, ten possible values, but the domain you're trying to model only has 5. And so when there's that discrepancy, there are five valid values that can be modeled by your type and an additional extra five that are not valid that just kind of shake out from the way you implemented things. So you can take that technique and apply it to a conditional that you've converted to algebraic data type form. And that can help find things like paths through your conditional code that don't line up with the problem that you're trying to solve. So going back to the example I talked about earlier of a multi-step form with three different steps, that's a problem that should have three paths through your conditional. But depending on your implementation, if it's a bunch of independent if clauses, you might have a bit of a combinatorial explosion. And there might be 25 different paths through that chunk of code. And that means three of them are the ones that your problem wants, and then the extra 22 are things that should quote, unquote, "never happen," but we all know that they eventually will. So that kind of analysis can help maybe give you pointers to the fact that your current structure is not well-suited to the problem that you're trying to solve. STEPHANIE: I think another database schema example that came to mind for me was using an enum to declare acceptable values for a field. And, yeah, I know exactly what you mean when working with code where you might know, because of the way the business works, that this thing is impossible, and yet, you still have to either end up coding defensively for it or just kind of hold that complexity in your head. And that can lead to some gnarly situations, and it makes debugging down the line a lot more difficult too. JOËL: It definitely makes it really hard for somebody else to know the original intention of the code when a conditional has more paths through it than there actually are actual paths in the problem you're trying to solve. Because you have to load all of that in your head, and our programmer brains are trained to think about all the edge cases, and what if this condition fires but this other one doesn't? Could that lead to a bug? Is that just a thing that's like, well, but the inputs will never trigger that, so you can ignore it? And if there are no comments to tell you, and if there are comments, then do you trust them? Because it -- STEPHANIE: Yes. [laughter] I'll just jump in here and say, yeah, I have seen the comments then conflict with the code as well. And so you have these two sources of information that are conflicting with each other, and you have no idea what is true and what's not. JOËL: So I'm a big fan of structuring conditional code such that the number of unique paths through a set of conditions is the same as the sort of, you might say, logical paths through the problem domain that we haven't added extra paths, just sort of accidentally due to the way we implemented things. STEPHANIE: Yeah. And now you have three different ways to visualize that information in your head [laughs] with these mental models. JOËL: Right. So from taking code that is conditional code and then transforming it into one of these other representations, I don't always do all three, but there are tools that I have. And I can gain all sorts of new insights into that code by looking at it through a completely different lens. STEPHANIE: That's super cool. JOËL: So the last episode, you had mentioned that you were going to try a soup club. How did that turn out? STEPHANIE: It turned out great. It was awesome, the inaugural soup group. I had, I think, around eight people total. And I spent...right after work, I went straight to chopping celery [laughs] and onions and just soup prepping. And it was such a good time. I invited a different group of friends than normally come together, and that turned out really well. I think we all kind of had at least one thing in common, which was my goal was just to, you know, have my friends come together and meet new people too. And we had soup, and we had bread. Someone brought a spiced crispy chickpea appetizer that went really well inside of our ribollita vegetable bean soup. And then I had the perfect amount of leftovers. So after making a really big batch of food and spending quite a long time cooking, I wanted to make sure that everyone had their fill. But it was also pretty nice to have two servings left over that I could toss in the freezer just for me and as a reward for my hard work. And then it ended up working out really well because I went on vacation last week. And the night we got back home, we were like, "Oh, it's kind of late. What are we going to do for dinner?" And then I got to pull out the leftover soup from my freezer. And it was the perfect coming home from a big trip, and you have nothing in your fridge kind of deal. So it worked out well. JOËL: I guess that's the advantage of hosting is that you get to keep the leftovers. STEPHANIE: It's true. JOËL: You also have to, you know, make the soup. [laughs] STEPHANIE: Also true. [laughs] But like I said, it wasn't like I had so much soup that I was going to have to eat it every single day for the next week and a half. It was just the amount that I wanted. So I'm excited to keep doing this. I'm hoping to do the next soup group in the next week or two. And then some other folks even offered to host it for next time. So maybe we might experiment with doing a rotating thing. But yeah, it has definitely brought me joy through this winter. JOËL: That's so lovely. What else has been new in your world? STEPHANIE: I have a clarification to make from last week's episode. So last week, we were talking about hexagons and tessellation. And we had mentioned that hexagons and triangles were really strong shapes. And we mentioned that, oh yeah, you can see it in the natural world through honeycomb. And I've since learned that bees don't actually build the hexagon shape themselves. That was something that scientists did think to be true for a little bit, that bees were just geometrically inclined, but it turns out that the accepted theory for how honeycomb gets its shape is that bees build cylindrical cells that later transform into hexagons, which does have a lot of surface area for holding the honey, though the process itself is actually still debated by scientists. So there's some research that has supported the idea that it's formed through physical forces like the changing temperature of the wax that transforms it from a cylinder shape into a hexagon, though, yeah, apparently, the studies are still a bit inconclusive. And the last scientific paper I read about this, just to really get my facts straight [laughs], they were kind of exploring aspects of bee behavior that led to the hexagons eventually forming because that does require that the cylinders are perfectly the same size and are at least built in a hexagonal pattern, even though the cells themselves are not hexagons. JOËL: Fascinating. So it sounds like it's either a social thing where the bees do it based off of some behavior. Or if it's a physical thing, it's some sort of like hexagons are a natural equilibrium point that everything kind of trends to, and so as temperature changes, the beehive will naturally trend towards that. STEPHANIE: Yeah, exactly. I have a good friend who is a beekeeper, so I got to pick her brain a little bit about honeycomb. [laughs] MID-ROLL AD: Debugging errors can be a developer's worst nightmare...but it doesn't have to be. Airbrake is an award-winning error monitoring, performance, and deployment tracking tool created by developers for developers that can actually help cut your debugging time in half. So why do developers love Airbrake? It has all of the information that web developers need to monitor their application - including error management, performance insights, and deploy tracking! Airbrake's debugging tool catches all of your project errors, intelligently groups them, and points you to the issue in the code so you can quickly fix the bug before customers are impacted. In addition to stellar error monitoring, Airbrake's lightweight APM helps developers to track the performance and availability of their application through metrics like HTTP requests, response times, error occurrences, and user satisfaction. Finally, Airbrake Deploy Tracking helps developers track trends, fix bad deploys, and improve code quality. Since 2008, Airbrake has been a staple in the Ruby community and has grown to cover all major programming languages. Airbrake seamlessly integrates with your favorite apps to include modern features like single sign-on and SDK-based installation. From testing to production, Airbrake notifiers have your back. Your time is valuable, so why waste it combing through logs, waiting for user reports, or retrofitting other tools to monitor your application? You literally have nothing to lose. Head on over to airbrake.io/try/bikeshed to create your FREE developer account today! JOËL: So in the past few episodes, we've talked about books we're reading, articles that we're reading. This is kind of turning into the Stephanie and Joël book club. STEPHANIE: I love it. JOËL: That got me thinking about things that I've read that were impactful in the past year. So I'm curious for both of us what might be, let's say, the top two or three most impactful articles that you read in 2022. Or maybe to put it another way, what are the top two or three articles that you reference the most in conversations with other people? STEPHANIE: So listeners might not know this, but I actually joined thoughtbot early last year in February. So I was coming into this new job, and I was so excited to be joining an organization with so many talented developers. And I was really excited to learn from everyone. So I kind of came in with really big goals around my technical growth. And the end of the year just passed, and I got to do a little bit of reflection. And I was quite proud of myself actually for all the things that I had learned and all the ways that I had grown. And I was reminded of this blog post that I think I had in the back of my mind around "Coachability" by Cate, and she talks about how coaching is different from mentorship. And she provides some really cool mental models for different ways of providing support to your teammates. Let's say mentorship is teaching someone how to swim, and maybe helping someone out with a task might be throwing them a life raft. Coaching is more like seeing someone in the water, but you are up on a bridge, and you are kind of seeing all of their surroundings. And you are identifying ways that they can help themselves. So maybe there's a branch, a tree branch, a few feet away from them. And can they go grab that tree branch? How can they help themselves? So I came to this new job at thoughtbot, and I had these really big goals. But I also knew that I wanted to lean on my new co-workers and just be able to not only learn the things that I was really excited to learn but also trust that they had my best interests in mind as well and for them to be able to point out things that could help my career growth. So the idea of coachability was really interesting to me because I had been coming from a workplace that had a really great feedback culture. But I think this article touches on what to do with feedback in a way that I hadn't seen before. So she also describes being coachable as having two axes, one of them being receptiveness to feedback and the other being actionability in response to feedback. So receptiveness is when you hear feedback; do you listen to it? Do you work through it? How does that feedback fit into your mental model of your goals and your skills? And then actionability is like, okay, what do you do with that? How do you change your behavior? How do you change the way you approach problems? And those two things in mind were really helpful in terms of understanding how I respond to feedback and how to really make the most of it when I receive it. Because there are times when I get feedback, and I don't know what to do with it, you know, maybe it just wasn't specific enough. And so, in that sense, I want to work on my actionability and figuring out, okay, someone said that testing would be a really great opportunity for me to learn. But what can I do to learn how to write better tests? And that might involve figuring that out on my own, like, what strategies work for me. Or that might involve asking them, being like, "What do you recommend?" So yeah, I had this really big year of growth. And I'm excited to keep this mental model in mind when I feel like I might be stuck and I'm not getting the growth that I want and using those axes to kind of determine how to move forward. JOËL: I think the first thing that comes to mind for me is the episode that you and I did a while back about the value of precise language. For example, you talked about the distinction between coaching and mentorship, which I think in sort of colloquial speech, we kind of use interchangeably. But having them both mean different things, and then being able to talk about those or at least analyze yourself through the lens of those two words, I think, is really valuable and may be helping to drive either insights or actions that you can take. And similarly, this idea of having two different axes for receptiveness versus...was it changeability you said was the other one? STEPHANIE: Actionability. JOËL: Actionability, I think, is really helpful when you're feeling stuck because now you can realize, oh, is it because I'm not accepting feedback or not getting good feedback? Or is it that I'm getting feedback, but it's hard to take action on it? So just all of a sudden, having those terms and having that mental model, that framework, I feel like equips me to engage with feedback in a way that is much more powerful than when we kind of used all those terms interchangeably. STEPHANIE: Yeah, exactly. I think that it's very well understood that feedback is important and having a good feedback culture is really healthy. But I think we don't always talk about the next step, which is what do you do with feedback? And with the help of this article, I've kind of come to realize that all feedback is valuable, but not all of it is good. And she makes a really excellent point of saying that the way you respond to feedback also depends on the relationship you have with the person giving it. So, ideally, you have a high trust high respect relationship with that person. And so when they give you feedback, you are like, yeah, I'm receptive to this, and I want to do something about it. But sometimes you get feedback from someone, and you might not have that trust in that relationship or that respect. And it just straight up might not be good feedback for you. And the way you engage with it could be figuring out what part of it is helpful for me and what part of it is not? And if it's not helpful in terms of helping your growth, it might at least be informative. And that might help you learn something about the other person or about the circumstances or environment that you're in. JOËL: Again, I love the distinction you're making between helpful and informative. STEPHANIE: Yeah. I think I had to learn that the hard way this year. [laughs] So, yeah, I really hope that folks find this vocabulary or this idea...or consider it when they are thinking about feedback in terms of giving it or receiving it and using it in a way that works for them to grow the way they want to. JOËL: I'm curious, in your interactions, and learning, and growth over the past year, do you feel like you've leaned a little bit more into the mentorship or the coaching side of things? What would you say is the rough percentage breakdown? Are we talking 50-50, 80-20? STEPHANIE: That's such a good question. I think I received both this year. But I think I'm at a point in my career where coaching is more valuable to me. And I'm reminded of a time a few months into joining thoughtbot where I was working and pairing with a principal developer. And he was really turning the workaround on me and asking, like, what do I want to do? What do I see in the code? What areas do I want to explore? And I found it really uncomfortable because I was like, oh, I just want you to tell me what to do because I don't know, or at least at the time, I was really...I found it kind of stressful. But now, looking back on it and with this vocabulary, I'm like, oh, that's what true coaching was because I gained a lot of experience towards my foundational skill set of figuring out how to solve problems or identifying areas of refactoring through that process. And so sometimes coaching can feel really uncomfortable because you are stretching outside of your comfort zone and that your coach is hopefully supporting you but not just giving you the help but teaching you how to help yourself. JOËL: That's a really interesting thing to notice. And I think what I'm hearing is that coaching can feel less comfortable than mentoring because you're being asked to do more of the work yourself. And you're maybe being stretched in some ways that aren't exactly the same as you would get in a more mentoring-focused scenario. Does that sound right? STEPHANIE: Yeah, I think that sounds right because, like I said, I was also receiving mentorship, and I learned about new things. But those didn't always solidify in terms of empowering me next time to be able to do it without the help of someone else. Joël, what was an article that really spoke to you this last year? JOËL: So I really appreciated an article by Adrianna Chang, who's a developer at Shopify, about "Refactoring Legacy Code with the Strangler Fig Pattern." And it talks about this approach to moving refactoring code from one implementation to another. And it's a longer-ranged process, and how to do so incrementally. And a big theme for me this year has been refactoring and incremental change. I've had a lot of conversations with people about how to spot smaller steps. I've written an article on working incrementally. And so I think this was really nice because it gave a very particular technique on how to do so with an example. And so, because these sorts of conversations kept coming up this year, I found myself referencing this article all the time. STEPHANIE: I really loved this article too. And this last year, I also saw a strangler fig tree for the first time in real life in Florida. And I think that was after I had read this article. And it was really cool to make the connection between something I was seeing in nature with a pattern in software development or technique. JOËL: We have this metaphor, and now you get to see the real thing. I was excited because, at RubyConf Mini this year, I actually got to meet Adrianna. So it was really cool. It's like, "Hey, I've been referencing your article all year. It's super cool to meet you in person." STEPHANIE: That's awesome. I love that, just being able to support members of the community. What I really liked about the approach this article advocated for is that it allowed developers to continue working. You don't have to halt everything and dedicate time to refactor and not get any new feature work done. And that's the beauty of the incremental approach that you were talking about earlier, where you can continue development. Sometimes that refactoring might be paused for some reason or another, but then you can pick back up where you left off. And that is really intriguing to me because I think this past year, I was working on a client where refactoring seemed like something we had to dedicate special time for. And it constantly became tough to prioritize and sell to stakeholders. Whereas if you incorporate it into the work and do it in a way that doesn't stop the show [laughs] from going on, it can work really well and work towards sustainability and maintenance, which is another thing that we've talked a lot about on the show. JOËL: Something that's really powerful, I think, with that technique is that it allows you to have all of the intermediate steps get merged into your main branch and get shipped. So you don't have to have this long-running branch with a big change that's constantly going stale, and you're having to keep in sync with the main branch. And, unfortunately, I've often seen even this sort of thing where you create a long-running branch for a big change, a big refactor, and eventually, it just gets abandoned, and you have not locked in any wins. STEPHANIE: Yeah, that's the worst of both worlds where you've dedicated time and resources and don't get the benefits of that work. I also liked that the strangler fig pattern kind of forces you to really understand the existing code. I think working with legacy code can be really challenging. And a lot of people don't like to do it because it involves a lot of spelunking and figuring out, okay, what's really going on. But in order to isolate the pieces to, you know, slowly start to stop making calls to the old code, it requires that you take a hard look at your legacy code and really figure it out. And I honestly think that that then informs the new code that you write to better support both the old feature and also any new features to come. JOËL: Definitely. The really nice thing about this pattern is that it also scales up and down. You can do this really small...even as part of a feature branch; maybe it's just part of your development process, even if you don't necessarily ship all of the intermediate steps. But it helps you work more incrementally and in a tighter scope. And then you can scale it up as big as changing out entire sections of a framework or...I think Adrianna's example is like switching out a data source. And so you can do some really large refactors. But then you could do it as well on just a small feature. I really like using this pattern anytime you're doing things like Rails upgrades, and you've got old gems that might not convert over where it's like, oh, the community abandoned this gem between Rails 4 and Rails 5. But now you need sort of a bridge to get over. And so I think that pattern is particularly powerful when doing something like a Rails upgrade. STEPHANIE: Very Cool. JOËL: So what would be a second article that was really impactful for you in the past year? STEPHANIE: So, speaking of refactoring, I really enjoyed a blog post called Finding Time to Refactor by a former thoughtboter, German Velasco. He makes a really great point that we should think of completeness in our work, not just when the code works as expected or meets the product requirements, but also when it is clear and maintainable. And so he really advocates for baking refactoring into just your normal development process. And like I said, that goes back to this idea that it can be incremental. It doesn't have to be separate or something that we do later, which is kind of what I had learned before coming to thoughtbot. So when I was also speaking about just my technical growth, this shift in philosophy, for me, was a really big part of that. And I just started kind of thinking and seeing ways to just do it in my regular process. And I think that has really helped me to feel better about my work and also see a noticeable improvement in the quality of my code. So he mentioned the three times that he makes sure to refactor, and that is one when he is practicing TDD and going through the red-green-refactor cycle. JOËL: It's in the name. STEPHANIE: [laughs] It really is. Two, when code is difficult to understand, so if he's coming in and fixing a bug and he pays the tax of trying to figure out confusing code, that's a really great opportunity to then reduce that caring cost for others by making it clear while you're in there, so leaving things better than you found it. And then three, when the existing design doesn't work. We, I think, have mentioned the adage, "Make the change easy, and then make the easy change." So if he's coming in to add a new feature and it's just not quite working, then that's a really good opportunity to refactor the existing design to support this new information or new concept. JOËL: I like those three scenarios. And I think that second one, in particular, resonated with me, the making things easier to understand. And in the sort of narrower sense of the word refactoring, traditionally, this means changing the structure of the code without changing its behavior. And I once had a situation where I was dealing with a series of early return expressions in a method that were all returning Booleans. And it was really hard because there were some unlesses, some ifs, some weird negation happening. And I just couldn't figure out what this code was doing. STEPHANIE: Did you draw a diagram? [laughs] JOËL: I did not. But it turns out this code was untested. And so I pretty much just tried, like, it took two Booleans as inputs and gave back a Boolean. So I just tried all the combinations, put it in, saw what it gave me out, and then wrote tests for them. And then realized that the test cases were telling me that this code was always returning false unless both inputs were true. And that's when it kind of hits me, it's like, wait a minute, this is Boolean AND. We've reimplemented Boolean AND with this convoluted set of conditional code. And so, at the end there, once I had that test coverage to feel confident, I went in and did a refactor where I changed the implementation. Instead of being...I think it was like three or four inline conditionals, just rewrote it as argument one and argument two, and that was much easier to read. STEPHANIE: That's a great point. Because the next time someone comes in here, and let's say they have to maybe add another condition or whatever, they're not just tacking on to this really confusing thing. You've hopefully made it easier for them to work with that code. And I also really appreciated, you know, I was mentioning how this article affected my thought process and how I approach development, but it's a really great one to share to then foster a culture of just continuous refactoring, I guess, is what I'm going to call it [laughs] and hopefully, avoiding having to do a massive rewrite or a massive effort to refactor. The phrase that comes to mind is many hands make light work. And if we all incorporated this into our process, perhaps we would just be working all around with more delightful code. Joël, do you have one more article that really stood out to you this year? JOËL: One that I think I really connected with this year is "Parse, Don't Validate" by Alexis King. Long-time listeners of the show will have heard me talk about this a little bit with Chris Toomey when he was a guest on the show this past fall. But the gist of the article is that the process of parsing is converting a broader type into a narrower type with the potential for errors. So traditionally, we think of this as turning a string which a string is very broad. All sorts of things are strings, and then you turn it into something else. So maybe you're parsing JSON. So you take a string of characters and try to turn it into a Ruby hash, but not all strings are valid hashes. So there's also the possibility for errors. And so, JSON.parse() could raise an error in Ruby. This idea, though, can be then expanded because, ideally, you don't want to just check that a value is valid for your stricter rules. You don't want to just check that a string is valid JSON and then pass the string along to the next person. You actually want to transform it. And then everybody else down the line can interact with that hash and not have to do a check again is this valid JSON? You've already validated that you've already converted it into a hash. You don't need to check that it's valid JSON again because, by the nature of being a hash, it's impossible for it to be invalid. Now, you might have some extra requirements on that hash. So maybe you require certain keys to be present and things like that. And I think that's where this idea gets even more powerful because then you can kind of layer this on top and have a second parsing step where you say, I'm going to parse this hash into, let's say, a shopping cart object. And so, not all Ruby hashes are valid shopping carts. And so you try to take a broader value and coerce it into a narrower value or transform it into a narrower value and potentially raise an error for those hashes that are not valid shopping carts. And then, whoever down the line gets a shopping cart object, you can just call items on it. You can call price on it. You don't need to check is this key present? Because now you have that certainty. STEPHANIE: This reminds me of when I was working with TypeScript in the summer of last year. And having come from a dynamically-typed language background, it was really challenging but also really interesting to me because we were also parsing JSON. But once we had transformed or parsed that data into this domain object, we had a lot more confidence about what we were working in. And all the functions we wrote down the line or used on the line, we could know for sure that, okay, it has these properties about it. And that really shaped the code we wrote. JOËL: So use the word confident here, which, for me, it's a keyword. And so you can now assume that certain properties are true because it's been checked once. That can be tricky if you don't actually do a transformation. If you're just sort of passing a raw value down, you'll often end up with code that is defensive that keeps rechecking the same conditions over and over. And you see this lot around nil in Ruby where somebody checks for a value for nil, and then inside that conditional, three or four other conditions deep, we recheck the same value for nil again, even though, in theory, it should not be nil at that point. And so by doing transformations like that, by parsing instead of just validating, we can ensure that we don't have to repeat those conditions. STEPHANIE: Yeah, I mean, that refers back to the analyzing conditional code that we spent a bit of time talking about at the beginning of this episode. Because I remember in that application, we render different components based on the status of this domain object. And there was a condition for when the status was something that was not expected. And then someone had left a comment that was like, technically, this should never happen. But I think that he had to add it to appease the compiler. And I think had we been able to better enforce those boundaries, had we been more thoughtful around our domain modeling, we could have figured out how to make sure that we weren't then introducing that ambiguity down the line. JOËL: I think it's interesting that you immediately went to talking about TypeScript here because TypeScript has a type system. And the "f, Don't Validate" article is written in Haskell, which is another typed language. And types are great for showing you exactly like, here's the boundary. On this side of it, it's a string, and on this side here, it's a richly-typed value that has been parsed. In Ruby, we don't have that, everything is duck-typed, but I think the principle still applies. It's a little bit more implicit, but there are zones of high or low assumptions about the data. So when I'm interacting directly with raw input from a third-party endpoint, I'm really only expecting some kind of raw string from the body of the response. It may or may not be valid. There are all sorts of checks I need to do to make sure I can do anything with it. So that is a very low assumption zone. Later on, in the business logic part of the code, I might expect that I can call a method on the object to get the price of a shopping cart or a list of items or something like that. Now I'm in a much higher assumption zone. And being self-aware about where we transition from low assumptions to high assumptions is, I think, a really key takeaway for how we interact with code in Ruby. Because, oftentimes, where that boundary is a little bit fuzzy or where we think it's in one place but it's actually in a different place is where bugs tend to cluster. STEPHANIE: Do you have any thoughts about how to adhere to those rules that we're making so we're not having to assume in a dynamically-typed language? JOËL: One way that I think is often helpful is trying to use richer objects and to not just rely on primitives all the time. So don't pass a business process a hash and be just like, trust me, I checked it; it's got the right keys because the day will come when you pass it a malformed hash and now we're going to have an error in the business process. And now we have a dilemma because do we want to start adding defensive checks in the business process to be like, oh, are all our keys that we expect present, things like that? Do we need to elsewhere in the code make sure we process the hash correctly? It becomes a little bit messy. And so, oftentimes, it might be better to say, don't pass a raw hash around. Create a domain object that has the actual method that you want, and pass that instead. STEPHANIE: Oh, sounds like a great opportunity to use the new data class in Ruby 3.2 that we talked about in an episode prior. JOËL: That's a great suggestion. I would definitely reach for something like that, I think, in a situation where I'm trying to model something a little bit richer than just a hash. STEPHANIE: I also think that there have been more trends around borrowing concepts from functional programming, and especially with the introduction of classes that represent nil or empty states, so instead of just using the default nil, having at least a bit of context around a nil what or an empty what. That then might have methods that either raise an error or just signal that something is wrong with the assumptions that we're making around the flexibility that we get from duck typing. I'm really glad that you proposed this topic idea for today's episode because it really represented a lot of themes that we have been discussing on the show in the past couple of months. And I am excited to maybe do this again in the future to just capture what's been interesting or inspiring for us throughout the year. JOËL: On that note, shall we wrap up? STEPHANIE: Let's wrap up. Show notes for this episode can be found at bikeshed.fm. JOËL: This show has been produced and edited by Mandy Moore. STEPHANIE: If you enjoyed listening, one really easy way to support the show is to leave us a quick rating or even a review in iTunes. It really helps other folks find the show. JOËL: If you have any feedback for this or any of our other episodes, you can reach us @_bikeshed, or you can reach me @joelquen on Twitter. STEPHANIE: Or reach both of us at hosts@bikeshed.fm via email. JOËL: Thank you so much for listening to The Bike Shed, and we'll see you next week. ALL: Byeeeeeeeeeee!!!!!!!! ANNOUNCER: This podcast is brought to you by thoughtbot, your expert strategy, design, development, and product management partner. We bring digital products from idea to success and teach you how because we care. Learn more at thoughtbot.com.
This is the second in a 4-part series where Anders Larson and Shea Parkes discuss predictive analytics with high cardinality features. In this episode they focus on y-aware feature engineering. Y-aware feature engineering is all about carefully bleeding information from your training response back into your engineered features without grossly misrepresenting your ability to generalize to new data.
This is the first in a 4-part series where Anders Larson and Shea Parkes discuss predictive analytics with high cardinality features. In this episode they focus on introducing what high cardinality features are, why you might want to work with them and some of the basic approaches to including them in predictive models.
In this episode of PING, Luuk Hendricks and Willem Toorop from NLNet talk about their work to embed telemetry in the linux kernel using eXpress Data Path (XDP) Read more about XDP in a series of articles published in the APNIC blog: Journeying into XDP: Part 0 Journeying into XDP: Augmenting the DNS Journeying into XDP: Fully-fledged DNS service augmentation Journeying into XDP: Augmenting the DNS and their blog covering this episode of Ping: Journeying into XDP: XDPerimenting with DNS telemetry The views expressed by the featured speakers are their own and do not necessarily reflect the views of APNIC.
Using cardinality constraints for portfolio optimization opens the doors to new applications for creating innovative portfolios and exchange-traded funds – all while providing better returns with less market risk. Host Konstantinos Karagiannis recently co-authored a paper on portfolio optimization with Sam Palmer from Multiverse Computing. In this discussion with Konstantinos and Sam, discover how the team was able to outperform classical financial index tracking using D-Wave's Hybrid Solver … and a little ingenuity. Also, learn about Multiverse's innovative Singularity software tool.For more on Multiverse Computing, visit https://multiversecomputing.com/. To read the paper “Financial Index Tracking via Quantum Computing with Cardinality Constraints” mentioned in this episode, visit https://arxiv.org/abs/2208.11380. Visit Protiviti at www.protiviti.com/postquantum to learn more about how Protiviti is helping organizations get post-quantum ready.Follow host Konstantinos Karagiannis on Twitter and Instagram: @KonstantHacker and follow Protiviti Technology on LinkedIn and Twitter: @ProtivitiTech. Contact Konstantinos at konstantinos.karagiannis@protiviti.com. Questions and comments are welcome! Theme song by David Schwartz. Copyright 2021.
We all collect logs, metrics and perhaps traces and other data types, in support of our observability. But this can get expensive pretty quickly, especially in microservices based systems, in what is commonly known as “the cardinality problem”. On this episode of OpenObservability Talks I'll host Ben Sigelman, co-founder and the GM of Lightstep, to discuss this data problem and how to overcome it. Ben architected Google's own planet-scale metrics and distributed tracing systems (still in production today), and went on to co-create the open-source OpenTracing and OpenTelemetry projects, both part of the CNCF. The episode was live-streamed on 12 July 2022 and the video is available at https://youtu.be/gJhzwP-mZ2k OpenObservability Talks episodes are released monthly, on the last Thursday of each month and are available for listening on your favorite podcast app and on YouTube. We live-stream the episodes on Twitch and YouTube Live - tune in to see us live, and pitch in with your comments and questions on the live chat.https://www.twitch.tv/openobservabilityhttps://www.youtube.com/channel/UCLKOtaBdQAJVRJqhJDuOlPg Have you got an interesting topic you'd like to share in an episode? Reach out to us and submit your proposal at https://openobservability.io/ Show Notes: The difference between monitoring, observability and APM What comprises the cost of observability How common is the knowledge of cardinality and how to add metrics Controlling cost with sampling, verbosity and retention Lessons from Google's metrics and tracing systems Using metric rollups and aggregations intelligently Semantic conventions for logs, metrics and traces OpenCost project New research paper by Meta on schema-first approach to application telemetry metadata OTEL code contributions - published stats Resources: Monitoring vs. observability: https://twitter.com/el_bhs/status/1349406398388400128 The two drivers of cardinality: https://twitter.com/el_bhs/status/1360276734344450050 Sampling vs verbosity: https://twitter.com/el_bhs/status/1440750741384089608 Observing resources and transactions: https://twitter.com/el_bhs/status/1372636288021524482 Socials: Twitter: https://twitter.com/OpenObserv Twitch: https://www.twitch.tv/openobservability YouTube: https://www.youtube.com/channel/UCLKOtaBdQAJVRJqhJDuOlPg
“Observability is a technique for ensuring that you can understand novel problems in your system. Can you understand what's happening in your system and why, without having to push a new code by slicing and dicing existing telemetry signals that are coming out of your system?" Liz Fong-Jones is the co-author of the “Observability Engineering” book and a Principal Developer Advocate for SRE and Observability at Honeycomb. In this episode, Liz shared in-depth about observability and why it is becoming an important practice in the industry nowadays. Liz started by explaining the fundamentals of observability and how it differs from traditional monitoring. She explained some important concepts, such as the core analysis loop, cardinality and dimensionality, and doing debugging from a first principle. Later, Liz shared the current state of observability and how we can improve our observability by doing observability driven development and improving our practices based on the proposed observability maturity model found in the book. Listen out for: Career Journey - [00:05:44] Observability - [00:06:30] Pillars of Observability - [00:09:57] Monitoring and SLO - [00:12:28] Core Analysis Loop - [00:15:06] Cardinality and Dimensionality - [00:18:41] Debugging from First Principle - [00:21:20] Current State of Observability - [00:26:49] Implementing Observability - [00:30:20] Observability Driven Development - [00:36:53] Having Developers On-Call - [00:39:06] Observability Maturity Model - [00:41:59] 3 Tech Lead Wisdom - [00:44:10] _____ Liz Fong-Jones's Bio Liz is a developer advocate, labor and ethics organizer, and Site Reliability Engineer (SRE) with 17+ years of experience. She is an advocate at Honeycomb for the SRE and Observability communities, and previously was an SRE working on products ranging from the Google Cloud Load Balancer to Google Flights. She lives in Vancouver, BC with her wife Elly, partners, and a Samoyed/Golden Retriever mix, and in Sydney, NSW. She plays classical piano, leads an EVE Online alliance, and advocates for transgender rights. Follow Liz: Twitter – @lizthegrey LinkedIn – https://www.linkedin.com/in/efong/ Website – https://www.lizthegrey.com/ GitHub – https://github.com/lizthegrey Our Sponsor Today's episode is proudly sponsored by Skills Matter, the global community and events platform for software professionals. Skills Matter is an easier way for technologists to grow their careers by connecting you and your peers with the best-in-class tech industry experts and communities. You get on-demand access to their latest content, thought leadership insights as well as the exciting schedule of tech events running across all time zones. Head on over to skillsmatter.com to become part of the tech community that matters most to you - it's free to join and easy to keep up with the latest tech trends. Like this episode? Subscribe on your favorite podcast app and submit your feedback. Follow @techleadjournal on LinkedIn, Twitter, and Instagram. Pledge your support by becoming a patron. For more info about the episode (including quotes and transcript), visit techleadjournal.dev/episodes/88.
SDxCentral 2-Minute Weekly Wrap for May 13, 2022 Plus, Vodafone released its cloud-native Unified Performance Management platform, and Dish is closer to hitting its 5G spectrum license deadline. Google, Microsoft, Apple Move One-Step Toward Passwordless Vodafone Grabs Google Cloud For AI-Powered 5G Data Hub Dish 5G Launch Races Against Time, Complexity SDxCentral 2-Minute Weekly Wrap Podcast Full Transcript Today is May 13, 2022, and this is the SDxCentral 2-Minute Weekly Wrap where we cover the week's top stories on next-generation IT infrastructure. On World Password Day, Apple, Google, and Microsoft teamed up to expand support for a passwordless sign-in standard built by the FIDO (Fi-Doh) Alliance and the World Wide Web Consortium. The parties and the Cybersecurity and Infrastructure Security Agency called this move a “milestone” on the passwordless journey. FIDO argues that password-only authentication is one of the biggest security issues on the web, leading to customer frustration, account takeovers, data breaches, or even stolen identities. The passwordless standard-based capabilities allow users to sign in through the same actions as unlocking their devices via biometric verification or a device pin. FIDO claims this approach is more secure than traditional passwords or legacy multi-factor technologies using one-time, SMS (s-m-s) passcodes. Vodafone this week released its cloud-native Unified Performance Management platform that the operator claims will provide a more reliable mobile experience to Vodafone's European customers. The platform taps Google Cloud's smart analytics portfolio, artificial intelligence, and machine learning tools, along with Cardinality.io's (Car-di-nal-i-ti-dot-i-o) cloud-native DataOps and analytics platform to update Vodafone's pan-European networks more quickly and efficiently via hybrid cloud architecture. The unified performance management platform, which is being launched in 11 European countries, will allow operating companies to gather billions of network performance data points, effectively replacing more than 100 distinct network performance applications used today. This will provide a single source of clean data located in the cloud that companies can analyze and optimize for using artificial intelligence. Dish Network remains steadfast that it will hit looming 5G spectrum license coverage requirements despite admitted past technical snags in launching its cloud-native network. The mobile virtual network operator controls a host of wireless spectrum licenses. Some of those licenses are tagged with very specific coverage requirements. The most pressing is that Dish must cover 20% of the population encompassed within those licensed areas by June. Dish CEO Erik Carlson previously stated that the company had more than 25 major metro markets ready to be deployed before the deadline, including around 100 smaller cities across the country. The company did take its first steps toward that goal this week by launching commercial operations on its own network in Las Vegas. However, it admits there is still work to be done. Learn more about your ad choices. Visit megaphone.fm/adchoices
In this paper, we provide an in-depth analysis of how to tackle high cardinality categorical features with the quantile. Our proposal outperforms state-of-the-art encoders, including the traditional statistical mean target encoder, when considering the Mean Absolute Error, especially in the presence of long tailed or skewed distributions. 2021: Carlos Mougán, D. Masip, Jordi Nin, O. Pujol https://arxiv.org/pdf/2105.13783v2.pdf
Twitch channel with the full show: https://www.twitch.tv/terminusdbAlso check out Matthijs' coding sessions on twitch - they are a delight, you won't regret.As I don't have any documentation on cardinality, I will just plug this new blog instead:8 Reasons to Version Control your database: https://terminusdb.com/blog/8-reasons-to-version-control-your-database/Data mesh by it's founder Zhamak Dehghani - this is an essential introduction to the ideas: https://martinfowler.com/articles/data-mesh-principles.htmlThis is a good engineering intro to data mesh. I appreciated its straightforward descriptions: https://www.datamesh-architecture.com/Three laws of robotics: https://en.wikipedia.org/wiki/Three_Laws_of_Robotics I, Robot (by Asimov - not the 'adaptation' Will Smith movie): https://en.wikipedia.org/wiki/I,_RobotThe robot vacuum cleaner company is called 'I, Robot': https://www.irobot.ie/roomba
get the earbuds in & a light jacket on for a walk to vibe with Cardinality's debut album! we dug deep with AG (he/him) & Ty (they/them) and just how all that magic happens with this ethereal expression of sound :D it's queer, it's intimate, it's bold, it's everything ya need to have an out of body experience with these two crafting untethered moments of artistry now available on Spotify! catch the band on IG: _cardinality_
In this Episode we explain what is Cardinality in databases. Listen and enjoy! Further Reading on Indexes and Cardinality: Indexes Cardinality Don't forget to share feedback on twitter @whattheperf and share episode suggestions at whattheperf@gmail.com Tech News: Britain tracks Covid patients in Excel India leads world in number of financial transactions per day AWSome Day Online Conference What The Crash!: Florida Voter Registration website crash Delhi University Admission Website crash NEET Results website crash
Software Engineering Radio - The Podcast for Professional Software Developers
Rob Skillington discusses the architecture, data management, and operational issues around monitoring and alerting systems with a large number of metrics and resources.
Rabbi Aaron L Raskin: Jewish Holidays, Names, Hebrew Letters and Torah Topics
This podcast is powered by JewishPodcasts.org. Start your own podcast today and share your content with the world. Click jewishpodcasts.fm/signup to get started.
Rabbi Aaron L Raskin: Jewish Holidays, Names, Hebrew Letters and Torah Topics
Splunk [Data Fabric Search and Data Stream Processor] 2019 .conf Videos w/ Slides
How would you go about exploring your data assets using Splunk’s newly available Data Fabric Search? What should you expect when you adopt Data Fabric Search for your Splunk deployments? We will show you how to go about enriching your Splunk searches and navigating through the different search phases to effectively utilize your resources. Speaker(s) Nikhil Roy, Principal Software Engineer, Splunk Asha Andrade, Principal Software Engineer, Data Fabric Search, Splunk Slides PDF link - https://conf.splunk.com/files/2019/slides/FN2143.pdf?podcast=1577146267 Product: Splunk Data Fabric Search and Data Stream Processor Track: Foundations/Platform Level: Intermediate
How would you go about exploring your data assets using Splunk’s newly available Data Fabric Search? What should you expect when you adopt Data Fabric Search for your Splunk deployments? We will show you how to go about enriching your Splunk searches and navigating through the different search phases to effectively utilize your resources. Speaker(s) Nikhil Roy, Principal Software Engineer, Splunk Asha Andrade, Principal Software Engineer, Data Fabric Search, Splunk Slides PDF link - https://conf.splunk.com/files/2019/slides/FN2143.pdf?podcast=1577146224 Product: Splunk Data Fabric Search and Data Stream Processor Track: Foundations/Platform Level: Intermediate
Splunk [Foundations/Platform Track] 2019 .conf Videos w/ Slides
How would you go about exploring your data assets using Splunk’s newly available Data Fabric Search? What should you expect when you adopt Data Fabric Search for your Splunk deployments? We will show you how to go about enriching your Splunk searches and navigating through the different search phases to effectively utilize your resources. Speaker(s) Nikhil Roy, Principal Software Engineer, Splunk Asha Andrade, Principal Software Engineer, Data Fabric Search, Splunk Slides PDF link - https://conf.splunk.com/files/2019/slides/FN2143.pdf?podcast=1577146201 Product: Splunk Data Fabric Search and Data Stream Processor Track: Foundations/Platform Level: Intermediate
The conclusion of the nineteenth lecture in Dr Joel Feinstein's G11FPM Foundations of Pure Mathematics module covers a brief discussion of sequences using up sets. Proof that the union of two countable sets is countable. These videos are also available on YouTube at: https://www.youtube.com/playlist?list=PLpRE0Zu_k-BzsKBqQ-HEqD6WVLIHSNuXa Dr Feinstein's blog may be viewed at: http://explainingmaths.wordpress.com Dr Joel Feinstein is an Associate Professor in Pure Mathematics at the Universit
The nineteenth lecture in Dr Joel Feinstein's G11FPM Foundations of Pure Mathematics module covers a brief discussion of different sizes of infinity. (The video “Beyond Infinity” may also be helpful here.) Sets with the same cardinality. Countable sets and uncountable sets defined. Connections between functions and sequences. Countability in terms of sequences. The union of two countable sets is countable. (Proved at the start of next class.) These videos are also available on YouTube a
The sixteenth lecture in Dr Joel Feinstein's G11FPM Foundations of Pure Mathematics module covers cardinality of finite sets: n-sets and k-subsets of sets. Discussion of injections, surjections and bijections between finite sets, including the Pigeonhole Principle. Contrast with infinite sets. Cardinalities of unions of sets: the Inclusion-Exclusion Principle (see Workshop 8 for more details). Binomial coefficients and the Binomial Theorem. These videos are also available on YouTube at: https
There are vastly many more real numbers than fractions! The MathFactor CD hidden track.
How to do infinitely many things in a few minutes!
Are there more fractions than counting numbers?
An old jalopy zooms up and down Arkansas hills...
Are there only half as many even numbers as counting numbers?
A hotel with an infinite number of rooms, all filled, but room for more?
Finer and finer, a ruler with infinitely many marks...
Which way leads out of the forest of Liars & Truth-tellers? This is the first episode on the Mathfactor CD, and the first mathfactor episode podcast to the world! For more information, listen to the podcast and send us an email! The Mathfactor is broadcast each Sunday morning on KUAF 91.3 FM, in Fayetteville Arkansas.