Podcasts about jeremy you

Jesse Lee Peterson Radio Show

Play Episode Listen Later Sep 24, 2025 64:07

Elizabeth Figura is a Wine developer at Code Weavers. We discuss how Wine and Proton make it possible to run Windows applications on other operating systems. Related links WineHQ Proton Crossover Direct3D MoltenVK XAudio2 Mesa 3D Graphics Library Transcript You can help correct transcripts on GitHub. Intro [00:00:00] Jeremy: Today I am talking to Elizabeth Figuera. She's a wine developer at Code Weavers. And today we're gonna talk about what that is and, uh, all the work that goes into it. [00:00:09] Elizabeth: Thank you Jeremy. I'm glad to be here. What's Wine [00:00:13] Jeremy: I think the first thing we should talk about is maybe saying what Wine is because I think a lot of people aren't familiar with the project. [00:00:20] Elizabeth: So wine is a translation layer. in fact, I would say wine is a Windows emulator. That is what the name originally stood for. it re implements the entire windows. Or you say win 32 API. so that programs that make calls into the API, will then transfer that code to wine and and we allow that Windows programs to run on, things that are not windows. So Linux, Mac, os, other operating systems such as Solaris and BSD. it works not by emulating the CPU, but by re-implementing every API, basically from scratch and translating them to their equivalent or writing new code in case there is no, you know, equivalent. System Calls [00:01:06] Jeremy: I believe what you're doing is you're emulating system calls. Could you explain what those are and, and how that relates to the project? [00:01:15] Elizabeth: Yeah. so system call in general can be used, referred to a call into the operating system, to execute some functionality that's built into the operating system. often it's used in the context of talking to the kernel windows applications actually tend to talk at a much higher level, because there's so much, so much high level functionality built into Windows. When you think about, as opposed to other operating systems that we basically, we end up end implementing much higher level behavior than you would on Linux. [00:01:49] Jeremy: And can you give some examples of what some of those system calls would be and, I suppose how they may be higher level than some of the Linux ones. [00:01:57] Elizabeth: Sure. So of course you have like low level calls like interacting with a file system, you know, created file and read and write and such. you also have, uh, high level APIs who interact with a sound driver. [00:02:12] Elizabeth: There's, uh, one I was working on earlier today, called XAudio where you, actually, you know, build this bank of of sounds. It's meant to be, played in a game and then you can position them in various 3D space. And the, and the operating system in a sense will, take care of all of the math that goes into making that work. [00:02:36] Elizabeth: That's all running on your computer and. And then it'll send that audio data to the sound card once it's transformed it. So it sounds like it's coming from a certain space. a lot of other things like, you know, parsing XML is another big one. That there's a lot of things. The, there, the, the, the space is honestly huge [00:02:59] Jeremy: And yeah, I can sort of see how those might be things you might not expect to be done by the operating system. Like you gave the example of 3D audio and XML parsing and I think XML parsing in, in particular, you would've thought that that would be something that would be handled by the, the standard library of whatever language the person was writing their application as. [00:03:22] Jeremy: So that's interesting that it's built into the os. [00:03:25] Elizabeth: Yeah. Well, and languages like, see it's not, it isn't even part of the standard library. It's higher level than that. It's, you have specific libraries that are widespread but not. Codified in a standard, but in Windows you, in Windows, they are part of the operating system. And in fact, there's several different, XML parsers in the operating system. Microsoft likes to deprecate old APIs and make new ones that do the same thing very often. [00:03:53] Jeremy: And something I've heard about Windows is that they're typically very reluctant to break backwards compatibility. So you say they're deprecated, but do they typically keep all of them still in there? [00:04:04] Elizabeth: It all still It all still works. [00:04:07] Jeremy: And that's all things that wine has to implement as well to make sure that the software works as well. [00:04:14] Jeremy: Yeah. [00:04:14] Elizabeth: Yeah. And, and we also, you know, need to make it work. we also need to implement those things to make old, programs work because there is, uh, a lot of demand, at least from, at least from people using wine for making, for getting some really old programs, working from the. Early nineties even. What people run with Wine (Productivity, build systems, servers) [00:04:36] Jeremy: And that's probably a good, thing to talk about in terms of what, what are the types of software that, that people are trying to run with wine, and what operating system are they typically using? [00:04:46] Elizabeth: Oh, in terms of software, literally all kinds, any software you can imagine that runs on Windows, people will try to run it on wine. So we're talking games, office software productivity, software accounting. people will run, build systems on wine, build their, just run, uh, build their programs using, on visual studio, running on wine. people will run wine on servers, for example, like software as a service kind of things where you don't even know that it's running on wine. really super domain specific stuff. Like I've run astronomy, software, and wine. Design, computer assisted design, even hardware drivers can sometimes work unwind. There's a bit of a gray area. How games are different [00:05:29] Jeremy: Yeah, it's um, I think from. Maybe the general public, or at least from what I've seen, I think a lot of people's exposure to it is for playing games. is there something different about games versus all those other types of, productivity software and office software that, that makes supporting those different. [00:05:53] Elizabeth: Um, there's some things about it that are different. Games of course have gotten a lot of publicity lately because there's been a huge push, largely from valve, but also some other companies to get. A lot of huge, wide range of games working well under wine. And that's really panned out in the, in a way, I think, I think we've largely succeeded. [00:06:13] Elizabeth: We've made huge strides in the past several years. 5, 5, 10 years, I think. so when you talk about what makes games different, I think, one thing games tend to do is they have a very limited set of things they're working with and they often want to make things run fast, and so they're working very close to the me They're not, they're not gonna use an XML parser, for example. [00:06:44] Elizabeth: They're just gonna talk directly as, directly to the graphics driver as they can. Right. And, and probably going to do all their own sound design. You know, I did talk about that XAudio library, but a lot of games will just talk directly as, directly to the sound driver as Windows Let some, so this is a often a blessing, honestly, because it means there's less we have to implement to make them work. when you look at a lot of productivity applications, and especially, the other thing that makes some productivity applications harder is, Microsoft makes 'em, and They like to, make a library, for use in this one program like Microsoft Office and then say, well, you know, other programs might use this as well. Let's. Put it in the operating system and expose it and write an API for it and everything. And maybe some other programs use it. mostly it's just office, but it means that office relies on a lot of things from the operating system that we all have to reimplement. [00:07:44] Jeremy: Yeah, that's somewhat counterintuitive because when you think of games, you think of these really high performance things that that seem really complicated. But it sounds like from what you're saying, because they use the lower level primitives, they're actually easier in some ways to support. [00:08:01] Elizabeth: Yeah, certainly in some ways, they, yeah, they'll do things like re-implement the heap allocator because the built-in heap allocator isn't fast enough for them. That's another good example. What makes some applications hard to support (Some are hard, can't debug other people's apps) [00:08:16] Jeremy: You mentioned Microsoft's more modern, uh, office suites. I, I've noticed there's certain applications that, that aren't supported. Like, for example, I think the modern Adobe Creative Suite. What's the difference with software like that and does that also apply to the modern office suite, or is, or is that actually supported? [00:08:39] Elizabeth: Well, in one case you have, things like Microsoft using their own APIs that I mentioned with Adobe. That applies less, I suppose, but I think to some degree, I think to some degree the answer is that some applications are just hard and there's, and, and there's no way around it. And, and we can only spend so much time on a hard application. I. Debugging things. Debugging things can get very hard with wine. Let's, let me like explain that for a minute because, Because normally when you think about debugging an application, you say, oh, I'm gonna open up my debugger, pop it in, uh, break at this point, see what like all the variables are, or they're not what I expect. Or maybe wait for it to crash and then get a back trace and see where it crashed. And why you can't do that with wine, because you don't have the application, you don't have the symbols, you don't have your debugging symbols. You don't know anything about the code you're running unless you take the time to disassemble and decompile and read through it. And that's difficult every time. It's not only difficult, every time I've, I've looked at a program and been like, I really need to just. I'm gonna just try and figure out what the program is doing. [00:10:00] Elizabeth: It takes so much time and it is never worth it. And sometimes you have to, sometimes you have no other choice, but usually you end up, you ask to rely on seeing what calls it makes into the operating system and trying to guess which one of those is going wrong. Now, sometimes you'll get lucky and it'll crash in wine code, or sometimes it'll make a call into, a function that we don't implement yet, and we know, oh, we need to implement that function. But sometimes it does something, more obscure and we have to figure out, well, like all of these millions of calls it made, which one of them is, which one of them are we implementing incorrectly? So it's returning the wrong result or not doing something that it should. And, then you add onto that the. You know, all these sort of harder to debug things like memory errors that we could make. And it's, it can be very difficult and so sometimes some applications just suffer from those hard bugs. and sometimes it's also just a matter of not enough demand for something for us to spend a lot of time on it. [00:11:11] Elizabeth: Right. [00:11:14] Jeremy: Yeah, I can see how that would be really challenging because you're, like you were saying, you don't have the symbols, so you don't have the source code, so you don't know what any of this software you're supporting, how it was actually written. And you were saying that I. A lot of times, you know, there may be some behavior that's wrong or a crash, but it's not because wine crashed or there was an error in wine. [00:11:42] Jeremy: so you just know the system calls it made, but you don't know which of the system calls didn't behave the way that the application expected. [00:11:50] Elizabeth: Exactly. Test suite (Half the code is tests) [00:11:52] Jeremy: I can see how that would be really challenging. and wine runs so many different applications. I'm, I'm kind of curious how do you even track what's working and what's not as you, you change wine because if you support thousands or tens thousands of applications, you know, how do you know when you've got a, a regression or not? [00:12:15] Elizabeth: So, it's a great question. Um, probably over half of wine by like source code volume. I actually actually check what it is, but I think it's, i, I, I think it's probably over half is what we call is tests. And these tests serve two purposes. The one purpose is a regression test. And the other purpose is they're conformance tests that test, that test how, uh, an API behaves on windows and validates that we are behaving the same way. So we write all these tests, we run them on windows and you know, write the tests to check what the windows returns, and then we run 'em on wine and make sure that that matches. and we have just such a huge body of tests to make sure that, you know, we're not breaking anything. And that every, every, all the code that we, that we get into wine that looks like, wow, it's doing that really well. Nope, that's what Windows does. The test says so. So pretty much any code that we, any new code that we get, it has to have tests to validate, to, to demonstrate that it's doing the right thing. [00:13:31] Jeremy: And so rather than testing against a specific application, seeing if it works, you're making a call to a Windows system call, seeing how it responds, and then making the same call within wine and just making sure they match. [00:13:48] Elizabeth: Yes, exactly. And that is obviously, or that is a lot more, automatable, right? Because otherwise you have to manually, you know, there's all, these are all graphical applications. [00:14:02] Elizabeth: You'd have to manually do the things and make sure they work. Um, but if you write automateable tests, you can just run them all and the machine will complain at you if it fails it continuous integration. How compatibility problems appear to users [00:14:13] Jeremy: And because there's all these potential compatibility issues where maybe a certain call doesn't behave the way an application expects. What, what are the types of what that shows when someone's using software? I mean, I, I think you mentioned crashes, but I imagine there could be all sorts of other types of behavior. [00:14:37] Elizabeth: Yes, very much so. basically anything, anything you can imagine again is, is what will happen. You can have, crashes are the easy ones because you know when and where it crashed and you can work backwards from there. but you can also get, it can, it could hang, it could not render, right? Like maybe render a black screen. for, you know, for games you could very frequently have, graphical glitches where maybe some objects won't render right? Or the entire screen will be read. Who knows? in a very bad case, you could even bring down your system and we usually say that's not wine's fault. That's the graphics library's fault. 'cause they're not supposed to do that, uh, no matter what we do. But, you know, sometimes we have to work around that anyway. but yeah, there's, there's been some very strange and idiosyncratic bugs out there too. [00:15:33] Jeremy: Yeah. And like you mentioned that uh, there's so many different things that could have gone wrong that imagine's very difficult to find. Yeah. And when software runs through wine, I think, Performance is comparable to native [00:15:49] Jeremy: A lot of our listeners will probably be familiar with running things in a virtual machine, and they know that there's a big performance impact from doing that. [00:15:57] Jeremy: How does the performance of applications compare to running natively on the original Windows OS versus virtual machines? [00:16:08] Elizabeth: So. In theory. and I, I haven't actually done this recently, so I can't speak too much to that, but in theory, the idea is it's a lot faster. so there, there, is a bit of a joke acronym to wine. wine is not an emulator, even though I started out by saying wine is an emulator, and it was originally called a Windows emulator. but what this basically means is wine is not a CPU emulator. It doesn't, when you think about emulators in a general sense, they're often, they're often emulators for specific CPUs, often older ones like, you know, the Commodore emulator or an Amiga emulator. but in this case, you have software that's written for an x86 CPU. And it's running on an x86 CPU by giving it the same instructions that it's giving on windows. It's just that when it says, now call this Windows function, it calls us instead. So that all should perform exactly the same. The only performance difference at that point is that all should perform exactly the same as opposed to a, virtual machine where you have to interpret the instructions and maybe translate them to a different instruction set. The only performance difference is going to be, in the functions that we are implementing themselves and we try to, we try to implement them to perform. As well, or almost as well as windows. There's always going to be a bit of a theoretical gap because we have to translate from say, one API to another, but we try to make that as little as possible. And in some cases, the operating system we're running on is, is just better than Windows and the libraries we're using are better than Windows. [00:18:01] Elizabeth: And so our games will run faster, for example. sometimes we can, sometimes we can, do a better job than Windows at implementing something that's, that's under our purview. there there are some games that do actually run a little bit faster in wine than they do on Windows. [00:18:22] Jeremy: Yeah, that, that reminds me of how there's these uh, gaming handhelds out now, and some of the same ones, they have a, they either let you install Linux or install windows, or they just come with a pre-installed, and I believe what I've read is that oftentimes running the same game on both operating systems, running the same game on Linux, the battery life is better and sometimes even the performance is better with these handhelds. [00:18:53] Jeremy: So it's, it's really interesting that that can even be the case. [00:18:57] Elizabeth: Yeah, it's really a testament to the huge amount of work that's gone into that, both on the wine side and on the, side of the graphics team and the colonel team. And, and of course, you know, the years of, the years of, work that's gone into Linux, even before these gaming handhelds were, were even under consideration. Proton and Valve Software's role [00:19:21] Jeremy: And something. So for people who are familiar with the handhelds, like the steam deck, they may have heard of proton. Uh, I wonder if you can explain what proton is and how it relates to wine. [00:19:37] Elizabeth: Yeah. So, proton is basically, how do I describe this? So, proton is a sort of a fork, uh, although we try to avoid the term fork. It's a, we say it's a downstream distribution because we contribute back up to wine. so it is a, it is, it is a alternate distribution fork of wine. And it's also some code that basically glues wine into, an embedding application originally intended for steam, and developed for valve. it has also been used in, others, but it has also been used in other software. it, so where proton differs from wine besides the glue part is it has some, it has some extra hacks in it for bugs that are hard to fix and easy to hack around as some quick hacks for, making games work now that are like in the process of going upstream to wine and getting their code quality improved and going through review. [00:20:54] Elizabeth: But we want the game to work now, when we distribute it. So that'll, that'll go into proton immediately. And then once we have, once the patch makes it upstream, we replace it with the version of the patch from upstream. there's other things to make it interact nicely with steam and so on. And yeah, I think, yeah, I think that's, I got it. [00:21:19] Jeremy: Yeah. And I think for people who aren't familiar, steam is like this, um, I, I don't even know what you call it, like a gaming store and a [00:21:29] Elizabeth: store game distribution service. it's got a huge variety of games on it, and you just publish. And, and it's a great way for publishers to interact with their, you know, with a wider gaming community, uh, after it, just after paying a cut to valve of their profits, they can reach a lot of people that way. And because all these games are on team and, valve wants them to work well on, on their handheld, they contracted us to basically take their entire catalog, which is huge, enormous. And trying and just step by step. Fix every game and make them all work. [00:22:10] Jeremy: So, um, and I guess for people who aren't familiar Valve, uh, softwares the company that runs steam, and so it sounds like they've asked, uh, your company to, to help improve the compatibility of their catalog. [00:22:24] Elizabeth: Yes. valve contracted us and, and again, when you're talking about wine using lower level libraries, they've also contracted a lot of other people outside of wine. Basically, the entire stack has had a tremendous, tremendous investment by valve software to make gaming on Linux work. Well. The entire stack receives changes to improve Wine compatibility [00:22:48] Jeremy: And when you refer to the entire stack, like what are some, some of those pieces, at least at a high level. [00:22:54] Elizabeth: I, I would, let's see, let me think. There is the wine project, the. Mesa Graphics Libraries. that's a, that's another, you know, uh, open source, software project that existed, has existed for a long time. But Valve has put a lot of, uh, funding and effort into it, the Linux kernel in various different ways. [00:23:17] Elizabeth: the, the desktop, uh, environment and Window Manager for, um, are also things they've invested in. [00:23:26] Jeremy: yeah. Everything that the game needs, on any level and, and that the, and that the operating system of the handheld device needs. Wine's history [00:23:37] Jeremy: And wine's been going on for quite a while. I think it's over a decade, right? [00:23:44] Elizabeth: I believe. Oh, more than, oh, far more than a decade. I believe it started in 1990, I wanna say about 1995, mid nineties. I'm, I probably have that date wrong. I believe Wine started about the mid nineties. [00:24:00] Jeremy: Mm. [00:24:00] Elizabeth: it's going on for three decades at this rate. [00:24:03] Jeremy: Wow. Okay. [00:24:06] Jeremy: And so all this time, how has the, the project sort of sustained itself? Like who's been involved and how has it been able to keep going this long? [00:24:18] Elizabeth: Uh, I think as is the case with a lot of free software, it just, it just keeps trudging along. There's been. There's been times where there's a lot of interest in wine. There's been times where there's less, and we are fortunate to be in a time where there's a lot of interest in it. we've had the same maintainer for almost this entire, almost this entire existence. Uh, Alexander Julliard, there was one person starting who started, maintained it before him and, uh, left it maintainer ship to him after a year or two. Uh, Bob Amstat. And there has been a few, there's been a few developers who have been around for a very long time. a lot of developers who have been around for a decent amount of time, but not for the entire duration. And then a very, very large number of people who come and submit a one-off fix for their individual application that they want to make work. [00:25:19] Jeremy: How does crossover relate to the wine project? Like, it sounds like you had mentioned Valve software hired you for subcontract work, but crossover itself has been around for quite a while. So how, how has that been connected to the wine project? [00:25:37] Elizabeth: So I work for, so the, so the company I work for is Code Weavers and, crossover is our flagship software. so Code Weavers is a couple different things. We have a sort of a porting service where companies will come to us and say, can we port my application usually to Mac? And then we also have a retail service where Where we basically have our own, similar to Proton, but you know, older, but the same idea where we will add some hacks into it for very difficult to solve bugs and we have a, a nice graphical interface. And then, the other thing that we're selling with crossover is support. So if you, you know, try to run a certain application and you buy crossover, you can submit a ticket saying this doesn't work and we now have a financial incentive to fix it. You know, we'll try to, we'll try to fix your, we'll spend company resources to fix your bug, right? So that's been so, so code we v has been around since 1996 and crossover, I don't know the date, but it's crossover has been around for probably about two decades, if I'm not mistaken. [00:27:01] Jeremy: And when you mention helping companies port their software to, for example, MacOS. [00:27:07] Jeremy: Is the approach that you would port it natively to MacOS APIs or is it that you would help them get it running using wine on MacOS? [00:27:21] Elizabeth: Right. That's, so that's basically what makes us so unique among porting companies is that instead of rewriting their software, we just, we just basically stick it inside of crossover and, uh, and, and make it run. [00:27:36] Elizabeth: And the idea has always been, you know, the more we implement, the more we get correct, the, the more applications will, you know, work. And sometimes it works out that way. Sometimes not really so much. And there's always work we have to do to get any given application to work, but. Yeah, so it's, it's very unusual because we don't ask companies for any of their code. We don't need it. We just fix the windows API [00:28:07] Jeremy: And, and so in that case, the ports would be let's say someone sells a MacOS version of their software. They would bundle crossover, uh, with their software. [00:28:18] Elizabeth: Right? And usually when you do this, it doesn't look like there's crossover there. Like it just looks like this software is native, but there is soft, there is crossover under the hood. Loading executables and linked libraries [00:28:32] Jeremy: And so earlier we were talking about how you're basically intercepting the system calls that these binaries are making, whether that's the executable or the, the DLLs from Windows. Um, but I think probably a lot of our listeners are not really sure how that's done. Like they, they may have built software, but they don't know, how do I basically hijack, the system calls that this application is making. [00:29:01] Jeremy: So maybe you could talk a little bit about how that works. [00:29:04] Elizabeth: So there, so there's a couple steps to go into it. when you think about a program that's say, that's a big, a big file that's got all the machine code in it, and then it's got stuff at the beginning saying, here's how the program works and here's where in the file the processor should start running. that's, that's your EXE file. And then in your DLL files are libraries that contain shared code and you have like a similar sort of file. It says, here's the entry point. That runs this function, this, you know, this pars XML function or whatever have you. [00:29:42] Elizabeth: And here's this entry point that has the generate XML function and so on and so forth. And, and, then the operating system will basically take the EXE file and see all the bits in it. Say I want to call the pars XML function. It'll load that DLL and hook it up. So it, so the processor ends up just seeing jump directly to this pars XML function and then run that and then return and so on. [00:30:14] Elizabeth: And so what wine does, is it part of wine? That's part of wine is a library, is that, you know, the implementing that parse XML and read XML function, but part of it is the loader, which is the part of the operating system that hooks everything together. And when we load, we. Redirect to our libraries. We don't have Windows libraries. [00:30:38] Elizabeth: We like, we redirect to ours and then we run our code. And then when you jump back to the program and yeah. [00:30:48] Jeremy: So it's the, the loader that's a part of wine. That's actually, I'm not sure if running the executable is the right term. [00:30:58] Elizabeth: no, I think that's, I think that's a good term. It's, it's, it's, it starts in a loader and then we say, okay, now run the, run the machine code and it's executable and then it runs and it jumps between our libraries and back and so on. [00:31:14] Jeremy: And like you were saying before, often times when it's trying to make a system call, it ends up being handled by a function that you've written in wine. And then that in turn will call the, the Linux system calls or the MacOS system calls to try and accomplish the, the same result. [00:31:36] Elizabeth: Right, exactly. [00:31:40] Jeremy: And something that I think maybe not everyone is familiar with is there's this concept of user space versus kernel space. you explain what the difference is? [00:31:51] Elizabeth: So the way I would explain, the way I would describe a kernel is it's the part of the operating system that can do anything, right? So any program, any code that runs on your computer is talking to the processor, and the processor has to be able to do anything the computer can do. [00:32:10] Elizabeth: It has to be able to talk to the hardware, it has to set up the memory space. That, so actually a very complicated task has to be able to switch to another task. and, and, and, and basically talk to another program and. You have to have something there that can do everything, but you don't want any program to be able to do everything. Um, not since the, not since the nineties. It's about when we realized that we can't do that. so the kernel is a part that can do everything. And when you need to do something that requires those, those permissions that you can't give everyone, you have to talk to the colonel and ask it, Hey, can you do this for me please? And in a very restricted way where it's only the safe things you can do. And a degree, it's also like a library, right? It's the kernel. The kernels have always existed, and since they've always just been the core standard library of the computer that does the, that does the things like read and write files, which are very, very complicated tasks under the hood, but look very simple because all you say is write this file. And talk to the hardware and abstract away all the difference between different drivers. So the kernel is doing all of these things. So because the kernel is a part that can do everything and because when you think about the kernel, it is basically one program that is always running on your computer, but it's only one program. So when a user calls the kernel, you are switching from one program to another and you're doing a lot of complicated things as part of this. You're switching to the higher privilege level where you can do anything and you're switching the state from one program to another. And so it's a it. So this is what we mean when we talk about user space, where you're running like a normal program and kernel space where you've suddenly switched into the kernel. [00:34:19] Elizabeth: Now you're executing with increased privileges in a different. idea of the process space and increased responsibility and so on. [00:34:30] Jeremy: And, and so do most applications. When you were talking about the system calls for handling 3D audio or parsing XML. Are those considered, are those system calls considered part of user space and then those things call the kernel space on your behalf, or how, how would you describe that? [00:34:50] Elizabeth: So most, so when you look at Windows, most of most of the Windows library, the vast, vast majority of it is all user space. most of these libraries that we implement never leave user space. They never need to call into the kernel. there's the, there only the core low level stuff. Things like, we need to read a file, that's a kernel call. when you need to sleep and wait for some seconds, that's a kernel. Need to talk to a different process. Things that interact with different processes in general. not just allocate memory, but allocate a page of memory, like a, from the memory manager and then that gets sub allocated by the heap allocator. so things like that. [00:35:31] Jeremy: Yeah, so if I was writing an application and I needed to open a file, for example, does, does that mean that I would have to communicate with the kernel to, to read that file? [00:35:43] Elizabeth: Right, exactly. [00:35:46] Jeremy: And so most applications, it sounds like it's gonna be a mixture. You're gonna have a lot of things that call user space calls. And then a few, you mentioned more low level ones that are gonna require you to communicate with the kernel. [00:36:00] Elizabeth: Yeah, basically. And it's worth noting that in, in all operating systems, you're, you're almost always gonna be calling a user space library. That might just be a thin wrapper over the kernel call. It might, it's gonna do like just a little bit of work in end call the kernel. [00:36:19] Jeremy: [00:36:19] Elizabeth: In fact, in Windows, that's the only way to do it. Uh, in many other operating systems, you can actually say, you can actually tell the processor to make the kernel call. There is a special instruction that does this and just, and it'll go directly to the kernel, and there's a defined interface for this. But in Windows, that interface is not defined. It's not stable. Or backwards compatible like the rest of Windows is. So even if you wanted to use it, you couldn't. and you basically have to call into the high level libraries or low level libraries, as it were, that, that tell you that create a file. And those don't do a lot. [00:37:00] Elizabeth: They just kind of tweak their parameters a little and then pass them right down to the kernel. [00:37:07] Jeremy: And so wine, it sounds like it needs to implement both the user space calls of windows, but then also the, the kernel, calls as well. But, but wine itself does that, is that only in Linux user space or MacOS user space? [00:37:27] Elizabeth: Yes. This is a very tricky thing. but all of wine, basically all of what is wine runs in, in user space and we use. Kernel calls that are already there to talk to the colonel, to talk to the host Colonel. You have to, and you, you get, you get, you get the sort of second nature of thinking about the Windows, user space and kernel. [00:37:50] Elizabeth: And then there's a host user space and Kernel and wine is running all in user, in the user, in the host user space, but it's emulating the Windows kernel. In fact, one of the weirdest, trickiest parts is I mentioned that you can run some drivers in wine. And those drivers actually, they actually are, they think they're running in the Windows kernel. which in a sense works the same way. It has libraries that it can load, and those drivers are basically libraries and they're making, kernel calls and they're, they're making calls into the kernel library that does some very, very low level tasks that. You're normally only supposed to be able to do in a kernel. And, you know, because the kernel requires some privileges, we kind of pretend we have them. And in many cases, you're even the drivers are using abstractions. We can just implement those abstractions kind of over the slightly higher level abstractions that exist in user space. [00:39:00] Jeremy: Yeah, I hadn't even considered the being able to use hardware devices, but I, I suppose if in, in the end, if you're reproducing the kernel, then whether you're running software or you're talking to a hardware device, as long as you implement the calls correctly, then I, I suppose it works. [00:39:18] Elizabeth: Cause you're, you're talking about device, like maybe it's some kind of USB device that has drivers for Windows, but it doesn't for, for Linux. [00:39:28] Elizabeth: no, that's exactly, that's a, that's kind of the, the example I've used. Uh, I think there is, I think I. My, one of my best success stories was, uh, drivers for a graphing calculator. [00:39:41] Jeremy: Oh, wow. [00:39:42] Elizabeth: That connected via USB and I basically just plugged the windows drivers into wine and, and ran it. And I had to implement a lot of things, but it worked. But for example, something like a graphics driver is not something you could implement in wine because you need the graphics driver on the host. We can't talk to the graphics driver while the host is already doing so. [00:40:05] Jeremy: I see. Yeah. And in that case it probably doesn't make sense to do so [00:40:11] Elizabeth: Right? [00:40:12] Elizabeth: Right. It doesn't because, the transition from user into kernel is complicated. You need the graphics driver to be in the kernel and the real kernel. Having it in wine would be a bad idea. Yeah. [00:40:25] Jeremy: I, I think there's, there's enough APIs you have to try and reproduce that. I, I think, uh, doing, doing something where, [00:40:32] Elizabeth: very difficult [00:40:33] Jeremy: right. Poor system call documentation and private APIs [00:40:35] Jeremy: There's so many different, calls both in user space and in kernel space. I imagine the, the user space ones Microsoft must document to some extent, but, oh. Is that, is that a [00:40:51] Elizabeth: well, sometimes, [00:40:54] Jeremy: Sometimes. Okay. [00:40:55] Elizabeth: I think it's actually better now than it used to be. But some, here's where things get fun, because sometimes there will be, you know, regular documented calls. Sometimes those calls are documented, but the documentation isn't very good. Sometimes programs will just sort of look inside Microsoft's DLLs and use calls that they aren't supposed to be using. Sometimes they use calls that they are supposed to be using, but the documentation has disappeared. just because it's that old of an API and Microsoft hasn't kept it around. sometimes some, sometimes Microsoft, Microsoft own software uses, APIs that were never documented because they never wanted anyone else using them, but they still ship them with the operating system. there was actually a kind of a lawsuit about this because it is an antitrust lawsuit, because by shipping things that only they could use, they were kind of creating a trust. and that got some things documented. At least in theory, they kind of haven't stopped doing it, though. [00:42:08] Jeremy: Oh, so even today they're, they're, I guess they would call those private, private APIs, I suppose. [00:42:14] Elizabeth: I suppose. Uh, yeah, you could say private APIs. but if we want to get, you know, newer versions of Microsoft Office running, we still have to figure out what they're doing and implement them. [00:42:25] Jeremy: And given that they're either, like you were saying, the documentation is kind of all over the place. If you don't know how it's supposed to behave, how do you even approach implementing them? [00:42:38] Elizabeth: and that's what the conformance tests are for. And I, yeah, I mentioned earlier we have this huge body of conformance tests that double is regression tests. if we see an API, we don't know what to do with or an API, we do know, we, we think we know what to do with because the documentation can just be wrong and often has been. Then we write tests to figure out what it's supposed to behave. We kind of guess until we, and, and we write tests and we pass some things in and see what comes out and see what. The see what the operating system does until we figure out, oh, so this is what it's supposed to do and these are the exact parameters in, and, and then we, and, and then we implement it according to those tests. [00:43:24] Jeremy: Is there any distinction in approach for when you're trying to implement something that's at the user level versus the kernel level? [00:43:33] Elizabeth: No, not really. And like I, and like I mentioned earlier, like, well, I mean, a kernel call is just like a library call. It's just done in a slightly different way, but it's still got, you know, parameters in, it's still got a set of parameters. They're just encoded differently. And, and again, like the, the way kernel calls are done is on a level just above the kernel where you have a library, that just passes things through. Almost verbatim to the kernel and we implement that library instead. [00:44:10] Jeremy: And, and you've been working on i, I think, wine for over, over six years now. [00:44:18] Elizabeth: That sounds about right. Debugging and having broad knowledge of Wine [00:44:20] Jeremy: What does, uh, your, your day to day look like? What parts of the project do you, do you work on? [00:44:27] Elizabeth: It really varies from day to day. and I, I, a lot of people, a lot of, some people will work on the same parts of wine for years. Uh, some people will switch around and work on all sorts of different things. [00:44:42] Elizabeth: And I'm, I definitely belong to that second group. Like if you name an area of wine, I have almost certainly contributed a patch or two to it. there's some areas I work on more than others, like, 3D graphics, multimedia, a, I had, I worked on a compiler that exists, uh, socket. So networking communication is another thing I work a lot on. day to day, I kind of just get, I, I I kind of just get a bug for some program or another. and I take it and I debug it and figure out why the program's broken and then I fix it. And there's so much variety in that. because a bug can take so many different forms like I described, and, and, and the, and then the fix can be simple or complicated or, and it can be in really anywhere to a degree. [00:45:40] Elizabeth: being able to work on any part of wine is sometimes almost a necessity because if a program is just broken, you don't know why. It could be anything. It could be any sort of API. And sometimes you can hand the API to somebody who's got a lot of experience in that, but sometimes you just do whatever. You just fix whatever's broken and you get an experience that way. [00:46:06] Jeremy: Yeah, I mean, I was gonna ask about the specialized skills to, to work on wine, but it sounds like maybe in your case it's all of them. [00:46:15] Elizabeth: It's, there's a bit of that. it's a wine. We, the skills to work on wine are very, it's a very unique set of skills because, and it largely comes down to debugging because you can't use the tools you normally use debug. [00:46:30] Elizabeth: You have to, you have to be creative and think about it different ways. Sometimes you have to be very creative. and programs will try their hardest to avoid being debugged because they don't want anyone breaking their copy protection, for example, or or hacking, or, you know, hacking in sheets. They want to be, they want, they don't want anyone hacking them like that. [00:46:54] Elizabeth: And we have to do it anyway for good and legitimate purposes. We would argue to make them work better on more operating systems. And so we have to fight that every step of the way. [00:47:07] Jeremy: Yeah, it seems like it's a combination of. F being able, like you, you were saying, being able to, to debug. and you're debugging not necessarily your own code, but you're debugging this like behavior of, [00:47:25] Jeremy: And then based on that behavior, you have to figure out, okay, where in all these different systems within wine could this part be not working? [00:47:35] Jeremy: And I, I suppose you probably build up some kind of, mental map in your head of when you get a, a type of bug or a type of crash, you oh, maybe it's this, maybe it's here, or something [00:47:47] Elizabeth: Yeah. That, yeah, there is a lot of that. there's, you notice some patterns, you know, after experience helps, but because any bug could be new, sometimes experience doesn't help and you just, you just kind of have to start from scratch. Finding a bug related to XAudio [00:48:08] Jeremy: At sort of a high level, can you give an example of where you got a specific bug report and then where you had to look to eventually find which parts of the the system were the issue? [00:48:21] Elizabeth: one, one I think good example, that I've done recently. so I mentioned this, this XAudio library that does 3D audio. And if you say you come across a bug, I'm gonna be a little bit generics here and say you come across a bug where some audio isn't playing right, maybe there's, silence where there should be the audio. So you kind of, you look in and see, well, where's that getting lost? So you can basically look in the input calls and say, here's the buffer it's submitting that's got all the audio data in it. And you look at the output, you look at where you think the output should be, like, that library will internally call a different library, which programs can interact with directly. [00:49:03] Elizabeth: And this our high level library interacts with that is the, give this sound to the audio driver, right? So you've got XAudio on top of, um. mdev, API, which is the other library that gives audio to the driver. And you see, well, the ba the buffer is that XAudio is passing into MM Dev, dev API. They're empty, there's nothing in them. So you have to kind of work through the XAudio library to see where is, where's that sound getting lost? Or maybe, or maybe that's not getting lost. Maybe it's coming through all garbled. And I've had to look at the buffer and see why is it garbled. I'll open up it up in Audacity and look at the weight shape of the wave and say, huh, that shape of the wave looks like it's, it looks like we're putting silence every 10 nanoseconds or something, or, or reversing something or interpreting it wrong. things like that. Um, there's a lot of, you'll do a lot of, putting in print fs basically all throughout wine to see where does the state change. Where was, where is it? Where is it? Right? And then where do things start going wrong? [00:50:14] Jeremy: Yeah. And in the audio example, because they're making a call to your XAudio implementation, you can see that Okay, the, the buffer, the audio that's coming in. That part is good. It, it's just that later on when it sends it to what's gonna actually have it be played by the, the hardware, that's when missing. So, [00:50:37] Elizabeth: We did something wrong in a library that destroyed the buffer. And I think on a very, high level a lot of debugging, wine is about finding where things are good and finding where things are bad, and then narrowing that down until we find the one spot where things go wrong. There's a lot of processes that go like that. [00:50:57] Jeremy: like you were saying, the more you see these problems, hopefully the, the easier it gets to, to narrow down where, [00:51:04] Elizabeth: Often. Yeah. Especially if you keep debugging things in the same area. How much code is OS specific?c [00:51:09] Jeremy: And wine supports more than one operating system. I, I saw there was Linux, MacOS I think free BSD. How much of the code is operating system specific versus how much can just be shared across all of them? [00:51:27] Elizabeth: Not that much is operating system specific actually. so when you think about the volume of wine, the, the, the, vast majority of it is the high level code that doesn't need to interact with the operating system on a low level. Right? Because Windows keeps putting, because Microsoft keeps putting lots and lots of different libraries in their operating system. And a lot of these are high level libraries. and even when we do interact with the operating system, we're, we're using cross-platform libraries or we're using, we're using ics. The, uh, so all these operating systems that we are implementing are con, basically conformed to the posix standard. which is basically like Unix, they're all Unix based. Psic is a Unix based standard. Microsoft is, you know, the big exception that never did implement that. And, and so we have to translate its APIs to Unix, APIs. now that said, there is a lot of very operating system, specific code. Apple makes things difficult by try, by diverging almost wherever they can. And so we have a lot of Apple specific code in there. [00:52:46] Jeremy: another example I can think of is, I believe MacOS doesn't support, Vulkan [00:52:53] Elizabeth: yes. Yeah.Yeah, That's a, yeah, that's a great example of Mac not wanting to use, uh, generic libraries that work on every other operating system. and in some cases we, we look at it and are like, alright, we'll implement a wrapper for that too, on top of Yuri, on top of your, uh, operating system. We've done it for Windows, we can do it for Vulkan. and that's, and then you get the Molten VK project. Uh, and to be clear, we didn't invent molten vk. It was around before us. We have contributed a lot to it. Direct3d, Vulkan, and MoltenVK [00:53:28] Jeremy: Yeah, I think maybe just at a high level might be good to explain the relationship between Direct 3D or Direct X and Vulcan and um, yeah. Yeah. Maybe if you could go into that. [00:53:42] Elizabeth: so Direct 3D is Microsoft's 3D API. the 3D APIs, you know, are, are basically a way to, they're way to firstly abstract out the differences between different graphics, graphics cards, which, you know, look very different on a hardware level. [00:54:03] Elizabeth: Especially. They, they used to look very different and they still do look very different. and secondly, a way to deal with them at a high level because actually talking to the graphics card on a low level is very, very complicated. Even talking to it on a high level is complicated, but it gets, it can get a lot worse if you've ever been a, if you've ever done any graphics, driver development. so you have a, a number of different APIs that achieve these two goals of, of, abstraction and, and of, of, of building a common abstraction and of building a, a high level abstraction. so OpenGL is the broadly the free, the free operating system world, the non Microsoft's world's choice, back in the day. [00:54:53] Elizabeth: And then direct 3D was Microsoft's API and they've and Direct 3D. And both of these have evolved over time and come up with new versions and such. And when any, API exists for too long. It gains a lot of croft and needs to be replaced. And eventually, eventually the people who developed OpenGL decided we need to start over, get rid of the Croft to make it cleaner and make it lower level. [00:55:28] Elizabeth: Because to get in a maximum performance games really want low level access. And so they made Vulcan, Microsoft kind of did the same thing, but they still call it Direct 3D. they just, it's, it's their, the newest version of Direct 3D is lower level. It's called Direct 3D 12. and, and, Mac looked at this and they decided we're gonna do the same thing too, but we're not gonna use Vulcan. [00:55:52] Elizabeth: We're gonna define our own. And they call it metal. And so when we want to translate D 3D 12 into something that another operating system understands. That's probably Vulcan. And, and on Mac, we need to translate it to metal somehow. And we decided instead of having a separate layer from D three 12 to metal, we're just gonna translate it to Vulcan and then translate the Vulcan to metal. And it also lets things written for Vulcan on Windows, which is also a thing that exists that lets them work on metal. [00:56:30] Jeremy: And having to do that translation, does that have a performance impact or is that not really felt? [00:56:38] Elizabeth: yes. It's kind of like, it's kind of like anything, when you talk about performance, like I mentioned this earlier, there's always gonna be overhead from translating from one API to another. But we try to, what we, we put in heroic efforts to. And try, try to make sure that doesn't matter, to, to make sure that stuff that needs to be fast is really as fast as it can possibly be. [00:57:06] Elizabeth: And some very clever things have been done along those lines. and, sometimes the, you know, the graphics drivers underneath are so good that it actually does run better, even despite the translation overhead. And then sometimes to make it run fast, we need to say, well, we're gonna implement a new API that behaves more like windows, so we can do less work translating it. And that's, and sometimes that goes into the graphics library and sometimes that goes into other places. Targeting Wine instead of porting applications [00:57:43] Jeremy: Yeah. Something I've found a little bit interesting about the last few years is [00:57:49] Jeremy: Developers in the past, they would generally target Windows and you might be lucky to get a Mac port or a Linux port. And I wonder, like, in your opinion now, now that a lot of developers are just targeting Windows and relying on wine or, or proton to, to run their software, is there any, I suppose, downside to doing that? [00:58:17] Jeremy: Or is it all just upside, like everyone should target Windows as this common platform? [00:58:23] Elizabeth: Yeah. It's an interesting question. I, there's some people who seem to think it's a bad thing that, that we're not getting native ports in the same sense, and then there's some people who. Who See, no, that's a perfectly valid way to do ports just right for this defacto common API it was never intended as a cross platform common API, but we've made it one. [00:58:47] Elizabeth: Right? And so why is that any worse than if it runs on a different API on on Linux or Mac and I? Yeah, I, I, I guess I tend to, I, that that argument tends to make sense to me. I don't, I don't really see, I don't personally see a lot of reason for, to, to, to say that one library is more pure than another. [00:59:12] Elizabeth: Right now, I do think Windows APIs are generally pretty bad. I, I'm, this might be, you know, just some sort of, this might just be an effect of having to work with them for a very long time and see all their flaws and have to deal with the nonsense that they do. But I think that a lot of the. Native Linux APIs are better. But if you like your Windows API better. And if you want to target Windows and that's the only way to do it, then sure why not? What's wrong with that? [00:59:51] Jeremy: Yeah, and I think the, doing it this way, targeting Windows, I mean if you look in the past, even though you had some software that would be ported to other operating systems without this compatibility layer, without people just targeting Windows, all this software that people can now run on these portable gaming handhelds or on Linux, Most of that software was never gonna be ported. So yeah, absolutely. And [01:00:21] Elizabeth: that's [01:00:22] Jeremy: having that as an option. Yeah. [01:00:24] Elizabeth: That's kind of why wine existed, because people wanted to run their software. You know, that was never gonna be ported. They just wanted, and then the community just spent a lot of effort in, you know, making all these individual programs run. Yeah. [01:00:39] Jeremy: I think it's pretty, pretty amazing too that, that now that's become this official way, I suppose, of distributing your software where you say like, Hey, I made a Windows version, but you're on your Linux machine. it's officially supported because, we have this much belief in this compatibility layer. [01:01:02] Elizabeth: it's kind of incredible to see wine having got this far. I mean, I started working on a, you know, six, seven years ago, and even then, I could never have imagined it would be like this. [01:01:16] Elizabeth: So as we, we wrap up, for the developers that are listening or, or people who are just users of wine, um, is there anything you think they should know about the project that we haven't talked about? [01:01:31] Elizabeth: I don't think there's anything I can think of. [01:01:34] Jeremy: And if people wanna learn, uh, more about the wine project or, or see what you're up to, where, where should they, where should they head? Getting support and contributing [01:01:45] Elizabeth: We don't really have any things like news, unfortunately. Um, read the release notes, uh, follow some, there's some, there's some people who, from Code Weavers who do blogs. So if you, so if you go to codeweavers.com/blog, there's some, there's, there's some codeweavers stuff, uh, some marketing stuff. But there's also some developers who will talk about bugs that they are solving and. And how it's easy and, and the experience of working on wine. [01:02:18] Jeremy: And I suppose if, if someone's. Interested in like, like let's say they have a piece of software, it's not working through wine. what's the best place for them to, to either get help or maybe even get involved with, with trying to fix it? [01:02:37] Elizabeth: yeah. Uh, so you can file a bug on, winehq.org,or, or, you know, find, there's a lot of developer resources there and you can get involved with contributing to the software. And, uh, there, there's links to our mailing list and IRC channels and, uh, and, and the GitLab, where all places you can find developers. [01:03:02] Elizabeth: We love to help you. Debug things. We love to help you fix things. We try our very best to be a welcoming community and we have got a long, we've got a lot of experience working with people who want to get their application working. So, we would love to, we'd love to have another. [01:03:24] Jeremy: Very cool. Yeah, I think wine is a really interesting project because I think for, I guess it would've been for decades, it seemed like very niche, like not many people [01:03:37] Jeremy: were aware of it. And now I think maybe in particular because of the, the Linux gaming handhelds, like the steam deck,wine is now something that a bunch of people who would've never heard about it before, and now they're aware of it. [01:03:53] Elizabeth: Absolutely. I've watched that transformation happen in real time and it's been surreal. [01:04:00] Jeremy: Very cool. Well, Elizabeth, thank you so much for, for joining me today. [01:04:05] Elizabeth: Thank you, Jeremy. I've been glad to be here.

apple design games performance microsoft 3d poor wine os mac windows fix adobe api audacity usb colonel psic linux github valve apis yuri macos amiga loading cpu vulcans commodore figura solaris microsoft office croft redirects irc gitlab proton kernel vulkan unix cpus xml debugging bsd debug exe directx dll opengl adobe creative suite windows os dlls valve software elizabeth it elizabeth you jeremy you direct3d jeremy so elizabeth yeah elizabeth so

Be superior to your former self | JLP Thu 6-19-25

Play Episode Listen Later Jun 19, 2025 180:00

Brandon Liu on Protomaps

Play Episode Listen Later Apr 6, 2025 59:57

Brandon Liu is an open source developer and creator of the Protomaps basemap project. We talk about how static maps help developers build sites that last, the PMTiles file format, the role of OpenStreetMap, and his experience funding and running an open source project full time. Protomaps Protomaps PMTiles (File format used by Protomaps) Self-hosted slippy maps, for novices (like me) Why Deploy Protomaps on a CDN User examples Flickr Pinball Map Toilet Map Related projects OpenStreetMap (Dataset protomaps is based on) Mapzen (Former company that released details on what to display based on zoom levels) Mapbox GL JS (Mapbox developed source available map rendering library) MapLibre GL JS (Open source fork of Mapbox GL JS) Other links HTTP range requests (MDN) Hilbert curve Transcript You can help correct transcripts on GitHub. Intro [00:00:00] Jeremy: I'm talking to Brandon Liu. He's the creator of Protomaps, which is a way to easily create and host your own maps. Let's get into it. [00:00:09] Brandon: Hey, so thanks for having me on the podcast. So I'm Brandon. I work on an open source project called Protomaps. What it really is, is if you're a front end developer and you ever wanted to put maps on a website or on a mobile app, then Protomaps is sort of an open source solution for doing that that I hope is something that's way easier to use than, um, a lot of other open source projects. Why not just use Google Maps? [00:00:36] Jeremy: A lot of people are gonna be familiar with Google Maps. Why should they worry about whether something's open source? Why shouldn't they just go and use the Google maps API? [00:00:47] Brandon: So Google Maps is like an awesome thing it's an awesome product. Probably one of the best tech products ever right? And just to have a map that tells you what restaurants are open and something that I use like all the time especially like when you're traveling it has all that data. And the most amazing part is that it's free for consumers but it's not necessarily free for developers. Like if you wanted to embed that map onto your website or app, that usually has an API cost which still has a free tier and is affordable. But one motivation, one basic reason to use open source is if you have some project that doesn't really fit into that pricing model. You know like where you have to pay the cost of Google Maps, you have a side project, a nonprofit, that's one reason. But there's lots of other reasons related to flexibility or customization where you might want to use open source instead. Protomaps examples [00:01:49] Jeremy: Can you give some examples where people have used Protomaps and where that made sense for them? [00:01:56] Brandon: I follow a lot of the use cases and I also don't know about a lot of them because I don't have an API where I can track a hundred percent of the users. Some of them use the hosted version, but I would say most of them probably use it on their own infrastructure. One of the cool projects I've been seeing is called Toilet Map. And what toilet map is if you're in the UK and you want find a public restroom then it maps out, sort of crowdsourced all of the public restrooms. And that's important for like a lot of people if they have health issues, they need to find that information. And just a lot of different projects in the same vein. There's another one called Pinball Map which is sort of a hobby project to find all the pinball machines in the world. And they wanted to have a customized map that fit in with their theme of pinball. So these sorts of really cool indie projects are the ones I'm most excited about. Basemaps vs Overlays [00:02:57] Jeremy: And if we talk about, like the pinball map as an example, there's this concept of a basemap and then there's the things that you lay on top of it. What is a basemap and then is the pinball locations is that part of it or is that something separate? [00:03:12] Brandon: It's usually something separate. The example I usually use is if you go to a real estate site, like Zillow, you'll open up the map of Seattle and it has a bunch of pins showing all the houses, and then it has some information beneath it. That information beneath it is like labels telling, this neighborhood is Capitol Hill, or there is a park here. But all that information is common to a lot of use cases and it's not specific to real estate. So I think usually that's the distinction people use in the industry between like a base map versus your overlay. The overlay is like the data for your product or your company while the base map is something you could get from Google or from Protomaps or from Apple or from Mapbox that kind of thing. PMTiles for hosting the basemap and overlays [00:03:58] Jeremy: And so Protomaps in particular is responsible for the base map, and that information includes things like the streets and the locations of landmarks and things like that. Where is all that information coming from? [00:04:12] Brandon: So the base map information comes from a project called OpenStreetMap. And I would also, point out that for Protomaps as sort of an ecosystem. You can also put your overlay data into a format called PMTiles, which is sort of the core of what Protomaps is. So it can really do both. It can transform your data into the PMTiles format which you can host and you can also host the base map. So you kind of have both of those sides of the product in one solution. [00:04:43] Jeremy: And so when you say you have both are you saying that the PMTiles file can have, the base map in one file and then you would have the data you're laying on top in another file? Or what are you describing there? [00:04:57] Brandon: That's usually how I recommend to do it. Oftentimes there'll be sort of like, a really big basemap 'cause it has all of that data about like where the rivers are. Or while, if you want to put your map of toilets or park benches or pickleball courts on top, that's another file. But those are all just like assets you can move around like JSON or CSV files. Statically Hosted [00:05:19] Jeremy: And I think one of the things you mentioned was that your goal was to make Protomaps or the, the use of these PMTiles files easy to use. What does that look like for, for a developer? I wanna host a map. What do I actually need to, to put on my servers? [00:05:38] Brandon: So my usual pitch is that basically if you know how to use S3 or cloud storage, that you know how to deploy a map. And that, I think is the main sort of differentiation from most open source projects. Like a lot of them, they call themselves like, like some sort of self-hosted solution. But I've actually avoided using the term self-hosted because I think in most cases that implies a lot of complexity. Like you have to log into a Linux server or you have to use Kubernetes or some sort of Docker thing. What I really want to emphasize is the idea that, for Protomaps, it's self-hosted in the same way like CSS is self-hosted. So you don't really need a service from Amazon to host the JSON files or CSV files. It's really just a static file. [00:06:32] Jeremy: When you say static file that means you could use any static web host to host your HTML file, your JavaScript that actually renders the map. And then you have your PMTiles files, and you're not running a process or anything, you're just putting your files on a static file host. [00:06:50] Brandon: Right. So I think if you're a developer, you can also argue like a static file server is a server. It's you know, it's the cloud, it's just someone else's computer. It's really just nginx under the hood. But I think static storage is sort of special. If you look at things like static site generators, like Jekyll or Hugo, they're really popular because they're a commodity or like the storage is a commodity. And you can take your blog, make it a Jekyll blog, hosted on S3. One day, Amazon's like, we're charging three times as much so you can move it to a different cloud provider. And that's all vendor neutral. So I think that's really the special thing about static storage as a primitive on the web. Why running servers is a problem for resilience [00:07:36] Jeremy: Was there a prior experience you had? Like you've worked with maps for a very long time. Were there particular difficulties you had where you said I just gotta have something that can be statically hosted? [00:07:50] Brandon: That's sort of exactly why I got into this. I've been working sort of in and around the map space for over a decade, and Protomaps is really like me trying to solve the same problem I've had over and over again in the past, just like once and forever right? Because like once this problem is solved, like I don't need to deal with it again in the future. So I've worked at a couple of different companies before, mostly as a contractor, for like a humanitarian nonprofit for a design company doing things like, web applications to visualize climate change. Or for even like museums, like digital signage for museums. And oftentimes they had some sort of data visualization component, but always sort of the challenge of how to like, store and also distribute like that data was something that there wasn't really great open source solutions. So just for map data, that's really what motivated that design for Protomaps. [00:08:55] Jeremy: And in those, those projects in the past, were those things where you had to run your own server, run your own database, things like that? [00:09:04] Brandon: Yeah. And oftentimes we did, we would spin up an EC2 instance, for maybe one client and then we would have to host this server serving map data forever. Maybe the client goes away, or I guess it's good for business if you can sign some sort of like long-term support for that client saying, Hey, you know, like we're done with a project, but you can pay us to maintain the EC2 server for the next 10 years. And that's attractive. but it's also sort of a pain, because usually what happens is if people are given the choice, like a developer between like either I can manage the server on EC2 or on Rackspace or Hetzner or whatever, or I can go pay a SaaS to do it. In most cases, businesses will choose to pay the SaaS. So that's really like what creates a sort of lock-in is this preference for like, so I have this choice between like running the server or paying the SaaS. Like businesses will almost always go and pay the SaaS. [00:10:05] Jeremy: Yeah. And in this case, you either find some kind of free hosting or low-cost hosting just to host your files and you upload the files and then you're good from there. You don't need to maintain anything. [00:10:18] Brandon: Exactly, and that's really the ideal use case. so I have some users these, climate science consulting agencies, and then they might have like a one-off project where they have to generate the data once, but instead of having to maintain this server for the lifetime of that project, they just have a file on S3 and like, who cares? If that costs a couple dollars a month to run, that's fine, but it's not like S3 is gonna be deprecated, like it's gonna be on an insecure version of Ubuntu or something. So that's really the ideal, set of constraints for using Protomaps. [00:10:58] Jeremy: Yeah. Something this also makes me think about is, is like the resilience of sites like remaining online, because I, interviewed, Kyle Drake, he runs Neocities, which is like a modern version of GeoCities. And if I remember correctly, he was mentioning how a lot of old websites from that time, if they were running a server backend, like they were running PHP or something like that, if you were to try to go to those sites, now they're like pretty much all dead because there needed to be someone dedicated to running a Linux server, making sure things were patched and so on and so forth. But for static sites, like the ones that used to be hosted on GeoCities, you can go to the internet archive or other websites and they were just files, right? You can bring 'em right back up, and if anybody just puts 'em on a web server, then you're good. They're still alive. Case study of news room preferring static hosting [00:11:53] Brandon: Yeah, exactly. One place that's kind of surprising but makes sense where this comes up, is for newspapers actually. Some of the users using Protomaps are the Washington Post. And the reason they use it, is not necessarily because they don't want to pay for a SaaS like Google, but because if they make an interactive story, they have to guarantee that it still works in a couple of years. And that's like a policy decision from like the editorial board, which is like, so you can't write an article if people can't view it in five years. But if your like interactive data story is reliant on a third party, API and that third party API becomes deprecated, or it changes the pricing or it, you know, it gets acquired, then your journalism story is not gonna work anymore. So I have seen really good uptake among local news rooms and even big ones to use things like Protomaps just because it makes sense for the requirements. Working on Protomaps as an open source project for five years [00:12:49] Jeremy: How long have you been working on Protomaps and the parts that it's made up of such as PMTiles? [00:12:58] Brandon: I've been working on it for about five years, maybe a little more than that. It's sort of my pandemic era project. But the PMTiles part, which is really the heart of it only came in about halfway. Why not make a SaaS? [00:13:13] Brandon: So honestly, like when I first started it, I thought it was gonna be another SaaS and then I looked at it and looked at what the environment was around it. And I'm like, uh, so I don't really think I wanna do that. [00:13:24] Jeremy: When, when you say you looked at the environment around it what do you mean? Why did you decide not to make it a SaaS? [00:13:31] Brandon: Because there already is a lot of SaaS out there. And I think the opportunity of making something that is unique in terms of those use cases, like I mentioned like newsrooms, was clear. Like it was clear that there was some other solution, that could be built that would fit these needs better while if it was a SaaS, there are plenty of those out there. And I don't necessarily think that they're well differentiated. A lot of them all use OpenStreetMap data. And it seems like they mainly compete on price. It's like who can build the best three column pricing model. And then once you do that, you need to build like billing and metrics and authentication and like those problems don't really interest me. So I think, although I acknowledge sort of the indie hacker ethos now is to build a SaaS product with a monthly subscription, that's something I very much chose not to do, even though it is for sure like the best way to build a business. [00:14:29] Jeremy: Yeah, I mean, I think a lot of people can appreciate that perspective because it's, it's almost like we have SaaS overload, right? Where you have so many little bills for your project where you're like, another $5 a month, another $10 a month, or if you're a business, right? Those, you add a bunch of zeros and at some point it's just how many of these are we gonna stack on here? [00:14:53] Brandon: Yeah. And honestly. So I really think like as programmers, we're not really like great at choosing how to spend money like a $10 SaaS. That's like nothing. You know? So I can go to Starbucks and I can buy a pumpkin spice latte, and that's like $10 basically now, right? And it's like I'm able to make that consumer choice in like an instant just to spend money on that. But then if you're like, oh, like spend $10 on a SaaS that somebody put a lot of work into, then you're like, oh, that's too expensive. I could just do it myself. So I'm someone that also subscribes to a lot of SaaS products. and I think for a lot of things it's a great fit. Many open source SaaS projects are not easy to self host [00:15:37] Brandon: But there's always this tension between an open source project that you might be able to run yourself and a SaaS. And I think a lot of projects are at different parts of the spectrum. But for Protomaps, it's very much like I'm trying to move maps to being it is something that is so easy to run yourself that anyone can do it. [00:16:00] Jeremy: Yeah, and I think you can really see it with, there's a few SaaS projects that are successful and they're open source, but then you go to look at the self-hosting instructions and it's either really difficult to find and you find it, and then the instructions maybe don't work, or it's really complicated. So I think doing the opposite with Protomaps. As a user, I'm sure we're all appreciative, but I wonder in terms of trying to make money, if that's difficult. [00:16:30] Brandon: No, for sure. It is not like a good way to make money because I think like the ideal situation for an open source project that is open that wants to make money is the product itself is fundamentally complicated to where people are scared to run it themselves. Like a good example I can think of is like Supabase. Supabase is sort of like a platform as a service based on Postgres. And if you wanted to run it yourself, well you need to run Postgres and you need to handle backups and authentication and logging, and that stuff all needs to work and be production ready. So I think a lot of people, like they don't trust themselves to run database backups correctly. 'cause if you get it wrong once, then you're kind of screwed. So I think that fundamental aspect of the product, like a database is something that is very, very ripe for being a SaaS while still being open source because it's fundamentally hard to run. Another one I can think of is like tailscale, which is, like a VPN that works end to end. That's something where, you know, it has this networking complexity where a lot of developers don't wanna deal with that. So they'd happily pay, for tailscale as a service. There is a lot of products or open source projects that eventually end up just changing to becoming like a hosted service. Businesses going from open source to closed or restricted licenses [00:17:58] Brandon: But then in that situation why would they keep it open source, right? Like, if it's easy to run yourself well, doesn't that sort of cannibalize their business model? And I think that's really the tension overall in these open source companies. So you saw it happen to things like Elasticsearch to things like Terraform where they eventually change the license to one that makes it difficult for other companies to compete with them. [00:18:23] Jeremy: Yeah, I mean there's been a number of cases like that. I mean, specifically within the mapping community, one I can think of was Mapbox's. They have Mapbox gl. Which was a JavaScript client to visualize maps and they moved from, I forget which license they picked, but they moved to a much more restrictive license. I wonder what your thoughts are on something that releases as open source, but then becomes something maybe a little more muddy. [00:18:55] Brandon: Yeah, I think it totally makes sense because if you look at their business and their funding, it seems like for Mapbox, I haven't used it in a while, but my understanding is like a lot of their business now is car companies and doing in dash navigation. And that is probably way better of a business than trying to serve like people making maps of toilets. And I think sort of the beauty of it is that, so Mapbox, the story is they had a JavaScript renderer called Mapbox GL JS. And they changed that to a source available license a couple years ago. And there's a fork of it that I'm sort of involved in called MapLibre GL. But I think the cool part is Mapbox paid employees for years, probably millions of dollars in total to work on this thing and just gave it away for free. Right? So everyone can benefit from that work they did. It's not like that code went away, like once they changed the license. Well, the old version has been forked. It's going its own way now. It's quite different than the new version of Mapbox, but I think it's extremely generous that they're able to pay people for years, you know, like a competitive salary and just give that away. [00:20:10] Jeremy: Yeah, so we should maybe look at it as, it was a gift while it was open source, and they've given it to the community and they're on continuing on their own path, but at least the community running Map Libre, they can run with it, right? It's not like it just disappeared. [00:20:29] Brandon: Yeah, exactly. And that is something that I use for Protomaps quite extensively. Like it's the primary way of showing maps on the web and I've been trying to like work on some enhancements to it to have like better internationalization for if you are in like South Asia like not show languages correctly. So I think it is being taken in a new direction. And I think like sort of the combination of Protomaps and MapLibre, it addresses a lot of use cases, like I mentioned earlier with like these like hobby projects, indie projects that are almost certainly not interesting to someone like Mapbox or Google as a business. But I'm happy to support as a small business myself. Financially supporting open source work (GitHub sponsors, closed source, contracts) [00:21:12] Jeremy: In my previous interview with Tom, one of the main things he mentioned was that creating a mapping business is incredibly difficult, and he said he probably wouldn't do it again. So in your case, you're building Protomaps, which you've admitted is easy to self-host. So there's not a whole lot of incentive for people to pay you. How is that working out for you? How are you supporting yourself? [00:21:40] Brandon: There's a couple of strategies that I've tried and oftentimes failed at. Just to go down the list, so I do have GitHub sponsors so I do have a hosted version of Protomaps you can use if you don't want to bother copying a big file around. But the way I do the billing for that is through GitHub sponsors. If you wanted to use this thing I provide, then just be a sponsor. And that definitely pays for itself, like the cost of running it. And that's great. GitHub sponsors is so easy to set up. It just removes you having to deal with Stripe or something. 'cause a lot of people, their credit card information is already in GitHub. GitHub sponsors I think is awesome if you want to like cover costs for a project. But I think very few people are able to make that work. A thing that's like a salary job level. It's sort of like Twitch streaming, you know, there's a handful of people that are full-time streamers and then you look down the list on Twitch and it's like a lot of people that have like 10 viewers. But some of the other things I've tried, I actually started out, publishing the base map as a closed source thing, where I would sell sort of like a data package instead of being a SaaS, I'd be like, here's a one-time download, of the premium data and you can buy it. And quite a few people bought it I just priced it at like $500 for this thing. And I thought that was an interesting experiment. The main reason it's interesting is because the people that it attracts to you in terms of like, they're curious about your products, are all people willing to pay money. While if you start out everything being open source, then the people that are gonna be try to do it are only the people that want to get something for free. So what I discovered is actually like once you transition that thing from closed source to open source, a lot of the people that used to pay you money will still keep paying you money because like, it wasn't necessarily that that closed source thing was why they wanted to pay. They just valued that thought you've put into it your expertise, for example. So I think that is one thing, that I tried at the beginning was just start out, closed source proprietary, then make it open source. That's interesting to people. Like if you release something as open source, if you go the other way, like people are really mad if you start out with something open source and then later on you're like, oh, it's some other license. Then people are like that's so rotten. But I think doing it the other way, I think is quite valuable in terms of being able to find an audience. [00:24:29] Jeremy: And when you said it was closed source and paid to open source, do you still sell those map exports? [00:24:39] Brandon: I don't right now. It's something that I might do in the future, you know, like have small customizations of the data that are available, uh, for a fee. still like the core OpenStreetMap based map that's like a hundred gigs you can just download. And that'll always just be like a free download just because that's already out there. All the source code to build it is open source. So even if I said, oh, you have to pay for it, then someone else can just do it right? So there's no real reason like to make that like some sort of like paywall thing. But I think like overall if the project is gonna survive in the long term it's important that I'd ideally like to be able to like grow like a team like have a small group of people that can dedicate the time to growing the project in the long term. But I'm still like trying to figure that out right now. [00:25:34] Jeremy: And when you mentioned that when you went from closed to open and people were still paying you, you don't sell a product anymore. What were they paying for? [00:25:45] Brandon: So I have some contracts with companies basically, like if they need a feature or they need a customization in this way then I am very open to those. And I sort of set it up to make it clear from the beginning that this is not just a free thing on GitHub, this is something that you could pay for if you need help with it, if you need support, if you wanted it. I'm also a little cagey about the word support because I think like it sounds a little bit too wishy-washy. Pretty much like if you need access to the developers of an open source project, I think that's something that businesses are willing to pay for. And I think like making that clear to potential users is a challenge. But I think that is one way that you might be able to make like a living out of open source. [00:26:35] Jeremy: And I think you said you'd been working on it for about five years. Has that mostly been full time? [00:26:42] Brandon: It's been on and off. it's sort of my pandemic era project. But I've spent a lot of time, most of my time working on the open source project at this point. So I have done some things that were more just like I'm doing a customization or like a private deployment for some client. But that's been a minority of the time. Yeah. [00:27:03] Jeremy: It's still impressive to have an open source project that is easy to self-host and yet is still able to support you working on it full time. I think a lot of people might make the assumption that there's nothing to sell if something is, is easy to use. But this sort of sounds like a counterpoint to that. [00:27:25] Brandon: I think I'd like it to be. So when you come back to the point of like, it being easy to self-host. Well, so again, like I think about it as like a primitive of the web. Like for example, if you wanted to start a business today as like hosted CSS files, you know, like where you upload your CSS and then you get developers to pay you a monthly subscription for how many times they fetched a CSS file. Well, I think most developers would be like, that's stupid because it's just an open specification, you just upload a static file. And really my goal is to make Protomaps the same way where it's obvious that there's not really some sort of lock-in or some sort of secret sauce in the server that does this thing. How PMTiles works and building a primitive of the web [00:28:16] Brandon: If you look at video for example, like a lot of the tech for how Protomaps and PMTiles works is based on parts of the HTTP spec that were made for video. And 20 years ago, if you wanted to host a video on the web, you had to have like a real player license or flash. So you had to go license some server software from real media or from macromedia so you could stream video to a browser plugin. But now in HTML you can just embed a video file. And no one's like, oh well I need to go pay for my video serving license. I mean, there is such a thing, like YouTube doesn't really use that for DRM reasons, but people just have the assumption that video is like a primitive on the web. So if we're able to make maps sort of that same way like a primitive on the web then there isn't really some obvious business or licensing model behind how that works. Just because it's a thing and it helps a lot of people do their jobs and people are happy using it. So why bother? [00:29:26] Jeremy: You mentioned that it a tech that was used for streaming video. What tech specifically is it? [00:29:34] Brandon: So it is byte range serving. So when you open a video file on the web, So let's say it's like a 100 megabyte video. You don't have to download the entire video before it starts playing. It streams parts out of the file based on like what frames... I mean, it's based on the frames in the video. So it can start streaming immediately because it's organized in a way to where the first few frames are at the beginning. And what PMTiles really is, is it's just like a video but in space instead of time. So it's organized in a way where these zoomed out views are at the beginning and the most zoomed in views are at the end. So when you're like panning or zooming in the map all you're really doing is fetching byte ranges out of that file the same way as a video. But it's organized in, this tiled way on a space filling curve. IIt's a little bit complicated how it works internally and I think it's kind of cool but that's sort of an like an implementation detail. [00:30:35] Jeremy: And to the person deploying it, it just looks like a single file. [00:30:40] Brandon: Exactly in the same way like an mp3 audio file is or like a JSON file is. [00:30:47] Jeremy: So with a video, I can sort of see how as someone seeks through the video, they start at the beginning and then they go to the middle if they wanna see the middle. For a map, as somebody scrolls around the map, are you seeking all over the file or is the way it's structured have a little less chaos? [00:31:09] Brandon: It's structured. And that's kind of the main technical challenge behind building PMTiles is you have to be sort of clever so you're not spraying the reads everywhere. So it uses something called a hilbert curve, which is a mathematical concept of a space filling curve. Where it's one continuous curve that essentially lets you break 2D space into 1D space. So if you've seen some maps of IP space, it uses this crazy looking curve that hits all the points in one continuous line. And that's the same concept behind PMTiles is if you're looking at one part of the world, you're sort of guaranteed that all of those parts you're looking at are quite close to each other and the data you have to transfer is quite minimal, compared to if you just had it at random. [00:32:02] Jeremy: How big do the files get? If I have a PMTiles of the entire world, what kind of size am I looking at? [00:32:10] Brandon: Right now, the default one I distribute is 128 gigabytes, so it's quite sizable, although you can slice parts out of it remotely. So if you just wanted. if you just wanted California or just wanted LA or just wanted only a couple of zoom levels, like from zero to 10 instead of zero to 15, there is a command line tool that's also called PMTiles that lets you do that. Issues with CDNs and range queries [00:32:35] Jeremy: And when you're working with files of this size, I mean, let's say I am working with a CDN in front of my application. I'm not typically accustomed to hosting something that's that large and something that's where you're seeking all over the file. is that, ever an issue or is that something that's just taken care of by the browser and, and taken care of by, by the hosts? [00:32:58] Brandon: That is an issue actually, so a lot of CDNs don't deal with it correctly. And my recommendation is there is a kind of proxy server or like a serverless proxy thing that I wrote. That runs on like cloudflare workers or on Docker that lets you proxy those range requests into a normal URL and then that is like a hundred percent CDN compatible. So I would say like a lot of the big commercial installations of this thing, they use that because it makes more practical sense. It's also faster. But the idea is that this solution sort of scales up and scales down. If you wanted to host just your city in like a 10 megabyte file, well you can just put that into GitHub pages and you don't have to worry about it. If you want to have a global map for your website that serves a ton of traffic then you probably want a little bit more sophisticated of a solution. It still does not require you to run a Linux server, but it might require (you) to use like Lambda or Lambda in conjunction with like a CDN. [00:34:09] Jeremy: Yeah. And that sort of ties into what you were saying at the beginning where if you can host on something like CloudFlare Workers or Lambda, there's less time you have to spend keeping these things running. [00:34:26] Brandon: Yeah, exactly. and I think also the Lambda or CloudFlare workers solution is not perfect. It's not as perfect as S3 or as just static files, but in my experience, it still is better at building something that lasts on the time span of years than being like I have a server that is on this Ubuntu version and in four years there's all these like security patches that are not being applied. So it's still sort of serverless, although not totally vendor neutral like S3. Customizing the map [00:35:03] Jeremy: We've mostly been talking about how you host the map itself, but for someone who's not familiar with these kind of tools, how would they be customizing the map? [00:35:15] Brandon: For customizing the map there is front end style customization and there's also data customization. So for the front end if you wanted to change the water from the shade of blue to another shade of blue there is a TypeScript API where you can customize it almost like a text editor color scheme. So if you're able to name a bunch of colors, well you can customize the map in that way you can change the fonts. And that's all done using MapLibre GL using a TypeScript API on top of that for customizing the data. So all the pipeline to generate this data from OpenStreetMap is open source. There is a Java program using a library called PlanetTiler which is awesome, which is this super fast multi-core way of building map tiles. And right now there isn't really great hooks to customize what data goes into that. But that's something that I do wanna work on. And finally, because the data comes from OpenStreetMap if you notice data that's missing or you wanted to correct data in OSM then you can go into osm.org. You can get involved in contributing the data to OSM and the Protomaps build is daily. So if you make a change, then within 24 hours you should see the new base map. Have that change. And of course for OSM your improvements would go into every OSM based project that is ingesting that data. So it's not a protomap specific thing. It's like this big shared data source, almost like Wikipedia. OpenStreetMap is a dataset and not a map [00:37:01] Jeremy: I think you were involved with OpenStreetMap to some extent. Can you speak a little bit to that for people who aren't familiar, what OpenStreetMap is? [00:37:11] Brandon: Right. So I've been using OSM as sort of like a tools developer for over a decade now. And one of the number one questions I get from developers about what is Protomaps is why wouldn't I just use OpenStreetMap? What's the distinction between Protomaps and OpenStreetMap? And it's sort of like this funny thing because even though OSM has map in the name it's not really a map in that you can't... In that it's mostly a data set and not a map. It does have a map that you can see that you can pan around to when you go to the website but the way that thing they show you on the website is built is not really that easily reproducible. It involves a lot of c++ software you have to run. But OpenStreetMap itself, the heart of it is almost like a big XML file that has all the data in the map and global. And it has tagged features for example. So you can go in and edit that. It has a web front end to change the data. It does not directly translate into making a map actually. Protomaps decides what shows at each zoom level [00:38:24] Brandon: So a lot of the pipeline, that Java program I mentioned for building this basemap for protomaps is doing things like you have to choose what data you show when you zoom out. You can't show all the data. For example when you're zoomed out and you're looking at all of a state like Colorado you don't see all the Chipotle when you're zoomed all the way out. That'd be weird, right? So you have to make some sort of decision in logic that says this data only shows up at this zoom level. And that's really what is the challenge in optimizing the size of that for the Protomaps map project. [00:39:03] Jeremy: Oh, so those decisions of what to show at different Zoom levels those are decisions made by you when you're creating the PMTiles file with Protomaps. [00:39:14] Brandon: Exactly. It's part of the base maps build pipeline. and those are honestly very subjective decisions. Who really decides when you're zoomed out should this hospital show up or should this museum show up nowadays in Google, I think it shows you ads. Like if someone pays for their car repair shop to show up when you're zoomed out like that that gets surfaced. But because there is no advertising auction in Protomaps that doesn't happen obviously. So we have to sort of make some reasonable choice. A lot of that right now in Protomaps actually comes from another open source project called Mapzen. So Mapzen was a company that went outta business a couple years ago. They did a lot of this work in designing which data shows up at which Zoom level and open sourced it. And then when they shut down, they transferred that code into the Linux Foundation. So it's this totally open source project, that like, again, sort of like Mapbox gl has this awesome legacy in that this company funded it for years for smart people to work on it and now it's just like a free thing you can use. So the logic in Protomaps is really based on mapzen. [00:40:33] Jeremy: And so the visualization of all this... I think I understand what you mean when people say oh, why not use OpenStreetMaps because it's not really clear it's hard to tell is this the tool that's visualizing the data? Is it the data itself? So in the case of using Protomaps, it sounds like Protomaps itself has all of the data from OpenStreetMap and then it has made all the decisions for you in terms of what to show at different Zoom levels and what things to have on the map at all. And then finally, you have to have a separate, UI layer and in this case, it sounds like the one that you recommend is the Map Libre library. [00:41:18] Brandon: Yeah, that's exactly right. For Protomaps, it has a portion or a subset of OSM data. It doesn't have all of it just because there's too much, like there's data in there. people have mapped out different bushes and I don't include that in Protomaps if you wanted to go in and edit like the Java code to add that you can. But really what Protomaps is positioned at is sort of a solution for developers that want to use OSM data to make a map on their app or their website. because OpenStreetMap itself is mostly a data set, it does not really go all the way to having an end-to-end solution. Financials and the idea of a project being complete [00:41:59] Jeremy: So I think it's great that somebody who wants to make a map, they have these tools available, whether it's from what was originally built by Mapbox, what's built by Open StreetMap now, the work you're doing with Protomaps. But I wonder one of the things that I talked about with Tom was he was saying he was trying to build this mapping business and based on the financials of what was coming in he was stressed, right? He was struggling a bit. And I wonder for you, you've been working on this open source project for five years. Do you have similar stressors or do you feel like I could keep going how things are now and I feel comfortable? [00:42:46] Brandon: So I wouldn't say I'm a hundred percent in one bucket or the other. I'm still seeing it play out. One thing, that I really respect in a lot of open source projects, which I'm not saying I'm gonna do for Protomaps is the idea that a project is like finished. I think that is amazing. If a software project can just be done it's sort of like a painting or a novel once you write, finish the last page, have it seen by the editor. I send it off to the press is you're done with a book. And I think one of the pains of software is so few of us can actually do that. And I don't know obviously people will say oh the map is never finished. That's more true of OSM, but I think like for Protomaps. One thing I'm thinking about is how to limit the scope to something that's quite narrow to where we could be feature complete on the core things in the near term timeframe. That means that it does not address a lot of things that people want. Like search, like if you go to Google Maps and you search for a restaurant, you will get some hits. that's like a geocoding issue. And I've already decided that's totally outta scope for Protomaps. So, in terms of trying to think about the future of this, I'm mostly looking for ways to cut scope if possible. There are some things like better tooling around being able to work with PMTiles that are on the roadmap. but for me, I am still enjoying working on the project. It's definitely growing. So I can see on NPM downloads I can see the growth curve of people using it and that's really cool. So I like hearing about when people are using it for cool projects. So it seems to still be going okay for now. [00:44:44] Jeremy: Yeah, that's an interesting perspective about how you were talking about projects being done. Because I think when people look at GitHub projects and they go like, oh, the last commit was X months ago. They go oh well this is dead right? But maybe that's the wrong framing. Maybe you can get a project to a point where it's like, oh, it's because it doesn't need to be updated. [00:45:07] Brandon: Exactly, yeah. Like I used to do a lot of c++ programming and the best part is when you see some LAPACK matrix math library from like 1995 that still works perfectly in c++ and you're like, this is awesome. This is the one I have to use. But if you're like trying to use some like React component library and it hasn't been updated in like a year, you're like, oh, that's a problem. So again, I think there's some middle ground between those that I'm trying to find. I do like for Protomaps, it's quite dependency light in terms of the number of hard dependencies I have in software. but I do still feel like there is a lot of work to be done in terms of project scope that needs to have stuff added. You mostly only hear about problems instead of people's wins [00:45:54] Jeremy: Having run it for this long. Do you have any thoughts on running an open source project in general? On dealing with issues or managing what to work on things like that? [00:46:07] Brandon: Yeah. So I have a lot. I think one thing people point out a lot is that especially because I don't have a direct relationship with a lot of the people using it a lot of times I don't even know that they're using it. Someone sent me a message saying hey, have you seen flickr.com, like the photo site? And I'm like, no. And I went to flickr.com/map and it has Protomaps for it. And I'm like, I had no idea. But that's cool, if they're able to use Protomaps for this giant photo sharing site that's awesome. But that also means I don't really hear about when people use it successfully because you just don't know, I guess they, NPM installed it and it works perfectly and you never hear about it. You only hear about people's negative experiences. You only hear about people that come and open GitHub issues saying this is totally broken, and why doesn't this thing exist? And I'm like, well, it's because there's an infinite amount of things that I want to do, but I have a finite amount of time and I just haven't gone into that yet. And that's honestly a lot of the things and people are like when is this thing gonna be done? So that's, that's honestly part of why I don't have a public roadmap because I want to avoid that sort of bickering about it. I would say that's one of my biggest frustrations with running an open source project is how it's self-selected to only hear the negative experiences with it. Be careful what PRs you accept [00:47:32] Brandon: 'cause you don't hear about those times where it works. I'd say another thing is it's changed my perspective on contributing to open source because I think when I was younger or before I had become a maintainer I would open a pull request on a project unprompted that has a hundred lines and I'd be like, Hey, just merge this thing. But I didn't realize when I was younger well if I just merge it and I disappear, then the maintainer is stuck with what I did forever. You know if I add some feature then that person that maintains the project has to do that indefinitely. And I think that's very asymmetrical and it's changed my perspective a lot on accepting open source contributions. I wanna have it be open to anyone to contribute. But there is some amount of back and forth where it's almost like the default answer for should I accept a PR is no by default because you're the one maintaining it. And do you understand the shape of that solution completely to where you're going to support it for years because the person that's contributing it is not bound to those same obligations that you are. And I think that's also one of the things where I have a lot of trepidation around open source is I used to think of it as a lot more bazaar-like in terms of anyone can just throw their thing in. But then that creates a lot of problems for the people who are expected out of social obligation to continue this thing indefinitely. [00:49:23] Jeremy: Yeah, I can totally see why that causes burnout with a lot of open source maintainers, because you probably to some extent maybe even feel some guilt right? You're like, well, somebody took the time to make this. But then like you said you have to spend a lot of time trying to figure out is this something I wanna maintain long term? And one wrong move and it's like, well, it's in here now. [00:49:53] Brandon: Exactly. To me, I think that is a very common failure mode for open source projects is they're too liberal in the things they accept. And that's a lot of why I was talking about how that choice of what features show up on the map was inherited from the MapZen projects. If I didn't have that then somebody could come in and say hey, you know, I want to show power lines on the map. And they open a PR for power lines and now everybody who's using Protomaps when they're like zoomed out they see power lines are like I didn't want that. So I think that's part of why a lot of open source projects eventually evolve into a plugin system is because there is this demand as the project grows for more and more features. But there is a limit in the maintainers. It's like the demand for features is exponential while the maintainer amount of time and effort is linear. Plugin systems might reduce need for PRs [00:50:56] Brandon: So maybe the solution to smash that exponential down to quadratic maybe is to add a plugin system. But I think that is one of the biggest tensions that only became obvious to me after working on this for a couple of years. [00:51:14] Jeremy: Is that something you're considering doing now? [00:51:18] Brandon: Is the plugin system? Yeah. I think for the data customization, I eventually wanted to have some sort of programmatic API to where you could declare a config file that says I want ski routes. It totally makes sense. The power lines example is maybe a little bit obscure but for example like a skiing app and you want to be able to show ski slopes when you're zoomed out well you're not gonna be able to get that from Mapbox or from Google because they have a one size fits all map that's not specialized to skiing or to golfing or to outdoors. But if you like, in theory, you could do this with Protomaps if you changed the Java code to show data at different zoom levels. And that is to me what makes the most sense for a plugin system and also makes the most product sense because it enables a lot of things you cannot do with the one size fits all map. [00:52:20] Jeremy: It might also increase the complexity of the implementation though, right? [00:52:25] Brandon: Yeah, exactly. So that's like. That's really where a lot of the terrifying thoughts come in, which is like once you create this like config file surface area, well what does that look like? Is that JSON? Is that TOML, is that some weird like everything eventually evolves into some scripting language right? Where you have logic inside of your templates and I honestly do not really know what that looks like right now. That feels like something in the medium term roadmap. [00:52:58] Jeremy: Yeah and then in terms of bug reports or issues, now it's not just your code it's this exponential combination of whatever people put into these config files. [00:53:09] Brandon: Exactly. Yeah. so again, like I really respect the projects that have done this well or that have done plugins well. I'm trying to think of some, I think obsidian has plugins, for example. And that seems to be one of the few solutions to try and satisfy the infinite desire for features with the limited amount of maintainer time. Time split between code vs triage vs talking to users [00:53:36] Jeremy: How would you say your time is split between working on the code versus issue and PR triage? [00:53:43] Brandon: Oh, it varies really. I think working on the code is like a minority of it. I think something that I actually enjoy is talking to people, talking to users, getting feedback on it. I go to quite a few conferences to talk to developers or people that are interested and figure out how to refine the message, how to make it clearer to people, like what this is for. And I would say maybe a plurality of my time is spent dealing with non-technical things that are neither code or GitHub issues. One thing I've been trying to do recently is talk to people that are not really in the mapping space. For example, people that work for newspapers like a lot of them are front end developers and if you ask them to run a Linux server they're like I have no idea. But that really is like one of the best target audiences for Protomaps. So I'd say a lot of the reality of running an open source project is a lot like a business is it has all the same challenges as a business in terms of you have to figure out what is the thing you're offering. You have to deal with people using it. You have to deal with feedback, you have to deal with managing emails and stuff. I don't think the payoff is anywhere near running a business or a startup that's backed by VC money is but it's definitely not the case that if you just want to code, you should start an open source project because I think a lot of the work for an opensource project has nothing to do with just writing the code. It is in my opinion as someone having done a VC backed business before, it is a lot more similar to running, a tech company than just putting some code on GitHub. Running a startup vs open source project [00:55:43] Jeremy: Well, since you've done both at a high level what did you like about running the company versus maintaining the open source project? [00:55:52] Brandon: So I have done some venture capital accelerator programs before and I think there is an element of hype and energy that you get from that that is self perpetuating. Your co-founder is gungho on like, yeah, we're gonna do this thing. And your investors are like, you guys are geniuses. You guys are gonna make a killing doing this thing. And the way it's framed is sort of obvious to everyone that it's like there's a much more traditional set of motivations behind that, that people understand while it's definitely not the case for running an open source project. Sometimes you just wake up and you're like what the hell is this thing for, it is this thing you spend a lot of time on. You don't even know who's using it. The people that use it and make a bunch of money off of it they know nothing about it. And you know, it's just like cool. And then you only hear from people that are complaining about it. And I think like that's honestly discouraging compared to the more clear energy and clearer motivation and vision behind how most people think about a company. But what I like about the open source project is just the lack of those constraints you know? Where you have a mandate that you need to have this many customers that are paying by this amount of time. There's that sort of pressure on delivering a business result instead of just making something that you're proud of that's simple to use and has like an elegant design. I think that's really a difference in motivation as well. Having control [00:57:50] Jeremy: Do you feel like you have more control? Like you mentioned how you've decided I'm not gonna make a public roadmap. I'm the sole developer. I get to decide what goes in. What doesn't. Do you feel like you have more control in your current position than you did running the startup? [00:58:10] Brandon: Definitely for sure. Like that agency is what I value the most. It is possible to go too far. Like, so I'm very wary of the BDFL title, which I think is how a lot of open source projects succeed. But I think there is some element of for a project to succeed there has to be somebody that makes those decisions. Sometimes those decisions will be wrong and then hopefully they can be rectified. But I think going back to what I was talking about with scope, I think the overall vision and the scope of the project is something that I am very opinionated about in that it should do these things. It shouldn't do these things. It should be easy to use for this audience. Is it gonna be appealing to this other audience? I don't know. And I think that is really one of the most important parts of that leadership role, is having the power to decide we're doing this, we're not doing this. I would hope other developers would be able to get on board if they're able to make good use of the project, if they use it for their company, if they use it for their business, if they just think the project is cool. So there are other contributors at this point and I want to get more involved. But I think being able to make those decisions to what I believe is going to be the best project is something that is very special about open source, that isn't necessarily true about running like a SaaS business. [00:59:50] Jeremy: I think that's a good spot to end it on, so if people want to learn more about Protomaps or they wanna see what you're up to, where should they head? [01:00:00] Brandon: So you can go to Protomaps.com, GitHub, or you can find me or Protomaps on bluesky or Mastodon. [01:00:09] Jeremy: All right, Brandon, thank you so much for chatting today. [01:00:12] Brandon: Great. Thank you very much.

amazon time california google uk apple pr running zoom colorado seattle twitch businesses washington post starbucks wikipedia ip saas vc react capitol hill api google maps financially ui chipotle 2d linux java github vpn stripe jekyll zillow mastodon south asia javascript html ubuntu css php financials s3 docker kubernetes prs cloudflare drm cdn lambda customizing plugin iit json xml terraform 1d csv rackspace npm linux foundation elasticsearch osm postgres geocities overlays ec2 openstreetmap cdns tom l supabase mapbox hetzner cloudflare workers bdfl jeremy you jeremy it brandon it jeremy so

Hong Minhee on ActivityPub

Play Episode Listen Later Feb 28, 2025 45:39

Hong Minhee is an open source developer and the creator of the Fedify ActivityPub server framework. We talk about how applications like Mastodon and Misskey communicate with one another using ActivityPub. This includes discussions on built-in activites, extending the specification in a backwards compatible way, difficulties implementing JSON-LD, the inbox model, and his experience implementing the specification. Hong Minhee: activitypub profile fedify hollo Specifications: ActivityPub W3C specification JSON Linked Data Resource Description Framework W3C Semantic Web Standards ActivityPub and WebFinger ActivityPub and HTTP Signatures ActivityPub implementations: Mastodon Misskey Akkoma Pleroma Pixelfed Lemmy Loops GoToSocial ActivityPub support in Ghost Threads has entered the Fediverse ActivityPub tools: ActivityPub Academy BrowserPub fedify CLI -- Transcript You can help correct transcripts on GitHub. What's ActivityPub? [00:00:00] Jeremy: Today, I'm talking to Hong Minhee. He is the developer of Fedify. A TypeScript library for building ActivityPub server applications. The first thing I think we should start with is defining ActivityPub. what is ActivityPub? [00:00:16] Hong: ActivityPub is the protocol that lets social networks talk to each other and it's officially recommended by W3C. It's what powers this thing we call the Fediverse which is basically a way for different social media platforms to work together. Users of ActivityPub [00:00:39] Jeremy: Can you give some examples that people might have heard of -- either users of ActivityPub or things that are a part of this fediverse? [00:00:50] Hong: Mastodon is probably the biggest one out there. And you know what's interesting? Meta threads has actually started implementing ActivityPub this summer. So this still pretty much a one way street right now. In East Asia, especially Japan, there's this really popular microblogging platform called misskey. It's got so many forks that people actually joke around and called them forkeys. but it's not just about Twitter style microblogging, there's Pixelfed which is kind of like Instagram, but for the fediverse. And those same folks recently launched loops. Which is basically doing what TikTok does, but in the Fediverse. Then you've got stuff like Lemmy and which are doing the reddit thing up in the Fediverse. [00:02:00] Jeremy: Oh like Reddit. [00:02:01] Hong: Yeah. There are so much more out there that I haven't even mentioned. Um, most of it is open source, which is pretty cool. [00:02:13] Jeremy: So the first few examples you gave, Mastodon and Meta's threads, they're very similar to, to Twitter, right? So that's what you were calling the, the Microblogging applications. And I think what you had said, which is a little bit interesting is you had said Metas threads is only one way. So could you kind of describe like what you mean by that? [00:02:37] Hong: Currently meta threads only can be followed by other ActivityPub applications but you cannot follow other people in the fediverse. [00:02:55] Jeremy: People who are using another Microblogging platform like Mastodon can follow someone on Meta's Threads platform. But the other way is not true. If you're on threads, you can't follow someone on Mastodon. [00:03:07] Hong: Yes, that's right. [00:03:09] Jeremy: And that's not a limitation of the protocol itself. That's a design decision or a decision made by Meta. [00:03:17] Hong: Yeah. They are slowly implementing ActivityPub and I hope they will implement complete ActivityPub in the future. Interoperability through Activities [00:03:27] Jeremy: And then the other examples you gave, one is I believe it was Pixel Fed is very similar to Instagram. And then the last examples you gave was I think it was Lemmy, you said it's similar to Reddit. Because you mentioned the term Fediverse before and you mentioned that these all use ActivityPub and since these seem like different kinds of applications, what does it mean for them to interact? Because with Mastodon and Threads I can kind of understand because they're both similar to Twitter. So you're posting messages and replying, but, but what does it mean, for example, for someone on Mastodon to interact with someone on Lemmy which is like Reddit because they seem very different. [00:04:16] Hong: People in Lemmy and Mastodon are called actors and can follow each other. They have interactions between them called activities. And there are several types of activities like, create and follow and undo, like, and so on. So, ActivityPub applications tend to, use these vocabulary to implement their features. So, for example, Lemmy uses like activities for upvoting and like activities for down voting and it's translated to likes in Mastodon. So if you submit a post on Lemmy and it shows up on your Mastodon timeline. If you like that post (it) is upvoting in Lemmy. [00:05:36] Jeremy: And probably similarly with Pixelfed, which you said is like Instagram, if you follow someone's Pixelfed account in Mastodon and they post a photo in Pixel Fed, they would see it as a post in Mastodon natively and they could give it a like there. Adding activities or properties [00:05:56] Jeremy: And these activities that you mentioned -- So the like and the dislike are those part of ActivityPub itself? [00:06:05] Hong: Yes, and this vocabulary can be extended. [00:06:10] Jeremy: So you can add, additional actions (activities) or are you adding information (properties) to the existing actions? [00:06:37] Hong: It is called activity vocabulary, and there are, things like accept, add, arrive, block, create lead, dislike, flag, follow, ignore invite, join, and so on. So, basically, almost everything you need to build social media is already there in the vocabulary, but if you want to extend some more, you can define your, own vocabulary. [00:06:56] Jeremy: Most of the things that an Instagram or a Twitter, or a Reddit would need is already there. But you're saying that you can have your own vocabulary. So if there's an action or an activity that is not covered by the specification, you can create one yourself. [00:07:13] Hong: Yes. For example, Misskey and Pleroma defined emoji reactor to represent emoji reactions. [00:07:25] Jeremy: Because the systems can extend the vocabulary. What are some other examples of cases where mastodon or any other of these systems has found that the existing vocabulary is not enough. What are some other examples of applications extending it? [00:07:45] Hong: For example, uh, mastodon defined suspended -- suspended property. They are not activities, but they are properties in the activity. ActivityPub consists of several types of objects and there are activities and normal objects like, article. they can have properties and there are several existing properties, but they can be also extended. So Mastodon extended some properties they need. So for example, they define suspended or discoverable. [00:08:44] Hong: Suspended for to tell if an actor is suspended by moderators. Discoverable tells if an actor itself wants to be, searched and indexed, and there are much more properties. Mastodon extended. Actors [00:09:12] Jeremy: And these are, these are properties of the actor. These are properties of the user? [00:09:19] Hong: Yes. Actors. [00:09:21] Jeremy: Cause I think earlier you mentioned that. The concept of a user is an actor, and it sounds like what you're saying is an actor can have all these properties. There's probably a, a username and things like that, but Mastodon has extended the properties so that, you can have a property on whether you wanna be searched or indexed you can have a property that says you're suspended. So I guess your account, is still there, but can't be used anymore. Something we should probably talk about then is, so you have these actors, you have these activities that I'm assuming the actors are performing on one another. What does that data look like and what does the communication look like? [00:10:09] Hong: Actors have their own dereferencable URI and when you look up that URI you get all the info about the actor in JSON-LD format [00:10:22] Jeremy: JSON-LD? [00:10:23] Hong: Yeah. JSON-LD. linked data. (The) Actor has all the stuff you expect to find on a social account name, bio URL to the profile page, profile picture, head image and more. And there are five main types of actors: application, group, organization, person and service. And you know how sometimes on Mastodon you will see an account marked as a bot? [00:10:58] Jeremy: A bot? [00:10:59] Hong: Yeah. Bot and that's what an actor of type service looks like. And the ActivityPub spec actually let you create other types beyond these five. But I haven't seen anyone actually do that yet. JSON-LD [00:11:15] Jeremy: And you mentioned that these are all JSON objects. but the LD part, the linked data part, I'm not familiar with. So what different about the linked data part of the JSON? [00:11:31] Hong: So JSON-LD is the special way of writing RDF. Which was originally used in the semantic web. Usually RDF uses (a) format (that) is called triples. [00:11:48] Jeremy: Triples? [00:11:49] Hong: Yeah, subject and predicate and object. [00:11:55] Jeremy: Subject, predicate, object. Can you give an example of what those three would be? [00:12:00] Hong: For example, is a person, it's a triple. John is a subject and is a predicate [00:12:11] Jeremy: is, is the predicate. [00:12:12] Hong: Okay. And person is a object. That's great for showing how things are connected, but it is pretty different from how we usually handle data in REST for APIs and stuff. Like normally we say a personal object has property like name, DOB, bio, and so on. And a bunch of subject predicated object triples that's where JSON-LD comes in -- is designed to look more like the JSON we are used to working with, while still being able to represent RDF Graphs. RDF graph are ontology. It's a way to represent factual data, but is, quite different from, how we represent data in relational database. And it's a bunch of triples each subject and objects are nodes and predicates connect these nodes. Semantic Web [00:13:30] Jeremy: You mentioned the Semantic web, what does that mean? What is the semantic web? [00:13:35] Hong: It's a way to represent web in the structural way, is machine readable so that you can, scan the data in the web, using scrapers or crawlers. [00:13:52] Jeremy: Scrapers -- or what was the second one? Crawling. [00:13:59] Hong: Yeah. Then you can have graph data of web and you can, query information about things from the data. [00:14:14] Jeremy: So is the web as it exists now, is that the Semantic web or is it something different? [00:14:24] Hong: I think it is partially semantic web, you have several metadata in Your HTML. For example, there are several specification for semantic web, like, OpenGraph metadata. [00:14:32] Jeremy: Cause when I think about OpenGraph, I think about the metadata on a webpage that, that tells other applications or websites that if you link to this page: show this image or show this title and description. You're saying that specifically you consider part of the semantic web? [00:15:05] Hong: That's, semantic web. To make your website semantic web. Your website should be able to, provide structural data. And other people can make Scrapers to scan, structural data from your website. There are a bunch of attributes and text for HTML to represent metadata. For example you have relation attribute rel so if you have a link with rel=me to your another social profile. Then other people can tell two web pages represent the same person. [00:16:10] Jeremy: Oh, I see. So you could have more than one website. Maybe one is your blog and maybe one is your favorite birds or something like that. But you could put a rel tag with information about you as a person so that someone who scrapes both websites could look at that tag and see that both of these websites are by, Hong, by this person. JSON-LD is difficult to implement and not used as intended [00:16:43] Hong: Yeah. I think JSON-LD is, designed for semantic web, but in reality, ActivityPub implementations, most of them are, not aware of semantic web. [00:17:01] Jeremy: The choice of JSON Linked Data, the JSON-LD, by the people who made the specification -- They had this idea that things that implemented ActivityPub would be a part of this semantic web, but the actual implementation of a Mastodon or a Pixelfed, they use JSON-LD because it's part of the specification, but the way they use it, it ends up not really being a part of this semantic web. [00:17:34] Hong: Yeah, that's exactly.. [00:17:37] Jeremy: You've mentioned that implementing it is difficult. What makes implementing JSON LD particularly hard? [00:17:48] Hong: The JSON-LD is quite complex. Which is why a lot of programming language don't even have JSON-LD implementations and it's pretty slow compared to just working with the regular JSON. So, what happens is a lot of ActivityPub implementations just treat JSON-LD like (it) is regular JSON without using a proper JSON-LD processor. You can do that, but it creates a source of headache. In JSON-LD there are weird equivalences like if a property is missing or if it's an empty array, that means the same thing. Or if a property has one value versus an array with just that one value in it, same thing. So when you are writing code to parse JSON-LD, you've got to keep checking if something's an array how long it is and all that is super easy to mess up. It's not just reading JSON-LD that's tricky. Creating it is just as bad. Like you might forget to include the right context metadata for a vocabulary and end up with a JSON-LD document that's either invalid or means something totally different from what you wanted. Even the big ActivityPub implementations mess this up pretty often. With Fedify we've got a JSON-LD processor built in and we keep running into issues where major ActivityPub implementations create invalidate JSON-LD. We've had to create workaround for all of them, but it's not pretty and causes kind of a mess. [00:19:52] Jeremy: Even though there is a specification for JSON-LD, it sounds like the implementers don't necessarily follow it. So you are kind of parsing JSON-LD, but not really. You're parsing something that. Looks like JSON-LD, but isn't quite it. [00:20:12] Hong: Yes, that's right. [00:20:14] Jeremy: And is that true in the, the biggest implementations, Mastodon, for example, are there things that it sends in its activities that aren't valid JSON-LD? [00:20:26] Hong: Those implementations that had bad JSON-LD tends to fix them soon as a possible. But regressions are so often made. Yeah. [00:20:45] Jeremy: Even within Mastodon, which is probably one of the largest implementers of ActivityPub, there are cases where it's not valid, JSON-LD and somebody fixes it. But then later on there are other messages or other activities that were valid, but aren't valid anymore. And so it's this, it's this back and forth of fixing them and causing new issues it sounds ... [00:21:15] Hong: Yeah. Yeah. Right. [00:21:17] Jeremy: Yeah. That sounds very difficult to deal with. How instances communicate (Inbox) [00:21:20] Jeremy: We've been talking about the messages themselves are this special format of JSON that's very particular. but how do these instances communicate with one another? [00:21:32] Hong: Most of time, it all starts with a follow. Like when John follows Alice, then Alice adds both John and John's inbox URI to her followers list, and after John follows Alice, Whenever Alice posts something new that activities get sent to John's inbox behind the scenes. This is just one HTTP post request. Even though ActivityPub is built on HTTP. It doesn't really care about the HTTP response beyond did it work or not. If you want to reply to an activity, you need to figure out the standard inbox, URI and send or reply activity there. [00:22:27] Jeremy: If we define all the terms, there's the actor, which is the person, each actor can send different activities. those activities are in the form of a JSON linked data. [00:22:40] Hong: Yeah. [00:22:42] Jeremy: And everybody has an inbox. And an inbox is an HTTP URL that people post to. [00:22:50] Hong: Right. [00:22:52] Jeremy: And so when you think about that, you had mentioned that if you have a list of followers, let's say you have a hundred followers, would that mean that you have the URLs to all hundred of those follower's inboxes and that you would send one HTTP post to each inbox every time you had a new message? [00:23:16] Hong: Pretty much all ActivityPub implementations have, a thing called shared inbox, it's exactly what it sounds like. One inbox that all actors on a server share. Private stuff like DMs don't go there (it) is just for public posts and thoughts. [00:23:36] Jeremy: I think we haven't really talked about the fact that, when you have multiple users, usually they're on a server, right? That somebody chooses. So you could have tens of thousands, I don't know how many people can fit on the same server. But, rather than, you having to post to each user individually, you can post to the shared inbox on this server. So let's say, of your 100 followers, 50 them are on the same server, and you have a new post, you only need to post to the shared inbox once. [00:24:16] Hong: Yes, that's right. [00:24:18] Jeremy: And in that message you would I assume have links to each of the profiles or actors that you wanted to send that message to. [00:24:30] Hong: Yeah. Scaling challenges [00:24:31] Jeremy: Something that I've seen in the past is there are people who have challenges with scaling. Their Mastodon instance or their implementations of ActivityPub. As the, the number of followers grow, I've seen a post about, ghost one of the companies you work with mentioning that they've had challenges there. What are the challenges there and, and how do you think those can be resolved? [00:25:04] Hong: To put this in context, when Ghost mentioned the scaling, they were not using Message Queue yet. I'm pretty sure using Message Queue would help a lot of their scaling problems. That said it is definitely true that a lot of activity post software has trouble with scaling right now. I think part of the problem is that everyone's using this purely event driven approach to sending activities around. One of the big issues is that when their delivery fails it's the sender who has to retry and not the receiver. Plus there's all this overhead because the sender has to authenticate itself with HTTP signatures every time. Actually the ActivityPub spec suggests using polling too so I'd love to see more ActivityPub software try using both approaches together. [00:26:16] Jeremy: You mean the followers would poll who they're following instead of the person posting the messages having to send their posts to everyone's inboxes. [00:26:29] Hong: Yeah. [00:26:29] Jeremy: I see. So that's a part of the ActivityPubs specification, but not implemented in a lot of ActivityPub implementations, And so it sounds like maybe that puts a lot of burden on the servers that have people with a lot of followers because they have to post to every single, follower server and maybe the server is slow or they can't reach it. And like you said, they have to just keep trying and trying. There could be a lot of challenges there. [00:27:09] Hong: Right. Account migration [00:27:10] Jeremy: We've talked a little bit about the fact that each person each actor is hosted by a server and those servers can host multiple actors. But if you want to move to another server either because your server is shutting down or you just would like to change servers, what are some of the challenges there? [00:27:38] Hong: ActivityPub and Fediverse already have the specification for an account move. It's called FEP-7628 Move Actor. First thing you need to do when moving an account is prove that both the old and new accounts belong to the same person. You do this by adding the all accounts, add the URI to the new account's AlsoKnownAs property. And then the old account contacts all the other instances it's moving by sending out a move activity. When a server gets this move activity, it checks that both accounts really do belong to the same parts, and then it makes all the accounts that, uh, were following the, all the accounts start to, following the new one instead. that's how the new account gets to keep all the, all the accounts follow us. pretty much all, all the major activity post software has this feature built in, for example, Mastodon Misskey you name it. [00:29:04] Jeremy: This is very similar to the post where when you execute a move, the server that originally hosted that actor, they need to somehow tell every single other server that was following that account that you've moved. And so if there's any issues with communicating with one of those servers, or you miss one, then it just won't recognize that you've moved. You have to make sure that you talk to every single server. [00:29:36] Hong: That's right. [00:29:38] Jeremy: I could see how that could be a difficult problem sometimes if you have a lot of followers. [00:29:45] Hong: Yeah. Fedify [00:29:46] Jeremy: You've created a TypeScript library Fedify for building ActivityPub powered applications. What was the reason you decided to create Fedify? [00:29:58] Hong: Fedify is (a) ActivityPub servers framework I built for TypeScript. It basically takes away a lot of headaches you'd get trying to implement (an) ActivityPub server from scratch. The whole thing started because I wanted to build hollo -- A single user microblogging platform I built. But when I tried, to implement ActivityPub from (the) ground up it was kind of a nightmare. Imagine trying to write a CGI program in Perl or C back in the late nineties, where you are manually printing, HTTP headers and HTML as bias. there just wasn't any good abstraction layer to go with. There were already some libraries and frameworks for ActivityPub out there but none of them really hit the sweet spot I was looking for. They were either too high level and rigid. Like you could only build a mastodon clone or they barely did anything at all. Or they were written in languages I didn't really know. Ghost and Fedify [00:31:24] Jeremy: I saw that you are doing some work with, ghost. How is Ghost using fedify? [00:31:30] Hong: Ghost is an open source publishing platform. They have put some money into fedify which is why I get to work on it full time now. Their ActivityPub feature is still in private beta but it should be available to everyone pretty soon. We work together to improve fedify. Basically they are a user of fedify. They report bugs request new features to fedify then I fix them or implement them, first. [00:32:16] Jeremy: Ghost to my understanding is a blogging platform and a a newsletter platform. So what does it mean for them to implement ActivityPub? What would somebody using Mastodon, for example, get when they follow somebody using Ghost? [00:32:38] Hong: Ghost will have a fediverse handle for each blog. If you follow them in your mastodon or something (similar) then a new post is published. These post will show up (in) your timeline in Mastodon and you can like them or share them. Andin the dashboard of Ghost you can see who liked their posts or shared their posts and so on. It is like how mastodon works but in Ghost. [00:33:26] Jeremy: I see. So if you are writing a ghost blog and somebody follows your blog from Mastodon, sort of like we were talking about earlier, they can like your post, and on the blog itself you could show, oh, I have 200 likes. And those aren't necessarily people who were on your ghost website, they could be people that were liking your post from Mastodon. [00:33:58] Hong: Yes. Misskey / Forkey development in Asia [00:34:00] Jeremy: Something you mentioned at the beginning was there is a community of developers in Asia making forks of I believe of Mastodon, right? [00:34:13] Hong: Yeah. [00:34:14] Jeremy: Do you have experience working in that development community? What's different about it compared to the more Western centric community? [00:34:24] Hong: They are very similar in most ways. The key difference is language of course. They communicate in Japanese primarily. They also accept pull requests with English. But there are tons of comments in Japanese in their code. So you need to translate them into English or your first language to understand what code does. So I think that makes a barrier for Western developers. In fact, many Western developers that contribute to misskey or forkey are able to speak a little Japanese. And many of the developers of misskey and forkey are kind of otaku. [00:35:31] Jeremy: Oh otaku okay. [00:35:33] Hong: It's not a big deal, but you can see (the) difference in a glance. [00:35:41] Jeremy: Yeah. You mentioned one of the things that I believe misskey implemented was the emoji reactions and maybe one of the reasons they wanted that was so that they could react to each other's posts with you know anime pictures or things like that. [00:35:58] Hong: Yeah, that's right. [00:36:01] Jeremy: You've mentioned misskey and forkey. So is misskey a fork of Mastodon and then is forkey a fork of misskey? [00:36:10] Hong: No, misskey is not a fork of mastodon. (It) is built from scratch. It's its own implementation. And forkeys are forks of Mastodon. [00:36:22] Jeremy: Oh, I see. But both of those are primarily built by Japanese developers. [00:36:30] Hong: Yes. Whereas Mastodon (is) written in Ruby. Ruby on Rails. But misskey is built in TypeScript. [00:36:40] Jeremy: And because of ActivityPub -- they all implement it. So you can communicate with people between mastodon and misskey because they all understand the same activities. [00:36:56] Hong: Yes. Backwards compatible activity implementations [00:36:57] Jeremy: You did mention since there are extensions like misskey has the emoji reactions. When there is an activity that an implementation doesn't support what happens between the two servers? Do you send it to a server's inbox and then the server just doesn't do anything with it? [00:37:16] Hong: Some implementers consider backwards compatibility. So they design (it) to work with other implementations that don't support that activity. For example misskey uses like activity for emoji reaction. So if you put an emoji to a Mastodon post then in Mastodon you get one like. So it's intended behavior by misskey developers that they fall back to normal likes. But sometimes ActivityPub implementers introduce entirely new activity types. For example Pleroma introduced the emoji react. And if you put emoji reaction to Mastodon post from Pleroma in Mastodon you have nothing to see because Mastodon just ignores them. [00:38:37] Jeremy: If I understand correctly, both misskey and Pleroma are independent implementations of ActivityPub, but with misskey, they can tell when or their message is backwards compatible where it's if you don't understand the emoji reaction, it'll be embedded inside of a like message. Whereas with Pleroma they send an activity that Mastodon can't understand at all. So it just doesn't do anything. [00:39:11] Hong: Yes, right. But, Misskey also understands (the) emoji react activity. So between pleroma and misskey they have exchanged emoji reactions with no problem. [00:39:27] Jeremy: Oh, I see. So they, they both understand that activity. They both implement it the same way, but then when misskey communicates with Mastodon or with an instance that it knows doesn't understand it, it sends something different. [00:39:45] Hong: Yeah, that's right. [00:39:47] Jeremy: The servers -- can they query one another to know which activities they support? [00:39:53] Hong: Usually ActivityPub implementations also implement NodeInfo specification. It's like a user agent-like thing in Fediverse. Implementations tell the other instance (if it) is Mastodon or something else. You can query the type of server. [00:40:20] Jeremy: Okay, so within ActivityPub are each of the servers -- is the term node is that the word they use for each server? [00:40:31] Hong: Yes. Right. [00:40:32] Jeremy: You have the nodes, which can have any number of actors and the servers send activities to one another, to each other's inboxes. And so those are the way they all communicate. [00:40:49] Hong: Yeah. Building an ActivityPub implementation [00:40:50] Jeremy: You've implemented ActivityPub with Fedify because you found like there weren't good enough implementations or resources already. Did you implement it based off of the specification or did you look at existing implementations while you were building your implementation? [00:41:12] Hong: To be honest, instead of just, diving into the spec. I usually start by looking at actually ActivityPub software code first. The ActivityPub spec is so vague that you can't really build something just from reading it. So when we talk about ActivityPub, we are actually talking about a whole bunch of other technical standards too, WebFinger, HTTP signatures and more. So you need to understand all of these as well. [00:41:47] Jeremy: With the specification alone, you were saying it's too vague and so what ends up being -- I'm not sure if it's right to call it a spec, but looking at the implementations that people have already made that collectively becomes the spec because trying to follow the spec just by itself is maybe too difficult. [00:42:12] Hong: Yes. [00:42:14] Jeremy: Maybe that brings up the issues you were talking about before where you have specifications like JSON-LD where they're so complicated that even the biggest implementations aren't quite following it exactly. [00:42:28] Hong: Yeah. [00:42:29] Jeremy: If somebody wanted to, to get started with understanding a little bit more about ActivityPub or building something with it where would you recommend they start? [00:42:44] Hong: I recommend to dig into a lot of code from actual implementations. First, Mastodon, Misskey, Akkoma and so on. There are are some really cool tools that have been so helpful. For example, ActivityPub Academy is this awesome mastodon server for debugging ActivityPub. It makes it super easy to create a temporary account and see what activities are going back and forth. There is also BrowserPub. BrowserPub is this neat tool for looking up and browsing ActivityPub objects. It's really handy when you want to see how different ActivityPub software handles various features. I also recommend to use Fedify. I've got to mention the Fedify CLI, which comes with some really useful tools. [00:43:46] Jeremy: So if someone uses Fedify they're writing an application in TypeScript, then it sounds like they have to know the high level concepts. They have to know what are the different activities, what is inside of an actor. But the actual implementation of how do I create and parse JSON linked data, those kinds of things are taken care of by the library. [00:44:13] Hong: Yes, right. [00:44:16] Jeremy: So in some ways it seems like it might be good to, like you were saying, use the tools you mentioned to create a test Mastodon account, look at the messages being sent back and forth, and then when you're trying to implement it, starting with something like Fedify might be good because then you can really just focus on the concepts and not worry so much about the, the implementation details. [00:44:43] Hong: Yes, that's right. [00:44:45] Jeremy: Is there anything else you. Wanted to mention or thought we should have talked about? [00:44:52] Hong: Mm. I want to, talk about, a lot of stuff about ActivityPub but it's difficult to speak in English for me, so, it's a shame to talk about it very little. [00:45:15] Jeremy: We need everybody to learn Korean right? [00:45:23] Hong: Yes, please. (laughs) [00:45:23] Jeremy: Yeah. Well, I wanna thank you for taking the time. I know it must have been really challenging to give an interview in, you know, a language that's not your native one. So thank you for spending the time to talk with me. [00:45:38] Hong: Thank you for having me.

Prefetcher on Building PinkSea on the AT Protocol

Play Episode Listen Later Feb 25, 2025 73:14

Kacper "prefetcher" Staroń created the PinkSea oekaki BBS on top of the AT Protocol. He also made the online multiplayer game MicroWorks with Noam "noam 2000" Rubin. He's currently studying Computer Science at the Lublin University of Technology. We discuss the appeal of oekaki BBSs, why and how PinkSea was created, web design of the early 2000s, flash animations, and building an application on top of the AT Protocol. Prefetcher Bluesky Github Personal site Microworks (Free multiplayer game) PinkSea and Harbor PinkSea PinkSea Bluesky Account PinkSea repository Harbor image proxy repository Harbor post from bnewbold.net imgproxy (Image proxy used by Bluesky) Early web design Web Design Museum Pixel Art in Web Design Kaliber10000 Eboy Assembler 2advanced epuls.pl (Polish social networking site) Wipeout 3 aesthetic Restorativland (Geocities archive) Flash sites and animations My Flash Archive (Run by prefetcher) dagobah Z0r Juicy Panic - Otarie IOSYS - Marisa Stole the Precious Thing Geocities style web hosts Neocities Nekoweb AT Protocol / Bluesky PDS Relay AppViews PLC directory Decentralized Identifier lexicon Jetstream XRPC ATProto scraping (List of custom PDS and did:web) Tools to view PDS data PDSls atp.tools ATProto browser Posters mentioned vertigris (Artist that promoted PinkSea) Mary (AT Protocol enthusiast) Brian Newbold (Bluesky employee) Oekaki drawing applets Tegaki chickenpaint Group drawing canvas Drawpile Aggie Other links Bringing Geocities back with Kyle Drake (Interview with creator of Neocities) firesky.tv (View all bluesky posts) ATFile (Use PDS as a file store) PinkSky (Instagram clone) front page (Hacker news clone) Smoke Signal (Meetup clone) -- Transcript You can help correct transcripts on GitHub. Intro [00:00:00] Jeremy: Today I am talking to Kacper Staroń. He created an oekaki BBS called PinkSea built on top of the AT protocol, and he's currently studying computer science at the Lublin University of Technology. We are gonna discuss the appeal of oekaki BBS, the web design of the early 2000s, flash animations, and building an application on top of the AT protocol. Kacper, thanks for talking with me today. [00:00:16] Prefetcher: Hello. Thank you for having me on. I'm Kacper Staroń also probably you know me as Prefetcher online. And as Jeremy's mentioned, PinkSea is an oekaki drawing bulletin board. You log in with your Bluesky account and you can draw and post images. It's styled like a mid to late 2000s website to keep it in the spirit. What's an oekaki BBS? [00:00:43] Jeremy: For someone who isn't familiar with oekaki BBSs what is different about them as opposed to say, a photo sharing website? [00:00:53] Prefetcher: The difference is that a photo sharing website you have the image already premade be it a photo or a drawing made in a separate application. And you basically log in and you upload that image. For example on Instagram or pixiv for artists even Flickr. But in the case of an oekaki BBS the thing that sets it apart is that oekaki BBSes already have the drawing tools built in. You cannot upload an already pre-made image with there being some caveats. Some different oekaki boards allow you to upload your already pre-made work. But Pinksea restricts you to a tool called Tegaki. Tegaki being a drawing applet that was built for one of the other BBSes and all of the drawing tools are inside of it. So you draw from within PinkSea and you upload it to the atmosphere. Every image that's on PinkSea is basically drawn right on it by the artists. No one can technically upload any images from elsewhere. How PinkSea got started and grew [00:01:56] Jeremy: You released this to the world. How did people find it and how many people are using it? [00:02:02] Prefetcher: I'll actually begin with how I've made it 'cause it kind of ties into how PinkSea got semi-popular. One day I was just browsing through Bluesky somewhere in the late 2024s. I was really interested in the AT Protocol and while browsing, one of the artists that I follow vertigris posted a post basically saying they'd really want to see something a drawing canvas like Drawpile or Aggie on AT Protocol or something like an oekaki board. And considering that I was really looking forward to make something on the AT Protocol. I'm like, that sounds fun. I used to be a member of some oekaki boards. I don't draw well but it's an activity that I was thinking this sounds like a fun thing to do. I'm absolutely down for it. From like, the initial idea to what I'd say was the first time I was proud to let someone else use it. I think it was like two weeks. I was posting progress on Bluesky and people seemed eager to use it. That kept me motivated. And yeah. Right as I approached the finish I posted about it as a response to vertigris' posts and people seemed to like it. I sent the early version to a bunch of artists. I basically just made a post calling for them. Got really positive feedback, things to fix, and I released it. And thanks to vertigris the post went semi-viral. The launch I got a lot of people which I would also tie to the fact that it was right after one of the user waves that came to Bluesky from other platforms. The website also seemed really popular in Japan. I remember going to sleep, waking up the next day, and I saw like a Japanese post about PinkSea and it had 2000 reposts and 3000 likes and I was just unable to believe it. Within I think the first week we got like 1000 posts overall which to me is just insane. For a week straight I just kept looking at my phone and clicking, refresh, refresh, refresh, just seeing the new posts flow in. There was a bunch of like really insane talented artists just posting their works. And I just could not believe it. PinkSea got I'd say fairly popular as an alternative AppView. People seem to really want oekaki boards back and I saw people going, oh look, it's like one of those 2000s oekaki boards! Oh, that's so cool! I haven't seen them in forever! The art stands out because it's human made [00:04:58] Prefetcher: And it made me so happy every single time seeing it. It's been since November, like four months, give or take. And today alone we got five posts. That doesn't sound seem like a lot but given that every single post is hand drawn it's still insane. People go on there and spend their time to produce their own original artworks. [00:05:26] Jeremy: This is especially relevant now when you have so much image generation stuff and they're making images that look polished but you're kind of like well... did you draw it? [00:05:39] Prefetcher: Yeah. [00:05:40] Jeremy: And when you see people draw with these oekaki boards using the tools that are there I think there's something very human and very nostalgic about oh... This came from you. [00:05:53] Prefetcher: Honestly, yeah. To me seeing even beginner artists 'cause PinkSea has a lot of really, really talented and popular people (and) also beginner artists that do it as a hobby. Ones that haven't been drawing for a long time. And no matter what you look at you just get like that homely feeling that, oh, that's someone that just spent time. That's someone that just wanted to draw for fun. And at least to me, with generative AI like images it really lacks that human aspect to it. You generate an image, you go, oh, that's cool. And it just fades away. But in this case you see people that spent their time drawing it spent their own personal time. And no matter if it's a masterpiece or not it's still incredibly nice to see people just do it for fun. [00:06:54] Jeremy: Yeah. I think whether it's drawing or writing or anything now more than ever people wanna see something that you made yourself right? They wanna know that a human did this. [00:07:09] Prefetcher: Yeah. absolutely. [00:07:11] Jeremy: So it sounds like, in terms of getting the initial users and the ones that are there now, it really all came out of a single Bluesky posts that an existing artist (vertigris) noticed and boosted. And like you said, you were lucky enough to go viral and that carried you all the way to now and then it just keeps going from there, [00:07:36] Prefetcher: Basically if not for vertigris PinkSea (would) just not exist because I honestly did not think about it. My initial idea on making something on ATProto and maybe in the future I'll do something like that would be a platform like StumbleUpon -- Something that would just allow you to go on a website, press a button, and it gets uploaded to your repo and your friends would be able to see oh -- you visited that website and there would be an AppView that would just recommend you sites based on those categories. I really liked that idea and I was dead set on making it but then like I noticed that post (from vertigris) and I'm like, no, that's better. I really wanna make that. And yeah. So right here I want to give a massive shout out to vertigris 'cause they've been incredibly nice to me. They've even contributed the German translation of PinkSea which was just insane to me. And yeah, massive shout out to every single other artist that, Reposted it, liked it, used it because, it's all just snowballed from there and even recently I've had another wave of new users from the PinkSea account. So there are periods where it goes up and it like goes chill -- and then popular again. Old internet and flash [00:08:59] Jeremy: Yeah. And so something that you mentioned is that some people who came across it they mentioned how it was nostalgic or it looked like the old oekaki BBSs from the early internet. And I noticed that that was something that you posted on your own website that you have an interest in that specifically. I wonder what about that part of the internet interests you? [00:09:26] Prefetcher: That is a really good question. Like, to me, even before PinkSea my interests lie in the early internet. I run on Twitter and also on Bluesky now an account called My Flash Archive, which was an archive of very random, like flash animations. And I still do that just not as much anymore 'cause I have a lot of other things to do. I used to on Google just type in Flash and look through the oldest archived random folders just having flash videos. And I would just go over them save all of that or go on like the dagobah or Z0r or swfchan. 'cause the early internet to me, it was really like more explorative. 'cause like now you have, people just concentrated in those big platforms like Twitter, Instagram, Facebook, whatever. And back then at least to me you had more websites that you would just go on, you would find cool stuff. And the designs were like sometimes very minimal, aesthetically pleasing. I'd named here one of my favorite sites, Kaliber10000 which had just fantastic web design. Like, I, I also spend a lot of time on like the web design museum just like looking at old web design and just in awe. My flash archive on Twitter at least got very popular. I kind of abandoned that account, but I think it was sitting at 12,000 followers if not more? And showed that people also yearn for that early internet vibe. And to me it feels really warm. Really different from the internet nowadays. Even with the death of flash you don't really have interactive experiences like it anymore. 'cause flash was supposed to be replaced by HTML5 and JavaScript and whatever but you don't really make interactive experiences that just come packaged in a single file like flash. You need a website and everything. In flash, it just had a single file. It could be shared on multiple sites and just experienced. That kind of propelled my interest. Plus I, I dunno, I just really like the old internet design aesthetics it really warms me (and really close..?) Flash loops [00:12:01] Jeremy: The flash one specifically. Were they animations or games or was there a specific type of a flash project that spoke to you? [00:12:15] Prefetcher: Something we call loops. Basically, it's sometimes animations. 'cause, surprisingly while I like flash games they weren't my main collection. What spoke to me more were loops. Basically someone would take a song, find a gif they liked, and they would just pair it together. Something like YTMND did. At least from loops I found some of my favorite musical artists, some of my favorite songs, a lot of interesting series, be it anime or TV or whatever. And you basically saw people make stuff about their favorite series and they would just share it online. I would go over those. For example, a good website as an example is z0r.de, which is surprisingly still active and updated to this day. And you would see people making loops about members of that community or whatever they like. And you would for example see like 10 posts about the same thing. So you would know someone decided to make 10 loops and just upload them at once. And yeah, to me, loops basically were like, I mean, they weren't always the highest quality or the most unique thing, but you would see someone liked something enough that they decided to make something about it. And I always found that really cool. I would late at night just browse for loops and I'm like, oh, oh, this series, I remember it. I liked it (laughs)! But of course flash games as well. I mean, I used to play a lot when I was younger, but specifically loops, even animations and especially like when someone took like their time to animate something like really in depth. My favorite example is, the music video to a song by the band Juicy Panic called Otari. Someone liked that song enough that they made an entire flash animated music video, which was basically vectorized art of various series like Azumanga Daioh or Neon Genesis Evangelion as well, and other things. And it was so cool, at least to me, like a lot of these loops just basically have an intense, like immense feeling on me (laughs). I just really liked collecting them. [00:14:38] Jeremy: And in that last example, it sounded more like it was a complete music video, not just a brief loop? [00:14:45] Prefetcher: No, it was like a five minute long music video that someone else made. [00:14:48] Jeremy: Five. Oh my gosh. [00:14:49] Prefetcher: Yeah. You would really see people's creativity shine through on just making those weird things that not a lot of people have seen, but you look at it and it's like, wow. It's different than YouTube (Sharable single file, vectorized) [00:15:01] Jeremy: It's interesting because you can technically do and see a lot of these things on, say, YouTube today, but I think it does feel a little different for some reason. [00:15:16] Prefetcher: It really is. Of course I'm not denying on YouTube you see a lot of creative things and whatever. But first and foremost, the fact that Flash is scalable. You don't lose the quality. So be able to open, I don't know, any of the IOSYS flash music videos for like their Touhou songs and the thing would just scale and you would see like in 4K and it's like, wow. And yeah, the fact that on YouTube you have like a central place where you just like put something and it just stays there. Of course not counting reuploads, but with Flash you just had like this one animation file that you would just be able to share everywhere and I don't know, like the aspect of sharing, just like having those massive collections, you would see this flash right here on this website and on that website and also on this website. And also seeing people's personal collections of flash videos and jrandomly online and you would also see this file and this file that you haven't seen it -- it really gives it, it's like explorative to me and that's what I like. You put in the effort to like go over all those websites and you just like find new and new cool stuff. [00:16:32] Jeremy: Yeah, that's a good point too that I hadn't thought about. You can open these files and you have basically the primitives of how it was made and since, like you said, it's vector based, there's no, oh, can you please upload it in 1080 p or 4K? You can make it as big as you want. [00:16:53] Prefetcher: Yeah. Web design differences, pixel art, non-responsive [00:16:55] Jeremy: I think web design as well it was very distinct. Maybe because the tools just weren't there, so a lot of people were building things more from scratch rather than pulling a template or using a framework. A lot of people were just making the design theirs I think rather than putting words on a page and filling into some template. [00:17:21] Prefetcher: Honestly, you raise a good point here that I did not think much about. 'cause like nowadays we have all of this tooling to make web design easier and you have design languages and whatnot. And you see people make really, in my opinion, still pretty websites, very usable websites on top of that. But all of them have like the same vibes to them. All of them have like a unified design language and all of them look very similar. And you kind of lose that creativity that some people had. Of course, you still find pretty websites that were made from scratch. But you don't really get the same vibes that you did get like back then. Like my favorite, for example, trend that used to be back on like the old internet is pixel art in web design. For example, Kaliber10000, or going off the top of my head, you had the Eboy or all the sites and then Poland, for example, ... (polish website) those websites use minimal graphics, like pixel graphics and everything to build really interesting looking websites. They had their own very massive charm to them that, I don't know, I don't see a lot in more modern internet. And it's also because back then you were limited by screen size, so you didn't have to worry about someone being on a Mac with high DPI or on a 32x9 monitor like I am right now. And just having to scale it up. So you would see people go more for images, like UI elements, images instead of just like building everything from scratch and CSS and whatnot. So, yeah, internet design had to accommodate the change. So we couldn't stay how it was forever 'cause technology changed. Design language has changed, but to me it's really lost its charm. Every single website was different, specific, the web design had like this weird form, at least on websites where it was like. I like to call it futuristic minimalism. They looked very modern and also very minimal and sort of dated. And I dunno, I just really like it. I absolutely recommend checking, on the web design museum fantastic website. I love them and the pixel art in web design sub page. Like those websites to me they just look fantastic. [00:19:52] Jeremy: Yeah, and that's a good point you brought up about the screen sizes where now you have to make sure your website looks good on a phone, on a tablet, on any number of monitor sizes. Back then in the late 90s, early 2000s, I think most people were looking at these websites on their 4x3 small CRT monitors. [00:20:20] Prefetcher: My favorite this website is best viewed with an 800 by 600 monitor. It's like ... what? [00:20:28] Jeremy: Exactly. Even if you open your personal site now the design is very reminiscent of those times and it looks really cool but at the same time on a lot of monitors it's a small box in the middle of the monitor, so it's like -- [00:20:49] Prefetcher: I saw that issue, 'cause I was making it on a 1080p monitor and now I have a 32x9 monitor and it does not scale. I've been working on reworking that website, but, also on the topic of my website, I, I wanna shout out a website from the 2000s that still exists today. 'cause, my website was really inspired by a website called Assembler. And Assembler, from what I could gather, was like a net art or like internet design collective. And the website still works to this day. You still had like, all of their projects, including the website that my website was based off of. [00:21:28] Jeremy: Yeah, I mean there, there definitely was an aesthetic to that time. And it's probably, like you said, it's probably people seeing someone else's site in this case, what, what did you call it? Assem? Assembler? [00:21:42] Prefetcher: Assembler. [00:21:42] Jeremy: Yeah. You see someone else's website and then maybe you try to copy some of the design language or you look at the HTML and the CSS and I mean, really at the time, these websites weren't being made with a ton of JavaScript. There weren't the minifiers, so you really could view source and just pull whatever you wanted from there. [00:22:06] Prefetcher: We also had those design studios, design agencies, notably 2advanced which check in now, their website still works, and their website is still in the same aesthetic as it was those 20 so years ago just dictating this futuristic design style that people really like. 'cause a lot of people nowadays also really like this old futurism minimalism for example a lot of people still love the Wipeout 3 aesthetic that was designed by one of my favorite studios overall the designers republic. And yeah, it's just hard for me to explain, but it feels so soulful in a way. [00:22:53] Jeremy: I think there are some trade offs. There's what we were talking about earlier with the flexibility of screen size. But there used to be with a lot of websites that used Flash, there used to be these very elaborate intros where the site is loading and there's these really neat animations. But at the same time, it's sort of like, well, to actually get to the content, it's a bit much, but, everything is a trade off. [00:23:25] Prefetcher: People had flash at their disposal and they just wanted to make, I have the tooling, I'm going to use all of the tooling and all of it. [00:23:33] Jeremy: Yeah. Yeah. but yeah, I definitely get what you're saying where when I went to make my own website I made it very utilitarian and in some ways boring, right? I think we do kind of miss some of what we used to have. [00:23:54] Prefetcher: I mean, in my opinion, utilitarian websites are just as fine. Like in some cases you don't really need a lot of flashy things and a lot of very modern very CPU intensive or whatever animations. Sometimes it is better to go on a website and just like, see, oh, there's the play button and that's it. [00:24:17] Jeremy: Yeah. Well definitely the animations and the intro and all that stuff. I guess more in terms of the aesthetics or the designs. It's tricky because there's definitely people making very cool things now things that weren't even possible back then. But it does feel like maybe the default is I'll pick this existing style sheet or this existing framework and just go with that. [00:24:47] Prefetcher: A lot of modern websites just go for similar aesthetics, similar designs, which they aren't bad, but they are also very just bland. They, they are futuristic, they are very well designed. But when you see the same website. The same -- five websites have the same feel. And this is especially, at least in my opinion, visible with websites built on top of NextJS or other frameworks. And it just feels corporate kind of dead. Like someone just makes a website that they want to sell something to you and not for fun. [00:25:26] Jeremy: With landing pages especially it's like, wow, this looks the same as every other site, but I guess it must work. [00:25:38] Prefetcher: It works. And it really cuts down on development time. You don't need to think much about it. You just already have a lot of well-established design rules that you just follow and you get a cohesive and responsive design system. Designing the PinkSea look and feel [00:25:56] Jeremy: Let's talk about that in connection with PinkSea. What was your thinking when you designed how PinkSea would look and feel? [00:26:06] Prefetcher: Honestly, at first I have to admit I looked at other websites. I looked at Bluesky first and foremost. I looked at, front page. I looked at Smoke Signal, and I thought that I might also build something that's modern and sleek and I sketched it out in an application and I showed it to some friends. One of them suggested I go for more like a 2000 aesthetic. I'm like, yeah, okay. I like that. As the website was built, I just saw more and more of how much I feel this could sit with others. Especially with the fact that it's an oekaki page an oekaki BBS and as you scroll through oekaki has a very distinct style to it. And as you scroll and you see all of those, pixel shaded, all those dithered images, non anti-aliased pens and whatnot. It feels really really cohesive somehow with the design aesthetic. But of course, PinkSea in itself is a modern website. Like if you were to go to my PinkSea repository. It's a modern website built up on top of Vue3, which talks via like XRPC API calls in real time and it's a single page app and whatever. That's kind of the thing I merged the modern way of making sites with a very oldish design language. And I feel, in my opinion, it somehow just really works. And especially it sets PinkSea apart from the other websites. It gives it that really weird aesthetic. You would go on it and you would not be like, oh, this is a modern site that connects with a modern protocol on top of a big decentralized network. This is just someone's weird BBS stuck in the 2000s that they forgot to shut down. (laughs) [00:28:00] Jeremy: Yeah. And I think that's a good reminder too, that when people are intentional about design, the tools we have now are so much better than what we used to have. There's nothing stopping us from making websites that when people go to them they really feel like something's different. I know I did not just land on Instagram. [00:28:27] Prefetcher: Yeah. And making PinkSea taught me that it's really easy to fall into that full string of thought that every site has to look modern. Because I was like, oh yeah, this is a modern protocol, a modern everything, and it has to look the part. It has to look interesting to people and everything. And after talking with a bunch of friends and other people and just going, huh, that's maybe like the 2000s isn't as bad as I thought. And yeah, the website especially it's design people seem to just really like it. Me too. I, I just absolutely love how PinkSea turned out it is really a reminder that you don't need modernness in web design always. And people really appreciate quirky looking pages, so to say, quirky like interesting. [00:29:23] Jeremy: I interviewed the, the creator of Neocities which is like kind of a modern version of GeoCities and yeah, that's really what one of the aspects that I think makes things so interesting to people from that era is, is that it really felt like you're creating your own thing, and not just everything looks the same. The term I think he used is homesteading. You're taking care of your place and it can match your sensibilities, your style, your likes, rather than having to, like you said, try to force everything to be this, this sort of base modern, look. The old spirit of the internet is coming back [00:30:08] Prefetcher: I mean Neocities and by extension also Nekoweb are websites that I often when I don't have much to do -- I like just going through them because you see a bunch of people just make their own places. And you see that even in 2025 when we have those big social media sites. You have platforms where you can get a ton of followers. You can get a ton of attention and everything. People to some extent still want that aspect of self-expression. They want to be able to make something that's uniquely theirs and you see people just make just really amazing websites build insane things on those old Geocities-like platforms using nothing but a code editor. You see them basically just wanting thing to express, oh, that's mine and no one else has it. So to say that's why. Yeah. I feel like to some extent the old school train of thought when it comes to the internet is slowly coming back. Especially with the advent of protocols like ATProto. And you'll experience more websites that just allow people to make their own homes on the internet. Cause in my opinion, one of the biggest problems is that people do not really want to register on a lot of platforms. 'cause you already have this place where you get all of your followers, you have all of your connections, and then you want to move and then you'll lose all of your connections and everything. But with something like ATProto, you can use the social graph of, for example, Bluesky. I want to add followers on PinkSea. So for example, you have an artist that has like 30,000 followers for example, I can just click import my following from Bluesky. And just like that they would already get all of the artists that they follow on Bluesky already added as followers on PinkSea. And for example, someone else joins and they followed that big artist and they instantly followed them on PinkSea as well. I think that we are slowly coming back to the advent of people owning their place online. PinkSea and ATProto (PDS) [00:32:24] Jeremy: Yeah. So let's talk a little bit more about how PinkSea fits into ATProto. For people who aren't super familiar with ATProto, maybe you could talk about how it's split up. You've got the PDS, the relays, the AppView. What are those and how do those fit into what PinkSea is? [00:32:48] Prefetcher: My favorite analogy, ATProto is a massive network, and at least me, when I saw the initial graph I was just very confused. I absolutely did not know what I'm looking at. But let's start with the base building block, something that ATProto wouldn't exist with. And it's the PDS. Think of the PDS as like a filing cabinet. You have a bunch of folders in which you have files, so to say. So you have a filing cabinet with your ID, this is the DID part that sometimes shows up and scares people. It's what we call a decentralized identifier. Basically that identifier is not really tied to the PDS, it just exists somewhere. And the end goal is that every user controls their DID. So for example, if your PDS shuts down, you can always move to somewhere else. Still keep like, for example, that you are prefetcher.miku.place. But in that filing cabinet the PDS going back to it you have your own little zone, your own cabinets, and that has your identifier, it's uniquely yours. Every single application on the AT protocol creates data. They create data and they store the data in a structured format called a record. A record is basically just a bunch of data that explains what that thing is, be it a like, a post on Bluesky an oekaki on PinkSea and an upvote on front page, or even a pixel on place.blue. And all of those records are organized into folders in your cabinet. And that folder is named with something we call a collection id. So for example, a like is, if I remember correctly, it's app.bsky.feed.like, so you see that it belongs to Bluesky. The app.bsky part. it's a feed thing, and the same way, PinkSea, for example, the oekaki and PinkSea uses com.shinolabs.pinksea.oekaki with com.shinolabs being the the collective that I use as a, pen name, so to say. PinkSea being, well, PinkSea and oekaki just being the name. It's an oekaki. If you want to see that there are a lot of tools, for example, PDSls or atp.tools or ATProto browser, if you had to go into one of those and you would type in for example, prefetcher.miku.place, you would see all of your records, the things that, you've created on the AT protocol network. Relay [00:35:19] Prefetcher: So you have a PDS, you have your data, but for example, imagine you have a PDS that you made yourself, you hosted yourself. How will, for example, Bluesky know that you exist? 'cause it won't, it's just a server in the middle of nowhere. That's where we have a relay. A relay is an application that listens to every single server. So every time you create something or you delete something, or for example, you edit a post, you delete an oekaki. You create a new, like -- Your PDS, your filing cabinet generates a record of that. It generates an event, something we call a commit. So, anytime you do something, your PDS goes, Hey, I did that thing. And relays function as big servers that a PDS can connect to. And it's a massive shout box. The PDS goes, Hey, I made this. Then the relay aggregates all of those PDSs into one and creates a massive stream of every single event that's going on the network at once. That's also where the name firehose comes from. 'cause the, the end result, the stream is like a firehose. It just shoots a lot of data directly at anyone who can connect to it. And the thing that makes AT Protocol open and able to be built on is that anyone can just go, I want to connect to jetstream1.west.bluesky.network. They just make a connection to it and boom they just get everything that's happening. You can, for example, see that via firesky.tv. If you go to it, you would open it in your browser. Every single Bluesky post being made in real time right directly in your computer. So you have the PDSs that store data, you have the relay that aggregates every, like, builds a stream of every single event on the network. AppViews [00:37:26] Prefetcher: You just get records. You can't interact with it. You can see that someone made a new record with that name, but to a human, you won't really understand what a cid is or what property something else is. That's why you have what we call AppViews. An AppView, or in full an application view is an application that runs on the AT protocol network. It connects to the relay and it transforms the network into a state that it can be used by people. That's why it's called an application view. 'cause it's a, a specialized view into the whole network. So, for example, PinkSea connects, and then it goes, hey, I want to listen on every single thing that's happening to com.shinolabs.pinksea.oekaki, and it sees all of those, new records coming in and PinkSea understands, oh, I can turn it into this, and then I can take this thing, store it in the database, and then someone can connect with a PinkSea front end. And then it can like, transform those things, those records into something that the front end understands. And then the front end can just display, for example, the timeline, the same way Bluesky, for example -- Bluesky gets every single event, every single new file, new record coming in from the network. And it goes. Okay, so this will translate into one more like on this post. And this post is a reply to that post. So I should chain it together. Oh. And this is a new feed, so I should probably display it to the user if they ask for feeds. And it basically just gets a lot of those disjoint records and it makes sense of them all. The end user has a different API to the Bluesky AppView. And then they can get a more specialized view into Bluesky. PinkSea does not store the original images, the PDS does [00:39:26] Jeremy: And so in that example, the PDSs, they can be hosted by Bluesky the company, or they could be hosted by any person. And so PinkSea itself, when somebody posts a new oekaki, a new image, they're actually telling PinkSea to go create the image in the user's PDS, right? PinkSea is itself not the the source of truth I guess you could say. [00:40:00] Prefetcher: PinkSea in itself. I don't remember which Bluesky team member said it, but I like the analogy that AppViews are like Google. So in Google, when you search something, Google doesn't have those websites. Google just knows that this thing is on that website. In the same vein, PinkSea, when you create a new oekaki, you tell PinkSea, Hey, go to my PDS and create that record for me. And then the person owns the PDS. So for example, let's say that in a year, of course I won't do it, but hypothetically here, I just go rogue and I shut down PinkSea, I delete the database. You still own the things. So for example, if someone else would clone the PinkSea repository and go here, there's PinkSea 2. They can still use all of those images that were already on the network. So, AppViews in a way basically just work as a search engine for the network. PinkSea doesn't store anything. PinkSea just indexes that a user made a thing on that server. And here I can show you how to get to it somehow. Those images aren't stored by PinkSea, but instead, I know that the image itself is stored, for example, on pds.example.com, and of course to reduce the load, we have a proxy. PinkSea asks the proxy to go to pds.example.com and fetch the image, and then it just returns it to the user. [00:41:37] Jeremy: And so what it sounds like then is if someone were to create oekaki on their own PDS completely independently of Pink Sea the fact that they had created that image would be sent to one of the relays, and then PinkSea would receive an event that says oh, this person created a new image then at that point your index could see, oh, somebody created a new image and they didn't even have to go through the PinkSea website or call the PinkSea APIs. Is that right? Sharing PDS records with other applications [00:42:14] Prefetcher: Yep. That is exactly right. For example, someone could now go, Hey, I'm making my own PinkSea-like application. And then they would go, I want to be compatible with PinkSea. So I'm using the same record. Or what we call a lexicon, basically describe how records look like. I forgot to mention that, but every single record has an attached lexicon. And lexicons serve as a blueprint. So a lexicon specifies, oh, this has an image, this has a for example, the tags attached to it, a description of the image. Validate that the record is correct, that you don't get someone just making up random stuff. But yeah, someone could just go, Hey, I'm making another website. Let's call it GreenForest for example. And GreenForest is also an oekaki website, but it uses, for example, chickenpaint instead of tegaki but I want to be able to interoperate with PinkSea. so I'm also gonna use com.shinolabs.pinksea.oekaki the collection, the same record, the same lexicon. And for example, they have their own servers and the servers just create regular oekaki records. So for example, GreenForest gets a new user, they log in, create, draw their beautiful image, and then they click upload it. So GreenForest goes to that person's PDS and tells the PDS, Hey, I want to make a new. com.shinolabs.pinksea.oekaki record. The PDS goes okay, I've done it for you. Let me just inform the relay that I did so, relay gets the notification that someone made that new PinkSea oekaki record. And so the main PinkSea instance, pinksea.art, which is listening in on the relay, gets a notification from the relay going, Hey, there is this new oekaki record. And PinkSea goes, sure, I'll index it. And so PinkSea just gets that GreenForest image directly in itself. And in the same vein, someone at PinkSea could draw something in tegaki -- their own beautiful character. And the same thing would happen with GreenForest. GreenForest would get that PinkSea image, that PinkSea record, and index it locally. So the two platforms, despite being completely different, doing completely different things, they would still be able to share images with each other. Bluesky PDS stores other AppView's data but they could stop at anytime [00:44:38] Jeremy: And these images, since they're stored in the PDS, what that would mean is that anybody building an application on ATProto, they can basically use Bluesky's PDS or the user's PDS as their storage. They could put any number of images in there and they could get into gigabytes of images. And that's the responsibility of the PDS and not yourself to keep track of. [00:45:12] Prefetcher: Yes, that can be the case. Of course, there is a hard limit on how big a single upload can be, which is, if I remember correctly, I don't wanna lie, I think it's 50 megabytes, I don't recall there being a hard cap on how big a single repository can be. I know of some people whose repositories are in the single gigabyte digits but this kind of is a thing scares app developers. 'cause you never know when Bluesky the company -- 'cause most people registering, are registering on Bluesky. We don't really know whether Bluesky, the company will want to keep it for free. Forever allow us to do something like that. You already have projects like, for example, ATFile, which just allow you to upload any arbitrary data just to store it, on their servers and they are paying for you. So we'll never know whether Bluesky will decide, okay, our services are only for Bluesky if you want to use PinkSea you have to deal with it. Or whether they go, okay, if you want to use alternative AppViews you have to pay us in order to host them. So, that also leads me to the fact that decentralization is an important part of AT protocol as Bluesky themselves say that they are a potential adversary. You cannot trust them in the long term. Right now they are benign right now, they're very nice, but, we never know how Bluesky will end up in a year or two. So if you want to be in the full control of your data, you need to sadly host it by yourself. And it's honestly really easy in order to do so. There is a ton of really useful online content blogs and whatever. I think I've set up my PDS in 10 minutes on a break between classes and university. But to a person that's non-technical that doesn't know much I'd say around an hour to two hours The liability and potential abuse from running a PDS [00:47:14] Jeremy: Yeah, I think the scary thing for a lot of people is technical or not, is even if it's easy to set up, you gotta make sure it keeps running. You gotta have backups. And so it could be a lot. [00:47:30] Prefetcher: Yeah. This is to be expected by the fact that you're in control of your data. Keeping it secure the same way, for your personal photos or your documents, for example, your master's diploma or whatever. And it's on you to keep your Bluesky interaction secure. On one hand, it's easier to get someone to do it, and I expect in the future we'll get people that are hosting public PDSes I sometimes thought of doing that for PinkSea, just like allowing people to register by PinkSea. But, doing so as a person, you also have to be constantly on call for abuse. So if someone decides to register via PinkSea and do some illicit activities, you are solely responsible for it. PDS and AppView moderation liability [00:48:17] Jeremy: So if they were to upload content that's illegal, for example, it's hosted on your servers so then it's your problem. [00:48:27] Prefetcher: Yeah, it is my problem. [00:48:29] Jeremy: At least the way that it works now, the majority of the people, their PDS is gonna be hosted by Bluesky. So if they upload content that's breaks the law, then that's the Bluesky company's problem at least currently. [00:48:44] Prefetcher: Yeah. That is something that Bluesky has to deal with. But I do believe that in the future we are going to have, more like independent entities just building infrastructure for ATProto, not even the relay it's just like PDSs for people to be able to join the atmosphere, but not directly via Bluesky. [00:49:06] Jeremy: I'm kind of curious also with the current PDSs, if it's hosted by Bluesky, are they, are they moderating what people upload to their PDSs? [00:49:16] Prefetcher: Good question. Honestly, I don't think they're moderating everything 'cause, it's infeasible for them to, for example, other than moderate Bluesky to also moderate PinkSea and moderate front page and whatnot. So it's the obvious responsibility to moderate itself and to report abuse. I'd say that if someone started uploading illicit material, I do not think, and this is not legal advice, I do not think that they would catch on until some point let's say. [00:49:52] Jeremy: I mean, from what you were describing too, it seems like the AppViews would also, have issues with this because if, let's say someone created a PinkSea record in their PDS directly and the image they put in was not an oekaki image, it's instead something pretty illegal in the country that your AppView is hosted then, Wouldn't that go straight to the PinkSea users viewing the website? [00:50:20] Prefetcher: Yes, sadly, this is something that you have to sign up as you're making an AppView and especially one with images. Sooner or later you are going to get material that you have to moderate and it's entirely on you. That's why, you have to think of moderation while you're working on an AppView. Bluesky has an insanely complicated, at least in my opinion, moderation system, which is composable and everything, which I like. But for smaller AppViews, I think it's too much to build the same level of tooling. So you have to rely more on manual work. Thankfully so far the user base on PinkSea has been nothing but stellar. I didn't have to deal with any law breaking stuff, but I am absolutely ready for one day where I'll have to sadly make some drastic moderation issues. [00:51:18] Jeremy: Yeah. I think to me that's the most terrifying thing about making any application that's open to user content. [00:51:29] Prefetcher: I get it, sadly. I'm no stranger to having issues with people, abusing my websites. Because since 2016, my, first major project was a text board based off of, a text board in a video game called DANGER/U/. It was semi-popular, during the biggest spike in activity in like 2017 and 2016, it had in the tens of thousands of monthly visitors. And sadly, yeah, even though it was only text, I've had to deal with a lot of annoying issues. So to say the worst I think was I remember waking up and people are telling me that DANGER/U/ is down. So I log in the activity logs and someone hit me with two terabytes of traffic in a day. There was a really dedicated person that just hated my website and just either spam me with posts or just with traffic. So, yeah, sadly I have experience with that. I know what to expect that's something that you sadly have to sign up for making a website that allows user content. Pinksea is a single server [00:52:42] Jeremy: To my understanding so far, PinkSea is just a single server. Is that right? [00:52:47] Prefetcher: It is a single server. Yeah. [00:52:48] Jeremy: That's kind of interesting in that, I think a lot of people when they make a project, they worry about scaling and things like that. But, was it a case where you just had a existing VPS and you're like, well hopefully this is, this is good enough? [00:53:03] Prefetcher: I actually ordered a new one even though it's not really powerful, but my train of thought was that I didn't expect it to blow up. I didn't expect it to require more than a single VPS with 8 gigabytes of RAM and whatnot. And so far it's handling it pretty well. I do not expect ever to reach the amounts of traffic that Bluesky does, so I do not really have to worry about insane scalability and whatever. But yeah. I thought of it always as a toy project until the day I released it and realized that it's a bit more than a toy project at this point. To this day, I just kind of think that that website even if it were popular, I would never expect it to have -- And in the best, most amazing case scenario, like a hundred posts a day. I do not have to deal with the amount of traffic that Bluesky does. So one VPS it is. [00:53:59] Jeremy: Yeah, that makes a lot of sense. I mean the application is also mostly reads, right? Most people are coming to see the posts and like you said, you get a few submissions a day, but all the read stuff can probably be cached. Harbor image proxy [00:54:15] Prefetcher: Yeah. The heaviest, thing that PinkSea requires is the image proxy harbor, and that's something that right now only runs on that server. It's in Luxembourg. I think that's where my coprovider hosts it but yeah, that gets the most reads. 'cause in most cases, PinkSea, all it does, all you get is reads from a database, which is just, it's a solved problem. It's really lightweight. But with something like image proxying, you have this whole new problem. 'cause it's a lot of data, and you somehow have to send it -- it's enough for me to just host it locally on that PinkSea server and just direct people to it. But sooner or later, I can always just put it behind something like Bunny CDN or whatnot to have it be worldwide. [00:55:09] Jeremy: So Harbor is something I think you added recently. How did the images work before and what is Harbor doing in its place? [00:55:18] Prefetcher: Before I did what a lot of us currently do and I just freeload atop of Bluesky CDN 'cause Bluesky CDN is just open so far. But it's something that personally irked me. 'cause, I want PinkSea to be completely independent of Bluesky Corporation. I, I wanted to persevere even if Bluesky just decides to randomly, for example, close, the CDN to others or the relay to others or the PLC directory in the worst case scenario. So I wanted to make my own CDN more like proxy. You can't really call it a CDN because it's not worldwide. It's just a single server but let's just say image proxy. So Harbor whenever a person goes to PinkSea, they start loading in all of the images and every single image instead of going to, for example, the PDS or to cdn.bluesky.app. They go to harbor.pinksea.art, you get attached the identifier of the user and what we call a content identifier. Every single, thing uploaded to a PDS has an attached content identifier, which identifies it in a secure way so to say. So Harbor does in reality a really simple set of things. First and foremost, if the user has not seen it, like, not loaded it before first Harbor asks the local cache, do I have this file? If they do, if Harbor does, it just sends the file and it tells the browser, Hey, by the way, please don't ask me about this file for the next day. And in most cases, after one refresh, the user, all of the images load instantly because the web browser just goes, of those files were already sent. And Harbor asked me not to like, ask it more about the same file. So in the case of the image isn't in harbor's local cache, Harbor, first does a lot of those steps to resolve, the users identifier through their PDS, basically resolving that identifier, the DID to a DID document, which is a document basically explaining how that user, what is their, alias, what is their handle and where can we find them, which PDS. So we find the PDS and we then ask the PDS, Hey, send us this file for this user. The PDS sends it or doesn't, in which case we just throw an error and, Harbor just saves it locally and it sends it to the client. It basically just that. But to my knowledge, it's the first non Bluesky image proxy that's deployed for any AppView. Which also caught the attention of Brian Newbold one of the Bluesky employees and made me really happy. DID PLC Lookup [00:58:14] Jeremy: The lookup when you have the user's, DID and you wanna find out where their PDS is that's talking to something called, I think it's the PLC directory? [00:58:25] Prefetcher: Actually there are two different ways. First is PLC directory, PLC originally standed for a placeholder, and then Bluesky realized that it's not a placeholder anymore, and they stealthily changed it to public ledger of credentials. So we have PLC and we have web, the most common version is PLC. The document, the DID document is stored on Bluesky controlled servers under the moniker of PLC directory. They expose a web API that basically just allows you to say, Hey, give me the document for did:plc, whatever. And, the directory goes, have it. And this is the less decentralized version. You can host your own PLC directory and you can basically ask (their) PLC directory to just send you every single document and just you can have your local copy, which some people already do, you kind of sacrifice the fact that you are not in control of the document. It's still on a centralized server, even if you control the keys. 'cause every single DID document also has a key. And that key is used to sign changes to the document. So technically, if you define your own set of keys, you can prevent anyone else from modifying your document, even Bluesky. 'cause every single document is verifiable back and forth. You can see the previous document and its key is used to sign the next document and the chain of trust is visible and no one can just make random changes to your identity, but yeah, it's still on Bluesky to control service and it's a point of contention. Bluesky eventually wants to move it to a nonprofit standards organization, but we have yet to see anything come out of it, sadly. DID WEB lookup The next method is web. And web instead of -- 'cause in did:plc, you have did:plc, and a random string of characters. [01:00:30] Prefetcher: Web relies on domains. So for example, the domain would already like be the sole authority of where the file is. So for example, if I had did:web:example.com, I would parse the DID and I would see it's hosted at example.com. So I go to example.com, I go to /.wellknown/did.json which is the well-known location for the file. And I would have the same DID document as I would have if I used, for example, a PLC DID resolved via the PLC directory. the web method, you are in control of the document entirely. It's on your server under your domain. While it's the more decentralized version, it's just kind of hard for non-technical people to make them. 'cause it relies on a bunch of things. And also the problem is that if you lose your domain, you also lose your identity. [01:01:23] Jeremy: Yeah. So unlike the PLC where it's not really tied to a specific domain, you can change domains. With the web way, you have to always keep the same domain 'cause it's a part of the DID and yeah, like you said, you can't let your renewal lapse or your credit card not work. 'cause then you just lose everything. [01:01:49] Prefetcher: Yeah. You would still be able to change handles, but you would be tied for that domain to forever send your DID otherwise you would just lose it forever. [01:01:57] Jeremy: Yeah, I had mostly only seen the PLC and I wasn't too familiar with the web, form of identification, but yeah that makes sense. [01:02:06] Prefetcher: I think the web if I remember correctly, there is slightly over 300 accounts total on the entire network that use it. Mary who is a person on Bluesky that does a lot of like, ATProto related things, has a GitHub repository that basically gives insight into the network. And on her GitHub repository, you can find the list of every single custom PDS and also how many DID webs there are in existence. And I think it was slightly over 300. [01:02:38] Jeremy: So are you on that list? [01:02:40] Prefetcher: My PDS Yeah. If you were to scroll down. I don't use a web DID 'cause I registered my account before when I was brand new to ATProto, so I didn't know anything. But if you had to scroll down, you would see pds.ata.moe, which is my custom PDS just running. [01:02:55] Jeremy: Cool. [01:02:57] Prefetcher: Yeah. Harbor image proxy can cache any image blob [01:02:58] Jeremy: So something I noticed about harbor, you take the, I believe you take the DID and then you take the CID, the content identifier. I noticed if you take any of those pairs from the ATProto network, like I go find a image somebody posted on Bluesky, I pass that post DID and CID for the image into harbor. Harbor downloads it and caches it. So it's like, does that mean anybody could technically use you as a ATProto CDN? [01:03:38] Prefetcher: Yes, the same way anyone could use like the Bluesky CDN to for example, run PinkSea like I did. cause I do not know if there is a good way to check if a CID of an image or a blob basically. 'cause files on ATProto are called blobs. I do not think there is a nice way to check if that blob is directly tied to a specific record. But that also allows you to make cool, interesting things. Crossposting to Bluesky talks directly to the PDS [01:04:06] Prefetcher: 'cause for example, PinkSea has that, cross post to Bluesky thing. So when you create an image, You already have an option to cross post it to Bluesky, which a lot of people liked. And it was a suggestion from one of the early users of PinkSea. And the way it works is that when we create a PinkSea record, we upload that image, right? And then PinkSea goes, okay, I'm gonna use that same image, the same content identifier, and just create a Bluesky post. So Bluesky and PinkSea all share the same image. I don't upload it twice, I just upload it once. use it in PinkSea and I also use it in Bluesky. And the same way Bluesky its CDN, can just fetch the image. I can also fetch the image from mine, 'cause blobs aren't tied to specific records. They just exist outside of that realm. And you could just query anything. Not even images. You could probably query a video or even a text file. [01:05:04] Jeremy: So when you cross post to Bluesky, you're creating a record directly in the person's PDS, not going through bluesky's API. [01:05:14] Prefetcher: No, I sidestep Bluesky's API completely. And, I basically directly talk to the PDS at all times. I just tell them, Hey, please, for me, create a app.bsky.feed.post record. And you have the image, the text, which also required me to manually parse text into rich text. 'cause like, Bluesky doesn't automatically detect for example, links or tags And you basically get -- like PinkSea creates a record directly with the link to the image. And all of those tags, like the PinkSea tag and whatever, And I completely sidestep. Bluesky's API. If Bluesky, the AppView would cease to exist, PinkSea would still happily create Bluesky crossposts for you. Other applications put metadata into Bluesky posts so they can treat them differently [01:06:02] Jeremy: And since you're creating the records yourself, then you can include additional metadata or fields where you know that this was a PinkSea post, or originally came from PinkSea. [01:06:13] Prefetcher: I could do that. I don't really do that right now 'cause I don't really have much of a reason other than adding a PinkSea hashtag to every single oekaki. But I, noticed, for example, I think it was PinkSky, interesting name, PinkSky, which is like (a) Bluesky Instagram client. Any single time you make a post via PinkSky it uses the Bluesky APIs. It's Bluesky, but it attaches a hidden hashtag like PinkSky underscore some random letters. In its feed building algorithm, it basically detects posts with that hashtag, that specific hashtag, and it builds a PinkSky only timeline. 'cause it's still a Bluesky post, but it has hidden additional metadata that identifies, Hey, it came from PinkSky. [01:07:02] Jeremy: It's pretty interesting how much control you have over what to put in the PDS. So, I'm sure there's a lot of interesting use cases that people are gonna come up with. [01:07:14] Prefetcher: Yeah, of course. You still lose some of the data when you go through the Bluesky API. 'cause of course it stores the record and it's all in formats and whatnot. But you can attach a lot of metadata that can identify posts and build micro networks within Bluesky itself. I see it like that. Bluesky CDN compression [01:07:37] Jeremy: And I think, this might have been a post from you. I think I saw somebody saying that when you view an image from the CDN that the Bluesky CDN specifically, there's some kind of compression going on that that messes with certain types of art. [01:07:55] Prefetcher: It's especially noticeable artists are complaining about it all the time, left and right. Bluesky is very happy with jpeg compression, by default, their CDN, -- like to every single image it applies a really not good amount of jpeg compression which is especially not small. If you compare an image that's uploaded via PinkSea, view an image on PinkSea, and view the same image, which is, it's the same content id. It's the same blob. And you view it on Bluesky, it loses so much fidelity, it loses so much of that aliasing on the pen. You just see everything become really blurry. And on top of that, when you upload an image via Bluesky itself, if I remember correctly, I don't wanna lie here, but they also downscale the image to 1024 pixels by default. So every single image, not only big ones, and artists usually work with really big canvases, they get, downscaled and also additionally they get jpegified. So for example, PinkSea directly uploads PNG files to the PDS. And for example, Harbor gives back the original file. It does no transformations on it, but Bluesky transforms all of them into JPEG compressed images and for photos, it's fine sometimes. 'cause I've also seen people just compare directly, downloaded images of the PDS versus images viewed on Bluesky. But for art it's especially noticible. And people really (do) not like that. [01:09:31] Jeremy: Yeah, that's kind of odd. 'cause if, if I understand correctly, then if you post directly to your PDS and Bluesky pulls it in you'll avoid that, that 1024 resizing. So your images will be higher quality? [01:09:47] Prefetcher: I actually do not know. That's an interesting question. Cause I know that the maybe their CDN also does that 'cause that's what I've heard from others, that on upload the image gets processed and squashed down. So I don't know if doing it via an alternative AppView would change it or would Bluesky just directly reject this post? Because for example, PinkSea, I have built-in which I think I might change in the future -- PinkSea will reject your post if it's bigger than 800x800. 'cause then it'll notice that something is off. This could not have been made with PinkSea. [01:10:26] Jeremy: Yeah, that's a good point I suppose we know at the very least, they have some third party and internal moderation tools that they feed the images through to, so they, they can do some automatic content tagging. But yeah, I, I don't know, like you said, whether, the resizing and all that stuff is at the CDN level [01:10:50] Prefetcher: The jpegification is definitely at the CDN level. 'cause, Bluesky is actually running an open source image proxy. It's called imgproxy. Brian Newbold talked about it a bit on that harbor post. And, yeah, so a lot of the compression, the end user things are done via image proxy, but that, downscaling, I don't know, you'd have to ask someone who's a bit more intimate with Bluesky's internals. [01:11:19] Jeremy: Cool. yeah, I think we've, we've covered a lot. Is there, is there anything else, you wanted to mention or thought we should have talked about? [01:11:26] Prefetcher: Regarding PinkSea I think I've mentioned a ton both the behind the scenes things and, the user things, the design principles. What I'd want to absolutely say, and it will sound cheesy, and, is that I'm eternally grateful to anyone who's actually visited PinkSea. It's definitely grown outta all of my like dreams for the platform, to the point where I'm sitting here just talking about it. I definitely hope that the future will bring us more applications (in) ATProto. I definitely have ideas on how to expand PinkSea, a lot of ideas, a lot of things I want to do, and I'm also a very busy person, so I never get around them. But yeah, think that's it, at least regarding PinkSea. [01:12:15] Jeremy: Cool. Well, if people want to check out PinkSea or see what you're up to, where can they find you? [01:12:22] Prefetcher: So PinkSea is at pinksea.art. That's the website and Bluesky Handle is at pinksea.art and me, well, search prefetcher on Bluesky, you'll probably find me. My tag is at prefetcher.miku.place. all of my socials are probably there. I'm Prefetcher pretty much every single platform except for the platforms that already had someone called Prefetcher. GitHub, github.com/purifetchi because Prefetcher was taken. And, yeah, hit me up. I'm always eager to talk. I don't bite. [01:13:00] Jeremy: Very cool. Well, Kacper thanks. Thanks for taking the time. This was fun. [01:13:04] Prefetcher: Thank you so much, Jeremy, for having me over. It was a pleasure.

tv ai google technology japan design artist german japanese tools forever web id mac flash poland honestly designing hackers ram polish computer science protocol blue sky api ui 4k sooner github rubin crt luxembourg javascript html harbor relay cpu validate css cid flickr png cdn neon genesis evangelion noam vps wipeout plc bbs jpeg html5 pds dpi smoke signals staro geocities reposted kacper green forest assembler touhou nextjs azumanga daioh bbss bbses eboy ytmnd jeremy you jeremy it jeremy so iosys

Tom MacWright on Shutting down Placemark

Leverage Your Incredible Factor Business Podcast with Darnyelle Jervey Harmon, MBA

Play Episode Listen Later Feb 6, 2025 62:17

Tom MacWright is a prolific contributor in the geospatial open source community. He made geojson.io, Mapbox Studio, and was the lead developer on the OpenStreetMap editor. He's currently on the team at Val Town. In 2021 he bootstrapped a solo business and created the Placemark mapping application. He acquired customers and found steady growth but after spending two years on the project he decided it was financially unsustainable. He open sourced the code and shut down the business. In this interview Tom speaks candidly about why geospatial is difficult, chasing technical rabbit holes, the mental impact of bootstrapping, and his struggles to grow a customer base. If you're interested in geospatial or the good and bad of running a solo business I think you'll enjoy this conversation with Tom. Related Links Tom's blog Placemark Play Placemark GitHub Placemark archive geojson.io Valtown Datawrapper (Visualization tool) Geospatial Companies mentioned Mapbox ArcGIS QGIS Carto -- Transcript You can help correct transcripts on GitHub. [00:00:00] Introduction Jeremy: Today I'm talking to Tom MacWright. He worked at Mapbox as a, a very early employee. He's had a lot of experience in the geospatial community, the open source community. One of his most recent projects was a mapping project called Placemark he started and ran on his own. So I wanted to talk to Tom about his experience going solo and, eventually having to, shut that down. Tom, thanks for agreeing to chat today. Tom: Yeah, thanks for having me. [00:00:32] Tools and Open Source at Mapbox Jeremy: So maybe to give everyone some context on, what your background was before you started Placemark. Um, let's talk a little bit about your experience at, at Mapbox. What did you work on there and, and what would you say are like the big things you learned from that experience? Tom: Yeah, so if you include the time that I was at Development Seed, which essentially turned into Mapbox, I kind of signed the paper to get fired from Development Seed and hired at Mapbox within the same 20 seconds. Uh, I was there for eight and a half years. so it was a lifetime in tech years. and the company really evolved from, uh, working for Human Rights Watch and Amnesty International and the World Bank and doing these small, little like micro websites to the point at which I left it. It had. Raised a lot of money, had a lot of employees. I think it was 350 or so when I left. and yeah, just expanded into a lot of different, uh, try trying to own more and more of the mapping stack. but yeah, I was kind of really focused on the creative and tooling side of it. that's kind of where I see a lot of the, the fun and programming is making these tools where, uh, they can give people the same kind of fun like interaction loop that programming has where you, you know, you do a little bit of math and you see the result and you're able to just play with, uh, what you're working on, letting people have that in other domains. so it was really cool to figure out how to get A map design tool where somebody changes the background color and it just automatically changes that in your browser. and it covered like data editing. It covered, um, map styling and we did, uh, three different versions of that tool over the years. and then Mapbox is also a company that was, it came from, kind of people who are working on the Howard Dean campaign. And so it was pretty ideological and part of the ideology was being pretty hardcore about open source. we hired a lot of people who were working on open source projects before and basically just paid them to work on the open source projects, uh, for their whole time there. And during my time there, I just tried to make as much of my work, uh, open as possible, which was, you know, at the time it was, it was pretty great. I think in the long term it's been, o open source has changed a lot. but during the time that we were there, we both kind of, helped things like leaflet and mapnik and openstreetmap, uh, but also made like some larger contributions to the open source world. yeah, that, that's kind of like the, the internal company facing side. And also like what I try to create as like a more of a, uh, enduring work. I think the open source stuff will hopefully have more of a, a long term, uh, benefit. [00:03:40] How open source has changed (value capture by large companies) Jeremy: When I was working on a project that needed offline maps, um, we couldn't use Google Maps or any of the, the other publicly available, cloud APIs. So yeah, we actually used a, a tool, called Tile Mill that I, I hadn't known that you'd worked on, but recently found out you did. So that actually let us pull in OpenStreetMap data and then use this style, uh, language called carto to, to basically let us choose what the colors would be and how the different, uh, the roads and the buildings would look. What's kind of interesting to me is that it being open source really let us, um, build something we otherwise wouldn't have been able to do. But like, at the same time, we also didn't pay Mapbox any money. (laughs) So I'm, I'm kind of curious, like, if it's changed, like what the thinking was in terms of, you know, we pay for people to build all these things. We make it open source. but then people may just not ever pay us, you know, for all these things we did. Tom: Yeah. Yeah. I think that the main thing that's changed since the era of tilemill is, the dominance of cloud platforms. Like back then, I think, uh, Mapbox was still using, we were using like a little bit of AWS but people were still just on like VPSs and, uh, configuring things in cPanel and sometimes even running their own servers. And the, the danger of people using the product for free was such a small thing for us. especially when tile Mill was also funded by the Knight Foundation, so, you know, that at least paid half of my salary for, or, well, sorry, probably, yeah, maybe half of my salary for the first year that I was there and half of three other people's salaries. but that, yeah, so like when we built Tile Mill, a few companies have really like built on those same tools. Uh, there's a company called Carto coincidentally, they had the same name as Carto CSS, and they built on a lot of the same stack they built on mapnik. Um, and it was, was... I mean, I'm not gonna say that it was all like, you know, sunshine and roses, but it was never a thing that we talked about in terms of like this being a brutal competition between us and these other startups. Mapbox eventually closed source some stuff. they made it a source available license. and eventually Mapbox Studio was a closed source product. Um, and that was actually a decision that I advocated for. And that's mostly just because at one point, Esri, Microsoft, Amazon, all had whitelisted versions of Mapbox code, which, uh, hurts a little bit on a personal level and also makes it pretty hard to think about. working almost like it. You don't want to go to your scrappy open source company and do unpaid labor for Amazon. Uh, you know, Bezos can afford to pay for the labor himself. that's just kind of my personal, uh, that I'm obviously, I haven't worked there in a long time, so I'm not speaking for the company, but that's kind of how it felt like. and it yeah, kind of changed the arithmetic of open source in this way that. It made it less fun and, more risky, um, for people I think. [00:07:11] Don't worry about the small free users Jeremy: Yeah. So it sounds like the thinking was if someone on a small team or an individual, they took the open source software and they used it for their own projects, that was fine. Like you expected that and didn't worry about it. It's more that when these really large organizations like a, a Microsoft comes in and, just like you said, white labels the software, and doesn't really contribute significantly back. That's, that's when it, the, the thinking sort of shifted. Tom: Yeah, like a lot of the people who can't pay full price in USD to use your product are great users and they're doing cool stuff. Like when I was working on Placemark and when I was like selling. The theme for my blog, I would get emails from like some kid in India and it's like, you know, you're selling this for a hundred dollars, which is a ton of money. And like, you know, why, why should I care? Why shouldn't I like, just send them the zip file for free? it's like nothing to me and a lot to them. and mapping tools are really, really expensive. So the fact that Mapbox was able to create a free alternative when, you know, ArcGIS was $500 a month sometimes, um, depending on your license, obviously. That's, that's good. You're always gonna find a way for, like, your salespeople are gonna find a way to charge the big companies a lot of money. They're great at that. Um, and that's what matters really for your, for the revenue. [00:08:44] ESRI to Google Maps with little in-between Jeremy: That's a a good point too about like the, my impression of the, the mapping space, and maybe this has changed more recently, but you had the, probably the biggest player Esri, who's selling things at enterprise prices and then there were, or there are like a few open source options. but they feel like the, the barrier to entry feels a little high. And so, and then I guess you have stuff like Google Maps, right? That's, um, that's very accessible, but it's pretty limited, so. There's this big gap, it feels like right between the, the Esri and the, the Google Maps and open source. It's, it's sort of like, there's almost like there's no sweet spot. guess May, maybe it's just because people's uses are so different, but I'm, I'm not sure, um, what makes maps so unique in that way Tom: Yeah, I have come to understand what Esri and QGIS do as like an extension of what CAD is like. And if you've used CAD software recently, it's just as crazy and as expensive and as powerful. and it's really hard to capture like the people who are motivated enough to make a map but don't want to go down the whole rabbit hole. I think that was one of the hardest things about Placemark was trying to be in the middle of those things and half of the people were mystified by the complexity and half the people wanted more complexity. Uh, and I just couldn't figure out how to get it to the right in between spot. [00:10:25] Placemark and its origins in geojson.io Jeremy: Yeah. So let's, let's talk a little bit about Placemark then, in terms of from its start. What was your, your goal with Placemark and, and what was the product itself? Tom: So the seed of the idea for Placemark, uh, is this website called geojson.io, uh, which is still around. And, Chris Fong (correction -- Whong) at, at Mapbox is still, uh, developing it. And that had become pretty useful for a lot of people who I knew in the industry who were in this position of managing geospatial data but not wanting to boot up ArcGIS uh, geojson.io is based on, I just tweeted, I was like, why? Why is there not a thing where you can edit data on a map and have a GeoJSON representation and just go Back and forth between the two really easily. and it started with that, and then it kind of grew to be a little bit more powerful. And then it was just a tool that was useful for everyone. And my theory was just that I wanted that to be more useful. And I knew just like anything else that you build and you work on for a long time, you know exactly how it could be so much better. And, uh, all the things that you would do better if you did it again. And I was, uh, you know, hoping that there was something where like if you make that more powerful and you make it something that's like so essential that somebody's using every day, then maybe there's some some value in that. And so Placemark kind of started as being like, oh, this is the thing where if you're tasking a satellite and you need a bounding box on a specific city, this is the easiest way to do that. Um, and it grew a little bit into being like a tool for collaborating because people were collaborating on it. And I thought that that would be, you know, an interesting thing to support. but yeah, I think it, it like tried to be in that middle of like, not exactly Google my Maps and certainly a lot, uh, simpler than, uh, QGIS or ArcGIS Jeremy: something I noticed, so I've actually used geojson.io as well when I was first learning how to put stuff on a map and learning that GeoJSON was a format that a lot of things were using, it was actually really helpful to, to be able to draw, uh, polygons and see, okay, this is how the JSO looks and all that stuff. And it was. Like just very simple. I think there's something like very powerful about, websites or applications like that where it, it does this one thing and when you go there, you're like, oh, okay, I, I, I know what I'm doing and it's, it's, uh, you know, it's gonna help me do the, this very specific thing I'm trying to do. [00:13:16] Placemark use cases (Farming, Transportation, Interior mapping, Satellite viewsheds) Jeremy: I think with Placemark, so, one question I would have is, you gave an example of, uh, someone, I think you said for a satellite, they're, are they drawing the, the area? What, what was the area specifically for? Tom: the area of interest, the area where they want the, uh, to point the camera. Jeremy: so yeah, with, with Placemark, I mean, were there, what were some of the specific customers or use cases you had in mind? 'cause that's, that's something about. Um, placemark as a product I noticed was it's sort of like, here's this thing where you can draw polygons put markers and there's all these like things you can do, but I think unless you already have the specific use case, it's not super clear, who uses it for what. So maybe you could give some examples of what you had in mind. Tom: I didn't have much in mind, but I can tell you what people, what some people used it for. so some of the more interesting uses of it, a bunch of, uh, farming oriented use cases, uh, especially like indoor and small scale farming. Um, there were some people who, uh, essentially had a bunch of flower farms and had polygons on the map, and they wanted to, uh, mark the ones that had mites or needed to be watered, other things that could spread in a geometric way. And so it's pretty important to have that geospatial component to it. and then a few places were using it for basically transportation planning. Um, so drawing out routes of where buses would go, uh, in Luxembourg. And, then there was also a little bit of like, kind of interesting, planning of what to buy more or less. Uh, so something of like, do we want to buy this tract of land or do we wanna buy this tract of land or do we wanna buy access to this one high speed internet cable or this other high speed internet cable? and yeah, a lot of those things were kind of like emergent use cases. Um, there's a lot of people who were doing either architecture or internal or in interior mapping essentially. Jeremy: Interior, you mean, inside of a building Tom: yeah. yeah. Jeremy: Hmm. Okay. Tom: Which I don't think it was the best tool for. Uh, but you know, people used it for that. Jeremy: Interesting. Yeah. I guess, would people normally use some kind of a CAD tool for that, or Tom: Yeah. Uh, there's CAD tools and there are a few, uh, companies that do just, there's a company that just does interior maps especially of airports, and that's their whole business model. Um, but it's, it's kind of an interesting, uh, problem because most CAD architecture work is done with like a local coordinate system, and you have like very good resolution of everything, and then you eventually place it in geo geospatial space. Uh, but if you do it all in latitude and longitude, you know, you're, you're moving a door and it's moving the 10th or 12th decimal point, and eventually you have some precision problems. Jeremy: So it's almost like if you start with latitude and longitude, it's hard to go the other way. Right? you have to start more specific and then you can move it into the, the geospatial, uh, area. Tom: Yeah. Uh, that's kind of why we have local projections for towns is that you can do a lot of work just in that local projection. And the numbers are kind of small 'cause your town's small, relatively. Jeremy: yeah, those are kind of interesting. So it sounds like just anytime somebody wants to, like you gave the example of transportation planning or you want to visually see where things are, like your crops or things like that, and that, that kind of makes sense. I mean, I think if you just think about paper maps, if somebody wants to sketch something out and, and sort of track the layout of something, this could serve the same purpose but be editable. and like you said, I think it's also. Collaborative so you can have multiple people editing the same, um, map. that makes sense. I think something that I believe I saw on your website is you said though that it was, it's like an editing tool, but it's not necessarily a visualization tool. Uh, I'm kind of curious what you, what you meant by that. [00:17:39] An editing tool that allows you to export data not a visualization tool Tom: Yeah, I, when you say a map, I think there's, people can interpret that as everything from raw data to satellite imagery and raster data. and then a lot of it is like, can I use this to make a choropleth map of the voter turnout in our, in my country? and that placemark did a little bit, but I think that it was, it was never going to be the, the thing that it did super well. and so, yeah, and also like the, the two things kind of, don't mesh all that well. Like if you have a scale point map and you have that kind of visualization of it and then you're editing the points at the same time and you're dragging around these like gigantic points because this point means a lot of population, it just doesn't really make that much sense. There are probably ways to square that circle and have different views, but, uh, I felt like for visualizations, I mean partly I just think data wrapper is kind of great and uh, I had already worked for observable at that point, which is also, which I think also does like great visualization work. Jeremy: Would that be the case of somebody could make a map inside a placemark and then they would take the GeoJSON and then import that into another visualization tool? Is that what you were kind of imagining people would do? Tom: Yeah. Yeah, exactly. Jeremy: And I could see from the customer's perspective, a lot of them, they may have that end, uh, visualization in mind. So they might look for a tool that kind of just does both. Right. Tom: Yeah. Yeah. Certain people definitely, wanted that. And yeah, it was an interesting direction to go down. I think that market was going to be a lot different than the people who wanted to manage and edit data. And also, I, one thing that I had in mind a lot, uh, was if Placemark didn't work out, how much would people be burned? and I think if I, if I built it in a way that like everyone was heavily relying on the API and embeds, people would be suffer a lot more, if I eventually had to shut it down. every API that you release is really a, a long-term commitment. And instead for me, like guilt wise, having a product where you can easily export everything that you ever did in any format that you want was like the least lock in, kind of. Jeremy: Yeah. And I imagine the, the scope of the project too, you're making it much smaller if you, if you stick to that editing experience and not try to do everything. Tom: Yeah. Yeah. I, the scope was already pretty big. as you can tell from the open source project, it's, it's bigger than I wish it was. the whole time I was really hoping that I could figure out some niche that was much more compact. there's, I forget the name, but there's somebody who has a, an application that's very similar to Placemark in. Technical terms, but is just a hundred percent focused on planning septic systems. And I'm just like, if I just did this just for septic systems, like would that be a much, would that be 10,000 lines of code instead of 40,000 lines of code? And it would be able to perfectly serve those customers. but you know, that I didn't do enough experimentation to figure that out. Um, I, that's, I think one thing that I wish I had done a lot more was, pivot and do experiments. Jeremy: that septic example, do you know if it's a, a business in and of itself where it can actually support one person or a staff of people? Or is it, is that market just too small? Tom: I think it's still a solo bootstrapped project. yeah. And it's, it's so hard to tell whether a company's doing well or not. I could ask the person over DM. [00:21:58] Built the base technology before going public Jeremy: So when you were first starting. placemark. You were, you were doing it as a solo, developer. A solo entrepreneur, reallyyou worked on it for quite a while, I think before you announced, right? Like maybe a year or so? Tom: Yeah, yeah. Almost, almost a year, I think, maybe, maybe 10 months in the dark. Jeremy: I think that there's, there was a lot of overlap between the different directions that I would eventually go in and. So just building a collaborative editor that can edit map data fairly quickly and checks all the boxes of being able to import and export things, um, that is, was a lot of work. and I mean also I, I was, uh, freelancing during part of it, so it wasn't a hundred percent of my time. Tom: But that, that core, I think even now if I were to build something similar, I would probably still use that work. because that, whether you're doing the septic planning application or you're doing a general purpose kind of map editor or some kind of social application, a lot of that stuff will be in common. Um, and so I wanted to really get, like, to figure out that problem space and get a few solutions that I could live with. Jeremy: The base. libraries or technologies you were gonna pick to get the map and have the collaborative aspect. Those are all things you wanted to get settled first. And then you figured, okay, once I have this base, then I can go find the, you know, the, the, the customers or, or find the specifics of what I'm gonna build. Tom: Yeah, exactly. Jeremy: I I think you had said that going forward when you're gonna work on another project, you would probably still start the same way. [00:23:51] Geospatial is a tough industry, no public companies Tom: if I was working on a project in the geospatial space, I would probably heavily reference the work that I already did here. but I don't know if I'll go back to, to maps again. It's a tough industry. Jeremy: Is it because of the, the customer base? Is it because like people don't really understand the market in terms of who actually needs the maps? I'm kind of curious what you feel makes it tough. Tom: I think, well there are no, there are no public mapping companies. Esri is I think one of the 10 largest private companies in the us. but it's not like any of these geospatial companies have ever been like a pure play. And I think that makes it hard. I think maps are just, they're kind of like fonts in a way in which they are this. Very deep well of complexity, which is absolutely fascinating. If you're in it, it's enough fun and engineering to spend an entire career just working on that stuff. And then once you're out of it, you talk to somebody and you're just like, oh, I work on this thing. And they're like, oh, that you Google maps. Um, or, you know, I work at a font type like a, you know, a type factory and it's like, oh, do you make, uh, you know, courier in, uh, word. It's really infrastructure, uh, that we mostly take for granted, which is, that's, that means it's good in some ways. but at the same time, I, it's hard to really find a niche in which the mapping component is that, that is that useful. A lot of the companies that are kind of mapping companies. Like, I think you could say that like Strava and Palantir are kind of geospatial companies, both of them. but Strava is a fitness company and Palantir is a military company. so if you're, uh, a mapping expert, you kind of have to figure out what, how it ties into the real world, how it ties into the business world and revenue. And then maps might be 50% of the solution or 75% of the solution, but it's probably not going to be, this is the company that makes mapping software. Jeremy: Yeah, it's more like, I have this product that I'm gonna sell and it happens to have a map as a part of it. versus I'm going to sell you, tools that, uh, you know, help you make your own map. That seems like a, a harder, harder sell. Tom: yeah. And especially pro tools like the. The idea of people being both invested in terms of paying and invested in terms of wanting to learn the tool. That's, uh, that's a lot to ask out of people. [00:26:49] Knowing the market is tough but going for it anyways Jeremy: I think the things we had just talked about, about mapping being a tough industry and about there being like the low end is taken care of by Google, the high end is taken care of by Esri with ArcGIS. Uh, I think you mentioned in a blog post that when you started Placemark you, you, you knew all this from the start. So I'm kind of curious, like, knowing that, what made you decide like, I'm gonna, I'm gonna go for it and, you know, do it anyways. Tom: uh, I, well, I think that having seen, I, like I am a co-founder of val.town now, and every company that I've worked for, I've been pretty early enough to see how the sausage is made and the sausage is made with chaos. Like every company doesn't know what it's doing and is in an impossible fight against some Goliath figure. And the product that succeeds, if it ever does succeed, is something that you did not think of two or three years in advance. so I looked at this, I looked at the odds, and I was like, oh, these are the typical odds, you know, maybe someday I'll see something where it's, uh, it's an obvious open blue water market opportunity. But I think for the, for the most part, I was expecting to grind. Uh, you know, like even, even if, uh, the odds were worse, I probably would've still done it. I think I, I learned a lot. I should have done a lot more marketing and business and, but I have, I have no regrets about, you know, taking, taking a one try at solving a very hard to solve problem. Jeremy: Yeah, that's a good point in that the, the odds, like you said, are already stacked against you. but sometimes you just gotta try it and see how it goes, Tom: Yeah. And I had the, like I was at a time where I was very aware of how my life was set up. I was like, I could do a startup right now and kind of burn money for a little while and have enough time to work on it, and I would not be abandoning an infant child or, you know, like all of the things that, all the life responsibilities that I will have in the near future. Um. So, you know, uh, the, the time was then, I guess, [00:29:23] Being a solo developer Jeremy: And comparing it to your time at Mapbox and the other startups and, and I suppose now at val.town, when you were working on Placemark, you're the sole developer, you're in charge of everything. how did that feel? Did you enjoy that experience or was it more like, I, I really wish I had other people to, you know, to kind of go through this with, Tom: Uh, around the end I started to chat with people who, like might be co-founders and I even entertained some chats with, uh, venture capital people. I am fine with the, the day to day of working on stuff alone of making a lot of decisions. That's what I have done in a lot of companies anyway. when you're building the prototype or turning a prototype into something that can be in production, I think that having, uh, having other people there, It would've been better for my mentality in terms of not feeling like it was my thing. Um, you know, like feeling detached enough from the product to really see its flaws and really be open to, taking more radical shifts in approach. whereas when it's just you, you know, it's like you and the customers and your email inbox and, uh, your conscience and your existential dread. Uh, and you know, it's not like a co-founder or, uh, somebody to work with is gonna solve all of that stuff for you, but, uh, it probably would've been maybe a little bit better. I don't know. but then again, like I've also seen those kinds of relationships blow up a lot. and I wanted to kind of figure out what I was doing before, adding more people, more complexity, more money into the situation. But maybe you, maybe doing that at the beginning is kind of the same, you know, like you, other people are down for the same kind of risk that you are. Jeremy: I'm sure it's always different trade offs. I mean, I, I think there probably is a power to being able to unilaterally say like, Hey, this is, this is what I wanna do, so I'm gonna do it. Tom: Yeah. [00:31:52] Spending too much time on multiplayer without a business case Jeremy: You mentioned how there were certain flaws or things you may not have seen because you were so in it. Looking back, what, what were some of those things? Tom: I think that, uh, probably the, I I don't think that most technical decisions are all that important, um, that it never seems like the thing that means life or death for companies. And, you know, Facebook is still on PHP, they've fought, fixed, the problem with, with money. but I think I got rabbit holed into a few things where if I had like a business co-founder, then they would've grilled me about like, why are we spending? The, the main thing that comes to mind, uh, is real time multiplayer, real time. It was a fascinating problem and I was so ready to think about that all the time and try to solve it. And I think that took up a lot of my time and energy. And in the long term, most people are not editing a map. At the same time, seeing the cursors move around is a really fun party trick, and it's great for marketing, but I think that if I were to take a real look at that, that was, that was a mistake. Especially when the trade off was things that actually mattered. Like the amount of time, the amount, the amount of data that the, that could be handled at. At the same time, I could have figured out ways to upload a one gigabyte or two gigabyte or three gigabyte shape file and for it to just work in that same time, whereas real time made it harder to solve that problem, which was a lot closer to what, Paying customers cared about and where people's expectations were? Jeremy: When you were working on this realtime collaborative functionality, was this before the product was public? Was this something you, built from the start? Tom: Yeah. I built the whole thing without it and then added it in. Not as like a rewrite, but like as a, as a big change to a lot of stuff. Jeremy: Yeah, I, I could totally see how that could happen because you are trying to envision people using this product, and you think of something like Google Docs, right? It's very powerful to be typing in a document and see the other cursors and, um, see other people typing. So, I could see how you, you would make that leap and say like, oh, the map should, should do that too. Yeah. [00:34:29] Financial pressures of bootstrapping, high COL, and healthcare Tom: Yeah. Yeah. Um, and, you know, Figma is very cool. Like the, it's, it's amazing. It's an amazing thing. But the Figma was in the dark for way longer than I was, and uh, Evan is a lot smarter than I was. Jeremy: He probably had a big bag of money too. Right. Tom: Yeah. Jeremy: I, I don't actually know the history of Figma, but I'm assuming it's, um, it's VC funded, right? Tom: Uh, yeah, they're, they're kind of famous for just having, I don't think they raised that much in the beginning, but they just didn't hire very much and it was just like the two co-founders, or two or three people and they just kept building for long time. I feel like it's like well over three years. Jeremy: Oh wow. Okay. I think like in your case, I, I saw a comment from you where you were saying, this was your sole source of income and you gotta pay for your health insurance, and so you have no outside investments. So, the pressures are, are very different I think. Tom: Yeah. Yeah. And that's really something to on, to appreciate about venture capital. It gives you the. Slack in your, in your budget to make some mistakes and not freak out about it. and sadly, the rent is not going down anytime soon in, in Brooklyn, and the health insurance is not going down anytime soon. I think it's, it's kind of brutal to like leave a job and then realize that like, you know, to, to be admitted to a hospital, you have to pay $500 a month. Jeremy: I'm, I'm sure that was like, shocking, right? The first time you had to pay for it yourself. Tom: Yeah. And it's not even good. Uh, we need to fix this like that. If there's anything that we could do to fix entrepreneurship in this country, it's just like, make it possible to do this without already being wealthy. Um, it was, it was a constant stress. [00:36:29] Growth and customers Jeremy: As you worked on it, and maybe especially as you, after you had shipped, was there a period where. You know, things were going really well in terms of customers and you felt like, okay, this is really gonna work. Tom: I was, so, like, I basically started out by dropping, I think $5,000 in the business bank account. And I was like, if I break even soon, then I'll be happy. And I broke even in the first month. And that was amazing. I mean, the costs were low and everything, but I was really happy to just be at that point and that like, it never went down. I think that probably somebody with more, uh, determination would've kept going after, after I had stopped. but yeah, like, and also The people who used Placemark, who I actually chatted with, and, uh, all that stuff, they were awesome. I wish that there were more of them. but like a lot of the customers were doing cool stuff. They were supportive. They gave me really informative feedback. Um, and that felt really good. but there was never a point at which like the, uh, the growth scale looked like, oh, we're going to hit a point at which this will be a sustainable business within a year. I think it, according to the growth when I left it, it would've been like maybe three years until I would've been, able to pay my rent and health insurance and, live a comfortable life in, in New York. Jeremy: So when you mentioned you broke even that was like the expenses into the business, but not for actually like rent and health insurance and food and all that. Okay. Okay. can you say like roughly how much was coming in or how many customers you had? Tom: Uh, yeah, the revenue initially I think was, uh, 1500 MRR, and eventually it was like 4,000 or so. Jeremy: And the growth was pretty steady. [00:38:37] Bootstrapping vs fundraising Tom: Um, so yeah, I mean, the numbers where you're just like, maybe I could have kept going. but it's, the other weird thing about VCs is just that I think I have this rich understanding of like, if you're, if you're running a business that will be stressful, but be able to pay your bills and you're in control of it, versus running a startup where you might make life changing money and then not have to run a business again. It's like the latter is kind of better. Uh, if stress affects you a lot, and if you're not really wedded to being super independent. so yeah, I don't know between the two ways of like living your life, I, I have some appreciation for, for both. doing what Placemark entailed if I was living cheaply in a, in a cheap city and it didn't stress me out all the time, would've been a pretty good deal. Um, but doing it in Brooklyn with all the stress was not it, it wasn't affecting my life in positive ways and I, I wanted to, you know, go see shows at night with my friends and not worry about the servers going down. Jeremy: Even putting the money aside, I think that's being the only person responsible for the app, right? Probably feels like you can't really take a vacation. Right. Tom: Yeah, I did take a vacation during it. Like I went to visit my partner who was in, uh, Germany at the time, and we were like on a boat, uh, between Germany, across the lake to Switzerland, and like the servers went down and I opened up my laptop and fixed the servers. It's just like, that is, it's a sacrifice that people make, but it is hard. Jeremy: There's, there's on call, but usually it's not just you 24 7. Tom: Yeah. If you don't pick up somebody else [00:40:28] Financial stress and framing money spent as an investment Jeremy: Yeah, yeah, yeah, I guess at what point, because I'm trying to think. You started in 2021 and then maybe wrapped up, was it sometime in 2024? Tom: Uh, I took a job in, uh, I, I mean I joined val.town in the early 2023 and then wrapped up in November, 2023. Jeremy: At what point did you really start feeling the, the stress? Like I, I imagine maybe when you first started out, you said you were doing consulting and stuff, so, um, probably things were okay, but once you kind of shifted away from that, is that kind of when the, the, the worries about money started coming in? Tom: Yeah. Um, I think maybe it was like six or eight months, um, in. Just that I felt like I wasn't finding, uh, like a, a way to grow the product without adding lots of complexity to it. and being a solo founder, the idea of succeeding, but having built like this hulking mess of a product felt just as bad as not succeeding. like ideally it would be something that I could really be happy maintaining for the long term. Uh, but I was just seeing like, oh, maybe I could succeed by adding every feature in QGIS and that's just not, not a, not something that I wanted to commit to. but yeah, I don't, I don't know. I've been, uh, do you know, uh, Ramit Sethie he's like a, Jeremy: I don't. Tom: an internet money guy. He's less scummy than the rest of them, but still, I. an internet money guy. Um, but he does adjust a lot of stuff about like, money psychology. And that has made me realize that a lot of what I thought at the time and even think now is kind of a rational, you know, like, I think one of the main things that I would do differently is just set a budget for Placemark. Like if I had just set away, like, you know, enough money to live on for a year and put that in, like the, this is for Placemark bucket, then it would've felt better to me then having it all be ad hoc, month to month, feeling like you're burning money instead of investing money in a thing. but yeah, nobody told me, uh, how to, how to think about it then. Uh, yeah, you only get experience by experiencing it. Jeremy: You're just seeing your, your bank account shrinking and there's this, psychological toll, right? Where you're not, you're not used to that feeling and it, it probably feels like something's wrong, Tom: Yeah, yeah. I'm, I think it, I'm really impressed by people who can say, oh, I invested, uh, you know, 50 or a hundred thousand dollars into this business and was comfortable with that risk. And like, maybe it works out, maybe it doesn't. Maybe you just like threw a lot of money down into that. and the people, I think with the healthy, productive, uh, relationship with it. Do think of it as like, oh, I, I paid for kind of a bet on a risk. and that's, that's what I was doing anyway. You know, like I was paying my rent and my health insurance and spending all my time working on the product instead of paying, uh, freelance work. but if you don't frame it that way, it doesn't feel like an investment. It feels like you're making a risky gamble. Jeremy: Yeah. And I think that makes sense to, to actually, I think, like you were saying, have a separate account or a separate thing set aside where you are like, this is, this is this money for this purpose. And like you said, look at it as an investment, which with regular investments can go down. Tom: Yeah, exactly. Yeah. Jeremy: Yeah [00:44:26] In hindsight might have raised money or tried smaller bets Jeremy: Were there, there other things, whether technical or or business wise, that, that if you were to to do it again, you would do differently? Tom: I go back and forth on whether I should have raised venture capital. there are, there's kind of a, an assumption in venture capital that once you're on it, you have to go the whole way. You have to become a billion dollar company, uh, or at least really tell people that you're going to be a billion dollar company and I am not. yeah, I, I don't know. I've seen, I've seen other companies in my space, or like our friends of my current company who are not really targeting that, or ones who were, and then they had somewhere in between the billion dollar and the very small outcome. Uh, and that's a little bit of a point in the favor of accepting a big pile of money from the venture capitalists. I'm also a little bit biased right now because val.town has one investor and he's like the, the best venture capitalist that I have ever met. Big fan. don't quote me on that. If he sacks me in like a year, we'll see. Um, but uh, yeah, there, I, I think that I understand more why people take that approach. or I've understood more why people take like the venture capital but not taking $300 million from SoftBank approach. yeah, and I don't know, I think that, trying a lot of things also seems really appealing. Uh, people who do the same kind of. of Maybe 10 months, but they build four or five different products or three different products instead of just one. I think that, that feels, feels like a good idea to me. Jeremy: And in doing that, would that be more of a, like as a solo entrepreneur or you, you're thinking you would take investment and then say, I'm gonna try all these things with, with your money. Tom: Oh, I've seen both. I, that I, yeah, one friend's company has pivoted like four times between very different ideas and yeah, it, it's one way to do it, but I think in the long term, I would want to do that as a solo developer and try to figure out, you know, something. but yeah, I, I think, uh, so much of it is mindset, that even then if I was working on like three different projects, I think I. My qualifications for something being worth, really adopting and spending all my time doing, you just have to accept, uh, a lot of hits and a lot of misses and a lot of like keeping things alive and finding out how to turn them into something. I am really inspired by my friends who like started around the same time that I did and they're not that much further in terms of revenue and they're like still, still doing it because that is what they want to do in life. and if you develop the whole ecosystem and mindset around it, I think that's somewhere that people can stay and, and be happy. just trying to find, trying to find a company that they own and control and they like. Jeremy: While, while making the the expenses work. Tom: Yeah. Yeah. that's the, that's the hard part, like freelancing on the side also. I probably could have kept that up. I liked my freelance clients. I would probably still work with them as well. but I kind of just wanted the, I wanted the focus, I wanted the motivation of, of being without a net. Jeremy: Yeah, I mean, energy wise, do you think that that would've worked? I mean, I imagine that Placemark took a lot of your time when you were working full time, so you're trying to balance, you know, clients and all your customers and everything you're doing with the software. It just feels like it might be a lot. Tom: Yeah. Yeah. Maybe with different freelance clients. I, I loved my freelance clients because I, after. leaving config. I, I wanted to work on climate change stuff and so I was working for climate change foundations and that is not the way to max out your paycheck. It's the way to feel good about your conscience. And so I still feel great about those projects, but in the future, yeah, I would probably just work for, uh, you know, a hedge fund or something. [00:49:02] Marketing to developers but not potential customers Jeremy: I think something you mentioned in one of your posts is that you maybe could have spent more time or had a different approach with marketing. Maybe you could kind of say what you did do and then what maybe worked and what didn't. Tom: Yeah. So I like my sweet spot is writing documentation and blog posts and technical stuff. And so I did a lot of that and a lot of that like worked in a way that didn't matter. I am at this point, weirdly good at writing stuff that gets on Hacker News. I've written a lot of stuff that's gotten to the top of Hacker News and unfortunately, writing about your technical approach and your geospatial project for handling errors, uh, in your JavaScript code is not really a way to get customers. and I think doing a lot of documentation was also great, but it was also, I think that the, the thing that was missing is the thing that I think Mapbox does fairly well now, in which the homepage really pushes you toward use cases immediately. and I should have been saying to each customer who had anything compelling as a use case, like, let's write an article about you and what you're doing, and here's how you use this in your industry. and that probably would've also been like a good, a good way to figure out which of those verticals was the one that was most worth spending all the time on. yeah. So it, it was, it was a lot of good marketing to nerds. and it could have been better in terms of marketing to actual customers and to people who are making the buying decisions. Jeremy: Yeah. Looking at the, the Placemark blog, I can definitely see how as a developer, a lot of the posts are appealing to me, right? It's about how you worked on a technical challenge or decisions you made, but maybe less so to somebody who they wanna. Draw a map to manage their crops. They're like, I don't care about any of this. Right. Tom: Yeah, like the Mapbox blog used to be, just all that stuff as well. We would write about designing protocol buffer layouts, and it was amazing for hiring and amazing for getting nerds in the door. But now it's just, Toyota is launching with, Mapbox Maps or something like that. And that's, that's what you, you should do if you're trying to sell a product. Jeremy: Yeah. And I think the, the sort of technical aspect, it makes sense too. If you're venture funded and you are looking to hire, right? You wanna build your team and you just want to increase like, the amount of stuff you're building and not worrying so much about, am I gonna have a paycheck next Tom: Yeah. Yeah. I, I just kind of do it because it's fun, which is not the right reason to do it, but, Yeah, I mean, I still write my blog mostly just because it's, it's a fun thing to do, but it's not the best way to, um, to run a business. Jeremy: Yeah. Well, the fun part is important too though. Tom: Yeah. Yeah. That's, that's maybe the whole thing. May, that's maybe the most important thing, but you can't do it if you don't do the, the money part. [00:52:35] Most customers came from existing audience Jeremy: Right. So the people who did find you, was it mostly word of mouth from people who did identify with the technical posts, or were there places that surprised you, that people found you? Tom: Uh, a lot of it was people who were familiar with the Mapbox ecosystem or with, with me. and then eventually, yeah, a few of the users came in through, um, through Hacker News, but it was mostly, mostly word of mouth also. The geospatial community is like fairly tight and it's, and it's not too hard to be the person who writes the article about some geospatial challenge that everyone finds. Jeremy: Hmm. Okay. Yeah, that's a good point about like being in that community, especially since you've done so much work in geospatial and in open source that you have this little, this built-in audience, I guess. Tom: yeah. Which I appreciate. It makes me nervous, but yeah. [00:53:43] Val.town marketing to developers Jeremy: Comparing that to something like val.town, how is val.town marketing? How is it finding users? 'cause from what I can tell, it's, it's getting a lot of, uh, a lot of people coming in, right? Tom: Yeah. Uh, well, right now our, our kind of target user, or the user that we think of is a hobbyist, is somebody who's, sometimes a pro developer or somebody, sometimes just somebody who's really interested in the field. And so writing these things that are just about, you know, programming, does super well. Uh, but it, we have exactly the same problem and that that is kind of being revamped as we speak. uh, we hired somebody who actually knows marketing and has a good sense for it. And so a lot of that stuff is shifting to show you what you can do with val.town because it, it suffers from the same problem as well. It's an empty text field in which you can type, type script, code, and it runs. And knowing what you can do with that or what you should do with that is, is hard if you don't have a grasp of TypeScript and web applications. so pretty soon we'll have pages which are like, here's how to connect linear and GitHub with OW Town, or, you know, two nouns connect them, for all of those companies and to do automations and all these like concrete applications. I think that's, you have to do it. You have to figure it out. Jeremy: Just briefly for someone who hasn't heard of val.town, like what, what does it do? Tom: Uh, val.town is a social website, so it has comments and likes and all of that stuff. but it's for writing these little snippets of TypeScript and JavaScript code that run. So a lot of them are websites, some of them are automations, so they receive emails or send emails or connect one service to another. And yeah, it's, it's like combining some aspects of, GitHub or like a code platform, uh, but with the assumption that every time that you save, everything's instantly deployed. Jeremy: So it's maybe a little bit like, um, like a glitch, I guess? Tom: Uh, yeah. Yeah, it takes a lot of experience, a lot of, uh, inspiration from Glitch. Jeremy: And I, I think, like you had mentioned, you enjoy writing the, the technical blog posts and the documentation. And so at least with val.town, your audience is developers versus, the geospatial community who probably largely doesn't care about, TypeScript and the, the different technical decisions there. Tom: Yeah, it, it makes it easier, that's for sure. The customer is, is me. [00:56:30] Shifting from solo to in-person teams Jeremy: Nice. Yeah. Looking at, you know, you, you worked as a, a solo developer for Placemark, and then now you've got a team of, is it like maybe five Tom: Uh, it is seven at the moment. Jeremy: Seven people. Okay. Are you all in person or is it, remote Tom: We all sit around two tables in Brooklyn. It's very nice. Jeremy: So how did that feel? Like shifting from, I'm in, I don't know if you worked from home while you were working on Placemark or if you were in coworking spaces, but you're, you're shifting from I'm like in my own head space doing everything myself to, to, I'm in a room with all these people and we're like working on this thing together. I'm kind of curious like how that felt for you. Tom: Yeah, it's been a big difference. And I think that I was just talking with, um, one, one of our, well an engineer at, at val.town about how everyone kind of had, had been working remote for obvious pandemic world reasons. And this kind of privilege of just being around the same table, if that's what you like is, a huge difference in terms of, I just remember having to. Trick myself into going on a walk around the block because I would get into such a dark mental head space of working on the same project for eight hours straight and skipping lunch. and now there's a little bit more structure. yeah, it's, it's been, it's been a overall, an improvement. Some days I wish that I could go on a run at noon 'cause that's the warmest time of the day. but, uh, overall, like it makes things so much easier. just reading the emotions in people's faces when they're telling you stuff and being able to, uh, not get into discussions that you don't need to get into because you can talk and just like understand each other very quickly. It's, it's very nice. I don't wanna force everyone to do it, you know, but it it for the people who want it, they, they, uh, really enjoy it. Jeremy: Yeah. I think if you have the right set of people, it's definitely more enjoyable. And um, if you don't, maybe not so Tom: Yeah, we haven't hired any, like, extremely loud chewers yet or anything like that, but yeah, maybe my story will change. Jeremy: No, no one microwaving fish. Tom: No, there's, uh, yeah, thankfully the microwave is outside of the office. Jeremy: Do you live close to the office? Tom: Yeah. Yeah. Like most of the team is within a 20 or 30 minute walk of the office and it's very fortunate. I think there's been something of a mass migration to New York. A lot of us didn't live in New York before four years ago, and now all of us do. it's, it's, uh, it's very comfortable to be here. Jeremy: I think that makes, uh, such a big difference. 'cause I think the majority of people, at least within the US you know, you're, you're getting in your car, you're sitting in traffic. and I know people who, during the pandemic, they actually moved further, right? Because they went, oh, like, uh, I don't need to come into the office. but yeah, if you are close enough where you can walk, yeah, I think that makes a big difference. Tom: Oh yeah. If I had to drive to work, I think my blood pressure would be so much higher. Uh, especially in New York. Oh, I feel so bad for the people who have to drive, whereas I'm just walking with, you know, a bagel in hand, enjoying listening to the birds. Jeremy: Yeah. Yeah. well now they have, what is it, the congestion pricing in Tom: Yeah. Yeah. We're all in Brooklyn, so it doesn't affect us that much, but it's supposedly, it's, it's working great. Um, yeah. I hope we can keep it. Jeremy: I've never driven in New York and I, I wouldn't want to Tom: Yeah. It's only for the brave or the crazy. [01:00:37] The value of public writing and work Jeremy: I think that's probably a good place to, to wrap up, but is there any other thoughts you had or things you wanted to mention? Tom: No, I've just, uh, thank you so much. This has been, this has been a lot of fun. You're, you're very good at this as well. I feel like it's, uh, Jeremy: Thank you Tom: It's not easy to, to steer a conversation in a way that makes awkward people sound, uh, normal. Jeremy: I wouldn't say that, but um, what's been actually pretty helpful to me is, you have such a body of work, I guess I would say, in terms of your blogging and, just the amount that you write and the long history of projects that, that there's, you know, there's a lot to talk about and I'm sure it helps, helps your thought process as well. Tom: Yeah. I, I've been lucky to have a lot of jobs where people, where companies were like, cool with publishing everything, you know? so a lot of what I've done is, uh, is public. it's, it's, uh, I'm very, very thankful for like, early on that being a big part of company culture. Jeremy: And you can definitely tell, I think for people who look at the Placemark blog posts or, or now your, your val.town blog posts, like there's, there's a clear difference when somebody like is very intentional and, um, you know, it's good at writing versus you're doing it because, um, it's your corporate responsibility or whatever, like people can tell. Yeah. Tom: Yeah. You can't fake being interested. so you gotta work on things that are interesting. Jeremy: Tom, thanks again for, for agreeing to chat. This was fun. Tom: Yeah thank you so much.

Jeremy Anderson: Serve Your Way to Millions

Play Episode Listen Later Jul 15, 2024 63:13

About Our Guest: Jeremy Anderson, a devoted husband, father, serial entrepreneur, author and celebrated motivational speaker, spreads his message of Next Level Living across various countries. Despite encountering numerous life challenges, Jeremy transcended a lifetime of adversity to embrace a purposeful and successful life. Though his books. Speaking engagements and coaching, Jeremy empowers others to elevate their lives to the Next Level. Companies and organizations worldwide seek Jeremy for his passionate and heartfelt messages on occupational integrity, overcoming obstacles, facing adversity, embracing change and succeeding during uncertain times. Episode Summary: This episode is powered by The Seven Figure Summer Success Series “When I stand before God at the end of my life, I hope not to have a single bit of talent left, saying, ‘I used everything you gave me.'” Irma Bombeck. When I think of this quote, I think of today's guest. There's something to be said for knowing that you have a powerful gift to share with the world. And it's an entirely different thing, when, in leveraging that gift, you remain a servant focused on shaking the planet. That sums up Jeremy Anderson in a nutshell. He's no stranger to working at a level of excellence that deepens the impact he ricochets through the earth. Even though he's achieved so much, he is more concerned with serving that strutting in his success, which is ultimately why he has been blessed with so much. In this powerful episode, Jeremy transparently unpacks his powerful story from sinner to servant, while sparing no detail in between. Through his commitment to live for God, he has created a movement that honors God, marriage, business and service via his Next Level Living message and community. Our conversation offers an example of how to become unapologetic about serving at a level that inspires, influences and impacts. So, if you're ready to stop chasing likes and followers and ready to shift into service that leads to earning millions, grab your pen and Move to Millions Podcast Notebook and listen in to discover: How to stop questioning your worth The significance of surrender The difference a praying family can make when you are not living up to your potential What real faith looks like How tithing 20% shifted everything And so much more Powerful Quotes from The Episode: “God is calling me to go all in.” Jeremy “You don't got to, you get to.” Jeremy “Now we in the overglow.” Jeremy “This is just the beginning, son.” Jeremy “I tend to wave the white flag often and that is when I win.” Jeremy “When I tried it my way, I lost. When I tried it Yahweh, I win.” Jeremy “We felt called to this work.” Jeremy “God is the guarantee.” Darnyelle “If I don't speak, we don't eat.” Jeremy “For six years, we tithed 30%.” Jeremy Move to Millions Wisdom Questions: Book: Cast of Characters Max Lucado and 360 leader John Maxwell Favorite Quote: My condition is not my conclusion Tool Jeremy Swears By: God will bless you with what He can trust you with How to Connect with Jeremy Anderson: Instagram: https://www.instagram.com/1jeremyanderson Facebook: https://www.facebook.com/jeremy.anderson.5268 Website: https://www.jeremyanderson.org/ Incredible One Enterprises, LLC is not responsible for the content and information delivered during the podcast interview by any guest. As always, we suggest that you conduct your own due diligence regarding any proclamations by podcast guests. Incredible One Enterprises, LLC is providing the podcast for informational purposes only. Want more of Darnyelle? Partner With Us To Scale Your Company Join the Move to Millions Facebook Group Register for the Seven Figure Summer Success Series Learn about Haus of Millions The Move to Millions Continuum Episode Social Media Links: http://www.instagram.com/darnyellejerveyharmon http://www.facebook.com/darnyellejerveyharmon http://www.twitter.com/darnyellejervey http://www.linkedin.com/in/darnyellejerveyharmon Subscribe to the Move to Millions Podcast: Listen on iTunes Listen on Google Play Listen on Stitcher Listen on iHeartRadio Listen on Pandora Leave us a review Are you subscribed to my podcast? If you're not, I want to encourage you to do that today. I don't want you to miss an episode. I'm adding a bunch of bonus episodes to the mix and if you're not subscribed there's a good chance you'll miss out on those. Now if you're feeling extra loving, I would be really grateful if you left me a review over on iTunes, too. Those reviews help other people find my podcast and they're also fun for me to go in and read. Just click here to review, select “Ratings and Reviews” and “Write a Review” and let me know what your favorite part of the podcast is. Thank you!

god speaking write serve llc companies millions next level haus ratings yahweh powerful quotes next level living jeremy anderson darnyelle incredible one enterprises jeremy you

Mayra Navarro on Getting to RubyConf (RubyConf 2023)

Play Episode Listen Later Nov 30, 2023 27:40

Mayra Navarro is an organizer of WNB.rb and Ruby Perú. Mayra shares how the Ruby community helped her get to RubyConf, going from project manager to developer, and the different ways people learn and communicate. This is the final interview recorded at RubyConf 2023 in San Diego. -- Mayra's Github Peruvian Digital Platform Codeable bootcamp Groups Ruby Perú WNB.rb Atlanta Ruby People Cody Norman Stefanni Brasil of hexdevs Dave Kimura of Drifting Ruby Conferences RubyConf RailsConf -- Transcript You can help correct transcripts on GitHub. [00:00:00] Jeremy: I hope you've been enjoying the conversations from RubyConf. Before we get started. I just want to say thank you to everyone. I met at the conference, all the guests who were so generous with their time. And to Irene from RubyConf for arranging a space and helping me connect with guests. [00:00:16] This final interview is with Mayra Navarro. She's an organizer of the Ruby community in Lima, Peru and a member of the women and non-binary Ruby online community. She's going to tell us how the community pulled together. Both friends of hers and strangers she had never met. To get her to RubyConf this year, we start the story in April where she's just finished attending the Ruby on rails conference, RailsConf in Atlanta. Getting to RubyConf [00:00:44] Mayra: So the thing is, in the last RailsConf, the last day that I was in the US, um, the day that I was returning to Peru, I got fired. (laughs) Yes. So I was with all the stress. [00:01:01] All the luggages that I had to pick or maybe overweight or something like that. And then I received that [00:01:08] Jeremy: Oh my gosh. [00:01:09] Mayra: And I, oh my gosh, what I can do? [00:01:13] Jeremy: That's terrible. [00:01:15] Mayra: It was awful. Since that, I think it was May 1st or something like that. And I was looking for job like everybody else who were fired all these times It was a difficult time for me. my plan was just before September I get a job so I have enough money and not using my, my savings for, for going to the RailsConf. Sorry, the RubyConf. but, uh, eventually at the end of September, October, I didn't get anything. [00:01:44] I'm Christian. So I, well, God doesn't want me to come here. [00:01:50] it there must be a reason, but there was something inside me that. I just to have to do something else. and I thank to my mom because she's someone that is always fighting for what she wants. [00:02:02] So I say, okay, I went to sleep that day that I say, okay, maybe I don't want to go. So next day I have the idea. Maybe you didn't use your last card. There is something else. That is something that I have from my mom. I can feel that. I say, what if you ask for money? Well, like a fundraising, I learned about that word later, and I say what, what else could you lose you don't have anything to lose right now. [00:02:30] So I say, okay, I'm going to write something. I asked Cody Norman, that is someone that I really appreciate right now. I asked him about suggestions, if that's a good idea or no, maybe not. He said, yes, you can do it. Uh, and I asked him if he could help me with the speech because I tried to write something and also I'm not good at writing things on Twitter and especially asking for money because I had to be open myself and be vulnerable to do that. [00:03:02] And it was like, uh, the last break for myself [00:03:05] a... I sent the speech to Cody, he helped me to update some things that I have to just improve. And I did it. I, also, I didn't know how to open a GoFundMe campaign because that's only for the US and Mexico. I think it doesn't exist in Peru. [00:03:23] So he said, Oh, there is another page that you can go. [00:03:27] I did it. So I just published that. I didn't open that until three, four hours later, because I was like, no, I don't want to see. And then I, I open it and I started to contact with the people who. [00:03:44] Well, who knows me because I like to be connected with a lot of people. I'm part of the FL RB even being in Peru, I am part of the FL RB. I go attend to the Atlanta Ruby Group. I go, I, I know a lot of people because of the conference. I try to help to the woman and non binary community also. I am organizer of the Ruby Peru, but I didn't want to ask them money for them, but I have some close friends from the conference that I, that I go for all. [00:04:13] All these two years. So they helped me to share that. And in two days, I got the money. [00:04:20] It was like a, I can't believe it. It is what, and I'm not good open myself for things like that. I love helping people, but it's difficult when I, you have to help yourself. So. All these people who I could see their names because it's, it's transferred to PayPal. [00:04:39] So I could see their name is like, uh, I really appreciate the thing that they don't know me. Some people, they don't know me, but, but I know them. I know who you are, if you're listening to this and I thank you appreciate for doing that. I also had the opportunity because I need to talk about this. I got a ticket from HexDev, uh, from Stephanie. [00:05:01] Jeremy: Oh... Hexdevs. Yeah. [00:05:02] Mayra: Yes. And, also I applied for the volunteer positions, just in case But I got the volunteer position. So what I did is, besides all my expense, I mean, that trip and also the hotel, expenses, I don't know, does it work? I, I said, I'm going to give this ticket that I have left to one of the women in the wnb community. [00:05:25] So I did that and say, I have a ticket and also I can share the room. I don't want to say her name because She's trying not to be too connected to social media, but she, she accepted sharing the room with me. [00:05:40] Jeremy: Yeah. [00:05:43] Mayra: So she's already here with me and I feel so happy because People not only helping me, they helping me to help and it feel like, wow. [00:05:53] Yeah. And that is, that is my story. And I still, well, I accepted to come to talk about this today because I received a job offer in the morning that I accepted. [00:06:06] So I wanted to. Send a happy ending for all my story about this. Yeah. And especially because I know that in the next conference I'll be with my own money, I could say, expenses. Asking for help [00:06:20] Jeremy: So was this all this year, the RailsConf? That was this year where you, you went to the conference, you enjoyed the talks and you were employed. And so it was the day that it was over that you, you found out that you got laid off. Wow. it's this, you have this high, right, you've met all your friends and, you know, you're learning all this stuff and you're really excited. [00:06:42] And then you get this notice and it's like, what, what happened? [00:06:47] Mayra: Yeah. That is what's happening. [00:06:50] Jeremy: then you, you kind of, like you said, you opened yourself up but yeah, it takes courage to say like, Hey, I need, you know, I need some help. [00:06:57] Mayra: Yeah, it's, this is something that I learned about this is always asking for help. This is something that I have been I bring into my life is always asking for help. I know as a woman, uh, as a woman, I have the thing that Try to be strong sometimes. I can do it by myself. I don't need help. [00:07:16] I don't need help, but sometimes you need, as human, you can open yourself. It's not something related to [00:07:23] gender it's more like humanity. That's how I feel right now. That is the feeling that I have and that is what is going to keep with me. For the rest of my life, I know. (laughs) [00:07:33] Jeremy: Yeah. Cause I, I think when, when people don't know, they, they might assume because you're so involved, right. With your, your local community and then the community internationally where people just assume that, Hey, Mayra is doing great. Right? She's, she's got no problems, no issues, and, there's just no way for people to know unless, you know, you, you share, and then that way people can help you, [00:07:58] Mayra: Yes. [00:07:59] Jeremy: That's a great story. I'm, I'm, I'm glad that, like, getting laid off is never good, right? That's never fun. But at least... Uh, things positive came out of it in terms of people coming out to support you, but also like you said, being able to support another, you know, another member of our community, [00:08:19] Mayra: And I would do it, and I would do it again. [00:08:21] I know that. Attending conferences [00:08:23] Jeremy: You know, now that you've gotten to come out, how, how has the, the conference been for you? Like, [00:08:29] Mayra: It was really good. I feel less insecure than the last time that I was here. (laughs) Actually, my first RubyConf was RubyConf Mini in Rhode Island. So this is like a, the Ruby real not the real one. It's just my more it's different. [00:08:48] Mayra: But, uh, the same time is. It's closer. That's how I feel it. I mean, this is my fifth conference. My first one was in Colombia. RubyConf Colombia. And I got a ticket as a scholarship. but until now I can say that it's like a, the feeling of the Ruby community, not only Ruby on Rails. Ruby community is like a It's really positive. It has changed my life so much since the first time that I joined to community that it's, I'm so happy to be developer instead of what, you know, everybody switched jobs. [00:09:20] I did too so it's like, uh, I won't regret. From project manager to developer [00:09:24] Jeremy: Hmm. Tell, tell me a little bit about that. How did you get interested in Ruby or, or have been involved with the community? [00:09:31] Mayra: I wasn't, I it was because money. Because it is. [00:09:34] Jeremy: That's a good reason. That's a good reason. [00:09:36] Mayra: Yeah. The thing is, I am graduated, uh, of the university. Uh, in Peru, but I was project manager before, well, I've switched a lot of careers because I was looking what I wanted to do and eventually I was project manager also. but I hated me in that position wasn't really good and it wasn't the company. [00:09:59] I knew it was me. I wasn't satisfied with my job and also I didn't like that, uh, working from nine to six every day in an office or something like that. It wasn't for me. So I remember that someone, one friend on Facebook shared something like a bootcamp that was about to start in Peru, that they were teaching Ruby on Rails. [00:10:21] I didn't know what was Ruby on Rails at the time. And then, and also React and JavaScript because, and you have to pay only if you get your first job. [00:10:32] Jeremy: Oh, this is like a bootcamp. [00:10:34] Mayra: Yeah, it's bootcamp, the first one that I met, I know, but that time there was someone called Laboratoria, but it's only related to JavaScript, but this one's a little bit more complete. So I apply, I didn't know that I could make it. I did it and it was an intensive bootcamp, six months from nine to six, but also I remember I didn't leave. [00:10:59] The place until 11 o'clock PM, because we were all 19 people there. So we really wanted to change our future. When everything ends, there was a moment when I, I could feel that Ruby also, especially Ruby on Rails, it was like, uh, something that I really like, uh, the syntax. Things like that. And also our teachers used to say, I can see who could be backend, who's going to be frontend. [00:11:31] I consider myself full stack, but she, she used to say that. I remember that. [00:11:36] Jeremy: So, which one were you? [00:11:38] Mayra: I was the Ruby side. The backend. [00:11:43] Mayra: Then I got an internship in the same companies who that was promoting the, the bootcamp. After three months, the, the, the internship ended because it was part of the contract but I wanted really to work in a place that they had Ruby on Rails. [00:12:05] So that's what I got. It took me more time than the rest of my friends, it's maybe it was like. four of us got a job in Ruby on Rails, uh, and I got mine. I remember my first job, full time job for Ruby on Rails was for the government of Peru. Actually, they use, they use Ruby on Rails for, CMS that they manage, that is called gov.pe. So I started working there. So it was a nice experience and I love, and I learned a lot about that experience also. So that is my story how it started. [00:12:44] Jeremy: Yeah. So you had talked about your friend and your friend referred you to the bootcamp, had you ever done any programming or anything related? [00:12:54] Mayra: When I was project manager, I had the opportunity to, to manage developers. I have developers in charge. So, but I had the kind of person that even I was. Your product manager, I try to help you to solve some things, like something that I say is a pseudocode. Instead of coding, I tried to give you the pseudocode that the things that you could do. So with that, maybe I can help. Well, my, my goal wanted to help you to unlock if you, you, you got stuck in something. [00:13:30] So with that, I just have a little bit of knowledge of what to do, but I. I felt that I hadn't the tools or I hadn't the skill to do that. That's why I decided to, to study in a bootcamp because they can teach me about the, that kind of tools because I couldn't study by myself. I couldn't. I can understand how the things goes right now, but at that time I, I was, I was lost. [00:13:56] Jeremy: But that's like, the, the skill that you already had as a project manager, being able to write the pseudocode and, and talk to your developers about the type of problem they're, they're trying to solve. That, that's a really important skill already. So I think, like, going into the bootcamp, nothing was totally new. [00:14:15] Right? I, I think that's really great that, that you got to see that beforehand and, and hopefully get a sense for like, that you might. Enjoy this, this sort of thing too, right? [00:14:25] Mayra: I love solving puzzles, so that was puzzle for me. I started with Code Wars. I know everybody started with that, but it's like a resolving puzzles. I need something there. And one of the things I really love I love helping people. I discovered that when I was helpdesk before. eventually all this time, even this time without job, I realized I can bring that. [00:14:47] Oh, I being more. aware about that I can help people just coding. So that's, that's what I've been doing all these months because I try to understand about gems or learning things more, but my focus is always going to be helping people. [00:15:03] Jeremy: I, I think that's really great that that's something that really appeals to you because that's something everybody needs. [00:15:12] Mayra: You know, the word that comes to my mind all the time that I say this is server. If you think about the word server, it's what I do. It receives something and gives you something. It's all the time. It's, you know, I know it's a machine or something like that, but if you think about the word, you are receiving something and giving something. [00:15:36] In all the time you are waiting for a, for a request to give something. So that is the word that it's, is for me, it's kind of helping people. [00:15:45] Jeremy: Hmm. we all serve one another in, in, in one way or another. Yeah. the boot camp, you said it was, uh, six months, and... You were, you were staying till 11 at night. So what, what was that experience like? How different people communicate and learn [00:16:02] Mayra: it's a nightmare, don't do it. (laughs) No, it was fun. that bootcamp changed my life. Before that bootcamp, I would say, I'm not going to say I was lack of some of the skills. The thing is, I didn't know how to, how to communicate. And one of the things that I learned besides that you need English or things like that, it's more about how to communicate with people because, there are multiple ways. [00:16:28] You can't talk in a way, for example, there's something that I'm always going to remember about it is you would prefer. WhatsApp, for example, and I would prefer Slack, or maybe voice, voice records instead of writing, or maybe an email. So, there must be a point between you and me. that might help us to communicate in a good way. [00:16:53] I, me, myself, especially, don't have to be forced people to do it in my way. It's a way of two, for two, you and me, and the best way, and try to do the same for the rest of the team, for example, or the rest of the people. Maybe they don't prefer this in this way, maybe another way. I have to be open to that. [00:17:12] Before the bootcamp I didn't know anything about it, but, and I tried to do it in my way. And then, right, thinking about it, it was a little bit selfish, but I need to learn. I need to be aware about [00:17:25] Jeremy: You're, you're referring to the way people. Communicate, or the ways people learn, or... [00:17:31] Mayra: Communicate actually. If we talk about the people, how the people learn. I am the example of, I am bad at listening book, podcasts. [00:17:41] Jeremy: Uh... Oh (laughs)... [00:17:42] Mayra: I'm sorry, I, I have ADHD. So it's hard for me to follow videos and podcasts because I have to. Pause, uh, Rewind, and Play again. And this is something when I miss so many ideas. So I prefer reading blogs or maybe transcriptions because it helps me just do reading. [00:18:04] And then return and continue reading when I can't understand something. it was difficult for me just to understand also that people prefer videos. Yes, it's not only my way to learn, it's their way to learn. And also we need to be open to that. Even when I used to, I mentor a couple of... people So, I had to be open to that also. [00:18:29] I am the kind of person you can write me at 2 a. m. and if I am awake, I'm going to answer you because maybe you are desperate for an answer at the time, but I can understand there are no people who are not. Who doesn't like, don't like that, so I try to be open to that or maybe improve our communications or maybe give some rules and not to think that everything is personal, right? [00:18:56] It's just, I hope the best of you. And I try that you get the best of me. (laughs) [00:19:03] Jeremy: Yeah, it's understanding their expectations, what they feel comfortable with, so that you know, It's okay if I send Mayra a message at 2 AM, but if I send someone else a message at 2 AM, it, it, maybe their phone dings and, you know, now they're distracted and, yeah, so, that, that makes sense. [00:19:26] Mayra: Yeah. If you, for example, I, I, I'm going to ask you because I learned that, can I send you a message at that time? But I, even I have to think about the time zone. [00:19:36] Jeremy: Yeah, oh, that's true, that's true. [00:19:38] Mayra: For example, because now I, I would have think about just my friends from Peru, but now I have friends all over the world because of the Woman Non-binary community. [00:19:49] So I have to think about things like that when I write. So what, all the things that has been useful for me is, for example, is like when you schedule a message that has been useful for me when I have to ask or sending messages. [00:20:03] Jeremy: It's interesting that you mentioned how with learning you prefer blogs and, and books and things like that because this may be a generalization, but maybe with, newer developers or, or younger people, a lot of them really like the video form Yesterday I was, interviewing, David Copeland, who he, he wrote a book about, sustainable web development with Ruby on Rails, and, yeah, we were talking a little bit about, it's like, so many people want video, is there a, is there still a place for me with my, you know, my, my book? [00:20:39] And stuff like that, and I think it's important to remember that there's people like yourself, and, um, I, I'm partly the same way, like, I like to be able to have the text so that I can read it at my own pace and copy and paste stuff and stuff like that. But you're right that everybody learns differently, so it, it makes sense for there to be the videos, for there to be, podcasts and blog posts. [00:21:05] There's different people who learn different ways [00:21:07] Mayra: And also, some people, including me, needs to pay for something if you want to learn something, because sometimes when it is free, you won't have the value that [00:21:22] Mayra: it has. [00:21:23] Jeremy: I totally understand that, yeah. Accessibility for videos and podcasts [00:21:26] Mayra: And there is something I would like to mention after you talk about this is I open to videos and podcasts, I can't take my time because I have now a lot of friends who, who create this type kind of content, but I like to remember that there are people with other difficult things. [00:21:45] No, it's, it's related to accessibility because when you have deaf people, they need. Transcriptions, they need closed captions. So maybe you are in a place with a lot of noise and it can help you, even if the video has closed captions, so people can read it. So it is something like, it's not me. It's more to be more open to people who are really has a disability. [00:22:12] That's a word that was like, so yeah. Or maybe they, their main language is not English and a lot of the content or the majority of the content about coding is in English. So when you have the transcriptions or you have the blog, you can translate it. And it's easy for that is access to them. [00:22:31] Jeremy: Yeah, that's definitely true. And I think even past people who aren't native speakers or have a disability, if you aren't in either of those categories, there's still a lot of people who they want to have a transcript or outside. Ruby or programming, people who watch movies and TV shows, a lot of them turn on the subtitles and they're native English speakers. [00:22:55] The dialogue is in English. They still want to see the subtitles. [00:22:59] Yeah, I think it's becoming very common. So, to your point when you have video having a way for people to still get the information another way, I think is helpful for everyone. Yeah. [00:23:15] Mayra: And it isn't too difficult nowadays because now you have AI or maybe, programs that can get you the, at least the closest words. [00:23:24] Jeremy: It gets you maybe 90 percent of the way there, so it definitely saves time, but I will say it is still work. [00:23:33] Mayra: Yes. Yes. It still work. Actually, it's because it could be a couple of words that need that maybe the, the program needs to improve. You can improve how the program, uh, translated, but it is something behind the meaning that you still can keep. [00:23:49] Jeremy: It being there and not perfect is kind of better than having nothing, I guess, yeah. What's next [00:23:55] Jeremy: Now that you've spent time at RubyConf, like, what are, like what are your plans next? [00:24:02] Mayra: There is a story behind all this, but I'm going to, the TLDR [00:24:06] Jeremy: Okay. [00:24:08] Mayra: Could be easy because I have a plenty of time without working to make a lot of thoughts in my mind. So it's just like, uh, I would like to explore more about Ruby, Ruby without Rails, something like that. So one of the things that I would like to do after this is just. [00:24:27] I would like to investigate more about the use of Ruby out of what is what application. It is something I was talking like a couple of hours ago, because there I found a blog about how is, how are the conference. The Ruby conference in Japan. So it was really interesting. It was, an article that is really old, but it got my attention because I never thought about it because I came from a boot camp and it was like a. [00:24:59] There is something else. So I could see a couple of talks about talking about, for example, Rack. [00:25:06] Mayra: So I will like how, oh, for example, we have another talk about how to create desktop applications with Ruby. So that is something that I would like to investigate. I would like to try also with the Ruby Peru community. [00:25:20] We decided to choose to investigate more about it and prepare talks about it in Spanish, because that is the mission. Our vision of our community is to create content in Spanish. [00:25:34] And also I was planning to give a lightning talk, but I wasn't ready yet to do it because I was nervous about, because I applied for jobs or things like that, I hadn't the time to prepare that, but actually I would like. I dunno if you heard about Dave Kimura and Drifting Ruby? [00:25:52] Jeremy: Uh, yes. Yes. They, they do the videos, right. [00:25:54] Mayra: He has a, blog on how to implement some kind of, uh, when your test test fails, you saw the light can change to red or green based on that The test that you are running. So it was really interesting. It isn't related to rails, but it, it is based on ruby so it's like, wow, I want to learn how to do that. Woman and non-binary community (WNB-rb.dev) [00:26:17] Jeremy: Anything else you want to mention or think we should have talked about? [00:26:22] Mayra: Yeah, because I am a member of the Women and Non-binary community. So if you are a woman or a non-binary person, you're invited to our community, we are open to, to you and we have meetups monthly. Uh, we have book clubs and we are always open to new ideas to share, to help you to do. that's it, I think. Yes. [00:26:47] Jeremy: And where can they find you if they're interested in that? [00:26:51] Mayra: Uh, we have a webpage, [00:26:53] wnb-rb.dev That is dev in English, I think. [00:27:00] Yes. And there you can find us. And also there's a form, where you can give us your, your info. It won't be shared only. No, it won't be shared only for the organizers. And that's all. We keep your privacy there. [00:27:17] Jeremy: Very cool. So [00:27:19] wnb-rb.dev [00:27:23] Mayra: yes, it is. [00:27:26] Jeremy: Well, Myra thank you so much for spending time to talk with me today. [00:27:30] Mayra: Thank you and sorry for my English. Ha ha ha ha [00:27:34] Jeremy: Your English is good, your English is much better than my Spanish. [00:27:38] Mayra: Okay.

god tv women ai english japan woman mexico spanish san diego adhd whatsapp colombia peru paypal communicate gofundme react rhode island slack lima accessibility attending github navarro cms javascript rewind rails rack transcription mayra tldr ruby on rails laboratoria railsconf rubyconf jeremy you jeremy it jeremy so

Sara Jackson on Teaching in Kanazawa (RubyConf 2023)

Play Episode Listen Later Nov 18, 2023 44:17

Sara is a team lead at thoughtbot. She talks about her experience as a professor at Kanazawa Technical College, giant LAN parties in Rochester, transitioning from Java to Ruby, shining a light on maintainers, and her closing thoughts on RubyConf. Recorded at RubyConf 2023 in San Diego. -- A few topics covered: Being an Assistant Arofessor in Kanazawa Teaching naming, formatting, and style Differences between students in Japan vs US Technical terms and programming resources in Japanese LAN parties at Rochester Transitioning from Java to Ruby Consulting The forgotten maintainer RubyConf Other links Sara's mastodon thoughtbot This Week in Open Source testdouble Ruby Central Scholars and Guides Program City Museum Japan International College of Technology Kanazawa RubyKaigi Applying mruby to World-first Small SAR Satellite (Japanese lightning talk) (mruby in space) Rochester Rochester Institute of Technology Electronic Gaming Society Tora-con Strong National Museum of Play Transcript You can help correct transcripts on GitHub. [00:00:00] Jeremy: I'm here at RubyConf, San Diego, with Sara Jackson, thank you for joining me today. [00:00:05] Sara: Thank you for having me. Happy to be here. [00:00:07] Jeremy: Sara right now you're working at, ThoughtBot, as a, as a Ruby developer, is that right? [00:00:12] Sara: Yes, that is correct. Teaching in Japan [00:00:14] Jeremy: But I think before we kind of talk about that, I mean, we're at a Ruby conference, but something that I, I saw, on your LinkedIn that I thought was really interesting was that you were teaching, I think, programming in. Kanazawa, for a couple years. [00:00:26] Sara: Yeah, that's right. So for those that don't know, Kanazawa is a city on the west coast of Japan. If you draw kind of a horizontal line across Japan from Tokyo, it's, it's pretty much right there on the west coast. I was an associate professor in the Global Information and Management major, which is basically computer science or software development. (laughs) Yep. [00:00:55] Jeremy: Couldn't tell from the title. [00:00:56] Sara: You couldn't. No.. so there I was teaching classes for a bunch of different languages and concepts from Java to Python to Unix and Bash scripting, just kind of all over. [00:01:16] Jeremy: And did you plan the curriculum yourself, or did they have anything for you? [00:01:21] Sara: It depended on the class that I was teaching. So some of them, I was the head teacher. In that case, I would be planning the class myself, the... lectures the assignments and grading them, et cetera. if I was assisting on a class, then usually it would, I would be doing grading and then helping in the class. Most of the classes were, uh, started with a lecture and then. Followed up with a lab immediately after, in person. [00:01:54] Jeremy: And I think you went to, is it University of Rochester? [00:01:58] Sara: Uh, close. Uh, Rochester Institute of Technology. So, same city. Yeah. [00:02:03] Jeremy: And so, you were studying computer science there, is that right? [00:02:07] Sara: I, I studied computer science there, but I got a minor in Japanese language. and that's how, that's kind of my origin story of then teaching in Kanazawa. Because Rochester is actually the sister city with Kanazawa. And RIT has a study abroad program for Japanese learning students to go study at KIT, Kanazawa Institute of Technology, in Kanazawa, do a six week kind of immersive program. And KIT just so happens to be under the same board as the school that I went to teach at. [00:02:46] Jeremy: it's great that you can make that connection and get that opportunity, yeah. [00:02:49] Sara: Absolutely. Networking! [00:02:52] Jeremy: And so, like, as a student in Rochester, you got to see how, I suppose, computer science education was there. How did that compare when you went over to Kanazawa? [00:03:02] Sara: I had a lot of freedom with my curriculum, so I was able to actually lean on some of the things that I learned, some of the, the way that the courses were structured that I took, I remember as a freshman in 2006, one of the first courses that we took, involved, learning Unix, learning the command line, things like that. I was able to look up some of the assignments and some of the information from that course that I took to inform then my curriculum for my course, [00:03:36] Jeremy: That's awesome. Yeah. and I guess you probably also remember how you felt as a student, so you know like what worked and maybe what didn't. [00:03:43] Sara: Absolutely. And I was able to lean on that experience as well as knowing. What's important and what, as a student, I didn't think was important. Naming, formatting, and style [00:03:56] Jeremy: So what were some examples of things that were important and some that weren't? [00:04:01] Sara: Mm hmm. For Java in particular, you don't need any white space between any of your characters, but formatting and following the general Guidelines of style makes your code so much easier to read. It's one of those things that you kind of have to drill into your head through muscle memory. And I also tried to pass that on to my students, in their assignments that it's. It's not just to make it look pretty. It's not just because I'm a mean teacher. It is truly valuable for future developers that will end up reading your code. [00:04:39] Jeremy: Yeah, I remember when I went through school. The intro professor, they would actually, they would print out our code and they would mark it up with red pen, basically like a writing assignment and it would be like a bad variable name and like, white space shouldn't be here, stuff like that. And, it seems kind of funny now, but, it actually makes it makes a lot of sense. [00:04:59] Sara: I did that. [00:04:59] Jeremy: Oh, nice. [00:05:00] Sara: I did that for my students. They were not happy about it. (laughs) [00:05:04] Jeremy: Yeah, at that time they're like, why are you like being so picky, right? [00:05:08] Sara: Exactly. But I, I think back to my student, my experience as a student. in some of the classes I've taken, not even necessarily computer related, the teachers that were the sticklers, those lessons stuck the most for me. I hated it at the time. I learned a lot. [00:05:26] Jeremy: Yeah, yeah. so I guess that's an example of things that, that, that matter. The, the aesthetics or the visual part for understanding. What are some things that they were teaching that you thought like, Oh, maybe this isn't so important. [00:05:40] Sara: Hmm. Pause for effect. (laughs) So I think that there wasn't necessarily Any particular class or topic that I didn't feel was as valuable, but there was some things that I thought were valuable that weren't emphasized very well. One of the things that I feel very strongly about, and I'm sure those of you out there can agree. in RubyWorld, that naming is important. The naming of your variables is valuable. It's useful to have something that's understood. and there were some other teachers that I worked with that didn't care so much in their assignments. And maybe the labs that they assigned had less than useful names for things. And that was kind of a disappointment for me. [00:06:34] Jeremy: Yeah, because I think it's maybe hard to teach, a student because a lot of times you are writing these short term assignments and you have it pass the test or do the thing and then you never look at it again. [00:06:49] Sara: Exactly. [00:06:50] Jeremy: So you don't, you don't feel that pain. Yeah, [00:06:53] Sara: Mm hmm. But it's like when you're learning a new spoken language, getting the foundations correct is super valuable. [00:07:05] Jeremy: Absolutely. Yeah. And so I guess when you were teaching in Kanazawa, was there anything you did in particular to emphasize, you know, these names really matter because otherwise you or other people are not going to understand what you were trying to do here? [00:07:22] Sara: Mm hmm. When I would walk around class during labs, kind of peek over the shoulders of my students, look at what they're doing, it's... Easy to maybe point out at something and be like, well, what is this? I can't tell what this is doing. Can you tell me what this does? Well, maybe that's a better name because somebody else who was looking at this, they won't know, I don't know, you know, it's in your head, but you will not always be working solo. my school, a big portion of the students went on to get technical jobs from after right after graduating. it was when you graduated from the school that I was teaching at, KTC, it was the equivalent of an associate's degree. Maybe 50 percent went off to a tech job. Maybe 50 percent went on to a four year university. And, and so as students, it hadn't. Connected with them always yet that oh, this isn't just about the assignment. This is also about learning how to interact with my co workers in the future. Differences between students [00:08:38] Jeremy: Yeah, I mean, I think It's hard, but, group projects are kind of always, uh, that's kind of where you get to work with other people and, read other people's code, but there's always that potential imbalance of where one person is like, uh, I know how to do this. I'll just do it. Right? So I'm not really sure how to solve that problem. Yeah. [00:09:00] Sara: Mm hmm. That's something that I think probably happens to some degree everywhere, but man, Japan really has groups, group work down. They, that's a super generalization. For my students though, when you would put them in a group, they were, they were usually really organized about who was going to do what and, kept on each other about doing things maybe there were some students that were a little bit more slackers, but it was certainly not the kind of polarized dichotomy you would usually see in an American classroom. [00:09:39] Jeremy: Yeah. I've been on both sides. I've been the person who did the work and the slacker. [00:09:44] Sara: Same. [00:09:46] Jeremy: And, uh, I feel bad about it now, but, uh, [00:09:50] Sara: We did what we had to do. [00:09:52] Jeremy: We all got the degree, so we're good. that is interesting, though. I mean, was there anything else, like, culturally different, you felt, from, you know, the Japanese university? [00:10:04] Sara: Yes. Absolutely. A lot of things. In American university, it's kind of the first time in a young person's life, usually, where they have the freedom to choose what they learn, choose where they live, what they're interested in. And so there's usually a lot of investment in your study and being there, being present, paying attention to the lecture. This is not to say that Japanese college students were the opposite. But the cultural feeling is college is your last time to have fun before you enter the real world of jobs and working too many hours. And so the emphasis on paying Super attention or, being perfect in your assignments. There was, there was a scale. There were some students that were 100 percent there. And then there were some students that were like, I'm here to get a degree and maybe I'm going to sleep in class a little bit. (laughs) That is another major difference, cultural aspect. In America, if you fall asleep in a meeting, you fall asleep in class, super rude. Don't do it. In Japan, if you take a nap at work, you take a nap in class, not rude. It's actually viewed as a sign of you are working really hard. You're usually working maybe late into the night. You're not getting enough sleep. So the fact that you need to take maybe a nap here or two here or there throughout the day means that you have put dedication in. [00:11:50] Jeremy: Even if the reason you're asleep is because you were playing games late at night. [00:11:54] Sara: Yep. [00:11:55] Jeremy: But they don't know that. [00:11:56] Sara: Yeah. But it's usually the case for my students. [00:11:59] Jeremy: Okay. I'm glad they were having fun at least [00:12:02] Sara: Me too. Why she moved back [00:12:04] Jeremy: That sounds like a really interesting experience. You did it for about two years? Three years. [00:12:12] Sara: So I had a three year contract with an option to extend up to five, although I did have a There were other teachers in my same situation who were actually there for like 10 years, so it was flexible. [00:12:27] Jeremy: Yeah. So I guess when you made the decision to, to leave, what was sort of your, your thinking there? [00:12:35] Sara: My fiance was in America [00:12:37] Jeremy: Good. [00:12:37] Sara: he didn't want to move to Japan [00:12:39] Jeremy: Good, reason. [00:12:39] Sara: Yeah, he was waiting three years patiently for me. [00:12:44] Jeremy: Okay. Okay. my heart goes out there . He waited patiently. [00:12:49] Sara: We saw each other. We, we were very lucky enough to see each other every three or four months in person. Either I would visit America or he would come visit me in Kanazawa. [00:12:59] Jeremy: Yeah, yeah. You, you couldn't convince him to, to fall in love with the country. [00:13:03] Sara: I'm getting there [00:13:04] Jeremy: Oh, you're getting Oh, [00:13:05] Sara: it's, We're making, we're making way. [00:13:07] Jeremy: Good, that's good. So are you taking like, like yearly trips or something, or? [00:13:11] Sara: That was, that was always my intention when I moved back so I moved back in the Spring of 2018 to America and I did visit. In 2019, the following year, so I could attend the graduation ceremony for the last group of students that I taught. [00:13:26] Jeremy: That's so sweet. [00:13:27] Sara: And then I had plans to go in 2020. We know what happened in 2020 [00:13:32] Jeremy: Yeah. [00:13:33] Sara: The country did not open to tourism again until the fall of 2022. But I did just make a trip last month. [00:13:40] Jeremy: Nice [00:13:40] Sara: To see some really good friends for the first time in four years. [00:13:43] Jeremy: Amazing, yeah. Where did you go? [00:13:46] Sara: I did a few days in Tokyo. I did a few days in Niigata cause I was with a friend who studied abroad there. And then a few days in Kanazawa. [00:13:56] Jeremy: That's really cool, yeah. yeah, I had a friend who lived there, but they were teaching English, yeah. And, I always have a really good time when I'm out there, yeah. [00:14:08] Sara: Absolutely. If anyone out there visiting wants to go to Japan, this is your push. Go do it. Reach out to me on LinkedIn. I will help you plan. [00:14:17] Jeremy: Nice, nice. Um, yeah, I, I, I would say the same. Like, definitely, if you're thinking about it, go. And, uh, sounds like Sara will hook you up. [00:14:28] Sara: Yep, I'm your travel guide. Technical terms in Japanese [00:14:31] Jeremy: So you, you studied, uh, you, you said you had a minor in Japanese? Yeah. So, so when you were teaching there, were you teaching classes in English or was it in Japanese? [00:14:42] Sara: It was a mix. Uh, when I was hired, the job description was no Japanese needed. It was a very, like, Global, international style college, so there was a huge emphasis on learning English. They wanted us to teach only in English. My thought was, it's hard enough learning computer science in your native language, let alone a foreign language, so my lectures were in English, but I would assist the labs in japanese [00:15:14] Jeremy: Oh, nice. Okay. And then, so you were basically fluent then at the time. Middle. Okay. Yeah. Yeah. Hey, well, I think if you're able to, to help people, you know, in labs and stuff, and it's a technical topic, right? So that's gotta be kind of a, an interesting challenge [00:15:34] Sara: I did learn a lot of new computer vocabulary. Yes. [00:15:39] Jeremy: So the words are, like, a lot of them are not the same? Or, you know, for, for specifically related to programming, I guess. [00:15:46] Sara: Hmm. Yeah, there are Japanese specific words. There's a lot of loan words that we use. We. Excuse me. There's a lot of loan words that Japanese uses for computer terms, but there's plenty that are just in Japanese. For example, uh, an array is hairetsu. [00:16:08] Jeremy: Okay. [00:16:08] Sara: And like a screen or the display that your monitor is a gamen, but a keyboard would be keyboard... Kībōdo, probably. [00:16:20] Jeremy: Yeah. So just, uh, so that, they use that as a loan word, I guess. But I'm not sure why not the other two. [00:16:27] Sara: Yeah, it's a mystery. [00:16:29] Jeremy: So it's just, it's just a total mix. Yeah. I'm just picturing you thinking like, okay, is it the English word or is it the Japanese word? You know, like each time you're thinking of a technical term. Yeah. [00:16:39] Sara: Mm hmm. I mostly, I, I I went to the internet. I searched for Japanese computer term dictionary website, and kind of just studied the terms. I also paid a lot of attention to the Japanese professors when they were teaching, what words they were using. Tried to integrate. Also, I was able to lean on my study abroad, because it was a technical Japanese, like there were classes that we took that was on technical Japanese. Computer usage, and also eco technology, like green technology. So I had learned a bunch of them previously. [00:17:16] Jeremy: Mm. So was that for like a summer or a year or something [00:17:20] Sara: It was six weeks [00:17:21] Jeremy: Six weeks. [00:17:21] Sara: During the summer, [00:17:22] Jeremy: Got it. So that's okay. So like, yeah, that must have been an experience like going to, I'm assuming that's the first time you had been [00:17:30] Sara: It was actually the second time [00:17:31] Jeremy: The second [00:17:32] Sara: Yeah. That was in 2010 that I studied abroad. [00:17:35] Jeremy: And then the classes, they were in Japanese or? Yeah. Yeah. That's, uh, that's, that's full immersion right there. [00:17:42] Sara: It was, it was very funny in the, in the very first lesson of kind of just the general language course, there was a student that was asking, I, how do I say this? I don't know this. And she was like, Nihongo de. [00:17:55] Jeremy: Oh (laughs) ! [00:17:56] Sara: You must, must ask your question only in [00:17:59] Jeremy: Yeah, Programming resources in Japanesez [00:17:59] Jeremy: yeah. yeah. That's awesome. So, so it's like, I guess the, the professors, they spoke English, but they were really, really pushing you, like, speak Japanese. Yeah, that's awesome. and maybe this is my bias because I'm an English native, but when you look up. Resources, like you look up blog posts and Stack Overflow and all this stuff. It's all in English, right? So I'm wondering for your, your students, when, when they would search, like, I got this error, you know, what do I do about it? Are they looking at the English pages or are they, you know, you know what I mean? [00:18:31] Sara: There are Japanese resources that they would use. They love Guguru (Google) sensei. [00:18:36] Jeremy: Ah okay. Okay. [00:18:38] Sara: Um, but yeah, there are plenty of Japanese language stack overflow equivalents. I'm not sure if they have stack overflow specifically in Japanese. But there are sites like that, that they, that they used. Some of the more invested students would also use English resources, but that was a minority. [00:19:00] Jeremy: Interesting. So there's a, there's a big enough community, I suppose, of people posting and answering questions and stuff where it's, you don't feel like, there aren't people doing the same thing as you out there. [00:19:14] Sara: Absolutely. Yeah. There's, a large world of software development in Japan, that we don't get to hear. There are questions and answers over here because of that language barrier. [00:19:26] Jeremy: Yeah. I would be, like, kind of curious to, to see, the, the languages and the types of problems they have, if they were similar or if it's, like, I don't know, just different. [00:19:38] Sara: Yeah, now I'm interested in that too, and I bet you there is a lot of research that we could do on Ruby, since Ruby is Japanese. [00:19:51] Jeremy: Right. cause something I've, I've often heard is that, when somebody says they're working with Ruby, Here in, um, the United States, a lot of times people assume it's like, Oh, you're doing a Rails app, [00:20:02] Sara: Mm hmm. [00:20:03] Jeremy: Almost, almost everybody who's using Ruby, not everyone, but you know, the majority I think are using it because of Rails. And I've heard that in Japan, there's actually a lot more usage that's, that's not tied to Rails. [00:20:16] Sara: I've also heard that, and I get the sense of that from RubyKaigi as well. Which I have never been lucky enough to attend. But, yeah, the talks that come out of RubyKaigi, very technical, low to the metal of Ruby, because there's that community that's using it for things other than Rails, other than web apps. [00:20:36] Jeremy: Yeah, I think, one of the ones, I don't know if it was a talk or not, but, somebody was saying that there is Ruby in space. [00:20:42] Sara: That's awesome. Ruby's everywhere. LAN parties in college [00:20:44] Jeremy: So yeah, I guess like another thing I saw, during your time at Rochester is you were, involved with like, there's like a gaming club I wonder if you could talk a little bit about your experience with that. [00:20:55] Sara: Absolutely, I can. So, at RIT, I was an executive board member for three or four years at the Electronic Gaming Society. EGS for short, uh, we hosted weekly console game nights in, the student alumni union area, where there's open space, kind of like a cafeteria. We also hosted quarterly land parties, and we would actually get people from out of state sometimes who weren't even students to come. Uh, and we would usually host the bigger ones in the field house, which is also where concerts are held. And we would hold the smaller ones in conference rooms. I think when I started in 2006, the, the, the LANs were pretty small, maybe like 50, 50 people bring your, your, your huge CRT monitor tower in. [00:21:57] Jeremy: Oh yeah, [00:21:57] Sara: In And then by the time I left in 2012. we were over 300 people for a weekend LAN party, um, and we were actually drawing more power than concerts do. [00:22:13] Jeremy: Incredible. what were, what were people playing at the time? Like when they would the LANs like, [00:22:18] Sara: Yep. Fortnite, early League of Legends, Call of Duty. Battlegrounds. And then also just like fun indie games like Armagedtron, which is kind of like a racing game in the style of [00:22:37] Jeremy: okay. Oh, okay, [00:22:39] Sara: Um, any, there are some like fun browser games where you could just mess with each other. Jackbox. Yeah. [00:22:49] Jeremy: Yeah, it's, it's interesting that, you know, you're talking about stuff like Fortnite and, um, what is it? Battlegrounds is [00:22:55] Sara: not Fortnite. Team Fortress. [00:22:58] Jeremy: Oh Team Fortress! [00:22:59] Sara: Sorry. Yeah. Oh, yeah, I got my, my names mixed up. Fortnite, I think, did not exist at the time, but Team Fortress was big. [00:23:11] Jeremy: Yeah. that's really cool that you're able to get such a big group there. is there something about Rochester, I guess, that that was able to bring together this many people for like these big LAN events? Because I'm... I mean, I'm not sure how it is elsewhere, but I feel like that's probably not what was happening elsewhere in the country. [00:23:31] Sara: Yeah, I mean, if you've ever been to, um, DreamHack, that's, that's a huge LAN party and game convention, that's fun. so... EGS started in the early 2000s, even before I joined, and was just a committed group of people. RIT was a very largely technical school. The majority of students were there for math, science, engineering, or they were in the computer college, [00:24:01] Jeremy: Oh, okay. [00:24:01] Sara: GCIS, G C C I S, the Gossano College of Computing and Information Sciences. So there was a lot of us there. [00:24:10] Jeremy: That does make sense. I mean, it's, it's sort of this, this bias that when there's people doing, uh, technical stuff like software, um, you know, and just IT, [00:24:21] Sara: Mm hmm. [00:24:23] Jeremy: there's kind of this assumption that's like, oh, maybe they play games. And it seems like that was accurate [00:24:27] Sara: It was absolutely accurate. And there were plenty of people that came from different majors. but when I started, there were 17, 000 students and so that's a lot of students and obviously not everyone came to our weekly meetings, but we had enough dedicated people that were on the eboard driving, You know, marketing and advertising for, for our events and things like that, that we were able to get, the good community going. I, I wasn't part of it, but the anime club at RIT is also huge. They run a convention every year that is huge, ToraCon, um. And I think it's just kind of the confluence of there being a lot of geeks and nerds on campus and Rochester is a college town. There's maybe like 10 other universities in [00:25:17] Jeremy: Well, sounds like it was a good time. [00:25:19] Sara: Absolutely would recommend. Strong Museum of Play [00:25:22] Jeremy: I've never, I've never been, but the one thing I have heard about Rochester is there's the, the Strong Museum of Play. [00:25:29] Sara: Yeah, that place is so much fun, even as an adult. It's kind of like, um, the, the Children's Museum in Indiana for, for those that might know that. it just has all the historical toys and pop culture and interactive exhibits. It's so fun. [00:25:48] Jeremy: it's not quite the same, but it, when you were mentioning the Children's Museum in, um, I think it's in St. Louis, there's, uh, it's called the City Museum and it's like a, it's like a giant playground, you know, indoors, outdoors, and it's not just for kids, right? And actually some of this stuff seems like kind of sketch in terms of like, you could kind of hurt yourself, you know, climbing [00:26:10] Sara: When was this made? [00:26:12] Jeremy: I'm not sure, but, uh, [00:26:14] Sara: before regulations maybe. ha. [00:26:16] Jeremy: Yeah. It's, uh, but it's really cool. So at the, at the Museum of Play, though, is it, There's like a video game component, right? But then there's also, like, other types of things, [00:26:26] Sara: Yeah, they have, like, a whole section of the museum that's really, really old toys on display, like, 1900s, 1800s. Um, they have a whole Sesame Street section, and other things like that. Yeah. From Java to Ruby [00:26:42] Jeremy: Check it out if you're in Rochester. maybe now we could talk a little bit about, so like now you're working at Thoughtbot as a Ruby developer. but before we started recording, you were telling me that you started, working with Java. And there was like a, a long path I suppose, you know, changing languages. So maybe you can talk a little bit about your experience there. [00:27:06] Sara: Yeah. for other folks who have switched languages, this might be a familiar story for you, where once you get a job in one technology or one stack, one language, you kind of get typecast after a while. Your next job is probably going to be in the same language, same stack. Companies, they hire based on technology and So, it might be hard, even if you've been playing around with Ruby in your free time, to break, make that barrier jump from one language to another, one stack to another. I mean, these technologies, they can take a little while to ramp up on. They can be a little bit different, especially if you're going from a non object oriented language to an object oriented, don't. Lose hope. (laughs) If you have an interest in Ruby and you're not a Rubyist right now, there's a good company for you that will give you a chance. That's the key that I learned, is as a software developer, the skills that you have that are the most important are not the language that you know. It's the type of thinking that you do, the problem solving, communication, documentation, knowledge sharing, Supporting each other, and as Saron the keynote speaker on Wednesday said, the, the word is love. [00:28:35] Jeremy: [00:28:35] Sara: So when I was job hunting, it was really valuable for me to include those important aspects in my skill, in my resume, in my CV, in my interviews, that like, I'm newer to this language because I had learned it at a rudimentary level before. Never worked in it really professionally for a long time. Um, when I was applying, it was like, look, I'm good at ramping up in technologies. I have been doing software for a long time, and I'm very comfortable with the idea of planning, documenting, problem solving. Give me a chance, please. I was lucky enough to find my place at a company that would give me a chance. Test Double hired me in 2019 as a remote. Software Consultant, and it changed my life. [00:29:34] Jeremy: What, what was it about, Ruby that I'm assuming that this is something that you maybe did in your spare time where you were playing with Ruby or? [00:29:43] Sara: I am one of those people that don't really code in their spare time, which I think is valuable for people to say. The image of a software developer being, well, if you're not coding in your spare time, then you're not passionate about it. That's a myth. That's not true. Some of us, we have other hobbies. I have lots of hobbies. Coding is not the one that I carry outside of the workplace, usually, but, I worked at a company called Constant Contact in 2014 and 2015. And while I was there, I was able to learn Ruby on Rails. [00:30:23] Jeremy: Oh, okay. So that was sort of, I guess, your experience there, on the job. I guess you enjoyed something about the language or something about Rails and then that's what made you decide, like, I would really love to, to... do more of this [00:30:38] Sara: Absolutely. It was amazing. It's such a fun language. The first time I heard about it was in college, maybe 2008 or 2009. And I remember learning, this looks like such a fun language. This looks like it would be so interesting to learn. And I didn't think about it again until 2014. And then I was programming in it. Coming from a Java mindset and it blew my mind, the Rails magic also, I was like, what is happening? This is so cool. Because of my typecasting sort of situation of Java, I wasn't able to get back to it until 2019. And I don't want to leave. I'm so happy. I love the language. I love the community. It's fun. [00:31:32] Jeremy: I can totally see that. I mean, when I first tried out Rails, yeah, it, like, you mentioned the magic, and I know some people are like, ah, I don't like the magic, but when, I think, once I saw what you could do, And how, sort of, little you needed to write, and the fact that so many projects kind of look the same. Um, yeah, that really clicked for me, and I really appreciated that. think that and the Rails console. I think the console is amazing. [00:32:05] Sara: Being able to just check real quick. Hmm, I wonder if this will work. Wait, no, I can check right now. I [00:32:12] Jeremy: And I think that's an important point you brought up too, about, like, not... the, the stereotype and I, I kind of, you know, showed it here where I assumed like, Oh, you were doing Java and then you moved to Ruby. It must've been because you were doing Ruby on the side and thought like, Oh, this is cool. I want to do it for my job. but I, I thought that's really cool that you were able to, not only that you, you don't do the programming stuff outside of work, but that you were able to, to find an opportunity where you could try something different, you know, in your job where you're still being paid. And I wonder, was there any, was there any specific intention behind, like, when you took that job, it was so that I can try something different, or did it just kind of happen? I'm curious what your... The appeal of consulting [00:32:58] Sara: I was wanting to try something different. I also really wanted to get into consulting. [00:33:04] Jeremy: Hmm. [00:33:05] Sara: I have ADHD. And working at a product company long term, I think, was never really going to work out for me. another thing you might notice in my LinkedIn is that a lot of my stays at companies have been relatively short. Because, I don't know, I, my brain gets bored. The consultancy environment is... Perfect. You can go to different clients, different engagements, meet new people, learn a different stack, learn how other people are doing things, help them be better, and maybe every two weeks, two months, three months, six months, a year, change and do it all over again. For some people, that sounds awful. For me, it's perfect. [00:33:51] Jeremy: Yeah, I hadn't thought about that with, with consulting. cause I, I suppose, so you said it's, it's usually about half a year between projects or is It [00:34:01] Sara: varies [00:34:01] Jeremy: It varies widely. [00:34:02] Sara: Widely. I think we try to hit the sweet spot of 3-6 months. For an individual working on a project, the actual contract engagement might be longer than that, but, yeah. Maintainers don't get enough credit [00:34:13] Jeremy: Yeah. And, and your point about how some people, they like to jump on different things and some people like to, to stick to the same thing. I mean, that, that makes a lot of, sense in terms of, I think maintaining software and like building new software. It's, they're both development, [00:34:32] Sara: Mm hmm. [00:34:32] Jeremy: they're very different. Right. [00:34:35] Sara: It's so funny that you bring that up because I highly gravitate towards maintaining over making. I love going to different projects, but I have very little interest in Greenfield, very little interest in making something new. I want to get into the weeds, into 10 years that nobody wants to deal with because the weeds are so high and there's dragons in there. I want to cut it away. I want to add documentation. I want to make it better. It's so important for us to maintain our software. It doesn't get nearly enough credit. The people that work on open source, the people that are doing maintenance work on, on apps internally, externally, Upgrades, making sure dependencies are all good and safe and secure. love that stuff. [00:35:29] Jeremy: That's awesome. We, we need more of you. (laughs) [00:35:31] Sara: There's plenty of us out there, but we don't get the credit (laughs) [00:35:34] Jeremy: Yeah, because it's like with maintenance, well, I would say probably both in companies and in open source when everything is working. Then Nobody nobody knows. Nobody says anything. They're just like, Oh, that's great. It's working. And then if it breaks, then everyone's upset. [00:35:51] Sara: Exactly. [00:35:53] Jeremy: And so like, yeah, you're just there to get yelled at when something goes wrong. But when everything's going good, it's like, [00:35:59] Sara: A job well done is, I was never here. [00:36:02] Jeremy: Yeah. Yeah. Yeah. I don't know how. To, you know, to fix that, I mean, when you think about open source maintainers, right, like a big thing is, is, is burnout, right? Where you are keeping the internet and all of our applications running and, you know, what you get for it is people yelling at you and the issues, right? [00:36:23] Sara: Yeah, it's hard. And I think I actually. Submitted a talk to RubyConf this year about this topic. It didn't get picked. That's okay. Um, we all make mistakes. I'm going to try to give it somewhere in the future, but I think one of the important things that we as an industry should strive for is giving glory. Giving support and kudos to maintenance work. I've been trying to do that. slash I have been doing that at ThoughtBot by, at some cadence. I have been putting out a blog post to the ThoughtBot blog called. This week in open source, the time period that is covered might be a week or longer in those posts. I give a summary of all of the commits that have been made to our open source projects. And the people that made those contributions with highlighting to new version releases, including patch level. And I do this. The time I, I, I took up the torch of doing this from a co worker, Mike Burns, who used to do it 10 years ago. I do this so that people can get acknowledgement for the work they do, even if it's fixing a broken link, even if it's updating some words that maybe don't make sense. All of it is valuable. [00:37:54] Jeremy: Definitely. Yeah. I mean, I, I think that, um, yeah, what's visible to people is when there's a new feature or an API change and Yeah, it's just, uh, people don't, I think a lot of people don't realize, like, how much work goes into just keeping everything running. [00:38:14] Sara: Mm hmm. Especially in the world of open source and Ruby on Rails, all the gems, there's so many different things coming out, things that suddenly this is not compatible. Suddenly you need to change something in your code because a dependency, however many steps apart has changed and it's hard work. The people that do those things are amazing. [00:38:41] Jeremy: So if anybody listening does that work, we, we appreciate you. [00:38:45] Sara: We salute you. Thank you. And if you're interested in contributing to ThoughtBot open source, we have lots of repos. There's one out there for you. Thoughts on RubyConf [00:38:54] Jeremy: You've been doing programming for quite a while, and, you're here at, at RubyConf. I wonder what kind of brings you to these, these conferences? Like, what do you get out of them? Um, I guess, how was this one? That sort of thing. [00:39:09] Sara: Well, first, this one was sick. This one was awesome. Uh, Ruby central pulled out all the stops and that DJ on Monday. In the event, in the exhibit hall. Wow. Amazing. So he told me that he was going to put his set up on Spotify, on Weedmaps Spotify, so go check it out. Anyway, I come to these conferences for people. I just love connecting with people. Those listening might notice that I'm an extrovert. I work remotely. A lot of us work remotely these days. this is an opportunity to see some of my coworkers. There's seven of us here. It's an opportunity to see people I only see at conferences, of which there are a lot. It's a chance to connect with people I've only met on Mastodon, or LinkedIn, or Stack Overflow. It's a chance to meet wonderful podcasters who are putting out great content, keeping our community alive. That's, that's the key for me. And the talks are wonderful, but honestly, they're just a side effect for me. They just come as a result of being here. [00:40:16] Jeremy: Yeah, it's kind of a unique opportunity, you know, to have so many of your, your colleagues and to just all be in the same place. And you know that anybody you talk to here, like if you talk about Ruby or software, they're not going to look at you and go like, I don't know what you're talking about. Like everybody here has at least that in common. So it's, yeah, it's a really cool experience to, to be able to chat with anybody. And it's like, You're all on the same page, [00:40:42] Sara: Mm hmm. We're all in this boat together. [00:40:45] Jeremy: Yup, that we got to keep, got to keep afloat according to matz [00:40:49] Sara: Gotta keep it afloat, yeah. [00:40:51] Jeremy: Though I was like, I was pretty impressed by like during his, his keynote and he had asked, you know, how many of you here, it's your first RubyConf and it felt like it was over half the room. [00:41:04] Sara: Yeah, I got the same sense. I was very glad to see that, very impressed. My first RubyConf was and it was the same sort of showing of [00:41:14] Jeremy: Nice, yeah. Yeah, actually, that was my first one, too. [00:41:17] Sara: Nice! [00:41:19] Jeremy: Uh, that was Nashville, Yeah, yeah, yeah. So it's, yeah, it's really interesting to see because, the meme online is probably like, Ah, Ruby is dead, or Rails is dead. But like you come to these conferences and yeah, there's, there's so many new people. There's like new people that are learning it and experiencing it and, you know, enjoying it the same way we are. So I, I really hope that the, the community can really, yeah, keep this going. [00:41:49] Sara: Continue, continue to grow and share. I love that we had first timer buttons, buttons where people could self identify as this is my first RubyConf and, and then that opens a conversation immediately. It's like, how are you liking it? What was your favorite talk? [00:42:08] Jeremy: Yeah, that's awesome. okay, I think that's probably a good place to start wrapping it. But is there anything else you wanted to mention or thought we should have talked about? [00:42:18] Sara: Can I do a plug for thoughtbot? [00:42:20] Jeremy: yeah, go for it. [00:42:21] Sara: Alright. For those of you out there that might not know what ThoughtBot does, we are a full software lifecycle or company lifecycle consultancy, so we do everything from market fit and rapid prototyping to MVPs to helping with developed companies, developed teams, maybe do a little bit of a Boost when you have a deadline or doing some tech debt. Pay down. We also have a DevOps team, so if you have an idea or a company or a team, you want a little bit of support, we have been around for 20 years. We are here for you. Reach out to us at thoughtbot.com. [00:43:02] Jeremy: I guess the thing about Thoughtbot is that, within the Ruby community specifically, they've been so involved with sponsorships and, and podcasts. And so, uh, when you hear about consultancies, a lot of times it's kind of like, well, I don't know, are they like any good? Do they know what they're doing? But I, I feel like, ThoughtBot has had enough, like enough of a public record. I feel It's like, okay, if you, if you hire them, um, you should be in good hands. [00:43:30] Sara: Yeah. If you have any questions about our abilities, read the blog. [00:43:35] Jeremy: It is a good blog. Sometimes when I'm, uh, searching for how to do something in Rails, it'll pop up, [00:43:40] Sara: Mm hmm. Me too. Every question I ask, one of the first results is our own blog. I'm like, oh yeah, that makes sense. [00:43:47] Jeremy: Probably the peak is if you've written the blog. [00:43:50] Sara: That has happened to my coworkers They're like, wait, I wrote a blog about this nine years ago. [00:43:55] Jeremy: Yeah, yeah. So maybe, maybe that'll happen to you soon. I, I know definitely people who do, um, Stack Overflow. And it's like, Oh, I like, this is a good answer. Oh, I wrote this. (laughs) yeah. Well, Sara, thank you so much for, for chatting with me today. [00:44:13] Sara: Absolutely, Jeremy. Thank you so much for having me. I was really glad to chat today.

united states america american university spotify world children english technology japan giving spring dj teaching management global japanese reach nashville san diego indiana adhd tokyo fall in love companies networking museum computers boost differences connected fortnite technical call of duty programming excuse guidelines cv rochester api naming sesame street coding python bash upgrades java league of legends github lan mvps crt mastodon computing devops rails greenfield submitted battlegrounds stack overflow information science unix ruby on rails rochester institute rit constant contact dreamhack team fortress lans jackbox in american egs saron mike burns niigata city museum maintainers nihongo kanazawa strong museum ktc thoughtbot rubyconf test double software consultant rubyist rubykaigi jeremy you sara jackson jeremy it sara it jeremy so sara well

ChaelCodes on The Joy of Programming Games and Streaming (RubyConf 2023)

Latent Space: The AI Engineer Podcast â€” CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Nov 15, 2023 43:53

Episode Notes Rachael Wright-Munn (ChaelCodes) talks about her love of programming games (games with programming elements in them, not how to make games!), starting her streaming career with regex crosswords, and how streaming games and open source every week led her to a voice acting role in one of her favorite programming games. Recorded at RubyConf 2023 in San Diego. mastodon twitch Personal website Programming Games mentioned: Regex Crossword SHENZHEN I/O EXAPUNKS 7 Billion Humans One Dreamer Code Rom@ntic Bitburner Transcript You can help edit this transcript on GitHub. Jeremy: I'm here at RubyConf San Diego with Rachel Wright-Munn, and she goes by Chaelcodes online. Thanks for joining me today. Rachael: Hi, everyone. Hi, Jeremy. Really excited to be here. Jeremy: So probably the first thing I'll ask about is on your web page, and I've noticed you have streams, you say you have an interest in not just regular games, but programming games, so. Rachael: Oh my gosh, I'm so glad you asked about this. Okay, so I absolutely love programming games. When I first started streaming, I did it with Regex Crossword. What I really like about it is the fact that you have this joyful environment where you can solve puzzles and work with programming, and it's really focused on the experience and the joy. Are you familiar with Zach Barth of Zachtronics? Jeremy: Yeah. So, I've tried, what was it? There's TIS-100. And then there's the, what was the other one? He had one that's... Rachael: Opus Magnum? Shenzhen I/O? Jeremy: Yeah, Shenzhen I/O. Rachael: Oh, my gosh. Shenzhen I/O is fantastic. I absolutely love that. The whole conceit of it, which is basically that you're this electronics engineer who's just moved to Shenzhen because you can't find a job in the States. And you're trying to like build different solutions for these like little puzzles and everything. It was literally one of the, I think that was the first programming game that really took off just because of the visuals and everything. And it's one of my absolute favorites. I really like what he says about it in terms of like testing environments and the developer experience. Cause it's built based on assembly, right? He's made a couple of modifications. Like he's talked about it before where it's like The memory allocation is different than what it would actually look like in assembly and the way the registers are handled I believe is different, I wouldn't think of assembly as something that's like fun to write, but somehow in this game it is. How far did you get in it? Jeremy: Uh, so I didn't get too far. So, because like, I really like the vibe and sort of the environment and the whole concept, right, of you being like, oh, you've been shipped off to China because that's the only place that these types of jobs are, and you're working on these problems with bad documentation and stuff like that. And I like the whole concept, but then the actual writing of the software, I was like, I don't know. Rachael: And it's so hard, one of the interesting things about that game is you have components that you drop on the board and you have to connect them together and wire them, but then each component only has a specific number of lines. So like half the time I would be like, oh, I have this solution, but I don't have enough lines to actually run it or I can't fit enough components, then you have to go in and refactor it and everything. And it's just such a, I don't know, it's so much fun for me. I managed to get through all of the bonus levels and actually finish it. Some of them are just real, interesting from both a story perspective and interesting from a puzzle perspective. I don't wanna spoil it too much. You end up outside Shenzhen, I'll just say that. Jeremy: OK. That's some good world building there. Rachael: Yeah. Jeremy: Because in your professional life, you do software development work. So I wonder, what is it about being in a game format where you're like, I'm in it. I can do it more. And this time, I'm not even being paid. I'm just doing it for fun. Rachael: I think for me, software development in general is a very joyful experience. I love it. It's a very human thing. If you think about it like math, language, all these things are human concepts and we built upon that in order to build software in our programs and then on top of that, like the entire purpose of everything that we're building is for humans, right? Like they don't have rats running programs, you know what I mean? So when I think about human expression and when I think about programming, these two concepts are really closely linked for me and I do see it as joyful, But there are a lot of things that don't spark joy in our development processes, right? Like lengthy test suites, or this exhausting back and forth, or sometimes the designs, and I just, I don't know how to describe it, but sometimes you're dealing with ugly code, sometimes you're dealing with code smells, and in your professional developer life, sometimes you have to put up with that in order to ship features. But when you're working in a programming game, It's just about the experience. And also there is a correct solution, not necessarily a correct solution, but like there's at least one correct solution. You know for a fact that there's, that it's a solvable problem. And for me, that's really fun. But also the environment and the story and the world building is fun as well, right? So one of my favorite ones, we mentioned Shenzhen, but Zachtronics also has Exapunks. And that one's really fun because you have been infected by a disease. And like a rogue AI is the only one that can provide you with the medicine you need to prevent it. And what this disease is doing is it is converting parts of your body into like mechanical components, like wires and everything. So what you have to do as an engineer is you have to write the code to keep your body running. Like at one point, you were literally programming your heart to beat. I don't have problems like that in my day job. In my day job, it's like, hey, can we like charge our customers more? Like, can we put some banners on these pages? Like, I'm not hacking anybody's hearts to keep them alive. Jeremy: The stakes are a little more interesting. Yeah, yeah. Rachael: Yeah, and in general, I'm a gamer. So like having the opportunity to mix two of my passions is really fun. Jeremy: That's awesome. Yeah, because that makes sense where you were saying that there's a lot of things in professional work where it's you do it because you have to do it. Whereas if it's in the context of a game, they can go like, OK, we can take the fun problem solving part. We can bring in the stories. And you don't have to worry about how we're going to wrangle up issue tickets. Rachael: Yeah, there are no Jira tickets in programming games. Jeremy: Yeah, yeah. Rachael: I love what you said there about the problem solving part of it, because I do think that that's an itch that a lot of us as engineers have. It's like we see a problem, and we want to solve it, and we want to play with it, and we want to try and find a way to fix it. And programming games are like this really small, compact way of getting that dopamine hit. Jeremy: For sure. Yeah, it's like. Sometimes when you're doing software for work or for an actual purpose, there may be a feeling where you want to optimize something or make it look really nice or perform really well. And sometimes it just doesn't matter, right? It's just like we need to just put it out and it's good enough. Whereas if it's in the context of a game, you can really focus on like, I want to make this thing look pretty. I want to feel good about this thing I'm making. Rachael: You can make it look good, or you can make it look ugly. You don't have to maintain it. After it runs, it's done. Right, right, right. There's this one game. It's 7 Billion Humans. And it's built by the creators of World of Goo. And it's like this drag and drop programming solution. And what you do is you program each worker. And they go solve a puzzle. And they pick up blocks and whatever. But they have these shredders, right? And the thing is, you need to give to the shredder if you have like a, they have these like little data blocks that you're handing them. If you're not holding a data block and you give to the shredder, the worker gives themself to the shredder. Now that's not ideal inside a typical corporate workplace, right? Like we don't want employees shredding themselves. We don't want our workers terminating early or like anything like that. But inside the context of a game, in order to get the most optimal solution, They have like a lines of code versus fastest execution and sometimes in order to win the end like Lines of code. You just kind of have to shred all your workers at the, When I'm on stream and I do that when I'm always like, okay everybody close your eyes That's pretty good it's Yeah, I mean cuz like in the context of the game. Jeremy: I think I've seen where they're like little They're like little gray people with big eyes Yes, yes, yes, yes. Yeah, so it's like, sorry, people. It's for the good of the company, right? Rachael: It's for my optimal lines of code solution. I always draw like a, I always write a humane solution before I shred them. Jeremy: Oh, OK. So it's, you know, I could save you all, but I don't have to. Rachael: I could save you all, but I would really like the trophy for it. There's like a dot that's going to show up in the elevator bay if I shred you. Jeremy: It's always good to know what's important. But so at the start, you mentioned there was a regular expression crossword or something like that. Is that how you got started with all this? Rachael: My first programming game was Regex Crossword. I absolutely loved it. That's how I learned Regex. Rachael: I love it a lot. I will say one thing that's been kind of interesting is I learned Regex through Regex Crossword, which means there's actually these really interesting gaps in my knowledge. What was it? at Link Tech Retreat, they had like a little Regex puzzle, and it was like forward slash T and then a plus, right? And I was like, I have no idea what that character is, right? Like, I know all the rest of them. But the problem is that forward slash T is tab, and Regex crossword is a browser game. So you can't have a solution that has tab in it. And have that be easy for users. Also, the idea of like greedy evaluation versus lazy evaluation doesn't apply, because you're trying to find a word that satisfies the regex. So it's not necessarily about what the regex is going to take. So it's been interesting finding those gaps, but I really think that some of the value there was around how regex operates and the rules underlying it and building enough experience that I can now use the documentation to fill in any gaps. Jeremy: So the crossword, is it where you know the word and you have to write a regular expression to match it? Or what's the? Rachael: They give you regex. And there's a couple of different versions, right? The first one, you have two regex patterns. There's one going up and down, and there's one going left and right. And you have to fill the crossword block with something that matches both regular expressions. Rachael: Then we get into hexagonal ones. Yeah, where you have angles and a hexagon, and you end up with like three regular expressions. What's kind of interesting about that one is I actually think that the hexagonal regex crosswords are a little bit easier because you have more rules and constraints, which are more hints about what goes in that box. Jeremy: Interesting. OK, so it's the opposite of what I was thinking. They give you the regex rules, and then you put in a word that's going to satisfy all the regex you see. Rachael: Exactly. When I originally did it, they didn't have any sort of hints or anything like that. It was just empty. Now it's like you click a box, and then they've got a suggestion of five possible letters that could go in there. And it just breaks my heart. I liked the old version that was plainer, and didn't have any hints, and was harder. But I acknowledge that the new version is prettier, and probably easier, and more friendly. But I feel like part of the joy that comes from games, that comes from puzzles, It comes from the challenge, and I miss the challenge. Jeremy: I guess someone, it would be interesting to see people who are new to it, if they had tried the old way, if they would have bounced off of it. Rachael: I think you're probably right. I just want them to give me a toggle somewhere. Jeremy: Yeah, oh, so they don't even let you turn off the hints, they're just like, this is how it is. Rachael: Yep. Jeremy: Okay. Well, we know all about feature flags. Rachael: And how difficult they are to maintain in perpetuity. Jeremy: Yeah, but no, that sounds really cool because I think some things, like you can look up a lot of stuff, right? You can look up things about regex or look up how to use them. But I think without the repetition and without the forcing yourself to actually go through the motion, without that it's really hard to like learn and pick it up. Rachael: I completely agree with you. I think the repetition, the practice, and learning the paradigm and patterns is huge. Because like even though I didn't know what forward slash t plus was, I knew that forward slash t was going to be some sort of character type. Jeremy: Yeah, it kind of reminds me of, there was, I'm not sure if you've heard of Vim Adventures, but... Rachael: I did! I went through the free levels. I had a streamerversary and my chat had completed a challenge where I had to go learn Vim. So I played a little bit of Vim Adventures. Jeremy: So I guess it didn't sell you. Rachael: Nope, I got Vim Extensions turned on. Jeremy: Oh, you did? Rachael: Yeah, I have the Vim extension turned on in VS Code. So I play with a little bit of sprinkling of Vim in my everyday. Jeremy: It's kind of funny, because I am not a Vim user in the sense that I don't use it as my daily editor or anything like that. But I do the same thing with the extensions in the browser. I like being able to navigate with the keyboard and all that stuff. Rachael: Oh, that is interesting. That's interesting. You have a point like memorizing all of the different patterns when it comes to like Keyboard navigation and things like that is very similar to navigating in Vim. I often describe writing code in Vim is kind of like solving a puzzle in order to write your code So I think that goes back to that Puzzle feeling that puzzle solving feeling we were having we were talking about before. Jeremy: Yeah, I personally can't remember, but whenever I watch somebody who's, really good at using Vim, it is interesting to see them go, oh, yes, I will go to the fifth word, and I will swap out just this part. And it's all just a few keystrokes, yeah. Rachael: Very impressive. Can be done just as well with backspace and, like, keyboard, like, little arrows and everything. But there is something fun about it and it is... Faster-ish. Jeremy: Yeah, I think it's like I guess it depends on the person, but for some people it's like they, they can think and do things at the speed that they type, you know, and so for them, I guess the the flow of, I'm doing stuff super fast using all these shortcuts is probably helpful to them. Rachael: I was talking to someone last night who was saying that they don't even think about it in Vim anymore. They just do it. I'm not there yet. (laughs) Jeremy: Yeah, I'll probably never be there (laughs) But yeah, it is something to see when you've got someone who's really good at it. Rachael: Definitely. I'm kind of glad that my chat encouraged and pressured me to work with Vim. One of the really cool things is when I'm working on stuff, I'll sometimes be like, oh, I want to do this. Is there a command in Vim for that? And then I'll get multiple suggestions or what people think, and ideas for how I can handle things better. Someone recently told me that if you want to delete to the end of a line, you can use capital D. And this whole time I was doing lowercase d dollar sign. Jeremy: Oh, right, right, right. Yeah. Yeah, it's like there's so many things there that, I mean, we should probably talk about your experiences streaming. But that seems like a really great benefit that you can be working through a problem or just doing anything, really. And then there's people who they're watching, and they're like, I know how to do it better. And they'll actually tell you, yeah. Rachael: I think that being open to that is one of the things that's most important as a streamer. A lot of people get into this cycle where they're very defensive and where they feel like they have to be the expert. But one of the things that I love about my chat is the fact that they do come to me with these suggestions. And then I can be open to them, and I can learn from them. And what I can do is I can take those learnings from one person and pass it on to the other people in chat. I can become a conduit for all of us to learn. Jeremy: So when you first decided to start streaming, I guess what inspired you to give it a shot? Like, what were you thinking? Rachael: That's a great question. It's also kind of a painful question. So the company that I was working for, I found out that there were some pay issues with regards to me being a senior, promotion track, things like that. And it wasn't the first time this had happened, right? Like, I often find that I'm swapping careers every two to three years because of some miserable experience at the company. Like you start and the first year is great. It's fantastic. It's awesome. But at the end of it, you're starting to see the skeletons and that two to three years later you're burnt out. And what I found was that every two to three years I was losing everything, right? Like all of my library of examples, the code that I would reference, like that's in their private repo. When it came to my professional network, the co -workers that liked and respected me, we had always communicated through the workplace Slack. So it's really hard to get people to move from the workplace Slack to like Instagram or Twitter or one of those other places if that's not where, if that's not a place where you're already used to talking to them. And then the other thing is your accomplishments get wiped out, right? Like when you start at the next company and you start talking about promotion and things like that, the work that you did at previous companies doesn't matter. They want you to be a team lead at that company. They want you to lead a massive project at that company and that takes time. It takes opportunities and Eventually, I decided that I wanted to exist outside my company. Like I wanted to have a reputation that went beyond that and that's what originally inspired me to stream And it's pretty hard to jump from like oh. I got really frustrated and burnt out at my company to I've got it I'm gonna do some regex crossword on stream, but honestly, that's what it was right was I just wanted to slowly build this reputation in this community outside of of my company and it's been enormously valuable in terms of my confidence, in terms of my opportunities. I've been able to pick up some really interesting jobs and I'm able to leverage some of those experiences in really clear professional ways and it's really driven me to contribute more to open source. I mentioned that I have a lot of people like giving me advice and suggestions and feedback. That's enormously helpful when you're going out there and you're trying to like get started in open source and you're trying to build that confidence and you're trying to build that reputation. I often talk about having a library of examples, right? Like your best code that you reference again and again and again. If I'm streaming on Twitch, everything that I write has to be open source because I'm literally showing it on video, right? So it's really encouraged me to build that out. And now when I'm talking to my coworkers and companies, I can be like, oh, we need to talk about single table inheritance. I did that in Hunter's Keepers. Why don't we go pull that up and we'll take a look at it. Or are we building a Docker image? I did that in Hunter's Keepers and Conf Buddies. Why don't we look at these, compare them, and see if we can get something working here, right? Like I have all of these examples, and I even have examples from other apps as well. Like I added Twitch Clips to 4M. So when I want to look at how to build a liquid tag, because Jekyll uses liquid tags as well. So when I'm looking at that, I can hop to those examples and hop between them, and I'm never going to lose access to them. Jeremy: Yeah, I mean, that's a really good point where I think a lot of people, they do their work at their job and it's never going to be seen by anyone and you can sort of talk about it, but you can't actually show anybody what you did. So it's like, and I think to that point too, is that there's some knowledge that is very domain specific or specific to that company. And so when you're actually doing open source work, it's something that anybody can pick up and use and has utility way beyond just your company. And the whole point of creating this record, that makes a lot of sense too, because if I wanna know if you know how to code, I can just see like, wow, she streams every Thursday. She's clearly she knows what she's doing and you know, you have these also these open source contributions as well So it's it's sort of like it's not this question of if I interview you It's it's not I'm just going off of your word that and I believe what you're saying. But rather it's kind of the proof is all it's all out there. Rachael: Oh, definitely if I were to think about my goals and aspirations for the future I've been doing this for four years still continuing But I think I would like to get to the point where I don't really have to interview. Where an interview is more of a conversation between me and somebody who already knows they want to hire me. Jeremy: Have you already started seeing a difference? Like you've been streaming for about four years I think Rachael: I had a really interesting job for about eight months doing developer relations with New Relic. That was a really interesting experience. And I think it really pushed the boundaries of what I understood myself to be capable of because I was able to spend 40 hours a week really focused on content creation, on blogging, on podcasting, on YouTube videos and things like that. Obviously there was a lot of event organization and things like that as well. But a lot of the stuff that came out of that time is some of my best work. Like I, I'm trying to remember exactly what I did while I was at New Relic, but I saw a clear decrease afterwards. But yeah, I think that was probably close to the tipping point. I don't for sure know if I'm there yet, right? Like you never know if you're at the point where you don't have to interview anymore until you don't have to interview. But the last two jobs, no, I haven't had to interview. Jeremy: So, doing it full -time, how did you feel about that versus having a more traditional lead or software developer role? Rachael: It was definitely a trade-off. So I spent a lot less time coding and a lot more time with content, and I think a little bit of it was me trying to balance the needs and desires of my audience against the needs and desires of my company. For me, and this is probably going to hurt my chances of getting one of those jobs where I don't have to interview in the future, but my community comes first, right? They're the people who are gonna stick with me when I swap between jobs, but that was definitely something that I constantly had to think about is like, how do I balance what my company wants from me with the responsibility that I have to my community? But also like my first talk, your first open source contribution, which was at RubyConf Denver, Like, that was written while I was at New Relic. Like, would I have had the time to work on a talk in addition to the streaming schedule and everything else? Um, for a period of time, I was hosting Ruby Galaxy, which was a virtual meetup. It didn't last very long, and we have deprecated it. Um, I deprecated it before I left the company because I wanted to give it, like, a good, clean ending versus, um, necessarily having it, like, linger on and be a responsibility for other people. but... I don't think I would have done those if I was trying to balance it with my day job. So, I think that that was an incredible experience. That said, I'm very glad it's over. I'm very glad that the only people I'm beholden to are my community now. Jeremy: So, is it the sheer amount that you had to do that was the main issue? Or is it more that that tension between, like you said, serving your audience and your community versus serving your employer? Rachael: Oh, a lot of it was tension. A lot of it was hectic, event management in general. I think if you're like planning and organizing events, that's a very challenging thing to do. And it's something that kind of like goes down to the deadline, right? And it's something where everybody's trying to like scramble and pull things together and keep things organized. And that was something that I don't think I really enjoyed. I like to have everything like nice and planned out and organized and all that sort of stuff, and I don't think that that's Something that happens very often in event management at least not from my experience So these were like in -person events or what types of events like I actually skipped out before the in -person events. They would have been in -person events. We had future stack at New Relic, which is basically like this big gathering where you talk about things you can do with New Relic and that sort of stuff. We all put together talks for that. We put together an entire like. Oh gosh, I'm trying to remember the tool that we use, but it was something similar to gather round where you like interact with people. And there's just a lot that goes into that from marketing to event planning to coordinating with everyone. I'm grateful for my time at New Relic and I made some incredible friends and some incredible connections and I did a lot, but yeah, I'm very glad I'm not in DevRel anymore. I don't, if you ask any DevRel, They'll tell you it's hectic, they'll tell you it's chaotic, and they'll tell you it's a lot of work. Jeremy: Yeah. So it sounds like maybe the streaming and podcasting or recording videos, talks, that part you enjoy, but it's the I'm responsible for planning this event for all these people to, you know. That's the part where you're like, OK, maybe not for me. Rachael: Yeah, kind of. I describe myself as like a content creator because I like to just like dabble and make things, right? Like I like to think about like, what is the best possible way to craft this tweet or this post or like to sit there and be like, okay, how can I structure this blog post to really communicate what I want people to understand? When it comes to my streams, what I actually do is I start with the hero's journey as a concept. So every single stream, we start with an issue in the normal world, right? And then what we do is we get drawn into the chaos realm as we're like debugging and trying to build things and going Back and forth and there's code flying everywhere and the tests are red and then they're green and then they're red and then they're green and then finally at the end we come back to the normal world as we create this PR and, Submit it neither merge it or wait for maintainer feedback. And for me that Story arc is really key and I like I'm a little bit of an artist. I like the artistry of it. I like the artistry of the code, and I like the artistry of creating the content. I think I've had guests on the show before, and sometimes it's hard to explain to them, like, no, no, no, this is a code show. We can write code, and that's great, but that's not what it's about. It's not just about the end product. It's about bringing people along with us on the journey. And sometimes it's been three hours, and I'm not doing a great job of bringing people along on the journey so like you know I'm tooting my own horn a little bit here but like that is important to me. Jeremy: So when you're working through a problem, When you're doing it on stream versus you're doing it by yourself, what are the key differences in how you approach the problem or how you work through it? Rachael: I think it's largely the same. It's like almost exactly the same. What I always do is, when I'm on stream, I pause, I describe the problem, I build a test for it, and then I start working on trying to fix what's wrong. I'm a huge fan of test -driven development. The way I see it, you want that bug to be reproducible, and a test gives you the easiest way to reproduce it. For me, it's about being easy as much as it is about it being the right way or not. But yeah, I would say that I approach it largely in the same way. I was in the content creator open space a little bit earlier, and I had to give them a bit of a confession. There is one small difference when I'm doing something on stream versus when I'm doing something alone. Sometimes, I have a lot of incredible senior staff, smart, incredible people in my chat. I'll describe the problem in vivid detail, and then I'll take my time writing the test, and by the time I'm done writing the test, somebody will have figured out what the problem is, and talk back to me about it. I very rarely do that. It's more often when it's an ops or an infrastructure or something like that. A great example of this is like the other day I was having an issue, I mentioned the Vim extensions. If I do command P on the code section, Vim extensions was capturing that, and so it wasn't opening the file. So one of my chatters was like, oh, you know, you can fix that if you Google it. I was like, oh, I don't know. I mean, I could Google it, but it will take so long and distract from the stream. Literally less than 15 minutes later a chatter had replied with like, here's exactly what to add to your VS Code extension, and I knew that was gonna happen. So that's my little secret confession. That's the only difference when I'm debugging things on stream is sometimes I'll let chat do it for me. Jeremy: Yeah, that's a superpower right there. Rachael: It is, and I think that happens because I am open to feedback and I want people to engage with me and I support that and encourage that in my community. I think a lot of people sometimes get defensive when it comes to code, right? Like when it comes to the languages or the frameworks that we use, right? There's a little bit of insecurity because you dive so deep and you gain so much knowledge that you're kind of scared that there might be something that's just as good because it means you might not have made the right decision. And I think that affects us when it comes to code reviews. I think it affects us when we're like writing in public. And I think, yeah, and I think it affects a lot of people when they're streaming, where they're like, if I'm not the smartest person in the room, and why am I the one with a camera and a microphone? But I try to set that aside and be like, we're all learning here. Jeremy: And when people give that feedback, and it's good feedback, I think it's really helpful when people are really respectful about it and kind about it. Have you had any issues like having to moderate that or make sure it stays positive in the context of the stream? Rachael: I have had moderation issues before, right? Like, I'm a woman on the internet, I'm going to have moderation issues. But for me, when it comes to feedback and suggestions, I try to be generous with my interpretation and my understanding of what they're going with. Like people pop in and they'll say things like, Ruby is dead, Rails is dead. And I have commands for that to like remind them, no, actually Twitch is a Rails app. So like, no, it's definitely not dead. You just used it to send a message. But like, I try to be understanding of where people are coming from and to meet them where they are, even if they're not being the most respectful. And I think what I've actually noticed is that when I do that, their tone tends to change. So I have two honorary trolls in my chat, Kego and John Sugar, and they show up and they troll me pretty frequently. But I think that that openness, that honesty, like that conversation back and forth it tends to defuse any sort of aggressive tension or anything. Jeremy: Yeah, and it's probably partly a function of how you respond, and then maybe the vibe of your stream in general probably brings people that are. Rachael: No, I definitely agree. I think so. Jeremy: Yeah. Rachael: It's the energy, you get a lot of the energy that you put out. Jeremy: And you've been doing this for about four years, and I'm having trouble picturing what it's even like, you know, you've never done a stream and you decide I'm gonna turn on the camera and I'm gonna code live and, you know, like, what was kind of going through your mind? How did you prepare? And like, what did, like, what was that like? Rachael: Thank you so much. That's a great question. So, actually, I started with Regex Crossword because it was structured, right? Like, I didn't necessarily know what I wanted to do and what I wanted to work on, but with Regex Crossword, you have a problem and you're solving it. It felt very structured and like a very controlled environment, and that gave me the confidence to get comfortable with, like, I'm here, I have a moderator, right? Like we're talking back and forth, I'm interacting with chatters, and that allowed me to kind of build up some skills. I'm actually a big fan of Hacktoberfest. I know a lot of people don't like it. I know a lot of people are like, oh, there are all these terrible spam PRs that show up during Hacktoberfest and open source repositories. But I'm a really big fan because I've always used it to push my boundaries, right? Like every single year, I've tried to take a new approach on it. So the first year that I did it, I decided that what I wanted to do to push my boundaries was to actually work on an application. So this one was called Hunter's Keepers. It was an app for managing characters in Monster of the Week and it was a Reels app because that's what I do professionally and that's what I like to work on. So I started just building that for Hacktoberfest and people loved it. It got a ton of engagement, way more than Regex Crossword and a little bit, like those open source streams continue to do better than the programming games, but I love the programming games so much that I don't wanna lose them, but that's where it kind of started, right? Was me sitting there and saying like, oh, I wanna work on these Rails apps. The Hacktoberfest after that one, And I was like, OK, I worked on my own app in the open, and I've been doing that for basically a year. I want to work on somebody else's app. So I pushed myself to contribute to four different open source repositories. One of the ones I pushed myself to work on was 4M. They did not have Twitch clips as embeds. They had YouTube videos and everything else. And I looked into how to do it, and I found out how liquids tags work, and I had a ton of other examples. I feel like extensions like that are really great contributions to open source because it's an easy way with a ton of examples that you can provide value to the project, and it's the sort of thing where, like, if you need it, other people probably need it as well. So I went and I worked on that, and I made some Twitch clips. And that was like one of my first like external open source project contributions. And that kind of snowballed, right? Because I now knew how to make a liquid tag. So when I started working on my Jekyll site, and I found out that they had liquid tags that were wrapped in gems, I used that as an opportunity to learn how to build a gem. And like how to create a gem that's wrapped around a liquid tag. And that exists now and is a thing that I've done. And so it's all of these little changes and moments that have stacked on top of each other, right? Like it's me going in and saying, OK, today I'd like to customize my alerts. Or like, today I'd like to buy a better microphone and set it up and do these changes. It's not something that changed all at once, right? It's just this small putting in the time day by day, improving. I say like the content gears are always grinding. You always need something new to do, right? And that's basically how my stream has gone for the last four years, is I'm just always looking for something new to do. We haven't talked about this yet, but I'm a voice actress in the programming video game, One Dreamer. And I actually collaborated with the creator of another one, Compressor, who like reached out to me about that Steam key. But the reason that I was able to talk to these people and I was able to reach out to them is rooted in Regex Crossword, right? Cause I finished Regex Crossword and Thursday night was like my programming game stream. And I loved them, so I kept doing them. And I kept picking up new games to play, and I kept exploring new things. So at the end of it, I ended up in this place where I had this like backlog in knowledge and history around programming games. So when Compressor was developed, I think he's like the creator, Charlie Bridge is like a VP at Arm or something. And okay, I should back up a little bit. Compressor is this game where you build CPUs with Steam. So it's like Steam Punk, like, electrical engineering components. Ah, it's so much fun. And like, the characters are all cool, because it's like you're talking to Nikola Tesla, and like Charles Babbage, and Ada Lovelace, and all this sort of stuff. It's just super fun. But the reason he reached out to me was because of that reputation, that backlog, that feedback. Like, when you think about how you became a developer, right, it's day by day, right? when you develop your experience. There's a moment where you look back and you're like, I just have all of these tools in my toolkit. I have all of these experiences. I've done all these things, and they just stack to become something meaningful. And that's kind of how it's gone with my stream, is just every single day I was trying to push, do something new. Well, not every day. Sometimes I have a lazy day, but like, but like I am continuously trying to find new ground to tread. Jeremy: Yeah, I mean that's really awesome thinking about how it went from streaming you solving these regex crosswords to all the way to ending up in one of these games that you play. Yeah, that's pretty pretty cool. Rachael: By the way, that is my absolute favorite game. So the whole reason that I'm in the game is because I played the demo on stream. Jeremy: Oh, nice. Rachael: And I loved it. Like I immediately was like, I'm going to go join the creators discord. This is going to be my game of the year. I can't wait to like make a video on this game. What's really cool about this one is that it uses programming as a mechanic and the story is the real driver. It's got this emotional impact and story. The colors are gorgeous and the way you interact with the world, like it is a genuine puzzle game where the puzzles are small, little, simple programming puzzles. And not like I walk up to this and like I solve a puzzle and the door opens. No, it's like you're interacting with different components in the world and wiring them together in order to get the code working. The whole premise is that there's an indie game developer who's gone through this really traumatic experience with his game, and now he's got the broken game, and he's trying to fix it in time for a really important game demo. I think it's like, it's like Vig something. Video game indie gaming. But what happened is I started following the creator, and I was super interested in them. And then he actually reached out to me about like the Steve workshop and then he was looking for people to voice act and I was like me please yes so yeah that's how I got involved with it yeah that's awesome it's like everything came full circle I guess it's like where you started and yeah no absolutely it's amazing. Jeremy: And so what was that experience like the voice acting bit? I'm assuming you didn't have professional experience with that before. Rachael: No, no, no, no. I had to do a lot of research into like how to voice act. My original ones were tossed out. I just, OK, so there's one line in it. This is going to this is so embarrassing. I can't believe I'm saying this on a podcast. There's one line that's like, it's a beautiful day to code. It's like a, because I'm an NPC, right? So like you can keep interacting with me and one of the like cycling ones is like, it's a beautiful day to code. Well, I tried to deliver it wistfully. Like I was staring out a window and I was like, it's a beautiful day to code. And every single person who heard it told me that it sounded like somewhat sensual, sexy. And I was dying because I had just sent this to this like indie game developer that like I appreciated and he replies back and he's like, I'm not sure if there was an audio issue with some of these, but could you like rerecord some of these? So I was very inexperienced. I did a lot of practicing, a lot of vocal exercises, but I think that it turned out well. Jeremy: That's awesome. So you kind of just kept trying and sending samples, or did they have anybody like try and coach you? Rachael: No, I just kept sending samples. I did watch some YouTube videos from like real voice actors. To try and like figure out what the vocal exercises were. One of the things that I did at first was I sent him like one audio, like the best one in my opinion. And he replied back being like, no, just record this like 10, 20 times. Send it to me and I'll chop the one I want. Jeremy: So the, anytime you did that, the one they picked, was it ever the one you thought was the best one? Rachael: Oh gosh, I don't think I actually like, Wow, I don't think I've gone back over the recordings to figure out which one I thought was the best one. Or like checked which one he picked out of the ones that I recorded. Oh, that's interesting. I'm going to have to do that after this. Jeremy: You're going to listen to all the, it's a beautiful day to code. Rachael: The final version is like a nice, neutral like, it's a beautiful day to code. One of the really cool things about that, though, is my character actually triggers the end of game scene, which is really fun. You know how you get a little hint that's like, oh, this is where the end of the game is, my character gets to do that. Jeremy: That's a big responsibility. Rachael: It is. I was so excited when I found out. Jeremy: That's awesome. Cool. Well, I think that's probably a good place to wrap it up on. But is there anything else you want to mention, or any games you want to recommend? Rachael: Oh, I think I mentioned all of them. I think if you look at Code Romantic, AXA Punks, Bitburner, is an idle JavaScript game that can be played in the browser where you write the custom files and build it and you're going off and hacking servers and stuff like that. It's a little light on story. One Dreamer, yeah. I think if you look at those four to five games, you will find one you like. Oh, it's 7 Billion Humans. Jeremy: Oh, right, yeah. Rachael: I haven't written the blog post yet, but that's my five programming video games that you should try if you've never done one before. 7 million humans is on mobile, so if you've got a long flight back from RubyConf, it might be a great choice. Jeremy: Oh, there you go. Rachael: Yeah. Other than that, it can be found at chael.codes, chael.codes/links for the socials, chael.codes/about for more information about me. And yeah, thank you so much for having me. This has been so much fun. Jeremy: Awesome. Well, Rachel, thank you so much for taking the time. Rachael: Thank you.

world ai google china pr games video story san diego twitch monster states streaming lines steam programming slack arm puzzle reels github npc jekyll keepers javascript rails keyboard nikola tesla 4m steampunk docker prs shenzhen cpus jira goo ada lovelace vig vim vs code new relic devrel compressor charles babbage hacktoberfest regex rubyconf zachtronics billion humans jeremy you jeremy it exapunks jeremy so zach barth shenzhen i o

The End of Finetuning — with Jeremy Howard of Fast.ai

Play Episode Listen Later Oct 19, 2023 69:15

Thanks to the over 17,000 people who have joined the first AI Engineer Summit! A full recap is coming. Last call to fill out the State of AI Engineering survey! See our Community page for upcoming meetups in SF, Paris and NYC.This episode had good interest on Twitter.Fast.ai's “Practical Deep Learning” courses been watched by over >6,000,000 people, and the fastai library has over 25,000 stars on Github. Jeremy Howard, one of the creators of Fast, is now one of the most prominent and respected voices in the machine learning industry; but that wasn't always the case. Being non-consensus and right In 2018, Jeremy and Sebastian Ruder published a paper on ULMFiT (Universal Language Model Fine-tuning), a 3-step transfer learning technique for NLP tasks: The paper demonstrated that pre-trained language models could be fine-tuned on a specific task with a relatively small amount of data to achieve state-of-the-art results. They trained a 24M parameters model on WikiText-103 which was beat most benchmarks.While the paper had great results, the methods behind weren't taken seriously by the community: “Everybody hated fine tuning. Everybody hated transfer learning. I literally did tours trying to get people to start doing transfer learning and nobody was interested, particularly after GPT showed such good results with zero shot and few shot learning […] which I was convinced was not the right direction, but who's going to listen to me, cause as you said, I don't have a PhD, not at a university… I don't have a big set of computers to fine tune huge transformer models.”Five years later, fine-tuning is at the center of most major discussion topics in AI (we covered some like fine tuning vs RAG and small models fine tuning), and we might have gotten here earlier if Jeremy had OpenAI-level access to compute and distribution. At heart, Jeremy has always been “GPU poor”:“I've always been somebody who does not want to build stuff on lots of big computers because most people don't have lots of big computers and I hate creating stuff that most people can't use.”This story is a good reminder of how some of the best ideas are hiding in plain sight; we recently covered RWKV and will continue to highlight the most interesting research that isn't being done in the large labs. Replacing fine-tuning with continued pre-trainingEven though fine-tuning is now mainstream, we still have a lot to learn. The issue of “catastrophic forgetting” and potential solutions have been brought up in many papers: at the fine-tuning stage, the model can forget tasks it previously knew how to solve in favor of new ones. The other issue is apparent memorization of the dataset even after a single epoch, which Jeremy covered Can LLMs learn from a single example? but we still don't have the answer to. Despite being the creator of ULMFiT, Jeremy still professes that there are a lot of open questions on finetuning:“So I still don't know how to fine tune language models properly and I haven't found anybody who feels like they do.”He now advocates for "continued pre-training" - maintaining a diversity of data throughout the training process rather than separate pre-training and fine-tuning stages. Mixing instructional data, exercises, code, and other modalities while gradually curating higher quality data can avoid catastrophic forgetting and lead to more robust capabilities (something we covered in Datasets 101).“Even though I originally created three-step approach that everybody now does, my view is it's actually wrong and we shouldn't use it… the right way to do this is to fine-tune language models, is to actually throw away the idea of fine-tuning. There's no such thing. There's only continued pre-training. And pre-training is something where from the very start, you try to include all the kinds of data that you care about, all the kinds of problems that you care about, instructions, exercises, code, general purpose document completion, whatever. And then as you train, you gradually curate that, you know, you gradually make that higher and higher quality and more and more specific to the kinds of tasks you want it to do. But you never throw away any data….So yeah, that's now my view, is I think ULMFiT is the wrong approach. And that's why we're seeing a lot of these so-called alignment tax… I think it's actually because people are training them wrong.An example of this phenomena is CodeLlama, a LLaMA2 model finetuned on 500B tokens of code: while the model is much better at code, it's worse on generic tasks that LLaMA2 knew how to solve well before the fine-tuning. In the episode we also dive into all the places where open source model development and research is happening (academia vs Discords - tracked on our Communities list and on our survey), and how Jeremy recommends getting the most out of these diffuse, pseudonymous communities (similar to the Eleuther AI Mafia).Show Notes* Jeremy's Background* FastMail* Optimal Decisions* Kaggle* Enlitic* fast.ai* Rachel Thomas* Practical Deep Learning* fastai for PyTorch* nbdev* fastec2 (the underrated library we describe)* Can LLMs learn from a single example?* the Kaggle LLM Science Exam competition, which “challenges participants to answer difficult science-based questions written by a Large Language Model”.* Sebastian Ruder* Alec Radford* Sylvain Gugger* Stephen Merity* Chris Lattner* Modular.ai / Mojo* Jono Whittaker* Zeiler and Fergus paper* ULM Fit* DAWNBench* Phi-1* Code Llama* AlexNetTimestamps* [00:00:00] Intros and Jeremy's background* [00:05:28] Creating ULM Fit - a breakthrough in NLP using transfer learning* [00:06:32] The rise of GPT and the appeal of few-shot learning over fine-tuning* [00:10:00] Starting Fast.ai to distribute AI capabilities beyond elite academics* [00:14:30] How modern LMs like ChatGPT still follow the ULM Fit 3-step approach* [00:17:23] Meeting with Chris Lattner on Swift for TensorFlow at Google* [00:20:00] Continued pre-training as a fine-tuning alternative* [00:22:16] Fast.ai and looking for impact vs profit maximization* [00:26:39] Using Fast.ai to create an "army" of AI experts to improve their domains* [00:29:32] Fast.ai's 3 focus areas - research, software, and courses* [00:38:42] Fine-tuning memorization and training curve "clunks" before each epoch* [00:46:47] Poor training and fine-tuning practices may be causing alignment failures* [00:48:38] Academia vs Discords* [00:53:41] Jeremy's high hopes for Chris Lattner's Mojo and its potential* [01:05:00] Adding capabilities like SQL generation through quick fine-tuning* [01:10:12] Rethinking Fast.ai courses for the AI-assisted coding era* [01:14:53] Rapid model development has created major technical debt* [01:17:08] Lightning RoundAI Summary (beta)This is the first episode we're trying this. Here's an overview of the main topics before you dive in the transcript. * Jeremy's background and philosophies on AI* Studied philosophy and cognitive science in college* Focused on ethics and thinking about AI even 30 years ago* Believes AI should be accessible to more people, not just elite academics/programmers* Created fast.ai to make deep learning more accessible* Development of transfer learning and ULMFit* Idea of transfer learning critical for making deep learning accessible* ULMFit pioneered transfer learning for NLP* Proposed training general language models on large corpora then fine-tuning - this became standard practice* Faced skepticism that this approach would work from NLP community* Showed state-of-the-art results on text classification soon after trying it* Current open questions around fine-tuning LLMs* Models appear to memorize training data extremely quickly (after 1 epoch)* This may hurt training dynamics and cause catastrophic forgetting* Unclear how best to fine-tune models to incorporate new information/capabilities* Need more research on model training dynamics and ideal data mixing* Exciting new developments* Mojo and new programming languages like Swift could enable faster model innovation* Still lots of room for improvements in computer vision-like innovations in transformers* Small models with fine-tuning may be surprisingly capable for many real-world tasks* Prompting strategies enable models like GPT-3 to achieve new skills like playing chess at superhuman levels* LLMs are like computer vision in 2013 - on the cusp of huge new breakthroughs in capabilities* Access to AI research* Many key convos happen in private Discord channels and forums* Becoming part of these communities can provide great learning opportunities* Being willing to do real work, not just talk about ideas, is key to gaining access* The future of practical AI* Coding becoming more accessible to non-programmers through AI assistance* Pre-requisite programming experience for learning AI may no longer be needed* Huge open questions remain about how to best train, fine-tune, and prompt LLMsTranscriptAlessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI. [00:00:21]Swyx: Hey, and today we have in the remote studio, Jeremy Howard all the way from Australia. Good morning. [00:00:27]Jeremy: The remote studio, also known as my house. Good morning. Nice to see you. [00:00:32]Swyx: Nice to see you too. I'm actually very used to seeing you in your mask as a message to people, but today we're mostly audio. But thank you for doing the very important public service of COVID awareness. It was a pleasure. [00:00:46]Jeremy: It was all very annoying and frustrating and tedious, but somebody had to do it. [00:00:52]Swyx: Somebody had to do it, especially somebody with your profile. I think it really drives home the message. So we tend to introduce people for them and then ask people to fill in the blanks on the personal side. Something I did not know about you was that you graduated with a BA in philosophy from the University of Melbourne. I assumed you had a PhD. [00:01:14]Jeremy: No, I mean, I barely got through my BA because I was working 80 to 100 hour weeks at McKinsey and Company from 19 years old onwards. So I actually didn't attend any lectures in second and third year university. [00:01:35]Swyx: Well, I guess you didn't need it or you're very sort of self-driven and self-motivated. [00:01:39]Jeremy: I took two weeks off before each exam period when I was working at McKinsey. And then, I mean, I can't believe I got away with this in hindsight, I would go to all my professors and say, oh, I was meant to be in your class this semester and I didn't quite turn up. Were there any assignments I was meant to have done, whatever. I can't believe all of them let me basically have it. They basically always would say like, okay, well, if you can have this written by tomorrow, I'll accept it. So yeah, stressful way to get through university, but. [00:02:12]Swyx: Well, it shows that, I guess, you min-maxed the opportunities. That definitely was a precursor. [00:02:18]Jeremy: I mean, funnily, like in as much as I, you know, in philosophy, the things I found interesting and focused on in the little bit of time I did spend on it was ethics and cognitive science. And it's kind of really amazing that it's now come back around and those are actually genuinely useful things to know about, which I never thought would happen. [00:02:38]Swyx: A lot of, yeah, a lot of relevant conversations there. So you were a consultant for a while and then in the magical month of June 1989, you founded both Optimal Decisions and Fastmeal, which I also briefly used. So thank you for that. [00:02:53]Jeremy: Oh, good for you. Yeah. Cause I had read the statistics, which is that like 90% or something of small businesses fail. So I thought if I start two businesses, I have a higher chance. In hindsight, I was thinking of it as some kind of stochastic thing I didn't have control over, but it's a bit odd, but anyway. [00:03:10]Swyx: And then you were president and chief scientist at Kaggle, which obviously is the sort of composition platform of machine learning. And then Enlitic, where you were working on using deep learning to improve medical diagnostics and clinical decisions. Yeah. [00:03:28]Jeremy: I was actually the first company to use deep learning in medicine, so I kind of founded the field. [00:03:33]Swyx: And even now that's still like a pretty early phase. And I actually heard you on your new podcast with Tanish, where you went very, very deep into the stuff, the kind of work that he's doing, such a young prodigy at his age. [00:03:47]Jeremy: Maybe he's too old to be called a prodigy now, ex-prodigy. No, no. [00:03:51]Swyx: I think he still counts. And anyway, just to round out the bio, you have a lot more other credentials, obviously, but most recently you started Fast.ai, which is still, I guess, your primary identity with Rachel Thomas. So welcome. [00:04:05]Jeremy: Yep. [00:04:06]Swyx: Thanks to my wife. Thank you. Yeah. Doing a lot of public service there with getting people involved in AI, and I can't imagine a better way to describe it than fast, fast.ai. You teach people from nothing to stable diffusion in seven weeks or something, and that's amazing. Yeah, yeah. [00:04:22]Jeremy: I mean, it's funny, you know, when we started that, what was that, like 2016 or something, the idea that deep learning was something that you could make more accessible was generally considered stupid. Everybody knew that deep learning was a thing that you got a math or a computer science PhD, you know, there was one of five labs that could give you the appropriate skills and that you would join, yeah, basically from one of those labs, you might be able to write some papers. So yeah, the idea that normal people could use that technology to do good work was considered kind of ridiculous when we started it. And we weren't sure if it was possible either, but we kind of felt like we had to give it a go because the alternative was we were pretty sure that deep learning was on its way to becoming, you know, the most or one of the most, you know, important technologies in human history. And if the only people that could use it were a handful of computer science PhDs, that seemed like A, a big waste and B, kind of dangerous. [00:05:28]Swyx: Yeah. [00:05:29]Alessio: And, you know, well, I just wanted to know one thing on your bio that at Kaggle, you were also the top rank participant in both 2010 and 2011. So sometimes you see a lot of founders running companies that are not really in touch with the problem, but you were clearly building something that you knew a lot about, which is awesome. Talking about deep learning, you created, published a paper on ULM fit, which was kind of the predecessor to multitask learning and a lot of the groundwork that then went to into Transformers. I've read back on the paper and you turned this model, AWD LSTM, which I did the math and it was like 24 to 33 million parameters, depending on what training data set you use today. That's kind of like not even small, it's like super small. What were some of the kind of like contrarian takes that you had at the time and maybe set the stage a little bit for the rest of the audience on what was kind of like the state of the art, so to speak, at the time and what people were working towards? [00:06:32]Jeremy: Yeah, the whole thing was a contrarian take, you know. So okay, so we started Fast.ai, my wife and I, and we thought, yeah, so we're trying to think, okay, how do we make it more accessible? So when we started thinking about it, it was probably 2015 and then 2016, we started doing something about it. Why is it inaccessible? Okay, well, A, no one knows how to do it other than a few number of people. And then when we asked those few number of people, well, how do you actually get good results? They would say like, oh, it's like, you know, a box of tricks that aren't published. So you have to join one of the labs and learn the tricks. So a bunch of unpublished tricks, not much software around, but thankfully there was Theano and rappers and particularly Lasagna, the rapper, but yeah, not much software around, not much in the way of data sets, you know, very hard to get started in terms of the compute. Like how do you get that set up? So yeah, no, everything was kind of inaccessible. And you know, as we started looking into it, we had a key insight, which was like, you know what, most of the compute and data for image recognition, for example, we don't need to do it. You know, there's this thing which nobody knows about, nobody talks about called transfer learning, where you take somebody else's model, where they already figured out like how to detect edges and gradients and corners and text and whatever else, and then you can fine tune it to do the thing you want to do. And we thought that's the key. That's the key to becoming more accessible in terms of compute and data requirements. So when we started Fast.ai, we focused from day one on transfer learning. Lesson one, in fact, was transfer learning, literally lesson one, something not normally even mentioned in, I mean, there wasn't much in the way of courses, you know, the courses out there were PhD programs that had happened to have recorded their lessons and they would rarely mention it at all. We wanted to show how to do four things that seemed really useful. You know, work with vision, work with tables of data, work with kind of recommendation systems and collaborative filtering and work with text, because we felt like those four kind of modalities covered a lot of the stuff that, you know, are useful in real life. And no one was doing anything much useful with text. Everybody was talking about word2vec, you know, like king plus queen minus woman and blah, blah, blah. It was like cool experiments, but nobody's doing anything like useful with it. NLP was all like lemmatization and stop words and topic models and bigrams and SPMs. And it was really academic and not practical. But I mean, to be honest, I've been thinking about this crazy idea for nearly 30 years since I had done cognitive science at university, where we talked a lot about the CELS Chinese room experiment. This idea of like, what if there was somebody that could kind of like, knew all of the symbolic manipulations required to answer questions in Chinese, but they didn't speak Chinese and they were kind of inside a room with no other way to talk to the outside world other than taking in slips of paper with Chinese written on them and then they do all their rules and then they pass back a piece of paper with Chinese back. And this room with a person in is actually fantastically good at answering any question you give them written in Chinese. You know, do they understand Chinese? And is this, you know, something that's intelligently working with Chinese? Ever since that time, I'd say the most thought, to me, the most thoughtful and compelling philosophical response is yes. You know, intuitively it feels like no, because that's just because we can't imagine such a large kind of system. But you know, if it looks like a duck and acts like a duck, it's a duck, you know, or to all intents and purposes. And so I always kind of thought, you know, so this is basically a kind of analysis of the limits of text. And I kind of felt like, yeah, if something could ingest enough text and could use the patterns it saw to then generate text in response to text, it could appear to be intelligent, you know. And whether that means it is intelligent or not is a different discussion and not one I find very interesting. Yeah. And then when I came across neural nets when I was about 20, you know, what I learned about the universal approximation theorem and stuff, and I started thinking like, oh, I wonder if like a neural net could ever get big enough and take in enough data to be a Chinese room experiment. You know, with that background and this kind of like interest in transfer learning, you know, I'd been thinking about this thing for kind of 30 years and I thought like, oh, I wonder if we're there yet, you know, because we have a lot of text. Like I can literally download Wikipedia, which is a lot of text. And I thought, you know, how would something learn to kind of answer questions or, you know, respond to text? And I thought, well, what if we used a language model? So language models are already a thing, you know, they were not a popular or well-known thing, but they were a thing. But language models exist to this idea that you could train a model to fill in the gaps. Or actually in those days it wasn't fill in the gaps, it was finish a string. And in fact, Andrej Karpathy did his fantastic RNN demonstration from this at a similar time where he showed like you can have it ingest Shakespeare and it will generate something that looks a bit like Shakespeare. I thought, okay, so if I do this at a much bigger scale, using all of Wikipedia, what would it need to be able to do to finish a sentence in Wikipedia effectively, to do it quite accurately quite often? I thought, geez, it would actually have to know a lot about the world, you know, it'd have to know that there is a world and that there are objects and that objects relate to each other through time and cause each other to react in ways and that causes proceed effects and that, you know, when there are animals and there are people and that people can be in certain positions during certain timeframes and then you could, you know, all that together, you can then finish a sentence like this was signed into law in 2016 by US President X and it would fill in the gap, you know. So that's why I tried to create what in those days was considered a big language model trained on the entirety on Wikipedia, which is that was, you know, a bit unheard of. And my interest was not in, you know, just having a language model. My interest was in like, what latent capabilities would such a system have that would allow it to finish those kind of sentences? Because I was pretty sure, based on our work with transfer learning and vision, that I could then suck out those latent capabilities by transfer learning, you know, by fine-tuning it on a task data set or whatever. So we generated this three-step system. So step one was train a language model on a big corpus. Step two was fine-tune a language model on a more curated corpus. And step three was further fine-tune that model on a task. And of course, that's what everybody still does today, right? That's what ChatGPT is. And so the first time I tried it within hours, I had a new state-of-the-art academic result on IMDB. And I was like, holy s**t, it does work. And so you asked, to what degree was this kind of like pushing against the established wisdom? You know, every way. Like the reason it took me so long to try it was because I asked all my friends in NLP if this could work. And everybody said, no, it definitely won't work. It wasn't like, oh, maybe. Everybody was like, it definitely won't work. NLP is much more complicated than vision. Language is a much more vastly complicated domain. You know, and you've got problems like the grounding problem. We know from like philosophy and theory of mind that it's actually impossible for it to work. So yeah, so don't waste your time. [00:15:10]Alessio: Jeremy, had people not tried because it was like too complicated to actually get the data and like set up the training? Or like, were people just lazy and kind of like, hey, this is just not going to work? [00:15:20]Jeremy: No, everybody wasn't lazy. So like, so the person I thought at that time who, you know, there were two people I thought at that time, actually, who were the strongest at language models were Stephen Merity and Alec Radford. And at the time I didn't know Alec, but I, after we had both, after I'd released ULM Fit and he had released GPT, I organized a chat for both of us with Kate Metz in the New York Times. And Kate Metz answered, sorry, and Alec answered this question for Kate. And Kate was like, so how did, you know, GPT come about? And he said, well, I was pretty sure that pre-training on a general large corpus wouldn't work. So I hadn't tried it. And then I read ULM Fit and turns out it did work. And so I did it, you know, bigger and it worked even better. And similar with, with Stephen, you know, I asked Stephen Merity, like, why don't we just find, you know, take your AWD-ASTLM and like train it on all of Wikipedia and fine tune it? And he's kind of like, well, I don't think that's going to really lie. Like two years before I did a very popular talk at KDD, the conference where everybody in NLP was in the audience. I recognized half the faces, you know, and I told them all this, I'm sure transfer learning is the key. I'm sure ImageNet, you know, is going to be an NLP thing as well. And, you know, everybody was interested and people asked me questions afterwards and, but not just, yeah, nobody followed up because everybody knew that it didn't work. I mean, even like, so we were scooped a little bit by Dai and Lee, Kwok Lee at Google. They had, they had, I already, I didn't even realize this, which is a bit embarrassing. They had already done a large language model and fine tuned it. But again, they didn't create a general purpose, large language model on a general purpose corpus. They only ever tested a domain specific corpus. And I haven't spoken to Kwok actually about that, but I assume that the reason was the same. It probably just didn't occur to them that the general approach could work. So maybe it was that kind of 30 years of mulling over the, the cell Chinese room experiment that had convinced me that it probably would work. I don't know. Yeah. [00:17:48]Alessio: Interesting. I just dug up Alec announcement tweet from 2018. He said, inspired by Cobe, Elmo, and Yola, I'm fit. We should have a single transformer language model can be fine tuned to a wide variety. It's interesting because, you know, today people think of AI as the leader, kind of kind of like the research lab pushing forward the field. What was that at the time? You know, like kind of like going back five years, people think of it as an overnight success, but obviously it took a while. [00:18:16]Swyx: Yeah. Yeah. [00:18:17]Jeremy: No, I mean, absolutely. And I'll say like, you know, it's interesting that it mentioned Elmo because in some ways that was kind of diametrically opposed to, to ULM fit. You know, there was these kind of like, so there was a lot of, there was a lot of activity at the same time as ULM fits released. So there was, um, so before it, as Brian McCann, I think at Salesforce had come out with this neat model that did a kind of multitask learning, but again, they didn't create a general fine tune language model first. There was Elmo, um, which I think was a lip, you know, actually quite a few months after the first ULM fit example, I think. Um, but yeah, there was a bit of this stuff going on. And the problem was everybody was doing, and particularly after GPT came out, then everybody wanted to focus on zero shot and few shot learning. You know, everybody hated fine tuning. Everybody hated transfer learning. And like, I literally did tours trying to get people to start doing transfer learning and people, you know, nobody was interested, particularly after GPT showed such good results with zero shot and few shot learning. And so I actually feel like we kind of went backwards for years and, and not to be honest, I mean, I'm a bit sad about this now, but I kind of got so disappointed and dissuaded by like, it felt like these bigger lab, much bigger labs, you know, like fast AI had only ever been just me and Rachel were getting all of this attention for an approach I thought was the wrong way to do it. You know, I was convinced was the wrong way to do it. And so, yeah, for years people were really focused on getting better at zero shot and few shots and it wasn't until, you know, this key idea of like, well, let's take the ULM fit approach, but for step two, rather than fine tuning on a kind of a domain corpus, let's fine tune on an instruction corpus. And then in step three, rather than fine tuning on a reasonably specific task classification, let's fine tune on a, on a RLHF task classification. And so that was really, that was really key, you know, so I was kind of like out of the NLP field for a few years there because yeah, it just felt like, I don't know, pushing uphill against this vast tide, which I was convinced was not the right direction, but who's going to listen to me, you know, cause I, as you said, I don't have a PhD, not at a university, or at least I wasn't then. I don't have a big set of computers to fine tune huge transformer models. So yeah, it was definitely difficult. It's always been hard. You know, it's always been hard. Like I've always been somebody who does not want to build stuff on lots of big computers because most people don't have lots of big computers and I hate creating stuff that most people can't use, you know, and also stuff that's created on lots of big computers has always been like much more media friendly. So like, it might seem like a recent thing, but actually throughout my 30 years in data science, the attention's always been on, you know, the big iron results. So when I first started, everybody was talking about data warehouses and it was all about Teradata and it'd be like, oh, this big bank has this huge room full of computers and they have like terabytes of data available, you know, at the press of a button. And yeah, that's always what people want to talk about, what people want to write about. And then of course, students coming out of their PhDs and stuff, that's where they want to go work because that's where they read about. And to me, it's a huge distraction, you know, because like I say, most people don't have unlimited compute and I want to help most people, not the small subset of the most well-off people. [00:22:16]Alessio: That's awesome. And it's great to hear, you do such a great job educating that a lot of times you're not telling your own story, you know? So I love this conversation. And the other thing before we jump into Fast.AI, actually, a lot of people that I know, they run across a new architecture and whatnot, they're like, I got to start a company and raise a bunch of money and do all of this stuff. And say, you were like, I want everybody to have access to this. Why was that the case for you? Was it because you already had a successful venture in like FastMail and you were more interested in that? What was the reasoning? [00:22:52]Jeremy: It's a really good question. So I guess the answer is yes, that's the reason why. So when I was a teenager, I thought it would be really cool to like have my own company. You know, I didn't know the word startup. I didn't know the word entrepreneur. I didn't know the word VC. And I didn't really know what any of those things were really until after we started Kaggle, to be honest. Even the way it started to what we now call startups. I just thought they were just small businesses. You know, they were just companies. So yeah, so those two companies were FastMail and Optimal Decisions. FastMail was the first kind of synchronized email provider for non-businesses. So something you can get your same email at home, on your laptop, at work, on your phone, whatever. And then Optimal Decisions invented a new approach to insurance pricing. Something called profit-optimized insurance pricing. So I saw both of those companies, you know, after 10 years. And at that point, I had achieved the thing that as a teenager I had wanted to do. You know, it took a lot longer than it should have because I spent way longer in management consulting than I should have because I got caught up in that stupid rat race. But, you know, eventually I got there and I remember my mom saying to me, you must be so proud. You know, because she remembered my dream. She's like, you've done it. And I kind of reflected and I was like, I'm not proud at all. You know, like people quite liked FastMail. You know, it's quite nice to have synchronized email. It probably would have happened anyway. Yeah, I'm certainly not proud that I've helped some insurance companies suck more money out of their customers. Yeah, no, I'm not proud. You know, it's actually, I haven't really helped the world very much. You know, maybe in the insurance case I've made it a little bit worse. I don't know. So, yeah, I was determined to not waste more years of my life doing things, working hard to do things which I could not be reasonably sure would have a lot of value. So, you know, I took some time off. I wasn't sure if I'd ever work again, actually. I didn't particularly want to, because it felt like, yeah, it felt like such a disappointment. And, but, you know, and I didn't need to. I had enough money. Like, I wasn't super rich, but I had enough money. I didn't need to work. And I certainly recognized that amongst the other people I knew who had enough money that they didn't need to work, they all worked ridiculously hard, you know, and constantly put themselves in extremely stressful situations. And I thought, I don't want to be one of those idiots who's tied to, you know, buying a bigger plane than the next guy or whatever. You know, Kaggle came along and I mainly kind of did that just because it was fun and interesting to hang out with interesting people. But, you know, with Fast.ai in particular, you know, Rachel and I had a very explicit, you know, long series of conversations over a long period of time about like, well, how can we be the most helpful to society as a whole, and particularly to those people who maybe need more help, you know? And so we definitely saw the world going in a potentially pretty dystopian direction if the world's most powerful technology was controlled by a small group of elites. So we thought, yeah, we should focus on trying to help that not happen. You know, sadly, it looks like it still is likely to happen. But I mean, I feel like we've helped make it a little bit less likely. So we've done our bit. [00:26:39]Swyx: You've shown that it's possible. And I think your constant advocacy, your courses, your research that you publish, you know, just the other day you published a finding on, you know, learning that I think is still something that people are still talking about quite a lot. I think that that is the origin story of a lot of people who are going to be, you know, little Jeremy Howards, furthering your mission with, you know, you don't have to do everything by yourself is what I'm saying. No, definitely. Definitely. [00:27:10]Jeremy: You know, that was a big takeaway from like, analytic was analytic. It definitely felt like we had to do everything ourselves. And I kind of, I wanted to solve medicine. I'll say, yeah, okay, solving medicine is actually quite difficult. And I can't do it on my own. And there's a lot of other things I'd like to solve, and I can't do those either. So that was definitely the other piece was like, yeah, you know, can we create an army of passionate domain experts who can change their little part of the world? And that's definitely happened. Like I find nowadays, at least half the time, probably quite a bit more that I get in contact with somebody who's done really interesting work in some domain. Most of the time I'd say, they say, yeah, I got my start with fast.ai. So it's definitely, I can see that. And I also know from talking to folks at places like Amazon and Adobe and stuff, which, you know, there's lots of alumni there. And they say, oh my God, I got here. And like half of the people are fast.ai alumni. So it's fantastic. [00:28:13]Swyx: Yeah. [00:28:14]Jeremy: Actually, Andre Kapathy grabbed me when I saw him at NeurIPS a few years ago. And he was like, I have to tell you, thanks for the fast.ai courses. When people come to Tesla and they need to know more about deep learning, we always send them to your course. And the OpenAI Scholars Program was doing the same thing. So it's kind of like, yeah, it's had a surprising impact, you know, that's just one of like three things we do is the course, you know. [00:28:40]Swyx: Yes. [00:28:40]Jeremy: And it's only ever been at most two people, either me and Rachel or me and Sylvia nowadays, it's just me. So yeah, I think it shows you don't necessarily need a huge amount of money and a huge team of people to make an impact. [00:28:56]Swyx: Yeah. So just to reintroduce fast.ai for people who may not have dived into it much, there is the courses that you do. There is the library that is very well loved. And I kind of think of it as a nicer layer on top of PyTorch that people should start with by default and use it as the basis for a lot of your courses. And then you have like NBDev, which I don't know, is that the third one? [00:29:27]Jeremy: Oh, so the three areas were research, software, and courses. [00:29:32]Swyx: Oh, sorry. [00:29:32]Jeremy: So then in software, you know, fast.ai is the main thing, but NBDev is not far behind. But then there's also things like FastCore, GHAPI, I mean, dozens of open source projects that I've created and some of them have been pretty popular and some of them are still a little bit hidden, actually. Some of them I should try to do a better job of telling people about. [00:30:01]Swyx: What are you thinking about? Yeah, what's on the course of my way? Oh, I don't know, just like little things. [00:30:04]Jeremy: Like, for example, for working with EC2 and AWS, I created a FastEC2 library, which I think is like way more convenient and nice to use than anything else out there. And it's literally got a whole autocomplete, dynamic autocomplete that works both on the command line and in notebooks that'll like auto-complete your instance names and everything like that. You know, just little things like that. I try to make like, when I work with some domain, I try to make it like, I want to make it as enjoyable as possible for me to do that. So I always try to kind of like, like with GHAPI, for example, I think that GitHub API is incredibly powerful, but I didn't find it good to work with because I didn't particularly like the libraries that are out there. So like GHAPI, like FastEC2, it like autocompletes both at the command line or in a notebook or whatever, like literally the entire GitHub API. The entire thing is like, I think it's like less than 100K of code because it actually, as far as I know, the only one that grabs it directly from the official open API spec that GitHub produces. And like if you're in GitHub and you just type an API, you know, autocomplete API method and hit enter, it prints out the docs with brief docs and then gives you a link to the actual documentation page. You know, GitHub Actions, I can write now in Python, which is just so much easier than writing them in TypeScript and stuff. So, you know, just little things like that. [00:31:40]Swyx: I think that's an approach which more developers took to publish some of their work along the way. You described the third arm of FastAI as research. It's not something I see often. Obviously, you do do some research. And how do you run your research? What are your research interests? [00:31:59]Jeremy: Yeah, so research is what I spend the vast majority of my time on. And the artifacts that come out of that are largely software and courses. You know, so to me, the main artifact shouldn't be papers because papers are things read by a small exclusive group of people. You know, to me, the main artifacts should be like something teaching people, here's how to use this insight and here's software you can use that builds it in. So I think I've only ever done three first-person papers in my life, you know, and none of those are ones I wanted to do. You know, they were all ones that, like, so one was ULM Fit, where Sebastian Ruder reached out to me after seeing the course and said, like, you have to publish this as a paper, you know. And he said, I'll write it. He said, I want to write it because if I do, I can put it on my PhD and that would be great. And it's like, okay, well, I want to help you with your PhD. And that sounds great. So like, you know, one was the masks paper, which just had to exist and nobody else was writing it. And then the third was the Fast.ai library paper, which again, somebody reached out and said, please, please write this. We will waive the fee for the journal and everything and actually help you get it through publishing and stuff. So yeah, so I don't, other than that, I've never written a first author paper. So the research is like, well, so for example, you know, Dawn Bench was a competition, which Stanford ran a few years ago. It was kind of the first big competition of like, who can train neural nets the fastest rather than the most accurate. And specifically it was who can train ImageNet the fastest. And again, this was like one of these things where it was created by necessity. So Google had just released their TPUs. And so I heard from my friends at Google that they had put together this big team to smash Dawn Bench so that they could prove to people that they had to use Google Cloud and use their TPUs and show how good their TPUs were. And we kind of thought, oh s**t, this would be a disaster if they do that, because then everybody's going to be like, oh, deep learning is not accessible. [00:34:20]Swyx: You know, to actually be good at it, [00:34:21]Jeremy: you have to be Google and you have to use special silicon. And so, you know, we only found out about this 10 days before the competition finished. But, you know, we basically got together an emergency bunch of our students and Rachel and I and sat for the next 10 days and just tried to crunch through and try to use all of our best ideas that had come from our research. And so particularly progressive resizing, just basically train mainly on small things, train on non-square things, you know, stuff like that. And so, yeah, we ended up winning, thank God. And so, you know, we turned it around from being like, like, oh s**t, you know, this is going to show that you have to be Google and have TPUs to being like, oh my God, even the little guy can do deep learning. So that's an example of the kind of like research artifacts we do. And yeah, so all of my research is always, how do we do more with less, you know? So how do we get better results with less data, with less compute, with less complexity, with less education, you know, stuff like that. So ULM fits obviously a good example of that. [00:35:37]Swyx: And most recently you published, can LLMs learn from a single example? Maybe could you tell the story a little bit behind that? And maybe that goes a little bit too far into the learning of very low resource, the literature. [00:35:52]Jeremy: Yeah, yeah. So me and my friend, Jono Whittaker, basically had been playing around with this fun Kaggle competition, which is actually still running as we speak, which is, can you create a model which can answer multiple choice questions about anything that's in Wikipedia? And the thing that makes it interesting is that your model has to run on Kaggle within nine hours. And Kaggle's very, very limited. So you've only got 14 gig RAM, only two CPUs, and a small, very old GPU. So this is cool, you know, if you can do well at this, then this is a good example of like, oh, you can do more with less. So yeah, Jono and I were playing around with fine tuning, of course, transfer learning, pre-trained language models. And we saw this, like, so we always, you know, plot our losses as we go. So here's another thing we created. Actually, Sylvain Guuger, when he worked with us, created called fast progress, which is kind of like TQEDM, but we think a lot better. So we look at our fast progress curves, and they kind of go down, down, down, down, down, down, down, a little bit, little bit, little bit. And then suddenly go clunk, and they drop. And then down, down, down, down, down a little bit, and then suddenly clunk, they drop. We're like, what the hell? These clunks are occurring at the end of each epoch. So normally in deep learning, this would be, this is, you know, I've seen this before. It's always been a bug. It's always turned out that like, oh, we accidentally forgot to turn on eval mode during the validation set. So I was actually learning then, or, oh, we accidentally were calculating moving average statistics throughout the epoch. So, you know, so it's recently moving average or whatever. And so we were using Hugging Face Trainer. So, you know, I did not give my friends at Hugging Face the benefit of the doubt. I thought, oh, they've fucked up Hugging Face Trainer, you know, idiots. Well, you'll use the Fast AI Trainer instead. So we switched over to Learner. We still saw the clunks and, you know, that's, yeah, it shouldn't really happen because semantically speaking in the epoch, isn't like, it's not a thing, you know, like nothing happens. Well, nothing's meant to happen when you go from ending one epoch to starting the next one. So there shouldn't be a clunk, you know. So I kind of asked around on the open source discords. That's like, what's going on here? And everybody was just like, oh, that's just what, that's just what these training curves look like. Those all look like that. Don't worry about it. And I was like, oh, are you all using Trainer? Yes. Oh, well, there must be some bug with Trainer. And I was like, well, we also saw it in Learner [00:38:42]Swyx: and somebody else is like, [00:38:42]Jeremy: no, we've got our own Trainer. We get it as well. They're just like, don't worry about it. It's just something we see. It's just normal. [00:38:48]Swyx: I can't do that. [00:38:49]Jeremy: I can't just be like, here's something that's like in the previous 30 years of neural networks, nobody ever saw it. And now suddenly we see it. [00:38:57]Swyx: So don't worry about it. [00:38:59]Jeremy: I just, I have to know why. [00:39:01]Swyx: Can I clarify? This is, was everyone that you're talking to, were they all seeing it for the same dataset or in different datasets? [00:39:08]Jeremy: Different datasets, different Trainers. They're just like, no, this is just, this is just what it looks like when you fine tune language models. Don't worry about it. You know, I hadn't seen it before, but I'd been kind of like, as I say, I, you know, I kept working on them for a couple of years after ULM fit. And then I kind of moved on to other things, partly out of frustration. So I hadn't been fine tuning, you know, I mean, Lama's only been out for a few months, right? But I wasn't one of those people who jumped straight into it, you know? So I was relatively new to the kind of Lama fine tuning world, where else these guys had been, you know, doing it since day one. [00:39:49]Swyx: It was only a few months ago, [00:39:51]Jeremy: but it's still quite a bit of time. So, so yeah, they're just like, no, this is all what we see. [00:39:56]Swyx: Don't worry about it. [00:39:56]Jeremy: So yeah, I, I've got a very kind of like, I don't know, I've just got this brain where I have to know why things are. And so I kind of, I ask people like, well, why, why do you think it's happening? And they'd be like, oh, it would pretty obviously, cause it's like memorize the data set. It's just like, that can't be right. It's only seen it once. Like, look at this, the loss has dropped by 0.3, 0.3, which is like, basically it knows the answer. And like, no, no, it's just, it is, it's just memorize the data set. So yeah. So look, Jono and I did not discover this and Jono and I did not come up with a hypothesis. You know, I guess we were just the ones, I guess, who had been around for long enough to recognize that like, this, this isn't how it's meant to work. And so we, we, you know, and so we went back and like, okay, let's just run some experiments, you know, cause nobody seems to have actually published anything about this. [00:40:51]Well, not quite true.Some people had published things, but nobody ever actually stepped back and said like, what the hell, you know, how can this be possible? Is it possible? Is this what's happening? And so, yeah, we created a bunch of experiments where we basically predicted ahead of time. It's like, okay, if this hypothesis is correct, that it's memorized in the training set, then we ought to see blah, under conditions, blah, but not under these conditions. And so we ran a bunch of experiments and all of them supported the hypothesis that it was memorizing the data set in a single thing at once. And it's a pretty big data set, you know, which in hindsight, it's not totally surprising because the theory, remember, of the ULMFiT theory was like, well, it's kind of creating all these latent capabilities to make it easier for it to predict the next token. So if it's got all this kind of latent capability, it ought to also be really good at compressing new tokens because it can immediately recognize it as like, oh, that's just a version of this. So it's not so crazy, you know, but it is, it requires us to rethink everything because like, and nobody knows like, okay, so how do we fine tune these things? Because like, it doesn't even matter. Like maybe it's fine. Like maybe it's fine that it's memorized the data set after one go and you do a second go and okay, the validation loss is terrible because it's now really overconfident. [00:42:20]Swyx: That's fine. [00:42:22]Jeremy: Don't, you know, don't, I keep telling people, don't track validation loss, track validation accuracy because at least that will still be useful. Just another thing that's got lost since ULMFiT, nobody tracks accuracy of language models anymore. But you know, it'll still keep learning and it does, it does keep improving. But is it worse? You know, like, is it like, now that it's kind of memorized it, it's probably getting a less strong signal, you know, I don't know. So I still don't know how to fine tune language models properly and I haven't found anybody who feels like they do, like nobody really knows whether this memorization thing is, it's probably a feature in some ways. It's probably some things that you can do usefully with it. It's probably, yeah, I have a feeling it's messing up training dynamics as well. [00:43:13]Swyx: And does it come at the cost of catastrophic forgetting as well, right? Like, which is the other side of the coin. [00:43:18]Jeremy: It does to some extent, like we know it does, like look at Code Llama, for example. So Code Llama was a, I think it was like a 500 billion token fine tuning of Llama 2 using code. And also pros about code that Meta did. And honestly, they kind of blew it because Code Llama is good at coding, but it's bad at everything else, you know, and it used to be good. Yeah, I was pretty sure it was like, before they released it, me and lots of people in the open source discords were like, oh my God, you know, we know this is coming, Jan Lukinsk saying it's coming. I hope they kept at least like 50% non-code data because otherwise it's going to forget everything else. And they didn't, only like 0.3% of their epochs were non-code data. So it did, it forgot everything else. So now it's good at code and it's bad at everything else. So we definitely have catastrophic forgetting. It's fixable, just somebody has to do, you know, somebody has to spend their time training a model on a good mix of data. Like, so, okay, so here's the thing. Even though I originally created three-step approach that everybody now does, my view is it's actually wrong and we shouldn't use it. [00:44:36]Jeremy: And that's because people are using it in a way different to why I created it. You know, I created it thinking the task-specific models would be more specific. You know, it's like, oh, this is like a sentiment classifier as an example of a task, you know, but the tasks now are like a, you know, RLHF, which is basically like answer questions that make people feel happy about your answer. So that's a much more general task and it's a really cool approach. And so we see, for example, RLHF also breaks models like, you know, like GPT-4, RLHDEFT, we know from kind of the work that Microsoft did, you know, the pre, the earlier, less aligned version was better. And these are all kind of examples of catastrophic forgetting. And so to me, the right way to do this is to fine-tune language models, is to actually throw away the idea of fine-tuning. There's no such thing. There's only continued pre-training. And pre-training is something where from the very start, you try to include all the kinds of data that you care about, all the kinds of problems that you care about, instructions, exercises, code, general purpose document completion, whatever. And then as you train, you gradually curate that, you know, you gradually make that higher and higher quality and more and more specific to the kinds of tasks you want it to do. But you never throw away any data. You always keep all of the data types there in reasonably high quantities. You know, maybe the quality filter, you stop training on low quality data, because that's probably fine to forget how to write badly, maybe. So yeah, that's now my view, is I think ULM fit is the wrong approach. And that's why we're seeing a lot of these, you know, so-called alignment tacks and this view of like, oh, a model can't both code and do other things. And, you know, I think it's actually because people are training them wrong. [00:46:47]Swyx: Yeah, well, I think you have a clear [00:46:51]Alessio: anti-laziness approach. I think other people are not as good hearted, you know, they're like, [00:46:57]Swyx: hey, they told me this thing works. [00:46:59]Alessio: And if I release a model this way, people will appreciate it, I'll get promoted and I'll kind of make more money. [00:47:06]Jeremy: Yeah, and it's not just money. It's like, this is how citations work most badly, you know, so if you want to get cited, you need to write a paper that people in your field recognize as an advancement on things that we know are good. And so we've seen this happen again and again. So like I say, like zero shot and few shot learning, everybody was writing about that. Or, you know, with image generation, everybody just was writing about GANs, you know, and I was trying to say like, no, GANs are not the right approach. You know, and I showed again through research that we demonstrated in our videos that you can do better than GANs, much faster and with much less data. And nobody cared because again, like if you want to get published, you write a GAN paper that slightly improves this part of GANs and this tiny field, you'll get published, you know. So it's, yeah, it's not set up for real innovation. It's, you know, again, it's really helpful for me, you know, I have my own research lab with nobody telling me what to do and I don't even publish. So it doesn't matter if I get citations. And so I just write what I think actually matters. I wish there was, and, you know, and actually places like OpenAI, you know, the researchers there can do that as well. It's a shame, you know, I wish there was more academic, open venues in which people can focus on like genuine innovation. [00:48:38]Swyx: Twitter, which is unironically has become a little bit of that forum. I wanted to follow up on one thing that you mentioned, which is that you checked around the open source discords. I don't know if it's too, I don't know if it's a pusher to ask like what discords are lively or useful right now. I think that something I definitely felt like I missed out on was the early days of Luther AI, which is a very hard bit. And, you know, like what is the new Luther? And you actually shouted out the alignment lab AI discord in your blog post. And that was the first time I even knew, like I saw them on Twitter, never knew they had a discord, never knew that there was actually substantive discussions going on in there and that you were an active member of it. Okay, yeah. [00:49:23]Jeremy: And then even then, if you do know about that and you go there, it'll look like it's totally dead. And that's because unfortunately, nearly all the discords, nearly all of the conversation happens in private channels. You know, and that's, I guess. [00:49:35]Swyx: How does someone get into that world? Because it's obviously very, very instructive, right? [00:49:42]Jeremy: You could just come to the first AI discord, which I'll be honest with you, it's less bustling than some of the others, but it's not terrible. And so like, at least, to be fair, one of Emma's bustling channels is private. [00:49:57]Swyx: I guess. [00:49:59]Jeremy: So I'm just thinking. [00:50:01]Swyx: It's just the nature of quality discussion, right? Yeah, I guess when I think about it, [00:50:05]Jeremy: I didn't have any private discussions on our discord for years, but there was a lot of people who came in with like, oh, I just had this amazing idea for AGI. If you just thought about like, if you imagine that AI is a brain, then we, you know, this just, I don't want to talk about it. You know, I don't want to like, you don't want to be dismissive or whatever. And it's like, oh, well, that's an interesting comment, but maybe you should like, try training some models first to see if that aligns with your intuition. Like, oh, but how could I possibly learn? It's like, well, we have a course, just actually spend time learning. Like, you know, anyway. And there's like, okay, I know the people who always have good answers there. And so I created a private channel and put them all in it. And I got to admit, that's where I post more often because there's much less, you know, flight of fancy views about how we could solve AGI, blah, blah, blah. So there is a bit of that. But having said that, like, I think the bar is pretty low. Like if you join a Discord and you can hit the like participants or community or whatever button, you can see who's in it. And then you'll see at the top, who the admins or moderators or people in the dev role are. And just DM one of them and say like, oh, here's my GitHub. Well, here's some blog posts I wrote. You know, I'm interested in talking about this, you know, can I join the private channels? And I've never heard of anybody saying no. I will say, you know, Alutha's all pretty open. So you can do the Alutha Discord still. You know, one problem with the Alutha Discord is it's been going on for so long that it's like, it's very inside baseball. It's quite hard to get started. Yeah. Carpa AI looks, I think it's all open. That's just less stability. That's more accessible. [00:52:03]Swyx: Yeah. [00:52:04]Jeremy: There's also just recently, now it's research that does like the Hermes models and data set just opened. They've got some private channels, but it's pretty open, I think. You mentioned Alignment Lab, that one it's all the interesting stuff is on private channels. So just ask. If you know me, ask me, cause I've got admin on that one. There's also, yeah, OS Skunkworks, OS Skunkworks AI is a good Discord, which I think it's open. So yeah, they're all pretty good. [00:52:40]Swyx: I don't want you to leak any, you know, Discords that don't want any publicity, but this is all helpful. [00:52:46]Jeremy: We all want people, like we all want people. [00:52:49]Swyx: We just want people who like, [00:52:51]Jeremy: want to build stuff, rather than people who, and like, it's fine to not know anything as well, but if you don't know anything, but you want to tell everybody else what to do and how to do it, that's annoying. If you don't know anything and want to be told like, here's a really small kind of task that as somebody who doesn't know anything is going to take you a really long time to do, but it would still be helpful. Then, and then you go and do it. That would be great. The truth is, yeah, [00:53:19]Swyx: like, I don't know, [00:53:20]Jeremy: maybe 5% of people who come in with great enthusiasm and saying that they want to learn and they'll do anything. [00:53:25]Swyx: And then somebody says like, [00:53:25]Jeremy: okay, here's some work you can do. Almost nobody does that work. So if you're somebody who actually does the work and follows up, you will massively stand out. That's an extreme rarity. And everybody will then want to help you do more work. [00:53:41]Swyx: So yeah. [00:53:41]Jeremy: So just, yeah, just do work and people will want to support you. [00:53:47]Alessio: Our Discord used to be referral only for a long time. We didn't have a public invite and then we opened it and they're kind of like channel gating. Yeah. A lot of people just want to do, I remember it used to be like, you know, a forum moderator. [00:54:00]Swyx: It's like people just want to do [00:54:01]Alessio: like drive-by posting, [00:54:03]Swyx: you know, and like, [00:54:03]Alessio: they don't want to help the community. They just want to get their question answered. [00:54:07]Jeremy: I mean, the funny thing is our forum community does not have any of that garbage. You know, there's something specific about the low latency thing where people like expect an instant answer. And yeah, we're all somehow in a forum thread where they know it's like there forever. People are a bit more thoughtful, but then the forums are less active than they used to be because Discord has got more popular, you know? So it's all a bit of a compromise, you know, running a healthy community is, yeah, it's always a bit of a challenge. All right, we got so many more things [00:54:47]Alessio: we want to dive in, but I don't want to keep you here for hours. [00:54:50]Swyx: This is not the Lex Friedman podcast [00:54:52]Alessio: we always like to say. One topic I would love to maybe chat a bit about is Mojo, modular, you know, CrystalLiner, not many of you on the podcast. So we want to spend a little time there. You recently did a hacker's guide to language models and you ran through everything from quantized model to like smaller models, larger models, and all of that. But obviously modular is taking its own approach. Yeah, what got you excited? I know you and Chris have been talking about this for like years and a lot of the ideas you had, so. [00:55:23]Jeremy: Yeah, yeah, yeah, yeah, no, absolutely. So I met Chris, I think it was at the first TensorFlow Dev Summit. And I don't think he had even like, I'm not sure if he'd even officially started his employment with Google at that point. So I don't know, you know, certainly nothing had been mentioned. So I, you know, I admired him from afar with LLVM and Swift and whatever. And so I saw him walk into the courtyard at Google. It's just like, oh s**t, man, that's Chris Latner. I wonder if he would lower his standards enough to talk to me. Well, worth a try. So I caught up my courage because like nobody was talking to him. He looked a bit lost and I wandered over and it's like, oh, you're Chris Latner, right? It's like, what are you doing here? What are you doing here? And I was like, yeah, yeah, yeah. It's like, oh, I'm Jeremy Howard. It's like, oh, do you do some of this AI stuff? And I was like, yeah, yeah, I like this AI stuff. Are you doing AI stuff? It's like, well, I'm thinking about starting to do some AI stuff. Yeah, I think it's going to be cool. And it's like, wow. So like, I spent the next half hour just basically brain dumping all the ways in which AI was stupid to him. And he listened patiently. And I thought he probably wasn't even remember or care or whatever. But yeah, then I kind of like, I guess I re-caught up with him a few months later. And it's like, I've been thinking about everything you said in that conversation. And he like narrated back his response to every part of it, projects he was planning to do. And it's just like, oh, this dude follows up. Holy s**t. And I was like, wow, okay. And he was like, yeah, so we're going to create this new thing called Swift for TensorFlow. And it's going to be like, it's going to be a compiler with auto differentiation built in. And blah, blah, blah. And I was like, why would that help? [00:57:10]Swyx: You know, why would you? [00:57:10]Jeremy: And he was like, okay, with a compiler during the forward pass, you don't have to worry about saving context, you know, because a lot will be optimized in the backward. But I was like, oh my God. Because I didn't really know much about compilers. You know, I spent enough to kind of like, understand the ideas, but it hadn't occurred to me that a compiler basically solves a lot of the problems we have as end users. I was like, wow, that's amazing. Okay, you do know, right, that nobody's going to use this unless it's like usable. It's like, yeah, I know, right. So I was thinking you should create like a fast AI for this. So, okay, but I don't even know Swift. And he was like, well, why don't you start learning it? And if you have any questions, ask me. It's just like, holy s**t. Like, not only has Chris Latner lowered his standards enough to talk to me, but he's offering me personal tutoring on the programming language that he made. So I was just like, I'm not g

covid-19 christmas god university amazon community new york city ai australia google state new york times phd chinese development microsoft dm holy current language 3d chatgpt poor tesla created lesson melbourne discord trainers stanford shakespeare communities honestly exciting wikipedia focused tom cruise cto ram vc swift react nlp academia transformers rapid openai salesforce sf residence imdb adobe faced api mixing mckinsey replacing luther mojo python gpt aws lama github llama hermes elmo 2b copilot llm dai learner gpu showed 3b agi elo google cloud sql ragnar jono lasagna rag internally codex gpus unclear anthropic large language models 7b ulm fergus gan alessio lms fine tuning triton gans prompting cpus lex fridman rac typescript yola tensorflow cuda rfc kwok datasets sram github actions pytorch kaggle 500b 24m ec2 teradata cobe jeremy howard brian mccann discords imagenet llvm zeiler andrej karpathy chris lattner neurips rachel thomas so google fastmail rlhf spms rnn phy kdd stockfish code llama theano llama2 jeremy you jeremy it enlitic mlir jeremy so practical deep learning tensorflow dev summit

Daniel Zingaro and Leo Porter on learning to program with LLMs

Play Episode Listen Later Sep 20, 2023 60:46

Dr. Daniel Zingaro and Dr. Leo Porter are co-authors of the book Learn AI-Assisted Python Programming. Leo will teach an introductory computer science course this quarter at UCSD using this book. We discuss how tools like GitHub Copilot let people new to programming focus on breaking down problems instead of language syntax. Dr. Zingaro is an Associate Professor of Computer Science at University of Toronto Mississauga and Dr. Porter is an Associate Professor at University of California San Diego. This episode was originally posted on Software Engineering Radio. Topics covered: Making programming more accessible Teaching problem decomposition instead of language syntax The importance of reading and testing untrusted generated code The rise of throwaway or one-off code Concerns about relying on commercial tools Rethinking how to assess students Related Links Learn AI-Assisted Python Programming Leo Porter Daniel Zingaro GitHub Copilot Transcript You can help edit this transcript on GitHub. Note the timestamps and audio for this transcript will not completely match. Intro [00:00:00] Jeremy: Today I'm talking to Dr. Leo Porter. He's an associate teaching professor of computer science at the University of California San Diego, and he co-founded the computing education research laboratory there. I'm also joined by Dr. Daniel Zingaro who is an associate teaching professor of computer science at the University of Toronto. And he's also the author of the book, learn to Code by Solving Problems and the Book, Algorithmic Thinking. They are co-authors of the book, learn AI Assisted Python programming. Leo and Dan, welcome to Software Engineering Radio. [00:00:37] Leo: Thank you for having us, Jeremy. I really appreciate your podcast, so thanks. Great to be here. [00:00:41] Dan: Thanks Jeremy. Writing a book for Leo's CS1 class [00:00:43] Jeremy: The first thing we could start with is, is why this book? And, and why now? How did you decide on like, okay, this is the thing we need to do now. [00:00:51] Leo: So, uh, this is Dan. Uh, so Dan, um, like really early when LLMs first kind of were coming out and being seen on the scene for programming, uh, he started playing with them, uh, for programming projects. And I think Dan really quickly realized that they'd had this, a big impact on how we teach programming. so he reached out to me, uh, and said, I really need to give em a try. And, uh, after I played with them for a little while, I had the exact same realization that this is gonna change, uh, how we teach programming, uh, in a pretty dramatic way. So having realized that, having realized that we had to change our, uh, introductory CS1 courses, we knew we needed to do that, but in order to teach that class, we'd have to have a book that we could assign our students that that would go along with the class. And so we knew we had to change the class, but we also knew we had to have a book for it. And given the, the timeline to write books, we started in the book first. Um, and so that's how it got started. LLMs for Syntax, Humans for breaking down problems [00:01:45] Dan: I guess we figured out that our course had to change first, before we knew exactly, um, how it had to change. One thing we, um, learned early on was that the kinds of assignments we give in our introductory courses, they're just solved by, by these tools like ChatGPT and copilot. So, uh, we knew something had to change, and then it is just a matter of figuring out what. And so we spent, um, quite a bit of time with these tools and we started to realize that what's gonna change is the skills that our students need to learn, uh, to be effective using these tools. So like b before these tools, we would spend a lot of time teaching syntax. Um, and students struggle quite a bit with learning syntax, which I mean, it's very, it's, it's very frustrating, right? Cuz you can't even do anything until you get the syntax right? And you're getting all these errors like missing colons and, you know, mismatched braces and stuff like that. Uh, so it's actually good, that, the LLMs are doing the syntax for the students. But you know, just because that skill's, uh, not needed as much, uh, doesn't mean that there aren't still skills for students to learn. So instead of syntax, other things become more important. Uh, so for example, uh, Leo and I, realize that reading code is gonna be extremely important even more so than before. I think if, if that, if that's even possible. Uh, and that's because sometimes you're gonna get back code that just doesn't work. And so we realized that students are gonna need to be able to read, the response that they get to see if the code looks reasonable, or not, right? And then if the code, uh, I is unreasonable, then they need to read more code, uh, and look at other solutions, right, that they get from the, uh, LLM. Uh, there are other, uh, things they can do as well, like messing around with the prompt and so on. But they're gonna need to be able to read code, uh, throughout the process. And then, so we just kind of kept on using these tools and documenting the skills that students are gonna need. And we just kinda realized that all the skills students are gonna need are skills we would want to teach anyway. So like, uh, one more example is testing, right? So, students may now not have, uh, an understanding of every last detail of, you know, the Python language like they would before. And so then that makes testing even more important, right? Than it was they need to verify that the code they're getting is correct. And so they have to be very good at writing test cases. and, and, you know, similar, similar for debugging, we need our students to have strong debugging skills, again, even potentially stronger than before, right? Because if the code isn't working, they need to first determine what the code is doing to be able to fix it. And then I guess one more I'll mention is problem decomposition. And this is a big one. I think this is gonna come up a couple times probably in our talk today, but LLMs struggle when you give them tasks that are too large and students need to know how to break problems down into small components so that, that, LLM can solve each one and, you know, have a good chance of getting it right. [00:04:56] Leo: Yeah, I, I think, um, kind of to, to piggyback off of that, you, you may be hearing these skills and saying, oh, these are absolutely essential skills. Every software engineer should know, uh, these are being taught right now. Right? Um, and the answer is not really, like these aren't core topics in a lot of introductory CS classes because so much time is spent on syntax. And so fairly early on when we kind of realized these skills would be so essential, Uh, we got really excited because these are skills we want to teach in our classes, and the LLMs are now giving us the ability to do that more. [00:05:27] Dan: Mm-hmm. [00:05:28] Jeremy: I think that's interesting about the syntax comment because you were saying how reading is gonna be more important than ever because you have LLM generating the code. Um, and you need to understand that code that's being generated and understand that it does what it, uh, you think it does. And so I wonder if when you say you spend less time on syntax, is it because you feel like they're gonna generate this code and they're sort of organically gonna pick up syntax that way versus having to focus on it at the start? I'm just trying to picture what you see changing there. [00:06:05] Dan: Yeah, Jeremy. So, uh, I, I was, I guess speaking specifically about syntax errors, which don't generally happen when you're using LLMs, and I also agree with you, you need to know what the code is doing, but, um, you can do that without worrying about each specific piece of syntax. Like, um, you're gonna need to know what the keywords do for sure, but, missing, you know, brackets and colons and, uh, oh, there needs to be like a blank line here. indentation, uh, a lot of this kind of thing. Is done for the most part, correctly by the LLMs. So yeah, I agree with you. You need to be able to identify the structures. So in our, in our book actually, Leo and I have, um, a couple of chapters on reading code and, I don't think we ever break breakdown, a line of code into its individual tokens. We do talk about the main structures, like ifs and loops and functions and all that. but compared to other books, I, I think or other, uh, other ways of teaching where you would focus on the micro level, we try to focus on the line level now, cuz we want our students to be able to grasp what each line is doing, I guess more than each token. [00:07:27] Leo: Yeah, maybe to, to add to that a bit, it's almost, uh, if you think about the advent of block-based languages, it was to make sure that the, essentially the, the author can't make syntax mistakes, right? Is the whole purpose of kind of block-based languages. And they're, they're huge for introductory programming, especially in like K through 12. in a sense, LLMs do this because they'd never give you back wrong syntax, or they almost, almost never give you back wrong syntax. And so it takes away that kind of cognitive burden of making sure you handle the, the token level. as uh Dan was saying LLM generated code needs test cases to catch logical errors [00:08:00] Jeremy: I, I'm curious, so you said the syntax is correct, but what are the, the typical mistakes you see coming back from these LLMs? Is it a, a logical mistake or is it ever something that. Actually doesn't compile. I'm, I'm kind of curious what your experience has been. [00:08:19] Leo: I think the, uh, more common errors that we've been seeing are logical. So it misinterprets the prompt that you're giving it. It essentially tries to solve a problem that's different than what you're trying to solve. It may have bugs in it, so it is in fact trying to solve the right problem, but it, it's off by one, um, is maybe replicating some mistake that it found in, in the large code base. And so most mistakes are gonna be you need to write test cases, run it. That mistake is then gonna show up when the test cases catch it, and then you'll have to try to fix it. if the students can read the code, uh, if we train them well to read the code, often you'll look at the response. And if the response is just not even trying to solve the right problem, you can usually pick that up pretty quick. Uh, and I think, I think the students will be learn to do that and then they can just say, okay, this is clearly not the right answer. And, and use the different tools in say vscode to find another answer, and then pick one that's right or change their prompt to get a response that's right. Go through that whole flow. But then some point or other it will give an answer that looks right. And then I think all of us as software engineers know that even the code looks right, it may not be. And so then they have to actually write the test cases, get some level of confidence that's actually working right before they'll know. And so sometimes, sometimes, you know, really quick is that it's just clearly wrong at solving the wrong problem. And sometimes it looks right, but it actually has some bugs that need to be fixed. [00:09:49] Dan: I guess one thing that struck me is how much a change in the prompt can, can matter. Uh, Leo, you know, um, we've, we've seen this over and over again where we'll write a prompt. It seems fine to us. And then we'll realize, oh, there are actually two different ways of interpreting this. and, uh, the ambiguity of, of English strikes again, right? And so it's just amazing to me how clarifying the prompts, how many times that fixes the code. Not always. We've definitely have examples where that's not the case, but, um, more, more often than not, in my experience, changing the prompt, uh, appropriately has a bigger than, than, um, anticipated effect on the, on the code. It's amazing. [00:10:36] Leo: And for thinking of the prompt, uh, in terms of like doc strings for functions, uh, adding the test cases certainly help. Um, sometimes it is, surprising sometimes that you can add the test cases to the prompt and it'll still give you back code that does not actually pass that test case because it, vscode and copilot doesn't actually run the code that comes back from the LLM. Uh, but I do find the test cases do tend to help with the quality response you get back. [00:11:01] Jeremy: As a part of your prompt, you're asking it to implement some functionality, and you're also asking it to write these tests for that same functionality? [00:11:11] Leo: Oh no, sorry. I, I, it's more the, um, doc test kind of format. So it, it, um, you're writing, let's say you, you've written your function signature and then you have the description of the function in a doc string. And then at towards the end of the doc string, I'm articulating the test cases that I intend to use. Um, and the articulating the test cases that I intend to use helps it come with a better prompt. Um, I haven't found it to be great at writing test cases. I haven't spent a ton of time with this, but the time that I have spent, it tends to want to do almost like a brute force search of all possible inputs, uh, as opposed to doing, okay, well here's a couple common. Here are the edge cases. Now I can feel fairly good about it. It doesn't seem to have that, um, intuition yet. [00:11:55] Jeremy: [00:11:55] Leo: For the most part, we're writing the test cases our ourselves, and we're gonna be teaching the students how to write the test cases themselves [00:12:01] Dan: Yeah, Yeah. So Leo and I have actually made a conscious decision to have students write test cases from scratch. Even though you could play around with the LLM and have it, you know, try to generate test cases, whether it's flawed or not, we still want students to do this from scratch. We think that writing test cases is a skill we want our students to have. [00:12:23] Jeremy: Sometimes what these models will generate, like you were saying, has logical errors. And hopefully if you're writing the test cases, you've put some thought into 'em, and your test cases are actually checking the correct behavior. So then you have the LLM generate the implementation. It's running against tests where you know what the correct answer should be. And so if it generates something that's incorrect, you've, you've kind of caught it. You're not totally relying on it. Telling you everything is, is good, you know? Um, It's confidence in something that's like you personally can't see. It's just what the machine gave you. [00:13:05] Dan: Maybe it takes away one layer of uncertainty too, Jeremy, right? Like, so the code could be wrong, right? And then if it generates test cases, okay, the test cases could be wrong too. And maybe you get unlucky and two wrongs make a right and then your test cases pass for the wrong reason. So yeah, we really wanna hone this skill in our students. And, and like Leo said earlier, these intro courses used to be so full of low level syntax concerns that we, we didn't do testing properly. I mean, you know, we all try to cover testing, but I think we're gonna be able to cover it a lot more, detailed now. LLMs could encourage students to test more since their output is untrusted [00:13:41] Leo: And I, I think we're enthusiastic about, uh, how students will approach testing when you're working with the LLM is what we. This is fairly anecdotal, but uh, when they interact with us talking about testing, often students aren't testing their code because they wrote it. And so of course it's Right. Right. This is like this really famous, uh, kind of bug in human thinking, right? Is that if you write it, of course the computer's gonna interpret what you're saying, right? Um, and so students tend to trust their code in a way that professional software engineers never would. and I think because it's coming from this third party that you know is wrong, it's coming from the LLM that can, that can often make mistakes. I think they're gonna be more inclined to actually engage in those testing practices. Uh, kind of knowing about the fallibility of the LLM, [00:14:27] Jeremy: You're shifting the order. I mean, there is test driven development that some people practice, but I feel like probably what's most common is you write the implementation yourself and then, then you'll go and see like, oh, did this thing I, I wrote. Did it do what I thought it should do? Um, whereas this is kind of flipping it, where it's the large language model is gonna write my code, so I'm just gonna start with the test and then I'll ask it to, to write me the code. And maybe that will kind of make test driven development be the default. [00:15:02] Leo: So yeah, I, I, I think that students may wanna engage more in kind of test driven development because they wanna think more about, uh, what exactly should this function be doing? Uh, how should behave, what kind of inputs and output should it expect? And then it can kind of write the prompt to co-pilot or whatever LLM is using, uh, to express those inputs and outputs. Well, they're more apt to get good answer from the LLM and they've kind already got their test cases worked out as well, so they can immediately just go right into the testing agency if the prompt came back right. Using LLMs at the function level instead of a broader scope [00:15:35] Jeremy: And you mentioned writing a prompt to implement a specific function. Have you found that they work well at the function level? But if you try to ask it to build something more broad, that that's kind of when it has problems? [00:15:53] Dan: So, I think in general, LLMs do work best at the function level. We have tried to get it to generate bigger apps, collections of functions, and it can work, but sometimes it does, uh, it does do worse. But also we want students to do the problem decomposition for themselves and break up the problem into individual functions. Even though maybe the LLM could work, uh, with, uh, bigger chunks of code, we want students to do it. And one reason is so that they can customize what they get from the LLM. So, in the book, we have a bunch of examples where you could probably just throw it at the LLM and get an answer and, you know, eventually get it to work. But I think at that point, making changes to it might be trickier than it would be if you knew, uh, the architecture of what you were, what you were building. So in the book, we have a bunch of top-down design diagrams, and we want students to understand what they're building at that level, like at the function level instead of, like we said earlier, instead of like at the token level or the line level. Potential issues with outsourcing high level design to an LLM [00:17:03] Jeremy: And so like in this example, you're thinking more from a, a learning perspective. You want the student to look at the big picture, figure out, okay, what are all the different functions or parts of my application? Break that down and then feed those individually. To, um, these large language models. I, I'm wondering from like, let's say you're a, a professional software engineer and your interest is more in I want to make the thing and less so, in I want to learn how to make the thing. in that case, do you feel like you could feel confident in, in giving the large language model a larger piece of the design, or do you still feel like it's good to have that overall structure done by the, the developer and then just be very targeted about how you use the large language model? [00:18:03] Leo: I think that's a tricky question because we haven't worked with these tools heavily in a professional programming setting. I think often when we're thinking about large design of software, you're gonna be working on teams, talking with other members of the team about the interfaces and things like that. And so I'd be pretty hesitant to to outsource that, that thinking to the, the l lm cuz you, the communication between the teams still has to happen. Uh, even if it weren't for that. Um, I kinda think of it as a probabilities. So essentially whenever you ask co copilot or any of these LMS to, to do a task, the more it has to right, get the kind of more likely it's gonna make a mistake. Um, and so, uh, that's kind of why I like the functional level. It seems like I. Partially because it's not that much code that tends to write. Um, so you help to avoid kinda the probabilistic problem, but also because it's learned on a huge code base that has lots and lots of functions that have been implemented. It tends to do well at that, that solving the function kind of task. [00:19:10] Jeremy: Yeah. And I, I think the way you put it as outsourcing that designer, that decision is, is interesting because yeah, if you are working on a team and whether it's in code review or just in a discussion, often people will ask, well, well, why did you do it this way? Or Why, why is this the, you know, the good way to design it? And if you kind of handed that off to an l l m, maybe your answer is, I don't know. It's just what it it told me, which (laughs) [00:19:39] Dan: Yeah. [00:19:42] Leo: That isn't an answer I want to u use talking to my boss. Right. Well the chat GPT told me I should have it this way. That doesn't seem like a good answer. Choosing GitHub Copilot for CS1 [00:19:50] Jeremy: I think we, we've kind of been talking in more a general sense of working with LLMs and you've mentioned how you're gonna be teaching introductory computer science courses this coming, quarter or semester. And so when you teach these classes, what tools are you gonna recommend your students use? And yeah, maybe you could go into that a bit. [00:20:13] Leo: Absolutely. So we're gonna be recommending, um, At least, at least for my class, I'm gonna be recommending that they use, uh, vs code with copilot. Um, I just like the integration of the IDE with the, uh, interactions with the LLM uh, I think it avoids just a whole bunch of copy pasting from another interface into your IDE to then, uh, run it. I think it also reduces the barrier of them kinda immediately getting the code and then testing it right there in the environment. I'm sure any of the other tools would work, it's just, that seems to have worked well for us, uh, when we were writing the book. And that's, that's actually the technique we recommend in the book as well. Um, so that would be the primary tool for the students writing the code. In addition to having them using copilot with, uh, in the IDE for a lot of the code generation, depending on where things are at with copilot x, um, which is right now, um, available through wait list. Uh, if that's, that's available publicly, I think we're gonna be recommending that because it has a copilot chat feature, uh, which can be really nice to interact with. And, uh, the main use that, that we're gonna be encouraging students to use, whether it be co-pilot chat or a ChatGPT is in just a conversation with the LLM about, particularly modules and libraries. So if you are diving into, merging PDFs, which, uh, Dan did a great job in one of the chapters in our book talking about, if you wanna dive into that, well, what libraries should we be using in Python for that. Uh, and we found that the LLMs do a really good job at this, of actually saying, here are the different libraries you could use. Here are the pros and cons of them. These are the ones that, uh, need to be actually have additional install done. Or these ones that come in with, vanilla Python. they're actually really good at kind of giving you the what you should use for the various libraries. Um, and so that's, that's one other way that we were gonna be encouraging the students to use the LLM. Types of questions to ask the LLM [00:22:07] Dan: Yeah. So whenever the students or the junior programmer, doesn't know how or doesn't think they can, uh, do something in base Python, we have them interact with the chat and, and ask. So another example that comes to mind from the book is we have a chapter writing some games. And so for most games, including the two that, uh, we've got in the book, you need to be able to generate random numbers, right? So how do you do that? And so in the past you would've used a search engine stack overflow or something, and you would've found, some sample code and you would've pasted it in to your file and made variable name changes and things like that. And so what we do now is we ask chat, okay, I need to generate some random numbers. How do I do it? And then it will come back to you with a few options, and then you can systematically work through those options if you like. Uh, and you can ask, okay, is this one built into Python or not? And then it will tell you, oh, this one's not. We don't need to memorize API docs [00:23:11] Dan: And you say, oh, well, okay, so like, how do I install this? And then no, does it work on all OSS or just Windows? Right? So, uh, we guide the reader through these questions that you could have, uh, to help you make a decision. Um, and I think what I like the most about this is not having to learn. APIs, like yet another api. Like I don't, I don't think I have room, you know, in my, like, brain for any more APIs. And, and what's cool is I, I've forgotten like every API that, uh, we've used in the book. So we have like examples of emerging PDFs and, uh, removing duplicate images from directories, uh, from like people's phones, and, and stuff like that. And I don't know, I don't know which library it's using. Uh, and I'm, I'm totally okay with that, right? Like I just, I, I wanted to get the job done. I wanted to write a tool, and the tool got written and it used some sort of library and it worked great. And I didn't have to look through the documentation for that library and figure out like, which functions do I have to call and things like that. So, I, I know it, it can be fun, you know, it could be fun to really learn an API well, but a lot of people, they don't want to program for programming sake. Like, they just wanna get work done, right? So, you know, while I, I, I fully admit to, enjoying programming just for the sake of programming. I do a lot of competitive programming problems just for fun. You know, it's like Sunday morning and it's like, Hey, yeah, I got like an hour and I got an hour to work on something. Let me work on this little competitive programming problem. But, uh, a lot of people, they're not motivated by that. They're motivated by consequences of code. And this is one thing about LLMs that I'm very excited about, is you can just, make a lot more progress, without having to learn what these, people may believe is just useless knowledge, right? Like, does it really matter how I should invoke this api Right, to merge PDF files? I mean, the answer for many people is no. Like, they just want the result to happen. And I love how we can kinda match what they, uh, deem important, right? With the LLMs, it's like a new level of abstraction, for for many people. LLMs make building software possible for more people [00:25:28] Leo: There's a couple of audiences that come to our introductory classes, and what Dan's talking about here is one of the things I'm most excited about with this, and that's the students who come and take just one. Programming class. I know it's probably a different audience than, uh, a lot of the people listening right now. Um, but the people who just take one programming class, it's required for, for their major. They, I just wanted to explore it a little bit, but they, they don't go into this as a, as a career. I think a lot of those students right now, uh, if you ask them a year later to program something, do any of these tasks that we're talking about right now, I doubt they're able to, even if they did really well in that class. Uh, and that's really disappointing, right? If they've taken a programming class, they should be able to, to do something with that, a year or even five years later. And I really believe that if you teach them the skills of interacting with these LLMs, they'll be able to do these tasks later. They'll be able to come back and go, you know, I don't remember any of the Python syntax. I don't remember, uh, even how to get started with this. But you know what, I'm just gonna ask, uh, copilot, how do, how do I go about merging these PDFs, having this directory? And then, uh, the copilot chat comes back and says, oh, you might use this and that. And then they go, oh, I remember, I remember how to, how to write these functions. And I just said, you have to go over a prompt. I think they could really do it. And that, that's a bit of a game changer, right? That means a larger portion of our society will be able to, uh, write code and using a useful way. And I'm just really excited about that. I think it's gonna be really nice, uh, after the changes happen. More people might stick with Computer Science [00:26:58] Jeremy: I can totally see in the context of someone who's, not seeing it as a career, or someone who is like, hasn't done it in a while. It could be. These tools can be incredibly useful, right? Or it can even get you interested in this field at all, right? Like a lot of people, they, they struggle through the syntax and then they decide like, oh, this is not for me. Even though like they had something really cool they wanted to build and, and maybe these kind of tools can, can get them over that hump. [00:27:31] Leo: Exactly. I think there's a population of students, um, and it varies a bit by demographics, who come to computer science, with really the best motives in mind, right? They wanna make their goals in their life are to make the world a better place, and they want to achieve those goals. And if you spend the first three quarters or three semesters working with them and all they're seeing is syntax and they're not actually solving anything meaningful, um, it starts to create this disconnect of what their goals are for their life and what they think the goals of are, are career are. Of course as, as, as a computer science, I wanna say, stick it out. You know, if you, if you go into the fourth, fifth class, you'll start seeing how these are really useful tools that can make society a better place. But it'd be really nice to front load that and have them solving useful problems much earlier and seeing that, uh, computer science, uh, can be used in really nice ways. Efficency can be taught later [00:28:26] Jeremy: And, and so within the, the context of. People who are studying computer science will eventually, who may become professional software developers, things like that. Something more long term where it becomes more of a craft, the, the code that comes back from these large language models. Sometimes it could be something that's like not maybe the most easy to read or it may be doing something inefficiently. And I'm wondering from your perspective how users of these tools should, should think about that and, and recognize when that's a problem. [00:29:06] Dan: We in, in, in the first couple of courses, typically in the CS program, um, we don't spend much time on efficiency. the reason is that there's just so much to learn early on, and, um, we worry about overwhelming people with, know, too much, for them to, to process it at once. And we don't wanna prevent students from becoming interested, by. Giving them all of these requirements early on. So typically we, you know, we push efficiency, down the, down the road into like a data structures course, for example. But your question points to another reason why, we've decided to teach some of the skills we teach early on. So if, if a student, you know, came up to Leo or, or me and said, Hey, you know, like I wanna generate efficient code, how do I do it? My answer would, would be, so like, get, get familiar with programming first, but you are learning the skills necessary where you'll be able to look at code later because you know how to read it still, right? It's not, uh, something that you don't understand. You're gonna, you're gonna know it. We're gonna spend lots of time on code reading, and so later I think we can just teach efficiency the way we always did. Um, so, you know, doing, uh, time complexity analysis on, on the code and they're still gonna understand what the code is doing. So, um, I, I, I don't think this is going to, this is going to change much in, in the earliest courses. LLMs can expose students to different types of code [00:30:35] Leo: To the, to the point about code readability, I might add that, uh, certainly they're gonna get back some, some code that's maybe not the best style and it may not be as readable. Uh, but what's kinda interesting is that students aren't exposed to a lot of different styles kind of in our existing courses, right? They, they see the code that they write and they see the code that the professor writes and gives them, and there's not much else. And so, I mean, we're gonna need data and we're gonna need research to, to, to know this for sure, but it, it, I suspect them seeing lots of different code styles and having to read those different code styles may actually inform them better than we do now about what makes code more readable. Uh, and then they might be able to employ that as they go forward. [00:31:21] Jeremy: And, and when you're saying they're gonna read different styles and things like that, are you referring to code they're gonna see from the LLM or are you talking about them reading just other code bases in their classes or their professional work? [00:31:39] Leo: Oh, I'm sorry. Yeah, I was referring to the code. They'll see from the LLM Right [00:31:43] Jeremy: Oh I see [00:31:43] Leo: LLM will come back in all these different ways. They'll have different styles and they'll, uh, have different approaches to solving it. Right? Sometimes they'll, uh, come back with like this one line Lambda expression thing that solves it, and they'll have no idea how that works. And they'll, they'll ask for a different answer and they'll get, uh, a much more, uh, user-friendly first, uh, first programing experience kind of code back. And they'll be able to understand that and go, okay, this is the kind of code that I wanna see. Not this thing that was completely non-readable. [00:32:11] Dan: Yeah, Leo, I just thought of something. So, uh, so you know, by default you can get it to give you 10, uh, code segments to solve the problem, right? So it'd be kind of cool, if we ask students about each of them, right? Each of the 10, which ones are right, which ones have bugs, which ones have good style, which ones have bad style, it's like a built-in learning opportunity right there. So yeah. [00:32:34] Leo: Oh, it's true. Yeah. And, and so the 10 things that, uh, Dan I was referring to is if you do control, enter in vs code when you're working with a copilot, it'll give you back 10. Possible responses. And you're totally right Dan. You could just say of these 10, how readable are they? Are they right? Um, there's lots of fun things you can do to ask students questions. [00:32:51] Dan: and often many of them are right with just subtly different ways of, of, of, of solving the problem. I mean, I'll, I'll admit to having some fun looking through all of the suggestions just to kind of see what the variability is and when there's a lot of variability. I really like it because, uh, like Leo said, it exposes people to different styles they may not have seen before. And, um, may it may, it may, um, encourage you to ask questions, right? Like, why does this one work? Right? I've tested it. It doesn't look like it should work. Why does it work? I feel like that's the beginning of a pr pretty powerful learning experience right there. [00:33:30] Jeremy: Yeah, that makes sense to me because I, I think about how when a lot of people are doing software development before all these LLMs, they will search on the internet and go, okay, what's an existing answer for this thing I'm trying to do? They'll find a post on Stack Overflow and they'll find the accepted answer and it'll be like, okay, this is it. This is the solution. Whereas, at least in this case, it seems like you can go like, okay, well here's, here's 10, 10 potential solutions, and at least you get a little bit more exposure to, um, what are the different ways you could do it. [00:34:06] Leo: Exactly, and, and it's nice for 'em to see these different options. And I think there is, for professional software engineers seeing that stack overflow post, like, here's the accepted answer, integrating that into your code isn't a big jump for, for a lot of us. Um, but I do wanna stress that for the intro students, it often is a really big jump. Uh, just the, oh, how do I change around this? Oh, this was the interface for this function, but I'm been asked to have this other interface with a function and, and they really can struggle in that domain. And so I think copilot and these LLMs are nice in that they give back answers that are more tailored to the existing code that they're working with, um, and will reduce that barrier of them trying to incorporate the answer. Optimization can come later, most code is straightforward [00:34:50] Jeremy: So it seems kind of overall, when you're talking about people who are using programming in a more professional capacity, the code style and efficiency that will probably be taught very similarly to however it is now, where you basically have to get exposed to different styles and types of code, get exposed to the algorithms and and that will allow you to read the answers you get back better. So the answers you get back from the LLM with the knowledge you gain from these later courses, you'll be able to tell like, oh, okay, this is, this. Level of complexity, or this has like, you know, exponential, performance implications, that kind of thing. [00:35:43] Leo: So I think the performance piece is really important. Um, and I appreciate your, you bringing it up. I think, I'm, I'm kind of curious, uh, uh, what percentage of the time professional programmers are really spent, uh, are spending optimizing, uh, the code that they write? Um, I suspect a lot of the code that's written, uh, is pretty straightforward. Uh, you, you already know how to work with the database you're working with. You already know how to write the queries for that. You're, you're, you're just, uh, you're still doing something that, that's certainly thought provoking, but it's not the hard work of, oh, how am I gonna write design the right algorithm for this to get the exact best runtime? And so I think there are some times that that does matter, but those may be the times that the LLMs aren't as helpful and there's still gonna be a, a pretty big need for programmers who know how to do that, uh, themselves. [00:36:33] Jeremy: Yeah. I mean, I, I think that of course this is gonna vary from industry to industry, but Dan, you were talking about learning APIs and I feel like a lot of jobs are learning APIs and gluing them together. [00:36:49] Dan: Yeah. Um, I would agree, but I wonder what can happen if some of that's automated. Right? So maybe, people who are gluing APIs together will be able to. Get even more done, right? Incorporate even more, APIs in the same amount of time that they've been doing it. Now, I don't, I don't know if that job changes as dramatically as it, it seems, um, I guess there's this tension between people, having to change jobs or become more efficient in the current job. And, you know, obviously I, I hope it's the latter and there is some recent evidence that it could end up being, the latter, just more productive people overall, building, know, bigger software in incorporating more APIs than, than before and, and not overloading yourself. So, we'll, we'll see, you know, how it, how it all, um, how it all turns out. But I'm, I'm hopeful that we'll just be doing our jobs better. Reading code as a skill [00:37:51] Jeremy: In that, that context, sometimes people will say that the, the reading of code and comprehending code can sometimes be more difficult than writing the, the code. And in fact, can sometimes take you more time, like, let's say you've built out a project and now you need to add new features. Well, to add the feature, you have to understand the, the code base that existed before and so. When we talk about LLMs and the context of not programming, but just general writing, people talk about the fact that it's easy to generate more writing, right? We can generate more documents, blog posts, more articles, that sort of thing. And with code, it sounds like it'll be similar, right? Where it'll be easier for us to write more code, generate more code. Um, but I wonder if either of you have thought or, or think it's a concern that we'll be generating so much code that now we'll have so much we won't be able to even have the time to understand all of it, [00:38:55] Leo: I haven't thought much about the generating so much code that you can't understand. I mean, I think if, if we're generating code, I, I'm really hoping someone's testing and making sure it works right and stuff. And so I guess it depends on what kind of, uh, what level of the interface are we, we looking at. Um, but I have thought about a fair bit about the, the, what you described early on in your question, which was. Diving into a big code base, figuring out what needs to be changed and changing it, that is a really common task, especially for like new software engineers, uh, in their, their first jobs. Right. And it is also one that's really well documented in the, the education literature, uh, education literature, uh, that we aren't teaching them to do. Like we almost always are giving them, uh, right, these functions are really well defined or, uh, write the code all from yourself, but we rarely ever give them large code bases to learn from. Now I don't think diving into a large code base and trying to understand how it works is the right thing for like an intro class. And then we're mainly talking about, uh, students first learning your program here. Uh, but I am encouraged that we are teaching code reading as kind of a first level skill when I think current programming courses teach code reading right? In parallel with writing. So a lot of the writing's happening very early before they even know how to read well. Um, and so I think there's some optimism here that if we teach code reading first and make it a core skill, they'll be better set up in the later classes to maybe take on those large projects where they tackle the exact problem you're describing, which is also the exact thing they're gonna have to do when they get to, to their jobs. The amount of code we throw away may increase exponentionally [00:40:37] Jeremy: Yeah, it, it also kind of, I wonder sometimes when you're writing code, you'll write it in a certain way because it's tedious to write a lot of code, right? Like you'll, you'll make something generic in such a way where you can reuse it, and maybe reduce the amount of lines of code. But then when you have something, generate that code, maybe it'll be a solution that. Is a lot more code than you would've written personally, and it works. But, by nature, the fact that it was easy to generate, you chose that solution versus one that, that maybe was more generic and um, had less code. I, I'm not sure if that makes sense, but I'm kind of curious if the use of these models will sort of change maybe how we write code [00:41:30] Dan: I'm kind of wondering if the amount of code we throw away is going to increase exponentially. Because, because, um, you spend time working on something, you're probably gonna keep it. But I, I wonder because, uh, Jeremy, like what you said, it's, it's so easy to generate code now. so I, I've had this thought where, what, not sure how, how, um, how much I believe myself here, but, uh, should we be storing the, the prompt, like not the dot py file, right? Like just store the prompt and then if you do have to regenerate the code later, maybe you gotta make some tweaks or something. You just change the prompt and then, and then rerun it. So, because, because, because code is, um, It's not there yet, but it's, it's becoming free, right? It's becoming, you can generate as much of it as you want. And so I, I wonder how much, how much of it is, so there's, there's a lot of code already that you write once, and you run it once and then, and then you get rid of it or lose it or whatever. And I wonder if that, that practice will increase. So it's like, okay, you know, I wanna do this data analysis. Okay. So you write a prompt, you get some code, you generate some graph, and then you just don't even think about it. You just get rid of it, and then maybe later you want another similar analysis and you just do it again. Right. So I kind of wonder, because there's maybe less ownership now of code, right? You didn't like sweat as much to write the code. So maybe, maybe more of it gets thrown away. [00:43:03] Leo: I, I completely see what you're saying, Dan. So you have the prompt and you had it perform some form of data analysis and you wanna tweak it to do a slightly different data analysis. Uh, I wouldn't go into the, I mean, right now if I wrote the code from scratch, I would go into the code and find that one spot that I need to change and I would tweak it. But if I'm just generating the code, I would just tweak the prompt and then get a new piece of code that does exactly what I want there without having to, to [00:43:26] Dan: yeah. You know, how, how, it can take a, a long time to re-familiarize yourself with a program that you wrote six months ago. You know, it's like, oh, I, I called this variable temp one. Like, what's this for again? Right. you know, maybe, yeah, [00:43:41] Leo: Wait, I think we've all been there. Keeping the prompt instead of the code [00:43:43] Dan: Uh, but yeah, I don't know. It's just, just a thought I've been having. It's like, it, so, so when, when, now when, when I hear people talking about code maintenance, for example, like using, you know, good variable names and consistent style and stuff, in my head I'm thinking, well, you know, is, is the code the artifact now? Is it still the artifact? And right now, you know, of course it is. But, um, but, you know, fast forward a little while, maybe, maybe some of what I just said, uh, sort of becomes true eventually. [00:44:11] Leo: That's getting to perhaps kind a larger issue about what is the interface that we're, we work with as programmers. I've been thinking about this a lot, uh, just because I, I teach my, my background's. I have a PhD in computer architecture, and so I teach the classes that do machine code and assembly code, and they're, they're, they're core classes for computer scientists because you need to know how computers work. And, um, I think that's a core component, understanding that, But we don't start by teaching the students machine code. Like no one wants to learn how to program a machine. Um, at least I can't imagine anyone wanting to learn that. Um, and we've kind of cognitively picked Python or Java right now, the most common two programming language to learn from. Because they're easy to learn, they're easy to, to read. The code tends to be more understandable when you read it. It tends to be a little bit more forgiving when you write it. Um, and so we picked these because we think they're nice interfaces. They're, they're convenient for programmers and they're convenient for, for new learners. And it just seems to make sense that the LLM may be that next step of interface that we start choosing. The, the catch is because it can be wrong. It's not like a compiler. A compiler is deterministic. It's gonna be, uh, shy of that. Maybe one time in your career you find a compiler bug, like the compiler's always right. This time the LLM isn't always right and so I, I'm not sure how this is all gonna play out. Um, you can imagine the LLM as the new interface and all we ever store is, is code prompts and we don't ever even see the code, perhaps as one scenario. And the other is we, we do in fact still interact with the LLMs and still interact after the code. Um, but I think it's too early to kind of know where this is all gonna fall. But, um, we could see some big shifts, I think, in the field over the next few years. [00:45:52] Jeremy: Yeah, I think that's pretty interesting to think about what, what Dan had mentioned where yeah, you could check in your prompt and maybe a set of test cases for the app that's supposed to come out and yeah, maybe that's your alternative to the actual source code. Um, especially for things that, like you were saying, are, are used not that frequently or maybe you only use it once and so the, um, the quality of the actual code is. Maybe less so important in terms of readability and things like that. And as long as you can reliably reproduce that thing, yeah, maybe, maybe that does make sense. [00:46:39] Leo: The reliable reproduction could be the tricky part. And you there may be even saying that you, you start doing where you tag don't, don't try to reproduce this. Like, we actually spend a whole bunch of time on this. It's super optimized. Like, don't think the LLMs gonna give you this answer again. So, uh, keep the code along with the prompt. Keep the code too. Don't, don't scratch that because the LLMs not gonna do better. Um, and then in some cases you're like, yeah, the LLM's gonna do a pretty good job on this and [00:47:07] Dan: Yeah. Leo, maybe we have to Maybe we have to distinguish between code that you can just get out of an LLM no problem. And code that people have spent time working on. I like that. Yeah. Yeah, [00:47:21] Leo: some you're like, hashtag don't change. [00:47:23] Dan: Humans were here. [00:47:25] Leo: exactly. The concerns about relying on commercial tools [00:47:27] Jeremy: Yeah. this is the 30th iteration of this code we generated and we verified that this one's good. So just, just, it's a interesting, interesting future. We, we might be heading into, so, so one thing you, you mentioned a little bit earlier is that the tools that you're gonna recommend to your students, it sounds like it's primarily going to be GitHub copilot and GitHub copilot X for the, the chat interface. And one thing about these tools is these are tools by commercial companies, right? These are tools by OpenAI and Microsoft. They're tools that you have to pay a subscription fee to use. You have to send your code to a commercial server. And I wonder if that aspect concerns you at all. The, the fact that the foundations that our students are learning on is kind of reliant on these companies and these cloud services. [00:48:31] Leo: I think it's an amazing question. Uh, I think to some degree these are the tools that professional software engineers are using, and so we need, there's, there's a bit of an obligation as instructors to teach them the tools that they're gonna be using as professionals going forward. I think right now they're free. Uh, to use for, for education's sake. and so as long as that stays the case, I'm a little, more comfortable with it. If it started to move to a pay model for education, I think there could be some really big problems with equity. and I think it's not just true for, for computer science, but I'll start with computer science. I mean, if it's computer science and we start making it where you would have to pay to get access to these models or use these models, then whether we tell the students they can use it or not, they still can use them. And so there's gonna be some students that, the wealthier students who may have access to these, who are being able to learn better from these, being able to solve better homeworks with these, that's super scary. And you could imagine the same thing for even just K through 12 education, right? If you're thinking about them writing essays for homeworks or anything else, if it's a pay model, then the students who have, uh, the money will pay for it and get access to these tools. And the students who don't, won't. You could imagine the, all these kind of socioeconomic, uh, divides that already exist, only being exacerbated by these tools if they switch to this pay model. Um, so that has me very worried. Um, and there's some real ethical issues we have to think about when we're, we're using them. Yeah. Um, the other ethical issue I kinda wanna mention is just the, the copyright and the notion of ownership. Um, and I think it's important for us as instructors to engage students in the conversation about what it means to create content and intellectual property and how these models are built and what they're building off of. Um, and just engage in that ethical conversation with the students. I don't think we as a society have figured this out. I don't, I think there's gonna be some time both legally and ethically before we have the right answers. but at the very least, you need to talk to the students about, uh, these challenges so they know what's going on and they can engage in the debate. [00:50:45] Dan: Yeah, just to underscore that, Leo, this is the reason we're doing research on the first version of the course that Leo's teaching. We need research on the impact of LLMs, on students. especially, we need to know if students benefit from this, in what ways they benefit. How are these benefits distributed across demographic groups? We have a long and sad history in, computer science of inequities, in who takes our courses, who succeeds in our courses. we're very aware of this and it's, uh, unacceptable to make that situation, uh, worse than it already is. So, um, we're, we're gonna be carefully doing our research on this, uh, first offering of the course. A downside is students might bypass fundamentals [00:51:30] Jeremy: So we've mostly been talking about the benefits of using these tools in classes and in education. we just mentioned the possible inequities if you don't have access to those things, I, I wonder if from either of you, if there are negatives you see to this technology, whether that's the impact on what people learn or in anything else. Like are there downsides you see to the use of this technology? [00:52:04] Dan: Yeah. So in addition to, uh, the important, uh, inequity concerns that, uh, we just talked about, I have a concern about students using the tools in ways that. Don't help them learn the skills we think they need. So it's a, it's a, it's a power tool and you can, uh, you can get pretty far, I think with, without, um, being systematic in, in how you work with it and without testing, without debugging, um, it's, you know, it's, it's kind of magic right now. And so I can imagine, a lot of students just taking off at, you know, a hundred miles an hour. and so I'm one, one of, one of, uh, the things we have to worry about in these initial courses is, convincing students that there really are principles to using this technology. You can't just type something and get an answer and then go party. and, and, and so that, that is one of my concerns. That's one of the negatives. It's super powerful. And, like, like, so before you, you can't just type some Python and make it work and, but now you can sort of type in whatever you want and kind of get something back. and so part of our job as educators is to help students use these tools, in in a way that. Will ensure their long-term success with, with these tools, right? So, I, I'm not saying that they can't just do whatever they want and, and make some of their first assignments work. I, I think they could, I think they could be like un principled with the prompts and just throw it in there and get code and, you know, submit that, submit that code. But, uh, we're, we're going, you know, we're going for longer term, uh, effectiveness here, right? We have students who may not take another CS course. We need to keep them in mind. We have students who are gonna wanna eventually be software engineers, uh, security experts, PhDs in computer science, right? So we have a number of audiences that we're talking about, and we think they all need to know the fundamental skills of programming still. Even though, you know, they have this, this power tool at their expense now. [00:54:07] Leo: Speaking of the fundamental skills for programming, I, because of my, my hardware background, I'm this huge fan of teaching mental models in classes. Like what is the mental model of computation? Like, how, how do you imagine the computer is executing as you write the code? And, uh, ideally a professional computer scientist should be able to take, okay, well this is kind of the, my interpretation, this is my mental model for when I'm working at Python. If I really, really wanna drill this down, I can turn that into assembly. And if I really had to and turn to machine and even think about how this is working within the cash subsystems and virtual memory and all these things, I want 'em to be able to play those things out. We are changing the first class, and I think the first class is gonna be doing some things much better than before, like teaching problem decomposition and things like that. I'll, I'll mention that in a second, but, we are doing some things better. but we may not be teaching at how is the computer working as well. And so you can't just change one course and think the rest of the curriculum's gonna work. And so I think the entire curriculum's gonna need to adjust some, um, in, in a way of just adapting to these LLMs. Rethinking how to assess students [00:55:10] Leo: Um, the second piece for things getting potentially more challenging, uh, is instructors, we're in a good place right now as instructors, uh, in terms of how we assign and grade homework. Um, so grading, uh, this probably isn't gonna be a shock, is not one of our favorite things to do as faculty. I mean, it's actually really important. Uh, it's, it's central to us understanding how our students have learned, but it's generally not the most favorite thing that we do. And what a lot of instructors have done, myself included, is for much the introductory sequence. We have created assignments that can be entirely auto grade. So we define functions incredibly well. Like, write a really good description, this is exactly what it needs to do, and the students write that one piece of code and, uh, whether we like it or not. That is exactly when copilot does very well, and the LLMs do really well. And so the LLMs are gonna solve those very easily already. So we have to fix our assignments just like it, it's a given. Um, but it means that we're probably gonna have to rethink how we do assessment. Um, and so we're probably gonna be writing assignments that are much more open-ended and we're probably gonna have to be grading those, uh, putting more care and time integrating those potentially by hand. Uh, but I think these are all good things for the community and for the field. Um, but you can imagine how it's gonna be a bit of a, a shift for faculty and, and may take some time, uh, to be adopted as a result. [00:56:41] Jeremy: And, and so if you're shifting to homework that is more broad in scope, has more code, needs more human eyes on it, how how does that scale within the educators side? Right. You were, you were talking about how you've got, um, things that could be auto graded before and then now you're letting somebody generate this whole project. How does that work from your end? [00:57:09] Leo: I, I think there's a few things that are at play. Um, we, at, at large institutions like Dan and I are at, we have kind of armies of, uh, instructor assistants, instructional assistants that help us, uh, and so we can engage 'em in, in various tasks. And so, uh, one of the roles they heavily have now is helping students in the labs solve these auto grade assignments. and so you can imagine they will still be in the labs helping the students with these creative assignments, but now they're gonna have to have potentially a larger role in assessing the success of those. Um, there's been some really creative work, uh, in, in assessment and so I'll, I'll, I'll mention a couple of the ones, but there's, I, I'm sure I'm gonna be omitting some. But, uh, one is, Students could complete their project, and then they have to record a short video of them explaining the code that was in their project and how it worked, and you actually assess them on that video and their explanation of the code and how it works. Right? Because those can be perhaps shorter than trying to go through a really big project and, and see how it works. Um, there's a tool out of a UIUC, um, called Prairie Learn that helps with, um, uh, these are still auto graded, but uh, it helps with the, the test setting where you can write questions and have them, uh, graded kinda in a, in a exam or homework setting. the, the neat feature of that is that it can be randomized and so you don't have to worry as much about students kind of leaking information to each other about, test content from quarter to quarter. And so, because the randomization, they have to learn, actually learn the skills, and so you can, um, kinda engage with 'em in these test centers. And so right now a big grading burden on, on faculty is exams. And so you can actually give more exams, give more frequent feedback to the students and with, without the same grading burden. and so that, that's the other kinda exciting assessment piece. [00:59:01] Dan: Current assessment is not effective [00:59:01] Jeremy: In the different types of assessments, like the example of the video you gave, I'm just thinking to myself, well, the person could ask copilot or ChatGPT to give 'em a script, right? And they can rehearse that when they, um, send you a video. [00:59:18] Leo: I think, but I think that's, um, I think this is a philosophical shift in assessment that's kind of been gaining momentum over the years and that's that the assignments are all formative and they should all be. Pretty low stakes and the students should be doing them for the process of learning. and then, and, and it's unfair in some ways. There's a, there's a lot of things right now where you kind of grade them on, were you present at this time? Did you, did you meet this deadline at this time? Which if you're thinking about the, a diverse population of students, like you can imagine like a, a working mother who's also trying to do this, grading them on where you here at this time doesn't feel very equitable to me. And so there's this whole movement for grading for equity that shifts much of the assessment onto the exams. And so, yeah, the students could, uh, find multiple ways to cheat on the homeworks, but that's not the point of the homework and the homework's just to learn. It's a small scale, the grade, so. But you still then have those kinda controlled environments where they're taking these tests and that's where the grade actually comes from. Um, it's gonna take some time to make that shift, at, at, at least at a number of schools, my own included assess that those ho take home assignments are a huge portion of the grade. And students will love that because they can get all this help. And they can, especially with the auto graders, that they don't even write their own test cases. They just use the auto graders, the test cases. Right. Um, which is really depressing. Um, and they go to the, the, the instructional staff. The instructional staff tends to, to give away the answers. That's actually a paper that we, uh, published a few years ago. Um, and so the students love this high stakes, but tons of help version of assessment, but that may not actually measure their, their level of knowledge. And so it's gonna take a little bit of adjustment, for students and for faculty to do the shift, uh, to where the, as assess the, the exams are the Give students something interesting to build and don't worry about cheating [01:01:09] Dan: Yeah. Also, I'm, I'm not convinced that cheating is gonna be a problem here. it's very possible, for example, that students cheat on our previous assignments because the assignments were not authentic. Um, you know, in industry you're never going to, no one's gonna come up to you and say, Hey, like, from scratch, you know, write this exact function, takes two lists and determines, you know, how many values are equal between them. It's like, it's like, that's not gonna happen, right? You're gonna be doing something that has some sort of business purpose. And I kind of wonder, um, and this, this will, you know, this will play out, um, one way or another in the next, in the next, uh, few months. But I kind of wonder if we give students authentic tasks. Now you're cheating yourself right out of doing some, some something of value, right? Like before you were. You were probably cheating yourself out of a learning opportunity, but how, how can, you know, how can students know that? Right. The assignments boring, right? It's like, write all these functions and then something, something happens because of the magic, you know, starter glue code we wrote. So I

university learning english giving phd writing reading toronto microsoft students chatgpt code humans types associate professor diving windows rethinking programming openai computer science api cs python gpt optimization java github apis incorporate llm pdfs oss ide cuz ucsd partially solving problems lambda lms stack overflow california san diego syntax github copilot uiuc both dan toronto mississauga zingaro cs1 software engineering radio dan so jeremy you jeremy so

Anita Zhang and Alvaro Levia on systemd at Meta

Play Episode Listen Later Jun 14, 2023 63:48

systemd is a service manager for Linux. It is the first process that runs on many Linux distributions and manages all other user processes. It includes utilities for logging, process isolation, process dependencies, socket activation, and many other tasks. psystemd is a python library to communicate with systemd over dbus from python as an alternative to shelling out from an application to control services. Anita Zhang is an engineerd managerd at Meta and Alvaro Levia is a production engineer at Meta. I attended their systemd workshop at the Southern California Linux Expo. Topics covered: What's systemd? Giving talks and workshops cgroups and namespaces systemd timers vs cron Migrating from CentOS 6 to 7 Production engineers need to go lower in the stack to debug applications Meta's Linux userspace team Use of public cloud at Meta Meta's bootcamp Pystemd Mastodon Anita Zhang Alvaro Leiva Workshop systemd workshop Conference talks Journey into the Heart of systemd - Scale 19x Systemd: why you should care as a Python developer - PyCon 2018 Move Fast without Breaking things - Scale 18x Solving All the Problems with systemd - LISA18 Using systemd to high level languages - All Systems Go! The Curious Case of Memory Growth - Scale 19x Related Links systemd psystemd systemd-run systemd-timers Transcript You can help edit this transcript on GitHub. Introductions [00:00:00] Jeremy: So today I'm talking to Avaro Leiva and Anita Zhang. Avaro is the author of the pystemd library and he's a production engineer at Meta. And Anita is an engineerd managerd at Meta, and I'll let her explain that further. [00:00:19] Jeremy: But thank you both for joining me today. [00:00:21] Anita: Yeah, thanks for having us. [00:00:24] Jeremy: I guess where we could start, Anita, maybe you could explain a little bit your, your title that I just gave you there. engineerd managerd [00:00:31] Anita: Yeah, so by default I, I should be a software engineering manager, but when I transitioned to management, I was not, Ready to go public with, um, my transition. So I kind of hid it by, changing the title. we have some weird systems in place that grep on like the word engineer. So I had to keep engineer in there somehow. and so I kind of polled my friends what I should change my title to, and they're like, oh, you're gonna support the systemd team, so you should change it to like managerd. So I was like, sounds good. engineerd, managerd. I didn't wanna get kicked out of any workplace groups, for example, that required me to be an engineer. [00:01:15] Jeremy: Oh, okay. [00:01:17] Anita: Or like engineering function, I guess. [00:01:19] Jeremy: Yeah. Yeah. And you just gotta title it yourself, so as long as you got engineer in it, you're good. [00:01:24] Anita: Yeah, pretty much. Some people have really fun titles like Chief Potato Officer and things like that. [00:01:32] Jeremy: So what groups does the, uh, the potato officer get to go in? [00:01:37] Anita: Yeah. Not the C level ones. (laughs) What's systemd? [00:01:42] Jeremy: I guess maybe to, to start, we should explain to people who aren't familiar, uh, what systemd is. So if either of you wanna wanna take that one. [00:01:52] Alvaro: so people who doesn't know, right? So systemd is today is your init system, right? Is the thing that manage your, your process. and the best way to understand this, it is like when your computer, it needs to execute something. And that's something is what we call pid one. And that pid one is the thing that is gonna manage everything from now from there on, right? Uh, in the most basic level, if you remember how to, how does program start, how does like an idea becomes a program? Uh, you need to fork exec, right? So that means that something has to be at the top of that tree and that is systemd. now that can be anything, right? So there was a time where that was like systemv and there was also like upstart, uh, today's systemd is the thing that, uh, it's shipped in most distributions. [00:02:37] Jeremy: Yeah, because I, I definitely remember when I first started working with Linux, uh, it was with CentOS 6, and when I would want to run a service, I would have to go and write a bash script and kind of have all these checks for, is this thing running? Does it have permission to these things, which user is it running as, and so there was a lot of stuff that I remember having to do before systemd came out. [00:03:08] Alvaro: The good old days as we call them, [00:03:11] Jeremy: Or the bad old days. [00:03:13] Anita: Yeah. Depending on who you ask. [00:03:15] Alvaro: Yeah. So, so that is super interesting because, um, During those time, like you said, you have to write a first script. That means that you were basically yourself, your own service manager, right? So ideas as simple as, is my program running? There was no real answer. You have to figure it out, right? So if you run a program, uh, you maybe would create a pid file which hold the p or the pid of the process, of the main process, right? And then something needs to check, oh, is this file exist? Does the file exist and does the content of this file actually match to a process? And then you grab the process. So it was all these ideas that you had to do, and then for, you have to do it for every single software that you would deploy on your machine, right? That also makes really hard to parallelize stuff, right? Because you have no concept of dependencies. So if your computer has to put, uh, I, I dunno if you remember like long time ago, like Linux machine would, takes like five minutes to boot like your desktop. I remember like openSUSE. I can't remember, like 2008, 2007. Uh, it would take like five minutes to boot and then Ubuntu came and, and it start like immediately. And it was because, you can parallelize things, but you cannot do that if all you're running are bash script. Why was systemd chosen to be included in Linux distributions? [00:04:26] Jeremy: I remember before the Linux distributions didn't include it. And I wonder if you have any insight into how systemd got chosen to be the thing to manage our processes and basically how we got to where we are today. [00:04:44] Anita: I mean, we can kind of speculate a little bit. at the time when Lennart started systemd, um, with. Kai Sievers probably messed up his name there. Um, they were all at Red Hat and Red Hat manages fedora these days and I believe fedoras kind of like the bleeding edge for a lot of the new software ideas. Um, and when they picked up systemd as the defaults, um, eventually it started to trickle down to the rest of their distributions through RHEL and to CentOS and at the same time, I think other distributions started to see how useful it was in terms of managing all the different processes and services. Um, I know Debian at one point had kind of a vote on like whether they should make systemd either default or like, make it easy to switch between both. And then they decided to just stick with systemd because it's, I mean, the public agrees that it's like easy to use and it's more useful. It abstracts away a lot of things that they had to manually do before Who is interested in systemd? Who comes to your talks and workshops? [00:05:43] Jeremy: Something I've been kind of curious about. So just this year at SCaLE uh, you ran a, a workshop teaching people how to use systemd and, and sort of what it is about. I guess when, when you get people coming to these workshops, what are they typically, where are they typically coming from? Are they like system administrators or are they software developers? Like when you run these workshops, who are you looking for as your audience? [00:06:13] Alvaro: To be fair, this was the first time that we actually did a workshop for this. But we have like, talk about this in, in many like conferences. here's what happened, right? So every time that you put systemd on the title of, uh, of a talk, you are like baiting people into coming in, right? Because you do want to hear like some people who are still like reluctant from that war that happened like a few years ago. Between systemd and Ups tart right? most of the people who we get are, I would say like, software engineers, people who do software, and at least the question that I always get a lot, it is like, why should I care about systemd um, if I run everything on my containers in my Docker containers, right? The other type of audience that you get, you do get system administrators. Uh, but in general those people only cares about starting and stopping services don't really care about like the, like the nice other features that systemd has to offer. And then you get people who just wanna start like flame wars and I'm here for them. Why give talks and workshops on systemd? [00:07:13] Jeremy: In previous years, you've given conference talks and, and things like that related to systemd. And I wonder for, for both of you where, where the, the interests came from, where this is something that you feel strongly enough about that you wanna give talks about it. Because it's like, a lot of times when people give a conference talk, it's about, like new front end technology or some, you know, new shiny thing. Whereas systemd is like, it's like very valuable, but it's something that I feel like a lot of people don't think about. And so I'm just kind of curious where the interest came for, for both of you. [00:07:52] Anita: I think I just like giving talks and teaching in general. So if I have work that I found really exciting or interesting, then I'd want to like tell people about it and like teach them and like show them something cool. I think systemd is kind of a really good topic in that case because a lot of people want to learn more about it. Today there's like lots of new developments going on in systemd. So there's like a lot of basic stuff that you can learn, but also a lot of new advanced topics that are changing every year as well. aside from that, there's also like more generally applicable things. Like everyone wants to know how to debug something if you're like a software engineer or developer or even a sysadmin. Um, so last year I did a debugging talk. there's a lot of overlap I'd say how about you Alvaro? [00:08:48] Alvaro: For me, it, my interest in systemd started in, back when I was working on Instagram, we needed to migrate from CentOS6 to CentOS7. and that was the transition where you would have like a random init system to systemd, right? So we needed to migrate all of our scripts from like shell script to whatever shell script is going to interact with systemd. And that's when I was like, I don't like this. So I also have a thing where if I find something that doesn't have an Python API for it, I go and create a Python api. So I, I create pystemd like during that time. And I guess for me, the first reaction was when I was digging up systemd was like, whoa, can systemd do that? Like, like really, like I can like manage, network firewalls, right? Can I, can I stop my service from actually accessing the internet without having to deal with iptables at the time? So that's kind of like the feeling that I wanted to show people when I, when we do these these talks and, and these workshops, right? It's why like most of our talks, eh, have light demos in them because we do want to show people like, Hey, like, this is real. You can use it. [00:09:55] Jeremy: I don't know if this was a conscious decision on your part, but the thing about things like systemd is they, they feel like more foundational things that don't change that quickly. Like if you look at front end development, for example, at at meta you've got React, and that ecosystem changes so often that it's like there's always this new thing, you learn the way to do it and then it changes, right? Whereas I feel like when you're in the Linux user space and you're with systemd, like they're adding new things, but the, the. Foundations kind of stay the same. I'm not sure if that sounds accurate to both of you. [00:10:38] Anita: Yeah, I'd say a lot of the, there are a lot of stable building blocks in systemd, but at Meadow we also have a kernel team, which is working on like new kernel features all the time. They take years possibly to adopt, but with systemd, if we're able to influence the community and like get those kernel features in earlier, then like we can start to really shape what the future of operating systems look like. So it's not, it's very like not short term, uh, work that we're doing. It's a lot of long term, uh, work. [00:11:11] Jeremy: Yeah, that's, that's interesting in that I didn't even think about the fact that you are sitting at the, the user level with systemd, but you kind of know what you want. And so if there's things that the kernel can do to support that, you're having that involvement. With the open source community, make sure that you have your, your say get put in there. Yeah. [00:11:33] Anita: Mm-hmm. [00:11:35] Alvaro: It, it goes both way, right? So one part it is like, yeah, sure, we want features and we create them. Um, and we actually wanted to those to be upstream because we like, one thing that you should, you should never do is manage internal patches for like, things like the kernel, because that's rebase hell. Um, but you also want to be like part of the community and, and, and, and get the benefit of like, being part of it. Who should care about systemd? [00:11:59] Jeremy: And so, like one thing you mentioned ear earlier, Alvaro, is that people will sometimes ask you, I'm running my application in, in Docker containers. Why do I care about systemd? So, so maybe you could explain like, how you would respond to that. Yeah. [00:12:17] Alvaro: Well for more, for most people who actually run their application container I'd say like, no, you probably shouldn't care. Right? Like, you're good where you are. But in general, like, like system is foundational in the sense that it is the first thing that your computer boots your computer doesn't boot off of Docker or Kubernetes or, or any like that. So like something has to run these applications. there's also like a lot of value is that not all applications exist in the vacuum. Like, uh, like let me give you an example. Like if you have a web server, When people are uploading stuff to the web server, you will upload temporary things and then you have to clean it up after a while. So you may want to take advantage of systemd timers or cron or, or whatever you want, right? While the classical container view is that your pid one of the container is the application that you're running, right? So you do want to have like this whole ecosystem, Not all companies can run on containers. not everything can run in containers. So that's basically where all the things start to, to getting into shape. There's a lot of value in understanding how programs actually like exist, right? With the thing that I told you at the beginning of how an idea becomes a program understanding like, like you hit, you are in your bash, right? And you hit ls Star full enter, right? What happened in your machine? Understanding all the things, uh, there is a lot of value and understanding how systemd works. It's, it, it provides, uh, like that knowledge for you. [00:13:39] Jeremy: So for the average engineer at Meta who is relying on your team to deploy their, their code, I guess, if that's the right term, do you think that they're ever needing to think about systemd or is that kind of more like the responsibility of your team and they're just worried about like, I put my thing into my container and I don't, I don't worry about it. [00:14:04] Anita: I think there's like a whole level of the stack that sh ideally we should not even care or know that we're running systemd below them. I think that's, say we're doing our job well, cuz then the abstraction is good enough that they don't have to worry about it. But there's like a whole class of engineers below that that have to, you know, support the systems that run our on bare metal and infrastructure and make it happen. And those are the people who really care about what we're putting in systemd or like what the corner cases are and things like that. [00:14:37] Jeremy: Yeah, that, that makes sense. I mean, one of the talks that was at SCaLE was, uh, Brian Cantrill um, he gave a talk about the forgotten operator, and he was talking about how people forget that there are actual servers behind all the things we're deploying to, right? [00:14:55] Anita: Mm-hmm. [00:14:55] Jeremy: There is a person that you're racking the machines and plugging the power, and like, even though there's all these abstractions in front, that still exists. And so it sounds like things happening at the kernel level and the Linux user space and systemd that's also true because all this infrastructure that people are using to deploy their software on your team is the one who has to keep that running and to keep that running, they need to understand, uh, systemd and, and all these foundational Linux pieces. Yeah. [00:15:27] Anita: Mm-hmm. Yeah. [00:15:29] Alvaro: Like with that said um, I, and maybe it's because I'm very close to to, to the source. Um, and, and you know, like, like I said, like when, when all your tool is a hammer, everything looks like a nail? Well, that hammer for me, a lot of the times it is like even like cgroups or, or namespaces or even like systemd itself, right? there is a lot of times where, um, like for instance, a few years ago we have not, like, like last year or something, uh, we had an application that was very was very hard to load, right? It used a lot of memory. And so we start with, with a model where we would load like a, like a parent process and then child process would deal with, with, um, with the actual work of the thing, the classical model of our server. Now, the thing is that each of the sub process that would run would need to run, uh, on a separate set of privileges, right? So it would really need to run as different users. And that was like very easy to do. But now we actually wanted to some process to run with a, with only view of the file system while the parent process actually doesn't have to do that, right? Uh, or we want to limit the amount of CPU that a child process would use. So like all of these things, we were able like to, to swap it out uh, with using like systemd and, and, uh, like, like a good, Strategy for like, you create a process, you create a new cgroup, you put that into the cgroup, you create the namespace, uh, you add this process into that namespace, and then you have like all this architecture, and it's pretty free because forking it's free in general. [00:17:01] Anita: Actually, Alvaro's comment reminded me of like why we even ended up building the systemd team in the first place. It's kind of like if we have all these teams trying to touch cgroups on their own or like manage processes on their own, they're all gonna do it a different way and not, all of them will be ideal or like, to put it bluntly, I guess, we're really aiming to try and provide like a unified, really good foundational experience, for the layers above us. And so, systemd and the other things that go into the operating system are a step to get there. What are cgroups and namespaces? [00:17:40] Jeremy: And so for someone who's not familiar with the concept of cgroups or of namespaces, could you kind of give like a brief description? [00:17:50] Anita: so namespaces are, uh, we're talking about the kernel feature where, um, there are different ways to isolate, uh, different resources to the process or like, so that they have their own view of certain things, the network or, the processes and things like that. Um, and Cgroup stand for control groups. It's, at meta we only use Cgroups v2 which is a way to organize your processes into, Kind of like a directory view. but processes will be grouped into different, folders, shall you say, but that allows you to, uh, manage the resources between different groups of processes, which is how systemd does its services. [00:18:33] Alvaro: So a, a control group will allow you to impose restrictions on how each system uses the resources, right? So with a cgroup, you can say, only use 20% of cpu, and the, and the kernel will take care of that. Uh, while namespace it is basically how you view the system around you. So like your mount directory like, like where does your home points to? that's, I would say it's more on the namespace side of things. So one is the view then one is the actual, the restrictions. And like Anita said, like systemd does a very clever thing. It doesn't have two, is not the. It's not why cgroups exist, but every time that you start a systemd service, systemd will create a cgroup for that service and will put every process in that cgroup, even though all cgroups would end up being the same, for instance. But eh, you can now like have a consolidated list of what process belong to a service. So a simple question like, like what services has my Apache web service started? That's show you how old I am. But yeah, you can answer that now because you just look at the cgroup, you don't look at the process tree. [00:19:42] Jeremy: So it, it sounds like the, the namespacing is maybe more for the purposes of security, like you said, giving you a certain view of your, your system. and the cgroups are more for restricting resources, but also, like you said, being able to see what are all the processes, um, are associated. Um, so that you, you don't have a process that spins up other processes and then you don't know who owns those, and then you don't know how to shut 'em all down. That, that takes care of that for you. [00:20:17] Alvaro: So I, I always reluctant to use the word security or privacy. I would like to use the word isolation. Yeah. And then if people want to impose the idea of security and privacy to those, that's fine, but it's, but it's mostly about isolation. [00:20:32] Anita: Yeah. Namespaces are what back all the container technologies are. Anytime you run things in a container, it's probably using some kind of name spacing. But yeah, you, you kind of hit the nail in the head. Isolation versus like resource control [00:20:46] Alvaro: As Anita just said that's what fits on containers, uh, namespaces and cgroup like a big mix of those. But that doesn't mean that the only reason why those things exist are for containers. You can take advantage of those technologies without actually having to think of a container. systemd timers vs cron [00:21:04] Jeremy: Something you had mentioned a little bit earlier is, is how systemd has other features and one of them was, was timers. And I was kind of curious, cuz you said you could, you wanna schedule a job, you can run it using cron or you can run it using systemd timers. And it, I feel like whenever I see people scheduling jobs, they're always talking about cron but, but not so much about systemd timers. So I was curious if you had any thoughts on that. [00:21:32] Anita: I don't know. I feel like it's used pretty interchangeably these days. Um, like even when people say cron they're actually running a systemd timer with the cron format, for their time. [00:21:46] Alvaro: So the, the advantage of of systemd timers over cron is, is basically two, right? The first one it is that, you get more control on the time, right? So you have monotonic and absolute times, right? Which is basically like, you can say like this, start five minutes after the previous run. Or you can say this, start after five minutes after the vote, right? So those are two type of time, that is the first one, uh, which may be irrelevant for most people, but that's it. Uh, the other one is that you actually have advantage over the, you take full advantage of systemd, right? In current you say run this process, right? And how that process run, it's basically controlled by the process itself, right? So if you, uh, like if the crontab is for the user, that's good for you, but if you want to like nice it or make it use less cpu, that's what it is. Well, with systemd you say, This cron will start the service and the service, you take full fledged advantage of all the things a service can do. [00:22:45] Jeremy: From what I could tell, looking at the, the timers api, it, it felt like it would be a lot easier to kind of see when things ran, get, you know, get a log of, I ran this time job and it, it failed. Um, it seemed like systemd had a lot more kind of built in to, to kind of look into that. but, uh, yeah, like Anita was saying, like when you, you hear kind of cron all the time, but like you said, maybe it's, maybe they're not actually using cron all the time. They're just saying cron [00:23:18] Alvaro: Well, I would say this for cron like the, the time, the time, uh, syntax for it, it's pretty, it's pretty easy to understand, even though I never remember where, I remember where weekday is, right? The fourth, which one is which? [00:23:32] Jeremy: I, I'm with Anita. I need to look it up whenever I'm gonna use it. (laughs) [00:23:36] Anita: Yeah. I use a cron translator when I have to use cron format. [00:23:41] Alvaro: This is like, like a flags to tar, right? Like, I never remember which, which flags to put. [00:23:48] Anita: Yeah, that's true. [00:23:50] Alvaro: We didn't talk about this, we haven't talked about systemd-run, but one of the advantages of the, one of the advantages of using timers is that you can schedule them on demand, right? So like cron if you wanna schedule something over time, you need to modify the cron the cron file. Uh, and that's, it's problem right? With systemd, you can have like ephemeral units and so you can say like, just for now, go and run this process five hours from now. Like, and after that, just forget about it. [00:24:21] Jeremy: Yeah, the, during the workshop you mentioned systemd-run and I hadn't even heard of it. And after I saw that I was like, wow, this, this could be really useful. [00:24:32] Alvaro: It is quite useful. How have things changed at meta? [00:24:34] Jeremy: One of the things you had mentioned, I, I guess you've, you've been at Meta for, for quite a while and you were talking about how you started with having all these scripts you were running on CentOS 6 and getting off of that to something more standard. I wonder if you could speak a little bit to that, that process. Like what did things look like then and, and how have they they changed over the years? [00:25:01] Alvaro: I would say the following thing, right? Like Anita said, like for most engineers, the day to day of things don't really change that much, because this is foundational things, right? So if you have to fundamentally change the way that you run applications every couple of years, then you waste a lot of time, right? It's not the same as you say, like react where, or, or in the old days, angular where angular one, angular two, angular three, and then it's gone, right? Like, so, so I, I would say it like for the average engineers things don't change that much, uh, for the other type of engineers, like, like us who we, who that we really care about, like how things run. like having a, an API where you can like query the state of your service. Like if like asking like, is my service running with an API that returns true or false, that is actually like a volume value that you can like, Transferring in your application, uh, that, that helps a lot on, on distributed systems. a lot of like our container infrastructure that we use internally at Meta is based on a lot of these ideas and technologies. [00:26:05] Anita: Yeah, thinking back to the CentOS 6 to 7 migration, I wasn't on like the any operating systems team at the time, but I was working with them and I also was on a team that had to migrate, figure out how to migrate our scripts and things over. so the one thing that did make it easy is that the OS team, uh, we deploy all our things using Chef. Maybe you've heard like Puppet and Ansible, that's our version, the Open Source Chef code. Um, and they wrote some really good documentation on how to migrate, from Runit, which is what we were using before to systemd. it was. a very large scale effort across multiple teams to kind of make sure their stuff works, do the OS upgrade and then get used to using systemd. [00:26:54] Jeremy: And so the, the team who is performing this migration, that's not the product team. That would be the, is it production engineering? Is that, is that what you called that? [00:27:09] Alvaro: So, so I was at the other side of, of that, of that table where I, the same as Anita, we were doing the migration more how most things work at Facebook is that it's a combination of the team that is responsible for the technology and the teams who uses the technology. Right. So we are a company, so we. Can like, move together. it's the same thing when you upgrade kernels. Most of the time the kernel team will do the effort to upgrade the kernels, and when they hit a roadblock or something, they will call for the owner of the service and the owner of the service can help debug uh, for the case of CentOS 6 and CentOS 7, eh, I was the PE at Instagram P Stand for Production Engineer. I was the PE at Instagram who did most of the migration of our fleet. So I, I rewrote most of the things because I understand how our things work, and the OS team provide like the support to understanding like, like when can I use some things, when can I use not other things. There was the equivalent of ChatGPT at those days, right? I was just ask them how to do stuff. They will gimme recipes. so, so it it's kind of like, like a mix, uh, work, uh, between those two teams. Uh, Anita, maybe you can talk a little bit about what you talk when you were upgrading the version of systemd and you found a bug? [00:28:23] Anita: Oh, the, like regular systemd upgrades nowadays? I, I'd say it's a lot easier these days. I mean, since the, at the time when we did the CentOS 6 to 7 migration, it was like, our fleet was a lot more fragmented. I'd say nowadays it's a lot more homogenous, which makes, which makes it easier. yeah, in the early versions there were some kind of obscure like, interactions with the kernel or like, um, we, we make pretty heavy use of systemd to run our container system. So, uh, if we run into any corner cases, um, like pretty obscure stuff sometimes, because we make pretty heavy use of the resource control properties. we usually those end up on the GitHub tracker, things like that. [00:29:13] Alvaro: That's the side effect of hiring very smart people. They do very smart things that are very hard to understand. (laughs) [00:29:21] Jeremy: That's kind of an interesting point about you, you saying you're using these, these features, you know, of the kernel very heavily because, you're kind of running your own infrastructure, I think even your own data centers, so you're kind of forced to go to this level, it sounds like just because of the sheer number of services you're running and the fact that like, you have to find a way to pack 'em all onto the same machine. Does that, does that sound right? [00:29:54] Anita: Yeah, I'd say at, at our scale, like it's more cost effective to act, own the servers and run all everything on it ourselves versus like, you know, leasing from, uh, AWS or something, which we've also explored in the past. But that also means we need more engineers to build and run things on our servers. [00:30:16] Jeremy: Yeah. So the, the distinction between, let's say you're a, a small company or a mid-size company and you pay AWS or, or Google to, to do your hosting for you, then you may not necessarily get exposed to a lot of the, the kernel level problems or even the Linux user space problems because you're, you're working at a higher level and that's why you don't necessarily encounter those kinds of things. [00:30:46] Anita: I'd say not, not necessarily. I think, once you get even like slightly lower in the stack where you're just like on your own server, Then you will want to start really looking into like what systemd's doing, how does it interact with other, uh, services, um, on your server, and how can you like connect these different features together? [00:31:08] Alvaro: One of the things that every developer who who works like has to worry about is log right, and that, and that's the first time that you actually start interacting with systemdata available, right? So you have to understand, like maybe it's not just tail /var/log foo, but log right. Maybe it's just journalctl and it's like, what? But yeah. [00:31:32] Jeremy: Yeah. That's a good point too about whenever you're working with the operating system, like you're deploying onto a Linux machine. Regardless of the distribution, if you're the person who's responsible for that, you, you need to know this stuff. Right. Otherwise it's kind of like, you're just putting stuff out there and hoping for the best. Yeah. [00:31:54] Alvaro: Yeah. There, there's also another thing that, I dunno if I've said this before, but, a lot of the times you don't have to know these technologies, but knowing them will help you do your work better. [00:32:05] Jeremy: Yeah, totally. I mean, I think that that applies to pretty much anything in, in development, right? I, I've heard often that some people will say, you take the level that you work at currently and then kind of just go down one level. Right. And then, so you can kind of see what's underneath that. And you don't necessarily need to keep digging, cuz eventually if you keep digging, you're getting into, you know, machine instructions and whatnot. But, um, Yeah, maybe just one level is, is good to, to give you a better sense of what's happening. Production engineers need to go lower in the stack to be able to debug applications [00:32:36] Alvaro: Um, every time that I, that I, that somebody ask me like, what is the difference between a PE and a SWE, uh, software engineer, production engineer, typical conference, uh, one of the biggest difference that I, that I say is that a PE would tends to ask a lot of questions going the same thing that you're saying, we're trying to go down the stack, right? And I always ask the following question, eh, do you know how time dot sleep is implemented? Right? Do you like, like if you, if you were to see time dot sleep on your Python program, like do you actually know what is doing under the hood, right? Is it a while true? While the time, is it doing a signal interrupt? Is it doing a select on a file descriptor with a timeout? Like what is it doing? would you be able to implement it? And the reason why I say this, because like when you're debugging an application, like somebody something's using your cpu, right? And then you see that line on your code, you. You can debug every single line of your code. But also there's a lot of value to say like, no time.sleep doesn't cause CPU to spike. Right. Because it's implemented in a way that it would not be possible to do that. Meta's linux user space team [00:33:39] Jeremy: Another thing that I think might be kind of interesting to talk about is, so Meta has this Linux user space team. And I, I wonder like including your role in it, but just as a whole, like what does that actually mean day to day? Like, what are the kinds of problems people are facing that, a user space team would be handling? [00:34:04] Anita: Hmm. It's kind of large cuz now that the team's grown out to encompass a few other things as well. But I'll focus on the Linux user space part. the team started off, on the software engineering side as the systemd developer team. So our job was really to contribute to the community. and both, you know, help with, problems and bugs that show up in upstream, um, while also bringing in new features, that we think would be useful both at Meta and to like, folks, in the Linux community as a whole. so we still play a heavy role in, systemd. We also support it, uh, within the fleet, like we roll out new releases and things like that. but we're also working on a few other projects in. User space. Um, BP filter is one of them, which is, uh, how can we convert like IP tables and network filtering, into BPF programs. Um, on the production engineering side, they focus a lot on, the community engagements. So in addition to supporting CentOS they also handle, or they like support several packages in Fedora, Debian and other distributions, really figuring out how we can, be a better member of the open source community, and, you know, make connections there and things like that. [00:35:30] Jeremy: And, and what was your, your process for getting in involved with this team? Because it sounded like maybe it either didn't exist at the start, or it was really small and, and now it's really, really grown. [00:35:44] Anita: So I was kind of the first member of like the systemd team, if you would call it that. Um, it spun out of containers. So my manager at the time, who's now my director, was he kind of made a call out on workplace looking for people who'd be willing to, contribute to systemd. He was, supporting the containers team at the time who after the CentOS 7 migration, they realized the potential that systemd could have, making their jobs a lot easier when it came to developing the container backend. and so along with that, they also needed someone to help, you know, fix bugs, put in new features and things that would, tie into the goals of the containers team. Um, and eventually now our host management team, I was the first person who reached out to him and said, Hey, I wanna give this a try. I was on the security team at the time and I always had dreams of going back into like, operating systems development and getting better at it. So yeah, that's kind of how I ended up in this space. A few years later, he decided, Hey, we should build a team and you should like hire some people who will also do this with you and increase our investments in systemd. so that's how we kind of built out the Linux user space team to encompass systemd and more like operating system, projects. Working on the internal security team vs the linux userspace team [00:37:12] Jeremy: And so when you were working on the security team before, was that on software internal to meta or were you also involved with, you know, the open source, user space side as well? [00:37:24] Anita: That was all internal at the time. Which was kind of a regret because there was a lot of stuff that I would've liked to talk about externally. But I think, moving to Linux user space made me realize like, oh, there's so much more potential in open source projects, in security, which is still like very closed source from our side. [00:37:48] Jeremy: And, and so like in your experience, what have been some of the big differences? I mean, definitely getting to talk about it is a big one. but like in terms of your day-to-day, what are the big differences between working on something internal versus something that that's open source? [00:38:04] Anita: I have to talk more with external folks. we're, pretty regular members of like the systemd like conclave sync that we have with the other upstream maintainers. Um, Oh yeah. There's a lot more like cross company or an external open source community building that we have to do. it kind of puts into perspective like how we manage our time and also our relationships versus like internally, like everyone you work with works at Meta. we kind of have, uh, some shared leadership at the top. it is a little faster to turn around, um, because, you know, you can just ping people on work chat. But the, all of the systems there are closed source. So, um, there's not like this swath of people outside that you can ask about when it comes to open source things. [00:38:58] Jeremy: You can't, can't look in, discord or whatever for questions about, internal meta infrastructure to other people. It's gotta be. all in the same place. Yeah. [00:39:10] Anita: Yeah. And I'd say with like the open source projects, there's a lot of potential to tap into, expertise and talent that just doesn't exist internally. That's what I found really valuable, cuz people have really great ideas outside as well. Um, and we should like, listen to them and figure out how to build that into their systems and also ours Alvaro's work at meta [00:39:31] Jeremy: And, Avaro, I don't know when you first started, was that on internal, infrastructure and tooling as well? [00:39:39] Alvaro: Yeah, so, um, my path is different than Anita and actually my path and Anita doesn't share any common edges. so I, I don't work at the user space or the Linux kernel or anything. I always work in teams adjacent to it. Uh, but. It's always been very interesting to know these technologies, right? So I started working on Instagram and then I did a lot of the work in containers in migrations at where, where we build psystemd and also like getting to know more about that technologies. We did, uh, a small pilot on using casync which is a very old tool that like, it's only for the fans, (laughs) it's still on systemd repository, I dunno if that's used or anything, but it was like a very cool idea of how to distribute images. Uh, and in Instagram we do very fast deployments. So we deploy, or back then we used to deploy the source code, of Instagram every seven minutes, right? So every seven minutes, every time that a developer did commit to master, uh, we pushed that into production in less than an hour and we did that every seven minutes. So we were like planning to, to use those technologies for that. Um, And then I moved to another team inside of Meta, which is called Cloud Foundation, where we do a lot of like cloud infrastructure, uh, like public cloud. Uh, that's the area, that is very much not talked much about. but I keep like contributing to, to like this world. never really work on, on, on those teams inside of Meta. [00:41:11] Jeremy: So I guess it's your, your team is responsible for working with the engineers who work on product to be able to take their code and, and deploy it. And it's kind of like you work in combination with the user space team or the systemd team to make sure that what you're doing can be supported by them. Is that kind of an accurate description? [00:41:35] Alvaro: Yeah, that's, that's, that's definitely not an exhaustive description, but yeah, that's the, we, we, we do that. Public cloud at meta [00:41:42] Jeremy: It's interesting that you're, you're talking about public cloud now. So when you move to public cloud, are you using VMs kind of like you would in a data center, or is it, you're actually looking at the more managed services and things like that? [00:41:57] Alvaro: So I'm gonna take a small detour and say like, something that is funny. When I got hired by Facebook, we were working on Instagram. So Instagram was just an acquisition for, for, for meta right. And Instagram ran on AWS. So why wasn't the original team who were moving stuff from AWS into the internal data centers at Meta? On the team that I work now, uh, we work to support workloads that cannot run on meta infrastructure either for legal reasons, or for, for practical reasons. Right, because we don't have the hardware, uh, capability or legal resource because the government ask us, like, this cannot be on, on your data center or security, right? We don't wanna run this, this binary that we don't understand on our network. We do want to work in isolation. and the same thing that Anita was saying, where their team are building the common ways of using these tools, like systemd, and user space. we do the same thing, but for using cloud technologies. So in a way that is more similar to meta. So that's the detour now the, to answer your actual question, uh, we do a potpourri of things, right? So since we manage infrastructure and then teams deploy their code, they are better suited to know how their code, gets to run. Uh, with that said, we do have our preferred ways of how you would run stuff. and it's a combination of user containers, uh, open source containers, and and also like VMs There's a big difference between VMs and meta and in public cloud [00:43:23] Jeremy: So it, it sounds like in this case, you're, you're still using VMs even in public cloud, so the way that you do deployments, the location is different, but the actual software and infrastructure that you're running is, is similar. [00:43:39] Alvaro: So there's there's a lot of difference. Between the two things, right. So, the uniformity of hardware at Facebook, or our data centers, makes deploying things very simple, right? while in, in the cloud, you first, you don't get that uniformity because everybody like builds their AMIs as, as they want to build it. But also like a meta, we use, one operating system, in the cloud, you are a little bit more free of what you want. And one of the reasons why you want to go to the cloud is because you can run stuff on. On, on, on the way that that meta will run. Right? So, so even though we have something that are similar, it's not as simple like, oh, just change your deployment from like this data center to like whatever us is one think you would run. [00:44:28] Jeremy: Can, can you give an example of something where you wouldn't be able to run it on Meta's, image that they would choose to go to public cloud to run a different image for? [00:44:41] Alvaro: So, um, so in, in general, like if the government ask us, like, this is not necessarily like, like the US government, right? So, and like if the government ask us like, hey, like you need to keep this transaction on, on our territory, right? for logs, for all the reasons, for whatever, right? like, and, and we wanted to be in the place, we would have to comply. And that's where we will probably use this, this kind of technologies security is another one that is pretty good. And the other one, it is like, in it general, like, like, uh, like disaster recovery, right? If, if meta is down in a way where we cannot communicate with each other using metas technologies, right? Like you would need to have like a bootstrap point. [00:45:23] Jeremy: Is, is it the case where you are not able to put, uh, meta's image up into public cloud? Because you were, The examples you gave was more about location, right? Where you're saying we need to host in public cloud because it needs to be in this country, but then I think you were also saying the, the actual images you would use on AWS right. Would be. I don't know, maybe you'd be using Amazon Linux or maybe you'd be using a different, os entirely. And is that mainly because you're just not able to deploy the same images you have, uh, in-house? [00:46:03] Alvaro: So in, in, in general, uh, this is kind of like very hard to to explain, but, but, uh, if, if we would have to deploy code to a, machine and that machine would, would, would be accessed by people who are not like meta employees and we have no way of getting them to sign NDAs then we would not deploy meta code into that machine. Uh, because that's Sorry. No, not Pi PI's personal information. I was, uh, ip, sorry, that's that's the word. Yeah. Yeah. [00:46:31] Jeremy: So, okay. So if there's, so if you're in public cloud, there's certain things that you just won't put there just because. Those are only allowed to run on Metas own infrastructure. [00:46:44] Alvaro: Yeah Meta's bootcamp [00:46:44] Jeremy: Earlier you were talking about Instagram was an acquisition and they were in AWS were, were you there at the time or you joined, after? [00:46:54] Alvaro: No, I joined. I joined after I joined to, to meta. The way that Meta does hiring, at least for my area, is that you get hired as a production engineer, but you don't get assigned to a team. So you go through a process called boot camp where you get to try different teams and figure out what things you like. I try a couple of different teams, turns out that I like it to work at the Instagram. [00:47:15] Jeremy: And so at that time they were already running on Facebook's internal infrastructure and they had migrated off of AWS [00:47:24] Alvaro: We were on the process of finishing that migration. [00:47:28] Jeremy: So by the time you were there, yeah. Basically get, getting everything out of AWS and then into meta's internal. [00:47:35] Alvaro: Yeah. And, and, and everything is, is a very hard terms to, to define. Uh, I would say like, like most of all, like the bulk of things we were putting it in inside, like, at least what we call our Django servers. Like they were all just moving into internal infrastructure. How Anita started [00:47:52] Jeremy: This kind of touches on the, the whole boot camp thing, but, Anita, I saw that you, you interned at Facebook and then you took a position there, when you ended up taking a position, I'm kind of curious what were the different projects you looked at or, or how did you end up settling on the one you chose? [00:48:11] Anita: Yeah, I interned, um, and I joined straight out of university. I went into bootcamp similar to Alvaro and I got the chance to explore several different teams. I knew I was never gonna do UI that was just like not my thing. Um, so I focused, uh, my search on all like backend infrastructure teams. Um, obviously security, uh, was one of them because that's the team I was in interning on. Um, I also explored, the kind of testing infra team. we call it sandcastle. It runs our internal like unit tests and things. and I also explored one of the, ads infrastructure backend teams. so it was mainly just, you know, getting to know the people, um, seeing which projects appealed to me the most. Um, and then, you know, I kind of chose based on that, I, I think I've always chosen. My work based on how interesting the project sounded, uh, which has worked out in my favor as far as I could tell. How Alvaro started [00:49:14] Jeremy: How, how about, you Alvaro what were the, the different projects you looked at when you first started? [00:49:20] Alvaro: So, As a PE you do have a more restrictive, uh, number of teams that you can, that you can join. Uh, like I don't get an option to work in ui. Not that I wanted, but, (laughs) I, I, it's, it's so long ago. Uh, I remember I did look at, um, at MySQL as a team, uh, that was also one of the cool team. Uh, we had at that time, uh, distribute, uh, engine, uh, to, to run work, like if, like celery or something like that. But internally, I really like the constable distribute like workloads, um, and. I can't remember. I think I did put, come with the Messenger team, that I, I ended up having like a good relationship with their TL their tech lead, uh, but never actually like joined that team. And I believe because she have me have a, a PHP task and it was like, no, I'm not down for doing PHP [00:50:20] Jeremy: Only Python. Huh? [00:50:21] Alvaro: Exactly. Python. Python. Because it's just above C level. Psystemd [00:50:27] Jeremy: I mean related to that, you, you started the, the psystemd project. And so I wonder if you could explain what the context behind that was. Like what sparked I need to make this, this library? [00:50:41] Alvaro: So it's, it's a confluence of two things. The first one, it is like, again, if I see something that doesn't have a Python API for it, I. Feels the strong urge to create one. I have done this a couple of times, mostly internally, but also externally. that was one. And when, while we were doing the migration, I, I, I honestly, I really hate text processing. So the classical thing was like, if you wanna know if your application's running, you do systemctl, you shell out to systemctl status, then parse the output, find the, find the status column. Okay. And I didn't like that. And I start reading about like, systemd uh, and I got in contact with the or I saw like the dbus implementation of systemd. And that was, I thought that was a very interesting idea how that opened all the doors. Right? Uh, so I got a demo working like in a couple of hours. and then I said like, okay, now how do we make this pythonic? And then I created that and I just created, again, just for migrating Instagram. That was the idea. Then, uh, one of the team members who work with Anita, but also one who doesn't work with us anymore, they saw this and said like, Hey, like this looks like a good thing to open source it. So it was like, sure, like I'm happy to opensource it. So we opensource it and then we went to all System Go, which is a very nice interesting conference that happened in Berlin where like all the head for like user space get together. and, and I talk about it and people seems to like it, and that's the story of that. [00:52:15] Jeremy: And so this was replacing, I guess, like you were saying, a lot of people were shelling out and running cat commands and things like that from their Python scripts. And this was meant to be a layer on top of that. [00:52:30] Alvaro: Yes. So it, it does a couple of things. So first of all, inspecting the processes or, or like the services, getting that information out. That's one of the main usage. But also like starting or stopping or like doing all that operations that you want to do. Uh, knowing the state of, of, of services, uh, that's also another thing that people take advantage of. The other thing that people take advantage of is to modify the status of the, of the processes at runtime, like changing properties, like increasing or decreasing the CPU threshold. because systemd provides a very nice API or interface to modify the cgroups properties that otherwise you would need to kind of understand the tree structure that, uh, that, that whatever. so that's what people tend to use this mostly internally. [00:53:23] Jeremy: And so it, it sounds like at least on the production engineering side, you're primarily working in, in Python. is that something that's the teams before were using Python and so everybody just continues using Python? Or is there kind of like more structure or thought put into that? [00:53:41] Alvaro: I would say the following thing about it, um, like in in general, uh, there's, there's not a direction on which language you should use. It's pretty natural which language you should use, but with without said, there's not a Potpourri of languages inside of, of meta. most teams use c c plus plus Python and rust and that's it. There's go, that appears every once in a while there. Sorry, I should not talk about this like, like, or talk like this about this, but eh, there are team who are actually like very fond of go and they use it and they contribute a lot to that space. It's just not. That much, uh, use internally. I have always gravitated towards Python. That has been the language that teach me how to do real coding. and that's the language that got me a job at meta. So I tends to work mostly on that. Yeah. [00:54:31] Anita: Hey, you forgot hack Alvaro. Our web services. (laughs) [00:54:37] Alvaro: Yes. Yes. Uh, so I would say like, the most used language at Meta is actually PHP it's just like used by, by one particular product. That, that is the Facebook product. Yes. So our, our entire web interface, eh, or web stack uses a combination of hack, which is a compiled php, which is better than uncompiled php, also known as vanilla php. Uh, there is a lot of like GraphQL, React, and, I think that's it. [00:55:07] Anita: Infrastructure is largely like c plus plus Python, and now Rust is getting a huge following as well. [00:55:15] Alvaro: Yeah. Like, like Rust. Rust is, I I would say it's the fastest growing language inside, inside of Meta. And the thing is that there is also what you call like the bootstrap problem. Um, there's like today, if I wanted do my python program and I have a function that fails one every three times, I can add a decorator that is retry, that retries every time that something fails for a timeout, right? And that's built in and it's there used and it's documented. And I can look at source code that uses this to understand how, how works. When you start with a new language, you don't get the things. So people have to build them. So there's the bootstrap problem. [00:55:55] Jeremy: That's also an opportunity as well, right? Like if you are the ones building sort of the foundations, then you, you have an opportunity to be the ones who have the core libraries that people are, are using every day. Whereas if a language has been around a while, it's kind of, some of that stuff is already set, right? And you may or may not like the APIs, but that's what people use. So that's what we, that's what we do. One of the last things I'd kind of like to ask, so Anita, you moved into management in just the last year or two or so, and I'm kind of curious what your experience has. Been like, was that a conscious decision where you wanted to go from engineering, uh, software engineering to management? Or maybe you could talk a little bit to that. [00:56:50] Anita: Oh man, it hasn't even been a year yet. I feel like so much time has passed already. Uh, no, I never had any plans to go into management. I love being an engineer. I love being in the code. but, I'd say my, my current manager and uh, my director, you know, who hired me into the Linux user space team, kind of. Sold me a little bit on the idea of like, Hey, if you wanna like, keep pushing more projects, you wanna build out the team that you wanna see working on these things, um, you can consider going into management, taking it slow in a, what we call a T L M role, which is like a tech lead manager, role where you kind of spend some time doing development, and leading the team while also supporting, the engineers as a manager doing the hiring and the relationship building and things that you do in management. so that actually worked out quite well for me, despite Alvaro shaking his head at first. I really enjoyed being able to split my time into kind of the key projects that I really wanted to work on, um, while also supporting the engineers and having them build out, um, New features in systemd and kind of getting their own foothold in the community as well. but I'd say like in the past few months, it's been pretty crazy. I, I probably naively thought that I'd have a little more control over, I don't know. My destiny has a manager and that's like a hundred percent not true. (laughs) Um, you're, you are kind of at both the whims of your engineers and also the people above you. And you kind of have to strike that balance. But, um, my favorite part still, just being able to hide the nasty stuff away from the engineers, let them focus on their work and enjoy what engineers wanna do best, which is just like coding, designing, and like, you know, doing fun, open source stuff. [00:58:56] Alvaro: I will say like, Anita may laugh about me for, because like she's on the other side, but one thing that I least I find very cool at Meta is that managers are not seen as your boss. Right? They're still like a teammate who just basically has a different roles. This is why like when you're an engineer, you can transition to be a manager and that's, it's not considered a promotion that's considered like a, a like an horizontal step and vice versa, you can come back, right. from a manager into, into like an engineer. Yeah. [00:59:25] Jeremy: That was what I would say. And, uh, I guess when you were shaking your head, I'm guessing this means you, you don't wanna become a manager anytime soon. [00:59:35] Alvaro: So I, I never closed the door on that, but I was checking my head to the work of a tlm. Right. Uh, so the tlm TL stands for Tech Lead and m stands for manager. so you're basically both, but with the time of only one. So, uh, Anita was able to pull it off. I don't think I would be able to pull up like, double duty on that. [00:59:56] Anita: Yeah. Unfortunately I support too many people now to do the TL stuff as deeply as I used to, but I still have find some time to code a little bit here and there. [01:00:09] Jeremy: So you were talking a little bit about how things have been crazy the last few months. If, if someone is making the transition into management, like what are the kinds of things that you would tell them to, to look out for or to be aware that's coming? [01:00:27] Anita: Um, when I, before I transitioned, I talked to a lot of managers about like, oh, what was like, you know, the hardest part about management. And they all have kind of their own horror story about what happened to them when they transitioned or even like, difficult things that happened to them during management. I'd say don't expect it to be easy. you're gonna make a lot of mistakes usually in like the interpersonal relationship side, and it's really just about learning how to learn from your mistakes, pick back up and do better next time. I think, um, you know, if people like books, the Making of a Manager by Julie Jo, she was a designer, and also a manager, at then Facebook. She's no longer here. but she has a really good book on like what you can expect when you transition into management. the other thing I'd say is don't go into management without having a management chain that you can really trust. I'd say that can kind of make or break your first few years as a manager, whether you'll enjoy it or not, or even like whether you'll be able to get through the hard times. [01:01:42] Jeremy: Good point. Yeah. I mean, I think whenever you take on anything new, right? Having the support of the people above you or just around you as well is like, that makes such a big difference, right? Even like the situation can be bad, but if everyone is supportive, then you can, you can get through it. [01:02:02] Anita: Yeah, that's absolutely right. [01:02:04] Jeremy: I think that's a good place to wrap up unless either of you have anything else that you thought we should have talked about. so if people want to check out what you're working on, what you're up to, um, how can they find you? [01:02:20] Anita: well, I guess we're both on matrix now. Uh, I'm Anita Zha on Matrix, a n i t a z h A. we both have Twitters as well. If you just search up our names. Nope. Yeah, you're on Twitter. Yeah. [01:02:36] Alvaro: There is an impostor with my name, right? Actually it's not an impostor. It's just me. I just never log into Twitter anymore. [01:02:40] Anita: We both have Mastodon now as well? Yes. Fosstodon we're both frequently at conferences as well. what's, what's coming up next? I think it's, uh, devconf cZ in the Czech Republic. and then, uh, all systems go in September. [01:02:57] Alvaro: You sent something in Canada? [01:03:01] Anita: Oh, yeah. L F F L F S M M B P F is coming up. That's a, that's more of a kernel conference, though. [01:03:09] Alvaro: An acryonym that is longer than the actual word. Yes. Yeah. [01:03:12] Jeremy: That's a lot. That's a lot of letters. [01:03:14] Anita: It's a, it's a mouthful. (laughs) [01:03:18] Jeremy: That's very neat that you get to, to kind of go to all these different conferences and, and actually get, to meet the people in, in person that are, you know, working with the same things you are and, get to be in the same room. I think that's a, that's a real privilege. Yeah. [01:03:35] Anita: Yeah, for sure. [01:03:38] Jeremy: All right. Well, Anita and Alvaro, thank you so much for chatting with me today. [01:03:43] Alvaro: Thank you for hosting. [01:03:45] Anita: Yeah. Thanks for the opportunity. This is a lot of fun.

canada google strategy giving heart public berlin chefs chatgpt matrix os production scale isolation foundations sold ip pe depending react messenger ups user rust api puppets czech republic python ui bp aws metas linux github apis curious case mastodon zhang apache tl potpourri ubuntu cpu transferring amis php django red hat docker kubernetes ndas lennart mysql graphql vms tech leads ansible debian swe centos rhel bpf systemd opensuse avaro all systems go namespaces production engineer cloud foundation amazon linux jeremy you jeremy it jeremy so

David Cramer on Application Monitoring with Sentry

Choppin' It Up With PScott

Play Episode Listen Later Jun 14, 2023 76:03

Sentry is an application monitoring tool that surfaces errors and performance problems. It minimizes the need to manually look at logs or dashboards by identifying common problems across applications and frameworks. David Cramer is the co-founder and CTO of Sentry. This episode originally aired on Software Engineering Radio. Topics covered: What's Sentry? Treating performance problems as errors Why you might no need logs Identifying common problems in applications and frameworks Issues with Open Telemetry data Why front-end applications are difficult to instrument The evolution of Sentry's architecture Switching from a permissive license to the Business Source License Related Links Sentry David's Blog Sentry 9.1 and Upcoming Changes Re-Licensing Sentry Transcript You can help edit this transcript on GitHub. [00:00:00] Jeremy: Today I'm talking to David Kramer. He's the founder and CTO of Sentry. David, welcome to Software Engineering Radio. [00:00:08] David: Thanks for having me. Excited for today's conversation. What's Sentry? [00:00:11] Jeremy: I think the first thing we could start with is defining what Sentry is. I know some people refer to it as an error tracker. Some people have referred to it as, an application performance monitoring tool. I wonder if you could kind of describe in, in your words what it is. [00:00:30] David: You know, as somebody who doesn't work in marketing, I just tell it how it is. So Sentry started out doing error monitoring, which. You know, dependent on who you talk to, you might just think of as logging, right? Like that's the honest truth. It is just logging just a different shape or form. these days it's hard to not classify us as just an APM tool that's like the industry that exists. It's like the tools people understand. So I would just say it's an APM tool, right? We do a bunch of things within that space, and maybe it's not, you know, item for item the same as say a product like New Relic. but a lot of the overlap's there, so it's like errors performance, which is like latency and sort of throughput. And then we have some stuff that just goes a little bit deeper within that. The, the one thing i would say that is different for us versus a lot of these tools is we actually only do application monitoring. So we don't do any since like systems or infrastructure monitoring. Meaning Sentry is not gonna tell you when you need to replace a hard drive or even that you need new hard, like more disk space or something like that because it's just, it's a domain that we don't think is relevant for sort of our customers and product. Application Performance Monitoring is about finding crashes and performance problems that users would associate with bugs [00:01:31] Jeremy: For people who aren't familiar with the term application performance monitoring, what is that compared to just error tracking? [00:01:41] David: The way I always reason about it, this is what I tell new hires and what I would tell, like my mother, if I had to explain what I do, is like, you load Uber and it crashes. We all know that's bad, right? That's error monitoring. We capture the crash report, we send it to developers. You load Uber and it's a 30 second spinner, like a loading indicator as a customer. Same outcome for me. I assume the app is broken, right? So we also know that's bad. Um, but that's different than a crash. Okay. Sentry captures that same thing and send it to developers. lastly the third example we use, which is a little bit more. I think, untraditional, but a non-traditional rather, uh, you load the Uber app and it's like a blank screen or there's no button to submit, like log in or something like this. So it's kind of like a, it's broken, but it maybe isn't erroring and it's not like a slow thing. Right. Same outcome. It's probably a bug of some sorts. Like it's what an end user would describe it as a bug. So for me, APM just translates to there are bugs, user perceived bugs in your application and we're able to monitor and, and help the software teams sort of prioritize and resolve those, those concerns. [00:02:42] Jeremy: Earlier you were talking about actual crashes, and then your second case is, may be more of if the app is running slowly, then that's not necessarily a crash, but it's still something that an APM would monitor. [00:02:57] David: Yeah. Yeah. And I, I think to be fair, APM, historically, it's not a very meaningful term. Like I as a, when I was more of just an individual contributor, I would associate APM to, like, there's a dashboard that will tell me what's slow in my application, which it does. And that is kind of core to APM, but it would also, none of the traditional tools, pre sentry would actually tell you why it's broken, like when there's an error, a crash. It was like most of those tools were kind of useless. And I don't know, I do actually know, but I'm gonna pretend I don't know about most people and just say for myself. But most of the time my problems are errors. They are not like it's fast or slow, you know? and so we just think of it as like it's a holistic thing to say, when I've changed the application and something's broken, or it's a bug, you know, what is that bug? How do we help people fix it? And that comes from a lot of different, like data signals and things like that. the end result is still the same. You either are gonna fix it or it's not important and you ignore it. I don't know. So it's a pretty straightforward, premise for us. But again, most companies in the space, like the traditional company is when you grow a big company, what happens is like you build one thing and then you build lots of check boxes to sell more things. And so I think a lot of the APM vendors, like they've created a lot of different products. Like RUM is a good example of another acronym that lives with an APM. And I would tell you RUM is completely meaningless. It, it stands for real user monitoring. And so I'm like, well, what's not real about monitoring the application? Well, nothing's not real, but like they created a new category because that's how marketing engines work. And that new category is more like analytics than it is like application telemetry. And it's only because they couldn't collect the app, the application telemetry at the time. And so there's just a lot of fluff, i would say. But at the end of the day too, like developers or engineering teams, it's like new version of the application. You broke something, let's tell you about it so you can fix it. You might not need logging or performance monitoring [00:04:40] Jeremy: And, and so earlier you were saying how this is a kind of logging, but there's also other companies, other products that are considered like logging infrastructure. Like I, I would think of companies like Paper Trail or Log Tail. So what space does Sentry fill that's that's different than that kind of logging? [00:05:03] David: Um, so the way I always think about it, and this is both personally true, and what I advise other folks is when you're building something new, when you start from zero, right, you can often take Sentry put it in, and that's good enough. You don't even need performance monitoring. You just need like errors, right? Like you're just causing bugs all the time. And you could do that with logging, but like the delta between air monitoring and logging is night and day. From a user experience, like error monitoring for us, or what we built at the very least, aggregates the errors. It, it helps you understand the frequency. It helps you when they're new versus old. it really gives you a lot of detail where logs don't, and so you don't need logging often. And I will tell you today at Sentry. Engineers do not use logs for the most part. Uh, I had a debate with one of our, our team members about it, like, why does he use logs recently? But you should not need them because logs serve a different purpose. Like if you have traces which tell you like, like fast and slow in a bunch of other network data and you have this sort of crash report collection or error monitoring thing, logs become like a compliance or an audit trail or like a security forensics, tool, and there's just not a lot of value that you would get out of them otherwise, like once in a while, maybe there's like some weird obscure use case, but generally speaking, you can just pretend that you don't need logs most days. Um, and to me that's like an evolution of the industry. And so when, when Sentry is getting started, most people were still logs. And if you go talk to SRE teams, they're like, oh, login is what we know. Some of that's changed a little bit, but. But at the end of the day, they should only be needed for more complicated audit trails because they're just not a good solution to the problem. It's just free form data. Structured or not, doesn't really matter. It's not aggregated. It's not something that you can really use. And it's why whenever you see logging tools, um, not even the papertrails of the world, but the bigger ones like Splunk or Cabana, it's like this weird, what we describe as choose your own adventure. Like go have fun, build your dashboards and try to make the logs useful kind of story. Whereas like something like Sentry, it's just like, why would you waste any time trying to build dashboards when we can just tell you when something new is broken? Like that's the ideal situation. [00:06:59] Jeremy: So it sounds like maybe the distinction is with a more general logging tool, like you mentioned Splunk and Kibana it's a collection of all this information. of things happening, even though nothing's necessarily wrong, whereas Sentry is more Sentry is it's going to log things, but it's only going to log things if Sentry believes something is wrong, either because of a crash or because of some kind of performance issue. People don't want to dig through logs or dashboards, they want to be told when something is wrong and whyMost software is built the same way, so we know common problems [00:07:28] David: Yeah. I, i would say it's about like actionability, right? Like, like nobody wants to spend their time digging through logs, digging through dashboards. Metrics are another good example of this. Like just charts with metrics on them. Yeah. They tell me something's happening. If there's lots of log statements, they tell me something's going on, but they're not, they're not optimized to like, help me solve a problem, right? And so our philosophy was always like, we haven't necessarily nailed this in all cases for what it's worth, but. It was like, the goal is we identify an actual problem, like close to like a root cause kind of problem, and we escalate that up and that's it. Uh, versus asking somebody to like go have to like build these dashboards, build these things, figure out what data matters and all this because most software looks exactly the same. Like if you have a web service, it doesn't matter what language it's written in, it doesn't matter how different you think your architecture is from somebody else's, they're all the same. It's like you've got a request, you've got a database, you've got some cache, you've got all these like known, known quantity things, and the slowness comes from the same places. Errors are structured while logs are not [00:08:25] David: The errors come from the same places. They're all exhibiting the same kinds of behavior. So logging is very unstructured. And what I mean by that is like there's no schema. Like you can hypothetically like make it JSON and everybody does that, but it's still unstructured. Whereas like errors, it's, it's a tight schema. It's like there's a type of error, there's a message for the error, there's a stack trace, there's all these things that you know. Right. And as soon as you know and you define those things, you can just build better products. And so distributed tracing is similar. Hypothetically, it's a little bit abstract to be fair, but hypothetically, distributed tracing is creating a schema out of basically network annotations. And somebody will yell at me for just simplifying it to that. I would tell 'em that's what it is. But, same goal in mind. If you know what the data is, you can take action on it. It's not quite entirely true. Um, because tracing is much more freeform. For example, it doesn't say if you have a SQL statement, it should be like this, it should be formatted this way, things like that. whereas like stack traces, there's a file name, there's there's a line number, there's like all these things, right? And so that's how I think about the delta between what is useful information and what isn't, I guess. And what allows you to actually build things like Sentry versus just build abstract exploration. Inferring problems rather than having user identify them [00:09:36] Jeremy: Kind of paint the picture of how someone would get started with a tool like Sentry. Do they need to tell Sentry anything about their application? Do they need to modify their source code at all? give us a picture of how that works. [00:09:50] David: Yeah, like one of our fundamentals, which I think applies for any real business these days is you've gotta like reduce user friction, right? Like you've gotta make it dead simple to use. Uh, and for us there were, there was like kind of a fundamental driving constraint behind that. So in many situations, um, APM vendors especially will require you to run an agent a basically like some kind of process that runs on your servers somewhere. Well, if you look at modern tech stacks, that doesn't really work because I don't run the servers half my stuff's in the browser, or it's a mobile app or a desktop app, and. Even if I do have those servers, it's like an entirely different team that controls them. So deploying like a sidecar, an agent is actually like much more complicated. And so we, we looked at that and also because like, it's much easier to have control if you just ship within the application. We're like, okay, let's build like an SDK and dependency that just injects into the, the application that runs, set an API key and then you're done. And so what that translates for Sentry is we spend a lot of time knowing what Django is or what Rails is or what expresses like all these frameworks. And just knowing how to plug into the right signals in those frameworks. And then at that point, like the user doesn't have to do anything. And so like the ideal outcome for Sentry is like you install the dependency in whatever language makes sense, right? You somehow configure the API key and maybe there's a couple other minor settings you add and that gives you the bare bones and that's it. Like it should just work from there. Now there's a lot you can do on top of that to enrich data and whatnot, but for the most part, especially for errors, like that's good enough. And that, that's always been a fundamental goal of ours. And I, I think we actually do it phenomenally well. [00:11:23] Jeremy: So it sounds like it infers things about the application without manual configuration. Can you give some examples of the kind of things that Sentry knows without the user having to tell it? [00:11:38] David: Yeah. So a good example. So on the errors side, we know literally everything because an error object in each language has all these attributes with it. It, it gives you the stack trace, it gives you a lot of these things. So that one's straightforward. On the performance side, we use a combination of leveraging some like open source, I guess implementations, like open telemetry where it's got all this instrumentation already and we can just soak that in, um, as well as we automatically instrument a bunch of stuff. So for example, say you've got like a Python application and you're using, let's say like SQL Alchemy or something. I don't actually know if this is how our SDK works right now, but, we will build something that's aware of that library and make sure it can automatically instrument the things it needs to get the right information out of it. And be fair. That's always been true for like APM vendors and stuff like that. The delta is, we've often gone a lot deeper. And so for Python for example, you plug it into an application, we'll capture things like the error, error object, which is like exception class name exception value, right? Stack trace, file, name, line number, all those normal things, function name. We'll also collect source code. So we'll, we'll give you sort of surrounding source code blocks for each line in the stack trace, which makes it infinitely easier to consume. And then in Python and, and php, and I forget if we do this anywhere else right now, we'll actually even allow you to collect what are called stack locals. So it'll, it'll give you basically the variables that are defined almost like a debugger. And that is actually, actually like game changing from a development point of view. Because if I can go look in production when there's an incident or a bug and I can actually see the state of the application. , I, I never need to know like, oh, what was going on here? Oh, what if like, do I need to go reproduce this somehow? I always have the right information. And so all of that for us is automatic and we only succeed like, it, it's, it's like by definition inside of Sentry, it has to be automatic. Like if we ask the user to do anything whatsoever, we're failing. And so whenever we design any product or anything, and to be fair, this is how every product company should operate. it's gotta be with as little user input as humanly possible. And so you can't always pull that off. Sometimes you have to have users configure stuff, but the goal should always be no input. Detecting errors through unhandled exceptions [00:13:42] Jeremy: So you, you're talking about getting a stack trace, getting, the state of variables, source code. That sounds like that's primarily gonna be through unhandled exceptions. Would you say that's, that's the primary way that you get error? [00:13:58] David: Yeah, you can integrate in other ways. So you can like trigger our API to capture an, uh, an exception. You can also, for better or worse, it's not always good. You can integrate through logging adapters. So if you're already using a logging framework and you log their errors there, we can often capture those. However, I will say in most cases, people use the logging APIs wrong and the data becomes junk. A good, a good example of this is like, uh, it varies per language. So I'm just gonna go to Python because Python is like sort of core to Sentry. Um, in Python you have the ability to log messages, you can log them as errors, you can log like actual error objects as errors. But what usually happens is somebody does a try-catch. They, they capture the error they rescue from it. They create a logging call, like log dot error or something, put the, the error message or value in there. And then they send that upstream. And what happens is the stack trace is gone because we don't know that it's an error object. And so for example, in Python, there's actually an an A flag. You pass the logging call to make sure that stack trace stays present. But if you don't know that the data becomes junk all of a sudden, and if we don't have a stack trace, we can't actually aggregate data because like there's just not enough information to like, to run hashing on it. And so, so there are a lot of ways, I guess, to capture the information, but there are like good ways and there are bad ways and I think it, it's in everybody's benefit when they design their, their apt to like build some of these abstractions. And so like as an example, when, whenever I would start a new project these days, I will add some kind of helper function for me to like log an exception when I like, try catch and then I can just plug in whatever I need later if I want to enrich the data or if I wanna send that to Sentry manually or send it to logs manually. And it just makes life a lot easier versus having to go back and like augment every single call in the code base. [00:15:37] Jeremy: So it, it sounds like. When you're using a tool like Sentry, there's gonna be the, the unhandled exceptions, which are ones that you weren't expecting. So those should I guess happen without you catching them. And then the ones that you perhaps do anticipate, but you still consider to be a problem, you would catch that and then you would add some kind of logging statement to your code that talks to Sentry directly. Finding issues like performance problems (N+1 queries) that are not explicit errorsz [00:16:05] David: Potentially. Yeah. It becomes a, a personal choice to be fair at that, at that point. but yeah, the, the way, one of the ways we've been thinking about this lately, because we've been changing our error monitoring product to not just be about errors, so we call it issues, and that's in the guise of like, it's like an issue tracker, a bug tracker. And so we started, we started putting what are effectively like, almost like static analysis concerns inside of this issue tracker. So for example, In our performance monitor, we'll do something called like detect n plus one queries, which is where you execute a, a duplicate query in a loop. It's not necessarily an error. It might not be causing a problem, but it could be causing a problem in the future. But it's like, you know, the, the, the qualities of it are not the same as an error. Like it's not necessarily causing the user to experience a bug. And so we've started thinking more about this, and, and this is the same as like logging errors that you handle. It's like, well, they're not really, they're not really bugs. It's like expected behavior, but maybe you still want to keep it like tracking somewhere. And I think about like, you know, Lins and things like that, where it's like, well, I've got some things that I definitely should be fixing. Then I've got a bunch of other stuff that's like informing me that maybe I should take action on or not. But only I, the human can really know at the end of the day, right, if I, if I should prioritize that or not. And so that's how I kind of think about like, if I'm gonna try catch and then log. Yeah, you should probably collect that data. It's probably less important than like the, these other concerns, like, like an actual unhandled exception. But you do, you do want to know that they're happening and whatnot. And so, I dunno, Sentry has not had a strong opinion on this historically. We're just like, send us whatever you want to capture in this regard, and you can pay for it, that's fine. It's like usage based, you know? we're starting to think a lot more about what should that look like if we, if we go back to like, what's the, what's the opinion we have for how you should use the product or how you should solve these kinds of software problems. [00:17:46] Jeremy: So you gave the example of detecting n plus one queries is, is that like being aware of the framework or the ORM the person is using and that's how you're determining this? Or is it at more of a lower level than that? [00:18:03] David: it is, yeah. It's at the framework level. So this is actually where Open Telemetry causes a lot of harm, uh, for us because we need to know what a database query is. Uh, we need to know like the structure of the query because we actually wanna parse it out in a lot of cases. Cause we actually need to identify if it's duplicate, right? And we need to know that it's a database query, not a random annotation that you've added. Um, and so what we do is within these traces, which is like if you, if you don't know what a trace is, it's basically just like, it's a tree, like a tree structure. So it's like A calls B, calls C, B also calls D and E and et cetera, right? And so you just, you know, it's a trace. Um, and so we actually just look at that trace data. We try to find these patterns, which is like, okay, B was a, a SQL query or something. And every single sibling of B is that same SQL query, but sort of removing certain parameters and stuff for the value. So we'll look at that data and we'll try to pull out anomalies. So m plus one is an example of like a fairly obvious anti pattern that everybody knows is bad and can be optimized. Uh, but there's a lot of other that are a little bit more subjective. I'll give you an example. If you execute three SQL statements back to back, one could argue that you could just batch those SQL statements together. I would argue most of the time it doesn't matter and I don't need to do that. And also it's not guaranteed that that is better. So it becomes much more like, well, in my particular situation this is valuable, but in this other situation it might not be. And that's where I go back to like, it's almost like a linter, you know? But we're trying to infer all of that from the data stream. So, so Sentry's kind of, we're kind of a backwards product company. So we build our product from a technology vision, not from customers want this, or we have this great product vision or anything like that. And so in our case, the technology vision is like, there's a lot of application data that comes in, a lot of telemetry, right? Errors, traces. We have a bunch of other streams now. within that telemetry there is like signal. And so one, it's all structured data so we know what it is so we can actually interpret it. And then we can identify that signal that might be a problem. And that signal in our case is often going to translate to like this issue concept. And then the goal is like, well, can we identify these problems for people and surface them versus the choose your own adventure model, which is like, we'll just capture everything and feed it to the user and they can figure out what matters. Because again, a web service is a web service. A database is a database. They're all the same problems for everybody. All you know, it's just, and so that's kind of the model we've built and are continuing to evolve on and, and so far works pretty well to, to curate a lot of these workflows. Want to infer everything, but there are challenges [00:20:26] Jeremy: You talked a little bit about how people will sometimes use tracing. And in cases like that, they may need some kind of session ID to track. Somebody making a call to a service and that talks to a database and that talks to other services. And you, inside of your application, you have to instrument some way of tracking. This all came from this one request. Is that something that Sentry can infer or is there something that the developer has to put into play so that you can track that sort of thing? [00:21:01] David: Yeah, so it's, it's like a bit of both. And i would say our goal is that we can infer everything. The reality is there is so much complexity and there's too much of a, like, too many technologies in the world. Like I was complaining about this the other day, like, the classic example on web service is if we have a middleware hook, We kind of know request response, usually that's how middleware would work, right? And so we can infer a lot from there. Like basically we can infer the boundaries, which is a really big deal. Okay. That's one thing is boundaries is a problem. What we, we describe that as a transaction. So like when the request starts. When the request ends, right? That's a very important boundary for everybody to understand because when I'm working on the api, I care about the API boundary. I actually don't care about what the database is doing at its low level or what the JavaScript application might be doing above it. I want my boundary. So that's one that we kind of can do. But it's hard in a lot of situations because of the way frameworks and technology has been designed, but at least traditional stuff like a, a traditional web stack, it works like a Rails app or a DDjango app or PHP app kind of thing, right? And then within that it becomes, well, how do you actually build a trace versus just have a bunch of arbitrary labels? And so we have a bunch of complicated tech within each language that tries to establish that tree. and then we annotate a lot of things along the way. And so we will either leverage Open Telemetry, which is an open format spec that ideally has very high quality data. Ideally, not realistically, but ideally it has high quality data. Every library author implements it great, everybody's happy. We don't have to do anything ever again. The reality is that data is like all over the map because there's not like strict requirements for what, how the data should be labeled and stuff. And not everything even has that data. Like not everything's instrumented with open telemetry. So we also have a bunch of stuff that, unrelated to using that we'll say, okay, we know what this library is, we're gonna try to infer some characteristics from this library, or we know what maybe like the DDjango template engine is. So we're gonna try to infer like when the template renders so you can capture that block of information. it is a very imperfect science and I would tell you like it's not, even though like Open Telemetry is a very fun topic for people. It is not necessarily good, like it's not in a good state. Could will it ever be good? I don't know in all honesty, but like the data quality is like all over the map and so that's honestly one of our biggest challenges to making this experience that, you know, tells you what's going on in your database so it tells you what's going on with the cash or things like this is like, I dunno, the cash might be called something completely random in one implementation and something totally different in another. And so it's a lot of like, like data normalization that you have to deal with. But for the most part, those libraries of things you don't control can and will be instrumented. Now the other interesting thing, which we'll see how this works out, so, so one thing Sentry tries to do there, we have all these layers of telemetry, so we have errors and traces, right? Those are pretty high level concepts. We also have profiling data, which is very, very, very, very low level. So it's usually only if you have like disc. I like. It's where is all the CPU time being spent in my application? Mostly not waiting. Like waiting's usually like a network call, right? But it's like, okay, I have a loop that's doing a lot of math, or I'm writing a bunch of stuff to disc and that's really slow. Like often those are not instrumented or it's like these black box areas of a performance. And so what we're trying to do with profiling data, instead of just showing you flame charts and stuff, is actually say, could we fill in these gaps in these traces? Like basically like, Hey, I've got a long period of time where the app's doing something. You know, here's an API call, here's the database stuff. But then there's this block, okay, what's that function or something? Can we pull that out of the profiling data? And so in that case, again, that's just automatic because the profile actually knows everything about the application and know it. It has full access to the function and the stack and everything, right? And so the dream is that you would just always have everything filled in the, the customer never has to do anything with one minor asterisk. And the asterisk is what I would call like business context. So a good example would be, You might wanna associate requests with a specific customer or something like that. Like you might wanna say, well it's uh, I don't know, Goldman Sachs or one of these big companies or something. So you can know like, well when Goldman Sachs is having performance issues or whatever it is, oh maybe I should focus on them cuz maybe they pay you a lot of money or something. Right. Sentry would never know that at the end of the day. So we also have these like kind of tagging contextual APIs that will say like, tell us some informations, maybe it's like customer, maybe it's something else that's relevant to your application. And we'll keep that data associated with the telemetry that's like present, you know, um, but the, at least the telemetry, like again, application's just worth the same, should be, there should be a day in the next few years that it's just all automatic. and again, the only challenge today is like, can it be high quality and automatic? And so that, that's like to be determined. [00:25:50] Jeremy: What you're kind of saying is the ideal is being able to look at this profiling information and be able to build a full picture of. a, a call from beginning to end, all the different things to talk to, but I guess what's the, what's the reality today? Like, what, what is Sentry able to determine, in the world we live in right now? [00:26:11] David: So we've done a lot of this like performance detection stuff already. So we actually can do a lot now. We put a lot of time into it and I, I will tell you, if you look at other tools trying to do tracing, their approach is much more abstract. It's like your traditional monitoring tool that's like, we're just gonna collect a lot of signals and maybe we'll find magic anomaly detection or something going on in it, which, you know, props, but that can figure that out. But, a lot of what we've done is like, okay, we kind of know what this data looks like. Let's go after this very like known quantity problem. Let's normalize the data. And let's make it happen like that's today. Um, the enrichment of profiles is new for us, but it, we actually can already do it. It's not perfect. Detection of blocking the UI thread in mobile apps [00:26:49] David: Um, and I think we're launching something in April or May, something around the, that timeframe where hopefully for the, the technologies we can instrument, we're actually able to surface that in a useful way. but as an example that, that concept that I was talking about, like with n plus one queries, the team built something using profiling data. and I think this, this might be for like a mobile app more so than anything where mobile apps have this problem of, it's, you've got a main thread and if you block that main thread, the app is basically frozen. You see this on desktop apps all the time. You, you very rarely see it on web apps anymore. But, but it's a really big problem when you have a web, uh, a mobile or desktop app because you don't want that like thing to be non-responsive. Right? And so one of the things they did was detect when you're doing like file io on the main thread, you know, right. When you're writing a disc, which is probably a slow thing or something like that, that's gonna block the whole thing. Because you should just do it on a separate thread. It's like an easy fix, potentially may not be a problem, but it could become a problem. Same thing as n plus one. But what's really interesting about it is what the team did is like they used the profiling data to detect it because we already know threads and everything in there, and then they actually recreated a stack trace out of that profiling data when it's surfaced. So it's actually like useful data with that. You could like that I or you as a developer might know how to take and actually be like, oh, this is where it happens at the source code. I can actually figure it out and go fix it myself. And to me, like as like I, I'm still very much in the weeds with software that is like one of the biggest gaps to most things. Is it just, it doesn't make it easy to consume or like take action on, right? Like if I've got a, a chart that says my error rate is high, what am I gonna do with that? I'm like, okay, what's breaking? That's immediately my next question. Right? Okay. This is the error. Where is that error happening at? Again, my next question, it, it's literally just root cause analysis, right? Um, and so that, that to me is very exciting. and I, I don't know that we're the first people to do that, I'm not sure. But like, if we can make that kind of data, that level of actionable and consumable, that's like a big deal for me because I will tell you is like I have 20 years of software experience. I still hate flame charts and like I struggle to use them. Like they're not a friendly visualization. They're almost like a, a hypothetically necessary evil. But I also think one where nobody said like, do we even need to use that? Do we need that to be like the way we operate? and so anyways, like I guess that's my long-winded way of saying like, I'm very excited for how we can leverage that data and change how it's used. [00:29:10] Jeremy: Yeah. So it sounds like in this example, both in the mobile app blocking the UI or the n plus one query is the Sentry, suppose, SDK or instrumentation that's hooked inside of your application. There are certain behaviors that it knows are, are not like ideal I guess, just based on. people's prior experience, like your own developers know that, hey, if you block the UI thread in this mobile application, then you're gonna have performance problems. And so that way, rather than just telling you, Hey, your app is slow, it can tell you your app is slow and it's because you're blocking the UI thread. Don't just aggregate metrics, the error tracker should have an opinion on what actual problems are [00:29:55] David: Exactly, and I, and I actually think, I don't know why so many people don't recognize this gap, because at the end of the day, like, I don't know, I don't need more people to tell me response times are bad or anything. I need you to have an opinion about what's good because. The only way it's like math education, right? Like, yeah, you learn the basics, but you're not expected to say, go to calc, but, and then like, do all the fundamentals. You're like, don't get a calculator and start simplifying the problem. Like, yeah, we're gonna teach you a few of these things so you understand it. We're gonna teach you how to use a calculator and then just use the calculator and then make it easier for everybody else. But we're also not teaching you how to build a calculator because who cares? Like, that's not the purpose of it. And so for me, this is like, we should be helping people sort of get to the finish line instead of making them run the entirety of the race over and over if they don't need to. I don't, I don't know if that's a good analogy, but that has been the biggest gap, I think, in so much of this software throughout the industry. And it's, it's, it's common everywhere. And there's no reason for that gap to exist these days. Like the technology's fine. And the technology's been fine for like 10 years. Like Sentry started in oh eight at this point. And I think there was only one other company I recall at the time that was doing anything that was even similar to like air monitoring and Sentry when we built it, we're just like, what if we just go deeper? What if we collect all this information that will help you debug the problem instead of just stopping it like a log aggregator or something kind of thing, so we can actually have an opinion about it. And I, I genuinely, it baffles me that more people do not think this way because it was not a hard problem at the time. It's certainly not hard these days, but there's still very, I mean, a lot more people do it now. They've seen Sentry successful and there's a lot of similar implementations, but it's, it's just amazes me. It's like, why don't you, why don't people try to make the data more actionable and more useful, the teams versus just collect more of it, you know? 40 people working on learning the common issues with languages and frameworks [00:31:41] Jeremy: it, it sounds like maybe the, the popularity of the stack the person is using or of the framework means that you're gonna have better insights, right? Like if somebody makes a, a Django application or a Rails application, there's all these lessons that your team has picked up in terms of, Hey, if you use the ORM this way, your application is gonna be slow. Whereas if somebody builds something totally homegrown, you won't know these patterns and you won't be able to like help as much basically. [00:32:18] David: Yeah. Yeah, that's exactly, and, and you might think that that is a challenge, but then you look at how many employees exist at like large tech companies and it's, it's not that big of a deal, like, , you might even think collecting all the information for each, like programming, runtime or framework is a challenge. We have like 40 people that work on that and it's totally fine. Like, and, and so I think actually all these scale just fine. Um, but you do have to understand like the domain, right? And so the counter version of this is if you look at say like browser applications, like very rich, uh, single page application type experiences. It's not really obvious like what the opinions are. Like, like if, if you, and this is like real, like if you go to Sentry, it's, it's kind of slow, like the app is kind of slow. Uh, we even make fun of ourselves for how slow it is cuz it's a lot of JavaScript and stuff. If you ask somebody internally, Hey, how would we make pick a page fast? They're gonna have no clue. Like, even if they have like infinite domain experience, they're gonna be like, I'm not entirely sure. Because there's a lot of like moving parts and it's not even clear what like, like good is right? Like we know n plus one is bad. So we can say not doing that is the better solution. And so if you have a JavaScript app, which is like where a lot of the slowness will come from is like the render times itself. Like how do you fix it? You, you can't actually build a product that tells you what to fix without knowing how to fix it, right? And so some of these newer and very fast moving targets are, are frankly very difficult for us. Um, and so that's one thing that I think is a challenge for the entire industry. And so, like, as an example, a lot of the browser folks have latched onto web vitals, which are just metrics that hopefully tell you something about the application, but they're not always actionable either. It'll be like, the idea with like web vitals is like, okay, time to interactive is an an important metric. It's like how long until the page loads that a user can do what they're probably there to do. Okay. Like abstractly, it makes sense to us, but like put into action. How do I optimize time to interactive? Don't block the page. That's one thing. I don't know. Defer assets, that's another thing. Okay. So you've gotta like, you've gotta build a technology that knows these assets could be deferred and aren't. Okay, which ones can be deferred? I don't know. Like, it, it, it's like such a deep rabbit hole. And then the problem is, six months from now, the tech will have completely changed, right? And it won't have like, necessarily solved some of these problems. It will just have changed and they're now a completely different shape of problem. But still the same fundamental like user experience is the same, you know? Um, and to me that's like the biggest challenge in the industry right now is that like dilemma of the browser at the end of the day. And so even from our end, we're like, okay, maybe we should step back, focus on servers again, focus on web services. Those are known quantities. We can do that really well. We can sort of change that to be better than it's been in the past and easier to consume with things like our n plus one detections. Um, and then take like a holistic, fresh look at browser and say, okay, now how would we solve this to make sure we can actually really latch onto the problems that like people have and, and we understand, right? And, you know, we'll see when we get there. I don't think any product does a great job these days for helping, uh, solve those problems. . But I think even without the, the products, like I said, like even our team would be like, fixing this is gonna take months because it's gonna take months just to figure out exactly where the, the common bottlenecks are and all these other things within an application. And so I, I guess what I mean to say with that is there's a lot of opportunity, I think with the moving landscape of technology, we can find a way to, whether it's standardized or Sentry, can find a way to make that data actionable want it something in between there. There are many ways to build things on the frontend with JavaScript which makes it harder to detect common problems compared to backend [00:35:52] Jeremy: So it sounds like what you're saying, With the, the back end, there's almost like a standard way of doing things or a way that a lot of people do it the same way. Whereas on the front end, even if you're looking at a React application, you could look at tenant react applications and they could all be doing state management a totally different way. They could be like the, the way that the application is structured could be totally different, and that makes it difficult for you to infer sort of these standard patterns on the front end side. [00:36:32] David: Yeah, that's definitely true. And it, it goes, it's even worse than that because well, one, there's just like the nature of JavaScript, which is asynchronous in the sense of like, it's a lot of callbacks and things like that. And so that already makes it hard to understand what's going on, uh, where things are happening. And then you have these abstractions like React, which are very good, but like they pull a lot of that away. And so, as an example of a common problem, you load the application, it has to do a lot of stuff to make the page render. You might call that hydration or whatever. Okay. And then there's a completely different state, which is going from, it's already hydrated. Page one, I, I've done an interaction or something. Or maybe I've navigated a page too, that's an entirely different, like, sort of performance problem. But that hydration time, that's like a known thing. That's kind of like time to interactive, right? But if the problem is in your framework, which a lot of it is like a lot of the problems today exist because of frameworks, not because of the technology's bad or the framework's bad, but just because it's abstracted and it's really hard to make it work in all these situations, it's complicated. And again, they have the same problem where it's like changing non sem. And so if the problem is the framework is somehow incorrectly re rendering the page as an example, and this came up recently, for some big technology stack, it's re rendering the page. That's a really bad problem for the, the customer because it's making the, it's probably actually causing a lot of CPU seconds. This is why like your Chrome browser tabs are using so much memory in cpu, right? How do you fix that? Can you even fix that? Do you just say, I don't know, blame the technology? Is that the solution? Maybe that is right, but how would we even blame the technology like that alone, just to identify why it's happening. and you need to know the why. Right? Like, that is such a hard problem these days. And, and personally, I think the only solution is if the industry sort of almost like standardizes on a way to like, on a belief of how this should be optimized and how it should be measured and monitored kind of thing. Because like how errors work is like a standardization effectively. It may not be like a formal like declaration of like, this is what an error is, but more or less they always have the same attributes because we've all kind of understood that. Like those are the valuable things, right? Okay. I've got a server rendered application that has client interaction, which is sort of the current generation of the technology. We need to standardize on what, like that web request, like response life cycle is, right? and what are the moving targets within there. And it just, to me, I, I honestly feel like a lot of what we use every day in technology is like beta. Right. And it's, I think it's one of the reasons why we're constantly always having to up, like upgrade and, and refactor and, and, and shift dependencies and things like that because it is not, it's very much a prototype, right? It's a moving target, which I personally do not think is great for the industry because like customers do not care. They do not care that you're using some technology that like needs a change every few months and things like that. now it has improved things to be fair. Like web applications are much more like interactive and responsive sometimes. Um, but it is a very hard problem I think for a lot of people in the world. [00:39:26] Jeremy: And, and when you refer to, to things feeling like beta, I suppose, are, are you referring to the frameworks people are using or the libraries they're using to support their front end development? I, I'm curious what you're, you're thinking there. [00:39:41] David: Um, I think it's everything. Even like the browser APIs are constantly shifting. It's, that's gotten a little bit better. But even the idea like type script and stuff, it's just like we're running like basically compilers to make all this code work. And, and so the, even that they're constantly adding features just because they can, which means behaviors are constantly changing. But like, if you look at a real world example, like React is like the, the most dominant technology. It's very well designed for managing the dom. It's basically just a rendering engine at the end of the day. It's like it's managed to process updates to the dom. Okay. Makes sense. But we've all learned that these massive single page applications where you build all your application logic and loaded into a bundle is a problem. Like, like, I don't know how big Sentry's bundle is, but it's multiple megs in size and it takes a little while for like a, even on fast fiber here in the Bay Area, it takes a, you know, several seconds for the UI to load. And that's not ideal. Like, it's like at some point half of us became okay with this. So we're like, okay, what we need to do is go back, literally just go back 10 years and we need to render it on the server. And then we need some stuff that makes interactions, you know, highly responsive in the UI or dynamic content in the ui, you know, bring, it's like bringing back jQuery or something. And so we're kind of going full circle, but that is actually like very complicated because the way people are trying to do is like, okay, we wanna, we wanna have the rendering engine operate the same on the server and is on as on the client, right? So it's like we just write one, path of code that basically it's like a template engine to some degree, right? And okay, that makes sense. Like we can all get behind that kind of model. But that is actually really hard to make work with a lot of people's software and, and I think the challenge and framers have adopted it, right? So they've taken this, so for example, it's like, uh, react server components, which is basically just like, can we render it on the server and then also keep that same interaction in the ui. But the problem is like frameworks take that, they abstract it and so it's another layer of complexity on something that is already enormously complex. And then they add their own flavor onto it, like their own opinions for maybe what the world way the world is going. And I will say like personally, I find those. Those flavors to be very hard to adapt to like things that are tried and true or importantly in this context, things that we know how to monitor and fix, right? And so I, I don't know what, what the be all end all is, but my thesis on this is you need to treat the UI like a template engine, and that's it. Remove all like complexity behind it. And so if you think about that, the term I've labeled it as, which I did not come up with, I saw this from somebody at some point, is like, it's like your front end as a service. Like you need to take that application that renders on the server and the front end, and it's just an entirely different application, which is annoying. and it just calls your APIs and that's how it gets the data it needs. So you're literally just treating it as if it's like a single page application that can't connect to your database. But the frameworks have not quite done that. And they're like, no, no, no. We'll connect to the database and we'll do all this stuff, but then it doesn't work because you've got, like, it works this way on the back end and this way on the front end anyways. Again, long winded way of saying like, it's very complicated. I don't think the technology can solve it today. I think the technology has to change before these problems can actually genuinely become solvable. And that's why I think the whole thing is like a beta, it's like, it's very much like a moving target that we're eventually we'll get there and it's definitely had value, but I don't know that, um, responsiveness for low latency connections is where the value has been created. You know, for like folks with bad internet and say remote Africa or something, like I'm sure the internet is not a very fun place for them to use these days. Some frontend code runs on the server and some in the browser which creates challenges [00:43:05] Jeremy: I guess one of the things you mentioned is there's this, almost like this split where you have the application running on the server. It has its own set of rules because it, like you said, has access to the database and it can do things that you can't do in the browser, and then you have to sort of run the same application in the browser, but it's not quite the same application because it doesn't have access to the same things in the browser. So you have this weird disconnect, I suppose. [00:43:35] David: Yeah. Yeah. And, and, and then the challenges is like a developer that's actually complicated for you from the experience point of view, cuz you have to know somehow, okay, these things are ta, these are actually running on the server and only on the server. And like, so I think the two biggest technologies that try to do this, um, or at least do it well enough, or the two that I've used, there might be some others, um, are NextJS and remix and they have very different takes on how to do this. But, remix is the one I use most recently. So I, I'll comment on that. But like, there's a, a way that you kind of say, well, this only runs on, I think the client as an example. And that helps you a little bit. You're like, okay, this is only gonna render on the client. I can, I actually can think about that and reason about that. But then there's this thing like, okay, sometimes this runs on the server, only this part runs on the server. And it's, it just becomes like the mental capacity to figure out what's going on and debug it is like so difficult. And that database problem is like the, the normal problem, right? Like of like, I can only query the database on the server because I need secure credentials or something. Okay. I understand that as a developer, but I don't understand how to make sure the application is doing what I expect it to do and how to fix it if something goes wrong. And that, that's why I think. , I'm a, I'm a believer in constraints. The only way you make progress is you simplify problems. Like you just give up on solving the complicated thing and you make the problem simpler. Right? And so for me, that's why I'm like, just take the database outta the equation. We can create APIs from the client, from the server, same security levels. Okay? Make it so it can only do that and it has to be run as almost like a UI only thing. Now that creates complexity cuz you have to run this other service, right? And, and like I personally do not wanna have to spin up a bunch of containers just to write like a simple like web application. but again, I, I think the problem has not been simplified yet for a lot of folks. Like React did this to be fair, um, it made it a lot easier to, to build UI that was responsive and, and just updated values when they changed, you know, which was a big deal for a long period of time. But I feel like everything after has not quite reached that that area, whereas it's simple and even react is hard to debug when it doesn't do what you want it to do. So I don't know, there, there's so gaps I guess is what i would say. And. Hopefully, hopefully, you know, in the next five years we'll kind of see this come to completion because it does feel like it's, it's getting closer to that compromise. You know, where like we used to have pure server rendered apps with some weird janky JavaScript on top. Now we've got this bridge of really complicated, you know, JavaScript on top, and the server apps are also complicated and it's just, it's a nightmare. And then this newer generation of these frameworks that work for some types of technology, but not all. And, and we're kind of almost coming full circle to like server rendered, you know, everything. But with like allowing the same level of interactions that we've been desiring, I guess, on the web. So, and I, fingers crossed this gets better, but right now I do not see like a clear like, oh, it's definitely there. I can see it coming. I'm like, well, we're kind of making progress. I don't love being the beta tester of the whole thing, but we're kind of getting there. And so, you know, we'll see. There are multiple ways to write mobile apps as well (flutter, react native, web views) [00:46:36] Jeremy: I guess you, you've been saying this whole shifting landscape of how Front End works has made it difficult for Sentry to provide like automatic instrumentation and things like that for, for mobile apps. Is that a different story? Like is it pretty standardized in terms of how do you instrument an Android app or an iOS app. [00:46:58] David: Sort of, but also, no, like, a good example here is like early days mobile, it's a native application. You ship a binary known quantity, right? Or maybe you embedded a web browser, but like, that was like a very different thing. Okay. And then they did things where like, okay, more of it's like embedded web browser type stuff, or dynamically render content. So that's now a moving target. the current version of that, which I'm not a mobile dev so like people have strong opinions on both sides of this fence, but it's like, okay, do you use like a, a hybrid framework which allows you to build. Say, uh, react native, which is like arou you to sort of write a JavaScript ish thing and it runs on both Android and mobile, but not really well on either. Um, or do you write a native, native app, which is like a known quantity, but then you may maintain like two code bases, have two degrees of expertise and stuff. Flutters the same thing. so there's still that version of complexity that goes on within it. And I, I think people care less about mobile cuz it impacts people less. Like, you know, there's that whole generation of like, oh, mobile's the future, everything's gonna be mobile, let's not become true. Uh, mobile's very important, but like we have desktops still. We use web software all the time, half the time on mobile. We're just using the web software at the end of the day, so at least we know that's a thing. And I think, so I think that investment in mobile has died down some. Um, but some companies like mobile is like their main experience or one of their driving experience is like a, like a company like DoorDash, mobile is as important as web, if not more, right? Because of like the types of customers. Spotify probably same thing, but I don't know, Sentry. We don't need a mobile app, who cares? It's irrelevant to the problem space, right? And so I, I think it's just not quite taken on. And so mobile is still like this secondary citizen at a lot of companies, and I think the evolution of it has been like complicated. And so I, I think a lot of the problems are known, but maybe people care less or there's just less customers. And so the weight doesn't, like, the weight is wildly different. Like JavaScript's probably like a hundred times the size from an investment point of view for everyone in the world than say mobile applications are, is how I would think about it. And so whether mobile is or isn't solved is almost irrelevant to the, the, the like general problem at hand. and I think at the very least, like mobile applications, there's like, there's like a tool chain where you can debug a lot of stuff that works fairly well and hasn't changed over the years, whereas like the web you have like browser tools, but that's about it. So. Mobile apps can have large binaries or pull in lots of dependencies at runtime [00:49:16] Jeremy: So I guess with mobile. Um, I was initially thinking of native apps, but you're, you're bringing up that there's actually people who would make a native app that's just a web view for a webpage, or there's React native or there's flutters, so there's actually, it really isn't standard how to make a mobile app. [00:49:36] David: Yeah. And even within those, it comes back to like, okay, is it now the same problem where we're loading in a bunch of JavaScript or downloading a bunch of JavaScript and content remotely and stuff? And like, you'll see this when you install a mobile app, and sometimes the binaries are huge, right? Sometimes they're really small, and then you load it up and it's downloading like several gigs of data and stuff, right? And those are completely different patterns. And even within those like subsets, I'm sure the implementations are wildly different, right? And so, you know, I, that may not be the same as like the runtime kind of changing, but I remember there was this, uh, this must be a decade ago. I, I used, I still am a gamer, but. Um, early in my career I worked a lot with like games like World of Warcraft and stuff, and I remember when games started launching progressive loading where it's like you could download a small chunk of the game and actually start playing and maybe the textures were lower, uh, like resolution and everything was lower fidelity and, and you could only go so far until the game fully installed. But like, imagine like if you're like focused on performance or something like that, measuring it there is completely different than measuring it once, say everything's installed, you know? And so I think those often become very complex use cases. And I think that used to be like an extreme edge case that was like such a, a hyper-specific optimization for like what The Warcraft, which is like one of the biggest games of all time that it made sense, you know, okay, whatever. They can build their own custom tooling and figure it out from there. And now we've taken that degree of complexity and tried to apply it to everything in the world. And it's like uhoh, like nobody has the teams or the, the, the talent or the, the experience to necessarily debug a lot of these complicated problems just like Sentry like. You know, we're not dealing with React internals. If something's wrong in the React internals, it's like somebody might be able to figure it out, but it's gonna take us so much time to figure out what's going on, versus, oh, we're rendering some html. Cool. We understand how it works. It's, it's a known, known problem. We can debug it. Like there's nothing to even debug most of the time. Right. And so, I, I don't know, I think the industry has to get to a place where you can reason about the software, where you have the calculator, right. And you don't have to figure out how the calculator works. You just can trust that it's gonna work for you. How Sentry's stack has become more complex over time [00:51:35] Jeremy: so kind of. Shifting over a little bit to Sentry's internals. You, you said that Sentry started in, was it 2008 you said? [00:51:47] David: Uh, the open source project was in 2008. Yeah. [00:51:50] Jeremy: The stack that's used in Sentry has evolved. Like I remembered that there was a period where I think you could run it with a pretty minimal stack, like I think it may have even supported SQLite. [00:52:02] David: Yeah. [00:52:03] Jeremy: And so it was something that people could run pretty easily on their own. But things have, have obviously changed a lot. And so I, I wonder if you could speak to sort of the evolution of that process. Like when do you decide like, Hey, this thing that I built in 2008, Is, you know, not gonna cut it. And I really need to re-architect what this system is. [00:52:25] David: Yeah, so I don't know if that's actually the reality of why things have changed, that it's like, oh, this doesn't work anymore. We've definitely introduced complexity in the sense of like, probably the biggest shift for Sentry was like, it used to be everything, and it was a SQL database, and everything was kind of optional. I think half that was maintainable because it was mostly built by. And so I could maintain like an architectural vision that kept it minimal. I had the experience to figure it out and duct tape the right things. Um, so that was one thing. And I think eventually, you know, that doesn't scale as you're trying to do more and build more into the product. So there's some complexity there. but for the most part you can, it can still

god amazon spotify google hell africa table uber android shifting id mobile ios engineers excited bay area remove saas application treating cto react suite goldman sachs monitoring rabbit metrics api intent stack chrome doordash makes python world of warcraft ui errors aws warcraft java github rum apis javascript detection rails apache kafka versus structured cpu php detecting django s3 sql cuz atlassian frontend sdks cockroach vps sentry elastic gitlab sre cabana splunk json lins mongo hypothetically apm defer jquery datadog mq orm new relic gpl bsd bsl paper trails sqlite nginx linode rabbitmq david kramer david you nextjs inferring flutters david so kibana david yeah david cramer riak david there software engineering radio mssql david thanks jeremy you jeremy so business source license

An Odyssey of Oddities

Play Episode Listen Later Feb 27, 2023 73:09

Newly published author Jeremy Eichenberger joins to show - We discuss the initial purpose of the book, and how it transformed, as well as he did - Break the glass ceiling - Suicidal thoughts - Shakespear is overrated - Not a fan of Jake Paul - CALLER ID, BRO! And so much more - Please make sure to share this episode with a friend - We dive into so many talking points that men usually won't dive into- This episode means a lot to me, and Jeremy- You can find the podcast on Apple Podcast, Google Podcast, Spotify and Anchor - Stay Up and Stay Blessed

spotify apple podcast google podcasts odyssey bro suicidal oddities stay blessed shakespear jeremy you

The Evolution of DevRel with Jeremy Meiss

Screaming in the Cloud

Play Episode Listen Later Feb 2, 2023 30:12

About JeremyJeremy is the Director of DevRel & Community at CircleCI, formerly at Solace, Auth0, and XDA. He is active in the DevRel Community, and is a co-creator of DevOpsPartyGames.com. A lover of all things coffee, community, open source, and tech, he is also house-broken, and (generally) plays well with others.Links Referenced: CircleCI: https://circleci.com/ DevOps Party Games: https://devopspartygames.com/ Twitter: Iamjerdog LinkedIn: https://www.linkedin.com/in/jeremymeiss/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored by our friends at Logicworks. Getting to the cloud is challenging enough for many places, especially maintaining security, resiliency, cost control, agility, etc, etc, etc. Things break, configurations drift, technology advances, and organizations, frankly, need to evolve. How can you get to the cloud faster and ensure you have the right team in place to maintain success over time? Day 2 matters. Work with a partner who gets it - Logicworks combines the cloud expertise and platform automation to customize solutions to meet your unique requirements. Get started by chatting with a cloud specialist today at snark.cloud/logicworks. That's snark.cloud/logicworksCorey: Welcome to Screaming in the Cloud. I'm Corey Quinn. I generally try to have people that I know in the ecosystem on this show from time to time, but somehow today's guest has never made it onto the show. And honestly, I have no excuse other than that, I guess I just like being contrary about it. Jeremy Meiss is the Director of DevRel and Community at CircleCI. Jeremy, thank you for finally getting on the show.Jeremy: Hey, you know what? I woke up months and months ago hoping I would be able to join and never have, so I appreciate you finally, you know, getting that celestial kick in the ass.Corey: I love the fact that this is what you lie awake at night worrying about. As all people should. So, let's get into it. You have been at CircleCI in their DevRel org—heading their DevRel org—for approximately 20 years, but in real-time and non-tech company timeframes, three years. But it feels like 20. How's that been? It's been an interesting three years, I'll say that much with the plague o'er the land.Jeremy: Yes, absolutely. No, it was definitely a time to join. I joined two weeks before the world went to shit, or shittier than it already was. And yeah, it's been a ride. Definitely see how everything's changed, but it's also been one that I couldn't be happier where I'm at and seeing the company grow.Corey: I've got to level with you. For the longest time, I kept encountering CircleCI in the same timeframes and context, as I did Travis CI. They both have CI in the name and I sort of got stuck on that. And telling one of the companies apart from the other was super tricky at the time. Now, it's way easier because Travis CI got acquired and then promptly imploded.Security issues that they tried to hide left and right, everyone I knew there long since vanished, and at this point, it is borderline negligence from my point of view to wind up using them in production. So oh, yeah, CircleCI, that's the one that's not trash. I don't know that you necessarily want to put that on a billboard somewhere, but that's my mental shortcut for it.Jeremy: You know, I'm not going to disagree with that. I think, you know, it had its place, I think there's probably only one or two companies nowadays actually propping it up as a business, and I think even they are actively trying to get out of it. So yeah, not going to argue there.Corey: I have been on record previously as talking about CI/CD—Continuous Integration slash Continuous Deployment—or for those who have not gone tumbling down that rabbit hole, the idea that when you push a commit to a particular branch on Git—or those who have not gotten to that point, push the button, suddenly code winds up deploying to different environments, occasionally production, sometimes staging, sometimes development, sometimes by accident—and there are a bunch of options in that space. AWS has a bunch of services under their CodeStar suite: CodeBuild, CodeDeploy, CodePipeline, and that's basically there as a marketing exercise by CI/CD companies that are effective because after having attempted to set those things up with the native offerings, you go scrambling to something else, anything else. GitHub Actions has also been heavily in that space because it's low friction to integrate, it's already there in GitHub, and that's awesome in some ways, terrible in others. But CircleCI has persistently been something that I see in a lot of different environments, both the open-source world, as well as among my clients, where they are using you folks to go from developer laptops to production safely and sanely.Jeremy: Absolutely, yeah. And I think that's one thing for us is, there's a niche of—you know, you can start if you're into AWS or you're into Google, or you're in—any of those big ecosystems, you can certainly use what they have, but those are always, like, add-on things, they're always like an afterthought of, “Oh, we're going to go add this,” or, “We're going to go add that.” And so, I think you adequately described it of, you know, once you start hitting scale, you're eventually going to start to want to use something, and I think that's where we generally fit in that space of, you know, you can start, but now you're going to eventually end up here and use best-in-class. I spent years Auth0 in the identity space, and it was the same kind of boat is that, you know, sure you can start with hopefully not rolling your own, but eventually you're going to end up wanting to use something best-in-class that does everything that you want it to do and does it right.Corey: The thing that just completely blows my mind is how much for all these companies, no matter who they are and how I talk to them, everyone talks about their CI/CD flow with almost a sense of embarrassment. And back in the days when I was running production environments, we use Jenkins as sort of a go-to answer for this. And that was always a giant screaming exemption to the infrastructure-as-code approach because you could configure it via the dashboard and the web interface and it would write that out as XML files. So, you wound up with bespoke thing lots of folks could interact with in different ways, and oh, by the way, it has access into development, staging, and production. Surely, there will be no disasters that happened as a result of this.And that felt terrible. And now we've gotten into a place where most folks are not doing that anymore, at least with the folks that I talk to, but I'm still amazed by how few best practices around a lot of this stuff has really emerged. Every time I see a CI/CD pipeline, it feels like it is a reimplementation locally of solving a global problem. You're the director of DevRel and have been for a few years now. Why haven't you fixed this yet?Jeremy: Primarily because I'm still stuck on the fact you mentioned, pushing a button and getting to XML. That just kind of stuck me there and sent me back that I can't come up with a solution at this point.Corey: Yeah, it's the way that you solve the gap—the schism as it were—between JSON and YAML. “Cool, we're going to use XML.” And everyone's like, “Oh, God, not that.” It's like, “Cool, now you're going to settle your differences or I'm going to implement other things, too.”Jeremy: That's right, yeah. I mean, then we're going to go use some bespoke company's own way of doing IAC. No, I think there's an element here where—I mean, it goes back to still using best-in-class. I think Hudson, which eventually became Jenkins, after you know, Cisco—was it Cisco? No, it was Sun—after Sun, you know, got their hands all over it, it was the thing. It's kind of, well, we're just going to spin this up and do it ourselves.But as the industry changes, we do more and more things on the cloud and we do it primarily because we're relocating the things that we don't want to have to manage ourselves with all of the overhead and all of the other stuff. We're going to go spit it over to the cloud for that. And so, I think there's been this shift in the industry that they still do, like you said, look at their pipelines with a little bit of embarrassment [laugh], I think, yeah. I chuckle when I think about that, but there is a piece where more and more people are recognizing that there is a better way and that you can—you don't have to look at your pipelines as this thing you hate and you can start to look at what better options there are than something you have to host yourself.Corey: What I'm wondering about now, though, because you've been fairly active in the space for a long time, which is a polite way of saying you have opinions—and you should hear the capital O and ‘Opinions' when I say it that way—let's fight about DevRel. What does DevRel mean to you? Or as I refer to it, ‘devrelopers?'Jeremy: Uh, devrelopers. Yes. You know, not to take from the standard DevOps answer, but I think it depends.Corey: That's the standard lawyer answer to anything up to and including, is it legal for me to murder someone? And it's also the senior consultant answer, to anything, too, because it turns out the world is baked and nuanced and doesn't lend itself to being resolved in 280 characters or less. That's what threads are for.Jeremy: Right [laugh]. Trademark. That is ultimately the answer, I think, with DevRel. For me, it is depending on what your company is trying to do. You ultimately want to start with building relationships with your developers because they're the ones using your product, and if you can get them excited about what they're doing with your product—or get excited about your product with what they're doing—then you have something to stand on.But you also have to have a product fit. You have to actually know what the hell your product is doing and is it going to integrate with whatever your developers want. And so, DevRel kind of stands in that gap that says, “Okay, here's what the community wants,” and advocates for the community, and then you have—it's going to advocate for the company back to the community. And hopefully, at the end of the day, they all shake hands. But also I've been around enough to recognize that there comes that point where you either a have to say, “Hey, our product for that thing is probably not the best thing for what you're trying to do. Here, you should maybe start at this other point.”And also understanding to take that even, to the next step to finish up the answer, like, my biggest piece now is all the fights that we have constantly around DevRel in the space of what is it and what is it not, DevRel is marketing. DevRel is sales. DevRel is product. And each of those, if you're not doing those things as a member of the company, you're not doing your job. Everybody in the company is the product. Everybody in the company is sales. Everybody in the company is marketing.Corey: Not everyone in the company realizes this, but I agree—Jeremy: Yes.Corey: Wholeheartedly.Jeremy: Yes. And so, that's where it's like yes, DevRel is marketing. Yes, it is sales. Because if you're not out there, spreading whatever the news is about your product and you're not actually, you know, showing people how to use it and making things easier for people, you're not going to have a job. And too often, these companies that—or too often I think a lot of DevRel teams find themselves in places where they're the first that get dropped when the company goes through things because sometimes it is just the fact that the company has not figured out what they really want, but also, sometimes it's the team hasn't really figured out how to position themselves inside the business.Corey: One of the biggest, I'll call it challenges that I see in the DevRel space comes down to defining what it is, first and foremost. I think that it is collectively a mistake for an awful lot of practitioners of developer relations, to wind up saying first and foremost that we're not marketing. Well, what is it that you believe that marketing is? In fact, I'll take it a step beyond that. I think that marketing is inherently the only place in most companies where we know that doing these things leads to good results, but it's very difficult to attribute or define that value, so how do we make sure that we're not first up on the chopping block?That has been marketing's entire existence. It's, you know that doing a whole bunch of things in marketing will go well for you, but as the old chestnut says, half your marketing budget is wasted and you'll go broke figuring out which half it is.Jeremy: Yeah. And whenever you have to make cuts, generally, they always, you know, always come to the marketing teams because hey, they're the ones spending, you know, millions of dollars a quarter on ads, or whatever it is. And so yeah, marketing has, in many ways figured this out. They're also the team that spends the most money in a company. So, I don't really know where to go with that isn't completely off the rails, but it is the reality. Like, that's where things happen, and they are the most in touch with what the direction of the company is going to ultimately be received as, and how it's going to be spoken about. And DevRel has great opportunities there.Corey: I find that when people are particularly militant about not liking sales or marketing or any other business function out there, one of the ways to get through them is to ask, “Great. In your own words, describe to me what you believe that department does. What is that?” And people will talk about marketing in a bunch of tropes—or sales in a bunch of tropes—where it is the worst examples of that.It's, “Terrific, great. Do you want me to wind up describing what you do as an engineer—in many cases—in the most toxic stereotype of Uber and 2015-style engineer I can come up with?” I think, in most cases if we're having a conversation and I haven't ended it by now, you would be horrified by that descriptor. Yeah. Not every salesperson is the skeezy used car salesman trying to trick you into something awful. Actual selling comes down to how do we wind up taking your pain away. One of my lines is, “I'm a consultant. You have problems and money. I will take both.”Jeremy: That's right [laugh]. Yeah, that's right.Corey: If you don't have a painful problem, I have nothing to sell you and all I'm doing is wasting my breath trying.Jeremy: Yeah, exactly. And that's where—I'll say it two ways—the difference between good marketing teams are, is understanding that pain point of the people that they're trying to sell to. And it's also a difference between, like, good and bad, even, DevRel teams is understanding what are the challenges that your users are having you're trying to express to, you're trying to fix? Figure that out because if you can't figure that out, then you or your marketing team are probably soon to be on the block and they're going to bring someone else in.Corey: I'm going to fight you a little bit, I suspect, in that a line I've heard is that, “Oh, DevRel is part of product because we are the voice of the community back into the development cycle of what product is building.” And the reason that I question that is I think that it glosses over an awful lot of what makes product competent as a department and not just a function done by other people. It's, “Oh, you're part of the product. Well, great. How much formal training have you had as part of your job on conducting user research and interviews with users and the rest?”And the answer invariably rounds to zero and, okay, in other words, you're just giving feedback in a drive-by fashion that not structured in any way and your product people are polite enough not to call you out on it. And that's when the fighting and slapping begins.Jeremy: Yeah. I don't think we're going to disagree too much there. I think one of the challenges, though, is for the very reason you just mentioned, that the product teams tend to hear your product sucks. And we've heard all the people telling us that, like, people in the community say that, they hear that so much and they've been so conditioned to it that it just rolls off their back, like, “Okay, whatever.” So, for DevRel teams, even if you're in product, which we can come back to that, regardless of where you're at, like, bringing any type of feedback you bring should have a person, a name associated with it with, like, Corey at Duckbill Group hates this product.Corey: Uh-oh [laugh]. Whenever my name is tied to feedback, it never goes well for me, but that will teach me eventually, ideally, to keep my mouth shut.Jeremy: Yeah. Well, how's that working for you?Corey: I'll let you know if it ever happens.Jeremy: Good. But once you start making the feedback like an actual person, it changes the conversation. Because now it's like, oh, it's not this nebulous, like, thing I can not listen to. It's now oh, it's actually a person at a specific company. So, that's one of the challenges in working with product that you have to overcome.When I think about DevRel in product, while I don't think that's a great spot for it, I think DevRel is an extension of product. That's part of where that, like, the big developer experience craze comes from, and why it is a valuable place for DevRel to be able to have input into is because you tend to be the closest to the people actually using the product. So, you have a lot of opportunities and a big surface area to have some impact.Corey: This episode is sponsored in part by our friends at Strata. Are you struggling to keep up with the demands of managing and securing identity in your distributed enterprise IT environment? You're not alone, but you shouldn't let that hold you back. With Strata's Identity Orchestration Platform, you can secure all your apps on any cloud with any IDP, so your IT teams will never have to refactor for identity again. Imagine modernizing app identity in minutes instead of months, deploying passwordless on any tricky old app, and achieving business resilience with always-on identity, all from one lightweight and flexible platform.Want to see it in action? Share your identity challenge with them on a discovery call and they'll hook you up with a complimentary pair of AirPods Pro. Don't miss out, visit Strata.io/ScreamingCloud. That's Strata dot io slash ScreamingCloud.Corey: I think that that is a deceptively nuanced statement. One of the things I learned from an earlier episode I had with Dr. Christina Maslach, is contributors to occupational burnout, so much of it really distills down—using [unintelligible 00:16:35] crappy layman's terms—to a lack of, I guess what I'm going to call relevance or a lack—a feeling like you are not significant to what the company is actually doing in any meaningful way. And I will confess to having had a number of those challenges in my career when I was working in production environments because, yeah, I kept the servers up and the applications up, but if you really think about it, one of the benefits of working in the system space—or the production engineers base, or DevOps, or platform engineering, or don't even start with me these days—is that what you do conveys almost seamlessly from company to company. Like, the same reason that I can do what I do now, I don't care what your company does, necessarily, I just know that the AWS bill is a bounded problem space and I can reason about it almost regardless of what you do.And if I'm keeping the site up, okay, it doesn't matter if we're streaming movies or selling widgets or doing anything, just so long as I don't find that it contradicts my own values. And that's great, but it also is isolating because you feel like you're not really relevant to the direction of what the company actually does. It's, “Okay, so what does this company do?” “We make rubber bands,” and well, I'm not really a rubber band connoisseur, but I could make sure that the website stays up. But it just feels like there's a disconnect element happening.Jeremy: That is real. It is very real. And one of the ways that I tried to kind of combat that, and I help my team kind of really try and keep this in mind, is we try to meet as much as possible with the people that are actually doing the direction, whether it be product marketing, or whether it's in product managers, or it's even, you know, in engineering is have some regular conversations with what we do as a company. How are we going to fit with that in what we do and what we say and all of our objectives, and making sure that everything we do ties to something that helps other teams and that fits within the business and where it's going so that we grow our understanding of what the company is trying to do so that we don't kind of feel like a ship that's without a sail and just floating wherever things go.Corey: On some level, I am curious as to what you're seeing as we navigate this—I don't know if it's a recession,' I don't know if it's a correction; I'm not sure what to call it—but my gut tells me that a lot of things that were aimed at, let's call it developer quality of life, they were something of a necessity in the unprecedented bull market that we've seen for the last decade at some point because most companies cannot afford to compete with the giant tech company compensation packages, so you have to instead talk about quality of life and what work-life balance looks like, and here's why all of the tools and processes here won't drive you to madness. And now it feels like, “Oh, we don't actually have to invest in a lot of those things, just because oh yeah, like, the benefits here are you're still going to be employed next week. So, how about that?” And I don't think that's a particularly healthy way to interact with people—it's certainly not how I do—but it does seem that worrying about keeping developers absolutely thrilled with every aspect of their jobs has taken something of a backseat during the downturn.Jeremy: I don't know. I feel like developer satisfaction is still an important piece, even though, you know, we have a changing market. And as you described, if you're not happy with the tool you're using, you're not going to be as productive than using the tool or using—you know, whether it's an actual developer productivity tool, or it's even just the fact that you might need two monitors, you're not going to be as productive if you're not enjoying what you're doing. So, there is a piece of it, I think, the companies are recognizing that there are some tools that do ultimately benefit and there's some things that they can say, we're not going to invest in that area right now. We're good with where we're at.Corey: On some level, being able to say, “No, we're not going to invest in that right now,” is the right decision. It is challenging, in some cases, to wind up talking to some team members in some orgs, who do not have the context that is required to understand why that decision is being made. Because without context, it looks like, “Mmm, no. I'm just canceling Christmas for you personally this year. Sorry, doesn't it suck to be you? [singing] Dut, dut.” And that is very rarely how executives make decisions, except apparently if they're Elon Musk.Jeremy: Right. Well, the [Muskrat 00:21:23] can, you know, sink any company—Corey: [laugh].Jeremy: — and get away with it. And that's one thing I've really been happy with where I'm at now, is you have a leadership team that says, “Hey, here's where things are, and here's what it looks like. And here's how we're all contributing to where we're going, and here's the decisions we're going to make, and here's how—” they're very open with what's going on. And it's not a surprise to anybody that the economic time means that we maybe can't go to 65 events next year. Like, that's just reality.But at the end of the day, we still have to go and do a job and help grow the company. So, how can we do that more efficiently? Which means that we—it leaves it better to try and figure that out than to be so nebulous, with like, “Yep, nope. You can't go do that.” That's where true leadership comes to is, like, laying it out there, and just, you know, getting people alongside with you.Corey: How do you see DevRel evolving? Because I think we had a giant evolution over the past few years. Because suddenly, the old vision of DevRel—at least in some quarters, which I admit I fell a little too deeply into—was, I'm going to go to all the conferences and give all of the talks, even though most of them are not related to the core of what I do. And maybe that's a viable strategy; maybe it's not. I think it depends on what your business does.And I don't disagree with the assertion that going and doing something in public can have excellent downstream effects, even if the connection is not obvious. But suddenly, we weren't able to do that, and people were forced to almost reinvent how a lot of that works. Now, that the world is, for better or worse, starting to open up again, how do you see it evolving? Are we going right back to a different DevOps days in a different city every week?Jeremy: I think it's a lot more strategic now. I think generally, there is less mountains of money that you can pull from to go and do whatever the hell you want. You have to be more strategic. I said that a few times. Like, there's looking at it and making sure, like, yeah, it would be great to go and, you know, get in front of 50,000 people this quarter or this year, whatever you want to do, but is that really going to move the bottom line for us? Is that really going to help the business, or is that just helping your Delta miles?What is really the best bang for the buck? So, I think DevRel as it evolves, in the next few years, has to come to a good recognition moment of we need to be a little bit more prescriptive in how we do things within our company and not so willy-nilly return to you know, what we generally used to get away with. That means you're going to see a lot more people have to be held to account within their companies of, is what you're doing actually match up to our business goal here? How does that fit? And having to explain more of that, and that's, I think, for some people will be easy. Some people are going to have to stretch that muscle, and others are going to be in a real tough pickle.Corey: One last topic I want to get into with you is devopspartygames.com, an online more or less DevOps, quote-unquote, “Personality” assortment of folks who wind up playing online games. I was invited once and promptly never invited back ever again. So first, was it something I said—obviously—and two what is that and how—is that still going in this post-pandemic-ish era?Jeremy: I like how you answered your own question first; that way I don't have to answer it. The second one, the way it came about was just, you know, Matty and I had started missing that interaction that we would tend to have in person. And so, one of the ways we started realizing is we play these, you know, Jackbox games, and why can't we just do this with DevOps tech prompts? So, that's kind of how it kicked off. We started playing around doing it for fun and then I was like, “You know, we should—we could do this as a big, big deal for foreseeable future.”Where's that now is, we actually have not done one online for—what is it? So probably, like, eight, nine months, primarily because it's harder and harder to do so as everybody [laugh]—we're now doing a little bit more travel, and it's hard to do those—as you know, doing podcasts, it takes a lot of work. It's not an easy kind of thing. And so, we've kind of put that on pause. But we actually did our first in-person DevOps Party Games at DevOpsDays Chicago recently, and that was a big hit, I think, and opportunity to kind of take what we're doing virtually, and the fun and excitement that we generally would have—relatively half-drunk—to actually doing it actually in-person at an event. And in the different—like, just as giving talks in person was a different level of interaction with the crowd, the same thing is doing it in person. So, it was just kind of a fun thing and an opportunity maybe to continue to do it in person.Corey: I think we all got a hell of a lot better very quickly at speaking to cameras instead of audiences and the rest. It also forced us to be more focused because the camera gives you nothing in a way that the audience absolutely does.Jeremy: They say make love to the camera, but it doesn't work anyways.Corey: I really want to thank you for spending as much time as you have talking to me. If people want to learn more about who you are and what you're up to, where should they go?Jeremy: Well, for the foreseeable future, or at least what we can guess, you can find me on the Twitters at @Iamjerdog. You can find me there or you can find me at, you know, LinkedIn, at jeremymeiss, LinkedIn. And you know, probably come into your local DevOpsDays or other conference as well.Corey: Of course. And we will, of course, put links to that in the show notes.Jeremy: Excellent.Corey: Thank you so much for being so generous with your time. It is always appreciated. And I do love talking with you.Jeremy: And I appreciate it, Corey. It was great beyond, finally. I won't hold it against you anymore.Corey: Jeremy Meiss, Director of DevRel at CircleCI. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with an angry, irritated comment talking about how CI/CD is nonsense and the correct way to deploy to production is via the tried-and-true method of copying and pasting.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.

christmas god director amazon community google work evolution elon musk security uber sun cloud figure personality delta jenkins cisco screaming aws github devops trademark solace terrific mmm git ci cd strata json iac xml idp jackbox devrel auth0 github actions yaml circleci muskrat xda continuous deployment corey quinn dut christina maslach devopsdays travis ci duckbill group jeremy you chief cloud economist last week in aws

Megan Cutrofello on Leaguepedia

Play Episode Listen Later Jan 10, 2023 97:55

Leaguepedia is a MediaWiki instance that covers tournaments, teams, and players in the League of Legends esports community. It's relied on by fans, analysts, and broadcasters from around the world. Megan "River" Cutrofello joined Leaguepedia in 2014 as a community manager and by the end of her tenure in 2022 was the lead for Fandom's esports wikis. She built up a community of contributing editors in addition to her role as the primary MediaWiki developer. She writes on her blog and is a frequent speaker at the Enterprise MediaWiki Conference Topics covered: When to use MediaWiki Visual vs code editor MediaWiki's rough syntax Templates and markup Limiting user input to simplify pages Choosing not to transliterate long player names in certain languages Handling mobile clients Building aliases for search results Creating a single source of truth Roster changes and caching Cargo (Query data in MediaWiki templates using SQL) Hiding implementation details from editors Optimizing for the editor, not a clean codebase Training your users to use workarounds MediaWiki only supports es5 The wiki aesthetic Who is working on the wiki + onboarding Who is using the wiki The future of Leaguepedia How Megan got into wiki development Issues as opportunities to onboard Related Links River Writes - Megan's Blog Leaguepedia - League of Legends esports wiki MediaWiki VisualEditor VueJS in MediaWiki Open issue to support ES6 in MediaWiki Whitespace programming language Lua MediaWiki extensions CharInsert - Add code snippets into the MediaWiki editor Semantic MediaWiki (SMW) - Store and query data inside Wiki pages Cargo - Replaced SMW at Leaguepedia Conference Talks Usage of Cargo with Lua on LoL Gamepedia Mediawiker SublimeText plugin Cargo/Lua Best Practices, and When Not To Use Them MediaWiki Lua Tutorial Editing your wiki with Python is easier than you think Other podcast appearances Between the Brackets Transcript You can help edit this transcript on GitHub. [00:00:00] Jeremy: Today I'm talking to Megan Cutrofello. She managed the Leaguepedia eSports wiki for eight years, and in 2017 she got an award for being the unsung hero of the year for eSports. So Megan, thanks for joining me today. [00:00:17] Megan: Thanks for having me. [00:00:19] Jeremy: A lot of the people I talk to are into web development, so they work with web frameworks and things like that. And I guess when you think about it, wikis are web development, but they're kind of their own world, I suppose. for someone who's going to build some kind of a site, like when does it make sense for them to use a wiki versus, uh, a content management system or just like a more traditional web framework? [00:00:55] Megan: I think it makes the most sense to use a wiki if you're going to have a lot of contributors and you don't want all of your contributors to have access to your server. also if your contributors aren't necessarily as tech savvy as you are, um, it can make sense to use a wiki. if you have experience with MediaWiki, I guess it makes sense to use a Wiki. Anytime I'm building something, my instinct is always, oh, I wanna make a Wiki (laughs) . Um, so even if it's not necessarily the most appropriate tool for the job, I always. My, my first thought is, hmm, let's see, I'm, I'm making a blog. Should I make my blog in in MediaWiki? Um, so, so I always, I always wanna do that. but I think it's always, when you're collaborating is pretty much, you always wanna do MediaWiki [00:01:47] Jeremy: And I, I think that's maybe an important point when you say people are collaborating. When I think about Wikis, I think of Wikipedia, uh, and the fact that I can click the edit button and I can see the markup right there, make a change and, and click save. And I didn't even have to log in or anything. And it seems like that workflow is built into a wiki, but maybe not so much into your typical CMS or WordPress or something like that. [00:02:18] Megan: Yeah. Having a public ability to solicit contributions from anyone. so for Leaguepedia, we actually didn't have open contributions from the public. You did have to create an account, but it's still that open anyone can make an account and all you have to do is like, go through that one step of create an account. Admittedly, sometimes people are like, I don't wanna make an account that's so much work. And we're like, just make the account. Come on. It's not that hard. but, uh, you still, you're a community and you want people to come and contribute ideas and you want people to come and be a part of that community to, document your open source project or, record the history of eSports or write down all of the easter eggs that you find in a video game or in a TV show, or in your favorite fantasy novels. Um, and it's really about community and working together to create something where the whole is bigger than the sum of its parts. [00:03:20] Jeremy: And in a lot of cases when people are contributing, I've noticed that on Wikipedia when you edit, there's an option for a, a visual editor, and then there's one for looking at the raw markup. in, in your experience, are people who are doing the edits, are they typically using the visual editor or are they mostly actually editing the, the markup? [00:03:48] Megan: So we actually disabled the Visual editor on Leaguepedia, because the visual editor is not fantastic at knowing things about templates. Um, so a template is when you have one page that gets its content pulled into the larger page, and there's a special syntax for that, and the visual editor doesn't know a lot about that. Um, so that's the first reason. And then the second reason is that, there's this, uh, one extension that we use that allows you to make a clickable, piece of text. It's called (https://www.mediawiki.org/wiki/Extension:CharInsert) CharInserts, uh, for character inserts. so I made a lot of these things that is sort of along the same philosophy as Visual Editor, where it's to help people not have to have the same burden of knowledge, of knowing every exact piece of source that has to be inserted into the page. So you click the thing that says like, um, insert a pick and band prefill, and then a little piece of JavaScript fires and it inserts a whole bunch of Wiki text and then you just enter the champions in the correct places. In the prefills of champions are like the characters that you play in, uh, league of Legends. And so then you have like the text is prefilled for you and you only have to fill in into this outline. so Visual Editor would conflict with CharInserts, and I much preferred the CharInserts approach where you have this compromise in between the never interacting with source and having to have all of the source memorized. So between the fact that Visual Editor like is not a perfect tool and has these bugs in it, and also the fact that I preferred CharInserts, we didn't use Visual Editor at all. I know that some wikis do like to use Visual Editor quite a bit, and especially if you're not working with these templates where you have all of these prefills, it can be a lot more preferred to use Visual Editor. Visual Editor is an experience much more similar to editing something like Microsoft Word, It doesn't feel like you're editing code. and editing code is, I mean, it's scary. Like for, and when I said like, MediaWiki is when you have editors who aren't as tech savvy, as the person who set up the Wiki. for people who don't have that experience, I mean, when you just said like you have to edit a wiki, like someone who's never done that before, they can be very intimidated by it. And you're trying to build a sense of community. You don't want to scare away your potential editors. You want everyone to be included there. So you wanna do everything possible to make everyone feel safe, to contribute their ideas to the Wiki. and if you make them have to memorize syntax, like even something that to me feels as simple as like two open brackets and then the name of a page, and then two closed brackets means linking the page. Like, I mean, I'm used to memorizing a lot of syntax because like, I'm a programmer, but someone who's never written code before, I mean, they're not used to memorizing things like that. So they wanna be able to click a button that says insert link, and then type the name of the page in the middle of the things that pop up there. Um, so visual editor is. It's a lot safer to use. so a lot of wikis do prefer that. and if it, if it didn't have the bugs with the type of editing that my Wiki required, and if we weren't using CharInserts so much, we definitely would've gone for it. But, um, it wasn't conducive to the wiki that I built, so we didn't use it at all. [00:07:42] Jeremy: And the, the compromise you're referring to, is it where the editor sees the raw markup, but then they can, there's like little buttons on the side they can click and they'll know, okay, if I click this one, then it's going to give me the text for creating a list or something like that. [00:08:03] Megan: Yeah, it's a little bit more high level than creating a list because I would never even insert the raw syntax for creating a list. It would be a template that's going to insert a list at the very end. but basically that, yeah, [00:08:18] Jeremy: And I, I know for myself, even though I do software development, if I click at it on a wiki and there's all the different curly brace tags, there's the square tags, and. I think if you spend some time with it, you can kind of get a sense of what it means. But for the average person who doesn't work with software in their day to day, do, do you find that, is that a big barrier for them where they, they click edit and there's all this stuff that they don't really understand? Is that where some people just, they go, oh, I don't, I don't know what to do. [00:08:59] Megan: I think the biggest barrier is actually clicking at it in the first place. so that was a big barrier to me actually. I didn't wanna click at it in the first place, and I guess my reasons were maybe a little bit different where for me it was like, I know that if I click edit, this is going to be a huge rabbit hole and I'm going to learn way too much about wikis and this is going to consume my entire life and look where I ended up. So I guess I was pretty right about that. I don't know if other people feel the same way or if they just like, don't wanna get involved at all. but I think once people, click edit, they're able to figure it out pretty well. I think there's, there's two barriers or maybe three barriers. the first one is clicking edit in the first place. The second one is if they learn to code templates at all. Media Wiki syntax is literally the worst I have encountered other than programming languages that are literally parodies. So like the white space language is worse (laughs https://en.wikipedia.org/wiki/Whitespace_(programming_language)) , but like it's two curly braces for a template and it's three curly braces for a variable. And like, are you actually kidding me? One of my blog posts is like a plea to editors to write a comment saying the name of the template that they're ending because media wiki like doesn't provide any syntax for what you're ending. And there's no, like, there's no indentation. So you can't visually see what you're ending. And there's no. So when I said the white sp white space language, that was maybe appropriate because MediaWiki prints all of the white space because it's really just like, PHP functions that are put into the text that you're literally putting onto the page. So any white space that you put gets printed. So the only way to put white space into your code is if you comment it out. So anytime you wanna put a new line, you have to comment out your new line. And if you wanna indent your code, you have to comment out the indents. So it's just, I, I'm , I'm not exaggerating here. It's, it's just the worst. Occasionally you can put a little bit of white space. Because there's like some divisions in parser functions that get handled when it gets sent to the parser. And, but I mean, for the most part it's just, it's just terrible. so if I'm like writing an if statement, I'll write if, and then I'll write a commented out endif at the end, so once an editor starts to write templates, like with parser functions and stuff, that's another big barrier because, and that's not because like people don't know how to code, it's just because the MediaWiki language, and I use language very loosely, it's like this collection of PHP functions poured into this just disaster It's just, it's not good! (laughs) And the, the next barrier is when people start to jump to Lua, which is just, I mean, it's just Lua where you can write Lua modules and then, Lua is fine. It's great, it has white space and you can make new lines and it's absolutely fine and you can write an entire code base and as long as you're writing Lua, it's, it's absolutely fantastic and there's nothing wrong with it anymore (laughs) So as much as I just insulted the MediaWiki language, like writing Lua in MediaWiki is great (laughs) . So for, for most of my time I was writing Lua. Um, and I have absolutely no complaints about that except that Lua is one index, but actually the one indexing of Lua is fine because MediaWiki itself is one indexed. So people complain about Lua being one index, and I'm like, what are you talking about? If it's, if another language were used, then you'd have all of this offsetting when you go to your scripting language because you'd have like the first argument from your template in MediaWiki going into your scripting language, and then you'd have to offset it to zero and everyone would be like vastly confused about what's going on. So you should be thankful that they picked a language that's one index because it saves you all of this headache. So anyway, sorry for that tangent, but it's very good that we picked a one index language. [00:13:17] Jeremy: When you were talking about the, the if statement and having to put in comments to have white space, is it, cuz like when I think about an if statement in most languages, the, the if statement isn't itself rendering anything, it's like deciding if you're going to do something inside of the, if so. like what, what would that white space do if you didn't comment it out in the context of the if? [00:13:44] Megan: So actually you would be able to put some white space inside of an if statement, but you would not be able to put any white space after an if statement. and there, most likely inside of the if statement, you're printing variables or putting other parser functions. and the other parser functions also end in like two curly braces. And, depending on what you're printing, you're likely ending with a series of like five or eight, or, I don't know, some very large set of curly braces. And so what I like to do is I would like to be able to see all of the things that I'm ending with, and I wanna know like how far the nesting goes, right. So I wanna write like an end if, and so I have to comment that out because there's no like end if statement. so I comment out an end if there, it's more that you can't indent the statements inside of the if, because anything that you would be printing inside of your code would get printed. So if I like write text inside of the code, then that indentation would get printed into the page. And then if I put any white space after the if statement, then that would also get printed. So technically you can have a little bit of white space before the curly braces, but that's only because it's right before the curly braces and PHP will strip the contents right inside of the parser function. So basically if PHP is stripping something, then you're allowed to have white space there. But if PHP isn't stripping anything, then all of the white space is going to be printed and it's like so inconsistent that for the most part it's not safe to put white space anywhere because you don't, you have to like keep track of am I in a location where PHP is going to be stripping something right now or not? and I, I wanna know what statement or what variable or what template I'm closing at any location. So I always want to, write out what I'm closing everywhere. And then I have to comment that because there was no foresight to put like an end if clause in this white space, sensitive language. [00:16:22] Jeremy: Yeah, I, I think I see what you mean. So you have, if you're gonna start an, if you have the, if inside these curly braces, but then, inside the, if you typically are going to render some text to the page, and so intuitively you would indent it so that it's indented in from the if statement. But then if you do that, then it's gonna be shifted to the right on, on the Wiki. Did I get that right? [00:16:53] Megan: Yeah. So you have the flexibility to put white space immediately because PHP will strip immediately, but then you don't have flexibility to put any white space after that, if that makes sense. [00:17:11] Jeremy: So, so when you say immediately, is that on the following line or is that [00:17:15] Megan: yeah, so any white space before the first clause, you have flexibility. So like if you were to put an if statement, so it's like if, and then there's a colon, all of the next white space will get stripped. Um, so then you can put some text, but then, if you wanted to like put some text and then another if statement nested within the first if statement. It's not like Lua where you could like assign a variable and then put a comment and then put some more white space and then put another statement. And it's white space insensitive because you're just writing code and you haven't returned anything yet. it, it's more like Jinja (View templating language) than Python for, for an analogy. So everything is getting printed because you're in like a, this templating language, not actually a programming language. Um, so you have to work as if you're in a templating language about, you know, 70% of the time , unless you're in this like very specific location where PHP is stripping your white space because you're at the edge of an argument that's being sent there. So it's like incredibly inconsistent. And every now and then you get to like, pretend that you're in an actual language and you have some white space, that you can indent or whatever. it's just incredibly inconsistent, which is like what you absolutely want out of a programming language (laughs) yeah, it's like you're, you're writing templates, but like, it seems like because of the fact that it's using php, there's [00:18:56] Jeremy: weird exceptions to the behavior. Yeah. [00:18:59] Megan: Exactly. Yeah. [00:19:01] Jeremy: and then you also mentioned these, these templates. So, if I understand correctly, this is kind of like how a lot of web frameworks will have, partials, I guess, where you'll, you'll be able to have a webpage, but it's made up of different I don't know if you would call them components, but you're able to build a full page that's made up of a bunch of different pieces. So you could have a [00:19:31] Megan: Yeah Yeah that's a good analogy. [00:19:33] Jeremy: Where it's like, here's my table of contents, or here's my info box, or things like that. And those are all things that you would create a MediaWiki template for, and then somehow the, the data gets passed into those templates and the template decides how to, to render it out. [00:19:55] Megan: Yeah. [00:19:56] Jeremy: And for these, these templates, I, I noticed on some of the Leaguepedia pages, I noticed there's some html in some of them. I was curious if that's typical to write them with HTML or if there are different ways native to Media Wiki for, for, creating these templates. [00:20:23] Megan: Um, it depends on what you're doing. MediaWiki has a special syntax for tables specifically. I would say that it's not necessarily recommended to use the special syntax because occasionally you can get things to not work out fantastically if people slightly break things. But it's easier to use it. So if you know that everything's going to work out perfectly, you can use it. and it's a simple shortcut. if you go to the help page about tables on Wikipedia, everything is explained, and not all HTML works, um, for security reasons. So there's like a list of allowed, things that you can use, allowed tags, so you can't put like forms and stuff natively, but there's the widgets extension that you can use and widgets just automatically renders all html that you put inside of a widget. Uh, and then the security layer there is that you have to have a special permission to edit a widget. so, you only give trusted people that permission and then they can put the whatever html they want there. So, we have a few forms on Leaguepedia that are there because I edited, uh, whichever widgets, and then put the widgets into a Lua module and then put the Lua module into a template and then put the template onto the page. I was gonna say, it's not that complicated. It's not as complicated as it sounds, but I guess it really is as complicated as it sounds (laughs) . Um, so, uh, I, I won't say that. I don't know how standard it is on other wikis to use that much html, I guess Leaguepedia is pretty unique in how complicated it is. There aren't that many wikis that do as many things as we did there. but tables are pretty common. I would say like putting divs places to style them, uh, is also pretty common. but beyond that, usually there's not too many HTML elements just because you typically wanna be mobile friendly and it's relatively hard to stay mobile friendly within the bounds of MediaWiki if you're like putting too many elements everywhere. And then also allowing users to put whatever content inside of them that they want. The reason that we were able to get away with it is because despite the fact that we had so many editors, our content was actually pretty limited. Like if there's a bracket, it's only short team names going into it. So, and short team names were like at most five or six characters long, so we don't have to worry about like overflow of team names. Although we designed the brackets to support overflow of team names, and the team names would wrap around and the bracket would not break. And a lot of CSS Magic went into making that work that, we worked really hard on and then did not end up using (laughsz) [00:23:39] Jeremy: Oh no. [00:23:41] Megan: Only short team names go into brackets. But, that's okay. uh, and then for example, like in, uh, schedules and stuff, a lot of fields like only contain numbers or only contain timestamps. there's like a lot of tables again where like there's only two digit numbers with one decimal point and stuff like that. So a lot of the stuff that I was designing, I knew the content was extremely constrained, and if it wasn't then I said, well, too bad. This is how I'm telling you to put the content . Um, and for technical reasons, that's the content that's gonna go here and I don't care. so there's like, A lot of understanding that if I said for technical reasons, this is how we have to do it. Then for technical reasons, that was how we had to do it. And I was very lucky that all of the people that I worked with like had a very big appreciation with like, for technical reasons, like argument over. This is what's happening. And I know that with like different people on staff, like they would not be willing to compromise that way. Um, so I always felt like extremely lucky that like if I couldn't figure out a way to redesign or recode something in order to be more flexible, then like that would just be respected. And that was like how we designed something. But in general, like it's, if you are not working with something as rigid as, I mean, and like the history of eSports sounds like a very fluid thing, but when you think about it, like it's mostly names of teams, names of players and statistics. There's not that much like variable stuff going on with it. It's very easy to put in relational databases. It's very easy to put in fixed width tables. It's very easy to put in like charts that look the same on every single page. I'm not saying. It was always easy to like write everything that I wrote, and it's not, it wasn't always easy to like, deal with designs and stuff, but like relative to other topics that you can pick, it was much easier to put constraints on what was going to go where because everything was very similar across regions, across, although actually one thing. Okay, so this will be like the, the exception that proves the rule. uh, we would trans iterate players' names when we, showed them in team rosters. So, uh, for example, when we were showing the hangul, the Korean player's names, we would show an English translation also. Um, and we would do this for every single alphabet. but Hungarian players' names are really, really, really long. And so the transliteration doesn't fit in the table when we show the translation to the Roman alphabet. And so we couldn't do this, so we actually had to make a cargo table. Of alphabets that are allowed to be transliterated into the Roman alphabet, uh, when we have players names in that alphabet. So we had, like, hangul was allowed and Arabic was allowed, and I can't remember the exact list, but we had like three alphabet, three or four alphabets were allowed and the rest of the alphabets were dis allowed to be transliterate into, uh, the Roman alphabet. and so again, we made up a rule that was like a hard rule across the entire Wiki where we forced the set of alphabets that were transliterated so that this tables could be the same size roughly across every single team page because these Hungarian player names are too long (laughs) So I guess even this exception ended up being part of the rule of everything had to be standardized because these tables were just way too wide and they were running into the info box. They couldn't fit on the side. so it's really hard when you have like arbitrary user entered content to fit it into the HTML that you design. And if you don't have people who all agree to the same standards, I mean, Even when we did have people who agreed to all of the same standards, it was really, really, really hard. And we ended up having things like a table of which alphabets to transliterate. Like that's not the kind of thing that you think you're going to end up having when you say, let's catalog the history of League of Legends eSports, [00:28:40] Jeremy: And, and so when, let's say you had a language that you couldn't trans iterate, what would go into the table. [00:28:49] Megan: uh, just the native alphabet. [00:28:51] Jeremy: Oh I see. Okay. [00:28:53] Megan: Yeah. And then if they went to the player page, then you would be able to see it transliterated. But it wouldn't show up on the team page. [00:29:00] Jeremy: I see. And then to help people visualize what some of these things you're talking about look like when you're talking about a, a bracket, it's, is it kind of like a tree structure where you're showing which teams are facing which teams and okay, [00:29:19] Megan: We had a very cool, CSS grid structure that used like before and after pseudo elements to generate the lines, uh, between the teams and then the teams themselves were the elements of the grid. Um, and it's very cool. Uh, I didn't design it. Um, I have a friend who I very, very fortunately have a friend who's amazing at CSS because I am like mediocre at css and she did all of our CSS for us. And she also like did most of our designs too. Uh, so the Wiki would not be like anything like what it is without her. [00:30:00] Jeremy: And when you're talking about making sure the designs fit on desktop and, and mobile, um, I think when you were talking earlier, you're talking about how you have these, these templates to build these tables and the, these, these brackets. Um, so I guess in which part of the wiki is it ensuring that it looks different or that it fits when you're working with these different screen sizes [00:30:32] Megan: Usually it's a peer CSS solution. Every now and then we hide an element on mobile altogether, and some of that is actually MediaWiki core, for example, in, uh, nav boxes don't show up on mobile. And that's actually on Wikipedia too. Uh, well, I guess, yeah. I mean, being MediaWiki core, So if you've ever noticed the nav boxes that are at the bottom of pages on Wikipedia, just don't show up on like en.m.wikipedia.org. and that way you're not like loading, you're not loading, but display noneing elements on mobile. but for the most part it's pure CSS Solutions. Um, so we use a lot of, uh, display flex to make stuff, uh, appropriate for mobile. Um, some media roles. sometimes we display none stuff for mobile. Uh, we try to avoid that because obviously then mobile users aren't getting like the full content. Occasionally we have like overflow rules, so you're getting scroll bars on mobile and then every now and then we sort of just say, too bad if you're on mobile, you're gonna have not the greatest solution or not the greatest, uh, experience. that's typically for large data tables. so the general belief at fandom was like, if you can't make it a good experience on mobile, don't put it on the Wiki. And I just think that's like the worst philosophy because like then no one gets a good experie. And you're just putting less content on the Wiki so no one gets to enjoy it, and no one gets to like use the content that could exist. So my philosophy has always been like the, the, core overview pages should be, as good as possible for both PC and mobile. And if you have to optimize for one, then you slightly optimize for mobile because the majority of traffic is mobile. but attempt not to optimize for either one and just make it a good experience on both. but then the pages behind that, I say behind because we like have tabs views, so they're like sort of literally behind because it looks like folders sort of, or it looks like the tabs in a folder and you can, like, I, I don't know, it, it looks like it's behind (laughs) , the, the more detailed views where it's just really hard to design for mobile and it's really easy to design for pc and it just feels very unlikely that users on mobile are going to be looking at these pages in depth. And it's the sort of thing. A PC user is much more likely to be looking at, and you're going to have like multiple windows open and you're gonna be tapping between them and you're gonna be doing all of your research at PC. You absolutely optimize this for PC users. Like, what the hell this is? These are like stats pages. It's pages and pages and pages of stats. It's totally fine to optimize this for PC users. And if the option is like, optimized for PC users or don't create it at all, what are you thinking To not create it at all, like make it a good experience for someone? So I don't, I don't understand that philosophy at all. [00:34:06] Jeremy: Did you, um, have any statistics in terms of knowing on these types of pages, these pages that are information dense or have really big tables? Could you tell that? Oh, most of the people coming here are on computers or, or larger screens. [00:34:26] Megan: I didn't have stats for individual pages. Um, mobile I accidentally lost Google Analytics access at some point, and honestly I wasn't interested enough to go through the process of trying to get it back. when I had it, it didn't really affect what I put time into, because it was, it was just so much what I expected it to be. That it, it didn't really affect much. What I actually spent the most time on was looking, so you can, uh, you get URLs for search results. And so I would look through our search results, and I would look at the URL of the failed search results and, so there would be like 45 results for this particular failed search. And then I would turn that into a redirect for what I thought the target was supposed to be. So I would make sure that people's failed searches would actually resolve to the correct thing. So if they're like typo something, then I make the typo actually resolve. So we had a lot of redirects of like common typos, or if they're using the wrong name for a tournament, then I make the wrong name for the tournament resolve. So the analytics were actually really helpful for that. But beyond that, I, I didn't really find it that useful. [00:35:48] Jeremy: And then when you're talking about people searching, are these people using a search box on the Wiki itself And not finding what they were looking for? [00:36:00] Megan: Yeah. So like the internal search, so like if you search Wikipedia for like New York City, but you spell it C I Y T, , then you're not going to get a result. But it might say, did you mean New York City t y? If like 45 people did that in one month, then that would show up for me. And then I don't want them to be getting, like, that's a bad experience. Sure. They're eventually getting there, but I mean, I don't want them to have to spend that extra time. So I'm gonna make an automatic redirect from c Y T to c i t Y [00:36:39] Jeremy: And, and. Maybe we should have talked about this a little earlier, but the, all the information on Leaguepedia is, it's about all of the different matches and players, um, who play League of Legends. so when you edit a, a page on Wikipedia, all of that information, or a lot of it I think is, is hand entered by, by people and on Leagueapedia, which has all this information about like what, how teams did in a tournament or, intricate stats about how a game went. That seems like a lot of information for someone to be hand entering. So I was wondering how much of that information is somebody actually manually editing those things and how much is, is done automatically or programmatically. [00:37:39] Megan: So it's mostly hand entered. We do have a little bit of it that's automated, via a couple scripts, but for the most part it's hand entered. But after being handed, entered into a couple of data pages, it gets propagated a lot of times based on a bunch of Lua modules and the cargo extension. So when I originally joined the Wiki back in 2014, it was hand entered. Not just once, but probably, I don't know, seven times for tournament results and probably 10 or 12 times for roster changes. It was, it was a lot. And starting in 2017, I started rewriting all of the code so that it was entered exactly one time for everything. Tournament results get entered one time into a data page and roster changes get entered one time into a data page. And, for roster changes, that was very difficult because, for a roster change that needs to update the team history on a player page, which goes, from a join to a leave and it needs to update the, the like roster, change portal for the off season, which goes from a leave to a join because it's showing like the deltas over the off season. And it needs to update the current team in the, player's info box, which means that the current team has to be calculated from all of the deltas that have ever occurred in that player's history and it needs to update. Current rosters in the team pages, which means that the team page needs to know all of the current players who are currently on the team, which again, needs to know all of the deltas from all of history because all that you're entering is the roster changes. You're not entering anyone's current team. So nowhere on the wiki does it ever store a current team anymore. It only stores the roster changes. So that was a lot of code to write and deciding even what was going to be entered was a lot because, all I knew was that I was going to single source of truth that somehow and I needed to decide what was I going to single source of truth. So I decided, um, that I was going to be this Delta and then deciding what to do with that, uh, how to store it in a relational database. It was, it was a big project. and I didn't have a background as a developer either. so this was like, I don't know, this was like my third big project ever. So, that was, that was pretty intense. but it was, it was a lot of fun. so it is hand entered but I feel like that's underselling it a little bit. [00:40:52] Jeremy: Yeah, cuz I was initially, I was a little confused when you mentioned how somebody might need to enter the same information multiple times. But, if I understood correctly, it would be if somebody's changing which team they're on, they would have to update, for example, the player's page and say like, oh, this player is on this team now. And then you would have to go to their old team and remove them from the roster there. Go to the new team, add them to the roster there, And you can see where it would kind [00:41:22] Megan: Yeah. And then there's the roster, there's the roster nav box, and there's like the old team, you have to say, like the next team. Cuz in the previous players list, like we show former team members from the old team and you have to say like the next team. Uh, so if they had like already left their old team, you'd have to say like, new team. Yeah, there's a, there's a lot of, a lot of places. [00:41:50] Jeremy: And so now what it sounds like is, I'm not sure this is exactly how it works, but if you go to any location that would need that information, which team is this player on? When you go to that page, for example, if you were to go to, uh, a teams page, then it would make a SQL query to figure out I guess who most recently had a, I forget what you called it, but like a join row maybe, or like a, they, they had the action of joining this team, and now, now there's a row in the database that says they did this. [00:42:30] Megan: it actually looks at the ten-- so I have an in in between table called tenures. And so it looks at the tenures table instead of querying all the way through the joins and leaves table and doing like the whole list of deltas. yeah. So, and it's also cached so you, it doesn't do the SQL query every time that you load the page. So the only time that the SQL queries actually happen is if you do a save on the page. And then otherwise the entire generated HTML of the page is actually cached on the server. So you're, you're not doing that many database queries every time you load the page, so don't worry about that. but there, there can actually be something like a hundred SQL queries sometimes, when you're, saving a page. So it would be absolute murder if you were doing that every time you went to the page. But yeah, it works. Something like that. [00:43:22] Jeremy: Okay, so this, this tenures table is, that's kind of like what's the current state of all these players and where they are, and then. [00:43:33] Megan: Um, the, the tenures table, caches sort of, or I guess the tenure table captures is a better word than caches um, every, join to leave historically from every team. Um, and then I save that for two reasons. The first one is so that I don't have to recompute it, uh, when I'm doing the team's table, because I have to know both the current members and the former members. And then the second reason is also that we have a public api and so people can query that. if they're building tools, like a lot of people use the public api, uh, for various things. And, one person built like, sort of like a six degrees of Kevin Bacon except for League of Legends, uh, using our tenures tables. So, part of the reason that that exists is so that uh, people can use it for whatever projects that they're doing. Cause the join, the join leave table is like pretty unfriendly and I didn't wanna have to really document that for anyone to use. So I made tenures so that that was the table I could document for people to use. [00:44:39] Jeremy: Yeah. That, that's interesting in that, yeah, when you provide an api, then there's so many different things people can do that even if your wiki didn't really need it, they can build their own apps or their own pages built on all this information you've aggregated. [00:44:58] Megan: Yeah. It's nice because then when someone says like, oh, can you build this as a feature request? I can say no, but you can (laughs) [00:45:05] Jeremy: Well you've, you've done the, the hard part for them (laughs) [00:45:09] Megan: Yeah. exactly. [00:45:11] Jeremy: So that's cool. Yeah. that's, that's interesting too about the, the caching because yeah, I guess when you think about a wiki, most of the people who are visiting it are just visiting it to see what's on there. So the, provided that they're not logged in and they don't need anything specific to them. Yeah, you should be able to cache the whole response. It sounds like. [00:45:41] Megan: Yeah. yeah. Caching was actually a nightmare with this in this particular thing. the, the team roster changes, because, so cargo, which I mentioned a couple times is the database extension that we used. Um, and it's basically a SQL wrapper that like, doesn't port 80% of the features that SQL has. so you can create tables and you can query, but you can't make, uh, like sub-select queries. So your queries have to be like very simple. which is good for like most users of MediaWiki because like the average MediaWiki user doesn't have that much coding experience, but if you do have coding experience, then you're like, what, what, what am I doing? I can't, I can't do anything. Um, but it's a very powerful tool, still compared to most of what you could do with Media Wiki without this, basically you're adding a database layer to your software stack, which I mean, I, I, that's what you're doing, (laughs) Um, so you get a huge amount of power from adding cargo to a wiki. Um, in exchange it's, it's very performance. It's like, it's, it, it's resource heavy. uh, it hurts your performance a lot. and if you don't need it, then you shouldn't use it. But frequently you need it when you're doing, difficult or not necessarily difficult, but like intensive things. Um, anytime that you need to pull data from one page to another, you wanna use something like that. Um, So cargo, uh, one of the things that it doesn't do is it doesn't allow you to, uh, set a primary key easily. so you have to like, just like pretend that one row in the table is your primary key, basically. it internally automatically sets one, but it won't be static or it won't be the same every time that you rebuild the table because it rebuilds the table in a random order and it just uses an auto increment primary key. So you set a row in the table to pretend to be your ran, to pretend to be your primary key. But editors don't know what, your editors don't understand anything about primary keys. And you wanna hide this from them completely. Like, you cannot tell an editor, protect this random number, please don't change this. So you have to hide it completely. So if you're making your own auto increment, like an editor cannot know that that exists. Like this is back to when we were talking about like visual editor. This is like, one of the things about making the wiki safe for people is like not exposing them to the internals of like, anything scary like that. So for example, if an editor accidentally reorders two rows and your roster change data like that has to not matter. Because that can't break the entire wiki. They, you can't make an editor like freak out because they just reordered two rows in, in the page. And you can't put like a scary notice somewhere saying, under no circumstances reorder two rows here. Like, that's gonna scare people away. And you wanna be very welcoming and say like, it's impossible to break this page no matter how hard you tried. Don't worry. Anything you do, we can just fix it. Don't worry. But the thing is that everything's going to be cached. And so in particular, um, when I said I made that tenures table, one thing I did not wanna do was resave every single row from the join leave table. So you had to join back to, sorry, I'm going to use, join in two different connotations. you had to join back to the join leave table in order to get like all of the auxiliary data, like all of the extra columns, like, I don't know, like role, date, team name and stuff. Because otherwise the tenures table would've had like 50 columns or something. So I needed to store the fake primary key in the tenures table, but the tenures table is cached on the player page and the join leave table is on the data page, which means that I need to purge the cache on the player page anytime that someone edits the data on the data page. Which means that, so there's like some JavaScript that does that, but if someone like changes the order of the lines, then that primary key is going to change because I have an auto increment going on. And so I had to like very, very carefully pick a primary key here so that it was literally impossible for any kind of order change to affect what the primary key was so that the cash on the player page wasn't going to be changed by anything that the editor did in unless they were going to then update the cash on that player page after making that change. If that makes sense. So after an editor makes a change on the news page, they're going to press a button to update the cache on the player page, but they're only going to update the player page for the one line that they change on the news page. These, uh, primary keys had to be like super invariant for accidental row moves, or also later on, like entire moves of separating a bunch of these data pages into like separate subpages because the pages were getting too big and it was like timing out the server because there were too many stores to the database on a single page every time you save the page. And anyway, it took me like five iterations of making the primary key like more and more specific to the single line because my auto increment was like originally including every single line I was auto incrementing and then I auto incremented only when that single player was was involved. And then I auto incremented only when that player and the team was involved. And then I reset the auto increment for that date. So, and it was just got like more and more convoluted what my primary key was. It was, it was a mess. Anyway, this is just like another thing when you're working with volunteers who don't know what's going on and they're editing the page and they can contribute content, you have to code for the editor and not code for like minimizing complexity, The editor's experience matters more than the cleanliness of your code base, and you just end up with these like absolute messes that make no sense whatsoever because the editor's experience matters and you always have to code to the editor. And Media Wiki is all about community, and the editor just becomes part of the software and part of the consideration of your code base, and it's very, very different from any other kind of development because they're like, the UX is just built so deeply into how you're developing. [00:53:33] Jeremy: if I am following correctly, when I, when I think of using SQL when you were first talking about cargo and you were talking about how you make your own tables, and I'm envisioning the, the columns and the rows and, it's very common for the primary key to either be auto incrementing or some kind of GUID But then if I understood correctly, I think what you were saying is that anytime an editor makes changes to the data, it regenerates the whole table. Is that did I get that right? [00:54:11] Megan: It regenerates all of the rows on that page. [00:54:14] Jeremy: and when you talk about this, these data pages, there's some kind of media wiki or cargo specific markup where people are filling in what is going to go into the rows. And the actual primary key that's in MySQL is not exposed anywhere when they're editing the data. [00:54:42] Megan: That's right [00:54:44] Jeremy: And so when you're talking about trying to come up with a primary key, um, I'm trying to, I guess I'm trying to picture [00:54:57] Megan: So usually I do page name underscore an auto increment. But then if people can rearrange the rows which they do because they wanna get the rows chronological, but some people just put it at the top of the page and then other people are like, oh my God, it's not chronological. And then they fix it and then other people are like, oh my God, you messed up the time zone. And then they rearrange it again. Then, I mean, normally I wouldn't care because I don't really care like what the primary key is. I just care that it exists. But then because I have it cached on these player pages, I really, really do care what the primary key is. And because I need the primary key to actually agree with what it is on the data page, because I'm actually joining these things together. and people aren't going to be updating the cache on the player page if they don't think that they edited the row because rearranging isn't actually editing and people aren't going to realize that. And again, this is burden of knowledge. People can't, I can't make them know that because they have to feel safe to make any edits. It's bad enough that they have to know that they have to click this button to update the cache after making an edit in the first place. so, the auto increment isn't enough, so it has to be like an auto increment, but only within the set of rows that incorporate that one player. And then rearranging is pretty safe because they'd have to rearrange two pieces of news, including the same player. And that's really unlikely to happen. It's really unlikely that someone's going to flip the order of two pieces of news that involve the same player without realizing that they're actually are editing that single player except maybe they are. So then I include the team in that also. So they'd have to rearrange two pieces of news, including the same player and the same team. And that's like unlikely to happen in the first place. And then like, maybe a mistake happens like once a year. And at the end of the day, the thing that really saves us is that we're a wiki. We're not an official source. And so if we have a mistake once a year, like no one cares really. So we're not going for like five nines or anything. We're going for like, you know, two (laughs) . Um, so [00:57:28] Jeremy: so [00:57:28] Megan: We were having like mistakes constantly until I added player and team and date to the set of things that I was auto incrementing against. and once I got all of those, it was pretty stable. [00:57:42] Jeremy: And for the caching part, so when you're making a cargo query or a SQL query on one page and it needs to join on or get data from another page, it goes to this cache that you have instead of going directly to the actual table in the database. And the only way to get the right data is for the editor to click this button on the website that tells it to update the cache did I get that right? [00:58:23] Megan: Not quite. So it, well, or Yes, you did sort of, it goes to the actual table. The issue here is that, the table was last updated, the last time that a page was saved. And the last time the data got saved was the last time that the page that contains the parser function that generates those rows got saved. So, let me say that again. So, some of the data is being saved from the data page where the users manually enter it, and that's fine because the only time that gets updated is when the users manually enter it and then the page gets saved. But then these tenures tables are stored by my lua code on the player pages, and those aren't going to get updated unless the player page gets blank edited or null edited, or a save action happens from the player page. And so the way to make a, an edit happen from the player page is either to manually go there and click edit, and then click save, which is called a blank edit because. Blank edited, you didn't do anything but you pressed save or to use my JavaScript gadget, which is clicking a button from the data page that just basically does that for you using the api. And then that's going to update the table and then the database table, because that's where the, the cargo parser function is that writes to the database and updates the tables there. with the information, Hey, the primary key changed, because that's where the parser function is physically located in the wiki because one of them is on the data page and one of them is on the player page. So you get this disconnect in the cache where it's on two different pages and so you have to press a save action in both of them before the table is consistent again. [01:00:31] Jeremy: Okay. It be, it's, so this is really all about the tenure table, which the user will never mod or the editor will never modify directly. You need your code running on the data page and the player's page to run, to update the The tenure table? [01:00:55] Megan: Yeah, exactly. [01:00:57] Jeremy: yeah, it's totally hidden that this exists to the editor, but it's something that, that you as the person who put this all together, um, have to always be aware of, yeah. [01:01:11] Megan: Right. So there was just so many things like this, where you just had to press this one button. I call it refresh overview because originally it was on a tournament page and you had to press, the refresh overview button to purge the cache on the overview page of the tournament. after editing the data and you would refresh, overview, to deal with this cache lag. And everyone knew you have to refresh overview, otherwise none of your data entry is gonna like, be worth anything because it's not, the cache is just gonna lag. but every editor learned, like if there's a refresh overview button, make sure you press the refresh overview button, , otherwise nothing's gonna happen. Um, and there is just like tons of these littered across the Wiki. and like to most people, it just like, looks like a simple little button, but like so many things happen when you press this button. so it is, it is very important. [01:02:10] Jeremy: Are there, no ways inside of media wiki to if somebody edits one page, for example, to force it to go and, do, I forget what you called it, like a blank save or blank edit on another page? [01:02:27] Megan: So that wouldn't even really work because, we had 11,000 player pages. And you don't know which one the user just edited. so it, it's unclear to MediaWiki what just happened when the user just edited some part of the data page. and like the whole point here is that I can't even blank edit every single player page that the data page links to because the data page probably links to, I don't know, 200 different player pages. So I wanna link, I wanna blank it like the five that this one news line links to. so I do that, through like HTML attributes, in the JavaScript, [01:03:14] Jeremy: Oh, so that's why you're using JavaScript so that you can tell what the person edited because there isn't really a way to know natively in, in MediaWiki. what just changed? [01:03:30] Megan: there's like a diff so I could, like, MediaWiki knows the characters that got changed, but it doesn't really know like semantically what happened. So it doesn't know, like, oh, a link to this just got edited and especially because, I mean it's like templates that got edited, not really like the final HTML or anything. So Media Wiki has no idea what's going on. so yeah, so the JavaScript, uh, looks at the HTML attributes and then runs a couple API queries, and then the blank edits happen and then a couple purges after that so that the cache gets purged after the blank edit. [01:04:08] Jeremy: Yeah. So it, it seems like on these Wiki pages, you have the html, you have the CSS you have the ability to describe these data pages, which I, I guess in the end, end up being rows in in SQL. And then finally you have JavaScript. So it kind of seems like you can do almost everything in the context of a a Wiki page. You have so many, so many of these tools at your, at your disposal. [01:04:45] Megan: Yeah. Except write es6 code. [01:04:48] Jeremy: Oh, still, still only es5. [01:04:52] Megan: Yeah, [01:04:52] Jeremy: Oh no. do, do you know if that's something that they are considering changing or [01:05:01] Megan: There's a Phabricator ticket open. [01:05:05] Jeremy: How, um, how, how many years? [01:05:06] Megan: It has a lot of comments, oh a lot of years. I think it's since like 2014 or something [01:05:14] Jeremy: Oh yeah. I, I guess the, the one maybe, well now now the browsers all, all support es6, but I, I guess one of the things, it sounds like media wiki, maybe side stepped is the whole, front end ecosystem in, in terms of node packages and build tools and things like that. is, is that right? It's basically you can write JavaScript and there, yeah, [01:05:47] Megan: You can even write jQuery. [01:05:49] Jeremy: Oh, okay. That's built in as well. [01:05:52] Megan: Yeah .So I have to admit, like my, my front end knowledge is like a decade out of date or something because it's like what MediaWiki can do and there's like this entire ecosystem out there that I just like, don't have access to. And so I like barely know about. So I have this like side project that uses React that I've like, kind of sort of been working on. And so like I know this tiny little bit of react and I'm like, why? Why doesn't MediaWiki do this? Um, they are adding Vue support. So in theory I'll get to learn vue so that'll be fun. [01:06:38] Jeremy: So I'm, I'm curious, just from the limited experience you've had, outside of, MediaWiki, are, are there like specific things, uh, in your experience working with React where you're, you really wish you had in inside of Media Wiki? [01:06:55] Megan: Well, really the big thing is like es6, like I really wish we could use arrow functions , like that would be fantastic. Being able to build components would be really nice. Yeah, we can't do that. [01:07:09] Jeremy: I, I suppose you, you've touched a little bit on performance before, but I, I guess that's one thing about Wikis is that, putting what's happening in the back end, aside the, the front end experience of Wikis, they, they feel pretty consistent since they're generally mostly server rendered. And the actual JavaScript is, is pretty light, at least from, from Wikis I've seen. [01:07:40] Megan: Yeah. I mean you can add as much JavaScript as you want, so I guess it depends on what the users decide to do. But it's, it's definitely true that wikis tend to load faster than some websites that I've seen. [01:07:54] Jeremy: Yeah, I mean, I guess when you think of a wiki, it's, you're there cuz you wanna get specific information and so the goal is not to necessarily reproduce like some crazy complex app or something. It's, It's, to get you the, the, information. Yeah. [01:08:14] Megan: Yeah. No, that's actually one thing that I really like about Wikis also is that you don't have the pressure to make them look nice. I know that some people are gonna hear that and just like, totally cringe and be like, oh my God, what is she saying? ? Um, but it's actually really true. Like there's an aesthetic that Wikis and Media Wiki in particular have, and you kind of stick to that. And within that aesthetic, I mean, you make them look as nice as you can. Um, and you certainly don't wanna like, make them deliberately ugly, but there's not a pressure to like go over the top with like marketing and branding and like, you know, you, you just make them look reasonably nice. And then the focus is on the information and the focus is on making the information as easy to understand as possible. And a wiki that looks really nice is a wiki that's very understandable and very intuitive, and one where you. I mean, one, that the information is the joy and, you know, not, not the presentation, I guess. So it's like the presentation of the information instead of the presentation of the brand. so I, I really appreciate that about wikis. [01:09:30] Jeremy: Yeah, that's a good point about the aesthetics in the sense of like, they have a certain look and yeah, maybe it's an authoritative look, , which, uh, is interesting cuz it's, like a, a wiki that I'll, I'll commonly go to for example, is there's the, the PC gaming Wiki. And when you look at how it's styled, it feels like very dated or it doesn't look like, I guess you could say normal webpages, but it's very much in line with what you expect a wiki to look like. So it's, it's interesting how they have that, shared aesthetic, I guess. [01:10:13] Megan: Yeah. yeah. No, I really like it. The Wiki experience, [01:10:18] Jeremy: We, we kind of touched on this near the beginning, but sometimes when. I would see wikis and, and projects like Leaguepedia I would kind of wonder, you know, what's the decision between or behind it being a wiki versus something being like a custom CMS in, in the case of Leaguepedia but, you know, talking to you about how it's so, like wikis are structured so that people can contribute. and then like you were saying, you have like this consistent look that brings the data to the user. Um, I actually, it gives me a better understanding of why so many people choose wikis as, as ways to present this information. [01:11:07] Megan: Yeah, a a lot of people have asked me over the years why, why MediaWiki when it always feels like I'm jumping through so many hoops. Um, I mean, when I just described the caching thing to you, and that's just like one of, I don't know, dozens of struggles that I've had where, MediaWiki has gotten in the way of what I need to do. Because really Leaguepedia is an entire software layer on top of MediaWiki, and so you might ask why. Why MediaWiki? Why not just build the software layer on top of something easier? And my answer is always, it's about the community. MediaWiki lends itself so well to community and people enjoy contributing to wikis and wikis. Wikis are just kind of synonymous with community, and they always have been. And Wikipedia sort of set the example when they launched, and it's sort of always been that way. And, you know, I feel like I'm a part of a community when I say a Wiki. And if it was just if it were a custom site that had the ability to contribute to it, you know, it just feels like it's not the same. [01:12:33] Jeremy: I think just even seeing the edit button on Wikis is such a different experience than having the expectation, well, I guess in the case of Leaguepedia, you do have to create an account, but even without creating the account, you can still click edit and you can look at the source and you can see how all this information, or a lot of it, how it got filled in. And I feel like it's kind of more similar to the earlier days of webpages where people could right click a site and click view source and then look at the HTML and the css, and kind of see how it was put together. versus, now with a lot of sites, the, the code has been minified or there's build tools involved so that when you look at view source on websites, it just looks crazy and you're not sure what's going on. So I, I, I feel like wikis in some ways are, kind of closer to the, the spirit of, like the earlier H T M L sites. Yeah. [01:13:46] Megan: And the knowledge transfers too. If you've edit, if you've, if you've ever edited Wikipedia, then you know that like open bracket, open bracket, closed bracket. Closed bracket is how you link a page. and that knowledge transfers to admittedly maybe a little bit less so for Leaguepedia, since there, you need to know how all the templates work and there's not so much direct source editing. it's mostly like clicking the CharInsert prefills. but there's still a lot of cross knowledge transfer, if you've edited one wiki and then change to editing another. And then it goes the other way too. If you edit Leaguepedia, then you want to go at it for the Zelda Wiki, that knowledge will transfer. [01:14:38] Jeremy: And, and talking about the community and the editors. I, I imagine on Wikipedia, most of the people editing are volunteers. Is it the same with Leaguepedia in your experience? [01:14:55] Megan: Um, yeah, so I was contracted, uh, or I was not contracted. My LLC was contract and then I subcontracted. Um, it changed a bit over the years, um, as people left. Uh, so at first I subcontracted quite a few people. Um, and then I guess, as you can imagine, as, there was a lot more data entry that had to be done at the start. And less had to be done later on, as I, expanded the code base so that it was more a single source of truth, and less stuff had to be duplicated. And I guess it was, it probably became a lot more fun too, uh, when you didn't have to edit, enter the same thing multiple times. but, uh, a bunch of people, uh, moved on over the years. and so by the end I was only subcontracting, three people. Um, and everyone else was volunteer. [01:15:55] Jeremy: And and the people that you were subcontracting, that was for only data entry, or was that also for the actual code? [01:16:05] Megan: No, that wasn't for data entry at all. Um, and actually that was for all of my wikis, uh, because I was. Managing like all of the eSports wikis. or on

Victor Adossi on Yak Shaving

Love Where You Are with Somer Colbert

Play Episode Listen Later Jan 2, 2023 110:47

Victor is a software consultant in Tokyo who describes himself as a yak shaver. He writes on his blog at vadosware and curates Awesome F/OSS, a mailing list of open source products. He's also a contributor to the Open Core Ventures blog. Before our conversation Victor wrote a structured summary of how he works on projects. I recommend checking that out in addition to the episode. Topics covered: Most people should use Dokku or CapRover But he uses Kubernetes anyways Hosting a Database in Kubernetes Learning technology You don't really know a thing until something goes wrong History of Frontend Development Context from lower layers of the stack and historical projects Good project pages have comparisons to other products Choosing technologies Language choice affects maintainability Knowing an ecosystem Victor's preferred stack Technology bake offs Posting findings means you get free corrections Why people use medium instead of personal sites Victor VADOSWARE - Blog How Victor works on Projects - Companion post for this episode Awesome FOSS - Curated list of OSS projects NimbusWS - Hosted OSS built on top of budget cloud providers Unvalidated Ideas - Startup ideas for side project inspiration PodcastSaver - Podcast index that allows you to choose Postgres or MeiliSearch and compare performance and results of each Victor's preferred stack Docker - Containers Kubernetes - Container provisioning (Though at the beginning of the episode he suggests Dokku for single server or CapRover for multiple) TypeScript - JavaScript with syntax for types. Victor's default choice. Rust - Language he uses if doing embedded work, performance is critical, or more correctness is desired Haskell - Language he uses if correctness and type system is the most important for the project Postgresql - General purpose database that's good enough for most use cases including full text search. KeyDB - Redis compatible database for caching. Acquired by Snap and then made open source. Victor uses it over Redis because it is multi threaded and supports flash storage without a Redis Enterprise license. Pulumi - Provision infrastructure with the languages you're already using instead of a specialized one or YAML Svelte and SvelteKit - Preferred frontend stack. Previously used Nuxt. Search engines Postgres Full Text Search vs the rest Optimizing Postgres Text Search with Trigrams OpenSearch - Amazon's fork of Elasticsearch typesense meilisearch sonic Quickwit JavaScript build tools Babel SWC Webpack esbuild parcel Vite Turbopack JavaScript frameworks React Vue Svelte Ember Frameworks built on top of frameworks Next - React Nuxt - Vue SvelteKit - Svelte Astro - Multiple Historical JavaScript tools and frameworks Underscore jQuery MooTools Backbone AngularJS Knockout Aurelia GWT Bower - Frontend package manager Grunt - Task runner Gulp - Task runner Related Links Dokku - Open source single-host alternative to Heroku Cloud Native Buildpacks - Buildpacks created by Heroku and Pivotal and used by Dokku CapRover - An open source PaaS-like abstraction built on top of Docker Swarm Kelsey Hightower's tweet about being cautious about running databases on Kubernetes Settling the Myth of Transparent HugePages for Databases Kubernetes Container Storage Interface (CSI) Kubernetes Local Persistent Volumes Longhorn - Distributed block storage for Kubernetes Postgres docs Postgres TOAST Everything I've seen on optimizing Postgres on ZFS Kubernetes Workload Resources Kubernetes Network Plugins Kubernetes Ingress Traefik Kubernetes the Hard Way (Setting up a cluster in a way that optimizes for learning) How does TLS work Let's Encrypt Cert manager for Kubernetes Choose Boring Technology A Linux user's guide to Logical Volume Management Docker networking overview Kubernetes Scheduler Tauri - Build desktop applications with web technology and Rust ripgrep - CLI tool to recursively search directory for a regex pattern (Meant to be a rust replacement for grep) angle-grinder / ag - CLI tool to parse and process log files written in rust Object.observe ECMAScript Proposal to be Withdrawn Ruby on Rails - Ruby web framework Django - Python web framework Laravel - PHP web framework Adonis - JavaScript NestJS - JavaScript What is a NullPointerException, and how do I fix it? Mastodon Clap - CLI argument parser for Rust AWS CDK - Provision AWS infrastructure using programming languages Terraform - Provision infrastructure with terraform language URL canonicalization of duplicate pages and the use of the canonical tag - Used by dev.to to send google traffic to the original blogpost instead of dev.to Transcript You can help edit this transcript on GitHub. [00:00:00] Jeremy: This episode, I talk to Victor Adossi who describes himself as a yak shaver. Someone who likes trying a whole bunch of different technologies, seeing the different options. We talk about what he uses, the evolution of front end development, and his various projects. Talking to just different people it's always good to get where they're coming from because something that works for Google at their scale is going to be different than what you're doing with one of your smaller projects. [00:00:31] Victor: Yeah, the context. Of course in direct conflict with that statement, I definitely use Google technology despite not needing to at all right? Like, you know, 99% of people who are doing like people like to call it indiehacking or building small products could probably get by with just Dokku. If you know Dokku or like CapRover. Are two projects that'll be like, Oh, you can just push your code here, we'll build it up like a little mini Heroku PaaS thing and just go on one big server, right? Like 99% of the people could just use that. But of course I'm not doing that. So I'm a bit of a hypocrite in that sense. I know what I should be doing, but I'm not doing that. I am writing a Kubernetes cluster with like five nodes for no reason. Uh, yeah, I dunno, people don't normally count the controllers. [00:01:24] Jeremy: Dokku and CapRover, I think those are where it's supposed to create a heroku like experience I think it's based off of the heroku buildpacks right? At least Dokku is? [00:01:36] Victor: Yeah Buildpacks has actually been spun out into like a community thing so like pivotal and heroku, it's like buildpacks.io, they're trying to build a wider standard around it so that more people can get involved. And buildpacks are actually obviously fantastic as a technology and as a a process piece. There's not much else like them and you know, that's obvious from like Heroku's success and everything. I know Dokku uses that. I don't know that Caprover does, but I haven't, I haven't really run Caprover that much. They, they probably do. Like at this point if you're going to support building from code, it seems silly to try and build your own buildpacks. Cause that's what you will do, eventually. So you might as well use what's there. Anyway, this is like just getting to like my personal opinions at this point, but like, if you think containers are a bad idea in 2022, You're wrong, you should, you should stop. Like you should, you should stop. Think about it. I mean, obviously there's not, um, I got a really great question at an interview once, which is, where are containers a bad idea? That's probably one of the best like recent interview questions I've ever gotten cause I was like, Oh yeah, I mean, like, you can't, it can't be perfect everywhere, right? Nothing's perfect everywhere. So it's like, where is it? Uh, and of course the answer was networking, right? (unintelligible) So if you need absolute performance, but like for just about everything else. Containers are kind of it at this point. Like, time has born it out, I think. So yeah, I always just like bias at taking containers at this point. So I'm probably more of a CapRover person than a Dokku person, even though I have not used, I don't use CapRover. [00:03:09] Jeremy: Well, like something that I've heard with containers, and maybe it's changed recently, but, but something that was kind of holdout was when people would host a database sometimes they would oh we just don't wanna put this in a container and I wonder if like that matches with your thinking or if things have changed. [00:03:27] Victor: I am not a database administrator right like I read postgres docs and I read the, uh, the Postgres documentation, and I think I know a bit about postgres but I don't commit right like so and I also haven't, like, oh, managed X terabytes on one server that you are making sure never goes down kind of deal. But the stickiness for me, at least from when I've run, So I've done a lot of tests with like ZFS and Postgres and like, um, and also like just trying to figure out, and I run Postgres in Kubernetes of course, like on my cluster and a lot of the stuff I found around is, is like fiddly kernel things like sort of base kernel settings that you need to have set. Like, you know, stuff like should you be using transparent huge pages, like stuff like that. But once you have that settled. Containers are just processes with name spacing and resource control, right? Like, that's it. there are some other ins and outs, but for the most part, if you're fine running a process, so people ran processes, right? And they were just completely like unprotected. Then people made users for the processes and they limited the users and ran the processes, right? Then the next step is now you can run a process and then do the limiting the name spaces in cgroups dynamically. Like there, there's, there's sort of not a humongous difference, unless you're hitting something very specific. Uh, but yeah, databases have been a point of contention, but I think, Kelsey Hightower had that tweet yeah. That was like, um, don't run databases in Kubernetes. And I think he called it back. [00:04:56] Victor: I don't know, but I, I know that was uh, was one of those things that people were really unsure about at first, but then after people sort of like felt it out, they were like, Oh, it's actually fine. Yeah. [00:05:06] Jeremy: Yeah I vaguely remember one of the concerns having to do with persistent storage. Like there were challenges with Kubernetes and needing to keep that storage around and I don't know if that's changed yeah or if that's still a concern. [00:05:18] Victor: Uh, I'd say that definitely has changed. Uh, and it was, it was a concern, depending on where you were. Mostly people who are running AKS or EKS or you know, all those other managed Kubernetes, they're just using EBS or like whatever storage provider is like offering for storage. Most of those people don't actually have that much of a problem with, storage in general. Now, high performance storage is obviously different, right? So like, so you'll, you're gonna have to start doing manual, like local volume management and stuff like that. it was a problem, because obviously CSI (Kubernetes Container Storage Interface) didn't exist for some period of time, and like there was, it was hard to know what to do for if you were just running a Kubernetes cluster. I think a lot of people were just using local, first of all, local didn't even exist for a bit. Um, they were just using host path, right? And just like, Oh, it's on the disk somewhere. Where do we, we have to go get it right? Or we have to like, sort of manage that. So that was something most people weren't ready for, especially if you were just, if you weren't like sort of a, a, a traditional sysadmin and used to doing that stuff. And then of course local volumes came out, but I think they still had to be, um, pre-provisioned. So that's sysadmin stuff that most people, you know, maybe aren't, aren't necessarily ready for. Uh, and then most of the general solutions were slow. So like, I used Longhorn (https://longhorn.io) for a long time and Longhorn, Longhorn's great. And super easy to set up, but it can be slower and you can have some, like, delays in mount time. it wasn't ideal for, for most people. So yeah, I, overall it's true. Databases, Databases in Kubernetes were kind of fraught with peril for a while, but it wasn't for the reason that, it wasn't for the fundamental reason that Kubernetes was just wrong or like, it wasn't the reason most people think of, which is just like, Oh, you're gonna break your database. It's more like, running a database is hard and Kubernetes hasn't solved all the hard problems. Like, cuz that's what Kubernetes does. It basically solves a lot of problems in a very generic way. Right. So it just hadn't solved all those problems yet at this point. I think it's got decent answers on a lot of them. So I, I mean, I don't know. I I do it. Don't, don't take what I'm saying to your, you know, PM meeting or your standup meeting, uh, anyone who's listening. But it's more like if you could solve the problems with databases in the sense before. You could probably solve 'em on Kubernetes now with a good understanding of Kubernetes. Cause at the end of the day, it's all the same stuff. Just Kubernetes makes it a little easier to, uh, do it dynamically. [00:07:50] Jeremy: It sounds like you could do it before, but some of the, I guess the tools or the ways of doing persistent storage were not quite there yet, or they were difficult to use. And so that was why people at the start were like, Okay, maybe it's not a good idea, but, now maybe there's some established practices for how you should run a database in Kubernetes. And I, I suppose the other aspect too is that, like you were saying, Kubernetes is its own thing. You gotta learn Kubernetes and all its intricacies. And then running a database is also its own challenge. So if you stack the two of them together and, and the path was not really clear then maybe at the start it wasn't the best idea. Um, uh, if somebody was going to try it out now, was there like a specific resource you looked at or a specific path to where like okay this is is how I'm going to do it. [00:08:55] Victor: I'll just say what I normally recommend to everybody. Cause it depends on which path you wanna go right? If you wanna go down like running a database path first and figure that out, fill out that skill tree. Like go read the Postgres docs. Well, first of all, use Postgres. That's the first tip there. But like, read those documents. And obviously you don't have to understand everything. You won't understand everything. But knowing the big pieces and sort of letting your brain see the mention of like a whole bunch of things, like what is toast? Oh, you can do compression on columns. Like, you can do some, some things concurrently. Um, you know, what ALTER TABLE looks like. You get all that stuff kind of in your head. Um, and then I personally really believe in sort of learning by building and just like iterating. you won't get it right the first time. It's just like, it's not gonna happen. You're get, you can, you can get better the first time, right? By being really prepared and like, and leave yourself lots of outs, but you kind of have to like, get it out there. Do do your best to make sure that you can't fail, uh, catastrophically, right? So this is like, goes back to that decision to like use ZFS as the bottom of this I'm just like, All right, well, I, I'm not a file systems expert, but if I. I could delegate some of that, you know, some of that, I can get some of that knowledge from someone else. Um, and I can make it easier for me to not fail catastrophically. For the database side, actually read documentation on Postgres or the whatever database you're going to use, make sure you at least understand that. Then start running it like locally or whatever. Again, Docker use, use Docker locally. It's, it's, it's fine. and then, you know, sort of graduate to running sort of more progressively, more complicated versions. what I would say for the Kubernetes side is actually similar. the Kubernetes docs are really good. they're very large. but they're good. So you can actually go through and know all the, like, workload, workload resources, know, like what a config map is, what a secret is, right? Like what etcd is doing in this whole situation. you know, what a kublet is versus an API server, right? Like the, the general stuff, like if you go through all that, you should have like a whole bunch of ideas at least floating around in your head. And then once you try and start setting up a server, they will all start to pop up again, right? And they'll all start to like, you, like, Oh, okay, I need a CNI (Container Networking) plugin because something needs to make the services available, right? Or something needs to power the ingress, right? Like, if I wanna be able to get traffic, I need an ingress object. But what listens, what does that, what makes that ingress object do anything? Oh, it's an ingress controller. nginx, you know, almost everyone's heard of nginx, so they're like, okay. Um, nginx, has an ingress control. Actually there's, there used to be two, I assume there's still two, but there's like one that's maintained by Kubernetes, one that's maintained by nginx, the company or whatever. I use traefik, it's fantastic. but yeah, so I think those things kind of fall out and that is almost always my first way to explain it and to start building. And tinkering iteratively. So like, read the documentation, get a good first grasp of it, and then start building yourself because you'll, you'll get way more questions that way. Like, you'll ask way more questions, you won't be able to make progress. Uh, and then of course you can, you know, hop into slacks or like start looking around and, and searching on the internet. oh, one of the things that really helped me out early learning Kubernetes was, Kelsey Hightower's, um, learn Kubernetes the hard way. I'm also a big believer in doing things the hard way, at least knowing what you're choosing to not know, right? distributing file system, Deltas, right? Or like changes to a file system over the network is not a new problem. Other people have solved it. There's a lot of complexity there. but if you at least know the sort of surface level of what the thing does and what it's supposed to do and how it's supposed to do it, you can make a decision on, Oh, how deep am I going to go? Right? To prevent yourself from like, making a mistake or going too deep in the rabbit hole. If you have an idea of the sort of ecosystem and especially like, Oh, here, like the basics of how I can use this thing, that's generally very good. And doing things the hard way is a great way to get a, a feel for that, right? Cause if you take some chunk and like, you know, the first level of doing things the hard way, uh, or, you know, Kelsey Hightower's guide is like, get a machine, right? Like, so, like, if you somehow were like, Oh, I wanna run a Kubernetes cluster. but, you know, I don't want use necessarily EKS and you wanna learn it the hard way. You have to go get a machine, right? If you, if you're not familiar, if you run on Heroku the whole time, like you didn't manage your own machines, you gotta go like, figure out EC2, right? Or, I personally use, hetzner I love hetzner, so you have to go figure out hetzner, digital ocean, whatever. Right. And then the next thing's like, you know, the guide's changed a lot, and I haven't, I haven't looked at it in like, in years, actually a while since I, since I've sort of been, I guess living it, but it's, it's like generate certificates, right? So if you've never dealt with SSL and like, sort of like, or I should say TLS uh, and generating certificates and how that whole dance works, right? Which is fascinating because it's like, oh, right, nothing's secure on the internet, except that we distribute root certificates on computers that are deployed in every OS, right? Like, that's a sort of fundamental understanding you may not go deep enough to realize, but if you are fascinated by it, trying to do it manually would lead you down that path. You'd be like, Oh, what, like what is this thing? What is a CSR? Like, why, who is signing my request? Right? And it's like, why do we trust those people? Right? And it's like, you know, that kind of thing comes out and I feel like you can only get there from trying to do it, you know, answering the questions you can. Right. And again, it takes some judgment to know when you should not go down a rabbit hole. uh, and then iterating. of course there are people who are excellent at explaining. you can find some resources that are shortcuts. But, uh, I think particularly my bread and butter has been just to try and do it the hard way. Avoid pitfalls or like rabbit holes when you can. But know that the rabbit hole is there, and then keep going. And sometimes if something's just too hard, you're not gonna get it the first time. Like maybe you'll have to wait like another three months, you'll try again and you'll know more sort of ambiently about everything else. You get a little further that time. that's how I feel about that. Anyway. [00:15:06] Jeremy: That makes sense to me. I think sometimes when people take on a project, they try to learn too many things at the same time. I, I think the example of Kubernetes and Postgres is pretty good example, where if you're not familiar with how do I install Postgres on bare metal or a vm, trying to make sense of that while you're trying to into is probably gonna be pretty difficult. So, so splitting them up and learning them individually, that makes a lot of sense to me. And the whole deciding how deep you wanna go. That's interesting too, because I think that's very specific to the person right because sometimes you wanna go a little deeper because otherwise you don't understand how the two things connect together. But other times it's just like with the example with certificates, some people they may go like, I just put in let's encrypt it gives me my cert I don't care right then, and then, and some people they wanna know like okay how does the whole certificate infrastructure work which I think is interesting, depending on who you are, maybe you go ahh maybe it doesn't really matter right. [00:16:23] Victor: Yeah, and, you know, shout out to Let's Encrypt . It's, it's amazing, right? think Singlehandedly the most, most of the deployment of HTTPS that happens these days, right? so many so many of like internet providers and uh, sort of service providers will use it right? Under the covers. Like, Hey, we've got you free SSL through Let's Encrypt, right? Like, kind of like under the, under the covers. which is awesome. And they, and they do it. So if you're listening to this, donate to them. I've done it. So now that, now the pressure is on whoever's listening, but yeah, and, and I, I wanna say I am that person as well, right? Like, I use, Cert Manager on my cluster, right? So I'm just like, I don't wanna think about it, but I, you know, but I, I feel like I thought about it one time. I have a decent grasp. If something changes, then I guess I have to dive back in. I think it, you've heard the, um, innovation tokens idea, right? I can't remember the site. It's like, um, do, like do boring tech or something.com (https://boringtechnology.club/) . Like it shows up on sort of hacker news from time to time, essentially. But it's like, you know, you have a certain amount of tokens and sort of, uh, we'll call them tokens, but tolerance for complexity or tolerance for new, new ideas or new ways of doing things, new processes. Uh, and you spend those as you build any project, right? you can be devastatingly effective by just sticking to the stack, you know, and not introducing anything new, even if it's bad, right? and there's nothing wrong with LAMP stack, I don't wanna annoy anybody, but like if you, if you're running LAMP or if you run on a hostgator, right? Like, if you run on so, you know, some, some service that's really old but really works for you isn't, you know, too terribly insecure or like, has the features you need, don't learn Kubernetes then, right? Especially if you wanna go fast. cuz you, you're spending tokens, right? You're spending, essentially brain power, right? On learning whatever other thing. So, but yeah, like going back to that, databases versus databases on Kubernetes thing, you should probably know one of those before you, like, if you're gonna do that, do that thing. You either know Kubernetes and you like, at least feel comfortable, you know, knowing Kubernetes extremely difficult obviously, but you feel comfortable and you feel like you can debug. Little bit of a tangent, but maybe that's even a better, sort of watermark if you know how to debug a thing. If, if it's gone wrong, maybe one or five or 10 or 20 times and you've gotten out. Not without documentation, of course, cuz well, if you did, you're superhuman. But, um, but you've been able to sort of feel your way out, right? Like, Oh, this has gone wrong and you have enough of a model of the system in your head to be like, these are the three places that maybe have something wrong with them. Uh, and then like, oh, and then of course it's just like, you know, a mad dash to kind of like, find, find the thing that's wrong. You should have confidence about probably one of those things before you try and do both when it's like, you know, complex things like databases and distributed systems management, uh, and orchestration. [00:19:18] Jeremy: That's, that's so true in, in terms of you are comfortable enough being able to debug a problem because it's, I think when you are learning about something, a lot of times you start with some kind of guide or some kind of tutorial and you follow the steps. And if it all works, then great. Right? But I think it's such a large leap from that to something went wrong and I have to figure it out. Right. Whether it's something's not right in my Dockerfile or my postgres instance uh, the queries are timing out. so many things that could go wrong, that is the moment where you're forced to figure out, okay, what do I really know about this not thing? [00:20:10] Victor: Exactly. Yeah. Like the, the rubber's hitting the road it's uh you know the car's about to crash or has already crashed like if I open the bonnet, do I know what's happening right or am I just looking at (unintelligible). And that's, it's, I feel sort a little sorry or sad for, for devs that start today because there's so much. Complexity that's been built up. And a lot of it has a point, but you need to kind of have seen the before to understand the point, right? So I like, I like to use front end as an example, right? Like the front end ecosystem is crazy, and it has been crazy for a very long time, but the steps are actually usually logical, right? Like, so like you start with, you know, HTML, CSS and JavaScript, just plain, right? And like, and you can actually go in lots of directions. Like HTML has its own thing. CSS has its own sort of evolution sort of thing. But if we look at JavaScript, you're like, you're just writing JavaScript on every page, right? And like, just like putting in script tags and putting in whatever, and it's, you get spaghetti, you get spaghetti, you start like writing, copying the same function on multiple pages, right? You just, it, it's not good. So then people, people make jquery, right? And now, now you've got like a, a bundled set of like good, good defaults that you can, you can go for, right? And then like, you know, libraries like underscore come out for like, sort of like not dom related stuff that you do want, you do want everywhere. and then people go from there and they go to like backbone or whatever. it's because Jquery sort of also becomes spaghetti at some point and it becomes hard to manage and people are like, Okay, we need to sort of like encapsulate this stuff somehow, right? And like the new tools or whatever is around at the same timeframe. And you, you, you like backbone views for example. and you have people who are kind of like, ah, but that's not really good. It's getting kind of slow. Uh, and then you have, MVC stuff comes out, right? Like Angular comes out and it's like, okay, we're, we're gonna do this thing called dirty checking, and it's gonna be, it's gonna be faster and it's gonna be like, it's gonna be less sort of spaghetti and it's like a little bit more structured. And now you have sort of like the rails paradigm, but on the front end, and it takes people to get a while to get adjusted to that, but then that gets too heavy, right? And then dirty checking is realized to be a mistake. And then, you get stuff like MVVM, right? So you get knockout, like knockout js and you got like Durandal, and like some, some other like sort of front end technologies that come up to address that problem. Uh, and then after that, like, you know, it just keeps going, right? Like, and if you come in at the very end, you're just like, What is happening? Right? Like if it, if it, if someone doesn't sort of boil down the complexity and reduce it a little bit, you, you're just like, why, why do we do this like this? Right? and sometimes there's no good reason. Sometimes the complexity is just like, is unnecessary, but having the steps helps you explain it, uh, or helps you understand how you got there. and, and so I feel like that is something younger people or, or newer devs don't necessarily get a chance to see. Cause it just, it would take, it would take very long right? And if you're like a new dev, let's say you jumped into like a coding bootcamp. I mean, I've got opinions on coding boot camps, but you know, it's just like, let's say you jumped into one and you, you came out, you, you made it. It's just, there's too much to know. sure, you could probably do like HTML in one month. Well, okay, let's say like two weeks or whatever, right? If you were, if you're literally brand new, two weeks of like concerted effort almost, you know, class level, you know, work days right on, on html, you're probably decently comfortable with it. Very comfortable. CSS, a little harder because this is where things get hard. Cause if you, if you give two weeks for, for HTML, CSS is harder than HTML kind of, right? Because the interactions are way more varied. Right? Like, and, and maybe it's one of those things where you just, like, you, you get somewhat comfortable and then just like know that in the future you're gonna see something you don't understand and have to figure it out. Uh, but then JavaScript, like, how many months do you give JavaScript? Because if you go through that first like, sort of progression that I, I I, I, I mentioned everyone would have a perfect sort of, not perfect but good understanding of the pieces, right? Like, why did we start transpiling at all? Right? Like, uh, or why did you know, why did we adopt libraries? Like why did Bower exist? No one talks about Bower anymore, obviously, but like, Bower was like a way to distribute front end only packages, right? Um, what is it? Um, Uh, yes, there's grunt. There's like the whole build system thing, right? Once, once we decide we're gonna, we're gonna do stuff to files before we, before we push. So there's grunt, there's, uh, gulp, which is like grunt, but like, Oh, we're gonna do it all in memory. We're gonna pipe, we're gonna use this pipes thing to make sure everything goes fast. then there's like, of course that leads like the insanity that's webpack. And then there's like parcel, which did better. There's vite there's like, there's all this, there's this progression, but how many months would it take to know that progression? It, it's too long. So they end up just like, Hey, you're gonna learn react. Which is the right thing because it's like, that's what people hire for, right? But then you're gonna be in react and be like, What's webpack, right? And it's like, but you can't go down. You can't, you don't have the time. You, you can't sort of approach that problem from the other direction where you, which would give you better understanding cause you just don't have the time. I think it's hard for newer devs to overcome this. Um, but I think there are some, there's some hope on the horizon cuz some things are simpler, right? Like some projects do reduce complexity, like, by watching another project sort of innovate so like react. Wasn't the first component, first framework, right? Like technically, I, I think, I think you, you might have to give that to like, to maybe backbone because like they had views and like marionette also went with that. Like maybe, I don't know, someone, someone I'm sure will get in like, send me an angry email, uh, cuz I forgot you Moo tools or like, you know, Ember Ember. They've also, they've also been around, I used to be a huge Ember fan, still, still kind of am, but I don't use it. but if you have these, if you have these tools, right? Like people aren't gonna know how to use them and Vue was able to realize that React had some inefficiencies, right? So React innovates the sort of component. So Reintroduces the component based model component first, uh, front end development model. Vue sees that and it's like, wait a second, if we just export this like data object, and of course that's not the only innovation of Vue, but if we just export this data object, you don't have to do this fine grained tracking yourself anymore, right? You don't have to tell React or tell your the system which things change when other things change, right? Like you, you don't have to set up this watching and stuff, right? Um, and that's one of the reasons, like Vue is just, I, I, I remember picking up Vue and being like, Oh, I'm done. I'm done with React now. Because it just doesn't make sense to use React because they Vue essentially either, you know, you could just say they learned from them or they, they realize a better way to do things that is simpler and it's much easier to write. Uh, and you know, functionally similar, right? Um, similar enough that it's just like, oh they boil down some of that complexity and we're a step forward and, you know, in other ways, I think. Uh, so that's, that's awesome. Every once in a while you get like a compression in the complexity and then it starts to ramp up again and you get maybe another compression. So like joining the projects that do a compression. Or like starting to adopting those is really, can be really awesome. So there's, there's like, there's some hope, right? Cause sometimes there is a compression in that complexity and you you might be lucky enough to, to use that instead of, the thing that's really complex after years of building on it. [00:27:53] Jeremy: I think you're talking about newer developers having a tough time making sense of the current frameworks but the example you gave of somebody starting from HTML and JavaScript going to jquery backbone through the whole chain, that that's just by nature of you've put in a lot of time right you've done a lot of work working with each of these technologies you see the progression as if someone is starting new just by nature of you being new you won't have been able to spend that time [00:28:28] Victor: Do you think it could work? again, the, the, the time aspect is like really hard to get like how can you just avoid spending time um to to learn things that's like a general problem I think that problem is called education in the general sense. But like, does it make sense for a, let's say a bootcamp or, or any, you know, school right? To attempt to guide people through the previous solutions that didn't work, right? Like in math, you don't start with calculus, right? It just wouldn't, it doesn't make sense, right? But we try and start with calculus in software, right? We're just like, okay, here's the complexity. You've got all of it. Don't worry. Just look at this little bit. If, you know, if the compiler ever spits out a weird error uh oh, like, you're, you're, you're in for trouble cuz you, you just didn't get the. get the basics. And I think that's maybe some of what is missing. And the thing is, it is like the constraints are hard, right? No one has infinite time, right? Or like, you know, even like, just tons of time to devote to learning, learning just front end, right? That's not even all of computing, That's not even the algorithm stuff that some companies love to throw at you, right? Uh, or the computer sciencey stuff. I wonder if it makes more sense to spend some time taking people through the progression, right? Because discovering that we should do things via components, let's say, or, or at least encapsulate our functionality to components and compose that way, is something we, we not everyone knew, right? Or, you know, we didn't know wild widely. And so it feels like it might make sense to touch on that sort of realization and sort of guide the student through, you know, maybe it's like make five projects in a week and you just get progressively more complex. But then again, that's also hard cause effort, right? It's just like, it's a hard problem. But, but I think right now, uh, people who come in at the end and sort of like see a bunch of complexity and just don't know why it's there, right? Like, if you've like, sort of like, this is, this applies also very, this applies to general, but it applies very well to the Kubernetes problem as well. Like if you've never managed nginx on more than one machine, or if you've never tried to set up a, like a, to format your file system on the machine you just rented because it just, you know, comes with nothing, right? Or like, maybe, maybe some stuff was installed, but, you know, if you had to like install LVM (Logical Volume Manager) yourself, if you've never done any of that, Kubernetes would be harder to understand. It's just like, it's gonna be hard to understand. overlay networks are hard for everyone to understand, uh, except for network people who like really know networking stuff. I think it would be better. But unfortunately, it takes a lot of time for people to take a sort of more iterative approach to, to learning. I try and write blog posts in this way sometimes, but it's really hard. And so like, I'll often have like an idea, like, so I call these, or I think of these as like onion, onion style posts, right? Where you either build up an onion sort of from the inside and kind of like go out and like add more and more layers or whatever. Or you can, you can go from the outside and sort of take off like layers. Like, oh, uh, Kubernetes has a scheduler. Why do they need a scheduler? Like, and like, you know, kind of like, go, go down. but I think that might be one of the best ways to learn, but it just takes time. Or geniuses and geniuses who are good at two things, right? Good at the actual technology and good at teaching. Cuz teaching is a skill and it's very hard. and, you know, shout out to teachers cuz that's, it's, it's very difficult, extremely frustrating. it's hard to find determinism in, in like methods and solutions. And there's research of course, but it's like, yeah, that's, that's a lot harder than the computer being like, Nope, that doesn't work. Right? Like, if you can't, if you can't, like if you, if the function call doesn't work, it doesn't work. Right. If the person learned suboptimally, you won't know Right. Until like 10 years down the road when, when they can't answer some question or like, you know, when they, they don't understand. It's a missing fundamental piece anyway. [00:32:24] Jeremy: I think with the example of front end, maybe you don't have time to walk through the whole history of every single library and framework that came but I think at the very least, if you show someone, or you teach someone how to work with css, and you have them, like you were talking about components before you have them build a site where there's a lot of stuff that gets reused, right? Maybe you have five pages and they all have the same nav bar. [00:33:02] Victor: Yeah, you kind of like make them do it. [00:33:04] Jeremy: Yeah. You make 'em do it and they make all the HTML files, they copy and paste it, and probably your students are thinking like, ah, this, this kind of sucks [00:33:16] Victor: Yeah [00:33:18] Jeremy: And yeah, so then you, you come to that realization, and then after you've done that, then you can bring in, okay, this is why we have components. And similarly you brought up, manual dom manipulation with jQuery and things like that. I, I'm sure you could come up with an example of you don't even necessarily need to use jQuery. I think people can probably skip that step and just use the the, the API that comes with the browser. But you can have them go in like, Oh, you gotta find this element by the id and you gotta change this based on this, and let them experience the. I don't know if I would call it pain, but let them experience like how it was. Right. And, and give them a complex enough task where they feel like something is wrong right. Or, or like, there, should be something better. And then you can go to you could go straight to vue or react. I'm not sure if we need to go like, Here's backbone, here's knockout. [00:34:22] Victor: Yeah. That's like historical. Interesting. [00:34:27] Jeremy: I, I think that would be an interesting college course or something that. Like, I remember when, I went through school, one of the classes was programming languages. So we would learn things like, Fortran and stuff like that. And I, I think for a more frontend centered or modern equivalent you could go through, Hey, here's the history of frontend development here's what we used to do and here's how we got to where we are today. I think that could be actually a pretty interesting class yeah [00:35:10] Victor: I'm a bit interested to know you learned fortran in your PL class. I, think when I went, I was like, lisp and then some, some other, like, higher classes taught haskell but, um, but I wasn't ready for haskell, not many people but fortran is interesting, I kinda wanna hear about that. [00:35:25] Jeremy: I think it was more in terms of just getting you exposed to historically this is how things were. Right. And it wasn't so much of like, You can take strategies you used in Fortran into programming as a whole. I think it was just more of like a, a survey of like, Hey, here's, you know, here's Fortran and like you were saying, here's Lisp and all, all these different languages nd like at least you, you get to see them and go like, yeah, this is kind of a pain. [00:35:54] Victor: Yeah [00:35:55] Jeremy: And like, I understand why people don't choose to use this anymore but I couldn't take away like a broad like, Oh, I, I really wish we had this feature from, I think we were, I think we were using Fortran 77 or something like that. I think there's Fortran 77, a Fortran 90, and then there's, um, I think, [00:36:16] Victor: Like old fortran, deprecated [00:36:18] Jeremy: Yeah, yeah, yeah. So, so I think, I think, uh, I actually don't know if they're, they're continuing to, um, you know, add new things or maintain it or it's just static. But, it's, it's more, uh, interesting in terms of, like we were talking front end where it's, as somebody who's learning frontend development who is new and you get to see how, backbone worked or how Knockout worked how grunt and gulp worked. It, it's like the kind of thing where it's like, Oh, okay, like, this is interesting, but let us not use this again. Right? [00:36:53] Victor: Yeah. Yeah. Right. But I also don't need this, and I will never again [00:36:58] Jeremy: yeah, yeah. It's, um, but you do definitely see the, the parallels, right? Like you were saying where you had your, your Bower and now you have NPM and you had Grunt and Gulp and now you have many choices [00:37:14] Victor: Yeah. [00:37:15] Jeremy: yeah. I, I think having he history context, you know, it's interesting and it can be helpful, but if somebody was. Came to me and said hey I want to learn how to build websites. I get into front end development. I would not be like, Okay, first you gotta start moo tools or GWT. I don't think I would do that but it I think at a academic level or just in terms of seeing how things became the way they are sure, for sure it's interesting. [00:37:59] Victor: Yeah. And I, I, think another thing I don't remember who asked or why, why I had to think of this lately. um but it was, knowing the differentiators between other technologies is also extremely helpful right? So, What's the difference between ES build and SWC, right? Again, we're, we're, we're leaning heavy front end, but you know, just like these, uh, sorry for context, of course, it's not everyone a front end developer, but these are two different, uh, build tools, right? For, for JavaScript, right? Essentially you can think of 'em as transpilers, but they, I think, you know, I think they also bundle like, uh, generally I'm not exactly sure if, if ESbuild will bundle as well. Um, but it's like one is written in go, the other one's written in Rust, right? And sort of there's, um, there's, in addition, there's vite which is like vite does bundle and vite does a lot of things. Like, like there's a lot of innovation in vite that has to have to do with like, making local development as fast as possible and also getting like, you're sort of making sure as many things as possible are strippable, right? Or, or, or tree shakeable. Sorry, is is is the better, is the better term. Um, but yeah, knowing, knowing the, um, the differences between projects is often enough to sort of make it less confusing for me. Um, as far as like, Oh, which one of these things should I use? You know, outside of just going with what people are recommending. Cause generally there is some people with wisdom sometimes lead the crowd sometimes, right? So, so sometimes it's okay to be, you know, a crowd member as long as you're listening to the, to, to someone worth listening to. Um, and, and so yeah, I, I think that's another thing that is like the mark of a good project or, or it's not exclusive, right? It's not, the condition's not necessarily sufficient, but it's like a good projects have the why use this versus x right section in the Readme, right? They're like, Hey, we know you could use Y but here's why you should use us instead. Or we know you could use X, but here's what we do better than X. That might, you might care about, right? That's, um, a, a really strong indicator of a project. That's good cuz that means the person who's writing the project is like, they've done this, the survey. And like, this is kind of like, um, how good research happens, right? It's like most of research is reading what's happening, right? To knowing, knowing the boundary you're about to push, right? Or try and sort of like push one, make one step forward in, um, so that's something that I think the, the rigor isn't in necessarily software development everywhere, right? Which is good and bad. but someone who's sort of done that sort of rigor or, and like, and, and has, and or I should say, has been rigorous about knowing the boundary, and then they can explain that to you. They can be like, Oh, here's where the boundary was. These people were doing this, these people were doing this, these people were doing this, but I wanna do this. So you just learned now whether it's right for you and sort of the other points in the space, which is awesome. Yeah. Going to your point, I feel like that's, that's also important, it's probably not a good idea to try and get everyone to go through historical artifacts, but if just a, a quick explainer and sort of, uh, note on the differentiation, Could help for sure. Yeah. I feel like we've skewed too much frontend. No, no more frontend discussion this point. [00:41:20] Jeremy: It's just like, I, I think there's so many more choices where the, the mental thought that has to go into, Okay, what do I use next I feel is bigger on frontend. I guess it depends on the project you're working on but if you're going to work on anything front end if you haven't done it before or you don't have a lot of experience there's so many build tools so many frameworks, so many libraries that yeah, but we [00:41:51] Victor: Iterate yeah, in every direction, like the, it's good and bad, but frontend just goes in every direction at the same time Like, there's so many people who are so enthusiastic and so committed and and it's so approachable that like everyone just goes in every direction at the same time and like a lot of people make progress and then unfortunately you have try and pick which, which branch makes sense. [00:42:20] Jeremy: We've been kind of talking about, some of your experiences with a few things and I wonder if you could explain the the context you're thinking of in terms of the types of projects you typically work on like what are they what's the scale of them that sort of thing. [00:42:32] Victor: So I guess I've, I've gone through a lot of phases, right? In sort of what I use in in my tooling and what I thought was cool. I wrote enterprise java like everybody else. Like, like it really doesn't talk about it, but like, it's like almost at some point it was like, you're either a rail shop or a Java shop, for so many people. And I wrote enterprise Java for a, a long time, and I was lucky enough to have friends who were really into, other kinds of computing and other kinds of programming. a lot of my projects were wrapped around, were, were ideas that I was expressing via some new technology, let's say. Right? So, I wrote a lot of haskell for, for, for a while, right? But what did I end up building with that was actually a job board that honestly didn't go very far because I was spending much more time sort of doing, haskell things, right? And so I learned a lot about sort of what I think is like the pinnacle of sort of like type development in, in the non-research world, right? Like, like right on the edge of research and actual usability. But a lot of my ideas, sort of getting back to the, the ideas question are just things I want to build for myself. Um, or things I think could be commercially viable or like do, like, be, be well used, uh, and, and sort of, and profitable things, things that I think should be built. Or like if, if I see some, some projects as like, Oh, I wish they were doing this in this way, Right? Like, I, I often consider like, Oh, I want, I think I could build something that would be separate and maybe do like, inspired from other projects, I should say, Right? Um, and sort of making me understand a sort of a different, a different ecosystem. but a lot of times I have to say like, the stuff I build is mostly to scratch an itch I have. Um, and or something I think would be profitable or utilizing technology that I've seen that I don't think anyone's done in the same way. Right? So like learning Kubernetes for example, or like investing the time to learn Kubernetes opened up an entire world of sort of like infrastructure ideas, right? Because like the leverage you get is so high, right? So you're just like, Oh, I could run an aws, right? Like now that I, now that I know this cuz it's like, it's actually not bad, it's kind of usable. Like, couldn't I do that? Right? That kind of thing. Right? Or um, I feel like a lot of the times I'll learn a technology and it'll, it'll make me feel like certain things are possible that they, that weren't before. Uh, like Rust is another one of those, right? Like, cuz like Rust will go from like embedded all the way to WASM, which is like a crazy vertical stack. Right? It's, that's a lot, That's a wide range of computing that you can, you can touch, right? And, and there's, it's, it's hard to learn, right? The, the, the, the, uh, the, the ramp to learning it is quite steep, but, it opens up a lot of things you can write, right? It, it opens up a lot of areas you can go into, right? Like, if you ever had an idea for like a desktop app, right? You could actually write it in Rust. There's like, there's, there's ways, there's like is and there's like, um, Tauri is one of my personal favorites, which uses web technology, but it's either I'm inspired by some technology and I'm just like, Oh, what can I use this on? And like, what would this really be good at doing? or it's, you know, it's one of those other things, like either I think it's gonna be, Oh, this would be cool to build and it would be profitable. Uh, or like, I'm scratching my own itch. Yeah. I think, I think those are basically the three sources. [00:46:10] Jeremy: It's, it's interesting about Rust where it seems so trendy, I guess, in lots of people wanna do something with rust, but then in a lot of they also are not sure does it make sense to write in rust? Um, I, I think the, the embedded stuff, of course, that makes a lot of sense. And, uh, you, you've seen a sort of surge in command line apps, stuff ripgrep and ag, stuff like that, and places like that. It's, I think the benefits are pretty clear in terms of you've got the performance and you have the strong typing and whatnot and I think where there's sort of the inbetween section that's kind of unclear to me at least would I build a web application in rust I'm not sure that sort of thing [00:47:12] Victor: Yeah. I would, I characterize it as kind of like, it's a tool toolkit, so it really depends on the problem. And think we have many tools that there's no, almost never a real reason to pick one in particular right? Like there's, Cause it seems like just most of, a lot of the work, like, unless you're, you're really doing something interesting, right? Like, uh, something that like, oh, I need to, I need to, like, I'm gonna run, you know, billions and billions of processes. Like, yeah, maybe you want erlang at that point, right? Like, maybe, maybe you should, that should be, you know, your, your thing. Um, but computers are so fast these days, and most languages have, have sort of borrowed, not borrowed, but like adopted features from others that there's, it's really hard to find a, a specific use case, for one particular tool. Uh, so I often just categorize it by what I want out of the project, right? Or like, either my goals or project goals, right? Depending on, and, or like business goals, if you're, you know, doing this for a business, right? Um, so like, uh, I, I basically, if I want to go fast and I want to like, you know, reduce time to market, I use type script, right? Oh, and also I'm a, I'm a, like a type zealot. I, I'd say so. Like, I don't believe in not having types, right? Like, it's just like there's, I think it's crazy that you would like have a function but not know what the inputs could be. And they could actually be anything, right? , you're just like, and then you have to kind of just keep that in your head. I think that's silly. Now that we have good, we, we have, uh, ways to avoid the, uh, ceremony, right? You've got like hindley Milner type systems, like you have a way to avoid the, you can, you know, predict what types of things will be, and you can, you don't have to write everything everywhere. So like, it's not that. But anyway, so if I wanna go fast, the, the point is that going back to that early, like the JS ecosystem goes everywhere at the same time. Typescript is excellent because the ecosystem goes everywhere at the same time. And so you've got really good ecosystem support for just about everything you could do. Um, uh, you could write TypeScript that's very loose on the types and go even faster, but in general it's not very hard. There's not too much ceremony and just like, you know, putting some stuff that shows you what you're using and like, you know, the objects you're working with. and then generally if I wanna like, get it really right, I I'll like reach for haskell, right? Cause it's just like the sort of contortions, and again, this takes time, this not fast, but, right. the contortions you can do in the type system will make it really hard to write incorrect code or code that doesn't, that isn't logical with itself. Of course interfacing with the outside world. Like if you do a web request, it's gonna fail sometimes, right? Like the network might be down, right? So you have to, you basically pull that, you sort of wrap that uncertainty in your system to whatever degree you're okay with. And then, but I know it'll be correct, right? But and correctness is just not important. Most of like, Oh, I should , that's a bad quote. Uh, it's not that correct is not important. It's like if you need to get to market, you do not necessarily need every single piece of your code to be correct, Right? If someone calls some, some function with like, negative one and it's not an important, it's not tied to money or it's like, you know, whatever, then maybe it's fine. They just see an error and then like you get an error in your back and you're like, Oh, I better fix that. Right? Um, and then generally if I want to be correct and fast, I choose rust these days. Right? Um, these days. and going back to your point, a lot of times that means that I'm going to write in Typescript for a lot of projects. So that's what I'll do for a lot of projects is cuz I'll just be like, ah, do I need like absolute correctness or like some really, you know, fancy sort of type stuff. No. So I don't pick haskell. Right. And it's like, do I need to be like mega fast? No, probably not. Cuz like, cuz so I don't necessarily don't necessarily need rust. Um, maybe it's interesting to me in terms of like a long, long term thing, right? Like if I, if I'm think, oh, but I want x like for example, tight, tight, uh, integration with WASM, for example, if I'm just like, oh, I could see myself like, but that's more of like, you know, for a fun thing that I'm doing, right? Like, it's just like, it's, it's, you don't need it. You don't, that's premature, like, you know, that's a premature optimization thing. But if I'm just like, ah, I really want the ability to like maybe consider refactoring some of this out into like a WebAssembly thing later, then I'm like, Okay, maybe, maybe I'll, I'll pick Rust. Or like, if I, if I like, I do want, you know, really, really fast, then I'll like, then I'll go Rust. But most of the time it's just like, I want a good ecosystem so I don't have to build stuff myself most of the time. Uh, and you know, type script is good enough. So my stack ends up being a lot of the time just in type script, right? Yeah. [00:52:05] Jeremy: Yeah, I think you've encapsulated the reason why there's so many packages on NPM and why there's so much usage of JavaScript and TypeScript in general is that it, it, it fits the, it's good enough. Right? And in terms of, in terms of speed, like you said, most of the time you don't need of rust. Um, and so typescript I think is a lot more approachable a lot of people have to use it because they do front end work anyways. And so that kinda just becomes the I don't know if I should say the default but I would say it's probably the most common in terms of when somebody's building a backend today certainly there's other languages but JavaScript and TypeScript is everywhere. [00:52:57] Victor: Yeah. Uh, I, I, I, another thing is like, I mean, I'm, of ignored the, like, unreasonable effectiveness of like rails Cause there's just a, there's tons of just like rails warriors out there, and that's great. They're they're fantastic. I'm not a, I'm not personally a huge fan of rails but that's, uh, that's to my own detriment, right? In, in some, in some ways. But like, Rails and Django sort of just like, people who, like, I'm gonna learn this framework it's gonna be excellent. It most, they have a, they have carved out a great ecosystem for themselves. Um, or like, you know, even php right? PHP and like Laravel, or whatever. Uh, and so I'm ignoring those, like, those pockets of productivity, right? Those pockets of like intense productivity that people like, have all their needs met in that same way. Um, but as far as like general, general sort of ecosystem size and speed for me, um, like what you said, like applies to me. Like if I, if I'm just like, especially if I'm just like, Oh, I just wanna build a backend, Like, I wanna build something that's like super small and just does like, you know, maybe a few, a couple, you know, endpoints or whatever and just, I just wanna throw it out there. Right? Uh, I, I will pick, yeah. Typescript. It just like, it makes sense to me. I also think note is a better. VM or platform to build on than any of the others as well. So like, like I, by any of the others, I mean, Python, Perl, Ruby, right? Like sort of in the same class of, of tool. So I I am kind of convinced that, um, Node is better, than those as far as core abilities, right? Like threading Right. Versus the just multi-processing and like, you know, other, other, other solutions and like, stuff like that. So, if you want a boring stack, if I don't wanna use any tokens, right? Any innovation tokens I reach for TypeScript. [00:54:46] Jeremy: I think it's good that you brought up. Rails and, and Django because, uh, personally I've done, I've done work with Rails, and you're right in that Rails has so many built in, and the ways to do them are so well established that your ability to be productive and build something really fast hard to compete with, at least in my experience with available in the Node ecosystem. Um, on the other hand, like I, I also see what you mean by the runtimes. Like with Node, you're, you're built on top of V8 and there's so many resources being poured into it to making it fast and making it run pretty much everywhere. I think you probably don't do too much work with managed services, but if you go to a managed service to run your code, like a platform as a service, they're gonna support Node. Will they support your other preferred language? Maybe, maybe not, You know that they will, they'll be able to run node apps so but yeah I don't know if it will ever happen or maybe I'm just not familiar with it, but feel like there isn't a real rails of javascript. [00:56:14] Victor: Yeah, you're, totally right. There are, there are. It's, it's weird. It's actually weird that there, like Uh, but, but, I kind of agree with you. There's projects that are trying it recently. There's like Adonis, um, there is, there are backends that also do, like, will do basic templating, like Nest, NestJS is like really excellent. It's like one of the best sort of backend, projects out there. I I, I but like back in the day, there were projects like Sails, which was like very much trying to do exactly what Rails did, but it just didn't seem to take off and reach that critical mass possibly because of the size of the ecosystem, right? Like, how many alternatives to Rails are there? Not many, right? And, and now, anyway, maybe let's say the rest of 'em sort of like died out over the years, but there's also like, um, hapi HAPI, uh, which is like also, you know, similarly, it was like angling themselves to be that, but they just never, they never found the traction they needed. I think, um, or at least to be as wide, widely known as Rails is for, for, for the, for the Ruby ecosystem, um, but also for people to kind of know the magic, cause. Like I feel like you're productive in Rails only when you imbibe the magic, right? You, you, know all the magic context and you know the incantations and they're comforting to you, right? Like you've, you've, you have the, you have the sort of like, uh, convention. You're like, if you're living and breathing the convention, everything's amazing, right? Like, like you can't beat that. You're just like, you're in the zone but you need people to get in that zone. And I don't think node has, people are just too, they're too frazzled. They're going like, there's too much options. They can't, it's hard to commit, right? Like, imagine if you'd committed to backbone. Like you got, you can't, It's, it's over. Oh, it's not over. I mean, I don't, no, I don't wanna, you know, disparage the backbone project. I don't use it, but, you know, maybe they're still doing stuff and you know, I'm sure people are still working on it, but you can't, you, it's hard to commit and sort of really imbibe that sort of convention or, or, or sort of like, make yourself sort of breathe that product when there's like 10 products that are kind of similar and could be useful as well. Yeah, I think that's, that's that's kind of big. It's weird that there isn't a rails, for NodeJS, but, but people are working on it obviously. Like I mentioned Adonis, there's, there's more. I'm leaving a bunch of them out, but that's part of the problem. [00:58:52] Jeremy: On, on one hand, it's really cool that people are trying so many different things because hopefully maybe they can find something that like other people wouldn't have thought of if they all stick same framework. but on the other hand, it's ... how much time have we spent jumping between all these different frameworks when what we could have if we had a rails. [00:59:23] Victor: Yeah the, the sort of wasted time is, is crazy to think about it uh, I do think about that from time to time. And you know, and personally I waste a lot of my own time. Like, just, just rec

amazon google running myth tokyo os speed medium seo pl windows workers ebay sonic snap hackers saas personally depending meant nest react wordpress assurance platforms complexity rust ux api languages python astro aws java databases github lamp assuming mastodon maus vm pivotal javascript rails html clap knockout acquired versus csr css longhorns ssd containers php django docker node js kubernetes cuz sails v8 foss pelican vue ssl haskell yc elastic gitlab bower bake off milner moo grunt paas tls gulp v1 terraform cli typescript mvc jquery npm nodejs redis readme webassembly encrypt lisp heroku hacker news postgres ansible wasm ebs laravel eks aks cdk sqlite nginx ec2 fortran deltas swc hapi kelsey hightower activitypub zfs nuxt gwt dsls lvm mvvm redis labs lucene dockerfile sidekiq durandal yak shaving nestjs jeremy you jeremy it jeremy so victor it victor there

The Power of Hope When Catastrophe Crashes In with Jeremy and Caleb Freeman

Play Episode Listen Later Oct 24, 2022 45:31

Welcome to the Love Where You Are podcast! In today's episode, Somer is joined with Jeremy Freeman and his son, Caleb. Jeremy Freeman is lead pastor at First Baptist Church in Newcastle, Oklahoma, and served as first vice president for the Oklahoma Baptist convention. Jeremy is a sought-after speaker for inspirational events hosted by churches, conventions, conferences, businesses, and schools. I'm just going to go ahead and give you fair warning: this week's episode is going to change you. It did me. Pastor Jeremy Freeman and his family have endured seasons of tremendous loss and tragedy. After losing their son, Trey, to a rare illness they nearly lost their son Caleb to a traumatic brain injury after a serious car crash. The doctors gave Caleb 10% chance of living, and he was unlikely to talk or walk again. The Freemans navigated remaining in ministry while grieving and walking with God in dark places. They witnessed God's hand at work as Caleb not only survived but speaks alongside his father today to share hope in Christ even in the gravest of circumstances. In this episode, Somer, Jeremy, and Caleb talk about their new book "#butGod," their ministry, and how those simple words "but God" transformed the way they look at life altering circumstances. Thank you for being here today. Now, let's dive into the conversation! Connect with the Freemans: Buy their book "#butGod" to read their story in more depth! Join their Facebook group, Pray for Caleb Freeman Instagram Twitter FAVORITE TAKE-AWAYS: “I'm pretty much the biggest nobody on this earth today but for some reason God has chosen to write this amazing story using my life. God has put me on this earth to tell that greater story and I live to tell that story to anyone who will listen.” - Caleb Freeman "The church that suffers well together grows well together" - Jeremy Freeman "God, use our story for your glory" - Jeremy "As believers we have to get to a place where our faith becomes so real to us that suffering doesn't destroy your faith, but it refines it" - Caleb "Faith that can't be tested, can't be trusted" - Caleb "God paves the way even in tragedy" - Somer "When you say 'thy will be done' that is when the cloud of darkness starts to lift" - Jeremy "If you don't take your grief to the Lord, it can control and consume you" - Jeremy "Suffering has been my greatest theologian" - Martin Luther "God always knows what He is doing and most of the time He is up to something bigger" - Caleb "Preach these Truths to yourself until you believe them" - John Piper "We believe suffering can be used to draw people to God's heart" - Jeremy "You can impress people with your strengths but you connect with people through your weaknesses" - Caleb Connect with Somer! Stay connected to Somer and the Love Where You Are podcast through her Facebook & Instagram! Now, go love where you are and live on mission for Jesus today.

The Power of Hope When Catastrophe Crashes In with Jeremy and Caleb Freeman

Love Where You Are with Somer Colbert

Play Episode Listen Later Oct 24, 2022 45:31

Xe Iaso on Tailscale

Play Episode Listen Later Oct 1, 2022 54:10

Xe Iaso is the Archmage of Infrastructure at Tailscale and previously worked at Heroku.This episode originally aired on Software Engineering Radio but includes some additional discussion about their blog near the end of the episode.Topics covered: Use cases for VPNs Simplifying service authentication by identifying users via IP Peer-to-peer vs centralized "Virtual Pain Networks" Tailscale's tech stack and why they forked the go compiler DERP relay servers Struggling with the iOS network extension size limit The surprisingly small amount of infrastructure required to run a VPN Running your company on your own product Working at Heroku vs Tailscale Using the socratic style of debate in technical blog posts Related Links @theprincessxena Xe's Blog ACL samples Go links origin story How Tailscale works Tailscale SSH How Tailscale assigns IP addresses Hey linker, can you spare a meg? My Blog is Hilariously Overengineered to the Point People Think it's a Static Site The Sheer Terror of PAM Transcript[00:00:00] Jeremy: Today I'm talking to Xe Iaso, they're the archmage of infrastructure at tailscale, and they also have a great blog everyone should check out. Xe, welcome to software engineering radio.[00:00:12] Xe: Thanks. It's great to be here. [00:00:14] Jeremy: I think the first thing we should start with, is what's a, a VPN, because I think some people they may have used it to remote into their workplace or something like that. But I think the, the scope of what it's good for and what it does is a lot broader than that. So maybe you could talk a little bit about that first.[00:00:31] Xe: Okay. a VPN is short for virtual private network. It's basically a fake network that's overlaid on top of existing networks. And then you can use that network to do whatever you would with a normal computer network. this term has been co-opted by companies that are attempting to get into the, like hide my ass style market, where, you know, you encrypt your internet information and keep it safe from hackers.But, uh, so it makes it really annoying and hard to talk about what a VPN actually is. Because tailscale, uh, the company I work for is closer to like the actual intent of a VPN and not just, you know, like hide your internet traffic. That's already encrypted anyway with another level of encryption and just make a great access point for, uh, three letter agencies.But are there, use cases, past that, like when you're developing a piece of software, why would you decide to use a VPN outside of just because I want my, you know, my workers to be able to get access to this stuff.[00:01:42] Xe: So something that's come up, uh, when I've been working at tailscale is that sometimes we'll make changes to something. And it'll be changes to like the user experience of something on the admin panel or something. So in a lot of other places I've worked in order to have other people test that, you know, you'd have to push it to the cloud.It would have to spin up a review app in Heroku or some terrifying terraform of abomination would have to put it out onto like an actual cluster or something. But with tail scale, you know, if your app is running locally, you just give like the name of your computer and the port number. And you know, other people are able to just see it and poke it and experience it.And that basically turns the, uh, feedback cycle from, you know, like having to wait for like the state of the world to converge, to, you know, make a change, press F five, give the URL to a coworker and be like, Hey, is this Gucci?they can connect to your app as if you were both connected to the same switch.[00:02:52] Jeremy: You don't have to worry about, pushing to a cloud service or opening ports, things like that.[00:02:57] Xe: Yep. It will act like it's in the same room, even when they're not it'll even work. if you're at both at Starbucks and the Starbucks has reasonable policies, like holy crap, don't allow devices to connect to each other directly. so you know, you're working on. Like your screenplay app at your Starbucks or something, and you have a coworker there and you're like, Hey, uh, check this out and, uh, give them the link.And then, you know, they're also seeing the screenplay editor.[00:03:27] Jeremy: in terms of security and things like that. I mean, I'm picturing it kind of like we were sitting in the same room and there's a switch and we both plugged in. Normally when you do something like that, you kind of have, full access to whatever else is on the switch. Uh, you know, provided that's not being blocked by a, a firewall.is there like a layer of security on top of that, that a VPN service like tailscale would provide.[00:03:53] Xe: Yes. Um, there are these things called access control lists, which are kind of like firewall rules, except you don't have to deal with like the nightmare of writing an IP tables rule that also works in windows firewall and whatever they use in Mac OS. The ACL rules are applied at the tailnet level for every device in the tailnet.So if you have like developer machines, you can put people into groups as things like developers and say that developer machines can talk to production, but not people in QA. They can only talk to testing and people on SRE have, you know, permissions to go everywhere and people within their own teams can connect to each other. you can make more complicated policies like that fairly easily.[00:04:44] Jeremy: And when we think about infrastructure for, for companies, you were talking about how there could be development, infrastructure, production, infrastructure, and you kind of separate it all out. when you're working with cloud infrastructure. A lot of times, there's the, I always forget what it stands for, but there's like IAM.There's like policies that you can set up with the cloud provider that says these users can access this, or these machines can access this. And, and I wonder from your perspective, when you would choose to use that versus use something at the, the network or the, the VPN level.[00:05:20] Xe: The way I think about it is that things like IAM enforce, permissions for like more granularly scoped things like can create EC2 instances or can delete EC2 instances or something like that. And that's just kind of a different level of thing. uh, tailscale, ACLs are more, you know, X is allowed to connect to Y or with tailscale, SSH X is allowed to connect as user Y.and that's really different than like arbitrary capability things like IAM offers.you could think about it as an IAM system, but the main permissions that it's exposing are can X connect to Y on Zed port.[00:06:05] Jeremy: What are some other use cases where if you weren't using a VPN, you'd have to do a lot more work or there's a lot more complexity, kind of what are some cases where it's like, okay, using a VPN here makes a lot of sense.(The quick and simple guide to go links https://www.trot.to/go-links) [00:06:18] Xe: There is a service internal to tailscale called go, which is a, clone of Google's so-called go links where it's basically a URL shortener that lives at http://go. And, you know, you have go/something to get to some internal admin service or another thing to get to like, you know, the company directory and notion or something, and this kind of thing you could do with a normal setup, you know, you could set it up and have to do OAuth challenges everywhere and, you know, have to put and make sure that everyone has the right DNS configuration so that, it shows up in the right place.And then you have to deal with HTTPS um, because OAuth requires HTTPS for understandable and kind of important reasons. And it's just a mess. Like there's so many layers of stuff like the, the barrier to get, you know, like just a darn URL, shortener up turns from 20 minutes into three days of effort trying to, you know, understand how these various arcane things work together.You need to have state for your OAuth implementation. You need to worry about what the hell a a JWT is (sigh) . It's it it's just bad. And I really think that something like tailscale with everybody has an IP address. In order to get into the network, you have to sign in with your, auth provider, your, a provider tells tailscale who you are.So transitively every IP address is tied to an owner, which means that you can enforce access permission based on the IP address and the metadata about it that you grab from the tailscale. daemon, it's just so much simpler. Like you don't have to think about, oh, how do I set up OAuth this time? What the hell is an oauth proxy?Um, what is a Kubernetes? That sort of thing you just think about like doing the thing and you just do it. And then everything else gets taken care of it. It's like kind of the ultimate network infrastructure, because it's both omnipresent and something you don't have to think about. And I think that's really the power of tailscale.[00:08:39] Jeremy: typically when you would spin up a, a service that you want your developers or your system admins, to be able to log into, you would have to have some way of authenticating and authorizing that user. And so you were talking about bringing in OAuth and having your, your service understand that.But I, I guess what you're saying is that when you have something like tailscale, that's kind of front loaded, I guess you, you authenticate with tail scale, you get onto the network, you get your IP. And then from that point on you can access all these different services that know like, Hey, because you're on the network, we know you're authenticated and those services can just maybe map that IP that's not gonna change to like users in some kind of table. Um, and not have to worry about figuring out how do I authenticate this user.[00:09:34] Xe: I would personally more suggest that you use the, uh, whois, uh, look up route in the tailscale daemon's local API, but basically, yeah, you don't really have to worry too much about like the authentication layer because the authentication layer has already been done. You know, you've already done your two factor with Gmail or whatever, and then you can just transitively push that property onto your other machines.[00:10:01] Jeremy: So when you talk about this, this whois daemon, can you give an example of I'm in the network now I'm gonna make a service call to an application. what, what am I doing with this? This whois daemon?[00:10:14] Xe: It's more of like a internal API call that we expose via tailscaled's, uh, Unix, socket. but basically you give it an IP address and a port, and it tells you who the person is. It's kind of like the Unix ident protocol in a way, except completely not. And at a high level, you know, if you have something like a proxy for Grafana, you have that proxy for Grafana, make a call to the local tailscale daemon, and be like, Hey, who was this person?And the tailscale, daemon will spit back at JSON object. Like, oh, it's this person on this device and there you can do additional logic like maybe you shouldn't be allowed to delete things from an iOS device, you know, crazy ideas like that. there's not really support for like arbitrary capabilities and tailscaled at the time of recording, but we've had some thoughts would be cool.[00:11:17] Jeremy: would that also include things like having roles, for example, even if it's just strings, um, that you get back so that your application would know, okay. This person, is supposed to have admin access to this service based on what I got back from, this, this service.[00:11:35] Xe: Not currently, uh, you can probably do it via convention or something, but what's currently implemented in the actual, like, source code and user experience that they, you can't do that right now. Um, it is something that I've been, trying to think about different ways to solve, but it's also a problem.That's a bit big for me personally, to tackle.[00:11:59] Jeremy: there's, there's so many, I guess, different ways of doing it. That it's kind of interesting to think of a solution that's kind of built into the, the network. Yeah.[00:12:10] Xe: Yeah. and when I describe that authentication thing to some people, it makes them recoil in shock because there's kind of a Stockholm syndrome type effect with security, for a lot of things where, the easy way to do something and the secure way to do something are, you know, like completely opposite and directly conflicting with each other in almost every way.And over time, people have come to associate security or like corporate VPNs as annoying, complicated, and difficult. And the idea of something that isn't annoying, complicated or difficult will make people reject it, like just on principle, because you know, they've been trained that, you know, VPN equals virtual pain network and it, it's hard to get that association outta people's heads because you know, a lot of VPNs are virtual pain networks.Like. I used to work for Salesforce and Salesforce had this corporate VPN where no matter what you did, all of your traffic would go out to the internet from their data center. I think it was in San Francisco or something. And I was in the Seattle area. So whenever I had the VPN on my latency to Google shot up by like eight times and being a software person, you know, I use Google the same way that others breathe and it, it was just not fun.And I only had the VPN on for the bare minimum of when I needed it. And, oh God, it was so bad.[00:13:50] Jeremy: like some people, when they picture a VPN, they picture exactly what you're describing, where all of my traffic is gonna get routed to some central point. It's gonna go connect to the thing for me and then send the result back. so maybe you could talk a little bit about why that's, that's maybe a wrong assumption, I guess, in the case of tailscale, or maybe in the case of just more modern VPN solutions.[00:14:13] Xe: Yeah. So the thing that I was describing is what I've been lovingly calling the, uh, single point of failure as a service type model of VPN, where, you know, you have like the big server somewhere, it concentrates all the connections and, you know, like does things to make the computer feel like they've teleported over there, but overall it's a single point of failure.And if that falls over, you know, like goodbye, VPN. everybody's just totally screwed. And in contrast, tailscale does a more peer-to-peer thing so that everyone is basically on equal footing. Everyone can send traffic directly to each other, and if it can't get directly to there, it'll use a network of, uh, relay servers, uh, lovingly called Derp and you don't have to worry about, your single point of failure in your cluster, because there's just no single point of failure.Everything will directly communicate as much as possible. And if it can't, it'll still communicate anyway.[00:15:18] Jeremy: let's say I start up my computer and I wanna connect to a server in a data center somewhere at the very beginning, am I connecting to some server hosted at tailscale? And then. There's some kind of negotiation process where after that I connect directly or do I just connect directly straight away?[00:15:39] Xe: If you just turn on your laptop and log in, you know, to it signs into tailscale and gets you on the tailnet and whatnot, then it will actually start all connections via Derp just so that it can negotiate the, uh, direct connection. And in case it can't, you know, it's already connected via Derp so it just continues the connection with Derp and this creates a kind of seamless magic type experience where doing things over Derp is slower.Yes, it is measurably slower because you know, like you're not going directly, you're doing TCP inside of TCP. And you know, that comes with a average minefield of lasers or whatever you call it. And it does work though. It's not ideal if you wanna do things like copy large amounts of data, but if you want just want ssh into prod and see the logs for what the heck is going on and why you're getting paged at 3:00 AM. it's pretty great.[00:16:40] Jeremy: What you, you were calling Derp is it where you have servers kind of all over the world and somehow it determines which one's, I guess, is it which one's closest to your destination or which one's closest to you. I'm kind of[00:16:54] Xe: It's really interesting. It's one of the most weird distributed systems, uh, type things that I've ever seen. It's the kind of thing that could only come outta the mind of an X Googler, but basically every tailscale, every tailscale node has a connection to all of the Derp servers and through process of, you know, latency testing.It figures out which connection is the fastest and the lowest latency. And it calls that it's home Derp but because it's connected to everything is connected to every Derp you can have two people with different home Derps getting their packets relayed too other clients from different Derps.So, you know, if you have a laptop in Ottawa and a laptop in San Francisco, the laptop in San Francisco will probably use the, uh, Derp that's closest to it. But the laptop in Ottawa will also use the Derp that's closest to it. So you get this sort of like asynchronous thing, and it actually works out a lot better in practice, than you're probably imagining.[00:17:52] Jeremy: And then these servers, what was the, the technical term for them? Are they like relays or what's[00:17:58] Xe: They're relays. Uh, they only really deal with encrypted wire guard packets, and there's, no way for us at tailscale, to see the contents of Derp messages, it is literally just a forwarder. It, it literally just forwards things based on the key ID.[00:18:17] Jeremy: I guess if tail scale isn't able to decrypt the traffic, is, is that because the, the keys are only on the user's devices, like it's on their laptop and on the server they're trying to reach, or[00:18:31] Xe: Yeah. The private keys are live and die with those devices or the devices they were minted on. And the public keys are given to the coordination server and the coordination server spreads those around to every device in your tailnet. It does some limiting so that like, if you don't have ACL access to something, you don't get the private key, you don't get the, uh, public key for it.The public key, not the private key, the public key, not the private key. And yeah. Then, you know, you just go that way and it'll just figure it out. It's pretty nice.[00:19:03] Jeremy: When we're kind of talking about situations where it can't connect directly, that's where you would use the relay. what are kind of the typical cases where that happens, where you, you aren't able to just connect directly?[00:19:17] Xe: Hotel, wifi and paranoid network security setups, hotel wifi is the most notorious one because you know, you have like an overpriced wifi connection. And if you bring, like, I don't know like, You you're recording a bunch of footage on your iPhone. And because in, 2022. The iPhone has the USB2 connection on it.And you know, you wanna copy that. You wanna use the network, but you can't. So you could just let it upload through iCloud or something, or, you know, do the bare minimum. You need to get the, to get the data off with Derp it wouldn't be ideal, but it would work. And ironically enough, that entire complexity involved with, you know, doing TCP inside of TCP to copy a video file over to your laptop might actually be faster than USB2, which is something that I did the math for a while ago.And I just started laughing.[00:20:21] Jeremy: Yeah, that that is pretty, pretty ridiculous [00:20:23] Xe: welcome to the future, man (laughs) .[00:20:27] Jeremy: in terms of connecting directly, usually when you have a computer on the internet, you don't have all your ports open, you don't necessarily allow, just anybody to send you traffic over UDP and so forth. let's say I wanna send, UDP data to a, a server on my network, but, you know, maybe it has some TCP ports open. I I'm assuming once I connect into the network via the VPN, I'm able to use other protocols and ports that weren't necessarily exposed. Is that correct?[00:21:01] Xe: Yeah, you can use UDP. you can do basically anything you would do on a normal network except multicast um, because multicast is weird.I mean, there's thoughts on how to handle multicast, but the main problem is that like wireguard, which is what is tail tailscale is built on top of, is, so called OSI model layer three network, where it's at like, you know, the IP address level and multicast is a layer two or data link layer type thing.And, those are different numbers and, you can't really easily put, you know, like broadcast packets into IP, uh, IPV4 thinks otherwise, but, uh, in practice, no people don't actually use the broadcast address.[00:21:48] Jeremy: so for someone who's, they, they have a project or their company wants to get started. I mean, what does onboarding look like? What, what do they have to do to get all these devices talking to one another?[00:22:02] Xe: basically you, install tail scale, you log in with a little GUI thing or on a Linux server, you run tailscale up, and then you all log to the, to a, like a G suite account with the same domain name. So, you know, if your domain is like example.com, then everybody logs in with their example.com G suite account.And, there is no step three, everything is allowed and everything can just connect and you can change the permissions from there. By default, the ACLs are set to a, you know, very permissive allow everyone to talk to everyone on any port. Uh, just so that people can verify that it's working, you know, you can ping to your heart's content.You can play Minecraft with others. You can, you know, host an HTTP server. You can SSH into your development box and and write blog post with emacs, whatever you want.[00:22:58] Jeremy: okay, you install the, the software on your servers, your workstations, your laptops, and so on. And then at, after that there's some kind of webpage or dashboard you would go in and say, I want these people to be able to access these things and [00:23:14] Xe: Mm-hmm [00:23:15] Jeremy: these ports and so on.[00:23:17] Xe: you, uh, can customize the access control rules with something that looks like JSON, but with trailing commas and comments allowed, and you can go from there to customize basically anything to your heart's content. you can set rules so that people on the DevOps team can access everything, but you know, maybe marketing doesn't need access to the production database.So you don't have to worry about that as much.[00:23:45] Jeremy: there's, there's kind of different options for VPNs. CloudFlare access, zero tier, there's, there's some kind of, I think it's Nebula from slack or something like that. so I was kind of curious from your perspective, what's the, difference between those kinds of services and, and tailscale.[00:24:04] Xe: I'm gonna lead this out by saying that I don't totally understand the differences between a lot of them, because I've only really worked with tailscale. I know things about the other options, but, uh, I have the most experience with tailscale but from what I've been able to tell, there are things that tailscale offers that others don't like reverse mapping of IP addresses to people, or, there's this other feature that we've been working on, where you can embed tail scale as a library inside your go application, and then write a internal admin service that isn't exposed to the internet, but it's only exposed over tailscale.And I haven't seen a way to do those things with those others, but again, I haven't done much research. Um, I understand that zero tier has some layer, two capabilities, but I've, I don't have enough time in the day to look into.[00:25:01] Jeremy: There's been different, I guess you would call them VPN protocols. I mean, there's people have probably worked with IP sec in some situations they may have heard of OpenVPN, wireguard. in the case of tailscale, I believe you chose to build it on top of wireguard.So I wonder if you could talk a little bit about why, you chose wireguard and, and maybe what makes it unique.[00:25:27] Xe: I wasn't on the team that initially wrote like the core of tailscale itself. But from what I understand, wire guard was chosen because, what overhead, uh, it's literally, you just encrypt the packets, you send it to the other server, the other server decrypts them. And you know, you're done. it's also based purely on the public key. Um, the key pairs involved. And from what I understand, like at the wireguard protocol level, there's no reason why you, why you would need an IP address at all in theory, but in practice, you kind of need an IP address because you know, everything sucks. But also wire guard is like UDP only, which I think it at it's like core implementation, which is a step up from like AnyConnect and OpenVPN where they have TCP modes.So you can experience the, uh, glorious, trash fire of TCP in TCP. And from what I understand with wireguard, you don't need to set up a certificate authority or figure out how the heck to revoke certificates. Uh, you just have key pairs and if a node needs to be removed, you delete the key pair and you're done.And I think that really matches up with a lot of the philosophy behind how tailscale networks work a lot better. You know, you have a list of keys and if the network changes the list of keys changes, that's, that's the end of the story.So maybe one of the big selling points was just What has the least amount of things I guess, to deal with, or what's the, the simplest, when you're using a component that you want to put into your own product, you kind of want the least amount of things that could go wrong, I guess.[00:27:14] Xe: Yeah. It's more like simple, but not like limiting. Like, for example, a set of tinker toys is simple in that, you know, you can build things that you don't have to worry too much about the material science, but a set of tinker toys is also limiting because you know, like they're little wooden, dowels and little circles made out of wind that you stick the dowels into, you know, you can only do so much with it.And I think that in comparison, wireguard is simple. You know, there's just key pairs. They're just encryption. And it's simple in it's like overall theory and it's implementation, but it's not limiting. Like you can do pretty much anything you want with it.inherently whenever we build something, that's what we want, but that's a, that's an interesting way of putting it. Yeah.[00:28:05] Xe: Yeah. It. It can be kind of annoyingly hard to figure out how to make things as simple as they need to be, but still allow for complexity to occur. So you don't have to like set up a keyboard macro to write if error not equals nil over and over.[00:28:21] Jeremy: I guess the next thing I'd like to talk a little bit about is. We we've covered it a little bit, but at a high level, I understand that that tailscale uses wireguard, which is the open source, VPN protocol, I guess you could call it. And then there's the client software. You're saying you need to install on each of the servers and workstations.But there's also a, a control plane. and I wonder if you could kind of talk a little bit about I guess at a high level, what are all the different components of, of tailscale?[00:28:54] Xe: There's the agent that you install in your devices. The agent is basically the same between all the devices. It's all written in go, and it turns out that go can actually cross compile fairly well. So you have. Your, you know, your implementation in go, that is basically the, the same code, more or less running on windows, MacOS, freeBSD, Android, ChromeOS, iOS, Linux.I think I just listed all the platforms. I'm not sure, but you have that. And then there's the sort of control plane on tailscale's side, the control plane is basically like control, uh, which is, uh, I think a get smart reference. and that is basically a key dropbox. So, you know, you You authenticate through there. That's where the admin panel's hosted. And that's what tells the different tailscale nodes uh, the keys of all the other machines on the tailnet. And also on tailscale side there's, uh, Derp which is a fleet of a bunch of different VPSs in various clouds, all over the world, both to try to minimize cost and to, uh, have resiliency because if both digital ocean and Vultr go down globally, we probably have bigger problems.[00:30:15] Jeremy: I believe you mentioned that the, the clients were written in go, are the control plane and the relay, the Derp portion. Are those also written in go or are they[00:30:27] Xe: They're all written and go, yeah,go as much as possible. Yeah.It's kind of what happens when you have some ex go team members is the core people involved in tail scale, like. There's a go compiler fork that has some additional patches that go upstream either can't accept, uh, won't accept or hasn't yet accepted, for a while. It was how we did things like trying to shave off by bites from binary size to attempt to fit it into the iOS network extension limit.Because for some reason they only allowed you to have 15 megabytes of Ram for both like your application and working Ram. And it turns out that 15 megabytes of Ram is way more than enough to do something like OpenVPN. But you know, when you have a peer-to-peer VPN engine, it doesn't really work that well.So, you know, that's a lot of interesting engineering challenge.[00:31:28] Jeremy: That was specifically for iOS. So to run it on an iPhone.[00:31:32] Xe: Yeah. Um, and amazingly after the person who did all of the optimization to the linker, trying to get the binary size down as much as possible, like replacing Unicode packages was something that's more coefficient, you know, like basically all but compressing parts of the binary to try to save space. Then the iOS, I think 15 beta dropped and we found out that they increased the network extension Ram limit to 50 megabytes and the look of defeat on that poor person's face. I feel very bad for him.[00:32:09] Jeremy: you got what you wanted, but you're sad about it,[00:32:12] Xe: Yeah.[00:32:14] Jeremy: so that's interesting too. you were using a fork of the go compiler [00:32:19] Xe: Basically everything that is built is built using, uh, the tailscale fork, of the go compiler.[00:32:27] Jeremy: Going forward is the sort of assumption is that's what you'll do, or is it you're, you're hoping you can get this stuff upstreamed and then eventually move off of it.[00:32:36] Xe: I'm pretty sure that, I, I don't know if I can really make a forward looking statement like that, but, I've come to accept the fact that there's a fork of the go compiler. And as a result, it allows a lot more experimentation and a bit more of control, a bit more control over what's going on. like I'm, I'm not like the most happy with it, but I've, I understand why it exists and I'm, I've made my peace with it.[00:33:07] Jeremy: And I suppose it, it helps somewhat that the people who are working on it actually originally worked on the, go compiler at Google. Is that right?[00:33:16] Xe: Oh yeah. If, uh, there weren't ex go team people working on that, then I would definitely feel way less comfortable about it. But I trust that the people that are working on it, know what they're doing at least enough.[00:33:30] Jeremy: I, I feel like, that's, that's kind of the position we put ourselves in with software in general, right? Is like, do we trust our ourselves enough to do this thing we're doing?[00:33:39] Xe: Yeah. And trust is a bitch.[00:33:44] Jeremy: um, I think one of the things that's interesting about tail scale is that it's a product that's kind of it's like network infrastructure, right? It's to connect you to your other devices. And that's a little different than somebody running a software as a service. And so. how do you test something that's like built to support a network and, and how is that different than just making a web app or something like that.[00:34:11] Xe: Um, well, it's a lot more complicated for one, especially when you have to have multiple devices in the mix with multiple different operating systems. And I was working on some integration tests, doing stuff for a while, and it was really complicated. You have to spin up virtual machines, you know, you have to like make sure the virtual machines are attempting to download the version of the tailscale client you wanna test and. It's it's quite a lot in practice.[00:34:42] Jeremy: I mean, do you have a, a lab, you know, with Android phones and iPhones and laptops and all this sort of stuff, and you have some kind of automated test suite to see like, Hey, if these machines are in Ottawa and, my servers in San Francisco, like you're mentioning before that I can get from my iPhone to this server and the data center over here, that kind of thing.[00:35:06] Xe: What's the right way to phrase this without making things look bad. Um, it's a work in progress. It it's, it's really a hard problem to solve, uh, especially when the company is fully remote and, uh, like. Address that's listed on the business records is literally one of the founders condos because you know, the company has no office.So that makes the logistics for a lot of this. Even more fun.[00:35:37] Jeremy: Probably any company that's in an early stage feels the same way where it's like, everything's a work in progress and we're just gonna, we're gonna keep going and we're gonna get there. And as long as everything keeps running, we're good.[00:35:50] Xe: Yeah. I, I don't like thinking about it in that way, because it kind of sounds like pessimistic or defeatist, but at some level it's, it, it really is a work in progress because it's, it's a hard problem and hard problems take a lot of time to solve, especially if you want a solution that you're happy with.[00:36:10] Jeremy: And, and I think it's kind of a unique case too, where it's not like if it goes down, it's like people can't do their job. Right. So it's yeah.[00:36:21] Xe: Actually, if tail scales like control plane goes down, I don't think people would notice until they tried to like boot up a, a reboot, a laptop, or connect a new device to their tailnet. Because once, once all the tailscale agents have all of the information they need from the control plate, you know, they just, they just continue on independently and don't have to care.Derp is also fairly independent of the, like the key dropbox component. And, you know, if that, if that goes down Derp doesn't care at all,[00:37:00] Jeremy: Oh, okay. So if the control plane is down, as long as you had authenticated earlier in the day, you can still, I don't know if it's cached or something, but you can still continue to reach the relay servers, the Derp servers or your, [00:37:15] Xe: other nodes. Yeah. I, I'm pretty sure that in most cases, the control plane could be down for several hours a day and nobody would notice unless they're trying to deal with the admin panel.[00:37:28] Jeremy: Got it. that's a little bit of a relief, I suppose, for, for all of you running it,[00:37:33] Xe: Yeah. Um, it's also kind of hard to sell people on the idea of here is a VPN thing. You don't need to self host it and they're like, what? Why? And yeah, it can be fun.[00:37:49] Jeremy: though, I mean, I feel like anybody who has, self-hosted a VPN, they probably like don't really wanna do it. I don't know. Maybe I'm wrong.[00:38:00] Xe: well, so a lot of the idea of wanting to self host it is, uh, I think it's more of like trying to be self-sufficient and not have to rely on other companies, failures dictating your company's downtime. And, you know, like from some level that's very understandable. And, you know, if, you know, like tail scale were to get bought out and the new owners would, you know, like basically kill the product, they'd still have something that would work for them.I don't know if like such a defeatist attitude is like productive. But it is certainly the opinion that I have received when I have asked people why they wanna self-host. other people, don't want to deal with identity providers or the, like, they wanna just use their, they wanna use their own identity provider.And what was hilarious was there was one, there was one thing where they were like our old VPN server died once and we got locked out of our network. So therefore we wanna, we wanna self-host tailscale in the future so that this won't happen again.And I'm like, buddy, let's, let's just, let's just take a moment and retrace our steps here. CAuse I don't think you mean what you think you mean.[00:39:17] Jeremy: yeah, yeah. [00:39:19] Xe: In general, like I suggest people that, you know, even if they're like way deep into the tailscale, Kool-Aid they still have at least one other method of getting into their servers. Ideally, two. I, I admit that I'm, I come from an SRE style background and I am way more paranoid than most, but it, I usually like having, uh, a backup just in case.[00:39:44] Jeremy: So I, I suppose, on, on that note, let's, let's talk a little bit about your role at tailscale. the title of the archmage of infrastructure is one of the, the coolest titles I've, uh, I've seen. So maybe you can go a little bit into what that entails at, at tailscale.[00:40:02] Xe: I started that title as a joke that kind of stuck, uh, my intent, my initial intent was that every time someone asked, I'd say, I'd have a different, you know, like mystic sounding title, but, uh, archmage of infrastructure kind of stuck. And since then, I've actually been pivoting more into developer relations stuff rather than pure software engineering.And, from the feedback that I've gotten at the various conferences I've spoken at, they like that title, even though it doesn't really fit with developer relations work at all, it it's like it fits because it doesn't. You know, that kind of coney kind of way.[00:40:40] Jeremy: I guess this would go more into the, the infrastructure side, but. What does the, the scale of your infrastructure look like? I mean, I, I think that you touched a little bit on the fact that you have relay servers all over the place and you've got this control plane, but I wonder if you could give people a little bit of perspective of what kind of undertaking this is.[00:41:04] Xe: I am pretty sure at this point we have more developer laptops and the like, than we do production servers. Um, I'm pretty sure that the scale of the production of production servers are in the tens, at most. Um, it turns out that computers are pretty darn and efficient and, uh, you don't really need like a lot of computers to do something amazing.[00:41:27] Jeremy: the part that I guess surprises me a little bit is, is the relay servers, I suppose, because, I would imagine there's a lot of traffic that goes through those. are you finding that just most of the time they just aren't needed and usually you can make a direct connection and that's why you don't need too many of these.[00:41:45] Xe: From what I understand. I don't know if we actually have a way to tell, like what percentage of data is going over the relays versus not. And I think that was an intentional decision, um, that may have been revisited I'm operating based off of like six to 12 month old information right now. But in general, like the only state that the relay servers has is in Ram.And whenever the relay, whenever you disconnect the server, the state is dropped.[00:42:18] Jeremy: Okay.[00:42:19] Xe: and even then that state is like, you know, this key is listening. It is, uh, connected, uh, in case you wanna send packets over here, I guess. it's a bit less bandwidth than you're probably thinking it's not like enough to max it out 24/7, but it is, you know, measurable and there are some, you know, costs associated with it. This is also why it's on digital ocean and vulture and not AWS. but in general, it's a lot less than you'd think. I'm pretty sure that like, if I had to give a baseless assumption, I'd say that probably about like 85% of traffic goes directly.And the remaining is like the few cases in the whole punching engine that we haven't figured out yet. Like Palo Alto fire walls. Oh God. Those things are a nightmare.[00:43:13] Jeremy: I see. So it's most of the traffic actually ends up. Being straight peer to peer. Doesn't have to go through your infrastructure. And, and therefore it's like, you don't need too many machines, uh, to, to make this whole thing work.[00:43:28] Xe: Yeah. it turns out that computers are pretty darn fast and that copying data is something that computers are really good at doing. Um, so if you have, you know, some pretty darn fast computers, basically just sitting there and copying data back and forth all day, like it, you can do a lot with shockingly little.Um, when I first started, I believe that the Derp VMs were using like sometimes as little as one core and 512 megabytes of Ram as like a primary Derp. And, you know, we only noticed when, there were some weird connection issues for people that were only on Derp because there were enough users that the machine had ran out of memory.So we just, you know, upped the, uh, virtual machine size and called it a day. But it's, it's truly remarkable how mu how far you can get with very little[00:44:23] Jeremy: And you mentioned the relay servers, the, the Derp servers were on services like digital ocean and Vultr. I'm assuming because of the, the bandwidth cost, for the control plane, is, is that on AWS or some other big cloud provider?[00:44:39] Xe: it's on AWS. I believe it's in EU central 1.[00:44:44] Jeremy: You're helping people connect from device to device and in a situation like that. what does monitoring look like in, in incidents? Like what are you looking for to determine like, Hey, something's not working.[00:44:59] Xe: there's monitoring with, you know, Prometheus, Grafana, all of that stuff. there are some external probing things. there's also some continuous functional testing for trying to connect to tailscale and like log in as an account. And if that fails like twice in a row, then, you know, something's very wrong and, you know, raise the alarm.But in general. A lot of our monitoring is kind of hard at some level because you know, we're tailscale at a tailscale can't always benefit from tailscale to help operate tail scale because you know, it's tailscale. Um, so it, it still trying to figure out how to detangle the chicken and egg situation.It's really annoying.there's the, the term dog fooding, right? Where they're saying like, oh, we, we run, um, our own development on our own platform or our own software. but I could see when your product is network infrastructure, VPNs, where that could be a little, little dicey.[00:46:06] Xe: Yeah, it is very annoying. But I I'm pretty sure we'll figure something out. It is just a matter of when, another thing that's come up is we've kind of wanted to use tailscale's SSH features, where you specify ACLs in your, you specify ACL rules to allow people to SSH, to other nodes as various users.but if that becomes your main access to production, then you know, like if tailscale is down and you're tailscale, like how do you get in, uh, then there's been various philosophical discussions about this. it's also slightly worse if you use what's called check mode in SSH, where, uh, tail scale, SSH without check mode, you know, you just, it, the, the server checks against the policy rules and the ACL and if it. if it's okay, it lets you in. And if not, it says no, but with check mode, there's also this like eight hour, there's this like eight hour quote unquote lifetime for you to have like sudo mode on GitHub, where you do an auth an auth challenge with your auth aprovider. And then, you know, you're given a, uh, Hey, this person has done this thing type verification.And if that's down and that goes through the control plane, and if the control plane is down and you're tailscale, trying to debug the control plane, and in order to get into the control plane over tailscale, you need to use the, uh, control plane. It, you know, that's like chicken and egg problem level 78,which is a mythical level of chicken egg problem that, uh, has only been foretold in the legends of yore or something.[00:47:52] Jeremy: at that point, it sounds like somebody just needs to, to drive to the data center and plug into the switch.[00:47:59] Xe: I mean, It's not, it's not going to, it probably wouldn't be like, you know, we need to get a person with an angle grinder off of Craigslist type bad. Like it was with the Facebook BGP outage, but it it's definitely a chicken and egg problem in its own right.it makes you do a lot of lateral thinking too, which is also kind of interesting.[00:48:20] Jeremy: When, when you say lateral thinking, I'm just kind of curious, um, if you have an example of what you mean.[00:48:27] Xe: I don't know of any example that isn't NDAed. Um, but basically, you know, tail scale is getting to the, to the point where tailscale is relying on tailscale to make tailscale function and you know, yeah. This is classic oroboros style problem.I've heard a, uh, a wise friend of mine said that that is an ideal problem to have, which sounds weird at face value. But if you're getting to that point, that means that you're successful enough that, you know, you're having that problem, which is in itself a good thing, paradoxically.[00:49:07] Jeremy: better to have that problem than to have nobody care about the product. Right.[00:49:12] Xe: Yeah.[00:49:13] Jeremy: kind of on that, that note, um, you mentioned you worked at, at Salesforce, uh, I believe that was working on Heroku. I wonder if you could talk a little about your experience working at, you know, tailscale, which is kind of more of a, you know, early startup versus, uh, an established company like Salesforce.[00:49:36] Xe: So at the time I was working at Heroku, it definitely didn't feel like I was working at Salesforce for the majority of it. It felt like I was working, you know, at Heroku, like on my resume, I listed as Heroku. When I talked about it to people, I said, I worked at Heroku and that sales force was this, you know, mythical, Ohana thing that I didn't have to deal with unless I absolutely had to.By the end of the time I was working at Heroku, uh, the salesforce, uh, sort of started to creep in and, you know, we moved from tracking issues in GitHub issues. Like we were used to, to using their, oh, what's the polite way to say this, their creation, which is, which was like the moral equivalent of JIRA implemented on top of Salesforce.You had to be behind the VPN for it. And, you know, every ticket had 20 fields and, uh, there were no templates. And in comparison with tail scale, you know, we just use GitHub issues, maybe some like things in notion for doing like longer term tracking or Kanban stuff, but it's nice to not have. you know, all of the pomp and ceremony of filling out 20 fields in a ticket for like two sentences of this thing is obviously wrong and it's causing X to happen.Please fix.[00:51:08] Jeremy: I, I like that, that phrase, the, the creation, that's a very, very diplomatic term.[00:51:14] Xe: I mean, I can think of other ways to describe it, but I'm pretty sure those ways wouldn't be allowed on the podcast. So[00:51:25] Jeremy: Um, but, but yeah, I, I know what you mean for sure where, it, it feels like there's this movement from, Hey, let's just do what we need. Like let's fill in the information that's actually relevant and don't do anything else to a shift to, we need to fill in these 10 fields because that's the thing we do.Yeah.[00:51:48] Xe: Yeah. and in the time I've been working for tail scale, I'm like employee ID 12. And, uh, tail scale has gone from a company where I literally know everyone to just recently to the point where I don't know everyone anymore. And it's a really weird feeling. I've never been in a, like a small stage startup that's gotten to this size before, and I've described some of my feelings to other people who have been there and they're like, yeah, welcome to the club. So I figure a lot of it is normal. from what I understand, though, there's a lot of intentionality to try to prevent tail skill from becoming, you know, like Google style, complexity, organizational complexity, unless that is absolutely necessary to do something.[00:52:36] Jeremy: it's a function of size, right? Like as you have more people, more teams, then more process comes in. that's a really tricky balance to, to grow and still keep that feeling of, I'm just doing the thing, I'm doing the work rather than all this other process stuff.[00:52:57] Xe: Yeah, but it, I've also kind of managed to pigeonhole myself off into a corner with devrel stuff. And that's been nice. I've been working a bunch with, uh, like marketing people and, uh, helping out with support occasionally and doing a, like a godawful amount of writing.[00:53:17] Jeremy: the, the writing, for our audience's benefit, I, I think they should, they should really check out your blog because I think that the way you write your, your articles is very thoughtful in terms of the balance of the actual example code or example scripts and the descriptions and, and some there's a little bit of a narrative sometimes too.So, [00:53:40] Xe: Um, I'm actually more of a prose writer just by like how I naturally write things. And a lot of the style of how I write things is, I will take elements from, uh, the Socratic style of dialogue where, you know, you have the student and the teacher. And, you know, sometimes the student will ask questions that the teacher will answer.And I found that that's a particularly useful way to help model understanding or, you know, like put side concepts off into their own little blurbs or other things like that. I also started doing those conversation things with, uh, furry art, specifically to dunk on a homophobe that was getting very angry at furry art being in, uh, another person's blog.And that's it, it's occasionally fun to go into the, uh, orange website of bad takes and see the comments when people complain about it. oh gosh, the bad takes are hilariously good. Sometimes.[00:54:45] Jeremy: it's good that you have like a, a positive, mindset around that. I know some people can read, uh, that sort of stuff and go, you know, just get really bummed out. [00:54:54] Xe: One of the ways I see it is that a lot of the time algorithms are based on like sheer numbers. So if you like get something that makes people argue in the comments, that number will go up and because there's more comments on it, it makes more people more likely to, to read the article and click on it.So, sometimes I have been known to sprinkle, what's the polite way to say this. I've been known to sprinkle like intentionally kind of things that will, uh, get people and make them want to argue about it in the comments. Purely to make the engagement numbers rise up, which makes more people likely to read the article.And, it's kind of a dirty practice, but you know, it makes more people read the article and more people benefit. So, you know, like it's kind of morally neutral, I guess.[00:55:52] Jeremy: usually that, that seems like, a sketchy thing. But I feel like if it's in service to, uh, like a technical blog post, I mean, why not? Right.[00:56:04] Xe: And a lot of the times I'll usually have the like, uh, kind of bad take, be in a little conversation blurb thing so that people will additionally argue about the characterization of, you know, the imaginary cartoon shark or whatever.[00:56:20] Jeremy: That's good. It's the, uh, it's the Xe Xe universe that they're, they're stepping into.[00:56:27] Xe: I've heard people describe it, uh, lovingly as the xeiaso.net cinematic universe.I've had some ideas on how to expand it in the future with more characters that have more different kind of diverse backgrounds. But, uh, it turns out that writing this stuff is hard. Like actually very hard because you have to get this right.You have to get the right balance of like snark satire, uh, like enlightenment. Andit's, it's surprisingly harder than you'd think. Um, but after a while, I've just sort of managed to like figure out as I'm writing where the side tangents come off and which ones I should keep and which ones I should, uh, prune and which ones can also help, Gain deeper understanding with a little like Socratic dialogue to start with a Mo like an incomplete assumption, like an incomplete picture.And then, you know, a question of, wait, what about this thing? Doesn't that conflict with that? And like, well, yes. technically it does, but realistically we don't have to worry about that as much. So we can think about it just in terms of this bigger model and, uh, that's okay. Like, uh, I mentioned the OSI model earlier, you know, like the seven layer OSI model it's, you know, genuinely overkill for basically everything, except it's a really great conceptual model for figuring out the difference between, you know, like an ethernet cable, an ethernet, like the ethernet card, the IP stack TCP and, you know, TLS or whatever.I have a couple talks that are gonna be up by the time this is published. Uh, one of them is my, uh, rustconf talk on my, or what was it called? I think it was called the surreal horrors of PAM or something where I discussed my experience, trying to bug a PAM module in rust, uh, for work. And, uh, it's the kind of story where, you know, it's bad when you have a break point on dlopen.[00:58:31] Jeremy: That sounds like a nightmare.[00:58:32] Xe: Oh yeah. Like part of the attempting to fix that process involved, going very deep. We're talking like an HTML frame set in the internet archive for sunOS documentation that was written around the time that PAM was used. Like it's things that are bad enough were like everything in the frame set, but the contents had eroded away through bit rot and you know, you're very lucky just to have what you do.[00:59:02] Jeremy: well, I'm, I'm glad it was. It was you and not me. we'll get to, to hear about it and, and not have to go through the, the suffering ourselves.[00:59:11] Xe: yeah. One of the things I've been telling people is that I'm not like a brilliant programmer. Like I know a bunch of people who are definitely way smarter than me, but what I am is determined and, uh, determination is a bit stronger of a force than you'd think.[00:59:27] Jeremy: Yeah. I mean, without it, nothing gets done. Right.[00:59:30] Xe: Yeah.[00:59:31] Jeremy: as we wrap up, is there anything we missed or anything else you wanna mention? [00:59:36] Xe: if you wanna look at my blog, it's on xeiaso.net. That's X, E I a S o.net. Um, that's where I post things. You can see, like the 280 something articles at time of recording. It's probably gonna get to 300 at some point, oh God, it's gonna get to 300 at some point. Um, and yeah, from, I try to post articles about weekly, uh, depending on facts and circumstances, I have a bunch of talks coming up, like one about the hilarious over engineering I did in my blog.And maybe some more. If I get back positive responses from calls for paper submissions,[01:00:21] Jeremy: Very cool. Well, Xe thank you so much for, for coming on software engineering radio.[01:00:27] Xe: Yeah. Thank you for having me. I hope you have a good day and, uh, try out tailscale, uh, note my bias, but I think it's great.

god google san francisco seattle european union iphone android starbucks id ios address ip infrastructure i am stockholm ram ottawa minecraft gucci salesforce api acl gmail craigslist aws linux github prometheus kool aid vpn qa devops macos html nebula icloud dns oh god kubernetes cloudflare socratic kanban authentication vpns ohana zed sre purely json unix tls jira chrome os osi derp ssh acls tcp xe heroku oauth unicode jwt udp ipv4 technical writing ec2 grafana freebsd virtual private networks my blog tailscale archmage openvpn software engineering radio jeremy you jeremy so sunos

Jonathan Shariat on Tragic Design

Play Episode Listen Later Sep 9, 2022 55:19

Jonathan Shariat is the coauthor of the book Tragic Design and co-host of the Design Review Podcast. He's currently a Sr. Interaction Designer & Accessibility Program Lead at Google.This episode originally aired on Software Engineering Radio.Topics covered: How poor design kills in medical environments Causing harm with features meant to bring joy Considerations during the product development cycle Industry specific checklists and testing requirements Creating guiding principles for a team Why medical software often has poor UX Designing for crisis situations Why dark patterns can be bad in the long term Related Links @designuxui Tragic Design How Bad UX Killed Jenny Design Review podcast Deceptive Design TranscriptYou can help edit this transcript on GitHub.[00:00:00] Jeremy: Today I'm talking to Jonathan Shariat, he's the co-author of Tragic design. The host of the design review podcast. And he's currently a senior interaction designer and accessibility program lead at Google. Jonathan, welcome to software engineering radio.[00:00:15] Jonathan: Hi, Jeremy, thank you So much for having me on.[00:00:18] Jeremy: the title of your book is tragic design. And I think that people can take a lot of different meanings from that. So I wonder if you could start by explaining what tragic design means to you.[00:00:33] Jonathan: Hmm. For me, it really started with this story that we have in the beginning of the book. It's also online. Uh, I originally wrote it as a medium article and th that's really what opened my eyes to, Hey, you know, design has, is, is this kind of invisible world all around us that we actually depend on very critically in some cases.And So this story was about a girl, you know, a nameless girl, but we named her Jenny for the story. And in short, she came for treatment of cancer at the hospital, uh, was given the medication and the nurses that were taking care of her were so distracted with the software they were using to chart, make orders, things like that, that they miss the fact that she needed hydration and that she wasn't getting it.And then because of that, she passed away. And I still remember that feeling of just kind of outrage. And, you know, when we hear a lot of news stories, A lot of them are outraging. they, they touch us, but some of them, some of those feelings stay and they stick with you.And for me, that stuck with me, I just couldn't let it go because I think a lot of your listeners will relate to this. Like we get into technology because we really care about the potential of technology. What could it do? What are all the awesome things that could do, but we come at a problem and we think of all the ways it could be solved with technology and here it was doing the exact opposite.It was causing problems. It was causing harm and the design of that, or, you know, the way that was built or whatever it was failing Jenny, it was failing the nurses too, right? Like a lot of times we blame that end user and, and it caused it. So to me, that story was so tragic. Something that deeply saddened me and was regrettable and cut short someone's uh, you know, life and that's the definition of tragic, and there's a lot of other examples with varying degrees of tragic, but, um, you know, as we look at the impact technology has, and then the impact we have in creating those technologies that have such large impacts, we have a responsibility to, to really look into that and make sure we're doing as best of job as we can and avoid those as much as possible.Because the biggest thing I learned in researching all these stories was, Hey, these aren't bad people. These aren't, you know, people who are clueless and making these, you know, terrible mistakes. They're me, they're you, they're they're people. Um, just like you and I, that could make the same mistakes.[00:03:14] Jeremy: I think it's pretty clear to our audience where there was a loss of life, someone, someone died and that's, that's clearly tragic. Right? So I think a lot of things in the healthcare field, if there's a real negative outcome, whether it's death or severe harm, we can clearly see that as tragic.and I, I know in your book you talk about a lot of other types of, I guess negative things that software can cause. So I wonder if you could, explain a little bit about now past the death and the severe injury. What's tragic to you.[00:03:58] Jonathan: Yeah. still in that line of like of injury and death, And, you know, the side that most of us will actually, um, impact, our work day-to-day is also physical harm. Like, creating this software in a car. I think that's a fairly common one, but also, ergonomics, right?Like when we bring it back to something like less impactful, but still like multiplied over the impact of, multiplied over the impact of a product rather, it can be quite, quite big, right? Like if we're designing software in a way that's very repetitive or, you know, everyone's, everyone's got that, that like scroll, thumb, scroll, you know, issue.Right. if, uh, our phones aren't designed well, so there's a lot of ways that it can still physically impact you ergonomically. And that can cause you a lot of problem arthritis and pain, but yeah, there's, there's other, there's other, other ways that are still really impactful. So the other one is by saddening or angry.You know, that emotional harm is very real. And oftentimes sometimes it gets overlooked a little bit because it's, um, you know, physical harm is what is so real to us, but sometimes emotional harm isn't. But, you know, we talk about in the book, the example of Facebook, putting together this great feature, which takes your most liked photo, and, you know, celebrates your whole year by you saying, Hey, look at as a hero, you're in review this, the top photo from the year, they add some great, you know, well done illustrations behind it, of, of balloons and confetti and, people dancing.But some people had a bad year. Some people's most liked engaged photo is because something bad happened and they totally missed. And because of that, people had a really bad time with this where, you know, they lost their child that year. They lost their loved one that year, their house burnt down. Um, something really bad happened to them.And here was Facebook putting that photo of their, of their dead child up with, you know, balloons and confetti and people dancing around it. And that was really hard for people. They didn't want to be reminded of that. And especially in that way, and these emotional harms also come into the, in the play of, on anger.You know, we talk about, well, one, you know, there's, there's a lot of software out there that, that, um, tries to bring up news stories that anger us and which equals engagement. Um, but also ones that, um, use dark patterns to trick us into purchasing and buying and forgetting about that free trial. So they charge us for a yearly subscription and won't refund us.Uh, if you've ever tried to cancel a subscription, you start to see some real their their real colors. Um, so emotional harm and, uh, anger is a, is a big one. We also talk about injustice in the book where there are products that are supposed to be providing justice. Um, and you know, in very real ways like voting or, you know, getting people the help that they need from the government, or, uh, for people to see their loved ones in jail.Um, or, you know, you're getting a ticket unfairly because you couldn't read the sign was you're trying to read the sign and you, and you couldn't understand it. so yeah, we look at a lot of different ways that design and our saw the software that we create can have very real impact on people's lives and in a negative way, if we're not careful. [00:07:25] Jeremy: the impression I get, when you talk about tragic design, it's really about anything that could harm a person, whether physically, emotionally, you know, make them angry, make them sad. And I think the, the most liked photo example is a great one, because like you said, I think the people may be building something that, that harms and they may have no idea that they're doing it.[00:07:53] Jonathan: Exactly like that. I love that story because not, not to just jump on the bandwagon of saying bad things about like Facebook or something. No, I love that story because I can see myself designing the exact same thing, like being a part of that product, you know, building it, you know, looking at the, uh, the, the specifications, the, um, the, the PM, you know, put it that put together and the decks that we had, you know, like I could totally see that happening.And just never, I think, never having the thought, because our we're so focused on like delighting our users and, you know, we have these metrics and these things in mind. So that's why, like, in the book, we really talk about a few different processes that need to be part of. Product development cycle to stop, pause, and think about like, well, what are the, what are the negative aspects here?Like what are the things that could go wrong? What are the, what are the other life experiences that are negative? Um, that could be a part of this and you don't need to be a genius to think of every single thing out there. You know, like in this example, I think just talking about, you know, like, oh, well, some people might've had, you know, if they would have taken probably like, you know, one hour out of their entire project, or maybe even 10 minutes, they might've come up with like, oh, there could be bad thing.Right. But, um, so if you don't have that, that, that moment to pause that moment to just say, okay, we have time to brainstorm together about like how this could go wrong or how, you know, the negative of life could be impacted by this, um, feature that that's all that it takes. It doesn't necessarily mean that you need to do.You know, giant study around the impact, potential impact of this product and all the, all the ways, but really just having a part of your process that takes a moment to think about that will just create a better product and better, product outcomes. You know, if you think about all of life's experiences and Facebook can say, Hey, condolences, and like, you know, and show that thoughtfulness that would be, uh, I would have that have higher engagement that would have higher, uh, satisfaction, right?So they could have created a better outcome by considering these things and obviously avoid the impact negative impact to users and the negative impact to their product. [00:10:12] Jeremy: continuing on with that thought you're a senior interaction designer and you're an accessibility program lead. And so I wonder on the projects that you work on, and maybe you can give us a specific example, but how are you ensuring that you're, you're not running up against these problems where you build something that you think is going to be really great, um, for your users, but in reality ends up being harmful and specifically.[00:10:41] Jonathan: Yeah, one of the best ways is, I mean, it should be part of multiple parts of your cycle. If, if you want something, if you want a specific outcome out of your product development life cycle, um, it needs to be from the very beginning and then a few more times, so that it's not, you know, uh, I think, uh, programmers, uh, will all latch onto this, where they have the worst end of the stick, right?Because a and Q and QA as well. Because, you know, any bad decision or assumption that's happened early on with, you know, the, the business team or, or the PM, you know, gets like multiplied when they talk to the designer and then gets multiplied again, they hand it off. And it's always the engineer who has to, has to put the final foot down, be like, this doesn't make sense.Or I think users are going to react this way, or, you know, this is the implication of that, that assumption. So, um, it's the same thing, you know, in our team, we have it in the very early stage when someone's putting together the idea for the feature, our project, we want to work on it's right there. There's a few, there's like a section about accessibility and a few other sections, uh, talking about like looking out for this negative impact.So right away, we can have a discussion about it when we're talking about like what we should do about this and the D and the different, implications of implementing it. That's the perfect place for it. You know, like maybe, maybe when you're a brainstorm. Uh, about like, what should we should do? Maybe it's not okay there because you're trying to be creative.Right. You're trying to think. But at the very next step, when you're saying, okay, like what would it mean to build this that's exactly where I should start showing up and, you know, the discussion from the team. And it depends also the, the risk involved, right? Like, uh, it depends, which is attached to how much, uh, time and effort and resources you should put towards avoiding that risk it's risk management.So, you know, if you work, um, like my, um, you know, colleagues, uh, or, you know, some of my friends were working in the automotive industry and you're creating a software and you're worried that it might be distracting. There might be a lot more time and effort or the healthcare industry. Um, those were, those are, those might need to take a lot more resources, but if you're a, maybe a building, um, you know, SaaS software for engineers to spin up, you know, they're, um, you know resources.Um, there might be a different amount of resources. It never is zero, uh, because you still have, are dealing with people and you'll impact them. And, you know, maybe, you know, that service goes down and that was a healthcare service that went down because of your, you know, so you really have to think about what the risk is.And then you can map that back to how much time and effort you need to be spending on getting that. Right. And accessibility is one of those things too, where a lot of people think that it takes a lot of effort, a lot of resources to be accessible. And it really isn't. It just, um, it's just like tech debt, you know, if, if you have ignored your tech debt for, you know, five years, and then they're saying, Hey, let's all fix all the tech debt. Yeah. Nobody's going to be on board for that as much. Versus like, if, if addressing that and finding the right level of tech debt that you're okay with and when you address it and how, um, because, and just better practice. That's the same thing with accessibility is like, if you're just building it correctly, as you go, it's, it's very low effort and it just creates a better product, better decisions.Um, and it is totally worth the increased amount of people who can use it and the improved quality for all users. So, um, yeah, it's just kind of like a win-win situation.[00:14:26] Jeremy: one of the things you mentioned was that this should all start. At the very beginning or at least right after you've decided on what kind of product you're going to build, and that's going to make it much easier than if you come in later and try to, make fixes then, I wonder when you're all getting together and you're trying to come up with these scenarios, trying to figure out negative impacts, what kind of accessibility, needs you need to have, who are the people who are involved in that conversation?Like, um, you know, you have a team of 50 people who needs to be in the room from the very beginning to start working this out.[00:15:05] Jonathan: I think it would be the same people who are there for the project planning, like, um, at, on my team, we have our eng counter counterparts there. at least the team lead, if, if, if there's a lot of them, but you know, if they would go to the project kickoff, uh, they should be there.you know, we, we have everybody in their PM, design, engineers, um, our project manager, like anyone who wants to contribute, uh, should really be there because the more minds you have with this the better, and you'll, you'll tease out much, much more of, of of all the potential problems because you have a more, more, um, diverse set of brains and life experiences to draw from.And so you'll, you'll get closer to that 80% mark, uh, that you can just quickly take off a lot of those big items off the table, right? [00:16:00] Jeremy: Is there any kind of formal process you follow or is it more just, people are thinking of ideas, putting them out there and just having a conversation.[00:16:11] Jonathan: Yeah, again, it depends which industry you're in, what the risk is. So I previously worked at a healthcare industry, um, and for us to make sure that we get that right, and how it's going to impact the patients, especially though is cancer care. And they were using our product to get early warnings of adverse effects.Our, system of figuring that like, you know, if that was going to be an issue was more formalized. Um, in, in some cases, uh, like, like actually like healthcare and especially if the, if it's a device or, or in certain software circumstances, it's determined by the FDA to be a certain category, you literally have a, uh, governmental version of this.So the only reason that's there is because it can prevent a lot of harm, right? So, um, that one is enforced, but there's, there's reasons, uh, outside of the FDA to have that exact formalized part of your process. And it can, the size of it should scale depending on what the risk is. So on my team, the risk is, is actually somewhat low.it's really just part of the planning process. We do have moments where we, we, um, when we're, uh, brainstorming like what we should do and how the feature will actually work. Where we talk about like what those risks are and calling out the accessibility issues. And then we address those. And then as we are ready to, um, get ready to ship, we have another, um, formalized part of the process.There will be check if the accessibility has been taken care of and, you know, if everything makes sense as far as, you know, impact to users. So we have those places, but in healthcare, but it was much stronger where we had to, um, make sure that we re we we've tested it. We've, uh, it's robust. It's going to work on, we think it's going to work.Um, we, you know, we do user testing has to pass that user testing, things like that before we're able to ship it, uh, to the end user.[00:18:12] Jeremy: So in healthcare, you said that the FDA actually provides, is it like a checklist of things to follow where you must have done this? As you're testing and you must have verified these, these things that's actually given to you by the government.[00:18:26] Jonathan: That's right. Yeah. It's like a checklist and the testing requirement. Um, and there's also levels there. So, I have, I've only, I've only done the lowest level. I know. There's like, I think like two more levels above that. Um, and again, that's like, because the risk is higher and higher and there's more stricter requirements there where maybe somebody in the FDA needs to review it at some point.And, um, so again, like mapping it back to the risk that your company has is, is really important to understanding that is going to help you avoid and, and build a better product, avoid, you know, the bad impact and build a better product. And, and I think that's one of the things I would like to focus on as well.And I'd like to highlight for your, for your listeners, is that, it's not just about avoiding tragic design because one thing I've discovered since writing the book and sharing it with a lot of people. Is that the exact opposite thing is usually, you know, in a vast majority of the cases ends up being a strategically great thing to pursue for the product and the company.You know, if you think about, that, that example with, with Facebook, okay. You've run into a problem that you want to avoid, but if you actually do a 180 there and you find ways to engage with people, when they're grieving, you find people to, to develop features that help people who are grieving, you've created a value to your users, that you can help build the company off of.Right. Um, cause they were already building a bunch of joy features. Right. Um, you know, and also like user privacy, like I, we see apple doing that really well, where they say, okay, you know, we are going to do our ML on device. We are going to do, you know, let users decide on every permission and things like that.And that, um, is a strategy. We also see that with like something like T-Mobile, when they initially started out, they were like one of the nobody, uh, telecoms in the world. And they said, okay, what are all the unethical bad things that, uh, our competitors are doing? They're charging extra fees, you know, um, they have these weird data caps that are really confusing and don't make any sense their contracts, you get locked into for many years.They just did the exact opposite of that. And that became their business strategy and it, and it worked for them now. They're, they're like the top, uh, company. So, um, I think there's a lot of things like that, where you just look at the exact opposite and, you, one you get to avoid the bad, tragic design, but you also see boom, you see an opportunity that, um, become, become a business strategy.[00:21:03] Jeremy: So, so when you referred to exact opposite, I guess you're, you're looking for the potentially negative outcomes that could happen. there was the Facebook example of, of seeing a photo or being reminded of a really sad event and figuring out can I build a product around, still having that same picture, but recontextualizing it like showing you that picture in a way that's not going to make you sad or upset, but is actually a positive.[00:21:35] Jonathan: Yeah. I mean, I don't know maybe what the solution was, but like one example that comes to mind is some companies. Now, before mother's day, we'll send you an email and say, Hey, this is coming up. Do you want us to send you emails about mother's day? Because for some people that's Can, be very painful. That's that's very thoughtful.Right. And that's a great way to show that you, that you care. Um, but yeah, like, you know, uh, thinking about that Facebook example, like if there's a formalized way to engage with, with grieving, like, I would use Facebook for that. I don't use Facebook very often or almost at all, but you know, if somebody passed away, I would engage right with my, my Facebook account.And I would say, okay, look, there's like, there's this whole formalized, you know, feature around, you know, uh, and, and Facebook understands grieving and Facebook understands like this w this event and may like smooth that process, you know, creates comfort for the community that's value and engagement. that is worthwhile versus artificial engagement.That's for the sake of engagement. and that would create, uh, a better feeling towards Facebook. Uh, I would maybe like then spend more time on Facebook. So it's in their mutual interest to do it the right way. Um, and so it's great to focus on these things to avoid harm, but also to start to see new opportunities for innovation.And we see this a lot already in accessibility where there's so many innovations that have come from just fixing accessibility issues like closed captions. We all use it, on our TVs, in busy crowded spaces, on, you know, videos that have no, um, uh, translation for us in different places.So, SEO is, is the same thing. Like you get a lot of SEO benefit from, you know, describing your images and, and making everything semantic and things like that. And that also helps screen readers. and different innovations have come because somebody wanted to solve an accessibility need.And then the one I love, I think it's the most common one is readability, like contrast and tech size. Sure. There's some people who won't be able to read it at all, but it hurts my eyes to read bad contrast and bad text size. And so it just benefits. Everyone creates a better design. And one of the things that comes up so often when I'm, you know, I'm the accessibility program lead.And so I see a lot of our bugs is so many issues that, that are caught because of our, our audits and our, like our test cases around accessibility that just our bad design and our bad experience for everyone. And so we're able to fix that. And, uh, and it's just like an another driver of innovation and there's, there's, there's a ton of accessibility examples, and I think there's also a ton of these other, you know, ethical examples or, you know, uh, avoiding harm where you just can see it. It's an opportunity area where it's like, oh, let's avoid that. But then if you turn around, you can see that there's a big opportunity to create a business strategy out of it.[00:24:37] Jeremy: Can, can you think of any specific examples where you've seen that? Where somebody, you know, doesn't treat it as something to avoid, but, but actually sees that as an opportunity.[00:24:47] Jonathan: Yeah. I mean, I, I think that the, um, the apple example is a really good one where from the beginning, like they, they saw like, okay, in the market, there's a lot of abuse of information and people don't like that. So they created a business strategy around that And that's become a big differentiator for them.Right. Like they, they have like ML on the device. They do. Um, they have a lot of these permission settings, you know, the Facebook. It was very much focused right. On, on using customer data and a lot of it without really asking their permission. And so once apple said, okay, now all apps need to show what you're tracking.And, and then, um, and asked for permission to do that. A lot of people said no, and that caused about $10 billion of loss for, for Facebook. and for, for apple, it's, you know, they advertise on that now that we're, you know, ethical that, you know, we, we source things ethically and we, we care about user privacy and that's a strong position, right?Uh, I think there's a lot of other examples out there. Like I mentioned accessibility and others, but like it they're kind of overflowing, so it's hard to pick one.[00:25:58] Jeremy: Yeah. And I think what's interesting about that too, is with the example of focusing on user privacy or trying to be more sensitive around, death or things like that, as I think that other people in the industry will, will notice that, and then in their own products, then they may start to incorporate those things as well.[00:26:18] Jonathan: Yeah. Yeah, exactly what the example of with T-Mobile. once that worked really, really well and they just ate up the entire market, all the other companies followed suit, right? Like now, um, having those data caps that, you know, are, are very rare, having those surprise fees are a lot, uh, rare.Um, you know, there's, there's no more like deep contracts that lock you in and et cetera, et cetera. A lot of those have become industry standard now. Um, and so It, and it does improve the environment for everyone because, because now it becomes a competitive advantage that everybody needs to meet. Um, so yeah, I think that's really, really important.So when you're going through your product's life cycle, you might not have the ability to make these big strategic decisions. Like, you know, we want to, you know, not have data caps or whatever, but, you know, if you, if you're on that Facebook level and you run into that issue, you could say, well, look, what could we do to address this?What could we could do to, to help this and make, make that a robust feature? You know, when we talk about, lot of these dating apps, one of the problems was a lot of abuse, where women were being harassed or, you know, after the day didn't go well and you know, things were happening. And so a lot of apps have now dif uh, these dating apps have differentiated themselves and attracted a lot of that market because they deal with that really well.And they have, you know, it's built into the strategy. It's oftentimes like a really good place to start too, because one it's not something we generally think about very, very well, which means your competitors. Haven't thought about it very well, which means it's a great place to, to build products, ideas off of. [00:27:57] Jeremy: Yeah, that's a good point because I think so many applications now are like social media applications, their messaging applications there, their video chat, that sort of thing. I think when those applications were first built, they didn't really think so much about what if someone is, you know, sending hateful messages or sending, pictures that people really don't want to see.Um, people are doing abusive things. It was like, they just assume that, oh, people will be, people will be good to each other and it'll be fine. But, uh, you know, in the last 10 years, pretty much all of the major social media companies have tried to figure out like, okay, um, what do I do if someone is being abusive and, and what's the process for that?And basically they all have to do something now. Um,Um [00:28:47] Jonathan: Yeah. And that's a hard thing to like, if, if that, uh, unethical or that, um, bad design decision is deep within your business strategy and your company's strategy. It's hard to undo that like some companies are still, still have to do that very suddenly and deal with it. Right. Like, uh, I know Uber had a big, big part of them, like, uh, and some other companies, but, uh, we're like almost suddenly, like everything will come to a head and they'll need to deal with it.Or, you know, like, Twitter now try to try to get, be acquired by Elon Musk. Uh, some of those things are coming to light, but, I, what I find really interesting is that these these areas are like really ripe for innovation. So if you're interested in, a startup idea or you're, or you're working in a startup, or, you know, you're about to start one, you know, there's a lot of maybe a lot of people out there who are thinking about side projects right now, this is a great way to differentiate and win that market against other well-established competitors is to say, okay, well, what are they, what are they doing right now that is unethical. And it's like, you know, core to their business strategy and doing that differently is really what will help you, to win that market. And we see that happening all the time, you know, especially the ones that are like these established, uh, leaders in the market. they can't pivot like you can, so being able to say, I'm, we're going to do this ethically.We're going to do this, uh, with, you know, with these tragic design in mind and doing the opposite, that's going to help you to, to find your, your attraction in the market.[00:30:25] Jeremy: Earlier, we were talking about. How in the medical field, there is specific regulation or at least requirements to, to try and avoid this kind of tragic design. Uh, I noticed you also worked for Intuit before. Uh, um, so for financial services, I was wondering if there was anything similar where the government is stepping in and saying like, you need to make sure that, these things happen to avoid, these harmful things that can come up.[00:30:54] Jonathan: Yeah, I don't know. I mean, I didn't work on TurboTax, so I worked on QuickBooks, which is like a accounting software for small businesses. And I was surprised, like we didn't have a lot, like a lot of those robust things, we just relied on user feedback to tell us like, things were not going well. And, you know, and I think we should have, like, I think, I think that that was a missed opportunity, um, to.Show your users that you understand them and you care, and to find those opportunity areas. So we didn't have enough of that. And there was things that we shipped that didn't work correctly right out of the box, which, you know, it happens, but had a negative impact to users. So it's like, okay, well, what do we do about that?How do we fix that? Um, and if the more you formalize that and make it part of your process, the more you get out of it. And actually this is like, this is a good, a good, um, uh, pausing point bit that I think will affect a lot of engineers listening to this. So if you remember in the book, we talk about the Ford Pinto story and there isn't, I want to talk about this story and why I added it to the book.Is that, uh, one, I think this is the thing that engineers deal with the most, um, and, and designers do too, which is that okay. we see the problem, but we don't think it's worth fixing. Okay. Um, so that, that's what I'm going. That's what we're going to dig into here. So it's a, hold on for a second while I explain some, some history about this car.So the Ford Pinto, if you're not familiar is notorious, uh, because it was designed, um, and built and shipped and there, they knowingly had this problem where if it was rear-ended at even like a pretty low speed, it would burst into flames because the gas tank would rupture the, and then oftentimes the, the, the doors would get jammed.And so it became a death trap of fire and caused many deaths, a lot of injuries. And, um, in an interview with the CEO at the time, like almost destroyed Ford like very seriously would have brought the whole company down and during the design of it, uh, and design meaning in the engineering sense. Uh, and the engineering design of it, they say they found this problem and the engineers came up with their best solution.Was this a rubber block. Um, and the cost was, uh, I forget how many dollars let's say it was like $9. let's say $6, but this is again, uh, back then. And also the margin on these cars was very, very, very thin and very important to have the lowest price in the market to win those markets. The customers were very price sensitive, so they, uh, they being like the legal team looked at like some recent, cases where they have the value of life and started to come up with like a here's how many people would sue us and here's how much it would cost to, uh, to, to settle all those.And then here's how much it would cost to add this to all the cars. And it was cheaper for them to just go with the lawsuits and they, they found. Um, and I think why, I think why this is so important is because of the two things that happened afterward, one, they were wrong. it was a lot more people it affected and the lawsuits were for a lot more money.And two after all this was going crazy and it was about to destroy the company, they went back to the drawing board and what did the engineers find? They found a cheaper solution. They were able to rework that, that rubber block and and get it under the margin and be able to hit the mark that they wanted to.And I think that's, there's a lot of focus on the first part because it's so unethical to the value of life and, and, um, and doing that calculation and being like we're willing to have people die, but in some industries, it's really hard to get away with that, but it's also very easy. To get into that.It's very easy to get lulled into this sense of like, oh, we're just going to crunch the numbers and see how many users it affects. And we're okay with that. Um, versus when you have principals and you have kind of a hard line and you, and you care a lot more than you should. And, and you really push yourself to create a more ethical, more, a safer, you know, avoiding, tragic design, then you, there there's a solution out there.Like you actually get to innovation, you actually get to the solving the problem versus when you just rely on, oh, you know, the cost benefit analysis we did is that it's going to take an engineer in a month to fix this and blah blah blah. But if, if you have those values, if you have those principles and you're like, you know what, we're not okay shipping this, then you'll, you'll find that.They're like, okay, there's, there's a cheaper way to, to fix this. There's another way we could address this. And that happens so often. and I know a lot of engineers deal with that. A lot of saying like, oh, you know, this is not worth our time to fix. This is not worth our time to fix. And that's why you need those principles is because oftentimes you don't see it and it's, but it's right there at right outside of the edge of your vision. [00:36:12] Jeremy: Yeah. I mean, with the Pinto example, I'm just picturing, you know, obviously there wasn't JIRA back then, but you can imagine that somebody's having an issue that, Hey, when somebody hits the back of the car, it's going to catch on fire. Um, and, and going like, well, how do I prioritize that? Right? Like, is this a medium ticket?Is this a high ticket? And it's just like, it's just, it just seems insane, right? That you could, make the decision like, oh no, this isn't that big an issue. You know, we can move it down to low priority and, and, and, ship it.Okay. [00:36:45] Jonathan: Yeah. And, and, and that's really what principals do for you, right? Is they help you make the tough decisions. You don't need a principle for an easy one. Uh, and that's why I really encourage people in the book to come together as a team and come up with what are your guiding principles. Um, and that way it's not a discussion point every single time.It's like, Hey, we've agreed that this is something that we, that we're going to care about. This is something that we are going to stop and, fix. Like, one of the things I really like about my team at Google is product excellence is very important to us. and. there are certain things that, uh, we're, you know, we're Okay. with, um, letting slip and fixing at a next iteration.And, you know, obviously we make sure we actually do that. Um, so it's not like we, we, we always address everything, but because it's one of our principles. We care more. We have more, we take on more of those tickets and we take on more of those things and make sure that they ship before, um, can make sure that they're fixed before we ship.And, and it shows like to the end user that th that this company cares and they have quality. Um, so it's one of it. You need a principal to kind of guide you through those difficult things that aren't obvious on a decision to decision basis, but, you know, strategically get you in somewhere important, you know, and, and like, like design debt or, um, our technical debt where it's like, this should be optimized, you know, this chunk of code, like, nah, but you know, in, in it grouping together with a hundred of those decisions.Yeah. It's gonna, it's gonna slow it down every single project from here on out. So that's why you need those principles.[00:38:24] Jeremy: So in the book, uh, there are a few examples of software in healthcare. And when you think about principles, you would think. Generally everybody on the team would be on board that we want to give whatever patient that's involved. We want to give them good care. We want them to be healthy. We don't want them to be harmed.And given that I I'm wondering because you, you interviewed multiple people in the book, you have a few different case studies. Um, why do you think that medical software in particular seems to be, so it seems to have such poor UX or has so many issues.[00:39:08] Jonathan: Yeah, that's a, complicated topic. I would summarize it with a few, maybe three different reasons. Um, one which I think is, uh, maybe a driving factor of, of some of the other ones. Is that the way that the medical, uh, industry works is the person who purchases the software. It's not the end user. So it's not like you have doctors and nurses voting on, on which software to use.Um, and so oftentimes it's, it's more of like a sales deal and then just gets pushed out and they, and they also have to commit to these things like, um, the software is very expensive and, uh, initially with, you know, like in the early days was very much like it needs to be installed, maintain, there has to be training.So there was a lot to money to be made, in those, in that software. And, and so the investment from the hospital was a lot, so they can't just be like, oh, can it be to actually, don't like this one, we're going to switch to the next one. So, because like, once it's sold, it's really easy to just like, keep that customer.There's very little incentive to like really improve it unless you're selling them a new feature. So there's a lot of feature add ons. Because they can charge more for those, but improving the experience and all that kind of stuff. There is less of that. I think also there's just generally a lot less like, uh, understanding of design, in that field.And there's a lot more because there's sort of like traditions of things. they end up putting a lot of the pressure and the, that responsibility on the end individuals. So, you know, you've heard recently of that nurse who made a medication error and she's going to jail for that. And sh you know, And oftentimes we blame that end, that end person.So the, the nurse gets all the blame or the doctor gets all the blame. Well, what about the software, you know, who like made that confusing or, you know, what about the medication that looks exactly like this other medication? Or what about the pump tool that you have to, you know, type everything in very specifically, and the nurses are very busy.They're doing a lot of work. There's a 12 hour shifts. They're dealing with lots of different patients, a lot of changing things for them to have to worry about having to type something a specific way. And yet when those problems happen, what do they do? They don't go in like redesign the devices. Are they more training, more training, more training, more training, and people only can absorb so much training.and so I think that's part of the problem is that like, there's no desire to change. They blame the end, the wrong person, and. Uh, lastly, I think that, um, it is starting to change. I think we're starting to see like the ability for, because of the fact that the government is pushing healthcare records to be more interoperable, meaning like I can take my health records anywhere, that a lot of the power comes in where the data is.And so, um, I'm hoping that, uh, you know, as the government and people and, um, and initiatives push these big companies, like epic to be more open, that things will improve. One is because they'll have to keep up with their competitors and that more competitors will be out there to improve things. Because I, I think that there's, there's the know-how out there, but like, because the there's no incentive to change and, and, and there's no like turnover and systems and there's the blaming of the end user.We're not going to see a change anytime soon.[00:42:35] Jeremy: that's a, that's a good point in terms of like, it, it seems like even though you have all these people who may have good ideas may want to do a startup, uh, if you've got all these hospitals that already locked into this very expensive system, then yeah. Where's, where's the room to kind of get in there in and have that change.[00:42:54] Jonathan: yeah. [00:42:56] Jeremy: Uh, another thing that you talk about in the book is about how, when you're in a crisis situation, the way that a user interacts with something is, is very different. And I wonder if you have any specific examples for software when, when that can happen.[00:43:15] Jonathan: yeah. Designing for crisis is a very important part of every software because, it might be hard for you to imagine being in that situation, but, it, it definitely will still happen so. one example that comes to mind is, uh, you know, let's say you're working on a cloud, um, software, like, uh, AWS or Google cloud.Right. there's definitely use cases and user journeys in your product where somebody would be very panicked. Right. Um, and if you've ever been on an on-call with, with something and it goes south, and it's a big deal, you don't think. Right. Right. Like when we're in crisis, our brains go into a totally different mode of like that fight or flight mode.And we don't think the way we do, it's really hard to read and comprehend very hard. and we might not make this, the right decisions and things like that. So, you know, thinking about that, like maybe your, your let's say, like, going back to that, the cloud software, like let's say you're, you're, you're working on that, like.Are you relying on the user reading a bunch of texts about this button, or is it very clear from the way you've crafted that exact button copy and how big it is? And, and it's where it is relation to a bunch of other content? Like what exactly it does. It's going to shut down the instance where it's gonna, you know, it's, it's gonna, do it at a delay or whatever, like be able to all those little decisions, like are really impactful.And when you, when you run them through the, um, the, the furnace of, of, of, uh, um, a user journey that's relying on, on a really urgent situation, you'll obviously help that. And you'll, you'll start to see problems in your UI that you hadn't noticed before, or, or different problems in the way you're implementing things that you didn't notice before, because you're seeing it from a different way.And that's one of the great things about, um, the, the systems and the book that we talk about around, like, thinking about how things could go wrong, or, you know, thinking about, you know, designing for crisis. Is it makes you think of some new use cases, which makes you think of some new ways to improve your product.You know, that improvement you make to make it so obvious that someone could do it in a crisis would help everyone, even when they're not in a crisis. Um, so that, that's why it's important to, to focus on those things. [00:45:30] Jeremy: And for someone who is working on these products, it's kind of hard to trigger that feeling of crisis. If there isn't actually a crisis happening. So I wonder if you can talk a little bit about how you, you try to design for that when it's not really happening to you. You're just trying to imagine what it would feel like.[00:45:53] Jonathan: yeah. Um, you're never really going to be able to do that. Like, so some of it has to be simulated, One of the ways that we are able to sort of simulate what we call cognitive load. Which is one of the things that happen during a crisis. But what also happened when someone's very distracted, they might be using your product while they're multitasking.We have a bunch of kids, a toddler constantly pulling on their arm and they're trying to get something done in your app. So, one of the ways that has been shown to help, uh, test that is, um, like the foot tapping method. So when you're doing user research, you have the user doing something else, like tapping or like, You know, uh, make it sound like they have a second task that they're doing on the side.It's manageable, like tapping their feet and their, their hands or something. And then they also have to do your task. Um, so like you can like build up what those tabs with those extra things are that they have to do while they're also working on, uh, finishing the task you've given them. and, and that's one way to sort of simulate cognitive load.some of the other things is, is really just, um, you know, listening to users, stories and, and find out, okay, this user was in crisis. Okay, great. Let's talk to them and interview them about that. Uh, if it was fairly recently within like the past six months or something like that. but, but sometimes you don't like, you just have to run through it and do your best.Um, and you know, those black Swan events or those, even if you're able to simulate it yourself, like put your, put your, put yourself into that exact position and be in panic, which, you know, you're not able to, but if you were that still would only be your experience and you wouldn't know all the different ways that people could experience this.So, and there's going to be some point in time where you're gonna need to extrapolate a little bit and, you know, extrapolate from what you know, to be true, but also from user testing and things like that. And, um, and then wait for a real data [00:47:48] Jeremy: You have a chapter in the book on design that angers and there were, there were a lot of examples in there, on, on things that are just annoying or, you know, make you upset while you're using software. I wonder for like our audience, if you could share just like a few of your, your favorites or your ones that really stand out.[00:48:08] Jonathan: My favorite one is Clippy because, um, you know, I remember growing up, uh, you know, writing software, writing, writing documents, and Clippy popping up. And, I was reading an article about it and obviously just like everybody else, I hated it. You know, as a little character, it was fun, but like when you're actually trying to get some work done, it was very annoying.And then I remember, uh, a while later reading this article about how much work the teams put into clubby. Like, I mean, if you think about it now, It had a lot of like, um, so the AI that we're playing with just now, um, around like natural language processing, understanding, like what, what type of thing you're writing and coming up with contextualized responses, like it was pretty advanced for the, uh, very advanced for the time, you know, uh, adding animation triggers to that and all, all that.Um, and they had done a lot of user research. I was like, what you did research in, like you had that reaction. And I love that example because, oh, and also by the way, I love how they, uh, took Clippy out and S and highlighted that as like one of the features of the next version of the office, uh, software.but I love that example again, because I see myself in that and, you know, you ha you have a team doing something technologically amazing doing user research, uh, and putting out a very great product, but he totally missing. And a lot of products do that. A lot of teams do that. And why is that? It's because they're, um, they're not thinking about, uh, they're putting their, they're putting the business needs or the team's needs first and they're putting the user's needs second.And whenever we do that, whenever we put ourselves first, we become a jerk, right? Like if you're in a relationship and you're always putting yourself first, that relationship is not going to last long or it's not going to go very well. And yet we Do that with our relationship with users where we're constantly just like, Hey, well, what is the business?The business wants users to not cancel here so let's make it very difficult for people to cancel. And that's a great way to lose customers. That's a great way to create, this dissonance with your, with your users. And, um, and so if you, if you're, focused on like, this is what the we need to accomplish with the users, and then you work backwards from.You're you're, you're, you're, you're lower your chances of missing it, of getting it wrong of angering your users. and const always think about like, you sometimes have to be very real with yourselves and your team. And I think that's really hard for a lot of teams because we have we don't want to look bad.We don't want to, but what I found is those are the people who actually, um, get promoted. Like, you know, if you look at the managers and directors and stuff, those are the people who can be brutally honest. Right. Um, who can say, like, I don't think this is ready. I don't, I don't think this is good. And so you actually, I, I, you know, I've done that in the front of like our CEO and things like that.And I've always had really good responses from them to say, like, we really appreciate that you, you know, uh, you can call that out and you can just call it like, it is like, Hey, this is what we see this user. Maybe we shouldn't do this at all. Maybe. Um, and that can, uh, you know, at Google that's one of the criteria that we have in our software engineers and the designers of being able to spot things that are, you know, things that we shouldn't should stop doing.Um, and so I think that's really important for the development of, of a senior engineer, uh, to be able to, to know that that's something like, Hey, this project, I would want it to work, but in its current form is not good. And being able to call that out is very important.[00:51:55] Jeremy: Do you have any specific examples where there was something that was like very obvious to you? To the rest of the team or to a lot of other people that wasn't.[00:52:06] Jonathan: um, yeah, so here's an example I finally got, I was early on in my career and I finally got to lead in our whole project. So we are redesigning our business micro-site um, and I got to, I got, uh, assigned two engineers and another designer and I got to lead the whole. I was, I was like, this is my chance.Right? So, and we had a very short timeline as well, and I put together all these designs. And, um, one of the things that we aligned on at the time was like as really cool, uh, so I put together this really cool design for the contact form, where you have like, essentially, I kind of like ad-lib, it looks like a letter.and you know, by the way, give me a little bit of, of, uh, of, of leeway here. Cause this was like 10 years ago, but, uh, it was like a letter and you would say like, you're addressing it to our company. And so it had all the things we wanted to get out of you around like your company size, your team, like, and so our sales team would then reach out to this customer.I designed it and I had shown it to the team and everybody loved it. Like my manager signed off on it. Like all the engineers signed off on it, even though we had a short timeline, they're like, yeah, well we don't care. That's so cool. We're going to build it. But as I put it through that test of like, does this make sense for the, what the user wants answers just kept saying no to me.So I had to go and back in and pitch everybody and argue with them around not doing the cool idea that I wanted to do. And, um, eventually they came around and that form performed once we launched it performed really well. And I think about like, what if users had to go through this really wonky thing?Like this is the whole point of the website is to get this contact form. It should be as easy and as straightforward as possible. So I'm really glad we did that. And I can think of many, many more of those situations where, you know, um, we had to be brutally honest with ourselves with like this isn't where it needs to be, or this isn't what we should be doing.And we can avoid a lot of harm that way too, where it's like, you know, I don't, I don't think this is what we should be building. Right.[00:54:17] Jeremy: So in the case of this form, was it more like you, you had a bunch of drop-downs or S you know, selections where you would say like, okay, these are the types of information that I want to get from the person filling out the form as a company. but you weren't looking so much at, as the person filling out the form, this is going to be really annoying.Was that kind [00:54:38] Jonathan: exactly, exactly. Like, so their experience would have been like, they come up, they come at the end of this page or on like contact us and it's like a letter to our company. And like, we're essentially putting words in their mouth because they're, they're filling out the, letter. Um, and then, yeah, it's like, you know, you have to like read and then understand like what, what that part of this, the, the page was asking you and, you know, versus like a form where you're, you know, it's very easy.Well-known bam. You're, you're you're on this page. So you're interested in, so like, get it, get them in there. So we were able to, to decide against that and that, you know, we, we also had to, um, say no to a few other things, but like we said yes, to some things that were great, like responsive design, um, making sure that our website worked at every single use case, which is not like a hard requirement at the time, but was really important to us and ended up helping us a lot because we had a lot of, you know, business people who are on their phone, on the go, who wanted to, to check in and fill out the form and do a bunch of other stuff and learn about us.So that, that, that sales, uh, micro-site did really well because I think we made the right decisions and all those kinds of areas. And like those, those general, those principles helped us say no to the right things, even though it was a really cool thing, it probably would have looked really great in my portfolio for a while, but it just wasn't the right thing to do for the, the, the goal that we had.[00:56:00] Jeremy: So did it end up being more like just a text box? You know, a contact us fill in. Yeah.[00:56:06] Jonathan: You know, with usability, you know, if someone's familiar with something and it's, it's tired, everybody does it, but that means everybody knows how to use it. So usability constantly has that problem of innovation being less usable. Um, and so sometimes it's worth the trade-off because you want to attract people because of the innovation and they'll bill get over that hump with you because the innovation is interesting.So sometimes it's worth it and sometimes it's not, and you really have to, I'd say most times it's not. Um, and So you have to find like, what is, when is it time to innovate and when is it time to do the what's tried and true. Um, and on a business microsite, I think it's time to do tried and true. [00:56:51] Jeremy: So in your research for the book and all the jobs you've worked previously, are there certain. Mistakes or just UX things that you've noticed that you think that our audience should know about?[00:57:08] Jonathan: I think dark patterns are one of the most common, you know, tragic design mistakes that we see, because again, you're putting the company first and the user second. And you know, if you go to a trash, sorry, if you go to a dark patterns.org, you can see a great list. Um, there's a few other sites that have a nice list of them and actually Vox media did a nice video about, uh, dark patterns as well.So it's gaining a lot of traction, but you know, things like if you try to cancel your search, like Comcast service or your Amazon service, it's very hard. Like I think I wrote this in the book, but. Literally re researched what's the fastest way to delete it to, to, you know, uh, remove your Comcast account.I prepared everything. I did it through chat because that was the fastest way for first, not to mention finding chat by the way was very, very hard for me. Um, so I took me, even though I was like, okay, I have to find I'm going to do it through chat. I'm gonna do all this. It took me a while to find like chat, which I couldn't find it.So once I finally found it from that point to deleting from having them finally delete my account was about an hour. And I knew what to do going in just to say all the things to just have them not bother me. So th that's on purpose they've purposely. Cause it's easier to just say like fine, I'll take the discount thing.You're throwing in my face at the last second. And it's almost become a joke now that like, you know, you have to cancel your Comcast every year, so you can keep the costs down. Um, you know, and Amazon too, like trying to find that, you know, delete my account is like so buried. You know, they do that on purpose and a lot of companies will do things like, you know, make it very easy to sign up for a free trial and, and hide the fact that they're going to charge you for a year high.The fact that they're automatically going to bill you not remind you when it's about to expire so that they can like surprise, get you in to forget about this billing subscription or like, you know, if you've ever gotten Adobe software, um, they are really bad at that. They, they trick you into like getting this like monthly sufficient, but actually you've committed to a year.And if you want to cancel early, we'll charge you like 80% of the year. And, uh, and there's a really hard to contact anybody about it. So, um, it happens quite often. If the more you read into those, um, different things, uh, different patterns, you'll start to see them everywhere. And users are really catching onto a lot of those things and are responding.To those in a very negative way. And like, um, we recently, uh, looked at a case study where, you know, this free trial, um, this company had a free trial and they had like the standard free trial, um, uh, kind of design. And then their test was really just focusing on like, Hey, we're not going to scam you. If I had to summarize that the entire direction of the second one, it was like, you know, cancel any time.Here's exactly how much you'll be charged. And on the, it'll be on this date, uh, at five days before that we'll remind you to cancel and all this stuff, um, that ended up performing about 30% better than the other one. And the reason is that people are now burned by that trick so much so that every time they see a free trial, they're like, forget it.I don't, I don't want to deal with all this trickery. Like, oh, I didn't even care about to try the product versus like. We were not going to trick you. We really want you to actually try the product and, you know, we'll make sure that if you're not wanting to move forward with this, that you have plenty of time and plenty of chances to lead and that people respond to that now.So that's what we talked about earlier in the show of doing the exact opposite. This is another example of that. [01:00:51] Jeremy: Yeah, because I think a lot of people are familiar with, like you said, trying to cancel Comcast or trying to cancel their, their New York times subscription. And they, you know, everybody is just like, they get so mad at the process, but I think they also may be assume that it's a positive for the company, but what you're saying is that maybe, maybe that's actually not in the company's best interest.[01:01:15] Jonathan: Yeah. Oftentimes what we find with these like dark patterns or these unethical decisions is that th they are successful because, um, when you look at the most impactful, like immediate metric, you can look at, it looks like it worked right. Like, um, you know, let's say for that, those free trials, it's like, okay, we implemented like all this trickery and our subscriptions went up.But if you look at like the end, uh, result, um, which is like farther on in the process, it's always a lot harder to track that impact. But we all know, like when we look at each other, like when we, uh, we, we, we talk to each other about these different, um, examples. Like we know it to be true, that we all hate that.And we all hate those companies and we don't want to engage with them. And we don't, sometimes we don't use the products at all. So, um, yeah, it, it, it's, it's one of those things where it actually has like that, very real impact, but harder to track. Um, and so oftentimes that's how these, these patterns become very pervasive is the oh, and page views went up, uh, this was, this was a really, you know, this is high engagement, but it was page views because people were refreshing the page trying to figure out where the heck to go. Right. So um, oftentimes they they're less effective, but they're easier to track[01:02:32] Jeremy: So I think that's, that's a good place to, to wrap things up, but, um, if people want to check out the book or learn more about what you're working on your podcast, where should they head?[01:02:44] Jonathan: Um, yeah, just, uh, check out tragic design.com and our podcast. You can find on any of your podcasting software, just search design review podcast. [01:02:55] Jeremy: Jonathan, thank you so much for joining me on software engineering radio.[01:02:59] Jonathan: alright, thanks Jeremy. Thanks everyone. And, um, hope you had a good time. I did.

ceo new york amazon ai google design mistakes elon musk uber product seo sr designing fda saas adobe ux tragic generally vox swan ui tvs aws ml comcast github pinto t mobile qa user experience versus intuit quickbooks turbotax jira clippy dark patterns ford pinto shariat interaction designer software engineering radio jeremy you jonathan yeah jonathan that jeremy so

301 Jeremy Enns - Going Narrow & Being Uncomfortably Specific

Podcast Junkies

Play Episode Listen Later Sep 6, 2022 83:11

Episode SummaryJeremy Enns is the Founder of Podcast Marketing Academy, where he helps scrappy brands and Creators hit their podcast growth milestones with a step-by-step playbook. Through his extensive experience in audio, Jeremy understands that podcasting is one of the best ways to consistently generate leads, make sales, and elevate your profile. Today, Harry and Jeremy discuss Jeremy's passion for writing and creating content, what separates a consistent show with a growing, successful one and the value of pursuing a niche and narrow audience. Episode SponsorFocusrite –http://pcjk.es/focusrite ( )http://pcjk.es/vocaster (http://pcjk.es/vocaster) FullCast –https://fullcast.co/ ( https://fullcast.co/) Key Takeaways06:25 – Founder of Podcast Marketing Academy, Jeremy Enns joins the show to share his plans for moving to Portugal as well as his passion for audio, music and photography 13:13 – The ‘Wildman' 23:01 – Content creation, Jeremy's podcasting origin story and overcoming Imposter Syndrome as an entrepreneur 34:14 – Jeremy reflects on returning to his love of writing 41:04 – How Jeremy plans and maps out his content 49:17 – The decision to begin curating courses 54:16 – What separates a consistent show with a growing, successful show 1:03:05 – The value of going narrower and pursuing niche audiences 1:13:51 – Something Jeremy has changed his mind about recently 1:19:07 – Harry thanks Jeremy for joining the show and lets listeners know where they can go to connect with him and learn more about Podcast Marketing Academy Tweetable Quotes“I think I read some stat recently that per capita more Canadians listen to podcasts than Americans do now.” (09:58) (Jeremy) “It's interesting when I look back at especially my lack of a corporate career history, I'm very aware of both the pros and cons. A lot of people I've talked to get conditioned where it's very hard to go off on your own path because you've just worked so long in a structure. I didn't have any of that baggage, but one of the interesting things that I've always thought is that everyone has some sort of Imposter Syndrome. But I don't know that you can get to this without going through it.” (27:31) (Jeremy) “You hear a lot of people who write regularly talk about how writing really clarifies your own thinking. That, for me, was the case where I thought, ‘At this point, I don't even care if anybody ever reads anything I write. It's so valuable to me personally just to keep writing because it was structuring my thoughts.'” (35:37) (Jeremy) “I tend to skew to the philosophical, higher-level strategy side of things than tactical side. But, I've also recognized that it's a lot harder to market and sell vague, philosophical things even if there's value to it and people like it.” (43:56) (Jeremy) “I think a lot of times it actually comes back to how specific you are about the person that you're actually trying to reach. Everybody, I'm sure who's listened to hundreds of episodes of your shows, has heard before that if you create a show for everybody, it's for nobody. We don't want to believe that, but it's the truth.” (55:32) (Jeremy) “For many audiences and many niches, the narrower you go, the more you can potentially charge sponsors. And, people don't really think of it that way.” (1:05:48) (Jeremy) Resources MentionedJeremy's Website & Links – https://counterweightcreative.co/junkies/ (https://counterweightcreative.co/junkies/) Podcast Marketing Academy – https://counterweightcreative.co/podcast-marketing-academy/ (https://counterweightcreative.co/podcast-marketing-academy/) Podcast Marketing Academy Twitter – https://twitter.com/PodGrowthSchool (https://twitter.com/PodGrowthSchool) Jeremy's LinkedIn – https://www.linkedin.com/in/jeremy-enns/ (https://www.linkedin.com/in/jeremy-enns/) Jeremy's Twitter – https://twitter.com/iamjeremyenns (https://twitter.com/iamjeremyenns) Seth Godin's Blog – https://seths.blog/ (https://seths.blog/)...

founders americans canadian blog portugal imposter syndrome creators narrow seth godin wildman fullcast uncomfortably jeremy enns podcast marketing academy jeremy you jeremy it

Ant Wilson on Supabase

The Leadership Stack Podcast

Play Episode Listen Later May 11, 2022 59:08

This episode originally aired on Software Engineering Radio.A few topics covered Building on top of open source Forking their GoTrue dependency Relying on Postgres features like row level security Adding realtime support based on Postgres's write ahead log Generating an API layer based on the database schema with PostgREST Creating separate EC2 instances for each customer's database How Postgres could scale in the future Monitoring postgres Common support tickets Permissive open source licenses Related Links @antwilson Supabase Supabase GitHub Firebase Airtable PostgREST GoTrue Elixir Prometheus VictoriaMetrics Logflare BigQuery Netlify Y Combinator Postgres PostgreSQL Write-Ahead Logging Row Security Policies pg_stat_statements pgAdmin PostGIS Amazon Aurora Transcript You can help edit this transcript on GitHub. [00:00:00] Jeremy: Today I'm talking to Ant Wilson, he's the co-founder and CTO of Supabase. Ant welcome to software engineering radio.[00:00:07] Ant: Thanks so much. Great to be here. [00:00:09] Jeremy: When I hear about Supabase, I always hear about it in relation to two other products. The first is Postgres, which is a open source relational database. And second is Firebase, which is a backend as a service product from Google cloud that provides a no SQL data store.It provides authentication and authorization. It has a functions as a service component. It's really meant to be a replacement for you needing to have your own server, create your own backend. You can have that all be done from Firebase. I think a good place for us to start would be walking us through what supabase is and how it relates to those two products.[00:00:55] Ant: Yeah. So, so we brand ourselves as the open source Firebase alternativethat came primarily from the fact that we ourselves do use the, as the alternative to Firebase. So, so my co-founder Paul in his previous startup was using fire store. And as they started to scale, they hit certain limitations, technical scaling limitations and he'd always been a huge Postgres fan.So we swapped it out for Postgres and then just started plugging in. The bits that we're missing, like the real-time streams. Um, He used the tool called PostgREST with a T for the, for the CRUD APIs. And sohe just built like the open source Firebase alternative on Postgres, And that's kind of where the tagline came from.But the main difference obviously is that it's relational database and not a no SQL database which means that it's not actually a drop-in replacement. But it does mean that it kind of opens the door to a lot more functionality actually. Um, Which, which is hopefully an advantage for us. [00:02:03] Jeremy: it's a, a hosted form of Postgres. So you mentioned that Firebase is, is different. It's uh NoSQL. People are putting in their, their JSON objects and things like that. So when people are working with Supabase is the experience of, is it just, I'm connecting to a Postgres database I'm writing SQL.And in that regard, it's kind of not really similar to Firebase at all. Is that, is that kind of right?[00:02:31] Ant: Yeah, I mean, the other thing, the other important thing to notice that you can communicate with Supabase directly from the client, which is what people love about fire base. You just like put the credentials on the client and you write some security rules, and then you just start sending your data. Obviously with supabase, you do need to create your schema because it's relational.But apart from that, the experience of client side development is very much the same or very similar the interface, obviously the API is a little bit different. But, but it's similar in that regard. But I, I think, like I said, we're moving, we are just a database company actually. And the tagline, just explained really, well, kind of the concept of, of what it is like a backend as a service. It has the real-time streams. It has the auth layer. It has the also generated APIs. So I don't know how long we'll stick with the tagline. I think we'll probably outgrow it at some point. Um, But it does do a good job of communicating roughly what the service is.[00:03:39] Jeremy: So when we talk about it being similar to Firebase, the part that's similar to fire base is that you could be a person building the front end part of the website, and you don't need to necessarily have a backend application because all of that could talk to supabase and supabase can handle the authentication, the real-time notifications all those sorts of things, similar to Firebase, where we're basically you only need to write the front end part, and then you have to know how to, to set up super base in this case.[00:04:14] Ant: Yeah, exactly. And some of the other, like we took w we love fire based, by the way. We're not building an alternative to try and destroy it. It's kind of like, we're just building the SQL alternative and we take a lot of inspiration from it. And the other thing we love is that you can administer your database from the browser.So you go into Firebase and you have the, you can see the object tree, and when you're in development, you can edit some of the documents in real time. And, and so we took that experience and effectively built like a spreadsheet view inside of our dashboard. And also obviously have a SQL editor in there as well.And trying to, create this, this like a similar developer experience, because that's where Firebase just excels is. The DX is incredible. And so we, we take a lot of inspiration from it in, in those respects.[00:05:08] Jeremy: and to to make it clear to our listeners as well. When you talk about this interface, that's kind of like a spreadsheet and things like that. I suppose it's similar to somebody opening up pgAdmin, I suppose, and going in and editing the rows. But, but maybe you've got like another layer on top that just makes it a little more user-friendly a little bit more like something you would get from Firebase, I guess.[00:05:33] Ant: Yeah.And, you know, we, we take a lot of inspiration from pgAdmin. PG admin is also open source. So I think we we've contributed a few things and, or trying to upstream a few things into PG admin. The other thing that we took a lot of inspiration from for the table editor, what we call it is airtable.And because airtable is effectively. a a relational database and that you can just come in and, you know, click to add your columns, click to add a new table. And so we just want to reproduce that experience again, backed up by a full Postgres dedicated database. [00:06:13] Jeremy: so when you're working with a Postgres database, normally you need some kind of layer in front of it, right? That the person can't open up their website and connect directly to Postgres from their browser. And you mentioned PostgREST before. I wonder if you could explain a little bit about what that is and how it works.[00:06:34] Ant: Yeah, definitely. so yeah, PostgREST has been around for a while. Um, It's basically an, a server that you connect to, to your Postgres database and it introspects your schemas and generates an API for you based on the table names, the column names. And then you can basically then communicate with your Postgres database via this restful API.So you can do pretty much, most of the filtering operations that you can do in SQL um, uh, equality filters. You can even do full text search over the API. So it just means that whenever you obviously add a new table or a new schema or a new column the API just updates instantly. So you, you don't have to worry about writing that, that middle layer which is, was always the drag right.When, what have you started a new project. It's like, okay, I've got my schema, I've got my client. Now I have to do all the connecting code in the middle of which is kind of, yeah, no, no developers should need to write that layer in 2022.[00:07:46] Jeremy: so this the layer you're referring to, when I think of a traditional. Web application. I think of having to write routes controllers and, and create this, this sort of structure where I know all the tables in my database, but the controllers I create may not map one to one with those tables. And so you mentioned a little bit about how PostgREST looks at the schema and starts to build an API automatically.And I wonder if you could explain a little bit about how it does those mappings or if you're writing those yourself. [00:08:21] Ant: Yeah, it basically does them automatically by default, it will, you know, map every table, every column. When you want to start restricting things. Well, there's two, there's two parts to this. There's one thing which I'm sure we'll get into, which is how is this secure since you are communicating direct from the client.But the other part is what you mentioned giving like a reduced view of a particular date, bit of data. And for that, we just use Postgres views. So you define a view which might be, you know it might have joins across a couple of different tables or it might just be a limited set of columns on one of your tables. And then you can choose to just expose that view. [00:09:05] Jeremy: so it sounds like when you would typically create a controller and create a route. Instead you create a view within your Postgres database and then PostgREST can take that view and create an end point for it, map it to that.[00:09:21] Ant: Yeah, exactly (laughs) . [00:09:24] Jeremy: And, and PostgREST is an open source project. Right. I wonder if you could talk a little bit about sort of what its its history was. How did you come to choose it? [00:09:37] Ant: Yeah.I think, I think Paul probably read about it on hacker news at some point. Anytime it appears on hacker news, it just gets voted to the front page because it's, it's So awesome. And we, we got connected to the maintainer, Steve Chavez. At some point I think he just took an interest in, or we took an interest in Postgres and we kind of got acquainted.And then we found out that, you know, Steve was open to work and this kind of like probably shaped a lot of the way we think about building out supabase as a project and as a company in that we then decided to employ Steve full time, but just to work on PostgREST because it's obviously a huge benefit for us.We're very reliant on it. We want it to succeed because it helps our business. And then as we started to add the other components, we decided that we would then always look for existing tools, existing opensource projects that exist before we decided to build something from scratch. So as we're starting to try and replicate the features of Firebase we would and auth is a great example.We did a full audit of what are all the authorization, authentication, authentication open-source tools that are out there and which one was, if any, would fit best. And we found, and Netlify had built a library called gotrue written in go, which did pretty much exactly what we needed. So we just adopted that.And now obviously, you know, we, we just have a lot of people on the team contributing to, to gotrue as well.[00:11:17] Jeremy: you touched on this a little bit earlier. Normally when you connect to a Postgres database your user has permission to, to basically everything I guess, by default, anyways. And so. So, how does that work? Where when you want to restrict people's permissions, make sure they only get to see records they're allowed to see how has that all configured in PostgREST and what's happening behind the scenes?[00:11:44] Ant: Yeah, we, the great thing about Postgres is it's got this concept of row level security, which actually, I don't think I even rarely looked at until we were building out this auth feature where the security rules live in your database as SQL. So you do like a create policy query, and you say anytime someone tries to select or insert or update apply this policy.And then how it all fits together is our auth server go true. Someone will basically make a request to sign in or sign up with email and password, and we create that user inside the, database. They get issued a URL. And they get issued a JSON, web token, a JWT, and which, you know, when they, when they have it on the, client side, proves that they are this, you, you ID, they have access to this data.Then when they make a request via PostgREST, they send the JWT in the authorization header. Then Postgres will pull out that JWT check the sub claim, which is the UID and compare it to any rows in the database, according to the policy that you wrote. So, so the most basic one is you say in order to, to access this row, it must have a column you UID and it must match whatever is in the JWT.So we basically push the authorization down into the database which actually has, you know, a lot of other benefits in that as you write new clients, You don't need to have, have it live, you know, on an API layer on the client. It's kind of just, everything is managed from the database.[00:13:33] Jeremy: So the, the, you, you ID, you mentioned that represents the user, correct. [00:13:39] Ant: Yeah. [00:13:41] Jeremy: Is that, does that map to a user in post graphs or is there some other way that you're mapping those permissions?[00:13:50] Ant: Yeah. When, so when you connect go true, which is the auth server to your Postgres database for the first time, it installs its own schema. So you'll have an auth schema and inside will be all start users with a list of the users. It'll have a uh, auth dot tokens which will store all the access tokens that it's issued.So, and one of the columns on the auth start user's table will be UUID, and then whenever you write application specific schemers, you can just join a, do a foreign key relation to the author users table. So, so it all gets into schema design and and hopefully we do a good job of having some good education content in the docs as well.Because one of the things we struggled with from the start was how much do we abstract away from SQL away from Postgres and how much do we educate? And we actually landed on the educate sides because I mean, once you start learning about Postgres, it becomes kind of a superpower for you as a developer.So we'd much rather. Have people discover us because we're a firebase alternatives frontend devs then we help them with things like schema design landing about row level security. Because ultimately like every, if you try and abstract that stuff it gets kind of crappy. And maybe not such a great experience. [00:15:20] Jeremy: to make sure I understand correctly. So you have GoTrue, which is uh, a Netlify open-source project that GoTrue project creates some tables in your, your database that has like, you've mentioned the tokens, the, the different users. Somebody makes a request to GoTrue. Like here's my username, my password go true.Gives them back a JWT. And then from your front end, you send that JWT to the PostgREST endpoint. And from that JWT, it's able to know which user you are and then uses postgres' built in a row level security to figure out which rows you're, you're allowed to bring back. Did I, did I get that right?[00:16:07] Ant: That is pretty much exactly how it works. And it's impressive that you garnered that without looking at a single diagram (laughs) But yeah, and, and, and obviously we, we provide a client library supabase JS, which actually does a lot of this work for you. So you don't need to manually attach the JJ JWT in a header.If you've authenticated with supabase JS, then every request sent to PostgREST. After that point, the header will just be attached automatically, and you'll be in a session as that user. [00:16:43] Jeremy: and, and the users that we're talking about when we talk about Postgres' row level security. Are those actual users in PostgreSQL. Like if I was to log in with psql, I could actually log in with those users.[00:17:00] Ant: They're not, you could potentially structure it that way. But it would be more advanced it's it's basically just users in, in the auth.users table, the way, the way it's currently done. [00:17:12] Jeremy: I see and postgrest has the, that row level security is able to work with that table. You, you don't need to have actual Postgres users.[00:17:23] Ant: Exactly. And, and it's, it's basically turing complete. I mean, you can write extremely complex auth policies. You can say, you know, only give access to this particular admin group on a Thursday afternoon between six and 8:00 PM. You can get really, yeah. really as fancy as you want. [00:17:44] Jeremy: Is that all written in SQL or are there other languages they allow you to use?[00:17:50] Ant: Yeah. It's the default is plain SQL. Within Postgres itself, you can useI think you can use, like there's a Python extension. There's a JavaScript extension, which is a, I think it's a subsets of, of JavaScripts. I mean, this is the thing with Postgres, it's super extensible and people have probably got all kinds of interpreters.So you, yeah, you can use whatever you want, but the typical user will just use SQL. [00:18:17] Jeremy: interesting. And that applies to logic in general, I suppose, where if you were writing a rails application, you might write Ruby. Um, If you're writing a node application, you write JavaScript, but you're, you're saying in a lot of cases with PostgREST, you're actually able to do what you want to do, whether that's serialization or mapping objects, do that all through SQL.[00:18:44] Ant: Yeah, exactly, exactly. And then obviously like there's a lot of awesome other stuff that Postgres has like this postGIS, which if you're doing geo, if you've got like a geo application, it'll load it up with a geo types for you, which you can just use. If you're doing like encryption and decryption, we just added PG libsodium, which is a new and awesome cryptography extension.And so you can use all of these, these all add like functions, like SQL functions which you can kind of use in, in any parts of the logic or in the role level policies. Yeah.[00:19:22] Jeremy: and something I thought was a little unique about PostgREST is that I believe it's written in Haskell. Is that right?[00:19:29] Ant: Yeah, exactly. And it makes it fairly inaccessible to me as a result. But the good thing is it's got a thriving community of its own and, you know, people who on there's people who contribute probably because it's written in haskell. And it's, it's just a really awesome project and it's an excuse to, to contribute to it.But yeah. I, I think I did probably the intro course, like many people and beyond that, it's just, yeah, kind of inaccessible to me. [00:19:59] Jeremy: yeah, I suppose that's the trade-off right. Is you have a, a really passionate community about like people who really want to use Haskell and then you've got the, the, I guess the group like yourselves that looks at it and goes, oh, I don't, I don't know about this.[00:20:13] Ant: I would, I would love to have the time to, to invest in uh, but not practical right now. [00:20:21] Jeremy: You talked a little bit about the GoTrue project from Netlify. I think I saw on one of your blog posts that you actually forked it. Can you sort of explain the reasoning behind doing that?[00:20:34] Ant: Yeah, initially it was because we were trying to move extremely fast. So, so we did Y Combinator in 2020. And when you do Y Combinator, you get like a part, a group partner, they call it one of the, the partners from YC and they add a huge amount of external pressure to move very quickly. And, and our biggest feature that we are working on in that period was auth.And we just kept getting the question of like, when are you going to ship auth? You know, and every single week we'd be like, we're working on it, we're working on it. And um, and one of the ways we could do it was we just had to iterate extremely quickly and we didn't rarely have the time to, to upstream things correctly.And actually like the way we use it in our stack is slightly differently. They connected to MySQL, we connected to Postgres. So we had to make some structural changes to do that. And the dream would be now that we, we spend some time upstream and a lot of the changes. And hopefully we do get around to that.But the, yeah, the pace at which we've had to move over the last uh, year and a half has been kind of scary and, and that's the main reason, but you know, hopefully now we're a little bit more established. We can hire some more people to, to just focus on, go true and, and bringing the two folks back together. [00:22:01] Jeremy: it's just a matter of, like you said speed, I suppose, because the PostgREST you, you chose to continue working off of the existing open source project, right? [00:22:15] Ant: Yeah, exactly. Exactly. And I think the other thing is it's not a major part of Netlify's business, as I understand it. I think if it was and if both companies had more resource behind it, it would make sense to obviously focus on on the single codebase but I think both companies don't contribute as much resource as as we would like to, but um, but it's, it's for me, it's, it's one of my favorite parts of the stack to work on because it's written in go and I kind of enjoy how that it all fits together.So Yeah. I, I like to dive in there. [00:22:55] Jeremy: w w what about go, or what about how it's structured? Do you particularly enjoy about the, that part of the project?[00:23:02] Ant: I think it's so I actually learned learned go through, gotrue and I'm, I have like a Python and C plus plus background And I hate the fact that I don't get to use Python and C plus posts rarely in my day to day job. It's obviously a lot of type script. And then when we inherited this code base, it was kind of, as I was picking it up I, it just reminded me a lot of, you know, a lot of the things I loved about Python and C plus plus, and, and the tooling around it as well. I just found to be exceptional. So, you know, you just do like a small amounts of conflig. Uh config, And it makes it very difficult to, to write bad code, if that makes sense.So the compiler will just, boot you back if you try and do something silly which isn't necessarily the case with, with JavaScript. I think TypeScript is a little bit better now, but Yeah, I just, it just reminded me a lot of my Python and C days.[00:24:01] Jeremy: Yeah, I'm not too familiar with go, but my understanding is that there's, there's a formatter that's a part of the language, so there's kind of a consistency there. And then the language itself tries to get people to, to build things in the same way, or maybe have simpler ways of building things. Um, I don't, I don't know.Maybe that's part of the appeal.[00:24:25] Ant: Yeah, exactly. And the package manager as well is great. It just does a lot of the importing automatically. and makes sure like all the declarations at the top are formatted correctly and, and are definitely there. So Yeah. just all of that tool chain is just really easy to pick up.[00:24:46] Jeremy: Yeah. And I, and I think compiled languages as well, when you have the static type checking. By the compiler, you know, not having things blow up and run time. That's, that's just such a big relief, at least for me in a lot of cases,[00:25:00] Ant: And I just loved the dopamine hits of when you compile something on it actually compiles this. I lose that with, with working with JavaScript. [00:25:11] Jeremy: for sure. One of the topics you mentioned earlier was how super base provides real-time database updates. And which is something that as far as I know is not natively a part of Postgres. So I wonder if you could explain a little bit about how that works and how that came about.[00:25:31] Ant: Yeah. So, So Postgres, when you add replication databases the way it does is it writes everything to this thing called the write ahead log, which is basically all the changes that uh, have, are going to be applied to, to the database. And when you connect to like a replication database. It basically streams that log across.And that's how the replica knows what, what changes to, to add. So we wrote a server, which basically pretends to be a Postgres rep, replica receives the right ahead log encodes it into JSON. And then you can subscribe to that server over web sockets. And so you can choose whether to subscribe, to changes on a particular schema or a particular table or particular columns, and even do equality matches on rows and things like this.And then we recently added the role level security policies to the real-time stream as well. So that was something that took us a while to, cause it was probably one of the largest technical challenges we've faced. But now that it's in the real-time stream is, is fully secure and you can apply these, these same policies that you apply over the CRUD API as well.[00:26:48] Jeremy: So for that part, did you have to look into the internals of Postgres and how it did its row level security and try to duplicate that in your own code?[00:26:59] Ant: Yeah, pretty much. I mean it's yeah, it's fairly complex and there's a guy on our team who, well, for him, it didn't seem as complex, let's say (laughs) , but yeah, that's pretty much it it's just a lot of it's effectively a SQL um, a Postgres extension itself, uh which in-in interprets those policies and applies them to, to the, to the, the right ahead log.[00:27:26] Jeremy: and this piece that you wrote, that's listening to the right ahead log. what was it written in and, and how did you choose that, that language or that stack?[00:27:36] Ant: Yeah. That's written in the Elixir framework which is based on Erlang very horizontally scalable. So any applications that you write in Elixir can kind of just scale horizontally the message passing and, you know, go into the billions and it's no problem. So it just seemed like a sensible choice for this type of application where you don't know.How large the wall is going to be. So it could just be like a few changes per second. It could be a million changes per second, then you need to be able to scale out. And I think Paul who's my co-founder originally, he wrote the first version of it and I think he wrote it as an excuse to learn Elixir, which is how, a lot of probably how PostgREST ended up being Haskell, I imagine.But uh, but it's meant that the Elixir community is still like relatively small. But it's a group of like very passionate and very um, highly skilled developers. So when we hire from that pool everyone who comes on board is just like, yeah, just, just really good and really enjoy is working with Elixir.So it's been a good source of a good source for hires as well. Just, just using those tools. [00:28:53] Jeremy: with a feature like this, I'm assuming it's where somebody goes to their website. They make a web socket connection to your application and they receive the updates that way. How have you seen how far you're able to push that in terms of connections, in terms of throughput, things like that?[00:29:12] Ant: Yeah, I don't actually have the numbers at hand. But we have, yeah, we have a team focused on obviously maximizing that but yeah, I don't I don't don't have those numbers right now. [00:29:24] Jeremy: one of the last things you've you've got on your website is a storage project or a storage product, I should say. And I believe it's written in TypeScript, so I was curious, we've got PostGrest, which is in Haskell. We've got go true and go. Uh, We've got the real-time database part in elixir.And so with storage, how did we finally get to TypeScript?[00:29:50] Ant: (Laughs) Well, the policy we kind of landed on was best tool for the job. Again, the good thing about being an open source is we're not resource constrained by the number of people who are in our team. It's by the number of people who are in the community and I'm willing to contribute. And so for that, I think one of the guys just went through a few different options that we could have went with, go just to keep it in line with a couple of the other APIs.But we just decided, you know, a lot of people well, everyone in the team like TypeScript is kind of just a given. And, and again, it was kind of down to speed, like what's the fastest uh we can get this up and running. And I think if we use TypeScript, it was, it was the best solution there. But yeah, but we just always go with whatever is best.Um, We don't worry too much uh, about, you know, the resources we have because the open source community has just been so great in helping us build supabase. And building supabase is like building like five companies at the same time actually, because each of these vertical stacks could be its own startup, like the auth stack And the storage layer, and all of this stuff.And you know, each has, it does have its own dedicated team. So yeah. So we're not too worried about the variation in languages.[00:31:13] Jeremy: And the storage layer is this basically a wrapper around S3 or like what is that product doing?[00:31:21] Ant: Yeah, exactly. It's it's wraparound as three. It, it would also work with all of the S3 compatible storage systems. There's a few Backblaze and a few others. So if you wanted to self host and use one of those alternatives, you could, we just have everything in our own S3 booklets inside of AWS.And then the other awesome thing about the storage system is that because we store the metadata inside of Postgres. So basically the object tree of what buckets and folders and files are there. You can write your role level policies against the object tree. So you can say this, this user should only access this folder and it's, and it's children which was kind of. Kind of an accident. We just landed on that. But it's one of my favorite things now about writing applications and supervisors is the rollover policies kind of work everywhere.[00:32:21] Jeremy: Yeah, it's interesting. It sounds like everything. Whether it's the storage or the authentication it's all comes back to postgres, right? At all. It's using the row level security. It's using everything that you put into the tables there, and everything's just kind of digging into that to get what it needs.[00:32:42] Ant: Yeah. And that's why I say we are a database company. We are a Postgres company. We're all in on postgres. We got asked in the early days. Oh, well, would you also make it my SQL compatible compatible with something else? And, but the amounts. Features Postgres has, if we just like continue to leverage them then it, it just makes the stack way more powerful than if we try to you know, go thin across multiple different databases.[00:33:16] Jeremy: And so that, that kind of brings me to, you mentioned how your Postgres companies, so when somebody signs up for supabase they create their first instance. What's what's happening behind the scenes. Are you creating a Postgres instance for them in a container, for example, how do you size it? That sort of thing.[00:33:37] Ant: Yeah. So it's basically just easy to under the hood for us we, we have plans eventually to be multi-cloud. But again, going down to the speed of execution that the. The fastest way was to just spin up a dedicated instance, a dedicated Postgres instance per user on EC2. We do also package all of the API APIs together in a second EC2 instance.But we're starting to break those out into clustered services. So for example, you know, not every user will use the storage API, so it doesn't make sense to Rooney for every user regardless. So we've, we've made that multitenant, the application code, and now we just run a huge global cluster which people connect through to access the S3 bucket.Basically and we're gonna, we have plans to do that for the other services as well. So right now it's you got two EC2 instances. But over time it will be just the Postgres instance and, and we wanted. Give everyone a dedicated instance, because there's nothing worse than sharing database resource with all the users, especially when you don't know how heavily they're going to use it, whether they're going to be bursty.So I think one of the things we just said from the start is everyone gets a Postgres instance and you get access to it as well. You can use your Postgres connection string to, to log in from the command line and kind of do whatever you want. It's yours.[00:35:12] Jeremy: so did it, did I get it right? That when I sign up, I create a super base account. You're actually creating an two instance for me specifically. So it's like every customer gets their, their own isolated it's their own CPU, their own Ram, that sort of thing.[00:35:29] Ant: Yeah, exactly, exactly. And, and the way the. We've set up the monitoring as well, is that we can expose basically all of that to you in the dashboard as well. so you can, you have some control over like the resource you want to use. If you want to a more powerful instance, we can do that. A lot of that stuff is automated.So if someone scales beyond the allocated disk size, the disk will automatically scale up by 50% each time. And we're working on automating a bunch of these, these other things as well.[00:36:03] Jeremy: so is it, is it where, when you first create the account, you might create, for example, a micro instance, and then you have internal monitoring tools that see, oh, the CPU is getting heady hit pretty hard. So we need to migrate this person to a bigger instance, that kind of thing.[00:36:22] Ant: Yeah, pretty much exactly. [00:36:25] Jeremy: And is that, is that something that the user would even see or is it the case of where you send them an email and go like, Hey, we notice you're hitting the limits here. Here's what's going to happen. [00:36:37] Ant: Yeah.In, in most cases it's handled automatically. There are people who come in and from day one, they say has my requirements. I'm going to have this much traffic. And I'm going to have, you know, a hundred thousand users hitting this every hour. And in those cases we will over-provisioned from the start.But if it's just the self service case, then it will be start on a smaller instance and an upgrade over time. And this is one of our biggest challenges over the next five years is we want to move to a more scalable Postgres. So cloud native Postgres. But the cool thing about this is there's a lot of.Different companies and individuals working on this and upstreaming into Postgres itself. So for us, we don't need to, and we, and we would never want to fork Postgres and, you know, and try and separate the storage and the the computes. But more we're gonna fund people who are already working on this so that it gets upstreamed into Postgres itself.And it's more cloud native. [00:37:46] Jeremy: Yeah. So I think the, like we talked a little bit about how Firebase was the original inspiration and when you work with Firebase, you, you don't think about an instance at all, right? You, you just put data in, you get data out. And it sounds like in this case, you're, you're kind of working from the standpoint of, we're going to give you this single Postgres instance.As you hit the limits, we'll give you a bigger one. But at some point you, you will hit a limit of where just that one instance is not enough. And I wonder if there's you have any plans for that, or if you're doing anything currently to, to handle that.[00:38:28] Ant: Yeah. So, so the medium goal is to do replication like horizontal scaling. We, we do that for some users already but we manually set that up. we do want to bring that to the self serve model as well, where you can just choose from the start. So I want, you know, replicas in these, in these zones and in these different data centers.But then, like I said, the long-term goal is that. it's not based on. Horizontally scaling a number of instances it's just a Postgres itself can, can scale out. And I think we will get to, I think, honestly, the race at which the Postgres community is working, I think we'll be there in two years.And, and if we can contribute resource towards that, that goal, I think yeah, like we'd love to do that, but yeah, but for now, it's, we're working on this intermediate solution of, of what people already do with, Postgres, which is, you know, have you replicas to make it highly available.[00:39:30] Jeremy: And with, with that, I, I suppose at least in the short term, the goal is that your monitoring software and your team is handling the scaling up the instance or creating the read replicas. So to the user, it, for the most part feels like a managed service. And then yeah, the next step would be to, to get something more similar to maybe Amazon's Aurora, I suppose, where it just kind of, you pay per use.[00:40:01] Ant: Yeah, exactly. Exactly. Aurora was kind of the goal from the start. It's just a shame that it's proprietary. Obviously. [00:40:08] Jeremy: right. Um, but it sounds, [00:40:10] Ant: the world would be a better place. If aurora was opensource. [00:40:15] Jeremy: yeah. And it sounds like you said, there's people in the open source community that are, that are trying to get there. just it'll take time. to, to all this, about making it feel seamless, making it feel like a serverless experience, even though internally, it really isn't, I'm guessing you must have a fair amount of monitoring or ways that you're making these decisions.I wonder if you can talk a little bit about, you know, what are the metrics you're looking at and what are the applications you're you have to, to help you make these decisions?[00:40:48] Ant: Yeah. definitely. So we started with Prometheus which is a, you know, metrics gathering tool. And then we moved to Victoria metrics which was just easier for us to scale out. I think soon we'll be managing like a hundred thousand Postgres databases will have been deployed on, on supabase. So definitely, definitely some scale. So this kind of tooling needs to scale to that as well. And then we have agents kind of everywhere on each application on, on the database itself. And we listen for things like the CPU and the Ram and the network IO. We also poll. Uh, Postgres itself. Th there's a extension called PG stats statements, which will give us information about what are, the intensive queries that are running on that, on that box.So we just collect as much of this as possible um, which we then obviously use internally. We set alerts to, to know when, when we need to upgrade in a certain direction, but we also have an end point where the dashboard subscribes to these metrics as well. So the user themselves can see a lot of this information.And we, I think at the moment we do a lot of the, the Ram the CPU, that kind of stuff, but we're working on adding just more and more of these observability metrics uh, so people can can know it could, because it also helps with Let's say you might be lacking an index on a particular table and not know about it.And so if we can expose that to you and give you alerts about that kind of thing, then it obviously helps with the developer experience as well.[00:42:29] Jeremy: Yeah. And th that brings me to something that I, I hear from platform as a service companies, where if a user has a problem, whether that's a crash or a performance problem, sometimes it can be difficult to distinguish between is it a problem in their application or is this a problem in super base or, you know, and I wonder how your support team kind of approaches that.[00:42:52] Ant: Yeah, no, it's, it's, it's a great question. And it's definitely something we, we deal with every day, I think because of where we're at as a company we've always seen, like, we actually have a huge advantage in that.we can provide. Rarely good support. So anytime an engineer joins super base, we tell them your primary job is actually frontline support.Everything you do afterwards is, is secondary. And so everyone does a four hour shift per week of, of working directly with the customers to help determine this kind of thing. And where we are at the moment is we are happy to dive in and help people with their application code because it helps our engineers land about how it's being used and where the pitfalls are, where we need better documentation, where we need education.So it's, that is all part of the product at the moment, actually. And, and like I said, because we're not a 10,000 person company we, it's an advantage that we have, that we can deliver that level of support at the moment. [00:44:01] Jeremy: w w what are some of the most common things you see happening? Like, is it I would expect you mentioned indexing problems, but I'm wondering if there's any specific things that just come up again and again,[00:44:15] Ant: I think like the most common is people not batching their requests. So they'll write an application, which, you know, needs to, needs to pull 10,000 rows and they send 10,000 requests (laughs) . That that's, that's a typical one for, for people just getting started maybe. Yeah. and, then I think the other thing we faced in the early days was. People storing blobs in the database which we obviously solve that problem by introducing file storage. But people will be trying to store, you know, 50 megabytes, a hundred megabyte files in Postgres itself, and then asking why the performance was so bad.So I think we've, we've mitigated that one by, by introducing the blob storage.[00:45:03] Jeremy: and when you're, you mentioned you have. Over a hundred thousand instances running. I imagine there have to be cases where an incident occurs, where something doesn't go quite right. And I wonder if you could give an example of one and how it was resolved.[00:45:24] Ant: Yeah, it's a good question. I think, yeah, w w we've improved the systems since then, but there was a period where our real time server wasn't able to handle rarely large uh, right ahead logs. So w there was a period where people would just make tons and tons of requests and updates to, to Postgres. And the real time subscriptions were failing. But like I said, we have some really great Elixir devs on the team, so they were able to jump on that fairly quickly. And now, you know, the application is, is way more scalable as a result. And that's just kind of how the support model works is you have a period where everything is breaking and then uh, then you can just, you know, tackle these things one by one. [00:46:15] Jeremy: Yeah, I think any, anybody at a, an early startup is going to run into that. Right? You put it out there and then you find out what's broken, you fix it and you just get better and better as it goes along.[00:46:28] Ant: Yeah, And the funny thing was this model of, of deploying EC2 instances. We had that in like the first week of starting super base, just me and Paul. And it was never intended to be the final solution. We just kind of did it quickly and to get something up and running for our first handful of users But it's scaled surprisingly well.And actually the things that broke as we started to get a lot of traffic and a lot of attention where was just silly things. Like we give everyone their own domain when they start a new project. So you'll have project ref dot super base dot in or co. And the things that were breaking where like, you know, we'd ran out of sub-domains with our DNS provider and then, but, and those things always happen in periods of like intense traffic.So we ha we were on the front page of hacker news, or we had a tech crunch article, and then you discover that you've ran out of sub domains and the last thousand people couldn't deploy their projects. So that's always a fun a fun challenge because you are then dependent on the external providers as well and theirs and their support systems.So yeah, I think. We did a surprisingly good job of, of putting in good infrastructure from the start. But yeah, all of these crazy things just break when obviously when you get a lot of, a lot of traffic[00:48:00] Jeremy: Yeah, I find it interesting that you mentioned how you started with creating the EC2 instances and it turned out that just work. I wonder if you could walk me through a little bit about how it worked in the beginning, like, was it the two of you going in and creating instances as people signed up and then how it went from there to where it is today?[00:48:20] Ant: yeah. So there's a good story about, about our fast user, actually. So me and Paul used to contract for a company in Singapore, which was an NFT company. And so we knew the lead developer very well. And we also still had the Postgres credentials on, on our own machines. And so what we did was we set up the th th the other funny thing is when we first started, we didn't intend to host the database.We, we thought we were just gonna host the applications that would connect to your existing Postgres instance. And so what we did was we hooked up the applications to, to the, to the Postgres instance of this, of this startup that we knew very well. And then we took the bus to their office and we sat with the lead developer, and we said, look, we've already set this thing up for you.What do you think. know, when, when you think like, ah, we've, we've got the best thing ever, but it's not until you put it in front of someone and you see them, you know, contemplating it and you're like, oh, maybe, maybe it's not so good. Maybe we don't have anything. And we had that moment of panic of like, oh, maybe we just don't maybe this isn't great.And then what happened was he didn't like use us. He didn't become a supabase user. He asked to join the team. [00:49:45] Jeremy: nice, nice.[00:49:46] Ant: that was a good a good kind of a moment where we thought, okay, maybe we have got something, maybe this is maybe this isn't terrible. So, so yeah, so he became our first employee. Yeah. [00:49:59] Jeremy: And so yeah, so, so that case was, you know, the very beginning you set everything up from, from scratch. Now that you have people signing up and you have, you know, I don't know how many signups you get a day. Did you write custom infrastructure or applications to do the provisioning or is there an open source project that you're using to handle that[00:50:21] Ant: Yeah. It's, it's actually mostly custom. And you know, AWS does a lot of the heavy lifting for you. They just provide you with a bunch of API end points. So a lot of that is just written in TypeScript fairly straightforward and, and like I said, you never intended to be the thing that last. Two years into the business.But it's, it's just scaled surprisingly well. And I'm sure at some point we'll, we'll swap it out for some I don't orchestration tooling like Pulumi or something like this. But actually the, what we've got just works really well.[00:50:59] Ant: Be because we're so into Postgres our queuing system is a Postgres extension called PG boss. And then we have a fleet of workers, which are. Uh, We manage on EC ECS. Um, So it's just a bunch of VMs basically which just subscribed to the, to the queue, which lives inside the database.And just performs all the, whether it be a project creation, deletion modification a whole, whole suite of these things. Yeah. [00:51:29] Jeremy: very cool. And so even your provisioning is, is based on Postgres.[00:51:33] Ant: Yeah, exactly. Exactly (laughs) . [00:51:36] Jeremy: I guess in that case, I think, did you say you're using the right ahead log there to in order to get notifications?[00:51:44] Ant: We do use real time, and this is the fun thing about building supabase is we use supabase to build supabase. And a lot of the features start with things that we build for ourselves. So the, the observability features we have a huge logging division. So, so w we were very early users of a tool called a log flare, which is also written in Elixir.It's basically a log sync backed up by BigQuery. And we loved it so much and we became like super log flare power users that it was kind of, we decided to eventually acquire the company. And now we can just offer log flare to all of our customers as well as part of using supabase. So you can query your logs and get really good business intelligence on what your users um, consuming in from your database.[00:52:35] Jeremy: the lock flare you're mentioning though, you said that that's a log sink and that that's actually not going to Postgres, right. That's going to a different type of store.[00:52:43] Ant: Yeah. That is going to big query actually. [00:52:46] Jeremy: Oh, big query. Okay. [00:52:47] Ant: yeah, and maybe eventually, and this is the cool thing about watching the Postgres progression is it's become. It's bringing like transactional and analytical databases together. So it's traditionally been a great transactional database, but if you look at a lot of the changes that have been made in recent versions, it's becoming closer and closer to an analytical database.So maybe at some point we will use it, but yeah, but big query works just great. [00:53:18] Jeremy: Yeah. It's, it's interesting to see, like, I, I know that we've had episodes on different extensions to Postgres where I believe they change out how the storage works. So there's yeah, it's really interesting how it's it's this one database, but it seems like it can take so many different forms. [00:53:36] Ant: It's just so extensible and that's why we're so bullish on it because okay. Maybe it wasn't always the best database, but now it seems like it is becoming the best database and the rate at which it's moving. It's like, where's it going to be in five years? And we're just, yeah, we're just very bullish on, on Postgres.As you can tell from the amount of mentions it's had in this episode.[00:54:01] Jeremy: yeah, we'll have to count how many times it's been said. I'm sure. It's, I'm sure it's up there. Is there anything else we, we missed or think you should have mentioned.[00:54:12] Ant: No, some of the things we're excited about are cloud functions. So it's the thing we just get asked for the most at anytime we post anything on Twitter, you're guaranteed to get a reply, which is like when functions. And we're very pleased to say that it's, it's almost there. So um, that will hopefully be a really good developer experience where also we launched like a, a graph QL Postgres extension where the resolver lives inside of Postgres.And that's still in early alpha, but I think I'm quite excited for when we can start offering that on the on the hosted platform as well. People will have that option to, to use GraphQL instead of, or as well as the restful API.[00:55:02] Jeremy: the, the common thread here is that PostgreSQL you're able to take it really, really far. Right. In terms of scale up, eventually you'll have the read replicas. Hopefully you'll have. Some kind of I don't know what you would call Aurora, but it's, it's almost like self provisioning, maybe not sharing what, how you describe it.But I wonder as a, as a company, like we talked about big query, right? I wonder if there's any use cases that you've come across, either from customers or in your own work where you're like, I just, I just can't get it to fit into Postgres.[00:55:38] Ant: I think like, not very often, but sometimes we'll, we will respond to support requests and recommend that people use Firebase. they're rarelylike if, if they really do have like large amounts of unstructured data, which is which, you know, documented storage is, is kind of perfect for that. We'll just say, you know, maybe you should just use Firebase.So we definitely come across things like that. And, and like I said, we love, we love Firebase, so we're definitely not trying to, to uh, destroy as a tool. I think it, it has its use cases where it's an incredible tool yeah. And provides a lot of inspiration for, for what we're building as well. [00:56:28] Jeremy: all right. Well, I think that's a good place to, to wrap it up, but where can people hear more about you hear more about supabase?[00:56:38] Ant: Yeah, so supeabase is at supabase.com. I'm on Twitter at ant Wilson. Supabase is on Twitter at super base. Just hits us up. We're quite active on the and then definitely check out the repose gets up.com/super base. There's lots of great stuff to dig into as we discussed. There's a lot of different languages, so kind of whatever you're into, you'll probably find something where you can contribute. [00:57:04] Jeremy: Yeah, and we, we sorta touched on this, but I think everything we've talked about with the exception of the provisioning part and the monitoring part is all open source. Is that correct? [00:57:16] Ant: Yeah, exactly.And as, yeah. And hopefully everything we build moving forward, including functions and graph QL we'll continue to be open source.[00:57:31] Jeremy: And then I suppose the one thing I, I did mean to touch on is what, what is the, the license for all the components you're using that are open source?[00:57:41] Ant: It's mostly Apache2 or MIT. And then obviously Postgres has its own Postgres license. So as long as it's, it's one of those, then we, we're not too precious. I, As I said, we inherit a fair amounts of projects. So we contribute to and adopt projects. So as long as it's just very permissive, then we don't care too much.[00:58:05] Jeremy: As far as the projects that your team has worked on, I've noticed that over the years, we've seen a lot of companies move to things like the business source license or there's, there's all these different licenses that are not quite so permissive. And I wonder like what your thoughts are on that for the future of your company and why you think that you'll be able to stay permissive.[00:58:32] Ant: Yeah, I really, really, rarely hope that we can stay permissive. forever. It's, it's a philosophical thing for, for us. You know, when we, we started the business, it's what just very, just very, as individuals into the idea of open source. And you know, if, if, if AWS come along at some point and offer hosted supabase on AWS, then it will be a signal that where we're doing something.Right. And at that point we just, I think we just need to be. The best team to continue to move super boost forward. And if we are that, and I, I think we will be there and then hopefully we will never have to tackle this this licensing issue. [00:59:19] Jeremy: All right. Well, I wish you, I wish you luck.[00:59:23] Ant: Thanks. Thanks for having me. [00:59:25] Jeremy: This has been Jeremy Jung for software engineering radio. Thanks for listening.

amazon google mit web nfts singapore id cto ram ant pg api io python aws github prometheus apis javascript rooney y combinator cpu dns s3 sql elixir js dx haskell yc json mysql typescript graphql vms nosql postgresql backblaze firebase postgres netlify jwt ec2 erlang bigquery ql horizontally supabase usei uid postgis software engineering radio jeremy you jeremy so

Ep 384: A Founding CEO's Guide to Finding Successors

Play Episode Listen Later Apr 14, 2022 16:34

Sean: But when you hire your own CEO, that means that you found someone who is very capable of replacing you and running the business so that it can work on its own. How did that look like for you? What are some key traits that you were looking for? Was it so expensive to do that? Jeremy: Yeah, I think the fundamental fact is that, you know, work is work. And, you know, companies are made of people and the environment changes, the company changes, and you change. And so, a lot of people are like stuck in a sense that like, 'okay, I will change when I want to change, right? Or change doesn't happen unless I approve it, or I influence it.' But the truth is, every day, you know, the markets change, your competitors get smarter, your friends have different requirements, your clients choose to do different things. That stuff is really changing. Your company is changing because you're already building company. You put your time and effort and if you have that team, they are themselves making decisions about how to grow, how to get better, how to get smarter. And the company is no longer the company you founded but is now a company that's going on to do things by itself, it has culture, has values and so, so forth. And lastly, you change, right? You know, like you're getting all the time and entropy is happening. And truth is all of us are going to be gone for sure in 100 years. Jeremy: And the truth is, if you think about it like we must give ourselves permissions that we ourselves are allowed to change, that we can want to do different things, that we want to build different things, that we want to create different things. Jeremy: You may say, like, I want to do something different, but the company shouldn't change. Then it's very hard to find someone who's going to be replacing you - I think that's one way you described it. But I always remember my mentor was talking to me and he said, Jeremy, your job is to find a successor who's better than you for this next phase of the company. Because the truth is, at a start of this company, there is no one better than you at the founder in building this first stage of company. But your job is to find a replacement who's going to be better than you. Jeremy: I think the tricky part that you must do is to say, I'm looking for someone who's better than me for this next phase of the company, because the company has changed, and you look for someone who is better fit for that. And the reason why you're leaving, I would like to leave or find someone is because you don't feel like you're the best person for that job because you may have the skills even, but you maybe don't have the passion anymore for it as well. And that's okay to acknowledge because I mean, you know, in today's world, you know, they say Gen Z and millennials are like switching jobs every two years, right, you know? And then here's the founder CEO hanging out for seven, ten more years. Right. And then, you know, like give me enough clothes to get to explore different things. - - - Youtube: https://www.youtube.com/leadershipstack Join our community and ask questions here: from.sean.si/discord Facebook: https://www.facebook.com/leadershipstack

ceo guide gen z successors founding ceo jeremy you

Taking Notes on Serverless

Play Episode Listen Later Nov 10, 2021 49:23

Swizec is the author of the Serverless Handbook and a software engineer at Tia.Swizec Swizec's personal site Serverless Handbook AWS Lambda API Gateway Operating Lambda (The cold start problem) Provisioned Concurrency DynamoDB Relational Database Service Aurora Simple Queue Service CloudFormation CloudWatch Other serverless function hosting providers Gatsby Cloud Functions Vercel Serverless Functions Netlify Functions Cloud Functions for Firebase Related topics Serverless Framework Jamstack Lighthouse What is a Static Site Generator? What is a CDN? Keeping Server-Side Rendering Cool With React Hydration TypeScript TranscriptYou can help edit this transcript on GitHub.[00:00:00] Jeremy: Today, I'm talking to Swiz Teller. He's a senior software engineer at Tia. The author of the serverless handbook and he's also got a bunch of other courses and I don't know is it thousands of blog posts now you have a lot of them.[00:00:13] Swizec: It is actually thousands of, uh, it's like 1500. So I don't know if that's exactly thousands, but it's over a thousand.I'm cheating a little bit. Cause I started in high school back when blogs were still considered social media and then I just kind of kept going on the same domain.Do you have some kind of process where you're, you're always thinking of what to write next? Or are you writing things down while you're working at your job? Things like that. I'm just curious how you come up with that. [00:00:41] Swizec: So I'm one of those people who likes to use writing as a way to process things and to learn. So one of the best ways I found to learn something new is to kind of learn it and then figure out how to explain it to other people and through explaining it, you really, you really spot, oh shit. I don't actually understand that part at all, because if I understood it, I would be able to explain it.And it's also really good as a reference for later. So some, one of my favorite things to do is to spot a problem at work and be like, oh, Hey, this is similar to that side project. I did once for a weekend experiment I did, and I wrote about it so we can kind of crib off of my method and now use it. So we don't have to figure things out from scratch.And part of it is like you said, that just always thinking about what I can write next. I like to keep a schedule. So I keep myself to posting two articles per week. It used to be every day, but I got too busy for that. when you have that schedule and, you know, okay on Tuesday morning, I'm going to sit down and I have an hour or two hours to write, whatever is on top of mind, you kind of start spotting more and more of these opportunities where it's like a coworker asked me something and I explained it in a slack thread and it, we had an hour. Maybe not an hour, but half an hour of back and forth. And you actually just wrote like three or 400 words to explain something. If you take those 400 words and just polish them up a little bit, or rephrase them a different way so that they're easier to understand for somebody who is not your coworker, Hey, that's a blog post and you can post it on your blog and it might help others.[00:02:29] Jeremy: It sounds like taking the conversations most people have in their day to day. And writing that down in a more formal way. [00:02:37] Swizec: Yeah. not even maybe in a more formal way, but more, more about in a way that a broader audience can appreciate. if it's, I'm super gnarly, detailed, deep in our infrastructure in our stack, I would have to explain so much of the stuff around it for anyone to even understand that it's useless, but you often get these nuggets where, oh, this is actually a really good insight that I can share with others and then others can learn from it. I can learn from it. [00:03:09] Jeremy: What's the most accessible way or the way that I can share this information with the most people who don't have all this context that I have from working in this place. [00:03:21] Swizec: Exactly. And then the power move, if you're a bit of an asshole is to, instead of answering your coworkers question is to think about the answer, write a blog post and then share the link with them.I think that's pushing it a little bit.[00:03:38] Jeremy: Yeah, It's like you're being helpful, but it also feels a little bit passive aggressive.[00:03:44] Swizec: Exactly. Although that's a really good way to write documentation. One thing I've noticed at work is if people keep asking me the same questions, I try to stop writing my replies in slack and instead put it on confluence or whatever internal wiki that we have, and then share that link. and that has always been super appreciated by everyone.[00:04:09] Jeremy: I think it's easy to, have that reply in slack and, and solve that problem right then. But when you're creating these Wiki pages or these documents, how're people generally finding these. Cause I know you can go through all this trouble to make this document. And then people just don't know to look or where to go. [00:04:30] Swizec: Yeah. Discoverability is a really big problem, especially what happens with a lot of internal documentation is that it's kind of this wasteland of good ideas that doesn't get updated and nobody maintains. So people stop even looking at it. And then if you've stopped looking at it before, stop updating it, people stop contributing and it kind of just falls apart.And the other problem that often happens is that you start writing this documentation in a vacuum. So there's no audience for it, so it's not help. So it's not helpful. That's why I like the slack first approach where you first answered the question is. And now, you know exactly what you're answering and exactly who the audiences.And then you can even just copy paste from slack, put it in a conf in JIRA board or wherever you put these things. spice it up a little, maybe effect some punctuation. And then next time when somebody asks you the same question, you can be like, oh, Hey, I remember where that is. Go find the link and share it with them and kind of also trains people to start looking at the wiki.I don't know, maybe it's just the way my brain works, but I'm really bad at remembering information, but I'm really good at remembering how to find it. Like my brain works like a huge reference network and it's very easy for me to remember, oh, I wrote that down and it's over there even if I don't remember the answer, I almost always remember where I wrote it down if I wrote it down, whereas in slack it just kind of gets lost.[00:06:07] Jeremy: Do you also take more informal notes? Like, do you have notes locally? You look through or something? That's not a straight up Wiki. [00:06:15] Swizec: I'm actually really bad at that. I, one of the things I do is that when I'm coding, I write down. so I have almost like an engineering log book where everything, I, almost everything I think about, uh, problems I'm working on. I'm always writing them down on by hand, on a piece of paper. And then I never look at those notes again.And it's almost like it helps me think it helps me organize my thoughts. And I find that I'm really bad at actually referencing my notes and reading them later because, and this again is probably a quirk of my brain, but I've always been like this. Once I write it down, I rarely have to look at it again.But if I don't write it down, I immediately forget what it is.What I do really like doing is writing down SOPs. So if I notice that I keep doing something repeatedly, I write a, uh, standard operating procedure. For my personal life and for work as well, I have a huge, oh, it's not that huge, but I have a repository of standard procedures where, okay, I need to do X.So you pull up the right recipe and you just follow the recipe. And if you spot a bug in the recipe, you fix the recipe. And then once you have that polished, it's really easy to turn that into an automated process that can do it for you, or even outsource it to somebody else who can work. So we did, you don't have to keep doing the same stuff and figuring out, figuring it out from scratch every time.[00:07:55] Jeremy: And these standard operating procedures, they sound a little bit like runbooks I guess. [00:08:01] Swizec: Yep. Run books or I think in DevOps, I think the big red book or the red binder where you take it out and you're like, we're having this emergency, this alert is firing. Here are the next steps of what we have to check.[00:08:15] Jeremy: So for those kinds of things, those are more for incidents and things like that. But in your case, it sounds like it's more, uh, I need to get started with the next JS project, or I need to set up a Postgres database things like that. [00:08:30] Swizec: Yeah. Or I need to reset a user to initial states for testing or create a new user. That's sort of thing.[00:08:39] Jeremy: These probably aren't in that handwritten log book.[00:08:44] Swizec: The wiki. That's also really good way to share them with new engineers who are coming on to the team.[00:08:50] Jeremy: Is it where you just basically dump them all on one page or is it where you, you organize them somehow so that people know that this is where, where they need to go. [00:09:00] Swizec: I like to keep a pretty flat structure because, I think the, the idea of categorization outlived its prime. We have really good search algorithms now and really good fuzzy searching. So it's almost easier if everything is just dumped and it's designed to be easy to search. a really interesting anecdote from, I think they were they were professors at some school and they realized that they try to organize everything into four files and folders.And they're trying to explain this to their younger students, people who are in their early twenties and the young students just couldn't understand. Why would you put anything in a folder? Like what is a folder? What is why? You just dump everything on your desktop and then command F and you find it. Why would you, why would you even worry about what the file name is? Where the file is? Like, who cares? It's there somewhere.[00:09:58] Jeremy: Yeah, I think I saw the same article. I think it was on the verge, right?I mean, I think that's that's right, because when you're using, say a Mac and you don't go look for the application or the document you want to run a lot of times you open up spotlight and just type it and it comes up.Though, I think what's also sort of interesting is, uh, at least in the note taking space, there's a lot of people who like setting up things like tags and things like that. And in a way that feels a lot like folders, I guess [00:10:35] Swizec: Yeah. The difference between tags and categories is that the same file can have multiple tags and it cannot be in multiple folders. So that's why categorization systems usually fall apart. You mentioned note taking systems and my opinion on those has always been that it's very easy to fall into the trap of feeling productive because you are working on your note or productivity system, but you're not actually achieving anything.You're just creating work for work sake. I try to keep everything as simple as possible and kind of avoid the overhead.[00:11:15] Jeremy: People can definitely spend hours upon hours curating what's my note taking system going to be, the same way that you can try to set up your blog for two weeks and not write any articles. [00:11:31] Swizec: Yeah. exactly.[00:11:32] Jeremy: When I take notes, a lot of times I'll just create a new note in apple notes or in a markdown file and I'll just write stuff, but it ends up being very similar to what you described with your, your log book in that, like, because it's, it's not really organized in any way. Um, it can be tricky to go back and actually, find useful information though, Though, I suppose the main difference though, is that when it is digital, uh, sometimes if I search for a specific, uh, software application or a specific tool, then at least I can find, um, those bits there [00:12:12] Swizec: Yeah. That's true. the other approach I'd like to use is called the good shit stays. So if I can't remember it, it probably wasn't important enough. And you can, especially these days with the internet, when it comes to details and facts, you can always find them. I find that it's pretty easy to find facts as long as you can remember some sort of reference to it.[00:12:38] Jeremy: You can find specific errors or like you say specific facts, but I think if you haven't been working with a specific technology or in a specific domain for a certain amount of time, you, it, it can be hard to, to find like the right thing to look for, or to even know if the solution you're looking at is, is the right one. [00:13:07] Swizec: That is very true. Yeah. Yeah, I don't really have a solution for that one other than relearn it again. And it's usually faster the second time. But if you had notes, you would still have to reread the notes. Anyway, I guess that's a little faster, cause it's customized to you personally.[00:13:26] Jeremy: Where it's helpful is that sometimes when you're looking online, you have to jump through a bunch of different sites to kind of get all the information together. And by that time you've, you've lost your flow a little bit, or you you've lost, kind of what you were working on, uh, to begin with. Yeah. [00:13:45] Swizec: Yeah. That definitely happens.[00:13:47] Jeremy: Next I'd like to talk about the serverless handbook. Something that you've talked about publicly a little bit is that when you try to work on something, you don't think it's a great idea to just go look at a bunch of blog posts. Um, you think it's better to, to go to a book or some kind of more, uh, I don't know what you would call it like larger or authoritative resource. And I wonder what the process was for, for you. Like when you decided I'm going to go learn how to do serverless you know, what was your process for doing that? [00:14:23] Swizec: Yeah. When I started learning serverless, I noticed that maybe I just wasn't good at finding them. That's one thing I've noticed with Google is that when you're jumping into a new technical. It's often hard to find stuff because you don't really know what you're searching for. And Google also likes to tune the algorithms to you personally a little bit.So it can be hard to find what you want if you are, if you haven't been in that space. So I couldn't really find a lot of good resources, uh, which resulted in me doing a lot of exploration, essentially from scratch or piecing together different blogs and scraps of information here and there. I know that I spend ridiculous amounts of time in even as deep as GitHub issues on closed issues that came up in Google and answer something or figure, or people were figuring out how something works and then kind of piecing all of that together and doing a lot of kind of manual banging my head against the wall until the wall broke.And I got through. I decided after all of that, that I really liked serverless as a technology. And I really think it's the future of how backend systems are going to be built. I think it's unclear yet. What kind of systems is appropriate for and what kind of kind of systems it isn't.It does have pros and cons. it does resolve a lot of the very annoying parts of building a modern website or building upon backend go away when you go serverless. So I figured I really liked this and I've learned a lot trying to piece it together over a couple of years.And if combined, I felt like I was able to do that because I had previous experience with building full stack websites, building full stack apps and understanding how backends work in general. So it wasn't like, oh, How do I do this from scratch? It was more okay. I know how this is supposed to work in theory.And I understand the principles. What are the new things that I have to add to that to figure out serverless? So I wrote the serverless handbook basically as a, as a reference or as a resource that I wish I had when I started learning this stuff. It gives you a lot of the background of just how backends work in general, how databases connect, what different databases are, how they're, how they work.Then I talked some, some about distributed systems because that comes up surprisingly quickly when you're going with serverless approaches, because everything is a lot more distributed. And it talks about infrastructure as code because that kind of simplifies a lot of the, they have opposite parts of the process and then talks about how you can piece it together in the ends to get a full product. and I approached it from the perspective of, I didn't want to write a tutorial that teaches you how to do something specific from start to finish, because I personally don't find those to be super useful. Um, they're great for getting started. They're great for building stuff. If you're building something, that's exactly the same as the tutorial you found.But they don't help you really understand how it works. It's kind of like if you just learn how to cook risotto, you know how to cook risotto, but nobody told you that, Hey, you actually, now that you know how to cook risotto, you also know how to just make rice and peas. It's pretty much the same process.Uh, and if you don't have that understanding, it's very hard to then transition between technologies and it's hard to apply them to your specific situation. So I try to avoid that and write more from the perspective. How I can give somebody who knows JavaScript who's a front end engineer, or just a JavaScript developer, how I can give them enough to really understand how serverless and backends works and be able to apply those approaches to any project.[00:18:29] Jeremy: When people hear serverless, a lot of times they're not really sure what that actually means. I think a lot of times people think about Lambdas, they think about functions as a service. but I wonder to you what does serverless mean? [00:18:45] Swizec: It's not that there's no server, there's almost always some server somewhere. There has to be a machine that actually runs your code. The idea of serverless is that the machine and the system that handles that stuff is trans is invisible to you. You're offloading all of the dev ops work to somebody else so that you can full focus on the business problems that you're trying to solve.You can focus on the stuff that is specific and unique to your situation because, you know, there's a million different ways to set up a server that runs on a machine somewhere and answers, a, API requests with adjacent. And some people have done that. Thousands of times, new people, new folks have probably never done it.And honestly, it's really boring, very brittle and kind of annoying, frustrating work that I personally never liked. So with serverless, you can kind of hand that off to a whole team of engineers at AWS or at Google or, whatever other providers there are, and they can deal with that stuff. And you can, you can work on the level of, I have this JavaScript function.I want this JavaScript function to run when somebody hits this URL and that's it. That's all, that's essentially all you have to think about. So that's what serverless means to me. It's essentially a cloud functions, I guess.[00:20:12] Jeremy: I mean, there been services like Heroku, for example, that, that have let people make rails apps or Django apps and things like that, where the user doesn't really have to think about the operating system, um, or about creating databases and things like that. And I wonder, to you, if, if that is serverless or if that's something different and, and what the difference there might be. [00:20:37] Swizec: I think of that as an intermediary step between on prem or handling your own servers and full serverless, because you still have to think about provisioning. You still have to think of your server as a whole blob or a whole glob of things that runs together and runs somewhere and lives or lifts somewhere.You have to provision capacity. You have to still think about how many servers you have on Heroku. They're called dynos. you still have to deal with the routing. You have to deal with connecting it to the database. Uh, you always have to think about that a little bit, but you're, you're still dealing with a lot of the frameworky stuff where you have to, okay, I'm going to declare a route. And then once I've declared the route, I'm going to tell it how to take data from the, from the request, put it to the function. That's actually doing the work. And then you're still dealing with all of that. Whereas with full serverless, first of all, it can scale down to zero, which is really useful.If you don't have a lot of traffic, you can have, you're not paying anything unless somebody is actually using your app. The other thing is that you don't deal with any of the routing or any of that. You're just saying, I want this URL to exist, and I want it to run that function, that you don't deal with anything more than that.And then you just write, the actual function that's doing the work. So it ends up being as a normal jobs function that accepts a request as an argument and returns a JSON response, or even just a JSON object and the serverless machinery handles everything else, which I personally find a lot easier. And you don't have to have these, what I call JSON bureaucracy, where you're piping an object through a bunch of different functions to get from the request to the actual part that's doing the work. You're just doing the core interesting work.[00:22:40] Jeremy: Sort of sounds like one of the big distinctions is with something like Heroku or something similar. You may not have a server, but you have the dyno, which is basically a server. You have something that is consistently running, Whereas with what you consider to be serverless, it's, it's something that basically only launches on when it's invoked. Um, whether that's a API call or, or something else. The, the routing thing is a little bit interesting because the, when I was going through the course, there are still the routes that you write. It's just that you're telling, I guess the API gateway Amazon's API gateway, how to route to your functions, which was very similar to how to route to a controller action or something like that in other languages.[00:23:37] Swizec: Yeah. I think that part is actually is pretty similar where, I think it kind of depends on what kind of framework you end up building. Yeah, it can be very simple. I know with rails, it's relatively simple to define a new route. I think you have to touch three or four different files. I've also worked in large express apps where.Hooking up the controller with all of the swagger definitions or open API definitions, and everything else ends up being like six or seven different files that have to have functions that are named just right. And you have to copy paste it around. And I, I find that to be kind of a waste of effort, with the serverless framework.What I like is you have this YAML file and you say, this route is handled by this function. And then the rest happens on its own with next JS or with Gatsby functions, Gatsby cloud functions. They've gone even a step further, which I really like. You have the slash API directory in your project and you just pop a file in there.And whatever that file is named, that becomes your API route and you don't even have to configure anything. You're just, in both of them, if you put a JavaScript file in slash API called hello, That exports, a handler function that is automatically a route and everything else happens behind the scenes.[00:25:05] Jeremy: So that that's more of a matter of the framework you're using and how easy does it make it to, to handle routing? Whether that's a pain or a not.[00:25:15] Swizec: Yeah. and I think with the serverless frameworks, it's because serverless itself, as a concept makes it easier to set this up. We've been able to have these modern frameworks with really good developer experience Gatsby now with how did they have Gatsby cloud and NextJS with Vercel and I think Netlify is working on it as well.They can have this really good integration between really tight coupling and tight integration between a web framework and the deployment environment, because serverless is enabling them to spin that up. So easily.[00:25:53] Jeremy: One of the things about your courses, this isn't the only thing you focus on, but one of the use cases is basically replacing a traditional server rendered application or a traditional rails, django, spring application, where you've got Amazon's API gateway in front, which is serving as the load balancer.And then you have your Lambda functions, which are basically what would be a controller action in a lot of frameworks. and then you're hooking it up to a database which could be Amazon. It could be any database, I suppose. And I wonder in your experience having worked with serverless at your job or in side projects, whether that's like something you would use as a default or whether serverless is more for background jobs and things like that.[00:26:51] Swizec: I think the underlying hidden question you're asking is about cold starts and API, and the response times, is one of the concerns that people have with serverless is that if your app is not used a lot, your servers scale down to zero. So then when somebody new comes on, it can take a really long time to respond.And they're going to bail and be upset with you. One way that I've solved, that is using kind of a more JAM Stacky approach. I feel like that buzzword is still kind of in flux, but the idea is that the actual app front-end app, the client app is running off of CDNs and doesn't even touch your servers.So that first load is of the entire app and of the entire client system is really fast because it comes from a CDN that's running somewhere as close as possible to the user. And it's only the actual APIs are hitting your server. So in the, for example, if you have something like a blog, you can, most blogs are pretty static.Most of the content is very static. I use that on my blog as well. you can pre-render that when you're deploying the project. So you, you kind of, pre-render everything that's static when you deploy. And then it becomes just static files that are served from the CDN. So you get the initial article. I think if you, I haven't tested in a while, but I think if you load one of my articles on swizec.com, it's readable, like on lighthouse reports, if you look at the lighthouse where it gives you the series of screenshots, the first screenshot is already fully readable.I think that means it's probably under 30 or 40 milliseconds to get the content and start reading, but then, then it rehydrates and becomes a react app. and then when it's a react app, it can make for their API calls to the backend. So usually on user interaction, like if you have upvotes or comments or something like that, Only when the user clicks something, you then make an API call to your server, and that then calls a Lambda or Gatsby function or a Netlify cloud function, or even a Firebase function, which then then wakes up and talks to the database and does things, and usually people are a lot more forgiving of that one taking 50 milliseconds to respond instead of 10 milliseconds, but, you know, 50 milliseconds is still pretty good.And I think there were recently some experiments shared where they were comparing cold start times. And if you write your, uh, cloud functions in JavaScript, the average cold startup time is something like a hundred milliseconds. And a big part of that is because you're not wrapping this entire framework, like express or rails into your function. It's just a small function. So the server only has to load up something like, I don't know. I think my biggest cloud functions have been maybe 10 kilobytes with all of the dependencies and everything bundled in, and that's pretty fast for a server to, to load run, start no JS and start serving your request.It's way fast enough. And then if you need even more speed, you can go to rust or go, which are even faster. As long as you avoid the java, .net, C-sharp those kinds of things. It's usually fine.[00:30:36] Jeremy: One of the reasons I was curious is because I was going through the rest example you've got, where it's basically going through Amazon's API gateway, um, goes to a Lambda function written in JavaScript, and then talks to dynamoDB gives you a record back or creates a record and, I, I found that just making those calls, making a few calls, hopefully to account for the cold start I getting response times of maybe 150 to 250 milliseconds, which is not terrible, but, it's also not what I would call fast either.So I was just kind of curious, when you have a real app, like, are, are there things that you've come across where Lambda maybe might have some issues or at least there's tricks you need to do to, to work around them? [00:31:27] Swizec: Yeah. So the big problem there is, as soon as a database is involved, that tends to get. Especially if that database is not co-located with your Lambda. So it's usually, or when I've experimented, it was a really bad idea to go from a Vercel API function, talk to dynamo DB in AWS that goes over the open internet.And it becomes really slow very quickly. at my previous job, I experimented with serverless and connecting it to RDS. If you have RDS in a separate private network, then RDS is that they, the Postgres database service they have, if that's running in a separate private network, then your functions, it immediately adds 200 or 300 milliseconds to your response times.If you keep them together, it usually works a lot faster. ANd then there are ways to keeping them. Pre-warned usually it doesn't work as well as you would want. There are ways on AWS to, I forget what it's called right now, but they have now what's, some, some sort of automatic rewarming, if you really need response times that are smaller than a hundred, 200 milliseconds.But yeah, it mostly depends on what you're doing. As soon as you're making API calls or database calls. You're essentially talking to a different server that is going to be slower on a lambda then it is if you have a packaged pserver, that's running the database and the server itself on the same machine.[00:33:11] Jeremy: And are there any specific challenges related to say you mentioned RDS earlier? I know with some databases, like for example, Postgres sometimes, uh, when you have a traditional server application, the server will pool the connections. So it'll make some connection into your data database and just keep reusing them.Whereas with the Lambda is it making a new connection every time? [00:33:41] Swizec: Almost. So Lambdas. I think you can configure how long it stays warm, but what AWS tries to do is reuse your laptops. So when the Lambda wakes up, it doesn't die immediately. After that initial request, it stays, it stays alive for the next, let's say it's one minute. Or even if it's 10 minutes, it's, there's a life for the next couple of minutes.And during that time, it can accept new requests, new requests and serve them. So anything that you put in the global namespace of your phone. We'll potentially remain alive between functions and you can use that to build a connection pool to your database so that you can reuse the connections instead of having to open new connections every time.What you have to be careful with is that if you get simultaneous requests at actually simultaneous requests, not like 10 requests in 10 milliseconds, if you get 10 requests at the same millisecond, you're going to wake up multiple Lambdas and you're going to have multiple connection pools running in parallel.So it's very easy to crash your RDS server with something like AWS Lambda, because I think the default concurrency limit is a thousand Lambdas. And if each of those can have a pool of, let's say 10 requests, that's 10,000 open requests or your RDS server. And. You were probably not paying for high enough tier for the RDS server to survive that that's where it gets really tricky.I think AWS now has a service that lets you kind of offload a connection pool so that you can take your Lambda and connect it to the connection pool. And the connection pool is keeping warm connections to your server. but an even better approach is to use something like Aurora DB, which is also an on AWS or dynamo DB, which are designed from the ground up to work with serverless applications.[00:35:47] Jeremy: It's things that work, but you have to know sort of the little, uh, gotchas, I guess, that are out there. [00:35:54] Swizec: Yeah, exactly. There's sharp edges to be found everywhere. part of that is also that. serverless, isn't that old yet I think AWS Lambda launched in 2014 or 2015, which is one forever in internet time, but it's still not that long ago. So we're still figuring out how to make things better.And, it's also where, where you mentioned earlier that whether it's more appropriate for backend processes or for user-facing processes, it does work really well for backend processes because you CA you have better control over the maximum number of Lambdas that run, and you have more patience for them being slow, being slow sometimes. And so on.[00:36:41] Jeremy: It sounds like even for front end processes as long as you know, like you said, the sharp edges and you could do things like putting a CDN in front where your Lambdas don't even get hit until some later time. There's a lot of things you can do to make it where it is a good choice or a good I guess what you're saying, when you're building an application, do you default to using a serverless type of stack? [00:37:14] Swizec: Yes, for all of my side projects, I default to using serverless. Um, I have a bunch of apps running that way, even when serverless is just no servers at all. Like my blog doesn't have any cloud functions right now. It's all running from CDNs, basically. I think the only, I don't know if you could even count that as a cloud function is w my email signup forms go to an API with my email provider.So there's also not, I don't have any servers there. It's directly from the front end. I would totally recommend it if you are a startup that just got tens of millions of dollars in funding, and you are planning to have a million requests per second by tomorrow, then maybe not. That's going to be very expensive very quickly.But there's always a trade off. I think that with serverless, it's a lot easier to build in terms of dev ops and in terms of handling your infrastructure, it's, it takes a bit of a mind shift in how you're building when it comes to the actual logic and the actual, the server system that you're building.And then in terms of costs, it really depends on what you're doing. If you're a super huge company, it probably doesn't make sense to go and serverless, but if you're that. Or if you have that much traffic, you hopefully are also making enough money to essentially build your own serverless system for yourself.[00:38:48] Jeremy: For someone who's interested in trying serverless, like I know for myself when I was going through the tutorial you're using the serverless framework and it creates all these different things in AWS for you and at a high level I could follow. Okay. You know, it has the API gateway and you've got your simple queue service and DynamoDB, and the lambdas all that sort of thing.So at a high level, I could follow along. But when I log into the AWS console, not knowing a whole lot about AWS, it's creating a ton of stuff for you. And I'm wondering from your perspective for somebody who's learning about serverless, how much do they need to really dive into the AWS internals and understand what's going on there. [00:39:41] Swizec: That's a tough one because personally I try to stay away as much as possible. And especially with the serverless framework, what I like is configuring everything through the framework rather than doing it manually. Um, because there's a lot of sharp edges there as well. Where if you go in and you manually change something, then AWS can't allow serverless framework to clean up anymore and you can have ghost processes running.At Tia, we've had that as a really interesting challenge. We're not using serverless framework, we're using something called cloud formation, which is essentially. One lower level of abstraction, then serverless framework, we're doing a lot more work. We're creating a lot more work for ourselves, but that's what we have. And that's what we're working with. these decisions predate me. So I'm just going along with what we have and we wanted to have more control, because again, we have dev ops people on the team and they want more control because they also know what they're doing and we keep having trouble with, oh, we were trying to use infrastructure as code, but then there's this little part where you do have to go into the AWS console and click around a million times to find the right thing and click it.And we've had interesting issues with hanging deploys where something gets stuck on the AWS side and we can take it back. We can tear it down, we can stop it. And it's just a hanging process and you have to wait like seven hours for AWS to do. Oh, okay. Yeah. If it's been there for seven hours, it's probably not needed and then kills it and then you can deploy.So that kind of stuff gets really frustrating very quickly.[00:41:27] Jeremy: Sounds like maybe in your personal projects, you've been able to, to stick to the serverless framework abstraction and not necessarily have to understand or dive into the details of AWS and it's worked out okay for you. [00:41:43] Swizec: Yeah, exactly. it's useful to know from a high, from a high level what's there and what the different parts are doing, but I would not recommend configuring them through the, through the AWS console because then you're going to always be in the, in the AWS console. And it's very easy to get something slightly wrong.[00:42:04] Jeremy: Yeah. I mean, I know for myself just going through the handbook, just going into the console and finding out where I could look at my logs or, um, what was actually running in AWS. It wasn't that straightforward. So, even knowing the bare minimum for somebody who's new to, it was like a little daunting. [00:42:26] Swizec: Yeah, it's super daunting. And they have thousands, if not hundreds of different products on AWS. and when it comes to, like you mentioned logs, I, I don't think I put this in the handbook because I either didn't know about it yet, or it wasn't available quite yet, but serverless can all the serverless framework also let you look at logs through the servers framework.So you can say SLS function, name, logs, and it shows you the latest logs. it also lets you run functions locally to an extent. it's really useful from that perspective. And I personally find the AWS console super daunting as well. So I try to stay away as much as possible.[00:43:13] Jeremy: It's pretty wild when you first log in and you click the button that shows you the services and it's covering your whole screen. Right. And you're like, I just want to see what I just pushed. [00:43:24] Swizec: Yeah, exactly. And there's so many different ones and they're all they have these obscure names that I don't find meaningful at all.[00:43:34] Jeremy: I think another thing that I found a little bit challenging was that when I develop applications, I'm used to having the feedback cycle of writing the code, running the application or running a test and seeing like, did it work? And if it didn't, what's the stack trace, what, what happened? And I found the process of going into CloudWatch and looking at the logs and waiting for them to eventually refresh and all that to be, a little challenging. And, and, um, so I was wondering in your, your experience, um, how you've worked through, you know, how are you able to get a fast feedback loop or is this just kind of just part of it. [00:44:21] Swizec: I am very lazy when it comes to writing tests, or when it comes to fast feedback loops. I like having them I'm really bad at actually setting them up. But what I found works pretty well for serverless is first of all, if you write your backend a or if you write your cloud functions in TypeScript that immediately resolves most of the most common issues, most common sources of bugs, it makes sure that you're not using something that doesn't exist.Make sure you're not making typos, make sure you're not holding a function wrong, which I personally find very helpful because I have pretty fast and I make typos. And it's so nice to be able to say, if it's completely. I know that it's at least going to run. I'm not going to have some stupid issue of a missing semi-colon or some weird fiddly detail.So that's already a super fast feedback cycle that runs right in your IDE the next step is because you're just writing the business logic function and you know, that the function itself is going to run. You can write unit tests that treat that function as a normal function. I'm personally really bad at writing those unit tests, but they can really speed up the, the actual process of testing because you can go and you can be like, okay.So I know that the code is doing what I want it to be doing if it's running in isolation. And that, that can be pretty fast. The next step that is, uh, Another level in abstraction and and gives you more feedback is with serverless. You can locally invoke most Lambdas. The problem with locally running your Lambdas is that it's not the same environment as on AWS.And I asked one of the original developers of the same serverless framework, and he said, just forget about accurately replicating AWS on your system. There are so many dragons there it's never going to work. and I had an interesting example about that when I was building a little project for my girlfriend that sends her photos from our relationship to an IOT device every day or something like that.It worked when I ran SLS invoke and it ran and it even called all of the APIs and everything worked. It was amazing. And then when I deployed it, it didn't work and it turned out that it was a permissions issue. I forgot to give myself a specific, I am role for something to work. That's kind of like a stair-stepping process of having fast feedback cycles first, if it compiles, that means that you're not doing anything absolutely wrong.If the tests are running, that means it's at least doing what you think it's doing. If it's invoking locally, it means that you're holding the API APIs and the third-party stuff correctly. And then the last step is deploying it to AWS and actually running it with a curl or some sort of request and seeing if it works in production.And that then tells you if it's actually going to work with AWS. And the nice thing there is because uh serverless framework does this. I think it does a sort of incremental deploys. The, that cycle is pretty fast. You're not waiting half an hour for your C code pipeline or for your CIO to run an integration test, to do stuff.One minute, it takes one minute and it's up and you can call it and you immediately see if it's working.[00:47:58] Jeremy: Basically you're, you're trying to do everything you can. Static typing and, running tests just on the functions. But I guess when it comes down to it, you really do have to push everything, update AWS, have it all run, um, in order to, to really know. Um, and so I guess it's, it's sort of a trade-off right. Versus being able to, if you're writing a rails application and you've got all your dependencies on your machine, um, you can spin it up and you don't really have to wait for it to, to push anywhere, but, [00:48:36] Swizec: Yeah. But you still don't know if, what if your database is misconfigured in production?[00:48:42] Jeremy: right, right. So it's, it's never, never the same as production. It's just closer. Right? Yeah. Yeah, I totally get When you don't have the real services or the real databases, then there's always going to be stuff that you can miss. Yeah, [00:49:00] Swizec: Yeah. it's not working until it's working in production.[00:49:03] Jeremy: That's a good place to end it on, but is there anything else you want to mention before we go?[00:49:10] Swizec: No, I think that's good. Uh, I think we talked about a lot of really interesting stuff.[00:49:16] Jeremy: Cool. Well, Swiz, thank you so much for chatting with me today. [00:49:19] Swizec: Yeah. Thank you for having me.

amazon google run mac thousands iot api cio aws github apis db static devops tia javascript wiki versus django sops ide software development js gatsby software engineering cdn lambda rds json sls serverless jira typescript hooking heroku firebase postgres discoverability taking notes netlify aws lambda yaml cdns dynamodb nextjs lambdas cloudwatch swiz jeremy you jeremy it jeremy so

Episode #107: Serverless Infrastructure as Code with Ben Kehoe

Play Episode Listen Later Jun 28, 2021 79:04

About Ben KehoeBen Kehoe is a Cloud Robotics Research Scientist at iRobot and an AWS Serverless Hero. As a serverless practitioner, Ben focuses on enabling rapid, secure-by-design development of business value by using managed services and ephemeral compute (like FaaS). Ben also seeks to amplify voices from dev, ops, and security to help the community shape the evolution of serverless and event-driven designs.Twitter: @ben11kehoeMedium: ben11kehoeGitHub: benkehoeLinkedIn: ben11kehoeiRobot: www.irobot.comWatch this episode on YouTube: https://youtu.be/B0QChfAGvB0 This episode is sponsored by CBT Nuggets and Lumigo.TranscriptJeremy: Hi, everyone. I'm Jeremy Daly.Rebecca: And I'm Rebecca Marshburn.Jeremy: And this is Serverless Chats. And this is a momentous occasion on Serverless Chats because we are welcoming in Rebecca Marshburn as an official co-host of Serverless Chats.Rebecca: I'm pretty excited to be here. Thanks so much, Jeremy.Jeremy: So for those of you that have been listening for hopefully a long time, and we've done over 100 episodes. And I don't know, Rebecca, do I look tired? I feel tired.Rebecca: I've never seen you look tired.Jeremy: Okay. Well, I feel tired because we've done a lot of these episodes and we've published a new episode every single week for the last 107 weeks, I think at this point. And so what we're going to do is with you coming on as a new co-host, we're going to take a break over the summer. We're going to revamp. We're going to do some work. We're going to put together some great content. And then we're going to come back on, I think it's August 30th with a new episode and a whole new show. Again, it's going to be about serverless, but what we're thinking is ... And, Rebecca, I would love to hear your thoughts on this as I come at things from a very technical angle, because I'm an overly technical person, but there's so much more to serverless. There's so many other sides to it that I think that bringing in more perspectives and really being able to interview these guests and have a different perspective I think is going to be really helpful. I don't know what your thoughts are on that.Rebecca: Yeah. I love the tech side of things. I am not as deep in the technicalities of tech and I come at it I think from a way of loving the stories behind how people got there and perhaps who they worked with to get there, the ideas of collaboration and community because nothing happens in a vacuum and there's so much stuff happening and sharing knowledge and education and uplifting each other. And so I'm super excited to be here and super excited that one of the first episodes I get to work on with you is with Ben Kehoe because he's all about both the technicalities of tech, and also it's actually on his Twitter, a new compassionate tech values around humility, and inclusion, and cooperation, and learning, and being a mentor. So couldn't have a better guest to join you in the Serverless Chats community and being here for this.Jeremy: I totally agree. And I am looking forward to this. I'm excited. I do want the listeners to know we are testing in production, right? So we haven't run any unit tests, no integration tests. I mean, this is straight test in production.Rebecca: That's the best practice, right? Total best practice to test in production.Jeremy: Best practice. Right. Exactly.Rebecca: Straight to production, always test in production.Jeremy: Push code to the cloud. Here we go.Rebecca: Right away.Jeremy: Right. So if it's a little bit choppy, we'd love your feedback though. The listeners can be our observability tool and give us some feedback and we can ... And hopefully continue to make the show better. So speaking of Ben Kehoe, for those of you who don't know Ben Kehoe, I'm going to let him introduce himself, but I have always been a big fan of his. He was very, very early in the serverless space. I read all his blogs very early on. He was an early AWS Serverless Hero. So joining us today is Ben Kehoe. He is a cloud robotics research scientist at iRobot, as I said, an AWS Serverless Hero. Ben, welcome to the show.Ben: Thanks for having me. And I'm excited to be a guinea pig for this new exciting format.Rebecca: So many observability tools watching you be a guinea pig too. There's lots of layers to this.Jeremy: Amazing. All right. So Ben, why don't you tell the listeners for those that don't know you a little bit about yourself and what you do with serverless?Ben: Yeah. So I mean, as with all software, software is people, right? It's like Soylent Green. And so I'm really excited for this format being about the greater things that technology really involves in how we create it and set it up. And serverless is about removing the things that don't matter so that you can focus on the things that do matter.Jeremy: Right.Ben: So I've been interested in that since I learned about it. And at the time saw that I could build things without running servers, without needing to deal with the scaling of stuff. I've been working on that at iRobot for over five years now. As you said early on in serverless at the first serverless con organized by A Cloud Guru, now plural sites.Jeremy: Right.Ben: And yeah. And it's been really exciting to see it grow into the large-scale community that it is today and all of the ways in which community are built like this podcast.Jeremy: Right. Yeah. I love everything that you've done. I love the analogies you've used. I mean, you've always gone down this road of how do you explain serverless in a way to show really the adoption of it and how people can take that on. Serverless is a ladder. Some of these other things that you would ... I guess the analogies you use were always great and always helped me. And of course, I don't think we've ever really come to a good definition of serverless, but we're not talking about that today. But ...Ben: There isn't one.Jeremy: There isn't one, which is also a really good point. So yeah. So welcome to the show. And again, like I said, testing in production here. So, Rebecca, jump in when you have questions and we'll beat up Ben from both sides on this, but, really ...Rebecca: We're going to have Ben from both sides.Jeremy: There you go. We'll embrace him from both sides. There you go.Rebecca: Yeah. Yeah.Jeremy: So one of the things though that, Ben, you have also been very outspoken on which I absolutely love, because I'm in very much closely aligned on this topic here. But is about infrastructure as code. And so let's start just quickly. I mean, I think a lot of people know or I think people working in the cloud know what infrastructure as code is, but I also think there's a lot of people who don't. So let's just take a quick second, explain what infrastructure as code is and what we mean by that.Ben: Sure. To my mind, infrastructure as code is about having a definition of the state of your infrastructure that you want to see in the cloud. So rather than using operations directly to modify that state, you have a unified definition of some kind. I actually think infrastructure is now the wrong word with serverless. It used to be with servers, you could manage your fleet of servers separate from the software that you were deploying onto the servers. And so infrastructure being the structure below made sense. But now as your code is intimately entwined in the rest of your resources, I tend to think of resource graph definitions rather than infrastructure as code. It's a less convenient term, but I think it's worth understanding the distinction or the difference in perspective.Jeremy: Yeah. No, and I totally get that. I mean, I remember even early days of cloud when we were using the Chefs and the Puppets and things like that, that we were just deploying the actual infrastructure itself. And sometimes you deploy software as part of that, but it was supporting software. It was the stuff that ran in the runtime and some of those and some configurations, but yeah, but the application code that was a whole separate process, and now with serverless, it seems like you're deploying all those things at the same time.Ben: Yeah. There's no way to pick it apart.Jeremy: Right. Right.Rebecca: Ben, there's something that I've always really admired about you and that is how strongly you hold your opinions. You're fervent about them, but it's also because they're based on this thorough nature of investigation and debate and challenging different people and yourself to think about things in different ways. And I know that the rest of this episode is going to be full with a lot of opinions. And so before we even get there, I'm curious if you can share a little bit about how you end up arriving at these, right? And holding them so steady.Ben: It's a good question. Well, I hope that I'm not inflexible in these strong opinions that I hold. I mean, it's one of those strong opinions loosely held kind of things that new information can change how you think about things. But I do try and do as much thinking as possible so that there's less new information that I have to encounter to change an opinion.Rebecca: Yeah. Yeah.Ben: Yeah. I think I tend to try and think about how people ... But again, because it's always people. How people interact with the technology, how people behave, how organizations behave, and then how technology fits into that. Because sometimes we talk about technology in a vacuum and it's really not. Technology that works for one context doesn't work for another. I mean, a lot of my strong opinions are that there is no one right answer kind of a thing, or here's a framework for understanding how to think about this stuff. And then how that fits into a given person is just finding where they are in that more general space. Does that make sense? So it's less about finding out here's the one way to do things and more about finding what are the different options, how do you think about the different options that are out there.Rebecca: Yeah, totally makes sense. And I do want to compliment you. I do feel like you are very good at inviting new information in if people have it and then you're like, "Aha, I've already thought of that."Ben: I hope so. Yeah. I was going to say, there's always a balance between trying to think ahead so that when you discover something you're like, "Oh, that fits into what I thought." And the danger of that being that you're twisting the information to fit into your preexisting structures. I hope that I find a good balance there, but I don't have a principle way of determining that balance or knowing where you are in that it's good versus it's dangerous kind of spectrum.Jeremy: Right. So one of the opinions that you hold that I tend to agree with, I have some thoughts about some of the benefits, but I also really agree with the other piece of it. And this really has to do with the CDK and this idea of using CloudFormation or any sort of DSL, maybe Terraform, things like that, something that is more domain-specific, right? Or I guess declarative, right? As opposed to something that is imperative like the CDK. So just to get everybody on the same page here, what is the top reasons why you believe, or you think that DSL approach is better than that iterative approach or interpretive approach, I guess?Ben: Yeah. So I think we get caught up in the imperative versus declarative part of it. I do think that declarative has benefits that can be there, but the way that I think about it is with the CDK and infrastructure as code in general, I'm like mildly against imperative definitions of resources. And we can get into that part, but that's not my smallest objection to the CDK. I'm moderately against not being able to enforce deterministic builds. And the CDK program can do anything. Can use a random number generator and go out to the internet to go ask a question, right? It can do anything in that program and that means that you have no guarantees that what's coming out of it you're going to be able to repeat.So even if you check the source code in, you may not be able to go back to the same infrastructure that you had before. And you can if you're disciplined about it, but I like tools that help give you guardrails so that you don't have to be as disciplined. So that's my moderately against. My strongly against piece is I'm strongly against developer intent remaining client side. And this is not an inherent flaw in the CDK, is a choice that the CDK team has made to turn organizational dysfunction in AWS into ownership for their customers. And I don't think that's a good approach to take, but that's also fixable.So I think if we want to start with the imperative versus declarative thing, right? When I think about the developers expressing an intent, and I want that intent to flow entirely into the cloud so that developers can understand what's deployed in the cloud in terms of the things that they've written. The CDK takes this approach of flattening it down, flattening the richness of the program the developer has written into ... They think of it as assembly language. I think that is a misinterpretation of what's happening. The assembly language in the process is the imperative plan generated inside the CloudFormation engine that says, "Here's how I'm going to take this definition and turn it into an actual change in the cloud.Jeremy: Right.Ben: They're just translating between two definition formats in CDK scene. But it's a flattening process, it's a lossy process. So then when the developer goes to the Console or the API has to go say, "What's deployed here? What's going wrong? What do I need to fix?" None of it is framed in terms of the things that they wrote in their original language.Jeremy: Right.Ben: And I think that's the biggest problem, right? So drift detection is an important thing, right? What happened when someone went in through the Console? Went and tweaked some stuff to fix something, and now it's different from the definition that's in your source repository. And in CloudFormation, it can tell you that. But what I would want if I was running CDK is that it should produce another CDK program that represents the current state of the cloud with a meaningful file-level diff with my original program.Jeremy: Right. I'm just thinking this through, if I deploy something to CDK and I've got all these loops and they're generating functions and they're using some naming and all this kind of stuff, whatever, now it produces this output. And again, my naming of my functions might be some function that gets called to generate the names of the function. And so now I've got all of these functions named and I have to go in. There's no one-to-one map like you said, and I can imagine somebody who's not familiar with CloudFormation which is ultimately what CDK synthesizes and produces, if you're not familiar with what that output is and how that maps back to the constructs that you created, I can see that as being really difficult, especially for younger developers or developers who are just getting started in that.Ben: And the CDK really takes the attitude that it's going to hide those things from those developers rather than help them learn it. And so when they do have to dive into that, the CDK refers to it as an escape hatch.Jeremy: Yeah.Ben: And I think of escape hatches on submarines, where you go from being warm and dry and having air to breathe to being hundreds of feet below the sea, right? It's not the sort of thing you want to go through. Whereas some tools like Amplify talk about graduation. In Amplify they aim to help you understand the things that Amplify is doing for you, such that when you grow beyond what Amplify can provide you, you have the tools to do that, to take the thing that you built and then say, "Okay, I know enough now that I understand this and can add onto it in ways that Amplify can't help with."Jeremy: Right.Ben: Now, how successful they are in doing that is a separate question I think, but the attitude is there to say, "We're looking to help developers understand these things." Now the CDK could also if the CDK was a managed service, right? Would not need developers to understand those things. If you could take your program directly to the cloud and say, "Here's my program, go make this real." And when it made it real, you could interact with the cloud in an understanding where you could list your deployed constructs, right? That you can understand the program that you wrote when you're looking at the resources that are deployed all together in the cloud everywhere. That would be a thing where you don't need to learn CloudFormation.Jeremy: Right.Ben: Right? That's where you then end up in the imperative versus declarative part where, okay, there's some reasons that I think declarative is better. But the major thing is that disconnect that's currently built into the way that CDK works. And the reason that they're doing that is because CloudFormation is not moving fast enough, which is not always on the CloudFormation team. It's often on the service teams that aren't building the resources fast enough. And that's AWS's problem, AWS as an entire company, as an organization. And this one team is saying, "Well, we can fix that by doing all this client side."What that means is that the customers are then responsible for all the things that are happening on the client side. The reason that they can go fast is because the CDK team doesn't have ownership of it, which just means the ownership is being pushed on customers, right? The CDK deploys Lambda functions into your account that they don't tell you about that you're now responsible for. Right? Both the security and operations of. If there are security updates that the CDK team has to push out, you have to take action to update those things, right? That's ownership that's being pushed onto the customer to fix a lack of ACM certificate management, right?Jeremy: Right. Right.Ben: That is ACM not building the thing that's needed. And so AWS says, "Okay, great. We'll just make that the customer's problem."Jeremy: Right.Ben: And I don't agree with that approach.Rebecca: So I'm sure as an AWS Hero you certainly have pretty good, strong, open communication channels with a lot of different team members across teams. And I certainly know that they're listening to you and are at least hearing you, I should say, and watching you and they know how you feel about this. And so I'm curious how some of those conversations have gone. And some teams as compared to others at AWS are really, really good about opening their roadmap or at least saying, "Hey, we hear this, and here's our path to a solution or a success." And I'm curious if there's any light you can shed on whether or not those conversations have been fruitful in terms of actually being able to get somewhere in terms of customer and AWS terms, right? Customer obsession first.Ben: Yeah. Well, customer obsession can mean two things, right? Customer obsession can mean giving the customer what they want or it can mean giving the customer what they need and different AWS teams' approach fall differently on that scale. The reason that many of those things are not available in CloudFormation is that those teams are ... It could be under-resourced. They could have a larger majority of customer that want new features rather than infrastructure as code support. Because as much as we all like infrastructure as code, there are many, many organizations out there that are not there yet. And with the CDK in particular, I'm a relatively lone voice out there saying, "I don't think this ownership that's being pushed onto the customer is a good thing." And there are lots of developers who are eating up CDK saying, "I don't care."That's not something that's in their worry. And because the CDK has been enormously successful, right? It's fixing these problems that exists. And I don't begrudge them trying to fix those problems. I think it's a question of do those developers who are grabbing onto those things and taking them understand the full total cost of ownership that the CDK is bringing with it. And if they don't understand it, I think AWS has a responsibility to understand it and work with it to help those customers either understand it and deal with it, right? Which is where the CDK takes this approach, "Well, if you do get Ops, it's all fine." And that's somewhat true, but also many developers who can use the CDK do not control their CI/CD process. So there's all sorts of ways in which ... Yeah, so I think every team is trying to do the best that they can, right?They're all working hard and they all have ... Are pulled in many different directions by customers. And most of them are making, I think, the right choices given their incentives, right? Given what their customers are asking for. I think not all of them balance where customers ... meeting customers where they are versus leading them where they should, like where they need to go as well as I would like. But I think ... I had a conclusion to that. Oh, but I think that's always a debate as to where that balance is. And then the other thing when I talk about the CDK, that my ideal audience there is less AWS itself and more AWS customers ...Rebecca: Sure.Ben: ... to understand what they're getting into and therefore to demand better of AWS. Which is in general, I think, the approach that I take with AWS, is complaining about AWS in public, because I do have the ability to go to teams and say, "Hey, I want this thing," right? There are plenty of teams where I could just email them and say, "Hey, this feature could be nice", but I put it on Twitter because other people can see that and say, "Oh, that's something that I want or I don't think that's helpful," right? "I don't care about that," or, "I think it's the wrong thing to ask for," right? All of those things are better when it's not just me saying I think this is a good thing for AWS, but it being a conversation among the community differently.Rebecca: Yeah. I think in the spirit too of trying to publicize types of what might be best next for customers, you said total cost of ownership. Even though it might seem silly to ask this, I think oftentimes we say the words total cost of ownership, but there's actually many dimensions to total cost of ownership or TCO, right? And so I think it would be great if you could enumerate what you think of as total cost of ownership, because there might be dimensions along that matrices, matrix, that people haven't considered when they're actually thinking about total cost of ownership. They're like, "Yeah, yeah, I got it. Some Ops and some security stuff I have to do and some patches," but they might only be thinking of five dimensions when you're like, "Actually the framework is probably 10 to 12 to 14." And so if you could outline that a bit, what you mean when you think of a holistic total cost of ownership, I think that could be super helpful.Ben: I'm bad at enumeration. So I would miss out on dimensions that are obvious if I was attempting to do that. But I think a way that I can, I think effectively answer that question is to talk about some of the ways in which we misunderstand TCO. So I think it's important when working in an organization to think about the organization as a whole, not just your perspective and that your team's perspective in it. And so when you're working for the lowest TCO it's not what's the lowest cost of ownership for my team if that's pushing a larger burden onto another team. Now if it's reducing the burden on your team and only increasing the burden on another team a little bit, that can be a lower total cost of ownership overall. But it's also something that then feeds into things like political capital, right?Is that increased ownership that you're handing to that team something that they're going to be happy with, something that's not going to cause other problems down the line, right? Those are the sorts of things that fit into that calculus because it's not just about what ... Moving away from that topic for a second. I think about when we talk about how does this increase our velocity, right? There's the piece of, "Okay, well, if I can deploy to production faster, right? My feedback loop is faster and I can move faster." Right? But the other part of that equation is how many different threads can you be operating on and how long are those threads in time? So when you're trying to ship a feature, if you can ship it and then never look at it again, that means you have increased bandwidth in the future to take on other features to develop other new features.And so even if you think about, "It's going to take me longer to finish this particular feature," but then there's no maintenance for that feature, that can be a lower cost of ownership in time than, "I can ship it 50% faster, but then I'm going to periodically have to revisit it and that's going to disrupt my ability to ship other things," right? So this is where I had conversations recently about increasing use of Step Functions, right? And being able to replace Lambda functions with Step Functions express workflows because you never have to go back to those Lambdas and update dependencies in them because dependent bot has told you that you need to or a version of Python is getting deprecated, right? All of those things, just if you have your Amazon States Language however it's been defined, right?Once it's in there, you never have to touch it again if nothing else changes and that means, okay, great, that piece is now out of your work stream forever unless it needs to change. And that means that you have more bandwidth for future things, which serverless is about in general, right? Of say, "Okay, I don't have to deal with this scaling problems here. So those scaling things. Once I have an auto-scaling group, I don't have to go back and tweak it later." And so the same thing happens at the feature level if you build it in ways that allow you to do that. And so I think that's one of the places where when we focus on, okay, how fast is this getting me into production, it's okay, but how often do you have to revisit it ...Jeremy: Right. And so ... So you mentioned a couple of things in there, and not only in that question, but in the previous questions as you were talking about the CDK in general, and I am 100% behind you on this idea of deterministic builds because I want to know exactly what's being deployed. I want to be able to audit that and map that back. And you can audit, I mean, you could run CDK synth and then audit the CloudFormation and test against certain things. But if you are changing stuff, right? Then you have to understand not only the CDK but also the CloudFormation that it actually generates. But in terms of solving problems, some of the things that the CDK does really, really well, and this is something where I've always had this issue with just trying to use raw CloudFormation or Serverless Framework or SAM or any of these things is the fact that there's a lot of boilerplate that you often have to do.There's ways that companies want to do something specifically. I basically probably always need 1,400 lines of CloudFormation. And for every project I do, it's probably close to the same, and then add a little bit more to actually make it adaptive for my product. And so one thing that I love about the CDK is constructs. And I love this idea of being able to package these best practices for your company or these compliance requirements, excuse me, compliance requirements for your company, whatever it is, be able to package these and just hand them to developers. And so I'm just curious on your thoughts on that because that seems like a really good move in the right direction, but without the deterministic builds, without some of these other problems that you talked about, is there another solution to that that would be more declarative?Ben: Yeah. In theory, if the CDK was able to produce an artifact that represented all of the non-deterministic dependencies that it had, right? That allowed you to then store that artifacts as you'd come back and put that into the program and say, "I'm going to get out the same thing," but because the CDK doesn't control upstream of it, the code that the developers are writing, there isn't a way to do that. Right? So on the abstraction front, the constructs are super useful, right? CloudFormation now has modules which allow you to say, "Here's a template and I'm going to represent this as a CloudFormation type itself," right? So instead of saying that I need X different things, I'm going to say, "I packaged that all up here. It is as a type."Now, currently, modules can only be playing CloudFormation templates and there's a lot of constraints in what you can express inside a CloudFormation template. And I think the answer for me is ... What I want to see is more richness in the CloudFormation language, right? One of the things that people do in the CDK that's really helpful is say, "I need a copy of this in every AZ."Jeremy: Right.Ben: Right? There's so much boilerplate in server-based things. And CloudFormation can't do that, right? But if you imagine that it had a map function that allowed you to say, "For every AZ, stamp me out a copy of this little bit." And then that the CDK constructs allowed to translate. Instead of it doing all this generation only down to the L one piece, instead being able to say, "I'm going to translate this into more rich CloudFormation templates so that the CloudFormation template was as advanced as possible."Right? Then it could do things like say, "Oh, I know we need to do this in every AZ, I'm going to use this map function in the CloudFormation template rather than just stamping it out." Right? And so I think that's possible. Now, modules should also be able to be defined as CDK programs. Right? You should be able to register a construct as a CloudFormation tag.Jeremy: It would be pretty cool.Ben: There's no reason you shouldn't be able to. Yeah. Because I think the declarative versus imperative thing is, again, not the most important piece, it's how do we move ... It's shifting right in this case, right? That how do you shift what's happening with the developer further into the process of deployment so that more of their context is present? And so one of the things that the CDK does that's hard to replicate is have non-local effects. And this is both convenient and I think of code smell often.So you can pass a bucket resource from another stack into a piece of code in your CDK program that's creating a different stack and you say, "Oh great, I've got this Lambda function, it needs permissions to that bucket. So add permissions." And it's possible for the CDK programs to either be adding the permissions onto the IAM role of that function, or non-locally adding to that bucket's resource policy, which is weird, right? That you can be creating a stack and the thing that you do to that stack or resource or whatever is not happening there, it's happening elsewhere. I don't think that's a great approach, but it's certainly convenient to be able to do it in a lot of situations.Now, that's not representable within a module. A module is a contained piece of functionality that can't touch anything else. So things like SAM where you can add events onto a function that can go and create ... You create the API events on different functions and then SAM aggregates them and creates an API gateway for you. Right? If AWS serverless function was a module, it couldn't do that because you'd have these in different places and you couldn't aggregate something between all of them and put them in the top-level thing, right?This is what CloudFormation macros enable, but they don't have a... There's no proper interface to them, right? They don't define, "This is what I'm doing. This is the kind of resources I can create." There's none of that that would help you understand them. So they're infinitely flexible, but then also maybe less principled for that reason. So I think there are ways to evolve, but it's investment in the CloudFormation language that allows us to shift that burden from being a flattening inside client-side code from the developer and shifting it to be able to be represented in the cloud.Jeremy: Right. Yeah. And I think from that standpoint too if we go back to the solving people's problems standpoint, that everything you explained there, they're loaded with nuances, it's loaded with gotchas, right? Like, "Oh, you can't do this, you can't do that." So that's just why I think the CDK is so popular because it's like you can do so much with it so quickly and it's very, very fast. And I think that trade-off, people are just willing to make it.Ben: Yes. And that's where they're willing to make it, do they fully understand the consequences of it? Then does AWS communicate those consequences well? Before I get into that question of, okay, you're a developer that's brand new to AWS and you've been tasked with standing up some Kubernetes cluster and you're like, "Great. I can use a CDK to do this." Something is malfunctioning. You're also tasked with the operations and something is malfunctioning. You go in through the Console and maybe figure out all the things that are out there are new to you because they're hidden inside L3 constructs, right?You're two levels down from where you were defining what you want, and then you find out what's wrong and you have no idea how to turn that into a change in your CDK program. So instead of going back and doing the thing that infrastructure as code is for, which is tweaking your program to go fix the problem, you go and you tweak it in the Console ...Jeremy: Right. Which you should never do.Ben: ... and you fix it that way. Right. Well, and that's the thing that I struggle with, with the CDK is how does the CDK help the developer who's in that situation? And I don't think they have a good story around that. Now, I don't know. I haven't talked with enough junior developers who are using the CDK about how often they get into that situation. Right? But I always say client-side code is not a replacement for a managed service because when it's client-side code, you still own the result.Jeremy: Right.Ben: If a particular CDK construct was a managed service in AWS, then all of the resources that would be created underneath AWS's problem to make work. And the interface that the developer has is the only level of ownership that they have. Fargate is this. Because you could do all the things that Fargate does with a CDK construct, right? Set up EC2, do all the things, and represent it as something that looks like Fargate in your CDK program. But every time your EC2 fleet is unhealthy that's your problem. With Fargate, that's AWS's problem. If we didn't have Fargate, that's essentially what CDK would be trying to do for ECS.And I think we all recognize that Fargate is very necessary and helpful in that case, right? And I just want that for all the things, right? Whenever I have an abstraction, if it's an abstraction that I understand, then I should have a way of zooming into it while not having to switch languages, right? So that's where you shouldn't dump me out the CloudFormation to understand what you're doing. You should help me understand the low-level things in the same language. And if it's not something that I need to understand, it should be a managed service. It shouldn't be a bunch of stuff that I still own that I haven't looked at.Jeremy: Makes sense. Got a question, Rebecca? Because I was waiting for you to jump in.Rebecca: No, but I was going to make a joke, but then the joke passed, and then I was like, "But should I still make it?" I was going to be like, "Yeah, but does the CDK let you test in production?" But that was a 32nd ago joke and then I was really wrestling with whether or not I should tell it, but I told it anyway, hopefully, someone gets a laugh.Ben: Yeah. I mean, there's the thing that Charity Majors says, right? Which is that everybody tests in production. Some people are lucky enough to have a development environment in production. No, sorry. I said that the wrong way. It's everybody has a test environment. Some people are lucky enough that it's not in production.Rebecca: Yeah. Swap that. Reverse it. Yeah.Ben: Yeah.Jeremy: All right. So speaking of talking to developers and getting feedback from them, so I actually put a question out on Twitter a couple of weeks ago and got a lot of really interesting reactions. And essentially I asked, "What do you love or hate about infrastructure as code?" And there were a lot of really interesting things here. I don't know, maybe it might be fun to go through a couple of these and get your thoughts on them. So this is probably not a great one to start with, but I thought it was interesting because this I think represents the frustration that a lot of us feel. And it was basically that they love that automation minimizes future work, right? But they hate that it makes life harder over time. And that pretty much every approach to infrastructure in, sorry, yeah, infrastructure in code at the present is flawed, right? So really there are no good solutions right now.Ben: Yeah. CloudFormation is still a pain to learn and deal with. If you're operating in certain IDEs, you can get tab completion.Jeremy: Right.Ben: If you go to CDK you get tab completion, which is, I think probably most of the value that developers want out of it and then the abstraction, and then all the other fancy things it does like pipelines, which again, should be a managed service. I do think that person is absolutely right to complain about how difficult it is. That there are many ways that it could be better. One of the things that I think about when I'm using tools is it's not inherently bad for a tool to have some friction to use it.Jeremy: Right.Ben: And this goes to another infrastructure as code tool that goes even further than the CDK and says, "You can define your Lambda code in line with your infrastructure definition." So this is fine with me. And there's some other ... I think Punchcard also lets you do some of this. Basically extracts out the bits of your code that you say, "This is a custom thing that glues together two things I'm defining in here and I'll make that a Lambda function for you." And for me, that is too little friction to defining a Lambda function.Because when I define a Lambda function, just going back to that bringing in ownership, every time I add a Lambda function, that's something that I own, that's something that I have to maintain, that I'm responsible for, that can go wrong. So if I'm thinking about, "Well, I could have API Gateway direct into DynamoDB, but it'd be nice if I could change some of these fields. And so I'm just going to drop in a little sprinkle of code, three lines of code in between here to do some transformation that I want." That is all of sudden an entire Lambda function you've brought into your infrastructure.Jeremy: Right. That's a good point.Ben: And so I want a little bit of friction to do that, to make me think about it, to make me say, "Oh, yeah, downstream of this decision that I am making, there are consequences that I would not otherwise think about if I'm just trying to accomplish the problem," right? Because I think developers, humans, in general, tend to be a bit shortsighted when you have a goal especially, and you're being pressured to complete that goal and you're like, "Okay, well I can complete it." The consequences for later are always a secondary concern.And so you can change your incentives in that moment to say, "Okay, well, this is going to guide me to say, "Ah, I don't really need this Lambda function in here. Then I'm better off in the long term while accomplishing that goal in the short term." So I do think that there is a place for tools making things difficult. That's not to say that the amount of difficult that infrastructure as code is today is at all reasonable, but I do think it's worth thinking about, right?I'd rather take on the pain of creating an ASL definition by hand for express workflow than the easier thing of writing Lambda code. Because I know the long-term consequences of that. Now, if that could be flipped where it was harder to write something that took more ownership, it'd be just easy to do, right? You'd always do the right thing. But I think it's always worth saying, "Can I do the harder thing now to pay off to pay off later?"Jeremy: And I always call those shortcuts "tomorrow-Jeremy's" problem. That's how I like to look at those.Ben: Yeah. Yes.Jeremy: And the funny thing about that too is I remember right when EventBridge came out and there was no CloudFormation support for a long time, which was super frustrating. But Serverless Framework, for example, implemented a custom resource in order to do that. And I remember looking at a clean stack and being like, "Why are there two Lambda functions there that I have no idea?" I'm like, "I didn't publish ..." I honestly thought my account was compromised that somebody had published a Lambda function in there because I'm like, "I didn't do that." And then it took me a while to realize, I'm like, "Oh, this is what this is." But if it is that easy to just create little transform functions here and there, I can imagine there being thousands of those in your account without anybody knowing that they even exist.Ben: Now, don't get me wrong. I would love to have the ability to drop in little transforms that did not involve Lambda functions. So in other words, I mean, the thing that VTL does for API Gateway, REST APIs but without it being VTL and being ... Because that's hard and then also restricted in what you can do, right? It's not, "Oh, I can drop in arbitrary code in here." But enough to say, "Oh, I want to flip ... These fields should go from a key-value mapping to a list of key-value, right? In the way that it addresses inconsistent with how tags are defined across services, those kinds of things. Right? And you could drop that in any service, but once you've defined it, there's no maintenance for you, right?You're writing JavaScript. It's not actually a JavaScript engine underneath or something. It's just getting translated into some big multi-tenant fancy thing. And I have a hypothesis that that should be possible. You should be able to do it where you could even do it in the parsing of JSON, being able to do transforms without ever having to have the whole object in memory. And if we could get that then, "Oh, sure. Now I have sprinkled all over the place all of these little transforms." Now there's a little bit of overhead if the transform is defined correctly or not, right? But once it is, then it just works. And having all those little transforms everywhere is then fine, right? And that incentive to make it harder it doesn't need to be there because it's not bringing ownership with it.Rebecca: Yeah. It's almost like taking the idea of tomorrow-Jeremy's problem and actually switching it to say tomorrow-Jeremy's celebration where tomorrow-Jeremy gets to look back at past-Jeremy and be like, "Nice. Thank you for making that decision past-Jeremy." Because I think we often do look at it in terms of tomorrow-Jeremy will think of this, we'll solve this problem rather than how do we approach it by saying, how do I make tomorrow-Jeremy thankful for it today-Jeremy? And that's a simple language, linguistic switch, but a hard switch to actually make decisions based on.Ben: Yeah. I don't think tomorrow-Ben is ever thankful for today-Ben. I think it's tomorrow-Ben is thankful for yesterday-Ben setting up the incentives correctly so that today-Ben will do the right thing for tomorrow-Ben. Right? When I think about people, I think it's easier to convince people to accept a change in their incentives than to convince them to fight against their incentives sustainably.Jeremy: Right. And I think developers and I'm guilty of this too, I mean, we make decisions based off of expediency. We want to get things done fast. And when you get stuck on that problem you're like, "You know what? I'm not going to figure it out. I'm just going to write a loop or I'm going to do whatever I can do just to make it work." Another if statement here, "Isn't going to hurt anybody." All right. So let's move to ... Sorry, go ahead.Ben: We shouldn't feel bad about that.Jeremy: You're right.Ben: I was going to say, we shouldn't feel bad about that. That's where I don't want tomorrow-Ben to have to be thankful for today-Ben, because that's the implication there is that today-Ben is fighting against his incentives to do good things for tomorrow-Ben. And if I don't need to have to get to that point where just the right path is the easiest path, right? Which means putting friction in the right places than today-Ben ... It's never a question of whether today-Ben is doing something that's worth being thankful for. It's just doing the job, right?Jeremy: Right. No, that makes sense. All right. I got another question here, I think falls under the category of service discovery, which I know is another topic that you love. So this person said, "I love IaC, but hate the fuzzy boundaries where certain software awkwardly fall. So like Istio and Prometheus and cert-manager. That they can be considered part of the infrastructure, but then it's awkward to deploy them when something like Terraform due to circular dependencies relating to K8s and things like that."So, I mean, I know that we don't have to get into the actual details of that, but I think that is an important aspect of infrastructure as code where best practices sometimes are deploy a stack that has your permanent resources and then deploy a stack that maybe has your more femoral or the ones that are going to be changing, the more mutable ones, maybe your Lambda functions and some of those sort of things. If you're using Terraform or you're using some of these other services as well, you do have that really awkward mix where you're trying to use outputs from one stack into another stack and trying to do all that. And really, I mean, there are some good tools that help with it, but I mean just overall thoughts on that.Ben: Well, we certainly need to demand better of AWS services when they design new things that they need to be designed so that infrastructure as code will work. So this is the S3 bucket notification problem. A very long time ago, S3 decided that they were going to put bucket notifications as part of the S3 bucket. Well, CloudFormation at that point decided that they were going to put bucket notifications as part of the bucket resource. And S3 decided that they were going to check permissions when the notification configuration is defined so that you have to have the permissions before you create the configuration.This creates a circular dependency when you're hooking it up to anything in CloudFormation because the dependency depends on the resource policy on an SNS topic, and SQS queue or a Lambda function depends on the bucket name if you're letting CloudFormation name the bucket, which is the best practice. Then bucket name has to exist, which means the resource has to have been created. But the notification depends on the thing that's notifying, which doesn't have the names and the resource policy doesn't exist so it all fails. And this is solved in a couple of different ways. One of which is name your bucket explicitly, again, not a good practice. Another is what SAM does, which says, "The Lambda function will say I will allow all S3 buckets to invoke me."So it has a star permission in it's resource policy. So then the notification will work. None of which is good or there's custom resources that get created, right? Now, if those resources have been designed with infrastructure as code as part of the process, then it would have been obvious, "Oh, you end up with a circular pendency. We need to split out bucket notifications as a separate resource." And not enough teams are doing this. Often they're constrained by the API that they develop first ...Jeremy: That's a good point.Ben: ... they come up with the API, which often makes sense for a Console experience that they desire. So this is where API Gateway has this whole thing where you create all the routes and the resources and the methods and everything, right? And then you say, "Great, deploy." And in the Console you only need one mutable working copy of that at a time, but it means that you can't create two deployments or update two stages in parallel through infrastructure as code and API Gateway because they both talk to this mutable working copy state and would overwrite each other.And if infrastructure as code had been on their list would have been, "Oh, if you have a definition of your API, you should be able to go straight to the deployment," right? And so trying to push that upstream, which to me is more important than infrastructure as code support at launch, but people are often like, "Oh, I want CloudFormation support at launch." But that often means that they get no feedback from customers on the design and therefore make it bad. KMS asymmetric keys should have been a different resource type so that you can easily tell which key types are in your template.Jeremy: Good point. Yeah.Ben: Right? So that you can use things like CloudFormation Guard more easily on those. Sure, you can control the properties or whatever, but you should be able to think in terms of, "I have a symmetric key or an asymmetric key in here." And they're treated completely separately because you use them completely differently, right? They don't get used to the same place.Jeremy: Yeah. And it's funny that you mentioned the lacking support at launch because that was another complaint. That was quite prevalent in this thread here, was people complaining that they don't get that CloudFormation support right away. But I think you made a very good point where they do build the APIs first. And that's another thing. I don't know which question asked me or which one of these mentioned it, but there was a lot of anger over the fact that you go to the API docs or you go to the docs for AWS and it focuses on the Console and it focuses on the CLI and then it gives you the API stuff and very little mention of CloudFormation at all. And usually, you have to go to a whole separate set of docs to find the CloudFormation. And it really doesn't tie all the concepts together, right? So you get just a block of JSON or of YAML and you're like, "Am I supposed to know what everything does here?"Ben: Yeah. I assume that's data-driven. Right? And we exist in this bubble where everybody loves infrastructure as code.Jeremy: True.Ben: And that AWS has many more customers who set things up using Console, people who learn by doing it first through the Console. I assume that's true, if it's not, then the AWS has somehow gotten on the extremely wrong track. But I imagine that's how they find that they get the right engagement. Now maybe the CDK will change some of this, right? Maybe the amount of interest that is generating, we'll get it to the point where blogs get written with CDK programs being written there. I think that presents different problems about what that CDK program might hide from when you're learning about a service. But yeah, it's definitely not ... I wrote a blog for AWS and my first draft had it as CloudFormation and then we changed it to the Console. Right? And ...Jeremy: That must have hurt. Did you die a little inside when that happened?Ben: I mean, no, because they're definitely our users, right? That's the way in which they interact with data, with us and they should be able to learn from that, their company, right? Because again, developers are often not fully in control of this process.Jeremy: Right. That's a good point.Ben: And so they may not be able to say, "I want to update this through CloudFormation," right? Either because their organization says it or just because their team doesn't work that way. And I think AWS gets requests to prevent people from using the Console, but also to force people to use the Console. I know that at least one of them is possible in IAM. I don't remember which, because I've never encountered it, but I think it's possible to make people use the Console. I'm not sure, but I know that there are companies who want both, right? There are companies who say, "We don't want to let people use the API. We want to force them to use the Console." There are companies who say, "We don't want people using the Console at all. We want to force them to use the APIs."Jeremy: Interesting.Ben: Yeah. There's a lot of AWS customers, right? And there's every possible variety of organization and AWS should be serving all of them, right? They're all customers. And certainly, I want AWS to be leading the ones that are earlier in their cloud journey and on the serverless ladder to getting further but you can't leave them behind, I think it's important.Jeremy: So that people argument and those different levels and coming in at a different, I guess, level or comfortability with APIs versus infrastructure as code and so forth. There was another question or another comment on this that said, "I love the idea of committing everything that makes my solution to text and resurrect an entire solution out of nothing other than an account key. Loved the ability to compare versions and unit tests, every bit of my solution, and not having to remember that one weird setting if you're using the Console. But hate that it makes some people believe that any coder is now an infrastructure wizard."And I think this is a good point, right? And I don't 100% agree with it, but I think it's a good point that it basically ... Back to your point about creating these little transformations in Pulumi, you could do a lot of damage, I mean, good or bad, right? When you are using these tools. What are your thoughts on that? I mean, is this something where ... And again, the CDK makes it so easy for people to write these constructs pretty quickly and spin up tons of infrastructure without a lot of guard rails to protect them.Ben: So I think if we tweak the statement slightly, I think there's truth there, which isn't about the self-perception but about what they need to be. Right? That I think this is more about serverless than about infrastructure as code. Infrastructure as code is just saying that you can define it. Right? I think it's more about the resources that are in a particular definition that require that. My former colleague, Aaron Camera says, "Serverless means every developer is an architect" because you're not in that situation where the code you write goes onto something, you write the whole thing. Right?And so you do need to have those ... You do need to be an infrastructure wizard whether you're given the tools to do that and the education to do that, right? Not always, like if you're lucky. And the self-perception is again an even different thing, right? Especially if coders think that there's nothing to be learned ... If programmers, software developers, think that there's nothing to be learned from the folks who traditionally define the infrastructure, which is Ops, right? They think, "Those people have nothing to teach me because now I can do all the things that they did." Well, you can create the things that they created and it does not mean that you're as good at it ...Jeremy: Or responsible for monitoring it too. Right.Ben: ... and have the ... Right. The monitoring, the experience of saying these are the things that will come back to bite you that are obvious, right? This is how much ownership you're getting into. There's very much a long-standing problem there of devaluing Ops as a function and as a career. And for my money when I look at serverless, I think serverless is also making the software development easier because there's so much less software you need to write. You need to write less software that deals with the hard parts of these architectures, the scaling, the distributed computing problems.You still have this, your big computing problems, but you're considering them functionally rather than coding things that address them, right? And so I see a lot of operations folks who come into serverless learn or learn a new programming language or just upscale, right? They're writing Python scripts to control stuff and then they learn more about Python to be able to do software development in it. And then they bring all of that Ops experience and expertise into it and look at something and say, "Oh, I'd much rather have step functions here than something where I'm running code for it because I know how much my script break and those kinds of things when an API changes or ... I have to update it or whatever it is."And I think that's something that Tom McLaughlin talks about having come from an outside ground into serverless. And so I think there's definitely a challenge there in both directions, right? That Ops needs to learn more about software development to be more engaged in that process. Software development does need to learn much more about infrastructure and is also at this risk of approaching it from, "I know the syntax, but not the semantics, sort of thing." Right? We can create ...Jeremy: Just because I can doesn't mean I should.Ben: ... an infrastructure. Yeah.Rebecca: So Ben, as we're looping around this conversation and coming back to this idea that software is people and that really software should enable you to focus on the things that do matter. I'm wondering if you can perhaps think of, as pristine as possible, an example of when you saw this working, maybe it was while you've been at iRobot or a project that you worked on your own outside of that, but this moment where you saw software really working as it should, and that how it enabled you or your team to focus on the things that matter. If there's a concrete example that you can give when you see it working really well and what that looks like.Ben: Yeah. I mean, iRobot is a great example of this having been the company without need for software that scaled to consumer electronics volumes, right? Roomba volumes. And needing to build a IOT cloud application to run connected Roombas and being able to do that without having to gain that expertise. So without having to build a team that could deal with auto-scaling fleets of servers, all of those things was able to build up completely serverlessly. And so skip an entire level of organizational expertise, because that's just not necessary to accomplish those tasks anymore.Rebecca: It sounds quite nice.Ben: It's really great.Jeremy: Well, I have one more question here that I think could probably end up ... We could talk about for another hour. So I will only throw it out there and maybe you can give me a quick answer on this, but I actually had another Twitter thread on this not too long ago that addressed this very, very problem. And this is the idea of the feedback cycle on these infrastructure as code tools where oftentimes to deploy infrastructure changes, I mean, it just takes time. In many cases things can run in parallel, but as you said, there's race conditions and things like that, that sometimes things have to be ... They just have to be synchronous. So is this something where there are ways where you see in the future these mutations to your infrastructure or things like that potentially happening faster to get a better feedback cycle, or do you think that's just something that we're going to have to deal with for a while?Ben: Yeah, I think it's definitely a very extensive topic. I think there's a few things. One is that the deployment cycle needs to get shortened. And part of that I think is splitting dev deployments from prod deployments. In prod it's okay for it to take 30 seconds, right? Or a minute or however long because that's at the end of a CI/CD pipeline, right? There's other things that are happening as part of that. Now, you don't want that to be hours or whatever it is. Right? But it's okay for that to be proper and to fully manage exactly what's going on in a principled manner.When you're doing for development, it would be okay to, for example, change the Lambda code without going through CloudFormation to change the Lambda code, right? And this is what an architect does, is there's a notion of a dirty deploy which just packages up. Now, if your resource graph has changed, you do need to deploy again. Right? But if the only thing that's changing is your code, sure, you can go and say, "Update function code," on that Lambda directly and that's faster.But calling it a dirty deploy is I think important because that is not something that you want to do in prod, right? You don't want there to be drift between what the infrastructure as code service understands, but then you go further than that and imagine there's no reason that you actually have to do this whole zip file process. You could be R sinking the code directly, or you could be operating over SSH on the code remotely, right? There's many different ways in which the loop from I have a change in my Lambda code to that Lambda having that change could be even shorter than that, right?And for me, that's what it's really about. I don't think that local mocking is the answer. You and Brian Rue were talking about this recently. I mean, I agree with both of you. So I think about it as I want unit tests of my business logic, but my business logic doesn't deal with AWS services. So I want to unit test something that says, "Okay, I'm performing this change in something and that's entirely within my custom code." Right? It's not touching other services. It doesn't mean that I actually need adapters, right? I could be dealing with the native formats that I'm getting back from a given service, but I'm not actually making calls out of the code. I'm mocking out, "Well, here's what the response would look like."And so I think that's definitely necessary in the unit testing sense of saying, "Is my business logic correct? I can do that locally. But then is the wiring all correct?" Is something that should only happen in the cloud. There's no reason to mock API gateway into Lambda locally in my mind. You should just be dealing with the Lambda side of it in your local unit tests rather than trying to set up this multiple thing. Another part of the story is, okay, so these deploys have to happen faster, right? And then how do we help set up those end-to-end test and give you observability into it? Right? X-Ray helps, but until X-Ray can sort through all the services that you might use in the serverless architecture, can deal with how does it work in my Lambda function when it's batching from Kinesis or SQS into my function?So multiple traces are now being handled by one invocation, right? These are problems that aren't solved yet. Until we get that kind of inspection, it's going to be hard for us to feel as good about cloud development. And again, this is where I feel sometimes there's more friction there, but there's bigger payoff. Is one of those things where again, fighting against your incentives which is not the place that you want to be.Jeremy: I'm going to stop you before you disagree with me anymore. No, just kidding! So, Rebecca, you have any final thoughts or questions for Ben?Rebecca: No. I just want to say to both of you and to everyone listening that I hope your today self is celebrating your yesterday-self right now.Jeremy: Perfect. Well, Ben, thank you so much for joining us and being a guinea pig as we said on this new format that we are trying. Excellent guinea pig. Excellent.Rebecca: An excellent human too but also great guinea pig.Jeremy: Right. Right. Pretty much so. So if people want to find out more about you, read some of the stuff you're doing and working on, how do they do that?Ben: I'm on Twitter. That's the primary place. I'm on LinkedIn, I don't post much there. And then I write articles that show up on Medium.Rebecca: And just so everyone knows your Twitter handle I'll say it out loud too. It's @ben11kehoe, K-E-H-O-E, ben11kehoe.Jeremy: Right. Perfect. All right. Well, we will put all that in the show notes and hopefully people will like this new format. And again, we'd love your feedback on this, things that you'd like us to do in the future, any ideas you have. And of course, make sure you reach out to Ben. He's an amazing resource for serverless. So again, thank you for everything you do, and thank you for being on the show.Ben: Yeah. Thanks so much for having me. This was great.Rebecca: Good to see you. Thank you.

technology moving chefs code loved software medium cloud reverse infrastructure i am iot api puppets swap aha python aws amplify console prometheus faa apis javascript ops x ray sns roomba asl s3 ides kubernetes irobot lambda acm baas ci cd json serverless iac soylent green terraform cli dsl ssh ecs l3 tco rest apis kms cdk ec2 yaml k8s kinesis istio charity majors dynamodb api gateway lambdas fargate cloud guru cloudformation sqs vtl ben it tom mclaughlin serverless framework aws hero ben there eventbridge jeremy you jeremy it aws serverless hero ben right ben well cbt nuggets ben kehoe rebecca so ben thanks jeremy so

Episode #106: Building Apps on the Decentralized Web with Nader Dabit

Play Episode Listen Later Jun 21, 2021 59:04

About Nader DabitNader Dabit is a web and mobile developer, author, and Developer Relations Engineer building the decentralized future at Edge and Node. Previously, he worked as a Developer Advocate at AWS Mobile working with projects like AWS AppSync and AWS Amplify. He is also the author and editor of React Native in Action and OpenGraphQL.Nader Dabit Twitter: @dabit3Edge and Node Twitter: @edgeandnodeGraph protocol Twitter: @graphprotocolEdge and Node: edgeandnode.com Everest: everest.link YouTube: YouTube.com/naderdabitWhat is Web3? The Decentralized Internet of the Future ExplainedWatch this episode on YouTube: https://youtu.be/pSv_cCQyCPQ This episode is sponsored by CBT Nuggets and Fauna. TranscriptJeremy: Hi everyone. I'm Jeremy Daly and this is Serverless Chats. Today I am joined again by Nader Dabit. Hey Nader, thanks for joining me.Nader: Hey Jeremy. Thanks for having me.Jeremy: You are now a developer relations engineer at Edge & Node. I would love it if you could tell the listeners a little bit about yourself. I think a lot of people probably know you already, but a little bit about your background and then what Edge & Node is.Nader: Yeah, totally. My name is Nader Dabit like you mentioned, and I've been a developer for about, I guess, nine or ten years now. A lot of people might know me from my work with AWS, where I worked with the Amplify team with the front end web and mobile team, doing a lot of full stack stuff there as well as serverless. I've been working as a developer relations person, developer advocate, actually, leading the front end web and mobile team at AWS for a little over three years I was there. I was a manager for the last year and I became really, really interested in serverless while I was there. It led to me writing a book, which is Full Stack Serverless. It also just led me down the rabbit hole of managed services and philosophy and all this stuff.It's been really, really cool to learn about everything in the space. Edge & Node is my next step, I would say, in doing work and what I consider maybe a serverless area, but it's an area that a lot of people might not associate with the traditional, I would say definition of serverless or the types of companies they often associate with serverless. But Edge & Node is a company that was spun off from a team that created a decentralized API protocol, which is called the Graph protocol. And the Graph protocol started being built in 2017. It was officially launched in a decentralized way at the end of 2020. Now we are currently finalizing that migration from a hosted service to a decentralized service actually this month.A lot of really exciting things going on. We'll talk a lot about that and what all that means. But Edge & Node itself, we do support the Graph protocol, that's part of what we do, but we also build out decentralized applications ourselves. We have a couple of applications that we're building as engineers. We're also doing a lot of work within the Web3 ecosystem, which is known as the decentralized web ecosystem by investing in different people and companies and supporting different things and spreading awareness around some of the things that are going on here because it does have a lot to do with maybe the work that people are doing in the Web2 space, which would be the traditional webspace, the space that I was in before.Jeremy: Right, right. Here I am. I follow you on Twitter. Love the videos that you do on your YouTube channel. You're like a shining example of what a really good developer relations dev advocate is. You just produce so much content, things like that, and you're doing all this stuff on serverless and I'm loving it. And then all of a sudden, I see you post this thing saying, hey, I'm leaving AWS Amplify. And you mentioned something about blockchain and I'm like, okay, wait a minute. What is this that Nader is now doing? Explain to me this, or maybe explain to me and hopefully the audience as well. What is the blockchain have to do with this decentralized applications or decentralized, I guess Web3?Nader: Web3 as defined by definition, what you might see if you do some research, would be what a lot of people are talking about as the next evolution of the web as we know it. In a lot of these articles and stuff that people are trying to formalize ideas and stuff, the original web was the read-only web where we were not creators, the only creators were maybe the developers themselves. Early on, I might've gone and read a website and been able to only interact with the website by reading information. The current version that we're currently experiencing might be considered as Web2 where everyone's a creator. All of the interfaces, all of the applications that we interact with are built specifically for input. I can actually create a comment, I can upload a video, I can share stuff, and I can write to the web. And I can read.And then the next evolution, a lot of people are categorizing, yes, is Web3. It's like taking a lot of the great things that we have today and maybe improving upon those. A lot of people and everyone kind of, this is just a really, a very old discussion around some of the trade-offs that we currently make in today's web around our data, around advertising, around the way a lot of business models are created for monetization. Essentially, they all come down to the manipulation of user data and different tricks and ways to steal people's data and use that essentially to create targeted advertising. Not only does this lead to a lot of times a negative experience. I just saw a tweet yesterday that resonated a lot with me that said, "YouTube is no longer a video platform, it's now an ad platform with videos in between." And that's the way I feel about YouTube. My kids ...Jeremy: Totally.Nader: ... I have kids that use YouTube and it's interesting to watch them because they know exactly what to do when the ads come up and exactly how to time it because they're used to, ads are just part of their experience. That's just what they're used to. And it's not just YouTube, it's every site that's out there, that's a social site, Instagram, LinkedIn. I think that that's not the original vision that people had, right, for the web. I don't think this was part of it. There have been a lot of people proposing solutions, but the core fundamental problem is how these applications are engineered, but also how the applications are paid for. How do these companies pay for developers to build. It's a really complex problem that, the simplest solution is just sell ads or maybe create something like a developer platform where you're charging a weekly or monthly or yearly or something like that.I would say a lot of the ideas around Web3 are aiming to solve this exact problem. In order to do that you have to rethink how we build applications. You have to rethink how we store data. You have to rethink about how we think about identity as well, because again, how do you build an application that deals with user data without making it public in some way? Right? How do we deal with that? A lot of those problems are the things that people are thinking about and building ways to address those in this decentralized Web3 world. It became really fascinating to me when I started looking into it because I'm very passionate about what I'm doing. I really enjoy being a developer and going out and helping other people, but I always felt there was something missing because I'm sitting here and I love AWS still.In fact, I would 100% go back and work there or any of these big companies, right? Because you can't really look at a company as, in my opinion, a black or white, good or bad thing, there's companies are doing good things and bad things at the same time. For instance, at AWS, I would meet a developer, teach them something at a workshop, a year later they would contact me and be like, hey, I got my first job or I created a business, or I landed my first client. So you're actually helping improve people's lives, at the same time you're reading these articles about Amazon in the news with some of the negative stuff going on. The way that I look at it is, I can't sit there and say any company is good or bad, but I felt a lot of the applications that people were building were also, at the end goal when you hear some of these VC discussions or people raising money, a lot of the end goal for some of the people I was working with were just selling advertising.And I'm like, is this really what we're here to do? It doesn't feel fulfilling anymore when you start seeing that over and over and over. I think the really thing that fascinated me was that people are actually building applications that are monetized in a different way. And then I started diving into the infrastructure that enabled this and realized that there was a lot of similarities between serverless and how developers would deploy and build applications in this way. And it was the entry point to my rabbit hole.Jeremy: I talked to you about this and I've been reading some of the stuff that you've been putting out and trying to educate myself on some of this. It seems very much so that show Silicon Valley on HBO, right? This decentralized web and things like that, but there's kind of, and totally correct me if I'm wrong here, but I feel there's two sides of this. You've got one side that is the blockchain, that I think some people are familiar with in the, I guess in the context of cryptocurrency, right? This is a very popular use of the blockchain because you have that redundancy and you have the agreement amongst multiple places, it's decentralized. And so you have that security there around that. But there's other uses for the blockchain as well.Especially things like banking and real estate and some of those other use cases that I'd like to talk about. And then there's another side of it that is this decentralized piece. Is the decentralized piece of it like building apps? How is that related to the blockchain or are those two separate things?Nader: Yeah, absolutely. I'm a big fan of Silicon Valley. Working in tech, it's almost like every single episode resonates with you if you've been in here long enough because you've been in one of those situations. The blockchain is part of the discussion. Crypto is part of the discussion, and those things never really interested me, to be honest. I was a speculator in crypto from 2015 until now. It's been fun, but I never really looked at crypto in any other way other than that. Blockchain had a really negative, I would say, association in my mind for a long time, I just never really saw any good things that people were doing with it. I just didn't do any research, maybe didn't understand what was going on.When I started diving into it originally what really got me interested is the Graph protocol, which is one of the things that we work on at Edge & Node. I started actually understanding, why does this thing exist? Why is it there? That led me to understanding why it was there and the fact that 90% of dApps, decentralized apps in the Ethereum ecosystem are using it. And billions of queries, companies with billions of dollars in transactions are all using this stuff. I'm like, okay, this whole world exists, but why does it exist? I guess to give you an example, I guess we can talk about the Graph protocol. And there are a lot of other web, I would say Web3 or decentralized infrastructure protocols that are out there that are similar, but they all are doing similar things in the sense of how they're actually built and how they allow participation and stuff like that.When you think of something like AWS, you think of, AWS has all of these different services. I want to build an app, I need storage. I need some type of authentication layer, maybe with Cognito, and then maybe I need someplace to execute some business logic. So maybe I'll spin up some serverless functions or create an EC2 instance, whatever. You have all these building blocks. Essentially what a lot of these decentralized protocols like the Graph are doing, are building out the same types of web infrastructure, but doing so in a decentralized way. Why does that even matter? Why is that important? Well, for instance, when you live, let's say for example in another country, I don't know, in South America and outside the United States, or even in the United States in the future, you never know. Let's say that you have some application and you've said something rude about maybe the president or something like that.Let's say that for whatever reason, somebody hacks the server that you're dealing with or whatever, at the end of the day, there is a single point of failure, right? You have your data that's controlled by the cloud provider or the government can come in and they can have control over that. The idea around some of, pretty much all of the decentralized protocols is that they are built and distributed in a way that there is no single point of failure, but there's also no single point of control. That's important when you're living in areas that have to even worry about stuff like that. So maybe we don't have to worry about that as much here, but in other countries, they might.Building something like a server is not a big deal, right? With AWS, but how would you build a server and make it available for anyone in the world to basically deploy and do so in a decentralized way? I think that's the problem that a lot of these protocols are trying to solve. For the Graph in particular, if you want to build an application using data that's stored on a blockchain. There's a lot of applications out there that are basically using the blockchain for mainly, right now it's for financial, transactional reasons because a lot of the transactions actually cost a lot of money. For instance, Uniswap is one of these applications. If you want to basically query data from a blockchain, it's not as easy as querying data from a traditional server or database.For us we are used to using something like DynamoDB, or some type of SQL database, that's very optimized for queries. But on the blockchain, you're basically having these blocks that add up every time. You create a transaction, you save it. And then someone comes behind them and they save another transaction. Over time you build up this data that's aggregated over time. But let's say you want to hit that database with the, quote-unquote, database with a query and you want to retrieve data over time, or you want to have some type of filtering mechanism. You can't do that. You can't just query blockchains the way you can from a regular database. Similar to how a database basically indexes data and stores it and makes it efficient for retrieval, the Graph protocol basically does that, but for blockchain data.Anyone that wants to build an application, one of these decentralized apps on top of blockchain data has a couple of options. They can either build their own indexing server and deploy it to somewhere like AWS. That takes away the whole idea of decentralization because then you have a single point of failure again. You can query data directly from the blockchain, from your client application, which takes a very long time. Both of those are not, I would say the most optimal way to build. But also if you're building your own indexing server, every time you want to come up with a new idea also, you have to think about the resources and time that go into it. Basically, I want to come up with a new idea and test it out, I have to basically build a server index, all this data, create APIs around it. It's time-intensive.What the Graph protocol allows you to do is, as a developer you can basically define a subgraph using YAML, similar to something like cloud formation or a very condensed version of that maybe more Serverless Framework where you're defining, I want to query data from this data source, and I want to save these entities and you deploy that to the network. And that subgraph will basically then go and look into that blockchain. And will look for all the transactions that have happened, and it will go ahead and save those and make those available for public retrieval. And also, again, one of the things that you might think of is, all of this data is public. All of the data that's on the blockchain is public.Jeremy: Right. Right. All right. Let me see if I could repeat what you said and you tell me if I'm right about this. Because this was one of those things where blockchain ... you're right. To me, it had a negative connotation. Why would you use the blockchain, unless you were building your own cryptocurrency? Right. That just seemed like that's what it was for. Then when AWS comes out with QLDB or they announced that or whatever it was. I'm like, okay, so this is interesting, but why would you use it, again, unless you're building your own cryptocurrency or something because that's the only thing I could think of you would use the blockchain for.But as you said, with these blockchains now, you have highly sensitive transactions that can be public, but a real estate transaction, for example, is something really interesting, where like, we still live in a world where if Bank of America or one of these other giant banks, JPMorgan Chase or something like that gets hacked, they could wipe out financial data. Right? And I know that's backed up in multiple regions and so forth, but this is the thing where if you're doing some transaction, that you want to make sure that transaction lives forever and isn't manipulated, then the blockchain is a good place to do that. But like you said, it's expensive to write there. But it's even harder to read off the blockchain because it's that ledger, right? It's just information coming in and coming in.So event storming or if you were doing event sourcing or something that, it's that idea. The idea with these indexers are these basically separate apps that run, and again, I'm assuming that these protocols, their software, and things that you don't have to build this yourself, essentially you can just deploy these things. Right? But this will read off of the blockchain and do that aggregation for you and then make that. Basically, it caches the blockchain. Right? And makes that available to you. And that you could deploy that to multiple indexers if you wanted to. Right? And then you would have access to that data across multiple providers.Nader: Right. No single point of failure. That's exactly right. You basically deploy a very concise configuration file that defines how you want your data stored and made available. And then it goes, and it just starts at the very beginning and it queries all those blocks or reads all those blocks, saves the data in a database, and then it keeps up with additional new updates. If someone writes a new transaction after that, it also saves that and makes it available for efficient retrieval. This is just for blockchain data. This is the data layer for, but it's not just a blockchain data in the future. You can also query from IPFS, which is a file storage layer, somewhat S3. You can query from other chains other than Ethereum, which is kind of like the main chamber.In the future really what we're hoping to have is a complete API on top of all public data. Anybody that wants to have some data set available can basically deploy a subgraph and index it and then anyone can then essentially query for it. It's like when you think of public data, we're not really used to thinking of data in this way. And also I think a good thing to talk about in a moment is the types of apps that you can build because you wouldn't want to store private messages on a blockchain or something like that. Right? The types of apps that people are building right now at least are not 100% in line with everything. You can't do everything I would say right now in Web3 that you can do in Web2.There are only certain types of applications, but those applications that are successful seem to be wildly successful and have a lot of people interested in them and using them. That's the general idea, is like you have this way to basically deploy APIs and the technology that we use to query is GraphQL. That was one of the reasons that I became interested as well. Right now the main data sources are blockchains like Ethereum, but in the future, we would like to make that available to other data sources as well.Jeremy: Right. You mentioned earlier too because there are apps obviously being built on this that you said are successful. And the problem though, I think right now, because I remember I speculated a little bit with Bitcoin and I bought a whole bunch of Ripple, so I'm still hanging on to it. Ripple XPR whatever, let's go. Anyways, but it was expensive to make a transaction. Right? Reading off of the blockchain itself, I think just connecting generally doesn't cost money, but if you're, and I know there's some costs with indexers and that's how that works. But in terms of the real cost, it's writing to the blockchain. I remember moving some Bitcoin at one point, I think cost me $30 to make one transaction, to move something like that.I can see if you're writing a $300,000 real estate transaction, or maybe some really large wire transfer or something that you want to record, something that makes sense where you could charge a fee of $30 or $40 in order to do that. I can't see you doing that for ... certainly not for web streaming or click tracking or something like that. That wouldn't make sense. But even for smaller things there might be writing more to it, $30 or whatever that would be ... seems quite expensive. What's the hope around that?Nader: That was one of the biggest challenges and that was one of the reasons that when I first, I would say maybe even considered this as a technology back in the day, that I would be considering as something that would possibly be usable for the types of applications I'm used to seeing. It just was like a no-brainer, like, no. I think right now, and that's one of the things that attracted me right now to some of the things that are happening, is a lot of those solutions are finally coming to fruition for fixing those sorts of things. There's two things that are happening right now that solve that problem. One of them is, they are merging in a couple of updates to the base layer, layer one, which would be considered something like Ethereum or Bitcoin. But Ethereum is the main one that a lot of the financial stuff that I see is happening.Basically, there are two different updates that are happening, I think the main one that will make this fee transactional price go down a little bit is sharding. Sharding is basically going to increase the number of, I believe nodes that are basically able to process the transactions by some number. Basically, that will reduce the cost somewhat, but I don't think it's ever going to get it down to a usable level. Instead what the solutions seem to be right now and one of the solutions that seems to actually be working, people are using it in production really recently, this really just started happening in the last couple of months, is these layer 2 solutions. There are a couple of different layer 2 solutions that are basically layers that run on top of the layer one, which would be something like Ethereum.And they treat Ethereum as the settlement layer. It's almost like when you interact with the bank and you're running your debit card. You're probably not talking to the bank directly and they are doing that. Instead, you have something like Visa who has this layer 2 on top of the banks that are managing thousands of transactions per second. And then they take all of those transactions and they settle those in an underlying layer. There's a couple different layer 2s that seem to be really working well right now in the Ethereum ecosystem. One of those is Arbitrum and then the other is I think Matic, but I think they have a different name now. Both of those seem to be working and they bring the cost of a transaction down to a fraction of a penny.You have, instead of paying $20 or $30 for a transaction, you're now paying almost nothing. But now that's still not cheap enough to probably treat a blockchain as a traditional database, a high throughput database, but it does open the door for a lot of other types of applications. The applications that you see building on layer one where the transactions really are $5 to $20 or $30 or typically higher value transactions. Things like governance, things like financial transactions, you've heard of NFTs. And that might make sense because if someone's going to spend a thousand bucks or 500 bucks, whatever ...Jeremy: NFTs don't make sense to me.Nader: They're not my thing either, the way they're being, I would say, talked about today especially, but I think in the future, the idea behind NFTs is interesting, but yeah, I'm in the same boat as you. But still to those people, if you're paying a thousand dollars for something then that 5 or 10 or 20 bucks might make sense, but it's not going to make sense if I just want to go to an e-commerce store and pay $5 for something. Right? I think that these layer 2s are starting to unlock those potential opportunities where people can start building these true financial applications that allow these transactions to happen at the same cost or actually a lot cheaper maybe than what you're paying for a credit card transaction, or even what those vendors, right? If you're running a store, you're paying percentages to those companies.The idea around decentralization comes back to this discussion of getting rid of the middleman, and a lot of times that means getting rid of the inefficiencies. If you can offload this business logic to some type of computer, then you've basically abstracted away a lot of inefficiencies. How many billions of dollars are spent every year by banks flying their people around the world and private jets and these skyscrapers and stuff. Now, where does that money come from? It comes from the consumer and them basically taking fees. They're taking money here and there. Right? That's the idea behind technology in general. They're like whenever something new and groundbreaking comes in, it's often unforeseen, but then you look back five years later and you're like, this is a no-brainer. Right?For instance Blockbuster and Netflix, there's a million of them. I don't have to go into that. I feel this is what that is for maybe the financial institutions and how we think about finance, especially in a global world. I think this was maybe even accelerated by COVID and stuff. If you want to build an application today, imagine limiting yourself to developers in your city. Unless you're maybe in San Francisco or New York, where that might still work. If I'm here in Mississippi and I want to build an application, I'm not going to just look for developers in a 30-mile radius. That is just insane. And I don't use that word mildly, it's just wild to think about that. You wouldn't do that.Instead, you want to look in your nation, but really you might want to look around the world because you now have things like Slack and Discord and all these asynchronous ways of doing work. And you might be able to find the best developer in the world for 25% or 50% of what you would typically find locally and an easy way to pay them might just be to just send them some crypto. Right? You don't have to go find out all their banking information and do all the wiring and all this other stuff. You just open your wallet, you send them the money and that's it. It's a done deal. But that's just one thing to think about. To me when I think about building apps in Web2 versus Web3, I don't think you're going to see the Facebook or Instagram use case anytime in the next year or two. I think the killer app for right now, it's going to be financial and e-commerce stuff.But I do think in maybe five years you will see someone crack that application for, something like a social media app where we're basically building something that we use today, but maybe in a better way. And that will be done using some off-chain storage solution. You're not going to be writing all these transactions again to a blockchain. You're going to have maybe a protocol like Graph that allows you to have a distributed database that is managed by one of these networks that you can write to. I think the ideas that we're talking about now are the things that really excite me anyway.Jeremy: Let's go back to GraphQL for a second, though. If you were going to build an app on top of this, and again, that's super exciting getting those transaction fees down, because I do feel every time you try to move money between banks or it's the $3 fee, if you go to a foreign ATM and you take money out of an ATM, they charge you. Everybody wants to take a cut somewhere along, and there's probably reasons for it, but also corporate jets cost money. So that makes sense as well. But in terms of the GraphQL protocol here, so if I wanted to build an application on top of it, and maybe my application doesn't write to the blockchain, it just reads from it, with one of these indexers, because maybe I'm summing up some financial transactions or something, or I've got an app we can look things up or whatever, I'm building something.I'm querying using the GraphQL, this makes sense. I have to use one of these indexers that's aggregating that data for me. But what if I did want to write to the blockchain, can I use GraphQL to do a mutation and actually write something to the blockchain? Or do I have to write to it directly?Nader: Yeah, that's actually a really, really good question. And that's one of the things that we are currently working on with the Graph. Right now if you want to write a transaction, you typically are going to be using one of these JSON RPC wallets and using some type of client library that interacts with the wallet and signs the transaction with the private key. And then that sends the transaction to the blockchain directly. And you're talking to the blockchain and you're just using something like the Graph to query. But I think what would be ideal and what we think would be ideal, is if someone could use a single technology, a single language, and a single abstraction to do everything, not only with reading and writing but also with subscriptions for real-time updates.That's where we think the whole idea for this will ultimately be, and that's what we're working on now. Right now you can only query. And if you want to write a transaction, you basically are still going to be using something like ethers.js or Web3 or one of these other libraries that allows you to sign a transaction using your wallet. But in the future and in fact, we're already building this right now as having an end-to-end GraphQL library that allows you to write transactions as well as read. That way someone just learns a single API and it's a lot easier. It would also make it easier for developers that are coming from a traditional web background to come in because there's a little bit of learning curve for understanding how to create one of these signed providers and write the transaction. It's not that much code, but it is a new way of thinking about things.Jeremy: Well I think both of us coming from the serverless space, we know that new way of thinking about things certainly can throw a wrench in the system when a new developer is trying to pick that stuff up.Nader: Yeah.Jeremy: All right. So that's the blockchain side of things with the data piece of it. I think people could wrap their head around that. I think it makes a lot of sense. But I'm still, the decentralized, the other things that you talked about. You mentioned an S3, something that's sort of an S3 type protocol that you can use. And what are some of the other ones? I think I've written some of them down here. Acash was one, Filecoin, Livepeer. These are all different protocols or services that are hosted by the indexers, or is this a different thing than the indexers? How does that work? And then how would you use that to save data, maybe save some blob, a blob storage or something like that?Nader: Let's talk about the tokenomics idea around how crypto fits into this and how it actually powers a protocol like this. And then we'll talk about some of those other protocols. How do people actually build all this stuff and do it for, are they getting paid for it? Is it free? How does that work and how does this network actually stay up? Because everything costs money, developers' time costs money, and so on and so forth. For something like the Graph, basically during the building phase of this protocol, basically, there was white papers and there was blog posts, and there was people in Discords talking about the ideas that were here. They basically had this idea to build this protocol. And this is a very typical life cycle, I would say.You have someone that comes up with an idea, they document some of it, they start building it. And the people that start building it are going to be basically part of essentially the founding team you could think of, in the sense of they're going to be having equity. Because at the end of the day, to actually launch one of these decentralized protocols, the way that crypto comes into it, there's typically some type of a token offering. The tokens need to be for a network like this, some type of utility token to keep the network running in the future. You're not just going to create some crypto and that's it like, right? I think that's the whole idea that I thought was going on when in reality, these tokens are typically used for powering the protocol.But let's say early on you have let's say 20 developers and they all build 5% of the system, whatever percentage that you want to talk about, whatever. Let's say you have these people helping out and then you actually build the thing and you want to go ahead and launch it and you have something that's working. A lot of times what people will do is they'll basically have a token offering, where they'll basically say, okay, let's go ahead and we're going to mint X number of tokens, and we're going to put these on the market and we're going to also pay these people that helped build this system, X number of tokens, and that's going to be their payment. And then they can go and sell those or keep those or trade those or whatever they would like to do.And then you have the tokens that are then put on the public market essentially. Once you've launched the protocol, you have to have tokens to basically continue to power the protocol and fund it. There are different people that interact with the protocol in different ways. You have the indexers themselves, which are basically software engineers that are deploying whatever infrastructure to something like AWS or GCP. These people are still using these cloud providers or they're maybe doing it at their house, whatever. All you basically need is a server and you want to basically run this indexer node, which is software that is open source, and you run this node. Basically, you can go ahead and say, okay, I want to start being an indexer and I want to be one of the different nodes on the network.To do that you basically buy some GRT, Graph Token, and in our case you stake it, meaning you are putting this money up to basically affirm that you are an indexer on the protocol and you are going to be accepting subgraph developers to deploy their subgraphs to your indexer. You stake that money and then when people use the API, they're basically paying money just like they might pay money to somewhere like API gateway or AppSync. Instead, they're paying money for their subgraph and that money is paid in GRT and it's distributed to the people in the ecosystem. Like me as a developer, I'm deploying the subgraph, and then if I have a million people using it, then I make some money. That's one way to use tokens in the system.Another way is basically to, as an outside person looking in, I can say, this indexer is really, really good. They know what they're doing. They're a very strong engineer. I'm going to basically put some money into their indexer and I'm basically backing them as an indexer. And then I will also share the money that comes in from the query fees. And then there are also people that are subgraph developers, which is the stuff that I've been working with mainly, where I can basically come up with a new API. I can be like, it'd be cool if I took data from this blockchain and this file system and merged it together, and I made this really cool API that people can use to build their apps with. I can deploy that. And basically, people can signal to this subgraph using tokens. And when people do that, they can say that they believe that this is a good subgraph to use.And then when people use that, I can also make money in that way. Basically, people are using tokens to be part of the system itself, but also to use that. If I'm a front end application like Uniswap and I want to basically use the Graph, I can basically say, okay, I'm going to put a thousand dollars in GRT tokens and I'm going to be using this API endpoint, which is a subgraph. And then all of the money that I have put up as someone that's using this, is going to be taken as the people start using it. Let's say I have a million queries and each query is one, 1000th of a cent, then after those million queries are up, I've spent $100 or something like that. Kind of similar to how you might pay AWS, you're now paying, you know, subgraph developers and indexers.Jeremy: Right. Okay. That makes sense. So then that's the payment method of that. So then these other protocols that get built on top of it, the Acash and Filecoin and Livepeer. So those ...Nader: They're all operating in a very similar fashion.Jeremy: Okay. All right. And so it's ...Nader: They have some type of node software that's run and people can basically run this node on some server somewhere and make it available as part of the network. And then they can use the tokens to participate. There's Filecoin for file storage. There's also IPFS, which is actually more of, it's a completely free service, but it's also not something that's as reliable as something like S3 or Filecoin. And then you have, like you mentioned, I believe Acash, which is a way to execute arbitrary code, business logic, and stuff like that. You have Ceramic Network, which is something that you can use for authentication. You have Livepeer which is something you use for live streaming. So you have all these ideas, these decentralized services fitting in these different niches.Jeremy: Right, right. Okay. So then now you've got a bunch of people. Now you mentioned this idea of, you could say, this is a good indexer. What about bad indexers? Right?Nader: That's a really good question.Jeremy: Yeah. You're relying on people to take data off of a public blockchain, and then you're relying on them to process it correctly and give you back good data. I'm assuming they could manipulate that data if they wanted to. I don't know why, but let's say they did. Is there a way to guarantee that you're getting the correct data?Nader: Yeah. That's a whole part of how the system works. There's this whole idea and this whole, really, really deep rabbit hole of crypto-economics and how these protocols are structured to incentivize and also disincentivize. In our protocol, basically, you have this idea of slashing and this is also a fairly known and used thing in the ecosystem and in the space. It's this idea of slashing. Basically, you incentivize people to go out and find people that are serving incorrect data. And if that person finds someone that's serving incorrect data, then the person that's serving the incorrect data is, quote-unquote, slashed. And that basically means that they're not only not going to receive the money from the queries that they were serving, but they also might lose the money that they put up to be a part of the network.I mentioned you have to actually put up money to deploy an indexer to the network, that money could also be at risk. You're very, very, very much so financially disincentivized to do that. And there's actually, again, incentives in the network for people to go and find those people. It's all-around incentives, game theory, and things like that.Jeremy: Which makes a ton of sense. That's good to know. You mentioned, you threw out the number, five years from now, somebody might build the killer app or whatever, they'll figure out some of these things. Where are we with this though? Because this sounds really early, right? There's still things that need to be figured out. Again, it's public data on the blockchain. How do you see this evolving? When do you think Web3 will be more accessible to the masses?Nader: Today people are actually building really, really interesting applications that are fitting the current technology stack, what are the things that you can build? People are already building those. But when you think about the current state of the web, where you have something like Twitter, or Facebook or Instagram, where I would say, especially maybe something like Facebook, that's extremely, extremely complex with a lot of UI interaction, a lot of private data, messages and stuff. I think to build something like that, yeah, it's going to be a couple of years. And then you might not even see certain types of applications being built. I don't think there is going to be this thing where there is no longer these types of applications. There are only these new types. I think it's more of a new type of application that people are going to be building, and it's not going to be a winner takes all just like in all tech in my opinion.I wouldn't say all but in many areas of tech where you're thinking of something as a zero-sum game where I don't think this is. But I do think that the most interesting stuff is around how Web3 essentially enables native payments and how people are going to use these native payments in interesting ways that maybe we haven't thought of yet. One of the ways that you're starting to see people doing, and a lot of venture capitalists are now investing in a lot of these companies, if you look at a lot of the companies coming out of YC and a lot of the new companies that these traditional venture capitalists are investing in, are a lot of TOMS crypto companies.When you think about the financial incentives, the things that we talked about early on, let's say you want to have the next version of YouTube and you don't want to have ads. How would that even work? Right? You still need to enable payments. But there's a couple of things that could happen there. Well, first of all, if you're building an application in the way that I've talked about, where you basically have these native payments or these native tokens that can be part of the whole process now, instead of waiting 10 years to do an IPO for an application that has been around for those 10 years and then paying back all his investors and all of those people that had been basically pulling money out their pockets to take part in.What if someone that has a really interesting idea and maybe they have a really good track record, they come out with a new application and they're basically saying, okay, if you want to own a piece of this, we're going to basically create a token and you can have ownership in it. You might see people doing these ICO's, initial coin offerings, or whatever, where basically they're offering portions of the company to anyone that wants to own it and then incentivizing people to basically use those, to govern how the application is built in the future. Let's say I own 1% of this company and a proposal is put up to do something new. I can basically say, I can use that portion of my ownership to vote on things. And then people that are speculating can say, this company is doing interesting things. I'm going to buy into it, therefore driving the price up or down.Kind of like the same way that you see the traditional stock market there, but without all of the regulation and friction that comes with that. I think that's interesting and you're already seeing companies doing that. You're not seeing the majority of companies doing that or anything like that, but you are starting to see those types of things happening. And that brings around the discussion of regulations. Is ... can you even do something like that in the United States? Well, maybe, maybe not. Does that mean people are going to start building these companies elsewhere? That's an interesting discussion as well. Right now if you want to build an application this way, you need to have some type of utility that these tokens are there for. You can't just do them purely on speculation, at least right now. But I think it's going to be interesting for sure, to watch.Jeremy: Right. And I think too that, I'm just thinking if you're a bank, right? And you maybe have a bunch of private transactions that you want to keep private. Because again, I don't even know how, I don't know how we get to private transactions on the blockchain. I could see you wanting to have some transactions that were public blockchain and some that were private and maybe a hybrid approach would make sense for some companies.Nader: I think the idea that we haven't really talked about at all is identity and how identity works compared to how we're used to identity. The way that we're used to identity working is, we basically go to a new website and we're like, this looks awesome. Let me try it out. And they're like, oh wait, we need your name, your email address, your phone number, and possibly your credit card and all this other stuff. We do that over and over and over, and over time we've now given our personal information to 500 people. And then you start getting these emails, your data has been breached, every week you get one of these emails, if you're someone like me, I don't know. Maybe I'm just signing up for too much stuff. Maybe not every week, but maybe every month or two. But you're giving out your personal data.But we're used to identity as being tied to our own physical name and address and things like that. But what if identity was something that was more abstract? And I think that that's the way that you typically see identity managed in Web3. When you're dealing with authentication mechanisms, one of the most interesting things that I think that is part of this whole discussion is this idea of a single sign-on mechanism, that you own your identity and you can transfer it across all the applications and no one else is in control of it. When you use something like an Ethereum wallet, like MetaMask, for example, it's an extension you can just download and put crypto in and basically make payments on the web with. When you create a wallet, you're given a wallet address. And the wallet address is basically created using public key cryptography, where basically you start with this private key, your public key is derived from the private key, and then your address is dropped from the public key.And when you send a transaction, you basically sign the transaction with your private key and you send your public key along with the transaction, and the person that receives that can decode the transaction with the public key to verify that that's who signed the transaction. Using this public key cryptography that only you can basically sign with your own address and your own password, it's all stored on the blockchain or in some decentralized manner. Actually in this case stored on the blockchain or it depends on how you use it really, I guess. But anyway, the whole idea here is that you completely own your identity. If you never decide to associate that identity with your name and your phone number, then who knows who's sending these transactions and who knows what's going on, because why would you need to associate your own name and phone number with all of these types of things, in these situations where you're making payments and stuff like that. Right?What is the idea of a user profile anyway, and why do you actually need it? Well, you might need it on certain applications. You might need it or want it on social network, or maybe not, or you might come up with a pseudonym, because maybe you don't want to associate yourself with whatever. You might want to in other cases, but that's completely up to you and you can have multiple wallet addresses. You might have a public wallet address that you associate your name with that you are using on social media. You might have a private wallet address that you're never associating with your name, that you're using for financial transactions. It's completely up to you, but no one can change that information. One of the applications that I recently built was called Decentralized Identity. I built it and release it a few days ago.And it's an implementation of this and it's using some of these Web3 technologies. One of them is IDX. One of them is Ceramic, which is a decentralized protocol similar to the Graph but for identity. And then it's using something called DIDs, which are decentralized identifiers, which are a way to have a completely unique ID based off of your address. And then you own the control over that. You can basically go in and make updates to that profile. And then any application across the web that you choose to use can then access that information. You're only dealing with it stored in one place. You have full control over it, at any time you can go in and delete that. You can go in and change it. No one has control over it except for you.The idea of identity is a mind-bending thing in this space because I think we're so used to just handing everybody our real names and our real phone numbers and all of our personal information and just having our fingers crossed, that we're just not used to anything else.Jeremy: It's all super interesting. You mentioned earlier about, would it be legal in the United States? I'm thinking of all these recent ransomware attacks and I think they were able to trace back some Bitcoin transaction, they were actually able to trace it back to the individual group that accepted the payment. It opens up a whole can of worms. I love this idea of being anonymous and not being tracked, but then it's also like, what could bad actors do with anonymous financial transactions and things like that? So ...Nader: There kind of has been anonymous transactional layer for a long time. Cash brought in, you can't really do a lot of illegal stuff these days without cash. So should we get rid of cash? I think with any technology ...Jeremy: No, but I mean, there's a limit though, right? You can't withdraw more than $10,000 worth of cash without the FBI being flagged and you can't deposit more, you know what I mean?Nader: You can't take a million dollars worth of Bitcoin that you've gotten from ransomware and turn it into cash either.Jeremy: That's also true. Right.Nader: Because it's all tracked on the blockchain, that's probably how they caught those people. Right? They somehow had their personal information tied to a transaction, because if you follow these transactions long enough, you're going to find some origination point. I agree though. There's definitely trade-offs with everything. I don't think I'm ever the type to argue that. There's good things and there's bad things. I think you have to look at the whole picture and decide for yourself, what you think. I'm the type that's like, let's lay out all of the ideas and let the market decide.Jeremy: Right. Yeah. I totally agree with that. All this stuff is fascinating, there is way too much more for me to learn at this point. I think my brain is filled at this point. Anything else about Edge & Node? Any cool things you're working on there or anything you want people to know?Nader: We're working on a couple of different projects. I can't really talk about some of them because they're not released yet, but we are working on a new version of something called Everest, and Everest is already out. If you want to check it out, it's at everest.link. It's basically a repository of a bunch of different applications that have already been built in the Web3 ecosystem. It also ties in a lot of the stuff that we talked about, like identity and stuff like that. You can basically sign in with your Ethereum wallet. You can basically interact with different applications and stuff, but you can also just see the types of stuff people are building. It's categorized into games, financial apps. If you've listened to this and you're like, this sounds cool, but are people actually building stuff? This is a place to see hundreds of apps that people have are already built and that are out there and successful.Jeremy: Awesome. All right. Well, listen, Nader, this was awesome. Thank you so much for sharing this with me. I know I learned a ton. I hope the listeners learned a ton. If people want to learn more about this or just follow you and keep up with what you're doing, what's the best way to do that?Nader: I would say check out Twitter, we're on Twitter @dabit3 for me, @edgeandnode for Edge & Node, and of course @graphprotocol for Graph protocol.Jeremy: Okay. And then edgeandnode.com. Your YouTube channel is just youtube.com/naderdabit, N-A-D-E-R D-A-B-I-T. And then you had an article on Web3 and I'll put it in the show notes.Nader: Yeah. Put it in the show notes. For freeCodeCamp, it's called what is Web3. And it's really a condensed version of a lot of the stuff we talked about. Maybe go into a little bit more depth around native payments and how people might build companies in the way that we've talked about here.Jeremy: Awesome. All right. Well, I will get all that stuff into the show notes. Thanks again, Nader.Nader: Thanks for having me. It was good to talk.

covid-19 united states america love new york amazon netflix action san francisco building reading hbo bank fbi bitcoin silicon valley nfts discord cloud crypto mississippi id south america blockchain mount everest explain visa vc similar slack ipo blockbuster api ethereum web3 ui atm aws amplify faa apis ripple jp morgan chase decentralized ico graphs fauna s3 sql node nader web2 lambda yc baas matic serverless ceramic dapps graphql gcp metamask uniswap developer advocate react native ipfs filecoin ec2 yaml arbitrum cognito sharding discords idx dynamodb building apps grt decentralized internet dids decentralized web nader dabit aws amplify livepeer developer relations engineer serverless framework aws appsync appsync jeremy you jeremy it cbt nuggets aws mobile

Episode #105: Building a Serverless Banking Platform with Patrick Strzelec

Play Episode Listen Later Jun 14, 2021 66:01

About Patrick StrzelecPatrick Strzelec is a fullstack developer with a focus on building GraphQL gateways and serverless microservices. He is currently working as a technical lead at NorthOne making banking effortless for small businesses.LinkedIn: Patrick StrzelecNorthOne Careers: www.northone.com/about/careersWatch this episode on YouTube: https://youtu.be/8W6lRc03QNU This episode sponsored by CBT Nuggets and Lumigo. TranscriptJeremy: Hi everyone. I'm Jeremy Daly, and this is Serverless Chats. Today, I'm joined by Patrick Strzelec. Hey, Patrick, thanks for joining me.Patrick: Hey, thanks for having me.Jeremy: You are a lead developer at NorthOne. I'd love it if you could tell the listeners a little bit about yourself, your background, and what NorthOne does.Patrick: Yeah, totally. I'm a lead developer here at NorthOne, I've been focusing on building out our GraphQL gateway here, as well as some of our serverless microservices. What NorthOne does, we are a banking experience for small businesses. Effectively, we are a deposit account, with many integrations that act almost like an operating system for small businesses. Basically, we choose the best partners we can to do things like check deposits, just your regular transactions you would do, as well as any insights, and the use cases will grow. I'd like to call us a very tailored banking experience for small businesses.Jeremy: Very nice. The thing that is fascinating, I think about this, is that you have just completely embraced serverless, right?Patrick: Yeah, totally. We started off early on with this vision of being fully event driven, and we started off with a monolith, like a Python Django big monolith, and we've been experimenting with serverless all the way through, and somewhere along the journey, we decided this is the tool for us, and it just totally made sense on the business side, on the tech side. It's been absolutely great.Jeremy: Let's talk about that because this is one of those things where I think you get a business and a business that's a banking platform. You're handling some serious transactions here. You've got a lot of transactions that are going through, and you've totally embraced this. I'd love to have you take the listeners through why you thought it was a good idea, what were the business cases for it? Then we can talk a little bit about the adoption process, and then I know there's a whole bunch of stuff that you did with event driven stuff, which is absolutely fascinating.Then we could probably follow up with maybe a couple of challenges, and some of the issues you face. Why don't we start there. Let's start, like who in your organization, because I am always fascinated to know if somebody in your organization says, “Hey we absolutely need to do serverless," and just starts beating that drum. What was that business and technical case that made your organization swallow that pill?Patrick: Yeah, totally. I think just at a high level we're a user experience company, we want to make sure we offer small businesses the best banking experience possible. We don't want to spend a lot of time on operations, and trying to, and also reliability is incredibly important. If we can offload that burden and move faster, that's what we need to do. When we're talking about who's beating that drum, I would say our VP, Blake, really early on, seemed to see serverless as this amazing fit. I joined about three years ago today, so I guess this is my anniversary at the company. We were just deciding what to build. At the time there was a lot of architecture diagrams, and Blake hypothesized that serverless was a great fit.We had a lot of versions of the world, some with Apache Kafka, and a bunch of microservices going through there. There's other versions with serverless in the mix, and some of the tooling around that, and this other hypothesis that maybe we want GraphQL gateway in the middle of there. It was one of those things that we wanted to test our hypothesis as we go. That ties into this innovation velocity that serverless allows for. It's very cheap to put a new piece of infrastructure up in serverless. Just the other day we wanted to test Kinesis for an event streaming use case, and that was just a half an hour to set up that config, and you could put it live in production and test it out, which is completely awesome.I think that innovation velocity was the hypothesis. We could just try things out really quickly. They don't cost much at all. You only pay for what you use for the most part. We were able to try that out, and as well as reliability. AWS really does a good job of making sure everything's available all the time. Something that maybe a young startup isn't ready to take on. When I joined the company, Blake proposed, “Okay, let's try out GraphQL as a gateway, as a concept. Build me a prototype." In that prototype, there was a really good opportunity to try serverless. They just ... Apollo server launched the serverless package, that was just super easy to deploy.It was a complete no-brainer. We tried it out, we built the case. We just started with this GraphQL gateway running on serverless. AWS Lambda. It's funny because at first, it's like, we're just trying to sell them development. Nobody's going to be hitting our services. It was still a year out from when we were going into production. Once we went into prod, this Lambda's hot all the time, which is interesting. I think the cost case breaks down there because if you're running this thing, think forever, but it was this GraphQL server in front of our Python Django monolift, with this vision of event driven microservices, which has fit well for banking. If you just think about the banking world, everything is pretty much eventually consistent.Just, that's the way the systems are designed. You send out a transaction, it doesn't settle for a while. We were always going to do event driven, but when you're starting out with a team of three developers, you're not going to build this whole microservices environment and everything. We started with that monolith with the GraphQL gateway in front, which scaled pretty nicely, because we were able to sort of, even today we have the same GraphQL gateway. We just changed the services backing it, which was really sweet. The adoption process was like, let's try it out. We tried it out with GraphQL first, and then as we were heading into launch, we had this monolith that we needed to manage. I mean, manually managing AWS resources, it's easier than back in the day when you're managing your own virtual machines and stuff, but it's still not great.We didn't have a lot of time, and there was a lot of last-minute changes we needed to make. A big refactor to our scheduling transactions functions happened right before launch. That was an amazing serverless use case. And there's our second one, where we're like, “Okay, we need to get this live really quickly." We created this work performance pattern really quickly as a test with serverless, and it worked beautifully. We also had another use case come up, which was just a simple phone scheduling service. We just wrapped an API, and just exposed some endpoints, but it was just a lot easier to do with serverless. Just threw it off to two developers, figure out how you do it, and it was ready to be live. And then ...Jeremy: I'm sorry to interrupt you, but I want to get to this point, because you're talking about standing up infrastructure, using infrastructure as code, or the tools you're using. How many developers were working on this thing?Patrick: How many, I think at the time, maybe four developers on backend functionality before launch, when we were just starting out.Jeremy: But you're building a banking platform here, so this is pretty sophisticated. I can imagine another business case for serverless is just the sense that we don't have to hire an operations team.Patrick: Yeah, exactly. We were well through launching it. I think it would have been a couple of months where we were live, or where we hired our first dev ops engineer. Which is incredible. Our VP took a lot of that too, I'm sure he had his hands a little more dirty than he did like early on. But it was just amazing. We were able to manage all that infrastructure, and scale was never a concern. In the early stages, maybe it shouldn't be just yet, but it was just really, really easy.Jeremy: Now you started with four, and I think, what are you now? Somewhere around 25 developers? Somewhere in that space now?Patrick: About 25 developers now, we're growing really fast. We doubled this year during COVID, which is just crazy to think about, and somehow have been scaling somewhat smoothly at least, in terms of just being able to output as a dev team promote. We'll probably double again this year. This is maybe where I shamelessly plug that we're hiring, and we always are, and you could visit northone.com and just check out the careers page, or just hit me up for a warm intro. It's been crazy, and that's one of the things that serverless has helped with us too. We haven't had this scaling bottleneck, which is an operations team. We don't need to hire X operations people for a certain number of developers.Onboarding has been easier. There was one example of during a major project, we hired a developer. He was new to serverless, but just very experienced developer, and he had a production-ready serverless service ready in a month, which was just an insane ramp-up time. I haven't seen that very often. He didn't have to talk to any of our operation staff, and we'd already used serverless long enough that we had all of our presets and boilerplates ready, and permissions locked down, so it was just super easy. It's super empowering just for him to be able to just play around with the different services. Because we hit that point where we've invested enough that every developer when they opened a branch, that branch deploys its own stage, which has all of the services, AWS infrastructure deployed.You might have a PR open that launches an instance of Kinesis, and five SQS queues, and 10 Lambdas, and a bunch of other things, and then tear down almost immediately, and the cost isn't something we really worry about. The innovation velocity there has been really, really good. Just being able to try things out. If you're thinking about something like Kinesis, where it's like a Kafka, that's my understanding, and if you think about the organizational buy-in you need for something like Kafka, because you need to support it, come up with opinions, and all this other stuff, you'll spend weeks trying it out, but for one of our developers, it's like this seems great.We're streaming events, we want this to be real-time. Let's just try it out. This was for our analytics use case, and it's live in production now. It seems to be doing the thing, and we're testing out that use case, and there isn't that roadblock. We could always switch off to a different design if you want. The experimentation piece there has been awesome. We've changed, during major projects we've changed the way we've thought about our resources a few times, and in the end it works out, and often it is about resiliency. It's just jamming queues into places we didn't think about in the first place, but that's been awesome.Jeremy: I'm curious with that, though, with 25 developers ... Kinesis for the most part works pretty well, but you do have to watch those iterator ages, and make sure that they're not backing up, or that you're losing events. If they get flooded or whatever, and also sticking queues everywhere, sounds like a really good idea, and I'm a big fan of that, but it also, that means there's a lot of queues you have to manage, and watch, and set alarms and all that kind of stuff. Then you also talked about a pretty, what sounds like a pretty great CI/CD process to spin up new branches and things like that. There's a lot of dev ops-y ops work that is still there. How are you handling that now? Do you have dedicated ops people, or do you just have your developers looking after that piece of it?Patrick: I would say we have a very spirited group of developers who are inspired. We do a lot of our code-sharing via internal packages. A few of our developers just figured out some of our patterns that we need, whether it's like CI, or how we structure our events stores, or how we do our Q subscriptions. We manage these internal packages. This won't scale well, by the way. This is just us being inspired and trying to reduce some of this burden. It is interesting, I've listened to this podcast and a few others, and this idea of infrastructure as code being part of every developer's toolbox, it's starting to really resonate with our team.In our migration, or our swift shift to full, I'd say doing serverless properly, we've learned to really think in it. Think in terms of infrastructure in our creating solutions. Not saying we're doing serverless the right way now, but we certainly did it the wrong way in the past, where we would spin up a bunch of API gateways that would talk to each other. A lot of REST calls going around the spider web of communication. Also, I'll call these monster Lambdas, that have a whole procedure list that they need to get through, and a lot of points of failure. When we were thinking about the way we're going to do Lambda now, we try to keep one Lambda doing one thing, and then there's pieces of infrastructure stitching that together. EventBridge between domain boundaries, SQS for commands where we can, instead of using API gateway. I think that transitions pretty well into our big break. I'm talking about this as our migration to serverless. I want to talk more about that.Jeremy: Before we jump into that, I just want to ask this question about, because again, I call those fat, some people call them fat Lambdas, I call them Lambda lifts. I think there's Lambda lifts, then fat Lambdas, then your single-purpose functions. It's interesting, again, moving towards that direction, and I think it's super important that just admitting that you're like, we were definitely doing this wrong. Because I think so many companies find that adopting serverless is very much so an evolution, and it's a learning thing where the teams have to figure out what works for them, and in some cases discovering best practices on your own. I think that you've gone through that process, I think is great, so definitely kudos to you for that.Before we get into that adoption and the migration or the evolution process that you went through to get to where you are now, one other business or technical case for serverless, especially with something as complex as banking, I think I still don't understand why I can't transfer personal money or money from my personal TD Bank account to my wife's local checking account, why that's so hard to do. But, it seems like there's a lot of steps. Steps that have to work. You can't get halfway through five steps in some transaction, and then be like, oops we can't go any further. You get to roll that back and things like that. I would imagine orchestration is a huge piece of this as well.Patrick: Yeah, 100%. The banking lends itself really well to these workflows, I'll call them. If you're thinking about even just the start of any banking process, there's this whole application process where you put in all your personal information, you send off a request to your bank, and then now there's this whole waterfall of things that needs to happen. All kinds of checks and making sure people aren't on any fraud lists, or money laundering lists, or even just getting a second dive from our compliance department. There's a lot of steps there, and even just keeping our own systems in sync, with our off-provider and other places. We definitely lean on using step functions a lot. I think they work really, really well for our use case. Just the visual, being able to see this is where a customer is in their onboarding journey, is very, very powerful.Being able to restart at any point of their, or even just giving our compliance team a view into that process, or even adding a pause portion. I think that's one of the biggest wins there, is that we could process somebody through any one of our pipelines, and we may need a human eye there at least for this point in time. That's one of the interesting things about the banking industry is. There are still manual processes behind the scenes, and there are, I find this term funny, but there are wire rooms in banks where there are people reviewing things and all that. There are a lot of workflows that just lend themselves well to step functions. That pausing capability and being able to return later with a response, so that allows you to build other internal applications for your compliance teams and other teams, or just behind the scenes calls back, and says, "Okay, resume this waterfall."I think that was the visualization, especially in an events world when you're talking about like sagas, I guess, we're talking about distributed transactions here in a way, where there's a lot of things happening, and a common pattern now is the saga pattern. You probably don't want to be doing two-phase commits and all this other stuff, but when we're looking at sagas, it's the orchestration you could do or the choreography. Choreography gets very messy because there's a lot of simplistic behavior. I'm a service and I know what I need to do when these events come through, and I know which compensating events I need to dump, and all this other stuff. But now there's a very limited view.If a developer is trying to gain context in a certain domain, and understand the chain of events, although you are decoupled, there's still this extra coupling now, having to understand what's going on in your system, and being able to share it with external stakeholders. Using step functions, that's the I guess the serverless way of doing orchestration. Just being able to share that view. We had this process where we needed to move a lot of accounts to, or a lot of user data to a different system. We were able to just use an orchestrator there as well, just to keep an eye on everything that's going on.We might be paused in migrating, but let's say we're moving over contacts, a transaction list, and one other thing, you could visualize which one of those are in the red, and which one we need to come in and fix, and also share that progress with external stakeholders. Also, it makes for fun launch parties I'd say. It's kind of funny because when developers do their job, you press a button, and everything launches, and there's not really anything to share or show.Jeremy: There's no balloons or anything like that.Patrick: Yeah. But it was kind of cool to look at these like, the customer is going through this branch of the logic. I know it's all green. Then I think one of the coolest things was just the retry ability as well. When somebody does fail, or when one of these workflows fails, you could see exactly which step, you can see the logs, and all that. I think one of the challenges we ran into there though, was because we are working in the banking space, we're dealing with sensitive data. Something I almost wish AWS solved out of the box, would be being able to obfuscate some of that data. Maybe you can't, I'm not sure, but we had to think of patterns for tokenization for instance.Stripe does this a lot where certain parts of their platform, you just get it, you put in personal information, you get back a token, and you use that reference everywhere. We do tokenization, as well as we limit the amount of details flowing through steps in our orchestrators. We'll use an event store with identifiers flowing through, and we'll be doing reads back to that event store in between steps, to do what we need to do. You lose some of that debug-ability, you can't see exactly what information is flowing through, but we need to keep user data safe.Jeremy: Because it's the use case for it. I think that you mentioned a good point about orchestration versus choreography, and I'm a big fan of choreography when it makes sense. But I think one of the hardest lessons you learn when you start building distributed systems is knowing when to use choreography, and knowing when to use orchestration. Certainly in banking, orchestration is super important. Again, with those saga patterns built-in, that's the kind of thing where you can get to a point in the process and you don't even need to do automated rollbacks. You can get to a failure state, and then from there, that can be a pause, and then you can essentially kick off the unwinding of those things and do some of that.I love that idea that the token pattern and using just rehydrating certain steps where you need to. I think that makes a ton of sense. All right. Let's move on to the adoption and the migration process, because I know this is something that really excites you and it should because it is cool. I always know, as you're building out applications and you start to add more capabilities and more functionality and start really embracing serverless as a methodology, then it can get really exciting. Let's take a step back. You had a champion in your organization that was beating the drum like, "Let's try this. This is going to make a lot of sense." You build an Apollo Lambda or a Lambda running Apollo server on it, and you are using that as a strangler pattern, routing all your stuff through now to your backend. What happens next?Patrick: I would say when we needed to build new features, developers just gravitated towards using serverless, it was just easier. We were using TypeScript instead of Python, which we just tend to like as an organization, so it's just easier to hop into TypeScript land, but I think it was just easier to get something live. Now we had all these Lambdas popping up, and doing their job, but I think the problem that happened was we weren't using them properly. Also, there was a lot of difference between each of our serverless setups. We would learn each time and we'd be like, okay, we'll use this parser function here to simplify some of it, because it is very bare-bones if you're just pulling the Serverless Framework, and it took a little ...Every service looked very different, I would say. Also, we never really took the time to sit back and say, “Okay, how do we think about this? How do we use what serverless gives us to enable us, instead of it just being an easy thing to spin up?" I think that's where it started. It was just easy to start. But we didn't embrace it fully. I remember having a conversation at some point with our VP being like, “Hey, how about we just put Express into one of our Lambdas, and we create this," now I know it's a Lambda lift. I was like, it was just easier. Everybody knows how to use Express, why don't we just do this? Why are we writing our own parsers for all these things? We have 10 versions of a make response helper function that was copy-pasted between repos, and we didn't really have a good pattern for sharing that code yet in private packages.We realized that we liked serverless, but we realized we needed to do it better. We started with having a serverless chapter reading between some of our team members, and we made some moves there. We created a shared boilerplate at some point, so it reduced some of the differences you'd see between some of the repositories, but we needed a step-change difference in our thinking, when I look back, and we got lucky that opportunity came up. At this point, we probably had another six Lambda services, maybe more actually. I want to say around, we'd probably have around 15 services at this point, without a governing body around patterns.At this time, we had this interesting opportunity where we found out we're going to be re-platforming. A big announcement we just made last month was that we moved on to a new bank partner called Bancorp. The bank partner that supports Chime, and they're like, I'll call them an engine boost. We put in a much larger, more efficient engine for our small businesses. If you just look at the capabilities they provide, they're just absolutely amazing. It's what we need to build forward. Their events API is amazing as well as just their base banking capabilities, the unit economics they can offer, the times on there, things were just better. We found out we're doing an engine swap. The people on the business side on our company trusted our technical team to do what we needed to do.Obviously, we need to put together a case, but they trusted us to choose our technology, which was awesome. I think we just had a really good track record of delivering, so we had free reign to decide what do we do. But the timeline was tight, so what we decided to do, and this was COVID times too, was a few of our developers got COVID tested, and we rented a house and we did a bubble situation. How in the NHL or MBA you have a bubble. We had a dev bubble.Jeremy: The all-star team.Patrick: The all-star team, yeah. We decided let's sit down, let's figure out what patterns are going to take us forward. How do we make the step-change at the same time as step-change in our technology stack, at the same time as we're swapping out this bank, this engine essentially for the business. In this house, we watched almost every YouTube video you can imagine on event driven and serverless, and I think leading up. I think just knowing that we were going to be doing this, I think all of us independently started prototyping, and watching videos, and reading a lot of your content, and Alex DeBrie and Yan Cui. We all had a lot of ideas already going in.When we all got to this house, we started off with this exercise, an event storming exercise, just popular in the domain-driven design community, where we just threw down our entire business on a wall with sticky notes, and it would have been better to have every business stakeholder there, but luckily we had two people from our product team there as representatives. That's how invested we were in building this outright, that we have products sitting in the room with us to figure it out.We slapped down our entire business on a wall, this took days, and then drew circles around it and iterated on that for a while. Then started looking at what the technology looks like. What are our domain boundaries, and what prototypes do we need to make? For a few weeks there, we were just prototyping. We built out what I'd called baby's first balance. That was the running joke where, how do we get an account opened with a balance, with the transactions minimally, with some new patterns. We really embraced some of this domain-driven-design thinking, as well as just event driven thinking. When we were rethinking architecture, three concepts became very important for us, not entirely new, but important. Item potency was a big one, dealing with distributed transactions was another one of those, as well as the eventual consistency. The eventual consistency portion is kind of funny because we were already doing it a lot.Our transactions wouldn't always settle very quickly. We didn't know about it, but now our whole system becomes eventually consistent typically if you now divide all of your architecture across domains, and decouple everything. We created some early prototypes, we created our own version of an event store, which is, I would just say an opinionated scheme around DynamoDB, where we keep track of revisions, payload, timestamp, all the things you'd want to be able to do event sourcing. That's another thing we decided on. Event sourcing seemed like the right approach for state, for a lot of our use cases. Banking, if you just think about a banking ledger, it is events or an accounting ledger. You're just adding up rows, add, subtract, add, subtract.We created a lot of prototypes for these things. Our events store pattern became basically just a DynamoDB with opinions around the schema, as well as a package of a shared code package with a simple dispatch function. One dispatch function that really looks at enforcing optimistic concurrency, and one that's a little bit more relaxed. Then we also had some reducer functions built into there. That was one of the packages that we created, as well as another prototype around that was how do we create the actual subscriptions to this event store? We landed on SNS to SQS fan-out, and it seems like fan-out first is the serverless way of doing a lot of things. We learned that along the way, and it makes sense. It was one of those things we read from a lot of these blogs and YouTube videos, and it really made sense in production, when all the data is streaming from one place, and then now you just add subscribers all over the place. Just new queues. Fan-out first, highly recommend. We just landed on there by following best practices.Jeremy: Great. You mentioned a bunch of different things in there, which is awesome, but so you get together in this house, you come up with all the events, you do this event storming session, which is always a great exercise. You get a pretty good visualization of how the business is going to run from an event standpoint. Then you start building out this event driven architecture, and you mentioned some packages that you built, we talked about step functions and the orchestration piece of this. Just give me a quick overview of the actual system itself. You said it's backed by DynamoDB, but then you have a bunch of packages that run in between there, and then there's a whole bunch of queues, and then you're using some custom packages. I think I already said that but you're using ... are you using EventBridge in there? What's some of the architecture behind all that?Patrick: Really, really good question. Once we created these domain boundaries, we needed to figure out how do we communicate between domains and within domains. We landed on really differentiating milestone events and domain events. I guess milestone events in other terms might be called integration events, but this idea that these are key business milestones. An account was open, an application was approved or rejected, things that every domain may need to know about. Then within our domains, or domain boundaries, we had these domain events, which might reduce to a milestone event, and we can maintain those contracts in the future and change those up. We needed to think about how do we message all these things across? How do we communicate? We landed on EventBridge for our milestone events. We have one event bus that we talked to all of our, between domain boundaries basically.EventBridge there, and then each of our services now subscribed to that EventBridge, and maintain their own events store. That's backed by DynamoDB. Each of our services have their own data store. It's usually an event stream or a projection database, but it's almost all Dynamo, which is interesting because our old platform used Postgres, and we did have relational data. It was interesting. I was really scared at first, how are we going to maintain relations and things? It became a non-issue. I don't even know why now that I think about it. Just like every service maintains its nice projection through events, and builds its own view of the world, which brings its own problems. We have DynamoDB in there, and then SNS to SQS fan-out. Then when we're talking about packages ...Jeremy: That's Office Streams?Patrick: Exactly, yeah. We're Dynamo streams to SNS, to SQS. Then we use shared code packages to make those subscriptions very easy. If you're looking at doing that SNS to SQS fan-out, or just creating SQS queues, there is a lot of cloud formation boilerplate that we were creating, and we needed to move really quick on this project. We got pretty opinionated quick, and we created our own subscription function that just generates all this cloud formation with naming conventions, which was nice. I think the opinions were good because early on we weren't opinionated enough, I would say. When you look in your AWS dashboard, the read for these aren't prefixed correctly, and there's all this garbage. You're able to have consistent naming throughout, make it really easy to subscribe to an event.We would publish packages to help with certain things. Our events store package was one of those. We also created a Lambda handlers package, which leverages, there's like a Lambda middlewares compose package out there, which is quite nice, and we basically, all the common functionality we're doing a lot of, like parsing a body from S3, or SQS or API gateway. That's just the middleware that we now publish. Validation in and out. We highly recommend the library Zod, we really embrace the TypeScript first object validation. Really, really cool package. We created all these middlewares now. Then subscription packages. We have a lot of shared code in this internal NPM repository that we install across.I think one challenge we had there was, eventually you extracted away too much from the cloud formation, and it's hard for new developers to ... It's easy for them to create events subscriptions, it's hard for them to evolve our serverless thinking because they're so far removed from it. I still think it was the right call in the end. I think this is the next step of the journey, is figuring out how do we share code effectively while not hiding away too much of serverless, especially because it's changing so fast.Jeremy: It's also interesting though that you take that approach to hide some of that complexity, and bake in some of that boilerplate that, someone's mostly didn't have to write themselves anyways. Like you said, they're copying and pasting between services, is not the best way to do it. I tried the whole shared packages thing one time, and it kind of worked. It's just like when you make a small change to that package and you have 14 services, that then you have to update to get the newest version. Sometimes that's a little frustrating. Lambda layers haven't been a huge help with some of that stuff either. But anyways, it's interesting, because again you've mentioned this a number of times about using queues.You did mention resiliency in there, but I want to touch on that point a little bit because that's one of those things too, where I would assume in a banking platform, you do not want to lose events. You don't want to lose things. and so if something breaks, or something gets throttled or whatever, having to go and retry those events, having the alerts in place to know that a queue is backed up or whatever. Then just, I'm thinking ordering issues and things like that. What kinds of issues did you face, and tell me a little bit more about what you've done for reliability?Patrick: Totally. Queues are definitely ... like SQS is a workhorse for our company right now. We use a lot of it. Dropping messages is one of the scariest things, so you're dead-on there. When we were moving to event driven, that was what scared me the most. What if we drop an event? A good example of that is if you're using EventBridge and you're subscribing Lambdas to it, I was under the impression early on that EventBridge retries forever. But I'm pretty sure it'll retry until it invokes twice. I think that's what we landed on.Jeremy: Interesting.Patrick: I think so, and don't quote me on this. That was an example of where drop message could be a problem. We put a queue in front of there, an SQS queue as the subscription there. That way, if there's any failure to deliver there, it's just going to retry all the time for a number of days. At that point we got to think about DLQs, and that's something we're still thinking about. But yeah, I think the reason we've been using queues everywhere is that now queues are in charge of all your retry abilities. Now that we've decomposed these Lambdas into one Lambda lift, into five Lambdas with queues in between, if anything fails in there, it just pops back into the queue, and it'll retry indefinitely. You can drop messages after a few days, and that's something we learned luckily in the prototyping stage, where there are a few places where we use dead letter queues. But one of the issues there as well was ordering. Ordering didn't play too well with ...Jeremy: Not with DLQs. No, it does not, no.Patrick: I think that's one lesson I'd want to share, is that only use ordering when you absolutely need it. We found ways to design some of our architecture where we didn't need ordering. There's places we were using FIFO SQS, which was something that just launched when we were building this thing. When we were thinking about messaging, we're like, "Oh, well we can't use SQS because they don't respect ordering, or it doesn't respect ordering." Then bam, the next day we see this blog article. We got really hyped on that and used FIFO everywhere, and then realized it's unnecessary in most use cases. So when we were going live, we actually changed those FIFO queues into just regular SQS queues in as many places as we can. Then so, in that use case, you could really easily attach a dead letter queue and you don't have to worry about anything, but with FIFO things get really, really gnarly.Ordering is an interesting one. Another place we got burned I think on dead-letter queues, or a tough thing to do with dead letter queues is when you're using our state machines, we needed to limit the concurrency of our state machines is another wishlist item in AWS. I wish there was just at the top of the file, a limit concurrent executions of your state machine. Maybe it exists. Maybe we just didn't learn to use it properly, but we needed to. There's a few patterns out there. I've seen the [INAUDIBLE] pattern where you can use the actual state machine flow to look back at how many concurrent executions you have, and pause. We landed on setting reserved concurrency in a number of Lambdas, and throwing errors. If we've hit the max concurrency and it'll pause that Lambda, but the problem with DLQs there was, these are all errors. They're coming back as errors.We're like, we're fine with them. This is a throttle error. That's fine. But it's hard to distinguish that from a poison message in your queue, so when do you dump those into DLQ? If it's just a throttling thing, I don't think it matters to us. That was another challenge we had. We're still figuring out dead letter queues and alerting. I think for now we just relied on CloudWatch alarms a lot for our alerting, and there's a lot you could do. Even just in the state machines, you can get pretty granular there. I know once certain things fail, and announced to your Slack channel. We use that Slack integration, it's pretty easy. You just go on a Slack channel, there's an email in there, you plop it into the console in AWS, and you have your very early alerting mechanism there.Jeremy: The thing with Elasticsearch ... not Elasticsearch, I'm sorry. I'm totally off-topic here. The thing with EventBridge and Lambda, these are one of those things that, again, they're nuances, but event bridge, as long as it can deliver to the Lambda service, then the Lambda service kicks off and queues it automatically. Then that will retry at a certain number of times. I think you can control that now. But then eventually if that retries multiple times and eventually fails, then that kicks it over to the DLQ or whatever. There's all different ways that it works like that, but that's why I always liked the idea of putting a queue in between there as well, because I felt you just had a little bit more control over exactly what happens.As long as it gets to the queue, then you know you haven't lost the message, or you hope you haven't lost a message. That's super interesting. Let's move on a little bit about the adoption issues. You mentioned a few of these things, obviously issues with concurrency and ordering, and some of that other stuff. What about some of the other challenges you had? You mentioned this idea of writing all these packages, and it pulls devs away from the CloudFormation a little bit. I do like that in that it, I think, accelerates a lot of things, but what are some of the other maybe challenges that you've been having just getting this thing up and running?Patrick: I would say IAM is an interesting one. Because we are in the banking space, we want to be very careful about what access do you give to what machines or developers, I think machines are important too. There've been cases where ... so we do have a separate developer set up with their own permissions, in development's really easy to spin up all your services within reason. But now when we're going into production, there's times where our CI doesn't have the permissions to delete a queue or create a queue, or certain things, and there's a lot of tweaking you have to do there, and you got to do a lot of thinking about your IAM policies as an organization, especially because now every developer's touching infrastructure.That becomes this shared operational overhead that serverless did introduce. We're still figuring that out. Right now we're functioning on least privilege, so it's better to just not be able to deploy than deploy something you shouldn't or read the logs that you shouldn't, and that's where we're starting. But that's something that, it will be a challenge for a little while I think. There's all kinds of interesting things out there. I think temporary IAM permissions is a really cool one. There are times we're in production and we need to view certain logs, or be able to access a certain queue, and there's tooling out there where you can, or at least so I've heard, you can give temporary permissions. You have this queue permission for 30 minutes, and it expires and it's audited, and I think there's some CloudTrail tie-in you could do there. I'm speaking about my wishlist for our next evolution here. I hope my team is listening ...Jeremy: Your team's listening to you.Patrick: ... will be inspired as well.Jeremy: What about ... because this is something too that I always found to be a challenge, especially when you start having multiple services, and you've talked about these domain events, but then milestone events. You've got different services that need to communicate across services, or across domains, and realize certain things like that. Service discovery in and of itself, and which queue are we mapping to, or which service am I talking to, and which version of the service am I talking to? Things like that. How have you been dealing with that stuff?Patrick: Not well, I would say. Very, very ad hoc. I think like right now, at least we have tight communication between the teams, so we roughly know which service we need to talk to, and we output our URLs in the cloud formation output, so at least you could reference the URLs across services, a little easier. Really, a GraphQL is one of the only service that really talks to a lot of our API gateways. At least there's less of that, knowing which endpoint to hit. Most of our services will read into EventBridge, and then within services, a lot of that's abstracted away, like the queue subscription's a little easier. Service discovery is a bit of a nightmare.Once our services grow, it'll be, I don't know. It'll be a huge challenge to understand. Even which services are using older versions of Node, for instance. I saw that AWS is now deprecating version 10 and we'll have to take a look internally, are we using version 10 anywhere, and how do we make sure that's fine, or even things like just knowing which services now have vulnerabilities in their NPM packages because we're using Node. That's another thing. I don't even know if that falls in service discovery, but it's an overhead of ...Jeremy: It's a service management too. It's a lot there. That actually made me, it brings me to this idea of observability too. You mentioned doing some CloudWatch alerts and some of that stuff, but what about using some observability tool or tracing like x-ray, and things like that? Have you been implementing any of that, and if you have, have you had any success and or problems with it?Patrick: I wish we had a better view of some of the observability tools. I think we were just building so quickly that we never really invested the time into trying them out. We did use X-Ray, so we rolled our own tooling internally to at least do what we know. X-Ray was one of those, but the problem with X-Ray is, we do subscribe all of our services, but X-Ray isn't implemented everywhere internally in AWS, so we lose our trail somewhere in that Dynamo stream to SNS, or SQS. It's not a full trace. Also, just digesting that huge graph of information is just very difficult. I don't use it often, I think it's a really cool graphic to show, “Hey, look, how many services are running, and it's going so fast."It's a really cool thing to look at, but it hasn't been very useful. I think our most useful tool for debugging and observability has been just our logging. We created a JSON logger package, so we get up JSON logs and we can actually filter off of different properties, and we ship those to Elasticsearch. Now you can have a view of all of the functions within a given domain at any point in time. You could really see the story. Because I think early on when we were opening up CloudWatch and you'd have like 10 tabs, and you're trying to understand this flow of information, it was very difficult.We also implemented our own trace ID pattern, and I think we just followed a Lumigo article where we introduced some properties, and in each of our Lambdas at a higher level, and one of our middlewares, and we were able to trace through. It's not ideal. Observability is something that we'll probably have to work on next. It's been tolerable for now, but I can't see the scaling that long.Jeremy: That's the other thing too, is even the shared package issue. It's like when you have an observability tool, they'll just install a layer or something, where you don't necessarily have to worry about updating your own tool. I always find if you are embracing serverless and you want to get rid of all that undifferentiated heavy lifting, observability tools, there's a lot of really good ones out there that are doing some great stuff, and they're specializing in it. It might be worth letting someone else handle that for you than trying to do it yourself internally.Patrick: Yeah, 100%. Do you have any that you've used that are particularly good? I know you work with serverless so-Jeremy: I played around with all of them, because I love this stuff, so it's always fun, but I mean, obviously Lumigo and Epsagon, and Thundra, and New Relic. They're all great. They all do things slightly differently, but they all follow a similar implementation pattern so that it's very easy to install them. We can talk more about some recommendations. I think it's just one of those things where in a modern application not having that insight is really hard. It can be really hard to debug stuff. If you look at some of the tools that AWS offers, I think they're there, it's just, they are maybe a little harder to implement, and not quite as refined and targeted as some of the observability tools. But still, you got to get there. Again, that's why I keep saying it's an evolution, it's a process. Maybe one time you get burned, and you're like, we really needed to have observability, then that's when it becomes more of a priority when you're moving fast like you are.Patrick: Yeah, 100%. I think there's got to be a priority earlier than later. I think I'll do some reading now that you've dropped some of these options. I have seen them floating around, but it's one of those things that when it's too late, it's too late.Jeremy: It's never too late to add observability though, so it should. Actually, a lot of them now, again, it makes it really, really easy. So I'm not trying to pitch any particular company, but take a look at some of them, because they are really great. Just one other challenge that I also find a lot of people run into, especially with serverless because there's all these artificial account limits in place. Even the number of queues you can create, and the number of concurrent Lambda functions in a particular region, and stuff like that. Have you run into any of those account limit issues?Patrick: Yeah. I could give you the easiest way to run into an account on that issue, and that is replay your entire EventBridge archive to every subscriber, and you will find a bottleneck somewhere. That's something ...Jeremy: Somewhere it'll fall over? Nice.Patrick: 100%. It's a good way to do some quick check and development to see where you might need to buffer something, but we have run into that. I think the solution there, and a lot of places was just really playing with concurrency where we needed to, and being thoughtful about where is their main concurrency in places that we absolutely needed to stay functioning. I think the challenge there is that eats into your total account concurrency, which was an interesting learning there. Definitely playing around there, and just being thoughtful about where you are replaying. A couple of things. We use replays a lot. Because we are using these milestone events between service boundaries, now when you launch a new service, you want to replay that whole history all the way through.We've done a lot of replaying, and that was one of the really cool things about EventBridge. It just was so easy. You just set up an archive, and it'll record everything coming through, and then you just press a button in the console, and it'll replay all of them. That was really awesome. But just being very mindful of where you're replaying to. If you replay to all of your subscriptions, you'll hit Lambda concurrency limits real quick. Even just like another case, early on we needed to replace ... we have our own domain events store. We want to replace some of those events, and those are coming off the Dynamo stream, so we were using dynamo to kick those to a stream, to SNS, and fan-out to all of our SQS queues. But there would only be one or two queues you actually needed to subtract to those events, so we created an internal utility just to dump those events directly into the SQS queue we needed. I think it's just about not being wasteful with your resources, because they are cheap. Sure.Jeremy: But if you use them, they start to cost money.Patrick: Yeah. They start to cost some money as well as they could lock down, they can lock you out of other functionality. If you hit your Lambda limits, now our API gateway is tapped.Jeremy: That's a good point.Patrick: You could take down your whole system if you just aren't mindful about those limits, and now you could call up AWS in a panic and be like, “Hey, can you update our limits?" Luckily we haven't had to do that yet, but it's definitely something in your back pocket if you need it, if you can make the case to AWS, that maybe you do need bigger limits than the default. I think just not being wasteful, being mindful of where you're replaying. I think another interesting thing there is dealing with partners too. It's really easy to scale in the Lambda world, but not every partner could handle that volume really quickly. If you're not buffering any event coming through EventBridge to your new service that hits a partner every time, you're going to hit their API rate limit really quickly, because they're just going to just go right through it.You might be doing thousands of API calls when you're instantiating a new service. That's one of those interesting things that we have to deal with, and particularly in our orchestrators, because they are talking to different partners, that's why we need to really make sure we could limit the concurrent executions of the state machines themselves. In a way, some of our architecture is too fast to scale.Jeremy: It's too good.Patrick: You still have to consider downstream. That, and even just, if you are using relational databases or anything else in your system, now you have to worry about connection limits and ...Jeremy: I have a whole talk I gave on that.Patrick: ... spikes in traffic.Jeremy: Yes, absolutely.Patrick: Really cool.Jeremy: I know all about it. Any final advice for companies like you that are trying to bite off a piece of the serverless apple, I guess, That's really bad. Anyways, any advice for people looking to get into this?Patrick: Yeah, totally. I would say start small. I think we were wise to just try it out. It might not land with your development team. If you don't really buy in, it's one of those things that could just end up unnecessarily messy, so start small, see if you like it in-shop, and then reevaluate, once you hit a certain point. That, and I would say shared boilerplate packages sooner than later. I know shared code is a problem, but it is nice to have an un-opinionated starter pack, that you're at least not doing anything really crazy. Even just things like having opinions around logging. In our industry, it's really important that you're not logging sensitive details.For us doing things like wrapping our HTTP clients to make sure we're not logging sensitive details, or having short Lambda packages that make sure out-of-the-box you're opinionated about not doing something terribly awful. I would say those two things. Start small and a boiler package, and maybe the third thing is just pay attention to the code smell of a growing Lambda. If you are doing three API calls in one Lambda, chances are you could probably break that up, and think about it in a more resilient way. If any one of those pieces fail, now you could have retry ability in each one of those. Those are the three things I would say. I could probably talk forever about the rest of our journey.Jeremy: I think that was great advice, and I love hearing about how companies are going through this process, what that process looks like, and I hope, I hope, I hope that companies listen to this and can skip a lot of these mistakes. I don't want to call them all mistakes, and I think it's just evolution. The stuff that you've done, we've all made them, we've all gone through that process, and the more we can solidify these practices and stuff like that, I think that more companies will benefit from hearing stories like these. Thank you very much for sharing that. Again, thank you so much for spending the time to do this and sharing all of this knowledge, and this journey that you've been on, and are continuing to be on. It would great to continue to get updates from you. If people want to contact you, I know you're not on Twitter, but what's the best way to reach out to you?Patrick: I almost wish I had a Twitter. It's the developer thing to have, so maybe in the future. Just on LinkedIn would be great. LinkedIn would be great, as well as if anybody's interested in working with our team, and just figuring out how to take serverless to the next level, just hit me up on LinkedIn or look at our careers page at northone.com, and I could give you a warm intro.Jeremy: That's great. Just your last name is spelled S-T-R-Z-E-L-E-C. How do you say that again? Say it in Polish, because I know I said it wrong in the beginning.Patrick: I guess for most people it would just be Strzelec, but if there are any Slavs in the audience, it's "Strzelec." Very intense four consonants last name.Jeremy: That is a lot of consonants. Anyways again, Patrick, thanks again. This was great.Patrick: Yeah, thank you so much, Jeremy. This has been awesome.

covid-19 pr service event mba cloud id nhl platform express apollo i am banking dropping polish slack validation api python onboarding aws faa stripe item kafka ordering chime x ray sns s3 dynamo node urls choreography lambda baas zod ci cd json serverless td bank typescript queues graphql observability fifo npm elasticsearch new relic postgres aws lambda inaudible slavs apache kafka kinesis dynamodb lambdas cloudformation bancorp our vp northone sqs cloudwatch serverless framework yan cui strzelec eventbridge thundra jeremy you jeremy it patrick you cbt nuggets patrick yeah jeremy great

Change For The Better Is Possible, Small Steps Add Up To Big Losses

Feeding Fatty

Play Episode Listen Later Jun 8, 2021 54:54

Change For The Better Is Possible, Small Steps Add Up To Big Losses with The Fit Mess Guys In this world of instant gratification, we want to take action today and see the results tomorrow. It usually takes us a while to get to the point we feel we need to make a change. We have to realize it takes time to see results. Change is possible, and taking small measured action is generally better for sustainable change. Get started today, it's never too late. About Jeremy and Zach Zach Tucker and Jeremy Grater are the founders and hosts of The Fit Mess. For nearly a decade, they have pushed themselves to learn more about their own physical, emotional, and mental health. This has created a passion for using their acquired knowledge to help others. As hosts of the show for over 2 years, they have had the opportunity to speak to a wide range of guests, including some of the biggest names in health and wellness. Zach Tucker, along with his wife and young daughter resides in Albany, NY. Zach has a passion for helping people define and meet their wellness goals. He thinks that every person is different and there is no one size fits all solution for someone's wellness. He is on a mission to share his story and some of the tools that helped him on his own wellness journey. Zach is certified to teach yoga and Insanity® Live.Jeremy Grater is a married father of two young girls and also lives in Seattle. He's spent the majority of the last decade experimenting with a variety of wellness tools to improve his mental health, lose 70 pounds and share what's helped along the way. He's also been in the broadcasting and podcasting business for about two decades. www.thefitmess.com www.feedingfatty.com Full Transcript Below Change For The Better Is Possible, Small Steps Add Up To Big Losses with The Fit Mess Guys 00:00:15 Roy Hello, and welcome to another episode of feeding fatty. I'm your host, Roy Terry. So we're the podcasts. That's chronicling my journey, Terry, my helper through this journey of wellness, losing some weight, getting in better shape. And, from time to time, we have, professionals in their field that come on and today is no different than Terry. I'll let you make our introduction of Jeremy and Zach. All right, 00:00:38 Terry Zach Tucker and Jeremy Grater are the founders and host of The Fit Mess for nearly a decade. They've pushed themselves to learn more about their own physical and emotional and mental health. This has created a passion for using their acquired knowledge to help others as hosts of the show. For over two years, they've had the opportunity to speak to a wide range of guests, including some of the biggest names in health and wellness, Zach and Jeremy. Welcome to the show. 00:01:12 Roy Yeah. Thanks guys. Thanks for taking time out of y'all's day to be with us. We just, I guess we'll throw it out there and see kind of what's new, as were talking in pre-show, some different new weight equipment and things out there, and Jeremy said he has started running, so we'll just kinda first off. It'd be nice just to find out, kind of how you guys found yourself here. 00:01:36 Zach Yeah. I'll start off because I think, my story goes all the way back to being a baby, of just being set up on a, poor, starting line in life. I had some heart defects. I had an absent mother, just, lots of issues, lots of trauma as a child, which led tons and tons of overeating. And, at one point in my life, I was, close to 300 pounds and eating terrible, not exercising. I was, not in great shape. Over the last 20 years, I've made small little changes to turn the health around, to make myself look better, feel better. I had to focus on the mental health and it was a long journey. Jeremy was also following behind on that as well. At some point I was giving him some advice and, we sat down one day and we're like talking about being vulnerable about, the journey that we're on and talking about the struggle and how hard it is. 00:02:42 Zach And, we came to the conclusion that, we need to get this out there and we need to share these conversations with other people, especially men who are sometimes afraid to share their feelings. Yeah. 00:02:56 Roy Wow. That's yeah. That's a, basically a carbon copy of where were, I've struggled. I've had, I can definitely go back and see, I'm kind of the opposite as a kid, I was very skinny, like to the point, people were like, oh my gosh, you got to eat to live. As an adult, definitely four or five stages where, put on 20 pounds, 25 pounds at this stage for this life change. Then, now I find myself here and it's just, it's hard because, we do good sporadically. We can do good for a couple months. It's like, we have an event change and, then things change. I typically go back to my old ways. So, we talk a lot about that. Surgery may be an option for people, but for me, I always thought if I don't change the way I think then I'll never, I'll just go back to my old ways. 00:03:50 Roy Anyway, but thanks for relating that story. I mean, it is, I think it is good to be vulnerable and that's why, Terry encouraged me. She's like, Hey, you need to get this out there because people are struggling. So anyway, Jeremy, what's your story? 00:04:05 Jeremy Similar, not quite the same trauma as a child, mind was much more garden variety, alcohol in the family and kind of absent father stuff. Yeah. Both parents working. A lot of, latchkey kid or the eighties thing. And it is funny. I remember I was around third grade when I really developed some really bad eating habits and it was like, my best friend would come over and we would just hang out and watch TV and raid the pantry during every commercial break. Right. That set me on a course of doing that for far too long. Yo-yo dieted most of my life and, got into, my career and about midway through my career. I really started recognizing a lot of imposter syndrome and really feeling like man, one of these days, they're going to figure out they made a huge mistake hiring me, and they're gonna, they're going to show me the door, know, and that lasted for about 10 years. 00:04:53 Jeremy I finally started getting into some therapy and that turned my life around. My therapist introduced me to meditation. That really helped, calm all of that noise and quiet. A lot of those voices that told me it wasn't good enough. I still don't get me wrong. I, I battle every single day I've fought depression my entire life. Meditation was the thing that set me on the course to doing other things. Along the way I had a bike crash got injured pretty badly. And, but I was finding that through biking. I was, I was also meditating. It was just this very in the moment experience where the whole point is to not die on the way to work. You have to be so in the moment and so focused on what's happening right now that I started trying to push that into other areas of my life. 00:05:41 Jeremy Like Zach said, along the way he and I met through, just through parent groups, we are our friends, our kids are we're friends, and we started hanging out and were just having these really authentic conversations. That's something I've always looked for in a relationship with any of my is who can I have real conversations with and who am I talking about the Mariners with? Who do I want to spend my time with? Especially because the Mariners suck. So, but like Zach said, we, the more we have these conversations, just, camping and whatever, were just like, we don't hear enough guys talking like this. So, like I said, we just felt like this is a show that needs to exist. 00:06:19 Roy Yeah. For the rest of the baseball world out there, you may want to tell them who the Mariners are. 00:06:31 Speaker 4 The Rangers, I mean, come on. Yeah. Right. Oh yeah. 00:06:37 Roy Yeah. That's why we holler for the red Sox or the Yankees, for winter. So, let's talk about that because, your ride at, the good thing about me I'm to an age that it's like, I am who I am. I mean, just because I don't admit that I'm overweight to myself, it doesn't mean that other people that look at me can't I can't figure that out. So, I mean, at some point you just have to say, Hey, I got to own this, but it's the funny thing for me is I don't think about myself as being overweight or, I picture, and I don't picture myself as old either. I'm thinking like that 25 year old in better shape self that's, just who I am until, you walk by a mirror or try to put some clothes on after being in quarantine for a year. And it's like, oh my goodness. 00:07:28 Roy Things are different, but so, let's talk about the difference in Mindshift body, habit shifts, things like that. Was there a point that for y'all either one that the, it was the, your mind engaged and then that's where your habits followed or did you kind of get on that path to habits and then the mind followed after that or some form of that? 00:07:56 Zach My first, my first, real big change was, so I used to smoke cigarettes as well. Okay. And, ate a McDonald's every day. I mean, it was, there was some pretty bad habits there, and it was, my mind shifted because of a comment that my boss made. And, I interviewed for the job and I decided to wear a nicotine patch that day. So I didn't smell like cigarette smoke. I got the job and I got hired in my first day, my boss looked at me and said, oh, you smoke. If I had known that I would have never hired you. Oh, wow. Right. Like that hit me really hard. I was like, wow, this habit, this unhealthy thing is now going to be a, a detriment to my professional career. Yeah. That was the first step of like, well, these are gone and my mind started shifting of what else am I doing? That's bad for me that will, not look in my professional career and right. 00:08:57 Zach That's when I started running and doing the little things, but that was the main shift for me, was this external judgment from somebody who said something like that. It just completely 180 degree my thought process of what I was doing to myself. Yeah. 00:09:14 Terry Yeah. Somebody else may have, you want, you always want to be in control. You want to think like, you're, you feel like you're in control, but it's for you to be able to know, to recognize that and actually act on it. That's those are the keys right there, just going ahead and act in on something that was a judgment call to you. That's that's good. 00:09:40 Zach Yeah. Acting on it was, I, I say, I quit smoking, right. That was probably one of the hardest things I ever did in my entire life. Like that is no joke, not easy. So, it's easy to say now. Right. But, at the time it was incredibly difficult, very painful and a big struggle, but it was worth it to make that change. That's awesome. 00:10:07 Jeremy I think I've had probably three really significant, moments that really changed everything for me. And, one was when I, I had the dumbest knee injury you've ever heard of in your life. We, I don't know if you guys have this out there, but Amazon, they deliver groceries. Yeah. We are, our daughter was very young. She was like, still in the crib. One morning the Amazon guy shows up at like six in the morning and he's dropping the groceries off and the dog loses his mind, starts barking. If you've ever had kids, that anyone who wakes up your kid deserves to be. 00:10:41 Speaker 4 Punished. Yes. I just, 00:10:43 Jeremy I rolled out of bed to like quickly try and quiet my dog. Somehow like rolled my knee and just collapsed on the ground. Couldn't couldn't walk, totally hurt my knee, literally getting out of bed. And it was the stupidest injury. And, so I started going to physical therapy for it. The physical therapist learned about my background and all that. She said, if you don't get on a bike, you're going to end up replacing both of your knees. Now. I was like, wow. Okay. I can do that. I bought a bike on Craigslist and started biking to work. And, and it was funny. I was having a conversation with my brother about it. I was like, yeah, I don't know if I'm gonna be able to, it's a 10 mile ride. I don't know if we'll be able to do this every day. He's like, dude, you just need to decide. 00:11:20 Jeremy You're that weird guy, that bikes to work every day, just make that mental shift and you'll do it. And he was absolutely right. Literally the next day I started biking to, and from work every day. So that was the first one. Another one was like I mentioned, yo-yo dieted my entire life. The conversations I was having with Zach, were camping one weekend sitting around the campfire and he was talking about how he had just lost, like, I don't know, 40 pounds or something in a few months doing the keto diet and like most conversations I often have with Zach, I started with, you're crazy. How do you do that? That's insane. Okay. I gotta try it. I tried it and I lost 70 pounds using the keto diet and just like some light exercise and of fasting and came off in a few months. So that was the other one. 00:12:07 Jeremy The other was, I've battled depression, my entire life on and off of various meds, to just to try and curb it. None of them really worked for me. And, it was shortly after my bike crash. I was on some opioids for awhile. I was like, man, these are really good. I stopped them after like a week. Cause like, that's, this is a dangerous road to go down. Once I stopped taking those, I also stopped taking my antidepressants and then it was new years. I had just a few beers, like nothing crazy woke up the next day with a hangover and just was like, I am just done. Like, those few beers were not worth feeling like this. Yeah. Literally quit drinking. That was like three or four years ago now. Those three things have been it, there's just, this really locked in mental shift where I just went, I'm done living that way. 00:12:57 Jeremy Yeah. I'm going to live this way now. Those three things I've never really looked back on it. 00:13:03 Roy Yeah. I'll just go, I can relate to the Amazon incident because I had almost a similar one. Usually they come in the afternoon or whatever. We've got a little, we live out in kind of in the wilderness. So, the front camera went off and usually it's like a varmint Armadillo or raccoon or something walking across. I just rolled over, pulled up my phone and looked at it and there was this car sitting up on the road that just won't leave. So, I did the same thing. I catapulted out of bed over both dogs and, threw the shoes and, slipped on the floor and getting out there. And it's just the Amazon grocery. I'm like doodles. She was at cigar in our space. Yeah. He delivers groceries at six 15 in the morning. Good. 00:13:47 Speaker 4 Tell him, I forgot to tell him, 00:13:51 Roy Talk about, Zach mentioned the opinions of others. It's funny because I being a heavy person, when I look at heavy people, I still have, negative thoughts. Let me just put it that way. I don't ever think about people thinking that of me. You really stop and think about, oh my gosh, all these, negative opinions that people form of me just because they see me without even meeting me or knowing anything. Now I try not to make the cover of the Walmart people page on Facebook with my pants riding down and my shirt riding up. I do try to stay, totally covered up, but still, I mean, it's just like people view you as inferior or damaged or something, but I don't know. I think about that, hasn't been enough to change me and it's like, I still continue on doing these things. So I don't know. 00:14:51 Roy It's I gotta give you credit for being able to, make that change based on that, because I, I wished I could, it's just been a struggle, finding that thing that works, that continues. We talked a little pre-show about, I'm good for about two, three months of something. If we have a little change in the way life goes, that, were doing good back in the fall and then Terry was gone for about three weeks. And, yeah. I mean, then McDonald's and taco bell became my best friend again, driving up there to try to get something to eat. Instead of cooking in, anyway, that's just kind of my downfall. It just be nice to be able to string together six months or a year, and stay on that path. Any advice either, but any, either one of y'all want to throw out about that would be very helpful. 00:15:41 Zach Yeah. A couple things. So, so the mindset of like what you look like. I do have like a quick, I I've, being 300 pounds, I have always felt as though I was a big guy. And internally it was never quite enough. Like I would push myself so hard to like work out, to eat, right. To the point where I made a comment to my wife one day that, man, I'm still, really fat. She was like, dude, you realize that I can see your abs. Right. She made me take a picture and like, they weren't showing much, but you could see a couple of ad muscles. That was a moment where like in our heads, we, for me anyway, even though I wasn't really that big anymore, I still felt really big. I was working really hard on it. Like that internal mindset of like what you think of yourself is really important to make sure that you're telling yourself the truth there. 00:16:43 Zach Yeah. To the other question of like, how do you stick to it? You can't like, that's what I had to accept. I would go really hard for three months and then I would cave. Yeah. I would beat myself up because I caved and cave more because I beat myself up and it would be this never ending spiral. Yeah. Three months later, two months of those games were gone. I've really just come back to like the 80, 20 principle. Right. If you're doing the right thing, 80% of the time, it's okay to go wild 20% of the time. Yeah. Right. Be okay with that and be forgiving of yourself for it. 00:17:24 Roy The thing is that forgiveness, because you're exactly right. When you fall off the, whatever you're doing, the Eaton exercise. I, I realized to lose weight exercise, doesn't need to be, we can lose weight with our diets 70, 80% or whatever, but I just feel like I've got to have that exercise. It makes me feel so much better. Anyway, it doesn't matter if we're talking about diet exercise or whatever. It's like, once you fall off, then it's like, so hard on ourself and then the non forgiveness. It's like, well, I've already, had three tacos that, what's it going to hurt? I'll have another, I'm going to go ahead and eat bad today. And then tomorrow I'm going to change. Tomorrow goodness knows something else pops up to where, it just, you just kicked that can down the road further and further. 00:18:12 Zach Yeah. I also try and schedule things too. So, ing that I'm going to be 80, 20. Saturday is kind of my, and I know some people don't like cheat days, but this is for me works well. Yeah. On Saturdays. I do whatever I want. If I don't want to work out, I don't work out if I want to eat pizza and ice cream, I do that. Having that schedule for me is actually really helpful knowing that I'm going to have a 20% of going crazy. Yeah. On Wednesday when I have that craving, I, I just go, okay, let's write that down. You're going to have that on Saturday. 00:18:44 Roy That's a, that's a good tip. I like that. 00:18:48 Terry Yeah. Just having something to look forward to that helps. 00:18:52 Zach The, the problem with that is once you do have the cheat day though. For me on Saturday, I eat all, I have sugar and junk food in your body. On Sunday you have that craving and it's really hard to go back into it. It doesn't work for everyone, but it definitely works for me. I wouldn't w it's something you can try. 00:19:13 Roy Yeah. Yeah. I've, I'm that way about, I guess I feel sometimes like I have to be totally strict and can't let anything creep back in because just that one, that little door opening for me, it's like back in the day when they came out with the little skinny blue, the skinny bell ice cream sandwiches, I mean, those are great. They're probably work well if you eat one, but if you eat a whole box of was like, well, that was so good. Let me just have six of them, and I, I'm an ice cream fanatic too. That's, that's even harder to give up something that is, and we talked about this with the, more of a psychologist type person the other day that, I, I grew up with ice cream, it was a reward. Now I'm thinking about that instead of thinking about, Hey, let's go celebrate and have some ice cream. 00:20:07 Roy It's like, what I get on my, TX band and do, 25 pull-ups or something, trying to shift that mindset from it's a reward because I think it goes back to the validation of us is that's what growing up, that was always the reward or the treat is, Hey, let's stop by dairy queen and let's go up to the soda fountain and, get a banana split. Trying to change that from, we can still celebrate, but now, it needs to be a physical activity, either walking or pulling up doing something to better ourselves, better myself. 00:20:43 Terry Jeremy, you were talking about, I think it was you, did you say something about meditating? 00:20:48 Jeremy Yeah. I was just going to say kind of two things on what you're talking about there. One I know for me, I look at my weight issues as being a symptom of my emotional and mental health issues. I can tell by looking in the mirror, how much I still have to do inside. That's where for me, the more I meditate, the more that I do that inner work, the more strength I find to do the outer work, to do the running and the other things. That's something to consider, but also, I just want to commend you on the fact that you're on this journey. This is an incredibly difficult journey at any stage of your life. Well, thanks. But, but especially in a system where the food that is affordable and accessible is garbage, right. Much of what you have available to eat goes against your body's biology and the way that it wants to live. 00:21:44 Jeremy You're fighting a losing game almost every bite fall. Yeah. To the amount of knowledge I think you need from a nutrition standpoint is almost impossible. I think for most people who have busy lives and things to do, because it's just not something that we're taught in a way that makes sense as we're growing up. Yeah. Every ad is, oh, you deserve a treat today. Well, yeah, maybe, but maybe I don't, maybe I'm just lazy. Maybe I just want to eat garbage because it tastes good. That makes me want to eat more garbage. So, so that, so two things there, but one, I think that is really important. I do think that for a lot of people, food is an escape from whatever the issue is that we're struggling with whatever trauma we haven't dealt with from our childhood, whatever. I do think that the, at least in my experience, and I'm no doctor here. 00:22:38 Jeremy The more work you can do, dealing with what's going on inside will allow you the freedom and the desire to do the work on the outside. And, and for me, exercise is almost never about building muscle or getting stronger, or it's all in my head. The more I run, the more I bike, the harder, the yoga, whatever I'm doing to push my body is all about releasing those demons. I like the phrase I like to say is a tired muscles, quiet, dark voices, because the more you quiet those dark voices, the more you're going to be able to live a healthier life and make better choices. 00:23:13 Roy Yeah. No, and I think you're right. And that's something, yeah. I don't think I've told, we haven't really put this out there, but what, cause I think back about that, what am I not dealing with because it's just been so difficult. I just I've become to feel that if I wasn't dealing with some, or wasn't not dealing with something, if there was an unresolved issue, it would be a lot easier. I'm actually going to do, I'll let Terry say the name of it. Reiki, Reiki. 00:23:41 Terry I did my first Reiki session yesterday. 00:23:43 Roy I'm actually going to go try that. And, supposedly it can, it's supposed to be a big release with some things like that. So, I'm open to try and I just I'm to the point in life that I want to make changes and it's, things that I've done in the past just haven't really worked or they've worked. Then, like Zach said, you put it all back on eventually. And, I'm just ready to make a change, for a lot of reasons, but just to get all this behind me. So. 00:24:13 Jeremy Something, something to consider too, like you seem to be aware of the pattern, right? Like it two or three months of this is going to fail me. Yeah. Just when you go with, go into whatever you're going into go, I'm going to do this for 30. I'm going to do this for a week next week. I'm going to do that. Like, just give yourself smaller goals because none of us is going to stick to anything for a year, for 10 years for it's never going to happen too much is going to get in the way. Right. Today I'm going to decide every day, this week I'm going to go walk for 20 minutes next week, it'll be 30 minutes, whatever it is, but just give yourself those smaller goals. And, and the more you pick up those little wins, the confidence grows and you'll be able to score those bigger wins. 00:24:53 Jeremy That's. 00:24:53 Roy Great. Yeah, that's a good point. Cause I, I am too much in that, looking at a year lifetime or whatever it is instead of just, and I guess this gets back to the meditation and being in the moment is just take it one meal at a time, one day at a time, one step at a time. It's, I guess it's the other part I, I have to say is it's daunting because it's like, the impatience is all I can say is like, it, and I think why haven't I been impatient, putting on all this weight now, all of a sudden I'm here now, all of a sudden I'm impatient. I want it all gone. It took years to get here. It may take years to get away from here, but it's like, nah, if I do something, if I go out and walk around the block, I expect to come home and I've lost 15 pounds or something ridiculous like that. 00:25:46 Roy It's like, it's just take it that whatever I did today, exercise wise and eating wise, that's better than what I was yesterday. I, trying to break it down into those smaller wins is definitely something I can work on for sure. Sure. 00:26:00 Terry Yep. Have you guys done a plant-based diet where we're delving into that now? Do much about the plant-based? 00:26:12 Zach I can I'll let Jeremy talk about that because I am a carnivore and I love hamburgers and steaks and chicken. Yeah. I've, I think I've tried it once or twice for maybe a couple of days. And, while I'm sure there are some health benefits and I'm sure there's some good stuff about it. I just really enjoy hamburgers. Jeremy's the vegetarian though, so we can talk. Okay. 00:26:39 Jeremy Yeah. I've been a vegetarian for probably 25. Oh, w w with occasional cheats here and there. Yeah. Any diet, if you don't do it right. You're, it's just a mess. When I talked to him nutrition at one point and a nutritionist at one point, and they said, I kind of told them what I, and oh, you're not a vegetarian, you're a cheese and bread. Attarian, you know, that's not good. You don't want to combine your fats and your carbs at every meal because that's how you put on a bunch of weight. So, definitely I think any diet, if you do it well, is going to beneficial. There are loads and loads of studies that will tell you a plant-based diet is ideal, is what you should be doing. I, I read a book, man, I'm forgetting the author's name now, but he broke down diet in a really simple way. 00:27:29 Jeremy It was just like, eat food, not too much. Oh, what's the third thing, eat food, not too much. Mostly plants. Yeah. His whole point was like, so much of what we eat is not food. Much of what is in the store is not food. If it didn't grow out of ground or it doesn't walk on the ground, it's not food. Right. You know, I, I eat impossible burgers. I eat, I eat tons of like processed fake meats to get my protein because I'm lazy. Because I have childhood associations with, a meat and potatoes diet. It's a lot of just a conditioning that I need to break. Yeah. But if you can eat, yeah. I mean the more locally sourced and plant-based, and not too much is a pretty simple rule to live by. And, and, I definitely think it's worth experimenting with, if you haven't before. 00:28:17 Jeremy Yeah. Why not? 00:28:18 Roy I haven't in a, we're in a touch second week and I feel like we're doing good and we're not the we'll never eat meat again. Were talking the other day that, if we get hungry for Terry was craving a hamburger yesterday and out so bad and she had to go somewhere and I'll say, well, just get it on, stop by and get one on your way. It's not a big deal. We, I think because the other part of this is, I think the more you tell yourself, oh, you can't have that. The more, psychologically, we want that. It's like, if we want to eat a piece of meat or a piece of fish, just we can do it, but I just, we're trying, I want to eliminate that because I've gotten to feel how, when I eat beef, especially it just, it seems to weigh me down and bother me the next day. 00:29:01 Roy Like being, not being able to get up. I'm telling you that, I felt my inflammation go down the last couple of weeks. Also just being, much, easier flinging up out of bed with a lot of energy and feeling good, but that's, and it's have to be fair to say, that's not meat or beef or proteins fault. That was, that was my fault for eating at the wrong time of day. That's another thing that we kind of have to talk about that goes with this. We, we're doing the intermittent fasting, so we try to eat our last meal by seven and then we're not eating again until lunch. So, I feel like that has been the most beneficial for me because I'm a late night eater. I mean, at nine, 10 o'clock at night, man, I am ready. Before, before Terri, man, I would physically get in the car at nine o'clock at night and go up to the convenience store and raid the shell, and get the little nacho bowl and a candy bar and come home and eat it. 00:30:03 Roy It's something about going to bed, feeling that full, satisfied feeling. Trying to get rid of that as well as the other thing, I kind of laugh about is that if I get just a little twinge of hunger and empty stomach, I'm like, oh my God, that must be death knocking at my door. It's like trying to realize that feeling can be there for a couple of hours and I'm probably not going to pass out or have any negative reactions. So, that's something that's like, oh my God, I felt my stomach. I need to run to the cabinet and load up on something, ? 00:30:36 Zach Yeah. I I've so I intermittent fast most days. It's actually 1:30 PM here right now and I still haven't eaten anything. I probably won't until dinner at this point, just because, if I eat now, my dinner will be ruined. Like my dad would always tell me. But interim and fasting is great. It really teaches you the difference between I am actually hungry for food and my body is craving food and I need the energy from food and the, I have something emotional or mental going on that I want to go crack the cupboard open for it. I did a three-day fast one day, or over three days, I did three days. That was such an eyeopening moment because you go three days without actually eating anything. All of these little, I had all these moments of like, I need to go eat Y oh, you're about to do a task that is hard or something that you want to avoid or remind you of something. 00:31:44 Zach I had all these connections in those three days of why I wanted to go eat. None of it had anything to do with being hungry. Yeah. It was, it was a really great realization. 00:31:54 Roy That's awesome. I had, I I'm going to brag. I don't usually do that. I want to brag about last night. I was sitting here putting together one of our episodes and, I don't even know why I just got up, went to the cabinet, opened it. This is like, what? Nine 30 got up, went to cabinet, opened it up and it was chickpeas. So, I mean, I have to say it wasn't like Doritos, but, and I had the chickpeas out. I had them open. I had my hand in it and I thought, why am I eating these it's nine o'clock I'm not supposed to be, I don't want to be doing this because it'll make me and I actually put them up without eating them. For me, that's a huge victory because what I have found too, is that it's that mindless eating, like in the past, I would have gone over there, ate a handful of, and then, 10 minutes, 15 minutes later realized, wow, I wish I hadn't had done that. 00:32:43 Roy Or, you know why I wasn't hungry. It was just a habit. So, yeah, I'm kind of fighting that too is just breaking these mindless habits of always feeling like you have to have something in your hand, eating. 00:32:56 Zach Good for you though. You were at the point of no return. That's, that's hard to turn back from that point. 00:33:01 Roy I, it was so bad. I actually had to wash my hands to get all the salt off of my hands. I have my eye on that board. 00:33:11 Jeremy We had a really interesting conversation. A few episodes ago. We had a guest, Dr. Judson brewer, and he's really big on curiosity when it comes to breaking habits and whether it's addiction, food, whatever. And we talked about that specific thing. Because I'm the same way I have to fast, otherwise I will nine, 10 o'clock whereas the popcorn where something quick that I can just emotionally escape into. Yep. And he raised the point. He's like, just get really curious every time that those urges come up, rather than giving into them, rather than fighting them, rather than like having this volatile relationship with that experience. Just get really curious with why do I need eat right now? What does is my body sending me a signal that says, Hey, you're really hungry. You're about to snap somebodies head off, or are you bored? Are you sad? Are you lonely? What's driving this? And it is amazing how just asking yourself, why am I doing this? Yeah. 00:34:05 Jeremy Will completely shift your relationship with that moment. To put you on a course, to your bigger, better offer, right? Your bigger, better goal. If I want to feel better, I don't want to wake up with this Dorito hangover. I want to wake up clear of brain fog and I'm ready to take on the day and know that I don't need to eat until lunch because I'm not going to starve. I'm in one of the wealthiest countries in the world, I'll find foods. 00:34:28 Roy And the paramedics are at nine one. They're just a moment. Give me an Abbey and get me back up on the ground. 00:34:34 Jeremy I don't think they delivered Doritos. I mean, 00:34:38 Speaker 4 By the way, I go by QT and grab us some bags and stuff. 00:34:43 Roy Yeah. That's a, I guess a good question for you guys since y'all have y'all are pretty far down the journey, but one of the, I was just thinking about the name of our podcast is feeding fatty. It was kind of a irreverent comment when were kicking it around, cause I just said, we're trying to find a name and all the healthy ones are taken or sound too corny. Were like, eh, and I just said, yeah, we could just say something like, feed no fatty. And Terry said, yeah, that's it. The other thing I think about is that, I think I will always be, even if I lost weight and got to where I want to be, I feel like I will always be that heavy person inside that I always have to be on guard. It's like, and I'm going to ask, I asked this question of both of y'all, but it's like being a reformed smoker, being a reformed drinker, because I, I'm not, I never smoked like packs and packs a day, but on the may have one and there are times that cigarettes, you smell very good. 00:35:41 Roy I mean, they smell good enough. You could eat one. It's like, I'm not even a, really a smoker, but that temptation is always there. I assume that temptation is always going to be here with food. That it's something you got to fight forever or not fight, but deal with put it that way. 00:35:57 Zach Yeah, it is. I mean, it's always there's a lake Jeremy side, right? There's a whole industry on making sure that, this kind of food is there and there's marketing and it's made to make you want more. The way I've tried to live is, for every, all of every single attempt that I've made at eating healthier, of it has stuck. The next time, a little more stuck and the next time, a little more stuck. Now if I look back like, I had to take my mom somewhere one day and she wanted to go to McDonald's and I had been eating healthy for three months and I ate McDonald's and I felt so bad after eating. McDonald's like brain fog. My body hurt my stomach was in doing somersaults. Even the next day for a couple of days, I didn't feel good. I thought about it and I was like, okay, I have now conditioned myself to feel good when I eat good food, because 20 years ago I would eat McDonald's for lunch and dinner and then I'd go back for a snack. 00:37:09 Zach And that was just normal for me. Every time I've tried to eat healthy, right. You're just going to reinforce it more and you're gonna fail and you're always going to go backwards. For me, that's always been the case like now where I'm at now, while I still don't feel like I eat as well as I want to eat, I look back at the journey and how much it's different. Yeah. How much change there is, how much the, those little steps have now added up to this one big thing where I'm doing pretty well. I could still grow, but I'm doing pretty well. Yeah. 00:37:46 Terry It's good that you realize that now because you guys, I mean, you guys have young kids too, right? I mean that you have to factor all a nine-year-old, you have to factor all of that in and what your world can handle and, how much time you have and trying to do, that's what we just get caught up in the rat race, going through the drive-through having everything right there at our Beck and call. That's gotta be an issue too. I mean, that part of it. 00:38:19 Zach Yeah. Well, and also, I mean, having my daughter watch what I'm doing, right. It's for me, it's, I watched my dad, I watched my mom and it was poor choices. Those poor choices lived on in me. It's my responsibility kind of break that chain and have my daughter watch me make good choices most of the time yeah. With the occasional reward. Right. And, and break that chain. She doesn't grow up in the same way, having those, poor relationship with food that I always have. Yeah. 00:38:51 Terry Right. Yeah. I always, I have, my daughters are almost 23 and 29 now. I used to tell them all the time, I thought you were only three. I'm sorry, did I say 29? And. 00:39:07 Speaker 4 Let the cat out of the bag there. But. 00:39:11 Terry I used to tell him all the time and I still do just joking with him, add that to your list for therapy, for your future therapist. That's just something that you're going to have to, but the more that you, like you were saying, your daughter's nine and she needs to emulate you and your decision, your choices. She doesn't have to deal with this later on in life. Yeah. 00:39:35 Zach Trust me that there's plenty of other things that I'm doing that is going to need therapy later. She has, she'll. 00:39:41 Terry Have a running list. It'll go forever. 00:39:44 Zach She has a little notebook that says notes to my future therapist. 00:39:48 Terry Get rid of that. I mean, gold frame it, 00:39:51 Roy Well, the other thing is that we're, I'm getting to a point of, we start thinking about outliving our wellness and that's a new little catch phrase that we like to use is because, you see so many people that they age and, either they're, they've got a good mind and their body's broken down or their body's broken down and they still have a good mind. A lot of times they have to have intensive care from somebody else, a family member typically. So, at some point I've started really thinking that one through that, I wish I'd have started earlier, but it's never too late to start, but the sooner we can think about what are we going to look like when we're 80, because, nowadays they talk about, diabetes or they talk about dementia being the, type three diabetes. And, just some of the things that we've done with the quality of food that we've eaten and, it really can lead to our detriment. 00:40:51 Roy Sometimes that doesn't manifest it until later in life. 00:40:56 Terry Yeah. The closer we get to that's something I, 00:40:59 Speaker 5 Go ahead. Sorry. 00:41:01 Jeremy Yeah. I was just gonna say that's something that I think about a lot too. I've had a number of friends and family members that have had cancer. I always think it's so interesting how we need that. You're going to die really soon unless you start eating this way. I, I constantly say to myself, get online and Google what your doctor would prescribe. If you were diagnosed with cancer and start eating that right now, it seems really simple. There there's an outline, there's a nutrition plan to fight off cancer, but we don't start eating it until we have it. Exactly. Yeah, I think about that a lot. I also think that is one of those things similar to, I'm going to do this for a year. I'm going to do this for five years. Yeah. If I worry about how I'm going to feel in 30 years, I'm going to lose sight of what do I need to do today. 00:41:48 Jeremy Right. To me, it's just, it's those baby steps. It's whatever I can do today to better than I was yesterday is ultimately what's going to happen when I'm 65, 70, whatever I look back and go, oh, those are the, I made good choices. I did the best I could with. 00:42:02 Terry I know. That's what we just got a, book is called how not to die. It's, it's amazing, but it talks about all those different diseases, the type two diabetes cancers, I just, all of that and inflammation. That's why we decided to kind of go with the plant-based diet because, we have different medications that we're on, for different things. And, just trying to get off of those, you can get off of those medications through eating properly, what you eat. Yeah. It's, that's what kinda turned us in toward that direction. Not only that, but, because it's better for us, not to pack on the pounds and all of that. I know we're. 00:42:49 Roy Way long on time guys, but just one more topic I want to bring up is the emotional, cause we're talking about this and, one, another big thing I'm struggling with and trying to break is that, not necessarily like the bad stuff, but more the good stuff, the celebrations, everything is emotional. Like let's go eat or get the ice cream for this to celebrate, but also, factoring in or thinking about it as fun because going out with friends and family and you go somewhere and you eat and chat and the reality is we can go eat healthy, but we can go chat with people and not have to eat bad or eat food that is not as good for us as we would like. So, not tell the story that it's very fresh in my mind that Terry was out running around one day and she called and said, I'm on my way home. 00:43:46 Roy Have you eaten yet? And I said, no. I started welling up inside, like, she's going to say, great. I am going by Chick-fil-A to pick up some stuff. She's like, okay, well there's some salmon and some asparagus in their fridge. I just felt like I just was deflated. It was like, all the air just blew out of me. That again, it's a realization that, oh my gosh, you have this, such an attachment to food that it, what does it matter? Does it really matter if I ate the chick? whatever I eat it's as long as it's healthy and it gives me fuel to live for another day. That's really all this is about, but these, all these attachments that we have, it's just crazy. 00:44:31 Jeremy I know in my case, all of that links back to sugar, the more processed and sugary foods I have, the way easier it is to cave into those cravings, to cave, to those emotions and eat that way. I, I mentioned that I, I started doing the keto diet after a conversation with Zach, and it wasn't even that strict about it. I just went, what, I'm just a hundred carbs a day. That's it, that's my limit. I'm going to experiment with that. And we'll play with that. See where that leads me. I got more and more strict about it. I started working out every day and doing these things. It was so funny because I distinctly remember this day at work when, one of my colleagues was eating wheat thins. She's just sitting at her desk snacking. I mean, she was like a cheerleader for the Houston rockets. 00:45:13 Jeremy I mean, she's in fantastic shape and she's, she can eat whatever she wants. She offered me some and I looked at it and it was as though she was offering me a hockey puck was just like, I was like, that's not even food. Like how, what you're asking by offering me this. I can't even process that as a question because you're not offering me food. My relationship just completely changed with those kinds of foods because I had just so completely cut them out of my diet and was feeling so good. They're just like, it just doesn't I two and three makes 47. That does not compute. I can't do that. Yeah. Just changing that relationship was huge for me. Again, not to be repetitive, but I just think that why question is going to be the thing that is a game changer for so many people. 00:45:58 Jeremy When I, when I still get an occasional, craving to drink a beer, I love beer. Love it, delicious, amazing. It just, it's, it causes more problems for me than solves them, although it does solve some problems, but by shifting to non-alcoholic beers, I've found some really high end ones that are really good, that tastes like real beer. You mentioned those reward systems and like the hanging out with friends, I can drink those and have that in my hand. It feels exactly the same as it used to only I don't wake up feeling horrible the next day. 00:46:36 Zach Yeah. Yeah. I quit drinking probably five years ago now, too. I was gonna use that as an example, the, it was a very hard thing to do because so of all of my friend group, I was the first one to do it. We would, but one of the things we would do is we would meet at a pub to socialize and everyone would drink beer. I was showing up and not drinking beer and I was drinking water. It was really hard to do, mentally, right. Because you go, you're going to drink beer. From the food perspective, I will say, I started earlier, like I did a three-day fast and having, that data of really reflective moments of why am I hungry? Yeah. Like what is the real here? I know I'm not going to die. What's the real reason. And, and just exploring that and being curious about it. 00:47:37 Zach I think that's one of the things that's really helped me is the curiosity, the, having the courage and to be brave enough to explore that stuff. Because if we're all being honest, right, you open some of those doors, you don't want to look inside. Right. Right. You want to keep them closed. If you can get to the underlying problem as to what's going on, right. It, it really does change your relationship now. Now my reward for, Hey, I want to go eat something. My reward is I go do 10 pushups. Yeah. I switched it. If I can. 00:48:14 Jeremy Jump on something that you said there too, even it doesn't always have to be the dark demon in the closet. I know for me, I have a habit of, during my Workday, I just, I need a reason to get up and walk away from the desk. Right. Way too often, it's go grab a snack from the kitchen. It used to be, go across and get another cup of coffee and a snack from the kitchen or whatever. When I get real honest with myself, the reason I'm doing it is I don't like my job. I don't want to be doing it. I won't come up with any distraction to go do something else, even if it's for two minutes. And even if it poisons my body. That's not like, dad beat me up when I was three. That's just, Hey, I got to find another job. 00:48:53 Jeremy This isn't working out for me. It doesn't always have to be basing your deepest, darkest fear. Sometimes it's just overcoming whatever the small hurdle is. That's right in front of your face. Yep. 00:49:02 Roy Yeah. Are feeling like we need that reason. Cause I'm the same way of being like, I love my job. Thank goodness. It's like, task within that, you bump up against something that's kind of hard. It makes you have to sit and think, and it, and I'm the same. It's like, well, let me just get up and get something to eat while I think about this. The other thing too is I, I feel so much better if I just go for a walk. If I take the same five minutes and go outside fresh air and clear my head, I would come back and even perform better, but not eating and then feeling bad. And, talking about the beer and stuff, around here going and having some, Mexican food with margarita and some chips, but I'm telling you, I could, and it's it, wasn't a hangover, but I could drink one margarita and the chips and I could feel so bad the next morning, till 10 or 11, o'clock it just a cloud and feel like, for my eyes and above was just full of congestion. 00:50:00 Roy So, that's been a big reason too, for me to switch some things and not to go do that it's because I just got where it makes me so feel so bad the next day, I'm trying to get out from under it got worse and worse. So. 00:50:14 Jeremy Yeah, your body's going to be the best teacher in all of this. It's just pay attention to attention or do whatever it is you're going to do and you'll make them, you'll make the right. 00:50:21 Roy Choices, guys. I know we're way late, for, we go couple things. What is a tool, a habit? What is something that each of y'all do in your daily life that adds a lot of value, professional, personal whichever. 00:50:34 Zach Yeah. I'll I want to make sure I mentioned something that I haven't mentioned already. So, as weird as this sounds journaling for me is a huge one. I, I currently use a, a product called the full focus planner and it's just a kind of a methodology and this planner that you get, your three biggest things done and you schedule everything. Within that, it's professionally personally, right? I schedule my healing time. I schedule my work time. I schedule my, family time and it allows me to be productive and not put things off, but it also gives me a place to get my thoughts out on a piece of paper, because once they're out of my head, they're not circling around and causing trouble anymore. So, and that's my own preference for journal. If you can just spend a couple of minutes with yourself, getting the thoughts out of your head onto a piece of paper and being grateful about it can make a huge difference in your day. 00:51:38 Zach Yeah. Way to steal my thunder. And you. 00:51:43 Jeremy Say the exact same thing. It is, it is so important to me every day to envision the day, the way I want it to go so that when I do run up against those roadblocks and I feel like, oh, I need to, I want to eat something I want to not work out. I can look back and go, well, what did I tell myself this morning when I was alert and on it and focused and knew what I wanted out of life. Yeah. And I can refer to that. It just, it really keeps me accountable. Most days what I wrote down doesn't happen. A lot of it does some of it doesn't and that's again, getting comfortable with forgiving yourself and forgiving that you cannot create the exact life you want, but you can work toward goals. Yeah. The closer you get, the better you'll feel along the way. 00:52:22 Jeremy Then, just in terms of something practical or something more tangible that I have been experimenting with, and it might sound weird, and this is probably a whole other show, but I've been experimenting with using CBD as a supplement. Because I've found that it really regulates my anxiety and my depression and it helps keep me focused. It's not, there's no high, there's no psychotropic effect. It's just, it's about balancing your body's natural cannabinoids to a level that is the way they should be functioning. That so many things like diet and busy lifestyle center. 00:52:56 Roy Okay. Interesting. Yeah. The journaling, it's funny, you mentioned it cause I've been trying to do better about that. And, not just, and I think like Zach said, it seemed like just getting things. It doesn't have to be stressed, but just getting the thought, whatever it is out of my head and on paper, it just seems to be so much more clearing. I feel like I've had a clear head, doing it too. So, 00:53:21 Terry Yeah. Zach, you mentioned it's so important to be grateful. I mean, just think about being grateful at least a few times a day, that has a lot to do with it. Just keeping that positivity go. And. 00:53:37 Roy Most of us have a lot to be thinking, yes, we do. You get it instead of getting focused on that negative and why we want to go eat, because let's focus on the, all the great things. We talk a lot about that too. If you've got 10 things on your to-do list today, and you get seven done, we beat herself up over these other three, but let's celebrate the seven. Let's take away some gratitude. Anyway, I'm going to wrap it up. You guys, we could talk for a lot longer, so thank y'all tell everybody course, how can they reach out and get a hold of, y'all talk about your, podcast. 00:54:10 Jeremy Yeah, the website is, thefitmess.com and we're fitness guys on pretty much every social platform that's out. 00:54:17 Roy There. Okay. All right. Well, great. Well, thanks so much. I can find us, of course, at www.feedingfatty.com. We're on all the major social media platforms, as well as the major podcast platforms, iTunes, Stitcher, Google, Spotify. If we're not one that you listened to, please reach out, be glad to get it added. So till next time I'm rolling. I'm Terry take care of yourself and take care of your family. Thanks guys. Thank you. Thank you. www.thefitmess.com www.feedingfatty.com

god tv love amazon spotify google mindset change meditation speaker seattle ny mexican mcdonald walmart tx acting stitcher cbd surgery new york yankees workout reiki rangers beck losses chick albany craigslist mariners doritos workday eaton small steps sox punished qt add up mindshift dorito armadillo excercise change for the better fit mess jeremy you jeremy it about jeremy zach yeah

Episode 6 with Jeremy Grant

Hello, User

Play Episode Listen Later Jun 7, 2021 37:28

Description: Today we welcome to the podcast Jeremy Grant of the Better Identity Coalition. Jeremy's roots in identity go back all the way to his college days at University of Michigan —one of the earliest adopters in the higher education space of the Smart Card. He has been involved since the early days of government internet regulation working with Bill Clinton, Virginia senator Chuck Robb, the Obama administration, the Department of Defense, NIST, and doing various legislative work as a staffer. In addition to being the Coordinator for the Better Identity Coalition, Jeremy now works for DC-based law firm Venable as well as a consultant to clients in several sectors. His focus is primarily in financial services, healthcare, IT, and recently in unemployment fraud due to billions of dollars lost during the pandemic. Jeremy dives into the specifics of why digital identity and cybersecurity are national issues that the private sectors simply cannot tackle on their own. Richard and Jeremy share their sentiments on creating a centralized and holistic approach to protecting and regulating identity in the United States. Key Takeaways: [2:26] Jeremy's background and how he came to be working in the identity space. [7:57] What is Jeremy's day-to-day work like? [11:29] Where is DC's attention in terms of identity? [13:45] Jeremy's thoughts on the landscape of digital identity.[18:30] What is the tipping point for when action takes place in legislation? [22:15] How much can any private organization do to push the needle for digital identity and cybersecurity? [29:56] What does Jeremy see for state-based privacy initiatives? Quotes: ●“People tell me all the time I have a dream job.” —Richard ●“I do have a great job. I really love it.” —Jeremy ●“How does stuff get done in DC? I mean you've got a mix of things in the executive branch and the legislative branch in and around identity where there's just a ton of attention these days.” —Jeremy ●“We've now seen bipartisan legislation introduced through the Digital Identity Act that's getting a lot more attention partially this past year with what we've been going through with the pandemic and digital becoming that much more important.” —Jeremy ●“A lot of regulators are just interested in getting perspective on how we look at identity.” —Jeremy ●“People are being recognized for these transformative changes and a lot of this stuff is being driven off of identity.” —Richard ●“How do we solve this nationally at a digital level in a way that might actually work? You can't just solve it in UI, and you can't just solve it in banking, and you can't just solve it in-house. You need a more holistic approach where people can do this everywhere.” —Jeremy ●“It is a challenge at times to get people to sort of take things up a level and look at this more of a national priority as opposed to a sector priority.” —Jeremy ●“At the end of the day when it comes to the identity proofing side of things, the government is the only authoritative assurer of identity. And everybody in the industry who sells an identity verification product is trying to guess what only the government knows.” —Jeremy ●“You cannot solve this without government help.” —Jeremy ●“It almost seems like we have elevated the digital capabilities of everything in this world except for identity. It's like we left identity behind.” —Richard ●“One of the things that I've found challenging in my digital identity evangelizing and proselytizing is this absence of a national data privacy standard here in the United States.” —Richard ●“If you keep protecting the stuff but not protecting the people, all I gotta do is be you and I get your stuff.” —Richard ●“Industry has come around to the idea that we need a national privacy law probably five years too late.” —Jeremy ●“Doing nothing is also a policy choice.” —Jeremy Mentioned in This Episode: The Better Identity Coalition Indentiverse NIST Venable LLP

united states university technology michigan dc barack obama defense coordinators bill clinton key takeaways ui nist venable ping identity description today jeremy you jeremy it

Episode #104: The Rise of Data Services with Patrick McFadin

Play Episode Listen Later Jun 7, 2021 49:06

About Patrick McFadinPatrick McFadin is the VP of Developer Relations at DataStax, where he leads a team devoted to making users of Apache Cassandra successful. He has also worked as Chief Evangelist for Apache Cassandra and consultant for DataStax, where he helped build some of the largest and exciting deployments in production. Previous to DataStax, he was Chief Architect at Hobsons and an Oracle DBA/Developer for over 15 years.Twitter: @PatrickMcFadinLinkedIn: Patrick McFadin DataStax website: datastax.comK8ssandra: k8ssandra.ioStargate: stargate.ioDataStax Astra: Cassandra-as-a-ServiceWatch this episode on YouTube: https://youtu.be/-BcIL3VlrjEThis episode sponsored by CBT Nuggets and Fauna.TranscriptJeremy: Hi everyone, I'm Jeremy Daly and this is Serverless Chats. Today I'm chatting with Patrick McFadin. Hey Patrick, thanks for joining me.Patrick: Hi Jeremy. How are you doing today?Jeremy: I am doing really well. So you are the VP of Developer Relations at DataStax, so I'd love it if you could tell the listeners a little bit about yourself and what DataStax is all about.Patrick: Sure. Well, I mean mostly I'm just a nerd with a cool job. I get to talk about technology a lot and work with technology. So DataStax, we're a company that was founded around Apache Cassandra, just supporting and making it awesome. And that's really where I came to the company. I've been working with Apache Cassandra for about 10 years now. I've been a part of the project as a contributor.But yeah, I mean mostly data infrastructure has been my life for most of my career. I did this in the dotcom era, back when it was really crazy when we had dozens of users. And when that washed out, I'm like, oh, then real scale started and during that period of time I worked a lot in just trying to scale infrastructure. It seems like that's been what I've been doing for like 30 years it seems like, 20 years, 20 years, I'm not that old. Yeah. But yeah, right now, I spend a lot of my time just working with developers on what's next in Kubernetes and I'm part of CNCF now, so yeah. I just can't to seem to stay in one place.Jeremy: Well, so I'm super interested in the work that DataStax is doing because I have had the pleasure/misfortune of managing a Cassandra ring for a start-up that I was at. And it was a very painful process, but once it was set up and it was running, it wasn't too, too bad. I mean, we always had some issues here and there, but this idea of taking a really good database, because Cassandra's great, it's an excellent data store, but managing it is a nightmare and finding people who can manage it is sort of a nightmare, and all that kind of stuff. And so this idea of taking these services and DataStax isn't the only one to do this, but to take these open-source services and turn them into these hosted solutions is pretty fantastic. So can you tell me a little bit more, though? What this shift is about? This moving away from hosting your own databases to using databases as a service?Patrick: Yeah. Well, you touched on something important. You want to take that power, I mean Cassandra was a database that was built in the scale world. It was built to solve a problem, but it was also built by engineers who really loved distributed computing, like myself, and it's funny you say like, "Oh, once I got it running, it was great," well, that's kind of the experience with most distributed databases, is it's hard to reason around having, "Oh, I have 100 mouths to feed now. And if one of them goes nuts, then I have to figure it out."But it's the power, that power, it's like stealing fire from the gods, right? It's like, "Oh, we could take the technology that Netflix and Apple and Facebook use and use it in our own stuff." But you got to pay the price, the gods demand their payment. And that's something that we've been really trying to tackle at DataStax for a couple of years now, actually three, which is how ... Because the era of running your own database is coming to an end. You should not run your own database. And my philosophy as a technologist is that proper, really important technology like your data layer should just fade into the background and it's just something you use, it's not something you have to reason through very much.There's lots of technology that's like that today. How many times have you ... When was the last time you managed your own memory in your code?Jeremy: Right. Right. Good point. I know.Patrick: Thank god, huh?Jeremy: Exactly.Patrick: Whew.Jeremy: But I think that you make a really good point, because you do have these larger companies like Facebook or whatever that are using these technologies and you mentioned data layers, which I don't think I've worked for a single company, I don't think I actually ... I founded a start-up one time and we built a data layer as well, because it's like, the complexity of understanding the transaction models and the routing, especially if you're doing things like sharding and all kinds of crazy stuff like that, hiding that complexity from your developers so that they can just say, "I need to get this piece of information," or, "I need to set this piece of information," is really powerful.But then you get stuck with these data layers that are bespoke and they're generally fragile and things like that, so how is that you can take data as a service and maybe get rid of some of that, I don't know, some of that liability I guess?Patrick: Yeah. It's funny because you were talking about sharding and things like that. These are things that we force on developers to reason through, and it's just cognitive load. I have an app to get out, and I have some business desire to get this application online, the last thing I need to worry about is my sharding algorithm. Jeremy, friends don't let friends shard.Jeremy: Right. That's right. That's a good point.Patrick: But yeah, I mean I think we actually have all the parts that we need and it's just about, this is closer than you think. Look at where we've already started going, and that is with APIs, using REST. Now GraphQL, which I think is deserving its hotness, is starting to bring together some things that are really important for this kind of world we want to live in. GraphQL is uni-fettering data and collecting and actual queries, it's a QL, and why they call it Graph, I have no idea. But it gives you this ability to have this more abstract layer.I think GraphQL will, here's a prediction is that it's going to be like the SQL of working with data services on the internet and for cloud-native applications. And so what does that mean? Well, that means I just have to know, well, I need some data and I don't really care what's underneath it. I don't care if I have this field indexed or anything like that. And that's pretty exciting to me because then we're writing apps at that point.Jeremy: Right. Yeah. And actually, that's one of the things I really like about GraphQL too is just this idea that it's almost like a universal data access layer in a sense because it does, you still have to know it, you have to know what you're requesting if you're an end developer, but it makes it easier to request the things that you need and have those mutations set and have some of those other things standardized across the company, but in a common format because isn't that another problem? Where it's like, I'm working with company A and I move to company B maybe and now company B is using a different technology and a different bespoke data layer and some of these other things.So, I think data as a service for one, maybe with GraphQL in front of it is a great way to have this alignment across companies, or I guess, just makes it easier for developers to switch and start developing right away when they move into a new company.Patrick: Yeah, and this is a concept I've been trying to push pretty hard and it's driven by some conversations I've had with some friends that they're engineering leaders and they have this common desire. We want to have a zero day dev, which is the first day that someone starts, they should be producing production code. And I don't think that's crazy talk, we can do this, but there's a lot of things that are in front of it. And the database is one of them. I think that's one of the first things you do when you show up at company X is like, "Okay, what database are you using? What flavor of SQL or GRPC or CQL, Cassandra query language? What's the data model? Quick, where's that big diagram on the wall with my ERD? I got to go look at that for a while."Jeremy: How poorly did you structure your Git repositories? Yeah.Patrick: Yeah, exactly. It's like all these things. And no, I would love to see a world where the most troublesome part of your first day is figuring out where the coffee and the bathroom are, and then the rest of it is just total, "Hey, I can do this. This is what I get paid to do."Jeremy: Right. Yeah. So that idea of zero day developer, I love that idea and I know other companies are trying to do that, but what enables that? Is it getting the idea of having to understand something bespoke? Is it getting that off of the table? Or not having to deal with the low-level database aspect of things? I mean because APIs, I had this conversation with Rob Sutter, actually, a couple weeks ago. And we were talking about the API economy and how everything is moving towards APIs. And even data, it was around data as well.So, is that the interface, you think, of the future that just says, "Look, trying to interface directly with a database or trying to work with some other layer of abstraction just doesn't make sense, let's just go straight from code right to the data, with a very simple API interface?"Patrick: Yeah, I think so. And it's this idea of data services because if you think of if you're doing React, or something like a front-end code, I don't want to have a driver. Drivers are a total impediment. It's like, driver hell can be difficult at large organizations, getting the matching right. Oh, we're using this database so you have to use this driver. And if you don't, you are now rejected at the gate. So it's using HTTP protocols, but it's also things like when you're using React or Angular, View, whatever you're using on the front-end, you have direct access.But most times what you're needing is just a collection or an object. And so just do a get, "I need this thing right now. I'm doing a pick list. I need your collection." I don't need a complicated setup and spend the first three days figuring out which driver I'm using and make sure my Gradle file is just perfect. Yeah. So, I think that's it.Jeremy: Yeah. No, I'd be curious how you feel about ORMs, or O-R-Ms, certainly for relational databases, I know a lot of people love them. I can't stand them. I think it adds a layer of abstraction and just more complexity where I just want access to the database. I want to write the query myself, and as soon as you start adding in all this extra stuff on top of it to try to make it easier, I don't know, it just seems to mess it up for me.Patrick: All right. So yeah, I think we have an accord. I am really not a fan of ORMs at all. And I mean this goes back to Hibernate. Everyone's like, "Oh, Hibernate's going to be the end of databases." No, it's not. Oh yeah, it was the end of the database at the other side because it would create these ridiculous queries. It's like, why is every query a full table scan?Jeremy: Exactly.Patrick: Because that's the way Hibernate wanted it. Yeah. I actually banned Hibernate at one company I was working at. I was Chief Architect there and I just said, "Don't ever put Hibernate in our production." Because I had more meetings about what it was doing wrong than what it was doing right.Jeremy: Right. Right. Yeah. No, that's sounds, yeah.Patrick: Is that a long answer? Like, no.Jeremy: No, I've had the same experience where certain ORMs you're just like, no. Certain things, you can't do this because it's going to one, I think it locks you in in a sense, I mean there's all kind of lock-in in the cloud, and if you're using a data service or an API or you're using something native in AWS, or IBM Cloud, you're still going to be locked in in some way, but I do feel like whenever you start going down that path of building custom things, or forcing developers to get really low level, that just builds up all kinds of tech debt, right? That you eventually are going to have to work down.Patrick: Well, it's organizational inertia. When you start getting into this, when you start using annotations in Hibernate where you're just cutting through all the layers and now you're way down in the weeds, try to move that. There's a couple of companies that I've worked with now that are looking at the true reality of portability in their data stores. Like, "Oh, we want to move from one to a different, from a key value to a document without developers knowing." Well, how do you get to that point?Jeremy: Right. Yeah.Patrick: And it's just, that's not giving access to those things, first of all, but this is that tech debt that's going to get in your way. We're really good, technologists, we're really good at just wracking up the charges on our tech debt credit card, especially whenever we're trying to get things out the door quickly. And I think that's actually one of the problems that we all face. I mean, I don't think I've ever talked to a developer who was ahead of schedule and didn't have somebody breathing down their neck.Jeremy: Very true.Patrick: You take shortcuts. You're like, "We've got to shift this code this week. Skip the annotations and go straight into the database and get the data you need." Or something. You start making trade-offs real fast.Jeremy: What can we hard code that will just get us past.Patrick: Yeah. Is it green? Shift it. Yeah.Jeremy: Yeah, no, I totally, totally agree. All right. So let's talk a little bit more about, I guess, skillsets and things like that. Because there are so many different databases out there. Cassandra is just one and if you're a developer working just at the driver level, I guess, with something like Cassandra, it's not horrible to work with. It's relatively easy once a lot of these things are set up for you.Same is true of MongoBD, or I mean, DynamoDB, or any of these other ones where the interface to it isn't overly difficult, but there's always some sort of something you want to build on top of it to make it a little bit easier. But I'm just curious, in terms of learning these different things and switching between organizations and so forth, there is a cognitive load going from saying, "I'm working on Cassandra," to going to saying, "I'm working on DynamoDB," or something like that. There's going to be a shift in understanding of how the data can be brought back, what the limitations are, just a whole bunch of things that you kind of have to think about. And that's not even including managing the actual thing. That's a whole other thing.So, hiring people, I guess, or hiring developers, how much do we want developers to know? Are you on board with me where it's like, I mean I like understanding how Cassandra works and I like understanding how DynamoDB works, and I like knowing the limits, but I also don't want to think about them when I'm writing code.Patrick: Yeah. Well, it's interesting because Cassandra, one of the things I really loved about Cassandra initially was just how it works. As a computer scientist, I was like, "This is really neat." I mean, my degree field is in distributed computing, so of course, I'm going to nerd out.Jeremy: There you go.Patrick: But that doesn't mean that it doesn't have mass appeal because it's doing the thing that people want. And I think that's going to be the challenge of any properly built service layer. I think I've mentioned to you before we started this, I work on a project called Stargate. And Stargate is a project that is meant to build a data layer on top of databases. And right now it's with Cassandra. And it's abstracting away some of the harder to understand or reason things.For instance, with distributed computing, we're trying to reduce the reliance on coordination. There is a great article about this by Pat Helland about how coordination is the last really expensive thing that we have in development. Memory, CPU, super cheap. I can rent that all day long. Coordination is really, really hard, and I don't expect a new programmer to understand, to reason through coordination problems. "Oh, yeah, the just in time race conditions," and things like that.And I think that's where distributed computing, it's super powerful, but then whenever people see what eventual consistency are, they freak out and they're like, "I just want my SQL Lite on my laptop. It's very safe." But that's not going to get you there. That's not a global database, it's not going to be able to take you to a billion users. Come on, don't cut ...Jeremy: Maybe you don't need to be.Patrick: ... your apps short Jeremy. You're going to have a billion users.Jeremy: You should strive for it, at least, is how I feel about it. So that's, I guess, the point I was trying to get to is that if the developers are the ones that you don't want learning some of this stuff, and there's ways to abstract it away again, going like we talked about data as a service and APIs and so forth. And I think that's where I would love to see things shifting. And as you said earlier, that's probably where things are going.But if you did want to run your own database cluster, and you wanted to do this on your own, I mean you have to hire people that know how to do this stuff. And the more I see the market heating up for this type of person, there is very, very few specialists out there that are probably available. So how would you even hire somebody to run your Cassandra ring? They probably all work at DataStax.Patrick: No, not all of them. There's a few that work at Target and FedEx, Apple, the biggest Cassandra users in the world. Huawei. We just found out lately that Huawei now has the biggest cluster on the planet. Yeah. They just showed up at ApacheCon and said, "Oh yeah, hold my beer." But I mean, you're right, it's a specialized skillset and one of the things we're doing at DataStax, we feel, yeah, you should just rent that. And so we have Astra, which is our database as a service.It's fully compatible with open-source Cassandra. If you don't like it, you can just take it over and use open-source. But we agree and we actually can run Cassandra cheaper than you can, and it's just because we can do it at scale. And right now Astra, the way we run it is truly serverless, you only pay for what you need, and that's something that we're bringing to the open-source side of Cassandra as well, but we're getting Cassandra closer to Kubernetes internally.So if you don't want to think about Kubernetes, if you don't want to think about all that stuff, you can just rent it from us, or you could just go use it in open-source, either way. But you're right. I mean, it should not be a 2020s skillset is, "Get better at running Cassandra." I think those days should be, leave it to, if you want to go work at DataStax and run Cassandra, great, we're hiring right now, you will love it. You don't have to. Yeah.Jeremy: So the idea of it being open-source, so again, I'm not a huge fan of this idea of vendor lock-in. I think if you want to run on AWS Lambda, yeah, most of what you can do can only run on AWS Lambda, but changing the compute, switching that over to Azure or switching that over to GCP or something like that, the compute itself is probably not that hard to move, right? I think especially depending on what you're doing, setting up an entire Kubernetes cluster just to run a few functions is probably not worth it. I mean, obviously, if you've got a much bigger implementation, that's a little different.But with data, data is just locked in. No matter where you go, it is very hard to move a lot of data. So even with the open-source flair that you have there, do you still see a worry about lock in from a data side?Patrick: Yeah. And it's becoming more of a concern with larger companies too, because options, #options. There was a pretty famous story a few years ago where the CEO of Target said, "I am not paying Amazon any more money," and they just picked up shop and moved from AWS to Google Cloud. And the CEO made a technical decision. It was like everybody downstream had to deal with that. And I think that luckily Target's a huge Cassandra shop and they were just like, "Okay, we'll just move it over there."But the thing is that you're right, I mean, and I love talking about this because back when cloud was first starting and I was talking about it and thinking about it, just what do the clouds promise you? Oh, you get commodity scale of CPU and network and storage. And that's what they want to sell you because that what they're building. Those big buildings in north Virginia, they are full of compute network and storage, but the thing they know they need to hook you in and the way that they're hooking you in, there's some services that are really handy, they're great, but really the hook is the data.Once you get into the database, the bespoke database for the cloud, one of the features of that database is it will not connect to any other database outside of that cloud, and they know that. I mean, and this is why I really strongly am starting to advocate this idea of this move towards data on Kubernetes is a way where open-source gets to take back the cloud. Because now we're deploying these virtual data centers and using open-source technology to create this portability. So we can use the compute network and storage, a Google, Amazon, Azure, OnPrem wherever, doesn't matter.But you need to think of like, "All right. How is that going to work?" And that's why we're like, "If you rent your Cassandra from DataStax with Astra, you can also use the open-source Cassandra as well." And if we aren't keeping you happy, you should feel totally fine with moving it to an open-source workload. And we're good with that. One way or the other, we would love for you to use a database that works for you.Jeremy: Right. And so this Stargate project that you're working on, is that the one that allows you to basically route to multiple databases?Patrick: That's the dream. Right now it just does Cassandra, but there's been some really interesting ... There's some folks coming out of the woodwork that really want to bring their database technology to Stargate. And that's what I'm encouraged by. It's an open-source project, Stargate.io, and you can contribute any of the connectors for underlying data store, but if we're using GraphQL, if you're using GRPC, if you're using REST, the underlying data store is really somewhat irrelevant in that case. You're just doing gets and puts, or gets and sets. Gets and puts, yeah, that's right. Gets, sets, puts, it's a lot of words.Jeremy: Whatever words. Yeah. Exactly.Patrick: That's what I love about standard, Jeremy, there's so many to pick from.Jeremy: Right, because there are ... Exactly, which standard do you choose? Yeah. So, because that's an interesting thing for me too, is just this idea of, I mean, it would be great to live in a perfect little cloud where you could say like, "Oh, well AWS has all the services I need. And I can just keep all my stuff there, whatever." But best of breed services, or again, the cost of hosting something in AWS maybe if you're hosting a Cassandra cluster there, versus maybe hosting it in GCP or maybe hosting it with you, you said you could host it cheaper than those could, or that we could host it ourselves.And so I do think that there is ... and again, we've had this conversation about multi-cloud and things like that where it's not about agnostic, it's not about being cloud agnostic, it's about using the best of breed for any service that you want to use. And APIs seem to be the way to get you there. So I love this idea of the Stargate project because it just seems like that's the way where it could be that standard across all these different clouds and onto all these different databases, well I mean, right now Cassandra, but eventually these other ones. I don't know, that seems like a pretty powerful project to me.Patrick: Well, the time has come. It's cloud native ... I work a lot with CNCF and cloud-native data is a kind of emerging topic. It's so emerging that I'm actually in the middle of writing a book, an O'Reilly book on it. So, yeah. Surprise. I just dropped it. This just in.Yeah, because I can see that this is going to be the future, but when we build cloud-native, cloud applications, cloud-native applications, we want scale, we want elasticity, and we want self-healing. Those are the three cloud-native things that we want. And that doesn't give us a whole lot ... So if I want to crank out a quick REACT app, that's what I'm going to use. And Netlify's a great example, or Vercel, they're creating this abstraction layer. But Netlify and Vercel are both working, they've been partnering with us on the Stargate project, because they're seeing like, "Okay, we want to have that very light touch, developers just come in and use it," in building cloud-native applications.And whenever you're building your application, you're just paying for what you use. And I think that's really key, not spinning up a bunch of infrastructure that you get a monthly bill for. And that bill can be expensive.Jeremy: It seems crazy. Doesn't it seem crazy nowadays? Actually provisioning an EC2 instance and paying for it to run even if it does nothing. That seems crazy to me.Patrick: There are start-ups around the idea of finding the instance that's running that's causing you money that you're not using.Jeremy: Which is crazy, isn't it? It's crazy. All right. So let's go a little bit more into standards, because you mentioned standards. So there are standards now for a lot of things, and again, GraphQL being a great example, I think. But also from a database perspective, looking at things like TSQL and developers come into an organization and they're familiar with MySQL, or they're familiar with PostgreSQL, whatever it is. Or maybe they're familiar with Cassandra or something like that, but I think most people, at least from what I've seen, have been very, very comfortable with the TSQL approach to getting data. So, how do you bring developers in and start teaching them or getting them to understand more of that NoSQL feel?Patrick: I think it's already happened, it's just the translation hasn't happened in a lot of minds. When you go to build an application, you're designing your application around the workflows your application's going to have. You're always thinking about like, "I click on this. I go there." I mean, this is where we wireframe out the application. At that point, your database is now involved and I don't think a lot of folks know that.It's like, at every point you need to put data or get data. And I think this is where we've taught could be anybody building applications, which makes it really difficult to be like, "No, no, no, start with your data domain first and build out all those models. And then you write your application to go against those models." And I'll tell you, I've been involved in a few of these application boot camps, like JavaScript boot camps and things, they don't go into data modeling. It's just not a part of it.Jeremy: Really?Patrick: And I think this is that thing where we have to acknowledge like, "Yeah, we don't really need that anymore as much, because we're just building applications." If I build a React app, and I have a form and I'm managing the authentication and I click a button and then I get a profile information, I just described every database interaction that I need and the objects that I need. And I'm going to put my user profile at some point, I'm going to click my ID and get that profile back as an object. Those are the interactions that I need. At no point did I say, "And then I'm going to write select from where." No, I just need to get that data.Jeremy: And I love thinking about data as objects anyways. It makes more sense, rather than rows of spreadsheets essentially that you join together, describing an object even if it's got nested data, like a document form or things like that, I think makes a ton of sense. But is SQL, is it still relevant do you think? I mean, in the world we're moving into? Should I be teaching my daughters how to write TSQL? Or would I be wasting my time?Patrick: Yeah. Well, yes and no. Depends on what your kid's doing. I think that SQL will go to where it originally started and where it will eventually end, which is in data engineering and data science. And I mean, I still use SQL every once in a while, Bigtable, that sort of thing, for exploring my data. I mean for an analytics career or reporting data and things like that, SQL is very expressive. I don't see any reason to change that. But this is a guy who's been writing SQL for a million years.But I mean, that world is still really moving. I mean, like a Presto and Snowflake and all these, Redshift, they all use Bigtable, they all use SQL to express the reporting capabilities. But ... And I think this is how you and I got sucked into this is like, well that was the database that we had, so we started using reporting languages to build applications. And how'd that work out?Jeremy: Yeah. Well, it certainly didn't scale very well, I can tell you that, going back to sharding, because that is always something that was very hard to do. So I guess, I get the point that essentially if you're going to be in the data sciences and you actually need to analyze that data and maybe you do need to do joins, or maybe you need to work with big data in a way, that's a specialized aspect of it and I think people could dabble in that if they were just regular developers and they didn't want to go too deep.But it sounds like the bigger, or the end goal here, maybe altruistic, is to just give people access to data. So even if they don't know SQL or they don't know something complex, just make it so that whatever data is there that anybody, with whatever level is, they can consume it.Patrick: Yeah. And move fast with the thing that you're building. Actually, I use a Facebook term, but Facebook does do this. Internally there's a system called Occhio that provides gets and puts for your data, but it abstracts things like geographics and things like that. But the companies that are trying to move quickly, they understood this a long time ago. If you have to reason through, "Am I doing a full table scan? Is that an efficient interjoin?" If you have to reason through that, you're not moving fast anymore.Jeremy: Right. Right. All right. Cool. All right, so let's talk about Astra a little bit more and this whole idea of, because Astra is the serverless version, the hosted version, the serverless version of Cassandra, right? Through DataStax?Patrick: Right. And ...Jeremy: Did I get that right?Patrick: You got it right. And so it gives you full access. You could do Port 9042 if you still want to use a driver, but it gives you access via GraphQL, REST, and there's also a document API. So if you just want to persist your JavaScript API or JavaScript and then pull it back out your JSON, it does full documents. So it emulates what a MongoDB or DocumenDB does. But the important thing, and this is the somewhat revolutionary side of this, and again, this is something that we're looking to put into open-source, is the serverless nature of it.You only pay for what you use. And when you want to create a Cassandra database, we don't even call it a Cassandra database on the Astra panel anymore. We just create a database. You give it a name. You click. And it's ready. And it will scale infinitely. As long as we can find some compute and network for you to use somewhere, it'll just keep scaling and that's kind of that true portion of serverless that we're really trying to make happen. And for me, that's exciting because finally, all that power that I feel like I've been hoarding for a long time is now available for so many more people.And then if you do a million writes per second for 10 minutes and then you turn it off, you only pay for that little short amount of time. And it scales back. You're not paying a persistent charge forever.Jeremy: I'm just curious from a technical implementation, because I'm thinking about PTSD or nightmares back of my days running Cassandra, and so I'm just trying to think how this works. Is it a shared tenancy model? Or is there a way to do single tenancy if you wanted that as a service?Patrick: Under the covers, yes, it is multi-tenant, but the way that we are created ... so we had to do some really interesting engineering inside. So my RCO's going to kill me if I talk about this, but hey, you know what, Jeremy? We're friends, we can do this. He's like, "Don't talk about the underlying architecture." I'm talking about the underlying architecture. The thing that we did was we took Cassandra and we decomposed it into microservices mostly. That's probably, it's still Cassandra, it's just how we run it makes it way more amenable to doing multi-tenant and scale in that fashion where the queries are separated from the storage and things that are running in the background, like if you're familiar with Cassandra because it's a log structure storage, you ask to do compactions and things like that, all that's just kind of on the side. It doesn't impact your query.But it gives us the ability to, if you create a database and all of a sudden you just hammer it with a million writes per second, there's enough infrastructure in total to cover it. And then we'll spin up more in the back to cover everything else. And then whenever you're done, we retract it back. That's how we keep our costs down. But then the storage side is separated and away from the compute side, and the storage side can scale its own way as well.And so whenever you need to store a petabyte of Cassandra data, you're just storing, you're just charged for the petabyte of storage on disk, not the thousandth of a cluster that you just created. Yeah.Jeremy: No. I love that. Thank you for explaining that though, because that is, every time I talk to somebody who's building a database or running some complex thing for a database, there's always magic. Somebody has to build some magic to make it actually work the way everyone hopes it would work. And so if anybody is listening to this and is like, "Ah, I'm just getting ready to spin up our own Cassandra ring," just think about these things because these are the really hard problems that are great to have a team of people working on that can solve this specific problem for you and abstract all of that crap away.Patrick: Yeah. Well, I mean it goes back to the Dynamo paper, and how distributed databases work, but it requires that they have a certain baseline. And they're all working together in some way. And Cassandra is a share-nothing architecture. I mean you don't have a leader note or anything like that. But like I said, because that data is spread out, you could have these little intermittent problems that you don't want to have to think about. Just leave that to somebody else. Somebody else has got a Grafana dashboard that's freaking out. Let them deal with it. But you can route around those problems really easily.Jeremy: Yeah. No, that's amazing. All right. So a couple more technical questions, because I'm always curious how some of these things work. So if somebody signs up and they set up this database and they want to connect to it, you mentioned you could use the driver, you mentioned you can use GraphQL or the REST API, or the Document API. What's the authentication method look like for that?Patrick: Yeah. So, it's a pretty standard thing with tokens. You create your access tokens, so when you create the database, you define the way that you access it with the token, and then whenever you connect to it, if you're using JavaScript, there's a couple of collection libraries that just have that as one of the environment variables.And so it's pretty standard for connecting the cloud databases now where you have your authentication token. And you can revoke that token at any time. So for instance, if you mistakenly commit that into your Git ...Jeremy: Say GitHub. We've never done that before.Patrick: No judging. You can revoke it immediately. But it also gives you our back, the controls over it's a read or write or admin, if you need to create new tables and that sort of thing. You can give that level of access to whatever that token is. So, very simple model, but then at that point, you're just interacting through a REST call or using any of the HTTP protocols or SQL protocol.Jeremy: And now, can you create multiple tokens with different levels of permission or is it all just token gives you full access?Patrick: No, it's multiple levels of protection and actually that's probably the best way to do it, for instance, if your CI/CD system, has the ability to, it should be able to create databases and tear them down, right? That would be a good use for that, but if you have, for instance, a very basic application, you just want it to be able to read and write. You don't want to change any of the underlying data structures.Jeremy: Right. Right.Patrick: That's a good layer of control, and so you can have all these layers going on one single database. But you can even have read-only access too, for ... I think that's something that's becoming more and more common now that there's reporting systems that are on the side.Jeremy: Right. Right. Good.Patrick: No, you can only read from the database.Jeremy: And what about data backups or exporting data or anything like that?Patrick: Yeah, we have a pretty rudimentary backup now, and we will probably, we're working on some more sophisticated versions of it. Data backup in Cassandra is pretty simple because it's all based on snapshots because if you know Cassandra the database, the data you write is immutable and that's a great way to start when you come to backup data. But yeah, we have a rudimentary backup system now where you have to, if you need to restore your data, you need to put in a ticket to have it restored at a certain point.I don't personally like that as much. I like the self-service model, and that's what we're working towards. And with more granularity, because with snapshots you can do things like snapshot, this is one of the things that we're working on, is doing like a snapshot of your production database and restoring it into a QA cluster. So, works for my house, oh, try it again. Yeah.Jeremy: That's awesome. No, so this is amazing. And I love this idea of just taking that pain of managing a database away from you. I love the idea of just make it simple to access the data. Don't create these complex things where people have to build more, and if people want to build a data access layer, the data access layer should maybe just be enforcing a model or something like that, and not having to figure out if you're on this shard, we route you to this particular port, or whatever. All that stuff is just insane, so yeah, I mean maybe go back to kind of the idea of this whole episode here, which is just, stop using databases. Start using these data services because they're so much easier to use. I mean, I'm sure there's concerns for some people, especially when you get to larger companies and you have all the compliance and things like that. I'm sure Astra and DataStax has all the compliance things and things like that. But yeah, just any final words, advice to people who might still be thinking databases are a good idea?Patrick: Well, I have an old 6502 on a breadboard, which I love to play with. It doesn't make it relevant. I'm sorry. That was a little catty, wasn't it?Jeremy: A little bit, but point well taken. I totally get what you're saying.Patrick: I mean, I think that it's, what do we do with the next generation? And this is one of the things, this will be the thought that I leave us with is, it's incumbent on a generation of engineers and programmers to make the next generation's job easier, right? We should always make it easier. So this is our chance. If you're currently working with database technology, this is your chance to not put that pain on the next generation, the people that will go past where you are. And so, this is how we move forward as a group.Jeremy: Yeah. Love it. Okay. Well Patrick, thank you so much for sharing all this and telling us about DataStax and Astra. So if people want to find out more about you or they want to find out more about Astra and DataStax, how do they do that?Patrick: All right. Well, plenty of ways at www.datastax.com and astra.datastax.com if you just want the good stuff. Cut the marketing, go to the good stuff, astra.datastax.com. You can find me on LinkedIn, Patrick McFadin. And I'm everywhere. If you want to connect with me on LinkedIn or on Twitter, I love connecting with folks and finding out what you're working on, so please feel free. I get more messages now on LinkedIn than anything, and it's great.Jeremy: Yeah. It's been picking up a lot. I know. It's kind of crazy. Linked in has really picked up. It's ...Patrick: I'm good with it. Yeah.Jeremy: Yeah. It's ...Patrick: I'm really good with it.Jeremy: It's a little bit better format maybe. So you also have, we mentioned the Stargate project, so that's just Stargate.io. We didn't talk about the K8ssandra project. Is that how you say that?Patrick: Yeah, the K8ssandra project.Jeremy: K8ssandra? Is that how you say it?Patrick: K8ssandra. Isn't that a cute name?Jeremy: It's K-8-S-S-A-N-D-R-A.io.Patrick: Right.Jeremy: What's that again? That's the idea of moving Cassandra onto Kubernetes, right?Patrick: Yeah. It's not Cassandra on Kubernetes, it's Cassandra in Kubernetes.Jeremy: In Kubernetes. Oh.Patrick: So it's like in concert and working with how Kubernetes works. Yes. So it's using Cassandra as your default data store for Kubernetes. It's a very, actually it's another one of the projects that's just taking off. KubeCon was last week from where we're recording now, or two weeks ago, and it was just a huge hit because again, it's like, "Kubernetes makes my infrastructure to run easier, and Cassandra is hard, put those together. Hey, I like this idea."Jeremy: Awesome.Patrick: So, yeah.Jeremy: Cool. All right. Well, if anybody wants to find out about that stuff, I will put all of these links in the show notes. Thanks again, Patrick. Really appreciate it.Patrick: Great. Thanks, Jeremy.

love ceo amazon netflix google apple data surprise ptsd shift target memory cloud id skip port previous react drivers api fedex huawei aws snowflakes faa apis qa stargate javascript azure cpu coordination graphs google cloud fauna sql dynamo git kubernetes internally presto chief evangelist lambda angular baas erd mongodb ci cd json serverless chief architect mysql graphql gcp developer relations nosql occhio postgresql redshift rest apis hibernate aws lambda ec2 cncf grafana data services grpc kubecon r co datastax ibm cloud dynamodb ql orms gradle apache cassandra javascript apis hobsons patrick mcfadin t sql jeremy you jeremy it cql patrick you cbt nuggets patrick yeah patrick no jeremy so patrick well patrick so patrick there

Episode #103: Differing Serverless Perspectives Between Cloud Providers with Mahdi Azarboon

Play Episode Listen Later May 31, 2021 51:07

About Mahdi AzarboonMahdi Aazarboon started working as a serverless specialist and evangelizing it through blog posts, conference talks and open source projects. He climbed up the corporate ladder, and currently works as Senior Manager - Cloud Presales at Cognizant. He helps big and traditional corporations to move into the cloud and improve their existing cloud environment. Having a hands-on background and currently working at the corporate level of cloud journeys, he has matured his overall understanding of serverless.Linkedin: linkedin.com/in/azarboon/Twitter: @m_azarboonWatch this episode on YouTube: https://youtu.be/QG-N3hf1zqIThis episode sponsored by CBT Nuggets and Lumigo.Transcript:Jeremy: Hi, everyone. I'm Jeremy Daly, and this is Serverless Chats. Today, I'm joined by Mahdi Azarboon. Hey, Mahdi. Thanks for joining me.Mahdi: Hi. Thanks for having me.Jeremy: So, you are a senior manager for cloud pre-sales in the Nordic region for Cognizant. So, I'd love it if you could tell the listeners a little bit about yourself, your background, and what it is that you do at Cognizant?Mahdi: Yeah. Just a little bit of background, I started as a full stack developer, then I joined Accenture as a serverless specialist, and over there I started to play with AWS Lambda specifically. Started to do some geeky stuff, writing blog posts, and speaking at conferences and so on. Then, I was developing several solutions for multiple corporations in Finland, then I joined another consultancy company, Eficode, which are known for DevOps. It is very good, they have a good reputation for that in Nordic region. I was as a practice lead, AWS practice lead driving their business. Then, I joined my current company, Cognizant, and here I work as a pre-sales capacity. I'm not hands-on anymore, but basically I do whatever is needed to make our customers happy and make them to go to the cloud. So that means high-level solutioning, talking with the customer and as a senior architect, I comment about stuff, I make diagrams, And I translate business and technical stuff requirements, basically as an interface between the delivery and the customer side. Yeah, that's all.Jeremy: Right. Awesome. All right. Well, so you mentioned in some of the blog posts that you were writing and some of that was a little while ago. And it's actually, I think there is some interesting perspective there. So I want to get into that in a little while, but I want to start by this idea or this post that you wrote about sort of what you need to know about Azure functions versus AWS Lambda and vice versa and it was sort of this lead-in to this concept of multi-cloud and not cloud-agnostic like being able to run the same workloads, but being able to understand the differences or maybe some of the nuances in Azure versus AWS and of course, that got extended to GCP and IBM cloud and some of these other things. But I'm curious why understanding different serverless services or different cloud services across clouds in this multi-cloud world we are living in now, why is that so important?Mahdi: Yeah. That's a good question. First of all, I would like to clarify that whatever I'm telling in this podcast is just my personal opinion and doesn't reflect my employer. This is just to save myself.Jeremy: Absolutely. Like a standard Twitter handle route.Mahdi: Yeah.Jeremy: Views are my own, right? Yeah.Mahdi: I don't want to answer to my boss after this podcast. Answering to your question, the thing is that multi-cloud is inevitable and even AWS which was ... In the best practices, I remember like a few years ago, they were saying that, no, try to avoid that. They started to even admitting through their offerings that they are trying to embracing that multi-cloud with their Kubernetes offerings. The thing is that, well, whether AWS fans like it or not, Azure is gaining a lot of market share and it depends on the country. For example, in Finland at least AWS is really popular. But now I'm dealing, for example, in other countries like Norway or UK, Azure is very popular. I mean, you can just exclude yourself to be only with one cloud, but in my opinion, you are missing a lot of opportunities, both to learn and just as a company to embrace the capacities, because whether ...Well, Azure provides some stuff which are better than AWS. I mean, I heard from a corporation that they really like AI capabilities of Azure much better than AWS and they do a lot of analytics. So it's inevitable whether many people like to admit it or not.Jeremy: Right. Right. But so even the fact that it's inevitable and we talk about, multi-cloud is one of those terms ... I just talked to Rob Sutter about multi-cloud a couple of episodes ago and it's so expansive. I mean, everything from SaaS providers to, obviously the public cloud providers, to maybe even on-prem cloud, I know that sounds weird, but like your hybrid cloud and things like that. So the problem is that there are a lot of providers, there are a lot of SaaS products, things like that. I mean, are you advocating that people will try to become experts in multiple clouds or how do you sort of ... What level of knowledge do you think you need to have in order to work across multiple clouds?Mahdi: I haven't met a single person who can claim to be expert in more than one cloud provider and I have talked with many experts because I have been running serverless in Finland and so I have been talking with many experts. None of them dared to claim that they knew it. I mean, even keeping up with one single cloud provider is a lot of work and I don't consider myself expert in any of them either, because I'm not hands-on anymore. The thing is that ... No, you don't have to be experts to work with different stuff. Of course, at some level you need some ... For example, you might need an Azure expert to work with Azure, AWS expert to work with AWS. But in my opinion, if you really want to keep up with the technology and so you need to be good in one provider, really good with that and then, know the fundamentals of the cloud, the best practices which are, I would say, it's irrespective of which cloud provider you are using there and be willing to learn.For example, it happened to me. At that time, I mean, when I wrote that blog post, I was only working with AWS. Then they said to me that, okay, you have this project on Azure, go for it and I never touched Azure before. It was a lot of pain, but I learned a lot. So I mean, as I said, the fundamentals are same and now be expert in one and be willing to learn. In my opinion, that should be good enough.Jeremy: Right. I'm curious, I think that's good advice to sort of be well-rounded. I mean, that's always good advice I think for technologists, going a mile wide and an inch deep is usually good enough. But like you said, being able to be an expert in a specific field or a specific technology or something like that can really help. So you think that's certainly a good career choice to sort of start to broaden your perspective a little bit?Mahdi: Definitely. Actually, I was one of those AWS fans that really was following this Hero, Serverless Heroes, and so on, basically was parroting whatever AWS was telling and I was saying that I just want to come to work with AWS. Actually, it happened to be like that, but when I joined my current company, my manager said that most of the opportunities that you are filling, I mean, in my department, so is mostly Azure. So basically they said that it is as it is, and cope with it. And I felt very happy actually. When I, for example, see ... Well, I'm sure that anyone who is in the cloud gets many job offers from recruiters. I was thinking about it, at some point when I was AWS guy, at least in my experience, half of those job ads were irrelevant and ...Jeremy: Right. Right.Mahdi: ... depending on the country. For example in Finland, if you are Azure ... AWS is very popular at least and if you are Azure expert, you are going to miss a lot of opportunities. But at least in my experience, if you say that you are with that, you have worked with the other one, you know something, a lot of career opportunities opens up. This is my observation.Jeremy: Right. Right. Yeah. And I think actually, you made a really good point and that's certainly, in terms of AWS heroes and so forth. I'm an AWS Serverless Hero and we get inside information but we spend a lot of time thinking about things the AWS way. AWS is very good at what they are doing with serverless and they have an interesting perspective in terms of what they believe serverless is supposed to be and what that roadmap looks like. But even just hosting this show and talking to so many different people in different clouds and different ways that they do it, getting that different perspective of how other people or other clouds think about serverless and how they are building it out. I think that's actually really good context to have.Mahdi: Yeah, I agree. Actually, you are one of my heroes also, I was following you. But I should say that it has its own advantages and disadvantage was that I was in a kind of AWS bubble. But when I started to see that, okay, even AWS itself opens up having this multi-cloud offering and some serverless heroes start to write about that, I was like, okay, that's time for opening of your thing. But I mean, by that time actually, I already started to use Azure. So again, I mean, I would say that what you have been doing, actually heroes are doing a great job, really doing a great job.Jeremy: Absolutely, totally agree.Mahdi: Azure also have similar. If I remember correctly, they tried MVP, something like that.Jeremy: I guess, that's MVP, yeah.Mahdi: The thing is that, at least based on my observation, they have more or less same level of dynamics or a narrative between themselves. They also consider Azure more and AWS more and so. But I was lucky, maybe by the choice and so that somehow I had to join or use or attach to both communities. Yeah, it has been a very valuable experience.Jeremy: Yeah. Yeah. So you went through that process, you were sort of an AWS convert or I guess, an Azure convert from AWS, and you stayed connected. But I know, that idea of transferring your skills and transferring the concepts and you mentioned sort of the pillars are the same as they are in AWS and you sort of have some of the general concepts, but as someone who went through that, what were the challenges that, what were some of the, I guess, challenges and the barriers that you faced going from AWS and that way of thinking into the Microsoft world?Mahdi: That's a very good question. The thing is that in the department, at that time I was working at Accenture and actually all of us were big AWS fans because at least Accenture owned Avanade, so Azure was very separate, we were in an AWS bubble. Yeah. I'm sure that definitely AWS is much more mature in many aspects than Azure, no doubt. At least it was like that and I'm sure it's still like that. Their gap has been narrower, but that still might be the case. I remember at that time, many of my colleagues were really bashing down Azure, really bashing down and they were right. I mean, some of their services were really immature. But then again, I had the chance to ... Actually, it wasn't quite choice, they said to me, okay, this is an Azure project. Basically, it was a team, I would say quite junior, developed something on Azure, something that you never probably want to hear.They developed everything in browser, infrastructure as a code nothing at all, they were junior, so they made quite many mistakes also, but they just made the app up and running. It didn't matter how or what, it was just running and that's all. So they told me that, okay, we need some little improvement, this was little improvement and that little improvement basically forced me to reverse engineer whatever they had done, and that required me to upgrade the whole application, because as I said, there was no infrastructure as a code, if I want to use it I had to use ... If I wanted to do local development, I had to use Windows, I had only Mac, so I had to change the complete platform. It was a very tedious process by itself. On top of that, I had to start to see how Azure functions work and that was another pain for that.The thing is that I had AWS mindset and I was thinking that, okay, AWS is the best, they came out first with the cloud and Lambda, so Azure should be something like that. As I elaborated in the blog post, no, actually they are different and there are some small patches or nuances that makes some even days to find it out, but you need to find it out, otherwise, your app doesn't work. After a while when I reflected the things, I realized that, okay, of course, I was angry and pissed ... I was really bashing down Azure, it was fight of the dynamics over there, but after a while when I reflected through my whole process and actually I wrote in the blog post, I realized that part of the blame was on me because I was expecting Azure to work in the AWS way. No, that's not how it works.I mean, when you look at, for example, authentication or the mindset, it's different. That requires a learning curve, I mean, you need to find out Stack Overflow and actually, the Azure community is really supportive. I really like it. They have their own community which is really supportive. So the pain basically was that ... Yeah, I had to find out how things work in Azure and what's different. But now that I'm working basically pre-sales in both of the cloud I can say that, again, fundamentals are same.Jeremy: Right.Mahdi: And these AWS architecture framework, there are five pillars. You can see that Azure has copied from AWS, it's obvious. Even they haven't changed the name. The naming is similar and you can find that it's just a bad copy. At least like few months ago that they had to implement for that. But at the end, I mean, Azure is catching up fast.Jeremy: Right.Mahdi: It's undeniable. And fundamentals are more or less same. I mean, if you want to make your app ... For example, you want to innovate, you should have shorter time to market. Basically, you need to use infrastructure as a code If you want to make your app really high-level appeal, you need to follow best practices, do maybe SRA. At the high level it's same, but when it comes to the detail level, it can be very different. Even the documentation was really confusing and it wasn't just me telling that.Jeremy: Out of curiosity, was the documentation for AWS more confusing or was the documentation for Azure more confusing?Mahdi: This is a million-dollar question. Actually, I thought that maybe it's me. I found the Azure doc very confusing, but I thought it's me, so I asked I think nine of my friends who are AWS experts that, "What's your opinion? Have you worked with Azure? Do you find documentation readable?" I think all of them said that it's confusing.Jeremy: Yeah.Mahdi: So I was like that, okay, then it's confusing. Then I talked with a few Azure experts who, they breathe in Azure, they are Windows guys and they never touched AWS and they said that, "No, documentation is good. Everything is fine." Actually, if I remember correctly one of them said that, "Actually, I find the AWS documentation confusing." It seemed like two different worlds, you know?Jeremy: Right. I find them both confusing, actually.Mahdi: Maybe now it has changed.Jeremy: Right. Yeah. So, that's interesting. I mean, I think the documentation is a good ... Well, first of all having good documentation is important and I think they both have good documentation, but I do think it's organized differently, right?Mahdi: Yes.Jeremy: And again, it's organized more towards I think maybe that different mindset. But let's just talk a little bit about the maturity of those, because to be fair to Azure, I mean, Azure or Azure Functions, it has come a very, very long way. I remember way back in 2018, way back, I mean it seems like a long time ago at this point, seeing very early demos of Durable Functions and I remember thinking like, oh, that's just a mess, like that is not the way that you want to do that. Now fast forward three years, Durable Functions are pretty cool and they do a lot of really interesting things. It does take time to catch up. So certainly I would think your criticism of Azure Functions back then in terms of what it is now, that's probably there is a huge gap there.Mahdi: Yeah. I'm sure that most of the criticism, the detailed one that I mentioned the blog post, I'm sure that many of them have either been fully addressed or they have been improved a lot. So that's why I don't want to focus that much on detail and I would focus more on the high-level things. Yeah.Jeremy: Right. So speaking of the high-level things, let's go there for a second. So you mentioned like a well-architected framework, sort of this idea of their being something very, very similar, maybe even a carbon copy in Microsoft. But what about getting down, you said that your individual skills are kind of when you get into the weeds there, that is certainly different, so I mean for the most part though, event-driven, stateless computes, things like that, do those skills transfer over?Mahdi: Yeah, they do. It's just a matter of implementation. For example, I can tell you, yes, those ones ... Well, there is some caveat. For example, I remember in Azure community, I was at that time, this probably has been changed, but I think it shows some kind of mindset. I was struggling to find out the observability tools of Azure, if I remember correctly it's what's called Application Insight, one of the tools, and they had some event driven insight, something like that which was, they call it near real-time. I remember that basically when I want to get the logs from the functions, it took three minutes to come up, three minutes. At the same time CloudWatch, for example, it was coming in 20 seconds, something like that, 10, 20 seconds and I mentioned it in their community.If I remember correctly, it was a notable dude, either one of the product team, or he was a very notable dude and he said that three minutes time is, in my opinion, is near real-time. He said that and I remember we made a lot of joke out of that sentence with my colleagues about that.Jeremy: I can imagine.Mahdi: But that shows some kind of mindset. I mean, three minutes, I don't think is near real-time. Most probably this time has been reduced, but I just wanted to tell you their mindset about that. But, yeah, event-to-event stateless stuff, they are transferable. But when it comes to implementation, it's different. For example, as I mentioned that blog post, there was some stuff that you can do with an authentication with some, certain some, environmental variables in AWS, but that same thing in Azure, if I remember correctly, is done through something like service principles, it's different. So if you try to play with environmental variables, it turns out no, it doesn't work that way. It gets to very detailed stuff, that gets different. Yeah.Jeremy: Yeah. Right. Right. Yeah. I'm curious to hear about like another sort of interconnectivity of what you would connect. I'm now trying to remember what they call bindings or triggers and bindings in Azure functions as opposed to events or actually event sources, I think we call them in the Lambda world. So would you look at the way that you connect to other services? Is that another thing that is similar between the two?Mahdi: Okay. I should say that I don't remember that much of these details anymore, but as far as I remember, again, the high levels were more or less the same. Okay, they call it three gears, but I don't remember now what does AWS Lambda calls it. But it was more or less the same.Jeremy: I can't even remember what it's called, it's like event sources or something like that.Mahdi: Yeah. It was more or less same. Yeah, yeah, yeah.Jeremy: Yeah.Mahdi: And they had something like a bus, events bus in order to have a centralized event driven thing. It's same I would say.Jeremy: Yeah.Mahdi: Again, when it comes to poor person who has to implement it if he hasn't done it before. But the person who is doing the high-level architecture and so, I can easily see that, I mean, I don't see that much difference. But I know that if someone has to implement it and hasn't done it before, he will go through the most pain, because he has to find this small configuration things that, unfortunately, you need to make them. Otherwise, it doesn't work out. But high-level, it's same. It's event ...Jeremy: Yeah.Mahdi: Yeah.Jeremy: I think the nuances are always those tough things. So thinking of the overall mindset here and sort of maybe the approach to serverless. So I know you went from AWS to Azure, but I'm curious, do you think it would be easier to go from Azure to AWS or easier to go from AWS to Azure?Mahdi: Well, I came from this part of the river to the other one, so I can just speculate about the other part. But I would say it's more or less same, because again when I talk with a few Azure people who really have been breathing always in Azure and never touched or barely touched AWS, I felt that they are feeling same thing about AWS. So I would say it's more or less same. They need to go through the same pain, they will find AWS stuff very confusing, especially that they will not have that great community support of AWS, but they need to either do the Stack Overflow thing or have a enterprise support of AWS. I would say it's more or less same for them.Jeremy: Yeah. I mean, I think that's interesting too just, that it is different enough that there is pain there, right? I mean, it would be nice if there was some standards and I know there's like the opening, the Cloud Computing Foundation is like open events and some of those things whatever, not that that's all working out for ... I think Kubernetes and Knative and those and some of those teams are implementing it or those projects, but I'm not sure the same things fall into AWS. But anyways, go ahead. You have any thoughts on that?Mahdi: Actually, that Cloud Foundation, I was working at Eficode and they are really working that stuff. They are so good in Kubernetes. I find that also another world completely.Jeremy: Yeah.Mahdi: This Cloud Foundation stuff. I never had to implement any of that for any of our customers in any of the companies that I worked, that they were AWS or Azure. Yeah, some of them they used Kubernetes also, but that CNC or whatever it was ...Jeremy: Yeah, CNCF.Mahdi: Yeah, yeah. I found it, that's a different world for me also, I should say. Sometimes out of curiosity, I played with it, but I never ... Nobody ever asked me that, do you want to use that?Jeremy: Right. Right. Yeah. No, that makes sense. All right. So we talked a lot about, we've been talking about the difficulties in switching between different cloud providers, but also the value of knowing those different cloud providers. And more so, so that you can build serverless applications. So let's talk about serverless in general. I know you are a little bit outside of the ... You're not in the developer role anymore. But this actually, could be really interesting to get your perspective on the management approach to this and how other companies are thinking about the value of serverless at a management level as opposed to ... I guess, even as a sort of planning level. So let me ask you this question then. Are you seeing companies looking at serverless and adopting serverless and that serverless mindset and then maybe a follow-up question would be, if they are not, why do you think they should be embracing serverless?Mahdi: Okay. Firstly I'll answer the second part. Basically, the thing is that nowadays the world is fast changing. Many companies, many corporations basically, are benefiting from their existing market share or regulatories or the monopoly that they have. For now, it works. If they don't want to change basically if they have the mindset that things are working, what's the point for change. Most probably within a decade or so they are going to die, their business is going to die. Because the world is fast-changing and they need to have them to adapt to the market.So ideally, they need to go through the pain and disrupt themselves. Disruption always brings pain. You cannot disrupt yourself and feel that everything runs smoothly. Ideally, they need to disrupt themselves, go through the pain and so become really agile in order to understand the customer feedback and deliver the value to the customer, what really the customer wants. They can either have this phase or they can ignore it and say that, okay, things are working, we are making money through our monopoly, regulatory, existing market share, whatever and then, their business is going to go away. These two choices, that's all. Yeah. Painful process to become more competitive and be ahead of customers or assume that everything is okay, and then at that time that's going to be very late.Jeremy: Right. So let me go back to that first question then. So you are seeing people not doing that?Mahdi: Okay. The thing is that what I'm telling is going to be biased because I'm working in a cloud team and whatever opportunity that they are going to bring to me, of course, you have the departments and the companies that they are interested in the cloud. So my mindset is a bit biased, but what I'm seeing is that it varies a lot and I mostly focus on corporations, because ... Yeah, of course, for startups it's much easier to go for that.Jeremy: Right. Of course.Mahdi: At least in Finland, my observation was that there are two ways. Either they are very ... it depends on the executive leadership. For example, a major bank in Finland, they say that, we want to go to the cloud and be, we want to go for that. And once, one of these big ones goes through that, there is going to be a domino effect on others. But there are some other ones say that, no, it's cloud, who is going to take care of the data? We are not going to do that and they don't touch it.There are some other companies and their departments, I would say there are departments who are interested in trying things out and then, they have to fight internally with the more conservative departments. So I'm sure that there are three levels of that. But mostly, I work with the ones who are inclined toward using the cloud.Jeremy: Right. Right. So then, the ones that are starting to dabble in the cloud, is that something where you see ... I mean, clearly there's lift and shift, right? Which I think we probably all understand at this point, it is not the best implementation or the best use of the cloud, right? That it is better to maybe use more native cloud services or cloud native services, I guess, to do that. So in terms of people just rehosting or maybe re-platforming, are you seeing this sort of rearchitecture, or I guess, this refactoring or is that something where companies are staying away from that?Mahdi: First of all, I respectfully maybe have to disagree with you.Jeremy: Okay.Mahdi: Actually, I think rehosting is actually a good approach and that's what even AWS promotes for conservative companies who want to start working with the cloud and they want to get the fastest result in the shortest period of time, with the least amount of pain, it's better to do migration through the easiest one which is lift and shift. Easiest, everything is relatively.Jeremy: Right.Mahdi: And then, have a data-driven approach to see what really needs to be improved and then refactor or rearchitect or re-platform based on data. So in AWS terms, I'm sure you're right there with me, have that evolutionary architecture in a data-driven approach. So lift and shift, I don't consider bad at all. Actually, I consider it a very good cornerstone, stepping stone at the beginning, for the beginning.Jeremy: Interesting. Okay.Mahdi: Yeah. What was the other question?Jeremy: No. I was just going to say, so you've got companies that are lift and shift, and, yeah ...Mahdi: Oh, okay. Sorry. Sorry. Yeah. Sorry, I just remembered.Jeremy: Yeah.Mahdi: Sorry to interrupt you. Actually, I'm a bit careful about using the word cloud native. I remember, in a previous company that I was working, we had some philosophical fight about that and I'm sure that then everyone was dissatisfied and I had to have an authoritarian appearance that this is the definition of cloud native. I'm sure many of them hated me after that. But the word cloud native, I really struggle to find a consensus of what does it mean and if you spend some time, you realize that you will find a variety of definitions of that. So I'm picky for the word cloud native. There is a lot of fight can happen, what is exactly cloud native. Some consider Kubernetes cloud native. Some consider using AWS or Azure cloud native. So this is the picky ... this is a very controversial term, I would say. Yeah.Jeremy: Well, let me interrupt you for a second. So when I think of cloud native, what I'm thinking of are services and components that are built specifically to run in the cloud, things like your API gateway at AWS or Azure functions or things that are like very much so built to run in the cloud environment where they do things. It's that serverless aspect. I think of it more serverlessly. I mean, I know containers and so forth fit in there as well. But that's how I think of cloud native. I think of cloud native as going beyond just your traditional VM and running everything on the VM and moving to the higher-level services that are more managed for you.Mahdi: May I challenge you?Jeremy: Absolutely.Mahdi: So you just said that basically things that use cloud, like API gateway and so. And now I should ask more of a technical question. What is cloud?Jeremy: Right. Well, that's another good question. Right.Mahdi: Okay. I can tell you, based on these several definitions that I read and I reflected on them, I have this definition of cloud native, most probably many people I'm sure will disagree. So that's fine because it's very controversial. In my opinion, cloud native is very simple. If your application is architectured in a way that it can leverage the advantages of the cloud environment, then it's cloud native. Doesn't matter if it is on Kubernetes, if it's on AWS, if it's on Azure or so. If it can scale to zero and theoretically to infinity and you pay for only what you use, then it's cloud native. That's my definition of that and I read so many definitions, so I came up with this. But feel free to disagree with that, because many people disagreed with me. I'm fine with that.Jeremy: That's all right. You are not the only one I'm sure, has differing opinions of what cloud native are. So let me ask this though because I think that's interesting, the way you explained the strategy of lift and shift of basically being able to say it's the, probably the lowest risk way to take an application that's on-prem and move that into the cloud and then to use data and so forth to kind of figure out what parts of the application might you want to migrate to, maybe again I don't want to overload the term, but more cloud native things. I think that's actually really interesting. I have found and I have seen many companies that seem to do this where it's more that they move things, they just rehost without really thinking through what that strategy is going to be and then they basically just end up having their on-prem in the cloud and not benefiting from some of those managed services and some of the benefits of the cloud that you get, they don't transfer on to them. That's what I have seen.Mahdi: Well, you know it better than me. Your cloud environment is never perfect and it's always an ongoing operation. So I mean, going to the cloud ... Again, if you put your own frame in, put them I don't know, use EC2 or which VM or the AWS or Azure, that's a very good first step ...Jeremy: Right. That's probably true.Mahdi: ... but you need to be able to start leveraging that. At least get the data, which one is being used and hopefully, hopefully when you are going to the cloud, you have done some analysis and you have realized that some of the services even are not working with the cloud. Some of them need to retire, some of them cannot be rehosted. They must be rearchitected, because they are so legacy for that. But even again, assuming that you have done your homework and you have done rehosting, okay, you need to leverage that and go and see that all things that AWS or Azure provide, how much over-used or over-utilized or under-utilized are your CPUs and this kind of thing and according to that, do right sizing for that.Jeremy: Right.Mahdi: That's a good step for that. Then if they want, requires refactoring, try to I don't know, do refactoring and use more managed services for that. So again, rehosting is a good first step, but cloud is a long journey. I don't know who came up that cloud is cheap, I really don't know.Jeremy: Right. No, I totally agree. You are right about the first step and I actually loved your point about which services might you be able to retire and not move at all because I think in a lot of these big companies, there are a lot of services that you probably don't need anymore or they are redundant or whatever and you could get rid of those moving to cloud. Good point. All right. I got a couple of more minutes and I want to go back to an article that you wrote. Now, this I think is like three years old and in terms of reading the article now, it's not relevant, because so many things have changed. But what's relevant is, what has changed and this was an article that was about the worrying and promising signals from the serverless community. I think this was an event you went to in Germany, they did this, and you have a couple of different points that you called out.One of the points was that users have ignored security and that was a worrying sign for you. Where do you think sort of cloud security or more specifically serverless security is now? Do you think people are still thinking about it or have brought it front and center like it probably should be or do you think it's still a worrying factor?Mahdi: Since I have implemented cloud solutions for I'll say mostly enterprises and a few startups, I haven't seen a single one of them using, having a cloud security specialist. Most of the corporations when they, at least in my experience, when they want to go to the cloud, they must address the security of it and typically because of the customer requirements, so they bring a security guy who has worked with this, let's put it this way, all their security stuff and he has to come in on the cloud part and it's funny that actually, sometimes I have to teach them basically. I remember they had a head of security for a customer. I really had to teach him and actually, I had the Lambda functioning in front of him and he was like, wow, is it really like that? I had to teach him what are the attack methods and it was funny. He had to sign off my solution that it is secure, but basically, I had to tell him what are the priorities.Jeremy: You had to tell him what it was.Mahdi: They address it from a traditional way. Yeah, they do some kind of a test, automated test and this kind of thing which is, yeah, definitely ... Again, I'm not a security expert, but as far as I understand, again they have some fundamentals which are safe, that's true, but when it comes to the cloud especially serverless and functional service, you will see that there is a lot of more attack vectors and unfortunately, these security experts, I have not seen any of them who have any expertise in that. I learned about it because I was curious about it and I started to work with basically professionals, some startups which provide professional security solutions for serverless. So that's how I got that, but again I had to go through the pain. It took few months to read so many stuff. But I haven't seen any security specialists who have been working on cloud projects who have done this.Jeremy: Yeah.Mahdi: So I would say customers, they consider it, but no, there is still a lot more way to mature.Jeremy: They are not addressing it. Yeah. It's funny because I remember that in 2018, 2019, there were a couple of companies that were in the serverless security space and they were all acquired. So now they are part of larger platforms which is ...Mahdi: Exactly.Jeremy: ... great for them, don't get me wrong. All right. So then another thing you said and I think this is important, because the biggest complaint that I always hear about serverless is, just the workflows are not easy. So you had mentioned that DevOps was finding its way and that was sort of a promising signal, you think that we've ... I mean, we have got a lot of tools for serverless now. Speaking of Azure, the way to deploy an Azure function right through VS Code now with the plugins is really, really slick and Serverless Frameworks, SAM, CDK, all these are there, Terraform and so forth. I mean, have we gotten to some stability around serverless and sort of mixing in DevOps there?Mahdi: Based on my experience, at least the ones that I have worked with, I can say that, yes, DevOps is now a part of a solution that's provided to the customer and maybe it's correct because personally, I went through the pain whenever I proposed any solution for the customer, so they are always using infrastructure as a code and always try to have a DevOps-centric viewpoint about your solution. So I try to push for that and, yes, I find customers receptive about that. It seems to me that, now DevOps is not one of those buzzwords for cool kids who just want to do this stuff, even the corporate guys are more receptive with that. Again, there is more way to really do the DevOps stuff, because you know that many companies claim that they are doing DevOps, but in reality, they are not. You know this better than me.Jeremy: Right. Of course.Mahdi: But, yeah, it's good. I'm happy for that. I mean, a few years ago DevOps was one of those buzzwords, but now I don't think it's buzzword anymore.Jeremy: Yeah. Yeah. And I think that serverless has actually opened up a lot of making it easier for teams to do automation and things like that, there's a lot that you can do because you have that little bit of compute power that you can do something with. So I think that's definitely promising. So speaking of sort of compute power and other things that you can do with it, one of the things you mentioned was that you saw as a promising sort of signal was, that serverless-based prototypes were on the rise, meaning different services, so whether it was cues or whatever or I guess Lex and things like that, all kinds of services that allow you to do different things that are specializing in different capabilities. So how do you feel about that now? Because there are a lot of those APIs out there.Mahdi: Yeah. Actually, I also find that even from these legacy corporations that I have been working with, I like the idea that now, they definitely when they want to do migration especially or this kind of thing or do anything cloud, first they do POC. Yes, I find it good. Honestly, I was sometimes impressed that, oh, from some people that I would never expect them to use this one, first let's do POC, then see what's come out. Oh, really? Yeah, it's good in my opinion. It's finding its way.Jeremy: Yeah. Yeah. No, I like that too and I think you are right about proof of concept, because it's just one of those things where even if it's expensive to use the Google Vision API or something like that, it's a really good way to prove out how that fits into whatever the business use case you have for that and then like you said, you can certainly take a step further and create more sophisticated or I won't say sophisticated but maybe more integrated tools or something like that, that would work around that. So I think that's interesting, allow people to fail fast, learn quickly, and just build out their applications.Mahdi: Yeah. When we say POC, I should say that I wouldn't exclude it only to this cool new serverless or what the AI stuff that AWS and Azure provide. Even for migration actually, POC is highly recommended. Again, I was working for some period of time for, I would say, one of the most conservative banks in Finland, small and conservative, for consultancy, but even then as we are trying to push the cloud and even then they said that, "Yeah, first let's do a POC of migration and see what's going to happen." Again, there really I was surprised. I would never expect it from them. But the idea of fail fast and learn fast, I think at least that it requires some level of maturity to reach that.Jeremy: True.Mahdi: That really needs more room for improvement, fail fast, learn fast. Yeah. Just something, I don't know, I would like to address about this cloud stuff if I can.Jeremy: Yeah, absolutely.Mahdi: Yeah. Basically, when companies or customers decide to go to the cloud, I'll recommend that don't look at only the technical aspect of it, because I see that there is, at least there is lot of debate for example ... At least it was like that. AWS, Azure, or this kind of thing, at the end I'll say that most of the things it doesn't matter that much. I mean, it depends on their, sometimes company policy, how much discount you can get, how much funding you can get from the cloud provider. So it's not really the technical people who decide, sometime it's the executive who decides.Jeremy: Right.Mahdi: But even then, when you go to the cloud, in my opinion as much as the technology and maturity of the cloud provider matters, the amount that your company is ready to change its operations is also important. This is my favorite example, that I developed and I would say at that time at least a state-of-the-art serverless solution, DevOps, or CI/DC stuff for a major bank in Finland and I was the first one who managed to do that among so many consultants that they have. It was really good. I'm proud of what I did and actually, I open-sourced that. It was really basically we could deploy multiple times per day and we went to their release manager and I said that, "Okay. It's like that. Everything is perfect. DevOps, CI/CD, we can release multiple times per day." And she said that, "No. It doesn't work like that. We need to release once per month," and we have to go through a very painstaking process, fill out so many useless documents.It didn't matter how much I tried to convince her that, "Well, the idea is different. I mean, you need to do small deployment. This way actually you have less risk. You deploy once per month. Still every time something goes wrong, but when you do a more frequent deployment, your risk is lowered." She said, "No. We are a bank. It is as it is. Sorry." Most of that effort that I made at least at that time went to waste basically, because the process was legacy, even though the technology was good. I'm sure that by now, they have changed because I was among those innovators basically or the early adopters who made through that. But in my opinion, technology matters but operation and processes and release stuff also matters and everything needs to change. So basically it needs to be holistic approach of going to the cloud, not just implementing from technical viewpoint.Jeremy: Mahdi, thank you so much for having this conversation with me. This was a lot of fun and then I love people who have sort of experienced, from moving from one cloud to another. It's a huge shift, but I think your advice here is great, just to sort of know those basics on those other platforms and do that. So if people want to reach out to you or find out more about, follow you on Twitter, things like that, how do they do that?Mahdi: Well, I have a Twitter account, but nowadays I mostly put non-service stuff, but LinkedIn is a good option for me.Jeremy: Okay. Great. And it's m_azarboon on Twitter and then, I will put LinkedIn and Twitter and that in the show notes as well. Thanks again, Mahdi.Mahdi: Thank you very much for having me. Bye-bye. Thank you.

ai uk speaking germany microsoft hero started mvp cloud mac honestly norway answering windows perspectives finland ibm saas disruption api painful nordic poc accenture aws faa apis lex vm devops azure providers kubernetes cnc sra differing lambda baas stack overflow ci cd serverless mahdi cpus cognizant terraform gcp vs code cdk aws lambda ec2 cncf avanade azure functions knative cloudwatch cloud foundation jeremy you aws serverless hero cbt nuggets jeremy so

Episode #102: Creating and Evolving Technical Content with Amy Arambulo Negrette

Play Episode Listen Later May 24, 2021 57:51

About Amy Arambulo NegretteWith over ten years industry experience, Amy Arambulo Negrette has built web applications for a variety of industries including Yahoo!, Fantasy Sports, and NASA Ames Research Center. One of her projects modernized two legacy systems impacting the entire research center and won her a Certificate of Excellence from the Ames Contractor Council. She has built APIs for enterprise clients for cloud consulting firms and led a team of Cloud Software Engineers. Currently, she works as a Cloud Economist at the Duckbill Group doing bill analyses and leading cost optimization projects. Amy has survived acquisitions, layoffs, and balancing life with two small children.Website: www.amy-codes.comTwitter: @nerdypawsLinkedin: linkedin.com/in/amycodesWatch this episode on YouTube: https://youtu.be/xc2rkR5VCxoThis episode sponsored by CBT Nuggets and Lumigo.TranscriptJeremy: Hi everyone, I'm Jeremy Daly, and this is Serverless Chats. Today, I'm joined by Amy Arambulo Negrette. Hey, Amy thanks for joining me.Amy: Thank you, glad to be here.Jeremy: You are a Cloud Economist at the Duckbill Group, so I'd love it if you could tell the listeners a little bit about yourself and your background and what you do at the Duckbill Group.Amy: Sure thing. I used to be an application developer, I did a bunch of AWS stuff for a while, and now at the Duckbill Group, a cloud economist is someone who goes through cost explorer and your usage report and tries to figure out where you're spending too much money and how the best to help you. It is the best-known use of a small skill I have, which is about being able to dig through someone's receipts and find out what their story is.Jeremy: Sounds like a forensic accountant, maybe forensic cloud economist or something to that effect.Amy: Yep. That's basically what we do.Jeremy: Well, I'm super excited to have you here. First of all, I have to ask this question, I've known Corey for quite some time, and I can imagine that working with him is either amazing or an absolute nightmare. I'm just curious, which one is it?Amy: It is not my job to control Corey, so it's great. He's great to talk to. He really is fully engaged in any conversation you have with him. You've talked to him before, I'm sure you know that. He loves knowing what other people think on things, which I think is a really healthy attitude to have.Jeremy: I totally agree, and hopefully he will subtweet this episode. Anyways, getting into this episode, one of the things that I've noticed that you've done quite a bit, is you create technical content. I've seen a lot of the talks that you've given, and I think that's something that you've done such a great job of not only coming up with content and making content interesting.Sometimes when you put together technical content, it's not super exciting. But you have a very good way of taking that technical content and making it interesting. But then also, following up with it. You have this series of talks where you started talking about managing FaaS, and then you went to the whole frenemies thing with Fargate versus Lambda. Now we're talking about, I think the latest one you did was about Lambda and the container support within Lambda. Maybe we can just go back, or start at a point where, for people who are interested in maybe doing talks, what is the reason for even creating some of these talks in the first place?Amy: I feel a lot of engineers have the same problem, just day-to-day where they will run into a bug, and then they'll go hit the all-knowing software engineer, which is the Google search engine, and have absolutely either nothing come up or have six posts that say, I'm having this problem, but you won't ever get an answer. This is just a fast way of answering those questions before someone has to ask.Jeremy: Right. When you come up with these ... You run into this bug, and you're thinking to yourself, you can't find the answer. So, you do the research, you spend the time digging through, and finding the right way to solve it. When you put these talks together, do you get a sense that it's helping people and then that it's just another way to connect with the community?Amy: Yeah. When I do it, it's really great, because after our talk, I'll see people either in the hallway, or I'll meet someone at a booth, and they'll even say, it's like, I ran into this exact same problem, and I gave up because it was such a strange edge case that it was too hard to fix, and we just moved on to another solution, which is entirely possible.I also get to express to just the general public that I do, in fact, know what I'm talking about, because someone has given me a stage to talk for 30 minutes, and just put up all of my proofs. That's an actually fun and weirdly empowering place to be.Jeremy: Yeah. I actually think that's really interesting. Again, for me, I loved your talks, and some of those things are ... I put those things at the back of my mind, but I know for people who give talks, who maybe get judged for other reasons or whatever, that it certainly is empowering. Is that something where you certainly shouldn't have to do it. There certainly should be that same level of respect. But is that something that you found that doing these talks really just sets the tone, right off the bat?Amy: Yeah, I feel it does. It helps that when someone Googles you, a bunch of YouTube videos on how to solve their problem comes up, that is extremely helpful, especially ... I do a lot of consulting, so if I ever have to go onsite, and someone wants to know what I do, I can pull up an actual YouTube playlist of things that I've done. It's like being in developer relations without having to write all of that content, I get to write a fraction of that content.Jeremy: Right. Unfortunately, that is a fact that we live with right now, which is, it is completely unfair, but I think that, again, the fact that you do that, you put that out there, and that gives you that credibility, which again, you should have from your resume, but at the same time, I think it's an interesting way to circumvent that, given the current world we live in.Amy: It also helps when there are either younger engineers or even other younger professionals who are looking at the tech industry, and the tech industry, especially right now, it does not have the best reputation to be able to see that there are people who are from different backgrounds, either educationally or financially, or what have you, and are able to go out and see someone who has something similar being a subject matter expert in whatever it is that they're talking about.Jeremy: Right. I definitely agree with that. That's that thing, where the more that we can amplify those types of voices and make sure that people can see that diversity, it's incredibly important. Good for you, obviously, for pushing through that, because I know that I've heard a lot of horror stories around that stuff that makes my blood boil.Let's talk to some of these people out here who potentially want to do some of these talks, and want to use this as a way to, again, sell themselves. Because I can tell you one thing, once I started writing blog posts and doing talks and doing those sorts of things, clearly, I have a very different background, but it just gave me a bunch of exposure; job offers and consulting clients and things like that, those just become much easier to get when you can actually go out there and do some of this stuff.If you're interested in doing that, I think one of the hard things for most people is, what even makes a good talk? You've come up with some really great talks. What's that secret sauce? How do you do that?Amy: I think it can also be very intimidating since a lot of the talks that get a lot of promotion are always huge vendor events that they're trying to push their product, they're trying to push a solution. That usually takes up a lot of advertising real estate, essentially, where that's what you see, that's what you see all the threads and everything. When you actually get to these community conferences, or even when I would speak at AWS Summit, it was ... I had a very specific problem that I needed to solve. I ran into a bug, the bug was not in the documentation, because why would it be?Jeremy: Why would you put that in there, right?Amy: Of course. Then Google, three pages down, maybe put me on the path to finding the right answer, and it's the journey of trying to put all of the bug fixes in place to make it work for your specific environment and then being able to share that.Jeremy: Right, yeah. That idea of taking these experiences that you've had, or trying to solve a problem, and then finding the nuances maybe in solving the problem as opposed to the happy path, which it's always great when you're following a blog post and it says, run this command, then run this command, then run this command. Well, what happens on that third command when the thing blows up, and you have no idea what to do? Then you end up Googling for five hours trying to find your way out of that.You take this path of, find those bugs or find that non-happy path and solve it. Then what do you do around there? How do you then take that ... You got to make that interesting somehow.Amy: Yes. A lot of people use gifs and memes. I use pictures of food and screencaps from Dungeons and Dragons. That's usually just different enough that it'll snap someone just out of their phone going, "Why is there a huge elf on my screen trying to attack people screaming elf errors." Well, that's because that's what they thought it would be great to call it. It's not a great error code. It doesn't explain what it is, and it makes you very confused.Jeremy: Right. Part of that is, and again, there's that relatability when you create talks, and you want to connect with the audience in some way. But you also ... This is the other thing that I've always found the hardest when I'm creating talks, is trying to find the right level. Because AWS always does this thing where they're like, it's a 200 level, or it's a 400 level, and so forth. I think that's helpful, but you're going to get people of all different skill levels, and so forth. How do you take a problem like that, and then make it relatable, or understandable, probably? Find that right level?Amy: The way I see it, there's going to be at least one person of these two types in the room that are not going to be your target audience, someone who doesn't know what you're talking about, but sees that a tool that they're considering is going to pose a problem, and they want to know how difficult it is to fix it. Or there's going to be a business person who has no technical background, and they just want to know if what they're evaluating is worth evaluating, if this error is going to be so difficult to narrow down and try to resolve that, yes, why would we go through something that my engineers are going to spend hours to try to fix something that's essentially a configuration issue?When I write any section of a talk, I make sure that it addresses a person who may not have come into that with that exact problem in mind. For the people who have, they'll understand the ... In animation, it's called key images, where there are very specific slots where you understand the topic of what is happening and the context around it. I always produce more verbose notes that go with my presentation. I usually release it either at the end of the day, or later on that week, once everyone has had time to settle, and it provides a tutorial-esque experience where this is what you saw, this is how you would actually do it if you were in front of a screen.Jeremy: Yeah.Amy: There are people who go to technical talks with a laptop on their lap because they're also working while they're trying to do it. But most of the time, they're not going to have the console open while you're walking through the demo. So, how are you going to address that issue? It's just easier that way.Jeremy: I like that idea too, of ... I try to do high-level bullet points, and then talk about the bullet point. Because one thing that I try to do, and I'd love to hear your thoughts on this as well. Here I am picking your brain trying to make my own talks better. But basically, I do a bullet point, and then I talk through it. I actually animate the bullet points coming in.I'm not a huge fan of showing an entire slide with all the bullet points and then letting people read ahead, I bring a bullet point in, talk about the bullet point, bring another bullet point in. Is that something you recommend doing too? Or do you just present all the concepts and then walk people through it?Amy: I think it depends. I tend to have very dense slides, which is not great for reading, especially if you're several rows back. I truly understand that. But the way I see it, because I also talk very fast when I'm on stage that I want there to be enough context around what's happening, so that if I gloss over a concept, then you visually can understand what's happening.That said, if that's because the entire bullet block on my slide is going to be about a very specific thing that's happening. It's not something that you have to view step-by-step. Now, I do have a few where, especially in a more workshop scenario, where you're going, I want you to think about this first and then go on to this next concept. I totally hide stuff. I just discovered for a talk that I was constructing the other day, that there's an animation that drops them down like index cards, and that's now my favorite animation right now.Jeremy: When you're doing that, like because this is the other thing, just for people who have ever ... If you're out there and you've ever written a talker or you've given a talk, the first iteration of it is never going to be the right one. You have to go through and you have to revise. It is sort of weird, and I don't know, maybe you felt this way too, in the pre-pandemic world, when you would give talks in person, most of the time, you'd give it to a relatively small audience, a couple of hundred people or whatever, as opposed to now, when we do talks, post-pandemic, and they're online, it's like, they're immediately available online.It's hard to give the same talk over and over and over and over again, without somebody potentially having seen it. A lot of work goes into a single talk. Not being able to use the same time over and over again, is not great. But, how do you refine it? Is it that you tested it with a live audience, or do you use a family member or a friend, or a colleague? How do you test and refine your talks?Amy: I'm actually an organizer at a meetup group, and specifically built around giving people of marginalized gender identities, and a place to stage and write technical content. It is a very specific audience.Jeremy: I can imagine.Amy: But it addresses that issue I had earlier about visibility, it also does help you ... If you don't have a lot of contacts in this industry, just as an aside, technical speaking is a way to do it, because everyone loves talking to each other after the stress has worn off, and you become the friendliest person after you've done that.But also, there are meetup groups out there, specifically about doing technical feedback, or just general speaking feedback. If you want to do something general, Toastmasters is a great organization to do. If you want to do strictly technical, if you do any cloud-related stuff, the DevOps communities are super friendly, even if it's not specifically about DevOps. I'm not a DevOps person, but I have a lot of DevOps friends. Some of my best friends are DevOps people.And you can get on a meetup or a Zoom call and just burn through your slides for about 10 or 15 minutes and see ... Your friends will be very honest with you, in a small group.Jeremy: Right. One of the things I did notice, too, giving a speech in person or giving your talk in person versus giving a talk via Zoom call, is sometimes when you don't hear any laughs or chuckles from a little joke that you make in there, it can feel very lonely in that space after you're waiting for something in there, but. It's a little bit ...Amy: It's worse when there are people in the room. I assure you, it is so much worse.Jeremy: That is very true. If something falls flat, that's a good point. Just going back to more this idea of creating good talks, and what makes a good talk. Where do you find ... You mentioned, maybe it's a vendor conference or something and you maybe install the vendor stuff, and you find the bugs and so forth. But is there any other places that you get inspiration from? Are there any resources you use to sort build some of these talks?Amy: Again, the communities help. The communities will tell you, really, it's like, I don't understand this thing, can someone hop on a call with me for real quick minute and explain why this concept is so hard? That's a very good place to base your talk off. As far as making them engaging, and interesting, I tend to clone video gaming videos, just because that's what I watch. I know, if it's going to be interesting to me, then it will probably be at least different than the content that's out there.Jeremy: Right. That's a good way to think of things too, is if it's something that you find interesting, chances are, there are lots of other people that will find that interesting. All right, let's go back to just this idea of creating new talks. You had mentioned this idea of, again, finding the bugs and so forth. But one of the things that I think we see quite a bit is always that bleeding edge stuff. People always want to write content about something new that happened.I'm guilty of this, I would think from a serverless standpoint where you're talking about things that are really, really bleeding edge. It's useful and they're interesting. Certainly, if you go to a conference about serverless, then it's really nice to see you have these talks and what might be possible. But sometimes when you're going to more practical type things. Again, even DevOps Days, and some of those other things, I think you've got attendees or talk listeners who are looking for very practical advice.I guess the question is like, how do you take a new piece of content, one of these problems, whatever it is. I guess, how do you keep finding new content is probably the better way to ask that question?Amy: Well, to just roll back just a little bit. My problem with bleeding edge content, I love watching it, but bleeding edge content will almost always be a product demo because it's someone who developed a new solution, and they want to share with everybody, which is just going to walk you through how it's used, which is great, except, and this is just a nature of what the cloud industry is like, all of this stuff, it changes day-to-day.These tools may not be applicable in a few months, or they may become the new standard. There's no way to tell until you're already six months out, and by then, they've already gone through several product revisions. I once did a talk where I was talking about best practices, and AWS released their updated best practices the day before my talk, and I had to update three slides. It threw off my timing, it was great.That's just one of those kind of pitfalls that you have to roll with. As far as getting new content, though, especially if you're dealing ... It depends who your audience is, because my audience tends to be either ICs or technical leads, and by then you're usually in a company ... If you're not developing these bleeding edge solutions, you're just using the tools that's out there already.You had brought up my "Serverless Frenemies," which is still my favorite title of any talk that I've ever made, because when I did the managing containers one, and I love all my Devro friends, but they all got into my mentions about why don't you just use Fargate? If you're at the containerization stage, why don't you just use Fargate, because it's not even close to the same thing, it is closer to Kubernetes than it is to Lambda, and I'm looking for a Lambda-like solution. That's what that whole deal was about, and I was able to stretch that out into I think 30 minutes because Twitter will tell you what's wrong, whether or not it's accurate or not, and whether or not they're actually your friends. They are my friends, but come on.Jeremy: Twitter can definitely be brutal. I think that, and maybe unpack a little bit what you were saying, is you're creating content around existing tools. One way to do it is, you're using existing tools, you're creating content around that, or you can create content around that. Looking at those solutions, you introduce a new solution to something, or you're even using an existing tool, nothing's perfect. You had mentioned that idea of bugs and so forth. But just, I guess new solutions, or just solutions, in general, maybe higher-level abstractions, everything creates some new type of problem that you have to deal with, and that's probably a pretty effective way to generate new content.Amy: It is. If you ever have to write down an RCA, which, for those who have not had the pleasure of doing one is called a root cause analysis, where you took down production, and you had to explain why.Jeremy: Yep.Amy: Or you ever did this, hopefully, in stage, or hopefully, in development where you ran into a situation where ... I had a situation once where Lambda would not delete itself. I call it my Skynet problem where it just hit a stage where it was both trying to save and delete at the same time. It would lock itself and I had to destroy the entire stack and send that command several times just to force that command through.If you ever have a problem like that, that is a thing that you write up instantly, and then you turn it into slide decks, and then you go to SlidesCarnival, you throw a very flashy background on it, and next thing you know, you have a TED talk, or a technical talk.Jeremy: Right. The other thing too, is, I find use cases to be an interesting, just like ... Non-traditional use cases are kind of fun too, how can I use this in a way that it wasn't meant to be used, and do something like that?Amy: I love those. Those are my favorite. I love watching people break away from what the tutorial says you have to do, and I'm going to get a little weird with it, and that to me is totally fascinating. When the whole, I fed these scripts into a computer meme came out, I thought that was super fascinating because that was something a company I had worked for did, they used analytics ... I used to work for Fantasy Sports, to write color commentary for your Fantasy Football team, and they would send it out.If you did really well, you would get a really raving review, and if you did really poorly, you would get roasted by a computer, and then that gets sent to everyone in the league, and it's hilarious. But that is not a thing that you would just assume a computer would do, is just write hot takes on your Fantasy Football team.Jeremy: That's ... Sure, go ahead.Amy: It's so much fun. I love watching people get weird with the tools that are there.Jeremy: There are times where you could do something like that, you could maybe create a content around some strange use case or whatever, and I love that idea of getting weird with that. The other part of it, though, is that, I guess, if you're sitting through a talk, and it's some super interesting problem that you're listening to, and again, I don't know, maybe it's some database replication thing, that you're just really into, whatever. That makes sense. But I think the majority of problems that developers have, are not that interesting, they're just frustrating.Probably the worst thing to do is wanting to sit through a talk that talks about some frustrating issue you have. Is there a way to basically say, "Look, I have a problem that I want to talk about. It's not the most interesting problem, but how can you flip that and take a problem that's not interesting and make it interesting?Amy: The batching containers and the frenemies talk was all based off of a bin library error from within the Lambda AMI. That, on paper is extremely boring, and should be a thing that you can easily look up, it is not. When I went around it trying to make tracking down library errors interesting, just saying it is very slow and can drain the energy out of your voice.But, I put a lot of energy into my work in general, and that's just how I had to approach pulling these talk is like, I like what I do, just, generally. When I try to explain what I do to people, it sounds super boring, and I own that. Now I'm doing it with spreadsheets, which is much, much worse. But when I tell people, it's not about the error itself, it's about everything that happened to make this one particular error happen. The reason why this error happened was because Lambda uses AWS's very specific Linux AMI when they did not used to, and they left stuff out for either security or performance purposes.Whether or not we as a group agree with that, that's a business decision that they made. How does their business decision affect your future business decisions and your future technical ones? Well, that becomes a way more interesting conversation, because it's like, we know this is going to break at this part, do we still want to use SSH? Do we still need it for this reason? You can approach it more from a narrative standpoint of, I wasted way too much time with this, did I need to? It's like, well, you shouldn't have, this should not have happened, but no bug should have happened, right?Jeremy: Right.Amy: You work through your process of finding a solution instead of concentrating on what the solution is because the solution they can look up in your show notes later.Jeremy: Right. No, I love that idea of documenting your process as opposed to just the solution itself. You find the problem, you pull the thread and where does that take you? I think to myself, a lot of times I go down the rabbit hole on trying to find the solution to a problem that I have or a bug fix, whatever. Sometimes, the resolution is underwhelming. Maybe it's not worth sharing. But other times, there's a revelation in there. I think you're right, with a little bit of storytelling, you can usually take that and turn that into a really interesting talk.Amy: One of the things it will also do, if you look at it from a process and from a narrative standpoint, is that when you take this video, and you send it to either a technical lead or a product manager, they'll understand what the problem was because you did not bog it down with code. There's very little live code in mine because I understand that people build things differently, just because every code is as different as every person. I get that and I've come to terms with it. This is the best way to share that information.Jeremy: Absolutely. All right, let's wrap up the idea of building talks. What is your advice to someone who is starting out new? What's the best way for them to get started, or what's just some general advice for people starting to build talks?Amy: The best content new engineers can do, and that's mostly because this is never the standpoint from which tutorials are ever written in, is that, as someone who knows very little of the way a language or a framework should work, write down your process, the entire thing on you getting either a framework onboarded, how you build, and a messaging system, things that people have written a billion times because chances are, one, you got that work from someone else's blog post or their documentation, and you can cite that. And two, when you do it that way, you not only get into the habit of writing, but you get in the habit of editing it in a way that makes it more palatable for people who are not in your specific experience.When you do it this way, people can actually see, from an outsider's perspective, exactly what is hard about the thing that they built, or what people who do not have a different level of experience are going through. If a tutorial is targeted at engineers who know where the memory leaks in PHP are, that's the thing that comes with experience, that is not the thing that can be trained.When a new engineer hits that point, and they found it in a new framework where you fix it, then you start knowing where to fix other problems. That way more senior engineers and more vetted people can learn from your experience, and then they will contact you and they will teach you how to find these issues, so you don't run into them again, and you end up with someone you can just bounce ideas off of. That's how you get pulled into these technical communities. It's a really self-healing process.Jeremy: Yeah. I love that. I think this idea of you approaching something from a slightly different angle, your experience, the way that you do it, the way that you see it, the way that you perceive the word or the next prompt that comes back, or how you read an error message or any of those things, you sharing your experience around that is hugely valuable to the people that are building these things. But also, you may run into problems that other people like you run into, and it's just ... Sometimes, all it takes is just a tiny twisting of the words, rearranging a sentence in a way that now that clicks with somebody where the other time it didn't. I love that.That's why I always encourage people, just even if somebody has written his content 100 times before, whatever slight difference there is in your content, that could have a powerful effect on someone else.Amy: Yeah, it really can.Jeremy: Awesome. All right, let me ask you a couple of questions about Lambda and Functions as a Service because I know that you spent quite a bit of time on this stuff. I guess a question, especially, maybe even from a cloud economist, what's next for Lambda and Functions as a Service? Because I know you've written about the Lambda containers, but what's maybe that next evolution?Amy: What AWS did recently when they released Lambda Containers is basically put it at feature parity with Azure and GCP, which already had that ability, they had either a function service or a function to Json service where you could upload your own container. They finally released the base image, where, granted, if you knew where to look, you could get it before, but they actually released it, and announced it to the general public, so you don't have to know someone in order to be able to use it.What I see a lot of people being able to do with this now is they really want to do local development testing, so they don't have to push anything to their account and rack up those charges, when all that you want to do is make sure that whatever one line update you made, actually worked and you didn't put the space or the cab in the wrong place, which is, I guess, how it works now and it takes down the entire stack, which again, we've all done at least once, so don't worry about it. If you've ever taken down production, don't worry, you're not the only one, I promise you. You can't throw a t-shirt into an empty conference room and not hit a dude who took down production. I'm going to save that for later.Local development testing, live simulation is a really big thing. I've seen asked to do full-on data science just on Lambda containers, so they don't have to use Kubernetes anymore, because speaking of cost stuff, it's easier to track cost-wise than Kubernetes is, because Kubernetes is purely consumption-based, and you have to tie a bunch of stuff together in order to make that tracking work. That would be great.I think from here on, and a lot of the FaaS changes, they're not going to be front ends anymore, it's all going to be optimizations by the providers, you're not going to see much of that anymore. It's not like before, where they would add three more fields and make a blog post about it. I think everything is just going to be tuning just from Lambda's perspective now. That and hooking it to more things, because they love their integrations. What good is Lambda if you can't integrate it yourself?Jeremy: Right, if you can't hook it up to events. It's interesting, though, this move to support containers as a packaging format. You're right, I think this has been available in IBM, it's been available in Google, it's been available in Microsoft, these capabilities have existed for a while to use a container, and again, that's a very overloaded word, I know, but to use that as a packaging format. But moving to that, the parity there with the other cloud providers is one thing, but who's that conversation for? Whose mind does that change about serverless, or FaaS, I guess.Amy: The security team.Jeremy: Security, okay.Amy: Because if you talk to any engineer, if it's a technical problem, they'll find a way to fix it. That's just the way, especially at the individual contributor level, that's how the brain works is like, oh, this is a small thing, I bet I can fix it with a few days, or a weekend. Weekend turns into a month, but that's a completely different problem. I've had clients who did not want to use Lambda because they could not control the containerization system. You would be pushing your code into containers that were owned by Amazon, and the way they saw that, they saw that as liability.While it does have some very strong technical implications, because you're now able to choose the kind of runtime you do, easier than trying to hamstring layers together, because I know layers is supposed to fix this problem, but it's so hard. It's so hard for something that you should be able to download off of Docker and then play with it and then put it back. It's so unnecessarily hard, and it makes me so angry.If you're willing to incur that responsibility, you can tweak your memory and you have more technical control, but also you have more control at a business level too, and that is a conversation that will go way easier as far as adoption.Jeremy: Right. The other thing, in terms of, I guess the complexity of running K8s or running Kubernetes is one of those things where that just seems like a lot of complexity. You mentioned the billing aspect of it and trying to track cost. Not that everyone's trying to narrow down exactly how much this Lambda container ran them, maybe you have more insight into that than I do, but the idea of just the complexity.It seems to me that if you start thinking about cost, that the total cost of ownership of running a container and a Lambda function or running it in Fargate, versus having to install and maintain ... I would say, even if you're using one of the managed services like EKS, or something like that, that the total cost of ownership of going down the serverless route has got to be better.Amy: Yeah, especially if you're one of these apps that are very user generater based. You're tracking mostly events and content, and not even a huge amount of content, you're not streaming video, you're sharing pictures, or sharing ... If you were trying to rebuild Foursquare, you would just be sharing Geo data, which is comparatively an extremely small piece of data.You don't need an entire instance, or an entire container to do that. You can do that on a very small scale, and build that out really quickly. That said, if you go from one of these three-person teams, and then there's interest in your product, for whatever reason, and it explodes, then not just your cost, but if you had to manage the traffic of that, if you had to manage the actual resources of that, and you did not think your usage would stick with your bill, that's not great.Being able to, at least in the first few years of the company, just use Lambda for everything, that's probably just a safer solution, because you're still rapidly iterating, and you're still changing things very quickly, and you're still transmitting very small bits of data. That said, it's like there are also large enterprise companies that are heavy Lambda users, and even their Lambda bill compared to their Kubernetes bill, it is ... If you round it to down there Kubernetes bill, you would get their Lambda bill.Jeremy: Right. Gotcha. I think that's really interesting because I do ... I actually would love to know your thoughts and whether you even see this. I don't know if we have enough data yet to know this, but this idea of using Lambda, especially early on in startups, or even projects within an enterprise, being able to have that flexibility and the low operational overhead and so forth, I think is really great. But do you see that, or is that something that you think will happen is, you'll get to a point where you'll say we've found some sort of stability point with this product, where we now need to move it over to something like Kubernetes, or a container management system because overall, it's going to end up being cheaper in the long run.Amy: What usually happens when you're making that transition from Lambda to either even ECS or Fargate, or eventually Kubernetes is that your business logic has now become so complex, or your infrastructure requirements have become so complex that Lambda can't do it cleanly anymore. You end up maxing out on either memory or CPU utilization, or because you're ... Apparently Lambda has a limit on how many times you can invoke it at the same time, which some people have hit in real life.Those are times when it stops being a cheaper solution, and it stops being a target solution because you can run your own FaaS environment within instances, and then you can have a similar environment to what you're building so you don't have to rebuild everything, but you don't have to incur that on-demand cost anymore. That's one path I've seen someone take, and that's usually the decision is that Lambda, before, when it was limited, can't hold it.Now that you can put your own container, so long as it fits in that requirement, you can pad that runway out a little bit, and you can stretch out how long you have before you do a full conversion to ECS environment. But that is usually how it is because you just try to overload or you have, maybe, 50 Lambdas trying to support one application, which is totally a thing you can do, it may not be the best ... Even with Step, even with everything else. When that becomes too complex, and you end up just going through containers, anyway.Jeremy: Right. I think that's interesting, and I think any company that grows to the point where that they need to start thinking about that next little infrastructure, it's probably a good thing. It's a good point to start having those conversations.All right, I got just one more question for you, because I'm really interested. You mentioned what you do as a cloud economist, reading through people's bills and things like that. Now, I thought Corey just made this thing up. I didn't even know this thing existed until, Corey comes out, and he probably coined the term. But in terms of that ...Amy: That's what he tells people.Jeremy: He does tell people that, right. I think he did. So, I will definitely give him credit there. But in terms of that role, of being a cloud economist and having to look through people's bills, and trying to find them ways to save it, that's pretty insane that we need people like you to do that, isn't it?Amy: Yes, it's a bananas job. I cannot believe this is a job that I'm actually doing. It's also a lot of fun. But if you think about it, that when I was starting out, and everything was LAMP stack, when I started. That was a hot new tech when I started, was the LAMP stack. The solution to all of those problems were we're going to throw more hardware at it. Then the following question was, why are we spending so much on hardware?Their solution to that problem was, we're going to buy real estate to store all of the hardware on. Now that you don't have to do that, you still have the problem of, I'm going to solve this problem by throwing more hardware at it. That's still a mindset that is alive and well, and you still end up with the same problem, except now you don't have the excuse that at least we own the facility that data is in because you don't anymore.Since you don't actually own the cases and the plates and everything, you don't have to worry about disposing of them and having to use stuff that you don't actually use anymore. A lot of my problems are, one of our services has gone out of control, we don't know why. Then I will tell you, who is spending that money. I will talk to that team to make sure that they know that it's happening because sometimes they don't even know what's happening. Something got spun up into their account, and maybe it was a testbed, maybe it was a demo, maybe they hired a vendor to load something into their environment and those costs got out of control.It's not like I'm going out trying to tell you that you did something wrong. It's like, this is where the problem is, let's go find out what happened. Forensic cloud bill person, I'm going to workshop that into a business card, because that sounds way better than the title that Corey uses.Jeremy: Forensic cloud accountant or something like that.Amy: Yes.Jeremy: I think it's also interesting that billing is, and the bills you get from AWS are a leading indicator of things that are potentially going wrong. Interesting, because I don't know if people connect this. Maybe I'm underestimating people here, but the idea that a bill that runs, or that you're seeing EC2 instances cost spiking, or you're seeing a higher load or higher bandwidth or things like that. Those can all be indicators of poorly written code, it can be indicators of the bad compression or missing compression settings, all kinds of things that it can jump out at you. Unless somebody is paying attention to those bills, I don't think most developers and most teams, they're not going to see that.Amy: Yeah. The only time they pay attention when things start spiraling out of control, and ... Okay, this sounds like an intuitive issue, and first thing people will do, will go, "We're going to log everything, and we're going to find out where the problem is."Jeremy: It'll cost you more money.Amy: There is a threshold where cloud watch becomes very expensive.Jeremy: Right, absolutely.Amy: Then they hit that threshold, and now their bill is four times as much.Jeremy: Right.Amy: A lot of the times it's misconfiguration, it's like, very rarely does any product get to the point where they just can't ... It's built so poorly that it can barely hold itself up. That's never been the case. It's always been, this has been turned off, or AWS also offers S3 analytics. You have to turn them on per bucket, that's not a policy that's usually written in anyone's AWS config. When they launch it, they just launch it without any analytics. They don't know if the thing is supposed to be sending things to Glacier, if it's highly used data, there's no way to tell.It's trying to find little holes like that, where it seems like it shouldn't be a problem, but the minute it becomes a problem, it's because you spent $20,000.Jeremy: Right. Yeah. No, you can spend money very, very fast in the cloud. I think that is a lesson learned by many, many people.Amy: The difference between being on metal and throwing hardware at a problem and being on the cloud and throwing hardware at a problem is that you can throw hardware at a problem at scale on the cloud.Jeremy: Exactly. Right. There's no stopping point like we have to go by using servers ...Amy: No one will stop you.Jeremy: No one will stop you. Just maybe the credit card company or whatever. Anyways, Amy, you are doing some amazing work with that, because I actually find that to be very, very fascinating. I think, in terms of what that can do, and the need for it, it's a fascinating field, and super interesting. Good for Corey for really digging into that and calling it out. Then again, for people like you who are willing to take that job, because that seems to me like poring through those numbers can't be the most interesting thing to do. But it must feel good when you do find a way to save somebody some money.Amy: Spreadsheets can be interesting. Again, it's like everything else about my job. If I try to explain why it's interesting, I just make it sound more boring.Jeremy: Awesome. All right. Well, let's leave it there. Amy, thank you again, for joining me, this was awesome. If people want to find out more about you, or maybe they have horribly large AWS cloud bills, and they want to check out the Duckbill Group, how do they do that?Amy: Honestly, if you search for Corey Quinn, you can find the Duckbill Group real fast. If you want to go talk to me because I like doing community engagement, and I like doing talks, and I like roasting people on Twitter just about different stuff, you can hit me up on Twitter @nerdypaws. If you want to be a professional, I'm also on LinkedIn under Amy Codes.Jeremy: All right, and then you also have a website, Amy-codes.com.Amy: Amy-codes.com is the archive of all my talks. It's currently only showing the talks from last year because for some reason, it's somehow became very hard to find a spot for the past year. Who knew?Jeremy: A lot of people doing talks. But anyways, all right, Amy, thank you again. Appreciate it.Amy: Thank you. Had so much fun.

amazon google service zoom local microsoft excellence cloud ibm evolving dungeons and dragons yahoo technical fantasy football certificates aws lamp faa apis devops azure functions googling forensic toastmasters cpu geo skynet php s3 glacier docker kubernetes rca gotcha fantasy sports lambda foursquare baas ics json serverless gcp ssh ecs nasa ames research center eks ec2 k8s corey quinn devopsdays lambdas fargate amy amy aws summit amy it amy you duckbill group jeremy you jeremy it cbt nuggets amy thank amy well

Episode #101: How Serverless is Becoming More Extensible with Julian Wood

Play Episode Listen Later May 17, 2021 64:40

About Julian WoodJulian Wood is a Senior Developer Advocate for the AWS Serverless Team. He loves helping developers and builders learn about, and love, how serverless technologies can transform the way they build and run applications at any scale. Julian was an infrastructure architect and manager in global enterprises and start-ups for more than 25 years before going all-in on serverless at AWS.Twitter: @julian_woodAll things Serverless @ AWS: ServerlessLandServerless Patterns CollectionServerless Office Hours – every Tuesday 10am PTLambda ExtensionsLambda Container ImagesWatch this episode on YouTube: https://youtu.be/jtNLt3Y51-gThis episode sponsored by CBT Nuggets and Lumigo.TranscriptJeremy: Hi everyone, I'm Jeremy Daly and this is Serverless Chats. Today I'm joined by Julian Wood. Hey Julian, thanks for joining me.Julian: Hey Jeremy, thank you so much for inviting me.Jeremy: Well, I am super excited to have you here. I have been following your work for a very long time and of course, big fan of AWS. So you are a Serverless Developer Advocate at AWS, and I'd love it if you could just tell the listeners a little bit about your background, so they get to know you a bit. And then also, sort of what your role is at AWS.Julian: Yeah, certainly. Well, I'm Julian Wood. I am based in London, but yeah, please don't let my accent fool you. I'm actually originally from South Africa, so the language purists aren't scratching their heads anymore. But yeah, I work within the Serverless Team at AWS, and hopefully do a number of things. First of all, explain what we're up to and how our sort of serverless things work and sort of, I like to sometimes say a bit cheekily, basically help the world fall in love with serverless as I have. And then also from the other side is to be a proxy and sort of be the voice of builders, and developers and whoever's building service applications, and be their voices internally. So you can also keep us on our toes to help build the things that will brighten your days.And just before, I've worked for too many years probably, as an infrastructure racker, stacker, architect, and manager. I've worked in global enterprises babysitting their Windows and Linux servers, and running virtualization, and doing all the operations kind of stuff to support that. But, I was always thinking there's a better way to do this and we weren't doing the best for the developers and internal customers. And so when this, you know in inverted commas, "serverless way" of things started to appear, I just knew that this was going to be the future. And I could happily leave the server side to much better and cleverer people than me. So by some weird, auspicious alignment of the stars, a while later, I managed to get my current dream job talking about serverless and talking to you.Jeremy: Yeah. Well, I tell you, I think a lot of serverless people or people who love serverless are recovering ops and infrastructure people that were doing racking and stacking. Because I too am also recovering from that and I still have nightmares.I thought that it was interesting too, how you mentioned though, developer advocacy. It's funny, you work for a specific company, AWS obviously, but even developer advocacy in general, who is that for? Who are you advocating for? Are you advocating for the developers to use the service from the company? Are you advocating for the developers so that the company can provide the services that they actually need? Interesting balance there.Julian: Yeah, it's true. I mean, the honest answer is we don't have great terms for this kind of role, but yeah, I think primarily we are advocating for the people who are developing the applications and on the outside. And to advocate for them means we've got to build the right stuff for them and get their voices internally. And there are many ways of doing that. Some people raise support requests and other kind of things, but I mean, sometimes some of our great ideas come from trolling Twitter, or yes, I know even Hacker News or that kind of thing. But also, we may get responses from 10 different people about something and that will formulate something in our brain and we'll chat with other kind of people. And that sort of starts a thing. It's not just necessarily each time, some good idea in Twitter comes in, it gets mashed into some big surface database that we all pick off.But part of our job is to be out there and try and think and be developers in whatever backgrounds we come from. And I mean, I'm not a pure software developer where I've come from, and I come, I suppose, from infrastructure, but maybe you'd call that a bit of systems engineering. So yeah, I try and bring that background to try and give input on whatever we do, hopefully, the right stuff.Jeremy: Right. Yeah. And then I think part of the job too, is just getting the information out there and getting the examples out there. And trying to create those best practices or at least surface those best practices, and encourage the community to do a lot of that work and to follow that. And you've done a lot of work with that, obviously, writing for the AWS blog. I know you have a series on the Serverless Lens and the Well-Architected Framework, and we can talk about that in a little while. But I really want to talk to you about, I guess, just the expansion of serverless over the last couple of years.I mean, it was very narrowly focused, probably, when it first came out. Lambda was ... FaaS as a whole new concept for a lot of people. And then as this progressed and we've gotten more APIs, and more services and things that it can integrate with, it just becomes complex and complicated. And that's a good thing, but also maybe a bad thing. But one of the things that AWS has done, and I think this is clearly in reaction to the developers needing it, is the ability to extend what you can do with a Lambda function, right? I mean, the idea of just putting your code in there and then, boom, that's it, that's all you have to do. That's great. But what if you do need access to lifecycle hooks? Or what if you do want to manipulate the underlying runtime or something like that? And AWS, I think has done a great job with that.So maybe we can start there. So just about the extensibility of Lambda in general. And one of the new things that was launched recently was, and recently, I don't know what was it? Seven months ago at this point? I'm not even sure. But was launched fairly recently, let's say that, is Lambda Extensions, and a couple of different flavors of that as well. Could you kind of just give the users an over, the users, wow, the listeners an overview of what Lambda Extensions are?Julian: I could hear the ops background coming in, talking about our users. Yeah. But I mean, from the get-go, serverless was always a terrible term because, why on earth would you name something for what it isn't? I mean, you know? I remember talking to DBAs, talking about noSQL, and they go, "Well, if it's not SQL, then what is it?" So we're terrible at that, serverless as well. And yeah, Lambda was very constrained when it came out. Lambda was never built being a serverless thing, that's what was the outcome. Sometimes we focus too much on the tools rather than the outcome. And the story is S3, just turning 15. And the genesis of Lambda was being an event trigger for S3, and people thought you'd upload something to S3, fire off a Lambda function, how cool is that? And then obviously the clever clubs at the time were like, "Well, hang on, let's not just do this for S3, let's do this for a whole bunch of kind of things."So Lambda was born out of that, as that got that great history, which is created an arc sort of into the present and into the future, which I know we're also going to get on about, the power of event driven applications. But the power of Lambda has always been its simplicity, and removing that operational burden, and that heavy lifting. But, sometimes that line is a bit of a gray area and there're people who can be purists about serverless and can be purists about FaaS and say, "Everything needs to be ephemeral. Lambda functions can't extend to anything else. There shouldn't be any state, shouldn't be any storage, shouldn't be any ..." All this kind of thing.And I think both of us can agree, but I don't want to speak for you, but I think both of us would agree that in some sense, yeah, that's fine. But we live in the real world and there's other stuff that needs to connect to and we're not here about building purist kind of stuff. So Lambda Extensions is a new way basically to integrate Lambda with your favorite tools. And that's the sort of headline thing we like to talk about. And the big idea is to open up Lambda to more effectively work mainly with partners, but also your own tools if you want to write them. And to sort of have deeper hooks into the Lambda lifecycle.And other partners are awesome and they do a whole bunch of stuff for serverless, plus customers also have connections to on-prem staff, or EC2 staff, or containers, or all kind of things. How can we make the tools more seamless in a way? How can we have a common set of tools maybe that you even use on-prem or in the cloud or containers or whatever? Why does Lambda have to be unique or different or that kind of thing? And Extensions is sort of one of the starts of that, is to be able to use these kind of tools and get more out of Lambda. So I mean, just the kind of tools that we've already got on board, there's things like Splunk and AppDynamics. And Lumigo, Epsagon, HashiCorp, Honeycomb, CoreLogic, Dynatrace, I can't think. Thundra and Sumo Logic, Check Point. Yeah, I'm sorry. Sorry for any partners who I've forgotten a few.Jeremy: No, right, no. That's very good. Shout them out, shout them out. No, I mean just, and not to interrupt you here, but ...Julian: No, please.Jeremy: ... I think that's great. I mean, I think that's one of the things that I like about the way that AWS deals with partners, is that ... I mean, I think AWS knows they can't solve all these problems on their own. I mean, maybe they could, right? But they would be their own way of solving the problems and there's other people who are solving these problems differently and giving you the ability to extend your Lambda functions into those partners is, there's a huge win for not only the partners because it creates that ecosystem for them, but also for AWS because it makes the product itself more valuable.Julian: Well, never mind the big win for customers because ultimately they're the one who then gets a common deployment tool, or a common observability tool, or a HashiCorp Vault that you can manage secrets and a Lambda function from HashiCorp Vault. I mean, that's super cool. I mean, also AWS services are picking this up because that's easy for them to do stuff. So if anybody's used Lambda Insights or even seen Lambda Insights in the console, it's somewhere in the monitoring thing, and you just click something over and you get this tool which can pull stuff that you can't normally get from a Lambda function. So things like CPU time and network throughput, which you couldn't normally get. But actually, under the hoods, Lambda Insights is using Lambda extensions. And you can see that if you look. It automatically adds the Lambda layer and job done.So anyway, this is how a lot of the tools work, that a layer is just added to a Lambda function and off you go, the tool can do its work. So also there's a very much a simplicity angle on this, that in a lot of cases you don't have to do anything. You configure some of the extensions via environment variables, if that's cooled you may just have an API key or a log retention value or something like that, I don't know, any kind of example of that. But you just configure that as a normal Lambda environment variable at this partner extension, which is just a Lambda layer, and off you go. Super simple.Jeremy: Right. So explain Extensions exactly, because I think that's one of those things because now we have Lambda layers and we have Lambda Extensions. And there's also like the runtime API and then something else. I mean, even I'm not 100% sure what all of the naming conventions. I'm pretty sure I know what they do ...Julian: Yeah, fair enough.Jeremy: ... but maybe we could say the names and say exactly what they do as well.Julian: Yeah, cool. You get an API, I get an API, everybody gets an API. So Lambda layers, let's just start, because that's, although it's not related to Extensions, it's how Extensions are delivered to the power core functions. And Lambda layers is just another way to add code to a Lambda function or not even code, it can be a dependency. It's just a way that you could, and it's cool because they are shareable. So you have some dependencies, or you have a library, or an SDK, or some training data for something, a Lambda layer just allows you to add some bits and bobs to your Lambda function. That's a horrible explanation. There's another word I was thinking of, because I don't want to use the word code, because it's not necessarily code, but it's dependency, whatever. It's just another way of adding something. I'll wake up in a cold sweat tonight thinking of the word I was thinking of, but anyway.But Lambda Extensions introduces a whole new companion API. So the runtime API is the little bit of code that allows your function to talk to the Lambda service. So when an event comes in, this is from the outside. This could be via API gateway or via the Lambda API, or where else, EventBridge or Step Functions or wherever. When you then transports that data rise in the Lambda services and HTTP call, and Lambda transposes that into an event and sends that onto the Lambda function. And it's that API that manages that. And just as a sidebar, what I find it cool on a sort of geeky, technical thing is, that actually API sits within the execution environment. People are like, "Oh, that's weird. Why would your Lambda API sit within the execution environment basically within the bubble that contains your function rather than it on the Lambda service?"And the cool answer for that is it's actually for a security mechanism. Like your function can then only ever talk to the Lambda runtime API, which is in that secure execution environment. And so our security can be a lot stronger because we know that no function code can ever talk directly out of your function into the Lambda service, it's all got to talk locally. And then the Lambda service gets that response from the runtime API and sends it back to the caller or whatever. Anyway, sidebar, thought that was nerdy and interesting. So what we've now done is we've released a new Extensions API. So the Extensions API is another API that an extension can use to get information from Lambda. And they're two different types of extensions, just briefly, internal and external extensions.Now, internal extensions run within the runtime process so that it's just basically another thread. So you can use this for Python or Java or something and say, when the Python runtime starts, let's start it with another parameter and also run this Java file that may do some observability, or logging, or tracing, or finding out how long the modules take to launch, for example. I know there's an example for Python. So that's one way of doing extensions. So it's internal extensions, they're two different flavors, but I'll send you a link. I'll provide a link to the blog posts before we go too far down the rabbit hole on that.And then the other part of extensions are external extensions. And this is a cool part because they actually run as completely separate processes, but still within that secure bubble, that secure execution environment that Lambda runs it. And this gives you some superpowers if you want. Because first of all, an extension can run in any language because it's a separate process. So if you've got a Node function, you could run an extension in other kind of languages. Now, what do we do recommend is you do run your extension in a compiled binary, just because you've got to provide the runtime that the extensions got to run in any way, so as a compiled binary, it's super easy and super useful. So is something like Go, a lot of people are doing because you write a single extension and Go, and then you can use it on your Node functions, your Java functions, your PowerShell functions, whatever. So that's a really good, simple way that you can have the portability.But now, what can these extensions do? Well, the extensions basically register with extensions API, and then they say to Lambda, "Lambda, I want to know about what happens when my functions invoke?" So the extension can start up, maybe it's got some initialization code, maybe it needs to connect to a database, or log into an observability platform, or pull down a secret order. That it can do, it's got its own init that can happen. And then it's basically ready to go before the function invokes. And then when the extension then registers and says, "I want to know when the function invokes and when it shuts down. Cool." And that's just something that registers with the API. Then what happens is, when a functioning invoke comes in, it tells the runtime API, "Hello, you now have an event," sends it off to the Lambda function, which the runtime manages, but also extension or extensions, multiple ones, hears information about that event. And so it can tell you the time it's going to run and has some metadata about that event. So it doesn't have the actual event data itself, but it's like the sort of Lambda context, a version of that that it's going to send to the extension.So the extension can use that to do various things. It can start collecting telemetry data. It can alter instrument some of your code. It could be managing a secret as a separate process that it is going to cache in the background. For example, we've got one with AppConfig, which is really cool. AppConfig is a service where you manage parameters external to your Lambda function. Well, each time your Lambda function warm invokes if you've got to do an external API call to retrieve that, well, it's going to be a little bit efficient. First of all, you're going to pay for it and it's going to take some time.So how about when the Lambda function runs and the extension could run before the Lambda function, why don't we just cache that locally? And then when your Lambda function runs, it just makes a local HTTP call to the extension to retrieve that value, which is going to be super quick. And some extensions are super clever because they're their own process. They will go, "Well, my value is for 30 minutes and every 30 minutes if I haven't been run, I will then update the value from that." So that's useful. Extensions can then also, when the runtime ... Sorry, let me back up.When the runtime is finished, it sends its response back to the runtime API, and extensions when they're done doing, so the runtime can send it back and the extension can carry on processing saying, "Oh, I've got the information about this. I know that this Lambda function has done X, Y, Z, so let me do, do some telemetry. Let me maybe, if I'm writing logs, I could write a log to S3 or to Kinesis or whatever. Do some kind of thing after the actual function invocation has happened." And then when it says it's ready, it says, "Hello, extensions API, I'm telling you I'm done." And then it's gone. And then Lambda freezes the execution environment, including the runtime and the extensions until another invocation happens. And the cycle then will happen.And then the last little bit that happens is, instead of an invoke coming in, we've extended the Lambda life cycles, so when the environment is going to be shut down, the extension can receive the shutdown and actually do some stuff and say, "Okay, well, I was connected to my observer HTTP platform, so let me close that connection. I've got some extra logs to flush out. I've got whatever else I need to do," and just be able to cleanly shut down that extra process that is running in parallel to the Lambda function.Jeremy: All right.Julian: So that was a lot of words.Jeremy: That was a lot and I bet you that would be great conversation for a dinner party. Really kicks things up. Now, the good news is that, first of all, thank you for that though. I mean, that's super technical and super in-depth. And for anyone listening who ...Julian: You did ask, I did warn you.Jeremy ... kind of lost their way ... Yes, but something that is really important to remember is that you likely don't have to write these yourself, right? There is all those companies you mentioned earlier, all those partners, they've already done this work. They've already figured this out and they're providing you access to their tools via this, that allows you to build things.Julian: Exactly.Jeremy: So if you want to build an extension and you want to integrate your product with Lambda or so forth, then maybe go back and listen to this at half speed. But for those of you who just want to take advantage of it because of the great functionality, a lot of these companies have already done that for you.Julian: Correct. And that's the sort of easiness thing, of just adding the Lambda layer or including in a container image. And yeah, you don't have to worry any about that, but behind the scenes, there's some really cool functionality that we're literally opening up our Lambda operates and allowing you to impact when a function responds.Jeremy: All right. All right. So let me ask another, maybe an overly technical question. I have heard, and I haven't experienced this, but that when it runs the life cycle that ends the Lambda function, I've heard something like it doesn't send the information right away, right? You have to wait for that Lambda to expire or something like that?Julian: Well, yes, for now, about to change. So currently Extensions is actually in preview. And that's not because it's in Beta or anything like that, but it's because we spoke to the partners and we didn't want to dump Extensions on the world. And all the partners had to come out with their extensions on day one and then try and figure out how customers are going to use them and everything. So what we really did, which I think in this case works out really well, is we worked with the partners and said, "Well, let's release this in preview mode and then give everybody a whole bunch of months to work out what's the best use cases, how can we best use this?" And some partners have said, "Oh, amazing. We're ready to go." And some partners have said, "Ah, it wasn't quite what we thought. Maybe we're going to wait a bit, or we're going to do something differently, or we've got some cool ideas, just give us time." And so that's what this time has been.The one other thing that has happened is we've actually added some performance enhancements during it. So yes, currently during the preview, the runtime and all extensions need to finish before we give you your response back to your Lambda function. So if you're in an asynchronous mode, you don't really care, but obviously if you're in a synchronous mode behind an API, yeah, you don't really want that. But when Extensions goes GA, which isn't going to be long, then that is no longer the case. So basically what'll happen is the runtime will respond and the result goes directly back to whoever's calling that, maybe API gateway, and the extensions can carry on, partly asynchronously in the background.Jeremy: Yep. Awesome. All right. And I know that the plan is to go GA soon. I'm not sure when around when this episode comes out, that that will be, but soon, so that's good to know that that is ...Julian: And in fact, when we go GA that performance enhancement is part of the GA. So when it goes GA, then you know, it's not something else you need to wait for.Jeremy: Perfect. Okay. All right. So let's move on to another bit of, I don't know if this is extensibility of the actual product itself or more so I think extensibility of maybe the workflow that you use to deploy to Lambda and deploy your serverless applications, and that's container image support. I mean, we've discussed it a lot. I think people kind of have an idea, but just give me your quick overview of what that is to set some context here.Julian: Yeah, sure. Well, container image support in a simple sort of headline thing is to be able to build and package your functions as a container image. So you basically build a function using a Docker file. So before if you use a zip function, but a lot of people use Serverless Framework or SAM, or whatever, that's all abstracted away from you, but it's actually creating a zip file and uploading it to Lambda or S3. So with container image support, you use a Docker file to build your Lambda function. That's the headline of what's happening.Jeremy: Right. And so the idea of creating, and this is also, and again, you mentioned packaging, right? I mean, that is the big thing here. This is a packaging format. You're not actually running the container in a Lambda function.Julian: Correct. Yeah, let's maybe think, because I mean, "containers," in inverted commas again for people who are on the audio, is ...Jeremy: What does it even mean?Julian: Yeah, exactly. And can be quite an overload of terms and definitely causes some confusion. And I sort of think maybe there's sort of four things that are in the container world. One, containers is an isolation mechanism. So on Linux, this is UNC Group, seccomp, other bits and pieces that can be used to isolate processes or maybe groups of processes. And then a second one, containers as the packaging mechanism. This is what Docker really popularized and this is about taking some code and the dependencies needed to run the code, and then packaging them all out together, maybe with some metadata to describe it.And then, three is containers as also a design philosophy. This is the idea, if we can package and isolate software, it's easier to run. Maybe smaller pieces of software is easy to reason about and manage independently. So I don't want to necessarily use microservices, but there's some component of that with it. And the emphasis here is on software rather than services, and standardized tooling to simplify your ops. And then the fourth thing is containers as an ecosystem. This is where all the products, tools, know how, all the actual things to how to do containers. And I mean, these are certain useful, but I wouldn't say there're anything about the other kind of things.What is cool and worth appreciating is how maybe independent these things are. So when I spoke about containers as isolation, well, we could actually replace containers as isolation with micro VMs such as we do with Firecracker, and there's no real change in the operational properties. So one, if we think, what are we doing with containers and why? One of those is in a way ticked off with Lambda. Lambda does have secure isolation. And containers as a packaging format. I mean, you could replace it with static linking, then maybe won't really be a change, but there's less convenience. And the design philosophy, that could really be applicable if we're talking microservices, you can have instances and certainly functions, but containers are all the same kind of thing.So if we talk about the packaging of Lambda functions, it's really for people who are more familiar with containers, why does Lambda have to be different? You've got, why does Lambda to have to be a snowflake in a way that you have to manage differently? And if you are packaging dependencies, and you're doing npm or pip install, and you're used to building Docker files, well, why can't we just do that for Lambda at the same things? And we've got some other things that come with that, larger function sizes, up to 10 gig, which is enabled with some of this technology. So it's a packaging format, but on the backend, there's a whole bunch of different stuff, which has to be done to to allow this. Benefits are, use your tooling. You've got your CI/CD pipelines already for containers, well, you can use that.Jeremy: Yeah, yeah. And I actually like that idea too. And when I first heard of it, I was like, I have nothing against containers, the containers are great. But when I was thinking about it, I'm like, "Wait container? No, what's happening here? We're losing something." But I will say, like when Lambda layers came out, which was I think maybe 2019 or something like that, maybe 2018, the idea of it made a lot of sense, being able to kind of supplement, add additional dependencies or code or whatever. But it always just seemed awkward. And some of the publishing for it was a little bit awkward. The versioning used like a numbered versioning instead of like semantic versioning and things like that. And then you had to share it to multiple places and if you published it as a SAR app, then you got global distri ... Anyways, it was a little bit hard to use.And so when you're trying to package large dependencies and put those in a layer and then combine them with a Lambda function, the other problem you had was you still had a maximum size that you could use for those, when those were combined. So I like this idea of saying like, "Look, I'd like to just kind of create this little isolate," like you said, "put my dependencies in there." Whether that's PyCharm or some other thing that is a big dependency that maybe I don't want to install, directly in a Lambda layer, or I don't want to do directly in my Lambda function. But you do that together and then that whole process just is a lot easier. And then you can actually run those containers, you could run those locally and test those if you wanted to.Julian: Correct. So that's also one of the sort of superpowers of this. And that's when I was talking about, just being able to package them up. Well, that now enables a whole bunch of extra kind of stuff. So yes, first of all is you can then use those container images that you've created as your local testing. And I know, it's silly for anyone to poo poo local testing. And we do like to say, "Well, bring your testing to the cloud rather than bringing the cloud to your testing." But testing locally for unit tests is super great. It's going to be super fast. You can iterate, have your Lambda functions, but we don't want to be mocking all of DynamoDB, all of building harebrained S3 options locally.But the cool thing is you've got the same Docker file that you're going to run in Lambda can be the same Docker file to build your function that you run locally. And it is literally exactly the same Lambda function that's going to run. And yes, that may be locally, but, with a bit of a stretch of kind of stuff, you could also run those Lambda functions elsewhere. So even if you need to run it on EC2 instances or ECS or Fargate or some kind of thing, this gives you a lot more opportunities to be able to use the same Lambda function, maybe in different way, shapes or forms, even if is on-prem. Now, obviously you can't recreate all of Lambda because that's connected to IM and it's got huge availability, and scalability, and latency and all that kind of things, but you can actually run a Lambda function in a lot more places.Jeremy: Yeah. Which is interesting. And then the other thing I had mentioned earlier was the size. So now the size of these container or these packages can be much, much bigger.Julian: Yeah, up to 10 gig. So the serverless purists in the back are shouting, "What about cold starts? What about cold starts?"Jeremy: That was my next question, yes.Julian: Yeah. I mean, back on zip functional archives are also all available, nothing changes with that Lambda layers, many people use and love, that's all available. This isn't a replacement it's just a new way of doing it. So now we've got Lambda functions that can be up to 10 gig in size and surely, surely that's got to be insane for cold starts. But actually, part of what I was talking about earlier of some of the work we've done on the backend to support this is to be able to support these super large package sizes. And the high level thing is that we actually cache those things really close to where the Lambda layer is going to be run.Now, if you run the Docker ecosystem, you build your Docker files based on base images, and so this needs to be Linux. One of the super things with the container image support is you don't have to use Amazon Linux or Amazon Linux 2 for Lambda functions, you can actually now build your Lambda functions also on Ubuntu, DBN or Alpine or whatever else. And so that also gives you a lot more functionality and flexibility. You can use the same Linux distribution, maybe across your entire suite, be it on-prem or anywhere else.Jeremy: Right. Right.Julian: And the two little components, there's an interface client, what you install, it's just another Docker layer. And that's that runtime API shim that talks to the runtime API. And then there's a runtime interface emulator and that's the thing that pretends to be Lambda, so you can shunt those events between HTTP and JSON. And that's the thing you would use to run locally. So runtime interface client means you can use any Linux distribution at the runtime interface client and you're compatible with Lambda, and then the interface emulators, what you would use for local testing, or if you want to spread your wings and run your Lambda functions elsewhere.Jeremy: Right. Awesome. Okay. So the other thing I think that container support does, I think it opens up a broader set of, or I guess a larger audience of people who are familiar with containerization and how that works, bringing those two Lambda functions. And one of the things that you really don't get when you run a container, I guess, on EC2, or, not EC2, I'm sorry, ECS, or Fargate or something like that, without kind of adding another layer on top of it, is the eventing aspect of it. I mean, Lambda just is naturally an event driven, a compute layer, right? And so, eventing and this idea of event driven applications and so forth has just become much more popular and I think much more mainstream. So what are your thoughts? What are you seeing in terms of, especially working with so many customers and businesses that are using this now, how are you seeing this sort of evolution or adoption of event driven applications?Julian: Yeah. I mean, it's quite funny to think that actually the event of an application was the genesis of Lambda rather than it being Serverless. I mentioned earlier about starting with S3. Yeah, the whole crux of Lambda has been, I respond to an event of an API gateway, or something on SQS, or via the API or anything. And so the whole point in a way of Lambda has been this event driven computing, which I think people are starting to sort of understand in a bigger thing than, "Oh, this is just the way you have to do Lambda." Because, I do think that serverless has a unique challenge where there is a new conceptual learning maybe that you have to go through. And one other thing that holds back service development is, people are used to a client's server and maybe ports and sockets. And even if you're doing containers or on-prem, or EC2, you're talking IP addresses and load balances, and sockets and firewalls, and all this kind of thing.But ultimately, when we're building these applications that are going to be composed of multiple services talking together through using APIs and events, the events is actually going to be a super part of it. And I know he is, not for so much longer, but my ultimate boss, but I can blame Jeff Bezos just a little bit, because he did say that, "If you want to talk via anything, talk via an API." And he was 100% right and that was great. But now we're sort of evolving that it doesn't just have to be an API and it doesn't have to be something behind API gateway or some API that you can run. And you can use the sort of power of events, particularly in an asynchronous model to not just be "forced" again in inverted commas to use APIs, but have far more flexibility of how data and information is going to flow through, maybe not just your application, but your suite of applications, or to and from your partners, or where that is.And ultimately authentications are going to be distributed, and maybe that is connecting to partners, that could be SaaS partners, or it's going to be an on-prem component, or maybe things in other kind of places. And those things need to communicate. And so the way of thinking about events is a super powerful way of thinking about that.Jeremy: Right. And it's not necessarily new. I mean, we've been doing web hooks for quite some time. And that idea of, something is going to happen somewhere and I want to be notified of it, is again, not a new concept. But I think certainly the way that it's evolved with Lambda and the way that other FaaS products had done eventing and things like that, is just those tight integrations and just all of the, I guess, the connective tissue that runs between those things to make sure that the events get delivered, and that you can DLQ them, and you can do all these other things with retries and stuff like that, is pretty powerful.I know you have, I actually just mentioned this on the last episode, about one of my favorite books, I think that changed my thinking and really got me thinking about how microservices communicate with one another. And that was Building Microservices by Sam Newman, which I actually said was sort of like my Bible for a couple of years, yes, I use that. So what are some of the other, like I know you have a favorite book on this.Julian: Well, that Building Microservices, Sam Newman, and I think there's a part two. I think it's part two, or there's another one ...Jeremy: Hopefully.Julian: ... in the works. I think even on O'Riley's website, you can go and see some preview copies of it. I actually haven't seen that. But yeah, I mean that is a great kind of Bible talking. And sometimes we do conflate this microservices things with a whole bunch of stuff, but if you are talking events, you're talking about separating things. But yeah, the book recommendation I have is one called Flow Architectures by James Urquhart. And James Urquhart actually works with VMware, but he's written this book which is looking sort of at the current state and also looking into the future about how does information flow through our applications and between companies and all this kind of thing.And he goes into some of the technology. When we talk about flow, we are talking about streams and we're talking about events. So streams would be, let's maybe put some AWS words around it, so streams would be something like Kinesis and events would be something like EventBridge, and topics would be SNS, and SQS would be queues. And I know we've got all these things and I wish some clever person would create the one flow service to rule them all, but we're not there. And they've got also different properties, which are helpful for different things and I know confusingly some of them merge. But James' sort of big idea is, in the future we are going to be able to moving data around between businesses, between applications. So how can we think of that as a flow? And what does that mean for designing applications and how we handle that?And Lambda is part of it, but even more nicely, I think is even some of the native integrations where you don't have to have a Lambda function. So if you've got API gateway talking to Step Functions directly, for example, well, that's even better. I mean, you don't have any code to manage and if it's certainly any code that I've written, you probably don't want to manage it. So yeah. I mean this idea of flow, Lambda's great for doing some of this moving around. But we are even evolving to be able to flow data around our applications without having to do anything and just wire up some things in a console or in a terminal.Jeremy: Right. Well, so you mentioned, someone could build the ultimate sort of flow control system or whatever. I mean, I honestly think EventBridge is very close to that. And I actually had Mike Deck on the show. I think it was like episode five. So two years ago, whenever it was when the show came out. I mean, when EventBridge came out. And we were talking and I sort of made the joke, I'm like, so this is like serverless web hooks, essentially being able, because there was the partner integrations where partners could push events onto an event bus, which they still can do. But this has evolved, right? Because the issue was always sort of like, I would have to subscribe to web books, I'd have to build a web hook to get events from a particular company. Which was great, always worked fine, but you're still maintaining that infrastructure.So EventBridge comes along, it creates these partner integrations and now you can just push an event on that now your applications, whether it's a Lambda function or other services, you can push them to an SQS queue, you can push them into a Kinesis stream, all these different destinations. You can go ahead and pull that data in and that's just there. So you don't have to worry about maintaining that infrastructure. And then, the EventBridge team went ahead and released the destination API, I think it's called.Julian: Yeah, API destinations.Jeremy: Event API destinations, right, where now you can set up these integrations with other companies, so you don't even have to make the API call yourself anymore, but instead you get all of the retries, you get the throttling, you get all that stuff kind of built in. So I mean, it's just really, really interesting where this is going. And actually, I mean, if you want to take a second to tell people about EventBridge API destinations, what that can do, because I think that now sort of creates both sides of that equation for you.Julian: It does. And I was just thinking over there, you've done a 10 times better job at explaining API destinations than I have, so you've nailed it on the head. And packet is that kind of simple. And it is just, events land up in your EventBridge and you can just pump events to any arbitrary endpoint. So it doesn't have to be in AWS, it can be on-prem. It can be to your Raspberry PI, it can literally be anywhere. But it's not just about pumping the events over there because, okay, how do we handle failover? And how do we handle over throttling? And so this is part of the extra cool goodies that came with API destinations, is that you can, for instance, if you are sending events to some external API and you only licensed for 1,000 invocations, not invocations, that could be too Lambda-ish, but 1,000 hits on the API every minute.Jeremy: Quotas. I think we call them quotas.Julian: Quotas, something like that. That's a much better term. Thank you, Jeremy. And some sort of quota, well, you can just apply that in API destinations and it'll basically store the data in the meantime in EventBridge and fire that off to the API destination. If the API destination is in that sort of throttle and if the API destination is down, well, it's going to be able to do some exponential back-off or calm down a little bit, don't over-flood this external API. And then eventually when the API does come back, it will be able to send those events. So that does just really give you excellent power rather than maintaining all these individual API endpoints yourself, and you're not handling the availability of the endpoint API, but of whatever your code is that needs to talk to that destination.Jeremy: Right. And I don't want to oversell this to anybody, but that also ...Julian: No, keep going. Keep going.Jeremy: ... adds the capability of enhanced security, because you're not exposing those API keys to your developers or anybody else, they're all baked in and stored within, the API destinations or within an EventBridge. You have the ability, you mentioned this idea of not needing Lambda to maybe talk directly, API gateway to DynamoDB or to step function or something like that. I mean, the cool thing about this is you do have translation capabilities, or transformation capabilities in EventBridge where you can transform the event. I haven't tried this, but I'm assuming it's possible to say, get an event from Salesforce and then pipe it into Stripe or some other API that you might want to pipe it into.So I mean, just that idea of having that centralized bus that can communicate with all these different things. I mean, we're talking about distributed systems here, right? So why is it different sending an event from my microservice A to my microservice B? Why can't I send it from my microservice A to company-wise, microservice B or whatever? And being able to do that in a secure, reliable, just with all of that stuff kind of built in for you, I think it's amazing. So I love EventBridge. To me EventBridge is one of those services that rivals Lambda. It's as, I guess as important as Lambda is, in this whole serverless equation.Julian: Absolutely, Jeremy. I mean, I'm just sitting here. I don't actually have to say anything. This is a brilliant interview and Jeremy, you're the expert. And you're just like laying down all of the excellent use cases. And exactly it. I mean, I like to think we've got sort of three interlinked services which do three different things, but are awesome. Lambda, we love if you need to do some processing or you need to do something that's literally your business logic. You've got EventBridge that can route data from in and out of SaaS partners to any other kind of API. And then you've got Step Functions that can do some coordination. And they all work together, but you've got three different things that really have sort of superpowers in terms of the amount of stuff you can do with it. And yes, start with them. If you land up bumping up against any kind of things that it doesn't work, well, first of all, get in touch with me, I'll work on that.But then you can maybe start thinking about, is it containers or EC2, or that kind of thing? But using literally just Lambda, Step Functions and EventBridge, okay. Yes, maybe you're going to need some queues, topics and APIs, and that kind of thing. But ...Jeremy: I was just going to say, add DynamoDB in there for some permanent state or for some data persistence. Right? Yeah. But other than that, no, I think you nailed it. Honestly, sometimes you're starting to build applications and yeah, you're right. You maybe need a queue here and there and things like that. But for the most part, no, I mean, you could build a lot with those three or four services.Julian: Yeah. Well, I mean, even think of it what you used to do before with API destinations. Maybe you drop something on a queue, you'd have Lambda pull that from a queue. You have Lambda concurrency, which would be set to five per second to then send that to an external API. If it failed going to that API, well, you've got to then dump it to Lambda destinations or to another SQS queue. You then got something ... You know, I'm going down the rabbit hole, or just put it on EventBridge ...Jeremy: You just have it magically happen.Julian: ... or we talk about removing serverless infrastructure, not normal infrastructure, and just removing even the serverless bits, which is great.Jeremy: Yeah, no. I think that's amazing. So we talked about a couple of these different services, and we talked about packaging formats and we talked about event driven applications, and all these other things. And a lot of this stuff, even though some of it may be familiar and you could probably equate it or relate it to things that developers might already know, there is still a lot of new stuff here. And I think, my biggest complaint about serverless was not about the capabilities of it, it was basically the education and the ability to get people to adopt it and understand the power behind it. So let's talk about that a little bit because ... What's that?Julian: It sounds like my job description, perfectly.Jeremy: Right. So there we go. Right, that's what you're supposed to be doing, Julian. Why aren't you doing it? No, but you are doing it. You are doing it. No, and that's why I want to talk to you about it. So you have that series on the Well-Architected Framework and we can talk about that. There's a whole bunch of really good resources on this. Obviously, you're doing videos and conferences, well, you used to be doing conferences. I think you probably still do some of those virtual ones, right? Which are not the same thing.Julian: Not quite, no.Jeremy: I mean, it was fun seeing you in Cardiff and where else were you?Julian: Yeah, Belfast.Jeremy: Cardiff and Northern Ireland.Julian: Yeah, exactly.Jeremy: Yeah, we were all over the place together.Julian: With the Guinness and all of us. It was brilliant.Jeremy: Right. So tell me a little bit about, sort of, the education process that you're trying to do. Or maybe even where you sort of see the state of Serverless education now, and just sort of where it's evolved, where we're getting best practices from, what's out there for people. And that's a really long question, but I don't know, maybe you can distill that down to something usable.Julian: No, that's quite right. I'm thinking back to my extensions explanation, which is a really long answer. So we're doing really long stuff, but that's fine. But I like to also bring this back to also thinking about the people aspect of IT. And we talk a lot about the technology and Lambda is amazing and S3 is amazing and all those kinds of things. But ultimately it is still sort of people lashing together these services and building the serverless applications, and deciding what you even need to do. And so the education is very much tied with, of course, having the products and features that do lots of kinds of things. And Serverless, there's always this lever, I suppose, between simplicity and functionality. And we are adding lots of knobs and levers and everything to Lambda to make it more feature-rich, but we've got to try and keep it simple at the same time.So there is sort of that trade-off, and of course with that, that obviously means not just the education side, but education about Lambda and serverless, but generally, how do I build applications? What do I do? And so you did mention the Well-Architected Framework. And so for people who don't know, this came out in 2015, and in 2017, there was a Serverless Lens which was added to it; what is basically serverless specific information for Well-Architected. And Well-Architected means bringing best practices to serverless applications. If you're building prod applications in the cloud, you're normally looking to build and operate them following best practices. And this is useful stuff throughout the software life cycle, it's not just at the end to tick a few boxes and go, "Yes, we've done that." So start early with the well-architected journey, it'll help you.And just sort of answer the question, am I well architected? And I mean, that is a bit of a fuzzy, what is that question? But the idea is to give you more confidence in the architecture and operations of your workloads, and that's not a goal it's in, but it's to reduce and minimize the impact of any issues that can happen. So what we do is we try and distill some of our questions and thoughts on how you could do things, and we built that into the Well-Architected Framework. And so the ServiceLens has a few questions on its operational excellence, security, reliability, performance, efficiency, and cost optimization. Excellent. I knew I was going to forget one of them and I didn't. So yeah, these are things like, how do you control access to an API? How do you do lifecycle management? How do you build resiliency into your application? All these kinds of things.And so the Well-Architected Framework with Serverless Lens there's a whole bunch of guidance to help you do that. And I have been slowly writing a blog series to literally cover all of the questions, they're nine questions in the Well-Architected Serverless Lens. And I'm about halfway through, and I had to pause because we have this little conference called re:Invent, which requires one or two slides to be created. But yeah, I'm desperately keen to pick that up again. And yeah, that's just providing some really and sort of more opinionated stuff, because the documentation is awesome and it's very in-depth and it's great when you need all that kind of stuff. But sometimes you want to know, well, okay, just tell me what to do or what do you think is best rather than these are the seven different options.Jeremy: Just tell me what to do.Julian: Yeah.Jeremy: I think that's a common question.Julian: Exactly. And I'll launch off from that to mention my colleague, James Beswick, he writes one or two things on serverless ...Jeremy: Yeah, I mean, every once in a while you see something from it. Yeah.Julian: ... every day. The Besbot machine of serverless. He's amazing. James, he's so knowledgeable and writes like a machine. He's brilliant. Yeah, I'm lucky to be on his team. So when you talk about education, I learn from him. But anyway, in a roundabout way, he's created this blog series and other series called the Lambda Operations Guide. And this is literally a whole in-depth study on how to operate Lambda. And it goes into a whole bunch of things, it's sort of linked to the Serverless Lens because there are a lot of common kind of stuff, but it's also a great read if you are more nerdily interested in Lambda than just firing off a function, just to read through it. It's written in an accessible way. And it has got a whole bunch of information on how to operate Lambda and some of the stuff under the scenes, how to work, just so you can understand it better.Jeremy: Right. Right. Yeah. And I think you mentioned this idea of confidence too. And I can tell you right now I've been writing serverless applications, well, let's see, what year is it? 2021. So I started in 2015, writing or building applications with Lambda. So I've been doing this for a while and I still get to a point every once in a while, where I'm trying to put something in cloud formation or I'm using the Serverless Framework or whatever, and you're trying to configure something and you think about, well, wait, how do I want to do this? Or is this the right way to do it? And you just have that moment where you're like, well, let me just search and see what other people are doing. And there are a lot of myths about serverless.There's as much good information is out there, there's a lot of bad information out there too. And that's something that is kind of hard to combat, but I think that maybe we could end it there. What are some of the things, the questions people are having, maybe some of the myths, maybe some of the concerns, what are those top ones that you think you could sort of ...Julian: Dispel.Jeremy: ... to tell people, dispel, yeah. That you could say, "Look, these are these aren't things to worry about. And again, go and read your blog post series, go and read James' blog post series, and you're going to get the right answers to these things."Julian: Yeah. I mean, there are misconceptions and some of them are just historical where people think the Lambda functions can only run for five minutes, they can run for 15 minutes. Lambda functions can also now run up to 10 gig of RAM. At re:Invent it was only 3 gig of RAM. That's a three times increase in Lambda functions within a three times proportional increase in CPU. So I like to say, if you had a CPU-intensive job that took 40 minutes and you couldn't run it on Lambda, you've now got three times the CPU. Maybe you can run it on Lambda and now because that would work. So yeah, some of those historical things that have just changed. We've got EFS for Lambda, that's some kind of thing you can't do state with Lambda. EFS and NFS isn't everybody's cup of tea, but that's certainly going to help some people out.And then the other big one is also cold starts. And this is an interesting one because, obviously we've sort of solved the cold start issue with connecting Lambda functions to VPC, so that's no longer an issue. And that's been a barrier for lots of people, for good reason, and that's now no longer the case. But the other thing for cold starts is interesting because, people do still get caught up at cold starts, but particularly for development because they create a Lambda function, they run it, that's a cold start and then update it and they run it and then go, oh, that's a cold start. And they don't sort of grok that the more you run your Lambda function the less cold starts you have, just because they're warm starts. And it's literally the number of Lambda functions that are running at exactly the same time will have a cold start, but then every subsequent Lambda function invocation for quite a while will be using a warm function.And so as it ramps up, we see, in the small percentages of cold starts that are actually going to happen. And when we're talking again about the container image support, that's got a whole bunch of complexity, which people are trying to understand. Hopefully, people are learning from this podcast about that as well. But also with the cold starts with that, those are huge and they're particular ways that you can construct your Lambda functions to really reduce those cold starts, and it's best practices anyway. But yeah, cold starts is also definitely one of those myths. And the other one ...Jeremy: Well, one note on cold starts too, just as something that I find to be interesting. I know that we, I even had to spend time battling with that earlier on, especially with VPC cold starts, that's all sort of gone away now, so much more efficient. The other thing is like provision concurrency. If you're using provision concurrency to get your cold starts down, I'm not even sure that's the right use for provision concurrency. I think provision concurrency is more just to make sure you have enough capacity because of the ramp-up time for Lambda. You certainly can use it for cold starts, but I don't think you need to, that's just my two cents on that.Julian: Yeah. No, that is true. And they're two different use cases for the same kind of thing. Yeah. As you say, Lambda is pretty scalable, but there is a bit of a ramp-up to get up to many, many, many, many thousands or tens of thousands of concurrent executions. And so yeah, using provision currency, you can get that up in advance. And yeah, some people do also use it for provision concurrency for getting those cold starts done. And yet that is another very valid use case, but it's only an issue for synchronous workloads as well. Anything that is synchronous you really shouldn't be carrying too much. Other than for cost perspective because it's going to take longer to run.Jeremy: Sure. Sure. I have a feeling that the last one you were going to mention, because this one bugs me quite a bit, is this idea of no ops or some people call it ops-less, which I think is kind of funny. But that's one of those things where, oh, it drives me nuts when I hear this.Julian: Yeah, exactly. And it's a frustrating thing. And I think often, sometimes when people are talking about no ops, they either have something to sell you. And sometimes what they're selling you is getting rid of something, which never is the case. It's not as though we develop serverless applications and we can then get rid of half of our development team, it just doesn't work like that. And it's crazy, in fact. And when I was talking about the people aspect of IT, this is a super important thing. And me coming from an infrastructure background, everybody is dying in their jobs to do more meaningful work and to do more interesting things and have the agility to try those experiments or try something else. Or do something that's better or even improve the way your build or improve the way your CI/CD pipeline runs or anything, rather than just having to do a lot of work in the lower levels.And this is what serverless really helps you do, is to be able to, we'll take over a whole lot of the ops for you, but it's not all of the ops, because in a way there's never an end to ops. Because you can always do stuff better. And it's not just the operations of deploying Lambda functions and limits and all that kind of thing. But I mean, think of observability and not knowing just about your application, but knowing about your business. Think of if you had the time that you weren't just monitoring function invocations and monitoring how long things were happening, but imagine if you were able to pull together dashboards of exactly what each transaction costs as it flows through your whole entire application. Think of the benefit of that to your business, or think of the benefit that in real-time, even if it's on Lambda function usage or something, you can say, "Well, oh, there's an immediate drop-off or pick-up in one region in the world or one particular application." You can spot that immediately. That kind of stuff, you just haven't had time to play with to actually build.But if we can take over some of the operational stuff with you and run one or two or trillions of Lambda functions in the background, just to keep this all ticking along nicely, you're always going to have an opportunity to do more ops. But I think the exciting bit is that ops is not just IT infrastructure, plumbing ops, but you can start even doing even better business ops where you can have more business visibility and more cool stuff for your business because we're not writing apps just for funsies.Jeremy: Right. Right. And I think that's probably maybe a good way to describe serverless, is it allows you to focus on more meaningful work and more meaningful tasks maybe. Or maybe not more meaningful, but more impactful on the business. Anyways, Julian, listen, this was a great conversation. I appreciate it. I appreciate the work that you're doing over at AWS ...Julian: Thank you.Jeremy: ... and the stuff that you're doing. And I hope that there will be a conference soon that we will be able to attend together ...Julian: I hope so too.Jeremy: ... maybe grab a drink. So if people want to get a hold of you or find out more about serverless and what AWS is doing with that, how do they do that?Julian: Yeah, absolutely. Well, please get hold of me anytime on Twitter, is the easiest way probably, julian_wood. Happy to answer your question about anything Serverless or Lambda. And if I don't know the answer, I'll always ask Jeremy, so you're covered twice over there. And then, three different things. James is, if you're talking specifically Lambda, James Beswick's operations guide, have a look at that. Just so much nuggets of super information. We've got another thing we did just sort of jump around, you were talking about cloud formation and the spark was going off in my head. We have something which we're calling the Serverless Patterns Collection, and this is really super cool. We didn't quite get to talk about it, but if you're building applications using SAM or serverless application model, or using the CDK, so either way, we've got a whole bunch of patterns which you can grab.So if you're pulling something from S3 to Lambda, or from Lambda to EventBridge, or SNS to SQS with a filter, all these kind of things, they're literally copy and paste patterns that you can put immediately into your cloud formation or your CDK templates. So when you are down the rabbit hole of Hacker News or Reddit or Stack Overflow, this is another resource that you can use to copy and paste. So go for that. And that's all hosted on our cool site called serverlessland.com. So that's serverlessland.com and that's an aggregation site that we run because we've got video talks, and we've got blog posts, and we've got learning path series, and we've got a whole bunch of stuff. Personally, I've got a learning path series coming out shortly on Lambda extensions and also one on Lambda observability. There's one coming out shortly on container image supports. And our team is talking all over as many things as we can virtually. I'm actually speaking about container images of DockerCon, which is coming up, which is exciting.And yeah, so serverlessland.com, that's got a whole bunch of information. That's just an easy one-stop-shop where you can get as much information about AWS services as you can. And if not yet, get in touch, I'm happy to help. I'm happy to also carry your feedback. And yeah, at the moment, just inside, we're sort of doing our planning for the next cycle of what Lambda and what all the service stuff we're going to do. So if you've got an awesome idea, please send it on. And I'm sure you'll be super excited when something pops out in the near issue, maybe just in future for a cool new functionality you could have been involved in.Jeremy: Well, I know that serverlessland.com is an excellent resource, and it's not that the AWS Compute blog is hard to parse through or anything, but serverlessland.com is certainly a much easier resource to get there. S

bible benefits south africa fall in love ga reddit cloud honestly jeff bezos windows ip saas beta ram personally salesforce northern ireland api belfast guinness python aws linux java cardiff faa apis stripe alpine invent sar ubuntu cpu extensions checkpoint vmware sns raspberry pi s3 sql docker node sdks lambda baas stack overflow splunk ci cd firecrackers json serverless honeycomb vms ecs nosql powershell corelogic woodall hashicorp hacker news nfs cdk ec2 efs dynatrace dbn appdynamics dbas kinesis vpc sam newman sumo logic dynamodb senior developer advocate extensible fargate pycharm sqs dockercon building microservices serverless framework amazon linux julian wood james urquhart eventbridge jeremy you thundra cbt nuggets jeremy so julian it

Episode #99: You already have a Multi-Cloud Strategy with Rob Sutter

AMIGOS LEARNING LANGUAGES

Play Episode Listen Later May 3, 2021 62:29

About Rob SutterRob Sutter, a Principal Developer Advocate at Fauna, has woven application development into his entire career, from time in the U.S. Army and U.S. Government to stints with the Big Four and Amazon Web Services. He has started his own company – twice – once providing consulting services and most recently with WorkFone, a software as a service startup that provided virtual digital identities to government clients. Rob loves to build in public with cloud architectures, Node.js or Go, and all things serverless!Twitter: @rts_robPersonal email: rob@fauna.com Personal website: robsutter.comFauna Homepage Learn more about Fauna Supported Languages and Frameworks Try Fauna for FreeThe Calvin PaperThis episode sponsored by CBT Nuggets: https://www.cbtnuggets.com/Watch this video on YouTube: https://youtu.be/CUx1KMJCbvkTranscriptJeremy: Hi, everyone. I'm Jeremy Daly, and this is Serverless Chats. Today, I'm joined by Rob Sutter. Hey, Rob. Thanks for joining me.Rob: Hey, Jeremy. Thanks for having me.Jeremy: So you are now the or a Principal Developer Advocate at Fauna. So I'd love it if you could tell the listeners a little bit about your background and what Fauna is all about.Rob: Right. So as you've said, I'm a DA at Fauna. I've been a serverless user in production since 2017, started the Serverless User Group in Dubai. So that's how I got into serverless in general. Previously, I was a DA on the Serverless Team at AWS, and I've been a SaaS startup co-founder, a government employee, an IT auditor, and like a lot of serverless people I find, I have a lot of Ops in my background, which is why I don't want to do it anymore. There's a lot of us that end up here that way, I think. Fauna is the data API for modern applications. So it's a database that you access as an API just as you would access Stripe for payments, Twilio for messaging. You just put your data into Fauna and access it that way. It's flexible, serverless. It's transactional. So it's a distributed database with asset transactions, and again, it's as simple as accessing any other API that you're already accessing as a developer so that you can simplify your code and ship faster.Jeremy: Awesome. All right. Well, so I want to talk more about Fauna, but I want to talk about it actually in a broader ... I think in the broader ecosystem of what's happening in the cloud right now, and we hear this term "multicloud" all the time. By the way, I'm super excited to have you on. I wanted to have you on for the longest time, and then just schedules, and it's like ...Rob: Yeah.Jeremy: You know how it is, but anyways.Rob: Thank you.Jeremy: No. But seriously, I'm super excited because your tweets, and everything that you've written, and the things that you were doing at AWS and things like that I think just all reinforced this idea that we are living in this multicloud world, right, and that when people think of multicloud ... and this is something I try to be very clear on. Multicloud is not cloud-agnostic, right?Rob: Right.Jeremy: It's a very different thing, right? We're not talking about running the same work load in parallel on multiple service providers or whatever.Rob: Right.Jeremy: We're talking about this idea of using the best services that are available to you across the spectrum of providers, whether those are cloud service providers, whether those are SaaS companies, or even to some degree, some open-source projects that are out there that make up this strategy. So let's start there right from the beginning. Just give me your thoughts on this idea of what multicloud is.Rob: Right. Well, it's sort of a dirty word today, and people like to rail against it. I think rightly so because it's that multicloud 1.0, the idea of, as you said, cloud-agnostic that "write once, run everywhere." All that is, is a race to the bottom, right? It's the lowest common denominator. It's, "What do I have available on every cloud service provider? And then let me write for that as a risk management strategy." That's a cost center when you want to put it in business terms.Jeremy: Right.Rob: Right? You're not generating any value there. You're managing risk by investing against that. In contrast, what you and I are talking about today is this idea of, "Let me use best in class everywhere," and that's a value generation strategy. This cloud service provider offers something that this team understands, and wants to build with, and creates value for the customer more quickly. So they're going to write on that cloud service provider. This team over here has different needs, different customers. Let them write over there. Quite frankly, a lot of this is already happening today at medium businesses and enterprises. It's just not called multicloud, right?Jeremy: Right.Rob: So it's this bottom-up approach that individual teams are consuming according to their needs to create the greatest value for customers, and that's what I like to see, and that's what I like to promote.Jeremy: Yeah, yeah. I love that idea of bottom-up because I think that is absolutely true, and I don't think you've seen this as aggressively as you have in the last probably five years as more SaaS companies have become or SaaS has become a household name. I mean, probably 10 years ago, I think Salesforce was around, and some of these other things were around, right?Rob: Yeah.Jeremy: But they just weren't ... They weren't the household names that they are now. Now, you watch any sport, any professional sport, and you see advertisements for all these SaaS companies now because that seems to be the modern economy. But the idea of the bottom-up approach, that is something where you basically give a developer or maybe you don't give them, but the developer takes the liberties, I would say, to maybe try and experiment with something new without having to do years of research, go through procurement, get approval to use some platform. Even companies trying to move to AWS, or on to Azure, or something like that, they have to go through hoops. Basically, jump through hoops in order to get them there. So this idea of the bottom-up approach, the developers are the ones who are experimenting. Very low-risk experiments, by the way, with some of these other services. That approach, that seems like the right marketing approach for companies that are building these services, right?Rob: Yeah. It seems like a powerful approach for them. Maybe not necessarily the only one, but it is a good one. I mean, there's a historical lesson here as well, right? I want to come back to your point about the developers after, but I think of this as shadow cloud. Right? You saw this with the early days of SaaS where people would go out and sign up for accounts for their business and use them. They weren't necessarily regulated, but we saw even before that with shadow IT, right, where people were bringing their own software in?Jeremy: Right.Rob: So for enterprises that are afraid of this that are heavily risk-focused or control-focused top-down, I would say don't be so afraid because there's an entire set of lessons you can learn about this as you bring it, as you come forward to it. Then, with the developers, I think it's even more powerful than the way you put it because a lot of times, it's not an experiment. I mean, you've seen the same things on Twitter I've seen, the great tech turnover of 2021, right? That's normal for tech. There's such a turnover that a lot of times, people are coming in already having the skills that they know will enhance delivery and add customer value more quickly. So it's not even an experiment. They already have the evidence, and they're able to get their team skilled up and building quickly. If you hire someone who's coming from an AWS shop, you hire someone who's coming from an Azure shop on to two different teams, they're likely going to evolve that excellence or that capability independently, and I don't necessarily think there's a reason to stop that as long as you have the right controls around it.Jeremy: Right. I mean, and you mentioned controls, and I think that if I'm the CTO of some company or whatever, or I'm the CIO because we're dealing in a super enterprise-y world, that you have developers that are starting to use tools ... Maybe not Stripe, but maybe like a Twilio or maybe they're using, I don't know, ChaosSearch or something, something where data that is from within their corporate walls are going out somewhere or being stored somewhere else, like the security risk around that. I mean, there's something there though, right?Rob: Yeah, there absolutely is. I think it's incumbent on the organizations to understand what's going on and adapt. I think it's also imcu,bent on the cloud service providers to understand those organizational concerns and evolve their product to address them, right?Jeremy: Right.Rob: This is one thing. My classic example of this is data exfiltration in a Lambda function. Some companies get ... I want to be able to inspect every packet that leaves, and they have that hard requirement for reasons, right?Jeremy: Right.Rob: You can't argue with them that they're right or wrong. They made that decision as a company. But then, they have to understand the impact of that is, "Okay. Well, every single Lambda function that you ever create is going to run inside of VPC or is going to run connected to a VPC." Now, you have the added complexity of managing a VPC, managing your firewall rules, your NACLs, your security groups. All of this stuff that ... Maybe you still have to do it. Maybe it really is a requirement. But if you examine your requirements from a business perspective and say, "Okay. There's another way we can address this with tightly-scoped IAM permissions that only allow me to read certain records or from certain tables, or access certain keys, or whatever." Then, we assume all that traffic goes out and that's okay. Then, you get to throw all of that complexity away and get back to delivering value more quickly. So they have to meet together, right? They have to meet.Jeremy: Right.Rob: This led to a lot of the work that AWS did with VPC networking for Lambda functions or removing the cold start because a lot ... Those enterprises weren't ready to let go of that requirement, and AWS can't tell them, "You're wrong." It's their business. It's their requirement. So AWS built that for them to reduce the cold start so that Lambda became a viable platform for them to build against.Jeremy: Right, and so if you're a developer and you're looking at some solution because ... By the way, I mean, like you said, choosing the best of breed. There are just a lot of good services out there. There are thousands and thousands of SaaS companies, and I think ... I don't know if we made this point, but I certainly consider SaaS companies themselves to be part of the cloud. I would think you would probably agree with that, right?Rob: Yeah.Jeremy: It might as well be cloud providers themselves. Most of them run on top of the cloud providers anyways, but they found ...Rob: But they don't have to, and that's interesting to me and another truth that you could be consuming services from somebody else's data center and that's still multicloud.Jeremy: Right, right. So, anyway. So my thought here or I guess the question I have is if you're a developer and you're trying to choose something best in breed, right? Whatever that is. Let's say I'm trying to send text messages, and I just think Twilio is ... It's got all the features that I need. I want to go with Twilio. If you're a developer, what are the things that you need to look for in some of these companies that maybe don't have ... I mean, I would say Twilio does, but like don't necessarily have the trust or the years of experience or I guess years under their belts of providing these services, and keeping data secure, and things like that. What's the advice that you give to developers looking to choose something like that to be aware of?Rob: To developers in particular I think is a different answer ...Jeremy: Well, I mean, yeah. Well, answer it both ways then.Rob: Yeah, because there's the builder and the buyer.Jeremy: Right.Rob: Right?Jeremy: Right.Rob: Whoever the buyer is, and a lot of times, that could just be the software development manager who's the buyer, and they still would approach it different ways. I think the developer is first going to be concerned with, "Does it solve my problem?" Right? "Overall, does it allow me to ship faster?" The next thing has to be stability. You have to expect that this company will be around, which means there is a certain level of evidence that you need of, "Okay. This company has been around and has serviced," and that's a bit of a chicken and an egg problem.Jeremy: Right.Rob: I think the developer is going to be a lot less concerned with that and more concerned with the immediacy of the problem that they're facing. The buyer, whether that's a manager, or CIO, or anywhere in between, is going to need to be concerned with other things according to their size, right? You even get the weird multicloud corner cases of, "Well, we're a major supplier of Walmart, and this tool only runs on a certain cloud service provider that they don't want us to use. So we're not going to use it." Again, that's a business decision, like would I build my software that way? No, but I'm not subject to that constraint. So that means nothing in that equation.Jeremy: Right. So you mentioned a little bit earlier this idea of bringing people in from different organizations like somebody comes in and they can pick up where somebody else left off. One of the things that I've noticed quite a bit in some of the companies that I've worked with is that they like to build custom tools. They build custom tools to solve a job, right? That's great. But as soon as Fred or Sarah leave, right, then all of a sudden, it's like, "Well, who takes over this project?" That's one of the things where I mentioned ... I said "experiments," and I said "low-risk." I think something that's probably more low-risk than building your own thing is choosing an API or a service that solves your problem because then, there's likely someone else who knows that API or that service that can come in, and can replace it, and then can have that seamless transition.And as you said, with all the turnover that's been happening lately, it's probably a good thing if you have some backup, and even if you don't necessarily have that person, if you have a custom system built in-house, there is no one that can support that. But if you have a custom ... or if you have a system you've used, you're interfacing with Twilio, or Stripe, or whatever it is, you can find a lot of developers who could come in even as consultants and continue to maintain and solve your problems for you.Rob: Yeah, and it's not just those external providers. It's the internal tooling as well.Jeremy: Right.Rob: Right?Jeremy: Right.Rob: We're guilty of this in my company. We wrote a lot of stuff. Everybody is, right, like you like to do it?Jeremy: Right.Rob: It's a problem that you recognized. It feels good to solve it. It's a quick win, and it's almost always the wrong answer. But when you get into things like ... a lot of cases it doesn't matter what specific tool you use. 10 years ago, if you had chosen Puppet, or Chef, or Ansible, it wouldn't be as important which one as the fact that you chose one of those so that you could then go out and find someone who knew it. Today, of course, we've got Pulumi, Terraform, and all these other things that you could choose from, and it's just better than writing a bunch of Bash scripts to tile the stuff together. I believe Bash should more or less be banned in the cloud, but that's another ... That's my hot take for another time. Come at me on Twitter if you don't like that one.Jeremy: So, yeah. So I think just bringing up this idea of tooling is important because the other thing that you potentially run into is with the variety of tooling that's out there, and you mentioned the original IAC. I guess they would... Right? We call those like Ansible and those sort of things, right?Rob: Right.Jeremy: All of those things, the Chefs and the Puppets. Those were great because you could have repeatable deployments and things like that. I mean, there's still work to be done there, but that was great because you made the choice to not building something yourself, right?Rob: Right.Jeremy: Something that somebody else could jump in on. But now with Terraform and with ... You mentioned Pulumi. We've got CloudFormation and even Microsoft has their own ... I think it's called ARM or something like that that is infrastructure as code. We've got the Serverless Framework. We've got SAM. We've got Begin. You've got ... or Architect, right? You've got all of these choices, and I think what happens too is that if teams don't necessarily ... If they don't rally around a tool, then you end up with a bunch of people using a bunch of different tools. Maybe not all these tools are going to be compatible, but I've seen really interesting mixes of people using Terraform, and CloudFormation, and SAM, and Serverless Framework, like binding it all together, and I think that just becomes ... I think that becomes a huge mess.Rob: It does, and I get back to my favorite quote about complexity, right? "Simplicity before complexity is worthless. Simplicity beyond complexity is priceless." I find it hard to get to one tool that's like artificial, premature optimization, fake simplicity.Jeremy: Yeah.Rob: If you force yourself into one tool all the time, then you're going to find it doing what it wasn't built to do. A good example of this, you talked about Terraform and the Serverless Framework. My opinion, they weren't great together, but your Terraform comes for your persistent infrastructure, and your Serverless Framework comes in and consumes the outputs of those Terraform stacks, but then builds the constantly churning infrastructure pieces of it. Right? There's a blast radius issue there as well where you can't take down your database, or S3 bucket, or all of this from a bad deploy when all of that is done in Terraform either by your team, or by another team, or by another process. Right? So there's a certain irreducible complexity that we get to, but you don't want to have duplication of effort with multiple tools, right?Jeremy: Right.Rob: You don't want to use CloudFormation to manage your persistent data over here and Terraform to manage your persistent data over here because then you're not ... That's like that agnostic model where you're not benefiting from the excellent features in each. You're only using whatever is common between them.Jeremy: Right, right, and I totally agree with you. I do like the idea of consuming. I mean, I have been using AWS for a very, very long time like 2007, 2008.Rob: Yeah, same. Oh, yeah.Jeremy: Right when EC2 instances were a thing. I guess 2008. But the biggest thing for me looking at using Terraform or something like that, I always felt like keeping it in, keeping it in the family. That's the wrong way to say it, but like using CloudFormation made a lot of sense, right, because I knew that CloudFormation ... or I thought I knew that CloudFormation would always support the services that needed to be built, and that was one of my big complaints about it. It was like you had this delay between ... They would release some service, and you had to either do it through the CLI or through the console. But then, CloudFormation support came months later. The problem that you have with some of that was then again other tools that were generating CloudFormation, like a Serverless Framework, that they would have to wait to get CloudFormation support before they could support it, and that would be another delay or they'd have to build something custom, which is not always the cleanest way to do it.Rob: Right.Jeremy: So anyways, I've always felt like the CloudFormation route was great if you could get to that CloudFormation, but things have happened with CDK. We didn't even mention CDK, but CDK, and Pulumi, and Terraform, and all of these other things, they've all provided these different ways to do things. But the thing that I always thought was funny was, and this is ... and maybe you have some insight into this if you can share it, but with SAM, for example, SAM wasn't extensible, right? You would just run into issues where you're like, "Oh, I can't do that with SAM." Whereas the Serverless Framework had this really great third-party plug-in system that allowed you to do some of these other things. Now, granted not all third-party plug-ins were super stable and were the best way to do something, right, because they'd either interact with APIs directly or whatever, but at least it gave you ... It unblocked you. Whereas I felt like with SAM and even CloudFormation when it didn't support something would block you.Rob: Yeah. Yeah, and those are just two different implementation philosophies from two different companies at two different stages of their existence, right? Like AWS ... and let's separate the reality from the theory here. The theory is that a large company can exert control over release cycles and limit what it delivers, but deliver it with a bar of excellence. A small company can open things up, and it depends on its community members for contributions to solve problems. It's very much like this is the cathedral and the Bazaar of cloud tooling, right?AWS has that CloudFormation architecture that they're working around with its own goals and approach, the one way to do it. Serverless Framework is, "Look, you need to ... You want to set up a stall here and insert IAM policies per function. Set up a stall. It will be great. Maybe people come and maybe they don't," and the system inherently sorts or bubbles up the value, right? So you see things like the Step Functions plug-in for Serverless Framework. It was one of the early ones that became very popular very quickly, whereas Step Functions supporting SAM trailed, but eventually came in. I think that team, by the way, deserves a lot of credit for really being focused on developers, but that's not the point of the difference between the two.A small young company like Serverless Framework that is moving very quickly can't have that cathedral approach to it, and both are valid, right? They're both just different strategies and good for the marketplace, quite frankly. I have my preferred approach, which is not about AWS or SAM vs Serverless Framework. It's the extensibility of plug-in frameworks to me are a key component of tooling that adapts as quickly as the clouds change, and you see this. Like Terraform was the first place that I really learned about plug-ins, and their plug-in framework is fantastic, the way they do providers. Serverless Framework as well is another good example, but you can't know how developers are going to build with your services. You just can't.You do customer development. You talk to them ahead of time. You get all this research. You talk to a thousand customers, and then you release it to 14 million customers, right? You're never going to guess, so let them. Let them build it, and if people ... They put the work in. People find there's value in it. Sometimes you can bring it in. Sometimes you leave it up to the community to maintain, but you just ... You have to be willing to accept that customers are going to use your product in different ways than you envisioned, and that's a good thing because it means customers are using your product.Jeremy: Right, right. Yeah. So I mean, from your perspective though ... because let's talk about SAM for a minute because I was excited when SAM came out. I was thinking to myself. I'm like, "All right. A simplified tooling that is focused on serverless. Right? Like gives me all the things that I think I'm going to need." And then I did ... from a developer experience standpoint, and let's call out the elephant in the room. AWS and developer experience are not always the same. They don't always give you that developer experience that you would want. They give you tons of tools, right, but they don't always give you that ...Rob: Funny enough, you can spell "developer experience" without AWS.Jeremy: Right. So I mean, that's my ... I was disappointed when I started using SAM, and I immediately reverted back to the Serverless Framework. Not because I thought that it wasn't good or that it wasn't well-thought-out. Like you said, there was a level of excellence there, which certainly you cannot diminish, It just didn't do the things I needed it to do, and I'm just curious if that was a consistent feedback that you got as being someone on the dev advocate team there. Was that something that you felt as well?Rob: I need to give two answers to this to be fair, to be honest. That was something that I felt as well. I never got as comfortable with SAM as I am with the Serverless Framework, but there's another side to this coin, and that's that enterprise uptake of SAM CLI has been really strong.Jeremy: Right.Rob: Enterprise it, it does what they need it to do, and it addresses their concerns, and they liked getting tooling from AWS. It just goes back to there being a place for both, right?Jeremy: Right. Rob: Enterprises are much more likely to build cathedrals. They want that top-down, "Okay, everybody. This is how you define something. In fact, we've created a module for you. Consume it here. Thou shalt not write new S3 to web server configuration in your SAM templates. Thou shalt consume this." That's not wrong, and the usage numbers don't lie with SAM. It's got a lot of fans, and it's got a lot of uptake, but that's an entirely different answer from how I feel about it. I think it also goes back to I'm not running an enterprise. I've never run an enterprise. The biggest I've got in terms of responsibility is at best a small company, right? So I think it's natural for me to feel that way when I try to use a tool that has such popularity amongst enterprise. Now, of course, you have the switch, right? You have enterprises using Serverless Framework, and you have small builders using SAM. But in general, I think the success there was with the enterprise, and it's a validation of their strategy.Jeremy: Right, right. So let's talk about enterprises for a second because this is where we look at tools like the CDK and SAM, Serverless Framework, things like that. You look at all those different tools, and like you said, there's adoption across some of those. But at the end of the day, most of those tools are compiling down to a CloudFormation or they're compiling down to ... What's it called? The Azure Resource Manager Language or whatever the heck it is, right?Rob: ARM templates.Jeremy: ARM templates. What's the value now in CloudFormation and those sort of things that the final product that you get to ... I mean, certainly, it's so much easier to build those using these frameworks, but do we need CloudFormation in those things anymore? Do we need to know those? Does an individual developer need to be able to understand those, or can they just basically take a step back and say, "Look, CDK does it for me," or, "Pulumi does it for me. So why do I need to know what's baked into those templates?"Rob: Yeah. So let's set Terraform aside and talk about it after because it's different. I think the choice of JSON and YAML as implementation languages for most of this tooling, most of this tooling is very ... It was a very effective choice because you don't necessarily have to know CloudFormation to look at a template and define what it's doing.Jeremy: Right.Rob: Right? You don't have to understand transforms. You don't have to understand parameter replacement and all of this stuff to look at the final transformed template in CloudFormation in the console and get a very quick reasoning about what's happening. That's good. Do I think there's value in learning to create multi-thousand-line CloudFormation templates by hand? I don't. It's the assembly language of the cloud, right?Jeremy: Right.Rob: It's there when you need it, and just like with procedural languages, you might want to look underneath at the instructions, how it unrolled certain loops, how it decided not to unroll others so that you can make changes at the next level. But I think that's rare, and that's optimization. In terms of getting things done and getting things shipped and delivered, to start, I wouldn't start with plain CloudFormation for any of these, especially of anything for any meaningful production size. That's not a criticism of CloudFormation. It's just like you said. It's all these other tooling is there to help us generate what we want consistently.The other benefit of it is once you have that underlying lingua franca of the cloud, you can build visualization, and debugging, and monitoring, and like ... I mean, all of these other evaluatory forensic. "Evaluatory?" Is that a word? It's a word now. You heard it here first on this podcast. Like forensic, cloud forensic type tooling that lets you see what's going on because it is a universal language among all of the tools.Jeremy: Yeah, and I want to get back to Terraform because I know you mentioned that, but I also want to be clear. I don't suggest you write CloudFormation. I think it is horribly verbose, but probably needs to be, right?Rob: Yeah.Jeremy: It probably needs to have that level of fidelity there or that just has all that descriptive information. Yeah. I would not suggest. I'm with you, don't suggest that people choose that as their way to go. I'm just wondering if it's one of those things where we don't need to be able to look at ones and zeroes anymore, and understand what they do, right?Rob: Right.Jeremy: We've got higher-level constructs that do that for us. I wouldn't quite put ... I get the assembly language comparison. I think that's a good comparison, but it's just that if you are an enterprise, right, is that ... Do you trust? Do you trust something like CDK to do everything you need it to do and make sure that it's covering all its bases because, again, you're writing layers of abstraction on top of a layer of abstraction that then abstracts it even more. So yeah, I'm just wondering. You had mentioned forensic tools. I think there's value there in being able to understand exactly what gets put into the cloud.Rob: Yeah, and I'd agree with that. It takes 15 seconds to run into the limit of my knowledge on CDK, by the way. But that knowledge includes the fact that CDK Synth is there, which generates the CloudFormation template for you, and that's actually the thing, which is uploaded to the CloudFormation service and executed. You'd have to bring in somebody like Richard Boyd or someone to talk about the guard rails that are there around it. I know they exist. I don't know what they are. It's wildly popular, and adoption is through the roof. So again, whether I philosophically think it's a good idea or not is irrelevant. Developers want it and want to build with it, right?Jeremy: Yeah.Rob: It's a Bazaar-type tool where they give you some basic constructs, and you can write your own constructs around it and get whatever you need. But ultimately, that comes back to CloudFormation, which is then subject to all the controls that your organization puts around CloudFormation, so it is ... There's value there. It can't be denied.Jeremy: Right. No, and the thing that I like about the CDK is the idea of being able to create those constructs because I think, especially from a ... What's the right word? Compliance standpoint or something like that, that you can write in these constructs that you say, "You need to use these constructs when you deploy a microservice or you deploy this," or whatever it is, and then you have those guard rails as you mentioned or whatever, but those ... All of those checkboxes are ticked because you can put that all into one construct. So I totally think that's great. All right. So let's talk about Terraform.Rob: Yeah, so there's ... First, it's a completely different model, right?Jeremy: Right.Rob: This is an interesting discussion to have because it's API calls. You write your provider, whatever your infrastructure is, and anything that can be an API call can now be a Terraform declarative resource. So it's that mapping between declarative and imperative that I find fascinating. Well, also building the dependency graph for you. So it handles all of those aspects of it, which is a really powerful tool. The thing that they did so well ... Terraform is equally verbose as CloudFormation. You've got to configure all the same options. You get the same defaults, et cetera. It can be terribly verbose, but it's modular. Every Terraform file that you have in one directory is concatenated, and that is a huge distinction between how CloudFormation wants everything in one template, or well, you can refer to something in an S3 bucket, but that's not actually useful to me as a developer.Jeremy: Right, right.Rob: I can't mount an S3 bucket as a drive on my workstation and compose all of these independent files at once and do them that way. Sidebar here. Maybe I can. Maybe it supports that, and I haven't been able to discover it, right? Whereas Terraform by default, out of the box put everything in its own file according to function. It's very easy to look in your databases.tf and understand what's in there, look in your VPC.tf and understand what's in there, and not have to go through thousands of lines of code at once. Yes, we have find and replace. Yes, we have search, and you ... Anybody who's ever built any of this stuff knows that's not the same thing. It's not the same thing as being able to open a hundred lines in your text editor, and look at everything all at once, and gain an understanding of it, and then dive into the next level of detail a hundred lines at a time and understand that.Jeremy: Right. But now, just a question here, without... because the API thing, I love that idea, and actually, Serverless Components used an API thing to do it and bypass CloudFormation. Actually, I believe Architect originally was using APIs, and then switched to CloudFormation, but the question I have about that is, if you don't have a CloudFormation template, if you don't have that assembly language of the web, and that's not sitting in your CloudFormation tool built into the dashboard, you don't get the drift protection, right, or the detection, and you don't get ... You don't have that resource map necessarily up there, right?Rob: First, I don't think CloudFormation is the assembly language of the web. I think it's the assembly language of AWS.Jeremy: I'm sorry. Right. Yes. Yeah. Yeah.Rob: That leads into my point here, which is, "Okay. AWS gives you the CloudFormation dashboard, but what if you're now consuming things from Datadog, or from Fauna, or from other places that don't map this the same way?Jeremy: Right.Rob: Terraform actually does manage that. You can do a plan against your existing file, and it will go out and check the actual existing state of all of your resources, and compare them to what you've asked for declaratively, and show you what the changeset will be. If it's zero, there's no drift. If there is something, then there's either drift or you've added new functionality. Now, with Terraform Cloud, which I've only used at a basic level, I'm not sure how automatic that is or whether it provides that for you. If you're from HashiCorp and listening to this, I would love to learn more. Get in touch with me. Please tell me. But the tooling is there to do that, but it's there to do it across anything that can be treated as an API that has ... really just create and retrieve. You don't even necessarily need the update and delete functionality there.Jeremy: Right, right. Yeah, and I certainly ... I am a fan of Terraform and all of these services that make it easier for you to build clouds more easily, but let's talk about APIs because you mentioned anything with an API. Well, everything has APIs now, right? I mean, essentially, we are living in a ... I mentioned SaaS, but now we're sort of ... This whole idea of the API economy. So just overall, what are your thoughts on this idea of basically anything you want to do is available now via an API?Rob: It's not anything you want to do. It's everything you want to do short of fulfilling your customer's needs, short of solving your customer's specific problem is available as an API. That means that you get to choose best-in-class for everything, right? Your customer's need isn't, "I want to spend $25 on my credit card." Your customer's need is, "I need a book."Jeremy: Right.Rob: So it's not, "I want to store information about books in a database." It's, "I need a book." So everything and every step of the way there can now be consumed from an API. Again, it's like serverless in general, right? It allows you to focus purely or as close to purely as we can right now on generating customer value and addressing customer problems so that you can ship faster, and so that you have it as a competitive advantage. I can write a payment processing program. I know I can because I've done it back in 2004, and it was horrible, and it was awful. It wasn't a very good one, and it worked. It took your money, but this was like pre-PCIDSS.If I had to comply with all of those things, why would I do that? I'm not a credit card payment processor. Stripe is, and they have specialists in all of the areas related to the problem of, "I need to take and process payments." That's the customer problem that they're solving. The specialization of labor that comes along with the API economy is fantastic. Ops never went away. All the ops people work at the cloud service providers now.Jeremy: Right.Rob: Right? Audit never went away. All the auditors have disappeared from view and gone into internal roles in payments companies. All of this continues to happen where the specialists are taking their deep, deep knowledge and bringing it inside companies that specialize in that domain.Jeremy: Right, and I think the domain expertise value that you get from whatever it is, whether it's running a database company or whether it's running a payment company, the number of people that you would need to hire to have a level of specialization for what you're paying for two cents per transaction or whatever, $50 a month for some service, you couldn't even begin. The total cost of ownership on those things are ... It's not even a conversation you would want to have, but I also built a payment processing system, and I did have to pass PCI, which we did pass, but it was ...Rob: Oh, good for you.Jeremy: Let's put it this way. It was for a customer, and we lost money on that customer because we had to go through PCI compliance, but it was good. It was a good experience to have, and it's a good experience to have because now I know I never want to do it again.Rob: Yeah, yeah. Back to my earlier point on Ops and serverless.Jeremy: Right. Exactly.Rob: These things are hard, right?Jeremy: Right, right.Rob: Sorry.Jeremy: No, no. Go ahead.Rob: Not to interrupt, but these are all really hard problems that people with graduate degrees and post-graduate research who have ... They're 30 years old when they start working on the problem are solving. There's a supply question there as well, right? There's just not enough people, and so you and I can like ... Well, I'm not going to project this on to you. I can stumble through an implementation and check off the requirements just like I worked in an optical microscopy lab in college, and I could create computer programs that modeled those concepts, but I was not an optical microscopist. I was not a PhD-level generating understandings of these things, and all of these, they're just so hard. Why would you do that when customer problems are equally hard and this set is solved?Jeremy: Right, right.Rob: This set of problems over here is solved, and you can't differentiate yourself by solving it better, and you're not likely to solve it better either. But even if you did, it wouldn't matter. This set of problems are completely unsolved. Why not just assemble the pieces from best-in-class so that you can attack those problems yourself?Jeremy: Again, I think that makes a ton of sense. So speaking about expertise, let's talk about what you might have to pay say a database administrator. If you were to hire a database administrator to maintain all the databases for you and keep all that uptime, and maybe you have to hire six database administrators in order for them to ... Well, I'm thinking multi-region and all that kind of stuff. Maybe you have to hire a hundred, depending on what it is. I mean, I'm getting a little ahead of myself, but ... So if I can buy a service like Fauna, so tell me a little bit more about just how that works.Rob: Right. Well, I mean, six database engineers in the US, you're over a million dollars a year easily, right?Jeremy: Right.Rob: I don't know what the exact number is, but when you consider benefits, and total cost, and all of that, it's a million dollars a year for six database engineers. Then, there are some very difficult problems in especially distributed databases and database scaling that Fauna solves. A number of other products or services solve some of them. I'm biased, of course, but I happen to think Fauna solves all of them in a way that no other product does, but you're looking ... You mentioned distributed transactions. Fauna is built atop the Calvin paper, which came out of Yale. It's a very brief, but dense academic research paper. It's a PHC research paper, and it talks about a model for distributed transactions and databases. It's a layer, a serialization layer, that sits atop your database.So let's say you wanted to replicate something like Fauna. So not only do you need to get six database engineers who understand the underlying database, but you need to find engineers that understand this paper, understand the limitations of the theory in the paper, and how to overcome them in operations. In reality, what happens when you actually start running regions around the world, replicating transactions between those regions? Quite frankly, there's a level of sophistication there that most of the set of people who satisfy that criteria already work at Fauna. So there's not much of a supply there. Now, there are other database competitors that solve this problem in different ways, and most of the specialists already work at those companies as well, right? I'm not saying that they aren't equally competent database engineers. I'm just saying there's not a lot of them.Jeremy: Right.Rob: So if you're thing is to sell books at a certain scale like Amazon, then that makes sense to have them because you are a database creator as well. But if your thing is to sell books at some level below that, why would you compete for that talent rather than just consuming it?Jeremy: Right. Yeah, and I would say unless you're a horse with a horn on your head, it's probably not worth maintaining your own database and things like that. So let's talk a little bit more, though, about that. I guess just this idea of maybe a shortage of people, like if you're ... You're right. There's a limited number of resources, right? I'm sure there's brilliant database engineers all around the world, and they have the experience where, right, they could come in and they could probably really help you maintain your database. Even if you could afford six of them and you wanted to do that, I think the problem is it's got to be the interestingness of the problem. I don't think "interestingness" is a word either, but like if I'm a database engineer, wouldn't I want to be working on something like Fauna that I could help millions and millions of people as opposed to helping some trucking company maintain their internal database or something like that?Rob: Yeah, and I would hope so. I hope it's okay that I mention we're hiring. So come to Fauna.com and look at our roles database engineers.Jeremy: You just read that Calvin paper first. Go ahead.Rob: But read the Calvin paper first. I think it's only like 12 pages, and even just the first page is enough. I'm happy to talk about that at any length because I find it fascinating and it's public. It is an interesting problem and the ... It's the reification or the implementation of theory. It's bringing that theory to the real world and ... Okay. First off, the theory is brilliant. This is not to take away from it, but the theory is conceived inside someone's mind. They do some tests, they prove it, and there's a world of difference between that point, which is foundational, and deploying it to production where people are trusting their workloads on it around the world. You're actually replicating across multiple cloud providers, and you're actually replicating across multiple regions, and you're actually delivering on the promise of the paper.What's described in the paper is not what we run at Fauna other than as a kernel, as a nugget, right, as the starting point or the first principle. That I think is wildly interesting for that class of talent like you talked about, the really world-class engineers who want to do something that can't be done anywhere else. I think one thing that Fauna did smartly early was be a remote-first company, which means that they can take advantage of those world-class engineers and their thirst for innovation regardless of wherever Fauna finds them. So that's a big deal, right? Would you rather work on a world-class or global problem or would you rather work on a local problem? Now, look, we need people working on local problems too. It's not to disparage that, but if this is your wheelhouse, if innovation is the thing that you want to do, if you want to be doing things with databases that nobody else is doing, this is where you want to be. So I think there's a strong argument there for coming to work in a place like Fauna.Jeremy: Yeah, and I want to make sure I apologize to any database engineer working at a trucking company because I'm sure there are actually probably really interesting problems with logistics and things like that that they are solving. So maybe not the best example. Maybe. I don't know. I can't think of another example. I don't want to offend anybody who's chosen a more local problem because you're right. I mean, there are local problems that need to be solved, but I do think that there are people ... I mean, even someone like me. I want to work on a bigger problem. You know what I mean? I owned a web development company for 12 years, and I was solving other people's problems by building them a website or whatever, and it just got to a point where I'm like, "I'm not making enough of an impact here." You're not solving a big enough problem. You want to work on something more interesting.Rob: Yeah. Humans crave challenge, right? Challenge is a necessary precondition for growth, and at least most of us, we want to grow. We want to be better at whatever it is we're doing or just however we think of ourselves next year that we aren't today, and you can't do that with challenge. If you build other people's websites for 12 years, eventually, you get to a point where maybe you're too good at it. Maybe that's great from a business perspective, but it's not so great from a personal fulfillment perspective.Jeremy: Right.Rob: If it's, "Oh, look, another brochure website. Okay. Here you go. Oh, you need a contact form?" Again, it's not to disparage this. It's the fact that if you do anything for 12 years, sometimes mastery is stasis. Not always.Jeremy: Right, and I have nightmares of contact forms, of building contact forms, by the way, but...Rob: It makes sense. Yeah. You know what you should do is just put all of those directly into Fauna and don't worry about it.Jeremy: Easy enough. Easy enough.Rob: Yeah, but it's not necessarily stasis, but I think about craftsmen and people who actually make things with their hands, physical builders, and I think a lot of that ... Like if you're making furniture, you're a cabinet maker. I think a lot of that is every time, it's just a little bit wrong, right? Not wrong, but just a little bit off from your optimum no matter how long you do it, and so everything has a chance to evolve. That's there with software to a certain extent, but the problem is never changing.Jeremy: Right.Rob: So, yeah. I can see both sides of it, but for me, I ... You can see it when I was on serverless four years ago and now that I'm on a serverless database now. I like to be out at the edge, pushing that edge out for everyone who's coming behind. It can be challenging because sometimes there's just no way forward, and sometimes everybody is not ready to come with you. In a lot of ways, being early is the same as being wrong.Jeremy: Right. Well, I've been ...Rob: Not an original statement, but ...Jeremy: No, but I've been early on many things as well where like five years after we tried to do something, like then, all of a sudden, it was like this magical thing where everybody is doing it, but you mentioned the edge. That would be something ... or you said on the edge. I know you mean this way, but the edge in terms of the actual edge. That's going to be an interesting data problem to solve.Rob: Oh, that's a fascinating data problem, especially for us at Fauna. Yeah, compute, and Andy Jassy, when he was at AWS, talked about how compute was bifurcating, right? It's either moving all the way out to the edge or it's moving all the way into the cloud, and that's true. But I think at Fauna, we take that a step further, right? The edge part is true, and a lot of the work that we've done recently, announcements with CloudFlare workers. We're ready for that. We believe in it, and we like pushing that out as close to the user as possible. One thing that we do uniquely is we have this concept of user-defined functions, and anybody who's written T-SQL back in the day, who wrote store procedures is going to be familiar with this, but it's ... You bring that business logic and that code to your data. Not near your data, to your data.Jeremy: Right.Rob: So you bring the compute not just to the cloud where it still needs to pass through top-of-rack and all of this. You bring it literally on to the same instance as your data where these functions execute against them there. So you get not just the database, but you get a compute layer in there, and this helps for things like filtering for things like the equivalent of joins, stuff that just ... If you've got to load gigabytes of data and move it somewhere, compute against it, reduce it to something, and then store that back, the speed of light still matters. Even if it's the speed of light across a couple switches, it still matters, and so there are some really interesting things that you can begin to do as you pull more and more of that logic into your data layer, and you also protect that logic from other components of your application.So I like that because things like GraphQL that endpoints already speak and already understand, just send it over, and again, they don't care about the architectural, quite frankly, genius--I can say that because I didn't create it--the genius behind all of this stuff. They just care that, "Look, I send this request over and I get it back," and entire workflows, and complex processes, and everything are executing behind the scenes just so that the endpoints can send and retrieve what they need more effectively and more quickly. The edge is fascinating. The thing I regret the most about the edge is I have no hardware skills, right? So I can't make fun things to do fun things in my house. I have to buy them, but you can't do everything.Jeremy: Yeah. Well, no. I think you make a good point though about bringing the compute to the data side, and other people have said there's no ... Ben Kehoe has been talking about this for a while too where like it just makes sense. Run the compute where the data is, and then send that data somewhere else, right, because there's more things that can be done with data after that initial bit of compute. But certainly, like you said, filtering it down or getting the bits that are relevant and moving a small amount of data as opposed to a large amount of data I think is hugely important.Now, the other thing I just want to mention before I let you go or I want to talk about quickly is this idea of going back to the API economy aspect of things and buying versus building. If you think about what you've had to do at Fauna, and I know you're relatively recent there, but you know what they've done and the work that had to go in in order to build this distributed system. I mean, I think about most systems now, and I think like anything I'm going to build now, I got to think about scale, right?I don't necessarily have to build to scale right away, especially if I'm doing an MVP or something like that. But if I was going to build a service that did something, I need to think about multi-region, and I need to think about failover, and I need to think about potentially providing it at the edge, and all these other things. So you come down to this thing, and I will just use the database example. But even if you were say using like MySQL, or Postgres, or something like that, that's going to scale. That's going to scale pretty well to get to a certain point, and then you're going to have to start sharding, right? When data gets hard, it's time to shard, right? You just have to start sharding everything.Rob: Yeah.Jeremy: Essentially, what you end up doing is rebuilding DynamoDB, or trying to rebuild Fauna, or something like that. So just thinking about that, anything you're building ... Maybe you have some advice for developers who ... I know we've talked about this a little bit, but I just go back to this idea of like, if you think about how complex some of these SaaS companies and these services that are being built out right now, why would you ever want to take that complexity on yourself?Rob: Pride. Hubris. I mean, the correct answer is you wouldn't. You shouldn't.Jeremy: People do.Rob: Yeah, they do. I would beg and plead with them like, "Look, we did take a lot of that on. Fauna scales. You don't need to plan for sharding. You don't need to plan for global replication. All of these things are happening." I raise that as an example of understanding the customer's problem. The customer didn't want to think about, "Okay, past a thousand TPS, I need to create a new read replica. Past a million TPS, I need to have another region with active-active." The customer wanted to store some data and get that data, knowing that they had the ASA guarantees around it, right, and that's what the customer has.So get that good understanding of what your customer really wants. If you can buy that, then you don't have a product yet. This is even out of software development and into product ideation at startups, right? If you can go ... Your customer's problem isn't they can't send text messages programmatically. They can do that through Twilio. They can do that through Amazon. They can do that through a number of different services, right? Your customer's problem is something else. So really get a good understanding of it. This is where knowing a little ... Like Joe Emison loves to rage against senior developers for knowing not quite enough. This is where knowing like, "Oh, yeah, Postgres. You can just shard it." "Just," the worst word in computer science, right?Jeremy: Right.Rob: "You can just shard it." Okay. Now, we're back to those database engineers that you talked about, and your customer doesn't want to shard a database. Your customer wants to store and retrieve data.Jeremy: Right.Rob: So any time that you can really focus in, and I guess I really got this one, this customer obsession beaten into me from my time at AWS. Really focus in on what the customer is asking you to do or what the customer needs even if they don't know how to express it, and build for that.Jeremy: Right, right. Like the saying. I forgot who said it. Somebody from Harvard Business Review, but, "Your customers don't want a quarter-inch drill. They want a quarter-inch hole."Rob: Right, right.Jeremy: That's basically true. I mean, the complexity that goes behind the scenes are something that I think a vast majority of customers don't necessarily want, and you're right. If you focus on that product ideation thing, I think that's a big piece of that. All right. Well, anyway. So I have one more question for you just to ...Rob: Please.Jeremy: We've been talking for a while here. Hopefully, we haven't been boring people to death with our talk about APIs and stuff like that, but I would like to get a little bit academic here and go into that Calvin paper just a tiny bit because I think most people probably will not want to read it. Not because they don't want to, but because people are busy, right, and so they're listening to the podcast.Rob: Yeah.Jeremy: Just give us a quick summary of this because I think this is actually really fascinating and has benefits beyond just I think solving data problems.Rob: Yeah. So what I would say first. I actually have this paper on my desk at all times. I would say read Section 1. It's one page, front and back. So if you're interested in it, you don't have to read the whole paper. Read that, and then listeners to this podcast will probably understand when I say this. Previously, for distributed databases and distributed transactions, you had what was called a two-phase commit. The first was you'd go out to all of your replicas, and you say, "Hey, I need lock." When everybody comes back, and acknowledges, and says, "Okay. You have the lock," then you do your transaction, and then you replicate it, and then you say, "Hey, everybody. I'm done. Release the lock." So it's a two-phase commit. If something went wrong, you rolled it all the way back and you said, "Hey, everybody. Forget it."Calvin is event-sourcing for databases. So if I could distill the entire paper down into one concept, that's it. Right? Instead of saying, "Hey, everybody. Give me a lock, I'm going to do something," you say, "Hey, everybody. Here's what we're going to do." It's a deterministic application of the transaction so that you can ... You both create the lock and execute the transaction at the same time. So rather than having this outbound round trip, and then doing the thing in an outbound round trip, you have the outbound round trip, and you're done.They all apply it independently, and then it gets into how you structure the guarantees around that, which again is very similar to event-sourcing in that you use snapshotting or checkpointing. "So, hey. At this point, we all agree. So we can forget all of our old transactions, and we roll forward from here." If somebody leaves the cluster, they come back in at that checkpoint. They reapply all of the events that occurred prior to that. Those events are transactions. They'd get up to working speed, and then off they go. The paper I think uses Paxos. That's an implementation detail, but the really interesting thing about it is you're not having a double round trip.Jeremy: Yeah.Rob: Again, I love the idea of event-sourcing. I think Amazon EventBridge is the best service that they've released in the past couple years.Jeremy: Totally.Rob: If you understand all of that and are already building serverless applications that way, well, that's what we're doing, just event-sourcing for database. That's it. That's it.Jeremy: Just event-sourcing. It's easy. Simple. All right. All the words you never want to hear. Simple, easy, just. Right. Yeah. Perfect.Rob: Yeah, but we do the hard work, so you don't have to. You don't care about all of that. You want to write your data somewhere, and you want to retrieve your data from an API, and that's what Fauna gives you.Jeremy: Which I think is the main point here. So awesome. All right. Well, Rob, listen. This was great, and I'm super happy that I finally got you on the show. Congratulations for the new role at Fauna and on what's happening over there because it is pretty exciting.Rob: Thank you.Jeremy: I love companies that are innovating, right? It's not just another hosted database. You're actually building something here that is innovative, which is pretty amazing. So if people want to find out more about you, follow you on Twitter, or find out more about Fauna, how do they do that?Rob: Right. Twitter, rts_rob. Probably the easiest way, go to my website, robsutter.com, and you will link to me from there. From there, of course, you'll get to fauna.com and all of our resources there. Always open to answer questions on Twitter. Yeah. Oh, rob@fauna.com. If you're old-school like me and you prefer the email, there you go.Jeremy: All right. Awesome. Well, I will get all that into the show notes. Thanks again, Rob.Rob: Thank you, Jeremy. Thanks for having me.

amazon personal phd government simple microsoft army chefs run humans walmart mvp cloud dubai congratulations yale i am saas developers architects cto simplicity compliance arm salesforce audit api cio puppets harvard business review bash aws faa apis consume stripe bazaar azure ops amazon web services fauna s3 big four node hubris cloudflare twilio sutter tps lambda pci baas sidebar json serverless iac mysql terraform cli graphql datadog multicloud andy jassy hashicorp postgres ansible cdk paxos ec2 yaml pci dss cloud strategy vpc dynamodb phc cloudformation rob it richard boyd serverless framework rob you t sql jeremy you jeremy it cbt nuggets ben kehoe jeremy so rob so

EP17. False Cognates. Dinero = Plata = Bread = Diner = Dinner; College = Colegio; Avocado = Abogado; Advice = Aviso

Play Episode Listen Later Apr 24, 2021 13:55

Falsos amigos al ataque nuevamente. Escucha a Jeremy y Joffre descifrar las diferencias entre estos falsos amigos. FREE FULL TRANSCRIPT CLICK HERE Introducción: Bienvenido a Amigos Learning Languages, este es un podcast hecho por amigos para amigos que aprenden lenguajes. Podrás escuchar a nativos hablando de su cultura, experiencias y consejos. In addition, you would be able to listen to people who are already on this path of learning and how they manage to get where they are. Enjoy this journey with us! Joffre: Okay. Hello, my friend Jeremy! It's a pleasure to talk with you again. Jeremy: You too! Hi Joffre, how is it going? Joffre: It's going well, thank you. I have a great topic to talk about with you and it is false friends, ¿has escuchado acerca de esto? Jeremy: This phrase and expression: false friends, is this a very common phrase in Spanish? Joffre: Yes! If you just Google it: false friends Spanish and English you will have a big list of different words. Jeremy: Because you know, false friends in English, you know, it means something totally different. Joffre: Yes! In Spanish is the same, you could have a false friend too, like, literally you could have a false friend. Jeremy: Okay, okay. Joffre: Someone who. Jeremy: Is fake. Joffre: Exactly, exactly, fake, a fake friend. Jeremy. Okay. So, when you first told me I was like: okay, we're going to talk about fake friends! I know a lot of fake people; you know what I mean? I got people at work that I work with that are fake, I have family members that are fake, what's up? Which one you want to talk about? Let's go, Joffre! I got a whole of bunch to say. I didn't know until when you called me today and you said: Today we're going to talk about the comparison of different words and I'm like: ohhh, he's talking about false friends between words, okay, okay. Joffre: Exactly, exactly., exactly. False friends for people that are learning English or Spanish. Jeremy; Okay, yeah. Joffre: So, for me and for you those words are false friends. Are you ready? Jeremy: I think so, I hope so. Joffre: Okay! La primera pareja es: dinero y dinner, dinero y dinner y diner son muy parecidas, en español dinero y en inglés tenemos diner y dinner. Could you help us with an explanation of those, please? Diner and dinner. Jeremy: So, diner? You want me to explain it in English? Joffre: Yes, please! Jeremy: Okay, so, diner is what we call a small restaurant where is usually a restaurant that sometimes you see it in small towns, very small towns and it's not a franchise, it's not a chain of restaurants like, I don't know, Red Lobster or Friday's or anything like that, but it is usually just one restaurant own by a small, you know, family or something like that, so that's what a diner is, it's just a small restaurant you can stop and grab something to eat really quick, something like that. Dinner is when you have dinner with your family in the evening whether is, I don't know, 5 o'clock or 6 o'clock you're having dinner or you go out to dinner with a friend or you wife or whatever, you can go out to a restaurant and have dinner too in a restaurant, so, that's the difference. Joffre: Okay, so we have two different meanings. The first one is like a un restaurant pequeño como que está cerca de la vía o un restaurant de paso, algo pequeño. Jeremy: Yeah, uh-hum. Joffre: Y el siguiente que es dinner es la merienda o la cena, que es la hora a la cual nosotros comemos.

english google college advice spanish bread dinner false podr dinero diner plata avocado falsos colegio aviso abogado red lobster in spanish joffre jeremy you jeremy so

Episode #97: How Serverless Fits in to the Cyclical Nature of the Industry with Gojko Adzic

Play Episode Listen Later Apr 19, 2021 63:19

About Gojko AdzicGojko Adzic is a partner at Neuri Consulting LLP. He one of the 2019 AWS Serverless Heroes, the winner of the 2016 European Software Testing Outstanding Achievement Award, and the 2011 Most Influential Agile Testing Professional Award. Gojko’s book Specification by Example won the Jolt Award for the best book of 2012, and his blog won the UK Agile Award for the best online publication in 2010.Gojko is a frequent speaker at software development conferences and one of the authors of MindMup and Narakeet.As a consultant, Gojko has helped companies around the world improve their software delivery, from some of the largest financial institutions to small innovative startups. Gojko specializes in agile and lean quality improvement, in particular impact mapping, agile testing, specification by example, and behavior driven development.Twitter: @gojkoadzicNarakeet: https://www.narakeet.comPersonal website: https://gojko.netWatch this video on YouTube: https://youtu.be/kCDDli7uzn8This episode is sponsored by CBT Nuggets: https://www.cbtnuggets.com/TranscriptJeremy: Hi everyone, I'm Jeremy Daly and this is Serverless Chats. Today my guest is Gojko Adzic. Hey Gojko, thanks for joining me.Gojko: Hey, thanks for inviting me.Jeremy: You are a partner at Neuri Consulting, you're an AWS Serverless Hero, you've written I think, what? I think 6,842 books or something like that about technology and serverless and all that kind of stuff. I'd love it if you could tell listeners a little bit about your background and what you've been working on lately.Gojko: I'm a developer. I started developing software when I was six and a half. My dad bought a Commodore 64 and I think my mom would have kicked him out of the house if he told her that he bought it for himself, so it was officially for me.Jeremy: Nice.Gojko: And I was the only kid in the neighborhood that had a computer, but didn't have any ways of loading games on it because he didn't buy it for games. I stayed up and copied and pasted PEEKs and POKEs in a book I couldn't even understand until I made the computer make weird sounds and print rubbish on the screen. And that's my background. Basically, ever since, I only wanted to build software really. I didn't have any other hobbies or anything like that. Currently, I'm building a product for helping tech people who are not video editing professionals create videos very easily. Previously, I've done a lot of work around consulting. I've built a lot of product that is used by millions of school children worldwide collaborate and brainstorm through mind-mapping. And since 2016, most of my development work has been on Lambda and on team stuff.Jeremy: That's awesome. I joke a little bit about the number of books that you wrote, but the ones that you have, one of them's called Running Serverless. I think that was maybe two years ago. That is an excellent book for people getting started with serverless. And then, one of my probably favorite books is Humans Vs Computers. I just love that collection of tales of all these things where humans just build really bad interfaces into software and just things go terribly.Gojko: Thank you very much. I enjoyed writing that book a lot. One of my passions is finding edge cases. I think people with a slight OCD like to find edge cases and in order to be a good developer, I think somebody really needs to have that kind of intent, and really look for edge cases everywhere. And I think collecting these things was my idea to help people first of all think about building better software, and to realize that stuff we might glance over like, nobody's ever going to do this, actually might cause hundreds of millions of dollars of damage ten years later. And thanks very much for liking the book.Jeremy: If people haven't read that book, I don't know, when did that come out? Maybe 2016? 2015?Gojko: Yeah, five or six years ago, I think.Jeremy: Yeah. It's still completely relevant now though and there's just so many great examples in there, and I don't want to spent the whole time talking about that book, but if you haven't read it, go check it out because it's these crazy things like police officers entering in no plates whenever they're giving parking tickets. And then, when somebody actually gets that, ends up with thousands of parking tickets, and it's just crazy stuff like that. Or, not using the middle initial or something like that for the name, or the birthdate or whatever it was, and people constantly getting just ... It's a fascinating book. Definitely check that out.But speaking of edge cases and just all this experience that you have just dealing with this idea of, I guess finding the problems with software. Or maybe even better, I guess a good way to put it is finding the limitations that we build into software mostly unknowingly. We do this unknowingly. And you and I were having a conversation the other day and we were talking about way, way back in the 1970s. I was born in the late '70s. I'm old but hopefully not that old. But way back then, time-sharing was a thing where we would basically have just a few large computers and we would have to borrow time against them. And there's a parallel there to what we were doing back then and I think what we're doing now with cloud computing. What are your thoughts on that?Gojko: Yeah, I think absolutely. We are I think going in a slightly cyclic way here. Maybe not cyclic, maybe spirals. We came to the same horizontal position but vertically, we're slightly better than we were. Again, I didn't start working then. I'm like you, I was born in late '70s. I wasn't there when people were doing punch cards and massive mainframes and time-sharing. My first experience came from home PC computers and later PCs. The whole serverless thing, people were disparaging about that when the marketing buzzword came around. I don't remember exactly when serverless became serverless because we were talking about microservices and Lambda was a way to run microservices and execute code on demand. And all of a sudden, I think the JAWS people realized that JAWS is a horrible marketing name, and decided to rename it to serverless. I think it most important, and it was probably 2017 or something like that. 2000 ...Jeremy: Something like that, yeah.Gojko: Something like that. And then, because it is a horrible marketing name, but it's catchy, it caught on and then people were complaining how it's not serverless, it's just somebody else's servers. And I think there's some truth to that, but actually, it's not even somebody else's servers. It really is somebody else's mainframe in a sense. You know in the '70s and early '80s, before the PC revolution, if you wanted to be a small software house or a small product operator, you probably were not running your own data center. What you would do is you would rent it based on paying for time to one of these massive, massive, massive operators. And in fact, we ended up with AWS being a massive data center. As far as you and I are concerned, it's just a blob. It's not a collection of computers, it's a data center we learn something from and Google is another one and then Microsoft is another one.And I remember reading a book about Andy Grove who was the CEO of Intel where they were thinking about the market for PC computers in the late '70s when somebody came to them with the idea that they could repurpose what became a 8080 processor. They were doing this I think for some Japanese calculator and then somebody said, "We can attach a screen to this and make this a universal computer and sell it." And they realized maybe there's a market for four or five computers in the world like that. And I think that that's ... You know, we ended up with four or five computers, it's just the definition of a computer changed.Jeremy: Right. I think that's a good point because you think about after the PC revolution, once the web started becoming really big, people started building data centers and collocation facilities like crazy. This is way before the cloud, and everybody was buying racks and Dell was getting really popular because people buying servers from Dell, and installing these in their data centers and doing this. And it just became this massive, whole industry built around doing that. And then you have these few companies that say, "Well, what if we just handled all that stuff for you? Rather than just racking stuff for you," but started just managing the software, and started managing the networking, and the backups, and all this stuff for you? And that's where the cloud was born.But I think you make a really good point where the cloud, whatever it is, Amazon or Google or whatever, you might as well just assume that that's just one big piece of processing that you're renting and you're renting some piece of that. And maybe we have. Maybe we've moved back to this idea where ... Even though everybody's got a massive computer in their pocket now, tons of compute power, in terms of the real business work that's being done, and the real global value, and the things that are powering global commerce and everything else like that, those are starting to move back to run in four, five, massive computers.Gojko: Again, there's a cyclic nature to all of this. I remember reading about the advent of power networks. Because before people had electric power, there were physical machines and movement through physical power, and there were water-powered plants and things like that. And these whole systems of shafts and belts and things like that powering factories. And you had this one kind of power load in a factory that was somewhere in the middle, and then from there, you actually have physical belts, rotating cogs in other buildings, and that was rotating some shafts that were rotating other cogs, and things like that.First of all, when people were able to package up electricity into something that's distributable, and they were running their own small electricity generators next to these big massive machines that were affecting early factories. And one of the first effects of that was they could reuse 30% of their factories better because it was up to 30% of the workspace in the factory that was taken up by all the belts and shafts. And all that movement was producing a lot of air movement and a lot of dust and people were getting sick. But now, you just plug a cable and you no longer have all this bad air and you don't have employees going sick and things like that. Things started changing quite a lot and then all of a sudden, you had this completely new revolution where you no longer had to operate your own electric generator. You could just plug in and get power from the network.And I think part of that is again, cyclic, what's happening in our industry now, where, as you said, we were getting machines. I used to make money as a Linux admin a long time ago and I could set up my own servers and things like that. I had a company in 2007 where we were operating our own gaming system, and we actually had physical servers in a physical server room with all the LEDs and lights, and bleeps, and things like that. Around that time, AWS really made it easy to get virtual machines on EC2 and I realized how stupid the whole, let's manage everything ourself is. But, we are getting to the point where people had to run their own generators, and now you can actually just plug into the electricity network. And of course, there is some standardization. Maybe U.S. still has 110 volts and Europe has 220, and we never really get global standardization there.But I assume before that, every factory could run their own voltage they wanted. It was difficult to manufacture for these things but now you have standardization, it's easier for everybody to plug into the ecosystem and then the whole ecosystem emerged. And I think that's partially what's happening now where things like S3 is an API or Lambda is an API. It's basically the electric socket in your wall.Jeremy: Right, and that's that whole Wardley maps idea, they become utilities. And that's the thing where if you look at that from an enterprise standpoint or from a small business standpoint if you're a startup right now and you are ordering servers to put into a data center somewhere unless you're doing something that's specifically for servers, that's just crazy. Use the cloud.Gojko: This product I mentioned that we built for mind mapping, there's only two of us in the whole company. We do everything from presales, to development testing supports, to everything. And we're competing with companies that have several orders of magnitude more employees, and we can actually compete and win because we can benefit from this ecosystem. And I think this is totally wonderful and amazing and for anybody thinking about starting a product, it's easier to start a product now than ever. And, another thing that's totally I think crazy about this whole serverless thing is how in effect we got a bookstore to offer that first.You mentioned the world utility. I remember I was the editor of a magazine in 2001 in Serbia, and we had licensing with IDG to translate some of their content. And I remember working on this kind of piece from I think PC World in the U.S. where they were interviewing Hewlett Packard people about utility computing. And people from Hewlett Packard back then were predicting that in a few years' time, companies would not operate their own stuff, they would use utility and things like that. And it's totally amazing that in order to reach us over there, that had to be something that was already evaluated and tested, and there was probably a prototype and things like that. And you had all these giants. Hewlett Packard in 2001 was an IT giant. Amazon was just up-and-coming then and they were a bookstore then. They were not even anything more than a bookstore. And you had, what? A decade later, the tables completely turned where HP's ... I don't know ...Jeremy: I think they bought Compaq at some point too.Gojko: You had all these giants, IBM completely missed it. IBM totally missed ...Jeremy: It really did.Gojko: ... the whole mobile and web and everything revolution. Oracle completely missed it. They're trying to catch up now but fat chance. Really, we are down to just a couple of massive clouds, or whatever that means, that we interact with as we're interacting with electricity sockets now.Jeremy: And going back to that utility comparison, or, not really a comparison. It is a utility now. Compute is offered as a utility. Yes, you can buy and generate compute yourself and you can still do that. And I know a lot of enterprises still will. I think cloud is like 4% of the total IT market or something. It's a fraction of it right now. But just from that utility aspect of it, from your experience, you mentioned you had two people and you built, is it MindMup.com?Gojko: MindMup, yeah.Jeremy: You built that with just two people and you've got tons of people using it. But just from your experience, especially coming from the world of being a Linux administrator, which again, I didn't administer ... Well, I guess I was. I did a lot of work in data centers in my younger days. But, coming from that idea and seeing how companies were building in the past and how companies are still building now, because not every company is still using the cloud, far from it. But not taking advantage of that utility, what are those major disadvantages? How badly do you think that's going to slow companies down that are trying to innovate?Gojko: I can give you a story about MindMup. You mentioned MindMup. When was it? 2018, there was the Intel processor vulnerabilities that were discovered.Jeremy: Right, yes.Gojko: I'm not entirely sure what the year was. A few years ago anyway. We got a email from a concerned university admin when the second one was discovered. The first one made all the news and a month later a second one was discovered. Now everybody knew that, they were in panic and things like that. After the second one was discovered, we got a email from a university admin. And universities are big users, they need to protect the data and things like that. And he was insisting that we tell him what our plan was for mitigating this thing because he knows we're on the cloud.I'm working on European time. The customer was in the U.S., probably somewhere U.S. Pacific because it arrived in the middle of the night. I woke up, I'm still trying to get my head around and drinking coffee and there's this whole sausage CV number that he sent me. I have no idea what it's about. I took that, pasted it into Google to figure out what's going on. The first result I got from Google was that AWS Lambda was already patched. Copy, paste, my day's done. And I assume lots and lots of other people were having a totally different conversation with their IT department that day. And that's why I said I think for products like the one I'm building with video and for the MindMup, being able to rent operations as a utility, but really totally rent ops as a utility, not have to worry about anything below my unique business level is really, really important.And yes, we can hire people to work on that it could even end up being slightly cheaper technically but in terms of my time and where my focus goes and my interruptions, I think deploying on a utility platform, whatever that utility platform is, as long as it's reliable, lets me focus on adding value where I can actually add value. That makes my product unique rather than the generic stuff.Jeremy: You mentioned the video product that you're working on too, and something that is really interesting I think too about taking advantage of the cloud is the scalability aspect of it. I remember, it was maybe 2002, maybe 2003, I was running my own little consulting company at the time, and my local high school always has a rivalry football game every Thanksgiving. And I thought it'd be really interesting if I was to stream the audio from the local AM radio station. I set up a server in my office with ReelCast Streaming or something running or whatever it was. And I remember thinking as long as we don't go over 140 subscribers, we'll be okay. Anything over that, it'll probably crash or the bandwidth won't be enough or whatever.Gojko: And that's just one of those things now, if you're doing any type of massive processing or you need bandwidth, bandwidth alone ... I remember T1 lines being great and then all of a sudden it was like, well, now you need a T3 line or something crazy in order to get the bandwidth that you need. Just from that aspect of it, the ability to scale quickly, that just seems like such a huge blocker for companies that need to order provision servers, maybe get a utility company to come in and install more bandwidth for them, and things like that. That's just stuff that's so far out of scope for building a business to me. At least building a software business or building any business. It's crazy.When I was doing consulting, I did a bit of work for what used to be one of the largest telecom companies in the world.Jeremy: Used to be.Gojko: I don't want to name names on a public chat. Somewhere around 2006, '07 let's say, we did a software project where they just needed to deploy it internally. And it took them seven months to provision a bunch of virtual machines to deploy it internally. Seven months.Jeremy: Wow.Gojko: Because of all the red tape and all the bureaucracy and all the wait for capacity and things like that. That's around the time where Amazon when EC2 became commercially available. I remember working with another client and they were waiting for some servers to arrive so they can install more capacity. And I remember just turning on the Amazon console. I didn't have anything useful to running it then but just being able to start up a virtual machine in about, I think it was less than half an hour, but that was totally fascinating back then. Here's a new Linux machine and in less than half an hour, you can use it. And it was totally crazy. Now we're getting to the point where Lambda will start up in less than 10 milliseconds or something like that. Waiting for that kind of capacity is just insane.With the video thing I'm building, because of Corona and all of this remote teaching stuff, for some reason, we ended up getting lots of teachers using the product. It was one of these half-baked experiments because I didn't have time to build the full user interface for everything, and I realized that lots of people are using PowerPoint to prepare that kind of video. I thought well, how about if I shorten that loop, so just take your PowerPoint and convert it into video. Just type up what you want in the speaker notes, and we'll use these neurometrics to generate audio and things like that. Teachers like it for one reason or the other.We had this influential blogger from Russia explain it on his video blog and then it got picked up, my best guess from what I could see from Google Translate, some virtual meeting of teachers of Russia where they recommended people to try it out. I woke up the next day, the metrics went totally crazy because a significant portion of teachers in Russia tried my tool overnight in a short space of time. Something like that, I couldn't predict it. It's lovely but as you said, as long as we don't go over a hundred subscribers, we're fine. If I was in a situation like that, the thing would completely crash because it's unexpected. We'd have a thing that's amazingly good for marketing that would be amazingly bad for business because it would crash all our capacity we had. Or we had to prepare for a lot more capacity than we needed, but because this is all running on Lambda, Fargate, and other auto-scaling things, it's just fine. No sweat at all. It was a lovely thing to see actually.Jeremy: You actually have two problems there. If you're not running in the cloud or not running on-demand compute, is the fact that one, you would've potentially failed, things would've fallen over and you would've lost all those potential customers, and you wouldn't have been able to grow.Gojko: Plus you've lost paying customers who are using your systems, who've paid you.Jeremy: Right, that's the other thing too. But, on the other side of that problem would be you can't necessarily anticipate some of those things. What do you do? Over-provision and just hope that maybe someday you'll get whatever? That's the crazy thing where the elasticity piece of the cloud to me, is such a no-brainer. Because I know people always talk about, well, if you have predictable workloads. Well yeah, I know we have predictable workloads for some things, but if you're a startup or you're a business that has like ... Maybe you'd pick up some press. I worked for a company that we picked up some press. We had 10,000 signups in a matter of like 30 seconds and it completely killed our backend, my SQL database. Those are hard to prepare for if you're hosting your own equipment.Gojko: Absolutely, not even if you're hosting your own.Jeremy: Also true, right.Gojko: Before moving to Lambda, the app was deployed to Heroku. That was basically, you need to predict how many virtual machines you need. Yes, it's in the cloud, but if you're running on EC2 and you have your 10, 50, 100 virtual machines, whatever running there, and all of a sudden you get a lot more traffic, will it scale or will it not scale? Have you designed it to scale like that? And one of the best things that I think Lambda brought as a constraint was forcing people to design this stuff in a way that scales.Jeremy: Yes.Gojko: I can deploy stuff in the cloud and make it all distributed monolith, so it doesn't really scale well, but with Lambda because it was so constrained when it launched, and this is one thing you mentioned, partially we're losing those constraints now, but it was so constrained when it launched, it was really forcing people to design things that were easy to scale. We had total isolation, there was no way of sharing things, there was no session stickiness and things like that. And then you have to come up with actually good ways of resolving that.I think one of the most challenging things about serverless is that even a Hello World is a distributed transaction processing system, and people don't get that. They think about, well, I had this DigitalOcean five-dollar-a-month server and it was running my, you know, Rails up correctly. I'm just going to use the same ideas to redesign it in Lambda. Yes, you can, but then you're not going to really get the benefits of all of this other stuff. And if you design it as a massively distributed transaction processing system from the start, then yes, it scales like crazy. And it scales up and down and it's lovely, but as Lambda's maturing, I have this slide deck that I've been using since 2016 to talk about Lambda at conferences. And every time I need to do another talk, I pull it out and adjust it a bit. And I have this whole Git history of it because I do markdown to slides and I keep the markdown in Git so I can go back. There's this slide about limitations where originally it's only ... I don't remember what was the time limitation, but something very short.Jeremy: Five minutes originally.Gojko: Yeah, something like that and then it was no PCI compliance and the retries are difficult, and all of this stuff basically became sold. And one of the last things that was there, there was don't even try to put it in a VPC, definitely, you can but it's going to take 10 minutes to start. Now that's reasonably okay as well. One thing that I remember as a really important design constraint was effectively it was a share nothing platform because you could not share data between two Lambdas running at the same time very easily in the same VM. Now that we can connect Lambdas to EFS, you effectively can do that as well. You can have two Lambdas, one writing into an EFS, the other reading the same EFS at the same time. No problem at all. You can pump it into a file and the other thing can just stay in a file and get the data out.As the platform is maturing, I think we're losing some of these design constraints, and sometimes constraints breed creativity. And yes, you still of course can design the system to be good, but it's going to be interesting to see. And this 15-minute limit that we have in Lamdba now is just an artificial number that somebody thought.Jeremy: Yeah, it's arbitrary.Gojko: And at some point when somebody who is important enough asks AWS to give them half-hour Lamdbas, they will get that. Or 24-hour Lambdas. It's going to be interesting to see if Lambda ends up as just another way of running EC2 and starting EC2 that's simpler because you don't have to manage the operating system. And I think the big difference we'll get between EC2 and Lambda is what percentage of ops your developers are responsible for, and what percentage of ops Amazon's developers are responsible for.Because if you look at all these different offerings that Amazon has like Lightsail and EC2 and Fargate and AWS Batch and CodeDeploy, and I don't know how many other things you can run code on in Lambda. The big difference with Lambda is really, at least until very recently was that apart from your application, Amazon is responsible for everything. But now, we're losing design constraints, you can put a Docker container in, you can be responsible for the OS image as well, which is a bit again, interesting to look at.Jeremy: Well, I also wonder too, if you took all those event sources that you can point at Lambda and you add those to Fargate, what's the difference? It seems like they're just merging into two very similar products.Gojko: For the video build platform, the last step runs in Fargate because people are uploading things that are massive, massive, massive for video processing, and just they don't finish in 15 minutes. I have to run to Fargate, and the big difference is the container I packaged up for Fargate takes about 40 seconds to actually deploy. A new event at the moment with the stuff I've packaged in Fargate takes about 40 seconds to deploy. I can optimize that, but I can't optimize it too much. Fargate is still order of magnitude of tens of seconds to process an event. I think as Fargate gets faster and as Lambda gets more of these capabilities, it's going to be very difficult to tell them apart I think.With Fargate, you're intended to manage the container image yourself. You're responsible for patching software, you're responsible for patching OS vulnerabilities and things like that. With Lambda, Amazon, unless you use a container image, Amazon is responsible for that. They come close. When looking at this video building for the first time, I was actually comparing code. I was considering using CodeBuild for that because CodeBuild is also a way to run things on demand and containers, and you actually can get quite decent machines with CodeBuild. And it's also event-driven, and Fargate is event-driven, AWS Batch is event-driven, and all of these things are converging to each other. And really, AWS is famous for having 10 products that do the same thing effectively and you can't tell them apart, and maybe that's where we'll end.Jeremy: And I'm wondering too, the thing that was great about Lambda, at least for me like you said, the shared nothing architecture where it was like, you almost didn't have to rely on anything other than the event that came in, and the processing of that Lambda function. And if you designed your systems well, you may have some bottleneck up front, but especially if you used distributed transactions and you used async invocations of downstream functions, where you could basically take some data that you needed to pass into it, and then you wouldn't necessarily need that to communicate with anything other than itself to process that data. The scale there was massive. You could just keep scaling and scaling and scaling. As you add things like EFS and that adds constraints in terms of the number of transactions and connections that, that can make and all those sort of things. Do these things, do they become less reliable? By allowing it to do more, are we building systems that are less reliable because we're not using some of those tried-and-true constraints that were there?Gojko: Possibly, but every time you add a new moving part, you create one more potential point of failure there. And I think for me, one of the big lessons when I was working on ... I spent a few years working on very high throughput transaction processing systems. That's why this whole thing rings a bell a lot. A lot of it really was how do you figure out what type of messages you send and where you send them. The craze of these messages and distributed transaction processing systems in early 2000s, created this whole craze of enterprise service buses later that came. We now have this... What is it called? It's not called enterprise service bus, it's called EventBridge, or something like that.Jeremy: EventBridge, yes.Gojko: That's effectively an enterprise service bus, it's just the enterprise is the Amazon cloud. The big challenge in designing things like that is decoupling. And it's realizing that when you have a complicated system like that, stuff is going to fail. And especially when we were operating around hardware, stuff is going to fail badly or occasionally, and you need to not bring the whole house down where some storage starts working a bit slower. You create circuit breakers, you create layers and layers of stuff that disconnect things. I remember when we were looking originally at Lambdas and trying to get the head around that and experimenting, should one Lambda call another? Or should one Lambda not call another? And things like that.I realized, let's say for now, until we realize we want to do something else, a Lambda should only ever talk to SNS and nothing else. Or SQS or something like that. When one Lambda completes, it's going to track a message somewhere and we need to design these messages to be good so that we can decouple different parts of the process. And so far, that helps too as a constraint. I think very, very few times we have one Lambda calling another. Mostly when we actually need a synchronized response back, and for security reasons, we wanted to isolate something to a single Lambda, but that's effectively just a black box security isolation. Since creating these isolation layers through messages, through queues, through topics, becomes a fundamental part of designing these systems.I remember speaking at the conference to somebody. I forgot the name of the person who was talking about airline. And he was presenting after me and he said, "Look, I can relate to a lot of what you said." And in the airline community basically they often talk about, apparently, I'm not an airline programmer, he told me that in the airline community, talk about designing the protocol being the biggest challenge. Once you design the protocol between your components, the message is who sends what where, you can recover from almost any other design flaw because it's decoupled so if you've made a mess in one Lambda, you can redesign that Lambda, throw it away, rewrite it, decouple things a different way. If the global protocol is good, you get all the flexibility. If you mess up the protocol for communication, then nothing's going to save you at the end.Now we have EFS and Lambda can talk to an EFS. Should this Lambda talk directly to an EFS or should this Lambda just send some messages to a topic, and then some other Lambdas that are maybe reserved, maybe more constrained talk to EFS? And again, the platform's evolved quite a lot over the last few years. One thing that is particularly useful in that regard is the SQS FIFO queues that came out last year I think. With Corona ...Jeremy: Yeah, whenever it was.Gojko: Yeah, I don't remember if it was last year or two years ago. But one of the things it allows us to do is really run lots and lots of Lambdas in parallel where you can guarantee that no two Lambdas access the same kind of business entity that you have in the same type. For example, for this mind mapping thing, we have lots and lots of people modifying lots and lots of files in parallel, but we need to aggregate a single map. If we have 50 people over here working with a single map and 60 people on a map working a different map, aggregation can run in parallel but I never ever, ever want two people modifying the same map their aggregation to run in parallel.And for Lambda, that was a massive challenge. You had to put Kinesis between Lambda and other Lambdas and things like that. Kinesis' provision capacity, it costs a lot, it doesn't auto-scale. But now with SQS FIFO queues, you can just send a message and you can say the kind of FIFO ID is this map ID that we have. Which means that SQS can run thousands of Lambdas in parallel but they'll never run more than one Lambda for the same map idea at the same time. Designing your protocols like that becomes how you decouple one end of your app that's massively scalable and massively parallel, and another end of your app that we have some reserved capacity or limits.Like for this kind of video thing, the original idea of that was letting me build marketing videos easier and I can't get rid of this accent. Unfortunately, everything I do sounds like I'm threatening someone to blackmail them. I'm like a cheap Bond villain, and that's not good, but I can't do anything else. I can pay other people to do it for me and we used to do that, but then that becomes a big problem when you want to modify tiny things. We paid this lady to professionally record audio for a marketing video that we needed and then six months later, we wanted to change one screen and now the narration is incorrect. And we paid the same woman again. Same equipment, same person, but the sound is totally different because two different equipment.Jeremy: Totally different, right.Gojko: You can't just stitch it up. Then you end up like, okay, do we go and pay for the whole thing again? And I realized the neurometric text-to-speech has learned so much that it can do English better than I can. You're a native English speaker so you can probably defeat those machines, but I can't.Jeremy: I don't know if I could. They're pretty good now. It's kind of scary.Gojko: I started looking at one like why don't they just put stuff in a Markdown and use Markdown to generate videos and things like that? All of these things, you get quota limits still. I thought we were limited on Google. Google gave us something like five requests per second in parallel, and it took me a really long time to even raise these quotas and things like that. I don't want to have lots of people requesting stuff and then in parallel trashing this other thing over there. We need to create these layers of running things in a decent limit, and I think that's where I think designing the protocol for this distributed system becomes an importance.Jeremy: I want to go back because I think you bring up a really good point just about a different type of architecture, or the architectural design of decoupling systems and these event-driven things. You mentioned a Lambda function processes something and sends it to SQS or sends it to SNS to it can do a fan-out pattern or in the case of the FIFO queue, doing an ordered pattern for sequential processing, which those were all great patterns. And even things that AWS has done, such as add things like Lambda destination. Now if you run an asynchronous Lambda function, you still have to write some code or you used to have to write some code that said, "When this is finished processing, now call some other component." And there's just another opportunity for failure there. They basically said, "Well, if it succeeds, then you can actually just forward it off to one of these other services automatically and we'll handle all of the retries and all the failures and that kind of stuff."And those things have been added in to basically give you that warm and fuzzy feeling that if an event doesn't reach where it's supposed to go, that some sort of cloud trickery will kick in and make sure that gets processed. But what that is introduced I think is a cognitive overload for a lot of developers that are designing these systems because you're no longer just writing a script that does X, Y, and Z and makes a few database calls. Now you're saying, okay, I've got to write a script that can massively scale and take the transactions that I need to maybe parallelize or that I maybe need to queue or delay or throttle or whatever, and pass those down to another subsystem. And then that subsystem has to pick those up and maybe that has to parallelize those or maybe there are failure modes in there and I've got all these other things that I have to think about.Just that effect on your average developer, I think you and I think about these things. I would consider myself to be a cloud architect, if that's a thing. But essentially, do you see this being I guess a wall for a lot of developers and something that really requires quite a bit of education to ramp them up to be able to start designing these systems?Gojko: One of the topics we touched upon is the cyclic nature of things, and I think we're going back to where moving from apps working on a single machine to client server architectures was a massive brain melt for a lot of people, and three-tier architectures, which is later, we're not just client server, but three-tier architectures ended up with their own host of problems and then design problems and things like that. That's where a lot of these architectural patterns and design patterns emerged like circuit breakers and things like that. I think there's a whole body of knowledge there for people to research. It's not something that's entirely new and I think you can get started with Lambda quite easily and not necessarily make a mess, but make something that won't necessarily scale well and then start improving it later.That's why I was mentioning that earlier in the discussion where, as long as the protocol makes sense, you can salvage almost anything late. Designing that protocol is important, but then we're going to good software design. I think teaching people how to do that is something that every 10 years, we have to recycle and reinvent and figure it out because people don't like to read books from more than 10 years ago. All of this stuff like designing fault tolerance systems and fail-safe systems, and things like that. There's a ton of books about that from 20 years ago, from 10 years ago. Amazon, for people listening to you and me, they probably use Amazon more for compute than they use for getting books. But Amazon has all these books. Use it for what Amazon was originally intended for and then get some books there and read through this stuff. And I think looking at design of distributed systems and stuff like that becomes really, really critical for Lambdas.Jeremy: Yeah, definitely. All right, we've got a few minutes left and I'd love to go back to something we were talking about a little bit earlier and that was everything moving onto a few of these major cloud providers. And one of the things, you've got scale. Scale is a problem when we talked about oh, we can spin up as many VMs as we want to, and now with serverless, we have unlimited capacity really. I know we didn't say that, but I think that's the general idea. The cloud just provides this unlimited capacity.Gojko: Until something else decides it's not unlimited.Jeremy: And that's my point here where every major cloud provider that I've been involved with and I've heard the stories of, where you start to move the needle at all, there's always an SA that reaches out to you and really wants to understand what your usage is going to be, and what your patterns were going to be. And that's because they need to make sure that where you're running your applications, that they provision enough capacity because there is not enough capacity, or there's not unlimited capacity in the cloud.Gojko: It's physically limited. There's only so much buildings where you can have data centers on the surface of Earth.Jeremy: And I guess that's where my question comes in because you always hear these things about lock-in. Like, well serverless, if you use Lambda, you're going to be locked in. And again, if you're using Oracle, you're locked in. Or, you're using MySQL you're locked in. Or, you're using any of the other things, you're locked in.Gojko: You're actually not locked in physically. There's a key and a lock.Jeremy: Right, but this idea of being locked in not to a specific cloud provider, but just locked into a cloud in general and relying on the cloud to do that scaling for you, where do you think the limitations there are?Gojko: I think again, going back to cyclic, cyclic, cyclic. The PC revolution started when a lot more edge compute was needed on mainframes, and people wanted to get stuff done on their own devices. And I think probably, if we do ever see the limitations of this and it goes into a next cycle, my best guess it's going to be driven by lots of tiny devices connected to a cloud. Not necessarily computers as we know computers today. I pulled out some research preparing for this from IDC. They are predicting basically from 18.3 zettabytes of data needed for IOT in 2019, to be 73.1 zettabytes by 2025. That's like times three in a space of six years. If you went to Amazon now and told them, "You need to have three times more data space in three years," I'm not sure how they would react to that.This stuff, everything is taking more and more data, and everything is more and more connected to the cloud. The impact of something like that going down now is becoming totally crazy. There was a case in 2017 where S3 started getting a bit more latency than usual in U.S. East 1, in I think February of 2018, or something like that. There were cases where people couldn't turn the lights on in their houses because the management software was working on S3 and depending on S3. Expecting S3 to be indestructible. Last year, in November, Kinesis pretty much went offline as far as everybody else outside AWS concerend for about 15 hours I think. There were people on Twitter that they can't go back into their house because their smart lock is no longer that smart.And I think we are getting to places where there will be more need for compute on the edge. First of all, there's going to be a lot more demand for data centers and cloud power and I think that's going to keep going on for the next five, ten years. But then people will realize they've hit some limitation of that, and they're going to start moving towards the edge. And we're going from mainframe back into client server computing I think. We're getting these products now. I assume most of your listeners have seen one like all these fancy ubiquity Wi-Fi thingies that are costing hundreds of dollars and they look like pieces of furniture that's just sitting discretely on the wall. And there was a massive security breach yesterday published. Somebody took their AWS keys and took all the customer data and everything.The big advantage over all the ugly routers was that it's just like a thin piece of glass that sits on your wall, and it's amazing and it looks good, but the reason why they could do a very thin piece of glass is the minimal amount of software is running on that piece of glass, the rest is running the cloud. It's not just locking in terms of is it on Amazon or Google, it's that it's so tightly coupled with something totally outside of your home, where your network router needs Amazon to be alive now in a very specific region of Amazon where everybody's been deploying for the last 15 years, and it's running out of capacity very often. Not very often but often enough.There's some really interesting questions that I guess we'll answer in the next five, ten years. We're on the verge of IOT I think exploding because people are trying to come up with these new products that you wouldn't even think before that you'd have smart shoes and smart whatnot. Smart glasses and things like that. And when that gets into consumer technology, we're no longer going to have five or ten computer devices per person, we'll have dozens and dozens of computing. I guess think about it this way, fifteen years ago, how many computer devices were you carrying with yourself? Probably mobile phone and laptop. Probably not more. Now, in the headphones you have there that's Bose ...Jeremy: Watch.Gojko: ... you have a microprocessor in the headphones, you have your watch, you have a ton of other stuff carrying with you that's low-powered, all doing a bit of processing there. A lot of that processing is probably happening on the cloud somewhere.Jeremy: Or, it's just sending data. It's just sending, hey here's the information. And you're right. For me, I got my Apple Watch, my thermostat is connected to Wi-Fi and to the cloud, my wife just bought a humidifier for our living room that is connected to Wi-Fi and I'm assuming it's sending data to the cloud. I'm not 100% sure, but the question is, I don't know why we need to keep track of the humidity in my living room. But that's the kind of thing too where, you mentioned from a security standpoint, I have a bunch of AWS access keys on my computer that I send over the network, and I'm assuming they're secure. But if I've got another device that can access my network and somebody hacked something on the cloud side and then they can get in, it gets really dangerous.But you're right, the amount of data that we are now generating and compute that we're using in the cloud for probably some really dumb things like humidity in my living room, is that going to get to a point where... You said there's going to be a limitation like five years, ten years, whatever it is. What does the cloud do then? What does the cloud do when it can no longer keep up with the pace of these IOT devices?Gojko: Well, if history is repeating and we'll see if history is repeating, people will start getting throttled and all of a sudden, your unlimited supply of Lambdas will no longer be unlimited supply of Lambdas. It will be something that you have to reserve upfront and pay upfront, and who knows, we'll see when we get there. Or, we get things that we have with power networks like you had a Texas power cut there that was completely severe, and you get a IT cut. I don't know. We'll see. The more we go into utility, the more we'll start seeing parallels between compute and power networks. And maybe power networks are something that you can look at and later name. That's why I think the next cycle is probably going to be some equivalent of client server computing reemerging.Jeremy: Yeah. All right, well, I got one more question for you and this is just something where it may be a little bit of a tongue-in-cheek question. Because we talked it a little bit ... we talked about the merging of Lambda, and of Fargate, and some of these other things. But just from your perspective, serverless in five years from now, where do you see that going? Do you see that just becoming the main ... This idea of utility computing, on-demand computing without setting up servers and managing ops and some of these other things, do you see that as the future of serverless and it just becoming just the way we build applications? Or do you think that it's got a different path?Gojko: There was a tweet by Simon Wardley. You mentioned Simon Wardley earlier in the talk. There was a tweet a few days ago where he mentioned some data. I'm not sure where he pulled it from. This might be unverified, but generally Simon knows what he's talking about. Amazon itself is deploying roughly 50% of all new apps they're building on serverless. I think five years from now, that way of running stuff, I'm not sure if it's Lambda or some new service that Amazon starts and gives it some even more confusing name that runs in parallel to everything. But, that kind of stuff where the operator takes care of all the ops, which they really should be doing, is going to be the default way of getting utility compute out.I think a lot of these other things will probably remain useful for specialists' use cases, where you can't really deploy it in that way, or you need more stability, or it's not transient and things like that. My best guess is first of all, we'll get Lambda's that run for longer, and I assume that after we get Lambdas that run for longer, we'll probably get some ways of controlling routing to Lambdas because you already can set up pre-provisioned Lambdas and hot Lambdas and reserved capacity and things like that. When you have reserved capacity and you have longer running Lambdas, the next logical thing there is to have session stickiness, and routing, and things like that. And I think we'll get a lot of the stuff that was really complicated to do earlier, and you had to run EC2 instances or you had to run complicated networks of services, you'll be able to do in Lambda.And Lambda is, I wouldn't be surprised if they launch a totally new service with some AWS cloud socket, whatever. Something that is a implementation of the same principle, just in a different way, that becomes a default we are running computer for lots of people. And I think GPUs are still a bit limited. I don't think you can run GPU utility anywhere now, and that's limiting for a whole host of use cases. And I think again, it's not like they don't have the technology to do it, it's just they probably didn't get around to doing it yet. But I assume in five years time, you'll be able to do GPUs on-demand, and processing GPUs, and things like that. I think that the buzzword itself will lose really any special meaning and that's going to just be a way of running stuff.Jeremy: Yeah, absolutely. Totally agree. Well, listen Gojko, thank you so much for spending the time chatting with me. Always great to talk with you.Gojko: You, too.Jeremy: If people want to get in touch with you, find out more about what you're doing, how do they do that?Gojko: Well, I'm very easy to find online because there's not a lot of people called Gojko. Type Gojko into Google, you'll find me. And gojko.networks, gojko.com works, gojko.org works, and all these other things. I was lucky enough to get all those domains.Jeremy: That's G-O-J-K-O ...Gojko: Yes, G-O-J-K-O.Jeremy: ... for people who need the spelling.Gojko: Excellent. Well, thanks very much for having me, this was a blast.Jeremy: All right, yeah. And make sure you check out ... You mentioned Narakeet. It's a speech thing?Gojko: Yeah, for developers that want to build videos without hassle, and want to put videos in continuous integration, and things like that. Narakeet, that's like parakeet with an N for narration. Check that out and thanks for plugging it.Jeremy: Awesome. And then, check out MindMup as well. Awesome stuff. I've got all the stuff in the show notes. Thanks again, Gojko.Gojko: Thank you. Bye-bye.

ceo amazon texas thanksgiving europe english google earth nature russia european corona japanese microsoft smart east teachers os pc scale cloud bond id pacific designing ibm wifi oracle intel obsessive compulsive disorder jaws copy cv iot hp api apple watches serbia fits powerpoint aws linux faa pcs vm rails gpu hewlett packard commodore bose sns s3 hello world sql docker git t3 gpus leds google translate lambda cyclical idc pci baas t1 compute serverless mysql vms pokes specifications digital ocean markdown fifo heroku compaq aws lambda idg andy grove ec2 efs pc world peeks kinesis vpc wardley cyclical nature lightsail lambdas fargate sqs simon wardley gojko compersonal netwatch gojko adzic eventbridge mindmup jeremy you jeremy it aws serverless hero cbt nuggets aws batch jolt award

第1015期:Proud Papa

Play Episode Listen Later Oct 26, 2020 3:34

Abidemi: Jeremy, I heard you have a new baby. Congrats on that.Jeremy: Thanks.Abidemi: So how has fatherhood changed you in any way?Jeremy: You know, it's changed me probably in every way. You know, when you think of how your life is different, it's basically from when you wake up until – well, you eventually get to go to sleep. Everything is different.But, you know, everything about it is positive, even things that you'd think would annoy you actually aren't really that annoying. Like having a toddler come and wake you up at 6 o'clock in the morning when you've got a cold and you've taken a bunch of sleeping medication the night before, like which happened this morning. And even that, having a toddler wake you up and just looking at you and laughing, it's awesome. It really is awesome.So, you know, I always kind of worried before when I was thinking about having a kid about all these things – oh, I can't go out at night or I'm going to have to wake up so early in the morning and I'm going to be woken up in the middle of the night by a screaming child.All of that stuff really doesn't bother me. It just, you know, it's part of the experience and yeah, sometimes maybe I'd like to sleep a little bit longer. But yeah, waking up and then having a little guy sit on my lap while he plays with the remote control and changes channels over and over again, I mean, that's somehow a very, very enjoyable experience.But, you know, you get to see a little person change everyday and start figuring things out. And of course, every person, every parent thinks that their child is the smartest child in the world. He can turn off and on the light switch, you know, but it's those types of things that you just marvel at.So, you know, I can say like most people that it's everything about it is very positive. But it's often just the really, really small things that make you appreciate this little wonder.Abidemi: And looking back on the experience now, what are some things that you would wish that you had known before your little boy came along?Jeremy: Yeah, it's a good question. You know, I was so stressed out for the first couple of weeks after he was born. And from the moment he was born until he came back home, and then for the first number of weeks that we was home, we basically just hovered over him for weeks. And, you know, of course, you have to protective and careful of a new born baby. But not everything was as serious as we thought it was.And, you know, crying – we were worried about waking up our neighbors or we're worrying that, you know, something was seriously wrong. Babies cry. And if you really get stressed out about it then you're going to make your life miserable.So, you know, having done it once, just realizing that, you know, kids cry, kids get sick, kids throw up, kids need their nappies changed, I would just take a deep breath if I have to do it again and just realize, okay, this is just part of the experience. And you just got to roll with it.Abidemi: Okay. That's great.

babies proud papa jeremy you

Why wouldn’t someone like free swag? That’s not a rhetorical question. In fact, Jeremy Parker has been trying to answer that question since he co-founded Swag.com in 2016. Jeremy knew that swag and other promotional items were becoming key marketing tools, and he saw an opportunity to build a business that brought those items straight to the people who needed them. On this episode of Up Next in Commerce, Jeremy takes us behind the scenes of what it was like building Swag.com, including how he went from 3,000 organic site visitors in a month to more than 40,000 organic visitors. The journey to that success was paved with many hiccups, including the difficulty that comes with building an ecommerce platform from scratch, and trying to land their first big-name customer by walking around that company’s campus until they found a buyer. But today, Swag.com can handle unlimited orders, and that first customer was a little company called Facebook. How did it happen? Learn that and more on this episode. Main Takeaways: The Snowball Effect — Attracting customers is always easier when you have a proven track record that you can point to. Therefore, it is critical to land key accounts in the early days that can be referenced in future sales conversations. Because when you can point to one successful company that works with you, other companies will follow suit. What To Know About SEO — Good SEO doesn’t happen by accident. Even though you might have great products and a thriving customer base, organic growth doesn’t happen unless you’re paying attention to your content strategy and making the necessary little tweaks that will bump you up in the search results. If You Build It, They Will Come — When deciding on your product offerings, you have to get inside your customers’ heads and build up an inventory of things that people actually want. Sometimes that means you have to get your hands dirty, do some testing and try things that don’t scale before finally settling on the right blend of offerings. For an in-depth look at this episode, check out the full transcript below. Quotes have been edited for clarity and length. --- Up Next in Commerce is brought to you by Salesforce Commerce Cloud. Respond quickly to changing customer needs with flexible Ecommerce connected to marketing, sales, and service. Deliver intelligent commerce experiences your customers can trust, across every channel. Together, we’re ready for what’s next in commerce. Learn more at salesforce.com/commerce --- Transcript: Stephanie: Welcome back to Up Next in Commerce. I'm your host, Stephanie Postles co-founder of mission.org. Today on the show we have Jeremy Parker, the co-founder and CEO at Swag.com. Jeremy, how's it going? Jeremy: Hey, thanks so much for having me. Stephanie: I'm excited to talk all things swag. You saw my shirt hoodie. I was ready for you this morning. I have everything branded mission. Jeremy: Every everyone needs a little schtickle of swag in their life. Stephanie: I agree. What is the first piece of swag that you remember? Jeremy: Oh, wow. For myself, I've been going to a ridiculous number of trade shows and events over the years. Honestly the earliest swagger member was stuff that I ended up throwing away and that's one that gave me one of the ideas for Swag.com and we wanted to make sure we only offer products that people actually want to keep. That was my main mission from the very beginning. Stephanie: Yeah, same here. I remember getting a bunch of stuff and throwing it away, but I remember being so excited it was back I think in 2010, it was like my first finance conference and I got like a Koozie. I was so excited because it was like the first thing that I'd ever gotten for free maybe and finances a little bit. Sticklers is about giving stuff away for free. And I look back and laugh now because I would go and collect all this stuff and it would ultimately end up being nothing that I really used. Jeremy: 100%. From the very beginning of our business, we were thinking of swag as an amazing marketing tool if it's used right, so obviously that's a big caveat. And when you think of just marketing in general and you have TV commercials and everyone's trained to now fast forward through commercials and you get a magazine, you flip through the ads, or you put your ad blocker on your computer. If you give somebody really high quality swag, they say, “Thank you.” It's really a powerful tool if it's done really right. And it has to be something that people are actually going to want to use. We don't really like to push the flashiest thing or the new hottest thing. It's all about what are people that actually use every day and get those impressions of. Stephanie: Yeah. I love that. Before we dive way too deep into Swag.com, I want to hear a little bit about your background because I see you've done a lot of things in your previous life. And I wanted to kind of hear what your journey was like before founding Swag.com. Jeremy: Sure. I was a documentary filmmaker actually in college, that's why I went to school for. I actually never wanted to be a filmmaker when I went to Boston University. And I looked at the curriculum and I really wanted it to be in high school my whole … Before college life I always wanted to be a marketing guy. I was always into branding and commercials and how to tell stories through marketing. When I went to school and I looked at the syllabus of film and marketing, they really were the exact same thing, except for film taught me how to make videos. And this is right at the onset of like YouTube. I thought that would become valuable. I became like probably the first filmmaker at BU history that never actually wanted to be a filmmaker. Jeremy: But as I was in school for those four years, I ended up making a feature length documentary that ended up winning the audience award at the Vail Film Festival. And I was [inaudible] and I walked down and the brunch the next day after the award ceremony and half the room are these major celebrities and half the room are these struggling filmmakers. And I did kind of an internal gut check of, am I good enough? Is this what I want to do with my life? And it wasn't, so right after I won this award, when people primarily feel like on a high, they're like, “Oh, I'm going to become the biggest filmmaker,” my thought was, “What else am I going to do? What's my plan? What's really my plan? What am I good at?” Jeremy: And when I graduated college, I didn't know what I was good at. I had no real experience in business or anything, but I thought maybe I should start something and just learn what I'm good at, what I enjoyed. I started a t-shirt company right out of college when I was 21, 22. And really I thought t-shirt sounds so simple, but really you're learning manufacturing, PR, marketing, building an Ecommerce experience, all the different aspects of business, fulfillment, all these different things. And I tried to figure out what I was really good at. Jeremy: And over the last 10 years, I've done a lot of different things. I started the company with my brother and Jesse Itzler. Jesse is the co-founder of Marquis Jet, private jet company. He sold ZICO Coconut Water to Coca-Cola. He's one of the owners of Atlanta Hawks. I started a company with him where we partnered up with different celebrity influencers and we owned their celebrity rights to Twitter and Facebook feeds before people knew how valuable it was. This was nine years ago or so. Jeremy: So [inaudible] a lot of celebrities, buying their rights. That company ultimately got bought by a publicly traded company. I then went on to start a social networking app that ultimately failed. Never start a social networking app, I'll tell you that. Extremely difficult. Stephanie: Semi-hard. Jeremy: Yeah, it's semi-hard to do. And we built an app called Vouch. That basically was about like Oprah's favorite things democratize for everybody. You could vouch for your favorite movie and book and charity and anything you'd want to vouch for and people who follow you really get to know what you like. Really kind of making the like button with its own platform. We ended up having 100,000 plus users. We had tons of influences. It just never materialized. And after doing that for three years, I realized that the next business I want to start, it needs to be something where we made money from day one, I could give a service and a product and I started Swag.com. Jeremy: So, it's been almost five years at this point with Swag. We were just named the 218 fastest growing company on the Inc. 5000. We have 5,000 companies from Facebook, Google, Amazon, Netflix, TikTok, Spotify buying on our site and we spent a really big portion of that building is automated experience for purchasing swag. And now it's about, now how do you handle the distribution of swag? It's more than just making it easy to buy. How do you get into the hands of people? And especially now with this pandemic, that's really the most important thing. Stephanie: Yeah, I was just going to touch on that. I know everyone's probably wondering with everything going on, where conferences are being obviously canceled and not coming back for a while. How are you guys handling that? Because I'm that the swag industry right now is down overall. What are you guys doing right now to not be part of that downward spiral? Jeremy: Yeah, that's a 100% true. They just came up with numbers. ASI, which is like the big organization for promotional products, just came out to number that over 92% of companies in our industry are down approximately 50% in Q2 this year versus last year. So, it's really bad. And then obviously it makes sense on the surface where you have our core buyer was like the HR manager buying for onboarding of new hires. That was one of our big purchases and no one's hiring right now. That business goes away. And then you have the marketing teams buying for trade shows and there's no trade shows happening, so that business goes away. Jeremy: Then you have the office manager buying for internal office and company culture, and no one's in the office right now. You have all these different buyers that really are not buying swag for the normal, the typical reasons to buy swag. So like everyone in our industry, we were very nervous like what's going to happen. And what we've been able to do is take this platform, our swag distribution platform, which is what we're really pushing and what we're really excited about. We'd been building this really amazing platform over the last two years, specifically for marketing managers. That was the initial idea of it. Allowing marketing managers to easily be able to buy swag and then send swag to the remote customers or to best leads to close sales. Jeremy: That was their initial intention. But obviously with this pandemic and everyone's working remotely, it's transitioned to office managers and HR managers really buying swag in bulk and sending it to the remote employees addresses to keep the company culture thriving, even when no one's in the office, so much so that not only are we not one of the 92%, that's downloading over 50% our Q2 this year was more than our Q2 of last year and July was almost double our last year July. And it was our best month ever and August is even better than that. We're really growing frankly in a crazy time for everyone. Stephanie: That's amazing. Now, I'm thinking about it. I ordered swag for our team maybe two years ago and the process, it was crazy. It was so much back and forth of like, “Here's your quote. Oh, you want to more of this? Okay. Here's your new quote? Here's what the design might look like.” It was just a lot. And then of course the big box came to me and then I had to maybe ship things out individually or wait until I saw people in-person if I was being a little cheap. What does it look like now I'm thinking about reordering hoodies and shirts for our team members? But of course I would have to individually maybe shift them out again or are you guys different? What is your process look like that's so different than others? Jeremy: Really simple. On our site, we have very curated selection of products. You're not going to be overwhelmed with too many options. Say the top 25 mugs, you find a mug you could use our filtering tools, really easy to search by color or price point or your type of brands. You find the product you upload your logo. Our system will detect how many colors are in your logo, in the nearest Pantone match. We're making sure we're printing, Coca-Cola red and not Staples red. Once the logo is uploaded, you can maneuver the rounds, you can mark everything up. You select on your quantity price adjuster in real time and checkout. It literally takes less than three minutes to buy swag. There's no back and forth. You can also use our instant quote tool, if you wanted to quote things out before you want to go through the design process on our site, you can upload your different variables, the quantity that you're looking for, how many print locations, the number of colors in the print. It takes two seconds and you're coordinating things out. Jeremy: So, there's no back and forth emails, there's no phone calls, there's no presentation decks. It's none of it. It's really completely automated streamlined. And then when you're going through the checkout flow, obviously you can input your own address, so we'll ship everything to your office. Or if you want us to handle all the distribution for you, there's a pink button on that shipping page that says, one is to hold your swag and inventory easily distributed, [inaudible 00:09:29]. You click on the button, you follow the onboarding and then we hold all of your swag in this online Swag closet if you will, where you can manage all of your inventory in real time. If you're ever running low in stock, we'll send you smart notifications to restock. If you want to send 1,000 different locations, you upload a CSV file we'll calculate the shipping costs in real time, based on the product you selected, where they're going. Jeremy: Once you pay for that, we grab those products off the shelf and we're shipping it all over the world for you. We really streamlined the entire experience. We take it a step further if you wanted it, some companies want this, some companies don't, but we have a whole ability to create different inventory closets for location or for a department. You can have a marketing closet versus a sales closet, versus your London office or New York office. Different people should get access to it. There's different permission settings, approval flows, et cetera. You could really break it down by department, by location and we're doing this with a lot of global main companies all over the world. Also, a lot of small startups who just want to use our service as a way to distribute swag. Stephanie: I was looking through your site and I saw products there that I haven't seen in other swag companies. And I wanted to talk a little bit about how you guys go about picking your products because all of them seemed high quality where oftentimes, I'll go through it and I'll find 50 different shirts on a custom t-shirt company website. And I'm like, “Oh my gosh, actually let me look through all the reviews. Let me see if they're good. Okay. 95% of them are all bad. They all have bad reviews, bad fits, whatever.” How do you go about making sure that you only have high quality stuff there that people will actually want? Jeremy: That's a great question. And that was the challenge. And it's an ongoing challenge, always. From the very beginning, me and my co-founder, each invested $25,000 of our own money. That was our first startup budget. What be used primarily for that was buying samples. We went out and we went to different trade, shows all over the country and we bought samples from tons of different suppliers. And we saw exactly what customers typically see when they buy from sites. A lot of this stuff was really poor quality, would end up in the trash and we would never feel comfortable selling it. We were really kind of laser focused on only offering a curated selection of products that we would actually want to keep ourselves. It's a lot of testing, it's constant testing. Jeremy: How we kind of look at the whole process is we want to have the best of what's out there. It could be the relatively inexpensive, or it could be premium. It doesn't really matter we have to have stuff in all price points. We don't want it to be known as the premium quality supplier. We want to be known as the quality supplier. We have a lot of products there high-end brands, Public Rec, Rowan, Top Wood Designs, Patagonia, different products like that. We also have no name products that you had never heard of, but they're really, really quality. We have a product sourcing team that's constantly contacting a lot of direct to consumer type of products and brands that are not traditionally found in the promotional product space and going after them as well, because we want to be known as the company that has products that no other company in our space offers. Jeremy: What we've been seeing is that a lot of companies that are okay featuring their stuff on our site or are happy to feature their stuff on our site like Bellroy Backpacks, they've never done it in other promotional product sites because the other sites, feel schlocky or throw away or cheap in some way. And we are really, really not that. We're really trying to focus on quality products, stuff that people would be proud to show off, stuff that when you get it, you're going to want to wear it every single day because that's really the only real reason why Swag is a true benefit is that people actually want to use it. Stephanie: Yeah. So, now that you can't go to trade shows and try things out, and are you still going through that process when it comes to finding new products, like just ordering things that you think are great and trying them out, or is it different than what it used to be? Jeremy: No, it's exactly that. It's less expensive in some ways and more expensive in other ways. We want to make sure we have the right products that we're constantly spending a ton on samples. And now at this point in the business also, we're almost five years in and we're somewhat known in our industry. We're the fastest growing company in the promotional product space. A lot of different, great suppliers and direct to consumer brands have heard of us, so they're willing to send us free samples. We don't necessarily have to pay for it anymore. But we're just constantly sourcing more products and taking some products that maybe were cool last year, but we don't think they're going to be good this year and replacing it with new stuff. Jeremy: We don't want to keep just adding and adding and adding because it then makes it very complicated for customers to make a decision. So, we're constantly, always looking at our site and saying, “Is this the right blend and mix of products?” And we're always never happy. We're always constantly trying to improve it. Stephanie: Very cool. I'm guessing there's also a bit of like a data element where you can probably look into the data and see what people are either enjoying. Do you do reviews? Do you use customer feedback to also influence the products that you choose? Jeremy: Yeah. 100%, yeah. After everyone places an order, we always have a survey that automatically goes on the time of delivery, very basic. It's like one question like, “How satisfied were you?” So we can get our ranking and see how people like the products and how they turned out. If we ever get any sort of bad or not 100% amazing feedback about a product, we'll look into it and maybe there's something wrong, maybe the print quality wasn't great for that order or maybe the product itself wasn't as great as what we thought and we'll just remove from our site. We're constantly listening to our customers, understanding do we have the right products at all times? Because that's very important for us. We need to have that. Jeremy: We're constantly testing more and more products. And obviously we're learning what people are adding to their cart. How many products are being … What products together go? We sometimes find that if somebody buys a tote bag, they're going to buy other products that could fit into that tote, like smaller products. Or if they buy a backpack, other types of products are usually bought with backpacks. We're constantly looking at data and trying to make sure we have the right mix of products that go with each other, so we can start positioning certain products. When you buy a backpack, the products that are featured as you might also like actually make sense. So, not just what we think, but what the data is telling us. Stephanie: I love that. Along with maybe getting personalized recommendations, depending on the product they chose, are you also personalizing the experience based on maybe what company is looking around? If a LinkedIn's looking versus Google, maybe you know that Google always buys hoodies where LinkedIn buys coffee mugs. I don't know. Are you personalizing it based on who's actually browsing? Jeremy: At this point, we're not. And we've been constantly thinking about that. The challenge is that there's so many different buyers within companies. Even if we worked, let's say with LinkedIn, which we do or Google, which we do, there's so many different divisions within Google that are completely different. We're selling to the HR team or the marketing team or the sales team or office manager, or just somebody who's buying it for their local team. Everyone's looking for different things. We've done for Google complete stuff, obviously the normal stuff of notebooks and t-shirts and sweatshirts, backpacks and water bottles. But we've also done custom Allbirds Sneakers. It's hard to kind of match up always and all the buyers are necessarily not always the same. Jeremy: So, it's constantly changing, but as we're growing as our processes and we're able to handle a lot more orders and we're analyzing more data, I think that will be a shift in the future of really making the experience as personal as possible and that might be not making it personal at all based on companies or that might be going the opposite way and making it super, super personal. We're kind of learning what's the right mix at this point. Stephanie: So, to talk a little bit about maybe the backend, the tech stack, it seems like there's a lot going on behind the scenes. I first wanted to start with, I saw that you were quoted saying the platform's able to handle unlimited orders in a day. And I was wondering, is that because you guys are leveraging cloud infrastructure or have you built some kind of scaling methodology? What does that look like behind the scenes to allow you to have unlimited orders? Jeremy: Yes. We do work with AWS, which for the cloud obviously makes things a little bit easier, but our entire platform is fully custom. Every single aspect of our site is custom. We're not using any other services. Obviously we're using like Intercom live chat. We're not going to be building our own chat, but the entire platform itself and all of our pricing is very complicated. That's why there's not a lot of companies in the space that could do what we do because it's fully dynamic. Every price tasting consideration, the quantity that you're looking to buy, how many print locations, the number of colors in the print, all these different variables that have to be in play. And now if we have 3,000 products on our site and 200 core products, they all have different pricing structures, they all have different under base charges, they all have different kinds of printing methods from screen printing, embroidery, laser engraving, all of these come with different complications. Jeremy: So, we really had to build our site from this place from the very beginning we couldn't just take an out of the box solution. And frankly I would have loved to take an Ad Box solution for this because it's been taking me five years to build [inaudible] building. We have a 15-person tech team and we're growing, we keep developing more and more and more because it's important. And we want to always stay one step ahead. At this point, like yesterday we did north of a 100,000 in sales all through our Ecommerce site. Things we could really scale and that same day. The day before we did 50,000 in sales and then hopefully today we do more than a 100,000 sales. Every day could literally be completely different and it's completely the same automation. Somebody could buy 5,000 notebooks or they could buy 50 notebooks or 15 notebooks or 20,000 notebooks. And it takes the same processing ability, same exact time for checkout. Stephanie: Very cool. Yeah, that's great. When you're thinking about back in the day, starting out with a custom website versus maybe pulling something like using a platform that is already out there, how did you go about deciding that you wanted something custom and then what did that process look like? What were some of maybe the mishaps or failures along the way where you're like, “Oh, if you guys are trying to build something custom make sure you don't do this or that you avoid this.” What kind of learnings did you get from doing a custom? Jeremy: Actually the truth of the matter was in the very beginning of the business, we went all in on Shopify. And we went all in say, “You know what? Why are we building our own Ecommerce experience when somebody else could do it significantly better than us or is worked through all the kinks?” The challenges, when we start to really build a Shopify, we realize how complicated our specific industry is in terms of pricing. And there's no really easy way. There's no Ad Box solution that could really do it. We spent literally two and a half months building the Shopify store only to then realize, which was a big mistake on our part, that the pricing was not able to be done. Jeremy: We had to really scrap it and start from scratch. And we realized it's going to take us a lot longer to get where we want to be, but we're still not where we want to be, but it makes the most sense. It's really the only way to streamline it as much as we want to streamline it. Now, the typical process of promotional products, as you mentioned before, it's a lot about phone calls. It's back and forth, this quote versus that quote. You change one little element, the whole quote changes. We didn't want to deal with that. We wanted customers to be able to do it themselves, no talking to anybody. If you don't want it, obviously if you want to call us, we love to hear from you, but you don't have to. You could do every single thing yourself and we want it to make that effortless. Jeremy: You want to hold things in inventory, click on the button and now it's all in inventory. You want to distribute swag, upload this and it ships out. Every single thing on our site, we wanted to make it as easy as possible and historically, and traditionally it's not been easy at all. And it makes sense because of how custom the product offering is. Stephanie: If you would have, maybe on day one started out with like, here's the kind of things that we're most interested in. Did you know that you wanted this custom pricing option and did you go and kind of look at different platforms to see if they could do that? Or did you just jump right in? Jeremy: Yeah. From the very beginning, honestly, we spent a year before we built any platform. Our initial idea was we don't really know the platform to build, we knew that the industry needed to be shaken up a little bit. We knew how old and fragmented the industry was, but frankly, I think most entrepreneurs could agree. You honestly know what the right answer is. Most people don't, they think they do, they don't. From the very beginning, our idea was let's just learn as much as possible. Let's reach out to as many office managers, HR managers, people that we know within industries by swag. And let's ask them what they like and hate about the current buying experience that they're having. And we would show up at meetings and we would literally say, “What sites you buy from?” And they would give us some site names and we would look over their shoulder really. Jeremy: This is what we did for the first year. We spoke to over 200 different office managers, HR-related buyers. And some ended up buying became customers of ours for many years and some moved on to other things, but just to see how they purchase swag was a big tell for us, really what the process was. Looking at their email back and forth 40, 50 emails with a rep just didn't make any sense to us. We kept kind of thinking. That was kind of the first six months of our business. The second six months of our business, the remainder of the year was about, “Well, let's do it the old school way. Let's just launch a landing page. Let's go out there. Let's be a traveling salesman and try to sell some stuff.” And we really learned how painful it is. It's like it takes forever to quote, there's a lot of manual labor. Jeremy: Every single thing that was painful for us, we then figured out a solution to automate it. And we kept just chipping away at it. Stephanie: That's so important. I think it's Paul Graham who said do things that don't scale. And that's how you actually learn, like what's working, what's not working and what to build going forward. That's really smart. Jeremy: Exactly right. And that's the same thing with even getting our customers. Now, I haven't made a sales call in four years, but in the beginning I was doing everything. Me and that co-founder, Josh, we would show up at offices and try to sell. And we sold to Facebook as one of our first customers. First customer, really actually. We had a friend who worked at Facebook, got us in the door. We ended up walking around Facebook's office in New York just speaking to whoever we could to see somebody who would buy swag. Ultimately ended up selling them a couple of t-shirts, like 100 or 200 t-shirts. We barely broke even on it. I think we made like 5% margin, like barely anything, but didn't matter to us. It was just about getting that Facebook logo. I remember two days later we showed up at WeWork in New York and WeWork asked us who else we work with? And we said Facebook. Jeremy: They assumed we probably had thousands of others because we had Swag.com brand and Facebook, but really it was just Facebook. And we got, WeWork and we continued that cycle to get that really five core blue chip companies. It was doing the really unscalable things like showing up, showing the products in-person, making the sale, really learning the process as much as possible, and then automating the experience and making that whole buying experience effortless. Now, people don't need to speak to anyone if you don't want to, that's really what our main goal was. Stephanie: That's great. Yeah. I think we've had a couple people on the show. Who've talked about just finding that first customer that you can kind of leverage as the brand name and then just pointing to them and be like, “Yeah, they work with us. Like you should too.” So I think that's a good lesson for a lot of companies starting up. If you find that one brand name and you can reference them, it'll probably help with all future sales. Jeremy: 100%. It's all about social proof, at least what we have learned, it's everything. People are not going to work with you if they don't feel confident. To build up the confidence, obviously you have to have a great platform, but that doesn't happen overnight. That takes time. You have to have a great brand and a great design, make people feel confident, but other ways are who else you've worked with? A lot of our shirts and what's big reason why we've been able to scale with very little money is a lot of our t-shirts and apparel has a Swag.com in your label. We do our own products. Jeremy: When people [inaudible] t-shirts that's 5,000 people knowing about Swag.com. They see the t-shirts, they see the quality, they feel how great it is, they see the print, they have the instant social proof that Facebook uses them or Google uses them, whoever is getting the product and they see Swag.com and it drives a ton of traffic to us. That only really works if the products are great as well. Obviously people are getting really poor quality and everything's says, “Swag.com,” no one will use us. It'd be opposite [inaudible 00:25:52]. Every single thing really has to work hand-to-hand Stephanie: Yeah, that's a really. Jeremy: Yeah. We were thinking about it initially because I wear jeans a lot. I was thinking like I buy one pair of jeans for like three years. It's kind of looks cooler, the more you wear jeans, it gets more faded. But with Swag.com or swag in general, people buy stuff for a specific reason. You're buying it to give it away and then you need more stuff. If they're buying it to give it away, we have to make the experience of giving away products that other people actually want and see. And then that new person, that person who just bought that 5,000 t-shirts now they need more stuff for the next event. It's a completely different kind of business. And we just try to figure out, we have to make sure that our logo is everywhere that it can be, obviously within reason. Stephanie: I love that. Let's talk a little bit about the backend when it comes to warehousing your inventory. How does it work behind the scenes? If you're able to allow someone to essentially have their cart saved and then say, “Okay, ship this to one person in California and then ship this to one person in Florida.” What does the backend look like to make those logistics work? Jeremy: Yeah. Upfront in terms of the actual buying swag and bulk, we have integrations with different kind of the best vendors in each industry. So, like the best one for drink wear, best one water bottles and obviously we have a big selection of product. When somebody buys 1,000 mugs or something on our site, it's automatically connects to our supplier network that produces the highest quality mug with their logo and then drop ships it directly to the customer's office or wherever. But if they're holding stuff in inventory, it ships into our 3PL. We have four strategic locations all throughout the US and we're adding more locations in Canada and Europe right now to make it cheaper for global distribution. Once the products are in our fulfillment center, then they log into the, my inventory portal and they see all of their inventory in real time. Jeremy: So, if you're ever running low in stock, we'll send you smart notifications to restock. They can easily upload their CSV file. We'll calculate the shipping costs in real time. They pay for it. We grab it off the shelves and we're shipping it to 1,000 different locations. We also have this feature called the Swag Giveaway. Oftentimes, especially now, people don't necessarily know where their remote employees are living. Say you went to a trade show. God willing the world opens up, we have trade shows again, and people go to your booth and they give you their email address. You'll know what t-shirt size they are. You'll know where they live, but you still might want to engage with them. We built the Swag Giveaway feature, where you can create a fully recipient branded landing page. Let's imagine Google just uploads their logo and their colors. Jeremy: And they could easily blast out to a CSV file of just having the person's first name and email, that recipient will click on the link. It will be branded with Google, they'll select their t-shirt size or they'll select their mug, or water bottle, they'll be able to choose which product they want. Input their address, submit. It all speaks to our system. And now we have the address that we can distribute. We're building all of these tools to allow people to distribute if they're shipping to one address, thousands of addresses, or even if they don't have the recipients addresses, but easy way to capture that and also distribute. Stephanie: Wow, that's a lot going on behind the scenes. Jeremy: Yeah. Stephanie: How are you thinking about the front end part of the website because to me when you're ordering swag or something where you really want to see the details of like, is that embroidery right, are the colors right and also just like making sure that you have people who are converting and not just sitting on maybe their design or their shopping cart? How are you moving people along through the website and what kind of best practices have you seen when setting up the front end user interface? Jeremy: Good question. It's probably the most challenging thing for our business because it is custom and everyone is somewhat concerned about, is this going to come out perfect, is it could be the right logo color, is it going to be the right positioning. What we've learned is obviously we built our patent technologies is one of the first things we built to detect the number of colors and the nearest panto match in your logo when you're uploading it. So, to make people feel really safe, they're going to get their exact color. Now, obviously it can never be 100% because web colors are not the same as Pantone colors that are used for printing and t-shirts, but it really gets to the closest match. And if you want, and if you know your Pantone, which a lot of companies do, it allows them to easily input their exact Pantone, so it overrides everything and it makes it really easy. Jeremy: Obviously they can maneuver their logo, they can mock it up. And what we say is, after you place your order, we're always going to create for them a virtual production mock-up to approve before we ever start to print. We'll never go into production until they give us the green light. Really customers should feel super safe that even if they upload their logo and they're not sure is this straight or is this exactly the right position. It doesn't really matter. We're always going to create that mock-up and they can make as many revisions as they want before we start with the print. That makes it really easy. And in terms of our distribution, obviously they can always just add this stuff to inventory and just easily distribute. The process on the front end, we try to make it really effortless and streamlined. Jeremy: It's taken us four years. We're constantly adding more and more features to make that experience better. We're launching a feature very soon called the Company Art Folder. Imagine you bought something and 20 other people in your company buys different things. It should lump all of your artwork together as a company art folder, so you never have to really hunt down the designer to make sure you have the right file or is this the right logo, is this the approved logo for swag? You can always, when you're uploading your logos, select the pre-approved designs that have been used and purchased by other people in your company, so you feel more safe. Or unlike my orders page, let's say Jennifer on your team is out sick one day, you can log into your account, you'll see all of your orders and then there's another tab that says company orders. You see everyone else in your company what they purchased, so you could easily reorder what somebody else ordered and easily subtract and make sure you're using the exact same artwork. Jeremy: We're trying to build this platform as effortless on the front end to make it really, really streamlined. And in terms of getting people through the funnel, what we've seen is our platform really does work well. I think that the more simple features that really solve a problem. And as you mentioned before, Paul said, “Do things unscalable before you scale it.” Every single thing we do, it has to be super painful for us, for us to spend time developing a solution for it. Once it's overwhelmingly painful, then we build the solution to make it easy. Jeremy: Then obviously we see their abandoned carts. We can track everyone's abandoned carts. And then we have our SDRs calling all these abandoned carts within like 10 minutes of the time that they'd been in to make sure that there's no experience that's wrong. Sometimes people say, “The shipping is too high.” Or, “It doesn't seem I can get in my in hand date.” There's certain things that we could actually help out and maneuver possible. And if it's not possible, we'll let the customer know it's not possible. But getting in front of them right when they're thinking about, are they going to purchase or not and they might have issues, that's really, I think we found the most important thing for us. Stephanie: Yeah. That's really smart. Have you seen people pick up their phone right away or have you experimented with texting instead? Jeremy: We haven't done texting and I've been researching some companies and I think it's actually a really good idea. We've seen a lot of people if they're actively looking on our site or they've just left in 10 minutes, they're likely to pick up their phone. Even when people fill out a form on our site and we have a lot of … Obviously our core business up until this point was Ecommerce experience adding it to distribution, but we have a whole ‘nother business where people could buy swag boxes in bulk, giving a really great unboxing experience for new hires or engaging with your best clients. That's fully custom branded boxes inside the boxes, as custom notebook and water bottle and pen and custom note card, crinkle paper. We've allowed people to custom build those boxes effortlessly through our site. Jeremy: All you have to do is upload your logo, the same process as buying and adding to your cart, you click on the button that says, “Build a box,” and it lumps, all those products together as a box listing. It makes the entire experience super simple. And we've seen with those bigger box orders. A lot of times it might be like a two to three week sales cycle. When Ecommerce could be like they land on a site and they check out that same day, boxes, they're fairly larger sizes. Typically, they're usually using our distribution platform for distributing because no one has room in their office for boxes or wants to boxing up themselves. So, they actively use our distribution platform for that. And that cycle takes a little longer. Getting on the phone with them, really talking through the challenges or what their issues are and what their questions are, we find is really, really important. Stephanie: Yeah. That's really great. Oftentimes when you're talking or this can happen, when you're talking to a customer, they don't always tell you exactly what they need. One example you gave there was, you want to be able to go into a library where your logos are there, which is huge. I remember ordering swag back in the day at other companies and it was always kind of a review and escalation process of like, “Is this the most recent logo? Are these the right colors? Is this our team logo?” Okay. How would you find out something like that that maybe a customer wouldn't know to tell you, but it would just make it easier if they did have that there? How do you go about getting in your customer's head? Jeremy: Yeah. I think it's just like just being their teammate. We like to think of in all of our customer success likes to think of it, is that we're an extension of your brand. Obviously if you're buying swag on our site, it has to really be the quality, but it's only going to be quality and only what you like, if the logo is right, the positioning is right, it's exactly what you want. Especially dealing with bigger orders, we like to jump on a call with customers, have a conversation, try to understand what the use case for their swag is, what their budget is, what their timeline is who the audience is. And we like to suggest ideas and obviously customers can go on the site and not talk to us if they want to talk to us and use our filtering tools and our search tools and just our browsable experience and find what they want. Jeremy: But if they want our help, we want to be there to help them. I think it's just constantly trying to understand, the reason for them buying swag and with the use cases. And then we constantly offer different suggested items that we know that we work with that other companies in the similar space have worked with. And we give other solutions for them to kind of play with. And I think it just gives a great experience where they could do their own kind of sourcing and they can also use us as a guiding tool to find them exactly what they're looking for. Stephanie: When you're thinking about getting new customers, what kind of acquisition channels are you using or finding success in right now to get these large companies using you guys? Jeremy: Yeah. I like to think about marketing and it's not always going to be the same traction channel is always going to work. Now, from the very beginning, we were doing a lot more Google ads because we wanted to get paid back fairly quickly and we've realized early on, at least for our business prospecting on Facebook is a little more challenging when you're dealing with B2B buyers. But for Google, when is looking specifically for swag it's quite challenging [inaudible] Google, obviously it's very expensive. In the beginning it maybe makes sense to do Google just to get those early wins and get the credibility. But then maybe you kind of shift away from Google and you do some more SEO. SEO for us has been tremendously successful. We started really diving deep into SEO about 18 months ago, just to put things in perspective. Jeremy: Last January, we had about 3,000 organic visitors to our site, in 2019 January. January of 2020, we had North of 20,000 organic visitors. And last month we had nearly 40,000 organic visitors. Really growing the base in terms of organic, putting out tons of content, always it's content that maybe has stuff to do with swag buyer like buying swag or maybe just has to do with the audience, HR managers, the best HR solution tools. Doesn't necessarily have to be about swag, but it's a valid topic related to the buyer. And then ultimately when the buyer comes on our site, reads about it and then is going to Facebook or Google or any of their other properties, we can re-target them. That's been a really great driving force for us, but also partnerships. Jeremy: There's a lot of different companies in our space that don't necessarily sell swag. They sell other products to the office manager or other products to the HR manager, that we could really parlay and work on. We could promote them to our audience. They could promote us to their audience. We've been trying a lot of different things, affiliate marketing, a lot of different stuff, but usually it's always the one or two kind of traction channels that are the most beneficial at that time. And right now it's SEO [inaudible] hands down has been the best driver of customers for us. Stephanie: Okay. I want to dive into that a bit then, because I hear people are always talking about SEO. There's so many SEO agency, they'll do all this SEO stuff for you. I think there's like tons of bar jokes, maybe not bar jokes. Maybe just be regular jokes about SEO agencies and consultants and stuff. I want to dive into, what are you guys actually doing when it comes to your SEO strategy because it sounds like it's been successful? How are you finding out what topics to write about? What are you seeing work? Give me all the nitty-gritty on what you all are doing behind the scenes. Jeremy: Yeah. I think from the very beginning with SEO, it was about making our site compatible and making it work for Google traffic. Our site, at the very beginning … I'm a branding person. My background is in branding and user experience design for the customer. There's a lot of things that are behind the scenes that Google looks at, that the customers don't even realize. And frankly, it doesn't even mean anything to the customers. I had to learn that. I didn't know anything about that. Frankly, I'm fairly new to SEO. We started really 18 months ago and I realized our organic rankings should be a lot higher based on our brand, based on these experience. We're getting a quality product out there and it should be getting a lot more traffic. The first step was just analyzing our site and realizing, “Well, how do I make the site faster?” Jeremy: Or, “How do I make the site make more sense in terms of Google?” So for example, on every single product page, 18 months ago, we had no other associated products below the fold. Now, most people don't necessarily look at those below the product the fold because they're trying to upload their logo, mock-up things. There's a lot of stuff for them to do on the product builder page to add to cart. But you need to add those other products below the fold, so that in terms of Google, they see that that product listing is connected to four other products or so, right. There's all these small kind of tweaks or theoretically, you want to keep adding and making your site feel refreshed. You're not going to be refreshing your homepage every single two weeks. It doesn't happen. Jeremy: You're not going to be redesigning your product builder page every two weeks or your browsable experience every two weeks. What you can do, is you can maybe put like a blog post in your footer, make it like the latest blog posts. Every time you update your blog, every day or every two days, your site is getting refresh constantly. There's all these kinds of small kind of tweak things that you could do in terms of overall site. And then it's about kind of pinpointing the content that you really want to go after and saying, “Well, who is our buyer?” So, really understanding who your customer is and trying to write really good content, not just like throw away stuff, really good content with great subject lines that get people to read something and learn something, get real value out of it that might not be about swag related, but has to do with swag adjacent, if you will. Jeremy: If someone's looking for office holiday party ideas. They might not be looking for swag, but maybe we could get swag in there somehow. Or best ways to engage your remote employees or something like that. Or what healthy snack food to have in the office, literally has nothing to do with swag, but the person who is looking for that is ultimately going to be looking for swag. And we don't necessarily need to convert them today, at this point, we could convert them a month from now. When they are looking for swag, just be on the top of mind, re-target them and ultimately convert them. Just putting out consistency. I think in general, whether it's SEO or whether it's being a startup founder or whether it's anything you do in life, I think it's just really all about consistency and just trying to have more good days than bad days. Jeremy: Constantly just trying to keep pushing as hard as you can because at the end of the day, you're going to get to a much better place if you're consistent with it, you keep pushing forward and no small setbacks really affect you. Stephanie: Yeah. I completely agree. Are you all doing the content creation and things like that in-house still? Or have you hired that out? At what point would you say like, “Oh, it's about time to hire this out,” to have someone else work on it instead of maybe an entrepreneur doing it all themselves? Jeremy: So, initially it was all me writing the content, then it became use some freelancers and now it's becoming, now we have the resources we're hiring actually this week, a full-time writer for our own team to be writing content and doing all of the stuff that we want to do. I think in everything, it always starts with the founders. Me and my co-founder, I think we've done really ridiculous, crazy things over the last four years to get to where we are. We've driven u-hauls 11 hours making deliveries at 11 o'clock at night. Having my family and my grandma, my aunts and uncles rolling t-shirts for three days straight trying to win these big deals and having no resources to do it. You're always kind of founder, CEO and head intern all at the same time. Now, at this point we're able to hire some of those roles that doesn't really make sense for us to be doing at this point or frankly, people who are just a lot better at it than we are. And that's where we're really excited to get to. Stephanie: I love that. I'm sure your grandma thanks you. Poor grandma, she's a real VIP over there, rolling t-shirts. Jeremy: Yeah. She was making fun that she hurt her back and that's why her back is messed up because of the [inaudible 00:43:33]. Stephanie: All because of you, Jeremy. Jeez. That's great. Before we move into a lightning round, is there anything that you wanted to cover today that I missed? Jeremy: No, no, this has been great. Stephanie: Okay, cool. Yeah, it has been a blast. All right. So, let's move onto the lightning round brought to you by our friends at Salesforce Commerce Cloud. This is where I'm going to throw a question your way and you have a minute or less to answer. Are you ready, Jeremy? Jeremy: I'm ready for it. Stephanie: All right. What's up next on your Netflix queue? Jeremy: Oh, cool. I started watching … I was in the Hamptons this weekend. Stephanie: Fancy. Jeremy: I know, very fancy. My mother-in-law's in town and she wanted to go out. [inaudible] and there's all these ads all over the place for million dollar beach house or something. I think I started watching some real estate show. Stephanie: There you go. I saw that also on Netflix. I was watching Selling Sunset, though I need to finish that one first. Jeremy: That's fine. Stephanie: All right. What is the best promotional item you've ever received? And what's the worst one? Jeremy: Well, okay. The worst one is obviously easy. It's all about the schlocky pens that don't write. Stephanie: Oh my gosh, yes. Jeremy: Pop socket, lighter. There's some of these things, stuff when they do the hybrid stuff, it's just kind of ridiculous. Like the highlighter that also acts as a compass. It's like, “No, that's not the right thing.” So, a lot of those. And a lot of people trying to sell me on selling their stuff and it's not good. Stephanie: You're like, “No, this is no.” Jeremy: Yeah. I don't want to be mean to anybody. I just say, “I don't think it's the right fit,” or something like that, but it's not good. And the best stuff I think is honestly anything really you're going to keep it. A really high quality water bottle, something you're going to see every single day it's could be on your desk and you're going to get those impressions. I'm really proud of that, but obviously we've done bicycles for companies. We made fully custom bicycles. A company came to us and they had their whole executive team. They're very into health. They want to do something a little bit different, unique. They have a campus. We create a really cool custom, fully custom logo, colors, everything bicycles. That was a really cool project to work on. And obviously we've done really cool backpacks. We did a backpack for Facebook, which I thought was really cool where the logo was nowhere on the outside. Jeremy: [inaudible] was we wanted to make the product so kind of premium. These are like very nice backpacks, that it didn't like scream Facebook. No one even know about it, except for the people who are wearing it. So, it was black-on-black logo on the inside of the backpack, so like when you open, only the people who are wearing it, see it. That's, I think is very important. I knew this was going to happen because frankly I started getting a sense that socks were going to become very popular. We sell [inaudible] socks and clearly socks is very … No one really sees it, but it's all about the person who's putting on the socks, is wearing it, who were seeing your logo. It starts to feel that kind of connection to your brand and eventually becomes that brand evangelist. It's all about that internal. Stephanie: Yeah, that's awesome. What is a new Ecommerce tool that you're trying out that you're loving right now? Jeremy: New Ecommerce tool? We're using a company called [Tend 00:46:36]. It's very early in it, but you're able to kind of track all the different people who are coming to your site without them inputting real information, which I think is kind of spooky, but kind of cool, just to see who's checking what. Stephanie: Cool. Jeremy: For me, it's kind of the core stuff. It's the Intercom, it's the HubSpot, it's just the marketing automation, streamlining things. And there's two different things, obviously with Intercom, which is our real life customer success. People are always here to help and jumping on the phone call. Then you have the HubSpot, which is really automating the experience. Having both sides for our type of businesses is very important. Stephanie: Great. All right. The last one, a little bit harder. What one thing will have the biggest impact on Ecommerce in the next year? Jeremy: Wow. Stephanie: Yeah. Jeremy: That's a good question. I'm still laser focused on swag. I don't necessarily always think about the broader industry as a whole. I think for swag, I think it's easy. I think it is swag distribution. Everyone's working remotely. I don't see people getting back into the office anytime soon. Even if they do, it's going to be somewhat of a new normal, maybe not every day. People are still going to be able to need to engage with the remote employees or the best customers. And who's going to want to fly across the country, maybe to that trade show. They might want to do things a little more remote and automated. For Swag, that's where we're going and we're going to be automating the distribution of Swag. I think that's our next phase. Jeremy: Or somebody's one year anniversary, send them automatically Swag in the mail. Or somebody's had a baby, send them Happy Mother's Day or Happy Father's Day type of swag in the mail. So, really automating different life activities where you want swag. Stephanie: Awesome. Love it. All right, Jeremy. This has been a blast. Where can people find out more about you and Swag.com? Jeremy: Yeah. You can obviously reach out to me on LinkedIn Jeremy Parker, and obviously come visit us at Swag.com. That's S-W-A-G.com and we would love to work with you on your next order. Stephanie: Awesome. Thanks so much for joining. Jeremy: Thank you so much for having me, guys.

god tv love ceo new york amazon spotify netflix california tiktok canada europe google marketing pr north poor oprah winfrey seo commerce respond honestly ecommerce vip b2b coca cola giveaways deliver harnessing quotes shopify orders boston university bu happy mother inventory input patagonia happy fathers aws tend hubspot wework swag atlanta hawks staples asi hamptons upfront up next selling sunset intercom pantone jeez paul graham sdrs jesse itzler csv 3pl koozies vouch if you build it marquis jet zico coconut water salesforce commerce cloud stephanie you jeremy you jeremy it stephanie yeah stephanie how jeremy so stephanie there stephanie oh google so

第991期:Best Teacher

Play Episode Listen Later Oct 2, 2020 3:59

更多英语知识，请关注微信公众号: VOA英语每日一听 Jeremy: Abidemi, so looking back at your life as a student, can you tell me a little bit about your experience with the best teacher that you had, and maybe a little bit about your worst teacher?Abidemi: Sure. I think I've had many, many great teachers. I've been really blessed in that way. Thinking back now, I remember my teacher when I was in primary 6 in Nigeria, actually. His name was Mr. Oleaer. And this teacher was a math and science teacher. And I think the best thing about him was he really took a personal interest in all of us. We could tell that he really cared. Although, he was very much a teacher, he took that authority role but at the same time, he was like our friend. Like during the break time, we would approach him, we would talk to him. And over the summer, he tutored a few of us for the entrance examination to high school because we have that system in Nigeria.So it was just fun. Every time we saw him, we just have a good time with him. And I remembered he talked to me. He mentored—he was like a mentor. He came to me, and he was wondering what I wanted to be in the future and he made some suggestions because of the scores I had in his class. And to this day, I still remember him. He sticks in my mind. And having taught for a little bit as well, I think for me too, he's one of the people that I tried to model myself after. I model myself after him. Maybe not even consciously, but I liked the fact that he was a teacher and we knew it but at the same time, he was very friendly. So I really love that aspect of him.How about you, Jeremy? Anyone comes to mind?Jeremy: You know, it was more just sort of teacher qualities that I remember from a number of different teachers. And what I always really—when I look back and think about teachers that made a difference, it was teachers that recognized students' weakness and tried to counsel them or to sort of—in my case, it was that I was a very, very shy person. And a teacher who would talk to me a little bit and give me encouragement if I did a presentation, which was just the worst thing I could possibly ever do because I was so nervous. But teachers that recognized I had a real serious shy problem, and that they would just give a little bit of encouragement. And those with little bits of encouragement, over time I started to gain a little bit more confidence. And by the time I was in university, I still had a problem with giving presentations but—even in university, I had teachers who would compliment me on a job attempted—and I wouldn't really say well done, but I tried.And those were the things that really stood out, teachers that recognize when people have issues in their life and they really try to encourage them to overcome those issues. It wasn't just one teacher. It was a number of teachers. But not all teachers did that. Some of them, it wasn't part of their job description to be a counselor, I guess, and it was more work and they just didn't attempted or didn't care or maybe didn't recognize it. But some people really thought, here's a guy who maybe needs a little bit of extra boost, and they would try to help me out. Those are the probably the things that stick out in my mind. And like you, I try to model myself after those teachers to try to find students who need a little bit of extra encouragement and make them feel better about themselves.Abidemi: I think that's great. I think as teachers, everyone, if you have an opportunity to be a teacher, one of the best things you can do is to try and find the potential in each and every one of your student because everyone has something special in them. And for a teacher to recognize that, it's so awesome. It's amazing. Yeah.Jeremy: I think you're right.

thinking nigeria best teachers jeremy you

第991期:Best Teacher

Play Episode Listen Later Oct 2, 2020 3:59

thinking nigeria best teachers jeremy you

第991期:Best Teacher

Play Episode Listen Later Oct 2, 2020 3:59

thinking nigeria best teachers jeremy you

第981期:Shockingly Different

Play Episode Listen Later Sep 22, 2020 3:15

更多英语知识，请关注微信公众号: VOA英语每日一听Jeremy: So you were 12 when you came to Canada.Abidemi: That's right.Jeremy: Do you remember anything that was either really similar or really different from how teachers taught or from the classroom experience? Were they more or less the same or were they shockingly different?Abidemi: Actually, I think they were different in a lot of ways. For some reason, I had finished my grade 6, which is like primary 6; they called it in Nigeria, which is the end of high school—no, sorry. The end of elementary school in Nigeria.So when I came to Canada, they put me into grade 7 thinking it's the next level. But my level was higher so they put me into grade 8 after that. So I got to skip a grade, which was really great. But I remember in terms of the way of thinking; in my English literature class, we had a conversation and we had to finish a story. And it was like, “Suddenly, something appeared in the sky. What is it?”And all my classmates were, “It's a UFO.” For me, I'd never heard of a UFO. So they were like it's a UFO. It makes somebody disappear, it picks up somebody, and one of their classmate and another person disappears.But in my essay, I'm like—it was just weird. I had a totally different way of thinking and processing things. And that really surprised my teachers because when they wrote my composition, they compared it to the rest of the class, to my classmates and they were like, “Wow, this is interesting that you're the same age and we're all speaking the same language, but the way we view things is very different.”And I guess, to me as well, it was my first exposure to cultural differences. And then after that, many years later, coming to Japan and being different again, in a different setting, it made me think back to that time thinking, wow, you can't always expect the same things. Differences come in weird places sometimes.Jeremy: I see.Abidemi: So yeah, that's one thing I remember. But a lot of things were the same. We are the same, and yeah, different.Jeremy: So Abidemi, just in terms of sort of the simple things in life, when you came to Canada, were there any foods or drinks that you were particularly fond of right away or thought were particularly strange?Abidemi: Hmm, very good question. Food, a lot of food was all right, right off the bat, like we didn't have any difficulties with that. But one thing I do remember was nacho chips. We were not used to cheese, the taste of cheese. So my sisters and I, what we would do is we would buy nacho chips, like, Doritos—I don't know if everyone knows that—and we would wash it because the cheesy taste was strange. So we would wash it before eating it and then we would eat it. And I have some friends who will just stare at us, like, “What's the point? Why would you buy that flavor of chips? That's the whole point, to taste that.” But we were like, “No, we can't eat this.”But eventually now, I love Doritos. I have grown accustomed to the taste.Jeremy: You probably made it healthier though.Abidemi: That's true.

canada english japan food ufos nigeria differences doritos shockingly jeremy you

第981期:Shockingly Different

Play Episode Listen Later Sep 22, 2020 3:15

canada english japan food ufos nigeria differences doritos shockingly jeremy you

第981期:Shockingly Different