POPULARITY
Tianqi Chen is an Assistant Professor in the Machine Learning Department and Computer Science Department at Carnegie Mellon University and the Chief Technologist of OctoML. His research focuses on the intersection of machine learning and systems. Tianqi's PhD thesis is titled "Scalable and Intelligent Learning Systems," which he completed in 2019 at the University of Washington. We discuss his influential work on machine learning systems, starting with the development of XGBoost,an optimized distributed gradient boosting library that has had an enormous impact in the field. We also cover his contributions to deep learning frameworks like MXNet and machine learning compilation with TVM, and connect these to modern generative AI. - Episode notes: www.wellecks.com/thesisreview/episode48.html - Follow the Thesis Review (@thesisreview) and Sean Welleck (@wellecks) on Twitter - Follow Tianqi Chen on Twitter (@tqchenml) - Support The Thesis Review at www.patreon.com/thesisreview or www.buymeacoffee.com/thesisreview
We have just announced our first set of speakers at AI Engineer Summit! Sign up for the livestream or email sponsors@ai.engineer if you'd like to support.We are facing a massive GPU crunch. As both startups and VC's hoard Nvidia GPUs like countries count nuclear stockpiles, tweets about GPU shortages have become increasingly common. But what if we could run LLMs with AMD cards, or without a GPU at all? There's just one weird trick: compilation. And there's one person uniquely qualified to do it.We had the pleasure to sit down with Tianqi Chen, who's an Assistant Professor at CMU, where he both teaches the MLC course and runs the MLC group. You might also know him as the creator of XGBoost, Apache TVM, and MXNet, as well as the co-founder of OctoML. The MLC (short for Machine Learning Compilation) group has released a lot of interesting projects:* MLC Chat: an iPhone app that lets you run models like RedPajama-3B and Vicuna-7B on-device. It gets up to 30 tok/s!* Web LLM: Run models like LLaMA-70B in your browser (!!) to offer local inference in your product.* MLC LLM: a framework that allows any language models to be deployed natively on different hardware and software stacks.The MLC group has just announced new support for AMD cards; we previously talked about the shortcomings of ROCm, but using MLC you can get performance very close to the NVIDIA's counterparts. This is great news for founders and builders, as AMD cards are more readily available. Here are their latest results on AMD's 7900s vs some of top NVIDIA consumer cards.If you just can't get a GPU at all, MLC LLM also supports ARM and x86 CPU architectures as targets by leveraging LLVM. While speed performance isn't comparable, it allows for non-time-sensitive inference to be run on commodity hardware.We also enjoyed getting a peek into TQ's process, which involves a lot of sketching:With all the other work going on in this space with projects like ggml and Ollama, we're excited to see GPUs becoming less and less of an issue to get models in the hands of more people, and innovative software solutions to hardware problems!Show Notes* TQ's Projects:* XGBoost* Apache TVM* MXNet* MLC* OctoML* CMU Catalyst* ONNX* GGML* Mojo* WebLLM* RWKV* HiPPO* Tri Dao's Episode* George Hotz EpisodePeople:* Carlos Guestrin* Albert GuTimestamps* [00:00:00] Intros* [00:03:41] The creation of XGBoost and its surprising popularity* [00:06:01] Comparing tree-based models vs deep learning* [00:10:33] Overview of TVM and how it works with ONNX* [00:17:18] MLC deep dive* [00:28:10] Using int4 quantization for inference of language models* [00:30:32] Comparison of MLC to other model optimization projects* [00:35:02] Running large language models in the browser with WebLLM* [00:37:47] Integrating browser models into applications* [00:41:15] OctoAI and self-optimizing compute* [00:45:45] Lightning RoundTranscriptAlessio: Hey everyone, welcome to the Latent Space podcast. This is Alessio, Partner and CTO in Residence at Decibel Partners, and I'm joined by my co-host Swyx, writer and editor of Latent Space. [00:00:20]Swyx: Okay, and we are here with Tianqi Chen, or TQ as people call him, who is assistant professor in ML computer science at CMU, Carnegie Mellon University, also helping to run Catalyst Group, also chief technologist of OctoML. You wear many hats. Are those, you know, your primary identities these days? Of course, of course. [00:00:42]Tianqi: I'm also, you know, very enthusiastic open source. So I'm also a VP and PRC member of the Apache TVM project and so on. But yeah, these are the things I've been up to so far. [00:00:53]Swyx: Yeah. So you did Apache TVM, XGBoost, and MXNet, and we can cover any of those in any amount of detail. But maybe what's one thing about you that people might not learn from your official bio or LinkedIn, you know, on the personal side? [00:01:08]Tianqi: Let me say, yeah, so normally when I do, I really love coding, even though like I'm trying to run all those things. So one thing that I keep a habit on is I try to do sketchbooks. I have a book, like real sketchbooks to draw down the design diagrams and the sketchbooks I keep sketching over the years, and now I have like three or four of them. And it's kind of a usually a fun experience of thinking the design through and also seeing how open source project evolves and also looking back at the sketches that we had in the past to say, you know, all these ideas really turn into code nowadays. [00:01:43]Alessio: How many sketchbooks did you get through to build all this stuff? I mean, if one person alone built one of those projects, he'll be a very accomplished engineer. Like you built like three of these. What's that process like for you? Like it's the sketchbook, like the start, and then you think about the code or like. [00:01:59]Swyx: Yeah. [00:02:00]Tianqi: So, so usually I start sketching on high level architectures and also in a project that works for over years, we also start to think about, you know, new directions, like of course generative AI language model comes in, how it's going to evolve. So normally I would say it takes like one book a year, roughly at that rate. It's usually fun to, I find it's much easier to sketch things out and then gives a more like a high level architectural guide for some of the future items. Yeah. [00:02:28]Swyx: Have you ever published this sketchbooks? Cause I think people would be very interested on, at least on a historical basis. Like this is the time where XGBoost was born, you know? Yeah, not really. [00:02:37]Tianqi: I started sketching like after XGBoost. So that's a kind of missing piece, but a lot of design details in TVM are actually part of the books that I try to keep a record of. [00:02:48]Swyx: Yeah, we'll try to publish them and publish something in the journals. Maybe you can grab a little snapshot for visual aid. Sounds good. [00:02:57]Alessio: Yeah. And yeah, talking about XGBoost, so a lot of people in the audience might know it's a gradient boosting library, probably the most popular out there. And it became super popular because many people started using them in like a machine learning competitions. And I think there's like a whole Wikipedia page of like all state-of-the-art models. They use XGBoost and like, it's a really long list. When you were working on it, so we just had Tri Dao, who's the creator of FlashAttention on the podcast. And I asked him this question, it's like, when you were building FlashAttention, did you know that like almost any transform race model will use it? And so I asked the same question to you when you were coming up with XGBoost, like, could you predict it would be so popular or like, what was the creation process? And when you published it, what did you expect? We have no idea. [00:03:41]Tianqi: Like, actually, the original reason that we built that library is that at that time, deep learning just came out. Like that was the time where AlexNet just came out. And one of the ambitious mission that myself and my advisor, Carlos Guestrin, then is we want to think about, you know, try to test the hypothesis. Can we find alternatives to deep learning models? Because then, you know, there are other alternatives like, you know, support vector machines, linear models, and of course, tree-based models. And our question was, if you build those models and feed them with big enough data, because usually like one of the key characteristics of deep learning is that it's taking a lot [00:04:22]Swyx: of data, right? [00:04:23]Tianqi: So we will be able to get the same amount of performance. That's a hypothesis we're setting out to test. Of course, if you look at now, right, that's a wrong hypothesis, but as a byproduct, what we find out is that, you know, most of the gradient boosting library out there is not efficient enough for us to test that hypothesis. So I happen to have quite a bit of experience in the past of building gradient boosting trees and their variants. So Effective Action Boost was kind of like a byproduct of that hypothesis testing. At that time, I'm also competing a bit in data science challenges, like I worked on KDDCup and then Kaggle kind of become bigger, right? So I kind of think maybe it's becoming useful to others. One of my friends convinced me to try to do a Python binding of it. That tends to be like a very good decision, right, to be effective. Usually when I build it, we feel like maybe a command line interface is okay. And now we have a Python binding, we have R bindings. And then it realized, you know, it started getting interesting. People started contributing different perspectives, like visualization and so on. So we started to push a bit more on to building distributive support to make sure it works on any platform and so on. And even at that time point, when I talked to Carlos, my advisor, later, he said he never anticipated that we'll get to that level of success. And actually, why I pushed for gradient boosting trees, interestingly, at that time, he also disagreed. He thinks that maybe we should go for kernel machines then. And it turns out, you know, actually, we are both wrong in some sense, and Deep Neural Network was the king in the hill. But at least the gradient boosting direction got into something fruitful. [00:06:01]Swyx: Interesting. [00:06:02]Alessio: I'm always curious when it comes to these improvements, like, what's the design process in terms of like coming up with it? And how much of it is a collaborative with like other people that you're working with versus like trying to be, you know, obviously, in academia, it's like very paper-driven kind of research driven. [00:06:19]Tianqi: I would say the extra boost improvement at that time point was more on like, you know, I'm trying to figure out, right. But it's combining lessons. Before that, I did work on some of the other libraries on matrix factorization. That was like my first open source experience. Nobody knew about it, because you'll find, likely, if you go and try to search for the package SVD feature, you'll find some SVN repo somewhere. But it's actually being used for some of the recommender system packages. So I'm trying to apply some of the previous lessons there and trying to combine them. The later projects like MXNet and then TVM is much, much more collaborative in a sense that... But, of course, extra boost has become bigger, right? So when we started that project myself, and then we have, it's really amazing to see people come in. Michael, who was a lawyer, and now he works on the AI space as well, on contributing visualizations. Now we have people from our community contributing different things. So extra boost even today, right, it's a community of committers driving the project. So it's definitely something collaborative and moving forward on getting some of the things continuously improved for our community. [00:07:37]Alessio: Let's talk a bit about TVM too, because we got a lot of things to run through in this episode. [00:07:42]Swyx: I would say that at some point, I'd love to talk about this comparison between extra boost or tree-based type AI or machine learning compared to deep learning, because I think there is a lot of interest around, I guess, merging the two disciplines, right? And we can talk more about that. I don't know where to insert that, by the way, so we can come back to it later. Yeah. [00:08:04]Tianqi: Actually, what I said, when we test the hypothesis, the hypothesis is kind of, I would say it's partially wrong, because the hypothesis we want to test now is, can you run tree-based models on image classification tasks, where deep learning is certainly a no-brainer right [00:08:17]Swyx: now today, right? [00:08:18]Tianqi: But if you try to run it on tabular data, still, you'll find that most people opt for tree-based models. And there's a reason for that, in the sense that when you are looking at tree-based models, the decision boundaries are naturally rules that you're looking at, right? And they also have nice properties, like being able to be agnostic to scale of input and be able to automatically compose features together. And I know there are attempts on building neural network models that work for tabular data, and I also sometimes follow them. I do feel like it's good to have a bit of diversity in the modeling space. Actually, when we're building TVM, we build cost models for the programs, and actually we are using XGBoost for that as well. I still think tree-based models are going to be quite relevant, because first of all, it's really to get it to work out of the box. And also, you will be able to get a bit of interoperability and control monotonicity [00:09:18]Swyx: and so on. [00:09:19]Tianqi: So yes, it's still going to be relevant. I also sometimes keep coming back to think about, are there possible improvements that we can build on top of these models? And definitely, I feel like it's a space that can have some potential in the future. [00:09:34]Swyx: Are there any current projects that you would call out as promising in terms of merging the two directions? [00:09:41]Tianqi: I think there are projects that try to bring a transformer-type model for tabular data. I don't remember specifics of them, but I think even nowadays, if you look at what people are using, tree-based models are still one of their toolkits. So I think maybe eventually it's not even a replacement, it will be just an ensemble of models that you can call. Perfect. [00:10:07]Alessio: Next up, about three years after XGBoost, you built this thing called TVM, which is now a very popular compiler framework for models. Let's talk about, so this came out about at the same time as ONNX. So I think it would be great if you could maybe give a little bit of an overview of how the two things work together. Because it's kind of like the model, then goes to ONNX, then goes to the TVM. But I think a lot of people don't understand the nuances. I can get a bit of a backstory on that. [00:10:33]Tianqi: So actually, that's kind of an ancient history. Before XGBoost, I worked on deep learning for two years or three years. I got a master's before I started my PhD. And during my master's, my thesis focused on applying convolutional restricted Boltzmann machine for ImageNet classification. That is the thing I'm working on. And that was before AlexNet moment. So effectively, I had to handcraft NVIDIA CUDA kernels on, I think, a GTX 2070 card. I have a 22070 card. It took me about six months to get one model working. And eventually, that model is not so good, and we should have picked a better model. But that was like an ancient history that really got me into this deep learning field. And of course, eventually, we find it didn't work out. So in my master's, I ended up working on recommender system, which got me a paper, and I applied and got a PhD. But I always want to come back to work on the deep learning field. So after XGBoost, I think I started to work with some folks on this particular MXNet. At that time, it was like the frameworks of CAFE, Ciano, PyTorch haven't yet come out. And we're really working hard to optimize for performance on GPUs. At that time, I found it's really hard, even for NVIDIA GPU. It took me six months. And then it's amazing to see on different hardwares how hard it is to go and optimize code for the platforms that are interesting. So that gets me thinking, can we build something more generic and automatic? So that I don't need an entire team of so many people to go and build those frameworks. So that's the motivation of starting working on TVM. There is really too little about machine learning engineering needed to support deep learning models on the platforms that we're interested in. I think it started a bit earlier than ONNX, but once it got announced, I think it's in a similar time period at that time. So overall, how it works is that TVM, you will be able to take a subset of machine learning programs that are represented in what we call a computational graph. Nowadays, we can also represent a loop-level program ingest from your machine learning models. Usually, you have model formats ONNX, or in PyTorch, they have FX Tracer that allows you to trace the FX graph. And then it goes through TVM. We also realized that, well, yes, it needs to be more customizable, so it will be able to perform some of the compilation optimizations like fusion operator together, doing smart memory planning, and more importantly, generate low-level code. So that works for NVIDIA and also is portable to other GPU backends, even non-GPU backends [00:13:36]Swyx: out there. [00:13:37]Tianqi: So that's a project that actually has been my primary focus over the past few years. And it's great to see how it started from where I think we are the very early initiator of machine learning compilation. I remember there was a visit one day, one of the students asked me, are you still working on deep learning frameworks? I tell them that I'm working on ML compilation. And they said, okay, compilation, that sounds very ancient. It sounds like a very old field. And why are you working on this? And now it's starting to get more traction, like if you say Torch Compile and other things. I'm really glad to see this field starting to pick up. And also we have to continue innovating here. [00:14:17]Alessio: I think the other thing that I noticed is, it's kind of like a big jump in terms of area of focus to go from XGBoost to TVM, it's kind of like a different part of the stack. Why did you decide to do that? And I think the other thing about compiling to different GPUs and eventually CPUs too, did you already see some of the strain that models could have just being focused on one runtime, only being on CUDA and that, and how much of that went into it? [00:14:50]Tianqi: I think it's less about trying to get impact, more about wanting to have fun. I like to hack code, I had great fun hacking CUDA code. Of course, being able to generate CUDA code is cool, right? But now, after being able to generate CUDA code, okay, by the way, you can do it on other platforms, isn't that amazing? So it's more of that attitude to get me started on this. And also, I think when we look at different researchers, myself is more like a problem solver type. So I like to look at a problem and say, okay, what kind of tools we need to solve that problem? So regardless, it could be building better models. For example, while we build extra boots, we build certain regularizations into it so that it's more robust. It also means building system optimizations, writing low-level code, maybe trying to write assembly and build compilers and so on. So as long as they solve the problem, definitely go and try to do them together. And I also see it's a common trend right now. Like if you want to be able to solve machine learning problems, it's no longer at Aggressor layer, right? You kind of need to solve it from both Aggressor data and systems angle. And this entire field of machine learning system, I think it's kind of emerging. And there's now a conference around it. And it's really good to see a lot more people are starting to look into this. [00:16:10]Swyx: Yeah. Are you talking about ICML or something else? [00:16:13]Tianqi: So machine learning and systems, right? So not only machine learning, but machine learning and system. So there's a conference called MLsys. It's definitely a smaller community than ICML, but I think it's also an emerging and growing community where people are talking about what are the implications of building systems for machine learning, right? And how do you go and optimize things around that and co-design models and systems together? [00:16:37]Swyx: Yeah. And you were area chair for ICML and NeurIPS as well. So you've just had a lot of conference and community organization experience. Is that also an important part of your work? Well, it's kind of expected for academic. [00:16:48]Tianqi: If I hold an academic job, I need to do services for the community. Okay, great. [00:16:53]Swyx: Your most recent venture in MLsys is going to the phone with MLCLLM. You announced this in April. I have it on my phone. It's great. I'm running Lama 2, Vicuña. I don't know what other models that you offer. But maybe just kind of describe your journey into MLC. And I don't know how this coincides with your work at CMU. Is that some kind of outgrowth? [00:17:18]Tianqi: I think it's more like a focused effort that we want in the area of machine learning compilation. So it's kind of related to what we built in TVM. So when we built TVM was five years ago, right? And a lot of things happened. We built the end-to-end machine learning compiler that works, the first one that works. But then we captured a lot of lessons there. So then we are building a second iteration called TVM Unity. That allows us to be able to allow ML engineers to be able to quickly capture the new model and how we demand building optimizations for them. And MLCLLM is kind of like an MLC. It's more like a vertical driven organization that we go and build tutorials and go and build projects like LLM to solutions. So that to really show like, okay, you can take machine learning compilation technology and apply it and bring something fun forward. Yeah. So yes, it runs on phones, which is really cool. But the goal here is not only making it run on phones, right? The goal is making it deploy universally. So we do run on Apple M2 Macs, the 17 billion models. Actually, on a single batch inference, more recently on CUDA, we get, I think, the most best performance you can get out there already on the 4-bit inference. Actually, as I alluded earlier before the podcast, we just had a result on AMD. And on a single batch, actually, we can get the latest AMD GPU. This is a consumer card. It can get to about 80% of the 4019, so NVIDIA's best consumer card out there. So it's not yet on par, but thinking about how diversity and what you can enable and the previous things you can get on that card, it's really amazing that what you can do with this kind of technology. [00:19:10]Swyx: So one thing I'm a little bit confused by is that most of these models are in PyTorch, but you're running this inside a TVM. I don't know. Was there any fundamental change that you needed to do, or was this basically the fundamental design of TVM? [00:19:25]Tianqi: So the idea is that, of course, it comes back to program representation, right? So effectively, TVM has this program representation called TVM script that contains more like computational graph and operational representation. So yes, initially, we do need to take a bit of effort of bringing those models onto the program representation that TVM supports. Usually, there are a mix of ways, depending on the kind of model you're looking at. For example, for vision models and stable diffusion models, usually we can just do tracing that takes PyTorch model onto TVM. That part is still being robustified so that we can bring more models in. On language model tasks, actually what we do is we directly build some of the model constructors and try to directly map from Hugging Face models. The goal is if you have a Hugging Face configuration, we will be able to bring that in and apply optimization on them. So one fun thing about model compilation is that your optimization doesn't happen only as a soft language, right? For example, if you're writing PyTorch code, you just go and try to use a better fused operator at a source code level. Torch compile might help you do a bit of things in there. In most of the model compilations, it not only happens at the beginning stage, but we also apply generic transformations in between, also through a Python API. So you can tweak some of that. So that part of optimization helps a lot of uplifting in getting both performance and also portability on the environment. And another thing that we do have is what we call universal deployment. So if you get the ML program into this TVM script format, where there are functions that takes in tensor and output tensor, we will be able to have a way to compile it. So they will be able to load the function in any of the language runtime that TVM supports. So if you could load it in JavaScript, and that's a JavaScript function that you can take in tensors and output tensors. If you're loading Python, of course, and C++ and Java. So the goal there is really bring the ML model to the language that people care about and be able to run it on a platform they like. [00:21:37]Swyx: It strikes me that I've talked to a lot of compiler people, but you don't have a traditional compiler background. You're inventing your own discipline called machine learning compilation, or MLC. Do you think that this will be a bigger field going forward? [00:21:52]Tianqi: First of all, I do work with people working on compilation as well. So we're also taking inspirations from a lot of early innovations in the field. Like for example, TVM initially, we take a lot of inspirations from Halide, which is just an image processing compiler. And of course, since then, we have evolved quite a bit to focus on the machine learning related compilations. If you look at some of our conference publications, you'll find that machine learning compilation is already kind of a subfield. So if you look at papers in both machine learning venues, the MLC conferences, of course, and also system venues, every year there will be papers around machine learning compilation. And in the compiler conference called CGO, there's a C4ML workshop that also kind of trying to focus on this area. So definitely it's already starting to gain traction and becoming a field. I wouldn't claim that I invented this field, but definitely I helped to work with a lot of folks there. And I try to bring a perspective, of course, trying to learn a lot from the compiler optimizations as well as trying to bring in knowledges in machine learning and systems together. [00:23:07]Alessio: So we had George Hotz on the podcast a few episodes ago, and he had a lot to say about AMD and their software. So when you think about TVM, are you still restricted in a way by the performance of the underlying kernel, so to speak? So if your target is like a CUDA runtime, you still get better performance, no matter like TVM kind of helps you get there, but then that level you don't take care of, right? [00:23:34]Swyx: There are two parts in here, right? [00:23:35]Tianqi: So first of all, there is the lower level runtime, like CUDA runtime. And then actually for NVIDIA, a lot of the mood came from their libraries, like Cutlass, CUDN, right? Those library optimizations. And also for specialized workloads, actually you can specialize them. Because a lot of cases you'll find that if you go and do benchmarks, it's very interesting. Like two years ago, if you try to benchmark ResNet, for example, usually the NVIDIA library [00:24:04]Swyx: gives you the best performance. [00:24:06]Tianqi: It's really hard to beat them. But as soon as you start to change the model to something, maybe a bit of a variation of ResNet, not for the traditional ImageNet detections, but for latent detection and so on, there will be some room for optimization because people sometimes overfit to benchmarks. These are people who go and optimize things, right? So people overfit the benchmarks. So that's the largest barrier, like being able to get a low level kernel libraries, right? In that sense, the goal of TVM is actually we try to have a generic layer to both, of course, leverage libraries when available, but also be able to automatically generate [00:24:45]Swyx: libraries when possible. [00:24:46]Tianqi: So in that sense, we are not restricted by the libraries that they have to offer. That's why we will be able to run Apple M2 or WebGPU where there's no library available because we are kind of like automatically generating libraries. That makes it easier to support less well-supported hardware, right? For example, WebGPU is one example. From a runtime perspective, AMD, I think before their Vulkan driver was not very well supported. Recently, they are getting good. But even before that, we'll be able to support AMD through this GPU graphics backend called Vulkan, which is not as performant, but it gives you a decent portability across those [00:25:29]Swyx: hardware. [00:25:29]Alessio: And I know we got other MLC stuff to talk about, like WebLLM, but I want to wrap up on the optimization that you're doing. So there's kind of four core things, right? Kernel fusion, which we talked a bit about in the flash attention episode and the tiny grab one memory planning and loop optimization. I think those are like pretty, you know, self-explanatory. I think the one that people have the most questions, can you can you quickly explain [00:25:53]Swyx: those? [00:25:54]Tianqi: So there are kind of a different things, right? Kernel fusion means that, you know, if you have an operator like Convolutions or in the case of a transformer like MOP, you have other operators that follow that, right? You don't want to launch two GPU kernels. You want to be able to put them together in a smart way, right? And as a memory planning, it's more about, you know, hey, if you run like Python code, every time when you generate a new array, you are effectively allocating a new piece of memory, right? Of course, PyTorch and other frameworks try to optimize for you. So there is a smart memory allocator behind the scene. But actually, in a lot of cases, it's much better to statically allocate and plan everything ahead of time. And that's where like a compiler can come in. We need to, first of all, actually for language model, it's much harder because dynamic shape. So you need to be able to what we call symbolic shape tracing. So we have like a symbolic variable that tells you like the shape of the first tensor is n by 12. And the shape of the third tensor is also n by 12. Or maybe it's n times 2 by 12. Although you don't know what n is, right? But you will be able to know that relation and be able to use that to reason about like fusion and other decisions. So besides this, I think loop transformation is quite important. And it's actually non-traditional. Originally, if you simply write a code and you want to get a performance, it's very hard. For example, you know, if you write a matrix multiplier, the simplest thing you can do is you do for i, j, k, c, i, j, plus, equal, you know, a, i, k, times b, i, k. But that code is 100 times slower than the best available code that you can get. So we do a lot of transformation, like being able to take the original code, trying to put things into shared memory, and making use of tensor calls, making use of memory copies, and all this. Actually, all these things, we also realize that, you know, we cannot do all of them. So we also make the ML compilation framework as a Python package, so that people will be able to continuously improve that part of engineering in a more transparent way. So we find that's very useful, actually, for us to be able to get good performance very quickly on some of the new models. Like when Lamato came out, we'll be able to go and look at the whole, here's the bottleneck, and we can go and optimize those. [00:28:10]Alessio: And then the fourth one being weight quantization. So everybody wants to know about that. And just to give people an idea of the memory saving, if you're doing FB32, it's like four bytes per parameter. Int8 is like one byte per parameter. So you can really shrink down the memory footprint. What are some of the trade-offs there? How do you figure out what the right target is? And what are the precision trade-offs, too? [00:28:37]Tianqi: Right now, a lot of people also mostly use int4 now for language models. So that really shrinks things down a lot. And more recently, actually, we started to think that, at least in MOC, we don't want to have a strong opinion on what kind of quantization we want to bring, because there are so many researchers in the field. So what we can do is we can allow developers to customize the quantization they want, but we still bring the optimum code for them. So we are working on this item called bring your own quantization. In fact, hopefully MOC will be able to support more quantization formats. And definitely, I think there's an open field that's being explored. Can you bring more sparsities? Can you quantize activations as much as possible, and so on? And it's going to be something that's going to be relevant for quite a while. [00:29:27]Swyx: You mentioned something I wanted to double back on, which is most people use int4 for language models. This is actually not obvious to me. Are you talking about the GGML type people, or even the researchers who are training the models also using int4? [00:29:40]Tianqi: Sorry, so I'm mainly talking about inference, not training, right? So when you're doing training, of course, int4 is harder, right? Maybe you could do some form of mixed type precision for inference. I think int4 is kind of like, in a lot of cases, you will be able to get away with int4. And actually, that does bring a lot of savings in terms of the memory overhead, and so on. [00:30:09]Alessio: Yeah, that's great. Let's talk a bit about maybe the GGML, then there's Mojo. How should people think about MLC? How do all these things play together? I think GGML is focused on model level re-implementation and improvements. Mojo is a language, super sad. You're more at the compiler level. Do you all work together? Do people choose between them? [00:30:32]Tianqi: So I think in this case, I think it's great to say the ecosystem becomes so rich with so many different ways. So in our case, GGML is more like you're implementing something from scratch in C, right? So that gives you the ability to go and customize each of a particular hardware backend. But then you will need to write from CUDA kernels, and you write optimally from AMD, and so on. So the kind of engineering effort is a bit more broadened in that sense. Mojo, I have not looked at specific details yet. I think it's good to start to say, it's a language, right? I believe there will also be machine learning compilation technologies behind it. So it's good to say, interesting place in there. In the case of MLC, our case is that we do not want to have an opinion on how, where, which language people want to develop, deploy, and so on. And we also realize that actually there are two phases. We want to be able to develop and optimize your model. By optimization, I mean, really bring in the best CUDA kernels and do some of the machine learning engineering in there. And then there's a phase where you want to deploy it as a part of the app. So if you look at the space, you'll find that GGML is more like, I'm going to develop and optimize in the C language, right? And then most of the low-level languages they have. And Mojo is that you want to develop and optimize in Mojo, right? And you deploy in Mojo. In fact, that's the philosophy they want to push for. In the ML case, we find that actually if you want to develop models, the machine learning community likes Python. Python is a language that you should focus on. So in the case of MLC, we really want to be able to enable, not only be able to just define your model in Python, that's very common, right? But also do ML optimization, like engineering optimization, CUDA kernel optimization, memory planning, all those things in Python that makes you customizable and so on. But when you do deployment, we realize that people want a bit of a universal flavor. If you are a web developer, you want JavaScript, right? If you're maybe an embedded system person, maybe you would prefer C++ or C or Rust. And people sometimes do like Python in a lot of cases. So in the case of MLC, we really want to have this vision of, you optimize, build a generic optimization in Python, then you deploy that universally onto the environments that people like. [00:32:54]Swyx: That's a great perspective and comparison, I guess. One thing I wanted to make sure that we cover is that I think you are one of these emerging set of academics that also very much focus on your artifacts of delivery. Of course. Something we talked about for three years, that he was very focused on his GitHub. And obviously you treated XGBoost like a product, you know? And then now you're publishing an iPhone app. Okay. Yeah. Yeah. What is his thinking about academics getting involved in shipping products? [00:33:24]Tianqi: I think there are different ways of making impact, right? Definitely, you know, there are academics that are writing papers and building insights for people so that people can build product on top of them. In my case, I think the particular field I'm working on, machine learning systems, I feel like really we need to be able to get it to the hand of people so that really we see the problem, right? And we show that we can solve a problem. And it's a different way of making impact. And there are academics that are doing similar things. Like, you know, if you look at some of the people from Berkeley, right? A few years, they will come up with big open source projects. Certainly, I think it's just a healthy ecosystem to have different ways of making impacts. And I feel like really be able to do open source and work with open source community is really rewarding because we have a real problem to work on when we build our research. Actually, those research bring together and people will be able to make use of them. And we also start to see interesting research challenges that we wouldn't otherwise say, right, if you're just trying to do a prototype and so on. So I feel like it's something that is one interesting way of making impact, making contributions. [00:34:40]Swyx: Yeah, you definitely have a lot of impact there. And having experience publishing Mac stuff before, the Apple App Store is no joke. It is the hardest compilation, human compilation effort. So one thing that we definitely wanted to cover is running in the browser. You have a 70 billion parameter model running in the browser. That's right. Can you just talk about how? Yeah, of course. [00:35:02]Tianqi: So I think that there are a few elements that need to come in, right? First of all, you know, we do need a MacBook, the latest one, like M2 Max, because you need the memory to be big enough to cover that. So for a 70 million model, it takes you about, I think, 50 gigahertz of RAM. So the M2 Max, the upper version, will be able to run it, right? And it also leverages machine learning compilation. Again, what we are doing is the same, whether it's running on iPhone, on server cloud GPUs, on AMDs, or on MacBook, we all go through that same MOC pipeline. Of course, in certain cases, maybe we'll do a bit of customization iteration for either ones. And then it runs on the browser runtime, this package of WebLM. So that will effectively... So what we do is we will take that original model and compile to what we call WebGPU. And then the WebLM will be to pick it up. And the WebGPU is this latest GPU technology that major browsers are shipping right now. So you can get it in Chrome for them already. It allows you to be able to access your native GPUs from a browser. And then effectively, that language model is just invoking the WebGPU kernels through there. So actually, when the LATMAR2 came out, initially, we asked the question about, can you run 17 billion on a MacBook? That was the question we're asking. So first, we actually... Jin Lu, who is the engineer pushing this, he got 17 billion on a MacBook. We had a CLI version. So in MLC, you will be able to... That runs through a metal accelerator. So effectively, you use the metal programming language to get the GPU acceleration. So we find, okay, it works for the MacBook. Then we asked, we had a WebGPU backend. Why not try it there? So we just tried it out. And it's really amazing to see everything up and running. And actually, it runs smoothly in that case. So I do think there are some kind of interesting use cases already in this, because everybody has a browser. You don't need to install anything. I think it doesn't make sense yet to really run a 17 billion model on a browser, because you kind of need to be able to download the weight and so on. But I think we're getting there. Effectively, the most powerful models you will be able to run on a consumer device. It's kind of really amazing. And also, in a lot of cases, there might be use cases. For example, if I'm going to build a chatbot that I talk to it and answer questions, maybe some of the components, like the voice to text, could run on the client side. And so there are a lot of possibilities of being able to have something hybrid that contains the edge component or something that runs on a server. [00:37:47]Alessio: Do these browser models have a way for applications to hook into them? So if I'm using, say, you can use OpenAI or you can use the local model. Of course. [00:37:56]Tianqi: Right now, actually, we are building... So there's an NPM package called WebILM, right? So that you will be able to, if you want to embed it onto your web app, you will be able to directly depend on WebILM and you will be able to use it. We are also having a REST API that's OpenAI compatible. So that REST API, I think, right now, it's actually running on native backend. So that if a CUDA server is faster to run on native backend. But also we have a WebGPU version of it that you can go and run. So yeah, we do want to be able to have easier integrations with existing applications. And OpenAI API is certainly one way to do that. Yeah, this is great. [00:38:37]Swyx: I actually did not know there's an NPM package that makes it very, very easy to try out and use. I want to actually... One thing I'm unclear about is the chronology. Because as far as I know, Chrome shipped WebGPU the same time that you shipped WebILM. Okay, yeah. So did you have some kind of secret chat with Chrome? [00:38:57]Tianqi: The good news is that Chrome is doing a very good job of trying to have early release. So although the official shipment of the Chrome WebGPU is the same time as WebILM, actually, you will be able to try out WebGPU technology in Chrome. There is an unstable version called Canary. I think as early as two years ago, there was a WebGPU version. Of course, it's getting better. So we had a TVM-based WebGPU backhand two years ago. Of course, at that time, there were no language models. It was running on less interesting, well, still quite interesting models. And then this year, we really started to see it getting matured and performance keeping up. So we have a more serious push of bringing the language model compatible runtime onto the WebGPU. [00:39:45]Swyx: I think you agree that the hardest part is the model download. Has there been conversations about a one-time model download and sharing between all the apps that might use this API? That is a great point. [00:39:58]Tianqi: I think it's already supported in some sense. When we download the model, WebILM will cache it onto a special Chrome cache. So if a different web app uses the same WebILM JavaScript package, you don't need to redownload the model again. So there is already something there. But of course, you have to download the model once at least to be able to use it. [00:40:19]Swyx: Okay. One more thing just in general before we're about to zoom out to OctoAI. Just the last question is, you're not the only project working on, I guess, local models. That's right. Alternative models. There's gpt4all, there's olama that just recently came out, and there's a bunch of these. What would be your advice to them on what's a valuable problem to work on? And what is just thin wrappers around ggml? Like, what are the interesting problems in this space, basically? [00:40:45]Tianqi: I think making API better is certainly something useful, right? In general, one thing that we do try to push very hard on is this idea of easier universal deployment. So we are also looking forward to actually have more integration with MOC. That's why we're trying to build API like WebILM and other things. So we're also looking forward to collaborate with all those ecosystems and working support to bring in models more universally and be able to also keep up the best performance when possible in a more push-button way. [00:41:15]Alessio: So as we mentioned in the beginning, you're also the co-founder of Octomel. Recently, Octomel released OctoAI, which is a compute service, basically focuses on optimizing model runtimes and acceleration and compilation. What has been the evolution there? So Octo started as kind of like a traditional MLOps tool, where people were building their own models and you help them on that side. And then it seems like now most of the market is shifting to starting from pre-trained generative models. Yeah, what has been that experience for you and what you've seen the market evolve? And how did you decide to release OctoAI? [00:41:52]Tianqi: One thing that we found out is that on one hand, it's really easy to go and get something up and running, right? So if you start to consider there's so many possible availabilities and scalability issues and even integration issues since becoming kind of interesting and complicated. So we really want to make sure to help people to get that part easy, right? And now a lot of things, if we look at the customers we talk to and the market, certainly generative AI is something that is very interesting. So that is something that we really hope to help elevate. And also building on top of technology we build to enable things like portability across hardwares. And you will be able to not worry about the specific details, right? Just focus on getting the model out. We'll try to work on infrastructure and other things that helps on the other end. [00:42:45]Alessio: And when it comes to getting optimization on the runtime, I see when we run an early adopters community and most enterprises issue is how to actually run these models. Do you see that as one of the big bottlenecks now? I think a few years ago it was like, well, we don't have a lot of machine learning talent. We cannot develop our own models. Versus now it's like, there's these great models you can use, but I don't know how to run them efficiently. [00:43:12]Tianqi: That depends on how you define by running, right? On one hand, it's easy to download your MLC, like you download it, you run on a laptop, but then there's also different decisions, right? What if you are trying to serve a larger user request? What if that request changes? What if the availability of hardware changes? Right now it's really hard to get the latest hardware on media, unfortunately, because everybody's trying to work on the things using the hardware that's out there. So I think when the definition of run changes, there are a lot more questions around things. And also in a lot of cases, it's not only about running models, it's also about being able to solve problems around them. How do you manage your model locations and how do you make sure that you get your model close to your execution environment more efficiently? So definitely a lot of engineering challenges out there. That we hope to elevate, yeah. And also, if you think about our future, definitely I feel like right now the technology, given the technology and the kind of hardware availability we have today, we will need to make use of all the possible hardware available out there. That will include a mechanism for cutting down costs, bringing something to the edge and cloud in a more natural way. So I feel like still this is a very early stage of where we are, but it's already good to see a lot of interesting progress. [00:44:35]Alessio: Yeah, that's awesome. I would love, I don't know how much we're going to go in depth into it, but what does it take to actually abstract all of this from the end user? You know, like they don't need to know what GPUs you run, what cloud you're running them on. You take all of that away. What was that like as an engineering challenge? [00:44:51]Tianqi: So I think that there are engineering challenges on. In fact, first of all, you will need to be able to support all the kind of hardware backhand you have, right? On one hand, if you look at the media library, you'll find very surprisingly, not too surprisingly, most of the latest libraries works well on the latest GPU. But there are other GPUs out there in the cloud as well. So certainly being able to have know-hows and being able to do model optimization is one thing, right? Also infrastructures on being able to scale things up, locate models. And in a lot of cases, we do find that on typical models, it also requires kind of vertical iterations. So it's not about, you know, build a silver bullet and that silver bullet is going to solve all the problems. It's more about, you know, we're building a product, we'll work with the users and we find out there are interesting opportunities in a certain point. And when our engineer will go and solve that, and it will automatically reflect it in a service. [00:45:45]Swyx: Awesome. [00:45:46]Alessio: We can jump into the lightning round until, I don't know, Sean, if you have more questions or TQ, if you have more stuff you wanted to talk about that we didn't get a chance to [00:45:54]Swyx: touch on. [00:45:54]Alessio: Yeah, we have talked a lot. [00:45:55]Swyx: So, yeah. We always would like to ask, you know, do you have a commentary on other parts of AI and ML that is interesting to you? [00:46:03]Tianqi: So right now, I think one thing that we are really pushing hard for is this question about how far can we bring open source, right? I'm kind of like a hacker and I really like to put things together. So I think it's unclear in the future of what the future of AI looks like. On one hand, it could be possible that, you know, you just have a few big players, you just try to talk to those bigger language models and that can do everything, right? On the other hand, one of the things that Wailing Academic is really excited and pushing for, that's one reason why I'm pushing for MLC, is that can we build something where you have different models? You have personal models that know the best movie you like, but you also have bigger models that maybe know more, and you get those models to interact with each other, right? And be able to have a wide ecosystem of AI agents that helps each person while still being able to do things like personalization. Some of them can run locally, some of them, of course, running on a cloud, and how do they interact with each other? So I think that is a very exciting time where the future is yet undecided, but I feel like there is something we can do to shape that future as well. [00:47:18]Swyx: One more thing, which is something I'm also pursuing, which is, and this kind of goes back into predictions, but also back in your history, do you have any idea, or are you looking out for anything post-transformers as far as architecture is concerned? [00:47:32]Tianqi: I think, you know, in a lot of these cases, you can find there are already promising models for long contexts, right? There are space-based models, where like, you know, a lot of some of our colleagues from Albert, who he worked on this HIPPO models, right? And then there is an open source version called RWKV. It's like a recurrent models that allows you to summarize things. Actually, we are bringing RWKV to MOC as well, so maybe you will be able to see one of the models. [00:48:00]Swyx: We actually recorded an episode with one of the RWKV core members. It's unclear because there's no academic backing. It's just open source people. Oh, I see. So you like the merging of recurrent networks and transformers? [00:48:13]Tianqi: I do love to see this model space continue growing, right? And I feel like in a lot of cases, it's just that attention mechanism is getting changed in some sense. So I feel like definitely there are still a lot of things to be explored here. And that is also one reason why we want to keep pushing machine learning compilation, because one of the things we are trying to push in was productivity. So that for machine learning engineering, so that as soon as some of the models came out, we will be able to, you know, empower them onto those environments that's out there. [00:48:43]Swyx: Yeah, it's a really good mission. Okay. Very excited to see that RWKV and state space model stuff. I'm hearing increasing chatter about that stuff. Okay. Lightning round, as always fun. I'll take the first one. Acceleration. What has already happened in AI that you thought would take much longer? [00:48:59]Tianqi: Emergence of more like a conversation chatbot ability is something that kind of surprised me before it came out. This is like one piece that I feel originally I thought would take much longer, but yeah, [00:49:11]Swyx: it happens. And it's funny because like the original, like Eliza chatbot was something that goes all the way back in time. Right. And then we just suddenly came back again. Yeah. [00:49:21]Tianqi: It's always too interesting to think about, but with a kind of a different technology [00:49:25]Swyx: in some sense. [00:49:25]Alessio: What about the most interesting unsolved question in AI? [00:49:31]Swyx: That's a hard one, right? [00:49:32]Tianqi: So I can tell you like what kind of I'm excited about. So, so I think that I have always been excited about this idea of continuous learning and lifelong learning in some sense. So how AI continues to evolve with the knowledges that have been there. It seems that we're getting much closer with all those recent technologies. So being able to develop systems, support, and be able to think about how AI continues to evolve is something that I'm really excited about. [00:50:01]Swyx: So specifically, just to double click on this, are you talking about continuous training? That's like a training. [00:50:06]Tianqi: I feel like, you know, training adaptation and it's all similar things, right? You want to think about entire life cycle, right? The life cycle of collecting data, training, fine tuning, and maybe have your local context that getting continuously curated and feed onto models. So I think all these things are interesting and relevant in here. [00:50:29]Swyx: Yeah. I think this is something that people are really asking, you know, right now we have moved a lot into the sort of pre-training phase and off the shelf, you know, the model downloads and stuff like that, which seems very counterintuitive compared to the continuous training paradigm that people want. So I guess the last question would be for takeaways. What's basically one message that you want every listener, every person to remember today? [00:50:54]Tianqi: I think it's getting more obvious now, but I think one of the things that I always want to mention in my talks is that, you know, when you're thinking about AI applications, originally people think about algorithms a lot more, right? Our algorithm models, they are still very important. But usually when you build AI applications, it takes, you know, both algorithm side, the system optimizations, and the data curations, right? So it takes a connection of so many facades to be able to bring together an AI system and be able to look at it from that holistic perspective is really useful when we start to build modern applications. I think it's going to continue going to be more important in the future. [00:51:35]Swyx: Yeah. Thank you for showing the way on this. And honestly, just making things possible that I thought would take a lot longer. So thanks for everything you've done. [00:51:46]Tianqi: Thank you for having me. [00:51:47]Swyx: Yeah. [00:51:47]Alessio: Thanks for coming on TQ. [00:51:49]Swyx: Have a good one. [00:51:49] Get full access to Latent Space at www.latent.space/subscribe
AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store
10 Best Open-Source Deep Learning Tools to Know in 2023:TensorFlow, PyTorch, Keras, MXNet, Caffe, Theano, Torch, Chainer, DeepLearning4j, Caffe2Google says it'll scrape everything you post online for AI;Microsoft uses ChatGPT to instruct and interact with robots;Will.i.am hails AI technology as ‘new renaissance' in music;Benchmarking LLMs searching scientific evidence;MIT Unveils Revolutionary AI Tool: Enhancing Chart Interpretation and Accessibility with Adaptive, Detail-Rich Captions for Users of All AbilitiesMoonlander launches AI-based platform for immersive 3D game developmentMozilla adds AI Help that does the oppositePanic about overhyped AI risk could lead to the wrong kind of regulationIt only took five hours for an AI model to design a functional computerDaily AI Update News from Microsoft, Humane, Nvidia, and MoonlanderUS Senator Believes AI Should Be Aligned With Democratic ValuesThis podcast is generated using the Wondercraft AI platform, a tool that makes it super easy to start your own podcast, by enabling you to use hyper-realistic AI voices as your host. Like mine!Attention AI Unraveled podcast listeners!Are you eager to expand your understanding of artificial intelligence? Look no further than the essential book "AI Unraveled: Demystifying Frequently Asked Questions on Artificial Intelligence," by Etienne Noumen, now available at Google, Apple and Amazon! This engaging read answers your burning questions and provides valuable insights into the captivating world of AI. Don't miss this opportunity to elevate your knowledge and stay ahead of the curve.Get your copy Apple, Google, or Amazon today!
Todo el mundo hablando de #IA, de #ChatGTP, de lo que viene, de lo que podría ser. Pero lo cierto es que la revolución comenzó hace tiempo y existen herramientas muy potentes que te gusten o no, harán la diferencia en la industria TI. En este episodio exploramos conceptos y asistentes, utilidades que van a permitirte iniciar o destacar en tu rama, te invito a escuchar con atención. Algunas de las apps nombradas en este episodio: - GitHub Copilot: https://github.com/features/copilot - Replit : https://replit.com/ - Helper Programming: https://www.programming-helper.com/ - AutoRegex: https://www.autoregex.xyz/ - Amazon CodeWhisperer: https://aws.amazon.com/es/codewhisperer/ Ahora si querés desarrollar vos mismo estas herramientas o modelos de IA, existen plataformas como : Google ML Kit, Auto ML, TensorFlow, Theano, MxNet, PyTorch, Auto ML, OpenNN entre otros. Si conoces alguna otra aplicación que utilice la Inteligencia Artificial y consideres que se debe mencionar en estos comentarios, será un gusto poder compartirla con la comunidad. Te invito a sumarte a mis contactos en : https://www.linkedin.com/in/soleralejandro Gracias por estar allí como cada semana y si este podcast te impactó o te pareció útil, la mejor forma de colaborar es valorarlo o compartirlo con alguien mas, así puede llegar a mas personas. - https://www.facebook.com/codigotecno - https://www.instagram.com/codigotecno Sumate a la comunidad en Youtube: https://bit.ly/2JLaKRj En Telegram estamos empezando a armar el canal donde compartimos material que puede aportar a tu formación, recursos, ofertas de empleo y cosillas interesantes. Te esperamos en : https://t.me/codigotecno Y si querés participar en la comunidad : https://t.me/elgrupodecodigotecno Envíame un email : codigotecno (arroba) hotmail.com Seguinos en las redes de podcast mas populares: * En Spotify : https://spoti.fi/31Dp4Sq * En Ivoox : https://bit.ly/2JoLotl * En Itunes: https://apple.co/2WNKWHV * En Anchor.fm: https://bit.ly/3OiVCsN Te espero, animate . ! Muy buen código para todos y hasta la próxima. !
Summary Machine learning is a force multiplier that can generate an outsized impact on your organization. Unfortunately, if you are feeding your ML model garbage data, then you will get orders of magnitude more garbage out of it. The team behind Galileo experienced that pain for themselves and have set out to make data management and cleaning for machine learning a first class concern in your workflow. In this episode Vikram Chatterji shares the story of how Galileo got started and how you can use their platform to fix your ML data so that you can get back to the fun parts. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Predibase is a low-code ML platform without low-code limits. Built on top of our open source foundations of Ludwig and Horovod, our platform allows you to train state-of-the-art ML and deep learning models on your datasets at scale. Our platform works on text, images, tabular, audio and multi-modal data using our novel compositional model architecture. We allow users to operationalize models on top of the modern data stack, through REST and PQL – an extension of SQL that puts predictive power in the hands of data practitioners. Go to themachinelearningpodcast.com/predibase today to learn more and try it out! Do you wish you could use artificial intelligence to drive your business the way Big Tech does, but don’t have a money printer? Graft is a cloud-native platform that aims to make the AI of the 1% accessible to the 99%. Wield the most advanced techniques for unlocking the value of data, including text, images, video, audio, and graphs. No machine learning skills required, no team to hire, and no infrastructure to build or maintain. For more information on Graft or to schedule a demo, visit themachinelearningpodcast.com/graft today and tell them Tobias sent you. Building good ML models is hard, but testing them properly is even harder. At Deepchecks, they built an open-source testing framework that follows best practices, ensuring that your models behave as expected. Get started quickly using their built-in library of checks for testing and validating your model’s behavior and performance, and extend it to meet your specific needs as your model evolves. Accelerate your machine learning projects by building trust in your models and automating the testing that you used to do manually. Go to themachinelearningpodcast.com/deepchecks today to get started! Your host is Tobias Macey and today I’m interviewing Vikram Chatterji about Galileo, a platform for uncovering and addressing data problems to improve your model quality Interview Introduction How did you get involved in machine learning? Can you describe what Galileo is and the story behind it? Who are the target users of the platform and what are the tools/workflows that you are replacing? How does that focus inform and influence the design and prioritization of features in the platform? What are some of the real-world impacts that you have experienced as a result of the kinds of data problems that you are addressing with Galileo? Can you describe how the Galileo product is implemented? What are some of the assumptions that you had formed from your own experiences that have been challenged as you worked with early design partners? The toolchains and model architectures of any given team is unlikely to be a perfect match across departments or organizations. What are the core principles/concepts that you have hooked into in order to provide the broadest compatibility? What are the model types/frameworks/etc. that you have had to forego support for in the early versions of your product? Can you describe the workflow for someone building a machine learning model and how Galileo fits across the various stages of that cycle? What are some of the biggest difficulties posed by the non-linear nature of the experimentation cycle in model development? What are some of the ways that you work to quantify the impact of your tool on the productivity and profit contributions of an ML team/organization? What are the most interesting, innovative, or unexpected ways that you have seen Galileo used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Galileo? When is Galileo the wrong choice? What do you have planned for the future of Galileo? Contact Info LinkedIn Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Galileo F1 Score Tensorflow Keras SpaCy Podcast.__init__ Episode Pytorch Podcast.__init__ Episode MXNet Jax The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
Travis Addair - Horovod and the Evolution of Deep Learning at Scale Deep neural networks are pushing the state of the art in numerous machine learning research domains; from computer vision, to natural language processing, and even tabular business data. However, scaling such models to train efficiently on large datasets imposes a unique set of challenges that traditional batch data processing systems were not designed to solve. Horovod is an open source framework that scales models written in TensorFlow, PyTorch, and MXNet to train seamlessly on hundreds of GPUs in parallel. In this talk, we'll explain the concepts and unique constraints that led to the development of Horovod at Uber, and discuss how the latest trends in deep learning research are informing the future direction of the project within the Linux Foundation. We'll explore how Horovod fits into production ML workflows in industry, and how tools like Spark and Ray can combine with Horovod to make productionizing deep learning at scale on remote data centers as simple as running locally on your laptop. Finally, we'll share some thoughts on what's next for large scale deep learning, including new distributed training architectures and how the larger ecosystem of production ML tooling is evolving.
There are lots of networked video ecosystems out there, but why is MXNet a step above the rest? Take a listen as John Tumbleson former Apple engineer and the product manager for MXNet dives into the technology behind the networked video ecosystem and why it's the best solution on the market! For more info please visit the AVPro Edge website at www.avproedge.com/mxnet.html Sponsored by Pro Tech Marketing. www.protechm.com Please email your suggestions to reps@protechm.com
TensorFlow, Spark MLlib, Scikit-learn, PyTorch, MXNet, and Keras shine for building and training machine learning and deep learning models --- Send in a voice message: https://anchor.fm/tonyphoang/message
Show Notes(2:00) Alexey studied Information Systems and Technologies from a local university in his hometown in eastern Russia.(4:54) Alexey commented on his experience working as a Java developer in the first three years after college in Russia and Poland, along with his initial exposure to Machine Learning thanks to Coursera.(7:55) Alexey talked about his decision to pursue the IT4BI Master Program specializing in Large-Scale Business Intelligence in 2013.(9:42) Alexey discussed his time working as a Research Assistant on Apache Flink at the DIMA Group at TU Berlin.(12:28) Alexey’s Master Thesis is called Semantification of Identifiers in Mathematics for Better Math Information Retrieval, which was later presented at the SIGIR conference on R&D in Information Retrieval in 2016.(14:35) Alexey discussed his first job as a Data Scientist at Searchmetrics - working on projects to help content marketers improve SEO ranking for their articles.(18:54) Alexey’s next role was with the ad-tech company Simplaex. There, he designed, developed, and maintained the ML infrastructure for processing 3+ billion events per day with 100+ million unique daily users - working with tools like Spark for data engineering tasks.(22:17) Alexey reflected on his journey participating in Kaggle competitions.(25:35) Alexey also participated in other competitions at academic conferences: winning 2nd place at the Web Search and Data Mining 2017 challenge on Vandalism Detection and winning 1st place at the NIPS 2017 challenge on Ad Placement.(29:59) Alexey authored his first book called Mastering Java for Data Science, which teaches readers how to create data science applications with Java.(31:40) Alexey then transitioned to a Data Scientist role at OLX Group, a global marketplace for online classified advertisements.(33:23) Alexey explained the ML system that detects duplicates of images submitted to the OLX marketplace, which he presented at PyData Berlin 2019. Read his two-part blog series: The first post presents a two-step framework for duplicate detection, and the second post explains how his team served and deployed this framework at scale.(38:12) Alexey was recently involved in building an infrastructure for serving image models at OLX. Read his two-part blog series on this evolution of image model serving at OLX, including the transition from AWS SageMaker to Kubernetes for model deployment, as well as the utilization of AWS Athena and MXNet for design simplification.(42:39) Alexey is in the process of writing a technical book called Machine Learning Bookcamp - which encourages readers to learn machine learning by doing projects.(46:17) Alexey discussed common struggles during data science interviews, referring to his talk on Getting a Data Science Job.(48:32) Alexey has put together a neat GitHub page that includes both theoretical and technical questions for people who are preparing for interviews.(52:19) Alexey extrapolated on the steps needed to become a better data scientist, in conjunction to his LinkedIn post a while back.(56:40) Alexey gave his advice for software engineers looking to transition into data science.(58:32) Alexey shared his opinion on the data science community in Berlin.(01:01:53) Closing segment.His Contact InfoWebsiteTwitterLinkedInGitHubKaggleQuoraGoogle ScholarMediumHis Recommended ResourcesApache FlinkKubeflowData Science Interviews GitHub RepoPyData BerlinBerlin BuzzwordsAndrew NgDesigning Data-Intensive Applications by Martin KleppmannMachine Learning BookcampPermanent 40$ discount code: poddcast195 free eBook codes (each good for one sample of the book): mlbdrt-D452, mlbdrt-5922, mlbdrt-2C4D, mlbdrt-3034, mlbdrt-1DD1
Show Notes(2:00) Alexey studied Information Systems and Technologies from a local university in his hometown in eastern Russia.(4:54) Alexey commented on his experience working as a Java developer in the first three years after college in Russia and Poland, along with his initial exposure to Machine Learning thanks to Coursera.(7:55) Alexey talked about his decision to pursue the IT4BI Master Program specializing in Large-Scale Business Intelligence in 2013.(9:42) Alexey discussed his time working as a Research Assistant on Apache Flink at the DIMA Group at TU Berlin.(12:28) Alexey’s Master Thesis is called Semantification of Identifiers in Mathematics for Better Math Information Retrieval, which was later presented at the SIGIR conference on R&D in Information Retrieval in 2016.(14:35) Alexey discussed his first job as a Data Scientist at Searchmetrics - working on projects to help content marketers improve SEO ranking for their articles.(18:54) Alexey’s next role was with the ad-tech company Simplaex. There, he designed, developed, and maintained the ML infrastructure for processing 3+ billion events per day with 100+ million unique daily users - working with tools like Spark for data engineering tasks.(22:17) Alexey reflected on his journey participating in Kaggle competitions.(25:35) Alexey also participated in other competitions at academic conferences: winning 2nd place at the Web Search and Data Mining 2017 challenge on Vandalism Detection and winning 1st place at the NIPS 2017 challenge on Ad Placement.(29:59) Alexey authored his first book called Mastering Java for Data Science, which teaches readers how to create data science applications with Java.(31:40) Alexey then transitioned to a Data Scientist role at OLX Group, a global marketplace for online classified advertisements.(33:23) Alexey explained the ML system that detects duplicates of images submitted to the OLX marketplace, which he presented at PyData Berlin 2019. Read his two-part blog series: The first post presents a two-step framework for duplicate detection, and the second post explains how his team served and deployed this framework at scale.(38:12) Alexey was recently involved in building an infrastructure for serving image models at OLX. Read his two-part blog series on this evolution of image model serving at OLX, including the transition from AWS SageMaker to Kubernetes for model deployment, as well as the utilization of AWS Athena and MXNet for design simplification.(42:39) Alexey is in the process of writing a technical book called Machine Learning Bookcamp - which encourages readers to learn machine learning by doing projects.(46:17) Alexey discussed common struggles during data science interviews, referring to his talk on Getting a Data Science Job.(48:32) Alexey has put together a neat GitHub page that includes both theoretical and technical questions for people who are preparing for interviews.(52:19) Alexey extrapolated on the steps needed to become a better data scientist, in conjunction to his LinkedIn post a while back.(56:40) Alexey gave his advice for software engineers looking to transition into data science.(58:32) Alexey shared his opinion on the data science community in Berlin.(01:01:53) Closing segment.His Contact InfoWebsiteTwitterLinkedInGitHubKaggleQuoraGoogle ScholarMediumHis Recommended ResourcesApache FlinkKubeflowData Science Interviews GitHub RepoPyData BerlinBerlin BuzzwordsAndrew NgDesigning Data-Intensive Applications by Martin KleppmannMachine Learning BookcampPermanent 40$ discount code: poddcast195 free eBook codes (each good for one sample of the book): mlbdrt-D452, mlbdrt-5922, mlbdrt-2C4D, mlbdrt-3034, mlbdrt-1DD1
In this Episode of AWS TechChat, Shane and Pete embark on a different style of the show and share with you a lot of updates - over 30 updates and we tackle it like speed dating. We start the show with some updates, there are now an additional 2 AWS regions, Milan in Italy and Cape Town in South Africa. This brings the region count to 24 Regions and 76 Availability Zones. Amazon Guard Duty has a price reduction for the customers who are consuming it on the upper end of the scale, VPC flow log scanning is now 40% cheaper when your logs are more than 10,000GB. Lots of Database engine updates: • Database engine version updates across almost all engines. Microsoft SSAS (SQL Server Analysis Studio) is now available on Amazon Relational Database Service (Amazon RDS) for SQL Server now. • If you are currently running SSAS on Amazon Elastic Compute Cloud (Amazon EC2), you can now save costs by running SSAS directly on the same Amazon RDS DB instance as your SQL Server database. SSAS is currently available on Amazon RDS for SQL Server 2016 and SQL Server 2017 in the single-AZ configuration on both the Standard and Enterprise edition. • NoSQL Workbench for Amazon DynamoDB is now is now generally available. NoSQL Workbench is a client-side application, available for Windows and macOS that helps developers build scalable, high-performance data models, and simplifies query development and testing. • Apache Kafka is an option for AWS Database Migration Service and Amazon Managed Apache Cassandra Service is now available in public preview. Microsoft SQL Server on RDS now supports Read Replicas. Storage updates: • More nitro based Amazon EC2 systems receive IO performance updates. • Amazon FSx for Windows File Server is now has a Magnetic HDD option which brings storage down to 1.3cents per GB. • Amazon Elastic File System (Amazon EFS) announces 400% increase in read operations for General Purpose mode file systems. On Development front: • AWS Lambda@Edge now supports Node 12.x and Python 3.8. • Amplify CLI add support for additional AWS Lambda runtimes (Java, Go, .NET and Python) and Lambda cron jobs. • AWS Lambda now supports .NET Core 3.1. • Receive notifications for AWS CodeBuild, AWS CodeCommit, AWS CodeDeploy, and AWS CodePipeline in Slack, no need to use Amazon Simple Notification Service (SNS) and AWS Glue. • Amazon MSK adds support for Apache Kafka version 2.4.1 • Updates to AWS Deep Learning Containers for PyTorch 1.4.0 and MXNet 1.6.0 Containers updates: • AWS Fargate launches platform version 1.4 which brings a raft of improvements. • Amazon Elastic Kubernetes Service (Amazon EKS) updates service level agreement to 99.95%. • Amazon EKS now supports service-linked roles. • Amazon EKS adds envelope encryption for secrets with AWS Key Management Service (KMS). • Amazon EKS now supports Kubernetes version 1.15 • Amazon ECS supports in preview updating placement strategy and constraints for existing Amazon ECS Services without recreating the service. Connect your managed call centre in the cloud: • Introducing Voicemail for Amazon Connect. • Amazon Connect adds custom terminating keypress for DTMF. Other updates: • New versions of Elastic Search available for Amazon Elastic Search. • AWS DeepComposer is now shipping from Amazon.com Speakers: Shane Baldacchino - Solutions Architect, ANZ, AWS Peter Stanski - Head of Solution Architecture, AWS AWS Events: AWS Summit Online https://aws.amazon.com/events/summits/online/ AWSome Day Online Conference https://aws.amazon.com/events/awsome-day/awsome-day-online/ AWS Innovate AIML Edition on-demand https://aws.amazon.com/events/aws-innovate/machine-learning/ AWS Events and Webinars https://aws.amazon.com/events/
Sponsored by Datadog: pythonbytes.fm/datadog Special guest: Vicki Boykis: @vboykis Michael #1: clize: Turn functions into command-line interfaces via Marcelo Follow up from Typer on episode 164. Features Create command-line interfaces by creating functions and passing them to [clize.run](https://clize.readthedocs.io/en/stable/api.html#clize.run). Enjoy a CLI automatically created from your functions’ parameters. Bring your users familiar --help messages generated from your docstrings. Reuse functionality across multiple commands using decorators. Extend Clize with new parameter behavior. I love how this is pure Python without its own API for the default case Vicki #2: How to cheat at Kaggle AI contests Kaggle is a platform, now owned by Google, that allows data scientists to find data sets, learn data science, and participate in competitions Many people participate in Kaggle competitions to sharpen their data science/modeling skills Recently, a competition that was related to analyzing pet shelter data resulted in a huge controversy Petfinder.my is a platform that helps people find pets to rescue in Malaysia from shelters. In 2019, they announced a collaboration with Kaggle to create a machine learning predictor algorithm of which pets (worldwide) were more likely to be adopted based on the metadata of the descriptions on the site. The total prize offered was $25,000 After several months, a contestant won. He was previously a Kaggle grandmaster, and won $10k. A volunteer, Benjamin Minixhofer, offered to put the algorithm in production, and when he did, he found that there was a huge discrepancy between first and second place Technical Aspects of the controversy: The data they gave asked the contestants to predict the speed at which a pet would be adopted, from 1-5, and included input features like type of animal, breed, coloration, whether the animal was vaccinated, and adoption fee The initial training set had 15k animals and the teams, after a couple months, were then given 4k animals that their algorithms had not seen before as a test of how accurate they were (common machine learning best practice). In a Jupyter notebook Kernel on Kaggle, Minixhofer explains how the winning team cheated First, they individually scraped Petfinder.my to find the answers for the 4k test data Using md5, they created a hash for each unique pet, and looked up the score for each hash from the external dataset - there were 3500 overlaps Did Pandas column manipulation to get at the hidden prediction variable for every 10th pet and replaces the prediction that should have been generated by the algorithm with the actual value Using mostly: obfuscated functions, Pandas, and dictionaries, as well as MD5 hashes Fallout: He was fired from H20.ai Kaggle issued an apology Michael #3: Configuring uWSGI for Production Deployment We run a lot of uWSGI backed services. I’ve spoken in-depth back on Talk Python 215: The software powering Talk Python courses and podcast about this. This is guidance from Bloomberg Engineering’s Structured Products Applications group We chose uWSGI as our host because of its performance and feature set. But, while powerful, uWSGI’s defaults are driven by backward compatibility and are not ideal for new deployments. There is also an official Things to Know doc. Unbit, the developer of uWSGI, has “decided to fix all of the bad defaults (especially for the Python plugin) in the 2.1 branch.” The 2.1 branch is not released yet. Warning, I had trouble with die-on-term and systemctl Settings I’m using: # This option tells uWSGI to fail to start if any parameter # in the configuration file isn’t explicitly understood by uWSGI. strict = true # The master uWSGI process is necessary to gracefully re-spawn # and pre-fork workers, consolidate logs, and manage many other features master = true # uWSGI disables Python threads by default, as described in the Things to Know doc. enable-threads = true # This option will instruct uWSGI to clean up any temporary files or UNIX sockets it created vacuum = true # By default, uWSGI starts in multiple interpreter mode single-interpreter = true # Prevents uWSGI from starting if it is unable to find or load your application module need-app = true # uWSGI provides some functionality which can help identify the workers auto-procname = true procname-prefix = pythonbytes- # Forcefully kill workers after 60 seconds. Without this feature, # a stuck process could stay stuck forever. harakiri = 60 harakiri-verbose = true Vicki #4: Thinc: A functional take on deep learning, compatible with Tensorflow, PyTorch, and MXNet A deep learning library that abstracts away some TF and Pytorch boilerplate, from Explosion Already runs under the covers in SpaCy, an NLP library used for deep learning type checking, particularly helpful for Tensors: PyTorchWrapper and TensorFlowWrapper classes and the intermingling of both Deep support for numpy structures and semantics Assumes you’re going to be using stochastic gradient descent And operates in batches Also cleans up the configuration and hyperparameters Mainly hopes to make it easier and more flexible to do matrix manipulations, using a codebase that already existed but was not customer-facing. Examples and code are all available in notebooks in the GitHub repo Michael #5: pandas-vet via Jacob Deppen A plugin for Flake8 that checks pandas code Starting with pandas can be daunting. The usual internet help sites are littered with different ways to do the same thing and some features that the pandas docs themselves discourage live on in the API. Makes pandas a little more friendly for newcomers by taking some opinionated stances about pandas best practices. The idea to create a linter was sparked by Ania Kapuścińska's talk at PyCascades 2019, "Lint your code responsibly!" Vicki #6: NumPy beginner documentation NumPy is the backbone of numerical computing in Python: Pandas (which I mentioned before), scikit-learn, Tensorflow, and Pytorch, all lean heavily if not directly depend on its core concepts, which include matrix operations through a data structure known as a NumPy array (which is different than a Python list) - ndarray Anne Bonner wrote up new documentation for NumPy that introduces these fundamental concepts to beginners coming to both Python and scientific computing Before, you went directly to the section about arrays and had to search through it find what you wanted. The new guide, which is very nice, includes a step-by-step on how arrays work, how to reshape them, and illustrated guides on basic array operations. Extras: Vicki I write a newsletter, Normcore Tech, about all things tech that I’m not seeing covered in the mainstream tech media. I’ve written before about machine learning, data for NLP, Elon Musk memes, and Nginx. There’s a free version that goes out once a week and paid subscribers get access to one more newsletter per week, but really it’s more about the idea of supporting in-depth writing about tech. vicki.substack.com Michael: pip 20.0 Released - Default to doing a user install (as if --user was passed) when the main site-packages directory is not writeable and user site-packages are enabled, cache wheels built from Git requirements, and more. Homebrew: brew install python@3.8 Joke: An SEO expert walks into a bar, bars, pub, public house, Irish pub, tavern, bartender, beer, liquor, wine, alcohol, spirits...
The Apache MXNet deep learning framework is used for developing, training, and deploying diverse artificial intelligence (AI) applications, including computer vision, speech recognition, and natural language processing (NLP). In this session, learn how to develop deep learning models with MXNet on Amazon SageMaker. Hear from the BBC about how it built a BERT-based NLP application to allow its website users to find relevant clips from recorded shows. We use the BBC's NLP application to demonstrate how to leverage MXNet's GluonNLP library to quickly build, train, and deploy deep learning models.
Amazon SageMaker helps provide the best model performance for less cost. In this session, we walk through a TCO analysis of Amazon SageMaker, exploring its three modules-build, train, and deploy. Learn how Amazon SageMaker automatically configures and optimizes ML frameworks such as TensorFlow, MXNet, and PyTorch, and see how to use pre-built algorithms that are tuned for scale, speed, and accuracy. We explain how the automatic model tuning feature performs hyperparameter optimization by discovering interesting features in your data and learning how those features interact to affect accuracy. Learn how to deploy your model with one click and how to lower inference costs using Amazon Elastic Inference. We end by showing how Aramex uses Amazon SageMaker.
So here are the tensor flow skills which are must for both Machine learning and Deep learning. Why Tensor Flow is an open-source software library built for deep learning or artificial neural networks. Tensor Flow skill is must for both ML & DL , you can create neural networks and computation mod else using flow graphs. It is one of the most well-maintained and popular open-source libraries available for deep learning. The Tensor Flow framework is available in C++ and Python. Other similar deep learning frameworks that are based on Python include theano ,Torch, Lasagne, Blocks, MXNet, pytorch, and Caffe. You can use Tensor Board for easy visualization and see the computation pipeline. Its flexible architecture allows you to deploy easily on different kinds of devices. We from BEPEC are ready to help you and make you shift your career at any cost.For more details visit: https://www.bepec.in/ Bepec registration form : https://www.bepec.in/registration-formCheck our youtube channel for more videos and please subscribe: https://www.youtube.com/channel/UCn1U...Check our Instagram page: https://instagram.com/bepec_solutions/Check our Facebook Page : https://www.facebook.com/Bepecsolutions/For any help or for any guidance please email enquiry@bepec.in
Simon and Nicki share a bumper-crop of interesting, useful and cool new services and features for AWS customers! Chapter Timings 00:01:17 Storage 00:03:15 Compute 00:07:13 Network 00:10:27 Databases 00:16:04 Migration 00:17:43 Developer Tools 00:22:47 Analytics 00:27:07 IoT 00:28:14 End User Computing 00:29:25 Machine Learning 00:30:49 Application Integration 00:34:18 Management and Governance 00:41:42 Customer Engagement 00:42:47 Media 00:44:03 Security 00:46:26 Gaming 00:47:54 AWS Marketplace 00:49:07 Robotics Shownotes Topic || Storage Optimize Cost with Amazon EFS Infrequent Access Lifecycle Management | https://aws.amazon.com/about-aws/whats-new/2019/07/optimize-cost-amazon-efs-infrequent-access-lifecycle-management/ Amazon FSx for Windows File Server Now Enables You to Use File Systems Directly With Your Organization’s Self-Managed Active Directory | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-fsx-for-windows-file-server-now-enables-you-to-use-file-systems-directly-with-your-organizations-self-managed-active-directory/ Amazon FSx for Windows File Server now enables you to use a single AWS Managed AD with file systems across VPCs or accounts | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-fsx-for-windows-file-server-now-enables-you-to-use-a-single-aws-managed-ad-with-file-systems-across-vpcs-or-accounts/ AWS Storage Gateway now supports Amazon VPC endpoints with AWS PrivateLink | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-storage-gateway-now-supports-amazon-vpc-endpoints-aws-privatelink/ File Gateway adds encryption & signing options for SMB clients – Amazon Web Services | https://aws.amazon.com/about-aws/whats-new/2019/06/file-gateway-adds-options-to-enforce-encryption-and-signing-for-smb-shares/ New AWS Public Datasets Available from Facebook, Yale, Allen Institute for Brain Science, NOAA, and others | https://aws.amazon.com/about-aws/whats-new/2019/07/new-aws-public-datasets-available-from-facebook-yale-allen/ Topic || Compute Introducing Amazon EC2 Instance Connect | https://aws.amazon.com/about-aws/whats-new/2019/06/introducing-amazon-ec2-instance-connect/ Introducing New Instances Sizes for Amazon EC2 M5 and R5 Instances | https://aws.amazon.com/about-aws/whats-new/2019/06/introducing-new-instances-sizes-for-amazon-ec2-m5-and-r5-instances/ Introducing New Instance Sizes for Amazon EC2 C5 Instances | https://aws.amazon.com/about-aws/whats-new/2019/06/introducing-new-instance-sizes-for-amazon-ec2-c5-instances/ Amazon ECS now supports additional resource-level permissions and tag-based access controls | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-ecs-now-supports-resource-level-permissions-and-tag-based-access-controls/ Amazon ECS now offers improved capabilities for local testing | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon-ecs-now-offers-improved-capabilities-for-local-testing/ AWS Container Services launches AWS For Fluent Bit | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-container-services-launches-aws-for-fluent-bit/ Amazon EKS now supports Kubernetes version 1.13, ECR PrivateLink, and Kubernetes Pod Security Policies | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-eks-now-supports-kubernetes113-ecr-privatelink-kubernetes-pod-security/ AWS VPC CNI Version 1.5.0 Now Default for Amazon EKS Clusters | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-vpc-cni-version-150-now-default-for-amazon-eks-clusters/ Announcing Enhanced Lambda@Edge Monitoring within the Amazon CloudFront Console | https://aws.amazon.com/about-aws/whats-new/2019/06/announcing-enhanced-lambda-edge-monitoring-amazon-cloudfront-console/ AWS Lambda Console shows recent invocations using CloudWatch Logs Insights | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-lambda-console-recent-invocations-using-cloudwatch-logs-insights/ AWS Thinkbox Deadline with Resource Tracker | https://aws.amazon.com/about-aws/whats-new/2019/06/thinkbox-deadline-resource-tracker/ Topic || Network Network Load Balancer Now Supports UDP Protocol | https://aws.amazon.com/about-aws/whats-new/2019/06/network-load-balancer-now-supports-udp-protocol/ Announcing Amazon VPC Traffic Mirroring for Amazon EC2 Instances | https://aws.amazon.com/about-aws/whats-new/2019/06/announcing-amazon-vpc-traffic-mirroring-for-amazon-ec2-instances/ AWS ParallelCluster now supports Elastic Fabric Adapter (EFA) | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-parallelcluster-supports-elastic-fabric-adapter/ AWS Direct Connect launches first location in Italy | https://aws.amazon.com/about-aws/whats-new/2019/06/aws_direct_connect_locations_in_italy/ Amazon CloudFront announces seven new Edge locations in North America, Europe, and Australia | https://aws.amazon.com/about-aws/whats-new/2019/06/cloudfront-seven-edge-locations-june2019/ Now Add Endpoint Policies to Interface Endpoints for AWS Services | https://aws.amazon.com/about-aws/whats-new/2019/06/now-add-endpoint-policies-to-interface-endpoints-for-aws-services/ Topic || Databases Amazon Aurora with PostgreSQL Compatibility Supports Serverless | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon-aurora-with-postgresql-compatibility-supports-serverless/ Amazon RDS now supports Storage Auto Scaling | https://aws.amazon.com/about-aws/whats-new/2019/06/rds-storage-auto-scaling/ Amazon RDS Introduces Compatibility Checks for Upgrades from MySQL 5.7 to MySQL 8.0 | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon_rds_introduces_compatibility_checks/ Amazon RDS for PostgreSQL Supports New Minor Versions 11.4, 10.9, 9.6.14, 9.5.18, and 9.4.23 | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon-rds-postgresql-supports-minor-version-114/ Amazon Aurora with PostgreSQL Compatibility Supports Cluster Cache Management | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-aurora-with-postgresql-compatibility-supports-cluster-cache-management/ Amazon Aurora with PostgreSQL Compatibility Supports Data Import from Amazon S3 | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-aurora-with-postgresql-compatibility-supports-data-import-from-amazon-s3/ Amazon Aurora Supports Cloning Across AWS Accounts | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon_aurora_supportscloningacrossawsaccounts-/ Amazon RDS for Oracle now supports z1d instance types | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-rds-for-oracle-now-supports-z1d-instance-types/ Amazon RDS for Oracle Supports Oracle Application Express (APEX) Version 19.1 | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-rds-oracle-supports-oracle-application-express-version-191/ Amazon ElastiCache launches reader endpoints for Redis | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-elasticache-launches-reader-endpoint-for-redis/ Amazon DocumentDB (with MongoDB compatibility) Now Supports Stopping and Starting Clusters | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon-documentdb-supports-stopping-starting-cluters/ Amazon DocumentDB (with MongoDB compatibility) Now Provides Cluster Deletion Protection | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon-documentdb-provides-cluster-deletion-protection/ You can now publish Amazon Neptune Audit Logs to Cloudwatch | https://aws.amazon.com/about-aws/whats-new/2019/06/you-can-now-publish-amazon-neptune-audit-logs-to-cloudwatch/ Amazon DynamoDB now supports deleting a global secondary index before it finishes building | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon-dynamodb-now-supports-deleting-a-global-secondary-index-before-it-finishes-building/ Amazon DynamoDB now supports up to 25 unique items and 4 MB of data per transactional request | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-dynamodb-now-supports-up-to-25-unique-items-and-4-mb-of-data-per-transactional-request/ Topic || Migration CloudEndure Migration is now available at no charge | https://aws.amazon.com/about-aws/whats-new/2019/06/cloudendure-migration-available-at-no-charge/ New AWS ISV Workload Migration Program | https://aws.amazon.com/about-aws/whats-new/2019/06/isv-workload-migration/ AWS Migration Hub Adds Support for Service-Linked Roles | https://aws.amazon.com/about-aws/whats-new/2019/06/aws_migration_hub_adds_support_for_service_linked_roles/ Topic || Developer Tools The AWS Toolkit for Visual Studio Code is Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2019/07/announcing-aws-toolkit-for-visual-studio-code/ The AWS Cloud Development Kit (AWS CDK) is Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2019/07/the-aws-cloud-development-kit-aws-cdk-is-now-generally-available1/ AWS CodeCommit Supports Two Additional Merge Strategies and Merge Conflict Resolution | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-codecommit-supports-2-additional-merge-strategies-and-merge-conflict-resolution/ AWS CodeCommit Now Supports Resource Tagging | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-codecommit-now-supports-resource-tagging/ AWS CodeBuild adds Support for Polyglot Builds | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-codebuild-adds-support-for-polyglot-builds/ AWS Amplify Console Updates Build image with SAM CLI and Custom Container Support | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-amplify-console-updates-build-image-sam-cli-and-custom-container-support/ AWS Amplify Console announces Manual Deploys for Static Web Hosting | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-amplify-console-announces-manual-deploys-for-static-web-hosting/ Amplify Framework now Supports Adding AWS Lambda Triggers for events in Auth and Storage categories | https://aws.amazon.com/about-aws/whats-new/2019/07/amplify-framework-now-supports-adding-aws-lambda-triggers-for-events-auth-storage-categories/ AWS Amplify Console now supports AWS CloudFormation | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-amplify-console-supports-aws-cloudformation/ AWS CloudFormation updates for Amazon EC2, Amazon ECS, Amazon EFS, Amazon S3 and more | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-cloudformation-updates-amazon-ec2-ecs-efs-s3-and-more/ Topic || Analytics Amazon QuickSight launches multi-sheet dashboards, new visual types and more | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-quickSight-launches-multi-sheet-dashboards-new-visual-types-and-more/ Amazon QuickSight now supports fine-grained access control over Amazon S3 and Amazon Athena! | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-quickSight-now-supports-fine-grained-access-control-over-amazon-S3-and-amazon-athena/ Announcing EMR Release 5.24.0: With performance improvements in Spark, new versions of Flink, Presto, and Hue, and enhanced CloudFormation support for EMR Instance Fleets | https://aws.amazon.com/about-aws/whats-new/2019/06/announcing-emr-release-5240-with-performance-improvements-in-spark-new-versions-of-flink-presto-Hue-and-cloudformation-support-for-launching-clusters-in-multiple-subnets-through-emr-instance-fleets/ AWS Glue now provides workflows to orchestrate your ETL workloads | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-glue-now-provides-workflows-to-orchestrate-etl-workloads/ Amazon Elasticsearch Service increases data protection with automated hourly snapshots at no extra charge | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon-elasticsearch-service-increases-data-protection-with-automated-hourly-snapshots-at-no-extra-charge/ Amazon MSK is Now Integrated with AWS CloudFormation and Terraform | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon_msk_is_now_integrated_with_aws_cloudformation_and_terraform/ Kinesis Video Streams adds support for Dynamic Adaptive Streaming over HTTP (DASH) and H.265 video | https://aws.amazon.com/about-aws/whats-new/2019/07/kinesis-video-streams-adds-support-for-dynamic-adaptive-streaming-over-http-dash-and-h-2-6-5-video/ Announcing the availability of Amazon Kinesis Video Producer SDK in C | https://aws.amazon.com/about-aws/whats-new/2019/07/announcing-availability-of-amazon-kinesis-video-producer-sdk-in-c/ Topic || IoT AWS IoT Expands Globally | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-iot-expands-globally/ Bluetooth Low Energy Support and New MQTT Library Now Generally Available in Amazon FreeRTOS 201906.00 Major | https://aws.amazon.com/about-aws/whats-new/2019/06/bluetooth-low-energy-support-amazon-freertos-now-available/ AWS IoT Greengrass 1.9.2 With Support for OpenWrt and AWS IoT Device Tester is Now Available | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-iot-greengrass-support-openwrt-aws-iot-device-tester-available/ Topic || End User Computing Amazon Chime Achieves HIPAA Eligibility | https://aws.amazon.com/about-aws/whats-new/2019/06/chime_hipaa_eligibility/ Amazon WorkSpaces now supports copying Images across AWS Regions | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon_workspaces_now_supports_copying_images_across_aws_regions/ Amazon AppStream 2.0 adds support for Windows Server 2016 and Windows Server 2019 | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-appstream-20-adds-support-for-windows-server-2016-and-windows-server-2019/ AWS Client VPN now includes support for AWS CloudFormation | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-client-vpn-includes-support-for-aws-cloudformation/ Topic || Machine Learning Amazon Comprehend Medical is now Available in Sydney, London, and Canada | https://aws.amazon.com/about-aws/whats-new/2019/06/comprehend-medical-available-in-asia-pacific-eu-canada/ Amazon Personalize Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-personalize-now-generally-available/ New in AWS Deep Learning Containers: Support for Amazon SageMaker and MXNet 1.4.1 with CUDA 10.0 | https://aws.amazon.com/about-aws/whats-new/2019/06/new-in-aws-deep-learning-containers-support-for-amazon-sagemaker-libraries-and-mxnet-1-4-1-with-cuda-10-0/ Topic || Application Integration Introducing Amazon EventBridge | https://aws.amazon.com/about-aws/whats-new/2019/07/introducing-amazon-eventbridge/ AWS App Mesh Service Discovery with AWS Cloud Map generally available. | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-app-mesh-service-discovery-with-aws-cloud-map-generally-available/ Amazon API Gateway Now Supports Tag-Based Access Control and Tags on WebSocket APIs | https://aws.amazon.com/about-aws/whats-new/2019/07/amazon-api-gateway-supports-tag-based-access-control-tags-on-websocket/ Amazon API Gateway Adds Configurable Transport Layer Security Version for Custom Domains | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-api-gateway-adds-configurable-transport-layer-security-version-custom-domains/ Topic || Management and Governance Introducing AWS Systems Manager OpsCenter to enable faster issue resolution | https://aws.amazon.com/about-aws/whats-new/2019/06/introducing-aws-systems-manager-opscenter-to-enable-faster-issue-resolution/ Introducing Service Quotas: View and manage your quotas for AWS services from one central location | https://aws.amazon.com/about-aws/whats-new/2019/06/introducing-service-quotas-view-and-manage-quotas-for-aws-services-from-one-location/ Introducing AWS Budgets Reports | https://aws.amazon.com/about-aws/whats-new/2019/07/introducing-aws-budgets-reports/ Introducing Amazon CloudWatch Anomaly Detection – Now in Preview | https://aws.amazon.com/about-aws/whats-new/2019/07/introducing-amazon-cloudwatch-anomaly-detection-now-in-preview/ Amazon CloudWatch Launches Dynamic Labels on Dashboards | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-cloudwatch-launches-dynamic-labels-on-dashboards/ Amazon CloudWatch Adds Visibility for your .NET and SQL Server Application Health | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-cloudwatch-adds-visibility-for-your-net-sql-server-application-health/ Amazon CloudWatch Events Now Supports Amazon CloudWatch Logs as a Target and Tagging of CloudWatch Events Rules | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-cloudwatch-events-now-supports-amazon-cloudwatch-logs-target-tagging-cloudwatch-events-rules/ Introducing Amazon CloudWatch Container Insights for Amazon ECS and AWS Fargate - Now in Preview | https://aws.amazon.com/about-aws/whats-new/2019/07/introducing-container-insights-for-ecs-and-aws-fargate-in-preview/ AWS Config now enables you to provision AWS Config rules across all AWS accounts in your organization | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-config-now-enables-you-to-provision-config-rules-across-all-aws-accounts-in-your-organization/ Session Manager launches Run As to start interactive sessions with your own operating system user account | https://aws.amazon.com/about-aws/whats-new/2019/07/session-manager-launches-run-as-to-start-interactive-sessions-with-your-own-operating-system-user-account/ Session Manager launches tunneling support for SSH and SCP | https://aws.amazon.com/about-aws/whats-new/2019/07/session-manager-launches-tunneling-support-for-ssh-and-scp/ Use IAM access advisor with AWS Organizations to set permission guardrails confidently | https://aws.amazon.com/about-aws/whats-new/2019/06/now-use-iam-access-advisor-with-aws-organizations-to-set-permission-guardrails-confidently/ AWS Resource Groups is Now SOC Compliant | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-resource-groups-is-now-soc-compliant/ Topic || Customer Engagement Introducing AI Powered Speech Analytics for Amazon Connect | https://aws.amazon.com/about-aws/whats-new/2019/06/introducing-ai-powered-speech-analytics-for-amazon-connect/ Amazon Connect Launches Contact Flow Versioning | https://aws.amazon.com/about-aws/whats-new/2019/06/amazon-connect-launches-contact-flow-versioning/ Topic || Media AWS Elemental MediaConnect Now Supports SPEKE for Conditional Access | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-elemental-mediaconnect-now-supports-speke-for-conditional-access/ AWS Elemental MediaLive Now Supports AWS CloudFormation | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-elemental-medialive-now-supports-aws-cloudformation/ AWS Elemental MediaConvert Now Ingests Files from HTTPS Sources | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-elemental-mediaconvert-now-ingests-files-from-https-sources/ Topic || Security AWS Certificate Manager Private Certificate Authority now supports root CA hierarchies | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-certificate-manager-private-certificate-authority-now-supports-root-CA-heirarchies/ AWS Control Tower is now generally available | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-control-tower-is-now-generally-available/ AWS Security Hub is now generally available | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-security-hub-now-generally-available/ AWS Single Sign-On now makes it easy to access more business applications including Asana and Jamf | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-single-sign-on-access-business-applications-including-asana-and-jamf/ Topic || Gaming Large Match Support for Amazon GameLift Now Available | https://aws.amazon.com/about-aws/whats-new/2019/07/large-match-support-for-amazon-gameLift-now-available/ New Dynamic Vegetation System in Lumberyard Beta 1.19 – Available Now | https://aws.amazon.com/about-aws/whats-new/2019/06/lumberyard-beta-119-available-now/ Topic || AWS Marketplace AWS Marketplace now integrates with your procurement systems | https://aws.amazon.com/about-aws/whats-new/2019/06/aws-marketplace-now-integrates-with-your-procurement-systems/ Topic || Robotics AWS RoboMaker announces support for Robot Operating System (ROS) Melodic | https://aws.amazon.com/about-aws/whats-new/2019/07/aws-robomaker-support-robot-operating-system-melodic/
Simon hosts an update show with lots of great new features and capabilities! Chapters: Developer Tools 0:26 Storage 3:02 Compute 5:10 Database 10:31 Networking 13:41 Analytics 16:38 IoT 18:23 End User Computing 20:19 Machine Learning 21:12 Application Integration 24:02 Management and Governance 24:23 Migration 26:05 Security 26:56 Training and Certification 29:57 Blockchain 30:27 Quickstarts 31:06 Shownotes: Topic || Developer Tools Announcing AWS X-Ray Analytics – An Interactive approach to Trace Analysis | https://aws.amazon.com/about-aws/whats-new/2019/04/aws_x_ray_interactive_approach_analyze_traces/ Quickly Search for Resources across Services in the AWS Developer Tools Console | https://aws.amazon.com/about-aws/whats-new/2019/05/search-resources-across-services-developer-tools-console/ AWS Amplify Console adds support for Incoming Webhooks | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-amplify-console-adds-support-for-incoming-webhooks/ AWS Amplify launches an online community for fullstack serverless app developers | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-amplify-launches-an-online-community-for-fullstack-serverless-app-developers/ AWS AppSync Now Enables More Visibility into Performance and Health of GraphQL Operations | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-appsync-now-enables-more-visibility-into-performance-and-hea/ AWS AppSync Now Supports Configuring Multiple Authorization Types for GraphQL APIs | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-appsync-now-supports-configuring-multiple-authorization-type/ Topic || Storage Amazon S3 Introduces S3 Batch Operations for Object Management | https://aws.amazon.com/about-aws/whats-new/2019/04/Amazon-S3-Introduces-S3-Batch-Operations-for-Object-Management/ AWS Snowball Edge adds block storage – Amazon Web Services | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-snowball-edge-adds-block-storage-for-edge-computing-workload/ Amazon FSx for Windows File Server Adds Support for File System Monitoring with Amazon CloudWatch | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-fsx-for-windows-file-server-adds-support-for-cloudwatch/ AWS Storage Gateway enhances access control for SMB shares to store and access objects in Amazon S3 buckets | https://aws.amazon.com/about-aws/whats-new/2019/05/AWS-Storage-Gateway-enhances-access-control-for-SMB-shares-to-access-objects-in-Amazon-s3/ Topic || Compute AWS Lambda adds support for Node.js v10 | https://aws.amazon.com/about-aws/whats-new/2019/05/aws_lambda_adds_support_for_node_js_v10/ AWS Serverless Application Model (SAM) supports IAM permissions and custom responses for Amazon API Gateway | https://aws.amazon.com/about-aws/whats-new/2019/aws_serverless_application_Model_support_IAM/ AWS Step Functions Adds Support for Workflow Execution Events | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-step-functions-adds-support-for-workflow-execution-events/ Amazon EC2 I3en instances, offering up to 60 TB of NVMe SSD instance storage, are now generally available | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-ec2-i3en-instances-are-now-generally-available/ Now Create Amazon EC2 On-Demand Capacity Reservations Through AWS CloudFormation | https://aws.amazon.com/about-aws/whats-new/2019/04/now-create-amazon-ec2-on-demand-capacity-reservations-through-aws-cloudformation/ Share encrypted AMIs across accounts to launch instances in a single step | https://aws.amazon.com/about-aws/whats-new/2019/05/share-encrypted-amis-across-accounts-to-launch-instances-in-a-single-step/ Launch encrypted EBS backed EC2 instances from unencrypted AMIs in a single step | https://aws.amazon.com/about-aws/whats-new/2019/05/launch-encrypted-ebs-backed-ec2-instances-from-unencrypted-amis-in-a-single-step/ Amazon EKS Releases Deep Learning Benchmarking Utility | https://aws.amazon.com/about-aws/whats-new/2019/05/-amazon-eks-releases-deep-learning-benchmarking-utility-/ Amazon EKS Adds Support for Public IP Addresses Within Cluster VPCs | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-eks-adds-support-for-public-ip-addresses-within-cluster-v/ Amazon EKS Simplifies Kubernetes Cluster Authentication | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-eks-simplifies-kubernetes-cluster-authentication/ Amazon ECS Console support for ECS-optimized Amazon Linux 2 AMI and Amazon EC2 A1 instance family now available | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-ecs-console-support-for-ecs-optimized-amazon-linux-2-ami-/ AWS Fargate PV1.3 now supports the Splunk log driver | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-fargate-pv1-3-now-supports-the-splunk-log-driver/ Topic || Databases Amazon Aurora Serverless Supports Capacity of 1 Unit and a New Scaling Option | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon_aurora_serverless_now_supports_a_minimum_capacity_of_1_unit_and_a_new_scaling_option/ Aurora Global Database Expands Availability to 14 AWS Regions | https://aws.amazon.com/about-aws/whats-new/2019/05/Aurora_Global_Database_Expands_Availability_to_14_AWS_Regions/ Amazon DocumentDB (with MongoDB compatibility) now supports per-second billing | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-documentdb-now-supports-per-second-billing/ Performance Insights is Generally Available on Amazon Aurora MySQL 5.7 | https://aws.amazon.com/about-aws/whats-new/2019/05/Performance-Insights-GA-Aurora-MySQL-57/ Performance Insights Supports Counter Metrics on Amazon RDS for Oracle | https://aws.amazon.com/about-aws/whats-new/2019/05/performance-insights-countermetrics-on-oracle/ Performance Insights Supports Amazon Aurora Global Database | https://aws.amazon.com/about-aws/whats-new/2019/05/performance-insights-global-datatabase/ Amazon ElastiCache for Redis adds support for Redis 5.0.4 | https://aws.amazon.com/about-aws/whats-new/2019/05/elasticache-redis-5-0-4/ Amazon RDS for MySQL Supports Password Validation | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-rds-for-mysql-supports-password-validation/ Amazon RDS for PostgreSQL Supports New Minor Versions 11.2, 10.7, 9.6.12, 9.5.16, and 9.4.21 | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-rds-postgresql-supports-minor-version-112/ Amazon RDS for Oracle now supports April Oracle Patch Set Updates (PSU) and Release Updates (RU) | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-rds-for-oracle-now-supports-april-oracle-patch-set-updates-psu-and-release-updates-ru/ Topic || Networking Elastic Fabric Adapter Is Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2019/04/elastic-fabric-adapter-is-now-generally-available/ Migrate Your AWS Site-to-Site VPN Connections from a Virtual Private Gateway to an AWS Transit Gateway | https://aws.amazon.com/about-aws/whats-new/2019/04/migrate-your-aws-site-to-site-vpn-connections-from-a-virtual-private-gateway-to-an-aws-transit-gateway/ Announcing AWS Direct Connect Support for AWS Transit Gateway | https://aws.amazon.com/about-aws/whats-new/2019/04/announcing-aws-direct-connect-support-for-aws-transit-gateway/ Amazon CloudFront announces 11 new Edge locations in India, Japan, and the United States | https://aws.amazon.com/about-aws/whats-new/2019/05/cloudfront-11locations-7may2019/ Amazon VPC Endpoints Now Support Tagging for Gateway Endpoints, Interface Endpoints, and Endpoint Services | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-vpc-endpoints-now-support-tagging-for-gateway-endpoints-interface-endpoints-and-endpoint-services/ Topic || Analytics Amazon EMR announces Support for Multiple Master nodes to enable High Availability for EMR applications | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-emr-announces-support-for-multiple-master-nodes-to-enable-high-availability-for-EMR-applications/ Amazon EMR now supports Multiple Master nodes to enable High Availability for HBase clusters | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-emr-now-supports-multiple-master-nodes-to-enable-high-availability-for-hbase-clusters/ Amazon EMR announces Support for Reconfiguring Applications on Running EMR Clusters | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-emr-announces-support-for-reconfiguring-applications-on-running-emr-clusters/ Amazon Kinesis Data Analytics now allows you to assign AWS resource tags to your real-time applications | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon_kinesis_data_analytics_now_allows_you_to_assign_aws_resource_tags_to_your_real_time_applications/ AWS Glue crawlers now support existing Data Catalog tables as sources | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-glue-crawlers-now-support-existing-data-catalog-tables-as-sources/ Topic || IoT AWS IoT Analytics Now Supports Faster SQL Data Set Refresh Intervals | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-iot-analytics-now-supports-faster-sql-data-set-refresh-intervals/ AWS IoT Greengrass Adds Support for Python 3.7, Node v8.10.0, and Expands Support for Elliptic-Curve Cryptography | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-iot-greengrass-adds-support-python-3-7-node-v-8-10-0-and-expands-support-elliptic-curve-cryptography/ AWS Releases Additional Preconfigured Examples for FreeRTOS on Armv8-M | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-releases-additional-freertos-preconfigured-examples-armv8m/ AWS IoT Device Defender supports monitoring behavior of unregistered devices | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-iot-device-defender-supports-monitoring-behavior-of-unregistered-devices/ AWS IoT Analytics Now Supports Data Set Content Delivery to Amazon S3 | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-iot-analytics-now-supports-data-set-content-delivery-to-amaz/ Topic || End User Computing Amazon AppStream 2.0 adds configurable timeouts for idle sessions | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-appstream-2-0-adds-configurable-timeouts-for-idle-session/ Monitor Emails in Your Workmail Organization Using Cloudwatch Metrics and Logs | https://aws.amazon.com/about-aws/whats-new/2019/05/monitor-emails-in-your-workmail-organization-using-cloudwatch-me/ You can now use custom chat bots with Amazon Chime | https://aws.amazon.com/about-aws/whats-new/2019/05/you-can-now-use-custom-chat-bots-with-amazon-chime/ Topic || Machine Learning Developers, start your engines! The AWS DeepRacer Virtual League kicks off today. | https://aws.amazon.com/about-aws/whats-new/2019/04/AWSDeepRacerVirtualLeague/ Amazon SageMaker announces new features to the built-in Object2Vec algorithm | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-sagemaker-announces-new-features-to-the-built-in-object2v/ Amazon SageMaker Ground Truth Now Supports Automated Email Notifications for Manual Data Labeling | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-sagemaker-ground-truth-now-supports-automated-email-notif/ Amazon Translate Adds Support for Hindi, Farsi, Malay, and Norwegian | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon_translate_support_hindi_farsi_malay_norwegian/ Amazon Transcribe now supports Hindi and Indian-accented English | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-transcribe-supports-hindi-indian-accented-english/ Amazon Comprehend batch jobs now supports Amazon Virtual Private Cloud | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-comprehend-batch-jobs-now-supports-amazon-virtual-private-cloud/ New in AWS Deep Learning AMIs: PyTorch 1.1, Chainer 5.4, and CUDA 10 support for MXNet | https://aws.amazon.com/about-aws/whats-new/2019/05/new-in-aws-deep-learning-amis-pytorch-1-1-chainer-5-4-cuda10-for-mxnet/ Topic || Application Integration Amazon MQ Now Supports Resource-Level and Tag-Based Permissions | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-mq-now-supports-resource-level-and-tag-based-permissions/ Amazon SNS Adds Support for Cost Allocation Tags | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-sns-adds-support-for-cost-allocation-tags/ Topic || Management and Governance Reservation Expiration Alerts Now Available in AWS Cost Explorer | https://aws.amazon.com/about-aws/whats-new/2019/05/reservation-expiration-alerts-now-available-in-aws-cost-explorer/ AWS Systems Manager Patch Manager Supports Microsoft Application Patching | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-systems-manager-patch-manager-supports-microsoft-application-patching/ AWS OpsWorks for Chef Automate now supports Chef Automate 2 | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-opsworks-for-chef-automate-now-supports-chef-automate-2/ AWS Service Catalog Connector for ServiceNow supports CloudFormation StackSets | https://aws.amazon.com/about-aws/whats-new/2019/05/service-catalog-servicenow-connector-now-supports-stacksets/ Topic || Migration AWS Migration Hub EC2 Recommendations | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-migration-hub-ec2-recommendations/ Topic || Security Amazon GuardDuty Adds Two New Threat Detections | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-guardduty-adds-two-new-threat-detections/ AWS Security Token Service (STS) now supports enabling the global STS endpoint to issue session tokens compatible with all AWS Regions | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-security-token-service-sts-now-supports-enabling-the-global-sts-endpoint-to-issue-session-tokens-compatible-with-all-aws-regions/ AWS WAF Security Automations Now Supports Log Analysis | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-waf-security-automations-now-supports-log-analysis/ AWS Certificate Manager Private Certificate Authority Increases Certificate Limit To One Million | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-certificate-manager-private-certificate-authority-increases-certificate-limit-to-one-million/ Amazon Cognito launches enhanced user password reset API for administrators | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-cognito-launches-enhanced-user-password-reset-api-for-administrators/ AWS Secrets Manager supports more client-side caching libraries to improve secrets availability and reduce cost | https://aws.amazon.com/about-aws/whats-new/2019/05/Secrets-Manager-Client-Side-Caching-Libraries-in-Python-NET-Go/ Create fine-grained session permissions using AWS Identity and Access Management (IAM) managed policies | https://aws.amazon.com/about-aws/whats-new/2019/05/session-permissions/ Topic || Training and Certification New VMware Cloud on AWS Navigate Track | https://aws.amazon.com/about-aws/whats-new/2019/04/vmware-navigate-track/ Topic || Blockchain Amazon Managed Blockchain What's New | https://aws.amazon.com/about-aws/whats-new/2019/04/introducing-amazon-managed-blockchain/ Topic || Quick Starts New Quick Start deploys SAP S/4HANA on AWS | https://aws.amazon.com/about-aws/whats-new/2019/05/new-quick-start-deploys-sap-s4-hana-on-aws/
Simon speaks with Randall Hunt about how he designed, built, and deployed the @WhereML Twitter bot that can identify where in the world a picture was taken using only the pixels in the image. We’ll dive deep on artificial intelligence and deep learning with the MXNet framework and also talk about working with the Twitter Account Activity API. The bot is entirely autoscaling and powered by Amazon API Gateway and AWS Lambda which means, as a customer, you don’t manage any infrastructure. Finally we’ll close with a discussion around custom authorizers in API Gateway and when to use them. https://github.com/ranman/WhereML
Simon and Nicki cover almost 100 updates! Check out the chapter timings to see where things of interest to you might be. Infrastructure 00:42 Storage 1:17 Databases 4:14 Analytics 8:28 Compute 9:52 IoT 15:17 End User Computing 17:40 Machine Learning 19:10 Networking 21:57 Developer Tools 23:21 Application Integration 25:42 Game Tech 26:29 Media 27:37 Management and Governance 28:11 Robotics 30:35 Security 31:30 Solutions 32:40 Topic || Infrastructure In the Works – AWS Region in Indonesia | https://aws.amazon.com/blogs/aws/in-the-works-aws-region-in-indonesia/ Topic || Storage New Amazon S3 Storage Class – Glacier Deep Archive | https://aws.amazon.com/blogs/aws/new-amazon-s3-storage-class-glacier-deep-archive/ File Gateway Supports Amazon S3 Object Lock - Amazon Web Services | https://aws.amazon.com/about-aws/whats-new/2019/03/file-gateway-supports-amazon-s3-object-lock/ AWS Storage Gateway Tape Gateway Deep Archive | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-storage-gateway-service-integrates-tape-gateway-with-amazon-s3-glacier-deeparchive-storage-class/ AWS Transfer for SFTP supports AWS Privatelink – Amazon Web Services | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-transfer-for-sftp-now-supports-aws-privatelink/ Amazon FSx for Lustre Now Supports Access from Amazon Linux | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-fsx-for-lustre-now-supports-access-from-amazon-linux/ AWS introduces CSI Drivers for Amazon EFS and Amazon FSx for Lustre | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-introduces-csi-drivers-for-amazon-efs-and-amazon-fsx-for-lus/ Topic || Databases Amazon DynamoDB drops the price of global tables by eliminating associated charges for DynamoDB Streams | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-dynamodb-drops-the-price-of-global-tables-by-eliminating-associated-charges-for-dynamodb-streams/ Amazon ElastiCache for Redis 5.0.3 enhances I/O handling to boost performance | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-elasticache-for-redis-503-enhances-io-handling-to-boost-performance/ Amazon Redshift announces Concurrency Scaling: Consistently fast performance during bursts of user activity | https://aws.amazon.com/about-aws/whats-new/2019/03/AmazonRedshift-ConcurrencyScaling/ Performance Insights is Generally Available on Amazon RDS for MariaDB | https://aws.amazon.com/about-aws/whats-new/2019/03/performance-insights-is-generally-available-for-mariadb/ Amazon RDS adds support for MySQL Versions 5.7.25, 5.7.24, and MariaDB Version 10.2.21 | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-rds-mysql-minor-5725-5725-and-mariadb-10221/ Amazon Aurora with MySQL 5.7 Compatibility Supports GTID-Based Replication | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-aurora-with-mysql-5-7-compatibility-supports-gtid-based-replication/ PostgreSQL 11 now Supported in Amazon RDS | https://aws.amazon.com/about-aws/whats-new/2019/03/postgresql11-now-supported-in-amazon-rds/ Amazon Aurora with PostgreSQL Compatibility Supports Logical Replication | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-aurora-with-postgresql-compatibility-supports-logical-replication/ Restore an Encrypted Amazon Aurora PostgreSQL Database from an Unencrypted Snapshot | https://aws.amazon.com/about-aws/whats-new/2019/03/restore-an-encrypted-aurora-postgresql-database-from-an-unencrypted-snapshot/ Amazon RDS for Oracle Now Supports In-region Read Replicas with Active Data Guard for Read Scalability and Availability | https://aws.amazon.com/about-aws/whats-new/2019/03/Amazon-RDS-for-Oracle-Now-Supports-In-region-Read-Replicas-with-Active-Data-Guard-for-Read-Scalability-and-Availability/ AWS Schema Conversion Tool Adds Support for Migrating Oracle ETL Jobs to AWS Glue | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-schema-conversion-tool-adds-support-for-migrating-oracle-etl/ AWS Schema Conversion Tool Adds New Conversion Features | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-sct-adds-support-for-new-endpoints/ Amazon Neptune Announces 99.9% Service Level Agreement | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-neptune-announces-service-level-agreement/ Topic || Analytics Amazon QuickSight Announces General Availability of ML Insights | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon_quicksight_announced_general_availability_of_mL_insights/ AWS Glue enables running Apache Spark SQL queries | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-glue-enables-running-apache-spark-sql-queries/ AWS Glue now supports resource tagging | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-glue-now-supports-resource-tagging/ Amazon Kinesis Data Analytics Supports AWS CloudTrail Logging | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-kinesis-data-analytics-supports-aws-cloudtrail-logging/ Tag-on Create and Tag-Based IAM Application for Amazon Kinesis Data Firehose | https://aws.amazon.com/about-aws/whats-new/2019/03/tag-on-create-and-tag-based-iam-application-for-amazon-kinesis-data-firehose/ Topic || Compute Amazon EKS Introduces Kubernetes API Server Endpoint Access Control | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-eks-introduces-kubernetes-api-server-endpoint-access-cont/ Amazon EKS Opens Public Preview of Windows Container Support | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-eks-opens-public-preview-of-windows-container-support/ Amazon EKS now supports Kubernetes version 1.12 and Cluster Version Updates Via CloudFormation | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-eks-now-supports-kubernetes-version-1-12-and-cluster-vers/ New Local Testing Tools Now Available for Amazon ECS | https://aws.amazon.com/about-aws/whats-new/2019/03/new-local-testing-tools-now-available-for-amazon-ecs/ AWS Fargate and Amazon ECS Support External Deployment Controllers for ECS Services | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-fargate-and-amazon-ecs-support-external-deployment-controlle/ AWS Fargate PV1.3 adds secrets and enhanced container dependency management | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-fargate-pv1-3-adds-secrets-and-enhanced-container-dependency/ AWS Event Fork Pipelines – Nested Applications for Event-Driven Serverless Architectures | https://aws.amazon.com/about-aws/whats-new/2019/03/introducing-aws-event-fork-pipelines-nested-applications-for-event-driven-serverless-architectures/ New Amazon EC2 M5ad and R5ad Featuring AMD EPYC Processors are Now Available | https://aws.amazon.com/about-aws/whats-new/2019/03/new-amazon-ec2-m5ad-and-r5ad-featuring-amd-epyc-processors-are-now-available/ Announcing the Ability to Pick the Time for Amazon EC2 Scheduled Events | https://aws.amazon.com/about-aws/whats-new/2019/03/announcing-the-ability-to-pick-the-time-for-amazon-ec2-scheduled-events/ Topic || IoT AWS IoT Analytics now supports Single Step Setup of IoT Analytics Resources | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-iot-analytics-now-supports-single-step-setup-of-iot-analytic/ AWS IoT Greengrass Adds New Connector for AWS IoT Analytics, Support for AWS CloudFormation Templates, and Integration with Fleet Indexing | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-iot-greengrass-adds-new-connector-aws-iot-analytics-support-aws-cloudformation-templates-integration-fleet-indexing/ AWS IoT Device Tester v1.1 is Now Available for AWS IoT Greengrass v1.8.0 | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-iot-device-tester-now-available-aws-iot-greengrass-v180/ AWS IoT Core Now Supports HTTP REST APIs with X.509 Client Certificate-Based Authentication On Port 443 | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-iot-core-now-supports-http-rest-apis-with-x509-client-certificate-based-authentication-on-port-443/ Generate Fleet Metrics with New Capabilities of AWS IoT Device Management | https://aws.amazon.com/about-aws/whats-new/2019/03/generate-fleet-metrics-with-new-capabilities-of-aws-iot-device-management/ Topic || End User Computing Amazon AppStream 2.0 Now Supports iPad and Android Tablets and Touch Gestures | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-appstream-2-0-now-supports-ipad-and-android-tablets-and-t/ Amazon WorkDocs Drive now supports offline content and offline search | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-workdocs-drive-now-supports-offline-content-and-offline-s/ Introducing Amazon Chime Business Calling | https://aws.amazon.com/about-aws/whats-new/2019/03/introducing-amazon-chime-business-calling/ Introducing Amazon Chime Voice Connector | https://aws.amazon.com/about-aws/whats-new/2019/03/introducing-amazon-chime-voice-connector/ Alexa for Business now lets you create Alexa skills for your organization using Skill Blueprints | https://aws.amazon.com/about-aws/whats-new/2019/03/alexa-for-business-now-lets-you-create-alexa-skills-for-your-org/ Topic || Machine Learning New AWS Deep Learning AMIs: Amazon Linux 2, TensorFlow 1.13.1, MXNet 1.4.0, and Chainer 5.3.0 | https://aws.amazon.com/about-aws/whats-new/2019/03/new-aws-deep-learning-amis-amazon-linux2-tensorflow-13-1-mxnet1-4-0-chainer5-3-0/ Introducing AWS Deep Learning Containers | https://aws.amazon.com/about-aws/whats-new/2019/03/introducing-aws-deep-learning-containers/ Amazon Transcribe now supports speech-to-text in German and Korean | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-transcribe-now-supports-speech-to-text-in-german-and-korean/ Amazon Transcribe enhances custom vocabulary with custom pronunciations and display forms | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-transcribe-enhances-custom-vocabulary-with-custom-pronunciations-and-display-forms/ Amazon Comprehend now supports AWS KMS Encryption | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-comprehend-now-supports-aws-kms-encryption/ New Setup Tool To Get Started Quickly with Amazon Elastic Inference | https://aws.amazon.com/about-aws/whats-new/2019/04/new-python-script-to-get-started-quickly-with-amazon-elastic-inference/ Topic || Networking Application Load Balancers now Support Advanced Request Routing | https://aws.amazon.com/about-aws/whats-new/2019/03/application-load-balancers-now-support-advanced-request-routing/ Announcing Multi-Account Support for Direct Connect Gateway | https://aws.amazon.com/about-aws/whats-new/2019/03/announcing-multi-account-support-for-direct-connect-gateway/ Topic || Developer Tools AWS App Mesh is now generally available | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-app-mesh-is-now-generally-available/ The AWS Toolkit for IntelliJ is Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2019/03/the-aws-toolkit-for-intellij-is-now-generally-available/ The AWS Toolkit for Visual Studio Code (Developer Preview) is Now Available for Download from in the Visual Studio Marketplace | https://aws.amazon.com/about-aws/whats-new/2019/03/the-aws-toolkit-for-visual-studio-code--developer-preview--is-now-available-for-download-from-vs-marketplace/ AWS Cloud9 announces support for Ubuntu development environments | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-cloud9-announces-support-for-ubuntu-development-environments/ Amplify Framework Adds Enhancements to Authentication for iOS, Android, and React Native Developers | https://aws.amazon.com/about-aws/whats-new/2019/03/amplify-framework-adds-enhancements-to-authentication-for-ios-android-and-react-native-developers/ AWS CodePipeline Adds Action-Level Details to Pipeline Execution History | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-codepipeline-adds-action-level-details-to-pipeline-execution-history/ Topic || Application Integration Amazon API Gateway Improves API Publishing and Adds Features to Enhance User Experience | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-api-gateway-improves-api-publishing-and-adds-features/ Topic || Game Tech AWS Whats New - Lumberyard Beta 118 - Amazon Web Services | https://aws.amazon.com/about-aws/whats-new/2019/03/over-190-updates-come-to-lumberyard-beta-118-available-now/ Amazon GameLift Realtime Servers Now in Preview | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-gamelift-realtime-servers-now-in-preview/ Topic || Media Services Detailed Job Progress Status and Server-Side S3 Encryption Now Available with AWS Elemental MediaConvert | https://aws.amazon.com/about-aws/whats-new/2019/03/detailed-job-progress-status-and-server-side-s3-encryption-now-available-with-aws-elemental-mediaconvert/ Introducing Live Streaming with Automated Multi-Language Subtitling | https://aws.amazon.com/about-aws/whats-new/2019/03/introducing-live-streaming-with-automated-multi-language-subtitling/ Video on Demand Now Leverages AWS Elemental MediaConvert QVBR Mode | https://aws.amazon.com/about-aws/whats-new/2019/04/video-on-demand-now-leverages-aws-elemental-mediaconvert-qvbr-mode/ Topic || Management and Governance Use AWS Config Rules to Remediate Noncompliant Resources | https://aws.amazon.com/about-aws/whats-new/2019/03/use-aws-config-to-remediate-noncompliant-resources/ AWS Config Now Supports Tagging of AWS Config Resources | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-config-now-supports-tagging-of-aws-config-resources/ Now You Can Query Based on Resource Configuration Properties in AWS Config | https://aws.amazon.com/about-aws/whats-new/2019/03/now-you-can-query-based-on-resource-configuration-properties-in-aws-config/ AWS Config Adds Support for Amazon API Gateway | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-config-adds-support-for-amazon-api-gateway/ Amazon Inspector adds support for Amazon EC2 A1 instances | https://aws.amazon.com/about-aws/whats-new/2019/03/amazon-inspector-adds-support-for-amazon-ec2-a1-instances/ Service control policies in AWS Organizations enable fine-grained permission controls | https://aws.amazon.com/about-aws/whats-new/2019/03/service-control-policies-enable-fine-grained-permission-controls/ You can now use resource level policies for Amazon CloudWatch Alarms | https://aws.amazon.com/about-aws/whats-new/2019/04/you-can-now-use-resource-level-permissions-for-amazon-cloudwatch/ Amazon CloudWatch Launches Search Expressions | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-cloudwatch-launches-search-expressions/ AWS Systems Manager Announces 99.9% Service Level Agreement | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-systems-manager-announces-service-level-agreement/ Topic || Robotics AWS RoboMaker Announces 99.9% Service Level Agreement | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-robomaker-announces-service-level-agreement/ AWS RoboMaker announces new build and bundle feature that makes it up to 10x faster to update a simulation job or a robot | https://aws.amazon.com/about-aws/whats-new/2019/03/robomaker-new-build-and-bundle/ Topic || Security Announcing the renewal command for AWS Certificate Manager | https://aws.amazon.com/about-aws/whats-new/2019/03/Announcing-the-renewal-command-for-AWS-Certificate-Manager/ AWS Key Management Service Increases API Requests Per Second Limits | https://aws.amazon.com/about-aws/whats-new/2019/03/aws-key-management-service-increases-api-requests-per-second-limits/ Announcing AWS Firewall Manager Support For AWS Shield Advanced | https://aws.amazon.com/about-aws/whats-new/2019/03/announcing-aws-firewall-manager-support-for-aws-shield-advanced/ Topic || Solutions New AWS SAP Navigate Track | https://aws.amazon.com/about-aws/whats-new/2019/03/sap-navigate-track/ Deploy Micro Focus PlateSpin Migrate on AWS with New Quick Start | https://aws.amazon.com/about-aws/whats-new/2019/03/deploy-micro-focus-platespin-migrate-on-aws-with-new-quick-start/
Carin Meier talks about the Clojure MXNet package, MXNet, Scala interop with Java, and ML in society. MXNet Clojure MXnet tvm-clj on The REPL Scala - Clojure interop utility namespace MXNet Issues tagged with Clojure Clojure MXNet contribution needs Clojurians Slack - #mxnet MXNet Slack joining info Other machine learning libraries Cortex dl4clj jutsu.ai Funny reinforcement learning outcomes Twitter thread on unexpected RL outcomes The Surprising Creativity of Digital Evolution
The AWS machine learning services are more examples of the newer offerings. Nevertheless, these are growing fast and can help you embrace cutting edge technology. Machine learning is a recent technology in general so the time you spend understanding these services may help you land that next job. Amazon SageMaker This service provides a method for building, training, and deploying machine learning models at any scale. This is a great way to try out machine learning. The time you spend here is good to use on your next resume update. You do need to put some data on S3 to analyze and then check out the use cases. There is a free tier for the first two months. Amazon Comprehend Quick and easy text analysis. Send your text to this service to analyze it for keywords among many other ways to do so. There is a free tier you can use to try it out and find out ways to organize and mine your content. Amazon Lex This service allows you to build voice and chatbots using the technology that drives Alexa. There are some templates, and the interface makes it easy to get started quickly. Amazon Polly If you want to create audio from your content, then this is the service for you. Try out the service a few thousand words at a time for free, and you can even download the audio in mp3 format. Amazon Rekognition The features that Comprehend provides for text is moved into the video world by Rekognition. This service analyzes video and can highlight or recognize people, objects, and other details you might search for in a stream. Amazon Translate This service provides a quick and easy way to translate text between any two languages. Much like Google translate, it is quick and provides an API that you can use to significantly increase your audience. Amazon Transcribe If you have ever wondered about transcribing audio notes (or a podcast), then this is the service for you. It is quick and easy to customize for even highly technical terms. The accuracy varies based on the clarity of the audio and background noise. AWS DeepLens This service is best understood by utilizing the tutorials. It provides a way to analyze videos for objects, faces, and activities. An essential difference between this and the others is that this is a piece of hardware and not just a service. It provides a camera with HD and onboard analysis tools for real-time processing of video. AWS Deep Learning AMIs This service provides quick start machine learning on EC2 through the AMIs. The configuration of a machine learning development environment can be tedious and time-consuming. These AMI options offer a shortcut to get working sooner. Apache MXNet on AWS This is a machine learning framework Apache MXNet is a fast and scalable training and inference framework with an easy-to-use, concise API for machine learning. MXNet includes the Gluon interface that allows developers of all skill levels to get started with deep learning on the cloud, on edge devices, and mobile apps. In just a few lines of Gluon code, you can build linear regression, convolutional networks and recurrent LSTMs for object detection, speech recognition, recommendation, and personalization. TensorFlow on AWS This is a machine learning framework on AWS. I think their description works best and avoids any ignorance about it on my end. "TensorFlow™ enables developers to quickly and easily get started with deep learning in the cloud. The framework has broad support in the industry and has become a popular choice for deep learning research and application development, particularly in areas such as computer vision, natural language understanding, and speech translation. You can get started on AWS with a fully-managed TensorFlow experience with Amazon SageMaker, a platform to build, train, and deploy machine learning models at scale. Or, you can use the AWS Deep Learning AMIs to build custom environments and workflows with TensorFlow and other popular frameworks including Apache MXNet, PyTorch, Caffe, Caffe2, Chainer, Gluon, Keras, and Microsoft Cognitive Toolkit."
The Apache MXNet deep learning framework is used for developing, training, and deploying diverse AI applications, including computer vision, speech recognition, natural language processing, and more at scale. In this session, learn how to get started with Apache MXNet on the Amazon SageMaker machine learning platform. Chick-fil-A share how they got started with MXNet on Amazon SageMaker to measure waffle fry freshness and how they leverage AWS services to improve the Chick-fil-A guest experience. Complete Title: AWS re:Invent 2018: [REPEAT 1] Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (AIM407-R1)
Dr. Andres Rodriguez, a Senior Principal Engineer in the Data Center Group at Intel, stops by to talk about why it is so critical to optimize framework and software tools artificial intelligence applications. Intel has worked hard over the last two years to optimize popular frameworks like Caffe, TensorFlow, MXNet, and Pytorch for Intel® Xeon® processors. We’ve also developed the Intel® Math Kernel Library for Deep Neural Networks (Intel® MKL-DNN) to accelerate deep learning workloads on Intel® architecture. Customers are now seeing the benefits of using their existing Intel Xeon processors for artificial intelligence workloads with increasingly optimized performance. For more on this topic, visit: http://ai.intel.com. Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at intel.com. Intel and the Intel logo are trademarks of Intel Corporation or its subsidiaries in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. © Intel Corporation.
Simon takes you through another BIG set of updates - what will catch your imagination? Shownotes: AWS re:Invent 2018: https://reinvent.awsevents.com/ AWS Public Sector Summit Canberra: https://aws.amazon.com/summits/canberra-public-sector/ Amazon QuickSight announces Pay-per-Session pricing, Private VPC Connectivity and more! | https://aws.amazon.com/about-aws/whats-new/2018/05/Amazon-QuickSight-announces-Pay-per-Session-pricing-Private-VPC-Connectivity-and-more/ Introducing Amazon EC2 M5d Instances | https://aws.amazon.com/about-aws/whats-new/2018/06/introducing-amazon-ec2-m5d-instances/ Amazon Polly Introduces a New French Female Voice, Léa | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-polly-introduces-a-new-french-female-voice-lea/ Amazon Neptune is now generally available to build fast, reliable graph applications | https://aws.amazon.com/about-aws/whats-new/2018/05/amazon-neptune-is-now-generally-available/ Amazon Athena releases support for Views | https://aws.amazon.com/about-aws/whats-new/2018/06/athena-support-for-views/ Amazon Redshift Can Now COPY from Parquet and ORC File Formats | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-redshift-can-now-copy-from-parquet-and-orc-file-formats/ Amazon DynamoDB Announces 99.999% Service Level Agreement for Global Tables | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-dynamodb-announces-a-monthly-service-level-agreement/ Amazon DynamoDB Backup and Restore Regional Expansion | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-dynamodb-backup-and-restore-regional-expansion/ Amazon DynamoDB Accelerator (DAX) SDK for Go Now Available | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-dynamodb-accelerator--dax--sdk-for-go-now-available/ Introducing Optimize CPUs for Amazon RDS for Oracle | https://aws.amazon.com/about-aws/whats-new/2018/06/introducing-optimize-cpus-for-amazon-rds-for-oracle/ Announcing General Availability of Performance Insights | https://aws.amazon.com/about-aws/whats-new/2018/06/announcing-general-availability-of-performance-insights/ Amazon RDS for PostgreSQL Read Replicas now support Multi-AZ Deployments | https://aws.amazon.com/about-aws/whats-new/2018/06/rds-postgres-supports-readreplicas-multiaz/ AWS Database Migration Service Can Start Replication Anywhere in a Transaction Log | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-dms-can-start-replication-anywhere-in-a-transaction-log/ AWS Storage Gateway Adds SMB Support to Store and Access Objects in Amazon S3 Buckets | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-storage-gateway-adds-smb-support-to-store-objects-in-amazon-s3/ Amazon EBS Extends Elastic Volumes to Support EBS Magnetic (Standard) Volume Type | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-ebs-extends-elastic-volumes-to-support-ebs-magnetic--standard--volume-type/ AWS CloudFormation StackSets Supports Multiple Execution Roles and Selective Update Operation on Stack Instances | https://aws.amazon.com/about-aws/whats-new/2018/05/aws-cloudformation-stacksets-supports-multiple-execution-roles-a/ Introducing CloudFormation Support for AWS PrivateLink Resources | https://aws.amazon.com/about-aws/whats-new/2018/06/cloudformation-support-for-aws-privatelink-resources/ Application Load Balancer Simplifies User Authentication for Your Applications | https://aws.amazon.com/about-aws/whats-new/2018/05/application-load-balancer-simplifies-user-authentication-for-your-applications/ Amazon MQ Now Supports AWS CloudFormation | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-mq-now-supports-aws-cloudformation/ Application Load Balancer Adds New Security Policies Including Policy for Forward Secrecy | https://aws.amazon.com/about-aws/whats-new/2018/06/application-load-balancer-adds-new-security-policies-including-policy-for-forward-secrecy/ Amazon Cognito Now Supports Custom Domains for a Unified Login Experience | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-cognito-now-supports-custom-domains-for-a-unified-login-experience/ Amazon Cognito Protection for Unusual Sign-in Activity and Compromised Credentials Is Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-cognito-advanced-security-features/ AWS Shield Advanced Announces New Onboarding Wizard | https://aws.amazon.com/about-aws/whats-new/2018/06/shield-advanced-onboarding-sign-up-wizard-drt-permission/ AWS WAF Announces Two New Features | https://aws.amazon.com/about-aws/whats-new/2018/06/waf-new-features-queryargs-cidr/ Amazon EC2 Auto Recovery is now available for Dedicated Instances | https://aws.amazon.com/about-aws/whats-new/2018/05/amazon-ec2-auto-recovery-is-now-available-for-dedicated-instances/ Amazon SageMaker Now Supports PyTorch and TensorFlow 1.8 | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-sagemaker-now-supports-pytorch-and-tensorflow-1-8/ Amazon SageMaker now Provides Chainer integration, Support for AWS CloudFormation, and Availability in the Asia Pacific (Tokyo) AWS Region | https://aws.amazon.com/about-aws/whats-new/2018/05/amazon-sagemaker-chainer-nrt-cloud-formation-support/ Amazon SageMaker Inference Calls are now supported on AWS PrivateLink | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-sagemaker-inference-calls-are-supported-on-aws-privatelink/ Now Clone a Model Training Job on the Amazon SageMaker Console | https://aws.amazon.com/about-aws/whats-new/2018/05/now-clone-a-model-training-job-on-the-amazon-sagemaker-console/ Automatic Model Tuning is now Generally Available | https://aws.amazon.com/about-aws/whats-new/2018/05/automatic-model-tuning-is-now-generally-available/ Announcing AWS DeepLens support for TensorFlow and Caffe, expanded MXNet layer support, integration with Kinesis Video Streams, new sample project, and availability to buy on Amazon.com | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-deeplens-tensorflow-caffe-mxnet-kinesis-video-streams-buy-now/ Amazon Elastic Container Service for Kubernetes Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-elastic-container-service-for-kubernetes-eks-now-ga/ Amazon Sumerian Regional Expansions | https://aws.amazon.com/about-aws/whats-new/2018/06/Amazon-Sumerian-Regional-Expansions/ Amazon Sumerian Regional and Feature Expansion | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-sumerian-regional-and-feature-expansion/ Amazon API Gateway Supports Private APIs | https://aws.amazon.com/about-aws/whats-new/2018/06/api-gateway-supports-private-apis/ Amazon CloudWatch Adds VPC Endpoint Support to AWS PrivateLink | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-cloudwatch-adds-vpc-endpoint-support-to-aws-privatelink/ Announcing Amazon Linux 2 with Long Term Support (LTS) | https://aws.amazon.com/about-aws/whats-new/2018/06/announcing-amazon-linux-2-with-long-term-support/ AWS Introduces Amazon Linux WorkSpaces | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-introduces-amazon-linux-workspaces/ Amazon CloudWatch Metric Math Supports Bulk Transformations | https://aws.amazon.com/about-aws/whats-new/2018/06/Amazon-CloudWatch-Metric-Math-Supports-Bulk-Transformations/ AWS CloudTrail Event History Now Includes All Management Events | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-cloud-trail-event-history-now-includes-all-management-events/ Amazon ElastiCache for Redis announces support for Redis 4.0 with caching improvements and better memory management for high-performance in-memory data processing | https://aws.amazon.com/about-aws/whats-new/2018/06/amazon-elastiCache-for-redis-announces-support-for-redis-40/ AWS Config Introduces New Lower Pricing for AWS Config Rules | https://aws.amazon.com/about-aws/whats-new/2018/06/aws-config-introduces-new-lower-pricing-for-aws-config-rules/ AWS Marketplace Launches New Website Workflow | https://aws.amazon.com/about-aws/whats-new/2018/06/aws_marketplace_launches_new_website_workflow/
With libraries such as Tensorflow, PyTorch, scikit-learn, and MXNet being released it is easier than ever to start a deep learning project. Unfortunately, it is still difficult to manage scaling and reproduction of training for these projects. Mourad Mourafiq built Polyaxon on top of Kubernetes to address this shortcoming. In this episode he shares his reasons for starting the project, how it works, and how you can start using it today.
AWS re:Invent 2017についてnishigori、つばさと話しました。 AWS re:Invent AWS re:Invent 2017 Keynote: Andy Jassy - YouTube Amazon EKS - Highly available and scalable Kubernetes service Introducing AWS Fargate - Run Containers without Managing Infrastructure Amazon SageMaker | Build, train, and deploy machine learning models at scale Twitter | MCL345 - NEW LAUNCH! Integrating Amazon SageMaker into your Enterpriseにいる … Accelerate Machine Learning with Amazon SageMaker | All Things Distributed Using Your Own Training Algorithms | Amazon SageMaker Amazon EC2 Bare Metal Instances with Direct Access to Hardware Amazon DynamoDB Update - Global Tables and On-Demand Backup AWS DeepLens DeepLensはAWSの新サービスてんこ盛りデバイスだった AWS DeepLens - Get Hands-On Experience with Deep Learning With Our New Video Camera GitHub - aws-samples/reinvent-2017-deeplens-workshop: A reference Lambda function that predicts image labels for a image using a MXNet built deep learning model AWS GreenGrass Greengrass Bootcamp - Basic Introducing Amazon Kinesis Video Streams The Non-Profit Hackathon | AWS re:Invent Amazon GuardDuty | Intelligent threat detection and continuous monitoring to protect your AWS accounts and workloads. AWS Lambda Edge Workshops Lab 3: Simple API | Lambda@Edge Workshop AWS EC2 Virtualization 2017 | Brendan Gregg’s Blog Announcing Amazon EC2 Bare Metal Instances (Preview) VMware Cloud on AWS Amazon Pinpoint Alexa for Business | Empower your organization with Alexa Amazon Transcribe | Automatic speech recognition Introducing Amazon Translate – Real-time Language Translation Amazon Comprehend | Discover insights and relationships in text DEEP LEARNING SUMMIT AWS メディアサービス | 動画ワークフローをクラウド上で構築する もうre:Invent 2017のVideoがちらほらあがってきているようです。YouTube / AWS re:Invent 2017 | Breakout Sessions
Reinforcement Learning (RL) can be used to solve real-world problems in robotics and conversational engines without supervision. AI algorithms that observe their surroundings and learn are considered to be the ultimate forms of AI. The RL use cases shines in multi-agent scenarios where each agent reacts in real-time to the changing situation. In this session, we explain RL, the theory, and the algorithms used. We show an MXNet-based demo that will automatically learn to play a game. We use a game and show how an agent powered by MXNet takes actions to win. Initially, you notice that the agent making very little progress, but after a few dozen iterations, it can play the game better than any human being. You can generalize this to real world problems. RL is currently used today in robotics, gaming, autonomous vehicle control, spoken language systems and many more. In this talk, I will be using Amazon EC2 P2 instances, AWS deep learning AMI, MXnet deep learning framework, Amazon EBS, and Amazon S3.
Startups and enterprises are increasingly using open source projects for architectures. AWS customers and partners also run their own open source programs and contribute key technologies to the industry (see DCS201). At AWS, we engage with open source projects in several ways. Through bug fixes and enhancements to popular projects, including work with the Hadoop ecosystem (see BDM401), Chromium (see BAP305) and Boto, and standalone projects like the security library s2n (see NET405) and machine learning project MXNet (see MAC401). We have services like Amazon ECS for Docker (see CON316) and Amazon RDS for MySQL and PostgreSQL (see DAT305) that make open source easier to use. In this session, learn more about existing AWS open source work and our next steps.
Over the next decade, accelerating autonomous driving technology—including advances in artificial intelligence, sensors, cameras, radar and data analytics—are set to transform how we commute. In this session, you learn how to use Amazon AI for a highly productive, on demand, and scalable autonomous driving development environment. We compare the most popular AI frameworks including TensorFlow and MXNet for use in autonomous driving workloads. You learn about the AWS optimizations on MXNet that yield near linear scalability for training deep neural networks and convolutional neural networks. We demonstrate the ease of getting started on AWS AI by using a sample training dataset for building an object detection model on AWS. This session is intended for audiences who have some exposure to the underlying concepts for AI-based autonomous driving development. After attending the session, you can get started with AI development on AWS by using a sample dataset for building an object detection model.
Financial services companies are using machine learning to reduce fraud, streamline processes, and improve their bottom line. AWS provides tools that help them easily use AI tools like MXNet and Tensor Flow to perform predictive analytics, clustering, and more advanced data analyses. In this session, hear how IHS Markit has used machine learning on AWS to help global banking institutions manage their commodities portfolios. Learn how Amazon Machine Learning can take the hassle out of AI.
We take a look at two-faced Oracle, cover a FAMP installation, how Netflix works the complex stuff, and show you who the patron of yak shaving is. This episode was brought to you by Headlines Why is Oracle so two-faced over open source? (https://www.theregister.co.uk/2017/10/12/oracle_must_grow_up_on_open_source/) Oracle loves open source. Except when the database giant hates open source. Which, according to its recent lobbying of the US federal government, seems to be "most of the time". Yes, Oracle has recently joined the Cloud Native Computing Foundation (CNCF) to up its support for open-source Kubernetes and, yes, it has long supported (and contributed to) Linux. And, yes, Oracle has even gone so far as to (finally) open up Java development by putting it under a foundation's stewardship. Yet this same, seemingly open Oracle has actively hammered the US government to consider that "there is no math that can justify open source from a cost perspective as the cost of support plus the opportunity cost of forgoing features, functions, automation and security overwhelm any presumed cost savings." That punch to the face was delivered in a letter to Christopher Liddell, a former Microsoft CFO and now director of Trump's American Technology Council, by Kenneth Glueck, Oracle senior vice president. The US government had courted input on its IT modernisation programme. Others writing back to Liddell included AT&T, Cisco, Microsoft and VMware. In other words, based on its letter, what Oracle wants us to believe is that open source leads to greater costs and poorly secured, limply featured software. Nor is Oracle content to leave it there, also arguing that open source is exactly how the private sector does not function, seemingly forgetting that most of the leading infrastructure, big data, and mobile software today is open source. Details! Rather than take this counterproductive detour into self-serving silliness, Oracle would do better to follow Microsoft's path. Microsoft, too, used to Janus-face its way through open source, simultaneously supporting and bashing it. Only under chief executive Satya Nadella's reign did Microsoft realise it's OK to fully embrace open source, and its financial results have loved the commitment. Oracle has much to learn, and emulate, in Microsoft's approach. I love you, you're perfect. Now change Oracle has never been particularly warm and fuzzy about open source. As founder Larry Ellison might put it, Oracle is a profit-seeking corporation, not a peace-loving charity. To the extent that Oracle embraces open source, therefore it does so for financial reward, just like every other corporation. Few, however, are as blunt as Oracle about this fact of corporate open-source life. As Ellison told the Financial Times back in 2006: "If an open-source product gets good enough, we'll simply take it. So the great thing about open source is nobody owns it – a company like Oracle is free to take it for nothing, include it in our products and charge for support, and that's what we'll do. "So it is not disruptive at all – you have to find places to add value. Once open source gets good enough, competing with it would be insane... We don't have to fight open source, we have to exploit open source." "Exploit" sounds about right. While Oracle doesn't crack the top-10 corporate contributors to the Linux kernel, it does register a respectable number 12, which helps it influence the platform enough to feel comfortable building its IaaS offering on Linux (and Xen for virtualisation). Oracle has also managed to continue growing MySQL's clout in the industry while improving it as a product and business. As for Kubernetes, Oracle's decision to join the CNCF also came with P&L strings attached. "CNCF technologies such as Kubernetes, Prometheus, gRPC and OpenTracing are critical parts of both our own and our customers' development toolchains," said Mark Cavage, vice president of software development at Oracle. One can argue that Oracle has figured out the exploitation angle reasonably well. This, however, refers to the right kind of exploitation, the kind that even free software activist Richard Stallman can love (or, at least, tolerate). But when it comes to government lobbying, Oracle looks a lot more like Mr Hyde than Dr Jekyll. Lies, damned lies, and Oracle lobbying The current US president has many problems (OK, many, many problems), but his decision to follow the Obama administration's support for IT modernisation is commendable. Most recently, the Trump White House asked for feedback on how best to continue improving government IT. Oracle's response is high comedy in many respects. As TechDirt's Mike Masnick summarises, Oracle's "latest crusade is against open-source technology being used by the federal government – and against the government hiring people out of Silicon Valley to help create more modern systems. Instead, Oracle would apparently prefer the government just give it lots of money." Oracle is very good at making lots of money. As such, its request for even more isn't too surprising. What is surprising is the brazenness of its position. As Masnick opines: "The sheer contempt found in Oracle's submission on IT modernization is pretty stunning." Why? Because Oracle contradicts much that it publicly states in other forums about open source and innovation. More than this, Oracle contradicts much of what we now know is essential to competitive differentiation in an increasingly software and data-driven world. Take, for example, Oracle's contention that "significant IT development expertise is not... central to successful modernization efforts". What? In our "software is eating the world" existence Oracle clearly believes that CIOs are buyers, not doers: "The most important skill set of CIOs today is to critically compete and evaluate commercial alternatives to capture the benefits of innovation conducted at scale, and then to manage the implementation of those technologies efficiently." While there is some truth to Oracle's claim – every project shouldn't be a custom one-off that must be supported forever – it's crazy to think that a CIO – government or otherwise – is doing their job effectively by simply shovelling cash into vendors' bank accounts. Indeed, as Masnick points out: "If it weren't for Oracle's failures, there might not even be a USDS [the US Digital Service created in 2014 to modernise federal IT]. USDS really grew out of the emergency hiring of some top-notch internet engineers in response to the Healthcare.gov rollout debacle. And if you don't recall, a big part of that debacle was blamed on Oracle's technology." In short, blindly giving money to Oracle and other big vendors is the opposite of IT modernisation. In its letter to Liddell, Oracle proceeded to make the fantastic (by which I mean "silly and false") claim that "the fact is that the use of open-source software has been declining rapidly in the private sector". What?!? This is so incredibly untrue that Oracle should score points for being willing to say it out loud. Take a stroll through the most prominent software in big data (Hadoop, Spark, Kafka, etc.), mobile (Android), application development (Kubernetes, Docker), machine learning/AI (TensorFlow, MxNet), and compare it to Oracle's statement. One conclusion must be that Oracle believes its CIO audience is incredibly stupid. Oracle then tells a half-truth by declaring: "There is no math that can justify open source from a cost perspective." How so? Because "the cost of support plus the opportunity cost of forgoing features, functions, automation and security overwhelm any presumed cost savings." Which I guess is why Oracle doesn't use any open source like Linux, Kubernetes, etc. in its services. Oops. The Vendor Formerly Known As Satan The thing is, Oracle doesn't need to do this and, for its own good, shouldn't do this. After all, we already know how this plays out. We need only look at what happened with Microsoft. Remember when Microsoft wanted us to "get the facts" about Linux? Now it's a big-time contributor to Linux. Remember when it told us open source was anti-American and a cancer? Now it aggressively contributes to a huge variety of open-source projects, some of them homegrown in Redmond, and tells the world that "Microsoft loves open source." Of course, Microsoft loves open source for the same reason any corporation does: it drives revenue as developers look to build applications filled with open-source components on Azure. There's nothing wrong with that. Would Microsoft prefer government IT to purchase SQL Server instead of open-source-licensed PostgreSQL? Sure. But look for a single line in its response to the Trump executive order that signals "open source is bad". You won't find it. Why? Because Microsoft understands that open source is a friend, not foe, and has learned how to monetise it. Microsoft, in short, is no longer conflicted about open source. It can compete at the product level while embracing open source at the project level, which helps fuel its overall product and business strategy. Oracle isn't there yet, and is still stuck where Microsoft was a decade ago. It's time to grow up, Oracle. For a company that builds great software and understands that it increasingly needs to depend on open source to build that software, it's disingenuous at best to lobby the US government to put the freeze on open source. Oracle needs to learn from Microsoft, stop worrying and love the open-source bomb. It was a key ingredient in Microsoft's resurgence. Maybe it could help Oracle get a cloud clue, too. Install FAMP on FreeBSD (https://www.linuxsecrets.com/home/3164-install-famp-on-freebsd) The acronym FAMP refers to a set of free open source applications which are commonly used in Web server environments called Apache, MySQL and PHP on the FreeBSD operating system, which provides a server stack that provides web services, database and PHP. Prerequisites sudo Installed and working - Please read Apache PHP5 or PHP7 MySQL or MariaDB Install your favorite editor, ours is vi Note: You don't need to upgrade FreeBSD but make sure all patches have been installed and your port tree is up-2-date if you plan to update by ports. Install Ports portsnap fetch You must use sudo for each indivdual command during installations. Please see link above for installing sudo. Searching Available Apache Versions to Install pkg search apache Install Apache To install Apache 2.4 using pkg. The apache 2.4 user account managing Apache is www in FreeBSD. pkg install apache24 Confirmation yes prompt and hit y for yes to install Apache 2.4 This installs Apache and its dependencies. Enable Apache use sysrc to update services to be started at boot time, Command below adds "apache24enable="YES" to the /etc/rc.conf file. For sysrc commands please read ```sysrc apache24enable=yes Start Apache service apache24 start``` Visit web address by accessing your server's public IP address in your web browser How To find Your Server's Public IP Address If you do not know what your server's public IP address is, there are a number of ways that you can find it. Usually, this is the address you use to connect to your server through SSH. ifconfig vtnet0 | grep "inet " | awk '{ print $2 }' Now that you have the public IP address, you may use it in your web browser's address bar to access your web server. Install MySQL Now that we have our web server up and running, it is time to install MySQL, the relational database management system. The MySQL server will organize and provide access to databases where our server can store information. Install MySQL 5.7 using pkg by typing pkg install mysql57-server Enter y at the confirmation prompt. This installs the MySQL server and client packages. To enable MySQL server as a service, add mysqlenable="YES" to the /etc/rc.conf file. This sysrc command will do just that ```sysrc mysqlenable=yes Now start the MySQL server service mysql-server start Now run the security script that will remove some dangerous defaults and slightly restrict access to your database system. mysqlsecureinstallation``` Answer all questions to secure your newly installed MySQL database. Enter current password for root (enter for none): [RETURN] Your database system is now set up and we can move on. Install PHP5 or PHP70 pkg search php70 Install PHP70 you would do the following by typing pkg install php70-mysqli mod_php70 Note: In these instructions we are using php5.7 not php7.0. We will be coming out with php7.0 instructions with FPM. PHP is the component of our setup that will process code to display dynamic content. It can run scripts, connect to MySQL databases to get information, and hand the processed content over to the web server to display. We're going to install the modphp, php-mysql, and php-mysqli packages. To install PHP 5.7 with pkg, run this command ```pkg install modphp56 php56-mysql php56-mysqli Copy sample PHP configuration file into place. cp /usr/local/etc/php.ini-production /usr/local/etc/php.ini Regenerate the system's cached information about your installed executable files rehash``` Before using PHP, you must configure it to work with Apache. Install PHP Modules (Optional) To enhance the functionality of PHP, we can optionally install some additional modules. To see the available options for PHP 5.6 modules and libraries, you can type this into your system pkg search php56 Get more information about each module you can look at the long description of the package by typing pkg search -f apache24 Optional Install Example pkg install php56-calendar Configure Apache to Use PHP Module Open the Apache configuration file vim /usr/local/etc/apache24/Includes/php.conf DirectoryIndex index.php index.html Next, we will configure Apache to process requested PHP files with the PHP processor. Add these lines to the end of the file: SetHandler application/x-httpd-php SetHandler application/x-httpd-php-source Now restart Apache to put the changes into effect service apache24 restart Test PHP Processing By default, the DocumentRoot is set to /usr/local/www/apache24/data. We can create the info.php file under that location by typing vim /usr/local/www/apache24/data/info.php Add following line to info.php and save it. Details on info.php info.php file gives you information about your server from the perspective of PHP. It' useful for debugging and to ensure that your settings are being applied correctly. If this was successful, then your PHP is working as expected. You probably want to remove info.php after testing because it could actually give information about your server to unauthorized users. Remove file by typing rm /usr/local/www/apache24/data/info.php Note: Make sure Apache / meaning the root of Apache is owned by user which should have been created during the Apache install is the owner of the /usr/local/www structure. That explains FAMP on FreeBSD. IXsystems IXsystems TrueNAS X10 Torture Test & Fail Over Systems In Action with the ZFS File System (https://www.youtube.com/watch?v=GG_NvKuh530) How Netflix works: what happens every time you hit Play (https://medium.com/refraction-tech-everything/how-netflix-works-the-hugely-simplified-complex-stuff-that-happens-every-time-you-hit-play-3a40c9be254b) Not long ago, House of Cards came back for the fifth season, finally ending a long wait for binge watchers across the world who are interested in an American politician's ruthless ascendance to presidency. For them, kicking off a marathon is as simple as reaching out for your device or remote, opening the Netflix app and hitting Play. Simple, fast and instantly gratifying. What isn't as simple is what goes into running Netflix, a service that streams around 250 million hours of video per day to around 98 million paying subscribers in 190 countries. At this scale, providing quality entertainment in a matter of a few seconds to every user is no joke. And as much as it means building top-notch infrastructure at a scale no other Internet service has done before, it also means that a lot of participants in the experience have to be negotiated with and kept satiated?—?from production companies supplying the content, to internet providers dealing with the network traffic Netflix brings upon them. This is, in short and in the most layman terms, how Netflix works. Let us just try to understand how Netflix is structured on the technological side with a simple example. Netflix literally ushered in a revolution around ten years ago by rewriting the applications that run the entire service to fit into a microservices architecture?—?which means that each application, or microservice's code and resources are its very own. It will not share any of it with any other app by nature. And when two applications do need to talk to each other, they use an application programming interface (API)?—?a tightly-controlled set of rules that both programs can handle. Developers can now make many changes, small or huge, to each application as long as they ensure that it plays well with the API. And since the one program knows the other's API properly, no change will break the exchange of information. Netflix estimates that it uses around 700 microservices to control each of the many parts of what makes up the entire Netflix service: one microservice stores what all shows you watched, one deducts the monthly fee from your credit card, one provides your device with the correct video files that it can play, one takes a look at your watching history and uses algorithms to guess a list of movies that you will like, and one will provide the names and images of these movies to be shown in a list on the main menu. And that's the tip of the iceberg. Netflix engineers can make changes to any part of the application and can introduce new changes rapidly while ensuring that nothing else in the entire service breaks down. They made a courageous decision to get rid of maintaining their own servers and move all of their stuff to the cloud?—?i.e. run everything on the servers of someone else who dealt with maintaining the hardware while Netflix engineers wrote hundreds of programs and deployed it on the servers rapidly. The someone else they chose for their cloud-based infrastructure is Amazon Web Services (AWS). Netflix works on thousands of devices, and each of them play a different format of video and sound files. Another set of AWS servers take this original film file, and convert it into hundreds of files, each meant to play the entire show or film on a particular type of device and a particular screen size or video quality. One file will work exclusively on the iPad, one on a full HD Android phone, one on a Sony TV that can play 4K video and Dolby sound, one on a Windows computer, and so on. Even more of these files can be made with varying video qualities so that they are easier to load on a poor network connection. This is a process known as transcoding. A special piece of code is also added to these files to lock them with what is called digital rights management or DRM?—?a technological measure which prevents piracy of films. The Netflix app or website determines what particular device you are using to watch, and fetches the exact file for that show meant to specially play on your particular device, with a particular video quality based on how fast your internet is at that moment. Here, instead of relying on AWS servers, they install their very own around the world. But it has only one purpose?—?to store content smartly and deliver it to users. Netflix strikes deals with internet service providers and provides them the red box you saw above at no cost. ISPs install these along with their servers. These Open Connect boxes download the Netflix library for their region from the main servers in the US?—?if there are multiple of them, each will rather store content that is more popular with Netflix users in a region to prioritise speed. So a rarely watched film might take time to load more than a Stranger Things episode. Now, when you will connect to Netflix, the closest Open Connect box to you will deliver the content you need, thus videos load faster than if your Netflix app tried to load it from the main servers in the US. In a nutshell… This is what happens when you hit that Play button: Hundreds of microservices, or tiny independent programs, work together to make one large Netflix service. Content legally acquired or licensed is converted into a size that fits your screen, and protected from being copied. Servers across the world make a copy of it and store it so that the closest one to you delivers it at max quality and speed. When you select a show, your Netflix app cherry picks which of these servers will it load the video from> You are now gripped by Frank Underwood's chilling tactics, given depression by BoJack Horseman's rollercoaster life, tickled by Dev in Master of None and made phobic to the future of technology by the stories in Black Mirror. And your lifespan decreases as your binge watching turns you into a couch potato. It looked so simple before, right? News Roundup Moving FreshPorts (http://dan.langille.org/2017/11/15/moving-freshports/) Today I moved the FreshPorts website from one server to another. My goal is for nobody to notice. In preparation for this move, I have: DNS TTL reduced to 60s Posted to Twitter Updated the status page Put the website put in offline mode: What was missed I turned off commit processing on the new server, but I did not do this on the old server. I should have: sudo svc -d /var/service/freshports That stops processing of incoming commits. No data is lost, but it keeps the two databases at the same spot in history. Commit processing could continue during the database dumping, but that does not affect the dump, which will be consistent regardless. The offline code Here is the basic stuff I used to put the website into offline mode. The main points are: header(“HTTP/1.1 503 Service Unavailable”); ErrorDocument 404 /index.php I move the DocumentRoot to a new directory, containing only index.php. Every error invokes index.php, which returns a 503 code. The dump The database dump just started (Sun Nov 5 17:07:22 UTC 2017). root@pg96:~ # /usr/bin/time pg_dump -h 206.127.23.226 -Fc -U dan freshports.org > freshports.org.9.6.dump That should take about 30 minutes. I have set a timer to remind me. Total time was: 1464.82 real 1324.96 user 37.22 sys The MD5 is: MD5 (freshports.org.9.6.dump) = 5249b45a93332b8344c9ce01245a05d5 It is now: Sun Nov 5 17:34:07 UTC 2017 The rsync The rsync should take about 10-20 minutes. I have already done an rsync of yesterday's dump file. The rsync today should copy over only the deltas (i.e. differences). The rsync started at about Sun Nov 5 17:36:05 UTC 2017 That took 2m9.091s The MD5 matches. The restore The restore should take about 30 minutes. I ran this test yesterday. It is now Sun Nov 5 17:40:03 UTC 2017. $ createdb -T template0 -E SQL_ASCII freshports.testing $ time pg_restore -j 16 -d freshports.testing freshports.org.9.6.dump Done. real 25m21.108s user 1m57.508s sys 0m15.172s It is now Sun Nov 5 18:06:22 UTC 2017. Insert break here About here, I took a 30 minute break to run an errand. It was worth it. Changing DNS I'm ready to change DNS now. It is Sun Nov 5 19:49:20 EST 2017 Done. And nearly immediately, traffic started. How many misses? During this process, XXXXX requests were declined: $ grep -c '" 503 ' /usr/websites/log/freshports.org-access.log XXXXX That's it, we're done Total elapsed time: 1 hour 48 minutes. There are still a number of things to follow up on, but that was the transfers. The new FreshPorts Server (http://dan.langille.org/2017/11/17/x8dtu-3/) *** Using bhyve on top of CEPH (https://lists.freebsd.org/pipermail/freebsd-virtualization/2017-November/005876.html) Hi, Just an info point. I'm preparing for a lecture tomorrow, and thought why not do an actual demo.... Like to be friends with Murphy :) So after I started the cluster: 5 jails with 7 OSDs This what I manually needed to do to boot a memory stick Start een Bhyve instance rbd --dest-pool rbddata --no-progress import memstick.img memstick rbd-ggate map rbddata/memstick ggate-devvice is available on /dev/ggate1 kldload vmm kldload nmdm kldload iftap kldload ifbridge kldload cpuctl sysctl net.link.tap.uponopen=1 ifconfig bridge0 create ifconfig bridge0 addm em0 up ifconfig ifconfig tap11 create ifconfig bridge0 addm tap11 ifconfig tap11 up load the GGate disk in bhyve bhyveload -c /dev/nmdm11A -m 2G -d /dev/ggate1 FB11 and boot a single from it. bhyve -H -P -A -c 1 -m 2G -l com1,/dev/nmdm11A -s 0:0,hostbridge -s 1:0,lpc -s 2:0,virtio-net,tap11 -s 4,ahci-hd,/dev/ggate1 FB11 & bhyvectl --vm=FB11 --get-stats Connect to the VM cu -l /dev/nmdm11B And that'll give you a bhyve VM running on an RBD image over ggate. In the installer I tested reading from the bootdisk: root@:/ # dd if=/dev/ada0 of=/dev/null bs=32M 21+1 records in 21+1 records out 734077952 bytes transferred in 5.306260 secs (138341865 bytes/sec) which is a nice 138Mb/sec. Hope the demonstration does work out tomorrow. --WjW *** Donald Knuth - The Patron Saint of Yak Shaves (http://yakshav.es/the-patron-saint-of-yakshaves/) Excerpts: In 2015, I gave a talk in which I called Donald Knuth the Patron Saint of Yak Shaves. The reason is that Donald Knuth achieved the most perfect and long-running yak shave: TeX. I figured this is worth repeating. How to achieve the ultimate Yak Shave The ultimate yak shave is the combination of improbable circumstance, the privilege to be able to shave at your hearts will and the will to follow things through to the end. Here's the way it was achieved with TeX. The recount is purely mine, inaccurate and obviously there for fun. I'll avoid the most boring facts that everyone always tells, such as why Knuth's checks have their own Wikipedia page. Community Shaving is Best Shaving Since the release of TeX, the community has been busy working on using it as a platform. If you ever downloaded the full TeX distribution, please bear in mind that you are downloading the amassed work of over 40 years, to make sure that each and every TeX document ever written builds. We're talking about documents here. But mostly, two big projects sprung out of that. The first is LaTeX by Leslie Lamport. Lamport is a very productive researcher, famous for research in formal methods through TLA+ and also known laying groundwork for many distributed algorithms. LaTeX is based on the idea of separating presentation and content. It is based around the idea of document classes, which then describe the way a certain document is laid out. Think Markdown, just much more complex. The second is ConTeXt, which is far more focused on fine grained layout control. The Moral of the Story Whenever you feel like “can't we just replace this whole thing, it can't be so hard” when handling TeX, don't forget how many years of work and especially knowledge were poured into that system. Typesetting isn't the most popular knowledge around programmers. Especially see it in the context of the space it is in: they can't remove legacy. Ever. That would break documents. TeX is also not a programming language. It might resemble one, but mostly, it should be approached as a typesetting system first. A lot of it's confusing lingo gets much better then. It's not programming lingo. By approaching TeX with an understanding for its history, a lot of things can be learned from it. And yes, a replacement would be great, but it would take ages. In any case, I hope I thoroughly convinced you why Donald Knuth is the Patron Saint of Yak Shaves. Extra Credits This comes out of a enjoyable discussion with [Arne from Lambda Island](https://lambdaisland.com/https://lambdaisland.com/, who listened and said “you should totally turn this into a talk”. Vincent's trip to EuroBSDCon 2017 (http://www.vincentdelft.be/post/post_20171016) My euroBSDCon 2017 Posted on 2017-10-16 09:43:00 from Vincent in Open Bsd Let me just share my feedback on those 2 days spent in Paris for the EuroBSDCon. My 1st BSDCon. I'm not a developer, contributor, ... Do not expect to improve your skills with OpenBSD with this text :-) I know, we are on October 16th, and the EuroBSDCon of Paris was 3 weeks ago :( I'm not quick !!! Sorry for that Arrival at 10h, I'm too late for the start of the key note. The few persons behind a desk welcome me by talking in Dutch, mainly because of my name. Indeed, Delft is a city in Netherlands, but also a well known university. I inform them that I'm from Belgium, and the discussion moves to the fact the Fosdem is located in Brussels. I receive my nice T-shirt white and blue, a bit like the marine T-shirts, but with the nice EuroBSDCon logo. I'm asking where are the different rooms reserved for the BSD event. We have 1 big on the 1st floor, 1 medium 1 level below, and 2 smalls 1 level above. All are really easy to access. In this entrance we have 4 or 5 tables with some persons representing their company. Those are mainly the big sponsors of the event providing details about their activity and business. I discuss a little bit with StormShield and Gandi. On other tables people are selling BSD t-shirts, and they will quickly be sold. "Is it done yet ?" The never ending story of pkg tools In the last Fosdem, I've already hear Antoine and Baptiste presenting the OpenBSD and FreeBSD battle, I decide to listen Marc Espie in the medium room called Karnak. Marc explains that he has rewritten completely the pkg_add command. He explains that, at contrario with other elements of OpenBSD, the packages tools must be backward compatible and stable on a longer period than 12 months (the support period for OpenBSD). On the funny side, he explains that he has his best idea inside his bath. Hackathons are also used to validate some ideas with other OpenBSD developers. All in all, he explains that the most time consuming part is to imagine a good solution. Coding it is quite straightforward. He adds that better an idea is, shorter the implementation will be. A Tale of six motherboards, three BSDs and coreboot After the lunch I decide to listen the talk about Coreboot. Indeed, 1 or 2 years ago I had listened the Libreboot project at Fosdem. Since they did several references to Coreboot, it's a perfect occasion to listen more carefully to this project. Piotr and Katazyba Kubaj explains us how to boot a machine without the native Bios. Indeed Coreboot can replace the bios, and de facto avoid several binaries imposed by the vendor. They explain that some motherboards are supporting their code. But they also show how difficult it is to flash a Bios and replace it by Coreboot. They even have destroyed a motherboard during the installation. Apparently because the power supply they were using was not stable enough with the 3v. It's really amazing to see that open source developers can go, by themselves, to such deep technical level. State of the DragonFly's graphics stack After this Coreboot talk, I decide to stay in the room to follow the presentation of Fran?ois Tigeot. Fran?ois is now one of the core developer of DrangonflyBSD, an amazing BSD system having his own filesystem called Hammer. Hammer offers several amazing features like snapshots, checksum data integrity, deduplication, ... Francois has spent his last years to integrate the video drivers developed for Linux inside DrangonflyBSD. He explains that instead of adapting this code for the video card to the kernel API of DrangonflyBSD, he has "simply" build an intermediate layer between the kernel of DragonflyBSD and the video drivers. This is not said in the talk, but this effort is very impressive. Indeed, this is more or less a linux emulator inside DragonflyBSD. Francois explains that he has started with Intel video driver (drm/i915), but now he is able to run drm/radeon quite well, but also drm/amdgpu and drm/nouveau. Discovering OpenBSD on AWS Then I move to the small room at the upper level to follow a presentation made by Laurent Bernaille on OpenBSD and AWS. First Laurent explains that he is re-using the work done by Antoine Jacoutot concerning the integration of OpenBSD inside AWS. But on top of that he has integrated several other Open Source solutions allowing him to build OpenBSD machines very quickly with one command. Moreover those machines will have the network config, the required packages, ... On top of the slides presented, he shows us, in a real demo, how this system works. Amazing presentation which shows that, by putting the correct tools together, a machine builds and configure other machines in one go. OpenBSD Testing Infrastructure Behind bluhm.genua.de Here Jan Klemkow explains us that he has setup a lab where he is able to run different OpenBSD architectures. The system has been designed to be able to install, on demand, a certain version of OpenBSD on the different available machines. On top of that a regression test script can be triggered. This provides reports showing what is working and what is not more working on the different machines. If I've well understood, Jan is willing to provide such lab to the core developers of OpenBSD in order to allow them to validate easily and quickly their code. Some more effort is needed to reach this goal, but with what exists today, Jan and his colleague are quite close. Since his company is using OpenBSD business, to his eyes this system is a "tit for tat" to the OpenBSD community. French story on cybercrime Then comes the second keynote of the day in the big auditorium. This talk is performed by the colonel of french gendarmerie. Mr Freyssinet, who is head of the Cyber crimes unit inside the Gendarmerie. Mr Freyssinet explains that the "bad guys" are more and more volatile across countries, and more and more organized. The small hacker in his room, alone, is no more the reality. As a consequence the different national police investigators are collaborating more inside an organization called Interpol. What is amazing in his talk is that Mr Freyssinet talks about "Crime as a service". Indeed, more and more hackers are selling their services to some "bad and temporary organizations". Social event It's now time for the famous social event on the river: la Seine. The organizers ask us to go, by small groups, to a station. There is a walk of 15 minutes inside Paris. Hopefully the weather is perfect. To identify them clearly several organizers takes a "beastie fork" in their hands and walk on the sidewalk generating some amazing reactions from some citizens and toursits. Some of them recognize the Freebsd logo and ask us some details. Amazing :-) We walk on small and big sidewalks until a small stair going under the street. There, we have a train station a bit like a metro station. 3 stations later they ask us to go out. We walk few minutes and come in front of a boat having a double deck: one inside, with nice tables and chairs and one on the roof. But the crew ask us to go up, on the second deck. There, we are welcome with a glass of wine. The tour Eiffel is just at few 100 meters from us. Every hour the Eiffel tower is blinking for 5 minutes with thousands of small lights. Brilliant :-) We see also the "statue de la libertee" (the small one) which is on a small island in the middle of the river. During the whole night the bar will be open with drinks and some appetizers, snacks, ... Such walking diner is perfect to talk with many different persons. I've discussed with several persons just using BSD, they are not, like me, deep and specialized developers. One was from Switzerland, another one from Austria, and another one from Netherlands. But I've also followed a discussion with Theo de Raadt, several persons of the FreeBSD foundation. Some are very technical guys, other just users, like me. But all with the same passion for one of the BSD system. Amazing evening. OpenBSD's small steps towards DTrace (a tale about DDB and CTF) On the second day, I decide to sleep enough in order to have enough resources to drive back to my home (3 hours by car). So I miss the 1st presentations, and arrive at the event around 10h30. Lot of persons are already present. Some faces are less "fresh" than others. I decide to listen to Dtrace in OpenBSD. After 10 minutes I am so lost into those too technical explainations, that I decide to open and look at my PC. My OpenBSD laptop is rarely leaving my home, so I've never had the need to have a screen locking system. In a crowded environment, this is better. So I was looking for a simple solution. I've looked at how to use xlock. I've combined it with the /ets/apm/suspend script, ... Always very easy to use OpenBSD :-) The OpenBSD web stack Then I decide to follow the presentation of Michael W Lucas. Well know person for his different books about "Absolute OpenBSD", Relayd", ... Michael talks about the httpd daemon inside OpenBSD. But he also present his integration with Carp, Relayd, PF, FastCGI, the rules based on LUA regexp (opposed to perl regexp), ... For sure he emphasis on the security aspect of those tools: privilege separation, chroot, ... OpenSMTPD, current state of affairs Then I follow the presentation of Gilles Chehade about the OpenSMTPD project. Amazing presentation that, on top of the technical challenges, shows how to manage such project across the years. Gilles is working on OpenSMTPD since 2007, thus 10 years !!!. He explains the different decisions they took to make the software as simple as possible to use, but as secure as possible, too: privilege separation, chroot, pledge, random malloc, ? . The development starts on BSD systems, but once quite well known they received lot of contributions from Linux developers. Hoisting: lessons learned integrating pledge into 500 programs After a small break, I decide to listen to Theo de Raadt, the founder of OpenBSD. In his own style, with trekking boots, shorts, backpack. Theo starts by saying that Pledge is the outcome of nightmares. Theo explains that the book called "Hacking blind" presenting the BROP has worried him since few years. That's why he developed Pledge as a tool killing a process as soon as possible when there is an unforeseen behavior of this program. For example, with Pledge a program which can only write to disk will be immediately killed if he tries to reach network. By implementing Pledge in the +-500 programs present in the "base", OpenBSD is becoming more secured and more robust. Conclusion My first EuroBSDCon was a great, interesting and cool event. I've discussed with several BSD enthusiasts. I'm using OpenBSD since 2010, but I'm not a developer, so I was worried to be "lost" in the middle of experts. In fact it was not the case. At EuroBSDCon you have many different type of enthusiasts BSD's users. What is nice with the EuroBSDCon is that the organizers foresee everything for you. You just have to sit and listen. They foresee even how to spend, in a funny and very cool attitude, the evening of Saturday. > The small draw back is that all of this has a cost. In my case the whole weekend cost me a bit more than 500euro. Based on what I've learned, what I've saw this is very acceptable price. Nearly all presentations I saw give me a valuable input for my daily job. For sure, the total price is also linked to my personal choice: hotel, parking. And I'm surely biased because I'm used to go to the Fosdem in Brussels which cost nothing (entrance) and is approximately 45 minutes of my home. But Fosdem is not the same atmosphere and presentations are less linked to my daily job. I do not regret my trip to EuroBSDCon and will surely plan other ones. Beastie Bits Important munitions lawyering (https://www.jwz.org/blog/2017/10/important-munitions-lawyering/) AsiaBSDCon 2018 CFP is now open, until December 15th (https://2018.asiabsdcon.org/) ZSTD Compression for ZFS by Allan Jude (https://www.youtube.com/watch?v=hWnWEitDPlM&feature=share) NetBSD on Allwinner SoCs Update (https://blog.netbsd.org/tnf/entry/netbsd_on_allwinner_socs_update) *** Feedback/Questions Tim - Creating Multi Boot USB sticks (http://dpaste.com/0FKTJK3#wrap) Nomen - ZFS Questions (http://dpaste.com/1HY5MFB) JJ - Questions (http://dpaste.com/3ZGNSK9#wrap) Lars - Hardening Diffie-Hellman (http://dpaste.com/3TRXXN4) ***
* ERRATA (As Reported by Peter: "The book Peter mentioned (at 46:20) by Stuart Russell, "Do the Right Thing", was published in 2003, and not recently" In this session Peter Morgan, CEO Deep Learning Partnership sat with Vishal Kumar, CEO AnalyticsWeek and shared his thoughts around Deep Learning, Machine Learning and Artificial Intelligence. They've discussed some of the best practices when it comes to picking right solution, right vendor and what are some of the keyword means. Here's Peter's Bio: Peter Morgan is a scientist-entrepreneur starting out in high energy physics enrolled in the PhD program at the University of Massachusetts at Amherst. After leaving UMass, and founding my own company, Peter has moved into computer networks, designing, implementing and troubleshooting global IP networks for companies such as Cisco, IBM and BT Labs. After getting an MBA and dabbling in financial trading algorithms. Peter has worked for three years on an experiment lead by Stanford University to measure the mass of the neutrino. Since 2012. He had been working in Data Science and Deep Learning, founding an AI Solutions company in Jan 2016. As an entrepreneur Peter has founded companies in the AI, social media, and music industries. He has also served on the advisory board of technology startups. Peter is a popular speaker at conferences, meetups and webinars. He has cofounded and currently organize meetups in the deep learning space. Peter has business experience in the USA, UK and Europe. Today, as CEO of Deep Learning Partnership, He leads the strategic direction and business development across product and services. This includes sales and marketing, lead generation, client engagement, recruitment, content creation and platform development. Deep Learning technologies used include computer vision and natural language processing and frameworks like TensorFlow, Keras and MXnet. Deep Learning Partnership design and implement AI solutions for our clients across all business domains. Interested in sharing your thought leadership with our global listeners? Register your interest @ http://play.analyticsweek.com/guest/
Over the last few years, we have seen a dramatic increase in the use of open source projects as the mainstay of architectures in both startups and enterprises. Many of our customers and partners also run their own open source programs and contribute key technologies to the industry as a whole (see DCS201). At AWS we engage with open source projects in a number of ways. We contribute bug fixes and enhancements to popular projects including our work with the Hadoop ecosystem (see BDM401), Chromium (see BAP305) and (obviously) Boto. We have our own standalone projects including the security library s2n (see NET405) and machine learning project MXnet (see MAC401). We also have services that make open source easier to use like ECS for Docker (see CON316), and RDS for MySQL and PostgreSQL (see DAT305). In this session you will learn about our existing open source work across AWS, and our next steps.
For many companies, recommendation systems solve important machine learning problems. But as recommendation systems grow to millions of users and millions of items, they pose significant challenges when deployed at scale. The user-item matrix can have trillions of entries (or more), most of which are zero. To make common ML techniques practical, sparse data requires special techniques. Learn how to use MXNet to build neural network models for recommendation systems that can scale efficiently to large sparse datasets.