POPULARITY
Trazemos novamente o especialista Samuel Matioli para falar do banco de dados colunar mais querido da Fortuna 500, O Apache Cassandra é o banco de dados utilizado por grandes empresas como: Uber, Facebook, Netflix, Instagram, Spotify e Instacart.Nesse bato papo sobre banco de dados NoSQL falamos sobre os seguintes tópicos:Crescimento na Utilização de NoSQL no MercadoDiferença entre HBase e Apache CassandraO que é o Apache CassandraTipos de Deployment e Opções de UtilizaçãoCasos de Uso Quais os Problemas o Apache Cassandra ResolveApache Cassandra = https://cassandra.apache.org/ Samuel Matioli = https://www.linkedin.com/in/samuelmatioli/ No YouTube possuímos um canal de Engenharia de Dados com os tópicos mais importantes dessa área e com lives todas as quartas-feiras.https://www.youtube.com/channel/UCnErAicaumKqIo4sanLo7vQ Quer ficar por dentro dessa área com posts e updates semanais, então acesse o LinkedIN para não perder nenhuma notícia.https://www.linkedin.com/in/luanmoreno/ Disponível no Spotify e na Apple Podcasthttps://open.spotify.com/show/5n9mOmAcjra9KbhKYpOMqYhttps://podcasts.apple.com/br/podcast/engenharia-de-dados-cast/ Luan Moreno = https://www.linkedin.com/in/luanmoreno/
On this episode of the Post Podcast, Thomas More Prep-Marian principal Chad Meitner shares information about the upcoming high school graduation as the end of the year approaches. Transcript: EDITORS NOTE: Transcripts are provided by an automated service and are not verified for accuracy. James Bell Thomas More Prep-Marian is getting ready for their graduation, principal Chad Meitner stops by to share with us some of the details on this episode of the Post Podcast. Chad Meitner It's the end of the year craziness. But yeah, nothing out of the ordinary. We have our last days of school, the 18th of May is our last day at TMP, Marian. But actually the seniors get out May 16. Couple days hopefully will bonus for them. Junior High gets out May 17. One day early. So yeah, it's starting to get to that last week. James Bell Very cool. And we're gonna talk a lot about that. Before we get there, though, last time, or maybe it was two weeks ago. I know either way, though. The ACE auction wrapped up. And from what you were telling me just right before we went on, it went really well this year, right? Chad Meitner Oh, yeah, it goes, it goes well, every year and this year was no exception. And it's, you know, within a lot of economic factors that we all hear a lot about we there's always a question mark, how's it going? How well will it do this year compared to past years? And, you know, we're still getting in bills and donations. So we don't have final numbers. But it's, it's really looking good. And it could possibly be one of the best ever, which is exciting. James Bell Yeah, I love to hear that. And I, as I already mentioned to you just a little bit ago, before we got on Erin, we were talking to Sarah earlier in charitable giving right now is is kind of a hard thing for people to do. So it's amazing to hear that people stepped up and really helped out the Catholic school during that HBase auction, because that's one of your big fundraisers for the year, right? Yes, it's Chad Meitner for our operations for any capital needs. You know, that's how we have to tackle it. And so for people who've always been so generous, near and far, we just continue to have to just say thank you with humble gratitude, because it's a great school, it's a great asset for our community. And it wouldn't be possible without so many people pitching in what they can, you know, some give a little, some give a lot, but it all helps us achieve our mission. James Bell Absolutely. So as we mentioned, we're gonna talk a little bit about the year we got, well, gosh, you say, what was it the 16th? Chad Meitner Yeah, for the SR 16. Like, one week away? James Bell Yeah. What so how's it how's the year went this year, Chad Meitner it's been a great year. I mean, of course, it's been a more normal year than the last two, because we've just been able to go about our business without any anything the out of the ordinary particularly. And so this senior class, who's a class who's seen, you know, the, they were sophomores when they lost their their spring like every other student did in the spring of 2020. And then they were part of last year where it was masks for most of the school year. And so this year, it was kind of coming out from all that and more of a celebratory feel, which is good because teenagers tend to not always be the most positive optimistic personalities. And so for them to be so grateful and thankful and take advantage of the opportunities they had is refreshing because we as adults know that you need to be thankful to appreciate those gifts that you're given to make life more enjoyable. So yeah, one week left and well there's a lot to squeeze in and that week, but the graduation will be May 22. So basically a week from this coming Sunday, and they we have to do finals they have to do we have award ceremonies, not just for the seniors but for all junior high and underclassmen students as well coming up this week and concerts and and of course spring sports don't stop they're still rolling all the way through till even Memorial Day weekend. So busy, busy time. James Bell You know, I I'm curious. And Pardon my ignorance. Everyone knows I'm not much of a sports fan. I wonder what sports you know, the day you have down there at TMP during the spring? I know baseball is probably one of them. Right? Basically, Chad Meitner we have quite a few for school our size, which is which is great for our kids, because you can come to a small school and get engaged in a lot of different ways. And that's I think that's part of the recipe of our success is that the kids almost have to get involved in that pushes them to to better themselves and get out of their comfort zone. But yeah, sports got baseball, softball, we do have a girl soccer. We have boys golf in the spring girls golf in the fall. We have track and field. What am I missing? I think that was everything. So lots of different activities in junior high they have track and field where basically entire student body goes out. Always, school days when we have Junior High track meets is interesting because you only have about 30 kids left in school and the rest of them are at the track meet. But again, those those keep the kids busy and gives them a little reprieve when they can get outside when the weather cooperates and get away from the schoolwork a little bit and just be with the friends and working out and being outside. James Bell No, absolutely. Yeah, and that's exactly this year. So it's just been so rough because of the weather has been crazy unpredictable. It's always unpredictable out here but it's been even this year, even a little bit more. Chad Meitner I feel like the wind even for Kansas right I think the wind has been even a little above average for us this time of year and early on. It was really windy but dry. but it was still so windy that we had to postpone a lot of events. And then we did start to get some rain, which was much needed. And we're thankful for that. But those that rain did also postpone some events. So we're going to, we're trying to squeeze them in as much as we can here this these last weeks is if they weren't busy enough. But it's good. That's what the what helps us wrap up this school year and style is that everybody can have these these opportunities to show what they can do and have a good finish to the school year. And this graduating class, this the class of 2022 is every year, it's interesting, because each class has their own personality. Well, this class it was, the word that pops to mind is just work ethic hard work this, these kids work extremely hard. And, you know, I look at just the academic is one metric of that this class has just under half of the students have a 96% or higher GPA. That's the equivalent of a 4.0. So when you almost have half of your class, get a 4.0. That, that shell that tells you a lot about how hard they work, because academics is not all about how smart you are. It's about organized. It's about good parenting. It's about our curriculum at TMP, Marian. And it's about how hard these kids work. So we're I'm excited to celebrate the class of 2022. Because of a lot of things, in their, their willingness to work hard is one of them. James Bell Yeah, that's incredible. And especially considering what you mentioned a little bit ago, everything that that class had to deal with, over the last four years to build come. That's, it's, it's incredible. I can't even imagine, like how proud everybody is of that group. Chad Meitner Yeah. And to think, well, maybe there's a big drop off, the very worst GPA is 82%, which that comes out to over 3.0. So, you know, you could argue maybe grade inflation, but you know, these are the same teachers we've had for quite a few years, these teachers care about kids, but they're not going to let kids just pass on by with gaps in their understanding or gaps in their, their study skills. So, you know, I said it's combination of the curriculum, good parenting, and then just great work ethic. And it's, it's fun to see see him have success and hope that translates into success after high school. James Bell Absolutely. You know, and for those kiddos that are wrapping up the year, they got to come back I imagine this time of year, they're thinking about all of those things that they accomplished and got out of their way. And then but also thinking about that break. And then what comes up next year as well. Chad Meitner Yeah, everyone thinks about the break. And that's our theme this year has been finished strong, particularly for the seniors, because they're the ones that get the senior itis and you know, they've got the most to feel like celebrating. But everybody too. It's like, okay, we've got a week. But that doesn't mean we're done. And you know, a lot of times you get students say, well, well, gosh, there's only like three days left? Can we just not have school, we're not doing as much as we usually do. And it's like, well, there's got to be a last day at some point, no matter when that date is you're going to come up to it thinking can't we just finish. So you've got to get to the finish line. And you know, finishing strong is, is what you will be most proud of a couple of weeks down the road than if you just skip the last couple of days and don't do your work. And then you spend the whole summer looking at how you dropped the ball there at the end. And we don't want that we want them to enjoy their break. And how do you do that you finish strong so that you can really enjoy the break and be proud of what you've accomplished? James Bell Absolutely. Again, you want to share those dates, the graduations coming up. I've already forgotten. Yeah, Chad Meitner that's may 22. It's Sunday, May 22. The Baccalaureate mass, which is a big part of our graduation ceremony is that Immaculate Heart of Mary Church, and that is at 2pm, on the 22nd. And then we have graduation in the Fieldhouse at 4pm. Two hours later. So everyone is of course invited if you have a graduate or know the families of a graduate, come to the field house and celebrate with us. And there's parties all the weekend before Friday, Saturday and Sunday of the of the weekend. And then we'll have project graduation, which is our safe party site for our seniors. That night of graduation, the 22nd. And we'll have activities and games, they're going to be over at the fort, big shout out to the fort for helping us host our seniors on that night. And we'll keep them busy all the way till two o'clock in the morning. And after that weekend, even an 18 year old will be exhausted. I'm pretty much guarantee that James Bell absolutely. But well any other last thoughts or anything else you wanna hit on before we go? Chad Meitner No, just you know, again, it's it's such a privilege to be in this community that supports Catholic education and education in general. And we're so we're so excited to be finishing up the year but please come out. If you can't do an awards night. We have the concert, the band concerts, the choir sings and then we have awards after that. So It's a nice evening of just celebrating all the students accomplishments from academics from faith formation to leadership, sports, all those things. It's a great time of year tiring, but it's it's worth celebrating. These kids have done amazing things and we all should be proud of James Bell Thomas More Prep-Marian is getting ready for their graduation, principal Chad Meitner stops by to share with us some of the details on this episode of the Post Podcast. Chad Meitner It's the end of the year craziness. But yeah, nothing out of the ordinary. We have our last days of school, the 18th of May is our last day at TMP, Marian. But actually the seniors get out May 16. Couple days hopefully will bonus for them. Junior High gets out May 17. One day early. So yeah, it's starting to get to that last week. James Bell Very cool. And we're gonna talk a lot about that. Before we get there, though, last time, or maybe it was two weeks ago. I know either way, though. The ACE auction wrapped up. And from what you were telling me just right before we went on, it went really well this year, right? Chad Meitner Oh, yeah, it goes, it goes well, every year and this year was no exception. And it's, you know, within a lot of economic factors that we all hear a lot about we there's always a question mark, how's it going? How well will it do this year compared to past years? And, you know, we're still getting in bills and donations. So we don't have final numbers. But it's, it's really looking good. And it could possibly be one of the best ever, which is exciting. James Bell Yeah, I love to hear that. And I, as I already mentioned to you just a little bit ago, before we got on Erin, we were talking to Sarah earlier in charitable giving right now is is kind of a hard thing for people to do. So it's amazing to hear that people stepped up and really helped out the Catholic school during that HBase auction, because that's one of your big fundraisers for the year, right? Yes, it's Chad Meitner for our operations for any capital needs. You know, that's how we have to tackle it. And so for people who've always been so generous, near and far, we just continue to have to just say thank you with humble gratitude, because it's a great school, it's a great asset for our community. And it wouldn't be possible without so many people pitching in what they can, you know, some give a little, some give a lot, but it all helps us achieve our mission. James Bell Absolutely. So as we mentioned, we're gonna talk a little bit about the year we got, well, gosh, you say, what was it the 16th? Chad Meitner Yeah, for the SR 16. Like, one week away? James Bell Yeah. What so how's it how's the year went this year, Chad Meitner it's been a great year. I mean, of course, it's been a more normal year than the last two, because we've just been able to go about our business without any anything the out of the ordinary particularly. And so this senior class, who's a class who's seen, you know, the, they were sophomores when they lost their their spring like every other student did in the spring of 2020. And then they were part of last year where it was masks for most of the school year. And so this year, it was kind of coming out from all that and more of a celebratory feel, which is good because teenagers tend to not always be the most positive optimistic personalities. And so for them to be so grateful and thankful and take advantage of the opportunities they had is refreshing because we as adults know that you need to be thankful to appreciate those gifts that you're given to make life more enjoyable. So yeah, one week left and well there's a lot to squeeze in and that week, but the graduation will be May 22. So basically a week from this coming Sunday, and they we have to do finals they have to do we have award ceremonies, not just for the seniors but for all junior high and underclassmen students as well coming up this week and concerts and and of course spring sports don't stop they're still rolling all the way through till even Memorial Day weekend. So busy, busy time. James Bell You know, I I'm curious. And Pardon my ignorance. Everyone knows I'm not much of a sports fan. I wonder what sports you know, the day you have down there at TMP during the spring? I know baseball is probably one of them. Right? Basically, Chad Meitner we have quite a few for school our size, which is which is great for our kids, because you can come to a small school and get engaged in a lot of different ways. And that's I think that's part of the recipe of our success is that the kids almost have to get involved in that pushes them to to better themselves and get out of their comfort zone. But yeah, sports got baseball, softball, we do have a girl soccer. We have boys golf in the spring girls golf in the fall. We have track and field. What am I missing? I think that was everything. So lots of different activities in junior high they have track and field where basically entire student body goes out. Always, school days when we have Junior High track meets is interesting because you only have about 30 kids left in school and the rest of them are at the track meet. But again, those those keep the kids busy and gives them a little reprieve when they can get outside when the weather cooperates and get away from the schoolwork a little bit and just be with the friends and working out and being outside. James Bell No, absolutely. Yeah, and that's exactly this year. So it's just been so rough because of the weather has been crazy unpredictable. It's always unpredictable out here but it's been even this year, even a little bit more. Chad Meitner I feel like the wind even for Kansas right I think the wind has been even a little above average for us this time of year and early on. It was really windy but dry. but it was still so windy that we had to postpone a lot of events. And then we did start to get some rain, which was much needed. And we're thankful for that. But those that rain did also postpone some events. So we're going to, we're trying to squeeze them in as much as we can here this these last weeks is if they weren't busy enough. But it's good. That's what the what helps us wrap up this school year and style is that everybody can have these these opportunities to show what they can do and have a good finish to the school year. And this graduating class, this the class of 2022 is every year, it's interesting, because each class has their own personality. Well, this class it was, the word that pops to mind is just work ethic hard work this, these kids work extremely hard. And, you know, I look at just the academic is one metric of that this class has just under half of the students have a 96% or higher GPA. That's the equivalent of a 4.0. So when you almost have half of your class, get a 4.0. That, that shell that tells you a lot about how hard they work, because academics is not all about how smart you are. It's about organized. It's about good parenting. It's about our curriculum at TMP, Marian. And it's about how hard these kids work. So we're I'm excited to celebrate the class of 2022. Because of a lot of things, in their, their willingness to work hard is one of them. James Bell Yeah, that's incredible. And especially considering what you mentioned a little bit ago, everything that that class had to deal with, over the last four years to build come. That's, it's, it's incredible. I can't even imagine, like how proud everybody is of that group. Chad Meitner Yeah. And to think, well, maybe there's a big drop off, the very worst GPA is 82%, which that comes out to over 3.0. So, you know, you could argue maybe grade inflation, but you know, these are the same teachers we've had for quite a few years, these teachers care about kids, but they're not going to let kids just pass on by with gaps in their understanding or gaps in their, their study skills. So, you know, I said it's combination of the curriculum, good parenting, and then just great work ethic. And it's, it's fun to see see him have success and hope that translates into success after high school. James Bell Absolutely. You know, and for those kiddos that are wrapping up the year, they got to come back I imagine this time of year, they're thinking about all of those things that they accomplished and got out of their way. And then but also thinking about that break. And then what comes up next year as well. Chad Meitner Yeah, everyone thinks about the break. And that's our theme this year has been finished strong, particularly for the seniors, because they're the ones that get the senior itis and you know, they've got the most to feel like celebrating. But everybody too. It's like, okay, we've got a week. But that doesn't mean we're done. And you know, a lot of times you get students say, well, well, gosh, there's only like three days left? Can we just not have school, we're not doing as much as we usually do. And it's like, well, there's got to be a last day at some point, no matter when that date is you're going to come up to it thinking can't we just finish. So you've got to get to the finish line. And you know, finishing strong is, is what you will be most proud of a couple of weeks down the road than if you just skip the last couple of days and don't do your work. And then you spend the whole summer looking at how you dropped the ball there at the end. And we don't want that we want them to enjoy their break. And how do you do that you finish strong so that you can really enjoy the break and be proud of what you've accomplished? James Bell Absolutely. Again, you want to share those dates, the graduations coming up. I've already forgotten. Yeah, Chad Meitner that's may 22. It's Sunday, May 22. The Baccalaureate mass, which is a big part of our graduation ceremony is that Immaculate Heart of Mary Church, and that is at 2pm, on the 22nd. And then we have graduation in the Fieldhouse at 4pm. Two hours later. So everyone is of course invited if you have a graduate or know the families of a graduate, come to the field house and celebrate with us. And there's parties all the weekend before Friday, Saturday and Sunday of the of the weekend. And then we'll have project graduation, which is our safe party site for our seniors. That night of graduation, the 22nd. And we'll have activities and games, they're going to be over at the fort, big shout out to the fort for helping us host our seniors on that night. And we'll keep them busy all the way till two o'clock in the morning. And after that weekend, even an 18 year old will be exhausted. I'm pretty much guarantee that James Bell absolutely. But well any other last thoughts or anything else you wanna hit on before we go? Chad Meitner No, just you know, again, it's it's such a privilege to be in this community that supports Catholic education and education in general. And we're so we're so excited to be finishing up the year but please come out. If you can't do an awards night. We have the concert, the band concerts, the choir sings and then we have awards after that. So It's a nice evening of just celebrating all the students accomplishments from academics from faith formation to leadership, sports, all those things. It's a great time of year tiring, but it's it's worth celebrating. These kids have done amazing things and we all should be proud of
Dhruba Borthakur is CTO at Rockset and a passionate Data Engineer. Before co-founding Rockset he played a big role in development of Hadoop HDFS at Yahoo as well as HBase and RocksDB at Facebook. His current project is the serverless Rockset platform where you can gain real time analytics insight into your data. I tried it out before our talk and really liked it.
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com)Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 285 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com) Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 285 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
Matt Yonkovit, The HOSS at Percona, sits down with Nagavamsi (Vamsi) Ponnekanti, Software Engineer at Quora. During the show we dive into the details on how and why Quora moved from HBase to RocksDB via MyRocks. We also learn about the need to reduce latency and improve predictability in performance in large infrastructure and database systems. Vamsi highlights some of his favorite features and tools and gives us tips and tricks on database migrations.
About KarthikKarthik was one of the original database engineers at Facebook responsible for building distributed databases including Cassandra and HBase. He is an Apache HBase committer, and also an early contributor to Cassandra, before it was open-sourced by Facebook. He is currently the co-founder and CTO of the company behind YugabyteDB, a fully open-source distributed SQL database for building cloud-native and geo-distributed applications.Links: Yugabyte community Slack channel: https://yugabyte-db.slack.com/ Distributed SQL Summit: https://distributedsql.org Twitter: https://twitter.com/YugaByte TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: You could build you go ahead and build your own coding and mapping notification system, but it takes time, and it sucks! Alternately, consider Courier, who is sponsoring this episode. They make it easy. You can call a single send API for all of your notifications and channels. You can control the complexity around routing, retries, and deliverability and simplify your notification sequences with automation rules. Visit courier.com today and get started for free. If you wind up talking to them, tell them I sent you and watch them wince—because everyone does when you bring up my name. Thats the glorious part of being me. Once again, you could build your own notification system but why on god's flat earth would you do that?Corey: This episode is sponsored in part by “you”—gabyte. Distributed technologies like Kubernetes are great, citation very much needed, because they make it easier to have resilient, scalable, systems. SQL databases haven't kept pace though, certainly not like no SQL databases have like Route 53, the world's greatest database. We're still, other than that, using legacy monolithic databases that require ever growing instances of compute. Sometimes we'll try and bolt them together to make them more resilient and scalable, but let's be honest it never works out well. Consider Yugabyte DB, its a distributed SQL database that solves basically all of this. It is 100% open source, and there's not asterisk next to the “open” on that one. And its designed to be resilient and scalable out of the box so you don't have to charge yourself to death. It's compatible with PostgreSQL, or “postgresqueal” as I insist on pronouncing it, so you can use it right away without having to learn a new language and refactor everything. And you can distribute it wherever your applications take you, from across availability zones to other regions or even other cloud providers should one of those happen to exist. Go to yugabyte.com, thats Y-U-G-A-B-Y-T-E dot com and try their free beta of Yugabyte Cloud, where they host and manage it for you. Or see what the open source project looks like—its effortless distributed SQL for global apps. My thanks to Yu—gabyte for sponsoring this episode.Corey: Welcome to Screaming in the Cloud. I'm Corey Quinn. Today's promoted episode comes from the place where a lot of my episodes do: I loudly and stridently insist that Route 53—or DNS in general—is the world's greatest database, and then what happens is a whole bunch of people who work at database companies get upset with what I've said. Now, please don't misunderstand me; they're wrong, but I'm thrilled to have them come on and demonstrate that, which is what's happening today. My guest is CTO and co-founder of Yugabyte. Karthik Ranganathan, thank you so much for spending the time to speak with me today. How are you?Karthik: I'm doing great. Thanks for having me, Corey. We'll just go for YugabyteDB being the second-best database. Let's just keep the first [crosstalk 00:01:13]—Corey: Okay. We're all fighting for number two, there. And besides, number two tries harder. It's like that whole branding thing from years past. So, you were one of the original database engineers at Facebook, responsible for building a bunch of nonsense, like Cassandra and HBase. You were an HBase committer, early contributor to Cassandra, even before it was open-sourced.And then you look around and said, “All right, I'm going to go start a company”—roughly around 2016, if memory serves—“And I'm going to go and build a database and bring it to the world.” Let's start at the beginning. Why on God's flat earth do we need another database?Karthik: Yeah, that's the question. That's the million-dollar question isn't it, Corey? So, this is one, fortunately, that we've had to answer so many times from 2016, that I guess we've gotten a little good at it. So, here's the learning that a lot of us had from Facebook: we were the original team, like, all three of us founders, we met at Facebook, and we not only build databases, we also ran them. And let me paint a picture.Back in 2007, the public cloud really wasn't very common, and people were just going into multi-region, multi-datacenter deployments, and Facebook was just starting to take off, to really scale. Now, forward to 2013—I was there through the entire journey—a number of things happened in Facebook: we saw the rise of the equivalent of Kubernetes which was internally built; we saw, for example, microservice—Corey: Yeah, the Tupperware equivalent, there.Karthik: Tupperware, exactly. You know the name. Yeah, exactly. And we saw how we went from two data centers to multiple data centers, and nearby and faraway data centers—zones and regions, what do you know as today—and a number of such technologies come up. And I was on the database side, and we saw how existing databases wouldn't work to distribute data across nodes, failover, et cetera, et cetera.So, we had to build a new class of databases, what we now know is NoSQL. Now, back in Facebook, I mean, the typical difference between Facebook and an enterprise at large is Facebook has a few really massive applications. For example, you do a set of interactions, you view profiles, you add friends, you talk with them, et cetera, right? These are supermassive in their usage, but they were very few in their access patterns. At Facebook, we were mostly interested in dealing with scale and availability.Existing databases couldn't do it, so we built NoSQL. Now, forward a number of years, I can't tell you how many times I've had conversations with other people building applications that will say, “Hey, can I get a secondary index on the SQL database?” Or, “How about that transaction? I only need it a couple of times; I don't need it all the time, but could you, for example, do multi-row transactions?” And the answer was always, “Not,” because it was never built for that.So today, what we're seeing is that transactional data and transactional applications are all going cloud-native, and they all need to deal with scale and availability. And so the existing databases don't quite cut it. So, the simple answer to why we need it is we need a relational database that can run in the cloud to satisfy just three properties: it needs to be highly available, failures or no, upgrades or no, it needs to be available; it needs to scale on demand, so simply add or remove nodes and scale up or down; and it needs to be able to replicate data across zones, across regions, and a variety of different topologies. So availability, scale, and geographic distribution, along with retaining most of the RDBMS features, the SQL features. That's really what the gap we're trying to solve.Corey: I don't know that I've ever told this story on the podcast, but I want to say it was back in 2009. I flew up to Palo Alto and interviewed at Facebook, and it was a different time, a different era; it turns out that I'm not as good on the whiteboard as I am at running my mouth, so all right, I did not receive an offer, but I think everyone can agree at this point that was for the best. But I saw one of the most impressive things I've ever seen, during a part of that interview process. My interview is scheduled for a conference room for must have been 11 o'clock or something like that, and at 10:59, they're looking at their watch, like, “Hang on ten seconds.” And then the person I was with reached out to knock on the door to let the person know that their meeting was over and the door opened.So, it's very clear that even in large companies, which Facebook very much was at the time, people had synchronized clocks. This seems to be a thing, as I've learned from reading the parts that I could understand of the Google Spanner paper: when you're doing distributed databases, clocks are super important. At places like Facebook, that is, I'm not going to say it's easy, let's be clear here. Nothing is easy, particularly at scale, but Facebook has advantages in that they can mandate how clocks are going to be handled throughout every piece of their infrastructure. You're building an open-source database and you can't guarantee in what environment and on what hardware that's going to run, and, “You must have an atomic clock hooked up,” is not something you're generally allowed to tell people. How do you get around that?Karthik: That's a great question. Very insightful, cutting right to the chase. So, the reality is, we cannot rely on atomic clocks, we cannot mandate our users to use them, or, you know, we'd not be very popularly used in a variety of different deployments. In fact, we also work in on-prem private clouds and hybrid deployments where you really cannot get these atomic clocks. So, the way we do this is we come up with other algorithms to make sure that we're able to get the clocks as synchronized as we can.So, think about at a higher level; the reason Google uses atomic clocks is to make sure that they can wait to make sure every other machine is synchronized with them, and the wait time is about seven milliseconds. So, the atomic clock service, or the true time service, says no two machines are farther apart than about seven milliseconds. So, you just wait for seven milliseconds, you know everybody else has caught up with you. And the reason you need this is you don't want to write on a machine, you don't want to write some data, and then go to a machine that has a future or an older time and get inconsistent results. So, just by waiting seven milliseconds, they can ensure that no one is going to be older and therefore serve an older version of the data, so every write that was written on the other machine see it.Now, the way we do this is we only have NTP, the Network Time Protocol, which does synchronization of time across machines, except it takes 150 to 200 milliseconds. Now, we wouldn't be a very good database, if we said, “Look, every operation is going to take 150 milliseconds.” So, within these 150 milliseconds, we actually do the synchronization in software. So, we replaced the notion of an atomic clock with what is called a hybrid logical clock. So, one part using NTP and physical time, and another part using counters and logical time and keep exchanging RPCs—which are needed in the course of the database functioning anyway—to make sure we start normalizing time very quickly.This in fact has some advantages—and disadvantages, everything was a trade-offs—but the advantage it has over a true time-style deployment is you don't even have to wait that seven milliseconds in a number of scenarios, you can just instantly respond. So, that means you get even lower latencies in some cases. Of course, the trade-off is there are other cases where you have to do more work, and therefore more latency.Corey: The idea absolutely makes sense. You started this as an open-source project, and it's thriving. Who's using it and for what purposes?Karthik: Okay, so one of the fundamental tenets of building this database—I think back to your question of why does the world need another database—is that the hypothesis is not so much the world needs another database API; that's really what users complain against, right? You create a new API and—even if it's SQL—and you tell people, “Look. Here's a new database. It does everything for you,” it'll take them two years to figure out what the hell it does, and build an app, and then put it in production, and then they'll build a second and a third, and then by the time they hit the tenth app, they find out, “Okay, this database cannot do the following things.” But you're five years in; you're stuck, you can only add another database.That's really the story of how NoSQL evolved. And it wasn't built as a general-purpose database, right? So, in the meanwhile, databases like Postgres, for example, have been around for so long that they absorb and have such a large ecosystem, and usage, and people who know how to use Postgres and so on. So, we made the decision that we're going to keep the database API compatible with known things, so people really know how to use them from the get-go and enhance it at a lower level to make a cloud-native. So, what is YugabyteDB do for people?It is the same as Postgres and Postgres features of the upper half—it reuses the code—but it is built on the lower half to be [shared nothing 00:09:10], scalable, resilient, and geographically distributed. So, we're using the public cloud managed database context, the upper half is built like Amazon Aurora, the lower half is built like Google Spanner. Now, when you think about workloads that can benefit from this, we're a transactional database that can serve user-facing applications and real-time applications that have lower latency. So, the best way to think about it is, people that are building transactional applications on top of, say, a database like Postgres, but the application itself is cloud-native. You'd have to do a lot of work to make this Postgres piece be highly available, and scalable, and replicate data, and so on in the cloud.Well, with YugabyteDB, we've done all that work for you and it's as open-source as Postgres, so if you're building a cloud-native app on Postgres that's user-facing or transactional, YugabyteDB takes care of making the database layer behave like Postgres but become cloud-native.Corey: Do you find that your users are using the same database instance, for lack of a better term? I know that instance is sort of a nebulous term; we're talking about something that's distributed. But are they having database instances that span multiple cloud providers, or is that something that is more talk than you're actually seeing in the wild?Karthik: So, I'd probably replace the word ‘instance' with ‘cluster', just for clarity, right?Corey: Excellent. Okay.Karthik: So, a cluster has a bunch—Corey: I concede the point, absolutely.Karthik: Okay. [laugh]. Okay. So, we'll still keep Route 53 on top, though, so it's good. [laugh].Corey: At that point, the replication strategy is called a zone transfer, but that's neither here nor there. Please, by all means, continue.Karthik: [laugh]. Okay. So, a cluster database like YugabyteDB has a number of instances. Now, I think the question is, is it theoretical or real? What we're seeing is, it is real, and it is real perhaps in slightly different ways than people imagine it to be.So, I'll explain what I mean by that. Now, there's one notion of being multi-cloud where you can imagine there's like, say, the same cluster that spans multiple different clouds, and you have your data being written in one cloud and being read from another. This is not a common pattern, although we have had one or two deployments that are attempting to do this. Now, a second deployment shifted once over from there is where you have your multiple instances in a single public cloud, and a bunch of other instances in a private cloud. So, it stretches the database across public and private—you would call this a hybrid deployment topology—that is more common.So, one of the unique things about YugabyteDB is we support asynchronous replication of data, just like your RDBMSs do, the traditional RDBMSs. In fact, we're the only one that straddles both synchronous replication of data as well as asynchronous replication of data. We do both. So, once shifted over would be a cluster that's deployed in one of the clouds but an asynchronous replica of the data going to another cloud, and so you can keep your reads and writes—even though they're a little stale, you can serve it from a different cloud. And then once again, you can make it an on-prem private cloud, and another public cloud.And we see all of those deployments, those are massively common. And then the last one over would be the same instance of an app, or perhaps even different applications, some of them running on one public cloud and some of them running on a different public cloud, and you want the same database underneath to have characteristics of scale and failover. Like for example, if you built an app on Spanner, what would you do if you went to Amazon and wanted to run it for a different set of users?Corey: That is part of the reason I tend to avoid the idea of picking a database that does not have at least theoretical exit path because reimagining your entire application's data model in order to migrate is not going to happen, so—Karthik: Exactly.Corey: —come hell or high water, you're stuck with something like that where it lives. So, even though I'm a big proponent as a best practice—and again, there are exceptions where this does not make sense, but as a general piece of guidance—I always suggest, pick a provider—I don't care which one—and go all-in. But that also should be shaded with the nuance of, but also, at least have an eye toward theoretically, if you had to leave, consider that if there's a viable alternative. And in some cases in the early days of Spanner, there really wasn't. So, if you needed that functionality, okay, go ahead and use it, but understand the trade-off you're making.Now, this really comes down to, from my perspective, understand the trade-offs. But the reason I'm interested in your perspective on this is because you are providing an open-source database to people who are actually doing things in the wild. There's not much agenda there, in the same way, among a user community of people reporting what they're doing. So, you have in many ways, one of the least biased perspectives on the entire enterprise.Karthik: Oh, yeah, absolutely. And like I said, I started from the least common to the most common; maybe I should have gone the other way. But we absolutely see people that want to run the same application stack in multiple different clouds for a variety of reasons.Corey: Oh, if you're a SaaS vendor, for example, it's, “Oh, we're only in this one cloud,” potential customers who in other clouds say, “Well, if that changes, we'll give you money.” “Oh, money. Did you say ‘other cloud?' I thought you said something completely different. Here you go.” Yeah, you've got to at some point. But the core of what you do, beyond what it takes to get that application present somewhere else, you usually keep in your primary cloud provider.Karthik: Exactly. Yep, exactly. Crazy things sometimes dictate or have to dictate architectural decisions. For example, you're seeing the rise of compliance. Different countries have different regulatory reasons to say, “Keep my data local,” or, “Keep some subset of data are local.”And you simply may not find the right cloud providers present in those countries; you may be a PaaS or an API provider that's helping other people build applications, and the applications that the API provider's customers are running could be across different clouds. And so they would want the data local, otherwise, the transfer costs would be really high. So, a number of reasons dictate—or like a large company may acquire another company that was operating in yet another cloud; everything else is great, but they're in another cloud; they're not going to say, “No because you're operating on another cloud.” It still does what they want, but they still need to be able to have a common base of expertise for their app builders, and so on. So, a number of things dictate why people started looking at cross-cloud databases with common performance and operational characteristics and security characteristics, but don't compromise on the feature set, right?That's starting to become super important, from our perspective. I think what's most important is the ability to run the database with ease while not compromising on your developer agility or the ability to build your application. That's the most important thing.Corey: When you founded the company back in 2016, you are VC-backed, so I imagine your investor pitch meetings must have been something a little bit surreal. They ask hard questions such as, “Why do you think that in 2016, starting a company to go and sell databases to people is a viable business model?” At which point you obviously corrected them and said, “Oh, you misunderstand. We're building an open-source database. We're not charging for it; we're giving it away.”And they apparently said, “Oh, that's more like it.” And then invested, as of the time of this recording, over $100 million in your company. Let me to be the first to say there are aspects of money that I don't fully understand and this is one of those. But what is the plan here? How do you wind up building a business case around effectively giving something away for free?And I want to be clear here, Yugabyte is open-source, and I don't have an asterisk next to that. It is not one of those ‘source available' licenses, or ‘anyone can do anything they want with it except Amazon' or ‘you're not allowed to host it and offer it as a paid service to other people.' So, how do you have a business, I guess is really my question here?Karthik: You're right, Corey. We're 100% open-source under Apache 2.0—I mean the database. So, our theory on day one—I mean, of course, this was a hard question and people did ask us this, and then I'll take you guys back to 2016. It was unclear, even as of 2016, if open-source companies were going to succeed. It was just unclear.And people were like, “Hey, look at Snowflake; it's a completely managed service. They're not open-source; they're doing a great job. Do you really need open-source to succeed?” There were a lot of such questions. And every company, every project, every space has to follow its own path, just applying learnings.Like for example, Red Hat was open-source and that really succeeded, but there's a number of others that may or may not have succeeded. So, our plan back then was to tread the waters carefully in the sense we really had to make sure open-source was the business model we wanted to go for. So, under the advisement from our VCs, we said we'd take it slowly; we want to open-source on day one. We've talked to a number of our users and customers and make sure that is indeed the path we've wanted to go. The conversations pretty clearly told us people wanted an open database that was very easy for them to understand because if they are trusting their crown jewels, their most critical data, their systems of record—this is what the business depends on—into a database, they sure as hell want to have some control over it and some transparency as to what goes on, what's planned, what's on the roadmap. “Look, if you don't have time, I will hire my people to go build for it.” They want it to be able to invest in the database.So, open-source was absolutely non-negotiable for us. We tried the traditional technique for a couple of years of keeping a small portion of the features of the database itself closed, so it's what you'd call ‘open core.' But on day one, we were pretty clear that the world was headed towards DBaaS—Database as a Service—and make it really easy to consume.Corey: At least the bad patterns as well, like, “Oh, if you want security, that's a paid feature.”Karthik: Exactly.Corey: No. That is not optional. And the list then of what you can wind up adding as paid versus not gets murky, and you're effectively fighting your community when they try and merge some of those features in and it just turns into a mess.Karthik: Exactly. So, it did for us for a couple of years, and then we said, “Look, we're not doing this nonsense. We're just going to make everything open and just make it simple.” Because our promise to the users was, we're building everything that looks like Postgres, so it's as valuable as Postgres, and it'll work in the cloud. And people said, “Look, Postgres is completely open and you guys are keeping a few features not open. What gives?”And so after that, we had to concede the point and just do that. But one of the other founding pieces of a company, the business side, was that DBaaS and ability to consume the database is actually far more critical than whether the database itself is open-source or not. I would compare this to, for example, MySQL and Postgres being completely open-source, but you know, Amazon's Aurora being actually a big business, and similarly, it happens all over the place. So, it is really the ability to consume and run business-critical workloads that seem to be more important for our customers and enterprises that paid us. So, the day-one thesis was, look, the world is headed towards DBaaS.We saw that already happen with inside Facebook; everybody was automated operations, simplified operations, and so on. But the reality is, we're a startup, we're a new database, no one's going to trust everything to us: the database, the operations, the data, “Hey, why don't we put it on this tiny company. And oh, it's just my most business-critical data, so what could go wrong?” So, we said we're going to build a version of our DBaaS that is in software. So, we call this Yugabyte Platform, and it actually understands public clouds: it can spin up machines, it can completely orchestrate software installs, rolling upgrades, turnkey encryption, alerting, the whole nine yards.That's a completely different offering from the database. It's not the database, it's just on top of the database and helps you run your own private cloud. So, effectively if you install it on your Amazon account or your Google account, it will convert it into what looks like a DynamoDB, or a Spanner, or what have you with you, with Yugabyte as DB as the database inside. So, that is our commercial product; that's source available and that's what we charge for. The database itself, completely open.Again, the other piece of the thinking is, if we ever charge too much, our customers have the option to say, “Look, I don't want your DBaaS thing; I'm going to the open-source database and we're fine with that.” So, we really want to charge for value. And obviously, we have a completely managed version of our database as well. So, we reuse this platform for our managed version, so you can kind of think of it as portability, not just of the database but also of the control plane, the DBaaS plane.They can run it themselves, we can run it for them, they could take it to a different cloud, so on and so forth.Corey: I like that monetization model a lot better than a couple of others. I mean, let's be clear here, you've spent a lot of time developing some of these concepts for the industry when you were at Facebook. And because at Facebook, the other monetization models are kind of terrifying, like, “Okay. We're going to just monetize the data you store in the open-source database,” is terrifying. Only slightly less would be the Google approach of, “Ah, every time you wind up running a SQL query, we're going to insert ads.”So, I like the model of being able to offer features that only folks who already have expensive problems with money to burn on those problems to solve them will gravitate towards. You're not disadvantaging the community or the small startup who wants it but can't afford it. I like that model.Karthik: Actually, the funny thing is, we are seeing a lot of startups also consume our product a lot. And the reason is because we only charge for the value we bring. Typically the problems that a startup faces are actually much simpler than the complex requirements of an enterprise at scale. They are different. So, the value is also proportional to what they want and how much they want to consume, and that takes care of itself.So, for us, we see that startups, equally so as enterprises, have only limited amount of bandwidth. They don't really want to spend time on operationalizing the database, especially if they have an out to say, “Look, tomorrow, this gets expensive; I can actually put in the time and money to move out and go run this myself. Why don't I just get started because the budget seems fine, and I couldn't have done it better myself anyway because I'd have to put people on it and that's more expensive at this point.” So, it doesn't change the fundamentals of the model; I just want to point out, both sides are actually gravitating to this model.Corey: This episode is sponsored in part by our friends at Jellyfish. So, you're sitting in front of your office chair, bleary eyed, parked in front of a powerpoint and—oh my sweet feathery Jesus its the night before the board meeting, because of course it is! As you slot that crappy screenshot of traffic light colored excel tables into your deck, or sift through endless spreadsheets looking for just the right data set, have you ever wondered, why is it that sales and marketing get all this shiny, awesome analytics and inside tools? Whereas, engineering basically gets left with the dregs. Well, the founders of Jellyfish certainly did. That's why they created the Jellyfish Engineering Management Platform, but don't you dare call it JEMP! Designed to make it simple to analyze your engineering organization, Jellyfish ingests signals from your tech stack. Including JIRA, Git, and collaborative tools. Yes, depressing to think of those things as your tech stack but this is 2021. They use that to create a model that accurately reflects just how the breakdown of engineering work aligns with your wider business objectives. In other words, it translates from code into spreadsheet. When you have to explain what you're doing from an engineering perspective to people whose primary IDE is Microsoft Powerpoint, consider Jellyfish. Thats Jellyfish.co and tell them Corey sent you! Watch for the wince, thats my favorite part.Corey: A number of different surveys have come out that say overwhelmingly companies prefer open-source databases, and this is waved around as a banner of victory by a lot of—well, let's be honest—open-source database companies. I posit that is in fact crap and also bad data because what the open-source purists—of which I admit, I used to be one, and now I solve business problems instead—believe that people are talking about freedom, and choice, and the rest. In practice, in my experience, what people are really distilling that down to is they don't want a commercial database. And it's not even about they're not willing to pay money for it, but they don't want to have a per-core licensing challenge, or even having to track licensing of where it is installed and how, and wind up having to cut checks for folks. For example, I'm going to dunk on someone because why not?Azure for a while has had this campaign that it is five times cheaper to run some Microsoft SQL workloads in Azure than it is on AWS as if this was some magic engineering feat of strength or something. It's absolutely not, it's that it is really expensive licensing-wise to run it on things that aren't Azure. And that doesn't make customers feel good. That's the thing they want to get away from, and what open-source license it is, and in many cases, until the source-available stuff starts trending towards, “Oh, you're going to pay us or you're not going to run it at all,” that scares the living hell out of people, then they don't actually care about it being open. So, at the risk of alienating, I'm sure, some of the more vocal parts of your constituency, where do you fall on that?Karthik: We are completely open, but for a few reasons right? Like, multiple different reasons. The debate of whether it purely is open or is completely permissible, to me, I tend to think a little more where people care about the openness more so than just the ability to consume at will without worrying about the license, but for a few different reasons, and it depends on which segment of the market you look at. If you're talking about small and medium businesses and startups, you're absolutely right; it doesn't matter. But if you're looking at larger companies, they actually care that, like for example, if they want a feature, they are able to control their destiny because you don't want to be half-wedded to a database that cannot solve everything, especially when the time pressure comes or you need to do something.So, you want to be able to control or to influence the roadmap of the project. You want to know how the product is built—the good and the bad—you want a lot of people testing the product and their feedback to come out in the open, so you at least know what's wrong. Many times people often feel like, “Hey, my product doesn't work in these areas,” is actually a bad thing. It's actually a good thing because at least those people won't try it and [laugh] they'll be safe. Customer satisfaction is more important than just the apparent whatever it is that you want to project about the product.At least that's what I've learned in all these years working with databases. But there's a number of reasons why open-source is actually good. There's also a very subtle reason that people may not understand which is that legal teams—engineering teams that want to build products don't want to get caught up in a legal review that takes many months to really make sure, look, this may be a unique version of a license, but it's not a license the legal team as seen before, and there's going to be a back and forth for many months, and it's just going to derail their product and their timelines, not because the database didn't do its job or because the team wasn't ready, but because the company doesn't know what the risk it'll face in the future is. There's a number of these aspects where open-source starts to matter for real. I'm not a purist, I would say.I'm a pragmatist, and I have always been, but I would say that a number of reasons why–you know, I might be sounding like a purist, but a number of reasons why a true open-source is actually useful, right? And at the end of the day, if we have already established, at least at Yugabyte, we're pretty clear about that, the value is in the consumption and is not in the tech if we're pretty clear about that. Because if you want to run a tier-two workload or a hobbyist app at home, would you want to pay for a database? Probably not. I just want to do something for a while and then shut it down and go do my thing. I don't care if the database is commercial or open-source. In that case, being open-source doesn't really take away. But if you're a large company betting, it does take away. So.Corey: Oh, it goes beyond that because it's not even, in the large company story, whether it costs money because regardless, I assure you, open-source is not free; the most expensive thing that we see in all of our customer accounts—again, our consultancy fixes AWS bills, an expensive problem that hits everyone—the environment in AWS is always less expensive than the people who are working on the environment. Payroll is an expense that dwarfs the AWS bill for anyone that is not a tiny startup that is still not paying a market-rate salary to its founders. It doesn't work that way. And the idea, for those folks is, not about the money, it's about the predictability. And if there's a 5x price hike from their database manager that suddenly completely disrupts their unit economic model, and they're in trouble. That's the value of open-source in that it can go anywhere. It's a form of not being locked into any vendor where it's hosted, as well as, now, no one company that has put it out there into the world.Karthik: Yeah, and the source-available license, we considered that also. The reason to vote against that was you can get into scenarios where the company gets competitive with his open-source site where the open-source wants a couple other features to really make it work for their own use case, like you know, case in point is the startup, but the company wants to hold those features for the commercial side, and now the startup has that 5x price jump anyway. So, at this point, it comes to a head-on where the company—the startup—is being charged not for value, but because of the monetization model or the business model. So, we said, “You know what? The best way to do this is to truly compete against open-source. If someone wants to operationalize the database, great. But we've already done it for you.” If you think that you can operationalize it at a lower cost than what we've done, great. That's fine.Corey: I have to ask, there has to have been a question somewhere along the way, during the investment process of, what if AWS moves into your market? And I can already say part of the problem with that line of reasoning is, okay, let's assume that AWS turns Yugabyte into a managed database offering. First, they're not going to be able to articulate for crap why you should use that over anything else because they tend to mumble when it comes time to explain what it is that they do. But it has to be perceived as a competitive threat. How do you think about that?Karthik: Yeah, this absolutely came up quite a bit. And like I said, in 2016, this wasn't news back then; this is something that was happening in the world already. So, I'll give you a couple of different points of view on this. The reason why AWS got so successful in building a cloud is not because they wanted to get into the database space; they simply wanted their cloud to be super successful and required value-added services like these databases. Now, every time a new technology shift happens, it gives some set of people an unfair advantage.In this case, database vendors probably didn't recognize how important the cloud was and how important it was to build a first-class experience on the cloud on day one, as the cloud came up because it wasn't proven, and they had twenty other things to do, and it's rightfully so. Now, AWS comes up, and they're trying to prove a point that the cloud is really useful and absolutely valuable for their customers, and so they start putting value-added services, and now suddenly you're in this open-source battle. At least that's how I would view that it kind of developed. With Yugabyte, obviously, the cloud's already here; we know on day one, so we're kind of putting out our managed service so we'll be as good as AWS or better. The database has its value, but the managed service has its own value, and so we'd want to make sure we provide at least as much value as AWS, but on any cloud, anywhere.So, that's the other part. And we also talked about the mobility of the DBaaS itself, the moving it to your private account and running the same thing, as well as for public. So, these are some of the things that we have built that we believe makes us super valuable.Corey: It's a better approach than a lot of your predecessor companies who decided, “Oh, well, we built the thing; obviously, we're going to be the best at running it. The end.” Because they dramatically sold AWS's operational excellence short. And it turns out, they're very good at running things at scale. So, that's a challenging thing to beat them on.And even if you're able to, it's hard to differentiate among the differences because at that caliber of operational rigor, it's one of those, you can only tell in the very niche cases; it's a hard thing to differentiate on. I like your approach a lot better. Before we go, I have one last question for you, and normally, it's one of those positive uplifting ones of what workloads are best for Yugabyte, but I think that's boring; let's be more cynical and negative. What workloads would run like absolute crap on YugabyteDB?Karthik: [laugh]. Okay, we do have a thing for this because we don't want to take on workloads and, you know, everybody have a bad experience around. So, we're a transactional database built for user-facing applications, real-time, and so on, right? We're not good at warehousing and analytic workloads. So, for example, if you were using a Snowflake or a Redshift, those workloads are not going to work very well on top of Yugabyte.Now, we do work with other external systems like Spark, and Presto, which are real-time analytic systems, but they translate the queries that the end-user have into a more operational type of query pattern. However, if you're using it straight-up for analytics, we're not a good bet. Similarly, there's cases where people want very high number of IOPS by reusing a cache or even a persistent cache. Amazon just came out with a [number of 00:31:04] persistent cache that does very high throughput and low-latency serving. We're not good at that.We can do reasonably low-latency serving and reasonably high IOPS at scale, but we're not the use case where you want to hit that same lookup over and over and over, millions of times in a second; that's not the use case for us. The third thing I'd say is, we're a system of record, so people care about the data they put, and they don't absolutely don't want to lose it and they want to show that it's transactional. So, if there's a workload where there's a lot of data and you're okay if you want to lose, and it's just some sensor data, and your reasoning is like, “Okay, if I lose a few data points, it's fine.” I mean, you could still use us, but at that point you'd really have to be a fanboy or something for Yugabyte. I mean, there's other databases that probably do it better.Corey: Yeah, that's the problem is whenever someone says, “Oh, yeah. Database”—or any tool that they've built—“Like, this is great.” “What workloads is it not a fit for?” And their answer is, “Oh, nothing. It's perfect for everything.”Yeah, I want to believe you, but my inner bullshit sense is tingling on that one because nothing's fit for all purposes; it doesn't work that way. Honestly, this is going to be, I guess, heresy in the engineering world, but even computers aren't always the right answer for things. Who knew?Karthik: As a founder, I struggled with this answer a lot, initially. I think the problem is, when you're thinking about a problem space, that's all you're thinking about, you don't know what other problem spaces exist, and when you are asked the question, “What workloads is it a fit for?” At least I used to say, initially, “Everything,” because I'm only thinking about that problem space as the world, and it's fit for everything in that problem space, except I don't know how to articulate the problem space—Corey: Right—Karthik: —[crosstalk 00:32:33]. [laugh].Corey: —and at some point, too, you get so locked into one particular way of thinking that the world that people ask about other cases like, “Oh, that wouldn't count.” And then your follow-up question is, “Wait, what's a bank?” And it becomes a different story. It's, how do you wind up reasoning about these things? I want to thank you for taking all the time you have today to speak with me. If people want to learn more about Yugabyte—either the company or the DB—how can they do that?Karthik: Yeah, thank you as well for having me. I think to learn about Yugabyte, just come join our community Slack channel. There's a lot of people; there's, like, over 3000 people. They're all talking interesting questions. There's a lot of interesting chatter on there, so that's one way.We have an industry-wide event, it's called the Distributed SQL Summit. It's coming up September 22nd, 23rd, I think a couple of days; it's a two-day event. That would be a great place to actually learn from practitioners, and people building applications, and people in the general space and its adjacencies. And it's not necessarily just about Yugabyte; it's generally about distributed SQL databases, in general, hence it's called the Distributed SQL Summit. And then you can ask us on Twitter or any of the usual social channels as well. So, we love interaction, so we are pretty open and transparent company. We love to talk to you guys.Corey: Well, thank you so much for taking the time to speak with me. Well, of course, throw links to that into the [show notes 00:33:43]. Thank you again.Karthik: Awesome. Thanks a lot for having me. It was really fun. Thank you.Corey: Likewise. Karthik Ranganathan, CTO, and co-founder of YugabyteDB. I'm Cloud Economist Corey Quinn and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice along with an angry comment, halfway through realizing that I'm not charging you anything for this podcast and converting the angry comment into a term sheet for $100 million investment.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com)Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 285 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
Ranganathan discusses the design considerations that influenced development of YugabyteDB, including the learnings gleaned from the engineering team's previous work at Facebook. YugabyteDB can be deployed on premises or as a cloud service. With built-in replication, YugabyteDB can be used to distribute data across geographic regions in support of data localization requirements and for high availability.Key topics in the interview include: The Yugabyte engineering team worked on the HBase and Cassandra databases at Facebook, experience that is now carrying over to the work they are doing at Yugabyte.How YugabyteDB is different from other distributed SQL databases, including its support for both SQL and NoSQL interfaces.Common uses cases for Yugabyte DB include real-time transactions, microservices, Edge and IoT applications, and geographically-distributed workloads.Yugabyte is available via Apache 2.0 license and as self-managed and fully-managed cloud services.Quotes from the podcast: “One of the important characteristics of transactional data is the fact that it needs to live forever.”“We reuse the upper half of Postgres, so it literally is Postgres-compatible and has all of the features.”“We said we're going to meet developers where they develop. We will support both API's [SQL and NoSQL]. We're not going to invent a new API — that's what people hate.”“It's not the database that people pay money for; it's the operations of the database and making sure it runs in a turnkey manner that people really find valuable in an enterprise setting.”
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com) Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 285 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com)Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 185 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com) Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 185 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
Karthik Ranganathan CTO at Yugabyte knows databases inside and out having been on the team that first built Apache Cassandra, helped optimize and scale HBase, and most recently built Yugabyte. What insights does he have from participating in these efforts? He sat down with Percona’s HOSS Matt Yonkovit to talk through what he learned, what he regretted, and how Yugabyte takes those lessons and implements them.
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com)Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 185 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com) Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 185 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com)Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 185 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com) Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 185 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This conversation covers: How Bloomberg is demystifying bond trading and pricing, and bringing transparency to financial markets through their various digital offerings. Andrey's role as CTO of compute architecture at Bloomberg, where he oversees research implementation of new compute related technologies to support kind of our business and engineering objectives. Why factors like speed and reliability are integral to Bloomberg's operations, and how they impact Bloomberg's operations . Andrey also talks about how they impact his approach to technology, and why they use cloud-native technology. How Andrey and his team use containers to scale and ensure reliability. Why portability is important to Bloomberg's applications. Bloomberg's journey to cloud-native. Some of the open-source services that Andrey and his team are using at Bloomberg. Unexpected challenges that Andrey has encountered at Bloomberg. Primary business value that Bloomberg has experienced from their cloud-native transition. Links Bloomberg Bloomberg GitHub Follow Andrey on Twitter Connect with Andrey on LinkedIn TranscriptEmily: Hi everyone. I'm Emily Omier, your host, and my day job is helping companies position themselves in the cloud-native ecosystem so that their product's value is obvious to end-users. I started this podcast because organizations embark on the cloud naive journey for business reasons, but in general, the industry doesn't talk about them. Instead, we talk a lot about technical reasons. I'm hoping that with this podcast, we focus more on the business goals and business motivations that lead organizations to adopt cloud-native and Kubernetes. I hope you'll join me.Emily: Welcome to The Business of Cloud Native, I'm your host Emily Omier. And today I'm chatting with Andrey Rybka from Bloomberg, thank you so much for joining us, Andrey.Andrey: Thank you for your invitation.Emily: Course. So, first of all, can you tell us a little bit about yourself and about Bloomberg?Andrey: Sure. So, I lead the secure computer architecture team, as the name suggests, in the CTO office. And our mission is to help with research implementation of new compute-related technologies to support our business and engineering objectives. But more specifically, we work on ways to faster provision, manage, and elastically scale compute infrastructure, as well as support rapid application development and delivery. And we also work on developing and articulating company's compute strategic direction, which includes the compute storage middleware, and application technologists, and we also help us product owners for the specific offerings that we have in-house. And as far as Bloomberg, so Bloomberg was founded in 1981 and it's got very large presence: about 325,000 Bloomberg subscribers in about 170 countries, about 20,000 employees, and more news reporters than The New York Times, Washington Post, and Chicago Tribune combined. And we have about 6000 plus software engineers, so pretty large team of very talented people, and we have quite a lot of data scientists and some specialized technologists. And some impressive, I guess, points is we run one of the largest private networks in the world, and we move about a hundred and twenty billion pieces of data from financial markets each day, with a peak of more than 10 million messages a second. We generate about 2 million news stories—and they're published every day—and then news content, we consuming from about 125,000 sources. And the platform allows and supports about 1 million messages, chats handled every day. So, it's very large and high-performance kind of deployment.Emily: And can you tell me just a little bit more about the types of applications that Bloomberg is working on or that Bloomberg offers? Maybe not everybody is familiar with why people subscribe to Bloomberg, what the main value is. And I'm also curious how the different applications fit into that.Andrey: The core product is Bloomberg Terminal, which is Software as a Service offering that is delivering diverse array of information of news and analytics to facilitate financial decision-making. And Bloomberg has been doing a lot of things that make financial markets quite a bit more transparent. The original platform helped to demystify a lot of bond trading and pricing. So, the Bloomberg Terminal is the core product, but there's a lot of products that are focused on the trading solutions, there is enterprise data distribution for market data and such, and there is a lot of verticals such as Bloomberg Media: that's bloomberg.com, TV, and radio, and news articles that are consumer-facing. But also there is Bloomberg Law, which is offering for the attorneys, and there is other verticals like New Energy Finance, which helps with all the green energy and information that helps a lot to do with helping with climate change. And then there's Bloomberg Government, which is focused on, specifically, research around government-specific data feeds. And so in general, you've got finance, government, law, and new energy as the key solutions.Emily: And how important is speed?Andrey: It is extremely important because, well, first of all, obviously, for traders, although we're not in high-frequency game, we definitely want to deliver the news as fast as possible. We want to deliver actionable financial information as fast as possible, so definitely it is a major factor, but also not the only factor because there's other considerations like reliability and quality of service as well.Emily: And then how does this translate to your approach to new technology in general? And then also, why did you think cloud-native might be a good technology to look into and to adopt?Andrey: So, I guess if we define cloud-native, a little because I think there's different definitions; many people think of containers immediately. But I think that we need to think of outside of not just, I guess, containers, but I guess the container orchestration and scaling elastically, up and down. And those, I guess, primitives. So, when we originally started on our cloud-native journey, we had this problem of we were treating our machines as pets if you know the paradigm of pets versus cattle where pet is something that you care for, and there's, like, literally the name for it, you take it to the vet if it gets sick. And when you use think of herd of cattle, there's many of them, and you can replace, and you have quite a lot of understanding of scalability with the herd versus pets. So, we started moving towards that direction because we wanted to have more uniform infrastructure, more heterogeneous. And we started with VMs. So, we didn't necessarily jump to containers. And then we started thinking like, “Is VMs the right abstraction?” And for some workloads it is, but then in some cases, we started thinking, “Well, maybe we need something more lightweight.” So, that's how we started looking at containers because you could provision them faster, and they could start off faster, and developers seem to be gravitating towards containers quite a bit because it's very easy to bootstrap your local dev environment with containers. And when you ship a container to the higher environment, it actually works. Used to be a problem where you developed on your local machine and you'd ship your code to production or higher environment, and it doesn't work because some dependency get missed. And that's where containers came about, to help with that problem.Emily: And then how does that fit in with your core business needs?Andrey: So, one of the big things is obviously, we need to ship products faster—and that's probably common to a lot of businesses—but we also want to ensure that we have highest availability possible, and that's where the containers help us to scale out our workloads and ensure that there's some resurrection happens with things like Kubernetes when something dies. And we also wanted to maximize our machine utilization. So, we have very large data centers and edge deployments—which I guess could be referenced as a private cloud—so we want to maximize utilization in our data centers. So, that's where virtualization and containers help quite a bit. But also, we wanted to make sure our workloads are portable across all the environments, from private cloud to the public cloud, or the edge. And that's where containerized technologies could help quite a bit. Because not only you can have, let's say Kubernetes clusters on-prem on the edge, but also, now all the three major cloud providers support a managed Kubernetes offering. And in this case, you have basically highly portable deployments across all the clouds, private and public.Emily: And why was that important?Andrey: Basically, we wanted to have, more or less, very generic way to deploy something, an application, right? And if you think of containers, that's pretty much, like I say, Docker is pretty standard these days. And developers, we were challenged with different package formats. So, if you do any application Ruby and Rails, or Java and Python, there is a native packages that you can use to package your application, distribute it, but it's not as uniformly support it outside of Bloomberg or even across various deployment platforms. But containers do get you that abstraction layer that helps you to basically build once and deploy many different targets in very uniform way. So, whether we do it on-premises, or to the edge, or to the public cloud, we can effectively use the same packaging mechanism. But not only for deployment, which is the one problem, but also for post-deployment. So, if we need to self-heal the workload. So, all those primitives are built there in the, I guess, Kubernetes fabric.Emily: But why is being portable important? What does it give you? What advantage? I mean, I understand that's one of the advantages of containers. But why specifically for Bloomberg, why do you care? I mean, are you moving applications around between public cloud providers, and—Andrey: So, we're definitely adopting public cloud quite a bit, but I guess what I was trying to hint is we have to support the private cloud deployments as our primary, I guess, delivery mechanism. But the edge deployments, when we actually deploy something closer to the customers, to your point about being faster, to deliver things faster to our customers, we have to deliver things to the edge, which is what I'm describing as something that is close to the customer. And then as far as the public clouds, we started moving a lot of workloads to public cloud, and that definitely required some rethinking of how do we want to adopt public cloud. But whether it's private or public, our main goal, I think, here is to make it easier for developers to package and deploy things and effectively, run faster, or deploy things faster, but also do it in more reliable way, right? Because it used to be that we could deploy things to a particular target of machines, and we could do it relatively reliably, but there was no auto-healing, necessarily, in place there. So, resilience and reliability wasn't quite as good as what we get with Kubernetes. And what I mentioned before, machine utilization, or actually ability to elastically scale workloads, and—within vertical and horizontal—vertical, we generally knew how to do that. Although I think with containers and VMs, you can do it much better to higher degree, but also horizontally, obviously, this was pretty challenging to do before Kubernetes came about. You had to bootstrap, even your bunch of VMs and different availability zones, figure out how you're going to deploy to them, and it just wasn't quite there as far as automation and ease of use.Emily: Let's change gears just a little bit to talk about a little bit of your journey to cloud-native. You mentioned that you started with VMs, and then you moved to containers. What time frame are we talking about? In addition to containers, what technologies do you use?Andrey: So, yeah, I guess we started about eight years ago or so, with OpenStack as a primary virtualization platform, and if you look at github.com/bloomberg, you will see that we actually open-sourced our OpenStack distribution, so anyone can look and see if, potentially, they can benefit from that. And so OpenStack provided the VMs, basic storage, and some basic Infrastructure as a Service concepts. But then we also started getting into object storage, so there was a lot of investment made into S3 compatible storage, similar to how it AWS's S3 object storage, or it's based on the [00:14:21 Sapth] open-source framework. So, that was our foundational blocks. And then, very shortly thereafter, we started looking at Kubernetes to build a general-purpose Platform as a Service. Because effectively developers generally don't really want to manage virtual machines, they want to just write applications and deploy them to the—somewhere, right, but they don't really care that much about the where I use Red Hat, or Ubuntu or the don't really care to configure proxies or anything like that. So, we started rolling out general-purpose Platform as a Service based on Kubernetes. So, that was with initial alpha release of Kubernetes; we already started adopting it. And then thereafter, we also started looking into how we can leverage Kubernetes for data science platform. Well, now we have a world-class data science platform that allows data scientists to train and run inference on the various large clusters of compute with GPUs. Then we quickly realized that on-prem, if we're building this on-prem, we need to have similar constructs to what you normally find in public cloud providers. So, as AWS add Identity Access Management, we started introducing that on-prem as well. But more importantly, we needed something that would be a discovery layer as a service, or if I'm looking for service, I need to go somewhere to look it up. And DNS was not necessarily the right construct, although it's certainly very important. So, we started looking into leveraging Consul as a primary discovery as a service. And that actually paid quite a lot of dividends and it's helped us quite a bit. We also looked into Databases as a Service because everything that I described so far was really good for stateless workloads, to a greater degree with Kubernetes, I think you can get really good at running stateless workloads, but for something's stateful, I think you needed something, basically, that will not run necessarily with Kubernetes. So, that's where we started looking at offering more Database as a Service, which Bloomberg has been doing this quite a bit before that. We open-sourced our core relational database called Comdb2. But we also wanted to offer that for MySQL, for Postgres, and some other database flavors. So, I think we have a pretty decent offering right now, which offers variety of Databases as a Service, and I would argue that you can provision some of those databases faster than you can do it on AWS.Emily: It sounds like—and correct me if I'm wrong, but it sounds like you've ended up building a lot of things in-house. I mean, you used Kubernetes, but you've also done a lot of in-house custom work.Andrey: Right. So, custom, but with the principle of using open-source. Everything that I described actually has an open-source framework behind it. And this open first principle is something that now this is becoming more normal. Before we build something in-house, we looked at open-source frameworks, and we look at which open-source community we can leverage that has a lot of contributors, but also, can we contribute back, right? So, we contribute back to a lot of open-source projects, like for example, Solr. So, we offer search as a service based on the Solr open-source framework. But also, we have Redis caching as a service, queuing as a service based on RabbitMQ. Kafka as a service, so distributed event streaming as a service. So, quite a few open-source frameworks. We're always thought of, “Can we start with something that's open-source, participate in the community, and contribute back?”Emily: And tell me a little bit about what has gone really well? And also what has been possibly unexpectedly challenging, or even expectedly, but it's always even more interesting to hear what was surprising.Andrey: I think, generally open-source first, as a strategy worked out pretty well. I think we have, I've listed only some of the services that we have in-house, but we certainly have quite a bit more. And the benefits, I think, I don't even know how to quantify it, but it certainly enabled us to go fast and deliver business value as soon as possible versus waiting for years before we build our alternative technology. And I think developer happiness also improved quite a bit because we started investing heavily into our developer experience and as a major effort. And this everything as a service makes it extremely easy for developers to deliver new products. So, all of the investment we've made so far paid huge dividends. Challenges: I think that as with anything, starting with open-source projects, you certainly have bugs and things like that. So, in this case, we preferred to partner with a company that effectively has inside knowledge into the open-source project so we can have at least for a couple of years, somebody who can help us guide us and potentially, we—by actually invest in actual money into the project, we get it to the point where it's mature enough and actually meets a certain quality criteria. And some of the projects we invested heavily in which many people don't know, probably, but we—like Chromium Project. So, many people use Chrome, but Bloomberg has been sponsoring Chromium and WebKit open-source development quite a bit. JavaScript, V8 Engine, even the newer technologies like WebAssembly we're heavily invested in sponsoring that. But again, one thing that it's very clear, it's not just we're going to be the consumers of the open-source, but we're going to be contributors back with either our developers helping on the projects, or we need to invest to help this actual open-source project we're leveraging to be successful, and not just by saying, like, Bloomberg consumes it, but actually investing back. So, that's one of the things that was a big lesson learned. But currently, I think we have a really good enough system in place where we always adopt open-source projects in a very conscious and serious way with investment going back into the open-source community.Emily: You mentioned being able to deliver business value sooner. What do you think are the primary two or three business values that you get from this cloud-native transition?Andrey: So, ability to go faster. That's one thing that's very clear. Ability to elastically scale workloads, and ability to achieve uniformity of deployments across various environments: private, edge, public, so we are able to deliver now products to our customers as they transition to the public cloud, for example, much faster because we're have a lot of standardized and a lot of technologists that helped us with adoption. And including Kubernetes is one of them, but not only Kubernetes. We also use Terraform, extensively, some other multi-cloud frameworks.And then also delivering things more reliably. That's I think, one of the things that is not always recognized, but I think reliability is a huge differentiator, and some of it has to do with how we deliver things to the customer with some resiliency and redundancy. So, we run very large private content delivery network as a service, and it's also based on open-source technologies. And the reliability is one of the main things that I would say we get from a lot of this technologies because if we do it on our own, yes, it would be generally Bloomberg working on this problem and solving it, but you get a, actually, a worldwide number of experts from different companies who're contributing back to this technologies, and I see this as a, obviously, a huge benefit because it's not just Bloomberg working on solving some distributed system framework, but it's actually people worldwide working on this.Emily: And would you say there's anything in moving to cloud-native that you would do differently?Andrey: I think what I see as the big challenge, especially with Kubernetes, is adoption of stateful workloads because I still think it's not quite there yet. Generally, the way we're thinking right now is we leverage Kubernetes for our stateless workloads, but some stateful workloads require some cloud-native storage primitives to be there, and this is where I think it's still not quite mature. You can certainly leverage various vendors for that, but I really would like to see better support for stateful workloads in the open-source world. And definitely still looking for a project to partner with to deliver better stateful workloads on Kubernetes. And I think, to a various degree, the public cloud providers, so hyperscalers are getting pretty good at this, but that is still private to them. So, whether it's Google, Amazon, or Azure, they deliver the statefulness to varying degree of reliability. But I would like this to be something that you can leverage on private clouds or anywhere else, and having it somewhere, well-supported through an open-source community would be, I think, hugely beneficial to quite a few people. So, Kubernetes, I think, is the right compute fabric for the future, but it still doesn't support some of the workload types that I would like to be there.Emily: Are any other continuing challenges that you're working on, or problems that you haven't quite solved yet, either that you feel like you haven't solved, maybe internally that might be specific to you, or that you just feel like the community hasn't quite figured out yet?Andrey: So, this whole idea of multi-cloud deployments, we leverage quite a few technologies from Terraform, to Vault, to Consul, to some other frameworks that help with some of it, but the day two alerting, monitoring, and troubleshooting with multi-cloud deployments is still not quite there. So, yes, you can solve it for one particular cloud provider, but as soon as you go to two, I think there's quite a few challenges that left unaddressed from just, like, single pane of glass—the view of all of your workloads, right. And that's definitely something that I would like to address: reliability, alerting across all the cloud providers, security across all the cloud providers. So, that's one of the challenges that I'm still working on—or, actually, quite a few people are working on at Bloomberg. As I said, we have 6000 plus talented engineers who are working on this.Emily: Excellent. Anything else that you'd like to add?Andrey: You know, I'm very excited about the future. I think this is almost like a compute renaissance. And it's really exciting to see all of these things that are happening, and I'm really excited about the future, I guess.Emily: Fabulous. Just a couple more questions. First of all, is there a tool that you feel like you couldn't do your job without?Andrey: Right. Yes, VI editor or [laugh] [00:27:42 unintelligible]? No. So, I think obviously, Docker has done quite a bit for the containerization. And I know, we're looking at alternatives to Docker at this point, but I do give Docker quite a bit of a credit because right now, local development environment, we bootstrap with Docker, we ship it as a deployment mechanism all over the place. So, I would say Docker, Kubernetes, and the two primary ones, but I don't necessarily want to pick favorites. [laugh]. I really like a lot of HashiCorp tools, you know Terraform, Consul, Vault, fantastic tools; a really good community. I really like Jenkins. We run Jenkins, the service; really good. Kafka has been extremely reliable and scalability-wise, Kafka is just amazing. Cache in Redis is really one of my favorite cache tools.There's probably a lot to mention. I've mentioned. [00:28:43 unintelligible] databases, Postgres is one of my favorite databases, or in so many varieties and different types of workload. But we also gain quite a lot from Hadoop and HBase. But one of my favorite NoSQL databases is Cassandra, an extremely reliable, and the replication across, I guess, low-quality bandwidth and of environment has been really awesome. So, I guess I'm not answering with just one but many tools, but I really like all of those tools.Emily: Excellent. Okay, well, just the last question is, where could listeners connect with you or follow you?Andrey: I am on Twitter as @andrey_rybka. I'm happy to get any direct messages. We are hiring. We're always hiring A lot of great opportunities. As I said, we're open-source first company these days, and we definitely have a lot of exciting new projects. I haven't mentioned even probably 90 probably of other exciting projects that we have. We also have github.com/bloomberg, so you're welcome to browse and look at some of the cool open-source projects that we have as well.Emily: Excellent. Cool. Well, thank you so much for joining me.Andrey: Thank you very much.Emily: Thanks for listening. I hope you've learned just a little bit more about The Business of Cloud Native. If you'd like to connect with me or learn more about my positioning services, look me up on LinkedIn: I'm Emily Omier—that's O-M-I-E-R—or visit my website which is emilyomier.com. Thank you, and until next time.Announcer: This has been a HumblePod production. Stay humble.
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com)Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 165 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each weekday morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. (and now SiouxFallsNewsRadio.com) Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 165 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 165 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD.Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 165 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD.Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 165 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 165 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
Apache HBase is a distributed, scalable, big data store. I spoke with Josh Elser about what that means, and the state of the HBase project, on their 10th birthday. Apache HBase – http://hbase.apache.org/ More project info – https://projects.apache.org/committee.html?hbase Prefer video? …
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD.Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 165 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 165 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD. Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that! This rest of this show is syndicated on over 135 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
This is an EXTRA HOUR of the John & Heidi Show, featuring Dan Farris each morning from about 6:35 to about 7:40, these breaks are only heard on the flagship station for The John And Heidi Show, Sunny 93.3 fm in Sioux Falls, SD.Part of the show is a visit from Ranger Dan, for Ranger Dan's Critter Corner. This is a fun program that features the most reckless, but well-meaning park ranger who SAYS he works for the state, but he can not verify any of that!This rest of this show is syndicated on over 135 stations, to hear these breaks, you can listen to Sunny 93.3 from 5am to 10am or hear the podcast version at JohnAndHeidiShow.com
Google’s own Billy Jacobson joins hosts Mark Mandel and Mark Mirchandani this week to dive deeper into Cloud Bigtable. Bigtable is Google’s petabyte scale, fully managed, NoSQL database. Billy elaborates on what projects Bigtable works best with, like time-series data user analytics, and why it’s such a great tool. It offers huge scalability with the benefits of a managed system, and it’s flexible and easily customized so users can turn on and off the pieces they need. Later, we learn about other programs that are compatible with Bigtable, such as JanusGraph, Open TSDB, and GeoMesa. Bigtable also supports the API for HBase, an open-source project similar to Bigtable. Because of this, it’s easy for HBase users to move to Bigtable, and the Bigtable community has access to many open source libraries. Billy also talks more about the nine clients available, and when customers might want to use Bigtable instead of, or in conjunction with, other Google services such as Spanner and BigQuery. Billy Jacobson Billy Jacobson is a developer programs engineer focusing on Cloud Bigtable. Cool things of the week Introducing Cloud Run Button: Click-to-deploy your git repos to Google Cloud blog Firebase Unity Solutions: Update game behavior without deploying with Remote Config blog Introducing the BigQuery Terraform module blog Macy’s uses Google Cloud to streamline retail operations blog Interview Cloud Bigtable site GCP Podcast Episode 18: Bigtable with Ian Lewis podcast BigQuery site Bigtable Documentation docs Codelab: Introduction to Cloud Bigtable site Key Visualizer docs Bigtable Replication Documentation docs Bigtable and HBase Documentation docs HBase site JanusGraph site Open TSDB site GeoMesa site Bigtable Client Libraries docs Cloud Spanner site Managing IoT Storage with Google’s Cloud Platform (Google I/O’19) video Cloud Datastore site Cloud Firestore site Mapping the invisible: Street View cars add air pollution sensors site Breathing Easy with Bigtable article Question of the week If I have an organization, how do I break down my billing data by folder? Where can you find us next? Mark Mirch is working around town but will be headed to LA soon. Mark Mandel will be at Pax Dev, Pax West, Kubecon, and the GDC Online Games Technology Summit.
016: Okera Data Management | Amandeep Khurana is CEO & Co-Founder of Cerebro Data. His company works with cloud native data management and governance software for enterprises. Before this he was the Principal Solutions Architect at Cloudera Inc, where he worked with Cloudera’s customers to help them with their adoption and usage of the Hadoop ecosystem. Amandeep has been been involved in the big data ecosystem since 2009 and worked at AWS prior to Cloudera. Amandeep has also co-authored HBase In Action, a book on developing applications on the popular NoSQL datastore, HBase.*** For Show Notes, Key Points, Contact Info, & Resources Mentioned on this episode visit here: Amandeep Khurana Interview. ***
What is the difference between SQL and NoSQL? In this episode I show you on the example of HBase how a key/value store works.
Simon hosts an update show with lots of great new features and capabilities! Chapters: Developer Tools 0:26 Storage 3:02 Compute 5:10 Database 10:31 Networking 13:41 Analytics 16:38 IoT 18:23 End User Computing 20:19 Machine Learning 21:12 Application Integration 24:02 Management and Governance 24:23 Migration 26:05 Security 26:56 Training and Certification 29:57 Blockchain 30:27 Quickstarts 31:06 Shownotes: Topic || Developer Tools Announcing AWS X-Ray Analytics – An Interactive approach to Trace Analysis | https://aws.amazon.com/about-aws/whats-new/2019/04/aws_x_ray_interactive_approach_analyze_traces/ Quickly Search for Resources across Services in the AWS Developer Tools Console | https://aws.amazon.com/about-aws/whats-new/2019/05/search-resources-across-services-developer-tools-console/ AWS Amplify Console adds support for Incoming Webhooks | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-amplify-console-adds-support-for-incoming-webhooks/ AWS Amplify launches an online community for fullstack serverless app developers | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-amplify-launches-an-online-community-for-fullstack-serverless-app-developers/ AWS AppSync Now Enables More Visibility into Performance and Health of GraphQL Operations | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-appsync-now-enables-more-visibility-into-performance-and-hea/ AWS AppSync Now Supports Configuring Multiple Authorization Types for GraphQL APIs | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-appsync-now-supports-configuring-multiple-authorization-type/ Topic || Storage Amazon S3 Introduces S3 Batch Operations for Object Management | https://aws.amazon.com/about-aws/whats-new/2019/04/Amazon-S3-Introduces-S3-Batch-Operations-for-Object-Management/ AWS Snowball Edge adds block storage – Amazon Web Services | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-snowball-edge-adds-block-storage-for-edge-computing-workload/ Amazon FSx for Windows File Server Adds Support for File System Monitoring with Amazon CloudWatch | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-fsx-for-windows-file-server-adds-support-for-cloudwatch/ AWS Storage Gateway enhances access control for SMB shares to store and access objects in Amazon S3 buckets | https://aws.amazon.com/about-aws/whats-new/2019/05/AWS-Storage-Gateway-enhances-access-control-for-SMB-shares-to-access-objects-in-Amazon-s3/ Topic || Compute AWS Lambda adds support for Node.js v10 | https://aws.amazon.com/about-aws/whats-new/2019/05/aws_lambda_adds_support_for_node_js_v10/ AWS Serverless Application Model (SAM) supports IAM permissions and custom responses for Amazon API Gateway | https://aws.amazon.com/about-aws/whats-new/2019/aws_serverless_application_Model_support_IAM/ AWS Step Functions Adds Support for Workflow Execution Events | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-step-functions-adds-support-for-workflow-execution-events/ Amazon EC2 I3en instances, offering up to 60 TB of NVMe SSD instance storage, are now generally available | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-ec2-i3en-instances-are-now-generally-available/ Now Create Amazon EC2 On-Demand Capacity Reservations Through AWS CloudFormation | https://aws.amazon.com/about-aws/whats-new/2019/04/now-create-amazon-ec2-on-demand-capacity-reservations-through-aws-cloudformation/ Share encrypted AMIs across accounts to launch instances in a single step | https://aws.amazon.com/about-aws/whats-new/2019/05/share-encrypted-amis-across-accounts-to-launch-instances-in-a-single-step/ Launch encrypted EBS backed EC2 instances from unencrypted AMIs in a single step | https://aws.amazon.com/about-aws/whats-new/2019/05/launch-encrypted-ebs-backed-ec2-instances-from-unencrypted-amis-in-a-single-step/ Amazon EKS Releases Deep Learning Benchmarking Utility | https://aws.amazon.com/about-aws/whats-new/2019/05/-amazon-eks-releases-deep-learning-benchmarking-utility-/ Amazon EKS Adds Support for Public IP Addresses Within Cluster VPCs | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-eks-adds-support-for-public-ip-addresses-within-cluster-v/ Amazon EKS Simplifies Kubernetes Cluster Authentication | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-eks-simplifies-kubernetes-cluster-authentication/ Amazon ECS Console support for ECS-optimized Amazon Linux 2 AMI and Amazon EC2 A1 instance family now available | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-ecs-console-support-for-ecs-optimized-amazon-linux-2-ami-/ AWS Fargate PV1.3 now supports the Splunk log driver | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-fargate-pv1-3-now-supports-the-splunk-log-driver/ Topic || Databases Amazon Aurora Serverless Supports Capacity of 1 Unit and a New Scaling Option | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon_aurora_serverless_now_supports_a_minimum_capacity_of_1_unit_and_a_new_scaling_option/ Aurora Global Database Expands Availability to 14 AWS Regions | https://aws.amazon.com/about-aws/whats-new/2019/05/Aurora_Global_Database_Expands_Availability_to_14_AWS_Regions/ Amazon DocumentDB (with MongoDB compatibility) now supports per-second billing | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-documentdb-now-supports-per-second-billing/ Performance Insights is Generally Available on Amazon Aurora MySQL 5.7 | https://aws.amazon.com/about-aws/whats-new/2019/05/Performance-Insights-GA-Aurora-MySQL-57/ Performance Insights Supports Counter Metrics on Amazon RDS for Oracle | https://aws.amazon.com/about-aws/whats-new/2019/05/performance-insights-countermetrics-on-oracle/ Performance Insights Supports Amazon Aurora Global Database | https://aws.amazon.com/about-aws/whats-new/2019/05/performance-insights-global-datatabase/ Amazon ElastiCache for Redis adds support for Redis 5.0.4 | https://aws.amazon.com/about-aws/whats-new/2019/05/elasticache-redis-5-0-4/ Amazon RDS for MySQL Supports Password Validation | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-rds-for-mysql-supports-password-validation/ Amazon RDS for PostgreSQL Supports New Minor Versions 11.2, 10.7, 9.6.12, 9.5.16, and 9.4.21 | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-rds-postgresql-supports-minor-version-112/ Amazon RDS for Oracle now supports April Oracle Patch Set Updates (PSU) and Release Updates (RU) | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-rds-for-oracle-now-supports-april-oracle-patch-set-updates-psu-and-release-updates-ru/ Topic || Networking Elastic Fabric Adapter Is Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2019/04/elastic-fabric-adapter-is-now-generally-available/ Migrate Your AWS Site-to-Site VPN Connections from a Virtual Private Gateway to an AWS Transit Gateway | https://aws.amazon.com/about-aws/whats-new/2019/04/migrate-your-aws-site-to-site-vpn-connections-from-a-virtual-private-gateway-to-an-aws-transit-gateway/ Announcing AWS Direct Connect Support for AWS Transit Gateway | https://aws.amazon.com/about-aws/whats-new/2019/04/announcing-aws-direct-connect-support-for-aws-transit-gateway/ Amazon CloudFront announces 11 new Edge locations in India, Japan, and the United States | https://aws.amazon.com/about-aws/whats-new/2019/05/cloudfront-11locations-7may2019/ Amazon VPC Endpoints Now Support Tagging for Gateway Endpoints, Interface Endpoints, and Endpoint Services | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-vpc-endpoints-now-support-tagging-for-gateway-endpoints-interface-endpoints-and-endpoint-services/ Topic || Analytics Amazon EMR announces Support for Multiple Master nodes to enable High Availability for EMR applications | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-emr-announces-support-for-multiple-master-nodes-to-enable-high-availability-for-EMR-applications/ Amazon EMR now supports Multiple Master nodes to enable High Availability for HBase clusters | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-emr-now-supports-multiple-master-nodes-to-enable-high-availability-for-hbase-clusters/ Amazon EMR announces Support for Reconfiguring Applications on Running EMR Clusters | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-emr-announces-support-for-reconfiguring-applications-on-running-emr-clusters/ Amazon Kinesis Data Analytics now allows you to assign AWS resource tags to your real-time applications | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon_kinesis_data_analytics_now_allows_you_to_assign_aws_resource_tags_to_your_real_time_applications/ AWS Glue crawlers now support existing Data Catalog tables as sources | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-glue-crawlers-now-support-existing-data-catalog-tables-as-sources/ Topic || IoT AWS IoT Analytics Now Supports Faster SQL Data Set Refresh Intervals | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-iot-analytics-now-supports-faster-sql-data-set-refresh-intervals/ AWS IoT Greengrass Adds Support for Python 3.7, Node v8.10.0, and Expands Support for Elliptic-Curve Cryptography | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-iot-greengrass-adds-support-python-3-7-node-v-8-10-0-and-expands-support-elliptic-curve-cryptography/ AWS Releases Additional Preconfigured Examples for FreeRTOS on Armv8-M | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-releases-additional-freertos-preconfigured-examples-armv8m/ AWS IoT Device Defender supports monitoring behavior of unregistered devices | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-iot-device-defender-supports-monitoring-behavior-of-unregistered-devices/ AWS IoT Analytics Now Supports Data Set Content Delivery to Amazon S3 | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-iot-analytics-now-supports-data-set-content-delivery-to-amaz/ Topic || End User Computing Amazon AppStream 2.0 adds configurable timeouts for idle sessions | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-appstream-2-0-adds-configurable-timeouts-for-idle-session/ Monitor Emails in Your Workmail Organization Using Cloudwatch Metrics and Logs | https://aws.amazon.com/about-aws/whats-new/2019/05/monitor-emails-in-your-workmail-organization-using-cloudwatch-me/ You can now use custom chat bots with Amazon Chime | https://aws.amazon.com/about-aws/whats-new/2019/05/you-can-now-use-custom-chat-bots-with-amazon-chime/ Topic || Machine Learning Developers, start your engines! The AWS DeepRacer Virtual League kicks off today. | https://aws.amazon.com/about-aws/whats-new/2019/04/AWSDeepRacerVirtualLeague/ Amazon SageMaker announces new features to the built-in Object2Vec algorithm | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-sagemaker-announces-new-features-to-the-built-in-object2v/ Amazon SageMaker Ground Truth Now Supports Automated Email Notifications for Manual Data Labeling | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-sagemaker-ground-truth-now-supports-automated-email-notif/ Amazon Translate Adds Support for Hindi, Farsi, Malay, and Norwegian | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon_translate_support_hindi_farsi_malay_norwegian/ Amazon Transcribe now supports Hindi and Indian-accented English | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-transcribe-supports-hindi-indian-accented-english/ Amazon Comprehend batch jobs now supports Amazon Virtual Private Cloud | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-comprehend-batch-jobs-now-supports-amazon-virtual-private-cloud/ New in AWS Deep Learning AMIs: PyTorch 1.1, Chainer 5.4, and CUDA 10 support for MXNet | https://aws.amazon.com/about-aws/whats-new/2019/05/new-in-aws-deep-learning-amis-pytorch-1-1-chainer-5-4-cuda10-for-mxnet/ Topic || Application Integration Amazon MQ Now Supports Resource-Level and Tag-Based Permissions | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-mq-now-supports-resource-level-and-tag-based-permissions/ Amazon SNS Adds Support for Cost Allocation Tags | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-sns-adds-support-for-cost-allocation-tags/ Topic || Management and Governance Reservation Expiration Alerts Now Available in AWS Cost Explorer | https://aws.amazon.com/about-aws/whats-new/2019/05/reservation-expiration-alerts-now-available-in-aws-cost-explorer/ AWS Systems Manager Patch Manager Supports Microsoft Application Patching | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-systems-manager-patch-manager-supports-microsoft-application-patching/ AWS OpsWorks for Chef Automate now supports Chef Automate 2 | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-opsworks-for-chef-automate-now-supports-chef-automate-2/ AWS Service Catalog Connector for ServiceNow supports CloudFormation StackSets | https://aws.amazon.com/about-aws/whats-new/2019/05/service-catalog-servicenow-connector-now-supports-stacksets/ Topic || Migration AWS Migration Hub EC2 Recommendations | https://aws.amazon.com/about-aws/whats-new/2019/05/aws-migration-hub-ec2-recommendations/ Topic || Security Amazon GuardDuty Adds Two New Threat Detections | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-guardduty-adds-two-new-threat-detections/ AWS Security Token Service (STS) now supports enabling the global STS endpoint to issue session tokens compatible with all AWS Regions | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-security-token-service-sts-now-supports-enabling-the-global-sts-endpoint-to-issue-session-tokens-compatible-with-all-aws-regions/ AWS WAF Security Automations Now Supports Log Analysis | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-waf-security-automations-now-supports-log-analysis/ AWS Certificate Manager Private Certificate Authority Increases Certificate Limit To One Million | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-certificate-manager-private-certificate-authority-increases-certificate-limit-to-one-million/ Amazon Cognito launches enhanced user password reset API for administrators | https://aws.amazon.com/about-aws/whats-new/2019/05/amazon-cognito-launches-enhanced-user-password-reset-api-for-administrators/ AWS Secrets Manager supports more client-side caching libraries to improve secrets availability and reduce cost | https://aws.amazon.com/about-aws/whats-new/2019/05/Secrets-Manager-Client-Side-Caching-Libraries-in-Python-NET-Go/ Create fine-grained session permissions using AWS Identity and Access Management (IAM) managed policies | https://aws.amazon.com/about-aws/whats-new/2019/05/session-permissions/ Topic || Training and Certification New VMware Cloud on AWS Navigate Track | https://aws.amazon.com/about-aws/whats-new/2019/04/vmware-navigate-track/ Topic || Blockchain Amazon Managed Blockchain What's New | https://aws.amazon.com/about-aws/whats-new/2019/04/introducing-amazon-managed-blockchain/ Topic || Quick Starts New Quick Start deploys SAP S/4HANA on AWS | https://aws.amazon.com/about-aws/whats-new/2019/05/new-quick-start-deploys-sap-s4-hana-on-aws/
Simon & Nicki are joined by a live audience to record a great set of cool updates for customers! Chapters: 1:20 Infrastructure 1:33 Developer Tools 3:50 Storage 4:28 Compute 6:13 Database 10:22 Analytics 13:01 IoT 13:23 End User Computing 14:08 Machine Learning 17:03 Networking 18:22 Customer Engagement 18:37 Application Integration 19:12 Game Tech 19:47 Media Services 20:44 Management and Governance 23:20 Robotics 24:26 Migration 25:03 Security 25:38 Training & Certification 26:05 Audience Q&A Shownotes: Topic || Infrastructure Announcing the AWS Asia Pacific (Hong Kong) Region | https://aws.amazon.com/about-aws/whats-new/2019/04/announcing-the-aws-asia-pacific-hong-kong-region/ Topic || Developer Tools AWS Amplify Console Now Supports Deploying Fullstack Serverless Applications with a Single Click | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-amplify-console-now-supports-deploying-fullstack-serverless-/ Amplify Framework Simplifies Configuring OAuth 2.0 Flows, Hosted UI, and AR/VR Scenes for Mobile and Web Apps | https://aws.amazon.com/about-aws/whats-new/2019/04/amplify-framework-simplifies-configuring-oauth-2-0-flows--hosted/ Amplify Framework Announces New Amazon Aurora Serverless, GraphQL, and OAuth Capabilities | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-amplify-announces-new-amazon-aurora-serverless--graphql--and/ AWS Amplify Console adds support for Custom Headers | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-amplify-console-adds-support-for-custom-headers/ AWS Amplify Console Now Available in Five Additional Regions | https://aws.amazon.com/about-aws/whats-new/2019/04/amplify-console-now-available-in-five-additional-regions/ AWS Device Farm Remote Access for Manual Testing on real Android and iOS devices now supports Android OS 8+ and iOS 11+ devices | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-device-farm-remote-access-for-manual-testing-on-real-android/ Topic || Storage New AWS Public Datasets Available from National Renewable Energy Laboratory, Nanyang Technological University, Stanford, Software Heritage and others | https://aws.amazon.com/about-aws/whats-new/2019/04/new-aws-public-datasets-available-from-national-renewable-energy/ Topic || Compute Amazon EC2 T3a Instances Are Now Generally Available | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-ec2-t3a-instances-are-now-generally-available/ Amazon EKS Now Delivers Kubernetes Control Plane Logs to Amazon CloudWatch | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-eks-now-delivers-kubernetes-control-plane-logs-to-amazon-/ Amazon EKS Supports EC2 A1 Instances as a Public Preview | https://aws.amazon.com/about-aws/whats-new/2019/04/-amazon-eks-supports-ec2-a1-instances-as-a-public-preview-/ AWS Elastic Beanstalk extends Tag-Based Permissions | https://aws.amazon.com/about-aws/whats-new/2019/04/aws_elastic_beanstalk_extends_tag-based_permissions/ AWS ParallelCluster 2.3.1 with enhanced support for Slurm Workload Manager is available now | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-parallelcluster-slurm-enhancements/ Topic || Databases Amazon RDS now supports per-second billing | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-rds-per-second-billing/ Amazon RDS for Oracle Now Supports Database Storage Size up to 64TiB | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-rds-for-oracle-now-supports-64tib/ Amazon RDS Enhanced Monitoring Adds New Storage and Host Metrics | https://aws.amazon.com/about-aws/whats-new/2019/04/enhanced-monitoring-supports-additional-metrics/ Amazon RDS for PostgreSQL Now Supports Multi Major Version Upgrades to PostgreSQL 11 | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-rds-postgresql-supports-multi-major-version-upgrades/ Amazon RDS for PostgreSQL Now Supports Data Import from Amazon S3 | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-rds-postgresql-supports-data-import-from-amazon-s3/ Amazon Aurora and Amazon RDS Enable Faster Migration from MySQL 5.7 Databases | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon_aurora_and_amazon_rds_enable_faster_migration_from_mysql_57_databases/ Amazon Aurora Serverless Supports Sharing and Cross-Region Copying of Snapshots | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon_aurora_serverless_now_supports_sharing_and_cross-region_copying_of_snapshots/ AWS simplifies replatforming of Microsoft SQL Server databases from Windows to Linux | https://aws.amazon.com/about-aws/whats-new/2019/04/windows-to-linux-replatforming-assistant-sql-server-databases/ Amazon Redshift now provides more control over snapshots | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-redshift-now-provides-more-control-over-snapshots/ AWS specifies the IP address ranges for Amazon DynamoDB endpoints | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-specifies-the-ip-address-ranges-for-amazon-dynamodb-endpoints/ Now you can tag Amazon DynamoDB tables when you create them | https://aws.amazon.com/about-aws/whats-new/2019/04/now-you-can-tag-amazon-dynamodb-tables-when-you-create-them/ DynamoDBMapper now supports Amazon DynamoDB transactional API calls | https://aws.amazon.com/about-aws/whats-new/2019/04/dynamodbmapper-now-supports-amazon-dynamodb-transactional-api-calls/ Topic || Analytics Amazon Elasticsearch Service announces support for Elasticsearch 6.5 | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-elasticsearch-service-announces-support-for-elasticsearch-6-5/ Amazon Elasticsearch Service adds event monitoring and alerting support | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-elasticsearch-service-adds-event-monitoring-and-alerting-support/ Amazon Elasticsearch Service now offers improved performance at lower costs with C5, M5, and R5 instances | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-elasticsearch-service-now-offers-improved-performance-at-lower-costs-with-C5-M5-R5-instances/ AWS Glue now supports additional configuration options for memory-intensive jobs | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-glue-now-supports-additional-configuration-options-for-memory-intensive-jobs/ Announcing EMR release 5.22.0: Support for new versions of HBase, Oozie, Flink, and optimized EBS configuration for improved IO performance for applications such as Spark | https://aws.amazon.com/about-aws/whats-new/2019/04/announcing-emr-release-5220-support-for-new-versions-of-hbase-oozie-flink-and-optimized-ebs-configuration-for-improved-io-performance-for-applications-such-as-spark/ Amazon Kinesis Data Streams changes license for its consumer library to Apache License 2.0 | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon_kinesis_data_streams_changes_license_for_its_consumer_library_to_apache_license_2_0/ Amazon MSK expands its open preview into AP (Singapore) and AP (Sydney) AWS Regions | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon_msk_expands_its_open_preview_into_ap_singapore_and_ap_sydney_aws_regions/ Amazon QuickSight now supports localization, percentile calculations and more | https://aws.amazon.com/about-aws/whats-new/2019/04/Amazon_QuickSight_now_supports_localization_percentile_calculations_and_more/ Topic || IoT Amazon FreeRTOS Now Supports Resource Tagging | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-freertos-now-supports-resource-tagging/ AWS IoT Analytics Now Supports Single Step Setup of IoT Analytics Resources from AWS IoT Core | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-iot-analytics-now-supports-single-step-setup-of-iot-analytic/ Topic || End User Computing AWS Client VPN is Now Available in Four Additional AWS Regions | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-client-vpn-is-now-available-in-four-additional-aws-regions/ Amazon WorkDocs Migration Service | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon_workdocs_migration_service/ Amazon WorkDocs Document Approvals | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-workdocs-document-approval/ Topic || Machine Learning Amazon SageMaker Now Offers Reduced Prices in the Asia Pacific (Tokyo) and Asia Pacific (Seoul) AWS Regions | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-sagemaker-now-offers-reduced-prices-in-the-asia-pacific--/ Amazon SageMaker Now Supports Greater Control of Root Access to Notebook Instances | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-sagemaker-now-supports-greater-control-of-root-access-to-/ Amazon SageMaker Ground Truth announces new features to simplify workflows, new data labeling vendors, and expansion in the Asia Pacific region | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-sagemaker-ground-truth-announces-new-features-to-simplify/ Amazon Transcribe now supports real-time speech-to-text in British English, French, and Canadian French | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-transcribe-now-supports-real-time-speech-to-text-in-british-english-french-and-canadian-french/ Amazon Polly Adds Arabic Language Support | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-polly-adds-arabic-language-support/ Amazon Comprehend Now Supports Confusion Matrices for Custom Classification | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-comprehend-now-supports-confusion-matrices-for-custom-classification/ AWS DeepLens Introduces New Bird Classification Project Template | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-deeplens-bird-classification/ Topic || Networking Amazon CloudFront enhances the security for adding alternate domain names to a distribution | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-cloudfront-enhances-the-security-for-adding-alternate-domain-names-to-a-distribution/ Amazon CloudFront is now Available in Mainland China | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-cloudfront-is-now-available-in-mainland-china/ Expanding AWS PrivateLink support for Amazon Kinesis Data Firehose | https://aws.amazon.com/about-aws/whats-new/2019/04/expanding_aws_privatelink_support_for_amazon_kinesis_data_firehose/ AWS Global Accelerator is Now Available in Six Additional Regions | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-global-accelerator-is-now-available-in-six-additional-regions/ Topic || Customer Engagement Amazon Pinpoint Now Offers an Analytics Dashboard for Transactional SMS Messages | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-pinpoint-now-offers-an-analytics-dashboard-for-transactional-sms-messages/ Topic || Application Integration AWS AppSync Now Supports Tagging GraphQL APIs | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-appsync-now-supports-tagging-graphql-apis/ Amazon MQ now supports ActiveMQ Minor Version 5.15.9 | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-mq-now-supports-activemq-minor-version-5-15-9/ Topic || Game Tech Amazon GameLift Realtime Servers Now Available | https://aws.amazon.com/about-aws/whats-new/2019/04/amazon-gameLift-realtime-servers-now-available/ Topic || Media Services AWS Elemental MediaPackage and MediaTailor improve support for DASH Endpoints and Monetization | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-elemental-mediapackage-and-mediatailor-improve-support-for-dash-endpoints-and-monetization/ AWS Elemental MediaLive Offers Lower Cost Live Channels with Single-Pipeline Option | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-elemental-medialive-offers-lower-cost-live-channels-with-single-pipeline-option/ Speed Up Video Processing With New Accelerated Transcoding in AWS Elemental MediaConvert | https://aws.amazon.com/about-aws/whats-new/2019/04/speed-up-video-processing-with-new-accelerated-transcoding-in-aws-elemental-mediaconvert/ AWS Elemental MediaStore Now Supports Chunked Object Transfer to Enable Ultra-Low Latency Video Workflows | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-elemental-mediastore-now-supports-chunked-object-transfer-to-enabling-ultra-low-latency-video-workflows/ Topic || Management and Governance AWS CloudFormation Coverage Updates for Amazon EC2, Amazon ECS and Amazon Elastic Load Balancer | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-cloudformation-coverage-updates-for-amazon-ec2--amazon-ecs-a/ AWS Systems Manager Session Manager Enables Session Encryption Using Customer Keys | https://aws.amazon.com/about-aws/whats-new/2019/04/AWS-Systems-Manager-Session-Manager-Enables-Session-Encryption-Using-Customer-Keys/ AWS Systems Manager Now Supports Use of Parameter Store at Higher API Throughput | https://aws.amazon.com/about-aws/whats-new/2019/04/aws_systems_manager_now_supports_use_of_parameter_store_at_higher_api_throughput/ AWS Systems Manager Parameter Store Introduces Advanced Parameters | https://aws.amazon.com/about-aws/whats-new/2019/04/aws_systems_manager_parameter_store_introduces_advanced_parameters/ Query AWS Regions Endpoints and More | https://aws.amazon.com/blogs/aws/new-query-for-aws-regions-endpoints-and-more-using-aws-systems-manager-parameter-store/ AWS Service Catalog Announces Tag Updating | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-service-catalog-announces-tag-updating/ Topic || Robotics Announcing AWS RoboMaker Cloud Extensions for Robot Operating System (ROS) Melodic | https://aws.amazon.com/about-aws/whats-new/2019/04/announcing-aws-robomaker-cloud-extensions-for-robot-operating-sy/ NICE DCV Now Supports MacOS Native Clients | https://aws.amazon.com/about-aws/whats-new/2019/04/nice-dcv-now-supports-macos-native-clients/ Topic || Migration Announcing Azure to AWS migration support in AWS Server Migration Service | https://aws.amazon.com/about-aws/whats-new/2019/04/announcing_azure_awsmigration_servermigrationservice/ Topic || Security AWS Certificate Manager Private Certificate Authority is now available in five additional regions | https://aws.amazon.com/about-aws/whats-new/2019/04/AWS-Certificate-Manager-Private-Certificate-Authority-is-now-available-in-five-additional-regions/ AWS Single Sign-On now offers certificate customization to support your corporate policies | https://aws.amazon.com/about-aws/whats-new/2019/04/you-can-now-customize-the-aws-single-sign-on-certificate-to-meet-your-corporate-security-requirements/ Topic || Training and Certification AWS Certification Triples its Testing Locations, Making it Even More Convenient to Get Certified | https://aws.amazon.com/about-aws/whats-new/2019/04/aws-certification-triples-testing-locations/ Announcing the New AWS Certified Alexa Skill Builder - Specialty Exam | https://aws.amazon.com/about-aws/whats-new/2019/04/new-awsexam-certified-alexa-skill-builder-specialty/
Luc Perkins joins the show to talk about "Seven Databases in Seven Weeks: A guide to modern databases and the NoSQL movement." We discuss a bit about each database: Redis, Neo4J, CouchDB, MongoDB, HBase, Postgres, and DynamoDB. Special Guest: Luc Perkins.
NoSQL databases like HBase are awesome! But why and when should you use them? How does a key value store like HBase work? Today I am talking about exactly that
Amandeep Khurana is CEO & Co-Founder of Cerebro Data. His company works with cloud native data management and governance software for enterprises. Before this he was the Principal Solutions Architect at Cloudera Inc, where he worked with Cloudera’s customers to help them with their adoption and usage of the Hadoop ecosystem. Amandeep has been been involved in the big data ecosystem since 2009 and worked at AWS prior to Cloudera. Amandeep has also co-authored HBase In Action, a book on developing applications on the popular NoSQL datastore, HBase. *** For Show Notes, Key Points, Contact Info, & Resources Mentioned on this episode visit here: Amandeep Khurana Interview. ***
In this edition of Roaring News, Dave covers the release of Apache Metron based HCP 1.3 and an HBase vs Cassandra benchmark battle. Jhon talks about some Spark tuning and scheduler inner-workings and finishes with a tale of a compliance kettle... Dave HCP 1.3 release https://hortonworks.com/blog/hortonworks-cybersecurity-platform-big-data-cybersecurity-solution/ https://docs.hortonworks.com/HDPDocuments/HCP1/HCP-1.3.0/bk_release-notes/content/ch01.html Battle of the Apache NoSQL heavyweights https://hortonworks.com/blog/hbase-cassandra-benchmark/ Jhon Spark Performance Tuning: A Checklist https://medium.com/zero-gravity-labs/spark-performance-tuning-a-checklist-abb3c80efb44 How the Spark Scheduler Work http://www.russellspitzer.com/2017/09/01/Spark-Locality/ A tale of a compliance kettle… https://cupfighter.net/2017/09/a-tale-of-a-compliance-kettle Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
FINRA partnered with AWS product teams to leverage Amazon EMR and Amazon S3 extensively to build an advanced analytics solution. In this session, you'll hear how FINRA implemented a data lake on S3 to provide a single source for their big data analytics platform. FINRA ingests 75 billion records each day of stock market transactions, with an AWS storage footprint of 20 petabytes across S3 and Amazon Glacier. To deal with this workload, FINRA has architected a platform that separates storage from compute to manage capacity for each independently, leading to improved performance and cost effectiveness. You'll also learn how FINRA was able to leverage Hbase on Amazon EMR to achieve significant benefits over running Hbase on a fixed capacity cluster. FINRA was able to implement a system that seamlessly scales in response to data growth and can scale quickly in response to user traffic. By working with multiple clusters, FINRA can now isolate ETL and user query workloads and has achieved rapid, built-in disaster recovery capability by leveraging data storage on S3 to run from multiple AZs and across regions.
During this session Greg Brandt and Liyin Tang, Data Infrastructure engineers from Airbnb, will discuss the design and architecture of Airbnb's streaming ETL infrastructure, which exports data from RDS for MySQL and DynamoDB into Airbnb's data warehouse, using a system called SpinalTap. We will also discuss how we leverage Spark Streaming to compute derived data from tracking topics and/or database tables, and HBase to provide immediate data access and generate cleanly time-partitioned Hive tables.
Neste episódio do DatabaseCast Mauro Pichilian (@pichiliani) ), Wagner Crivelini (@wcrivelini) e o convidado Felipe Gasparini (felipe.gasparini@movile.com) trocam receitas de sopas e miojo que levam o ingrediente Hadoop. Neste episódio você vai aprender é o Hadoop, HFS, Map/Reduce, Yarm, Pig, Hive, Spark, HBase, Zookeper, Mahout, RedShift, Storm e outros softwares que compõe o ecossistema do Hadoop. Mas não se esqueça de distribuir seus dados, se preocupar com o link e com as falhas da nuvem. LANÇAMENTO: Livro do banco de dados MongoDB escrito pelo Mauro Pichiliani! Veja este livro na Amazon e no Clube de Autores pelos links abaixo: https://www.amazon.com.br/dp/B01L4PERBC https://www.clubedeautores.com.br/book/216555--Introducao_ao_MongoDB Segunda turma do curso presencial de machine learning com Python ministrado pelo Mauro Pichiliani! Veja no link abaixo mais detalhes sobre o curso presencial de introdução ao machine learning ministrado pelo nosso co-produtor Mauro Pichiliani. Façam logo sua inscrições, pois as vagas são limitadas e estão para acabar! http://getitup.com.br/treinamentos/2a-turma-introducao-ao-machine-learning-com-python/ Acessem o canal do DatabaseCast no YouTube: https://www.youtube.com/channel/UC8EUZ3gYTxJi-gr4azFJGYA Confiram o preço promocional da camiseta do DatabaseCast Fluxo Matrix com tecido especial (tipo tradicional econômico) http://www.zazzle.com.br/camiseta_fluxo_matrix_t_shirt-235338811658509024 Vejam a caneca Datas SQL com a sintaxe para manipulação de datas no Oracle, SQL Server, Mysql e PostgreSQL. http://www.zazzle.com.br/caneca_datassql_branca_325ml-168900583784663517 Confiram o livro "Conversando sobre banco de dados" em: http://www.amazon.com.br/Conversando-sobre-Banco-Dados-publicados-ebook/dp/B00JV3B7VI/ e http://clubedeautores.com.br/book/126042--Conversando_sobre_banco_de_dados Confiram as camisetas com estampas fractais do DatabaseCast: http://www.zazzle.com.br/databasecast Não deixe de nos incentivar digitando o seu comentário no final deste artigo, mandando e-mail para databasecast@gmail.com, seguindo o nosso twitter @databasecast, vendo informações de bastidores no nosso Tumblr e curtindo a nossa página no Facebook e no Google+.
Neste episódio do DatabaseCast Mauro Pichilian (@pichiliani) ), Wagner Crivelini (@wcrivelini) e o convidado Felipe Gasparini (felipe.gasparini@movile.com) trocam receitas de sopas e miojo que levam o ingrediente Hadoop. Neste episódio você vai aprender é o Hadoop, HFS, Map/Reduce, Yarm, Pig, Hive, Spark, HBase, Zookeper, Mahout, RedShift, Storm e outros softwares que compõe o ecossistema do Hadoop. Mas não se esqueça de distribuir seus dados, se preocupar com o link e com as falhas da nuvem. LANÇAMENTO: Livro do banco de dados MongoDB escrito pelo Mauro Pichiliani! Veja este livro na Amazon e no Clube de Autores pelos links abaixo: https://www.amazon.com.br/dp/B01L4PERBC https://www.clubedeautores.com.br/book/216555--Introducao_ao_MongoDB Segunda turma do curso presencial de machine learning com Python ministrado pelo Mauro Pichiliani! Veja no link abaixo mais detalhes sobre o curso presencial de introdução ao machine learning ministrado pelo nosso co-produtor Mauro Pichiliani. Façam logo sua inscrições, pois as vagas são limitadas e estão para acabar! http://getitup.com.br/treinamentos/2a-turma-introducao-ao-machine-learning-com-python/ Acessem o canal do DatabaseCast no YouTube: https://www.youtube.com/channel/UC8EUZ3gYTxJi-gr4azFJGYA Confiram o preço promocional da camiseta do DatabaseCast Fluxo Matrix com tecido especial (tipo tradicional econômico) http://www.zazzle.com.br/camiseta_fluxo_matrix_t_shirt-235338811658509024 Vejam a caneca Datas SQL com a sintaxe para manipulação de datas no Oracle, SQL Server, Mysql e PostgreSQL. http://www.zazzle.com.br/caneca_datassql_branca_325ml-168900583784663517 Confiram o livro "Conversando sobre banco de dados" em: http://www.amazon.com.br/Conversando-sobre-Banco-Dados-publicados-ebook/dp/B00JV3B7VI/ e http://clubedeautores.com.br/book/126042--Conversando_sobre_banco_de_dados Confiram as camisetas com estampas fractais do DatabaseCast: http://www.zazzle.com.br/databasecast Não deixe de nos incentivar digitando o seu comentário no final deste artigo, mandando e-mail para databasecast@gmail.com, seguindo o nosso twitter @databasecast, vendo informações de bastidores no nosso Tumblr e curtindo a nossa página no Facebook e no Google+.
In the eighteenth episode of this podcast, your hosts Francesc and Mark interview Ian Lewis, a Google Cloud Platform Developer Advocate based in Tokyo about Bigtable. About Ian Ian is a Developer Advocate on the Google Cloud Platform team working out of Tokyo. Ian loves Python and Go and helps run the largest Python event in Japan, PyCon JP. Ian is also interested in Docker and Kubernetes and hopes to help Google Cloud Platform users achieve their highest potential. Cool thing of the week We're live at GCPNext with our mics! If you're around come say hi, and if not follow the event from one of the many local viewing parties or via the live stream. Interview Resources: Bigtable: A Distributed Storage System for Structured Data pdf Google Cloud Bigtable docs Differences between the HBase and Cloud Bigtable APIs docs Cloud Bigtable Pricing Question of the week How to limit what users can do on the resources of your project? Google Cloud Identity and Access Management docs
本期由 Dingding 主持,邀请到了 AVOS Cloud 的创始人江宏来 Teahour 做客,聊聊 AVOS Cloud 的背后故事和技术架构。 AVOS Cloud 是一个为移动开发者提供后端服务的云解决方案,包括了存储、账号管理、社交分享、消息推送等模块。在节目中,江宏分享了 AVOS Cloud 的由来和目前的架构设计,以及江宏完成博士学位后在 Google Search Infrastructure 组时的工作经历。同时他也是中国 Clojure 社区的早期发起者,AVOS Cloud 的成员多数来自于 Clojure 社区,他分享了自己对 Clojure、Python、C++ 等语言的一些看法。目前,AVOS Cloud 正在招聘 iOS 开发者、Android 开发者和 DevOps,有兴趣的可以听听江宏对于他们团队和开放文化的介绍。 本期附送彩蛋:想找工作,要在众多竞争者中脱颖而出,你应该学什么。 AVOS Cloud BAAS Parse MixBit MySQL MongoDB HBase WebSocket 云上的云:AVOS Cloud在云平台上构建云服务的经验分享 Vesper Vesper Sync Story Docker Clojure AVOS Cloud 开放资源 Hard Thing About Things Building Off Screen insight 视频访谈:Slack 创始人 Stewart Special Guest: 江宏.
Higher Availability, Increased Scale and Enhanced Security on Apache HBase a talk with Michael Stack, Lars Hofhansl and Andrew Purtell
本期由 Kevin Wang 主持,Dingding Ye 协作主持,邀请到 《Seven Databases in Seven Weeks》 作者 Eric Redmond 畅聊数据库。Eric 目前是 Riak 的核心开发人员,在两个小时时间了,Eric 介绍了 PostgreSQL、MongoDB、CouchDB、HBase、Cassandra、Redis、Riak、Neo4J 的各自设计思想和优缺点,同时在最后也分享了他个人在数据库选择上的一些考虑原则。 Basho Seven Languages in Seven Weeks MongoHQ CAP theorem PostgreSQL MongoDB CouchDB HBase Cassandra Redis Riak Neo4J Dynamo VoltDB JUNG Cypher Eventual Consistency Google Glass Explorer Program The Little Riak Book Antifragile: Things That Gain from Disorder Hashrocket Lunch n' Learn with Eric Redmond Hashrocket Guest Star Interview: Eric Redmond Special Guest: Eric Redmond .
Brian talks with Doug Hairfield (@knucklesandwich, Manager of Systems Engineering and Continuous Integration @Bronto) about how he’s transformed the way they do continuous deployment and “manage hardware as code”. They talk about how they transformed their environment from two deployments a year to 10-20 deployments a day, to deliver advance marketing tools and analytics.
NoSQL is a terrible term for a collection of widely varied databases. You have key-value stores like Redis, Tokyo Cabinet, Memcached, etc. You also have document databases like couchDB and mongoDB. Finally you have column based systems like Cassandra. And still others like HBase. In a lot of cases, there are gems for these. I've used several of them such as the gems for managing Cassandra, couchDB, and mongoDB.
Enregistre le 19 novembre 2010 Devoxx http://devoxx.com Nicolas Martignole Le Touilleur Express http://www.touilleur-express.fr/ http://twitter.com/nmartignole Michael Figuiere http://blog.xebia.fr/author/mfiguiere/ Xebia http://www.xebia.fr/ Paris JUG http://www.parisjug.org Java SE JSR project coins http://jcp.org/en/jsr/detail?id=334 JSR Lambda expression http://jcp.org/en/jsr/detail?id=335 JSR Java SE 7 http://jcp.org/en/jsr/detail?id=336 JSR Java SE 8 http://jcp.org/en/jsr/detail?id=337 Java Modules et Jigsaw http://openjdk.java.net/projects/jigsaw/ Devops Michael Cote http://www.redmonk.com/cote/ John Willis http://www.johnmwillis.com/about/ Devops http://en.wikipedia.org/wiki/DevOps Langage alternatifs Stephan Colebourne http://jroller.com/scolebourne/ Next Big Language http://www.jroller.com/scolebourne/entry/the_next_big_jvm_language1 NoSQL Cassandra http://cassandra.apache.org/ Project Voldemort http://project-voldemort.com/ Hadoop http://hadoop.apache.org/ HBase http://hbase.apache.org/ Infinispan http://jboss.org/infinispan MongoDB http://www.mongodb.org/ Hive http://hive.apache.org/ Pig http://pig.apache.org/ Performances Joshua Bloch http://en.wikipedia.org/wiki/Joshua_Bloch David Gageot http://blog.javabien.net/ AlgoDeal https://beta.algodeal.com/ Kirk Pepperdine Java Specialists Newsletter http://www.javaspecialists.eu/archive/archive.jsp Hudson https://hudson.dev.java.net/ Divers Groovy http://groovy.codehaus.org/ Les outils de la semaine Parleys http://parleys.com/ Nous contacter Contactez-nous via twitter http://twitter.com/lescastcodeurs sur le groupe Google http://groups.google.com/group/lescastcodeurs ou sur le site web http://lescastcodeurs.com/ Flattr-ez nous sur http://lescastcodeurs.com/
NoSQL HBase and Hadoop with Todd Lipcon from Cloudera