POPULARITY
On this episode of Alexa's Input (AI), we're diving deep into the world of distributed databases with Patrick McFadin, Principal Technical Strategist at DataStax and a leading voice in the Apache Cassandra community. Patrick shares his journey into tech and how he became one of the foremost experts on Cassandra—an open-source, highly scalable NoSQL database that powers mission-critical applications across the globe.We explore Cassandra's unique architecture, its approach to the CAP theorem, real-world use cases, and how it continues to evolve in the era of AI and real-time analytics. Whether you're a developer, architect, or just database-curious, this episode offers a clear, insightful look at how Cassandra handles scale, availability, and open-source innovation.Links:LinkedIn: https://www.linkedin.com/in/patrick-mcfadin-53a8046/DataStax: https://www.datastax.com/our-people/patrick-mcfadinX: https://x.com/patrickmcfadinGithub: https://github.com/pmcfadinYou can support this podcast on the creators page. Make sure to subscribe and follow Alexa's Input Twitter account to get notified when a new podcast episode comes out.
Slightly different The Business of Open Source episode today! I spoke with Patrick McFadin and Mick Semb Wever about the relationship between Apache Cassandra and DataStax — how it was at the beginning and how the relationship has evolved over the years. We talked about:— How there was a dynamic around Cassandra where many of the many of the contributors ended up being sucked into the DataStax orbit, simply because it allowed those contributors to work on on Cassandra full-time— How there can be tensions between different stakeholders simply because everyone involved ultimately has their own interests at heart, and those interests are not always aligned. — How it is actually hard to really have open discussions about new features, and how often there can be a new feature dropped in a project that clearly had been developed behind closed doors for some time, and sometimes that created tension in the community— Some open source projects are just too complex to be hobby projects — Cassandra is so complex that you won't become a code contributor unless you're working full-time on Cassandra, because that's the level of skill you need to keep up. — How the relationship between a company and a project often changes as the technology matures. — The importance of addressing tensions between company and community head-on, as adults, when they occur — as well as why you need to remember to treat people as humans and remember that they have good days, bad days, goals and interests. Patrick on LinkedInMick on LinkedIn
In this episode 68, I had an engaging conversation with Patrick McFadin from DataStax on the aspects of vector databases and their impact on production level GenAI. We talked about how these vector databases are revolutionizing the way we store, access and analyze data making GenAI more efficient and effective than ever before. Stay tuned for more interesting conversations from the (XTrawAI.com) podcast series on machine learning and AI applications. --- Send in a voice message: https://podcasters.spotify.com/pod/show/raghu-banda/message
Glauber Costa (@glcst) is the founder of Turso and the co-creator of libSQL, an open source, open contribution fork of the database engine library, SQLite. Most people believe that SQLite is open-source software, but it actually exists in the public domain and doesn't accept external contributions. With their big fork, Glauber and his team have set out to evolve SQLite into a modern database with support for distributed data, an asynchronous interface, compatibility with WASM and Linux, and more. Subscribe to Contributor on Substack for email notifications, and join our Slack community! In this episode we discuss: Community reactions to forking SQLite How Glauber was spoiled by starting his career developing for Linux The controversial decision to launch libSQL without writing a single line of code The plan for incorporating upstream changes from SQLite Examples of how application developers need to move code “to the edge” Links: libSQL SQLite Turso LiteFS Litestream rqlite VLCN People mentioned: Avi Kivity (@AviKivity) Dor Laor (@DorLaor) Ben Johnson (@benbjohnson) Phillip O'Toole (@general_order24) Matt Tantaman (@tantaman) Other episodes: Scylla with Dor Laor Apache Cassandra with Patrick McFadin
Ruben Fiszel (@rubenfiszel) is the creator of Windmill, the open-source developer platform that lets users easily turn scripts into workflows and internal apps with auto-generated UIs. Windmill doesn't force engineers to change their coding style or adopt a convoluted API, and its low-code design makes it accessible to non-technical users. Tune in to find out how Windmill offers speed, performance and flexibility, while avoiding the limitations of rigid tools. Subscribe to Contributor on Substack for email notifications, and join our Slack community! In this episode we discuss: Why many engineers try to reinvent the wheel when it comes to workflow engines When Ruben first saw the need for a platform like Windmill while working at Palantir “Today is the nicest period to build open-source…” Ruben's incredible presence with support and bug fixes Windmill's generous open-source offerings and the future of the business Links: Windmill Retool Tokio Apache Airflow Apache Spark Other episodes: Prefect with Jeremiah Lowin Dagster with Nick Schrock Temporal with Maxim Fateev Temporal (Part 2) with Maxim Fateev and Dominik Tornow Apache Cassandra with Patrick McFadin
Hey Everyone, In this video I talk to Patrick McFadin from DataStax. We uncovered the new features in Cassandra 5.0 and discussed how ACID transactions are achieved in the new version. This is a deep dive into the features of Cassandra, consensus protocols and how Accord is different as compared to Paxos, RAFT, Spanner and Calvin. Chapters: Cassandra 5.0 - ACID transactions and Vector Search 00:00 Introduction 01:45 List of features in the new Cassandra 04:51 Who needs ACID properties? 07:20 Why didn't Cassandra have ACID properties so far? 10:35 Why is Accord consensus protocol well suited for Cassandra? 16:40 Lets take a gaming example to see how Transactions work 21:55 Whats happening behind the scenes in a Transactions? 27:44 What happens when there are failures? 33:41 How is upgrade to the new version going to look like? 35:48 How is the latency impacted because of transactions? 40:23 What was missing in lightweight transactions? 42:24 Vector Search - What is it? How does it work? Previous episode on Cassandra: • Apache Cassandra ... Other playlists to watch: Distributed systems and Databases: • Distributed Syste... Software Engineering: • Software Engineering Distributed systems: • Distributed Systems Modern Databases: • Modern Databases Patrick's Linkedin: https://www.linkedin.com/in/patrick-m... Astra: astra.datastax.com Cassandra: https://cassandra.apache.org/_/index.... I hope you liked this episode. If you did, please hit the like button, share it with your network and subscribe to the channel. Cheers, The GeekNarrator
Patrick McFadin, VP of Developer Relations at DataStax and Chief Evangelist for Apache Cassandra, joins the Hacking Open Source Business Podcast on Episode 26 to deep dive into open source. In this episode Patrick talks about:- His time working in open source database community, including Apache Cassandra's journey and upcoming developments.- The role of evangelism and contributors in driving adoption and getting people to try your project.- The challenges and mistakes companies make when commercializing open source, with lessons he has learned from his time in the database community.- How new features are chosen based on his experience with Cassandra highlighting features such as transactions and open-source tool Guardrails?- Does open source innovation slow down as products mature?- What is cloud-native anyways? And what does it mean in the database context?- Building a diverse and gloabl team by building trust.- DevRel Best practices includeing, how do you measuring DevRel success.- Patrick McFadin's LinkedIn profile: https://www.linkedin.com/in/patrick-mcfadin-53a8046/- Learn more about Apache Cassandra: https://cassandra.apache.org/Checkout our other interviews, clips, and videos: https://l.hosbp.com/YoutubeDon't forget to visit the open-source business community at: https://opensourcebusiness.community/Visit our primary sponsor, Scarf, for tools to help analyze your #opensource growth and adoption: https://about.scarf.sh/Subscribe to the podcast on your favorite app:Spotify: https://l.hosbp.com/SpotifyApple: https://l.hosbp.com/AppleGoogle: https://l.hosbp.com/GoogleBuzzsprout: https://l.hosbp.com/Buzzsprout
This special episode of Open||Source||Data features an interview with Patrick McFadin. Patrick has been a distributed systems hacker since he first plugged a modem into his Atari computer. Looking for adventure, he joined the US Navy, working on the Naval Tactical Data System (NTDS), which cemented his love of distributed systems. He is now an Apache Cassandra Committer, and is the Vice President of Developer Relations at DataStax. Sam catches up with Patrick at Data Day Texas to discuss his book Managing Cloud Native Data on Kubernetes, Cassandra Forward, and the future of Apache Cassandra.-------------------“I can now use my Parquet file in Iceberg or DuckDB, and this is data that I created with Cassandra. And we're not getting to the point where we have to reinvent an entire database. We can just connect the Lego parts together and if they're open, then I don't have these encumbrances. I'm not like, ‘Well, I can connect that if I call a salesperson and get a license.' [...] That's what's exciting to me about Cassandra, the way that the ecosystem is evolving around Cassandra. It's not, ‘Cassandra's at the center, it's just a player.' It's at the party." – Patrick McFadin-------------------Episode Timestamps:(01:06): What open source data means to Patrick(02:11): Patrick discusses his book Managing Cloud Native Data on Kubernetes(10:02): Patrick discusses Cassandra Forward(11:09): The future of Apache Cassandra-------------------Links:LinkedIn - Connect with PatrickCassandra Forward
From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT Kubernetes has crossed the chasm, but what about stateful applications and databases? Join us for this panel discussion and learn more about how organizations are deploying different databases like PostgreSQL and Cassandra on Kubernetes, what are the benefits of running databases on Kubernetes, and how the ecosystem is working towards making these things boring, so you can focus on your applications! We will have an interactive discussion with the hosts and guests of the Kubernetes Bytes podcast, and open it up to the audience to ask questions and learn more about the what, why, and how about Databases on Kubernetes!
Apache Cassandra paved the way for today's biggest digital platforms to scale into the much bigger global scene. Patrick McFadin of Datastax is one of the people involved in this open-source project and saw first-hand how it burst into the world. He joins Ben Rometsch to share how Cassandra was developed, the many challenges they faced in its optimization, its relationship with Datastax, and how it changed database engine creation and data modeling. Patrick also talks about the measures they are implementing to continuously improve Cassandra and limit open-source access to ensure quality.
Hey Everyone, In this episode I invited Patrick McFadin who is an expert in the world of Cassandra and Data Modelling. Patrick currently works for DataStax as a VP Of Dev Rel. Patrick has given several techtalks on Cassandra and the ecosystem around it. We have covered the architecture of Cassandra in depth. Heres what we have covered: 00:00 Introduction 04:00 History of Cassandra 07:18 Patrick Apache Cassandra? 14:30 How writes work in Cassandra? 21:30 How many copies are written on a single write? 25:44 How does replication work? 32:00 How do reads work? (Read consistency levels) 39:00 Why is Allow Filtering not recommended? 43:00 Data Modelling in Cassandra 50:45 Modeling a Chat Application 01:05:00 How does CAP theorem fits Cassandra? 01:07:06 New features in Cassandra? References: Patrick McFadin: https://www.linkedin.com/in/patrick-m... Kaivalya Apte: https://www.linkedin.com/in/kaivalya-... Astra: astra.datastax.com Cassandra: https://cassandra.apache.org/_/index.... Webinar on Data Modeling: https://www.youtube.com/watch?v=4D39w... Playlist on Distributed Systems and Databases: https://www.youtube.com/playlist?list... I hope you enjoyed our discussion and learned from it. Please like, share and subscribe to the channel and keep supporting. Cheers, The GeekNarrator
I have Patrick McFadin back on the show to discuss using Apache Pulsar for distributed cloud-native streaming and how it fits into Data Stax's plans and business goals. --- Send in a voice message: https://anchor.fm/chinchillasqueaks/message
Patrick McFadin joins the show to chat about his upcoming book, "Managing Cloud Native Data on Kubernetes", as well as the general practices behind this. #dataengineering #cloudnative #kubernetes --------------------------------- TERNARY DATA We are Matt and Joe, and we're "recovering data scientists". Together, we run a data architecture company called Ternary Data. Ternary Data is not your typical data consultancy. Get no-nonsense, no BS data engineering strategy, coaching, and advice. Trusted by great companies, both huge and small. Check out Fundamentals of Data Engineering (O'Reilly): https://amzn.to/3bftdoQ Subscribe to our newsletter, or check out our services at Ternary Data Site - https://ternarydata.com Please follow our LinkedIn page - https://www.linkedin.com/company/ternary-data/ Subscribe to our YouTube and smash the like button! - https://www.youtube.com/channel/UC3H60XHMp6BrUzR5eUZDyZg Thanks for your support!
The HOSS Talks FOSS is welcoming Patrick McFadin, Developer Relations at DataStax, for the second time. Catch up with open-source databases in general, design, and architecture. Learn more about Pulsar and all ongoing projects from Cassandra, and DataStax. Matt asked a few rapid-fire questions to know more about Patrick
Eric Anderson (@ericmander) returns to Temporal with co-founder Maxim Fateev (@mfateev) and principal engineer Dominik Tornow (@DominikTornow). When Maxim joined us in September of 2020, the company called their project a “workflow orchestrator.” Today, Temporal has grown in popularity and usability, but the terminology around that abstraction has changed. Tune in to track the evolution of what Maxim calls a genuinely “new category of software.” In this episode we discuss: New features and developments in the last 2 years The proper way to pronounce “Temporal” How Temporal guarantees that workflow execution actually runs to execution Describing Temporal as a new pair of glasses Replay, Temporal's first developer conference on August 25-26, in Seattle Links: Temporal Cadence Apache Cassandra Replay People mentioned: Samar Abbas (@samarabbas77) Other episodes: Temporal with Maxim Fateev Apache Cassandra with Patrick McFadin
https://go.dok.community/slack https://dok.community ABSTRACT OF THE TALK What about your streaming and analytic workloads? If you are all-in on Kubernetes you can't forget about these important parts of your infrastructure. I'll talk about the current state of the art. Why organizations may hesitate to go beyond deploying databases in Kubernetes and most important, some key things you need to be successful. BIO Patrick McFadin is the co-author of the upcoming O'Reilly book “Managing Cloud-Native Data on Kubernetes” He currently works at DataStax in Developer Relations and as a contributor to the Apache Cassandra project. Patrick has worked as Chief Evangelist for Apache Cassandra and as a consultant for DataStax, where he had a great time building some of the largest deployments in production. Previous to DataStax, he held positions as Chief Architect, Engineering Lead and Database DBA/Developer. KEY TAKE-AWAYS FROM THE TALK People should walk away with a better understanding of what it takes to deploy streaming and analytic workloads in Kubernetes.
This bonus episode features conversations from season 1 of the Open||Source||Data podcast. In this episode, you'll hear from Kelsey Hightower, Principal Engineer at Google Cloud; Lachlan Evenson, Principal Program Manager at Microsoft Azure; and Patrick McFadin, Head of Developer Relations at DataStax. Sam sat down with each guest to discuss Data on Kubernetes and how they're making progress on a stateless infrastructure.You can listen to the full episodes from Kelsey Hightower, Lachlan Evenson, and Patrick McFadin by clicking the links below.-------------------Timestamps:(00:39): Kelsey Hightower(01:33): Lachlan Evenson(02:06): Patrick McFadin-------------------Links:Listen to Kelsey's episodeListen to Lachlan's episodeListen to Patrick's episode
In this episode, Ryan and Bhavin interview Patrick McFadin, VP of Developer Relations at Datastax, who is a co-author of the upcoming O'Reilly book “Managing Cloud-Native Data on Kubernetes” and a contributor to the Apache Cassandra project. The discussion dives into how K8ssandra helps users deploy Cassandra on Kubernetes clusters, and how customers are using Cassandra as the NoSQL, Distributed DB backend for their applications. We talk about the challenges, benefits, and best practices for running Cassandra on Kubernetes, and what users can look forward to in the near future. Show links: Patrick McFadin - LinkedIn - Twitter K8ssandra.io - https://k8ssandra.io Introduction to Cassandra - Crash Course - Youtube series - https://youtube.com/playlist?list=PL2g2h-wyI4SqCdxdiyi8enEyWvACcUa9R AWS Marketplace - https://aws.amazon.com/marketplace/pp/prodview-iy7gagaxm2foa Cassandra Discord community - https://discord.com/invite/qP5tAt6Uwt Data On Kubernetes - https://www.meetup.com/Data-on-Kubernetes-community/events/ Managing Cloud-Native Data on Kubernetes - https://portworx.com/resource/ebook-managing-cloud-native-data-on-kubernetes/ Cloud-Native News: Docker raises Series-C funding Garden.io raises Series A - $16M funding to combat waste in cloud development Are you Ready for K8s 1.24 NetApp acquires InstaClustr Spring4Shell - Zero Day Remote Code Execution Vulnerability Portworx Enterprise 2.10 Etcd v3.5.[0-2] is not recommended for production Announcing Postgres container apps: Easy deploy Postgres apps
It's another VOICES OF THE COMMUNITY Episode! This episode, Toby Bee is back with TWO interviews!First, Toby gets an EXCLUSIVE interview w/ Louis Ifer, the Legendary CEO of withoutrelent.com!Then, Toby interviews Patrick McFadin, VP of Developer Relations @ Datastax. We discuss the metaverse and the future of computing!
https://go.dok.community/slack https://dok.community/ ABSTRACT OF THE TALK Patrick is a Data on Kubernetes Community veteran. He did the very first session "Is k8s even ready for data?" in July 2020 and has seen the growth of the community since then. Jeff Carpenter is a Software Engineer at Datastax where he works on the Stargate.io project. If you have questions you want to be answered in the session, please feel free to message Bart on Slack.
How Kubernetes environments might be able to offer hooks for storage, databases and other sources of persistent data still is a question in the minds of many potential users. To that end, a new consortium called the Data on Kubernetes Community (DoKC) was formed to help organizations find the best ways of working with stateful data on Kubernetes.In this latest episode of The New Stack Maker podcast, two members of the group discuss the challenges associated with running stateful workloads on Kubernetes and how DoKC can help.Participants for this conversation were Melissa Logan, principal, of Constantia.io, an open source and enterprise tech marketing firm, and director of DoKC; Patrick McFadin, vice president, developer relations and chief evangelist for the Apache Cassandra NoSQL database platform from DataStax; and Evan Powell, advisor, investor and board member, MayaData, a Kubernetes-environment storage-solution provider.TNS Editor Joab Jackson hosted the podcast.
On this episode, we chat with Patrick McFadin. He is a technologist, member of the CNCF, author, and Vice President of Developer Relations at Datastax. We discuss the Cassandra Database, release 4.0, Kubernetes, and technology foundations. Eric and Brandon had way to much fun with this one! Destination Linux Network (https://destinationlinux.network) Sudo Show Website (https://sudo.show) Sponsor: Bitwarden (https://bitwarden.com/dln) Sponsor: Digital Ocean (https://do.co/dln-mongo) Sudo Show Swag (https://sudo.show/swag) Contact Us: DLN Discourse (https://sudo.show/discuss) Email Us! (mailto:contact@sudo.show) Sudo Matrix Room (https://sudo.show/matrix) Apache Cassandra (https://cassandra.apache.org/_/index.html) DataStax: What is NoSQL? (https://www.datastax.com/nosql) Apache Cassandra 4.0 is Here (https://cassandra.apache.org/_/blog/Apache-Cassandra-4.0-is-Here.html) DataStax (https://www.datastax.com) Cloud Native Computing Foundation (https://www.cncf.io) KubeCon 2021 (https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america) K8ssandra (https://k8ssandra.io) Chapters 00:00 Intro 00:42 Welcome 01:56 Sponsor - Digital Ocean 03:00 Sponsor - Bitwarden 04:31 Meet Patrick McFadin 06:42 Who Is Patrick? 08:14 What Is Cassandra? 13:37 Cassandra 4.0! 17:52 DataStax Mission 26:38 What is the CNCF? 31:22 Data on Kubernetes 41:00 What's Next? 47:08 Wrap Up Special Guest: Patrick McFadin.
Cassandra Database reaches 4.0. Nearly six years on from the release of Apache Cassandra 3.0, the community behind the popular open-source distributed database has announced the release of v4.0 of Apache Cassandra. Patrick McFadin, VP of Developer Relations at DataStax, and Ben Bromhead, CTO of Instaclustr, are with Swapnil Bhartiya to talk about it. The first issue to be addressed is the importance Cassandra holds in the modern world. McFadin starts off by talking about what workloads Cassandra is focused on, which are websites and mobile applications. McFadin says, "When you use a mobile app on your phone, you're probably using Cassandra." Since its inception, Cassandra has developed into a "really awesome, general-purpose database," adds Bromhead. More importantly, he makes mention of scalability when he says, "As people reach the limits of scalability or availability when it comes to some of the other databases out there (such as MySQL and PostgreSQL), we see developers reaching for Apache Cassandra." The discussion then shifts to the new features available in Cassandra v4.0. Bromhead talks about structural changes based around the Netty networking framework, which has enabled several really cool features, such as zero-copy streaming which allows an Apache Cassandra node to stream the data it's responsible for and leads to wire-level streaming speeds between nodes. Practically speaking, that means users can now run denser nodes. The 4.0 release also saw the deprecation of the Thrift protocol, in favor of the CQL protocol, which was a major change. As far as the upgrade process is concerned, version 4.0 should be considerably easier than previous releases. "If you had been upgrading Cassandra, before, like in the three and twos, there was always a long list of intermediate patches that you had to put into place, or you had to do some extra work mid-upgrade. Because of that, the developers decided it was of utmost importance to make it simple," explains McFadin. Bromhead calls out to developers and admins to "not stress too much about this one. Still run through all the track checks and the standard processes you do. But again, this has been pretty well battle-tested." To further highlight the upgrade process, McFadin mentions that the maintainers had a lot of discussion about the project and how improvements to the upgrade start at the developer level. McFadin says, "Instead of just having someone drop code in and ask everyone what they think, we have a proposal process. So you outline the change that you want to make, we have good discussions about it, and make some changes before there's actual code." Processes like this certainly go a long way in making a project more stable over time.
About Patrick McFadinPatrick McFadin is the VP of Developer Relations at DataStax, where he leads a team devoted to making users of Apache Cassandra successful. He has also worked as Chief Evangelist for Apache Cassandra and consultant for DataStax, where he helped build some of the largest and exciting deployments in production. Previous to DataStax, he was Chief Architect at Hobsons and an Oracle DBA/Developer for over 15 years.Twitter: @PatrickMcFadinLinkedIn: Patrick McFadin DataStax website: datastax.comK8ssandra: k8ssandra.ioStargate: stargate.ioDataStax Astra: Cassandra-as-a-ServiceWatch this episode on YouTube: https://youtu.be/-BcIL3VlrjEThis episode sponsored by CBT Nuggets and Fauna.TranscriptJeremy: Hi everyone, I'm Jeremy Daly and this is Serverless Chats. Today I'm chatting with Patrick McFadin. Hey Patrick, thanks for joining me.Patrick: Hi Jeremy. How are you doing today?Jeremy: I am doing really well. So you are the VP of Developer Relations at DataStax, so I'd love it if you could tell the listeners a little bit about yourself and what DataStax is all about.Patrick: Sure. Well, I mean mostly I'm just a nerd with a cool job. I get to talk about technology a lot and work with technology. So DataStax, we're a company that was founded around Apache Cassandra, just supporting and making it awesome. And that's really where I came to the company. I've been working with Apache Cassandra for about 10 years now. I've been a part of the project as a contributor.But yeah, I mean mostly data infrastructure has been my life for most of my career. I did this in the dotcom era, back when it was really crazy when we had dozens of users. And when that washed out, I'm like, oh, then real scale started and during that period of time I worked a lot in just trying to scale infrastructure. It seems like that's been what I've been doing for like 30 years it seems like, 20 years, 20 years, I'm not that old. Yeah. But yeah, right now, I spend a lot of my time just working with developers on what's next in Kubernetes and I'm part of CNCF now, so yeah. I just can't to seem to stay in one place.Jeremy: Well, so I'm super interested in the work that DataStax is doing because I have had the pleasure/misfortune of managing a Cassandra ring for a start-up that I was at. And it was a very painful process, but once it was set up and it was running, it wasn't too, too bad. I mean, we always had some issues here and there, but this idea of taking a really good database, because Cassandra's great, it's an excellent data store, but managing it is a nightmare and finding people who can manage it is sort of a nightmare, and all that kind of stuff. And so this idea of taking these services and DataStax isn't the only one to do this, but to take these open-source services and turn them into these hosted solutions is pretty fantastic. So can you tell me a little bit more, though? What this shift is about? This moving away from hosting your own databases to using databases as a service?Patrick: Yeah. Well, you touched on something important. You want to take that power, I mean Cassandra was a database that was built in the scale world. It was built to solve a problem, but it was also built by engineers who really loved distributed computing, like myself, and it's funny you say like, "Oh, once I got it running, it was great," well, that's kind of the experience with most distributed databases, is it's hard to reason around having, "Oh, I have 100 mouths to feed now. And if one of them goes nuts, then I have to figure it out."But it's the power, that power, it's like stealing fire from the gods, right? It's like, "Oh, we could take the technology that Netflix and Apple and Facebook use and use it in our own stuff." But you got to pay the price, the gods demand their payment. And that's something that we've been really trying to tackle at DataStax for a couple of years now, actually three, which is how ... Because the era of running your own database is coming to an end. You should not run your own database. And my philosophy as a technologist is that proper, really important technology like your data layer should just fade into the background and it's just something you use, it's not something you have to reason through very much.There's lots of technology that's like that today. How many times have you ... When was the last time you managed your own memory in your code?Jeremy: Right. Right. Good point. I know.Patrick: Thank god, huh?Jeremy: Exactly.Patrick: Whew.Jeremy: But I think that you make a really good point, because you do have these larger companies like Facebook or whatever that are using these technologies and you mentioned data layers, which I don't think I've worked for a single company, I don't think I actually ... I founded a start-up one time and we built a data layer as well, because it's like, the complexity of understanding the transaction models and the routing, especially if you're doing things like sharding and all kinds of crazy stuff like that, hiding that complexity from your developers so that they can just say, "I need to get this piece of information," or, "I need to set this piece of information," is really powerful.But then you get stuck with these data layers that are bespoke and they're generally fragile and things like that, so how is that you can take data as a service and maybe get rid of some of that, I don't know, some of that liability I guess?Patrick: Yeah. It's funny because you were talking about sharding and things like that. These are things that we force on developers to reason through, and it's just cognitive load. I have an app to get out, and I have some business desire to get this application online, the last thing I need to worry about is my sharding algorithm. Jeremy, friends don't let friends shard.Jeremy: Right. That's right. That's a good point.Patrick: But yeah, I mean I think we actually have all the parts that we need and it's just about, this is closer than you think. Look at where we've already started going, and that is with APIs, using REST. Now GraphQL, which I think is deserving its hotness, is starting to bring together some things that are really important for this kind of world we want to live in. GraphQL is uni-fettering data and collecting and actual queries, it's a QL, and why they call it Graph, I have no idea. But it gives you this ability to have this more abstract layer.I think GraphQL will, here's a prediction is that it's going to be like the SQL of working with data services on the internet and for cloud-native applications. And so what does that mean? Well, that means I just have to know, well, I need some data and I don't really care what's underneath it. I don't care if I have this field indexed or anything like that. And that's pretty exciting to me because then we're writing apps at that point.Jeremy: Right. Yeah. And actually, that's one of the things I really like about GraphQL too is just this idea that it's almost like a universal data access layer in a sense because it does, you still have to know it, you have to know what you're requesting if you're an end developer, but it makes it easier to request the things that you need and have those mutations set and have some of those other things standardized across the company, but in a common format because isn't that another problem? Where it's like, I'm working with company A and I move to company B maybe and now company B is using a different technology and a different bespoke data layer and some of these other things.So, I think data as a service for one, maybe with GraphQL in front of it is a great way to have this alignment across companies, or I guess, just makes it easier for developers to switch and start developing right away when they move into a new company.Patrick: Yeah, and this is a concept I've been trying to push pretty hard and it's driven by some conversations I've had with some friends that they're engineering leaders and they have this common desire. We want to have a zero day dev, which is the first day that someone starts, they should be producing production code. And I don't think that's crazy talk, we can do this, but there's a lot of things that are in front of it. And the database is one of them. I think that's one of the first things you do when you show up at company X is like, "Okay, what database are you using? What flavor of SQL or GRPC or CQL, Cassandra query language? What's the data model? Quick, where's that big diagram on the wall with my ERD? I got to go look at that for a while."Jeremy: How poorly did you structure your Git repositories? Yeah.Patrick: Yeah, exactly. It's like all these things. And no, I would love to see a world where the most troublesome part of your first day is figuring out where the coffee and the bathroom are, and then the rest of it is just total, "Hey, I can do this. This is what I get paid to do."Jeremy: Right. Yeah. So that idea of zero day developer, I love that idea and I know other companies are trying to do that, but what enables that? Is it getting the idea of having to understand something bespoke? Is it getting that off of the table? Or not having to deal with the low-level database aspect of things? I mean because APIs, I had this conversation with Rob Sutter, actually, a couple weeks ago. And we were talking about the API economy and how everything is moving towards APIs. And even data, it was around data as well.So, is that the interface, you think, of the future that just says, "Look, trying to interface directly with a database or trying to work with some other layer of abstraction just doesn't make sense, let's just go straight from code right to the data, with a very simple API interface?"Patrick: Yeah, I think so. And it's this idea of data services because if you think of if you're doing React, or something like a front-end code, I don't want to have a driver. Drivers are a total impediment. It's like, driver hell can be difficult at large organizations, getting the matching right. Oh, we're using this database so you have to use this driver. And if you don't, you are now rejected at the gate. So it's using HTTP protocols, but it's also things like when you're using React or Angular, View, whatever you're using on the front-end, you have direct access.But most times what you're needing is just a collection or an object. And so just do a get, "I need this thing right now. I'm doing a pick list. I need your collection." I don't need a complicated setup and spend the first three days figuring out which driver I'm using and make sure my Gradle file is just perfect. Yeah. So, I think that's it.Jeremy: Yeah. No, I'd be curious how you feel about ORMs, or O-R-Ms, certainly for relational databases, I know a lot of people love them. I can't stand them. I think it adds a layer of abstraction and just more complexity where I just want access to the database. I want to write the query myself, and as soon as you start adding in all this extra stuff on top of it to try to make it easier, I don't know, it just seems to mess it up for me.Patrick: All right. So yeah, I think we have an accord. I am really not a fan of ORMs at all. And I mean this goes back to Hibernate. Everyone's like, "Oh, Hibernate's going to be the end of databases." No, it's not. Oh yeah, it was the end of the database at the other side because it would create these ridiculous queries. It's like, why is every query a full table scan?Jeremy: Exactly.Patrick: Because that's the way Hibernate wanted it. Yeah. I actually banned Hibernate at one company I was working at. I was Chief Architect there and I just said, "Don't ever put Hibernate in our production." Because I had more meetings about what it was doing wrong than what it was doing right.Jeremy: Right. Right. Yeah. No, that's sounds, yeah.Patrick: Is that a long answer? Like, no.Jeremy: No, I've had the same experience where certain ORMs you're just like, no. Certain things, you can't do this because it's going to one, I think it locks you in in a sense, I mean there's all kind of lock-in in the cloud, and if you're using a data service or an API or you're using something native in AWS, or IBM Cloud, you're still going to be locked in in some way, but I do feel like whenever you start going down that path of building custom things, or forcing developers to get really low level, that just builds up all kinds of tech debt, right? That you eventually are going to have to work down.Patrick: Well, it's organizational inertia. When you start getting into this, when you start using annotations in Hibernate where you're just cutting through all the layers and now you're way down in the weeds, try to move that. There's a couple of companies that I've worked with now that are looking at the true reality of portability in their data stores. Like, "Oh, we want to move from one to a different, from a key value to a document without developers knowing." Well, how do you get to that point?Jeremy: Right. Yeah.Patrick: And it's just, that's not giving access to those things, first of all, but this is that tech debt that's going to get in your way. We're really good, technologists, we're really good at just wracking up the charges on our tech debt credit card, especially whenever we're trying to get things out the door quickly. And I think that's actually one of the problems that we all face. I mean, I don't think I've ever talked to a developer who was ahead of schedule and didn't have somebody breathing down their neck.Jeremy: Very true.Patrick: You take shortcuts. You're like, "We've got to shift this code this week. Skip the annotations and go straight into the database and get the data you need." Or something. You start making trade-offs real fast.Jeremy: What can we hard code that will just get us past.Patrick: Yeah. Is it green? Shift it. Yeah.Jeremy: Yeah, no, I totally, totally agree. All right. So let's talk a little bit more about, I guess, skillsets and things like that. Because there are so many different databases out there. Cassandra is just one and if you're a developer working just at the driver level, I guess, with something like Cassandra, it's not horrible to work with. It's relatively easy once a lot of these things are set up for you.Same is true of MongoBD, or I mean, DynamoDB, or any of these other ones where the interface to it isn't overly difficult, but there's always some sort of something you want to build on top of it to make it a little bit easier. But I'm just curious, in terms of learning these different things and switching between organizations and so forth, there is a cognitive load going from saying, "I'm working on Cassandra," to going to saying, "I'm working on DynamoDB," or something like that. There's going to be a shift in understanding of how the data can be brought back, what the limitations are, just a whole bunch of things that you kind of have to think about. And that's not even including managing the actual thing. That's a whole other thing.So, hiring people, I guess, or hiring developers, how much do we want developers to know? Are you on board with me where it's like, I mean I like understanding how Cassandra works and I like understanding how DynamoDB works, and I like knowing the limits, but I also don't want to think about them when I'm writing code.Patrick: Yeah. Well, it's interesting because Cassandra, one of the things I really loved about Cassandra initially was just how it works. As a computer scientist, I was like, "This is really neat." I mean, my degree field is in distributed computing, so of course, I'm going to nerd out.Jeremy: There you go.Patrick: But that doesn't mean that it doesn't have mass appeal because it's doing the thing that people want. And I think that's going to be the challenge of any properly built service layer. I think I've mentioned to you before we started this, I work on a project called Stargate. And Stargate is a project that is meant to build a data layer on top of databases. And right now it's with Cassandra. And it's abstracting away some of the harder to understand or reason things.For instance, with distributed computing, we're trying to reduce the reliance on coordination. There is a great article about this by Pat Helland about how coordination is the last really expensive thing that we have in development. Memory, CPU, super cheap. I can rent that all day long. Coordination is really, really hard, and I don't expect a new programmer to understand, to reason through coordination problems. "Oh, yeah, the just in time race conditions," and things like that.And I think that's where distributed computing, it's super powerful, but then whenever people see what eventual consistency are, they freak out and they're like, "I just want my SQL Lite on my laptop. It's very safe." But that's not going to get you there. That's not a global database, it's not going to be able to take you to a billion users. Come on, don't cut ...Jeremy: Maybe you don't need to be.Patrick: ... your apps short Jeremy. You're going to have a billion users.Jeremy: You should strive for it, at least, is how I feel about it. So that's, I guess, the point I was trying to get to is that if the developers are the ones that you don't want learning some of this stuff, and there's ways to abstract it away again, going like we talked about data as a service and APIs and so forth. And I think that's where I would love to see things shifting. And as you said earlier, that's probably where things are going.But if you did want to run your own database cluster, and you wanted to do this on your own, I mean you have to hire people that know how to do this stuff. And the more I see the market heating up for this type of person, there is very, very few specialists out there that are probably available. So how would you even hire somebody to run your Cassandra ring? They probably all work at DataStax.Patrick: No, not all of them. There's a few that work at Target and FedEx, Apple, the biggest Cassandra users in the world. Huawei. We just found out lately that Huawei now has the biggest cluster on the planet. Yeah. They just showed up at ApacheCon and said, "Oh yeah, hold my beer." But I mean, you're right, it's a specialized skillset and one of the things we're doing at DataStax, we feel, yeah, you should just rent that. And so we have Astra, which is our database as a service.It's fully compatible with open-source Cassandra. If you don't like it, you can just take it over and use open-source. But we agree and we actually can run Cassandra cheaper than you can, and it's just because we can do it at scale. And right now Astra, the way we run it is truly serverless, you only pay for what you need, and that's something that we're bringing to the open-source side of Cassandra as well, but we're getting Cassandra closer to Kubernetes internally.So if you don't want to think about Kubernetes, if you don't want to think about all that stuff, you can just rent it from us, or you could just go use it in open-source, either way. But you're right. I mean, it should not be a 2020s skillset is, "Get better at running Cassandra." I think those days should be, leave it to, if you want to go work at DataStax and run Cassandra, great, we're hiring right now, you will love it. You don't have to. Yeah.Jeremy: So the idea of it being open-source, so again, I'm not a huge fan of this idea of vendor lock-in. I think if you want to run on AWS Lambda, yeah, most of what you can do can only run on AWS Lambda, but changing the compute, switching that over to Azure or switching that over to GCP or something like that, the compute itself is probably not that hard to move, right? I think especially depending on what you're doing, setting up an entire Kubernetes cluster just to run a few functions is probably not worth it. I mean, obviously, if you've got a much bigger implementation, that's a little different.But with data, data is just locked in. No matter where you go, it is very hard to move a lot of data. So even with the open-source flair that you have there, do you still see a worry about lock in from a data side?Patrick: Yeah. And it's becoming more of a concern with larger companies too, because options, #options. There was a pretty famous story a few years ago where the CEO of Target said, "I am not paying Amazon any more money," and they just picked up shop and moved from AWS to Google Cloud. And the CEO made a technical decision. It was like everybody downstream had to deal with that. And I think that luckily Target's a huge Cassandra shop and they were just like, "Okay, we'll just move it over there."But the thing is that you're right, I mean, and I love talking about this because back when cloud was first starting and I was talking about it and thinking about it, just what do the clouds promise you? Oh, you get commodity scale of CPU and network and storage. And that's what they want to sell you because that what they're building. Those big buildings in north Virginia, they are full of compute network and storage, but the thing they know they need to hook you in and the way that they're hooking you in, there's some services that are really handy, they're great, but really the hook is the data.Once you get into the database, the bespoke database for the cloud, one of the features of that database is it will not connect to any other database outside of that cloud, and they know that. I mean, and this is why I really strongly am starting to advocate this idea of this move towards data on Kubernetes is a way where open-source gets to take back the cloud. Because now we're deploying these virtual data centers and using open-source technology to create this portability. So we can use the compute network and storage, a Google, Amazon, Azure, OnPrem wherever, doesn't matter.But you need to think of like, "All right. How is that going to work?" And that's why we're like, "If you rent your Cassandra from DataStax with Astra, you can also use the open-source Cassandra as well." And if we aren't keeping you happy, you should feel totally fine with moving it to an open-source workload. And we're good with that. One way or the other, we would love for you to use a database that works for you.Jeremy: Right. And so this Stargate project that you're working on, is that the one that allows you to basically route to multiple databases?Patrick: That's the dream. Right now it just does Cassandra, but there's been some really interesting ... There's some folks coming out of the woodwork that really want to bring their database technology to Stargate. And that's what I'm encouraged by. It's an open-source project, Stargate.io, and you can contribute any of the connectors for underlying data store, but if we're using GraphQL, if you're using GRPC, if you're using REST, the underlying data store is really somewhat irrelevant in that case. You're just doing gets and puts, or gets and sets. Gets and puts, yeah, that's right. Gets, sets, puts, it's a lot of words.Jeremy: Whatever words. Yeah. Exactly.Patrick: That's what I love about standard, Jeremy, there's so many to pick from.Jeremy: Right, because there are ... Exactly, which standard do you choose? Yeah. So, because that's an interesting thing for me too, is just this idea of, I mean, it would be great to live in a perfect little cloud where you could say like, "Oh, well AWS has all the services I need. And I can just keep all my stuff there, whatever." But best of breed services, or again, the cost of hosting something in AWS maybe if you're hosting a Cassandra cluster there, versus maybe hosting it in GCP or maybe hosting it with you, you said you could host it cheaper than those could, or that we could host it ourselves.And so I do think that there is ... and again, we've had this conversation about multi-cloud and things like that where it's not about agnostic, it's not about being cloud agnostic, it's about using the best of breed for any service that you want to use. And APIs seem to be the way to get you there. So I love this idea of the Stargate project because it just seems like that's the way where it could be that standard across all these different clouds and onto all these different databases, well I mean, right now Cassandra, but eventually these other ones. I don't know, that seems like a pretty powerful project to me.Patrick: Well, the time has come. It's cloud native ... I work a lot with CNCF and cloud-native data is a kind of emerging topic. It's so emerging that I'm actually in the middle of writing a book, an O'Reilly book on it. So, yeah. Surprise. I just dropped it. This just in.Yeah, because I can see that this is going to be the future, but when we build cloud-native, cloud applications, cloud-native applications, we want scale, we want elasticity, and we want self-healing. Those are the three cloud-native things that we want. And that doesn't give us a whole lot ... So if I want to crank out a quick REACT app, that's what I'm going to use. And Netlify's a great example, or Vercel, they're creating this abstraction layer. But Netlify and Vercel are both working, they've been partnering with us on the Stargate project, because they're seeing like, "Okay, we want to have that very light touch, developers just come in and use it," in building cloud-native applications.And whenever you're building your application, you're just paying for what you use. And I think that's really key, not spinning up a bunch of infrastructure that you get a monthly bill for. And that bill can be expensive.Jeremy: It seems crazy. Doesn't it seem crazy nowadays? Actually provisioning an EC2 instance and paying for it to run even if it does nothing. That seems crazy to me.Patrick: There are start-ups around the idea of finding the instance that's running that's causing you money that you're not using.Jeremy: Which is crazy, isn't it? It's crazy. All right. So let's go a little bit more into standards, because you mentioned standards. So there are standards now for a lot of things, and again, GraphQL being a great example, I think. But also from a database perspective, looking at things like TSQL and developers come into an organization and they're familiar with MySQL, or they're familiar with PostgreSQL, whatever it is. Or maybe they're familiar with Cassandra or something like that, but I think most people, at least from what I've seen, have been very, very comfortable with the TSQL approach to getting data. So, how do you bring developers in and start teaching them or getting them to understand more of that NoSQL feel?Patrick: I think it's already happened, it's just the translation hasn't happened in a lot of minds. When you go to build an application, you're designing your application around the workflows your application's going to have. You're always thinking about like, "I click on this. I go there." I mean, this is where we wireframe out the application. At that point, your database is now involved and I don't think a lot of folks know that.It's like, at every point you need to put data or get data. And I think this is where we've taught could be anybody building applications, which makes it really difficult to be like, "No, no, no, start with your data domain first and build out all those models. And then you write your application to go against those models." And I'll tell you, I've been involved in a few of these application boot camps, like JavaScript boot camps and things, they don't go into data modeling. It's just not a part of it.Jeremy: Really?Patrick: And I think this is that thing where we have to acknowledge like, "Yeah, we don't really need that anymore as much, because we're just building applications." If I build a React app, and I have a form and I'm managing the authentication and I click a button and then I get a profile information, I just described every database interaction that I need and the objects that I need. And I'm going to put my user profile at some point, I'm going to click my ID and get that profile back as an object. Those are the interactions that I need. At no point did I say, "And then I'm going to write select from where." No, I just need to get that data.Jeremy: And I love thinking about data as objects anyways. It makes more sense, rather than rows of spreadsheets essentially that you join together, describing an object even if it's got nested data, like a document form or things like that, I think makes a ton of sense. But is SQL, is it still relevant do you think? I mean, in the world we're moving into? Should I be teaching my daughters how to write TSQL? Or would I be wasting my time?Patrick: Yeah. Well, yes and no. Depends on what your kid's doing. I think that SQL will go to where it originally started and where it will eventually end, which is in data engineering and data science. And I mean, I still use SQL every once in a while, Bigtable, that sort of thing, for exploring my data. I mean for an analytics career or reporting data and things like that, SQL is very expressive. I don't see any reason to change that. But this is a guy who's been writing SQL for a million years.But I mean, that world is still really moving. I mean, like a Presto and Snowflake and all these, Redshift, they all use Bigtable, they all use SQL to express the reporting capabilities. But ... And I think this is how you and I got sucked into this is like, well that was the database that we had, so we started using reporting languages to build applications. And how'd that work out?Jeremy: Yeah. Well, it certainly didn't scale very well, I can tell you that, going back to sharding, because that is always something that was very hard to do. So I guess, I get the point that essentially if you're going to be in the data sciences and you actually need to analyze that data and maybe you do need to do joins, or maybe you need to work with big data in a way, that's a specialized aspect of it and I think people could dabble in that if they were just regular developers and they didn't want to go too deep.But it sounds like the bigger, or the end goal here, maybe altruistic, is to just give people access to data. So even if they don't know SQL or they don't know something complex, just make it so that whatever data is there that anybody, with whatever level is, they can consume it.Patrick: Yeah. And move fast with the thing that you're building. Actually, I use a Facebook term, but Facebook does do this. Internally there's a system called Occhio that provides gets and puts for your data, but it abstracts things like geographics and things like that. But the companies that are trying to move quickly, they understood this a long time ago. If you have to reason through, "Am I doing a full table scan? Is that an efficient interjoin?" If you have to reason through that, you're not moving fast anymore.Jeremy: Right. Right. All right. Cool. All right, so let's talk about Astra a little bit more and this whole idea of, because Astra is the serverless version, the hosted version, the serverless version of Cassandra, right? Through DataStax?Patrick: Right. And ...Jeremy: Did I get that right?Patrick: You got it right. And so it gives you full access. You could do Port 9042 if you still want to use a driver, but it gives you access via GraphQL, REST, and there's also a document API. So if you just want to persist your JavaScript API or JavaScript and then pull it back out your JSON, it does full documents. So it emulates what a MongoDB or DocumenDB does. But the important thing, and this is the somewhat revolutionary side of this, and again, this is something that we're looking to put into open-source, is the serverless nature of it.You only pay for what you use. And when you want to create a Cassandra database, we don't even call it a Cassandra database on the Astra panel anymore. We just create a database. You give it a name. You click. And it's ready. And it will scale infinitely. As long as we can find some compute and network for you to use somewhere, it'll just keep scaling and that's kind of that true portion of serverless that we're really trying to make happen. And for me, that's exciting because finally, all that power that I feel like I've been hoarding for a long time is now available for so many more people.And then if you do a million writes per second for 10 minutes and then you turn it off, you only pay for that little short amount of time. And it scales back. You're not paying a persistent charge forever.Jeremy: I'm just curious from a technical implementation, because I'm thinking about PTSD or nightmares back of my days running Cassandra, and so I'm just trying to think how this works. Is it a shared tenancy model? Or is there a way to do single tenancy if you wanted that as a service?Patrick: Under the covers, yes, it is multi-tenant, but the way that we are created ... so we had to do some really interesting engineering inside. So my RCO's going to kill me if I talk about this, but hey, you know what, Jeremy? We're friends, we can do this. He's like, "Don't talk about the underlying architecture." I'm talking about the underlying architecture. The thing that we did was we took Cassandra and we decomposed it into microservices mostly. That's probably, it's still Cassandra, it's just how we run it makes it way more amenable to doing multi-tenant and scale in that fashion where the queries are separated from the storage and things that are running in the background, like if you're familiar with Cassandra because it's a log structure storage, you ask to do compactions and things like that, all that's just kind of on the side. It doesn't impact your query.But it gives us the ability to, if you create a database and all of a sudden you just hammer it with a million writes per second, there's enough infrastructure in total to cover it. And then we'll spin up more in the back to cover everything else. And then whenever you're done, we retract it back. That's how we keep our costs down. But then the storage side is separated and away from the compute side, and the storage side can scale its own way as well.And so whenever you need to store a petabyte of Cassandra data, you're just storing, you're just charged for the petabyte of storage on disk, not the thousandth of a cluster that you just created. Yeah.Jeremy: No. I love that. Thank you for explaining that though, because that is, every time I talk to somebody who's building a database or running some complex thing for a database, there's always magic. Somebody has to build some magic to make it actually work the way everyone hopes it would work. And so if anybody is listening to this and is like, "Ah, I'm just getting ready to spin up our own Cassandra ring," just think about these things because these are the really hard problems that are great to have a team of people working on that can solve this specific problem for you and abstract all of that crap away.Patrick: Yeah. Well, I mean it goes back to the Dynamo paper, and how distributed databases work, but it requires that they have a certain baseline. And they're all working together in some way. And Cassandra is a share-nothing architecture. I mean you don't have a leader note or anything like that. But like I said, because that data is spread out, you could have these little intermittent problems that you don't want to have to think about. Just leave that to somebody else. Somebody else has got a Grafana dashboard that's freaking out. Let them deal with it. But you can route around those problems really easily.Jeremy: Yeah. No, that's amazing. All right. So a couple more technical questions, because I'm always curious how some of these things work. So if somebody signs up and they set up this database and they want to connect to it, you mentioned you could use the driver, you mentioned you can use GraphQL or the REST API, or the Document API. What's the authentication method look like for that?Patrick: Yeah. So, it's a pretty standard thing with tokens. You create your access tokens, so when you create the database, you define the way that you access it with the token, and then whenever you connect to it, if you're using JavaScript, there's a couple of collection libraries that just have that as one of the environment variables.And so it's pretty standard for connecting the cloud databases now where you have your authentication token. And you can revoke that token at any time. So for instance, if you mistakenly commit that into your Git ...Jeremy: Say GitHub. We've never done that before.Patrick: No judging. You can revoke it immediately. But it also gives you our back, the controls over it's a read or write or admin, if you need to create new tables and that sort of thing. You can give that level of access to whatever that token is. So, very simple model, but then at that point, you're just interacting through a REST call or using any of the HTTP protocols or SQL protocol.Jeremy: And now, can you create multiple tokens with different levels of permission or is it all just token gives you full access?Patrick: No, it's multiple levels of protection and actually that's probably the best way to do it, for instance, if your CI/CD system, has the ability to, it should be able to create databases and tear them down, right? That would be a good use for that, but if you have, for instance, a very basic application, you just want it to be able to read and write. You don't want to change any of the underlying data structures.Jeremy: Right. Right.Patrick: That's a good layer of control, and so you can have all these layers going on one single database. But you can even have read-only access too, for ... I think that's something that's becoming more and more common now that there's reporting systems that are on the side.Jeremy: Right. Right. Good.Patrick: No, you can only read from the database.Jeremy: And what about data backups or exporting data or anything like that?Patrick: Yeah, we have a pretty rudimentary backup now, and we will probably, we're working on some more sophisticated versions of it. Data backup in Cassandra is pretty simple because it's all based on snapshots because if you know Cassandra the database, the data you write is immutable and that's a great way to start when you come to backup data. But yeah, we have a rudimentary backup system now where you have to, if you need to restore your data, you need to put in a ticket to have it restored at a certain point.I don't personally like that as much. I like the self-service model, and that's what we're working towards. And with more granularity, because with snapshots you can do things like snapshot, this is one of the things that we're working on, is doing like a snapshot of your production database and restoring it into a QA cluster. So, works for my house, oh, try it again. Yeah.Jeremy: That's awesome. No, so this is amazing. And I love this idea of just taking that pain of managing a database away from you. I love the idea of just make it simple to access the data. Don't create these complex things where people have to build more, and if people want to build a data access layer, the data access layer should maybe just be enforcing a model or something like that, and not having to figure out if you're on this shard, we route you to this particular port, or whatever. All that stuff is just insane, so yeah, I mean maybe go back to kind of the idea of this whole episode here, which is just, stop using databases. Start using these data services because they're so much easier to use. I mean, I'm sure there's concerns for some people, especially when you get to larger companies and you have all the compliance and things like that. I'm sure Astra and DataStax has all the compliance things and things like that. But yeah, just any final words, advice to people who might still be thinking databases are a good idea?Patrick: Well, I have an old 6502 on a breadboard, which I love to play with. It doesn't make it relevant. I'm sorry. That was a little catty, wasn't it?Jeremy: A little bit, but point well taken. I totally get what you're saying.Patrick: I mean, I think that it's, what do we do with the next generation? And this is one of the things, this will be the thought that I leave us with is, it's incumbent on a generation of engineers and programmers to make the next generation's job easier, right? We should always make it easier. So this is our chance. If you're currently working with database technology, this is your chance to not put that pain on the next generation, the people that will go past where you are. And so, this is how we move forward as a group.Jeremy: Yeah. Love it. Okay. Well Patrick, thank you so much for sharing all this and telling us about DataStax and Astra. So if people want to find out more about you or they want to find out more about Astra and DataStax, how do they do that?Patrick: All right. Well, plenty of ways at www.datastax.com and astra.datastax.com if you just want the good stuff. Cut the marketing, go to the good stuff, astra.datastax.com. You can find me on LinkedIn, Patrick McFadin. And I'm everywhere. If you want to connect with me on LinkedIn or on Twitter, I love connecting with folks and finding out what you're working on, so please feel free. I get more messages now on LinkedIn than anything, and it's great.Jeremy: Yeah. It's been picking up a lot. I know. It's kind of crazy. Linked in has really picked up. It's ...Patrick: I'm good with it. Yeah.Jeremy: Yeah. It's ...Patrick: I'm really good with it.Jeremy: It's a little bit better format maybe. So you also have, we mentioned the Stargate project, so that's just Stargate.io. We didn't talk about the K8ssandra project. Is that how you say that?Patrick: Yeah, the K8ssandra project.Jeremy: K8ssandra? Is that how you say it?Patrick: K8ssandra. Isn't that a cute name?Jeremy: It's K-8-S-S-A-N-D-R-A.io.Patrick: Right.Jeremy: What's that again? That's the idea of moving Cassandra onto Kubernetes, right?Patrick: Yeah. It's not Cassandra on Kubernetes, it's Cassandra in Kubernetes.Jeremy: In Kubernetes. Oh.Patrick: So it's like in concert and working with how Kubernetes works. Yes. So it's using Cassandra as your default data store for Kubernetes. It's a very, actually it's another one of the projects that's just taking off. KubeCon was last week from where we're recording now, or two weeks ago, and it was just a huge hit because again, it's like, "Kubernetes makes my infrastructure to run easier, and Cassandra is hard, put those together. Hey, I like this idea."Jeremy: Awesome.Patrick: So, yeah.Jeremy: Cool. All right. Well, if anybody wants to find out about that stuff, I will put all of these links in the show notes. Thanks again, Patrick. Really appreciate it.Patrick: Great. Thanks, Jeremy.
Eric Anderson (@ericmander) and Patrick McFadin (@PatrickMcFadin) delve into the history of Apache Cassandra, the open-source NoSQL database born and bred around cloud over a decade ago. Patrick is the VP of Developer Relations at DataStax, and a member of the Cassandra Project Management Committee. On today's episode, Patrick shares his philosophy on developer advocacy and experience in open-source. In this episode we discuss: Behind the NoSQL explosion that made Cassandra the darling of the valley Comparing different eras of commercializing open-source, then and now How Patrick became a pioneer in evangelizing and community-building The two kinds of people to recruit for developer relations Why Patrick says open-source is going to “start eating clouds” Links: Apache Cassandra Datastax Datastax Astra People mentioned: Avinash Lakshman (@HedvigEng) Prashant Malik (@pmalik) Adrian Cawcroft (@adrianco) Kelsey Hightower (@kelseyhightower) Other episodes: Chef with Adam Jacob
Further democratizing Apache Cassandra, DataStax (https://www.datastax.com) has announced a serverless option for Cassandra via DataStax Astra, its Cassandra-as-a-Service offering. “You can now deploy Cassandra without having to think about capacity,” said Patrick McFadin, VP, Developer Relations at DataStax. “It’s a huge shift for Cassandra as developers don’t have to worry about over-provisioning their capacity, which will have a direct impact on what they are going to pay for that capacity. Now in a pure serverless fashion, they can pay what they use.” In this episode of TFiR Insights, we dug deeper into this announcement. Here are some of the topics we discussed: • What is serverless Astra? • Why did DataStax feel a need to offer this now? • How is it more about giving developers greater control over what they pay? • What are the direct benefits for developers? • The evolution of DataStax Astra over time? • Discussion on Apache Cassandra, the heart and soul of Astra. Patrick McFadin is the VP of Developer Relations at DataStax, where he leads a team devoted to making users of Apache Cassandra successful. He has also worked as Chief Evangelist for Apache Cassandra and consultant for DataStax, where he helped build some of the largest and exciting deployments in production. Previous to DataStax, he was Chief Architect at Hobsons and an Oracle DBA/Developer for over 15 years.
The two open-source veterans of over 20 years sit down and talk about coming to open source, growing the community, the changes to open source licensing, and of course the new role of SRE/DBRE and how DBA’s are stepping up and turning into awesome contributors in this new role. We will also hear what is new and exciting with Apache Cassandra and what the folks over at Datastax are working on!
Today I had a fun and engaging conversation with Patrick McFadin, the VP of Developer Relations at DataStax. Patrick is one of the leading experts of Apache Cassandre which is an open-source database management system. Patrick and I also discussed: The changing attitudes towards open source from the 90’s to today. He also gives examples of open source has reset the economics of building software in the past and goes on to explain how Kubernetes is again resetting the economics. Another interesting topic we discussed is the growing importance of Site Reliability Engineers in the database world. Link to transcript: https://openteams.com/landing/podcast-ep-14/ Follow OpenTeams on: Twitter: https://twitter.com/openteamsinc LinkedIn: https://www.linkedin.com/company/openteams Facebook: https://www.facebook.com/openteamsinc Instagram: https://www.instagram.com/openteams/ Support this podcast by liking this video and subscribing to OpenTeams’ YouTube channel: https://bit.ly/2ZBPGnt You can also show support for this podcast by leaving a rating and review for the podcast on Apple Podcasts. Link to podcast channel: https://apple.co/3itAzne Thanks for listening!
With data comes DBAs and with Kubernetes comes SREs. Listen in as Patrick McFadin and Sam discuss what’s in store in 2021 for Data on Kubernetes, how experienced DBA roles can evolve into very effective SREs, and why today is THE day to learn Kubernetes. See omnystudio.com/listener for privacy information.
In our inaugural DOKC meet-up, Patrick McFadin Developer Advocate at Datastax emphasized the challenges of running Cassandra on Kubernetes, concluding at one point that “Kubernetes might not be ready for Cassandra.” Since that meeting, the use of the open-source Container Attached Storage project OpenEBS as a simple and high performance per workload storage for Cassandra has proliferated. Also the Cassandra Operator from Datastax, aka “CaSS”, has progressed as well. So - where are we now? Is CaSS on CAS working well? What is the future of collaboration between Datastax / Cassandra and MayaData / OpenEBS? Is Kubernetes now ready for Cassandra? What are the emerging technologies that might shape storage and Kubernetes in the near future? What are the reasons people avoid running DBs on Kubernetes? What makes it easier?
Joining us this week is Patrick McFadin, Apache Cassandra and VP, Developer Relations at DataStax. About DataStax DataStax helps companies compete in a rapidly changing world where expectations are high and new innovations happen daily. DataStax is an experienced partner in on-premises, hybrid, and multi-cloud deployments and offers a suite of distributed data management products and cloud services. We make it easy for enterprises to deliver killer apps that crush the competition. More than 400 of the world’s leading enterprises including Capital One, Cisco, Comcast, Delta Airlines, eBay, Macy’s, McDonald’s, Safeway, Sony, and Walmart use DataStax to build modern applications that can be deployed across any cloud.
Hi, Spring fans! In this episode [Josh Long (@starbuxman)](http://twitter.com/starbuxman) talks about all the wonderful shows he's been at, the epic new support for Kubernetes-ready native images in Spring Boot 2.3 and Spring Boot 2.4, and then he talks to DataStax's own [Patrick McFadin (@PatrickMcFadin)](http://twitter.com/PatrickMcFadin), a legend in the JVM and Cassandra communities, and an all around amiable gent.
Kubernetes is becoming boring and that's a good thing — it's what's on top of Kubernetes that counts. In this The New Stack Analysts podcast, TNS Founder & Publisher Alex Williams asked KubeCon attendees to join him for a short “stack” at our Virtual Pancake & Podcast to discuss “What's on your stack?” The podcast featured guest speakers Janakiram MSV, principal analyst, Janakiram & Associates, Priyanka Sharma, general manager, CNCF, Patrick McFadin, chief evangelist for Apache Cassandra and vice president, developer relations, DataStax and Bill Zajac, regional director of solution engineering, Dynatrace. The group passed the virtual syrup and talked Kubernetes, which may be stateless, but also means there's plenty of room for sides.
Kubernetes is becoming boring and that's a good thing — it's what's on top of Kubernetes that counts. In this The New Stack Analysts podcast, TNS Founder & Publisher Alex Williams asked KubeCon attendees to join him for a short “stack” at our Virtual Pancake & Podcast to discuss “What's on your stack?” The podcast featured guest speakers Janakiram MSV, principal analyst, Janakiram & Associates, Priyanka Sharma, general manager, CNCF, Patrick McFadin, chief evangelist for Apache Cassandra and vice president, developer relations, DataStax and Bill Zajac, regional director of solution engineering, Dynatrace. The group passed the virtual syrup and talked Kubernetes, which may be stateless, but also means there's plenty of room for sides.
In this episode we talk about Cassandra 4.0 with Patrick McFadin. Contacting Patrick McFadin Website: Twitter: LinkedIn: Datastax: Episode Editing by - ( )
Our kick-off inaugural event of Data on Kuberneters commenced with VP Developer Relations of Datastax Patrick McFadin talking about this vision for the future of doing data on k8s Kubernetes has been a great solution for deploying application infrastructure. Trying to manage your data with the same control plane has been, less than ideal. This has been even more true when using distributed databases like Apache Cassandra. Once you get past the storage and stateful sets, you still have a lot to do. Let's have a frank talk about the new opportunities to make Kubernetes ready for data. Patrick McFadin is the VP of Developer Relations at DataStax, where he leads a team devoted to making users of Apache Cassandra successful. He has also worked as Chief Evangelist for Apache Cassandra and consultant for DataStax, where he helped build some of the largest and exciting deployments in production. Patrick is active in the Apache Cassandra project and a member of the CNCF TOC. The new data on k8s community (DOKC) is an openly governed and self-organizing group of curious and experienced operators and engineers concerned with running data-intensive workloads on Kubernetes. We will have weekly meetups on Tuesdays at 5pm UK / 9am PST and everything will be recorded and put up on youtube and podcast land. I am currently looking for speakers who can talk about things such as operators, databases, multicloud/hybrid, or anything else that could be interesting for the SRE engineering crowd. Join our slack: https://join.slack.com/t/dokcommunity/shared_invite/zt-g3ui5r0g-jDKz5dhh2W1ayElqwKYYAg Follow us on Twitter: @dokcommunity Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Patrick on Linkedin: https://www.linkedin.com/in/patrick-mcfadin-53a8046/ This meetup is sponsored by MayaData, which helped start the DOK.community and remains an active supporter. MayaData sponsors two Cloud Native Computing Foundation (CNCF) projects, OpenEBS - the leading open-source container attached storage solution - and Litmus - the leading Kubernetes native chaos engineering project, which was recently donated to the CNCF as a Sandbox project. As of June 2020, MayaData is the sixth-largest contributor to CNCF projects. Well-known users of MayaData products include the CNCF itself, Bloomberg, Comcast, Arista, Orange, Intuit, and others. Check out more info at https://mayadata.io/ Come learn about running Cassandra in their hands-on workshop: https://www.datastax.com/events/cassandra-workshop-series
DataStax sponsored this podcast. About 10 years ago the tech industry rejected the single relational database, and demanded a way to scale — at scale — with distributed systems. This movement saw the birth of React, Cassandra, MongoDB, and Tokyo Cabinet, all to better manage distributed databases. “All those databases that grew from: ‘Hey, we have a scaled data problem and this single relational database is not solving it.' And I think that was the first time we really had to solve scale problems and use distributed technology to make it work,” said Patrick McFadin, chief evangelist for Apache Cassandra and vice president of developer relations at DataStax. McFadin joined colleague Kathryn Erickson, head of strategy and product at DataStax, for this episode of The New Stack Makers. They sat down with founder and Publisher of The New Stack, Alex Williams, to reflect on how the industry has seen a sudden explosion of scale and how that's now guiding the next steps toward fully self-service architecture.
Trick question of the week: What’s in the future of developer relations, especially after covid-19? The pandemic forced all companies to shift strategies mid-Q1. But how does this affect developer relations? Events, the cause burnouts for many developer marketing, relations, advocates and evangelists are currently out of the mix. So is the opportunity for reaching developers, 1-on-1, face-to-face. Software has come to cover this hole by many applications. What does the future hold and how can we connect to developers when we can’t reach them in person? Patrick McFadin, VP of DataStax joins us in this episode to discuss connecting with developers in the COVID-era and post-COVID-era, open source and the challenges developer relations professionals face. In this episode, we introduce the “Let’s talk Data” section where we pick one graph from the DevRelx Trends page and analyze the results. Patrick's pick for this week: programming language communities and growth (https://www.devrelx.com/trends?lightbox=dataItem-k9mq59cz1?utm_source=PodcastDescription&utm_medium=PatrickMcFadin&utm_campaign=Desctiption) . Patrick McFadin is the Vice President of Developer Relations at DataStax. He is one of the leading experts of Apache Cassandra and data modeling techniques and has helped build some of the largest and exciting deployments in production. Previous to DataStax, he was Chief Architect at Hobsons and an Oracle DBA/Developer for over 15 years.
In this episode we have an in depth discussion on Apache Cassandra with Patrick McFadin, VP Developer Relations at DataStax. We had a great conversation with Patrick about topics ranging from basic NoSql topics to more in depth applicability of Apache Cassandra. Apache Cassandra really is one of the most used NoSql solutions out there and this information should really be useful for anybody working in Big Technology! Apache Cassandra More information on Apache Cassandra can be found on the Apache Cassandra website and on the website of DataStax. DataStax Accelerate Meet the creators of Apache Cassandra at a DataStax Accelerate event near you. They will be in Dan Diego starting May 11th and in London from June 2nd. We do have discount codes for our listeners that are want to atend these events. Use the promo code ELEPHANT20 for a 20% discount on the ticket price! Unfortunately, we've been informed that these DataStax Accelerate events have been cancelled due to Covid-19. More information is available at https://www.datastax.com/accelerate Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
In this episode I speak with Patrick McFadin about Datastax and their plans with Apache Cassandra, ponder on the release of MIDI 2.0 after 37 years, plane security, what happened in Iowa, and more https://www.datastax.com/ https://cassandra.apache.org/ chrischinchilla.com --- Send in a voice message: https://anchor.fm/theweeklysqueak/message
Join us on this podcast as Patrick McFadin and Denise Gosnell talk about the popularity of Python and the real surge in popularity it's seeing in the past few years. Pandas, Tensor Flow and killer robots? Data Engineering is standardizing around Python and making lives better. Find out about what it is, where to get started and maybe a few things you never knew. See omnystudio.com/listener for privacy information.
Patrick McFadin and Jeff Carpenter discuss how to know when you've run out of capacity on your relational database and provide a recipe for migrating to Cassandra: 1) adapting your data model, 2) adapting your application, 3) planning your Cassandra deployment, 4) executing the migration. See omnystudio.com/listener for privacy information.
We're dusting off our crystal ball to give you our top 8 predictions in the world of databases for the coming new year. Join us on this podcast as Patrick McFadin breaks down how 5G, Kubernetes, Graph, Apache Cassandra 4.0 and so much more will impact your database in 2020! You won't want to miss this!
Matija Gobec shares with Patrick McFadin why he started working on a new compaction strategy for Apache Cassandra and how the Cassandra community can collaborate more effectively to introduce new capabilities such as partition-based compaction. Highlights: 0:00 - Patrick welcomes Matija Gobec to the show. 1:30 - Matija introduces the concept of compaction in Cassandra and some of the challenges with existing compaction strategies 2:53 - Existing strategies include size-tiered Compaction (the default) and See omnystudio.com/listener for privacy information.
Patrick McFadin connects with Carlos Rolo from Pythian at ApacheCon NA to recap the talk Carlos gave on some of the most common issues he sees in production Cassandra clusters and how to avoid them. You can listen to the full talk at https://feathercast.apache.org/2019/09/12/day-to-day-with-cassandra-the-weirdest-and-complex-situations-we-found-carlos-rolo/ 0:00 - Patrick welcomes Carlos back to the show to recap his talk at ApacheCon about some of the worst cases he's seen with Cassandra clusters. New use See omnystudio.com/listener for privacy information.
Patrick McFadin and Jeff Carpenter recap their favorite talks and hallway conversations from ApacheCon North America 2019 including DataStax announcements from the keynote by DataStax CTO Jonathan Ellis. Highlights: 0:00 - Enough talk - lets fight! 1:53 - Next Generation Cassandra Conference (NGCC) - the conference within a conference. Thanks to the Apache Software Foundation for making space. 3:42 - NGCC was focused around Cassandra 4.0 including the release of the first alpha and the testing that will be See omnystudio.com/listener for privacy information.
With 2018 coming to a close, we are excited to see what 2019 has in store for the database world. Join us on this podcast as Patrick McFadin, VP of Developer Relations at DataStax, gives us his take on the future of databases, microservices/containers, AI and machine learning, and more!
After running 6 DataStax Developer Days Events in multiple cities around the world, Patrick McFadin and Cedrick Lunven take a few minutes to wrap up, provide some developer feedbacks and trendy technical subjects before opening perspectives for next year. Highlights 00:15 Introduction of Patrick and Cedrick 00:30 Last Developer Day of six in Paris 01:15 Cedrick gives overview of the Developer Days 02:00 Wanted to hear from the developers attending and their use cases 02:45 What's on the minds of developer See omnystudio.com/listener for privacy information.
Patrick McFadin is the VP of Developer Relations at DataStax, where he leads a team devoted to making users of DataStax products successful. He has also worked as Chief Evangelist for Apache Cassandra and consultant for DataStax, where he helped build some of the largest and exciting deployments in production. Previous to DataStax, he was Chief Architect at Hobsons and an Oracle DBA/Developer for over 15 years.
Patrick introduces new DataStax evangelist Adron Hall and they discuss the challenges of running distributed data across multiple cloud providers. See omnystudio.com/listener for privacy information.
DataStax Head of Developer Relations Patrick McFadin and DataStax co-founder Jonathan Ellis talk Cassandra’s beginnings, how DataStax grew out of it, and not hating on Python.
In this week’s show we interview Datastax’s Patrick McFadin about data, development and containerised services. We also review the latest book from Tom Peters’ – The Excellence Dividend, and choose the next title for the WB40 Bookclub – John Doerr’s Measure What Matters You can find the long list for the next book here. Don’t […]
Patrick McFadin (@patrickmcfadin) Chief Evangelist at DataStax (@Datastax) joins us this week on The Hot Aisle to continue our education on NoSQL, Cassandra, Apache Spark, and so much more. Your hosts Brent Piatti (@brentpiatti) and Brian Carpenter (@intheDC) look to learn a bit about how Cassandra got here (Thanks Facebook!), the inspirations behind Cassandra (like the […]