Podcasts about Apache Cassandra

77PODCASTS
159EPISODES
43mAVG DURATION
1MONTHLY NEW EPISODE
Nov 17, 2025LATEST

POPULARITY

20172018201920202021202220232024

Best podcasts about Apache Cassandra

Distributed Data Show

16 episodes with Apache Cassandra

Bigdata Hebdo

11 episodes with Apache Cassandra

airhacks.fm podcast with adam bien

6 episodes with Apache Cassandra

DataSnax Podcast

10 episodes with Apache Cassandra

The Cloud Pod

3 episodes with Apache Cassandra

Azure Friday (HD) - Channel 9

3 episodes with Apache Cassandra

Software Engineering Daily

2 episodes with Apache Cassandra

Cassandra Community Podcasts

10 episodes with Apache Cassandra

The New Stack Podcast

4 episodes with Apache Cassandra

Latest podcast episodes about Apache Cassandra

pre:Invent Drumbeat

AWS Morning Brief

Play Episode Listen Later Nov 17, 2025 5:55

AWS Morning Brief for the week of November 17th, with Corey Quinn.Links:Custom domain names for VPC Lattice resourcesAWS Lambda networking over IPv6AWS Control Tower supports automatic enrollment of accountsAmazon Braket Notebook Environments Now Support CUDA-Q NativelyAmazon MSK Express brokers now support Intelligent Rebalancing for 180 times faster operation performanceAmazon Keyspaces now supports logged batches for atomic, multi-statement operationsAmazon CloudWatch Composite Alarms adds threshold-based alertingAmazon Keyspaces (for Apache Cassandra) now supports Logged BatchesAmazon Elastic Kubernetes Service gets independent affirmation of its zero operator access designAWS Fault Injection Service (FIS) launches new test scenarios for partial failuresAWS CloudFormation Hooks adds granular invocation details for Hooks invocation summaryIntroducing structured output for Custom Model Import in Amazon Bedrock

amazon cloud aws devops invent hooks drumbeat corey quinn apache cassandra amazon bedrock last week in aws

Ep162: Improving Search for Generative AI Developers with DataStax and AWS

AWS for Software Companies Podcast

Play Episode Listen Later Oct 24, 2025 27:31

Learn how DataStax transformed customer feedback into a hybrid search solution that powers Fortune 500 companies through their partnership with AWS.Topics Include:AWS and DataStax discuss how quality data powers AI workloads and applications.DataStax built on Apache Cassandra powers Starbucks, Netflix, and Uber at scale.Their TIL app collects outside-in customer feedback to drive product development decisions.Hybrid search and BM25 kept trending in customer requests for several months.Customers wanted to go beyond pure vector search, not specifically BM25 itself.Research showed hybrid search improves accuracy up to 40% over single methods.ML-based re-rankers substantially outperform score-based ones despite added latency and cost.DataStax repositioned their product as a knowledge layer above the data layer.Developer-first design prioritizes simple interfaces and eliminates manual data modeling headaches.Hybrid search API uses simple dollar-sign parameters and integrates with Langflow automatically.AWS PrivateLink ensures security while Graviton processors boost efficiency and tenant density.Graviton reduced total platform operating costs by 20-30% with higher throughput.Participants:Alejandro Cantarero – Field CTO, AI, DataStaxRuskin Dantra - Senior ISV Solution Architect, AWS, Amazon Web ServicesSee how Amazon Web Services gives you the freedom to migrate, innovate, and scale your software company at https://aws.amazon.com/isv/

#083 | DataStax

Of Je Stopt De Stekker Er In

Play Episode Listen Later Sep 9, 2025 31:01

In deze aflevering duiken we in de wereld van DataStax, samen met Michel de Ru, pre-sales specialist en expert op het gebied van enterprise data-oplossingen. DataStax, inmiddels onderdeel van IBM, biedt krachtige technologieën rondom de database Apache Cassandra. Denk aan Astra DB, Astra Streaming, de Hyper-Converged Database, DataStax Enterprise en het innovatieve Langflow.Michel neemt ons mee in wat DataStax precies is, waarom het zo'n belangrijke speler is in het datalandschap, en wat hun oplossingen kunnen betekenen voor organisaties die willen versnellen met data, schaalbaarheid en AI. We bespreken de unieke waarde van hun pakketten, de link met kunstmatige intelligentie, en hoe deze technologieën bijdragen aan moderne, intelligente automatisering.

ai ibm denk ru datastax apache cassandra datastax enterprise

Kodsnack 654 - German-style strings, with Matt Topol

Kodsnack in English

Play Episode Listen Later Aug 5, 2025 53:20

Fredrik talks to Matt Topol about Arrow and how the Arrow ecosystem is evolving. Arrow is an open source, columnar in-memory data format designed for efficient data processing and analytics - which means passing data between things without needing to transform it, and ideally even without needing to copy it. What makes the ecosystem grow, and why is it very cool to have Arrow on the GPU? What is the connection between Arrow, machine learning, and Hugging face? Matt emphasizes the value of open standards, even as they work with or within more closed systems they can help open things up, and help bring about more modular solutions so that developers can focus on doing their core area really well. This episode can be seen as a follow-up to episode 567, where Matt first joined to discuss everything Arrow. Recorded during Øredev 2024. Thank you Cloudnet for sponsoring our VPS! Comments, questions or tips? We a re @kodsnack, @tobiashieta, @oferlund and @bjoreman on Twitter, have a page on Facebook and can be emailed at info@kodsnack.se if you want to write longer. We read everything we receive. If you enjoy Kodsnack we would love a review in iTunes! You can also support the podcast by buying us a coffee (or two!) through Ko-fi. Links Matt Matt’s Øredev 2023 talks: State of the Apache Arrow ecosystem: How your project can leverage Arrow! and Leveraging Apache Arrow for ML workflows Previous episodes with Matt Øredev 2024 Matt’s Øredev 2024 talks - on Arrow ADBC and Composable and modular data systems ADBC - Arrow database connectivity Arrow Snowflake Snowflake drivers for ADBC Bigquery The Bigquery driver Microsoft Fabric Duckdb Postgres SQLite Arrow flight - RPC framework for services based on Arrow data Arrow flight SQL Microsoft Power BI Velox Apache datafusion Query planning Substrait - query IR Polaris Libcudf Nvidia RAPIDS Pytorch Tensorflow Arrow device interface DLPack - in-memory tensor structure Tensors Nanoarrow Voltron data - where Matt used to work. He’s now at Columnar Theseus GPU compute engine The composable data management system manifesto Support us on Ko-fi! Matt’s book - In-memory analytics with Apache Arrow Spark Spark connect RPC UDFs Photon Datafusion Apache Cassandra ODBC JDBC R - programming language for statistical computing Hugging face Ray Stringview - “German-style strings” Scaling up with R and Arrow - the book on using Arrow with R Titles It’s gotten a lot bigger The bones of it are in the repo (Powered by ADBC) Individual compute components Feed it substrate Where the ecosystem is going Arrow on the GPU The data stays on the GPU A forced copy Leverage that device interface Without forcing the copy Shy of that last mile Turtles all the way down The guy who said yes German-style strings

Oracle GoldenGate 23ai: New Features & Product Family

Oracle University Podcast

Play Episode Listen Later May 6, 2025 17:39

In this episode, Lois Houston and Nikita Abraham continue their deep dive into Oracle GoldenGate 23ai, focusing on its evolution and the extensive features it offers. They are joined once again by Nick Wagner, who provides valuable insights into the product's journey. Nick talks about the various iterations of Oracle GoldenGate, highlighting the significant advancements from version 12c to the latest 23ai release. The discussion then shifts to the extensive new features in 23ai, including AI-related capabilities, UI enhancements, and database function integration. Oracle GoldenGate 23ai: Fundamentals: https://mylearn.oracle.com/ou/course/oracle-goldengate-23ai-fundamentals/145884/237273 Oracle University Learning Community: https://education.oracle.com/ou-community LinkedIn: https://www.linkedin.com/showcase/oracle-university/ X: https://x.com/Oracle_Edu Special thanks to Arijit Ghosh, David Wright, Kris-Ann Nansen, Radhika Banka, and the OU Studio Team for helping us create this episode. ----------------------------------------------------------------- Episode Transcript: 00:00 Welcome to the Oracle University Podcast, the first stop on your cloud journey. During this series of informative podcasts, we'll bring you foundational training on the most popular Oracle technologies. Let's get started! 00:25 Lois: Hello and welcome to the Oracle University Podcast! I'm Lois Houston, Director of Innovation Programs with Oracle University, and with me is Nikita Abraham, Team Lead: Editorial Services. Nikita: Hi everyone! Last week, we introduced Oracle GoldenGate and its capabilities, and also spoke about GoldenGate 23ai. In today's episode, we'll talk about the various iterations of Oracle GoldenGate since its inception. And we'll also take a look at some new features and the Oracle GoldenGate product family. 00:57 Lois: And we have Nick Wagner back with us. Nick is a Senior Director of Product Management for GoldenGate at Oracle. Hi Nick! I think the last time we had an Oracle University course was when Oracle GoldenGate 12c was out. I'm sure there's been a lot of advancements since then. Can you walk us through those? Nick: GoldenGate 12.3 introduced the microservices architecture. GoldenGate 18c introduced support for Oracle Autonomous Data Warehouse and Autonomous Transaction Processing Databases. In GoldenGate 19c, we added the ability to do cross endian remote capture for Oracle, making it easier to set up the GoldenGate OCI service to capture from environments like Solaris, Spark, and HP-UX and replicate into the Cloud. Also, GoldenGate 19c introduced a simpler process for upgrades and installation of GoldenGate where we released something called a unified build. This means that when you install GoldenGate for a particular database, you don't need to worry about the database version when you install GoldenGate. Prior to this, you would have to install a version-specific and database-specific version of GoldenGate. So this really simplified that whole process. In GoldenGate 23ai, which is where we are now, this really is a huge release. 02:16 Nikita: Yeah, we covered some of the distributed AI features and high availability environments in our last episode. But can you give us an overview of everything that's in the 23ai release? I know there's a lot to get into but maybe you could highlight just the major ones? Nick: Within the AI and streaming environments, we've got interoperability for database vector types, heterogeneous capture and apply as well. Again, this is not just replication between Oracle-to-Oracle vector or Postgres to Postgres vector, it is heterogeneous just like the rest of GoldenGate. The entire UI has been redesigned and optimized for high speed. And so we have a lot of customers that have dozens and dozens of extracts and replicats and processes running and it was taking a long time for the UI to refresh those and to show what's going on within those systems. So the UI has been optimized to be able to handle those environments much better. We now have the ability to call database functions directly from call map. And so when you do transformation with GoldenGate, we have about 50 or 60 built-in transformation routines for string conversion, arithmetic operation, date manipulation. But we never had the ability to directly call a database function. 03:28 Lois: And now we do? Nick: So now you can actually call that database function, database stored procedure, database package, return a value and that can be used for transformation within GoldenGate. We have integration with identity providers, being able to use token-based authentication and integrate in with things like Azure Active Directory and your other single sign-on for the GoldenGate product itself. Within Oracle 23ai, there's a number of new features. One of those cool features is something called lock-free reservation columns. So this allows you to have a row, a single row within a table and you can identify a column within that row that's like an inventory column. And you can have multiple different users and multiple different transactions all updating that column within that same exact row at that same time. So you no longer have row-level locking for these reservation columns. And it allows you to do things like shopping carts very easily. If I have 500 widgets to sell, I'm going to let any number of transactions come in and subtract from that inventory column. And then once it gets below a certain point, then I'll start enforcing that row-level locking. 04:43 Lois: That's really cool… Nick: The one key thing that I wanted to mention here is that because of the way that the lock-free reservations work, you can have multiple transactions open on the same row. This is only supported for Oracle to Oracle. You need to have that same lock-free reservation data type and availability on that target system if GoldenGate is going to replicate into it. 05:05 Nikita: Are there any new features related to the diagnosability and observability of GoldenGate? Nick: We've improved the AWR reports in Oracle 23ai. There's now seven sections that are specific to Oracle GoldenGate to allow you to really go in and see exactly what the GoldenGate processes are doing and how they're behaving inside the database itself. And there's a Replication Performance Advisor package inside that database, and that's been integrated into the Web UI as well. So now you can actually get information out of the replication advisor package in Oracle directly from the UI without having to log into the database and try to run any database procedures to get it. We've also added the ability to support a per-PDB Extract. So in the past, when GoldenGate would run on a multitenant database, a multitenant database in Oracle, all the redo data from any pluggable database gets sent to that one redo stream. And so you would have to configure GoldenGate at the container or root level and it would be able to access anything at any PDB. Now, there's better security and better performance by doing what we call per-PDB Extract. And this means that for a single pluggable database, I can have an extract that runs at that database level that's going to capture information just from that pluggable database. 06:22 Lois And what about non-Oracle environments, Nick? Nick: We've also enhanced the non-Oracle environments as well. For example, in Postgres, we've added support for precise instantiation using Postgres snapshots. This eliminates the need to handle collisions when you're doing Postgres to Postgres replication and initial instantiation. On the GoldenGate for big data side, we've renamed that product more aptly to distributed applications in analytics, which is really what it does, and we've added a whole bunch of new features here too. The ability to move data into Databricks, doing Google Pub/Sub delivery. We now have support for XAG within the GoldenGate for distributed applications and analytics. What that means is that now you can follow all of our MAA best practices for GoldenGate for Oracle, but it also works for the DAA product as well, meaning that if it's running on one node of a cluster and that node fails, it'll restart itself on another node in the cluster. We've also added the ability to deliver data to Redis, Google BigQuery, stage and merge functionality for better performance into the BigQuery product. And then we've added a completely new feature, and this is something called streaming data and apps and we're calling it AsyncAPI and CloudEvent data streaming. It's a long name, but what that means is that we now have the ability to publish changes from a GoldenGate trail file out to end users. And so this allows through the Web UI or through the REST API, you can now come into GoldenGate and through the distributed applications and analytics product, actually set up a subscription to a GoldenGate trail file. And so this allows us to push data into messaging environments, or you can simply subscribe to changes and it doesn't have to be the whole trail file, it can just be a subset. You can specify exactly which tables and you can put filters on that. You can also set up your topologies as well. So, it's a really cool feature that we've added here. 08:26 Nikita: Ok, you've given us a lot of updates about what GoldenGate can support. But can we also get some specifics? Nick: So as far as what we have, on the Oracle Database side, there's a ton of different Oracle databases we support, including the Autonomous Databases and all the different flavors of them, your Oracle Database Appliance, your Base Database Service within OCI, your of course, Standard and Enterprise Edition, as well as all the different flavors of Exadata, are all supported with GoldenGate. This is all for capture and delivery. And this is all versions as well. GoldenGate supports Oracle 23ai and below. We also have a ton of non-Oracle databases in different Cloud stores. On an non-Oracle side, we support everything from application-specific databases like FairCom DB, all the way to more advanced applications like Snowflake, which there's a vast user base for that. We also support a lot of different cloud stores and these again, are non-Oracle, nonrelational systems, or they can be relational databases. We also support a lot of big data platforms and this is part of the distributed applications and analytics side of things where you have the ability to replicate to different Apache environments, different Cloudera environments. We also support a number of open-source systems, including things like Apache Cassandra, MySQL Community Edition, a lot of different Postgres open source databases along with MariaDB. And then we have a bunch of streaming event products, NoSQL data stores, and even Oracle applications that we support. So there's absolutely a ton of different environments that GoldenGate supports. There are additional Oracle databases that we support and this includes the Oracle Metadata Service, as well as Oracle MySQL, including MySQL HeatWave. Oracle also has Oracle NoSQL Spatial and Graph and times 10 products, which again are all supported by GoldenGate. 10:23 Lois: Wow, that's a lot of information! Nick: One of the things that we didn't really cover was the different SaaS applications, which we've got like Cerner, Fusion Cloud, Hospitality, Retail, MICROS, Oracle Transportation, JD Edwards, Siebel, and on and on and on. And again, because of the nature of GoldenGate, it's heterogeneous. Any source can talk to any target. And so it doesn't have to be, oh, I'm pulling from Oracle Fusion Cloud, that means I have to go to an Oracle Database on the target, not necessarily. 10:51 Lois: So, there's really a massive amount of flexibility built into the system. 11:00 Unlock the power of AI Vector Search with our new course and certification. Get more accurate search results, handle complex datasets easily, and supercharge your data-driven decisions. From now through May 15, 2025, we are waiving the certification exam fee (valued at $245). Visit mylearn.oracle.com to enroll. 11:26 Nikita: Welcome back! Now that we've gone through the base product, what other features or products are in the GoldenGate family itself, Nick? Nick: So we have quite a few. We've kind of touched already on GoldenGate for Oracle databases and non-Oracle databases. We also have something called GoldenGate for Mainframe, which right now is covered under the GoldenGate for non-Oracle, but there is a licensing difference there. So that's something to be aware of. We also have the OCI GoldenGate product. We are announcing and we have announced that OCI GoldenGate will also be made available as part of the Oracle Database@Azure and Oracle Database@ Google Cloud partnerships. And then you'll be able to use that vendor's cloud credits to actually pay for the OCI GoldenGate product. One of the cool things about this is it will have full feature parity with OCI GoldenGate running in OCI. So all the same features, all the same sources and targets, all the same topologies be able to migrate data in and out of those clouds at will, just like you do with OCI GoldenGate today running in OCI. We have Oracle GoldenGate Free. This is a completely free edition of GoldenGate to use. It is limited on the number of platforms that it supports as far as sources and targets and the size of the database. 12:45 Lois: But it's a great way for developers to really experience GoldenGate without worrying about a license, right? What's next, Nick? Nick: We have GoldenGate for Distributed Applications and Analytics, which was formerly called GoldenGate for big data, and that allows us to do all the streaming. That's also where the GoldenGate AsyncAPI integration is done. So in order to publish the GoldenGate trail files or allow people to subscribe to them, it would be covered under the Oracle GoldenGate Distributed Applications and Analytics license. We also have OCI GoldenGate Marketplace, which allows you to run essentially the on-premises version of GoldenGate but within OCI. So a little bit more flexibility there. It also has a hub architecture. So if you need that 99.99% availability, you can get it within the OCI Marketplace environment. We have GoldenGate for Oracle Enterprise Manager Cloud Control, which used to be called Oracle Enterprise Manager. And this allows you to use Enterprise Manager Cloud Control to get all the statistics and details about GoldenGate. So all the reporting information, all the analytics, all the statistics, how fast GoldenGate is replicating, what's the lag, what's the performance of each of the processes, how much data am I sending across a network. All that's available within the plug-in. We also have Oracle GoldenGate Veridata. This is a nice utility and tool that allows you to compare two databases, whether or not GoldenGate is running between them and actually tell you, hey, these two systems are out of sync. And if they are out of sync, it actually allows you to repair the data too. 14:25 Nikita: That's really valuable…. Nick: And it does this comparison without locking the source or the target tables. The other really cool thing about Veridata is it does this while there's data in flight. So let's say that the GoldenGate lag is 15 or 20 seconds and I want to compare this table that has 10 million rows in it. The Veridata product will go out, run its comparison once. Once that comparison is done the first time, it's then going to have a list of rows that are potentially out of sync. Well, some of those rows could have been moved over or could have been modified during that 10 to 15 second window. And so the next time you run Veridata, it's actually going to go through. It's going to check just those rows that were potentially out of sync to see if they're really out of sync or not. And if it comes back and says, hey, out of those potential rows, there's two out of sync, it'll actually produce a script that allows you to resynchronize those systems and repair them. So it's a very cool product. 15:19 Nikita: What about GoldenGate Stream Analytics? I know you mentioned it in the last episode, but in the context of this discussion, can you tell us a little more about it? Nick: This is the ability to essentially stream data from a GoldenGate trail file, and they do a real time analytics on it. And also things like geofencing or real-time series analysis of it. 15:40 Lois: Could you give us an example of this? Nick: If I'm working in tracking stock market information and stocks, it's not really that important on how much or how far down a stock goes. What's really important is how quickly did that stock rise or how quickly did that stock fall. And that's something that GoldenGate Stream Analytics product can do. Another thing that it's very valuable for is the geofencing. I can have an application on my phone and I can track where the user is based on that application and all that information goes into a database. I can then use the geofencing tool to say that, hey, if one of those users on that app gets within a certain distance of one of my brick-and-mortar stores, I can actually send them a push notification to say, hey, come on in and you can order your favorite drink just by clicking Yes, and we'll have it ready for you. And so there's a lot of things that you can do there to help upsell your customers and to get more revenue just through GoldenGate itself. And then we also have a GoldenGate Migration Utility, which allows customers to migrate from the classic architecture into the microservices architecture. 16:44 Nikita: Thanks Nick for that comprehensive overview. Lois: In our next episode, we'll have Nick back with us to talk about commonly used terminology and the GoldenGate architecture. And if you want to learn more about what we discussed today, visit mylearn.oracle.com and take a look at the Oracle GoldenGate 23ai Fundamentals course. Until next time, this is Lois Houston… Nikita: And Nikita Abraham, signing off! 17:10 That's all for this episode of the Oracle University Podcast. If you enjoyed listening, please click Subscribe to get all the latest episodes. We'd also love it if you would take a moment to rate and review us on your podcast app. See you again on the next episode of the Oracle University Podcast.

family director ai product cloud unlock retail hospitality spark oracle analytics saas senior director fundamentals ui snowflakes product management apache graphs solaris david wright maa databricks mainframe micros nosql redis postgres cerner rest apis oci cloudera bigquery daa siebel awr mariadb pdb azure active directory apache cassandra oracle database web ui google bigquery oracle university enterprise edition jd edwards innovation programs nick wagner exadata nick so nick one

The Database That Doesn't Quit: Apache Cassandra with Patrick McFadin

Alexa's Input (AI)

Play Episode Listen Later Apr 8, 2025 60:08

On this episode of Alexa's Input (AI), we're diving deep into the world of distributed databases with Patrick McFadin, Principal Technical Strategist at DataStax and a leading voice in the Apache Cassandra community. Patrick shares his journey into tech and how he became one of the foremost experts on Cassandra—an open-source, highly scalable NoSQL database that powers mission-critical applications across the globe.We explore Cassandra's unique architecture, its approach to the CAP theorem, real-world use cases, and how it continues to evolve in the era of AI and real-time analytics. Whether you're a developer, architect, or just database-curious, this episode offers a clear, insightful look at how Cassandra handles scale, availability, and open-source innovation.Links:LinkedIn: https://www.linkedin.com/in/patrick-mcfadin-53a8046/DataStax: https://www.datastax.com/our-people/patrick-mcfadinX: https://x.com/patrickmcfadinGithub: https://github.com/pmcfadinYou can support this podcast on the ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠creators page⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠. Make sure to subscribe and follow ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠Alexa's Input Twitter account⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠ to get notified when a new podcast episode comes out.

ai quit cap databases nosql datastax apache cassandra patrick mcfadin

DataStax and the Future of Real-Time Data Applications with Jonathan Ellis

Software Engineering Daily

Play Episode Listen Later Nov 19, 2024 43:24

DataStax is known for its expertise in scalable data solutions, particularly for Apache Cassandra, a leading NoSQL database. Recently, the company has focused on enhancing platform support for AI-driven applications, including vector search capabilities. Jonathan Ellis is the Co-founder of DataStax. He maintains a technical role at the company and has recently worked on developing The post DataStax and the Future of Real-Time Data Applications with Jonathan Ellis appeared first on Software Engineering Daily.

ai future applications nosql real time data datastax apache cassandra software engineering daily jonathan ellis

DataStax and the Future of Real-Time Data Applications with Jonathan Ellis

Podcast – Software Engineering Daily

Play Episode Listen Later Nov 19, 2024 43:24

ai future applications nosql real time data datastax apache cassandra software engineering daily jonathan ellis

From Apache Cassandra to Serverless: Exploring Cloud-Native Databases

airhacks.fm podcast with adam bien

Play Episode Listen Later Oct 5, 2024 75:47

An airhacks.fm conversation with Jake Luciani (@tjake) about: from Commodore 64 to cloud databases, early programming experiences with Basic and Excel macros, studying cognitive science and its influence on his career, transition to computer science, working at Bell Labs on R language, developing open-source projects like Night Rider MP3 player, creating a NoSQL database that led to involvement with Cassandra, building search API on top of Cassandra, joining datastax as an early employee, working on various aspects of Cassandra including compaction and streaming, challenges of byte buffer implementation, development of CQL (Cassandra Query Language), transition from NoSQL to SQL-like interfaces, separation of compute and storage in cloud databases, using S3 as the source of truth for Astra DB, implementing a Java file system abstraction for S3 integration, using etcd as a transactional cache for metadata, offering multiple APIs including REST and CQL drivers for astra DB, implementing JSON document storage and querying capabilities, cross-AZ cost considerations in cloud deployments, Java as a language for database development, future plans for jlama (Java-based LLM inference engine), the importance of open-source in cloud technologies, cost-driven architectures in cloud deployments, serverless vs. traditional deployments trade-offs, integration of AstraDB with cloud marketplaces and security considerations Jake Luciani on twitter: @tjake

basic excel api java databases apis db llm commodore s3 sql json serverless cloud native bell labs nosql apache cassandra cql

DataStax with Ed Anuff

Software Engineering Daily

Play Episode Listen Later Jun 25, 2024

DataStax is a generative AI data company that provides tools and services to build AI and other data-intensive applications. Ed Anuff is the Chief Product Officer at DataStax. He joins the show to talk about making Apache Cassandra accessible, adding vector support at DataStax, envisioning the future application stack for AI, and more. Full Disclosure: The post DataStax with Ed Anuff appeared first on Software Engineering Daily.

ai chief product officer full disclosure datastax apache cassandra software engineering daily

DataStax with Ed Anuff

Podcast – Software Engineering Daily

Play Episode Listen Later Jun 25, 2024

ai chief product officer datastax apache cassandra software engineering daily

High-Performance Java, Or How JVector Happened

airhacks.fm podcast with adam bien

Play Episode Listen Later May 18, 2024 61:16

An airhacks.fm conversation with Jonathan Ellis (@spyced) about: Jonathan's first computer experiences with IBM PC 8086 and Thinkpad laptop with Red Hat Linux, becoming a key contributor to Apache Cassandra and founding datastax, starting DataStax to provide commercial support for Cassandra, early experiences with Java, C++, and python, discussion about the evolution of Java and its ecosystem, the importance of vector databases for semantic search and retrieval augmented generation, the development of JVector for high-performance vector search in Java, the potential of integrating JVector with LangChain for Java / langchain4j and quarkus for serverless deployment, the advantages of Java's productivity and performance for building concurrent data structures, the shift from locally installed software to cloud-based services, the challenges of being a manager and the benefits of taking a sabbatical to focus on creative pursuits, the importance of separating storage and compute in cloud databases, Cassandra's write-optimized architecture and improvements in read performance, DataStax's investment in Apache Pulsar for stream processing, the llama2java project for high-performance language models in Java Jonathan Ellis on twitter: @spyced

high performance java thinkpad datastax ibm pc langchain apache cassandra jonathan ellis red hat linux apache pulsar

The Evolving Relationship between Apache Cassandra and DataStax

The Business of Open Source

Play Episode Listen Later Feb 28, 2024 40:26

Slightly different The Business of Open Source episode today! I spoke with Patrick McFadin and Mick Semb Wever about the relationship between Apache Cassandra and DataStax — how it was at the beginning and how the relationship has evolved over the years. We talked about:— How there was a dynamic around Cassandra where many of the many of the contributors ended up being sucked into the DataStax orbit, simply because it allowed those contributors to work on on Cassandra full-time— How there can be tensions between different stakeholders simply because everyone involved ultimately has their own interests at heart, and those interests are not always aligned. — How it is actually hard to really have open discussions about new features, and how often there can be a new feature dropped in a project that clearly had been developed behind closed doors for some time, and sometimes that created tension in the community— Some open source projects are just too complex to be hobby projects — Cassandra is so complex that you won't become a code contributor unless you're working full-time on Cassandra, because that's the level of skill you need to keep up. — How the relationship between a company and a project often changes as the technology matures. — The importance of addressing tensions between company and community head-on, as adults, when they occur — as well as why you need to remember to treat people as humans and remember that they have good days, bad days, goals and interests. Patrick on LinkedInMick on LinkedIn

relationships business evolving real world open source slightly founder stories datastax apache cassandra patrick mcfadin

Stackd 66: Streams, Messages, Events, and a Java User Group

Enterprise Java Newscast

Play Episode Listen Later Aug 11, 2023 121:43

Ian, Kito, and Josh are joined by Java Champion, Streaming Developer Advocate at DataStax, and President of Chicago-JUG, Mary Grygleski. They discuss news about Capacitor, Angular, PrimeNG Designer for Tailwind, JetBraiins Compose Multiplatform for iOS, JDK 21, AI developer tools, Jakarta EE 10, and more. Kito announces the work he is doing on the Jakarta EE Tutorial, and then they delve into Mary's background and event streaming with Apache Pulsar, plus tools like Apache Pinot, Apache Flink, RisingWave, ByteWax and Apache Cassandra. We Thank DataDog for sponsoring this podcast! https://www.pubhouse.net/datadog Front End - Announcing Capacitor 5.0 - Ionic Blog (https://ionic.io/blog/announcing-capacitor-5) - Angular v16 is here! (https://blog.angular.io/angular-v16-is-here-4d7a28ec680d) - Compose Multiplatform (https://blog.jetbrains.com/kotlin/2023/05/compose-multiplatform-for-ios-is-in-alpha/) - PrimeNG Designer - Tailwind (Q3 2023) (https://www.primefaces.org/primeng-theme-designer-with-tailwind/) Server Side Java - Kito is working with Bauke Scholtz and Arjan Tjmes to refresh the Jakarta EE Tutorial - Eclipse Documentation for Jakarta EE (https://projects.eclipse.org/projects/ee4j.jakartaee-documentation) - Antora (https://antora.org) - Asciidoc (http://asciidoc.org) - Jakarta EE 10; MicroProfile 6; Java SE 20; Open Liberty (https://openliberty.io/blog/2023/04/04/23.0.0.3.html) - Jakarta EE Starter (https://start.jakarta.ee/) AI/ML - Phind - AI search engine for developers (https://www.phind.com/) - 92% of devs using AI coding assistants (https://www.zdnet.com/article/github-developer-survey-finds-92-of-programmers-using-ai-tools/) Java Platform - JDK 21, the next LTS release, due out in September (https://www.infoworld.com/article/3689880/jdk-21-the-new-features-in-java-21.html) IDE and Tools - Grazie Professional - IntelliJ IDEs Plugin | Marketplace (https://plugins.jetbrains.com/plugin/16136-grazie-professional) Chat w/Mary - Twitter: @mgrygles (https://twitter.com/mgrygles) - Discord server: https://discord.gg/RMU4Juw - LinkedIn: https://www.linkedin.com/in/mary-grygleski/ - Apache Pulsar (https://pulsar.apache.org/) - Apache Pinot (https://pinot.apache.org/) - Apache Flink (https://flink.apache.org/) - RisingWave (https://www.risingwave.dev/) - ByteWax (https://bytewax.io/) - Apache Cassandra (https://cassandra.apache.org/) - Apache Kafka (https://kafka.apache.org/) Picks - Quantum Energy Squares (Kito) (https://quantumsquares.com/) - JBOSS EAP on Azure (Josh) (https://learn.microsoft.com/en-us/azure/developer/java/ee/jboss-on-azure) - Interstellar (Mary) (https://www.imdb.com/title/tt0816692/) - Black Mirror Season 6 Episode 1 - Joan Is Awful - Netflix (Ian) (https://www.rottentomatoes.com/tv/black_mirror/s06/e01) Other Pubhouse Network podcasts - Breaking into Open Source (https://www.pubhouse.net/breaking-into-open-source) - OffHeap (https://www.javaoffheap.com/) - Java Pubhouse (https://www.javapubhouse.com/) Events - Lone Star Software Symposium - July 14 - 15, Austin, TX, USA (https://nofluffjuststuff.com/austin) - ÜberConf - July 18 - 21, Denver, CO, USA (https://uberconf.com/) - Nebraska.code() - July 19-20, Lincoln, NE, USA (https://nebraskacode.amegala.com/)

210: The Cloud Pod Deep Inspects Itself

The Cloud Pod

Play Episode Listen Later May 4, 2023 59:35

Welcome to the newest episode of The Cloud Pod podcast! Justin, Ryan and Matthew are your hosts this week as we discuss all the latest news and announcements in the world of the cloud and AI - including what's new with Google Deepmind, as well as goings on over at the Finops X Conference. Join us! Titles we almost went with this week:

ceo amazon ai google starting france moving news deep west cross brand dc microsoft expectations san diego employees wall street os snow cloud customers prime oracle fantastic saas intel reducing organizations select titles api bingo certification bard python aws initiatives linux java tee databases inspectors devops azure amazon web services ubuntu spotted ec flush s3 sql eta kubernetes public service announcement deepmind microsoft azure zookeepers typescript sidenote vms google deepmind tcp andy jassy redesigned bueller cable news gci oci push notifications amazon amazon chalk talk microsoft cloud amazon s3 waf ec2 foghorn apache kafka google research 512gb apache cassandra justinas matt yeah amazon redshift javascript apis dolby digital matt there justin it etcd amazon inspector aws systems manager cloud pod foghorn consulting

Open Source Adoption, DevRel, and FOSS: Learning from Apache Cassandra w/ Patrick McFadin - EP. 26

The Hacking Open Source Business Podcast

Play Episode Listen Later Apr 28, 2023 57:29

Patrick McFadin, VP of Developer Relations at DataStax and Chief Evangelist for Apache Cassandra, joins the Hacking Open Source Business Podcast on Episode 26 to deep dive into open source. In this episode Patrick talks about:- His time working in open source database community, including Apache Cassandra's journey and upcoming developments.- The role of evangelism and contributors in driving adoption and getting people to try your project.- The challenges and mistakes companies make when commercializing open source, with lessons he has learned from his time in the database community.- How new features are chosen based on his experience with Cassandra highlighting features such as transactions and open-source tool Guardrails?- Does open source innovation slow down as products mature?- What is cloud-native anyways? And what does it mean in the database context?- Building a diverse and gloabl team by building trust.- DevRel Best practices includeing, how do you measuring DevRel success.- Patrick McFadin's LinkedIn profile: https://www.linkedin.com/in/patrick-mcfadin-53a8046/- Learn more about Apache Cassandra: https://cassandra.apache.org/Checkout our other interviews, clips, and videos: https://l.hosbp.com/YoutubeDon't forget to visit the open-source business community at: https://opensourcebusiness.community/Visit our primary sponsor, Scarf, for tools to help analyze your #opensource growth and adoption: https://about.scarf.sh/Subscribe to the podcast on your favorite app:Spotify: https://l.hosbp.com/SpotifyApple: https://l.hosbp.com/AppleGoogle: https://l.hosbp.com/GoogleBuzzsprout: https://l.hosbp.com/Buzzsprout

spotify learning building adoption checkout open source buzzsprout guardrails foss chief evangelist scarf apple google developer relations devrel datastax apache cassandra patrick mcfadin

204: Amazon eats Pi with their own version of S3FS

The Cloud Pod

Play Episode Listen Later Mar 23, 2023 50:38

On this episode of The Cloud Pod, the team discusses Amazon Pi Day, Google's upcoming I/O conference, the agricultural data manager by Microsoft, and the downturn in net profits of Oracle. They also round up cloud migrations by highlighting tools from different cloud service providers that are useful for the process. A big thanks to this week's sponsor, Foghorn Consulting, which provides full-stack cloud solutions with a focus on strategy, planning and execution for enterprises seeking to take advantage of the transformative capabilities of AWS, Google Cloud and Azure. This week's highlights

amazon ai google microsoft public wall street pivot oracle api eats io aws silicon valley bank azure resolver google cloud google i o mountain view ftp pi day gcp ipv6 oci eks amazon s3 amazon connect apache cassandra illumio amazon ses aws amazon amazon route apache hive microsoft active directory amazon emr aws systems manager cloud pod foghorn consulting

How Apache Roller Happened

airhacks.fm podcast with adam bien

Play Episode Listen Later Mar 11, 2023 63:15

An airhacks.fm conversation with Dave Johnson (@snoopdave) about: PDP-8 with a paper tape reader, airhacks.tv questions and answers, TRS-80, playing asteroids, asteroids, Defender and Battlezone were based on vector graphics, learning Pascal and C, Data General Eclipse MV/8000, Geographic Resources Analysis Support System (GRASS GIS), working for University of Kingston, working on jfactory for Rouge Wave, HAHT Software, The Soul of a New Machine, distributed Visual Basic application server, using xdoclet to generate EJB, using castor for persistence, Apache Roller started as sample application, Sun hires dave, working on Lotus Notes social, starting at wayin, Roller supports Pingback, Lotus is using roller, using Rightscale to deploy Java software to AWS, using Jenkins and CloudFormation, episode with Scott McNealy "#19 SUN, JavaSoft, Java, Oracle", Roller uses Apache Velocity, working on RSS parser Rome, switching from MongoDB to Apache Cassandra, UserGrid data store, Oracle acquires apiary , starting at CloudBees, episode with Kohsuke Kawaguchi "#143 How Hudson and Jenkins happened", starting at Apollo, several thousand blogs on roller Dave Johnson on twitter: @snoopdave

Special Episode: Data on Kubernetes and Cassandra Forward with Patrick McFadin

Open||Source||Data

Play Episode Listen Later Feb 22, 2023 18:44

This special episode of Open||Source||Data features an interview with Patrick McFadin. Patrick has been a distributed systems hacker since he first plugged a modem into his Atari computer. Looking for adventure, he joined the US Navy, working on the Naval Tactical Data System (NTDS), which cemented his love of distributed systems. He is now an Apache Cassandra Committer, and is the Vice President of Developer Relations at DataStax. Sam catches up with Patrick at Data Day Texas to discuss his book Managing Cloud Native Data on Kubernetes, Cassandra Forward, and the future of Apache Cassandra.-------------------“I can now use my Parquet file in Iceberg or DuckDB, and this is data that I created with Cassandra. And we're not getting to the point where we have to reinvent an entire database. We can just connect the Lego parts together and if they're open, then I don't have these encumbrances. I'm not like, ‘Well, I can connect that if I call a salesperson and get a license.' [...] That's what's exciting to me about Cassandra, the way that the ecosystem is evolving around Cassandra. It's not, ‘Cassandra's at the center, it's just a player.' It's at the party." – Patrick McFadin-------------------Episode Timestamps:(01:06): What open source data means to Patrick(02:11): Patrick discusses his book Managing Cloud Native Data on Kubernetes(10:02): Patrick discusses Cassandra Forward(11:09): The future of Apache Cassandra-------------------Links:LinkedIn - Connect with PatrickCassandra Forward

data vice president forward lego us navy atari iceberg kubernetes developer relations parquet datastax apache cassandra open source data duckdb patrick mcfadin

Star Trek, Star Wars, Transactions, SQL, NoSQL and almost Streaming

airhacks.fm podcast with adam bien

Play Episode Listen Later Feb 5, 2023 62:10

An airhacks.fm conversation with Mary Grygleski (@mgrygles) about: 808X as first computer, Hong Kong was high tech, enjoying space missions, Star Trek and Star Wars, the intriguing registration terminal, writing code in Pascal, 3 GL programming languages and SQL, set theory and SQL, the seven layers of OSI, OSI model, IBM MVS, AS 400 is the opposite of micro services, developers get bored too early, learning X-Windows, working with early Oracle databases, using dBASE, clipper and FoxPro, transarc, stratos tx, Transarc the transaction file system, Transaction Processing: Concepts and Techniques, working on SMTP / MTA, CouchDB and Lotus Notes, the Sun Ultra 30 workstation, starting at Sybase, EA server Sybase / Jaguar, using emacs for Java development, then netbeans, Java EE and the hierarchical class loaders, working on EJB 3 specs, mobile apps with Apache Cordova, reactive systems at IBM, using akka, Eclipse Vertex and MicroProfile, working for datastax and Pulsar, Datastax provides support for Apache Cassandra and Apache Pulsar, separating the compute from the storage, astra the managed cloud platform Mary Grygleski on twitter: @mgrygles

Low Code, No Code, WYSIWYG …and some CRaC

airhacks.fm podcast with adam bien

Play Episode Listen Later Nov 13, 2022 61:08

An airhacks.fm conversation with John Ceccarelli (@jceccarelli1) about: Macintosh 512K, writing short stories and playing Dark Castle, studying European politics, enjoying Brno and Prague, learning Czech from a communist book, technical writing for Sun Microsystems, working on NetBeans Matisse, WYSIWYG precision is challenging, NetBeans Visual Web Pack was extremely popular, Sun's JSF woodstock, separation of generated and implemented code is challenging, explaining AWS Lambdas with EJBs, visual representation of complex code is challenging, NetBeans vs. IntelliJ strategies, Installing Java Support in Visual Studio Code, working on JVM internals at Azul Systems, Azul JVMs Zulu vs. Prime, the Falcon JIT, optimising JVM for Apache Cassandra, the Renaissance Suite, memento and openJDK CRaC, Azul's CRAC optimization, crowdourcing the optimizations, quarkus on Azul's CRaC, Azul Prime is based on LLVM, Foojay and azul John Ceccarelli on twitter: @jceccarelli1

The Kubernetes Native Database // Jeffrey Carpenter (DoK Day North America 2022)

Data on Kubernetes Community

Play Episode Listen Later Nov 2, 2022 16:26

From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT In the software industry we're fond of terms that define major trends, like “cloud native”, “Kubernetes native” and “serverless”. As more and more organizations move stateful workloads to Kubernetes, we've started to see these terms applied to data infrastructure, where they can get overtaken by marketing hype unless we work to define them. In this talk, we'll examine two different databases, TiDB and Apache Cassandra, in order to identify what it means for a database to be Kubernetes native and why it matters. We'll look at points including: - The differences between cloud native, Kubernetes native, and serverless - How databases become Kubernetes native - Benefits of Kubernetes native databases - How Kubernetes can better support databases

benefits north america native carpenter databases kubernetes apache cassandra

Datastax

The Craft Of Open Source

Play Episode Listen Later Oct 28, 2022 50:44

Apache Cassandra paved the way for today's biggest digital platforms to scale into the much bigger global scene. Patrick McFadin of Datastax is one of the people involved in this open-source project and saw first-hand how it burst into the world. He joins Ben Rometsch to share how Cassandra was developed, the many challenges they faced in its optimization, its relationship with Datastax, and how it changed database engine creation and data modeling. Patrick also talks about the measures they are implementing to continuously improve Cassandra and limit open-source access to ensure quality.

datastax apache cassandra patrick mcfadin

How ScyllaDB Helped an AdTech Company Focus on Core Business

The New Stack Podcast

Play Episode Listen Later Oct 20, 2022 26:51

GumGum is a company whose platform serves up online ads related to the context in which potential customers are already shopping or searching. (For instance: it will send ads for Zurich restaurants to someone who's booked travel to Switzerland.) To handle that granular targeting, it relies on its proprietary machine learning platform, Verity. “For all of our publishers, we send a list of URLs to Verity,” according to Keith Sader, GumGum's director of engineering. “Verity goes in and basically categorizes those URLs as different [internal bus] categories. So the IB has tons of taxonomies, based on autos, based upon clothing based upon entertainment. And then that's how we do our targeting.” Verity's targeting data is stored in DynamoDB, but the rest of GumGum's data is stored in managed MySQL and its daily tracking data is stored in ScyllaDB, a database designed for data-intensive applications. Scylla, Sader said, helps his company avoid serving audiences the same ads over and over again, by keeping track of which ads customers have already seen. “That's where Scylla comes into the picture for us,” he said. “Scylla is our rate limiter on ad serving.” In this episode of The New Stack's Makers podcast, Sader and Dor Laor, CEO and co-founder of Scylla, told how GumGum has used ScyllaDB shift more IT resources to its core business and keep it from repeating ads to audiences that have already seen them, no matter where they travel. This case study episode of Makers, hosted Heather Joslyn, TNS features editor, was sponsored by ScyllaDB. ‘Where Do We Spend Our Limited Funds?' Before adding ScyllaDB to its stack, Sader said, “We had a Cassandra-based system that some very smart people put in. But Cassandra relies upon you to have an engineering staff to support it. “That's great. But like many types of systems, managing Cassandra databases is not really what our business makes money at.” GumGum was hosting its Cassandra database, installed on Amazon Web Services, by itself — and the drain on resources brought the company's teams to a crossroards, Sader said. “Where do we spend our limited funds? Do we spend it on Cassandra maintenance? Or do we hire someone to do it for us? And that's really what determined the switch away from a sort of self-installed, self-managed Cassanda to another provider.” A core issue for GumGum, Sader said, was making sure that it wasn't over-serving consumers, even as they moved around the globe. “If you see an ad in one place, we need to make sure, if you fly across the country, you don't see it agin,” he said. That's an issue Cassandra solved for his company, he said. Because ScyllaDB is a drop-in replacement for Apache Cassandra, it also helped prevent over-serving in all regions of the globe — thus preventing GumGum from losing money. In addition to managing its database for GumGum and other customers, Laor said that an advantage ScyllaDB brings is an “always on” guarantee. “We have a big legacy of infrastructure that's supposed to be resilient,” he said. “For example, every implementation of ours has consistent configurable consistency, so you can have multiple replicas.” Laor added, “Many many times organizations have multiple data centers. Sometimes it's for disaster recovery, sometimes it's also to shorten the latency and be closer to the client.” Replica databases located in data centers that are geographically distributed, he said, protect against failure in any one data center. Seeing Results Bringing ScyllaDB to GumGum was not without challenges, both Sader and Laor said. When ScyllaDB is added to an organization's stack, Laor said, it likes to start with as small a deployment as possible. “But in the GumGum case, all of these clients were new processes,” Laor said. So hundreds or thousands of processes, all trying to connect to the database, it's really a connection storm.” Scylla's team created a private version of its database to work on the problem and eventually solved it: “We had to massage the algorithm and make sure that all of the [open source] code committers upstream are summing it up.” It ultimately designed an admission control mechanism that measures the amount of parallel requests that the distributed database is handling, and to slow down requests that arrived for the first time from a new process. “We tried to have the complexity on our end,” Laor said. GumGum has seen the results of handing off that complexity and toil to a managed database. “We have pretty much reduced our entire operations effort with Scylla, to almost nothing,” Sader said. He added, “We're coming into our busy point of the year, ads really get picked up in Q4. So we reach out so we go, ‘Hey, we need more nodes in these regions, can you make that happen for us?' They go, ‘Yep.' Give us the things, we pay the money. And it happens.” In 2021, Sader said, “we increased our volume by probably 75% plus 50%, over our standard. The toughest thing to do in this industry is make things look easy. And Scylla helped us make ad serving look easy.” Check out the podcast to get more detail about GumGum's move to a managed database.

Apache Cassandra Masterclass with Patrick McFadin

The GeekNarrator

Play Episode Listen Later Sep 1, 2022 76:11

Hey Everyone, In this episode I invited Patrick McFadin who is an expert in the world of Cassandra and Data Modelling. Patrick currently works for DataStax as a VP Of Dev Rel. Patrick has given several techtalks on Cassandra and the ecosystem around it. We have covered the architecture of Cassandra in depth. Heres what we have covered: 00:00 Introduction 04:00 History of Cassandra 07:18 Patrick Apache Cassandra? 14:30 How writes work in Cassandra? 21:30 How many copies are written on a single write? 25:44 How does replication work? 32:00 How do reads work? (Read consistency levels) 39:00 Why is Allow Filtering not recommended? 43:00 Data Modelling in Cassandra 50:45 Modeling a Chat Application 01:05:00 How does CAP theorem fits Cassandra? 01:07:06 New features in Cassandra? References: Patrick McFadin: https://www.linkedin.com/in/patrick-m... Kaivalya Apte: https://www.linkedin.com/in/kaivalya-... Astra: astra.datastax.com Cassandra: https://cassandra.apache.org/_/index.... Webinar on Data Modeling: https://www.youtube.com/watch?v=4D39w... Playlist on Distributed Systems and Databases: https://www.youtube.com/playlist?list... I hope you enjoyed our discussion and learned from it. Please like, share and subscribe to the channel and keep supporting. Cheers, The GeekNarrator

history cheers masterclass playlist cap webinars modeling databases distributed systems datastax data modeling apache cassandra patrick mcfadin

Apache Cassandra: O Banco de Dados NoSQL de Missão Crítica e Tempo-Real da Fortune 500

Engenharia de Dados [Cast]

Play Episode Play 30 sec Highlight Listen Later Aug 12, 2022 63:01

Trazemos novamente o especialista Samuel Matioli para falar do banco de dados colunar mais querido da Fortuna 500, O Apache Cassandra é o banco de dados utilizado por grandes empresas como: Uber, Facebook, Netflix, Instagram, Spotify e Instacart.Nesse bato papo sobre banco de dados NoSQL falamos sobre os seguintes tópicos:Crescimento na Utilização de NoSQL no MercadoDiferença entre HBase e Apache CassandraO que é o Apache CassandraTipos de Deployment e Opções de UtilizaçãoCasos de Uso Quais os Problemas o Apache Cassandra ResolveApache Cassandra = https://cassandra.apache.org/ Samuel Matioli = https://www.linkedin.com/in/samuelmatioli/ No YouTube possuímos um canal de Engenharia de Dados com os tópicos mais importantes dessa área e com lives todas as quartas-feiras.https://www.youtube.com/channel/UCnErAicaumKqIo4sanLo7vQ Quer ficar por dentro dessa área com posts e updates semanais, então acesse o LinkedIN para não perder nenhuma notícia.https://www.linkedin.com/in/luanmoreno/ Disponível no Spotify e na Apple Podcasthttps://open.spotify.com/show/5n9mOmAcjra9KbhKYpOMqYhttps://podcasts.apple.com/br/podcast/engenharia-de-dados-cast/ Luan Moreno = https://www.linkedin.com/in/luanmoreno/

Consistent Hashing | The Backend Engineering Show

IGeometry

Play Episode Listen Later Aug 6, 2022 24:42

In this episode of the backend engineering show I discuss consistent hashing a very important algorithm in distributed computing specially in database systems such as Apache Cassandra and DynamoDB. 0:00 Intro 2:00 Problem of Distributed Systems 5:00 When to Distribute 7:00 Simple Hashing 9:30 Where Simple Hashing Breaks 11:40 Consistent Hashing 18:00 Adding a Server 21:15 Removing a Server 22:30 Limitations --- Support this podcast: https://anchor.fm/hnasr/support

engineering consistent backend hashing dynamodb apache cassandra

PostgreSQL und MariaDB

Python Podcast

Play Episode Listen Later Jun 14, 2022 163:49

Vor über drei Jahren hatten wir ja schon einmal eine Episode über Datenbanken. Da das ja nun schon ein bisschen her ist, dachten wir dass es vielleicht an der Zeit wäre, mal wieder über dieses Thema zu reden. Dazu haben wir (Dominik und Jochen) uns diesmal mit Susanne zusammengesetzt, die seit vielen Jahren Consulting und Schulungen zum Thema anbietet. Die alte Datenbank-Episode war unsere längste Episode bisher, und irgendwie ist auch diese hier länger als gewöhnlich geworden. Offenbar gibt es über Datenbanken mehr zu sagen als zu anderen Themen

news development thema mac billion consulting released dazu beta unterschiede python kommentare anregungen szene sql array jochen cypher mysql graphql offenbar schulungen geoffrey hinton nosql postgresql lizenzen elasticsearch postgres datenbanken neo4j mariadb numpy apache cassandra postgis kombinatorik mengenlehre europython postgressql

DoK Talks #135 - DoK isn't just Database on Kubernetes // Patrick McFadin

Data on Kubernetes Community

Play Episode Listen Later Jun 10, 2022 46:00

https://go.dok.community/slack https://dok.community ABSTRACT OF THE TALK What about your streaming and analytic workloads? If you are all-in on Kubernetes you can't forget about these important parts of your infrastructure. I'll talk about the current state of the art. Why organizations may hesitate to go beyond deploying databases in Kubernetes and most important, some key things you need to be successful. BIO Patrick McFadin is the co-author of the upcoming O'Reilly book “Managing Cloud-Native Data on Kubernetes” He currently works at DataStax in Developer Relations and as a contributor to the Apache Cassandra project. Patrick has worked as Chief Evangelist for Apache Cassandra and as a consultant for DataStax, where he had a great time building some of the largest deployments in production. Previous to DataStax, he held positions as Chief Architect, Engineering Lead and Database DBA/Developer. KEY TAKE-AWAYS FROM THE TALK People should walk away with a better understanding of what it takes to deploy streaming and analytic workloads in Kubernetes.

previous databases kubernetes chief evangelist chief architect developer relations datastax engineering lead apache cassandra patrick mcfadin

Bringing Apache Cassandra closer to Kubernetes (DoK Day EU 2022) // Jake Luciani

Data on Kubernetes Community

Play Episode Listen Later May 27, 2022 9:41

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) What does Kubernetes provide that allows us to reduce the complexity of Apache Cassandra while making it better suited for cloud native deployments? That was the question we started with as we began a mission to bring Cassandra closer to Kubernetes and eliminate the redundancy. Many great open source databases have been adapted to run on Kubernetes, without relying on the deep ecosystem of projects that it takes to run in Kubernetes(there is a difference). This talk will discuss the design and implementation of the Astra Serverless Database which re-architected Apache Cassandra to run only on Kubernetes infrastructure. Built to be optimized for multi-tenancy and auto-scaling, we set out with a design goal to completely separate compute and storage. Decoupling different aspects of Cassandra into scaleable services and relying on the benefits of Kubernetes and it's ecosystem created a simpler more powerful database service than a stand alone, bare-metal Cassandra cluster. The entire system is now built on Apache Cassandra, Stargate, Etcd, Prometheus, and object-storage like Minio or Ceph. In this talk we will discuss the downstream changes coming to several open source projects based on the work we have done. Jake is a lead developer and software architect at DataStax with over 20 years of experience in the areas of distributed systems, finance, and manufacturing. He is a member of the Apache Foundation and is on the project committee of the Apache Cassandra, Arrow, and Thrift projects. Jake has a reputation for developing creative solutions to solve difficult problems and fostering a culture of trust and innovation. He believes the best software is built by small diverse teams who are encouraged to think freely. Jake received his B.S. in Computer Science from Lehigh University along with a minor in Cognitive Science.

european union built closer computer science arrow prometheus stargate kubernetes cognitive science thrift lehigh university decoupling luciani datastax ceph minio apache cassandra apache foundation etcd

Tech with project RapGOD (DoK Day EU 2022) // Abhijith Ganesh

Data on Kubernetes Community

Play Episode Listen Later May 27, 2022 8:50

https://go.dok.community/slack https://dok.community/ From the DoK Day EU 2022 (https://youtu.be/Xi-h4XNd5tE) The Rap God project acts as a great entry point to many incoming open-source enthusiasts who are interested in learning about the cloud native ecosystem. The Rap-God project uses Kubernetes orchestration for a stateful case which is an emerging topic, the Rap God project acts as a demonstration of how to use such features of Kubernetes. The project will be using Stateful sets that'd deploy Apache Cassandra (for its first cycle) and eventually it'll be implementing the same API endpoints for various databases that will be with/on Kubernetes. We in the community intend to do this with PersistenceVolumes and Persistent Volume Claims. Keeping in mind the issues, various developers face, we also will be making options for storage classes. The project will allow the members to explore how they can customize the whole storage class setup according to their setup. The project will be bringing Helm, Cassandra, Kubernetes and Argo under its watch and shall actively expand on its implementation with the further iterations. Abhijith Ganesh is an undergrad computer science major, currently pursuing his Freshman year. His areas of interest include DevOps, Kuberenetes and Open Source Projects. He is an active member of the DoK Community where he is currently an intern. He is also member of the Pyrsia and SeaQL communities.

project tech european union api freshman helm devops argo kubernetes ganesh open source projects rap god apache cassandra stateful

Cassandra on Kubernetes using K8ssandra

Kubernetes Bytes

Play Episode Listen Later Apr 14, 2022 50:07

In this episode, Ryan and Bhavin interview Patrick McFadin, VP of Developer Relations at Datastax, who is a co-author of the upcoming O'Reilly book “Managing Cloud-Native Data on Kubernetes” and a contributor to the Apache Cassandra project. The discussion dives into how K8ssandra helps users deploy Cassandra on Kubernetes clusters, and how customers are using Cassandra as the NoSQL, Distributed DB backend for their applications. We talk about the challenges, benefits, and best practices for running Cassandra on Kubernetes, and what users can look forward to in the near future. Show links: Patrick McFadin - LinkedIn - Twitter K8ssandra.io - https://k8ssandra.io Introduction to Cassandra - Crash Course - Youtube series - https://youtube.com/playlist?list=PL2g2h-wyI4SqCdxdiyi8enEyWvACcUa9R AWS Marketplace - https://aws.amazon.com/marketplace/pp/prodview-iy7gagaxm2foa Cassandra Discord community - https://discord.com/invite/qP5tAt6Uwt Data On Kubernetes - https://www.meetup.com/Data-on-Kubernetes-community/events/ Managing Cloud-Native Data on Kubernetes - https://portworx.com/resource/ebook-managing-cloud-native-data-on-kubernetes/ Cloud-Native News: Docker raises Series-C funding Garden.io raises Series A - $16M funding to combat waste in cloud development Are you Ready for K8s 1.24 NetApp acquires InstaClustr Spring4Shell - Zero Day Remote Code Execution Vulnerability Portworx Enterprise 2.10 Etcd v3.5.[0-2] is not recommended for production Announcing Postgres container apps: Easy deploy Postgres apps

data gardens kubernetes series c netapp developer relations nosql postgres k8s datastax apache cassandra bhavin patrick mcfadin etcd

Intro to distributed databases on Kubernetes

Kubernetes Bytes

Play Episode Listen Later Jan 20, 2022 37:00

In todays episode of KubernetesBytes, hosts Ryan Wallner and Bhavin Shah discuss the basic of running distributed databases like Apache Cassandra and Kafka along with Mongo, CockroachDB and others on Kubernetes. There are various capabilities of Kubernetes that were designed for these types of data services and this podcast should help you get a basic understanding of the landscape as well as WHY you may want to run them on Kubernetes. Show Links: https://thenewstack.io/new-tools-for-optimizing-data-resilience-in-kubernetes/ https://awesome-kubernetes.readthedocs.io/ / https://nubenetes.com/ https://www.containiq.com/post/should-you-run-a-database-on-kubernetes Log4j recap - https://blog.aquasec.com/log4j-vulnerabilities-overview IPv6 support for EKS - https://aws.amazon.com/blogs/aws/amazon-elastic-kubernetes-service-adds-ipv6-networking/ https://thenewstack.io/testkube-a-new-approach-to-cloud-native-testing/ GigaOM DP report 2 https://gigaom.com/report/gigaom-radar-for-kubernetes-data-protection-2/ https://portworx.com/blog/kubernetes-failover-mongodb/ https://thenewstack.io/the-perfect-pair-kubernetes-and-distributed-sql/ https://www.purestorage.com/docs.html?item=/type/pdf/subtype/doc/path/content/dam/pdf/en/white-papers/wp-kafka-on-kubernetes-with-portworx.pdf https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/dml/dmlAboutDataConsistency.html https://developer.ibm.com/tutorials/ba-multi-data-center-cassandra-cluster-kubernetes-platform/ https://thenewstack.io/the-perfect-pair-kubernetes-and-distributed-sql/ https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

databases kafka distributed kubernetes mongo ipv6 eks cockroachdb apache cassandra

What's new in October 2021

Melbourne AWS User Group

Play Episode Listen Later Jan 17, 2022 69:58

A lot of things happened in October, and we talked about them all in early November. In this episode Arjen, Guy, and JM discuss a whole bunch of cool things that were released and may be a bit harsh on everything Microsoft. News Finally in Sydney Amazon EC2 Mac instances are now available in seven additional AWS Regions Amazon MemoryDB for Redis is now available in 11 additional AWS Regions Serverless Lambda AWS Lambda now supports triggering Lambda functions from an Amazon SQS queue in a different account AWS Lambda now supports IAM authentication for Amazon MSK as an event source Step Functions Now — AWS Step Functions Supports 200 AWS Services To Enable Easier Workflow Automation | AWS News Blog AWS Batch adds console support for visualizing AWS Step Functions workflows Amplify Announcing General Availability of Amplify Geo for AWS Amplify AWS Amplify for JavaScript now supports resumable file uploads for Storage Other Accelerating serverless development with AWS SAM Accelerate | AWS Compute Blog Containers Amazon EKS Managed Node Groups adds native support for Bottlerocket AWS Fargate now supports Amazon ECS Windows containers Announcing the general availability of cdk8s and support for Go | Containers Monitoring clock accuracy on AWS Fargate with Amazon ECS Amazon ECS Anywhere now supports GPU-based workloads AWS Console Mobile Application adds support for Amazon Elastic Container Service AWS Load Balancer Controller version 2.3 now available with support for ALB IPv6 targets AWS App Mesh Metric Extension is now generally available EC2 & VPC New – Amazon EC2 C6i Instances Powered by the Latest Generation Intel Xeon Scalable Processors | AWS News Blog Amazon EC2 now supports sharing Amazon Machine Images across AWS Organizations and Organizational Units Amazon EC2 Hibernation adds support for Ubuntu 20.04 LTS Announcing Amazon EC2 Capacity Reservation Fleet a way to easily migrate Amazon EC2 Capacity Reservations across instance types Amazon EC2 Auto Scaling now supports describing Auto Scaling groups using tags Amazon EC2 now offers Microsoft SQL Server on Microsoft Windows Server 2022 AMIs AWS Elastic Beanstalk supports Database Decoupling in an Elastic Beanstalk Environment AWS FPGA developer kit now supports Jumbo frames in virtual ethernet frameworks for Amazon EC2 F1 instances Amazon VPC Flow Logs now supports Apache Parquet, Hive-compatible prefixes and Hourly partitioned files Network Load Balancer now supports TLS 1.3 New – Attribute-Based Instance Type Selection for EC2 Auto Scaling and EC2 Fleet | AWS News Blog Amazon Lightsail now supports AWS CloudFormation for instances, disks and databases Dev & Ops CLI AWS Cloud Control API, a Uniform API to Access AWS & Third-Party Services | AWS News Blog Now programmatically manage alternate contacts on AWS accounts CodeGuru Amazon CodeGuru now includes recommendations powered by Infer Amazon CodeGuru announces Security detectors for Python applications and security analysis powered by Bandit Amazon CodeGuru Reviewer adds detectors for AWS Java SDK v2's best practices and features IaC AWS CDK releases v1.121.0 - v1.125.0 with features for faster development cycles using hotswap deployments and rollback control AWS CloudFormation customers can now manage their applications in AWS Systems Manager Other NoSQL Workbench for Amazon DynamoDB now enables you to import and automatically populate sample data to help build and visualize your data models Amazon Corretto October Quarterly Updates Bulk Editing of OpsItems in AWS Systems Manager OpsCenter AWS Fault Injection Simulator now supports Spot Interruptions AWS Fault Injection Simulator now injects Spot Instance Interruptions Security Firewalls AWS Firewall Manager now supports centralized logging of AWS Network Firewall logs AWS Network Firewall Adds New Configuration Options for Rule Ordering and Default Drop Backups AWS Backup Audit Manager adds compliance reports AWS Backup adds an additional layer for backup protection with the availability of AWS Backup Vault Lock Other AWS Security Hub adds support for cross-Region aggregation of findings to simplify how you evaluate and improve your AWS security posture Amazon SES now supports 2048-bit DKIM keys AWS License Manager now supports Delegated Administrator for Managed entitlements Data Storage & Processing Goodbye Microsoft SQL Server, Hello Babelfish | AWS News Blog Announcing availability of the Babelfish for PostgreSQL open source project Announcing Amazon RDS Custom for Oracle AWS announces AWS Snowcone SSD Amazon RDS Proxy now supports Amazon RDS for MySQL Version 8.0 Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) announces support for Cross-Cluster Replication Amazon OpenSearch Service (successor to Amazon Elasticsearch Service) now comes with an improved management console AWS Transfer Family customers can now use Amazon S3 Access Point aliases for granular and simplified data access controls Amazon EMR now supports Apache Spark SQL to insert data into and update Apache Hive metadata tables when Apache Ranger integration is enabled Amazon Neptune now supports Auto Scaling for Read Replicas AWS Glue Crawlers support Amazon S3 event notifications Amazon Keyspaces (for Apache Cassandra) now supports automatic data expiration by using Time to Live (TTL) settings New – AWS Data Exchange for Amazon Redshift | AWS News Blog AI & ML SageMaker Announcing Fast File Mode for Amazon SageMaker Amazon SageMaker Projects now supports Image Building CI/CD templates Amazon SageMaker Data Wrangler now supports Amazon Athena Workgroups, feature correlation, and customer managed keys Other Amazon Kendra launches support for 34 additional languages Amazon Fraud Detector now supports event datasets AWS announces a price reduction of up to 56% for Amazon Fraud Detector machine learning fraud predictions Amazon Fraud Detector launches new ML model for online transaction fraud detection Amazon Transcribe now supports custom language models for streaming transcription Amazon Textract launches TIFF support and adds asynchronous support for receipts and invoices processing Announcing Amazon EC2 DL1 instances for cost efficient training of deep learning models Other Cool Stuff AWS IoT Core now makes it optional for customers to send the entire trust chain when provisioning devices using Just-in-Time Provisioning and Just-in-Time Registration AWS IoT SiteWise announces support for using the same asset models across different hierarchies VMware Cloud on AWS Outposts Brings VMware SDDC as a Fully Managed Service on Premises | AWS News Blog AWS Outposts adds new CloudWatch dimension for capacity monitoring Amazon Monitron launches iOS app Amazon Braket offers D-Wave's Advantage 4.1 system for quantum annealing Amazon QuickSight adds support for Pixel-Perfect dashboards Amazon WorkMail adds Mobile Device Access Override API and MDM integration capabilities Announcing Amazon WorkSpaces API to create new updated images with latest AWS drivers Computer Vision at the Edge with AWS Panorama | AWS News Blog Amazon Connect launches API to configure hours of operation programmatically New region availability and Graviton2 support now available for Amazon GameLift Sponsors CMD Solutions Silver Sponsors Cevo Versent

time microsoft security advantage i am region api python aws ml hive javascript ubuntu gpu jumbo jm lambda tls mdm redis arjen postgresql amazon s3 d wave babel fish microsoft sql server apache cassandra auto scaling amazon rds aws fargate cloudwatch aws cloudformation amazon dynamodb microsoft windows server aws organizations amazon sqs apache hive amazon elasticsearch service amazon msk

Yugabyte CEO Karthik Ranganathan

A Bootiful Podcast

Play Episode Listen Later Dec 23, 2021 64:45

Hi, Spring fans! Welcome to another installment of a _Bootiful Podcast_! How are you doing? In this episode, we've got an extra special holiday treat for you! [Josh Long (@starbuxman)](https://twitter.com/starbuxman) talks to [Yugabyte](https://twitter.com/Yugabyte) CEO and Apache Cassandra, and Apache HBase co-founder [Karthik Ranganathan (@karthikr)](https://twitter.com/karthikr). Merry Christmas (if you celebrate!)

spring merry christmas karthik ranganathan josh long apache cassandra yugabyte

DataStax and the Startup Mentality with Jonathan Ellis

The Business of Open Source

Play Episode Listen Later Dec 22, 2021 29:46

Jonathan Ellis, CTO and co-founder of DataStax, has always had a startup mindset. In this episode, Jonathan joins me to discuss his journey and entrepreneurial roadmap thus far.In our conversation, Jonathan shares how he became involved with the Apache Cassandra project and his transition to founding DataStax. He also shares insight on the importance of hiring a go to market team, why hiring executives proves to be more challenging than engineers, building a company based around an open-source project, and more.Highlights: Jonathan's views on his identity as a founder and scratching his coding itch through art. (00:23) A look at Jonathan's journey from Mozy to the Apache Cassandra project. (05:40) The history of DataStax - and Jonathan explores the benefits of building a company around open source. (11:33) Lessons learned: the importance of implementing a go-to-market team, DataStax Kubernetes adoption, and why hiring executives is a challenge. (15:58) Jonathan's advice to technical founders - and his perspective and insight on remote work. (27:39) Links:JonathanLinkedIn: https://www.linkedin.com/in/jbellis/Twitter: https://twitter.com/spycedDataSTax: https://www.datastax.com/

business startups cto mentality real world kubernetes cloud native datastax apache cassandra mozy jonathan ellis

Building Distributed Cognition into Your Business with Sam Ramji

Screaming in the Cloud

Play Episode Listen Later Dec 9, 2021 39:56

About SamA 25-year veteran of the Silicon Valley and Seattle technology scenes, Sam Ramji led Kubernetes and DevOps product management for Google Cloud, founded the Cloud Foundry foundation, has helped build two multi-billion dollar markets (API Management at Apigee and Enterprise Service Bus at BEA Systems) and redefined Microsoft's open source and Linux strategy from “extinguish” to “embrace”.He is nerdy about open source, platform economics, middleware, and cloud computing with emphasis on developer experience and enterprise software. He is an advisor to multiple companies including Dell Technologies, Accenture, Observable, Fletch, Orbit, OSS Capital, and the Linux Foundation.Sam received his B.S. in Cognitive Science from UC San Diego, the home of transdisciplinary innovation, in 1994 and is still excited about artificial intelligence, neuroscience, and cognitive psychology.Links: DataStax: https://www.datastax.com Sam Ramji Twitter: https://twitter.com/sramji Open||Source||Data: https://www.datastax.com/resources/podcast/open-source-data Screaming in the Cloud Episode 243 with Craig McLuckie: https://www.lastweekinaws.com/podcast/screaming-in-the-cloud/innovating-in-the-cloud-with-craig-mcluckie/ Screaming in the Cloud Episode 261 with Jason Warner: https://www.lastweekinaws.com/podcast/screaming-in-the-cloud/what-github-can-give-to-microsoft-with-jason-warner/ TranscriptAnnouncer: Hello, and welcome to Screaming in the Cloud with your host, Chief Cloud Economist at The Duckbill Group, Corey Quinn. This weekly show features conversations with people doing interesting work in the world of cloud, thoughtful commentary on the state of the technical world, and ridiculous titles for which Corey refuses to apologize. This is Screaming in the Cloud.Corey: This episode is sponsored in part by our friends at Redis, the company behind the incredibly popular open source database that is not the bind DNS server. If you're tired of managing open source Redis on your own, or you're using one of the vanilla cloud caching services, these folks have you covered with the go to manage Redis service for global caching and primary database capabilities; Redis Enterprise. Set up a meeting with a Redis expert during re:Invent, and you'll not only learn how you can become a Redis hero, but also have a chance to win some fun and exciting prizes. To learn more and deploy not only a cache but a single operational data platform for one Redis experience, visit redis.com/hero. Thats r-e-d-i-s.com/hero. And my thanks to my friends at Redis for sponsoring my ridiculous non-sense. Corey: Are you building cloud applications with a distributed team? Check out Teleport, an open source identity-aware access proxy for cloud resources. Teleport provides secure access to anything running somewhere behind NAT: SSH servers, Kubernetes clusters, internal web apps and databases. Teleport gives engineers superpowers! Get access to everything via single sign-on with multi-factor. List and see all SSH servers, kubernetes clusters or databases available to you. Get instant access to them all using tools you already have. Teleport ensures best security practices like role-based access, preventing data exfiltration, providing visibility and ensuring compliance. And best of all, Teleport is open source and a pleasure to use.Download Teleport at https://goteleport.com. That's goteleport.com.Corey: Welcome to Screaming in the Cloud, I'm Cloud Economist Corey Quinn, and recurring effort that this show goes to is to showcase people in their best light. Today's guest has done an awful lot: he led Kubernetes and DevOps Product Management for Google Cloud; he founded the Cloud Foundry Foundation; he set open-source strategy for Microsoft in the naughts; he advises companies including Dell, Accenture, the Linux Foundation; and tying all of that together, it's hard to present a lot of that in a great light because given my own proclivities, that sounds an awful lot like a personal attack. Sam Ramji is the Chief Strategy Officer at DataStax. Sam, thank you for joining me, and it's weird when your resume starts to read like, “Oh, I hate all of these things.”Sam: [laugh]. It's weird, but it's true. And it's the only life I could have lived apparently because here I am. Corey, it's a thrill to meet you. I've been an admirer of your public speaking, and public tweeting, and your writing for a long time.Corey: Well, thank you. The hard part is getting over the voice saying don't do it because it turns out that there's no real other side of public shutting up, which is something that I was never good at anyway, so I figured I'd lean into it. And again, I mean, that the sense of where you have been historically in terms of your career not, “Look what you've done,” which is a subtext that I could be accused of throwing in sometimes.Sam: I used to hear that a lot from my parents, actually.Corey: Oh, yeah. That was my name growing up. But you've done a lot of things, and you've transitioned from notable company making significant impact on the industry, to the next one, to the next one. And you've been in high-flying roles, doing lots of really interesting stuff. What's the common thread between all those things?Sam: I'm an intensely curious person, and the thing that I'm most curious about is distributed cognition. And that might not be obvious from what you see is kind of the… Lego blocks of my career, but I studied cognitive science in college when that was not really something that was super well known. So, I graduated from UC San Diego in '94 doing neuroscience, artificial intelligence, and psychology. And because I just couldn't stop thinking about thinking; I was just fascinated with how it worked.So, then I wanted to build software systems that would help people learn. And then I wanted to build distributed software systems. And then I wanted to learn how to work with people who were thinking about building the distributed software systems. So, you end up kind of going up this curve of, like, complexity about how do we think? How do we think alone? How do we learn to think? How do we think together?And that's the directed path through my software engineering career, into management, into middleware at BEA, into open-source at Microsoft because that's an amazing demonstration of distributed cognition, how, you know, at the time in 2007, I think, Sourceforge had 100,000 open-source projects, which was, like, mind boggling. Some of them even worked together, but all of them represented these groups of people, flung around the world, collaborating on something that was just fundamentally useful, that they were curious about. Kind of did the same thing into APIs because APIs are an even better way to reuse for some cases than having the source code—at Apigee. And kept growing up through that into, how are we building larger-scale thinking systems like Cloud Foundry, which took me into Google and Kubernetes, and then some applications of that in Autodesk and now DataStax. So, I love building companies. I love helping people build companies because I think business is distributed cognition. So, those businesses that build distributed systems, for me, are the most fascinating.Corey: You were basically handed a heck of a challenge as far as, “Well, help set open-source strategy,” back at Microsoft, in the days where that was a punchline. And credit where due, I have to look at Microsoft of today, and it's not a joke, you can have your arguments about them, but again in those days, a lot of us built our entire personality on hating Microsoft. Some folks never quite evolved beyond that, but it's a new ballgame and it's very clear that the Microsoft of yesteryear and the Microsoft of today are not completely congruent. What was it like at that point understanding that as you're working with open-source communities, you're doing that from a place of employment with a company that was widely reviled in the space.Sam: It was not lost on me. The irony, of course, was that—Corey: Well, thank God because otherwise the question where you would have been, “What do you mean they didn't like us?”Sam: [laugh].Corey: Which, on some levels, like, yeah, that's about the level of awareness I would have expected in that era, but contrary to popular opinion, execs at these companies are not generally oblivious.Sam: Yeah, well, if I'd been clever as a creative humorist, I would have given you that answer instead of my serious answer, but for some reason, my role in life is always to be the straight guy. I used to have Slashdot as my homepage, right? I love when I'd see some conspiracy theory about, you know, Bill Gates dressed up as the Borg, taking over the world. My first startup, actually in '97, was crushed by Microsoft. They copied our product, copied the marketing, and bundled it into Office, so I had lots of reasons to dislike Microsoft.But in 2004, I was recruited into their venture capital team, which I couldn't believe. It was really a place that they were like, “Hey, we could do better at helping startups succeed, so we're going to evangelize their success—if they're building with Microsoft technologies—to VCs, to enterprises, we'll help you get your first big enterprise deal.” I was like, “Man, if I had this a few years ago, I might not be working.” So, let's go try to pay it forward.I ended up in open-source by accident. I started going to these conferences on Software as a Service. This is back in 2005 when people were just starting to light up, like, Silicon Valley Forum with, you know, the CEO of Demandware would talk, right? We'd hear all these different ways of building a new business, and they all kept talking about their tech stack was Linux, Apache, MySQL, and PHP. I went to one eight-hour conference, and Microsoft technologies were mentioned for about 12 seconds in two separate chunks. So, six seconds, he was like, “Oh, and also we really like Microsoft SQL Server for our data layer.”Corey: Oh, Microsoft SQL Server was fantastic. And I know that's a weird thing for people to hear me say, just because I've been renowned recently for using Route 53 as the primary data store for everything that I can. But there was nothing quite like that as far as having multiple write nodes, being able to handle sharding effectively. It was expensive, and you would take a bath on the price come audit time, but people were not rolling it out unaware of those things. This was a trade off that they were making.Oracle has a similar story with databases. It's yeah, people love to talk smack about Oracle and its business practices for a variety of excellent reasons, at least in the database space that hasn't quite made it to cloud yet—knock on wood—but people weren't deploying it because they thought Oracle was warm and cuddly as a vendor; they did it because they can tolerate the rest of it because their stuff works.Sam: That's so well said, and people don't give them the credit that's due. Like, when they built hypergrowth in their business, like… they had a great product; it really worked. They made it expensive, and they made a lot of money on it, and I think that was why you saw MySQL so successful and why, if you were looking for a spec that worked, that you could talk through through an open driver like ODBC or JDBC or whatever, you could swap to Microsoft SQL Server. But I walked out of that and came back to the VC team and said, “Microsoft has a huge problem. This is a massive market wave that's coming. We're not doing anything in it. They use a little bit of SQL Server, but there's nothing else in your tech stack that they want, or like, or can afford because they don't know if their businesses are going to succeed or not. And they're going to go out of business trying to figure out how much licensing costs they would pay to you in order to consider using your software. They can't even start there. They have to start with open-source. So, if you're going to deal with SaaS, you're going to have to have open-source, and get it right.”So, I worked with some folks in the industry, wrote a ten-page paper, sent it up to Bill Gates for Think Week. Didn't hear much back. Bought a new strategy to the head of developer platform evangelism, Sanjay Parthasarathy who suggested that the idea of discounting software to zero for startups, with the hope that they would end up doing really well with it in the future as a Software as a Service company; it was dead on arrival. Dumb idea; bring it back; that actually became BizSpark, the most popular program in Microsoft partner history.And then about three months later, I got a call from this guy, Bill Hilf. And he said, “Hey, this is Bill Hilf. I do open-source at Microsoft. I work with Bill Gates. He sent me your paper. I really like it. Would you consider coming up and having conversation with me because I want you to think about running open-source technology strategy for the company.” And at this time I'm, like, 33 or 34. And I'm like, “Who me? You've got to be joking.” And he goes, “Oh, and also, you'll be responsible for doing quarterly deep technical briefings with Bill… Gates.” I was like, “You must be kidding.” And so of course I had to check it out. One thing led to another and all of a sudden, with not a lot of history in the open-source community but coming in it with a strategist's eye and with a technologist's eye, saying, “This is a problem we got to solve. How do we get after this pragmatically?” And the rest is history, as they say.Corey: I have to say that you are the Chief Strategy Officer at DataStax, and I pull up your website quickly here and a lot of what I tell earlier stage companies is effectively more or less what you have already done. You haven't named yourself after the open-source project that underlies the bones of what you have built so you're not going to wind up in the same glorious challenges that, for example, Elastic or MongoDB have in some ways. You have a pricing page that speaks both to the reality of, “It's two in the morning. I'm trying to get something up and running and I want you the hell out of my way. Just give me something that I can work with a reasonable free tier and don't make me talk to a salesperson.” But also, your enterprise tier is, “Click here to talk to a human being,” which is speaking enterprise slash procurement slash, oh, there will be contract negotiation on these things.It's being able to serve different ends of your market depending upon who it is that encounters you without being off-putting to any of those. And it's deceptively challenging for companies to pull off or get right. So clearly, you've learned lessons by doing this. That was the big problem with Microsoft for the longest time. It's, if I want to use some Microsoft stuff, once you were able to download things from the internet, it changed slightly, but even then it was one of those, “What exactly am I committing to here as far as signing up for this? And am I giving them audit rights into my environment? Is the BSA about to come out of nowhere and hit me with a surprise audit and find out that various folks throughout the company have installed this somewhere and now I owe more than the company's worth?” That was always the haunting fear that companies had back then.These days, I like the approach that companies are taking with the SaaS offering: you pay for usage. On some level, I'd prefer it slightly differently in a pay-per-seat model because at least then you can predict the pricing, but no one is getting surprise submarined with this type of thing on an audit basis, and then they owe damages and payment in arrears and someone has them over a barrel. It's just, “Oh. The bill this month was higher than we expected.” I like that model I think the industry does, too.Sam: I think that's super well said. As I used to joke at BEA Systems, nothing says ‘I love you' to a customer like an audit, right? That's kind of a one-time use strategy. If you're going to go audit licenses to get your revenue in place, you might be inducing some churn there. It's a huge fix for the structural problem in pricing that I think package software had, right?When we looked at Microsoft software versus open-source software, and particularly Windows versus Linux, you would have a structure where sales reps were really compensated to sell as much as possible upfront so they could get the best possible commission on what might be used perpetually. But then if you think about it, like, the boxes in a curve, right, if you do that calculus approximation of a smooth curve, a perpetual software license is a huge box and there's an enormous amount of waste in there. And customers figured out so as soon as you can go to a pay-per-use or pay-as-you-go, you start to smooth that curve, and now what you get is what you deserve, right, as opposed to getting filled with way more cost than you expect. So, I think this model is really super well understood now. Kind of the long run the high point of open-source meets, cloud, meets Software as a Service, you look at what companies like MongoDB, and Confluent, and Elastic, and Databricks are doing. And they've really established a very good path through the jungle of how to succeed as a software company. So, it's still difficult to implement, but there are really world-class guides right now.Corey: Moving beyond where Microsoft was back in the naughts, you were then hired as a VP over at Google. And in that era, the fact that you were hired as a VP at Google is fascinating. They preferred to grow those internally, generally from engineering. So, first question, when you were being hired as a VP in the product org, did they make you solve algorithms on a whiteboard to get there?Sam: [laugh]. They did not. I did have somewhat of an advantage [because they 00:13:36] could see me working pretty closely as the CEO of the Cloud Foundry Foundation. I'd worked closely with Craig McLuckie who notably brought Kubernetes to the world along with Joe Beda, and with Eric Brewer, and a number of others.And he was my champion at Google. He was like, “Look, you know, we need him doing Kubernetes. Let's bring Sam in to do that.” So, that was helpful. I also wrote a [laugh] 2000-word strategy document, just to get some thoughts out of my head. And I said, “Hey, if you like this, great. If you don't throw it away.” So, the interviews were actually very much not solving problems in a whiteboard. There were super collaborative, really excellent conversations. It was slow—Corey: Let's be clear, Craig McLuckie's most notable achievement was being a guest on this podcast back in Episode 243. But I'll say that this is a close second.Sam: [laugh]. You're not wrong. And of course now with Heptio and their acquisition by VMware.Corey: Ehh, they're making money beyond the wildest dreams of avarice, that's all well and good, but an invite to this podcast, that's where it's at.Sam: Well, he should really come on again, he can double down and beat everybody. That can be his landmark achievement, a two-timer on Screaming in [the] Cloud.Corey: You were at Google; you were at Microsoft. These are the big titans of their era, in some respect—not to imply that there has beens; they're bigger than ever—but it's also a more crowded field in some ways. I guess completing the trifecta would be Amazon, but you've had the good judgment never to work there, directly of course. Now they're clearly in your market. You're at DataStax, which is among other things, built on Apache Cassandra, and they launched their own Cassandra service named Keyspaces because no one really knows why or how they name things.And of course, looking under the hood at the pricing model, it's pretty clear that it really is just DynamoDB wearing some Groucho Marx classes with a slight upcharge for API level compatibility. Great. So, I don't see it a lot in the real world and that's fine, but I'm curious as to your take on looking at all three of those companies at different eras. There was always the threat in the open-source world that they are going to come in and crush you. You said earlier that Microsoft crushed your first startup.Google is an interesting competitor in some respects; people don't really have that concern about them. And your job as a Chief Strategy Officer at Amazon is taken over by a Post-it Note that simply says ‘yes' on it because there's nothing they're not going to do, or try, and experiment with. So, from your perspective, if you look at the titans, who is it that you see as the largest competitive threat these days, if that's even a thing?Sam: If you think about Sun Tzu and the Art of War, right—a lot of strategy comes from what we've learned from military environments—fighting a symmetric war, right, using the same weapons and the same army against a symmetric opponent, but having 1/100th of the personnel and 1/100th of the money is not a good plan.Corey: “We're going to lose money, going to be outcompeted; we'll make it up in volume. Oh, by the way, we're also slower than they are.”Sam: [laugh]. So, you know, trying to come after AWS, or Microsoft, or Google as an independent software company, pound-for-pound, face-to-face, right, full-frontal assault is psychotic. What you have to do, I think, at this point is to understand that these are each companies that are much like we thought about Linux, and you know, Macintosh, and Windows as operating systems. They're now the operating systems of the planet. So, that creates some economies of scale, some efficiencies for them. And for us. Look at how cheap object storage is now, right? So, there's never been a better time in human history to create a database company because we can take the storage out of the database and hand it over to Amazon, or Google, or Microsoft to handle it with 13 nines of durability on a constantly falling cost basis.So, that's super interesting. So, you have to prosecute the structure of the world as it is, based on where the giants are and where they'll be in the future. Then you have to turn around and say, like, “What can they never sell?”So, Amazon can never sell something that is standalone, right? They're a parts factory and if you buy into the Amazon-first strategy of cloud computing—which we did at Autodesk when I was VP of cloud platform there—everything is a primitive that works inside Amazon, but they're not going to build things that don't work outside of the Amazon primitives. So, your company has to be built on the idea that there's a set of people who value something that is purpose-built for a particular use case that you can start to broaden out, it's really helpful if they would like it to be something that can help them escape a really valuable asset away from the center of gravity that is a cloud. And that's why data is super interesting. Nobody wakes up in the morning and says, “Boy, I had such a great conversation with Oracle over the last 20 years beating me up on licensing. Let me go find a cloud vendor and dump all of my data in that so they can beat me up for the next 20 years.” Nobody says that.Corey: It's the idea of data portability that drives decision-making, which makes people, of course, feel better about not actually moving in anywhere. But the fact that they're not locked in strategically, in a way that requires a full software re-architecture and data model rewrite is compelling. I'm a big believer in convincing people to make decisions that look a lot like that.Sam: Right. And so that's the key, right? So, when I was at Autodesk, we went from our 100 million dollar, you know, committed spend with 19% discount on the big three services to, like—we started realize when we're going to burn through that, we were spending $60 million or so a year on 20% annual growth as the cloud part of the business grew. Thought, “Okay, let's renegotiate. Let's go and do a $250 million deal. I'm sure they'll give us a much better discount than 19%.” Short story is they came back and said, “You know, we're going to take you from an already generous 19% to an outstanding 22%.” We thought, “Wait a minute, we already talked to Intuit. They're getting a 40% discount on a $400 million spend.”So, you know, math is hard, but, like, 40% minus 22% is 18% times $250 million is a lot of money. So, we thought, “What is going on here?” And we realized we just had no credible threat of leaving, and Intuit did because they had built a cross-cloud capable architecture. And we had not. So, now stepping back into the kind of the world that we're living in 2021, if you're an independent software company, especially if you have the unreasonable advantage of being an open-source software company, you have got to be doing your customers good by giving them cross-cloud capability. It could be simply like the Amdahl coffee cup that Amdahl reps used to put as landmines for the IBM reps, later—I can tell you that story if you want—even if it's only a way to save money for your customer by using your software, when it gets up to tens and hundreds of million dollars, that's a really big deal.But they also know that data is super important, so the option value of being able to move if they have to, that they have to be able to pull that stick, instead of saying, “Nice doggy,” we have to be on their side, right? So, there's almost a detente that we have to create now, as cloud vendors, working in a world that's invented and operated by the giants.Corey: This episode is sponsored by our friends at Oracle HeatWave is a new high-performance accelerator for the Oracle MySQL Database Service. Although I insist on calling it “my squirrel.” While MySQL has long been the worlds most popular open source database, shifting from transacting to analytics required way too much overhead and, ya know, work. With HeatWave you can run your OLTP and OLAP, don't ask me to ever say those acronyms again, workloads directly from your MySQL database and eliminate the time consuming data movement and integration work, while also performing 1100X faster than Amazon Aurora, and 2.5X faster than Amazon Redshift, at a third of the cost. My thanks again to Oracle Cloud for sponsoring this ridiculous nonsense.Corey: When we look across the, I guess, the ecosystem as it's currently unfolding, a recurring challenge that I have to the existing incumbent cloud providers is they're great at offering the bricks that you can use to build things, but if I'm starting a company today, I'm not going to look at building it myself out of, “Ooh, I'm going to take a bunch of EC2 instances, or Lambda functions, or popsicles and string and turn it into this thing.” I'm going to want to tie together things that are way higher level. In my own case, now I wind up paying for Retool, which is, effectively, yeah, it runs on some containers somewhere, presumably, I think in Azure, but don't quote me on that. And that's great. Could I build my own thing like that?Absolutely not. I would rather pay someone to tie it together. Same story. Instead of building my own CRM by running some open-source software on an EC2 instance, I wind up paying for Salesforce or Pipedrive or something in that space. And so on, and so forth.And a lot of these companies that I'm doing business with aren't themselves running on top of AWS. But for web hosting, for example; if I look at the reference architecture for a WordPress site, AWS's diagram looks like a punchline. It is incredibly overcomplicated. And I say this as someone who ran large WordPress installations at Media Temple many years ago. Now, I have the good sense to pay WP Engine. And on a monthly basis, I give them money and they make the website work.Sure, under the hood, it's running on top of GCP or AWS somewhere. But I don't have to think about it; I don't have to build this stuff together and think about the backups and the failover strategy and the rest. The website just works. And that is increasingly the direction that business is going; things commoditize over time. And AWS in particular has done a terrible job, in my experience, of differentiating what it is they're doing in the language that their customers speak.They're great at selling things to existing infrastructure engineers, but folks who are building something from scratch aren't usually in that cohort. It's a longer story with time and, “Well, we're great at being able to sell EC2 instances by the gallon.” Great. Are you capable of going to a small doctor's office somewhere in the American Midwest and offering them an end-to-end solution for managing patient data? Of course not. You can offer them a bunch of things they can tie together to something that will suffice if they all happen to be software engineers, but that's not the opportunity.So instead, other companies are building those solutions on top of AWS, capturing the margin. And if there's one thing guaranteed to keep Amazon execs awake at night, it's the idea of someone who isn't them making money somehow somewhere, so I know that's got to rankle them, but they do not speak that language. At all. Longer-term, I only see that as a more and more significant crutch. A long enough timeframe here, we're talking about them becoming the Centurylinks of the world, the tier one backbone provider that everyone uses, but no one really thinks about because they're not a household name.Sam: That is a really thoughtful perspective. I think the diseconomies of scale that you're pointing to start to creep in, right? Because when you have to sell compute units by the gallon, right, you can't care if it's a gallon of milk, [laugh] or a gallon of oil, or you know, a gallon of poison. You just have to keep moving it through. So, the shift that I think they're going to end up having to make pragmatically, and you start to see some signs of it, like, you know, they hired but could not retain Matt [Acey 00:23:48]. He did an amazing job of bringing them to some pragmatic realization that they need to partner with open-source, but more broadly, when I think about Microsoft in the 2000s as they were starting to learn their open-source lessons, we were also being able to pull on Microsoft's deep competency and partners. So, most people didn't do the math on this. I was part of the field governance council so I understood exactly how the Microsoft business worked to the level that I was capable. When they had $65 billion in revenue, they produced $24 billion in profit through an ecosystem that generated $450 billion in revenue. So, for every dollar Microsoft made, it was $8 to partners. It was a fundamentally platform-shaped business, and that was how they're able to get into doctors offices in the Midwest, and kind of fit the curve that you're describing of all of those longtail opportunities that require so much care and that are complex to prosecute. These solved for their diseconomies of scale by having 1.2 million partner companies. So, will Amazon figure that out and will they hire, right, enough people who've done this before from Microsoft to become world-class in partnering, that's kind of an exercise left to the [laugh] reader, right? Where will that go over time? But I don't see another better mathematical model for dealing with the diseconomies of scale you have when you're one of the very largest providers on the planet.Corey: The hardest problem as I look at this is, at some point, you hit a point of scale where smaller things look a lot less interesting. I get that all the time when people say, “Oh, you fix AWS bills, aren't you missing out by not targeting Google bills and Azure bills as well?” And it's, yeah. I'm not VC-backed. It turns out that if I limit the customer base that I can effectively service to only AWS customers, yeah turns out, I'm not going to starve anytime soon. Who knew? I don't need to conquer the world and that feels increasingly antiquated, at least going by the stories everyone loves to tell.Sam: Yeah, it's interesting to see how cloud makes strange bedfellows, right? We started seeing this in, like, 2014, 2015, weird partnerships that you're like, “There's no way this would happen.” But the cloud economics which go back to utilization, rather than what it used to be, which was software lock-in, just changed who people were willing to hang out with. And now you see companies like Databricks going, you know, we do an amazing amount of business, effectively competing with Amazon, selling Spark services on top of predominantly Amazon infrastructure, and everybody seems happy with it. So, there's some hint of a new sensibility of what the future of partnering will be. We used to call it coopetition a long time ago, which is kind of a terrible word, but at least it shows that there's some nuance in you can't compete with everybody because it's just too hard.Corey: I wish there were better ways of articulating these things because it seems from the all the outside world, you have companies like Amazon and Microsoft and Google who go and build out partner networks because they need that external accessibility into various customer profiles that they can't speak to super well themselves, but they're also coming out with things that wind up competing directly or indirectly, with all of those partners at the same time. And I don't get it. I wish that there were smarter ways to do it.Sam: It is hard to even talk about it, right? One of the things that I think we've learned from philosophy is if we don't have a word for it, we can't be intelligent about it. So, there's a missing semantics here for being able to describe the complexity of where are you partnering? Where are you competing? Where are you differentiating? In an ecosystem, which is moving and changing.I tend to look at the tools of game theory for this, which is to look at things as either, you know, nonzero-sum games or zero-sum games. And if it's a nonzero-sum game, which I think are the most interesting ones, can you make it a positive sum game? And who can you play positive-sum games with? An organization as big as Amazon, or as big as Microsoft, or even as big as Google isn't ever completely coherent with itself. So, thinking about this as an independent software company, it doesn't matter if part of one of these hyperscalers has a part of their business that competes with your entire business because your business probably drives utilization of a completely different resource in their company that you can partner within them against them, effectively. Right?For example, Cassandra is an amazingly powerful but demanding workload on Kubernetes. So, there's a lot of Cassandra on EKS. You grow a lot of workload, and EKS business does super well. Does that prevent us from working with Amazon because they have Dynamo or because they have Keyspaces? Absolutely not, right?So, this is when those companies get so big that they are almost their own forest, right, of complexity, you can kind of get in, hang out, do well, and pretty much never see the competitive product, unless you're explicitly looking for it, which I think is a huge danger for us as independent software companies. And I would say this to anybody doing strategy for an organization like this, which is, don't obsess over the tiny part of their business that competes with yours, and do not pay attention to any of the marketing that they put out that looks competitive with what you have. Because if you can't figure out how to make a better product and sell it better to your customers as a single purpose corporation, you have bigger problems.Corey: I want to change gears slightly to something that's probably a fair bit more insulting, but that's okay. We're going to roll with it. That seems to be the theme of this episode. You have been, in effect, a CIO a number of times at different companies. And if we take a look at the typical CIO tenure, industry-wide, it's not long; it approaches the territory from an executive perspective of, “Be sure not to buy green bananas. You might not be here by the time they ripen.” And I'm wondering what it is that drives that and how you make a mark in a relatively short time frame when you're providing inputs and deciding on strategy, and those decisions may not bear fruit for years.Sam: CIO used to—we used say it stood for ‘Career Is Over' because the tenure is so short. I think there's a couple of reasons why it's so short. And I think there's a way I believe you can have impact in a short amount of time. I think the reason that it's been short is because people aren't sure what they want the CIO role to be.Do they want it to be a glorified finance person who's got a lot of data processing experience, but now really has got, you know, maybe even an MBA in finance, but is not focusing on value creation? Do they want it to be somebody who's all-singing, all-dancing Chief Data Officer with a CTO background who did something amazing and solved a really hard problem? The definition of success is difficult. Often CIOs now also have security under them, which is literally a job I would never ever want to have. Do security for a public corporation? Good Lord, that's a way to lose most of your life. You're the only executive other than the CEO that the board wants to hear from. Every sing—Corey: You don't sleep; you wait, in those scenarios. And oh, yeah, people joke about ablative CSOs in those scenarios. Yeah, after SolarWinds, you try and get an ablative intern instead, but those don't work as well. It's a matter of waiting for an inevitability. One of the things I think is misunderstood about management broadly, is that you are delegating work, but not the responsibility. The responsibility rests with you.So, when companies have these statements blaming some third-party contractor, it's no, no, no. I'm dealing with you. You were the one that gave my data to some sketchy randos. It is your responsibility that data has now been compromised. And people don't want to hear that, but it's true.Sam: I think that's absolutely right. So, you have this high risk, medium reward, very fungible job definition, right? If you ask all of the CIO's peers what their job is, they'll probably all tell you something different that represents their wish list. The thing that I learned at Autodesk, I was only there for 15 months, but we established a fundamental transformation of the work of how cloud platform is done at the company that's still in place a couple years later.You have to realize that you're a change agent, right? You're actually being hired to bring in the bulk of all the different biases and experiences you have to solve a problem that is not working, right? So, when I got to Autodesk, they didn't even know what their uptime was. It took three months to teach the team how to measure the uptime. Turned out the uptime was 97.7% for the cloud, for the world's largest engineering software company.That is 200 hours a year of unplanned downtime, right? That is not good. So, a complete overhaul [laugh] was needed. Understanding that as a change agent, your half-life is 12 to 18 months, you have to measure success not on tenure, but on your ability to take good care of the patient, right? It's going to be a lot of pain, you're going to work super hard, you're going to have to build trust with everyone, and then people are still going to hate you at the end. That is something you just have to kind of take on.As a friend of mine, Jason Warner joined Redpoint Ventures recently, he said this when he was the CTO of GitHub: “No one is a villain in their own story.” So, you realize, going into a big organization, people are going to make you a villain, but you still have to do incredibly thoughtful, careful work, that's going to take care of them for a long time to come. And those are the kinds of CIOs that I can relate to very well.Corey: Jason is great. You're name-dropping all the guests we've had. My God, keep going. It's a hard thing to rationalize and wrap heads around. It's one of those areas where you will not be measured during your tenure in the role, in some respects. And, of course, that leads to the cynical perspective as well, where well, someone's not going to be here long and if they say, “Yeah, we're just going to keep being stewards of the change that's already underway,” well, that doesn't look great, so quick, time to do a cloud migration, or a cloud repatriation, or time to roll something else out. A bit of a different story.Sam: One of the biggest challenges is how do you get the hearts and the minds of the people who are in the organization when they are no fools, and their expectation is like, “Hey, this company's been around for decades, and we go through cloud leaders or CIOs, like Wendy's goes through hamburgers.” They could just cloud-wash, right, or change-wash all their language. They could use the new language to describe the old thing because all they have to do is get through the performance review and outwait you. So, there's always going to be a level of defection because it's hard to change; it's hard to think about new things.So, the most important thing is how do you get into people's hearts and minds and enable them to believe that the best thing they could do for their career is to come along with the change? And I think that was what we ended up getting right in the Autodesk cloud transformation. And that requires endless optimism, and there's no room for cynicism because the cynicism is going to creep in around the edges. So, what I found on the job is, you just have to get up every morning and believe everything is possible and transmit that belief to everybody.So, if it seems naive or ingenuous, I think that doesn't matter as long as you can move people's hearts in each conversation towards, like, “Oh, this person cares about me. They care about a good outcome from me. I should listen a little bit more and maybe make a 1% change in what I'm doing.” Because 1% compounded daily for a year, you can actually get something done in the lifetime of a CIO.Corey: And I think that's probably a great place to leave it. If people want to learn more about what you're up to, how you think about these things, how you view the world, where can they find you?Sam: You can find me on Twitter, I'm @sramji, S-R-A-M-J-I, and I have a podcast that I host called Open||Source||Datawhere I invite innovators, data nerds, computational networking nerds to hang out and explain to me, a software programmer, what is the big world of open-source data all about, what's happening with machine learning, and what would it be like if you could put data in a container, just like you could put code in a container, and how might the world change? So, that's Open||Source||Data podcast.Corey: And we'll of course include links to that in the [show notes 00:35:58]. Thanks so much for your time. I appreciate it.Sam: Corey, it's been a privilege. Thank you so much for having me.Corey: Likewise. Sam Ramji, Chief Strategy Officer at DataStax. I'm Cloud Economist Corey Quinn, and this is Screaming in the Cloud. If you've enjoyed this podcast, please leave a five-star review on your podcast platform of choice, whereas if you've hated this podcast, please leave a five-star review on your podcast platform of choice, along with a comment telling me exactly which item in Sam's background that I made fun of is the place that you work at.Corey: If your AWS bill keeps rising and your blood pressure is doing the same, then you need The Duckbill Group. We help companies fix their AWS bill by making it smaller and less horrifying. The Duckbill Group works for you, not AWS. We tailor recommendations to your business and we get to the point. Visit duckbillgroup.com to get started.Announcer: This has been a HumblePod production. Stay humble.

god ceo amazon art google man service war office seattle microsoft mba silicon valley software cloud midwest longer lego boy windows spark ibm bill gates oracle saas route cto vc crm bought salesforce wordpress api cio screaming accenture aws cios linux chief strategy officer apis orbits borg vcs my god devops azure apache invent cognition intuit distributed vmware google cloud php dns uc san diego dynamo kubernetes macintosh sun tzu cognitive science solarwinds autodesk fletch lambda elastic dell technologies mongodb 5x chief data officers groucho marx mysql gcp good lord databricks ssh bsa american midwest redis linux foundation wp engine teleport pipedrive sql server eks csos confluent ec2 observable samit oracle cloud datastax dynamodb corey quinn redpoint ventures samone samwell cloud foundry eric brewer api management olap microsoft sql server apache cassandra slashdot open source data oltp sourceforge heptio demandware jason warner apigee amazon redshift bea systems jdbc joe beda duckbill group amdahl amazon aurora media temple odbc sam you bizspark chief cloud economist last week in aws cloud episode enterprise service bus humblepod

DataStax‘s Christopher Bradford on the Apache Cassandra operator for Kubernetes, K8ssandra

A Bootiful Podcast

Play Episode Listen Later Dec 3, 2021 64:33

Hi, Spring fans! In this installment, [Josh Long (@starbuxman)](https://twitter.com/starbuxman) talks to DataStax's [Christopher Bradford](https://www.linkedin.com/in/bradfordcp/) on the Apache Cassandra operator for Kubernetes, [K8ssandra](https://k8ssandra.io/). * [KubeCon Talk blogpost part 1](https://thenewstack.io/taking-your-database-beyond-a-single-kubernetes-cluster/) * [KubeCon Talk blogpost part 2](https://medium.com/building-the-open-data-stack/managing-distributed-applications-in-kubernetes-using-cilium-and-istio-with-helm-and-operator-for-9652d71d6432) * [We Pushed Helm to the Limit, then Built a Kubernetes Operator ](https://thenewstack.io/we-pushed-helm-to-the-limit-then-built-a-kubernetes-operator/) * [A Case for Databases on Kubernetes from a Former Skeptic](https://thenewstack.io/a-case-for-databases-on-kubernetes-from-a-former-skeptic/)

spring built limit operator bradford databases kubernetes datastax josh long apache cassandra

DoK Talks #97- Learn about Developing a Multicluster Operator with K8ssandra Operator // John Sanda

Data on Kubernetes Community

Play Episode Listen Later Oct 29, 2021 62:10

https://go.dok.community/slack https://dok.community/ ABSTRACT OF THE TALK Cassandra is a highly scalable database with an architecture that makes it well suited for multi-region workloads. A Kubernetes cluster often spans across multiple zones within a single region. Multi-region Kubernetes clusters are less common though due to the challenges that they present. This has led to a growing number of multi-cluster solutions. In this presentation John Sanda introduces K8ssandra Operator. It is designed from the ground up for multi-cluster deployments. John will discuss how to reconcile objects across multiple clusters, how to manage secrets, pitfalls to avoid, and testing strategies. BIO John Sanda is a DataStax engineer working on the K8ssandra project. He is passionate about Cassandra and Kubernetes and loves being involved in open source. Prior to joining DataStax John worked for a year at The Last Pickle as an Apache Cassandra consultant. Prior to that, he spent a number of years at Red Hat as an engineer. It was during that time John got involved with Cassandra when he redesigned a metrics data store and built it with Cassandra in place of an RDBMS. He had his first initial exposure to Cassandra and Kubernetes when the metrics storage engine was later used in OpenShift.

developing operator red hat kubernetes sanda openshift datastax apache cassandra rdbms

Finding Search Success with Cloud Flexibility

Craft of Code

Play Episode Listen Later Oct 15, 2021 28:11

How important is it to find a hosting partner that's a good fit for your company? Rock de Vocht is the Director of Technology, CTO, and Co-Founder at SimSage, the AI powered search platform designed to make finding information more efficient. There are plenty of obstacles to overcome when building a search engine but finding a hosting solution that's suitable and flexible shouldn't be one of them. Rock joined episode three, season two, of our Craft of Code podcast to discuss the technology, infrastructure, processing, and even the language theory behind SimSage's development. We also talked to Rock about his partnership with Linode, including why he switched from Google to Linode, the benefits of cloud hosting, and why human customer support is fundamental to success. In this episode, we discussed: How SimSage connects people with information in the workplace How Rock's background in computational linguistics and languages impacted SimSage's development Where many go wrong with neural networks Why the cloud is a natural fit for SimSage The technology infrastructure behind SimSage SimSage's roadmap for scaling Why Rock made the switch from GCP to Linode Rock's advice for future technologists You can find out more by visiting https://www.linode.com/craft-of-code/ (https://www.linode.com/craft-of-code/) Important Links & Mentions https://simsage.ai/ (SimSage) https://kotlinlang.org/ (Kotlin) https://cassandra.apache.org/_/index.html (Apache Cassandra) https://kubernetes.io/ (Kubernetes) https://www.docker.com/ (Docker) Follow Us https://github.com/linode/ (GitHub) https://www.instagram.com/linode/ (Instagram) https://www.linkedin.com/company/linode/ (LinkedIn) https://twitter.com/linode (Twitter) https://www.youtube.com/linode (YouTube) If you enjoyed our show, then please rate and review us on your podcast app of choice.

director success ai google rock technology co founders search code cloud craft cto flexibility github kubernetes gcp kotlin linode apache cassandra vocht

Ep. 483 w/ Chet Kapoor Chairman & CEO at DataStax

Building The Future Show - Radio / TV / Podcast

Play Episode Listen Later Sep 23, 2021 47:21

DataStax is the open, multi-cloud stack for modern data apps. DataStax gives enterprises the freedom of choice, simplicity, and true cloud economics to deploy massive data, delivered via APIs, powering rich interactions on multi-cloud, open source and Kubernetes.DataStax is built on proven Apache Cassandra™ and the Stargate™ open source API platform. DataStax Astra is the new stack for modern data apps as-a-service, built on the scale-out, cloud-native, open source K8ssandra™.DataStax powers modern data apps for 500 of the world's most demanding enterprises including The Home Depot, T-Mobile, Intuit and half of the Fortune 100.https://www.datastax.com/

founders technology entrepreneur fortune startups investors api home depot apis t mobile stargate intuit kubernetes kapoor datastax apache cassandra

162: Framework Laptop, Pop!_OS Rolling Release, Linux Mint, WayDroid | This Week in Linux

This Week in Linux

Play Episode Listen Later Aug 1, 2021 45:21

On this episode of This Week in Linux, we'll cover the modular laptop that respects your Right to Repair called the Framework laptop. In the Distro News, we've got updates from Linux Mint and a very interesting potential plan for a Rolling Release edition of Pop!_OS from System76. We're going to jump into an enterprise grade tool with Apache Cassandra 4.0. Then in the Linux Mobile News, we've got an interview with Rudi Timmermans of the WayDroid project which is looking to make it possible to run Android Apps on Linux Phones like Ubuntu Touch. We've also got news from NVIDIA, they've released new Drivers with a lot of great features and they've even Open Sourced some content. All that and so much more on episode 162 of This Week in Linux, recorded live on July 31, 2021. Your Weekly Source for Linux GNews! SPONSORED BY: Digital Ocean ►► https://do.co/dln-mongo Bitwarden ►► https://bitwarden.com/dln TWITTER ►► https://twitter.com/michaeltunnell MASTODON ►► https://mastodon.social/@MichaelTunnell DLN COMMUNITY ►► https://destinationlinux.network/contact FRONT PAGE LINUX ►► https://frontpagelinux.com MERCH ►► https://dlnstore.com BECOME A PATRON ►► https://tuxdigital.com/contribute This Week in Linux is produced by the Destination Linux Network: https://destinationlinux.network SHOW NOTES ►► https://tuxdigital.com/twil162 00:00 = Welcome to TWIL 162 01:24 = Framework Modular Laptop Respects Your Right to Repair 09:37 = Element Raises $30 Million 13:50 = Linux Mint Getting New Website 16:59 = Digital Ocean: Managed MongoDB ( https://do.co/dln-mongo ) 18:16 = Pop!_OS Rolling Release? 22:16 = Apache Cassandra 4.0 Released 25:08 = WayDroid: Android Apps On Linux Phones 32:00 = Bitwarden Password Manager ( https://bitwarden.com/dln ) 33:59 = NVIDIA Drivers Security Bugs & Open Source 37:47 = K-9 Mail 5.800 Released 40:27 = Humble Bundles: Programming Games & More 42:58 = Outro Other Videos: 7 Reasons Why Firefox Is My Favorite Web Browser: https://youtu.be/bGTBH9yr8uw How To Use Firefox's Best Feature, Multi-Account Containers: https://youtu.be/FfN5L5zAJUo 5 Reasons Why I Use KDE Plasma: https://youtu.be/b0KA6IsO1M8 6 Cool Things You Didn't Know About Linux's History: https://youtu.be/u9ZY41mNB9I Thanks For Watching! Linux #TechNews #Podcast

Azure Cosmos DB cache, serverless MongoDB and Managed Apache Cassandra

Azure Friday (HD) - Channel 9

Play Episode Listen Later Jul 9, 2021

Kirill Gavrylyuk and friends join Scott Hanselman to discuss Azure Cosmos DB updates: integrated cache, serverless for MongoDB API, and Managed Instance for Apache Cassandra with dual write proxy.[0:00:00]– Opening[0:01:33]– Integrated cache with Tim Sander[0:17:36]– Serverless for MongoDB API with Gahl Levy[0:24:00]– Managed Instance for Apache Cassandra with Theo van Kraay[0:37:35]– Wrap-upHow to configure the Azure Cosmos DB integrated cache (Preview)Azure Cosmos DB serverlessAzure Managed Instance for Apache Cassandra documentationGitHub - Azure-Samples / cassandra-proxyLearning path: Work with NoSQL data in Azure Cosmos DBCreate a free account (Azure)

work integrated managed azure cache mongodb serverless nosql scott hanselman apache cassandra azure cosmos db

Azure Cosmos DB cache, serverless MongoDB and Managed Apache Cassandra

Azure Friday (Audio) - Channel 9

Play Episode Listen Later Jul 9, 2021

work integrated managed azure cache mongodb serverless nosql scott hanselman apache cassandra azure cosmos db

Stargate, a GraphQL for databases from DataStax. First stop - Cassandra. Featuring Ed Anuff, DataStax CPO

Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis

Play Episode Listen Later Dec 9, 2020 35:41

A flexible API is key to database accessibility and developer friendliness today. Apache Cassandra was lacking in that department, and DataStax is trying to address this with the release of a new API layer called Stargate. A discussion with Ed Anuff, formerly of Apogee and Google Cloud, and currently DataStax Chief Product Officer, on the rationale behind Stargate, its architecture and operation, how it compares to GraphQL, and a roadmap for the future. Article published on ZDNet

api databases stargate google cloud graphql apogee zdnet datastax apache cassandra

Part 1: Yugabyte - Deep dive into a distributed SQL database

The Computing Podcast

Play Episode Listen Later Jun 11, 2020 24:16

Welcome to our 5rd episode. This is the second part of a two part series where go deep into the internals of Yugabyte with Karthik and Kannan. Yugabyte is a highly scalable and developer friendly open source distributed SQL database. Yugabyte is built by an Ex-Facebook team that wanted to bring what they learnt running one of the latest databases on the planet out into the open source world. One thing I find really fascinating with Yugabyte is that they are fully compatible with Postgres, Redis and Apache Cassandra which makes it easy to replace a lot of infrastructure with just Yugabyte. Hope you enjoy the listen and remember to subscribe for many more of these deep technical discussions. Our guests for this episode are: Kannan Muthukkaruppan, Founder & President, Product Dev. @ Yugabyte Karthik Ranganathan, Founder & CTO @ YugaByte Links: Kudu: Storage for Fast Analytics on Fast Data - https://kudu.apache.org/kudu.pdf Under the Hood: Building and open-sourcing RocksDB - https://www.facebook.com/notes/facebook-engineering/under-the-hood-building-and-open-sourcing-rocksdb/10151822347683920/ The Log-Structured Merge-Tree (LSM-Tree) - https://www.cs.umb.edu/~poneil/lsmtree.pdf Amazon Aurora: Design Considerations for High Throughput Cloud-Native Relational Databases - https://dl.acm.org/doi/epdf/10.1145/3035918.3056101

founders president deep dive databases distributed sql karthik redis postgres kannan apache cassandra yugabyte rocksdb

Another globally distributed cloud native SQL database on the rise: Yugabyte Raises $30 million in Series B Funding. Backstage chat with CEO and Founders

Orchestrate all the Things podcast: Connecting the Dots with George Anadiotis

Play Episode Listen Later Jun 9, 2020 31:59

Your good old on-premise SQL database is in terminal decline. A pure-play open-source cloud-native PostgreSQL, with support for Apache Cassandra and GraphQL interfaces, is what you need. Or at least, this is what the Yugabyte crew thinks. The company, founded by Facebook data infrastructure veterans, announced that it has raised $30 million in an oversubscribed Series B round to double down on community and team growth. This is a crowded market, but big enough to be a non-zero-sum game. We connected with Yugabyte founders Kannan Muthukkaruppan and Karthik Ranganathan, and newly recruited CEO Bill Cook, previously of Sun Microsystems and Pivotal, for a deep dive in the company, the funding, and the market. Article published on ZDNet in June 2020

founders funding raises backstage globally databases pivotal distributed sql series b sun microsystems graphql cloud native postgresql zdnet bill cook apache cassandra yugabyte

Ep. #25, Data Stack Orchestration with Jonathan Ellis of DataStax

Heavybit Podcast Network: Master Feed

Play Episode Listen Later Feb 6, 2020 70:02

In episode 25 of EnterpriseReady, Grant speaks with Jonathan Ellis of DataStax about Apache Cassandra and the complexity of distributed storage systems, as well as recruiting, interviewing, and hiring executives and engineers.

stack orchestration datastax apache cassandra jonathan ellis

What's New in Azure Cosmos DB

Azure Friday (HD) - Channel 9

Play Episode Listen Later Feb 15, 2018

Kirill Gavrylyuk returns to Azure Friday to update Scott Hanselman on what's new in Azure Cosmos DB, such as the Cassandra API for applications that are written for Apache Cassandra, updates to the Azure Table storage API, the Apache Spark Connector, the Graph API, partitioned collections, 99.999% (five 9s) SLA, and more.Dear Cassandra Developers, welcome to Azure #CosmosDB!Introduction to Azure Cosmos DB Table APIApache Spark to Azure #CosmosDB Connector is now generally availableAzure #CosmosDB Graph API now generally availablePartition and scale in Azure Cosmos DBCreate a Free Account (Azure)Follow @SHanselman Follow @AzureFriday Follow @kirillg_msft

api sla scott hanselman apache cassandra azure cosmos db

What's New in Azure Cosmos DB

Azure Friday (Audio) - Channel 9

Play Episode Listen Later Feb 15, 2018

api sla scott hanselman apache cassandra azure cosmos db

Podcasts about Apache Cassandra

Best podcasts about Apache Cassandra

Distributed Data Show

Bigdata Hebdo

airhacks.fm podcast with adam bien

DataSnax Podcast

The Cloud Pod

Azure Friday (HD) - Channel 9

Software Engineering Daily

Cassandra Community Podcasts

The New Stack Podcast

Latest news about Apache Cassandra

Latest podcast episodes about Apache Cassandra

pre:Invent Drumbeat

Ep162: Improving Search for Generative AI Developers with DataStax and AWS

#083 | DataStax

Kodsnack 654 - German-style strings, with Matt Topol

Oracle GoldenGate 23ai: New Features & Product Family

The Database That Doesn't Quit: Apache Cassandra with Patrick McFadin

DataStax and the Future of Real-Time Data Applications with Jonathan Ellis

DataStax and the Future of Real-Time Data Applications with Jonathan Ellis

From Apache Cassandra to Serverless: Exploring Cloud-Native Databases

DataStax with Ed Anuff

DataStax with Ed Anuff

High-Performance Java, Or How JVector Happened

The Evolving Relationship between Apache Cassandra and DataStax

Stackd 66: Streams, Messages, Events, and a Java User Group

210: The Cloud Pod Deep Inspects Itself

Open Source Adoption, DevRel, and FOSS: Learning from Apache Cassandra w/ Patrick McFadin - EP. 26

204: Amazon eats Pi with their own version of S3FS

How Apache Roller Happened

Special Episode: Data on Kubernetes and Cassandra Forward with Patrick McFadin

Star Trek, Star Wars, Transactions, SQL, NoSQL and almost Streaming

Low Code, No Code, WYSIWYG …and some CRaC

The Kubernetes Native Database // Jeffrey Carpenter (DoK Day North America 2022)

Datastax

How ScyllaDB Helped an AdTech Company Focus on Core Business

Apache Cassandra Masterclass with Patrick McFadin

Apache Cassandra: O Banco de Dados NoSQL de Missão Crítica e Tempo-Real da Fortune 500

Consistent Hashing | The Backend Engineering Show

PostgreSQL und MariaDB

DoK Talks #135 - DoK isn't just Database on Kubernetes // Patrick McFadin

Bringing Apache Cassandra closer to Kubernetes (DoK Day EU 2022) // Jake Luciani

Tech with project RapGOD (DoK Day EU 2022) // Abhijith Ganesh

Cassandra on Kubernetes using K8ssandra

Intro to distributed databases on Kubernetes

What's new in October 2021

Yugabyte CEO Karthik Ranganathan

DataStax and the Startup Mentality with Jonathan Ellis

Building Distributed Cognition into Your Business with Sam Ramji

DataStax‘s Christopher Bradford on the Apache Cassandra operator for Kubernetes, K8ssandra

DoK Talks #97- Learn about Developing a Multicluster Operator with K8ssandra Operator // John Sanda

Finding Search Success with Cloud Flexibility

Ep. 483 w/ Chet Kapoor Chairman & CEO at DataStax

162: Framework Laptop, Pop!_OS Rolling Release, Linux Mint, WayDroid | This Week in Linux

Azure Cosmos DB cache, serverless MongoDB and Managed Apache Cassandra

Azure Cosmos DB cache, serverless MongoDB and Managed Apache Cassandra

Stargate, a GraphQL for databases from DataStax. First stop - Cassandra. Featuring Ed Anuff, DataStax CPO

Part 1: Yugabyte - Deep dive into a distributed SQL database

Another globally distributed cloud native SQL database on the rise: Yugabyte Raises $30 million in Series B Funding. Backstage chat with CEO and Founders

Ep. #25, Data Stack Orchestration with Jonathan Ellis of DataStax

What's New in Azure Cosmos DB

What's New in Azure Cosmos DB