Podcasts about modern data stack

Play Episode Listen Later Nov 26, 2025 53:43

This episode is a re-air of one of our most popular conversations from this year, featuring insights worth revisiting. Thank you for being part of the Data Stack community. Stay up to date with the latest episodes at datastackshow.com. This week on The Data Stack Show, John and Matt welcome Pedram Navid, Chief Dashboard Officer at Dagster Labs. During the conversation, Pedram shares his career evolution from consulting to his current role, where he oversees data, developer relations (DevRel), and marketing. The discussion delves into the synergies between DevRel and marketing, emphasizing the importance of understanding developers' learning preferences. Pedram explains data orchestration, highlighting its role in managing and automating data workflows. He also discusses Daxter's unique asset-based approach, which enhances visibility and control over data processes, catering to users from novices to experts, and so much more. Highlights from this week's conversation include:Pedram's Background and Journey in Data (0:47)Joining Dagster Labs (1:41)Synergies Between Teams (2:56)Developer Marketing Preferences (6:06)Bridging Technical Gaps (9:54)Understanding Data Orchestration (11:05)Dagster's Unique Features (16:07)The Future of Orchestration (18:09)Freeing Up Team Resources (20:30)Market Readiness of the Modern Data Stack (22:20)Career Journey into DevRel and Marketing (26:09)Understanding Technical Audiences (29:33)Building Trust Through Open Source (31:36)Understanding Vendor Lock-In (34:40)AI and Data Orchestration (36:11)Modern Data Stack Evolution (39:09)The Cost of AI Services (41:58)Differentiation Through Integration (44:13)Language and Frameworks in Orchestration (49:45)Future of Orchestration and Closing Thoughts (51:54)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

Why AI Agents Need a New Lakehouse. Ciro Greco (Bauplan) on “Git for Data”

strategy data farmers governance data privacy catalog data security data governance gdpr compliance policy management modern data stack donald farmer data lineage

Play Episode Listen Later Nov 26, 2025 53:47

In this episode, Ciro Greco (Co-founder & CEO, Bauplan) joins me to discuss why the future of data infrastructure must be "Code-First" and how this philosophy accidentally created the perfect environment for AI Agents.We explore why the "Modern Data Stack" isn't ready for autonomous agents and why a programmable lakehouse is the solution. Ciro explains that while we trust agents to write code (because we can roll it back), allowing them to write data requires strict safety rails. He breaks down how Bauplan uses "Git for Data" semantics - branching, isolation, and transactionality - to provide an air-gapped sandbox where agents can safely operate without corrupting production data. Welcome to the future of the lakehouse.Bauplan: https://www.bauplanlabs.com/

ceo data greco git ciro bauplan modern data stack

Governance beyond the data catalog – with Donald Farmer, TreeHive Strategy

Data Culture Podcast

Play Episode Listen Later Nov 10, 2025 30:32

“It's one thing to govern the data, but you also have to govern the usage of the data.”

Freestyle Fridays w/ Matt Housley - Fivetran + dbt?, Modern Data Stack Consolidation and AI Bubbles

ai bubbles labs consolidation freestyle friday housley modern data stack

Play Episode Listen Later Oct 3, 2025 42:24

It's all about acquisitions, acquisitions, acquisitions! Matt Housley joins me to tackle the biggest rumor in the data world this week: the potential acquisition of dbt Labs by Fivetran. This news sparks a wide-ranging discussion on the inevitable consolidation of the Modern Data Stack, a trend we predicted as the era of zero-interest-rate policy ended.We also talk about financial pressures, vendor exposure to the rise of AI, the future of data tooling, and more.

#225 - Qover : Structurer son Data Warehouse & Modéliser ses Données (dbt, Médaillon…)

Big Data e Inteligencia Artificial

Play Episode Listen Later Sep 22, 2025 21:44

Grégoire Hornung est Head of Data chez Qover, une pépite belge de l'InsurTech qui a levé 70 millions d'euros et des beaux clients tels que Revolut, Qonto ou Mastercard.On aborde :

head data acast ia leur mastercard visitez leurs suivez laissez donn revolut regarder self service inscrivez data warehouses hornung structurer qonto modern data stack

102. Los datos no valen nada sin ESTO

Play Episode Listen Later Sep 10, 2025 19:53

chatgpt despu esto descubre snowflakes datos regreso inteligencia artificial haz herramientas valen accede elt sesiones etl hadoop modern data stack

#222 - Retool : L'outil low code adopté par les équipes Data & IA (OpenAI, Nvidia, Pernod Ricard, Decathlon…)

Play Episode Listen Later Sep 10, 2025 30:48

Alexis Ego est Solution Engineer chez Retool, l'outil low code massivement adopté par les équipes Data & IA aux US et en Europe. Ils ont déjà convaincu des géants comme OpenAI et Nvidia aux Etats-Unis ou Pernod Ricard et Decathlon en France.On aborde :

europe france data acast ia openai ils nvidia adopt ses open source visitez rh plusieurs etats unis suivez laissez decathlon low code pernod ricard inscrivez modern data stack

#218 - Fairly Made : Lancer le département Data d'une startup

Play Episode Listen Later Aug 25, 2025 28:11

Sarah De Oliveira Bugalho est Head of Data chez Fairly Made, la startup qui propose une solution de mesure de l'impact environnemental des produits textiles et qui a levé 15 millions en 2025. Arrivée premier profil data dans une équipe de 45 personnes, elle a monté le département Data de zéro (stack, use cases, recrutement et intégration produit) et dirige aujourd'hui une équipe de 5 personnes.On aborde :

Redif Top 3 : La fin de la Modern Data Stack ? Avec Christophe Blefari (aka Blef)

Play Episode Listen Later Aug 12, 2025 39:25

Christophe Blefari est Staff Data Engineer et auteur de la célèbre newsletter data française Blef.fr. Il est l'un des plus gros experts data en France et est d'ailleurs membre du collectif de freelances DataGen. Il revient nous parler des dernières actualités data, notamment du débat qui échauffe les esprits ces dernières semaines : est-ce la fin de la Modern Data Stack ?On aborde :

#215 - Le CEO d'Airbyte partage sa vision (Open Source, GenAI, Souveraineté)

The Data Engineering Show

Play Episode Listen Later Jul 9, 2025 21:25

Michel Tricot est CEO et co-fondateur d'Airbyte, l'un des outils d'ingestion modernes leader sur le marché. Leur dernière levée de fonds en 2021 s'élève à 150 millions de dollars avec une valorisation à 1.5 milliards de dollars.On aborde :

AI, Data Engineering, and the Modern Data Stack

AI + a16z

Play Episode Listen Later Jun 20, 2025 35:07

In this episode of AI + a16z, dbt Labs founder and CEO Tristan Handy sits down with a16z's Jennifer Li and Matt Bornstein to explore the next chapter of data engineering — from the rise (and plateau) of the modern data stack to the growing role of AI in analytics and data engineering. As they sum up the impact of AI on data workflows: The interesting question here is human-in-the-loop versus human-not-in-the-loop. AI isn't about replacing analysts — it's about enabling self-service across the company. But without a human to verify the result, that's a very scary thing.Among other specific topics, they also discuss how automation and tooling like SQL compilers are reshaping how engineers work with data; dbt's new Fusion Engine and what it means for developer workflows; and what to make of the spate of recent data-industry acquisitions and ambitious product launches.Follow everyone on X:Tristan HandyJennifer LiMatt Bornstein Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts.

ai data artificial intelligence analytics big data labs data science data analytics sql large language models ai data data engineering data lake modern data stack jennifer li

From Zero to 100M Users: Inside Notion's Data Stack and AI Strategy with Sumit Gupta

Play Episode Listen Later Jun 10, 2025 22:13

Dive into the future of data engineering with Sumit Gupta, Lead BI Engineer at Notion, as he shares insights with the bros on navigating the AI revolution in modern data stacks. From leveraging tools like Snowflake and dbt to automating content creation with AI, discover how traditional technical skills are evolving alongside the rise of AI. Whether you're a seasoned data professional or just starting your journey, learn why embracing AI isn't optional and how to balance technical expertise with crucial soft skills in this rapidly changing landscape. Get an insider's perspective on working at tech giants like Notion, Snowflake, and Dropbox, while exploring practical applications of AI in both professional and personal contexts.

ai data dive users 100m stack dropbox snowflakes notion data analytics business intelligence ai strategy data engineering modern data stack sumit gupta

#206 - Decathlon : Leur Stratégie Supply Chain Analytics

Play Episode Listen Later Jun 2, 2025 27:53

Laurent Borel est Data & Analytics Manager pour le département Supply Chain chez Decathlon. Il y a 5 ans, ils étaient 3 dans l'équipe. Aujourd'hui, ils sont une 30aine. Des Data Analysts, Data Scientists… tous spécialisés sur la Supply Chain.On aborde :

244: Postgres to ClickHouse: Simplifying the Modern Data Stack with Aaron Katz & Sai Krishna Srirampur

challenges philosophy takeaways final thoughts simplifying open source krishna cdp observability postgres clickhouse modern data stack aaron katz rudderstack

Play Episode Listen Later May 20, 2025 34:51

Highlights from this week's conversation include:Background of ClickHouse (1:14)PostgreSQL Data Replication Tool (3:19)Emerging Technologies Observations (7:25)Observability and Market Dynamics (11:26)Product Development Challenges (12:39)Challenges with PostgreSQL Performance (15:30)Philosophy of Open Source (18:01)Open Source Advantages (22:56)Simplified Stack Vision (24:48)End-to-End Use Cases (28:13)Migration Strategies (30:21)Final Thoughts and Takeaways (33:29)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

#203 - Masterclass | Mettre en place une Modern Data Stack avec Matthieu Rousseau

startups masterclass acast timeline ia visitez mettre rousseau suivez matthieu laissez regarder recevoir inscrivez data engineering dataops modern data stack

Play Episode Listen Later May 14, 2025 24:48

Matthieu Rousseau, expert en Data Engineering et DataOps, a fondé Modeo, un cabinet de conseil spécialisé sur la Modern Data Stack et le DataOps qui travaille avec des Grands Groupes et des Startups.

Der modern Data Stack & aktuelle Technologie-Trends – mit Philipp Ziemer, INFORM DataLab

Data Culture Podcast

Play Episode Listen Later Apr 21, 2025 36:25

In dieser Folge des Data Culture Podcasts spricht Carsten Bange mit Philipp Ziemer, einem Technologieexperten für Data und Analytics. Sie diskutieren über die neuesten Technologie-Trends und die Bedeutung der Auswahl der richtigen Tools für Datenprojekte.

data tools analytics bedeutung technologie philipp auswahl inform data analytics aktuelle data governance datenanalyse data culture data lab ziemer modern data stack technologietrends

What it Takes to Build a BI Platform | Colin Zima, CEO of Omni

Infinite Machine Learning

Play Episode Listen Later Apr 15, 2025 40:07 Transcription Available

Colin Zima is the cofounder and CEO of Omni, a data platform that combines the consistency of a shared data model with the speed and freedom of SQL. They recently raised their $69M Series B led by ICONIQ Growth. He was previously the Chief Analytics Officer at Looker.Colin's favorite book: Blink (Author: Malcolm Gladwell)(00:01) Introduction(01:10) What Is a Data Model and Why It Matters(03:27) Gaps in the Modern Data Stack(05:38) The Staying Power of SQL(07:29) Origin Story: Why Omni Was Created(10:13) Lessons from Building the MVP(12:48) Go-to-Market Insights: Zero to Ten Customers(16:02) Founder-Led Sales and Marketing Tactics(18:58) Company Building: Recruiting and Product Challenges(21:34) Product Positioning in a Crowded Market(23:26) Design Philosophy in Enterprise Software(28:21) Omni's Tech Stack and Development Strategy(28:57) Real-World Use of AI Inside the Company(31:01) Future of Data Tooling and Role of AI(33:49) Rapid Fire Round--------Where to find Colin Zima: LinkedIn: https://www.linkedin.com/in/colinzima/--------Where to find Prateek Joshi: Newsletter: https://prateekjoshi.substack.com Website: https://prateekj.com LinkedIn: https://www.linkedin.com/in/prateek-joshi-infiniteX: https://x.com/prateekvjoshi

Inside the Mind of Snowflake's CEO: Bold Bets in the AI Arms Race

Play Episode Listen Later Apr 10, 2025 83:41

In this episode, we sit down with Sridhar Ramaswamy, CEO of Snowflake, for an in-depth conversation about the company's transformation from a cloud analytics platform into a comprehensive AI data cloud. Sridhar shares insights on Snowflake's shift toward open formats like Apache Iceberg and why monetizing storage was, in his view, a strategic misstep.We also dive into Snowflake's growing AI capabilities, including tools like Cortex Analyst and Cortex Search, and discuss how the company scaled AI deployments at an impressive pace. Sridhar reflects on lessons from his previous startup, Neeva, and offers candid thoughts on the search landscape, the future of BI tools, real-time analytics, and why partnering with OpenAI and Anthropic made more sense than building Snowflake's own foundation models.SnowflakeWebsite - https://www.snowflake.comX/Twitter - https://x.com/snowflakedbSridhar RamaswamyLinkedIn - https://www.linkedin.com/in/sridhar-ramaswamyX/Twitter - https://x.com/RamaswmySridharFIRSTMARKWebsite - https://firstmark.comX/Twitter - https://twitter.com/FirstMarkCapMatt Turck (Managing Director)LinkedIn - https://www.linkedin.com/in/turck/X/Twitter - https://twitter.com/mattturck(00:00) Intro and current market tumult(02:48) The evolution of Snowflake from IPO to Today(07:22) Why Snowflake's earliest adopters came from financial services(15:33) Resistance to change and the philosophical gap between structured data and AI(17:12) What is the AI Data Cloud?(23:15) Snowflake's AI agents: Cortex Search and Cortex Analyst(25:03) How did Sridhar's experience at Google and Neeva shape his product vision?(29:43) Was Neeva simply ahead of its time?(38:37) The Epiphany mafia(40:08) The current state of search and Google's conundrum(46:45) “There's no AI strategy without a data strategy”(56:49) Embracing Open Data Formats with Iceberg(01:01:45) The Modern Data Stack and the future of BI(01:08:22) The role of real-time data(01:11:44) Current state of enterprise AI: from PoCs to production(01:17:54) Building your own models vs. using foundation models(01:19:47) Deepseek and open source AI(01:21:17) Snowflake's 1M Minds program(01:21:51) Snowflake AI Hub

ceo ai google building current resistance ipo openai epiphany bets bi snowflakes iceberg inside the mind anthropic arms race pocs sridhar neeva modern data stack apache iceberg

Trends in Data Engineering – Adrian Brudaru

DataTalks.Club

Play Episode Listen Later Mar 7, 2025 56:59

In this podcast episode, we talked with Adrian Brudaru about the past, present and future of data engineering.About the speaker:Adrian Brudaru studied economics in Romania but soon got bored with how creative the industry was, and chose to go instead for the more factual side. He ended up in Berlin at the age of 25 and started a role as a business analyst. At the age of 30, he had enough of startups and decided to join a corporation, but quickly found out that it did not provide the challenge he wanted.As going back to startups was not a desirable option either, he decided to postpone his decision by taking freelance work and has never looked back since. Five years later, he co-founded a company in the data space to try new things. This company is also looking to release open source tools to help democratize data engineering.0:00 Introduction to DataTalks.Club1:05 Discussing trends in data engineering with Adrian2:03 Adrian's background and journey into data engineering5:04 Growth and updates on Adrian's company, DLT Hub9:05 Challenges and specialization in data engineering today13:00 Opportunities for data engineers entering the field15:00 The "Modern Data Stack" and its evolution17:25 Emerging trends: AI integration and Iceberg technology27:40 DuckDB and the emergence of portable, cost-effective data stacks32:14 The rise and impact of dbt in data engineering34:08 Alternatives to dbt: SQLMesh and others35:25 Workflow orchestration tools: Airflow, Dagster, Prefect, and GitHub Actions37:20 Audience questions: Career focus in data roles and AI engineering overlaps39:00 The role of semantics in data and AI workflows41:11 Focusing on learning concepts over tools when entering the field 45:15 Transitioning from backend to data engineering: challenges and opportunities 47:48 Current state of the data engineering job market in Europe and beyond 49:05 Introduction to Apache Iceberg, Delta, and Hudi file formats 50:40 Suitability of these formats for batch and streaming workloads 52:29 Tools for streaming: Kafka, SQS, and related trends 58:07 Building AI agents and enabling intelligent data applications 59:09Closing discussion on the place of tools like DBT in the ecosystem

Drill to Detail Ep.118 ‘A Look Into the Future of Looker and Google Cloud Data Analytics' featuring Special Guest Sean Zinsmeister

Drill to Detail

Play Episode Listen Later Feb 3, 2025 50:36

In this episode, Sean Zinsmeister from Google joins Mark Rittman to discuss the latest developments for Looker including the integration of Looker Studio, new modeling capabilities and the exciting potential of generative AI for BI.We discuss how Looker is evolving to be a more open, composable platform that can power advanced analytics and data storytelling, with Sean sharing insights on Google's purpose-built Gemini models for natural language to SQL translation and how Looker customers can leverage these AI capabilities. We also explore Looker's agentic API strategy, the long-term vision of using Looker Studio as the primary Looker front-end and the opening up of LookML to tools beyond just Looker. Driving Looker customer innovations in the generative AI eraPreviewing Studio in Looker, the (Eventual) Future of Self-Service Reporting for LookerLooker now available from Google Cloud consoleDelivering the third wave of BI in the AI era with LookerDrill to Detail Ep.100 Special ‘Past, Present and Future of the Modern Data Stack' with Special Guests Keenan Rice, Stewart Bryson and Jake SteinDrill to Detail Ep. 73 'Luck, Thinking Different and Designing Looker Data Platform' with Special Guest Colin Zima

ai google future gemini detail bi api drill data analytics google cloud sql looker modern data stack looker studio sean zinsmeister mark rittman

224: Bridging Gaps: DevRel, Marketing Synergies, and the Future of Data with Pedram Navid of Dagster Labs

detail gomez drill airflow modern data stack

Play Episode Listen Later Jan 15, 2025 53:24

Highlights from this week's conversation include:Pedram's Background and Journey in Data (0:47)Joining Dagster Labs (1:41)Synergies Between Teams (2:56)Developer Marketing Preferences (6:06)Bridging Technical Gaps (9:54)Understanding Data Orchestration (11:05)Dagster's Unique Features (16:07)The Future of Orchestration (18:09)Freeing Up Team Resources (20:30)Market Readiness of the Modern Data Stack (22:20)Career Journey into DevRel and Marketing (26:09)Understanding Technical Audiences (29:33)Building Trust Through Open Source (31:36)Understanding Vendor Lock-In (34:40)AI and Data Orchestration (36:11)Modern Data Stack Evolution (39:09)The Cost of AI Services (41:58)Differentiation Through Integration (44:13)Language and Frameworks in Orchestration (49:45)Future of Orchestration and Closing Thoughts (51:54)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Drill to Detail Ep.117 ‘How DataCoves Operationalises the Modern Data Stack' featuring Special Guest Noel Gomez

Drill to Detail

Play Episode Listen Later Dec 20, 2024 50:37

Join Mark Rittman in this special end-of-year episode as he speaks with Noel Gomez, co-founder of DataCoves about the challenges and opportunities of orchestrating dbt and other tools within the open-source Modern Data Stack, navigating the evolving semantic layer landscape and the future of modular, vendor-agnostic data solutions.Datacoves Platform OverviewBuild vs Buy Analytics Platform: Hosting Open-Source ToolsScale the benefits of Core with dbt CloudDagster vs. Airflow

Do You Really Need That New Data Tool, or is a Spreadsheet Good Enough?

ptsd tool good enough iceberg spreadsheets new data hadoop modern data stack

Play Episode Listen Later Dec 16, 2024 5:01

This morning, a great article came across my feed that gave me PTSD, asking if Iceberg is the Hadoop of the Modern Data Stack? In this rant, I bring the discussion back to a central question you should ask with any hot technology - do you need it at all? Do you need a tool built for the top 1% of companies at a sufficient data scale? Or is a spreadsheet good enough? Link: https://blog.det.life/apache-iceberg-the-hadoop-of-the-modern-data-stack-c83f63a4ebb9

Building Relational AI: AI Coprocessor for Snowflake | Molham Aref CEO & Founder

Startup Project

Play Episode Listen Later Nov 19, 2024 59:09

My guest today is Moham Aref, CEO of Relational AI, a company that recently closed a $75 million Series B funding round. Moham shares his incredible journey spanning over 30 years in the AI and machine learning space, offering invaluable insights for aspiring entrepreneurs and tech enthusiasts alike. Moham Aref, CEO of Relational AI, Moham brings a wealth of experience, having previously led LogicBlox and Predictix, and now spearheading Relational AI's mission to simplify intelligent application development. → Website: ⁠https://relational.ai/ → Linkedin: https://www.linkedin.com/in/molham/ Nataraj is the host & creator of Startup Project podcast, he is a full time product manager at Microsoft, early stage investor & advisor. → Linkedin: https://www.linkedin.com/in/natarajsindam/ → Twitter: https://x.com/natarajsindam → Email updates: ⁠https://startupproject.substack.com/⁠ → Website: ⁠⁠⁠https://thestartupproject.io⁠⁠⁠ Podcast Highlights: This episode covers a wide range of fascinating topics, from Moham's extensive career journey to the intricacies of the modern data stack and the transformative potential of Relational AI's technology. We unravel the complexities of descriptive, predictive, and prescriptive analytics, demystifying these crucial concepts for a broader audience. We also discuss the challenges of finding those first five customers in the B2B world, the strategic decision to build on Snowflake, and the potential for future competition and cannibalization by larger platforms. Moham thoughtfully shares his perspective on the current hype surrounding Generative AI and its practical applications in the enterprise space. We finish with advice on leadership, mentorship, and the overall challenges and rewards of a career in tech. Timestamps: 00:00 - Introduction and Guest Introduction 01:55 - Moham Aref's Career Journey and Transition to Relational AI 08:30 - Understanding Descriptive, Predictive, and Prescriptive Analytics 12:00 - Early Use Cases and Target Customers for Relational AI 17:30 - The Decision to Build on Snowflake: Strategy and Competition 22:15 - Securing the First Five Customers in the B2B World 27:40 - The Modern Data Stack and Relational AI's Place Within It 34:30 - Generative AI: Hype, Reality, and Enterprise Applications 40:00 - Leveraging Generative AI Internally and for Customer Value 45:00 - B2B Sales Strategies: Content, Relationships, and Customer Focus 51:30 - Relational AI's Future Plans and Growth Strategy 54:00 - Moham's Consumption Habits: Historical Insights and Mentorship 58:30 - Lessons Learned as a Founder and CEO Don't forget to like and subscribe for more insightful conversations about the world of AI! → YouTube: ⁠https://youtu.be/9-J4eV8qvZg⁠ → Spotify: ⁠https://open.spotify.com/episode/3Og8mbra1cokQ5cRJdjZn1?si=iqEOqKLLSqSbk8ehkniFqg⁠ → Apple podcasts: ⁠https://podcasts.apple.com/us/podcast/85-ai-should-not-be-regulated-author-ml-researcher/id1551300319?i=1000673806783⁠ → Email updates: ⁠https://startupproject.substack.com/⁠ → Others: ⁠https://spotifyanchor-web.app.link/e/qYaG6vhTRNb⁠#ModernDataStack #RelationalAI #AI #MachineLearning #DataAnalytics #PredictiveAnalytics #PrescriptiveAnalytics #GenerativeAI #Snowflake #B2B #Entrepreneurship #TechPodcast #DataManagement #BusinessIntelligence #CloudComputing #TechLeadership #CareerAdvice #Innovation #DataStrategy

The Death of Big Data and Why It's Time To Think Small | Jordan Tigani, CEO, MotherDuck

death marketing big data founding 100m think small small data duckdb modern data stack tomasz tunguz google bigquery

Play Episode Listen Later Oct 24, 2024 59:00

A founding engineer on Google BigQuery and now at the helm of MotherDuck, Jordan Tigani challenges the decade-long dominance of Big Data and introduces a compelling alternative that could change how companies handle data. Jordan discusses why Big Data technologies are an overkill for most companies, how MotherDuck and DuckDB offer fast analytical queries, and lessons learned as a technical founder building his first startup. Watch the episode with Tomasz Tunguz: https://youtu.be/gU6dGmZzmvI Website - https://motherduck.com Twitter - https://x.com/motherduck Jordan Tigani LinkedIn - https://www.linkedin.com/in/jordantigani Twitter - https://x.com/jrdntgn FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (00:56) What is the Small Data? (06:56) Marketing strategy of MotherDuck (08:39) Processing Small Data with Big Data stack (15:30) DuckDB (17:21) Creation of DuckDB (18:48) Founding story of MotherDuck (24:08) MotherDuck's community (25:25) MotherDuck of today ($100M raised) (33:15) Why MotherDuck and DuckDB are so fast? (39:08) The limitations and the future of MotherDuck's platform (39:49) Small Models (42:37) Small Data and the Modern Data Stack (46:47) Making things simpler with a shift from Big Data to Small Data (50:04) Jordan Tigani's entrepreneurial journey (58:31) Outro

E151: Taking on DBT by Combining Data Transformation with a Query Engine

Open Source Startup Podcast

Play Episode Listen Later Sep 30, 2024 31:14

Lukas Schulte is Co-Founder and CEO of SDF Labs (Semantic Data Fabric), the data transformation layer and query engine platform. They're an open core company powered by the Apache Data Fusion query engine. SDF Labs has raised $9M from investors including RTP Global and Two Sigma Ventures. In this episode, we dig into the complications and pain points with the Modern Data Stack, shifting left with data (ie. moving more over to the client), competing with DBT by adding a query engine, why building in Rust was important, why their CLI is closed source, the importance of a strong partner strategy as a data company & more!

ceo co founders rust engine dbt 9m query cli data transformation modern data stack rtp global

Drill to Detail Ep.112 ‘From Delphi to Cube's New Semantic Model AI Features' with Special Guest David Jayatillake

Drill to Detail

Play Episode Listen Later Sep 12, 2024 33:42

Mark Rittman is joined by returning guest David Jayatillake, VP of AI at Cube.dev, to talk about Delphi Labs' journey from a standalone data analytics chatbot to now becoming the basis of Cube's new AI features within its composable semantic model product.Drill to Detail Ep.102 'LLMs, Semantic Models and Bringing AI to the Modern Data Stack' with Special Guest David JayatillakeDrill to Detail Ep.107 'Cube, Headless BI and the AI Semantic Layer' with Special Guest Artyom KeydunovIntroducing the AI API and Chart Prototyping in Cube CloudA Practical Guide to Getting Started with Cube's AI APICube Rollup London : Bringing Cube Users Together

ai model getting started detail cube drill delphi semantic modern data stack mark rittman

Is the Modern Data Stack Out Over its Skis?

Breaking Analysis with Dave Vellante

Play Episode Listen Later Sep 7, 2024 30:30

skis modern data stack

What even is the modern data stack (Interview)

The Changelog

Play Episode Listen Later Jul 17, 2024 72:05

Benn Stancil's weekly Substack on data and technology provides a fascinating perspective on the modern data stack & the industry building it. On this episode, Benn joins Jerod to dissect a few of his essays, discuss opportunities he sees during this slowdown & discuss why he thinks maybe we should disband the analytics team.

development code software substack hackers programming open source software engineering benn jerod changelog modern data stack jerod santo

What even is the modern data stack (Changelog Interviews #600)

Changelog Master Feed

Play Episode Listen Later Jul 17, 2024 72:05

Benn Stancil's weekly Substack on data and technology provides a fascinating perspective on the modern data stack & the industry building it. On this episode, Benn joins Jerod to dissect a few of his essays, discuss opportunities he sees during this slowdown & explain why he thinks maybe we should disband the analytics team.

development code software substack hackers programming open source software engineering benn jerod changelog modern data stack jerod santo

437. Why Hardware is Attractive, The Most Interesting Areas in AI Outside of GenAI, and the Modern Data Stack (Jake Yormak)

Play Episode Listen Later Jun 10, 2024 57:58

Jake Yormak of Story Ventures joins Nate to discuss Why Hardware is Attractive, The Most Interesting Areas in AI Outside of GenAI, and the Modern Data Stack. In this episode we cover: Concentrating on Early-Stage Companies with Potential for Growth Investing in Hardware Companies, Challenges and Opportunities Focusing on Power Law Outliers AI Commoditization, Impact on Profit Pools, with a Focus on Computer Vision and Proprietary Data AI in Workflows, Incentivizing Users to Contribute Context Guest Links: LinkedIn X Story Ventures The hosts of The Full Ratchet are Nick Moran and Nate Pierotti of New Stack Ventures, a venture capital firm committed to investing in founders outside of the Bay Area. Want to keep up to date with The Full Ratchet? Follow us on social. You can learn more about New Stack Ventures by visiting our LinkedIn and Twitter. Are you a founder looking for your next investor? Visit our free tool VC-Rank and we'll send a list of potential investors right to your inbox!

challenges focus impact bay area areas hardware attractive genai workflows computer vision concentrating growth investing early stage companies nick moran modern data stack new stack ventures

AI, Data and Blockchain: a VC perspective | Tomasz Tunguz, Founder of Theory Ventures

Play Episode Listen Later May 16, 2024 54:54

In this episode, we sat down with Tomasz Tunguz (https://twitter.com/ttunguz), the founder of Theory Ventures and a leading voice in the tech investment space. We discussed the transformative potential of Ethereum as a database company, the importance of data security in a decentralized world, and the evolving landscape of AI technologies from foundational models to AI-native applications.

Safiyy Momen - The Good and Bad of the Modern Data Stack, Controlling Cloud Costs, and More

cloud costs controlling good and bad momen modern data stack

Play Episode Listen Later May 15, 2024 51:46

Safiyy Momen and I chat about the good and bad of the Modern Data Stack, controlling cloud costs, boring engineering, and much more. LinkedIn: https://www.linkedin.com/in/safiyy-momen/

Navigating the AI Landscape: A Survival Guide | 2024 MAD Landscape with Matt Turck and Aman Kabeer

Play Episode Listen Later Apr 26, 2024 48:57

ai navigating open blog landscape saas emerging vc mad survival guide vcs aman genai modern data stack matt turck

186: Data Fusion and The Future Of Specialized Databases with Andrew Lamb of InfluxData

Play Episode Listen Later Apr 24, 2024 58:26

Highlights from this week's conversation include:The Evolution of Data Systems (0:47)The Role of Open Source Software (2:39)Challenges of Time Series Data (6:38)Architecting InfluxDB (9:34)High Cardinality Concepts (11:36)Trade-Offs in Time Series Databases (15:35)High Cardinality Data (18:24)Evolution to InfluxDB 3.0 (21:06)Modern Data Stack (23:04)Evolution of Database Systems (29:48)InfluxDB Re-Architecture (33:14)Building an Analytic System with Data Fusion (37:33)Challenges of Mapping Time Series Data into Relational Model (44:55)Adoption and Future of Data Fusion (46:51)Externalized Joins and Technical Challenges (51:11)Exciting Opportunities in Data Tooling (55:20)Emergence of New Architectures (56:35)Final thoughts and takeaways (57:47)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

future challenges building evolution adoption fusion emergence databases specialized tradeoffs cdp open source software data systems influxdb influxdata modern data stack rudderstack andrew lamb database systems

Startup Funding Espresso – Modern Data Stack

Investor Connect Podcast

Play Episode Listen Later Apr 19, 2024 2:08

Modern Data Stack Hello, this is Hall T. Martin with the Startup Funding Espresso -- your daily shot of startup funding and investing. The modern data stack is the term for the tools used by tech companies to analyze and integrate data. It's cloud-based which alleviates many of the challenges in analyzing data with legacy systems. Here are the components of the modern data stack: Data sources -- this includes databases, company products that produce a stream of data, and event streams which log each action a user takes. Data warehouse -- these are the tools used to store the voluminous amounts of data that come from data analysis work. This includes data lakes and other large-scale formats for storing the data. Data analytics -- this includes the ability to query into the data sets and apply analytics to the data. Data transformation -- this moves the data into a format that end users can use for their own queries and analysis. Data monitoring -- this captures metrics about the data such as how often the data is being used and for what applications. Data governance -- this monitors the use of the data to comply with government regulations. Data applications -- the set of applications which use the data output from the system for applications such as business intelligence. In setting up a data analytics program at your company consider the modern data stack and its components. Thank you for joining us for the Startup Funding Espresso where we help startups and investors connect for funding. Let's go startup something today. _______________________________________________________ For more episodes from Investor Connect, please visit the site at: Check out our other podcasts here: For Investors check out: For Startups check out: For eGuides check out: For upcoming Events, check out For Feedback please contact info@tencapital.group Please , share, and leave a review. Music courtesy of .

music data events startups funding espresso modern data stack investor connect for feedback hall t

183: Why Modern Data Quality Must Move Beyond Traditional Data Management Practices with Chad Sanderson of Gable.ai

The Analytics Engineering Podcast

Play Episode Listen Later Mar 27, 2024 62:50

Highlights from this week's conversation include:Chad's background and journey in data (0:46)Importance of Data Supply Chain (2:19)Challenges with Modern Data Stack (3:28)Comparing Data Supply Chain to Real-world Supply Chains (4:49)Overview of Gable.ai (8:05)Rethinking Data Catalogs (11:42)New Ideas for Managing Data (15:16)Data Discovery and Governance Challenges (18:51)Static Code Analysis and AI Impact on Data (24:55)Creating Contracts and Defining Data Lineage (27:31)Data Quality Issues and Upstream Problems (32:32)Challenges with Third-Party Vendors and External Data (34:29)Incentivizing Engineers for Data Quality (40:28)Feedback Loops and Actionability in Data Catalogs (45:30)Missing metadata (48:57)Role of AI in data semantics (50:27)Data as a product (54:26)Slowing down to go faster (57:38)Quantifying the cost of data changes (1:01:24)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

How Bobsled is Revolutionizing Cross-Cloud Data Sharing

Founded and Funded

Play Episode Listen Later Mar 20, 2024 33:23

Investor Sabrina Wu hosts Bobsled Co-founder and CEO Jake Graham for the latest episode of Founded & Funded. Jake is revolutionizing data sharing across platforms, enabling customers to get to analysis faster directly in the platforms where they work. Madrona co-led Bobsled's $17 million series A last year, which put the company at an $87 million valuation. In this episode, Jake — who had stints at Neo4j, Intel, and Microsoft — provides his perspective on why enabling cross-cloud data sharing is often cumbersome yet so important in the age of AI. He also shares why you can't PLG the enterprise, how to convince customers to adopt new technologies in a post-zero interest rate environment, and what it takes to land and partner with the hyperscalers. Transcript here: https://www.madrona.com/bobsled-cross-cloud-data-sharing/ (00:00) Introduction (01:36) Why found a startup? (03:00) The Genesis of Bobsled: From Inspiration to Reality (05:26) Understanding Bobsled's Functionality: Cross-Cloud Data Sharing (09:48) The Role of Cross-Cloud Data Sharing in the Age of AI (13:05) Redefining the Modern Data Stack and Its Future (18:04) Navigating Enterprise Sales and Partnerships in Tech (23:22) Strategic Partnerships and Navigating the Hyperscaler Landscape (29:40) Leadership Lessons and Vision for Bobsled

The End of the Modern Data Stack (w/ Benn Stancil, Mode)

Play Episode Listen Later Feb 25, 2024 45:54

Benn Stancil, cofounder and CTO at Mode, returns to The Analytics Engineering Podcast to discuss the evolution of the term "modern data stack" and its value today. Tristan wrote on this idea for The Analytics Engineering Roundup in Is the Modern Data Stack Still a Useful Idea? For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.

data modern cto labs stack benn modern data stack

Is the Modern Data Stack Dead? with dbt Labs CEO Tristan Handy

ceo ai data market modern artificial intelligence hiring machine learning labs handy venture capital stack databases 4b dbt mds modern data stack

Play Episode Listen Later Feb 22, 2024 47:22

In this episode, we explore the dynamic world of modern analytics with Tristan Handy, CEO of dbt Labs (https://twitter.com/jthandy). DBT, which helps more than 30,000 enterprises ship trusted data products faster, has raised more than $400 million dollars, most recently at a $4B valuation.We discuss how dbt has revolutionized analytics engineering, enabling seamless data transformation and orchestration in the cloud. This innovation fosters greater collaboration among data teams and integrates software engineering principles into data analytics workflows.We also talk about dbt's Semantic Layer, a game-changer that streamlines data operations by standardizing key business metrics for consistent use across various analytical tools.In this conversation, we tackle pressing questions about the current state and future of data management and analytics. Is the "modern data stack" becoming obsolete? What's next for data engineering? And how is AI reshaping the analytics landscape?Tune in to discover our insights.

#163 - Joe Reis and Matt Housley - The Demise of the Modern Data Stack & Listener Q&A

Monday Morning Data Chat

Play Episode Listen Later Feb 20, 2024 66:53

Joe Reis and Matt Housley are back for another listener Q&A. They chat about the demise of the Modern Data Stack, architecture, data modeling, AI, and much more.

ai data modern stack reis demise housley modern data stack joe reis

5 Minute Friday - Everything Ends...Moving on From the Modern Data Stack

moving data modern ends stack minute friday modern data stack

Play Episode Listen Later Feb 16, 2024 9:34

My voice is sort of working, and I chat about Tristan Handy's article that raised quite a ruckus this week, "Is the "Modern Data Stack" Still a Useful Idea?" In the end, the Modern Data Stack won - people use the cloud for analytics. And everything ends, so I'm excited for what's next. Article: https://roundup.getdbt.com/p/is-the-modern-data-stack-still-a?r=oc02

The PRQL: Why is a Semantic Layer Important in the Modern Data Stack? Featuring Artyom Keydunov of Cube Dev

Play Episode Listen Later Jan 22, 2024 3:08

In this bonus episode, Eric and Kostas preview their upcoming conversation with Artyom Keydunov of Cube Dev.

data modern stack cube layer semantic kostas artyom modern data stack

Adding An Easy Mode For The Modern Data Stack With 5X

Data Engineering Podcast

Play Episode Listen Later Dec 18, 2023 56:12

Summary The "modern data stack" promised a scalable, composable data platform that gave everyone the flexibility to use the best tools for every job. The reality was that it left data teams in the position of spending all of their engineering effort on integrating systems that weren't designed with compatible user experiences. The team at 5X understand the pain involved and the barriers to productivity and set out to solve it by pre-integrating the best tools from each layer of the stack. In this episode founder Tarush Aggarwal explains how the realities of the modern data stack are impacting data teams and the work that they are doing to accelerate time to value. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) You shouldn't have to throw away the database to build with fast-changing data. You should be able to keep the familiarity of SQL and the proven architecture of cloud warehouses, but swap the decades-old batch computation model for an efficient incremental engine to get complex queries that are always up-to-date. With Materialize, you can! It's the only true SQL streaming database built from the ground up to meet the needs of modern data products. Whether it's real-time dashboarding and analytics, personalization and segmentation or automation and alerting, Materialize gives you the ability to work with fresh, correct, and scalable results — all in a familiar SQL interface. Go to dataengineeringpodcast.com/materialize (https://www.dataengineeringpodcast.com/materialize) today to get 2 weeks free! Data lakes are notoriously complex. For data engineers who battle to build and scale high quality data workflows on the data lake, Starburst powers petabyte-scale SQL analytics fast, at a fraction of the cost of traditional methods, so that you can meet all your data needs ranging from AI to data applications to complete analytics. Trusted by teams of all sizes, including Comcast and Doordash, Starburst is a data lake analytics platform that delivers the adaptability and flexibility a lakehouse ecosystem promises. And Starburst does all of this on an open architecture with first-class support for Apache Iceberg, Delta Lake and Hudi, so you always maintain ownership of your data. Want to see Starburst in action? Go to dataengineeringpodcast.com/starburst (https://www.dataengineeringpodcast.com/starburst) and get $500 in credits to try Starburst Galaxy today, the easiest and fastest way to get started using Trino. Your host is Tobias Macey and today I'm welcoming back Tarush Aggarwal to talk about what he and his team at 5x data are building to improve the user experience of the modern data stack. Interview Introduction How did you get involved in the area of data management? Can you describe what 5x is and the story behind it? We last spoke in March of 2022. What are the notable changes in the 5x business and product? What are the notable shifts in the data ecosystem that have influenced your adoption and product direction? What trends are you most focused on tracking as you plan the continued evolution of your offerings? What are the points of friction that teams run into when trying to build their data platform? Can you describe design of the system that you have built? What are the strategies that you rely on to support adaptability and speed of onboarding for new integrations? What are some of the types of edge cases that you have to deal with while integrating and operating the platform implementations that you design for your customers? What is your process for selection of vendors to support? How would you characterize your relationships with the vendors that you rely on? For customers who have pre-existing investment in a portion of the data stack, what is your process for engaging with them to understand how best to support their goals? What are the most interesting, innovative, or unexpected ways that you have seen 5XData used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on 5XData? When is 5X the wrong choice? What do you have planned for the future of 5X? Contact Info LinkedIn (https://www.linkedin.com/in/tarushaggarwal/) @tarush (https://twitter.com/tarush) on Twitter Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links 5X (https://5x.co) Informatica (https://www.informatica.com/) Snowflake (https://www.snowflake.com/en/) Podcast Episode (https://www.dataengineeringpodcast.com/snowflakedb-cloud-data-warehouse-episode-110/) Looker (https://cloud.google.com/looker/) Podcast Episode (https://www.dataengineeringpodcast.com/looker-with-daniel-mintz-episode-55/) DuckDB (https://duckdb.org/) Podcast Episode (https://www.dataengineeringpodcast.com/duckdb-in-process-olap-database-episode-270/) Redshift (https://aws.amazon.com/redshift/) Reverse ETL (https://medium.com/memory-leak/reverse-etl-a-primer-4e6694dcc7fb) Fivetran (https://www.fivetran.com/) Podcast Episode (https://www.dataengineeringpodcast.com/fivetran-data-replication-episode-93/) Rudderstack (https://www.rudderstack.com/) Podcast Episode (https://www.dataengineeringpodcast.com/rudderstack-open-source-customer-data-platform-episode-263/) Peak.ai (https://peak.ai/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)

#273 An API-First World in Data Integration - An Actual Modern Data Stack - Zhamak's Corner 31

Data Mesh Radio

Play Episode Listen Later Dec 1, 2023 22:52

Key Points:The rush to categorize all of our tooling in data has caused many issues - we will see a big shake-up coming in the future much like happened in application development tooling.So much of data people's time is spent on things that don't add value themselves, it's work that should be automated. We need to fix that so the data work is about delivering value.We can learn a lot from virtualization but data virtualization is not where things should go in general.Containerization is merely an implementation detail. Much like software developers don't really care much about process containers, the same will happen in data product containers - it's all about the experience and containers significantly improve the experience.The pendulum swung towards decoupled data tech instead of monolithic offerings with 'The Modern Data Stack' but most of the technologies were not that easy to stitch together. Going forward, we want to keep the decoupled strategy but we need a better way to integrate - APIs is how it worked in software, why not in data? Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or

modern stack apis pixabay first world data integration sergequadrado containerization api first modern data stack lexin music itswatr zhamak

Egor Gryaznov - The "Non-Modern Data Stack", and Getting Out of Our Data Bubble