The Datanation Podcast - Podcast for Data Engineers, Analysts and Scientists

Follow The Datanation Podcast - Podcast for Data Engineers, Analysts and Scientists
Share on
Copy link to clipboard

Explaining Concepts and Ideas relevant to Data Engineering and Analytics. Follow Alex on Twitter @amdatalakehouse Find article Alex has written on Data related topics at Dremio.com/Subsurface

Alex Merced Podcasts

Donate to The Datanation Podcast - Podcast for Data Engineers, Analysts and Scientists


    • May 27, 2025 LATEST EPISODE
    • monthly NEW EPISODES
    • 56 EPISODES


    Search for episodes from The Datanation Podcast - Podcast for Data Engineers, Analysts and Scientists with a specific topic:

    Latest episodes from The Datanation Podcast - Podcast for Data Engineers, Analysts and Scientists

    Data News: DuckLake, Confluent’s TableFlow, New Book!

    Play Episode Listen Later May 27, 2025


    Go to the DataLakehouseHub.com and join my Slack Community Download Free Iceberg Book: https://drmevn.fyi/podcast52725iceberg Download Free Polaris Book: https://drmevn.fyi/podcast52725Polaris

    Bonus – What is MCP? (Model Context Protocol – Modern AI 101)

    Play Episode Listen Later Apr 6, 2025


    Alex Merced discusses modern AI and the new MCP standard and what it means. Follow Alex’s Blogs on and on Social at AlexMerced.com

    Will Apache Iceberg and Delta Lake Merge?

    Play Episode Listen Later Feb 21, 2025


    Alex Merced discusses the idea of whether Apache Iceberg and Delta Lake could merge. Follow my blog: https://medium.alexmerced.blog

    2025, Data Lakehouse Looking Forward

    Play Episode Listen Later Dec 30, 2024


    Alex Merced discusses what he looks forward to in 2025 in the Data Lakehouse Space. Alex Merced Event Listings: https://lu.ma/LakehouselinkupsAlex on Bluesky: https://bsky.app/profile/alextalksdatalakehouses.fyi Alex on Twitter: https://x.com/AMdatalakehouse Alex on LinkedIn: https://www.linkedin.com/in/alexmerced/

    63 – Reinvent, AWS S3 Table Buckets and Apache Iceberg

    Play Episode Listen Later Dec 6, 2024


    Alex Merced discusses his experience at AWS re:invent follow Alex at AlexMered.com/data

    BONUS: Data Lakehouse Crash Course (Polaris, Nessie, Unity, Gravitino, Lakekeeper and more!)

    Play Episode Listen Later Nov 5, 2024


    Register for the catalog Course: https://drmevn.fyi/catalogcourse1024 Watch the Iceberg Crash Course: https://drmevn.fyi/icebergcourse1024 London Meetup: https://lu.ma/Lakehouselinkups Paris Meetup: https://drmevn.fyi/1120-france-meetup My Calendar of Events: https://lu.ma/Lakehouselinkups

    62 – Why Catalogs are so hot right now in the data space?

    Play Episode Listen Later Oct 30, 2024


    Alex Merced discusses why catalogs are so important in data:

    61 – What’s New In dbt? (dbt coalesce 2024)

    Play Episode Listen Later Oct 11, 2024


    Alex Merced discusses the news and announcements for dbt coalest 2024. Announcements Alex didn’t mention:– dbt Apache Iceberg support, this is done by working with Iceberg supporting query engines like Dremio – Healthtiles with more information on your dashboard about the health of your models – Auto-exposures in Tableau triggering BI Dashboard updates when models […]

    FREE Apache Iceberg Crash Course

    Play Episode Listen Later Jul 9, 2024


    Register for the Course: https://bit.ly/am-2024-iceberg-live-crash-course-1 Free Copy of Apache Iceberg Book: https://bit.ly/am-iceberg-book My social and blog links: https://bio.alexmerced.com/data

    60 – Interoperability of Data Lake Table Format (Apache Iceberg, Apache Hudi, Delta Lake)

    Play Episode Listen Later Jun 28, 2024


    Alex Merced discusses where interoperability tools like Apache Xtable and Uniform

    #59 – Apache Iceberg Catalogs (Nessie) vs Enterprise Data Catalogs (Colibra)

    Play Episode Listen Later Jun 25, 2024


    Alex Merced discusses the difference between Apache Iceberg Catalog and Enterprise Data Catalogs to help clarify the discussions around catalogs in today’s data trends. Follow Alex -> https://bio.alexmerced.com/data

    58 – Databricks Announcements (Open Source Unity Catalog, Liquid Clustering, Nvidia)

    Play Episode Listen Later Jun 12, 2024


    Alex Merced discusses some of the Databricks announcement at the Data + AI summit Follow Alex by visit https://bio.alexmerced.com/data

    57 – Databricks buys Tabular

    Play Episode Listen Later Jun 5, 2024


    I talk about the big news of the day. follow on Twitter @amdatalakehouse

    56 – Open Source Apache Iceberg Catalogs (Nessie, Polaris, Gravitino)

    Play Episode Listen Later Jun 4, 2024


    Alex Merced discusses the value of Open Source Apache Iceberg catalogs in creating a truly open lakehouse environment without Vendor lock-in. Check out my article on the subject: https://open.substack.com/pub/amdatalakehouse/p/open-source-table-format-open-source?r=h4f8p&utm_campaign=post&utm_medium=web&showWelcomeOnShare=true Follow me on twitter at @amdatalakehouse

    55 – Discussing the Apache Iceberg Kafka Connect Connector

    Play Episode Listen Later May 16, 2024


    In this episode, we delve into the Apache Iceberg Kafka Connector, a critical tool for streaming data into your data lakehouse. We’ll explore how this connector facilitates seamless data ingestion from Apache Kafka into Apache Iceberg, enhancing your real-time analytics capabilities and data lakehouse efficiency. We’ll cover: Join us to understand how the Apache Iceberg […]

    54 – Major Architectural Differences between Apache Iceberg and Delta Lake (Partition Evolution and Hidden Partitioning)

    Play Episode Listen Later Apr 20, 2024


    Alex Merced discusses some of the major differences in how Apache Iceberg and Delta Lake work that lead to: Follow me on social https://bio.alexmerced.com/data

    53-Why Do Snowflake Bills Get So Large?

    Play Episode Listen Later Apr 17, 2024


    Alex Merced discusses the mistakes that makes Snowflake bills get so large. Hands-On Lakehouse Laptop Exercises:– MongoDB with Dremio: https://bit.ly/am-mongodb-dashboard– SQLServer with Dremio: https://bit.ly/am-sqlserver-dashboard– Postgres with Dremio: https://bit.ly/am-postgres-to-dashboard https://bio.alexmerced.com/data

    52 – Apache Iceberg, Dremio and PuppyGraph

    Play Episode Listen Later Mar 28, 2024


    Alex Merced discusses the benefits of Apache Iceberg’s open data ecosystem! Build a Data Lakehouse on Your Laptop Deploy Deploy into Production

    #1 – intro to catalogs, manifests and metadata. Oh my!

    Play Episode Listen Later Mar 25, 2024


    In this episode, Alex Merced introduces his new podcast “Catalogs, Manifests, and Metadata. Oh my!” covering open-source data projects like Apache Iceberg and others. Make sure to subscribe, this podcast will be showing up in podcast directories over the next week or so of the publishing of this episode. Follow Alex Merced, find all links […]

    51 – Open Data Standards (Apache Iceberg, Apache Parquet, Apache Arrow, Apache Ibis, Apach Substrait)

    Play Episode Listen Later Mar 18, 2024


    Alex Merced discusses many of the open source projects aiming to reduce the frictions the heavily fragmented data world. Follow me on Socials:https://bio.alexmerced.com/data

    50 – Thinking about the flow of Streaming/Real-Time Data

    Play Episode Listen Later Feb 21, 2024


    Alex thinks on the development of Real-time data pipelines.

    48 – Understanding how Lakehouse Table Formats are Implemented in your Favorite Tools

    Play Episode Listen Later Feb 2, 2024


    Alex Merced discusses how formats like Apache Iceberg, Apache Hudi and Delta Lake work and are implemented into your favorite tools, distinguishing what is the responsibility of the format and there responsibility of the engine. Follow Alex on Social, find all links at:https://bio.alexmerced.com/data

    47 – Understanding your cloud costs (Storage, Egress, Compute, Serverless, etc.)

    Play Episode Listen Later Jan 21, 2024


    Alex Merced discusses cloud costs Alex’s Links: https://bio.alexmerced/data

    Bonus: New Youtube Channel, State of the Data Lakehouse

    Play Episode Listen Later Jan 20, 2024


    Find all my data resources below:https://bio.alexmerced.com/data Listen to the State of the Data Lakehouse Podcast Here:https://em360tech.com/podcast/dremio-state-data-lakehouse?utm_source=podcasts&utm_medium=podcast&utm_content=content&utm_campaign=alexmercedcontent&utm_term=iceberg+lakehouse+nessie

    2024 Preview – Data/Web Content

    Play Episode Listen Later Jan 9, 2024


    youtube.com/@alexmercedcoder youtube.com/@alexmerceddata twitter.com/alexmercedcoder twitter.com/amdatalakehouse

    46 – Apache Iceberg vs Delta Lake: Understanding the Table Format Debate

    Play Episode Listen Later Dec 8, 2023


    ZeroETL & Virtual Data Marts Presentation: https://www.youtube.com/watch?v=mDwpsg8btto Blog for getting hands on with Dremio on Laptop:https://www.dremio.com/blog/intro-to-dremio-nessie-and-apache-iceberg-on-your-laptop/

    45 – BI Dashboard Acceleration (Extracts, Cubes and Reflections)

    Play Episode Listen Later Nov 1, 2023


    Alex Merced discusses different techniques to speed up BI Dashboard performance.

    Call for Speakers – Subsurface 2024 (Live in NYC May 2024)

    Play Episode Listen Later Oct 30, 2023


    Submit your talks here: https://www.dremio.com/subsurface/

    44 – Multi-Table Versioning and why Abstractions Matter

    Play Episode Listen Later Oct 19, 2023


    There is a reason the Git-for-Data Paradigm of Nessie catalogs is so essential, not only for the versioning features it provides but also the level of abstraction it provides them. In this episode, I discuss this more.

    43 – Building a Data Lakehouse on your Laptop

    Play Episode Listen Later Aug 23, 2023


    In just a few commands, you can have everything you need to practice ingestion and querying with popular data software. Just install Docker and then run the commands in the image. You can also follow the directions in this blog:https://lnkd.in/eDiC8fc6 Also try out this video series:https://lnkd.in/gp843ErM

    42 – Window Functions and Apache Iceberg Metadata Tables

    Play Episode Listen Later Jul 12, 2023


    Alex Merced describes what are window function, and how they can be applied to Apache Iceberg Metadata tables

    41 – Databricks’ “Open” Problem and the Need for an Agnostic Intermediate Data Lakehouse Table Format

    Play Episode Listen Later Jun 29, 2023


    Alex Merced discusses some of the fallout from Databricks’ UNIFormat announcement, and the innovation the industry needs to unlock the data lakehouse. Follow me on twitter @amdatalakehouse

    40 – Big Announcements for Apache Iceberg, Delta Lake and Apache Hudi from Snowflake and Databricks

    Play Episode Listen Later Jun 28, 2023


    Alex Merced discusses some of the big announcements from this weeks conferences. Make sure to checkout Gnarly Data Waves on your favorite podcast app.

    39 – What are Dremio’s Data Reflections and why are they so cool!

    Play Episode Listen Later Jun 23, 2023


    Alex Merced explains what are Dremio reflection and how they bring you speed, reduce storage costs, and do so while keeping things easy for your end users. Follow Alex on twitter @amdatalakehouse

    38 – What is a DAG? (Directed Acyclic Graphs)

    Play Episode Listen Later Jun 20, 2023


    Follow Alex Merced @amdatalakehouse

    37 – Dremio, Data Lakehouses and Generative AI

    Play Episode Listen Later Jun 16, 2023


    Alex Merced discusses Dremio’s new generative AI Features and the future of Data Lakehouses. Follow Alex on twitter @amdatalakehouse

    36 – ELT & ETL: The Good, The Bad and the Ugly

    Play Episode Listen Later May 23, 2023


    Alex Merced reflects on a recent article from Lauren Balik on the topic of ELT. Here is the Article:https://medium.com/@laurengreerbalik/how-fivetran-dbt-actually-fail-3a20083b2506 Launren’s Twitter: @laurenbalik My Twitter handle: @amdatalakehouse

    35 – Data Lakehouse Statistics (Understanding Parquet and Iceberg)

    Play Episode Listen Later May 8, 2023


    Alex Merced helps explain how stats are collected and used when working with Parquet files and Apache Iceberg tables. Follow Alex on twitter @amdatalakehouse

    BONUS: What is Object Storage like AWS S3, Minio and more!

    Play Episode Listen Later Apr 12, 2023


    Alex Merced discusses what is Object Storage and the history of file systems. Join the community at datanation.click

    Happy New Year Web Dev 101 & DataNation

    Play Episode Listen Later Jan 6, 2023


    Alex Merced kicks off the new year with some updates, follow him on twitter: -> @amdatalakehouse -> @alexmercedcoder

    26 – pyIceberg 0.2.0 Released – Apache Iceberg Python API

    Play Episode Listen Later Dec 9, 2022


    Alex Merced discusses the features of the pyIceberg 0.2.0

    25 – end of year and Dremio Test Drive

    Play Episode Listen Later Dec 2, 2022


    Alex Merced encourages everyone to go to Dremio.com and try out the free Dremio test-drive and gives end of year thoughts.

    24 – Upcoming Presentations in the Data World (OSA Con, Data Day Texas)

    Play Episode Listen Later Nov 12, 2022


    follow alex on twitter @amdatalakehouse

    23 – Data Lake Migration On-Prem to Cloud

    Play Episode Listen Later Oct 14, 2022


    Alex Merced discusses Data Lake migration. Follow Alex on twitter @amdatalakehouse

    22- What is a Semantic Layer and why it matters

    Play Episode Listen Later Oct 7, 2022


    Alex Merced discusses what is a semantic layer for curating your data. Follow Alex @amdatalakehouse Join the community at datanation.click

    Don’t use GPDHost and cool content announcements

    Play Episode Listen Later Sep 29, 2022


    AlexMercedCoder.dev

    alexmercedcoder
    21 – Data Stacks: Complex and Simple

    Play Episode Listen Later Sep 16, 2022


    Alex Merced discusses how complex data stacks can become and how simple they can be with data lakehouse technologies.

    20 – Optimizing Performance at the Data, File, Table and Engine Level on the Data Lakehouse

    Play Episode Listen Later Sep 2, 2022


    Alex Merced discusses the many angles a data engineer should think about engineering data for better performance. Read more at dremio.com/subsurface

    New Podcasts! Economics and Finance for N00bs & Living Better

    Play Episode Listen Later Aug 30, 2022


    Alex Merced introduces some new podcasts to soon show up on spotify, google play, stitcher and iTunes.

    19 – What is the difference between a Spec, a Library and a Service

    Play Episode Listen Later Aug 24, 2022


    Alex talks about what is a Spec, Library and Service when understanding modern tech frameworks.

    18 – Enabling Startups to Have Big Data Architecture (Dremio and Apache Iceberg)

    Play Episode Listen Later Aug 13, 2022


    Alex Merced discusses how Dremio enables companies of all sizes to have big data scalability without having to break the bank so companies can be data oriented earlier on in their lifecycle.

    Claim The Datanation Podcast - Podcast for Data Engineers, Analysts and Scientists

    In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

    Claim Cancel