Every Monday morning, two data nerds - Matt Housley and Joe Reis - have candid and unscripted chats about data.
That's it folks! We're done. This is the final episode of the Monday Morning Data Chat. Find Joe and Matt wherever they might be.
The amount of hype and bullsh*t in AI is today is sky high. But there's also some awesome achievements happening in the field as well. How do you separate what's real and what's hype? Paco Nathan has been at the forefront of ML and AI, among other fields, for several decades. He joins us to chat about the myths and reality of AI.
Weimo Liu joins us to chat about all things Iceberg, including why it's the unified storage for SQL and NoSQL.
Matthew Mullins (CTO Coginiti) joins us to chat about a lot of topics, perhaps including bridging the gap between data engineers and analysts, small data, why we're still not learning from devops, the commoditization of data warehousing, and more.
Ricky Thomas and Paul Dudley join us to chat about streaming CDC and much more.
Andrew Ng joins us to chat about AI, data engineering, and education. Enjoy!
Ghalib Suleiman joins us once again to talk about data stacks, open table formats, data roles, and much more.
Should data engineers focus on the technology? Finding ways to transform data into something valuable? Both? Tevje Olin joins us to chat about what he thinks data engineers should focus on. LinkedIn: https://www.linkedin.com/in/tevjeolin/ Agile Data Engine: https://www.agiledataengine.com/
Rob Harmon joins us to chat about small data, being efficient, data modeling, and much more.
We're back from our Summer break and ready to rant about all kinds of stuff. We talk about the tech downturn, AI hype bubbles, and much more.
Nick Schrock and Wes McKinney join us for a chat about composable data stacks, open table formats, managing complexity, and much more.
Joe and Matt are going to take the summer off from the Monday Morning Data Chat. They'll chat about what's on their minds, summer plans, and more. Plus, special guest Zhamak Dehghani discusses the current state of Data Mesh, and what she's up to at NextData.
Ghalib Suleiman (Polytomic) joins the show to chat how the zero-interest rate hangover is affecting data and AI. We dive into the AI hype cycle, "data influencers", data vendors, data teams, and much more.
Bart Vandekerckhove (Raito) joins us to chat about data security, and the challenges data teams face when using traditional IAM technology and workflows for data access/security management.
It seems like LLMs are taking the analytics world by storm. But how do you use them to support the analytics workflow? Yali Sassoon (CTO, Co-founder of Snowplow) joins us to chat about this and much more. We'll also likely dive into behavioral analytics and more. Snowplow: https://snowplow.io/
David Yaffe (Estuary) and John Kutay (Striim) join the show to chat about the state of streaming and change data capture (CDC) in 2024 and beyond. There is a lot to cover and learn in this show. Estuary: https://estuary.dev/ Striim: https://www.striim.com/
Data products are all the rage. But what are they? And what the heck does "customer-facing" mean? Will your old-school BI tool handle customer-facing needs? Solomon Kahn joins us to chat about customer-facing data products and much more. He's one of the people we consider to be at the bleeding edge of modern analytics and data products, so definitely check this out. Delivery Layer: https://www.deliverylayer.com/ LinkedIn: https://www.linkedin.com/in/solomonkahn/
Katharine Jarmul is a AI/ML privacy and security expert, and the author of Practical Data Privacy. She joins us to chat about whether we are solving the "right" problems with AI/ML/data science, exploring what "safe", "responsible", and "ethical" AI means, and much more.
Cedric Chin (CommonCog) and Sam Taylor join us to chat about communicating sophisticated stuff to stakeholders, statistical process control, XmR charts, and a ton more. CommonCog: https://commoncog.com/ XMRIT: https://xmrit.com/
Martin Musiol (Founder and CEO of generativeAI.net and GenAI Lead for Europe at Infosys) joins us to chat about all things generative AI, and his new book, "Generative AI: Navigating the Course to the Artificial General Intelligence Future."
Veteran data industry analyst Tony Baer joins us to chat about his outlook for generative AI in 2024 and beyond. Tony goes way deeper than most people in his analysis, and if you're interested where things are going with generative AI, you better tune into this.
Ethan Aaron (CEO of Portable) joins us to chat about whether data is a job or a skill, what's stopping companies from running their analytics out of a Google Sheet, and much more.
Jean-Georges Perrin joins the show to chat about data mesh, data contracts, modern data engineering standards, Bitol (is open standard project with the Linux Foundation), data architecture, and much more. This is a wide-reaching discussion. Enjoy!
Joe Reis and Matt Housley are back for another listener Q&A. They chat about the demise of the Modern Data Stack, architecture, data modeling, AI, and much more.
Our favorite Valentine's Week guest and all around love doctor, Scot Taylor, joins the show to chat about explaining "value" to the business, puppets, and much more.
Michel Tricot (CEO of Airbyte) joins the show to chat about the impact of AI on traditional data practices (e.g. ETL/ELT), building a company, and much more.
Benn Stancil joins the show to chat about 2024 predictions, how GenAI impacts product development, writing, and more. Please note - there were some Internet problems with Benn's audio once in a while in the talk.
Dave Langer returns to the show to chat about Python in Excel, data science in SMBs without "data scientists", and much more
It's Joe and Matt today, taking listener questions and ranting about whatever's on their minds.
Alex Gallego (founder and CEO of Redpanda) joins the show to chat about the streaming data renaissance, why open formats for tiered storage is the future of data, and much more.
Mike Ferguson (Managing Director, Intelligent Business Strategies, Chairman of Big Data London) joins the show to chat about the top key trends he's seeing in in data management and analytics - GenAI, architecture modernization, FinOps, and much more.
Tristan Handy (CEO of dbt Labs) joins the show to chat about the data tooling landscape, business moats, semantic layers, the data engineering ecosystem, and much more. We covered a ton of ground in an hour and probably could've kept going for another hour. Enjoy!
Sol Rashidi is a heavy hitter in the enterprise data space, having been CXO at Estee Lauder, Sony, Merck, and more. She joins us to chat about getting business value from data, the CXO playbook, AWS ReInvent, and more.
Sarah Nagy (CEO, Seek.ai) joins us to chat about automating analytics with generative AI, the generative AI space in general, and much more.
Dave McComb (Semantic Arts) is a pioneer in the use of knowledge graphs and semantics in data management. He joins to chat about these topics, and much more.
Kai Zenner joins us to chat about all things EU AI Act. If you've wanted to learn about this upcoming piece of critical regulation, tune in.
Nadine Farah joins the show to chat about Apache Hudi's core primitives: indexing, CDC, table services, faster UPSERTs, incremental processing framework, and more.
Yoav Cohen (co-founder & CTO at Satori) joins the show to chat about why data security is hard, strategies companies use to deal with analytics over sensitive data, security and compliance requirements that data teams need to meet, and much more.
Our favorite nerd sniper, Kevin Hu (CEO of Metaplane), joins the show to help us recap some major conferences we attended last week. Lots of data news, gossip, anecdotes, and more.
Why are semantics important for a data warehouse? Lukas Schulte joins us to chat about why semantics are important, the heterogeneity of data systems, how semantics relate to SQL compilers, his project SDF, and much more.Please be aware that this discussion will get into the nitty-gritty and technical weeds of all things data. #sql #data #datawarehouse
This is a bit of a different episode, but it's a topic that is long overdue for discussion. Between long hours sitting in front of a monitor, "hustle culture", and prevalent alcohol and drug use, our profession is literally killing us. The negative effects on health and wellness among techies are insane. We've seen our friends go to the ER from stress, diet, and lifestyle-related emergencies. We've lost other friends along the way. Colleen Fotsch is uniquely qualified to discuss this issue. She is used to operating at the highest levels of sports, being an NCAA D1-champion swimmer, multiple-time CrossFit Games athlete and coach, and former US Bobsled team member. She also works as a data analyst and part-time coach for Opex, a leader in fitness education coaching (she's Joe's coach). It's time we wake up and look at how we can improve our health and wellness, and bring our best selves to our work and life. Colleen's IG: https://www.instagram.com/colleenfotsch/?hl=en
Matt Housley and Joe Reis chat about where data engineering is going, and take audience questions.
The data job market is certainly evolving. Matt, Chris, and Joe have a candid chat and AMA about career advice going into 2024.
Amit Prakash (CTO/Co-Founder of Thoughtspot) joins the show to chat about the future of generative AI in data analytics. Thoughtspot has been a leader in searchable analytics, and it will be interesting to get Amit's take on where the field of analytics is heading next.
Max Howell created Homebrew, one of the most popular open-source software (OSS) packages on the planet. He's also the founder of tea.xyz, which is helping incentivize developers to pursue their OSS projects.In this episode, we chat about the realities and future of OSS, how developers can be remunerated for their OSS projects, and much more. Tea: tea.xyz
Aaron Hunsaker joins Matthew Housley and I to chat about data grifters, dealing with vendors, how data people should converse with "the business", and much more. Aaron doesn't hold back.Enjoy this very candid in-person chat.
Hala Nelson joins the show to chat writing books, teaching math, and much more. It's not often we get three math nerds, professors, and authors in the same conversation, and this is a lot of fun. Enjoy!
David Yaffe and Johnny Graettinger (both from Estuary) join the show to do a deep dive into streaming data processing. We also cover how to scale change data capture (CDC) and where transformations belong in data pipelines. Estuary: https://estuary.dev/ Gazette: https://gazette.readthedocs.io/en/latest/ Note - Joe's audio was having issues for this episode. Apologies.
Multiple products, versions, platforms, targets technologies, formats, and locales? How do you make sense of the "multiple of multiples" challenge from a technical perspective? The "language of the business" and data in all its structured, semi-structured, and 'unstructured' forms helps drive this home.John O'Gorman has world-class expertise in language, semantics, and tying this together for the business. We hope you learn something new from this episode.
Brian Olsen joins us again to chat about why Apache Iceberg won the table format war. We also finish our chat from last time about Data Mesh. #dataengineering #datalake #datamesh
David Langer joins the show to chat about how programming languages for data science, BI teams have a unique advantage in helping introduce data science into their organizations.