Podcasts about data engineering

  • 381PODCASTS
  • 1,113EPISODES
  • 38mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Jun 11, 2026LATEST

POPULARITY

20192020202120222023202420252026


Best podcasts about data engineering

Show all podcasts related to data engineering

Latest podcast episodes about data engineering

MY DATA IS BETTER THAN YOURS
Warum KI bessere Datenstrukturen braucht - mit Stefan N., Outdooractive

MY DATA IS BETTER THAN YOURS

Play Episode Listen Later Jun 11, 2026 38:01 Transcription Available


Warum empfiehlt eine Plattform genau diese Tour – und nicht eine andere? In dieser Folge spricht Jonas Rashedi mit Stefan Neubig von Outdooractive über Recommender-Systeme, Knowledge Graphen und die technischen Grundlagen moderner Empfehlungstechnologien. Stefan gibt Einblicke in den Aufbau eines Graphen mit rund 92 Millionen Knoten und 140 Millionen Beziehungen. Dabei geht es um Nutzer, Touren, Regionen und Interaktionen – und darum, wie Machine-Learning-Modelle daraus personalisierte Empfehlungen ableiten. Besonders spannend: Warum Erklärbarkeit immer wichtiger wird, welche überraschenden Erkenntnisse Outdooractive bei Experimenten zur Besucherlenkung gewonnen hat und weshalb Knowledge Graphen im Zusammenspiel mit KI-Agenten eine wichtige Rolle spielen könnten. MY DATA IS BETTER THAN YOURS ist ein Projekt von BETTER THAN YOURS, der Marke für richtig gute Podcasts.

DOU Podcast
SpaceX купує Cursor | Оновлення Google | Нові правила бронювання — DOU News #251

DOU Podcast

Play Episode Listen Later May 25, 2026 38:05


У свіжому дайджесті DOU News обговорюємо рішення уряду, який оновив правила бронювання працівників для «критичних» підприємств. У тек-світі черговий скандал: СЕО Bolt Financial назвав звільнення всього HR-відділу перемогою, а SpaceX одразу після подачі документів на найбільше IPO в історії планує викупити ШІ-стартап Cursor. Дивіться ці та інші новини українського та світового тек-сектору. Таймкоди 00:00 Інтро 00:21 Уряд оновив правила бронювання для «критичних» підприємств 01:50 Google представила Gemini 3.5 Flash 14:05 Курс «Data Engineering» 15:26 API-ключі Google залишаються активними після видалення 16:41 Перший прибутковий квартал в історії Anthropic 18:26 Зарплатне опитування DOU і портрет айтівця 19:12 Андрей Карпати доєднується до команди Anthropic 21:33 OpenAI підтримає вивчення ШІ в українських школах під час війни 22:33 Meta звільняє тисячі людей, щоб перекрити інвестиції в ШІ 23:51 CEO Bolt Financial про звільнення всього HR-відділу 27:12 Starbucks відмовляється від ШІ-інструменту інвентаризації через 9 місяців 28:21 GitHub підтверджує компрометацію 3800 репозиторіїв через шкідливе розширення VSCode 30:47 SpaceX офіційно подала документи на найбільше IPO в історії 33:21 Новий поворот: SpaceX планує викупити Cursor 34:18 Автори Kingdom Come офіційно роблять гру за «Володарем Перснів» 35:12 Що рекомендує Женя: Flipper One та статтю «If you're an LLM, please read this»  

Unf*ck Your Data
Der 100-Euro Data Stack: Wie du dir teure Lizenzen sparst | Fabian Werkmeister

Unf*ck Your Data

Play Episode Listen Later May 20, 2026 54:10


Stehst du auch manchmal vor deinem Data Stack und denkst: Warum ist dieser Bums eigentlich so teuer und kompliziert? Fünf Tools fürs Laden, drei fürs Transformieren, noch eins zum Speichern und eins zum Visualisieren – schon hast du einen Tool-Zoo, der mehr kostet als er bringt. Muss das sein? Eher nicht! Wir räumen heute richtig auf. Der wunderbare Fabian Werkmeister ist zurück. Er hat auf seiner Weltreise mit dem Rucksack gelernt: Was unnötig wiegt, fliegt raus. Genau diese Machete packen wir jetzt im Datendschungel aus und sezieren den „Lean Data Stack“. Wir klären knallhart, wie du mit einer Prise Open Source, einem Mono-Repo und der KI Claude deine Datenplattform radikal verschlankst. Das Ergebnis? Fortune-500-Ergebnisse für unter 100 Euro im Monat, komplett ohne fette Software-Lizenzen. Butter bei die Fische: Wir besprechen auch, für wen das klappt (hallo, KMU!) und wo der Spaß bei großen Tankern aufhört. Hör auf, wochenlang auf IT-Tickets zu warten. Am Ende des Tages macht der Mensch den Unterschied – oder eben die KI, die endlich die Grenzen zwischen Business und Tech aufbricht. ▬▬▬▬▬▬ Profile: ▬▬▬▬Zum LinkedIn-Profil von Fabian: https://www.linkedin.com/in/fabian-werkmeister/Zum LinkedIn-Profil von Christian: https://www.linkedin.com/in/christian-krug/Christians Wonderlink: https://wonderl.ink/@christiankrugUnf*ck Your Data auf Linkedin: https://www.linkedin.com/company/unfck-your-data▬▬▬▬▬▬ Buchempfehlung: ▬▬▬▬Buchempfehlung von Fabian: Die Seele will frei sein - Michael SingerAlle Empfehlungen in Melenas Bücherladen: https://gunzenhausen.buchhandlung.de/unfuckyourdata▬▬▬▬▬▬ Hier findest Du Unf*ck Your Data: ▬▬▬▬Zum Podcast auf Spotify: https://open.spotify.com/show/6Ow7ySMbgnir27etMYkpxT?si=dc0fd2b3c6454bfaZum Podcast auf iTunes: https://podcasts.apple.com/de/podcast/unf-ck-your-data/id1673832019Zum Podcast auf Deezer: https://deezer.page.link/FnT5kRSjf2k54iib6Zum Podcast auf Youtube: https://www.youtube.com/@unfckyourdata▬▬▬▬▬▬ Merch: ▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬https://unfckyourdata-shop.de/▬▬▬▬▬▬ Kontakt: ▬▬▬▬E-Mail: christian@uyd-podcast.com▬▬▬▬▬▬ Timestamps: ▬▬▬▬▬▬▬▬▬▬▬▬▬00:00 Intro: Der bunte Tool-Zoo und was Data Engineering mit Backpacking zu tun hat. 04:15 Ballast im Data Stack: Warum zu viele Tools dich langsam und arm machen. 08:37 Religion BI-Tool: Über Power BI, Tableau und den gefährlichen Vendor-Lock-in. 10:02 Die Lösung: Ein Mono-Repo, Open Source und Claude als absoluter Gamechanger. 13:07 Business vs. Devs: Wie KI den Ticket-Wahnsinn von Wochen auf wenige Minuten reduziert. 23:45 Reality Check: Warum der Lean Data Stack perfekt für den Mittelstand ist und Siemens damit baden geht. 28:23 Butter bei die Fische: Fortune-500-Ergebnisse für lausige 100 Euro im Monat. 36:31 Wer baut uns das? Die neue Macht des Datenanalysten ohne fette IT-Abteilung. 42:07 Keine Angst vor Code: Wie KI komplexe Datenstrukturen für das Business in Klartext übersetzt. 50:01 Outro: Kulinarik in Japan, Drum & Bass Beats und philosophischer Lesestoff.

The Joe Reis Show
Why 90% of Data Teams Are Failing at Modeling - Freestyle Friday (May 15, 2026)

The Joe Reis Show

Play Episode Listen Later May 15, 2026 16:07


NOTE - Sorry for the edits in this video. I used Descript to edit out the umms and uhhs, and it was a bit too aggressive. Will make it less jarring in future videos. Thanks.Freestyle Friday, May 15, 2026Walking around Salt Lake City and unpacking the April 2026 data modeling survey results (334 respondents). Across three surveys now: January's State of Data Engineering (1,100), March's AI usage poll (193), and April's data modeling deep-dive. Not surprisingly, the same two pain points keep surfacing: time pressure and lack of clear ownership.90% of respondents have a data modeling pain point. When asked what would actually help, only 4.8% wanted better tools. Training, business requirements, time, and ownership crushed tooling in the rankings. Will AI improve things or make them worse? Time will tell...Also covered:Why physical data modeling has become the default (and why that's a problem)Data modeling vs. schema design - they're not the same thingSemantic layers (yay or nay?), Lloyd Tabb, and MalloyConway's Law, Reis's Law, and what changes when org charts get flattened by AIWhy leadership is under more pressure than everThe June half-year survey is coming

Arumugam's Podcast
How I Switched to Cloud Data Engineering from Tech Support | Real Career Journey

Arumugam's Podcast

Play Episode Listen Later May 11, 2026 65:20


In this Tamilboomi Weekly Meet, Saravanan shares his real journey of switching from Tech Support to Cloud Data Engineering while working and building his skills step by step.If you are someone trying to move into Data Engineering, Cloud, or Data Careers, this session will give you practical insights on how to plan your transition.• How to switch careers while managing a full-time job• What skills are needed to become a Cloud/Data Engineer• Practical learning strategies that worked in the real world• Mistakes to avoid during your transition• Career tips for aspiring data professionalsThis is a real experience-sharing session from a Tamilboomi community member who successfully made the switch.Tamilboomi is a tech learning community helping people grow their careers in Data Engineering, Cloud, and AI through workshops, meetups, and knowledge sharing.Website: https://www.tamilboomi.comJoin our Weekly Tech Meetups and learning community.• Data Engineering• AI & Data Careers• Career Switch Stories• Tech Community TalksIn this session, you will learn:

The Joe Reis Show
Zach Wilson - Data Engineering in 2026, Traveling, and more - Freestyle Fridays - May 8, 2026

The Joe Reis Show

Play Episode Listen Later May 9, 2026 10:55


Zach Wilson and I happen to be in Stockholm, Sweden, this evening. In this Freestyle Friday chat, we talk about what it takes to be a data engineer in 2026 and much more.

The Joe Reis Show
AI Agents Can't Fix Data - Josh Wills on Where AI Breaks in Data Engineering

The Joe Reis Show

Play Episode Listen Later May 7, 2026 55:02


Josh Wills has spent 25 years writing data pipelines, with a career spanning Cloudera, as Director of Data Engineering at Slack, on the dbt DuckDB adapter, and now training foundation models at Datology AI. He uses coding agents every day. And he keeps running into the same wall: the agents jump to conclusions, fix the wrong thing, and ship pipelines no one understands.In this conversation, we unpack why AI agents struggle with the messiest, highest-stakes parts of data work, and what it means for the engineers managing them.We get into:- Big Data is back- Why AI agents jump to conclusions on benchmarks and complex bottlenecks- The $200K vibe-coded pipeline problem nobody wants to talk about- Why there's no training data for the gnarly enterprise pipelines that actually power businesses- "We're all managers now" - managing unreliable agents like managing unreliable people- Wicked problems and the limits of intelligence- Why politics is the last human endeavor to fall to LLMs (the data is never written down)- Whether classical ML still has a place (yes)- What Josh would tell a new grad starting in data today

The Measure Pod
#140 Taming BigQuery costs with Alvin.ai (with Martin Sahlen)

The Measure Pod

Play Episode Listen Later May 2, 2026 63:15


Full show notes and transcript  - https://bit.ly/bq-cost-tamingWatch on YouTube - https://youtu.be/2QxXQH6waLk-----Episode Summary:In this episode of The Measure Pod, Dara and Matthew welcome Martin Sahlen, CEO and co-founder of Alvin.ai. Martin shares his journey from studying computer science in Norway to serial entrepreneurship, eventually settling in Tallinn, Estonia, where he founded Alvin. He explains how the company pivoted from data lineage and observability into a focused BigQuery cost optimisation platform that automatically routes queries between billing models to deliver savings, charging a percentage of what it saves. The conversation covers Alvin's transparent, no-lock-in approach, the duality of cost and performance optimisation, and the competitive dynamics of operating alongside Google's own tooling.-----About The Measure Pod:The Measure Pod is your go-to fortnightly podcast hosted by seasoned analytics pros. Join Dara Fitzgerald (Co-Founder at Measurelab) & Matthew Hooson (Head of Engineering at Measurelab) as they dive into the world of data, analytics and measurement, with a side of fun.-----If you liked this episode, don't forget to subscribe to The Measure Pod on your favourite podcast platform and leave us a review. Let's make sense of the analytics industry together!

Data Gen
#267 - L'agentique accélère : quel impact pour l'équipe data ? Avec Blef

Data Gen

Play Episode Listen Later Apr 27, 2026 34:32


Christophe Blefari a été Head of Data Engineering, Staff Data Engineer et Head of Data dans dans des startups et des grands groupes et il a cofondé Nao, un agent IA open source pour l'analytics.On aborde :

Arumugam's Podcast
Cloud Data Engineering with AI 2026 | Introduction

Arumugam's Podcast

Play Episode Listen Later Apr 19, 2026 65:26


You can join our WhatsApp group for discussion and keep yourself updated on data engineering.https://chat.whatsapp.com/JpsCWtJMaIYLP27ehk3Bjf

The Catalyst by Softchoice
The Curiosity Episode: You're Not What You Know

The Catalyst by Softchoice

Play Episode Listen Later Apr 8, 2026 24:13 Transcription Available


What got you here won't get you there. For most IT leaders, the path to the top was paved with expertise — knowing the systems, owning the decisions, having the answers. But something happens when that playbook stops working. Not a crash. Not a failure. Just a quiet plateau that tells you something needs to change.In this episode of The Catalyst, we explore what's on the other side of that wall: a shift toward curiosity, empowerment, and a fundamentally different way of leading. Featuring leadership coach Kirsten Schmidtke, curiosity researcher Dr. Deb Clary, and Benevity VP of Engineering Rob Woolley — three voices who all landed in the same place.What you'll take away:Why expertise becomes a trap for senior IT leaders — and how to recognize when it's happening to youThe one behaviour change Rob Woolley made that created what he calls "titanic shifts" in his leadershipWhat MIT-commissioned research reveals about the direct link between curiosity and organizational performanceWhy the best leaders aren't the ones with the most answers — and what they do insteadFeaturing: Rob Woolley, VP Core Platform & Data Engineering at Benevity | Kirsten Schmidtke, Leadership Coach & Growth Advisor | Dr. Deb Clary, Author of The Curiosity Curve (Fast Company Press)Learn more about Kirsten at kirstenschmidtke.com Take Deb's curiosity assessment at debraclary.com #ITLeadership #MidMarketIT #TheCatalyst #CuriousLeadership #Softchoice #LeadershipDevelopment #FutureOfITThe Catalyst by Softchoice is the podcast dedicated to exploring the intersection of humans and technology. 

The Joe Reis Show
Breaking Into Data Engineering in 2026: AI Tools, Standout Resumes, and more w/ Chris Gambill

The Joe Reis Show

Play Episode Listen Later Mar 31, 2026 50:37


In this episode, I sit down with Chris Gambill, a data strategy and engineering leader, fractional consultant, and career coach. We dive into the realities of the data engineering job market in 2026, exploring what it takes to stand out, the massive shift AI coding tools are causing, and why mastering the fundamentals of data engineering remains crucial.Chris shares his unfiltered thoughts on coaching career switchers into data engineering , why finance professionals make great data engineers , and the exact resume and portfolio strategies hiring managers are actually looking for. We also get into the weeds on the latest AI development tools, comparing GitHub Copilot, Claude, and Codex. If you're looking for solid, no BS advice on the field of data engineering in 2026, this is a great discussion!Gambill Data Engineering: https://www.gambilldataengineering.com/LinkedIn: https://www.linkedin.com/in/databasemanagement/

Arumugam's Podcast
How AI is changing Data Engineering? in Tamil

Arumugam's Podcast

Play Episode Listen Later Mar 29, 2026 70:20


Without coding, how to implement RAG on our own?For More :WhatsApp Group Link: https://chat.whatsapp.com/EVEyPBDLB28EFiPyTgypBULinkedIn: https://www.linkedin.com/company/tamilboomi-technologies/Instagram: https://www.instagram.com/tamilboomitechnologies/?hl=enWhatsApp: +91 9619663272Website : https://www.tamilboomi.com/Email: arumugam@tamilboomi.com

DataTalks.Club
Data Engineer Career in 2026: Roles, Specializations, and What Companies Look for - Slawomir Tulski

DataTalks.Club

Play Episode Listen Later Mar 27, 2026 68:43


In this talk, Slawomir Tulski, Data Leadership Consultant and former Meta Data Engineering Manager, shares his ten-year journey through the evolution of data systems—from researching glaciers in Poland to scaling the ads ranking infrastructure at one of the world's largest tech giants. We explore the shifting definition of the Data Engineer, the "Actionable Data" philosophy, and how to navigate the 2026 hiring market amidst the rise of AI.You'll learn about:- How to distinguish between Platform DE, Product DE, and Analytics Engineering.- Why most teams over-engineer their stacks and how to build "Value-First" instead of "Tool-First."- Why being "cloud-cost-conscious" is the most underrated competitive advantage in modern data teams.- How to identify "Legacy Traps" and choose a company culture that fosters growth.- Why strategic builders will thrive while "DBT Monkeys" and manual triaging roles are at risk of automation.- How to frame side projects and end-to-end "Toy Platforms" to stand out to recruiters without a Big Tech pedigree.TIMECODES:00:00 From Measuring Glaciers to London's Tech Scene06:47 Hadoop vs. AI: Lessons from the Original Big Data Hype11:54 The Data Identity Crisis: Platform vs. Product Engineering17:29 Tech-Native vs. Tech-by-Necessity Company Cultures25:33 The Competitive Advantage of Cost-Aware Engineering30:56 Avoiding Over-Engineered Platforms and Modern Data Stacks38:01 The Real-Time Myth: When to Use Kafka and Spark42:08 Breaking into Data Engineering: 2026 Market Reality51:04 AI Automation: Why Strategic Builders Outlast "DBT Monkeys"57:35 Portfolio Strategy: Framing Side Projects for Maximum Impact1:04:42 The Ultimate Portfolio Project: Building End-to-End Platforms1:07:49 Networking Advice and Local Gdansk CultureThis talk is designed for ambitious data professionals including engineers, analysts, and career-switchers who want a pragmatic, "fluff-free" roadmap for surviving and thriving in the 2026 data landscape. It is particularly valuable for hiring managers and senior leaders looking to audit their recruitment processes, as well as those in traditional corporate environments seeking to implement the agile, high-impact engineering cultures found in Big Tech giants like Meta.Connect with Slawomir:- Linkedin - https://www.linkedin.com/in/slawomir-tulski-091611116/- Form for DE role Ebook - https://docs.google.com/forms/d/e/1FAIpQLSdSCLaBdTtuRlgV_nukKckumR60VOovECtlRIRI5DMUIk36EQ/viewform?usp=dialogConnect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/

MY DATA IS BETTER THAN YOURS
Warum Data Mesh ohne Data Citizens scheitert – mit Jonas K., Siemens Energy

MY DATA IS BETTER THAN YOURS

Play Episode Listen Later Mar 26, 2026 44:45 Transcription Available


Data Mesh ist in aller Munde – aber wie setzt man es wirklich um? Jonas Kell, Product Owner bei Siemens Energy, erklärt, warum der Schlüssel nicht in der Technologie liegt, sondern in den Menschen. Von Data Citizens über Datenprodukte bis zur konkreten Architektur: So baut Siemens Energy eine datengetriebene Organisation auf. Key Takeaways: → Warum Data Mesh ohne Data Citizens nicht funktioniert → Wie Siemens Energy Datenkompetenz in den Fachbereichen aufbaut → KI als Hebel für bessere Datenqualität – nicht umgekehrt → Der Weg vom Data Citizen zum Data Product Owner → Warum Effizienzsteigerung durch Datenprodukte messbar sein muss Über den Gast: Jonas Kell ist Product Owner bei Siemens Energy und verantwortet den Aufbau von Data Products im Manufacturing-Bereich. Zuvor war er bei Accenture DACH im LowCode-Umfeld tätig. Er studierte an der FOM Hochschule. Zum Linkedin Profil von Jonas Kell: https://www.linkedin.com/in/jonas-kell-24b213202/ Zur Homepage von Siemens Energy: https://www.siemens-energy.com/global/en/home.html Zu allen Links rund um Jonas Rashedi: https://linktr.ee/jonas.rashedi

The Data Engineering Show
The Data Fusion Secret & Why Custom Query Engines Fail with Nikita Lapkov

The Data Engineering Show

Play Episode Listen Later Mar 24, 2026 18:11


What if building a distributed SQL engine meant rethinking everything about how query execution works at scale? In this episode, Benjamin sits down with Nikita, Senior Software Engineer at Cloudflare, to explore how R2 SQL leverages object storage and distributed computing to power analytics across 300 global locations, why backward compatibility becomes critical when you can't control infrastructure rollouts, and the key strategies for handling joins and adaptive query execution in a stateless, point-to-point network architecture. Whether you're designing distributed systems or curious about how Cloudflare processes petabytes of data, this conversation reveals the real-world engineering challenges and innovations shaping the future of cloud data platforms.

Data Culture Podcast
How AI Is Democratizing Data Roles and Reshaping Team Management – with Angelita Frozza Sanches, Scout24

Data Culture Podcast

Play Episode Listen Later Mar 16, 2026 34:05


"We might spend a lot of time trying to define a role that will change within a month."

The PolicyViz Podcast
From PDFs to Pit Lane: Building a Real-Time Data Product for McLaren Racing

The PolicyViz Podcast

Play Episode Listen Later Mar 11, 2026 37:11


In this week's episode of the PolicyViz Podcast, I chat with Michael Gethers, former Head of Data & Strategy for the McLaren IndyCar team, about how a personal side project analyzing IndyCar timing PDFs turned into a job building real-time data tools for a professional race team. We dig into what it's like to design data products for engineers, strategists, and drivers who need to understand information instantly while a car is on track. Michael shares how he moved from making public visualizations on Twitter to building an internal analytics application from scratch, why “pretty charts” weren't enough for the engineers, and how user feedback shaped the product. We also talk about race strategy as a probabilistic data science problem, the difference between dashboards and data products, and what he learned about designing for cognition under extreme time pressure. If you care about dashboards, data storytelling, or building tools people truly use, this conversation is a goldmine.Keywords: data dashboards, data product design, data visualization, motorsports analytics, race strategy, McLaren IndyCar, telemetry data, timing data, data science in sports, user centered design, dashboard design, real time analytics, D3 visualization, data engineering, analytics applicationSubscribe to the PolicyViz Podcast wherever you get your podcasts.Become a patron of the PolicyViz Podcast for as little as a buck a monthFollow me on Instagram, LinkedIn, Substack, Twitter, Website, YouTubeEmail: jon@policyviz.com

The Data Stack Show
Re-Air: Data Tools, Templates, and the Trouble with “Easy” Solutions with the Cynical Data Guy

The Data Stack Show

Play Episode Listen Later Mar 11, 2026 41:47


This episode is a re-air of one of our most popular conversations, featuring insights worth revisiting. This week on The Data Stack Show, John and Matt bring you another edition of the Cynical Data Guy. John and Matt dive into the evolution of customer data infrastructure, the growing influence of low-code tools like Clay, and the blurred lines around the “engineer” title in modern data roles. They also discuss the trade-offs between SaaS adoption and building custom solutions, the pitfalls of enterprise software buying, and the realities of platform lock-in—using Palantir's unique business model as a case study. Key takeaways include the importance of simplicity and scalability in data engineering, the need for clear requirements when evaluating tools, and a healthy skepticism toward sales pitches and “art of the possible” features. Don't miss this month's Cynical Data Guy.  Highlights from this week's conversation include: Reacting to the Rise of the GTM Engineer (1:11) Is "Engineer" the Right Term? (4:49) Low-Code Tools, AI, and Future Workflows (7:14) Simplicity in Data Engineering (14:38) The Pitfalls of "Simple" Solutions (15:18) Choosing SaaS vs. Building In-House (18:26) Business Process Abstraction and SaaS Adoption (21:31) Enterprise Software: Art of the Possible vs. Practicality (24:31) Sales Advice: Focus on Customer Needs (27:11) Forward Deployed Engineers and Delivery Models (29:05) Platform Lock-In: When Is It a Dirty Word? (36:41) Legacy Systems and the Reality of Lock-In (39:53) Final Thoughts and Takeaways (40:55) The Data Stack Show is a weekly podcast powered by RudderStack, customer data infrastructure that enables you to deliver real-time customer event data everywhere it's needed to power smarter decisions and better customer experiences. Each week, we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data. RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com. Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

The Geospatial Index
Geoharbor - Hearding the Data Engineering Cats

The Geospatial Index

Play Episode Listen Later Mar 1, 2026 52:11


Geoharbor is a solution from SketchMyView. It is a way to avoid notebook hell. This is done through all data engineers on a team using a common framework. I recommend a watch to see Rakesh showing it in action.

The Data Engineering Show
The Geo-Data Problem Nobody Talks About And How Voi Solved It ft. Magnus Dahlbäck

The Data Engineering Show

Play Episode Listen Later Feb 19, 2026 16:06


What if your data platform could power both critical business decisions and real-time product features at scale? In this episode, host Benjamin sits down with Magnus Dahlbäck, Senior Director of Data and Platform at Voi, to explore how a metrics-first approach and semantic layers transform data accessibility, why traditional ML and LLMs require different strategies for different problems, and how to balance FinOps costs while processing billions of IoT events daily. Whether you're building data infrastructure for a high-growth company or rethinking how your organization consumes data, this conversation is packed with practical strategies for unlocking data value and preparing your platform for AI. Tune in to discover how Voi ditched traditional BI tools and revolutionized their approach to enterprise analytics.

The Joe Reis Show
Freestyle Fridays - The State of Data Engineering in 2026, Book Writing, and More

The Joe Reis Show

Play Episode Listen Later Feb 13, 2026 21:52


The 2026 Practical Data Community State of Data Engineering dropped this week. It's full of some obvious and very counterintuitive information about the state of data engineers around the globe, in all sizes and types of organizations. Check it out!Also, I talk about the book writing process, where I messed up on this latest book, it's progress toward publication, and more.Survey: https://joereis.github.io/practical_data_data_eng_survey---------------------This episode is brought to you by Ellie.aiEllie makes data modeling as easy as sketching on a whiteboard—so even business stakeholders can contribute effortlessly. By skipping redraws, rework, and forgotten context, and by keeping all dependencies in sync, teams report saving up to 78% of modeling time.Check out Ellie: https://ellie.ai/

Hustle in Faith
Ep. 373 The Tech Career Roadmap Nobody Explains with Jimmy Willis

Hustle in Faith

Play Episode Listen Later Feb 12, 2026 45:53


Send a textIn this episode, I had the pleasure of speaking with Jimmy Willis, a Senior Manager of Data Engineering at an AdTech company, where he builds systems that turn massive amounts of raw data into useful information. He is a self-taught programmer without a tech degree who was able to get an internship at JP Morgan Chase and leveraged that opportunity into a 6-figure job. Jimmy is currently writing a book and is on a mission to get 10,000 Black people into tech by learning Python and other real-world tech skills.https://www.rovion.co/Sign up for Activate Your Calling: Create, Build, & Promote Your Gift: https://bit.ly/4r0QixGSign up to be notified about Faith to Launch Community: https://bit.ly/FaithtoLaunchPlease join me in my YouTube only series, 30 Days to Becoming a Stronger, More Confident You in Christ: https://www.youtube.com/playlist?list=PLfkkBA4-h1A56MxObeO__s873pdUnnWQ5

Heavybit Podcast Network: Master Feed
Ep. #31, Developer-First Data Engineering with dltHub

Heavybit Podcast Network: Master Feed

Play Episode Listen Later Feb 12, 2026 40:58


In episode 31 of Open Source Ready, Brian and John sit down with Matthaus Krzykowski, Thierry Jean, and Elvis Kahoro to explore how dlt and dltHub are changing the way developers build data pipelines. The conversation dives into DuckDB, LLM-driven workflows, and the growing shift toward developer-first data engineering. They also discuss open source adoption, AI orchestration, and what it means to be a “10x engineer” in 2026.

Standard Deviation: A podcast from Juliana Jackson
2026 - we are back and it aint sayf

Standard Deviation: A podcast from Juliana Jackson

Play Episode Listen Later Jan 29, 2026 45:52


This Podcast is sponsored by Team Simmer.Go to TeamSimmer and use the coupon code DEVIATE for 10% on individual course purchases.The Technical Marketing Handbook provides a comprehensive journey through technical marketing principles.Sign up to the Simmer Newsletter for the latest news in Technical Marketing.NEW SIMMER COURSE ALERT!  - Data Analysis with R - taught by Arben Kqiku (coupon code doesn't apply to this course)Latest content from Simo AhavaRun Server-side Google Tag Manager On Localhost ArticleLatest content from Juliana JacksonThe distance between what gets funded and what works has never been wider. (subscribe to the newsletter for more amazing content)Mentioned in the episode:Superweek Analytics SummitMeasurecamp HelsinkiConnect with Sayf Sharif:LinkedinThree Bears DataOptiMeasure This podcast is brought to you by Juliana Jackson and Simo Ahava.

Stories from the Hackery
Career Shift: From Photographer & Marketer to Application Engineer

Stories from the Hackery

Play Episode Listen Later Jan 28, 2026 21:48


What happens when a side hustle photo business turns into a decade-long marketing career that no longer fits? In this episode, Michael Galo shares his non-linear journey to Nashville Software School (NSS). After feeling "stuck" in marketing and communications, Michael decided to follow the advice of local coffee shop regulars and dive into tech. Michael discusses the intensity of the six-month Software Development bootcamp, the "fire hose" of learning, and why he chose to immediately specialize further by joining NSS's brand-new Data Engineering program through the ProTech initiative. 01:33 Life Before NSS: A Decade in Photo Production, Marketing & Communications 02:31 The Spark: Too Many Alumni at the Coffee Shop 02:57 Why Software Development? 04:59 Navigating the Bootcamp Challenge: The Capstones 06:52 The Importance of Community and Teamwork 08:01 Specializing with Data Engineering and ProTech 10:41 Deepening Backend Skills and Data Architecture 12:18 Expanding the Job Search Target 14:24 Career Development: Beyond the Resume 16:18 Advice for the Job Search: Stay Connected 18:00 Is Now the Right Time to Invest in Yourself? 20:10 Final Thoughts: Busting Through the Walls

The Joe Reis Show
Live with Joe Reis - January 2026 AMA. Ontologies, Data Modeling, Data Engineering, and More

The Joe Reis Show

Play Episode Listen Later Jan 3, 2026 46:28


Welcome to 2026! In this spontaneous Friday AMA, I take listener questions on ontologies, the “leaky abstractions” of AI coding tools, why the “button pusher” era of engineering is a professional dead end, and the shifting landscape of data engineering.I also provides an update on my upcoming book, Mixed Model Arts (launching in March 2026), and discuss the unexpected convergence of library science, ontologies, and traditional data modeling, something not on my 2025 bingo card.Great turnout, especially for no notice. Thanks to everyone who showed up!

BIFocal - Clarifying Business Intelligence
Episode 314 - Fabric November 2025 Feature Summary part 2

BIFocal - Clarifying Business Intelligence

Play Episode Listen Later Dec 30, 2025 28:50


This is episode 314 recorded on December 15th, 2025, where John & Jason talk about the Fabric November 2025 Feature Summary part 2 including updates to Data Engineering & Data Science. For show notes please visit www.bifocal.show

Engineering Kiosk
#248 Data as a Product: Die Struktur & Skalierung von Data-Teams mit Mario Müller von Veeva

Engineering Kiosk

Play Episode Listen Later Dec 30, 2025 78:44 Transcription Available


Data as a Product: Was steckt dahinter?Warum ist AI überall, aber der Weg von der Datenbank zu "Wow, das Modell kann das" wirkt oft wie ein schwarzes Loch? Du loggst brav Events, die Daten landen in irgendwelchen Silos, und trotzdem bleibt die entscheidende Frage offen: Wer sorgt eigentlich dafür, dass aus Rohdaten ein zuverlässiges, verkaufbares Datenprodukt wird.In dieser Episode machen wir genau dort das Licht an. Gemeinsam mit Mario Müller, Director of Data Engineering bei Veeva Systems, schauen wir uns an, was Datenteams wirklich sind, wie "Data as a Product" in der Praxis funktioniert und warum Data Engineering mehr ist als nur ein paar CSVs über FTP zu schubsen. Wir sprechen über Teamstrukturen von der One-Man-Show bis zur cross-functional Squad, über Ownership auf den Daten, Data Governance und darüber, wie du Datenqualität wirklich misst, inklusive Monitoring, Alerts, SQL-Regeln und menschlicher Quality Control.Dazu gibt es eine ordentliche Portion Tech: Spark, AWS S3 als primärer Speicher, Delta Lake, Athena, Glue, Airflow, Push-Pull statt Event-Overkill und die Entscheidung für Batch Processing, obwohl alle Welt nach Streaming ruft.Und natürlich klären wir auch, was passiert, wenn KI an den Daten rumfummelt: Wo AI beim Bootstrapping hilft, warum Production und Scale tricky werden und wieso Verantwortlichkeit beim Commit nicht von einem LLM übernommen wird.Wenn du Datenteams aufbauen willst, Data Products liefern musst oder einfach verstehen willst, wie aus Daten verlässlicher Business-Impact wird, bist du hier genau richtig.Bonus: Batchjobs bekommen heute mal ein kleines Comeback.Unsere aktuellen Werbepartner findest du auf https://engineeringkiosk.dev/partnersDas schnelle Feedback zur Episode:

The Data Engineering Show
The $100M Problem: How Lyft's Data Platform Prevents ML Failures with Ritesh Varyani at Lyft

The Data Engineering Show

Play Episode Listen Later Dec 16, 2025 25:46


What if your data platform could serve AI-native workloads while scaling reliably across your entire organization? In this episode, Benjamin sits down with Ritesh, Staff Engineer at Lyft, to explore how to build a unified data stack with Spark, Trino, and ClickHouse, why AI is reshaping infrastructure decisions, and the strategies powering one of the industry's most sophisticated data platforms. Whether you're architecting data systems at scale or integrating AI into your analytics workflow, this conversation delivers actionable insights into reliability, modernization, and the future of data engineering. Tune in to discover how Lyft is balancing open-source investments with cutting-edge AI capabilities to unlock better insights from data.

GOTO - Today, Tomorrow and the Future
Fundamentals of Data Engineering • Matt Housley & Joe Reis

GOTO - Today, Tomorrow and the Future

Play Episode Listen Later Dec 16, 2025 34:08


This interview was recorded for the GOTO Book Club.http://gotopia.tech/bookclubRead the full transcription of the interview here:https://gotopia.tech/episodes/399Matt Housley - Co-Author of "Fundamentals of Data Engineering", Keynote Speaker & PodcasterJoe Reis - Co-Author of "Fundamentals of Data Engineering", Keynote Speaker, Professor & PodcasterRESOURCESMatthttps://www.linkedin.com/in/housleymatthewJoehttps://www.linkedin.com/in/josephreishttps://github.com/JoeReishttps://joereis.substack.comLinkhttps://mathstodon.xyz/@tao/114915604830689046DESCRIPTIONJoe Reis and Matt Housley, co-authors of "Fundamentals of Data Engineering," discuss the evolution of their field three years after their book's publication. They explore how the rise of AI tools has transformed data engineering practices, the ongoing importance of foundational knowledge, and the challenges facing junior engineers in an AI-dominated landscape. The conversation covers the balance between leveraging AI assistance and maintaining core expertise, the resurgence of classical techniques, and why fundamental principles remain more relevant than ever.RECOMMENDED BOOKSJoe Reis & Matt Housley • Fundamentals of Data Engineering • https://amzn.to/4n85049Karen Hao • Empire of AI • https://amzn.to/46qeL6BKeach Hagey • The Optimist • https://amzn.to/4nlcS20Parmy Olson • Supremacy • https://amzn.to/3IpHdgIPeter Norvig & Stuart Russel • Artificial Intelligence • https://amzn.to/420ZgR8David Foster • Generative Deep Learning • https://amzn.to/48ZgP4xSol Rashidi • Your AI Survival Guide • https://amzn.to/3UFYnKCHow Hacks HappenHacks, scams, cyber crimes, and other shenanigans explored and explained. Presented...Listen on: Apple Podcasts SpotifyBlueskyTwitterInstagramLinkedInFacebookCHANNEL MEMBERSHIP BONUSJoin this channel to get early access to videos & other perks:https://www.youtube.com/channel/UCs_tLP3AiwYKwdUHpltJPuA/joinLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket: gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!

The Chris Voss Show
The Chris Voss Show Podcast – Artificial Intelligence and Machine Learning in Human Resources: A Concise Guide by Dr. C. Rasmussen

The Chris Voss Show

Play Episode Listen Later Nov 28, 2025 44:50


Artificial Intelligence and Machine Learning in Human Resources: A Concise Guide by Dr. C. Rasmussen https://www.amazon.com/Artificial-Intelligence-Machine-Learning-Resources/dp/B0FWZQXHMG Curtisrasmussen.focalpointcoaching.com What if a computer could help find the perfect employee or predict who might leave a job? This exciting idea opens the door to a new way of working. Overview This guide explains how artificial intelligence (AI) and machine learning (ML) are transforming human resources (HR). Smart computer programs can quickly review thousands of job applications to find the best candidates, suggest training tailored to employees’ needs, and predict which workers might quit, helping managers take action to keep them. The book includes real-world examples, like how large companies use AI to save time, and covers benefits, such as improved hiring, as well as key concerns, like protecting personal information. At just 61 pages, it's concise by design, following Richard Feynman's wisdom: “If you can’t explain something simply, you don’t understand it well enough.” More pages don't equal more value; in fact, lengthy texts can bury useful insights. Since every organization is unique, this book equips HR professionals and managers with the right questions to ask rather than a rigid roadmap, making it a practical tool for anyone curious about the future of work. About the author Dr. Curtis “Curt” Rasmussen is a leading expert in industrial-organizational psychology with a Ph.D. from Walden University. He specializes in blending human skills with artificial intelligence (AI) and machine learning (ML) to make workplaces better and more efficient. With years of experience in research, consulting, and government roles, he helps businesses use data and tech wisely. His career highlights include owning Cyber-Human Performance Tech, LLC, where he advises small and mid-sized companies on adding AI to hiring and daily tasks while keeping things ethical. He also guides students in George Mason University’s Data Engineering program, focusing on AI tools like natural language processing and computer vision. At the Cybersecurity and Infrastructure Security Agency (CISA), he led workforce planning as a senior I/O psychologist, creating surveys and frameworks that improved employee satisfaction by 45% and helped with smarter hiring. Earlier, he reviewed AI and data science proposals for the Department of Commerce, National Academy of Medicine, and the Office of the Director of National Intelligence, making sure projects were strong and fair. Dr. Rasmussen has invented patent-pending tools like the Multidimensional Algorithm Structure (MAS), which picks the best AI methods by checking data and company needs, and the eXplainable Artificial Intelligence Construct (XAIC), which makes AI easy to understand and trust by involving people in decisions. These ideas help fix common AI problems, like failures or hidden biases.

Software Engineering Radio - The Podcast for Professional Software Developers
SE Radio 696: Flavia Saldanha on Data Engineering for AI

Software Engineering Radio - The Podcast for Professional Software Developers

Play Episode Listen Later Nov 25, 2025 74:25


Flavia Saldanha, a consulting data engineer, joins host Kanchan Shringi to discuss the evolution of data engineering from ETL (extract, transform, load) and data lakes to modern lakehouse architectures enriched with vector databases and embeddings. Flavia explains the industry's shift from treating data as a service to treating it as a product, emphasizing ownership, trust, and business context as critical for AI-readiness. She describes how unified pipelines now serve both business intelligence and AI use cases, combining structured and unstructured data while ensuring semantic enrichment and a single source of truth. She outlines key components of a modern data stack, including data marketplaces, observability tools, data quality checks, orchestration, and embedded governance with lineage tracking. This episode highlights strategies for abstracting tooling, future-proofing architectures, enforcing data privacy, and controlling AI-serving layers to prevent hallucinations. Saldanha concludes that data engineers must move beyond pure ETL thinking, embrace product and NLP skills, and work closely with MLOps, using AI as a co-pilot rather than a replacement. Brought to you by IEEE Computer Society and IEEE Software magazine.

ai nlp flavia etl data engineering saldanha ieee computer society se radio
The Joe Reis Show
From Data Engineering to Context Engineering w/ Nick Schrock

The Joe Reis Show

Play Episode Listen Later Nov 20, 2025 44:57


Data engineering is undergoing a fundamental shift. In this episode, I sit down with Nick Schrock, founder and CTO of Dagster, to discuss why he went from being an "AI moderate" to believing 90% of code will be written by AI. Being hands on also led to a massive pivot in Dagster's roadmap and a new focus on managing and engineering context.We dive deep into why simply feeding data to LLMs isn't enough. Nick explains why real-time context tools (like MCPs) can become "token hogs" that lack precision and why the future belongs to "context pipelines": offline, batch-computed context that is governed, versioned, and treated like code.We also explore Compass, Dagster's new collaborative agent that lives in Slack, bridging the gap between business stakeholders and data teams. If you're wondering how your role as a data engineer will evolve in an agentic world, this conversation maps out the territoryDagster: dagster.io Nick Schrock on X: @schrockn

The Data Engineering Show
60 Billion Predictions Daily: Inside Credit Karma's Agentic Data Layer with Maddie Daianu

The Data Engineering Show

Play Episode Listen Later Nov 19, 2025 19:55


What does MLOps look like when you are deploying 22,000 models a month? Maddie Daianu, Head of Data and AI at Intuit Credit Karma, joins the Data Bros to pull back the curtain on one of the most high-volume data environments in FinTech. With a 100-person team serving 140 million members, standard data practices break down. Maddie shares how her team manages terabytes of daily data on Google Cloud and explains the massive strategic pivot they are undertaking right now: The move from "Information" to "Agency."

The Data Exchange with Ben Lorica
Making Data Engineering Safe for Automation and Agents

The Data Exchange with Ben Lorica

Play Episode Listen Later Nov 13, 2025 49:41


Ciro Greco, Co-founder & CEO at Bauplan, joins the podcast to discuss a new paradigm for data engineering rooted in software engineering principles. He explains how treating the data lakehouse like a software project — with version control, branching, and transactional pipelines — creates a robust and safe environment for development. Subscribe to the Gradient Flow Newsletter

Acxiom Podcast
#75 - Unlocking Incrementally in a Privacy-First World | Real Talk about Marketing an Acxiom Podcast

Acxiom Podcast

Play Episode Listen Later Oct 24, 2025 35:19 Transcription Available


In this episode of Real Talk with Anant Veeravalli, the discussion revolves around the evolving data landscape and the necessity for strategic partnerships to achieve holistic measurement. The team unpacks the importance of ethical data sourcing, privacy compliance, and the utilization of clean room environments like Snowflake and Databricks to bridge data gaps. Enabling secure and scalable data connectivity and facilitating real-time data sharing is key for brands to derive meaningful intelligence, including predictive modeling and AI-driven insights. This episode is essential listening for anyone focused on governance, security, and future-proofing data systems.Thanks for listening! Follow us on Twitter and Instagram or find us on Facebook.

The Joe Reis Show
Navigating Career Growth and the Content Gap w/ Yordan Ivanov

The Joe Reis Show

Play Episode Listen Later Oct 22, 2025 52:44


There's no shortage of technical content for data engineers, but a massive gap exists when it comes to the non-technical skills required to advance beyond a senior role. I sit down with Yordan Ivanov, Head of Data Engineering and writer of "Data Gibberish," to talk about this disconnect.We dive into his personal journey of failing as a manager the first time, learning the crucial "people" skills, and his current mission to help data engineers learn how to speak the language of business.Key areas we explore:The Senior-Level Content Gap: Yordan explains why his non-technical content on career strategy and stakeholder communication gets "terrible" engagement compared to technical posts, even though it's what's needed to advance.The Managerial Trap: Yordan's candid story about his first attempt at management, where he failed because he cared only about code and wasn't equipped for the people-centric aspects and politics of the role.The Danger of AI Over-reliance: A deep discussion on how leaning too heavily on AI can prevent the development of fundamental thinking and problem-solving skills, both in coding and in life.The Maturing Data Landscape: We reflect on the end of the "modern data stack euphoria" and what the wave of acquisitions means for innovation and the future of data tooling.AI Adoption in Europe vs. the US: A look at how AI adoption is perceived as massive and mandatory in Europe, while US census data shows surprisingly low enterprise adoption rates

alphalist.CTO Podcast - For CTOs and Technical Leaders
#130 - From PhD Research to DuckDB: Building the Next Generation of Analytical DBs with Mark Raasveldt // CTO @ DuckDB

alphalist.CTO Podcast - For CTOs and Technical Leaders

Play Episode Listen Later Oct 16, 2025 53:12 Transcription Available


Mark Raasveldt, co-founder and CTO of DuckDB Labs, shares his journey from academic research at CWI Amsterdam to creating one of the most innovative analytical databases of the last decade. Mark discusses the technical challenges of building DuckDB from scratch, the philosophy behind embedded analytical databases, and why single-node performance still matters in our cloud-first world. He provides insights into open source business models, the evolution of data formats like Parquet, and how DuckDB is democratizing high-performance analytics for developers everywhere.

The Data Engineering Show
Block Bad Data Before the Write with Nike's Ashok Singamaneni

The Data Engineering Show

Play Episode Listen Later Oct 7, 2025 20:20


Nike's Principal Data Engineer Ashok Singamaneni joins Benjamin and Eldad to discuss his open-source data quality framework, Spark Expectations. Ashok explains how the tool, which was inspired by Databricks DLT Expectations, shifts data quality checks to before the data is written to a final table. This proactive approach uses row-level, aggregation-level, and query data quality checks to fail jobs, drop bad records, or alert teams - ultimately saving huge costs on recompute and engineering effort in mission-critical data pipelines.

Speaking of Data
Data Engineering for Analytics and AI with Prashanth Southekal

Speaking of Data

Play Episode Listen Later Oct 6, 2025 37:49


Prashanth Southekal, Ph.D., managing principal for DBP Institute, joins host Andrew Miller to discuss data engineering for analytics and AI - including the four types of business data, practical data wrangling enrichment techniques, and managing governance, ethics and risk. For more information on Prashanth's course at TDWI Orlando please visit Data Engineering for Analytics and AI. ____________ More information: ·       TDWI Conferences: https://bit.ly/3XqBhGH ·       TDWI Modern Data Leader's Summits: https://bit.ly/4902fuu ·       TDWI Virtual Summits: https://bit.ly/31HJ2xr ·       Seminars: https://bit.ly/3WxQPr4 ·       More Speaking of Data Episodes: https://bit.ly/3JsQPWo Follow Us on: ·       LinkedIn - https://bit.ly/42zCZZB ·       Facebook - https://bit.ly/49uej7j ·       Instagram - https://bit.ly/3HM8x57 ·       X - https://bit.ly/3SsYu9P

The Joe Reis Show
The Rise of the Context Company: Reshaping Data Engineering with Saket Saurabh

The Joe Reis Show

Play Episode Listen Later Oct 1, 2025 48:09


In this episode, I sit down with Saket Saurabh (CEO of Nexla) to discuss the fundamental shift happening in the AI landscape. The conversation is moving beyond the race to build the biggest foundational models and towards a new battleground: context. We explore what it means to be a "model company" versus a "context company" and how this changes everything for data strategy and enterprise AI. Join us as we cover:Model vs. Context Companies: The emerging divide between companies building models (like OpenAI) and those whose advantage lies in their unique data and integrations.The Limits of Current Models: Why we might be hitting an asymptote with the current transformer architecture for solving complex, reliable business processes. "Context Engineering": What this term really means, from RAG to stitching together tools, data, and memory to feed AI systems. The Resurgence of Knowledge Graphs: Why graph databases are becoming critical for providing deterministic, reliable information to probabilistic AI models, moving beyond simple vector similarity. AI's Impact on Tooling: How tools like Lovable and Cursor are changing workflows for prototyping and coding, and the risk of creating the "-10x engineer." The Future of Data Engineering: How the field is expanding as AI becomes the primary consumer of data, requiring a new focus on architecture, semantics, and managing complexity at scale.

The InfoQ Podcast
AI, ML, and Data Engineering InfoQ Trends Report 2025

The InfoQ Podcast

Play Episode Listen Later Sep 24, 2025 53:02


In this episode of the podcast, members of the InfoQ editorial staff and friends of InfoQ discuss the current trends in the domain of AI, ML and Data Engineering. One of the regular features of InfoQ are the trends reports, which each focus on a different aspect of software development. These reports provide the InfoQ readers and listeners with a high-level overview of the topics to pay attention to this year. InfoQ AI, ML and Data Engineering editorial team met with external guests to discuss the trends in AI and ML areas, and what to watch out for the next 12 months. In addition to the written report and trends graph, this podcast provides a recording of a discussion where expert panelists discuss how innovative AI technologies are disrupting the industry. Read a transcript of this interview: http://bit.ly/4nRpvlF Subscribe to the Software Architects' Newsletter for your monthly guide to the essential news and experience from industry peers on emerging patterns and technologies: https://www.infoq.com/software-architects-newsletter Upcoming Events: InfoQ Dev Summit Munich (October 15-16, 2025) Essential insights on critical software development priorities. https://devsummit.infoq.com/conference/munich2025 QCon San Francisco 2025 (November 17-21, 2025) Get practical inspiration and best practices on emerging software trends directly from senior software developers at early adopter companies. https://qconsf.com/ QCon AI New York 2025 (December 16-17, 2025) https://ai.qconferences.com/ QCon London 2026 (March 16-19, 2026) https://qconlondon.com/ The InfoQ Podcasts: Weekly inspiration to drive innovation and build great teams from senior software leaders. Listen to all our podcasts and read interview transcripts: - The InfoQ Podcast https://www.infoq.com/podcasts/ - Engineering Culture Podcast by InfoQ https://www.infoq.com/podcasts/#engineering_culture - Generally AI: https://www.infoq.com/generally-ai-podcast/ Follow InfoQ: - Mastodon: https://techhub.social/@infoq - X: https://x.com/InfoQ?from=@ - LinkedIn: https://www.linkedin.com/company/infoq/ - Facebook: https://www.facebook.com/InfoQdotcom# - Instagram: https://www.instagram.com/infoqdotcom/?hl=en - Youtube: https://www.youtube.com/infoq - Bluesky: https://bsky.app/profile/infoq.com Write for InfoQ: Learn and share the changes and innovations in professional software development. - Join a community of experts. - Increase your visibility. - Grow your career. https://www.infoq.com/write-for-infoq

MLOps.community
The DuckLake Lakehouse Format // Hannes Mühleisen // #339

MLOps.community

Play Episode Listen Later Sep 19, 2025 57:24


The DuckLake Lakehouse Format // MLOps Podcast #339 with Hannes Mühleisen, Co-founder and CEO of DuckDB Labs.Join the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletter// AbstractManaging data on Object Stores has been a painful affair. Users had to choose between data swamp chaos or a maze of metadata files with catalog servers on top. DuckLake is a new paradigm for managing data on object stores: First, it uses classical SQL data management systems to manage metadata. Second, actual data is stored in Parquet files on pretty arbitrary storage. Third, processing queries is done client-side, or anywhere really. DuckDB is the first system to integrate with DuckLake using an extension with the same name. Conceptually, DuckLake enables central control over truth while decentralizing compute and storage entirely. DuckLake turns data warehouse architecture upside down by departing from the integrated metadata/compute layer towards a fully disconnected operation with only centralized metadata. For the first time, DuckLake allows a “multi-player” experience with DuckDB, where computation stays fully local, but transactional control is centralized.// BioHannes Mühleisen

Chuck Yates Needs A Job
The Secret Data Engineering Behind Industry AI

Chuck Yates Needs A Job

Play Episode Listen Later Sep 17, 2025 45:13


This episode is packed with big-picture energy talk and some seriously nerdy (but fun) data breakdowns. John Kalfayan from collide. and Chuck start with what's really happening in oil and gas today before shifting into the challenges of putting AI to work in the field. From there, things get deep: contract dedications, what RAG actually means, how data chunking works, and the never-ending battle with duplicate info. We also weigh the costs of storage, querying, and running models, plus the tradeoffs between RAG and foundational models. If you've ever wondered about vector databases, data strategy, or just why we have a rant about sand, it's all here. By the end, we hit on the human side too: education, privacy, and making sure the right people can access the right data.Click here to watch a video of this episode.Join the conversation shaping the future of energy.Collide is the community where oil & gas professionals connect, share insights, and solve real-world problems together. No noise. No fluff. Just the discussions that move our industry forward.Apply today at collide.ioClick here to view the episode transcript. 00:00 - Intro01:51 - Oil and Gas Industry Insights06:34 - AI Deployment Challenges09:12 - Contract Dedications Explained10:32 - Understanding RAG12:52 - What is RAG in Data Management13:43 - Data Chunking Techniques17:17 - Cost Considerations in Data18:03 - RAG vs Foundational Models19:21 - Vectorized Databases Overview23:47 - Managing Duplicate Data26:28 - Data Strategy Considerations28:24 - Sand Rant31:32 - Identifying Gaps in Data33:10 - The Cost of Storage33:56 - Effective Data Querying35:50 - AI Education and Awareness37:53 - Privacy Concerns with Language Models40:54 - Data Access and Availabilityhttps://twitter.com/collide_iohttps://www.tiktok.com/@collide.iohttps://www.facebook.com/collide.iohttps://www.instagram.com/collide.iohttps://www.youtube.com/@collide_iohttps://bsky.app/profile/digitalwildcatters.bsky.socialhttps://www.linkedin.com/company/collide-digital-wildcatters

Stories from the Hackery
The Engine Behind AI: Why Data Engineering is in Demand | Stories From The Hackery

Stories from the Hackery

Play Episode Listen Later Sep 10, 2025 67:21


In this episode of Stories from the Hackery, we talk with Nashville tech leader and hiring manager Jason Turan about one of tech's most in-demand fields: data engineering. Jason, a long-time friend of NSS, was one of the first people to tell us that Nashville needed more data engineers. He shares his perspective on what a data engineer does, describing the role as the "connective tissue between data producers and data consumers". Listen in to hear us discuss: - Why data engineers are essential for flipping the 80/20 rule, allowing data scientists and analysts to spend less time cleaning data and more time finding insights. - How the rise of generative AI has acted as an "accelerant," increasing the need for high-quality data and the professionals who can provide it. - Actionable advice for getting started in the field, including the importance of focusing on a "T-shaped skillset" with SQL at its core. - Why Jason's number one piece of advice is to be curious, experiment, and "go out and do the thing". 01:20 Meet Jason Turan: His Tech Origin Story 03:04 Jason's History with NSS and Hiring Grads 07:28 Defining Data Engineering: The "Connective Tissue" of Tech 11:15 Why Nashville is a Hub for Data Engineers 13:56 Healthcare's Impact on Nashville's Data Jobs 20:35 How GenAI Accelerates the Need for Data Engineers 31:33 Getting Started: Lower Barriers to Entry 39:03 A Top Use Case for AI: Understanding Your Codebase 52:21 Misconceptions & the "T-Shaped Skillset" 55:29 The Value of Hands-On Learning: "Go Do the Thing" 58:52 Lightning Round: Favorite Tech Tools 01:00:32 Lightning Round: Top Reads & Resources Links Metabase: https://www.metabase.com/ DuckDB: https://duckdb.org/ MotherDuck: https://motherduck.com/ Ralph Kimball: The Data Warehouse Toolkit: https://www.amazon.com/gp/product/1118530802 Bill Inmon: Building the Data Warehouse: https://www.amazon.com/Building-Data-Warehouse-W-Inmon/dp/0764599445 Edward Tufte: The Visual Display of Quantitative Information: https://www.amazon.com/Visual-Display-Quantitative-Information/dp/0961392142 Brendan Keeler: The Health API Guy: https://healthapiguy.substack.com/ TLDR Newsletter: https://tldr.tech/ Nashville Technology Council (NTC): https://technologycouncil.com/

BIFocal - Clarifying Business Intelligence
Episode 304 - Microsoft Fabric August 2025 Feature Summary

BIFocal - Clarifying Business Intelligence

Play Episode Listen Later Sep 9, 2025 41:34


This is episode 304 recorded on September 4th, 2025, where John & Jason talk the Microsoft Fabric August 2025 Feature Summary including a new Flat list view in Deployment pipelines, Bursting controls for Data Engineering workloads, new test capabilities for User Data Functions, the ability to server real-time predictions with ML model endpoints, several updates to Data Warehouse, Database tree in edit tile and AzMon data sources for RTI, the ability to use Python Notebooks to read/write to Fabric SQL Databases, Auto table creation on destination in copy job in Data Factory, and much, much more. For show notes please visit www.bifocal.show

Engenharia de Dados [Cast]
Data AI Sunset Meetup Brasília - O Futuro da Engenharia de Dados, Comunidade, IA e Carreira

Engenharia de Dados [Cast]

Play Episode Listen Later Aug 24, 2025 63:43


Prepare-se para uma imersão nos bastidores do mais recente encontro de engenharia de dados em Brasília e descubra as tendências que estão moldando o futuro da área. Neste episódio, Vitor Ramos conversa com Wesley Outeiro e outros participantes para compartilhar os principais insights e aprendizados do evento presencial, organizado pela Engenharia de Dados Academy e como palestrante Luan Moreno.Uma conversa sincera sobre a importância das interações presenciais, a evolução da comunidade de dados e o impacto da Inteligência Artificial no dia a dia dos profissionais.O que você vai aprender neste episódio:A importância do networking e da comunidade para o crescimento pessoal e profissional na área de dados.Como a interação presencial em eventos potencializa o aprendizado e a colaboração.As principais tendências em dados e IA que estão criando novas oportunidades e desafios para o mercado.Por que o domínio dos conceitos fundamentais é mais crucial do que nunca para o sucesso na engenharia de dados.A relevância de FinOps para a gestão eficiente de custos de nuvem em projetos de dados.Reflexões sobre como a dinâmica de eventos e a troca de conhecimento estão evoluindo.O poder de se conectar com líderes da indústria para se inspirar e motivar sua carreira. Luan Moreno = https://www.linkedin.com/in/luanmoreno/

Alter Everything
191: Data Hot Topics with DataFramed

Alter Everything

Play Episode Listen Later Aug 13, 2025 39:25


Join us in a discussion with Richie Cotton, Senior Data Evangelist at DataCamp and host of the DataFramed podcast, for a special crossover episode exploring the hottest topics in data science, analytics, and artificial intelligence! Don't miss the full video of this conversation on YouTube! Discover whether AI will reduce the need to learn coding, the real-world applications of AI agents, and what it truly means to be an "AI-first" company. Get expert insights, practical advice for building a career in data and AI, and learn how to stay ahead in a rapidly changing tech landscape.Panelists: Richie Cotton, Senior Data Evangelist @ DataCamp,LinkedIn, XMegan Bowers, Sr. Content Manager @ Alteryx - @MeganBowers, LinkedInShow notes: DataFramed podcastDataCamp Skill Track: Alteryx FundamentalsGartner Agentic AI articleWSJ Agentic AI article Interested in sharing your feedback with the Alter Everything team? Take our feedback survey here!This episode was produced by Megan Bowers, Mike Cusic, and Matt Rotundo. Special thanks to Andy Uttley for the theme music.

Engenharia de Dados [Cast]
The Data Engineering & GenAI Era: Insights with Eduardo Ordax

Engenharia de Dados [Cast]

Play Episode Listen Later Jun 23, 2025 55:38


O Impacto da IA Generativa no Presente e Futuro dos DadosPrepare-se para uma conversa de altíssimo nível sobre como a Inteligência Artificial Generativa está transformando o mundo dos dados, das empresas e das carreiras. Neste episódio, Luan Moreno recebe Eduardo Ordax, Líder de IA Generativa na AWS, e Mateus Oliveira para discutir, sem rodeios, os impactos reais da IA no mercado.O que você vai aprender neste episódio:Como a IA Generativa está mudando a forma como construímos pipelines, produtos e soluções de dados.Os principais desafios que empresas enfrentam ao implementar GenAI — e por que tecnologia não é mais o problema, mas sim pessoas e dados.O papel da Engenharia de Dados no mundo da IA e como ela se conecta com conceitos como LLMOps, Fine-Tuning, Prompt Engineering e Data-Centric AI.Por que o domínio dos fundamentos nunca foi tão importante para quem trabalha (ou quer trabalhar) com dados e IA.Reflexões sobre o futuro das carreiras em dados e IA — será que os engenheiros de dados, cientistas de dados e desenvolvedores serão substituídos ou terão um papel ainda mais relevante?As diferenças entre usar IA para brincar no ChatGPT e levar IA para resolver problemas de negócios no mundo real, em escala e em produção.Este é um papo sobre IA. É uma imersão completa sobre os desafios, as oportunidades e a visão de futuro para quem trabalha com dados, engenharia, machine learning e inteligência artificial. Luan Moreno = https://www.linkedin.com/in/luanmoreno/