Podcasts about data warehouses

  • 274PODCASTS
  • 515EPISODES
  • 36mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Oct 28, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about data warehouses

Latest podcast episodes about data warehouses

CX-Talks - Podcast für Customer Experience Management
#150 Mixed Mode Analytics. So trifft congstar bessere Entscheidungen. Christoph Kelzenberg (congstar) bei Peter Pirner

CX-Talks - Podcast für Customer Experience Management

Play Episode Listen Later Oct 28, 2025 40:28


In dieser Folge geht es um das Zusammenspiel unterschiedlicher Datenquellen und Analysemethoden, den sogenannten Mixed Mode Analytics. Wir zeigen, was es dazu braucht und, wie Congstar mit diesem Vorgehen bessere Entscheidungen im CX Management trifft.▿ Alle Links und mehr Informationen findest du auf der Website www.cx-talks.com und in den ►Shownotes auf Spotify (Abonnenten des Podcasts), Apple ("Website der Episode"), alternativ auf https://cx-talks.podigee.io

The .NET Core Podcast
Data, AI, and the Human Touch: Michael Washington on Building Trustworthy Applications

The .NET Core Podcast

Play Episode Listen Later Oct 24, 2025 62:28


Strategic Technology Consultation Services This episode of The Modern .NET Show is supported, in part, by RJJ Software's Strategic Technology Consultation Services. If you're an SME (Small to Medium Enterprise) leader wondering why your technology investments aren't delivering, or you're facing critical decisions about AI, modernization, or team productivity, let's talk. Show Notes "What do I mean by compute? Compute is whenever you want a computer to do a thing, okay, it requires the CPU to exist and I want the CPU to do a thing. How well it can do it Is based upon what kind of CPU you have. What kind of CPU they have since have it in miniature chip. So, if you have an NVIDIA chip, it does a lot of really good things, but as we know, they're very expensive, and that's why NVIDIA is like what, I guess, the largest company in the world right now."— Michael Washington Hey everyone, and welcome back to The Modern .NET Show; the premier .NET podcast, focusing entirely on the knowledge, tools, and frameworks that all .NET developers should have in their toolbox. I'm your host Jamie Taylor, bringing you conversations with the brightest minds in the .NET ecosystem. Today, Michael Washington joined us to talk about his open source project "Personal Data Warehouse", what a data warehouse is, and the why we collect data in our applications. We also talk about the differences between storing data in the database and storing it in a data warehouse—one of the biggest differences, as you'll find out, is the difference in cost. "The only reason why we collect any data is because at some point a human being needs this data to make a decision. Seriously, and I challenge anyone to come up with any exceptions to that."— Michael Washington Along the way, we talked about the benefits and pitfalls of leveraging AI (particularly LLMs) in your applications. Both Michael and I agree that there is little "intelligence" in LLMs in the traditional sense, and Michael brings up the most important point when deciding to an LLM in your application: that a human must always make decisions based on what data they have and what the LLM can provide. We must never hand over decision making to LLMs. Before we jump in, a quick reminder: if The Modern .NET Show has become part of your learning journey, please consider supporting us through Patreon or Buy Me A Coffee. Every contribution helps us continue bringing you these in-depth conversations with industry experts. You'll find all the links in the show notes. Anyway, without further ado, let's sit back, open up a terminal, type in `dotnet new podcast` and we'll dive into the core of Modern .NET. Full Show Notes The full show notes, including links to some of the things we discussed and a full transcription of this episode, can be found at: https://dotnetcore.show/season-8/data-ai-and-the-human-touch-michael-washington-on-building-trustworthy-applications/ Useful Links: Apache Parquet Personal Data Warehouse on: Windows App Store GitHub Michael on: Find an MVP GitHub Bluesky Blazor Help Website blazordata.net AI Story Builders Supporting the show: Leave a rating or review Buy the show a coffee Become a patron Getting in Touch: Via the contact page Joining the Discord Remember to rate and review the show on Apple Podcasts, Podchaser, or wherever you find your podcasts, this will help the show's audience grow. Or you can just share the show with a friend. And don't forget to reach out via our Contact page. We're very interested in your opinion of the show, so please get in touch. You can support the show by making a monthly donation on the show's Patreon page at: https://www.patreon.com/TheDotNetCorePodcast. Music created by Mono Memory Music, licensed to RJJ Software for use in The Modern .NET Show. Editing and post-production services for this episode were provided by MB Podcast Services.

Der Performance Manager Podcast | Für Controller & CFO, die noch erfolgreicher sein wollen

1952 löste Grace Hopper ein Problem, das heute in jedem Unternehmen existiert. Die IT spricht von ETL-Prozessen und Data Warehouses. Das Controlling denkt in KPIs und Forecasts. Das Marketing fokussiert sich auf Conversion-Rates. Alle sprechen eine andere „Datensprache“ - genau wie damals die unverständlichen Maschinencodes. Grace Hoppers Lösung? Der Compiler - ein Übersetzer zwischen Mensch und Maschine. Was Unternehmen heute brauchen: Einen Compiler für ihre BI-Projekte. Jemanden, der zwischen den Abteilungen dolmetscht und aus scheinbar gegensätzlichen Anforderungen eine gemeinsame Lösung entwickelt. Im aktuellen Podcast zeige ich, woran Sie echte BI-Kompetenz erkennen - und warum die meisten Berater zu schnell mit Softwarevorschlägen kommen. 

Data Culture Podcast
Aufbau einer Data-driven Culture bei Picnic – mit Anna Hannemann

Data Culture Podcast

Play Episode Listen Later Oct 13, 2025 29:50


MY DATA IS BETTER THAN YOURS
Warum Datenprojekte mehr Kultur als Technik brauchen - mit Sören E., Provinzial

MY DATA IS BETTER THAN YOURS

Play Episode Listen Later Oct 9, 2025 41:42 Transcription Available


Was passiert, wenn zwei Unternehmen nicht nur Systeme, sondern auch Denkweisen zusammenführen müssen? In der zweiten Folge der neuen Podcast-Reihe von MY DATA IS BETTER THAN YOURS spricht Host Jonas Rashedi mit Dr. Sören Erdweg von der Provinzial über genau diese Herausforderung – und über den kulturellen und technischen Umbau nach einer Fusion. Sören war ursprünglich als Data Scientist gestartet und verantwortet heute als IT-Projektleiter große Datenprojekte im Konzern. Er berichtet, wie operative Systeme und historische Datenbestände zusammengeführt werden – mit dem Ziel, ein konsolidiertes Data Warehouse aufzubauen, das nicht nur für Reporting, sondern auch für moderne KI-Modelle nutzbar ist. Dabei wird klar: Die größte Herausforderung ist nicht die Technologie. Es sind die unterschiedlichen Modelle, fachlichen Logiken und Erwartungshaltungen und der Weg, sie miteinander in Einklang zu bringen. Sören erklärt, wie die Provinzial ein gemeinsames Zielbild schafft, wie übergreifende Teams aus IT und Fachbereich aufgebaut werden und was nötig ist, um aus Einzelinitiativen nachhaltige Plattformen entstehen zu lassen. Auch über konkrete Anwendungsfälle wird gesprochen: von Vorhersagemodellen und der Analyse von Kundenverhalten bis zur Anwendung von GPT-Modellen in einem stark textbasierten Umfeld wie der Versicherung. Zum Schluss geht's noch um die Frage: Ist das Data Game eigentlich ein Sprint oder ein Marathon? Sörens Antwort: beides – und genau deshalb so spannend. MY DATA IS BETTER THAN YOURS ist ein Projekt von BETTER THAN YOURS, der Marke für richtig gute Podcasts. Zum LinkedIn-Profil von Sören: https://www.xing.com/profile/Soeren_Erdweg Zur Webseite der Provinzial Versicherung: https://www.provinzial.de/west/ Zu allen wichtigen Links rund um Jonas und den Podcast: https://linktr.ee/jonas.rashedi Zeitstempel mit Inhaltsbeschreibung: 00:00 Intro und Begrüßung 01:05 Rückblick auf Teil 1 01:28 Vorstellung Sören 02:15 Datenprojekte und Systemkonsolidierung nach der Fusion 04:33 Vom Data Scientist zum IT-Projektleiter 07:31 Herausforderungen der Datenmodellierung in der Versicherung 09:51 Kulturelle Unterschiede und dezentrale Datenlogiken 13:54 Zielbild: ein gemeinsames Data Warehouse 16:50 Fachbereiche als Schlüssel zur Datenstrategie 18:00 Hub-and-Spoke-Modell aus der Praxis 20:38 Learnings aus früheren Projekten 25:39 KI-Anwendungen 27:54 Herausforderungen bei Empfehlungssystemen im Versicherungsumfeld 30:23 GPT und Textverarbeitung im Versicherungsbereich 33:31 Innovationsspielraum versus Ressourcenrealität 34:57 Lessons Learned: Was funktioniert in der Praxis? 39:50 Private Datennutzung

Der Performance Manager Podcast | Für Controller & CFO, die noch erfolgreicher sein wollen
#757 Der Weg zur datengesteuerten Organisation – Ausgabe 5/2025 der Fachzeitschrift Controlling

Der Performance Manager Podcast | Für Controller & CFO, die noch erfolgreicher sein wollen

Play Episode Listen Later Oct 6, 2025 30:24


„Daten sind das neue Öl“ – diese Metapher ist längst abgenutzt. Die eigentliche Frage ist: Wie schaffen es Unternehmen, aus ihren Datenbeständen echten Wert zu schöpfen? Und welche Rolle spielen Controller dabei?  Die neue Ausgabe 5/2025 der Fachzeitschrift Controlling zeigt vier unterschiedliche Wege zur datengesteuerten Organisation. Von systematischen Use Case-Bewertungen, die Subjektivität bei Daten-Projekten vermeiden, über modulare Data Mesh-Architekturen, die zentralisierte Data Warehouses herausfordern bis hin zu Knowledge Graphs, die neue Möglichkeiten der Unternehmenssteuerung eröffnen. Dazu kommt ein oft übersehener Aspekt: Controller sollen lernen, ihre Zahlen zu erzählen – Data Storytelling als Brücke zwischen komplexen Analysen und verständlicher Kommunikation.  Die zentrale Erkenntnis: Controller werden nicht zu reinen Datenexperten, sondern müssen ihre bewährte Rolle als Business Partner um technische Kompetenzen erweitern. Doch wo liegt die Balance zwischen Effizienz und organisatorischen Herausforderungen?  Darüber spricht ATVISIO-Geschäftsführer Peter Bluhm im Performance Manager Podcast mit Prof. Dr. Ulrike Baumöl von der Universität St. Gallen, Mitherausgeberin der Zeitschrift Controlling. 

BIFocal - Clarifying Business Intelligence
Episode 307 - Microsoft Fabric September 2025 Feature Summary part 2

BIFocal - Clarifying Business Intelligence

Play Episode Listen Later Sep 30, 2025 32:09


This is episode 307 recorded on September 19th, 2025, where John & Jason continue talking about the Microsoft Fabric September 2025 Feature Summary including the Data Agent improvements, Merge Transact SQL in Data Warehouse, new Real-Time Intelligence Features, and much more. For show notes please visit www.bifocal.show

Data Gen
#225 - Qover : Structurer son Data Warehouse & Modéliser ses Données (dbt, Médaillon…)

Data Gen

Play Episode Listen Later Sep 22, 2025 21:44


Grégoire Hornung est Head of Data chez Qover, une pépite belge de l'InsurTech qui a levé 70 millions d'euros et des beaux clients tels que Revolut, Qonto ou Mastercard.On aborde :

Stories from the Hackery
The Engine Behind AI: Why Data Engineering is in Demand | Stories From The Hackery

Stories from the Hackery

Play Episode Listen Later Sep 10, 2025 67:21


In this episode of Stories from the Hackery, we talk with Nashville tech leader and hiring manager Jason Turan about one of tech's most in-demand fields: data engineering. Jason, a long-time friend of NSS, was one of the first people to tell us that Nashville needed more data engineers. He shares his perspective on what a data engineer does, describing the role as the "connective tissue between data producers and data consumers". Listen in to hear us discuss: - Why data engineers are essential for flipping the 80/20 rule, allowing data scientists and analysts to spend less time cleaning data and more time finding insights. - How the rise of generative AI has acted as an "accelerant," increasing the need for high-quality data and the professionals who can provide it. - Actionable advice for getting started in the field, including the importance of focusing on a "T-shaped skillset" with SQL at its core. - Why Jason's number one piece of advice is to be curious, experiment, and "go out and do the thing". 01:20 Meet Jason Turan: His Tech Origin Story 03:04 Jason's History with NSS and Hiring Grads 07:28 Defining Data Engineering: The "Connective Tissue" of Tech 11:15 Why Nashville is a Hub for Data Engineers 13:56 Healthcare's Impact on Nashville's Data Jobs 20:35 How GenAI Accelerates the Need for Data Engineers 31:33 Getting Started: Lower Barriers to Entry 39:03 A Top Use Case for AI: Understanding Your Codebase 52:21 Misconceptions & the "T-Shaped Skillset" 55:29 The Value of Hands-On Learning: "Go Do the Thing" 58:52 Lightning Round: Favorite Tech Tools 01:00:32 Lightning Round: Top Reads & Resources Links Metabase: https://www.metabase.com/ DuckDB: https://duckdb.org/ MotherDuck: https://motherduck.com/ Ralph Kimball: The Data Warehouse Toolkit: https://www.amazon.com/gp/product/1118530802 Bill Inmon: Building the Data Warehouse: https://www.amazon.com/Building-Data-Warehouse-W-Inmon/dp/0764599445 Edward Tufte: The Visual Display of Quantitative Information: https://www.amazon.com/Visual-Display-Quantitative-Information/dp/0961392142 Brendan Keeler: The Health API Guy: https://healthapiguy.substack.com/ TLDR Newsletter: https://tldr.tech/ Nashville Technology Council (NTC): https://technologycouncil.com/

Heverton Anunciação e Universidade do Consumidor te inspiram a inovar na relação empresa e clientes
Eu entrevistei Bill Inmon, o pai que criou o conceito de Data Warehouse #cienciadedados #analytics

Heverton Anunciação e Universidade do Consumidor te inspiram a inovar na relação empresa e clientes

Play Episode Listen Later Sep 10, 2025 2:58


Eu entrevistei Bill Inmon, o pai que criou o conceito de Data Warehouse, que é peça essencial para análise de dados e Inteligência Artificial. Quer assistir inteira? https://youtube.com/live/Jvejm2kX2kE Assista em nosso canal do youtube.. Esta e outras centenas de debates riquíssimos. #atendimentoaocliente #experienciadocliente #inovação #customerexperience #callcenter #crm #ciênciadedados #satisfaçãodocliente #jornadadocliente #consumidor #ouvidoria #marketing #empreendedorismo #vendas #sucessodoocliente

BIFocal - Clarifying Business Intelligence
Episode 304 - Microsoft Fabric August 2025 Feature Summary

BIFocal - Clarifying Business Intelligence

Play Episode Listen Later Sep 9, 2025 41:34


This is episode 304 recorded on September 4th, 2025, where John & Jason talk the Microsoft Fabric August 2025 Feature Summary including a new Flat list view in Deployment pipelines, Bursting controls for Data Engineering workloads, new test capabilities for User Data Functions, the ability to server real-time predictions with ML model endpoints, several updates to Data Warehouse, Database tree in edit tile and AzMon data sources for RTI, the ability to use Python Notebooks to read/write to Fabric SQL Databases, Auto table creation on destination in copy job in Data Factory, and much, much more. For show notes please visit www.bifocal.show

Industrie 4.0 – der Expertentalk für den Mittelstand
Industrie 4.0 to go #22: So geht smarte Steuerung mit Echtzeit-Reporting

Industrie 4.0 – der Expertentalk für den Mittelstand

Play Episode Listen Later Sep 9, 2025 9:07


In der heutigen Geschäftswelt sind Entscheidungen auf Bauchgefühl längst überholt – Echtzeit-Reporting ist der Schlüssel zu fundierten, datenbasierten Entscheidungen. Schau dir konkrete Beispiele und Reporting-Dashboards auf unserer Website an

Der Performance Manager Podcast | Für Controller & CFO, die noch erfolgreicher sein wollen
#750 Architekturen für BI & Analytics – Prof. Dr. Peter Gluchowski im Gespräch (Teil 2v2)

Der Performance Manager Podcast | Für Controller & CFO, die noch erfolgreicher sein wollen

Play Episode Listen Later Sep 4, 2025 36:42


In dieser Folge des Performance Manager Podcast spreche ich mit Prof. Dr. Peter Gluchowski (TU Chemnitz) über moderne BI- und Analytics-Architekturen. Wir klären die Unterschiede zwischen Data Lake, Data Lakehouse, Data Mesh und Data Warehouse und diskutieren, warum klassische Ansätze oft nicht mehr ausreichen. Prof. Dr. Gluchowski gibt praxisnahe Empfehlungen, wie Unternehmen ihre Daten effizienter nutzen und ihre BI-Landschaften modernisieren können. Hier geht´s zu dem Buch "Architekturen für BI & Analytics": https://bit.ly/3Vx2tnG

Der Performance Manager Podcast | Für Controller & CFO, die noch erfolgreicher sein wollen
#749 Architekturen für BI & Analytics – Prof. Dr. Peter Gluchowski im Gespräch (Teil 1v2)

Der Performance Manager Podcast | Für Controller & CFO, die noch erfolgreicher sein wollen

Play Episode Listen Later Sep 2, 2025 24:16


In dieser Folge des Performance Manager Podcast spreche ich mit Prof. Dr. Peter Gluchowski (TU Chemnitz) über moderne BI- und Analytics-Architekturen. Wir klären die Unterschiede zwischen Data Lake, Data Lakehouse, Data Mesh und Data Warehouse und diskutieren, warum klassische Ansätze oft nicht mehr ausreichen. Prof. Dr. Gluchowski gibt praxisnahe Empfehlungen, wie Unternehmen ihre Daten effizienter nutzen und ihre BI-Landschaften modernisieren können. Hier geht´s zu dem Buch "Architekturen für BI & Analytics": https://bit.ly/3Vx2tnG

Explicit Measures Podcast
454: Looking at the Data Warehouse Roadmap

Explicit Measures Podcast

Play Episode Listen Later Aug 29, 2025 72:46


Mike & Tommy are joined by Brad Schact at Microsoft to review what is new & upcoming in the Microsoft Fabric Data Warehouse. Get in touch:Send in your questions or topics you want us to discuss by tweeting to @PowerBITips with the hashtag #empMailbag or submit on the PowerBI.tips Podcast Page.Visit PowerBI.tips: https://powerbi.tips/Watch the episodes live every Tuesday and Thursday morning at 730am CST on YouTube: https://www.youtube.com/powerbitipsSubscribe on Spotify: https://open.spotify.com/show/230fp78XmHHRXTiYICRLVvSubscribe on Apple: https://podcasts.apple.com/us/podcast/explicit-measures-podcast/id1568944083‎Check Out Community Jam: https://jam.powerbi.tipsFollow Mike: https://www.linkedin.com/in/michaelcarlo/Follow Seth: https://www.linkedin.com/in/seth-bauer/Follow Tommy: https://www.linkedin.com/in/tommypuglia/

The Data Engineering Show
Is Self-Service BI a False Promise? Lei Tang of Fabi.ai Thinks So

The Data Engineering Show

Play Episode Listen Later Aug 28, 2025 21:07


AI is reshaping business intelligence by enabling true self-service analytics and transforming how organizations interact with their data through natural language processing. In this episode of The Data Engineering Show, host Benjamin interviews Lei, Co-founder and CTO of Fabi.ai, to explore how AI-native BI platforms are reshaping data analytics and empowering non-technical users to derive meaningful insights from complex datasets.

Der AWS-Podcast auf Deutsch
116 - Die Data Platform Transformation bei Tom Tailor

Der AWS-Podcast auf Deutsch

Play Episode Listen Later Aug 28, 2025 27:12


In dieser Episode sprechen wir mit Christoph Scheyk, Head of Data, Analytics & AI bei Tom Tailor, über den erfolgreichen Wandel des Modeunternehmens durch den Einsatz moderner AWS-Technologie. Gemeinsam mit Josephine Plath (AWS) beleuchten wir den Weg von der Migration des Data Warehouses nach Amazon Redshift, über die Nutzung der Datenplattform als Enabler für Digitalisierung, bis hin zum Einsatz von Generativer KI für Produktvisualisierung. Ein inspirierendes Gespräch über Learnings, Best Practices und die Rolle von Daten in einer Branche im Wandel. Kernthemen der Episode: Herausforderungen und Motivation zur Cloud-Migration Aufbau einer skalierbaren Data Platform mit Redshift GenAI in der Fashion-Branche: Bildgenerierung & neue Use Cases Nachhaltigkeit, E-Commerce & technologische Chancen Empfehlungen für Unternehmen in der Transformation Innovations-Roadmap: Wie Tom Tailor die Zukunft mit AWS gestaltet Highlights: Wie Tom Tailor mit einer stabilen Datenbasis Innovationen beschleunigt Konkrete GenAI-Projekte für Visualisierung von Kollektionen Praktische Tipps aus der Cloud-Transformation eines Modeunternehmens Visionäre Perspektiven auf die Rolle von Technologie in der Fashion-Industrie

Explicit Measures Podcast
453: Creature Comforts of Data Warehouse

Explicit Measures Podcast

Play Episode Listen Later Aug 26, 2025 66:16


Mike & Tommy are joined by Brad Schacht on the Data Warehouse team for Microsoft to dive into how to get eased and comfortable with the Warehouse experience in Microsoft Fabric.Get in touch:Send in your questions or topics you want us to discuss by tweeting to @PowerBITips with the hashtag #empMailbag or submit on the PowerBI.tips Podcast Page.Visit PowerBI.tips: https://powerbi.tips/Watch the episodes live every Tuesday and Thursday morning at 730am CST on YouTube: https://www.youtube.com/powerbitipsSubscribe on Spotify: https://open.spotify.com/show/230fp78XmHHRXTiYICRLVvSubscribe on Apple: https://podcasts.apple.com/us/podcast/explicit-measures-podcast/id1568944083‎Check Out Community Jam: https://jam.powerbi.tipsFollow Mike: https://www.linkedin.com/in/michaelcarlo/Follow Seth: https://www.linkedin.com/in/seth-bauer/Follow Tommy: https://www.linkedin.com/in/tommypuglia/

Data Culture Podcast
Data Warehouse Automation: Benefits and Market Overview – with Florian Bigelmaier, BARC

Data Culture Podcast

Play Episode Listen Later Aug 18, 2025 34:10


SQL Data Partners Podcast
Episode 285: Who Is Using Microsoft Fabric

SQL Data Partners Podcast

Play Episode Listen Later May 29, 2025 34:05


Fabric personas were originally designed to break down the various functional roles within Microsoft Fabric—such as Power BI, Data Factory, Data Activator, Data Engineering, Data Science, Data Warehouse, and Real-time Analytics—into more manageable, bite-sized sections. The goal was to prevent users from feeling overwhelmed by the platform's breadth. However, this feature has since been discontinued, as it did not effectively communicate the seamless integration between these roles. Still, the underlying concepts can be useful when thinking about how you might approach Fabric from a functional standpoint. Do you like the change on one large white canvas, or did personas have a use for you? Let us know in the comments below. We hope you enjoyed this conversation on personas in Microsoft Fabric. If you have questions or comments, please send them our way. We would love to answer your questions on a future episode. Leave us a comment and some love ❤️on LinkedIn, X, Facebook, or Instagram. The show notes for today's episode can be found at Episode 285: Who is Using Microsoft Fabric. Have fun on the SQL Trail!

Weaver: Beyond the Numbers
Five Common Data Warehouse Issues

Weaver: Beyond the Numbers

Play Episode Listen Later May 15, 2025 5:54


BIFocal - Clarifying Business Intelligence
Episode 293 - Microsoft Fabric March 2025 Feature Summary Part 2

BIFocal - Clarifying Business Intelligence

Play Episode Listen Later May 8, 2025 42:07


This is episode 293 recorded on May 6th, 2025 where John & Jason talk the Microsoft Fabric March 2025 Feature Summary including Data Science, Data Warehouse, Real-time Intelligence, and Data Factory.

Weaver: Beyond the Numbers
How Does a Data Warehouse Work?

Weaver: Beyond the Numbers

Play Episode Listen Later Apr 9, 2025 5:17


Weaver: Beyond the Numbers
How Does a Data Warehouse Work?

Weaver: Beyond the Numbers

Play Episode Listen Later Apr 9, 2025 5:17


BIFocal - Clarifying Business Intelligence
Episode 287 - Microsoft Fabric January 2025 Feature Summary

BIFocal - Clarifying Business Intelligence

Play Episode Listen Later Mar 20, 2025 35:31


This is episode 287 recorded on March 13th, 2025 where John & Jason talk the Microsoft Fabric January 2025 Feature Summary including Python Notebooks in preview, Lineage enhancements to spark notebooks, lots of DBA enhancements to Data Warehouse, Tenant Level Private Link for Databases, and CI/CD preview for most of Fabric.

BlockHash: Exploring the Blockchain
Ep. 497 Catherine Daly | Verifiable Data Warehouse with Space and Time

BlockHash: Exploring the Blockchain

Play Episode Listen Later Mar 17, 2025 34:01


For episode 497, Head of Marketing Catherine Daly joins Brandon Zemp to talk about Space and Time, an AI driver web3 data warehouse. SxT replaces blockchain indexing, databases, and API servers with a decentralized solution.Catherine is a senior marketing strategist with a passion for building community around emerging technology. Prior to Space and Time, Catherine managed full-funnel marketing for both startups and established global organizations in the semiconductor industry. She is accomplished in developing data-driven integrated communications strategies to accelerate growth for businesses across Web3 the technology ecosystem.

The Marketing Hero Podcast
Integrating Your CRM With Your Data Warehouse

The Marketing Hero Podcast

Play Episode Listen Later Mar 4, 2025 38:29


Companies have crucial data stored across multiple systems in their organization. And the bigger the company, the more systems there are. Sales, marketing, finance, ERP, inventory, contract management, billing, and service delivery are just types of data and systems that show the story of the business. Many times your CRM just by itself can't show all of that!Because of this, many companies have started setting up a centralized data warehouse like Snowflake or Redshift to pull in data from their CRM and other systems to be able to run more advanced and centralized reporting across it all. But if you want to set this up, where do you start? How do you manage it? How do you protect it? How do you keep it maintained? How do you actually derive value from all the hard work of implementing it?I contacted Ryan Severns to talk through all these questions with him. We talk through all these questions and more, starting with questions like:Why set up a CRM data warehouse infrastructure in the first place?How do you build the integration?What important considerations do you need to plan for?We dig deeper into the details from there, talking through topics like ETL tooling, DBT, data governance, cross-platform analytics, building effective business intelligence systems, and more.If you're ready to level up your customer data skills and advance to the next level of your RevOps hero journey, this episode is for you! Give it a watch and a like. And hit that subscribe button so that you'll always get notified of future episodes of The RevOps Hero Podcast as well.

Dynasty Dingers
Jon Anderson of MLB Data Warehouse Joins The Show

Dynasty Dingers

Play Episode Listen Later Feb 17, 2025 82:24


Doc and Matt welcome Jon Anderson of MLB Data Warehouse to the show to talk the value of dynasty leagues and what Jon looks for in his model to help dominate redraft.Follow Jon @JonPgh on X

Les Cast Codeurs Podcast
LCC 322 - Maaaaveeeeen 4 !

Les Cast Codeurs Podcast

Play Episode Listen Later Feb 9, 2025 77:13


Arnaud et Emmanuel discutent des nouvelles de ce mois. On y parle intégrité de JVM, fetch size de JDBC, MCP, de prompt engineering, de DeepSeek bien sûr mais aussi de Maven 4 et des proxy de répository Maven. Et d'autres choses encore, bonne lecture. Enregistré le 7 février 2025 Téléchargement de l'épisode LesCastCodeurs-Episode-322.mp3 ou en vidéo sur YouTube. News Langages Les evolutions de la JVM pour augmenter l'intégrité https://inside.java/2025/01/03/evolving-default-integrity/ un article sur les raisons pour lesquelles les editeurs de frameworks et les utilisateurs s'arrachent les cheveux et vont continuer garantir l'integrite du code et des données en enlevant des APIs existantes historiquemnt agents dynamiques, setAccessible, Unsafe, JNI Article expliques les risques percus par les mainteneurs de la JVM Franchement c'est un peu leg sur les causes l'article, auto propagande JavaScript Temporal, enfin une API propre et moderne pour gérer les dates en JS https://developer.mozilla.org/en-US/blog/javascript-temporal-is-coming/ JavaScript Temporal est un nouvel objet conçu pour remplacer l'objet Date, qui présente des défauts. Il résout des problèmes tels que le manque de prise en charge des fuseaux horaires et la mutabilité. Temporal introduit des concepts tels que les instants, les heures civiles et les durées. Il fournit des classes pour gérer diverses représentations de date/heure, y compris celles qui tiennent compte du fuseau horaire et celles qui n'en tiennent pas compte. Temporal simplifie l'utilisation de différents calendriers (par exemple, chinois, hébreu). Il comprend des méthodes pour les comparaisons, les conversions et le formatage des dates et des heures. La prise en charge par les navigateurs est expérimentale, Firefox Nightly ayant l'implémentation la plus aboutie. Un polyfill est disponible pour essayer Temporal dans n'importe quel navigateur. Librairies Un article sur les fetch size du JDBC et les impacts sur vos applications https://in.relation.to/2025/01/24/jdbc-fetch-size/ qui connait la valeur fetch size par default de son driver? en fonction de vos use cases, ca peut etre devastateur exemple d'une appli qui retourne 12 lignes et un fetch size de oracle a 10, 2 a/r pour rien et si c'est 50 lignres retournées la base de donnée est le facteur limitant, pas Java donc monter sont fetch size est avantageux, on utilise la memoire de Java pour eviter la latence Quarkus annouce les MCP servers project pour collecter les servier MCP en Java https://quarkus.io/blog/introducing-mcp-servers/ MCP d'Anthropic introspecteur de bases JDBC lecteur de filke system Dessine en Java FX demarrables facilement avec jbang et testes avec claude desktop, goose et mcp-cli permet d'utliser le pouvoir des librarires Java de votre IA d'ailleurs Spring a la version 0.6 de leur support MCP https://spring.io/blog/2025/01/23/spring-ai-mcp-0 Infrastructure Apache Flink sur Kibernetes https://www.decodable.co/blog/get-running-with-apache-flink-on-kubernetes-2 un article tres complet ejn deux parties sur l'installation de Flink sur Kubernetes installation, setup mais aussi le checkpointing, la HA, l'observablité Data et Intelligence Artificielle 10 techniques de prompt engineering https://medium.com/google-cloud/10-prompt-engineering-techniques-every-beginner-should-know-bf6c195916c7 Si vous voulez aller plus loin, l'article référence un très bon livre blanc sur le prompt engineering https://www.kaggle.com/whitepaper-prompt-engineering Les techniques évoquées : Zero-Shot Prompting: On demande directement à l'IA de répondre à une question sans lui fournir d'exemple préalable. C'est comme si on posait une question à une personne sans lui donner de contexte. Few-Shot Prompting: On donne à l'IA un ou plusieurs exemples de la tâche qu'on souhaite qu'elle accomplisse. C'est comme montrer à quelqu'un comment faire quelque chose avant de lui demander de le faire. System Prompting: On définit le contexte général et le but de la tâche pour l'IA. C'est comme donner à l'IA des instructions générales sur ce qu'elle doit faire. Role Prompting: On attribue un rôle spécifique à l'IA (enseignant, journaliste, etc.). C'est comme demander à quelqu'un de jouer un rôle spécifique. Contextual Prompting: On fournit des informations supplémentaires ou un contexte pour la tâche. C'est comme donner à quelqu'un toutes les informations nécessaires pour répondre à une question. Step-Back Prompting: On pose d'abord une question générale, puis on utilise la réponse pour poser une question plus spécifique. C'est comme poser une question ouverte avant de poser une question plus fermée. Chain-of-Thought Prompting: On demande à l'IA de montrer étape par étape comment elle arrive à sa conclusion. C'est comme demander à quelqu'un d'expliquer son raisonnement. Self-Consistency Prompting: On pose plusieurs fois la même question à l'IA et on compare les réponses pour trouver la plus cohérente. C'est comme vérifier une réponse en la posant sous différentes formes. Tree-of-Thoughts Prompting: On permet à l'IA d'explorer plusieurs chemins de raisonnement en même temps. C'est comme considérer toutes les options possibles avant de prendre une décision. ReAct Prompting: On permet à l'IA d'interagir avec des outils externes pour résoudre des problèmes complexes. C'est comme donner à quelqu'un les outils nécessaires pour résoudre un problème. Les patterns GenAI the thoughtworks https://martinfowler.com/articles/gen-ai-patterns/ tres introductif et pre RAG le direct prompt qui est un appel direct au LLM: limitations de connaissance et de controle de l'experience eval: evaluer la sortie d'un LLM avec plusieurs techniques mais fondamentalement une fonction qui prend la demande, la reponse et donc un score numerique evaluation via un LLM (le meme ou un autre), ou evaluation humaine tourner les evaluations a partir de la chaine de build amis aussi en live vu que les LLMs puvent evoluer. Decrit les embedding notament d'image amis aussi de texte avec la notion de contexte DeepSeek et la fin de la domination de NVidia https://youtubetranscriptoptimizer.com/blog/05_the_short_case_for_nvda un article sur les raisons pour lesquelles NVIDIA va se faire cahllengert sur ses marges 90% de marge quand meme parce que les plus gros GPU et CUDA qui est proprio mais des approches ardware alternatives existent qui sont plus efficientes (TPU et gros waffle) Google, MS et d'autres construisent leurs GPU alternatifs CUDA devient de moins en moins le linga franca avec l'investissement sur des langages intermediares alternatifs par Apple, Google OpenAI etc L'article parle de DeepSkeek qui est venu mettre une baffe dans le monde des LLMs Ils ont construit un competiteur a gpt4o et o1 avec 5M de dollars et des capacites de raisonnements impressionnant la cles c'etait beaucoup de trick d'optimisation mais le plus gros est d'avoir des poids de neurores sur 8 bits vs 32 pour les autres. et donc de quatizer au fil de l'eau et au moment de l'entrainement beaucoup de reinforcemnt learning innovatifs aussi et des Mixture of Expert donc ~50x moins chers que OpenAI Donc plus besoin de GPU qui on des tonnes de vRAM ah et DeepSeek est open source un article de semianalytics change un peu le narratif le papier de DeepSkeek en dit long via ses omissions par ensemple les 6M c'est juste l'inference en GPU, pas les couts de recherches et divers trials et erreurs en comparaison Claude Sonnet a coute 10M en infererence DeepSeek a beaucoup de CPU pre ban et ceratins post bans evalués a 5 Milliards en investissement. leurs avancées et leur ouverture reste extremement interessante Une intro à Apache Iceberg http://blog.ippon.fr/2025/01/17/la-revolution-des-donnees-lavenement-des-lakehouses-avec-apache-iceberg/ issue des limites du data lake. non structuré et des Data Warehouses aux limites en diversite de données et de volume entrent les lakehouse Et particulierement Apache Iceberg issue de Netflix gestion de schema mais flexible notion de copy en write vs merge on read en fonction de besoins garantie atomicite, coherence, isoliation et durabilite notion de time travel et rollback partitions cachées (qui abstraient la partition et ses transfos) et evolution de partitions compatbile avec les moteurs de calcul comme spark, trino, flink etc explique la structure des metadonnées et des données Guillaume s'amuse à générer des histoires courtes de Science-Fiction en programmant des Agents IA avec LangChain4j et aussi avec des workflows https://glaforge.dev/posts/2025/01/27/an-ai-agent-to-generate-short-scifi-stories/ https://glaforge.dev/posts/2025/01/31/a-genai-agent-with-a-real-workflow/ Création d'un générateur automatisé de nouvelles de science-fiction à l'aide de Gemini et Imagen en Java, LangChain4j, sur Google Cloud. Le système génère chaque nuit des histoires, complétées par des illustrations créées par le modèle Imagen 3, et les publie sur un site Web. Une étape d'auto-réflexion utilise Gemini pour sélectionner la meilleure image pour chaque chapitre. L'agent utilise un workflow explicite, drivé par le code Java, où les étapes sont prédéfinies dans le code, plutôt que de s'appuyer sur une planification basée sur LLM. Le code est disponible sur GitHub et l'application est déployée sur Google Cloud. L'article oppose les agents de workflow explicites aux agents autonomes, en soulignant les compromis de chaque approche. Car parfois, les Agent IA autonomes qui gèrent leur propre planning hallucinent un peu trop et n'établissent pas un plan correctement, ou ne le suive pas comme il faut, voire hallucine des “function call”. Le projet utilise Cloud Build, le Cloud Run jobs, Cloud Scheduler, Firestore comme base de données, et Firebase pour le déploiement et l'automatisation du frontend. Dans le deuxième article, L'approche est différente, Guillaume utilise un outil de Workflow, plutôt que de diriger le planning avec du code Java. L'approche impérative utilise du code Java explicite pour orchestrer le workflow, offrant ainsi un contrôle et une parallélisation précis. L'approche déclarative utilise un fichier YAML pour définir le workflow, en spécifiant les étapes, les entrées, les sorties et l'ordre d'exécution. Le workflow comprend les étapes permettant de générer une histoire avec Gemini 2, de créer une invite d'image, de générer des images avec Imagen 3 et d'enregistrer le résultat dans Cloud Firestore (base de donnée NoSQL). Les principaux avantages de l'approche impérative sont un contrôle précis, une parallélisation explicite et des outils de programmation familiers. Les principaux avantages de l'approche déclarative sont des définitions de workflow peut-être plus faciles à comprendre (même si c'est un YAML, berk !) la visualisation, l'évolutivité et une maintenance simplifiée (on peut juste changer le YAML dans la console, comme au bon vieux temps du PHP en prod). Les inconvénients de l'approche impérative incluent le besoin de connaissances en programmation, les défis potentiels en matière de maintenance et la gestion des conteneurs. Les inconvénients de l'approche déclarative incluent une création YAML pénible, un contrôle de parallélisation limité, l'absence d'émulateur local et un débogage moins intuitif. Le choix entre les approches dépend des exigences du projet, la déclarative étant adaptée aux workflows plus simples. L'article conclut que la planification déclarative peut aider les agents IA à rester concentrés et prévisibles. Outillage Vulnérabilité des proxy Maven https://github.blog/security/vulnerability-research/attacks-on-maven-proxy-repositories/ Quelque soit le langage, la techno, il est hautement conseillé de mettre en place des gestionnaires de repositories en tant que proxy pour mieux contrôler les dépendances qui contribuent à la création de vos produits Michael Stepankin de l'équipe GitHub Security Lab a cherché a savoir si ces derniers ne sont pas aussi sources de vulnérabilité en étudiant quelques CVEs sur des produits comme JFrog Artifactory, Sonatype Nexus, et Reposilite Certaines failles viennent de la UI des produits qui permettent d'afficher les artifacts (ex: mettez un JS dans un fichier POM) et même de naviguer dedans (ex: voir le contenu d'un jar / zip et on exploite l'API pour lire, voir modifier des fichiers du serveur en dehors des archives) Les artifacts peuvent aussi être compromis en jouant sur les paramètres propriétaires des URLs ou en jouant sur le nomage avec les encodings. Bref, rien n'est simple ni niveau. Tout système rajoute de la compléxité et il est important de les tenir à mettre à jour. Il faut surveiller activement sa chaine de distribution via différents moyens et ne pas tout miser sur le repository manager. L'auteur a fait une présentation sur le sujet : https://www.youtube.com/watch?v=0Z_QXtk0Z54 Apache Maven 4… Bientôt, c'est promis …. qu'est ce qu'il y aura dedans ? https://gnodet.github.io/maven4-presentation/ Et aussi https://github.com/Bukama/MavenStuff/blob/main/Maven4/whatsnewinmaven4.md Apache Maven 4 Doucement mais surement …. c'est le principe d'un projet Maven 4.0.0-rc-2 est dispo (Dec 2024). Maven a plus de 20 ans et est largement utilisé dans l'écosystème Java. La compatibilité ascendante a toujours été une priorité, mais elle a limité la flexibilité. Maven 4 introduit des changements significatifs, notamment un nouveau schéma de construction et des améliorations du code. Changements du POM Séparation du Build-POM et du Consumer-POM : Build-POM : Contient des informations propres à la construction (ex. plugins, configurations). Consumer-POM : Contient uniquement les informations nécessaires aux consommateurs d'artefacts (ex. dépendances). Nouveau Modèle Version 4.1.0 : Utilisé uniquement pour le Build-POM, alors que le Consumer-POM reste en 4.0.0 pour la compatibilité. Introduit de nouveaux éléments et en marque certains comme obsolètes. Modules renommés en sous-projets : “Modules” devient “Sous-projets” pour éviter la confusion avec les Modules Java. L'élément remplace (qui reste pris en charge). Nouveau type de packaging : “bom” (Bill of Materials) : Différencie les POMs parents et les BOMs de gestion des dépendances. Prend en charge les exclusions et les imports basés sur les classifiers. Déclaration explicite du répertoire racine : permet de définir explicitement le répertoire racine du projet. Élimine toute ambiguïté sur la localisation des racines de projet. Nouvelles variables de répertoire : ${project.rootDirectory}, ${session.topDirectory} et ${session.rootDirectory} pour une meilleure gestion des chemins. Remplace les anciennes solutions non officielles et variables internes obsolètes. Prise en charge de syntaxes alternatives pour le POM Introduction de ModelParser SPI permettant des syntaxes alternatives pour le POM. Apache Maven Hocon Extension est un exemple précoce de cette fonctionnalité. Améliorations pour les sous-projets Versioning automatique des parents Il n'est plus nécessaire de définir la version des parents dans chaque sous-projet. Fonctionne avec le modèle de version 4.1.0 et s'étend aux dépendances internes au projet. Support complet des variables compatibles CI Le Flatten Maven Plugin n'est plus requis. Prend en charge les variables comme ${revision} pour le versioning. Peut être défini via maven.config ou la ligne de commande (mvn verify -Drevision=4.0.1). Améliorations et corrections du Reactor Correction de bug : Gestion améliorée de --also-make lors de la reprise des builds. Nouvelle option --resume (-r) pour redémarrer à partir du dernier sous-projet en échec. Les sous-projets déjà construits avec succès sont ignorés lors de la reprise. Constructions sensibles aux sous-dossiers : Possibilité d'exécuter des outils sur des sous-projets sélectionnés uniquement. Recommandation : Utiliser mvn verify plutôt que mvn clean install. Autres Améliorations Timestamps cohérents pour tous les sous-projets dans les archives packagées. Déploiement amélioré : Le déploiement ne se produit que si tous les sous-projets sont construits avec succès. Changements de workflow, cycle de vie et exécution Java 17 requis pour exécuter Maven Java 17 est le JDK minimum requis pour exécuter Maven 4. Les anciennes versions de Java peuvent toujours être ciblées pour la compilation via Maven Toolchains. Java 17 a été préféré à Java 21 en raison d'un support à long terme plus étendu. Mise à jour des plugins et maintenance des applications Suppression des fonctionnalités obsolètes (ex. Plexus Containers, expressions ${pom.}). Mise à jour du Super POM, modifiant les versions par défaut des plugins. Les builds peuvent se comporter différemment ; définissez des versions fixes des plugins pour éviter les changements inattendus. Maven 4 affiche un avertissement si des versions par défaut sont utilisées. Nouveau paramètre “Fail on Severity” Le build peut échouer si des messages de log atteignent un niveau de gravité spécifique (ex. WARN). Utilisable via --fail-on-severity WARN ou -fos WARN. Maven Shell (mvnsh) Chaque exécution de mvn nécessitait auparavant un redémarrage complet de Java/Maven. Maven 4 introduit Maven Shell (mvnsh), qui maintient un processus Maven résident unique ouvert pour plusieurs commandes. Améliore la performance et réduit les temps de build. Alternative : Utilisez Maven Daemon (mvnd), qui gère un pool de processus Maven résidents. Architecture Un article sur les feature flags avec Unleash https://feeds.feedblitz.com//911939960/0/baeldungImplement-Feature-Flags-in-Java-With-Unleash Pour A/B testing et des cycles de développements plus rapides pour « tester en prod » Montre comment tourner sous docker unleash Et ajouter la librairie a du code java pour tester un feature flag Sécurité Keycloak 26.1 https://www.keycloak.org/2025/01/keycloak-2610-released.html detection des noeuds via la proble base de donnée aulieu echange reseau virtual threads pour infinispan et jgroups opentelemetry tracing supporté et plein de fonctionalités de sécurité Loi, société et organisation Les grands morceaux du coût et revenus d'une conférence. Ici http://bdx.io|bdx.io https://bsky.app/profile/ameliebenoit33.bsky.social/post/3lgzslhedzk2a 44% le billet 52% les sponsors 38% loc du lieu 29% traiteur et café 12% standiste 5% frais speaker (donc pas tous) Ask Me Anything Julien de Provin: J'aime beaucoup le mode “continuous testing” de Quarkus, et je me demandais s'il existait une alternative en dehors de Quarkus, ou à défaut, des ressources sur son fonctionnement ? J'aimerais beaucoup avoir un outil agnostique utilisable sur les projets non-Quarkus sur lesquels j'intervient, quitte à y metttre un peu d'huile de coude (ou de phalange pour le coup). https://github.com/infinitest/infinitest/ Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 6-7 février 2025 : Touraine Tech - Tours (France) 21 février 2025 : LyonJS 100 - Lyon (France) 28 février 2025 : Paris TS La Conf - Paris (France) 6 mars 2025 : DevCon #24 : 100% IA - Paris (France) 13 mars 2025 : Oracle CloudWorld Tour Paris - Paris (France) 14 mars 2025 : Rust In Paris 2025 - Paris (France) 19-21 mars 2025 : React Paris - Paris (France) 20 mars 2025 : PGDay Paris - Paris (France) 20-21 mars 2025 : Agile Niort - Niort (France) 25 mars 2025 : ParisTestConf - Paris (France) 26-29 mars 2025 : JChateau Unconference 2025 - Cour-Cheverny (France) 27-28 mars 2025 : SymfonyLive Paris 2025 - Paris (France) 28 mars 2025 : DataDays - Lille (France) 28-29 mars 2025 : Agile Games France 2025 - Lille (France) 3 avril 2025 : DotJS - Paris (France) 3 avril 2025 : SoCraTes Rennes 2025 - Rennes (France) 4 avril 2025 : Flutter Connection 2025 - Paris (France) 4 avril 2025 : aMP Orléans 04-04-2025 - Orléans (France) 10-11 avril 2025 : Android Makers - Montrouge (France) 10-12 avril 2025 : Devoxx Greece - Athens (Greece) 16-18 avril 2025 : Devoxx France - Paris (France) 23-25 avril 2025 : MODERN ENDPOINT MANAGEMENT EMEA SUMMIT 2025 - Paris (France) 24 avril 2025 : IA Data Day 2025 - Strasbourg (France) 29-30 avril 2025 : MixIT - Lyon (France) 7-9 mai 2025 : Devoxx UK - London (UK) 15 mai 2025 : Cloud Toulouse - Toulouse (France) 16 mai 2025 : AFUP Day 2025 Lille - Lille (France) 16 mai 2025 : AFUP Day 2025 Lyon - Lyon (France) 16 mai 2025 : AFUP Day 2025 Poitiers - Poitiers (France) 24 mai 2025 : Polycloud - Montpellier (France) 24 mai 2025 : NG Baguette Conf 2025 - Nantes (France) 5-6 juin 2025 : AlpesCraft - Grenoble (France) 5-6 juin 2025 : Devquest 2025 - Niort (France) 10-11 juin 2025 : Modern Workplace Conference Paris 2025 - Paris (France) 11-13 juin 2025 : Devoxx Poland - Krakow (Poland) 12-13 juin 2025 : Agile Tour Toulouse - Toulouse (France) 12-13 juin 2025 : DevLille - Lille (France) 13 juin 2025 : Tech F'Est 2025 - Nancy (France) 17 juin 2025 : Mobilis In Mobile - Nantes (France) 24 juin 2025 : WAX 2025 - Aix-en-Provence (France) 25-26 juin 2025 : Agi'Lille 2025 - Lille (France) 25-27 juin 2025 : BreizhCamp 2025 - Rennes (France) 26-27 juin 2025 : Sunny Tech - Montpellier (France) 1-4 juillet 2025 : Open edX Conference - 2025 - Palaiseau (France) 7-9 juillet 2025 : Riviera DEV 2025 - Sophia Antipolis (France) 18-19 septembre 2025 : API Platform Conference - Lille (France) & Online 2-3 octobre 2025 : Volcamp - Clermont-Ferrand (France) 6-10 octobre 2025 : Devoxx Belgium - Antwerp (Belgium) 9-10 octobre 2025 : Forum PHP 2025 - Marne-la-Vallée (France) 16-17 octobre 2025 : DevFest Nantes - Nantes (France) 4-7 novembre 2025 : NewCrafts 2025 - Paris (France) 6 novembre 2025 : dotAI 2025 - Paris (France) 7 novembre 2025 : BDX I/O - Bordeaux (France) 12-14 novembre 2025 : Devoxx Morocco - Marrakech (Morocco) 28-31 janvier 2026 : SnowCamp 2026 - Grenoble (France) 23-25 avril 2026 : Devoxx Greece - Athens (Greece) 17 juin 2026 : Devoxx Poland - Krakow (Poland) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via X/twitter https://twitter.com/lescastcodeurs ou Bluesky https://bsky.app/profile/lescastcodeurs.com Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/

Bigdata Hebdo
Episode 211 - Motherduck

Bigdata Hebdo

Play Episode Listen Later Jan 23, 2025 55:19


Le BigDataHebdo, reçoit Mehdi, Developer Advocate chez MotherDuck, pour explorer l'univers de DuckDB et MotherDuck. Au programme, les origines académiques de DuckDB, son évolution en tant que moteur SQL analytique performant, et son extension MotherDuck qui permet de l'utiliser comme un Data Warehouse en ligne.Show notes sur http://bigdatahebdo.com/podcast/episode-211-motherduck/

SQL Data Partners Podcast
Episode 283: Data Lakehouse vs Data Warehouse vs My House

SQL Data Partners Podcast

Play Episode Listen Later Jan 2, 2025 48:59


Microsoft Fabric offers two enterprise-scale, open-standard format workloads for data storage: Warehouse and Lakehouse. Which service should you choose? In this episode, we dive into the technical components of OneLake, along with some of the decisions you'll be asked to make as you start to build out your data infrastructure. These are two good articles we mention in the podcast that could help inform your decision on the services to implement in your OneLake. Microsoft Fabric Decision Guide: Choose between Warehouse and Lakehouse - Microsoft Fabric | Microsoft Learn Lakehouse vs Data Warehouse vs Real-Time Analytics/KQL Database: Deep Dive into Use Cases, Differences, and Architecture Designs | Microsoft Fabric Blog | Microsoft Fabric We hope you enjoyed this conversation on the nuances of data storage within Microsoft OneLake! If you have questions or comments, please send them our way. We would love to answer your questions on a future episode. Leave us a comment and some love ❤️ on LinkedIn, X, Facebook, or Instagram. The show notes for today's episode can be found at Episode 283: Data Lakehouse vs Data Warehouse vs My House. Have fun on the SQL Trail!

Voice of the DBA
The Load of Real Time Data Warehouses

Voice of the DBA

Play Episode Listen Later Oct 6, 2024 3:52


If you have a data warehouse, what do you think your ratio of reads to writes is on any given day? Do you think 1:1, as in one read for each write? Is it 10:1, with 10 reads for each write? 100:1? Do you track this in any way? One would think that most of the databases we work on in the transactional world have many more reads than writes. I'd have assumed the ratios might be higher for data warehouses, where we load data that is queried (read) as the primary use case. After all, I expect that there are lots of people querying data that is loaded into this warehouse, with relatively few changes. Read the rest of The Load of Real Time Data Warehouses

Over The Edge
Leveraging Open Source Technologies for Data Lakehouses with Alex Merced, Senior Tech Evangelist at Dremio

Over The Edge

Play Episode Listen Later Oct 2, 2024 44:01


What makes data lakehouses a game changer in modern data management? In this episode, Bill sits down with Alex Merced, Senior Tech Evangelist at Dremio, to explore the evolution of data lakehouses and their role in bridging the gap between data lakes and data warehouses. Alex breaks down the components of data lakehouses and dives into the rise of Apache Iceberg.---------Key Quotes:“I love just get really deep into technology, really see what it does. And then scream at the rooftops how cool it is. And basically that was my charter. And [Apache] Iceberg, the more I learned about it, the more I realized this is really interesting.”“Interoperability and data. Basically, a lot of the things that kept data in silos is now breaking apart.”"So here we're talking about something that's going to be a standard. And that's when I think of the highest levels of openness matter because if it's something that a whole industry is going to build on, it should be something that the whole industry has to say in its evolution…And that's the beauty of openness that it does create these nice sort of places where we can collaborate and compete together.”--------Timestamps: (01:32) How Alex got started in his career(03:54) Breaking down data lakehouses(07:08) The idea behind an open data lakehouse(10:10) Alex's involvement with Apache Iceberg(15:13) Key components of a data lakehouse(23:41) The growth of Apache Iceberg(32:07) Dremio's Apache Iceberg crash course(38:43) Explaining self-service analytics--------Sponsor:Over the Edge is brought to you by Dell Technologies to unlock the potential of your infrastructure with edge solutions. From hardware and software to data and operations, across your entire multi-cloud environment, we're here to help you simplify your edge so you can generate more value. Learn more by visiting dell.com/edge for more information or click on the link in the show notes.--------Credits:Over the Edge is hosted by Bill Pfeifer, and was created by Matt Trifiro and Ian Faison. Executive producers are Matt Trifiro, Ian Faison, Jon Libbey and Kyle Rusca. The show producer is Erin Stenhouse. The audio engineer is Brian Thomas. Additional production support from Elisabeth Plutko.--------Links:Follow Bill on LinkedInFollow Alex on LinkedIn

Secrets of Data Analytics Leaders
A Novel Approach for Reducing Cloud Data Warehouse Expenses from Coginiti - Audio Blog

Secrets of Data Analytics Leaders

Play Episode Listen Later Oct 1, 2024 6:31


As organizations grapple with data spread across various storage locations, solutions like Coginiti Hybrid Query offer a much-needed alternative to fragmented tools. Published at: https://www.eckerson.com/articles/a-novel-approach-for-reducing-cloud-data-warehouse-expenses-from-coginiti

Masters of Privacy
Jonathan Mendez: making the most of first-party data in the age of AI

Masters of Privacy

Play Episode Listen Later Sep 29, 2024 42:16


Jonathan Mendez has been a founder and leader in Adtech and Martech for two decades, with a focus on building first-party data products to optimize media performance.  He is the founder and CEO at Neuralift AI, having prior to that been Chief Digital Officer at a major cruise line, and having also spent five years building composable CDPs (Customer Data Platform) for global retail brands and telcos. He was also the Founder and CEO of Yieldbot, which in 2016 was the fourth largest Digital Advertising Network. He was also the CSO at Offermatica, eventually acquired by Omniture, now part of Adobe.  Jonathan's blog has been active for 17 years and is a recognized source of insights into AdTech, MarTech or Media. References: Jonathan Mendez (blog): Optimize & Prophesize Neuralift AI Jonathan Mendez on X Jonathan Mendez on LinkedIn Tejas Manohar (Hightouch): data activation and composable CDPs in a privacy-first world (Masters of Privacy) Nicola Newitt (Infosum): the legal case for Data Clean Rooms (Masters of Privacy) Matthias Eigenmann (Decentriq): Confidential Computing, contractual relationships and legal bases for Data Clean Rooms (Masters of Privacy)  

Tech Optimist
#49 - Meet the Startup Building Secure Data Warehouses for Our Ever-Changing World

Tech Optimist

Play Episode Listen Later Sep 6, 2024 25:09


In this captivating episode of the Tech Optimist podcast, Managing Partner Ray Wu sits down with Nate Holiday, the visionary co-founder and CEO of Space and Time. Dive into the intricate world of decentralized data warehouses as they discuss how Space and Time ensures verifiability and transparency in database operations. Learn about their innovative technology, Proof of SQL, which secures data interactions within blockchain environments, offering a new level of security and integrity for enterprise applications. Tune in to uncover how Space and Time is transforming data management and security, forging a path toward a more trustworthy digital infrastructure.Nate's Ask: Nate invites the community to join and engage with Space and Time's community of developers, contributing to their open-source projects on GitHub. He also encourages the adoption of their verifiable database solutions available through platforms like Microsoft's Azure Marketplace and Google's marketplace, to enhance data security and transparency in enterprise applications.To Learn More:Alumni Ventures (AV)AV LinkedInAV AI FundTech OptimistSpace and TimeSpeakers:Ray Wu - Guest Nate Holiday - Guest Chapters:(00:00) - Intro (01:51) - Interview (17:02) - Nate's Ask (24:31) - Closing Legal Disclosure:https://av-funds.com/tech-optimist-disclosures

BIFocal - Clarifying Business Intelligence
Episode 281 - Interview With Charles Webb

BIFocal - Clarifying Business Intelligence

Play Episode Listen Later Aug 13, 2024 34:48


This is episode 281 recorded on August 9th, 2024 where John & Jason talk to Charles Webb, Principal PM Manager for Data Warehouse in Microsoft Fabric, about Fabric, Data Warehouse, his history with Power BI & Dataflows, AI, college sports & business, his work in philanthropy and much more.

Partially Redacted: Data Privacy, Security & Compliance
Demystifying Data Warehouses with Felicis Ventures's Eric Flaningam

Partially Redacted: Data Privacy, Security & Compliance

Play Episode Listen Later Jul 17, 2024 32:44


In this episode, host Sean Falconer sits down with Eric Flaningam, a researcher at Felicis Ventures, to explore the fascinating world of data warehouses. They dive into the history, evolution, and future trends of data warehousing, shedding light on its importance. Key topics discussed include an overview of the article "A Primer on Data Warehouses," and the definition and key characteristics of data warehouses. They also cover the historical evolution and major milestones in data warehousing, the shift from batch processing to real-time data, and the convergence of data warehouses and SQL. Eric and Sean discuss the impact of unstructured and complex data, advancements in technology and their effect on data warehouses, and the technical architecture and components of a typical data warehouse. They share real-world benefits and use cases of data warehouses, common challenges in implementing and maintaining data warehouses, and future trends and the influence of AI and machine learning on data warehouses. For further reading, check out Eric Flaningam's article, A Primer on Data Warehouses: https://www.generativevalue.com/p/a-primer-on-data-warehouses

Beyond Rent: Exploring Property Management
The Importance of Data Warehouses

Beyond Rent: Exploring Property Management

Play Episode Listen Later May 19, 2024 39:19


Companies are more successful and efficient when they have current, accurate data. Employing metrics, dashboards, and data warehouses—a central repository of data from different systems—can help businesses achieve their goals. Saad Shah of RentViewer joins the podcast to discuss the importance of utilizing data and analytics in the real estate management industry. Doing so gives companies better visibility and business intelligence; it eliminates manual work; and streamlines expense and work order management. By evaluating the processes and tasks that are costing you the most money, you can use data to improve your bottom line.Learn more about Rent Manager's industry-leading accounting, reporting, maintenance, and communication features at RentManager.com, or connect with us on LinkedIn, Facebook, Instagram, YouTube, and Twitter.You can learn more about Saad Shah on LinkedIn, and RentViewer on the company's website.Visit RentManager.com/Podcast to submit an idea for an upcoming episode of Beyond Rent and discover more about the program.

The Data Stack Show
189: Customer Data Modeling, The Data Warehouse, Reverse ETL, and Data Activation with Ryan McCrary of RudderStack

The Data Stack Show

Play Episode Listen Later May 16, 2024 63:52


Highlights from this week's conversation include:Ryan's Background and Roles in Data (0:05)Data Activation and Dashboard Staleness (1:27)Profiles and Data Activation (2:54)Customer-Facing Experience and Product Management (3:40)Profiles Product Overview (5:10)Use Cases for Profiles (6:44)Challenges with Data Projects (9:19)Entity Management and Account Views (15:33)Handling Entities and Duplicates (17:55)Challenges in Entity Management (22:18)Product Management and Data Solutions (26:08)Reverse ETL and Data Movement (31:58)Accessibility of Data Warehouses (36:14)Profiles and Entity Features (37:47)Cohorts Creation and Use Cases (41:17)Customer Data and Targeting (43:09)Activations and Reverse ETL (45:57)ML and AI Use Cases (55:53)Data Activation and ML Predictions (57:02)Spicy Take and Future Product Features (59:47)ETL Evolution and Cloud Tools (1:00:50)Unbundling and Future Trends (1:02:10)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

DH Unplugged
DHUnplugged #701: Sentiment Pulse

DH Unplugged

Play Episode Listen Later May 8, 2024 65:28


Earnings season - better and stats - BIGGEST BUYBACK EVER - We are gauging investor sentiment --- Remember - Confidence  and Sentiment (Cheer-leading helps) PLUS we are now on Spotify and Amazon Music/Podcasts! Click HERE for Show Notes and Links DHUnplugged is now streaming live - with listener chat. Click on link on the right sidebar. Love the Show? Then how about a Donation? Follow John C. Dvorak on Twitter Follow Andrew Horowitz on Twitter DONATE - Show 700 Campaign Warm Up - Earnings season - better and stats - BIGGEST BUYBACK EVER - We are gauging investor sentiment -- --- Remember - Confidence  and Sentiment (Cheer-leading helps) - Announcing the WINNER  CTP for Apple - Fake Work? Market Update - If down - buy.... Names that were hammered due to earnings catching bids again - Follow up - Utilities - Fed Speaks - Can't stop the Dove - Employment - Excitement about the Unemployment Rate Earnings Season Update: - Overall, 80% of the companies in the S&P 500 have reported actual results for Q1 2024 to date. - Of these companies, 77% have reported actual EPS above estimates, which is equal to the 5-year average of 77% but above the 10-year average of 74%. - In aggregate, companies are reporting earnings that are 7.5% above estimates, which is also below the 5-year average of 8.5% but above the 10-year average of 6.7% - Eight of the eleven sectors are reporting year-over-year earnings growth, led by the Communication Services, Utilities, Consumer Discretionary, and Information Technology sectors. - Three sectors are reporting a year-over-year decline in earnings: Energy, Health Care, and Materials. - Revenue - up again - estimated to be 4.1% when all said and done. -  -    If 4.1% is the actual revenue growth rate for the quarter, it will mark the 14th consecutive quarter of revenue growth for the index. Fake Work - An investor at famed Silicon Valley firm Andreessen Horowitz is the latest VC to get involved in the debate around "fake work" in the tech industry. - Ulevitch went on to point the finger at Google specifically, calling it "an amazing example." - "I don't think it's crazy to believe that half the white-collar staff at Google probably does no real work," he said. "The company has spent billions and billions of dollars per year on projects that go nowhere for over a decade, and all that money could have been returned to shareholders who have retirement accounts." - Marc Andreessen has criticized a managerial "laptop class" and tweeted in 2022, "The good big companies are overstaffed by 2x. The bad big companies are overstaffed by 4x or more." Buy 'em - Companies that took a hit after earnings (NFLX, AMD) getting bid again - NFLX gapped lower from ~$608 to $551 and now $592 - AMD dropped from $160 to $140 and now $156 - SPY , IWM and QQQ- Now above the 50day Moving average again Follow Up - Utilities - Just wanted to provide this idea again - Data Warehouses and other AI Power hungry places --- Symbol list of some utilities to look at further - SO, NEE, EXC, CMS - Natural gas producers are planning for a significant spike in demand over the next decade, as artificial intelligence drives a surge in electricity consumption that renewables may struggle to meet alone. - After a decade of flat power growth in the U.S., electricity demand is forecast to grow as much as 20% by 2030, according to a Wells Fargo analysis published in April. Power companies are moving to quickly secure energy as the rise of AI coincides with the expansion of domestic semiconductor and battery manufacturing as well as the electrification of the nation's vehicle fleet. - AI data centers alone are expected to add about 323 terawatt hours of electricity demand in the U.S. by 2030 Utilities ETF Apple - Earnings - Nothing great in the earnings. --- A few pockets of sunshine.... --- Raises dividend and $110 BILLION buyback - largest buyback EVER ...

Software Huddle
Operational Data Warehouse with Nikhil Benesch

Software Huddle

Play Episode Listen Later Apr 30, 2024 65:56


Today's episode is with Nikhil Benesch, who's the co-founder and CTO at Materialize, an Operational Data Warehouse. Materialize gets you the best of both worlds, combining the capabilities of your data warehouse with the immediacy of streaming. This fusion allows businesses to operate with data in real-time. We discussed the data infrastructure stuff of it, how they built it, how they think about billing, how they think about cloud primitives and what they wish they had.

Good Data, Better Marketing
Building Flexible Data Architectures for Enhanced Customer Engagement with Kevin Niparko, VP of Product for Twilio Segment CDP

Good Data, Better Marketing

Play Episode Listen Later Apr 11, 2024 38:12


This episode features an interview with Kevin Niparko, Vice President of Product for Twilio Segment CDP. Kevin joined the team in 2015 to lead Growth & Analytics, before helping form Segment's Product Management organization. He's led a variety of Twilio Segment's products over the years, from Connections, Cloud Sources and ETL, and Profiles.In this episode, Kailey and Kevin discuss future proofing organizations to take advantage of AI breakthroughs, accelerating time to value, and solving problems through data strategy alignment.-------------------Key Takeaways:Keeping up with the evolving landscape of data management requires flexibility, extensibility, and interoperability built into your data architecture.How modern enterprises can quickly and continuously adapt to the proliferation of tools and technologies.The importance of creating a data strategy to evolve with the needs of your business and serve your cross-functional stakeholders.-------------------“The problems that we see our customers running into that really feel intractable are the ones more on the people and the process side of data. It's something that technology can help with. It's something that CDPs can play a role in. But, I think we're also realistic that no tech or software is going to be the silver bullet. It's about different parts of the organization coming together and aligning on an overall data strategy that everybody will abide by.” – Kevin Niparko-------------------Episode Timestamps:‍*(02:59) - Kevin's career journey*(05:52) - Trends impacting technology and customer engagement*(09:04) - Components of a flexible enterprise*(16:55) - How AI intersects with data management*(30:04) - How Kevin defines “good data”*(36:50) - Kevin's recommendations for upleveling inclusive marketing strategies-------------------Links:Connect with Kevin on LinkedInConnect with Kailey on LinkedInLearn more about Caspian Studios-------------------SponsorGood Data, Better Marketing is brought to you by Twilio Segment. In today's digital-first economy, being data-driven is no longer aspirational. It's necessary. Find out why over 20,000 businesses trust Segment to enable personalized, consistent, real-time customer experiences by visiting Segment.com

Explicit Measures Podcast
305: Fabric Lakehouse or Data Warehouse?

Explicit Measures Podcast

Play Episode Listen Later Mar 27, 2024 52:49


Mike, Seth, & Tommy discuss a great article by Sam Debruyn on what really a Fabric Lakehouse is, and where should we spend out time. Get in touch: Send in your questions or topics you want us to discuss by tweeting to @PowerBITips with the hashtag #empMailbag or submit on the PowerBI.tips Podcast Page. Visit PowerBI.tips: https://powerbi.tips/ Watch the episodes live every Tuesday and Thursday morning at 730am CST on YouTube: https://www.youtube.com/powerbitips Subscribe on Spotify: https://open.spotify.com/show/230fp78XmHHRXTiYICRLVv Subscribe on Apple: https://podcasts.apple.com/us/podcast/explicit-measures-podcast/id1568944083‎ Check Out Community Jam: https://jam.powerbi.tips Follow Mike: https://www.linkedin.com/in/michaelcarlo/ Follow Seth: https://www.linkedin.com/in/seth-bauer/ Follow Tommy: https://www.linkedin.com/in/tommypuglia/

Packet Pushers - Full Podcast Feed
KU051: Getting Under the Hood of Yellowbrick's K8s Data Warehouse (Sponsored)

Packet Pushers - Full Podcast Feed

Play Episode Listen Later Mar 21, 2024 33:52


In this episode of the Kubernetes Unpacked Podcast, Kristina and Michael catch up with Mark from Yellowbrick to talk about all things underlying architecture. Very rarely do we get a vendor to chat about what's going on underneath the hood and how a particular application stack/tool is running, so this was an awesome episode! Mark... Read more »

Packet Pushers - Fat Pipe
KU051: Getting Under the Hood of Yellowbrick's K8s Data Warehouse (Sponsored)

Packet Pushers - Fat Pipe

Play Episode Listen Later Mar 21, 2024 33:52


In this episode of the Kubernetes Unpacked Podcast, Kristina and Michael catch up with Mark from Yellowbrick to talk about all things underlying architecture. Very rarely do we get a vendor to chat about what's going on underneath the hood and how a particular application stack/tool is running, so this was an awesome episode! Mark... Read more »

Kubernetes Unpacked
KU051: Getting Under the Hood of Yellowbrick's K8s Data Warehouse (Sponsored)

Kubernetes Unpacked

Play Episode Listen Later Mar 21, 2024 33:52


In this episode of the Kubernetes Unpacked Podcast, Kristina and Michael catch up with Mark from Yellowbrick to talk about all things underlying architecture. Very rarely do we get a vendor to chat about what's going on underneath the hood and how a particular application stack/tool is running, so this was an awesome episode! Mark... Read more »

The CUInsight Network
Data Warehousing - Lodestar

The CUInsight Network

Play Episode Listen Later Feb 23, 2024 22:10


“We love working with credit unions to become more data-driven, so they can better support their members.” - Andrea BrownThank you for tuning in to The CUInsight Network, with your host, Lauren Culp, President & CEO of CUInsight. In The CUInsight Network, we take a deeper dive with the thought leaders who support the credit union community. We discuss issues and challenges facing credit unions and identify best practices to learn and grow together.My guest on today's show is Andrea Brown, SVP of Growth at Lodestar. Andrea is a return guest to the podcast and shares what has changed since last year and what remains a common focus for leaders. Lodestar is a data warehouse and analytics partner for credit unions. They provide a full-service analytics platform of data connectors, visuals, workflows, and strategic guidance to help credit unions move forward in their analytics journey. During our conversation, Andrea discusses which data strategies credit unions should be focusing on to benefit business goals. She explains how leveraging data analytics and choosing the right data warehouse is crucial for a successful core conversion. Listen as Andrea talks about growth plans for the future and continuing to support credit unions with the complex technology systems needed to embrace efficiency and sustainability.As we wrap up the episode, Andrea talks about spending time with her family, splurging on new experiences, and preferring the audiobook of this recent read. Enjoy my conversation with Andrea Brown!Find the full show notes on cuinsight.com.Connect with Andrea:Andrea Brown, SVP of Growth at Lodestarandrea.brown@lodestartech.calodestartech.caAndrea: LinkedInLodestar: LinkedIn

Speaking of Data
Building a Data Warehouse in the Cloud with Norbert Kremer

Speaking of Data

Play Episode Listen Later Jan 8, 2024 24:45


Norbert Kremer, Ph.D., cloud solution architect and TDWI faculty member, joins host Andrew Miller to discuss his upcoming course on building a data warehouse in the cloud. For more information on Norbert's course please visit Building a Data Warehouse the Google Cloud Way and for the full agenda please visit TDWI Transform 24 Las Vegas.

Engenharia de Dados [Cast]
The Data Lakehouse Paradigm with Bill Inmon - The Father of Data Warehouse

Engenharia de Dados [Cast]

Play Episode Listen Later Oct 12, 2023 43:19


No episódio de hoje, Luan Moreno, Mateus Oliveira e Orlando Marley entrevistam Bill Inmon, criador do conceito de Data Warehouse e escritor de diversos livros com temáticas voltadas para dados.Data Warehouse é o conceito de centralização de dados analíticos das organizações, de forma estruturar um visão 360° do business. Neste episódio, você irá aprender: Diferenças entre OLTP e OLAP;Histórico dos dados para tomada de decisão;Criar um processo resiliente para entender os fatos dos dados.Falamos também, neste bate-papo, sobre os seguintes temas: História do Bill Inmon;Pilares de sistemas analíticos;Nova geração de plataforma de dados analíticos;Aprenda mais sobre análise de dados, como utilizar tecnologias para tornar o seu ambiente analítico confiável e resiliente com as palavras do pai do Data Warehouse. Bill Inmon = Linkedin Luan Moreno = https://www.linkedin.com/in/luanmoreno/