Podcasts about Data modeling

  • 114PODCASTS
  • 180EPISODES
  • 43mAVG DURATION
  • 1EPISODE EVERY OTHER WEEK
  • May 19, 2025LATEST
Data modeling

POPULARITY

20172018201920202021202220232024


Best podcasts about Data modeling

Latest podcast episodes about Data modeling

Data Culture Podcast
Trends in Data Engineering – from AI to Automation – with Joe Reis

Data Culture Podcast

Play Episode Listen Later May 19, 2025 37:00


Joe Reis and Carsten discuss the evolving landscape of data engineering.

AgileBI
The Hook data modeling pattern with Andrew Foad - Episode #63

AgileBI

Play Episode Listen Later May 16, 2025 47:56


Join Shane Gibson as he chats with Andrew Foad on his data modeling pattern "Hook" You can get in touch with Andrew via LinkedIn or read more on substack https://hookcookbook.substack.com If you want to download the transcript for the podcast, head over to: https://agiledata.io/podcast/agiledata-podcast/the-hook-data-modeling-pattern-with-andrew-foad/#read   If you want want to read a summary generated with GenAI, head over to: https://agiledata.substack.com/p/the-hook-data-modeling-pattern-with   Listen to more podcasts on applying AgileData patterns over at https://agiledata.io/podcasts/ Read more on the AgileData Way of Working over at https://wow.agiledata.io/   If you want to join us on the next podcast, get in touch over at https://agiledata.io/podcasts/#contact   Or if you just want to talk about making magic happen with agile and data you can connect with Shane @shagility on LinkedIn.   Subscribe: Apple Podcast | Spotify | Google Podcast  | Amazon Audible | TuneIn | iHeartRadio | PlayerFM | Listen Notes | Podchaser |  Deezer | Podcast Addict |  Simply Magical Data

The Joe Reis Show
John Giles - The Data Elephant in the Board Room, Data Modeling, and More

The Joe Reis Show

Play Episode Listen Later May 6, 2025 80:23


John Giles joins me to discuss his new book, "The Data Elephant in the Board Room," conceptual modeling, the unreasonable effectiveness of data modeling patterns, and more.

MetaDAMA - Data Management in the Nordics
4#13 - Juha Korpela - Data Consulting and the Role of Data Modeling (Eng)

MetaDAMA - Data Management in the Nordics

Play Episode Listen Later Apr 14, 2025 43:40


«You bring in the knowledge of what works in real life and what doesn't. That is actually what you are being paid for.»With a year behind him as a solo entrepreneur in his own company, Datakor Consulting, Juha Korpela takes us on a journey through fact-finding-missions at what he calls "the middle layer" of organizations — the strategic area between high-level business strategy and tactical project execution. It is here, he believes, that data consultants can create the most significant and lasting value.We discuss the pitfalls of standardized frameworks and "blueprint" approaches offered by many consulting firms, and why tailored solutions based on a deep understanding of organizational culture always yield better results. Juha shares his methods for knowledge transfer that ensure organizations can continue succeeding with their data work long after the consultant has left the project.Here are Winfried´s key takeaways:SkillsThe key skill as a data consultant, no matter if on a strategic or solution, project level is to understand «what the customer really needs.»The key skills are: Listening. Active listening is the key to understanding.Create mental models: when talking to stakeholder you need to be able to put the information you capture together in a mental model.Understanding.Tech comes after.Working with data modeling is about listening to stories about how business work. Understanding business processes are key.Understanding stories about business and what is relevant for data modeling is a skill that everyone can profit from, but that is seldom taught.Data Modeling is a fact-finding-mission.It is about understanding what the organization does, how it does things, and where this could be improved.ImpactA data consultants impact is dependent on the organization, the structure, and the level of maturity.If there is a CDO or CIO to connect to it can be a good way to create results and visibility.Also as a data consultant it is important find a place in the organization where you have shared views and understanding.If you begin bottom-up you need to be ready to sell this upwards in the organization.LimitsConsultants can help with the initial projects to get you started.Consultants can help figuring out processes and operating model and design what is needed.Organizations need to create long-term ownership in house.Running and maintaining needs to fit with the organizations culture, its s structure, needs, maturity, etc.Models, blueprints, frameworks that you get from the outside can get you started, but do not work in the long run.PatternsData Consultants can see certain patterns emerging across an industry.That knowledge on patterns, lessons learnt, experiences is valuable to apply.That knowledge you bring in is what defines your value, more than specific skills.It is easy for people in organizations to get stuck. Consultants can help as a fresh wind.Knowledge transferAs a consultant you bring in new knowledge, and you need to account for that organizations want to transfer that knowledge to internals.Find ways to create custom training packages to facilitate knowledge sharing.You aim for the organization to succeed with their work, also after the consultants are gone.Consultant aaSDo we move from being consultants to becoming a service offering?Service models can crate a distance between consultants and clients.You need to have a clear understanding the impact of models that include ownership and responsibility transfer as eg. Outsourcing operational tasks.

Catalog & Cocktails
TAKEAWAYS - What Do Data Modeling, Data Vault and Knowledge Graphs Have in Common?

Catalog & Cocktails

Play Episode Listen Later Mar 28, 2025 6:33


Patrick Cuba, Snowflake Architect, explores the fundamental connections between Data Modeling, Data Vault, and Knowledge Graphs—revealing how these approaches all center on the same core elements: business entities, their relationships, and their historical states. Patrick unpacks why, despite the AI revolution, human expertise remains irreplaceable for accountability and real business value. If you're wrestling with cognitive overload in the face of data explosion or wondering how different modeling disciplines can complement each other, this episode delivers the practical insights you need.

Catalog & Cocktails
What Do Data Modeling, Data Vault and Knowledge Graphs Have in Common?

Catalog & Cocktails

Play Episode Listen Later Mar 28, 2025 70:55


Patrick Cuba, Snowflake Architect, explores the fundamental connections between Data Modeling, Data Vault, and Knowledge Graphs—revealing how these approaches all center on the same core elements: business entities, their relationships, and their historical states. Patrick unpacks why, despite the AI revolution, human expertise remains irreplaceable for accountability and real business value. If you're wrestling with cognitive overload in the face of data explosion or wondering how different modeling disciplines can complement each other, this episode delivers the practical insights you need.

Digital Health Talks - Changemakers Focused on Fixing Healthcare
AI-Driven Healthcare: Sutter Health's Journey to Scale Clinical Innovation

Digital Health Talks - Changemakers Focused on Fixing Healthcare

Play Episode Listen Later Mar 25, 2025 27:38


Join Kiran Mysore, Chief Data & Analytics Officer at Sutter Health, as he shares insights on scaling AI adoption, building sustainable innovation infrastructure, and transforming healthcare delivery through data-driven approaches. Learn how one of the nation's largest health systems is successfully integrating advanced analytics and AI into clinical practice while maintaining governance and ethical standards. Real-world implementation of clinical AI at scale Building sustainable innovation infrastructure Data strategy for improved patient and provider experiences Governance frameworks for responsible AI adoption Implementation and impact from digital scribes to diagnosticsKiran Mysore, Chief Data & Analytics Officer at Sutter HealthShahid Shah, Chairman of the Board, Netspective Foundation

The Joe Reis Show
Freestyle Fridays - Old School vs. New School Data Modeling

The Joe Reis Show

Play Episode Listen Later Feb 7, 2025 17:44


In data modeling - and pretty much anything else - do you choose "old school" or "new school"? In other words, do you move slow and methodically or fast?

MetaDAMA - Data Management in the Nordics
4#10 - Geir Myrind - The Revival of Data Modeling (Nor)

MetaDAMA - Data Management in the Nordics

Play Episode Listen Later Feb 3, 2025 41:25


"Vi modellerer for å forstå, organisere og strukturere dataene." / "We model to understand, organize, and structure the data."This episode with Geir Myrind, Chief Information Architect, offers a deep dive into the value of data modeling in organizations. We explore how unified models can enhance the value of data analysis across platforms and discuss the technological development trends that have shaped this field. Historical shifts toward more customized systems have also challenged the way we approach data modeling in public agencies such as the Norwegian Tax Administration.Here are my key takeaways:StandardizationStandardization is a starting point to build a foundation, but not something that let you advance beyond best practice.Use standards to agree on ground rules, that can frame our work, make it interoperable.Conceptual modeling is about understanding a domain, its semantics and key concepts, using standards to ensure consistency and support interoperability.Data ModelingModeling is an important method to bridge business and data.More and more these conceptual models gain relevance for people outside data and IT to understand how things relate.Models make it possible to be understood by both humans and machines.If you are too application focused, data will not reach its potential and you will not be able to utilize data models to their full benefits.This application focus which has been prominent in mainstream IT for many years now is probably the reason why data modeling has lost some of its popularity.Tool advancement and new technology can have an impact on Data Management practices.New tools need a certain data readiness, a foundation to create value, e.g. a good metadata foundation.Data Modeling has often been viewed as a bureaucratic process with little flexibility.Agility in Data Modeling is about modeling being an integrated part of the work - be present, involved, addressed.The information architect and data modeling cannot be a secretary to the development process but needs to be involved as an active part in the cross-functional teams.Information needs to be connected across domains and therefore information modeling should be connected to business architecture and process modeling.Modeling tools are too often connected only to the discipline you are modeling within (e.g. different tools for Data vs. Process Modeling).There is substantial value in understanding what information and data is used in which processes and in what way.The greatest potential is within reusability of data, its semantics and the knowledge it represents.The role of Information ArchitectInformation Architects have played a central role for decades.While the role itself is stable it has to face different challenges today.Information is fluctuant and its movement needs to be understood, be it through applications or processes.Whilst modeling is a vital part of the work, Information Architects need to keep a focus on the big picture and the overhauling architecture.Information architects are needed both in projects and within domains.There is a difference between Information and Data Architects. Data Architects focus on the data layer, within the information architecture, much closer to decisions made in IT.The biggest change in skills and competency needs for Information Architects is that they have to navigate a much more complex and interdisciplinary landscape.MetadataData Catalogs typically include components on Metadata Management.We need to define Metadata broader - it includes much more than data about data, but rather data about things.

The Joe Reis Show
Remco Broekmans - Data Modeling, Data Vault, and More

The Joe Reis Show

Play Episode Listen Later Jan 29, 2025 71:17


Remco Broekmans and I chat about data modeling and the business, Data Vault, and using AI to accelerate data modeling.

The Joe Reis Show
Jamie Davidson - Modern Data Modeling

The Joe Reis Show

Play Episode Listen Later Jan 21, 2025 51:07


Jamie Davidson (Chief Product Officer at Omni, Former VP of Product at Looker) joins me to chat about "modern" data modeling, going from a startup to Google and back to a startup, and much more. Omni: https://omni.co/

The Joe Reis Show
Freestyle Fridays - The Year Ahead (Why Data Modeling Matters, AI, Being Human, etc)

The Joe Reis Show

Play Episode Listen Later Jan 3, 2025 21:39


It's 2025! We made it! ;) In this podcast, I rant about why data modeling matters more than ever, AI, and why humans will seek out "human" things in 2025 and beyond. ❤️ Your support means a lot. Please like and rate this podcast on your favorite podcast platform.

The Data Stack Show
222: The Future of Data Modeling: Breaking Free from Tables with Best-Selling Author, Joe Reis of Ternary Data

The Data Stack Show

Play Episode Listen Later Dec 31, 2024 60:48


Highlights from this week's conversation include:Joe's Recent Projects and Work (0:55)Joe's New Book and Inspiration for Writing It (4:39)Challenges in Data Education (7:00)Internal Training Programs (10:02)Creative Problem Solving (17:46)Evaluating Candidates' Skills (21:18)Market Value and Career Growth (24:03)AI's Impact on Hiring (27:47)Content Production and Quality (31:56)The Evolution of AI and Data (34:00)Challenges of Automation (36:12)Convergence of Data Fields (40:26)Shortcomings of Relational Models (42:09)Inefficiencies of Poor Data Modeling (47:10)Discussion on Resource Constraints (51:50)The Role of Language Models (53:13)AI in Migration Projects (57:00)Joe's Teaser for a New Project (59:05)  Final Thoughts and Closing Remarks (1:00:07)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

The Data Stack Show
The PRQL: From Tables to AI: The Future of Data Modeling with Best-Selling Author, Joe Reis of Ternary Data

The Data Stack Show

Play Episode Listen Later Dec 30, 2024 4:29


The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

MetaDAMA - Data Management in the Nordics
Holiday Special: Joe Reis - A Journey around the World of Data (Eng)

MetaDAMA - Data Management in the Nordics

Play Episode Listen Later Dec 16, 2024 53:47


«Data Management is an interesting one: If it fails, what's the feedback loop?»For the Holiday Special of Season 4, we've invited the author of «Fundamentals of Data Engineering», podcast host of the «Joe Reis Show», «Mixed Model Arts» sensei, and «recovering Data Scientist» Joe Reis. Joe has been a transformative voice in the field of data engineering and beyond. He is also the author of the upcoming book with the working title "Mixed Model Arts", which redefines data modeling for the modern era.  This episode covers the evolution of data science, its early promise, and its current challenges. Joe reflects on how the role of the data scientist has been misunderstood and diluted, emphasizing the importance of data engineering as a foundational discipline. We explore why data modeling—a once-vital skill—has fallen by the wayside and why it must be revived to support today's complex data ecosystems. Joe offers insights into the nuances of real-time systems, the significance of data contracts, and the role of governance in creating accountability and fostering collaboration.  We also highlight two major book releases: Joe's "Mixed Model Arts", a guide to modernizing data modeling practices, and our host Winfried Etzel's book on federated Data Governance, which outlines practical approaches to governing data in fast-evolving decentralized organizations. Together, these works promise to provide actionable solutions to some of the most pressing challenges in data management today.  Join us for a forward-thinking conversation that challenges conventional wisdom and equips you with insights to start rethinking how data is managed, modeled, and governed in your organization.Some key takeaways:Make Data Management tangibleData management is not clear enough to be understood, to have feedback loops, to ensure responsibility to understand what good looks like.Because Data Management is not always clear enough, there is a pressure to make it more tangible.That pressure is also applied to Data Governance, through new roles like Data Governance Engineers, DataGovOps, etc.These roles mash enforcing policies with designing policies.Data ContractsShift Left in Data needs to be understood more clearly, towards a closer understanding and collaboration with source systems.Data Contracts are necessary, but it's no different from interface files in software. It's about understanding behavior and expectations.Data Contracts are not only about controlling, but also about making issues visible.Data GovernanceThink of Data Governance as political parties. Some might be liberal, some more conservative.We need to make Data Governance lean, integrated and collaborative, while at the same time ensuring oversight and accountability.People need a reason to care about governance rules and held accountable.If not Data Governance «(...) ends up being that committee of waste.»The current way Data Governance is done doesn't work. It needs a new look.Enforcing rules, that people don't se ant connection to or ownership within are deemed to fail.We need to view ownership from two perspectives - a legal and a business perspective. They are different.Data ModelingBusiness processes, domains and standards are some of the building blocks for data.Data Modeling should be an intentional act, not something you do on the side.The literature on Data Modeling is old, we are stuck in a table-centric view of the world.

The MongoDB Podcast
EP. 252 Mastering Data Modeling in MongoDB: Insights from a Strategy Expert

The MongoDB Podcast

Play Episode Listen Later Dec 13, 2024 11:36


Join us for an insightful discussion with a MongoDB expert as we delve into the intricacies of data modeling and design reviews. With over a decade of experience, our guest shares valuable strategies for optimizing data models, avoiding common pitfalls, and ensuring successful implementations. Learn about the importance of aligning your data structure with application needs and discover practical design patterns that can enhance your MongoDB projects. Whether you're new to MongoDB or looking to refine your skills, this episode is packed with actionable insights to elevate your data modeling game.

Crazy Wisdom
Episode #415: Rethinking Databases: EdgeDB's Blueprint for a Developer-Friendly Future

Crazy Wisdom

Play Episode Listen Later Dec 6, 2024 55:03


On this episode of the Crazy Wisdom Podcast, host Stewart Alsop is joined by Yury Selivanov, the CEO and co-founder of EdgeDB, for a fascinating discussion about the reinvention of relational databases. Yury explains how EdgeDB addresses modern application development challenges by improving developer experience and rethinking decades-old database paradigms. They explore how foundational technologies evolve, the parallels between software and real-world systems like the electrical grid, and the emerging role of AI in coding and system design. You can connect with Yury through his personal Twitter account @1st1 (https://twitter.com/1st1) and EdgeDB's official Twitter @EdgeDatabase (https://twitter.com/edgedatabase).Check out this GPT we trained on the conversation!Timestamps00:00 Introduction to the Crazy Wisdom Podcast00:27 What is EdgeDB?00:58 The Evolution of Databases04:36 Understanding SQL and Relational Databases07:48 The Importance of Database Relationships09:27 Schema vs. No-Schema Databases14:14 EdgeDB: SQL 2.0 and Developer Experience23:09 The Future of Databases and AI Integration26:43 AI's Role in Software Development27:20 Challenges with AI-Generated Code29:56 Human-AI Collaboration in Coding34:00 Future of Programming Languages44:28 Junior Developers and AI Tools50:02 EdgeDB's Vision and Future PlansKey InsightsReimagining Relational Databases: Yury Selivanov explains how EdgeDB represents a modern rethinking of relational databases. Unlike traditional databases designed with 1970s paradigms, EdgeDB focuses on improving developer experience by introducing object-oriented schemas and hierarchical query capabilities, bridging the gap between modern programming needs and legacy systems.Bridging Data Models and Code: A key challenge in software development is the object-relational impedance mismatch, where relational database tables do not naturally map to object-based data models in programming languages. EdgeDB addresses this by providing a high-level data model and query language that aligns with how developers think and work, eliminating the need for complex ORMs.Advancing Query Language Design: Traditional SQL, while powerful, can be cumbersome for application development. EdgeDB introduces EdgeQL, a modern query language designed for readability, hierarchical data handling, and developer productivity. This new language reduces the friction of working with relational data in real-world software projects.AI as a Tool, Not a Replacement: While AI has transformed coding productivity, Yury emphasizes that it is a tool to assist, not replace, developers. LLMs like GPT can generate code, but the resulting systems still require human oversight for debugging, optimization, and long-term maintenance, highlighting the enduring importance of experienced engineers.The Role of Schema in Data Integrity: Schema-defined databases like EdgeDB allow developers to codify business logic and enforce data integrity directly within the database. This reduces the need for application-level checks, simplifying the codebase while ensuring robust data consistency—a feature that remains critical even in the era of AI.Integrating AI into Databases: EdgeDB is exploring innovative integrations of AI, such as automatic embedding generation and retrieval-augmented generation (RAG) endpoints, to enhance data usability and simplify complex workflows. These capabilities position EdgeDB as a forward-thinking tool in the rapidly evolving landscape of AI-enhanced software.Balancing Adoption and Usability: To encourage adoption, EdgeDB is incorporating familiar tools like SQL alongside its advanced features, lowering the learning curve for new users. This approach combines innovation with accessibility, ensuring that developers can transition seamlessly to the platform while benefiting from its modern capabilities.

The Jason Cavness Experience
Bridger (Waleed) Ammar has been leading top-tier, high-impact data-modeling projects since 2006 in research, education, engineering and product.

The Jason Cavness Experience

Play Episode Listen Later Nov 10, 2024 129:52


Bridger (Waleed) Ammar has been leading top-tier, high-impact data-modeling projects since 2006 in research, education, engineering and product. Sponsor The Jason Cavness Experience is sponsored by CavnessHR. CavnessHR provides HR to companies with 49 or fewer people. CavnessHR provides a tech platform that automates HR while providing access to a dedicated HR Business Partner. www.CavnessHR.com Go to www.thejasoncavnessexperience.com for the podcast on your favorite platforms  Bridger's Bio Bridger (Waleed) Ammar has been leading top-tier, high-impact data-modeling projects since 2006 in research, education, engineering and product. A few experiences which particularly helped shape his thinking: - Co-founded the ACM chapter at Alexandria University. Defended his PhD in Language-Universal Large Models (L-ULM), in 2016, with Tom Mitchell and Kuzman Ganchev as examiners. - Taught at Alexandria University, Carnegie Mellon University, and University of Washington. Published at Nature, JAMA, NeurIPS, ACL, EMNLP among other top-tier venues. Advised mission-critical organizations on AI strategy, including the NSF (USA), SDAIA (KSA), a leading gaming platform (USA), a leading freight forwarding platform (KSA). At King Saud university, he learned the holistic power of safely integrating different cultures for global good. - At Alexandria University, he contributed to a digital model for historical artifacts, in collaboration with the Alexandria Library. At P&G, he learned the holistic power of mapping the manufacturing process in a data model. - At IBM, he contributed to the state of the art (SOTA) in using statistics to model biological sequences, in collaboration with DARPA. - At eSpace, he learned the basics of building sustainable businesses, in collaboration with Alexandria University. - At Microsoft, he contributed to the then-SOTA in statistical machine translation models, in collaboration with the Cairo Microsoft Innovation Center. At Carnegie Mellon University, as a Google PhD fellow, he developed the SOTA in language-universal models (L-UMs). At Google Shopping, he contributed to the SOTA in mixing random forests with neural networks. - At the Allen Institute for Artificial Intelligence, he learned the SOTA in managing science from his mentor Oren Etzioni, then developed the SOTA in modeling science. At Google Health, he contributed to the SOTA in building the digital manifestation of living cells in species-agnostic models. - At Google Research, he learned the SOTA in cost-effective scaling of LLM inference to a Billion users. - At Google Assistant, he learned the SOTA in scalable distribution of data products. At Burning Man, he learned how to safely integrate freedom and self expression. We talked about the following and other items Burning Man Experience and Philosophy Scientific Progress and Its Impact  Ethics in Science and Peer Review Purpose of Science and Future Discoveries Encouraging Young Scientists and Scientific Discoveries Future of AI and Its Impact on Various Industries Global AI Development and Personal Background  Is Singularity coming Paddle boarding and dancing AI/ML How were the pyramids built Are humans becoming smarter AI ethics Bridger's Social Media  Bridger's LinkedIn: https://www.linkedin.com/in/waleedammar/  Bridger's Email: wammar@higg.world Company Website: https://higg.world/ Company Instagram: https://www.instagram.com/holistic_intelligence/  

Speaking of Data
Data Modeling Techniques with Mark Peco

Speaking of Data

Play Episode Listen Later Oct 1, 2024 22:33


Mark Peco, CBIP, analytics consultant and instructor, joins host Andrew Miller to discuss data modeling techniques - including data and business capabilities, types of data models, and technical debt. Please visit Data Modeling Essentials for more information on the upcoming TDWI seminar with Mark where attendees can earn a certificate. ____________ More information: ·       TDWI Conferences: https://bit.ly/3XqBhGH ·       TDWI Modern Data Leader's Summits: https://bit.ly/4902fuu ·       TDWI Virtual Summits: https://bit.ly/31HJ2xr ·       Seminars: https://bit.ly/3WxQPr4 ·       More Speaking of Data Episodes: https://bit.ly/3JsQPWo Follow Us on: ·       LinkedIn - https://bit.ly/42zCZZB ·       Facebook - https://bit.ly/49uej7j ·       Instagram - https://bit.ly/3HM8x57 ·       X - https://bit.ly/3SsYu9P

Monday Morning Data Chat
#178 - Rob Harmon - Rob Harmon - Small Data, Efficiency, and Data Modeling

Monday Morning Data Chat

Play Episode Listen Later Aug 19, 2024 63:41


Rob Harmon joins us to chat about small data, being efficient, data modeling, and much more.

Bricks & Bytes
Will AI Replace Construction Managers? - Amir Berman, Go-To-Market Strategist At Buildots

Bricks & Bytes

Play Episode Listen Later Jul 16, 2024 68:35


In this episode, Amir Berman of Builddots challenges the status quo in construction technology. We explore how AI and data are silently revolutionizing the industry, despite widespread skepticism. Berman shares lessons from his failed startup, proposing controversial new models for ROI and product management in construction tech. We delve into the clash between digital solutions and physical realities, and uncover strategies to bridge this divide. Berman's insights on Performance Driven Construction Management hint at a future where data, not intuition, drives decisions. As AI begins to automate traditionally human-centric processes, we confront the implications for the workforce and the industry's future. Prepare to have your assumptions challenged. Tune in to find out about: ✅ Why old ways of tracking progress might be costing you money ✅ The hard truth about getting value from construction tech ✅ How AI spots delays before humans can ✅ Why your project data might be useless without the right tools --------------------------------- Sign up to the #1 Newsletter In Construction Tech. Join over 1,000 like-minded Founders, Investors and Techies disrupting the way we build. Forever : https://bricks-bytes.beehiiv.com/subscribe LinkedIn: https://www.linkedin.com/company/bricks-bytes/ X/Twitter: https://twitter.com/bricksbytespod Youtube: https://www.youtube.com/channel/UCmNbunUTIIQDzbJgGJt9_Zg Instagram: https://www.instagram.com/bricksbytes/ --------------------------------- 00:00 - Intro 02:12 - Lessons From Starting & Failing A Startup 08:15 - Innovation Happening In London In Construction 10:42 - Measuring ROI in Construction 12:45 - Product Management in Construction vs Other Industries 15:31 - BuildDots: Progress Tracking and Data Modeling for Construction 36:30 - Capturing and Utilizing Performance Data 38:08 - Creating Data Standards for 4D 38:37 - The Power of Productivity Data 41:42 - Innovation and Continuous Improvement 48:35 - Showcasing Value through Case Studies 51:51 - Go-to-Market Strategies in Construction Tech 56:15 - The Challenge of Creating a Clear ROI Formula 59:33 - The New Layer Strategy: Performance-Driven Construction Management 01:06:42 - Charging Based on Construction Volume

The MongoDB Podcast
EP. 222 Vector Search and Data Modeling with MongoDB

The MongoDB Podcast

Play Episode Listen Later Jul 12, 2024 11:37


In this episode, recorded live at the Javits Center in New York City, we talk with Henry Weller, Product Manager at MongoDB. Henry shares the latest developments in vector search, now available in the MongoDB Community Edition, and provides expert advice on data modeling for unstructured data. Discover how to leverage implicit structure, optimize search results, and implement best practices for information retrieval. This episode is essential for developers and tech enthusiasts looking to stay ahead in the evolving landscape of data technology.

The Joe Reis Show
Steven Macleod - The Importance of Operational Data Modeling

The Joe Reis Show

Play Episode Listen Later Jul 2, 2024 71:10


Steven Macleod joins me to discuss the importance of operational data modeling, the flaws in current data modeling practices, and the challenges of data modeling at startups and in 3rd party tools. Occam Works: https://occam.works/ LinkedIn: https://www.linkedin.com/in/stevenmacleod/

The Joe Reis Show
Juha Korpela - The Power of Conceptual Data Modeling

The Joe Reis Show

Play Episode Listen Later Jun 5, 2024 57:26


Juha Korpela is a world-renowned expert in conceptual data modeling. He joins me to discuss the power of conceptual data modeling, why the data modeling world is broken today, data products, and much more. LinkedIn: https://www.linkedin.com/in/jkorpela/

DMRadio Podcast
Snowflake Data Cloud Summit Preview

DMRadio Podcast

Play Episode Listen Later May 30, 2024 44:53


The Snowflake Conference cometh! Some 200 vendors and 20,000 attendees will converge on the Moscone Convention Center in San Francisco to hear the latest and greatest in the world of data. AI, Analytics, Data Modeling and more will be on the docket. Learn more by checking out this episode of DM Radio, as Host @eric_kavanagh interviews Hyoun Park of Amalgam Insights, and Keith Belanger of SQLdbm!

Tech Lead Journal
#175 - How to Solve Real-World Data Analysis Problems - David Asboth

Tech Lead Journal

Play Episode Listen Later May 20, 2024 57:10


“All data scientists and analysts should spend more time in the business, outside the data sets, just to see how the actual business works. Because then you have the context, and then you understand the columns you're seeing in the data." David Asboth, author of “Solve Any Data Analysis Problem” and co-host of the “Half Stack Data Science” podcast, shares practical tips for solving real-world data analysis challenges. He highlights the gap between academic training and industry demands, emphasizing the importance of understanding the business problem and maintaining a results-driven approach. David offers practical insights on data dictionary, data modeling, data cleaning, data lake, and prediction analysis. We also explore AI's impact on data analysis and the importance of critical thinking when leveraging AI solutions. Tune in to level up your skills and become an indispensable, results-driven data analyst.   Listen out for: Career Journey - [00:01:38] Half Stack Data Science Podcast - [00:06:33] Real-World Data Analysis Gaps - [00:10:46] Understanding the Business/Problem - [00:15:36] Result-Driven Data Analysis - [00:18:28] Feedback Iteration - [00:21:44] Data Dictionary - [00:23:48] Data Modeling - [00:27:18] Data Cleaning - [00:30:43] Data Lake - [00:35:05] Common Data Analysis Tasks - [00:36:50] Prediction Analysis - [00:40:23] The Impact of AI on Data Analysis - [00:43:15] Importance of Critical Thinking - [00:47:05] Common Tasks Solved by AI - [00:50:07] 3 Tech Lead Wisdom - [00:53:10] _____ David Asboth's BioDavid is a “data generalist”; currently a freelance data consultant and educator with an MSc. in Data Science and a background in software and web development. With over 6 years experience teaching, he has taught everyone from junior analysts up to C-level executives in industries like banking and management consulting about how to successfully apply data science, machine learning, and AI to their day-to-day roles. He co-hosts the Half Stack Data Science podcast about data science in the real world and is the author of Solve Any Data Analysis Problem, a book about the data skills that aspiring analysts actually need in their jobs, which will be published by Manning in 2024. Follow David: LinkedIn – linkedin.com/in/david-asboth-9256772 Website – davidasboth.com Podcast – halfstackdatascience.com _____ Our Sponsors Manning Publications is a premier publisher of technical books on computer and software development topics for both experienced developers and new learners alike. Manning prides itself on being independently owned and operated, and for paving the way for innovative initiatives, such as early access book content and protection-free PDF formats that are now industry standard.Get a 45% discount for Tech Lead Journal listeners by using the code techlead45 for all products in all formats. Like this episode? Show notes & transcript: techleadjournal.dev/episodes/175. Follow @techleadjournal on LinkedIn, Twitter, and Instagram. Buy me a coffee or become a patron.

The Joe Reis Show
5 Minute Friday - Is Data Modeling a Waste of Time?

The Joe Reis Show

Play Episode Listen Later May 17, 2024 12:22


Is data modeling a waste of time? I meet a number of people who say it is. In this episode, I dissect some of the arguments against data modeling, and give reasons why it matters more than ever today.

The Data Stack Show
189: Customer Data Modeling, The Data Warehouse, Reverse ETL, and Data Activation with Ryan McCrary of RudderStack

The Data Stack Show

Play Episode Listen Later May 16, 2024 63:52


Highlights from this week's conversation include:Ryan's Background and Roles in Data (0:05)Data Activation and Dashboard Staleness (1:27)Profiles and Data Activation (2:54)Customer-Facing Experience and Product Management (3:40)Profiles Product Overview (5:10)Use Cases for Profiles (6:44)Challenges with Data Projects (9:19)Entity Management and Account Views (15:33)Handling Entities and Duplicates (17:55)Challenges in Entity Management (22:18)Product Management and Data Solutions (26:08)Reverse ETL and Data Movement (31:58)Accessibility of Data Warehouses (36:14)Profiles and Entity Features (37:47)Cohorts Creation and Use Cases (41:17)Customer Data and Targeting (43:09)Activations and Reverse ETL (45:57)ML and AI Use Cases (55:53)Data Activation and ML Predictions (57:02)Spicy Take and Future Product Features (59:47)ETL Evolution and Cloud Tools (1:00:50)Unbundling and Future Trends (1:02:10)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

The Bike Shed
425: Modeling Associations in Rails

The Bike Shed

Play Episode Listen Later May 7, 2024 29:39


Stephanie shares an intriguing discovery about the origins of design patterns in software, tracing them back to architect Christopher Alexander's ideas in architecture. Joël is an official member of the Boston bike share system, and he loves it. He even got a notification on the app this week: "Congratulations. You have now visited 10% of all docking stations in the Boston metro area." #AchievementUnlocked, Joël! Joël and Stephanie transition into a broader discussion on data modeling within software systems, particularly how entities like companies, employees, and devices interconnect within a database. They debate the semantics of database relationships and the practical implications of various database design decisions, providing insights into the complexities of backend development. Christopher Alexander and Design Patterns (https://www.designsystems.com/christopher-alexander-the-father-of-pattern-language/) Rails guide to choosing between belongsto and hasone (https://edgeguides.rubyonrails.org/association_basics.html#choosing-between-belongs-to-and-has-one) Making impossible states impossible (https://www.youtube.com/watch?v=IcgmSRJHu_8) Transcript: We're excited to announce a new workshop series for helping you get that startup idea you have out of your head and into the world. It's called Vision to Value. Over a series of 90-minute working sessions, you'll work with a thoughtbot product strategist and a handful of other founders to start testing your idea in the market and make a plan for building an MVP. Join for all seven of the weekly sessions, or pick and choose the ones that address your biggest challenge right now. Learn more and sign up at tbot.io/visionvalue. JOËL: Hello and welcome to another episode of The Bike Shed, a weekly podcast from your friends at thoughtbot about developing great software. I'm Joël Quenneville. STEPHANIE: And I'm Stephanie Minn. And together, we're here to share a bit of what we've learned along the way. JOËL: So, Stephanie, what's new in your world? STEPHANIE: So, I learned a very interesting tidbit. I don't know if it's historical; I don't know if I would label it that. But, I recently learned about where the idea of design patterns in software came from. Are you familiar with that at all? JOËL: I read an article about that a while back, and I forget exactly, but there is, like, a design patterns movement, I think, that predates the software world. STEPHANIE: Yeah, exactly. So, as far as I understand it, there is an architect named Christopher Alexander, and he's kind of the one who proposed this idea of a pattern language. And he developed these ideas from the lens of architecture and building spaces. And he wrote a book called A Pattern Language that compiles, like, all these time-tested solutions to how to create spaces that meet people's needs, essentially. And I just thought that was really neat that software design adopted that philosophy, kind of taking a lot of these interdisciplinary ideas and bringing them into something technical. But also, what I was really compelled by was that the point of these patterns is to make these spaces comfortable and enjoyable for humans. And I have that same feeling evoked when I'm in a codebase that's really well designed, and I am just, like, totally comfortable in it, and I can kind of understand what's going on and know how to navigate it. That's a very visceral feeling, I think. JOËL: I love the kind of human-centric approach that you're using and the language that you're using, right? A place that is comfortable for humans. We want that for our homes. It's kind of nice in our codebases, too. STEPHANIE: Yeah. I have really enjoyed this framing because instead of just saying like, "Oh, it's quote, unquote, "best practice" to follow these design patterns," it kind of gives me more of a reason. It's more of a compelling reason to me to say like, "Following these design patterns makes the codebase, like, easier to navigate, or easier to change, or easier to work with." And that I can get kind of on board with rather than just saying, "This way is, like, the better way, or the superior way, or the way to do things." JOËL: At the end of the day, design patterns are a means to an end. They're not an end in of itself. And I think that's where it's very easy to get into trouble is where you're just sort of, I don't know, trying to rack up engineering points, I guess, for using a lot of design patterns, and they're not necessarily in service to some broader goal. STEPHANIE: Yeah, yeah, exactly. I like the way you put that. When you said that, for some reason, I was thinking about catching Pokémon or something like filling your Pokédex [laughs] with all the different design patterns. And it's not just, you know, like you said, to check off those boxes, but for something that is maybe a little more meaningful than that. JOËL: You're just trying to, like, hit the completionist achievement on the design patterns. STEPHANIE: Yeah, if someone ever reaches that, you know, gets that achievement trophy, let me know [laughs]. JOËL: Can I get a badge on GitHub for having PRs that use every single Gang of Four pattern? STEPHANIE: Anyway, Joël, what's new in your world? JOËL: So, on the topic of completing things and getting badges for them, I am a part of the Boston bike share...project makes it sound like it's a, I don't know, an exclusive club. It's Boston's bike share system. I have a subscription with them, and I love it. It's so practical. You can go everywhere. You don't have to worry about, like, a bike getting stolen or something because, like, you drop it off at a docking station, and then it's not your responsibility anymore. Yeah, it's very convenient. I love it. I got a notification on the app this week that said, "Congratulations. You have now visited 10% of all docking stations in the Boston metro area." STEPHANIE: Whoa, that's actually a pretty cool accomplishment. JOËL: I didn't even know they tracked that, and it's kind of cool. And the achievement shows me, like, here are all the different stations you've visited. STEPHANIE: You know what I think would be really fun? Is kind of the equivalent of a Spotify Wrapped, but for your biking in a year kind of around the city. JOËL: [laughs] STEPHANIE: That would be really neat, I think, just to be like, oh yeah, like, I took this bike trip here. Like, I docked at this station to go meet up with a friend in this neighborhood. Yeah, I think that would be really fun [laughs]. JOËL: You definitely see some patterns come up, right? You're like, oh yeah, well, you know, this is my commute into work every day. Or this is that one friend where, you know, every Tuesday night, we go and do this thing. STEPHANIE: Yeah, it's almost like a travelogue by bike. JOËL: Yeah. I'll bet there's a lot of really interesting information that could surface from that. It might be a little bit disturbing to find out that a company has that data on you because you can, like, pick up so much. STEPHANIE: That's -- JOËL: But it's also kind of fun to look at it. And you mentioned Spotify Wrapped, right? STEPHANIE: Right. JOËL: I love Spotify Wrapped. I have so much fun looking at it every year. STEPHANIE: Yeah. It's always kind of funny, you know, when products kind of track that kind of stuff because it's like, oh, like, it feels like you're really seen [laughs] in terms of what insights it's able to come up with. But yeah, I do think it's cool that you have this little badge. I would be curious to know if there's anyone who's, you know, managed to hit a hundred percent of all the docking stations. They must be a Boston bike messenger or something [laughs]. JOËL: Now that I know that they track it, maybe I should go for completion. STEPHANIE: That would be a very cool flex, in my opinion. JOËL: [laughs] And, you know, of course, they're always expanding the network, which is a good thing. I'll bet it's the kind of thing where you get, like, 99%, and then it's just really hard to, like, keep up. STEPHANIE: Yeah, nice. JOËL: But I guess it's very appropriate, right? For a podcast titled The Bike Shed to be enthusiastic about a bike share program. STEPHANIE: That's true. So, for today's topic, I wanted to pick your brain a little bit on a data modeling question that I posed to some other developers at thoughtbot, specifically when it comes to associations and associations through other associations [laughs]. So, I'm just going to kind of try to share in words what this data model looks like and kind of see what you think about it. So, if you had a company that has many employees and then the employee can also have many devices and you wanted to be able to associate that device with the company, so some kind of method like device dot company, how do you think you would go about making that association happen so that convenience method is available to you in the code? JOËL: As a convenience for not doing device dot employee dot company. STEPHANIE: Yeah, exactly. JOËL: I think a classic is, at least the other way, is that it has many through. I forget if you can do a belongs to through or not. You could also write, effectively, a delegation method on the device to effectively do dot employee dot company. STEPHANIE: Yeah. So, I had that same inkling as you as well, where at first I tried to do a belongs to through, but it turns out that belongs to does not support the through option. And then, I kind of went down the next path of thinking about if I could do a has one, a device has one company through employee, right? But the more I thought about it, the kind of stranger it felt to me in terms of the semantics of saying that a device has a company as opposed to a company having a device. It made more sense in plain English to think about it in terms of a device belonging to a company. JOËL: That's interesting, right? Because those are ways of describing relationships in sort of ActiveRecord's language. And in sort of a richer situation, you might have all sorts of different adjectives to describe relationships. Instead of just belongs to has many, you have things like an employee owns a device, an employee works for a company, you know because an employee doesn't literally belong to a company in the literal sense. That's kind of messed up. So, I think what ActiveRecord's language is trying to use is less trying to, like, hit maybe, like, the English domain language of how these things relate to, and it's more about where the foreign keys are in the database. STEPHANIE: Yeah. I like that point where even though, you know, these are the things that are available to us, that doesn't actually necessarily, you know, capture what we want it to mean. And I had gone to see what Rails' recommendation was, not necessarily for the situation I shared. But they have a section for choosing between which model should have the belongs to, as opposed to, like, it has one association on it. And it says, like you mentioned, you know, the distinction is where you place the foreign key, but you should kind of think about the actual meaning of the data. And, you know, we've talked a lot about, I think, domain modeling [chuckles] on the show. But their kind of documentation says that...the has something relationship says that one of something is yours, that it can, like, point back to you. And in the example I shared, it still felt to me like, you know, really, the device wanted to point to the company that it is owned by. And if we think about it in real-world terms, too, if that device, like, is company property, for example, then that's a way that that does make sense. But the couple of paths forward that I saw in front of me were to rework that association, maybe add a new column onto the device, and go down that path of codifying it at the database level. Or kind of maybe something as, like, an in-between step is delegating the method to the employee. And that's what I ended up doing because I wasn't quite ready to do that data migration. JOËL: Adding more columns is interesting because then you can run into sort of data consistency issues. Let's say on the device you have a company ID to see who the device belongs to. Now, there are sort of two different independent paths. You can ask, "Which company does this device belong to?" You can either check the company ID and then look it up in the company table. Or you can join on the employee and join the employee back under company. And those might give you different answers and that can be a problem with data consistency if those two need to stay in sync. STEPHANIE: Yeah, that is a good point. JOËL: There could be scenarios where those two are allowed to diverge, right? You can imagine a scenario where maybe a company owns the device, but an employee of a potentially different company is using the device. And so, now it's okay to have sort of two different chains because the path through the employee is about what company is using our devices versus which company actually owns them. And those are, like, two different kinds of relationships. But if you're trying to get the same thing through two different paths of joining, then that can set you up for some data inconsistency issues. STEPHANIE: Wow. I really liked what you said there because I don't think enough thought goes into the emergent relationships between models after they've been introduced to a codebase. At least in my experience, I've seen a lot of thought go up front into how we might want to model an ActiveRecord, but then less thought into seeing what patterns kind of show up over time as we introduce more functionality to these models, and kind of understand how they should exist in our codebase. Is that something that you find yourself kind of noticing? Like, how do you kind of pick up on the cue that maybe there's some more thought that needs to happen when it comes to existing database tables? JOËL: I think it's something that definitely is a bit of a red flag, for me, is when there are multiple paths to connect to sort of establish a relationship. So, if I were to draw out some sort of, like, diagram of the models, boxes, and arrows or something like that, and then I could sort of overlay different paths through that diagram to connect two models and realize that those things need to be in sync, I think that's when I started thinking, ooh, that's a potential danger. STEPHANIE: Yeah, that's a really great point because, you know, the example I shared was actually a kind of contrived one based on what I was seeing in a client codebase, not, you know, I'm not actually working with devices, companies, and employees [laughs]. But it was encoded as, essentially, a device having one company. And I ended up drawing it out because I just couldn't wrap my head around that idea. And I had, essentially, an arrow from device pointing to company when I could also see that you could go take the path of going through employee [laughs]. And I was just curious if that was intentional or was it just kind of a convenient way to have that direct method available? I don't currently have enough context to determine but would be something I want to pay attention to. Like you said, it does feel like, if not a red flag, at least an orange one. JOËL: And there's a whole kind of science to some of this called database normalization, where they're sort of, like, they all have rather arcane names. They're the first normal form, the second normal form, the third normal form, you know, it goes on. If you look at the definition, they're all also a little bit arcane, like every element in a relation must depend solely upon the primary key. And you're just like, well, what does that mean? And how do I know if my table is compliant with that? So, I think it's worth, if you're Googling for some of these, find an article that sort of explains these a little bit more in layman's terms, if you will. But the general idea is that there are sort of stricter and stricter levels of the amount of sort of duplicate sources of truth you can have. In a sense, it's almost like DRY but for databases, and for your database schema in particular. Because when you have multiple sources of truth, like who does this device belong to, and now you get two different answers, or three different answers, now you've got a data corruption issue. Unlike bugs in code where it's, you know, it can be a problem because the site is down, or users have incorrect behavior, but then you can fix it later, and then go to production, and disruption to your clients is the worst that happened, this sort of problem in data is sometimes unrecoverable. Like, it's just, hey, -- STEPHANIE: Whoa, that sounds scary. JOËL: Yeah, no, data problems scare me in a way that code problems don't. STEPHANIE: Whoa. Could you...I think I interrupted you. But where were you going to go about once you have corrupted data? Like, it's unrecoverable. What happens then? JOËL: Because, like, if I look at the database, do I know who the real owner of this...if I want to fix it, let's say I fix my schema, but now I've got all this data where I've got devices that have two different owners, and I don't know which one is the real one. And maybe the answer is, I just sort of pick one and say, "Oh, the one that was through this association is sort of the canonical one, and we can just sort of ignore the other one." Do I have confidence in that decision? Well, maybe depending on some of the other context maybe, I'm lucky that I can have that. The doomsday scenario is that it's a little bit of one, a little bit of the other because there were different code paths that would write to one way or another. And there's no real way of knowing. If there's not too many devices, maybe I do an audit. Maybe I have to, like, follow up with all of my customers and say, "Hey, can you tell me which ones are really your devices?" That's not going to scale. Like, real worst case scenario, you almost have to do, like, a bit of a bankruptcy, where you say, "Hey, all the data prior to this date there's a bit of a question mark on it. We're not a hundred percent sure about it." And that does not feel great. So, now you're talking about mitigation strategies. STEPHANIE: Oof. Wow. Yeah, you did make it sound [laughs] very scary. I think I've kind of been on the periphery of a situation like this before, where it's not just that we couldn't trust the code. It's that we couldn't trust the data in the database either to tell us how things work, you know, for our users and should work from a product perspective. And I was on a previous client project where they had to, yeah, like, hire a bunch of people to go through that data and kind of make those determinations, like you said, to kind of figure out it out for, you know, all of these customers to determine the source of truth there. And it did not sound like an easy feat at all, right? That's so much time and investment that you have to put into that once you get to that point. JOËL: And there's a little bit of, like, different problems at different layers. You know, at the database layer, generally, you want all of that data to be really in a sort of single source of truth. Sometimes that makes it annoying to query because you've got to do all these joins. And so, there are various denormalization strategies that you can use to make that. Or sometimes it's a risk you're going to take. You're going to say, "Look, this table is not going to be totally normalized. There's going to be some amount of duplication, and we're comfortable with the risk if that comes up." Sometimes you also build layers of abstractions on top, so you might have your data sort of at rest in database tables fully normalized and separated out, but it's really clunky to query. So, you build out a database view on top of that that returns data in sort of denormalized fashion. But that's okay because you can always get your correct answer by querying the underlying tables. STEPHANIE: Wow. Okay. I have a lot of thoughts about this because I feel like database normalization, and I guess denormalization now, are skills that I am certainly not an expert at. And so, when it comes to, like, your average developer, how much do you think that people need to be thinking about this? Or what strategies do you have for, you know, a typical Rails dev in terms of how deep they should go [laughs]? JOËL: So, the classic advice is you probably want to go to, like, third to fourth normal form, usually three. There's also like 3.5 for some reason. That's also, I think, sometimes called BNF. Anyway, sort of levels of how much you normalize. Some of these things are, like, really, really basic things that Rails just builds into its defaults with that convention over configuration, so things like every table should have a primary key. And that primary key should be something that's fixed and unique. So, don't use something like combination of first name, last name as your primary key because there could be multiple people with the same name. Also, people change their names, and that's not great. But it's great that people can change their names. It's not great to rely on that as a primary key. There are things like look for repeating columns. If you've got columns in your schema with a number prefix at the end, that's probably a sign that you want to extract a table. So, I don't know, you have a movie, and you want to list the actors for a movie. If your movie table has actor 1, actor 2, actor 3, actor 4, actor 5, you know, like, all the way up to actor 20, and you're just like, "Yeah, no, we fill, like, actor 1 through N, and if there's any space left over, we just put nulls in those columns," that's a pretty big sign that, hey, why don't you instead have a, like, actor's table, and then make a, like, has many association? So, a lot of the, like, really basic normalization things, I think, are either built into Rails or built into sort of best practices around Rails. I think something that's really useful for developers to get as a sense beyond learning the actual different normal forms is think about it like DRY for your schema. Be wary of sort of multiple sources of truth for your data, and that will get you most of the way there. When you're designing sort of models and tables, oftentimes, we think of DRY more in terms of code. Do you ever think about that a little bit in terms of your tables as well? STEPHANIE: Yeah, I would say so. I think a lot of the time rather than references to another table just starting to grow on a certain model, I would usually lean towards introducing a join table there, both because it kind of encapsulates this idea that there is a connection, and it makes the space for that idea to grow if it needs to in the future. I don't know if I have really been disciplined in thinking about like, oh, you know, there should really...every time I kind of am designing my database tables, thinking about, like, there should only be one source of truth. But I think that's a really good rule of thumb to follow. And in fact, I can actually think of an example right now where we are a little bit tempted to break that rule. And you're making me reconsider [laughter] if there's another way of doing so. One thing that I have been kind of appreciative of lately is on my current client project; there's just, like, a lot of data. It's a very data-intensive and sensitive application. And so, when we introduce migrations, those PRs get tagged for review by someone over from the DevOps side, just to kind of provide some guidance around, you know, making sure that we're setting up our models to scale well. One of the things that he's been asking me on my couple of code changes I introduced was, like, when I introduced an index, like, it happened to be, like, a composite index with a couple of different columns, and the particular order of those columns mattered. And he kind of prompted me to, like, share what my use cases for this index were, just to make sure that, like, some thought went into it, right? Like, it's not so much that the way that I had done it was wrong, but just that I had, like, thought about it. And I like that as a way of kind of thinking about things at the abstraction that I need to to do my dev work day to day and then kind of mapping that to, like you were saying, those best practices around keeping things kind of performant at the database level. JOËL: I think there's a bit of a parallel world that people could really benefit from dipping a toe in, and that's sort of the typed programming world, this idea of making impossible states impossible or making illegal states unrepresentable. That in the sort of now it's not schemas of database tables or schemas of types that you're creating but trying to prevent data coming into a state where someone could plausibly construct an instance of your object or your type that would be nonsensical in the context of your app, kind of trying to lock that down. And I think a lot of the ways that people in those communities think about...in a sense, it's kind of like database normalization for developers. So, if you're not wanting to, like, dip your toe in more of the sort of database-centric world and, like, read an article from a DBA, it might be worthwhile to look at some of those worlds as well. And I think a great starting point for that is a talk by Richard Feldman called Making Impossible States Impossible. It's for the Elm language. And there are equivalents, I think, in many others as well. STEPHANIE: That's really cool that you are making that connection. I know we've kind of briefly talked about workshops in the past on the show. But if there were a workshop for, you know, that kind of database normalization for developers, I would be the first to sign up [laughs]. JOËL: Hint, hint, RailsConf idea. There's something from your original question that I think is interesting to circle back to, and that's the fact that it was awkward to work through in Ruby to do the work that you wanted to do because the tables were laid out in a certain way. And sometimes, there's certain ways that you need the tables to be in order to be sort of safe to represent data, but they're not the optimal way that we would like to interact with them at the Ruby level. And I think it's okay for not everything in Ruby to be 100% reflective of the structure of the tables underneath. ActiveRecord gives us a great pattern, but everything is kind of one-to-one. And it's okay to layer on some things on top, add some extra methods to build some, like, connections in Ruby that rely on this normalized data underneath but that make life easier for you, or they better just represent or describe the relationships that you have. STEPHANIE: 100%. I was really compelled by your idea of introducing helpers that use more descriptive adjectives for what that relationship is like. We've talked about how Rails abstracted things from the database level, you know, for our convenience, but that should not stop us from, like, leaning on that further, right? And kind of introducing our own abstractions for those connections that we see in our domain. So, I feel really inspired. I might even kind of reconsider the way I handled the original example and see what I can make of it. JOËL: And I think your original solution of doing the delegation is a great example of this as well. You want the idea that a device belongs to a company or has an association called company, and you just don't want to go through that long chain, or at least you don't want that to be visible as an implementation detail. So, in this case, you delegate it through a chain of methods in Ruby. It could also be that you have a much longer chain of tables, and maybe they don't all have associations in Rails and all that. And I think it would be totally fine as well to define a method on an object where, I don't know, a device, I don't know, has many...let's call it technicians, which is everybody who's ever touched this device or, you know, is on a log somewhere for having done maintenance. And maybe that list of technicians is not a thing you can just get through regular Rails associations. Maybe there's a whole, like, custom query underlying that, and that's okay. STEPHANIE: Yeah, as you were saying that, I was thinking about that's actually kind of, like, active models are the great spot to put those methods and that logic. And I think you've made a really good case for that. JOËL: On that note, shall we wrap up? STEPHANIE: Let's wrap up. Show notes for this episode can be found at bikeshed.fm. JOËL: This show has been produced and edited by Mandy Moore. STEPHANIE: If you enjoyed listening, one really easy way to support the show is to leave us a quick rating or even a review in iTunes. It really helps other folks find the show. JOËL: If you have any feedback for this or any of our other episodes, you can reach us @_bikeshed, or you can reach me @joelquen on Twitter. STEPHANIE: Or reach both of us at hosts@bikeshed.fm via email. JOËL: Thanks so much for listening to The Bike Shed, and we'll see you next week. ALL: Byeeeeeeee!!!!!!!!!! AD: Did you know thoughtbot has a referral program? If you introduce us to someone looking for a design or development partner, we will compensate you if they decide to work with us. More info on our website at: tbot.io/referral. Or you can email us at: referrals@thoughtbot.com with any questions.

The Joe Reis Show
Keith Belanger - The Art of Data Modeling

The Joe Reis Show

Play Episode Listen Later Apr 10, 2024 59:17


Keith Belanger is an OG data modeling practitioner, having been in the game for decades. We chat about a wide range of data modeling topics. What's changed and what's stayed the same? How to model data to fit the business's needs. Agile data modeling. When it works, when it doesn't. Data modeling for data mesh and decentralization. The art of data modeling How to teach conceptual data modeling to new practitioners Keith brings a wealth of experience and a practical, no-nonsense perspective. If you're interested in data modeling, don't miss this! LinkedIn: https://www.linkedin.com/in/krbelanger/

The Joe Reis Show
5 Minute Friday - Your Mileage WILL Vary With Analytical Data Modeling

The Joe Reis Show

Play Episode Listen Later Apr 5, 2024 8:13


This morning, the Practical Data Modeling Community held its first group discussion (to be posted very soon). People from all sorts of organizations (biggest companies in the world, universities, small companies) discussed how the approach analytical data modeling. My major takeaway - your mileage will vary. There's the ideal way of data modeling we're taught, and there's reality. Everyone's situation is different and there's no one-size-fits-all approach that will work for everyone. The discussion was awesome, and we'll do it again soon. If you're not part of the Practical Data Modeling Community, please join here: https://practicaldatamodeling.substack.com/

Speaking of Data
Data Modeling with Rich Fox

Speaking of Data

Play Episode Listen Later Mar 12, 2024 24:42


Rich Fox, analytics consultant and educator, joins host Andrew Miller to discuss data modeling - including data modeling basics, why data modeling is still relevant, and a preview of the TDWI Virtual Seminar on Data Modeling Essentials.    ____________ More information: ·       TDWI Conferences: https://bit.ly/3XqBhGH ·       TDWI Modern Data Leader's Summits: https://bit.ly/4902fuu ·       TDWI Virtual Summits: https://bit.ly/31HJ2xr ·       Seminars: https://bit.ly/3WxQPr4 ·       More Speaking of Data Episodes: https://bit.ly/3JsQPWo   Subscribe to our TDWI Upside newsletter: https://bit.ly/3Ss4Ypz Follow Us on: ·       LinkedIn - https://bit.ly/42zCZZB ·       Facebook - https://bit.ly/49uej7j ·       Instagram - https://bit.ly/3HM8x57 ·       X - https://bit.ly/3SsYu9P

The Data Stack Show
179: Time Series Data Management and Data Modeling with Tony Wang of Stanford University

The Data Stack Show

Play Episode Listen Later Feb 28, 2024 50:42


Highlights from this week's conversation include:Tony's background and research focus (3:35)Challenges in academia and industry (6:15)Ph.D. student's routine (10:47)Academic paper review process (15:26)Aha moments in research (20:05)Academic lab structure (23:09)The decision to move from hardware to data research (24:43)Research focus on time series data management (27:40)Data modeling in time series and OLAP systems (32:01)Issues and potential solutions for parquet format (37:32)Role of external indices in parquet files (42:19)Tony's open source project (47:11)Final thoughts and takeaways (49:30)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Software Huddle
The Real Work of Data Engineering with Joe Reis

Software Huddle

Play Episode Listen Later Feb 27, 2024 59:00


Today, we have Joe Reis on the show. Joe is the co author of the book, Fundamentals of Data Engineering, probably the best and most comprehensive book on data engineering you could think to read. We talk about the culture of Data Engineering, Relationship with Data Science, the downside of chasing bleeding edge technology in approaches to Data Modeling. Joe's got lots to say, lots of opinions and is super knowledgeable. So even if Data Engineering, Data Science isn't your thing. We think you're still going to really enjoy listening to the interview.

Financial Modeler's Corner
Beyond the Numbers: Lance Rubin's Insights on AI, Power BI, and the Future of Financial Modeling

Financial Modeler's Corner

Play Episode Listen Later Feb 15, 2024 49:03


In this episode, Paul Barnhurst is joined by Lance Rubin, who has a wealth of financial modeling experience including working for PwC & KPMG, Investec Bank Corporate Finance & Advisory, National Australia Bank, and starting his own practice.Lance spent two decades working for corporations (PwC & KPMG, Investec Bank Corporate Finance & Advisory, National Australia Bank). It was during this time he gained a love of modeling. Lance was previously the CFO of fin-tech start-up Banjo (SME lender) and Sequel CFO whilst founding Model Citizn, a financial modeling, analytics, and automation consultancy firm following his 20 years in corporate. He has delivered a number of online training workshops in financial modeling and Power BI whilst also being a certified trainer for the FMI and he wrote a large portion of the CA (ANZ) study guide on financial modeling. He was also a judge at the world's first Financial Modeling Innovation Awards and he presented at the Power BI Global Summit in 2022. Listen to this episode as Lance shares:  His learning from the worst models he has come across.His journey into Financial Modeling and how he fell in love with modeling.The key to Financial Transformations.The HACK Framework.The importance of Power BI and similar tech.His advice on the use of tools and shortcuts.His position on controversial modeling issues, including circular references, dynamic arrays, modeling standards, AI in modeling, and more.Quotes:  “Financial Transformation is the combination of process, tech, and people. With people being the most important.” “The HACK Framework (Hygiene, Automation, Capability, Knowledge) allows you to bring technical skills and soft skills. together...You need to develop capability and knowledge and you need to bring that together with hygiene and automation.” “Data Modeling and Financial Modeling sound the same but are fundamentally quite different.”  Sign up for the Advanced Financial Modeler Accreditation or FMI Fundamentals Today and receive 15% off by using the special show code ‘Podcast'. Visit www.fminstitute.com/podcast and use code Podcast to save 15% when you register.  Go to https://earmarkcpe.com, download the app, take the quiz and you can receive CPE credit.  Follow Lance:    Website - https://www.modelcitizn.com/LinkedIn - https://www.linkedin.com/in/financial-modelling/Follow Paul: Website - https://www.thefpandaguy.com/  LinkedIn - https://www.linkedin.com/in/thefpandaguy/ TikTok - https://www.tiktok.com/@thefpandaguy YouTube - https://www.youtube.com/@thefpaguy8376 Follow Financial Modeler's Corner LinkedIn Page- https://www.linkedin.com/company/financial-modeler-s-corner/?viewAsMember=true Newsletter - Subscribe on LinkedIn- https://www.linkedin.com/build-relation/newsletter-follow?entityUrn=7079020077076905984 In today's episode:  (00:22) Intro;(00:47) Welcoming Lance;(01:00) The worst financial model Lance has ever seen;(05:14) Takeaway from the worst financial model;(08:02) Lance's background;(11:27) Key to Finance Transformation;(17:37) The Hack Framework;(18:43 - 19:29) Validate your Financial Modeling Skills with FMI's Accreditation Program (ad);(19:30) What led to Lance's love for Modeling?;(22:38) The most interesting model;(26:30) Importance of Power BI and similar Tech;(31:51) AI and Financial Modeling;(36:10) The learning that saved Lance a lot time while Modeling;(41:16) Rapid Fire;(46:58) Connect with Lance;(48:15) Outro;

The Joe Reis Show
Steve Hoberman - Data Modeling's Past, Present, and Future.

The Joe Reis Show

Play Episode Listen Later Feb 13, 2024 54:05


I consider Steve Hoberman to be one of the original data modelers, having practiced and taught data modeling since the 1990s. He also runs the venerable Technics Publications, which I consider the foremost publishers of data-oriented books. Steve and I discuss data modeling's past, present, and future. If you're into data modeling, this is a must-listen. Enjoy! Technics Publications: https://technicspub.com/ Steve Hoberman LinkedIn - https://www.linkedin.com/in/stevehoberman/

The Joe Reis Show
5 Minute Friday - Data Day Texas, Practical Data Modeling Updates, and More

The Joe Reis Show

Play Episode Listen Later Feb 2, 2024 7:50


Today's rant is a random grab bag of stuff - my thoughts on Data Day Texas, updates to Practical Data Modeling, and more.

The Joe Reis Show
5 Minute Friday - How I Define Data Modeling (today)

The Joe Reis Show

Play Episode Listen Later Jan 19, 2024 10:30


I chat about my working (and evolving) definition of data modeling. In short, we need to move beyond a human-centric view of data modeling, as most data is consumed by machines.

Learning Tech Talks
Revolutionizing Business Intelligence with AI: Transforming Data Analytics Across Industries with Steve Wasick

Learning Tech Talks

Play Episode Listen Later Jan 16, 2024 63:47


Are we on the brink of a narrative revolution in AI analytics? This week I had a conversation with Steve Wasick, Founder & CEO of infoSentience, where we explore the intersection of AI, storytelling, and data analytics. Unlike the conventional approach of large language models (LLMs) like GPT, we explored world of 'conceptual automata', a game-changing technology that redefines how we perceive and generate narratives from complex data sets. We touched on the challenges that traditional LLMs face in maintaining accuracy and context, especially when dealing with large volumes of data. We also highlighted the unique flexibility and customization capabilities of conceptual automata, which allow for more accurate, interactive, and context-aware reporting – akin to having an intelligent, data-savvy journalist at your fingertips. But, this wasn't just about the tech as is true with all my discussions. We hit on the future of AI in business, the ethics of AI-generated content, and most importantly, the irreplaceable value of the human element in interpreting and strategizing from AI-provided insights. For anyone intrigued by the potential of AI to transform not just how we analyze data but how we narrate and understand it, you'll want to check it out. Show Notes: 00:00 - Introduction AI in Reporting and Analytics 13:56 - InfoSentience's Unique AI Approach 28:09 - AI's Impact on the Role of Analysts 32:09 - Human Behavior and Psychology in Tech 38:15 - Adapting to Technological Shifts in Professional Roles 46:04 - Customization and Data Modeling in AI Solutions 50:16 - Content Creation and Transparency in AI 57:40 - Limitless Potential of AI in Business Intelligence

The AI Fundamentalists
Non-parametric statistics

The AI Fundamentalists

Play Episode Listen Later Jan 10, 2024 32:49 Transcription Available


Get ready for 2024 and a brand new episode! We discuss non-parametric statistics in data analysis and AI modeling. Learn more about applications in user research methods, as well as the importance of key assumptions in statistics and data modeling that must not be overlooked, After you listen to the episode, be sure to check out the supplement material in Exploring non-parametric statistics.Welcome to 2024  (0:03)AI, privacy, and marketing in the tech industryOpenAI's GPT store launch. (The Verge)Google's changes to third-party cookies. (Gizmodo)Non-parametric statistics and its applications (6:49)A solution for modeling in environments where data knowledge is limited.Contrast non-parametric statistics with parametric statistics, plus their respective strengths and weaknesses.Assumptions in statistics and data modeling (9:48)The importance of understanding statistics in data science, particularly in modeling and machine learning. (Probability distributions, Wikipedia)Discussion about a common assumption of representing data with a normal distribution; oversimplifies complex real-world phenomena.The importance of understanding data assumptions when using statistical modelsStatistical distributions and their importance in data analysis (15:08)Discuss the importance of subject matter experts in evaluating data distributions, as assumptions about data shape can lead to missed power and incorrect modeling. Examples of different distributions used in various situations, such as Poisson for wait times and counts, and discrete distributions like uniform and Gaussian normal for continuous events.Consider the complexity of selecting the appropriate distribution for statistical analysis; understand the specific distribution and its properties.Non-parametric statistics and its applications in data analysis (19:31)Non-parametric statistics are more robust to outliers and can generalize across different datasets without requiring domain expertise or data massaging.Methods rely on rank ordering and have less statistical power compared to parametric methods, but are more flexible and can handle complex data sets better.Discussion about the usefulness and limitations, which require more data to detect meaningful changes compared to parametric tests.Non-parametric tests for comparing data sets (24:15)Non-parametric tests, including the K-S test and chi-square test, which can compare two sets of data without assuming a specific distribution.Can also be used for machine learning, classification, and regression tasks, even when the underlying datWhat did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

The Joe Reis Show
5 Minute Friday - Practical Data Modeling

The Joe Reis Show

Play Episode Listen Later Jan 5, 2024 12:44


I often get some questions - What happened to data modeling? Where do I learn data modeling? Where the heck is your new book? Well, at least some of your questions will be answered in this podcast. I'm launching a new project called Practical Data Modeling on Substack. You'll get weekly articles, early chapters of my new data modeling book, community discussions, and much more. Subscribe to Practical Data Modeling: https://practicaldatamodeling.substack.com/

Secrets of Data Analytics Leaders
A Fresh Look at Data Modeling Part 2: Rediscovering the Lost Art of Data Modeling - Audio Blog

Secrets of Data Analytics Leaders

Play Episode Listen Later Dec 14, 2023 11:28


Data modeling is a core skill of data engineering, but it is missing or inadequate in many data engineering teams. These teams focus on moving data with little attention to shaping the data. They engineer processes, not products. Full data engineering is both process and product engineering, and that calls for data modeling. Published at: https://www.eckerson.com/articles/a-fresh-look-at-data-modeling-part-2-rediscovering-the-lost-art-of-data-modeling

Teach Better Talk
Strategies for Effective Data Modeling, Training & Communication - School Administrator Mastermind

Teach Better Talk

Play Episode Listen Later Dec 13, 2023 15:03


Welcome to another enlightening episode of the School Administrator Mastermind Recap! Join hosts Joshua Stamper and Jeff Gargas as they unravel the intricacies of "Strategies for Effective Data Modeling, Training, and Communication." In this dynamic recap, Joshua and Jeff dive deep into the world of data, offering practical insights and actionable strategies for administrators seeking to enhance their approach to data modeling, training, and communication. Discover innovative methods to transform raw data into meaningful insights, cultivate a culture of data literacy among your team, and effectively communicate data-driven decisions. Key Highlights:

The Data Stack Show
165: SQL Queries, Data Modeling, and Data Visualization with Colin Zima of Omni

The Data Stack Show

Play Episode Listen Later Nov 22, 2023 54:23


Highlights from this week's conversation include:Colin's Background and Starting Omni (1:48)Defining “good” at Google search early in his career (4:42)Looker's Unique Approach to Analytics (9:48)The paradigm shift in analytics (10:52)The architecture of Looker and its influence (12:04)Combatting the challenge of unbundling in the data stack (14:26)The evolution of analytics engineering (21:50)Enhancing user flexibility in Omni (23:44)The evolution of BI tools (32:53)What does the future look like for BI tools? (35:14)The role of Python and notebooks in BI (39:48)The product experience of Omni and its vision (45:27)Expectations for the future of Omni (47:52)The relationship between algorithms and business logic (50:51)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.

Voice of the DBA
Having Good Data Modeling Standards

Voice of the DBA

Play Episode Listen Later Nov 22, 2023 3:43


While working with a customer recently, they mentioned that they have certain standards for their objects. They require a PK, and it's the name of the table with _PK added. They also have some standards, like CustomerName vs. CustomerNames for various data items. In fact, they have enough that they built a tool to scan their database code to ensure that changes to the QA and UAT environments adhere to these modeling standards. I wonder how many organizations have formal standards. While I've often tried to set some naming guidelines, I often haven't seen anything (or created anything) formal enough to build a tool around. I would like to, and I think it's a good idea, but it's often something that isn't handled in advance. Read the rest of Having Data Modeling Standards

Secrets of Data Analytics Leaders
A Fresh Look at Data Modeling Part 1: The What and Why of Data Modeling - Audio Blog

Secrets of Data Analytics Leaders

Play Episode Listen Later Nov 14, 2023 10:53


Many organizations abandoned data modeling as they embraced big data and NoSQL. Now they find that data modeling continues to be important, perhaps more important today than ever before. With a fresh look you'll see that today's data modeling is different from past practices – much more than physical design for relational data. Published at: https://www.eckerson.com/articles/a-fresh-look-at-data-modeling-part-1-the-what-and-why-of-data-modeling

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)
Snowpal Education: JSON Data Modeling - A Simple Example

Web and Mobile App Development (Language Agnostic, and Based on Real-life experience!)

Play Episode Listen Later Oct 31, 2023 0:44


You can represent hierarchical data in many ways with one of the most popular formats being JSON. If you are a UI developer, you are likely consuming JSON Data, and if you are a server side engineer, you are providing JSON Data (via REST or Graph APIs, for instance). It is imperative that your JSON Schema looks accurate and is a true structural representation of the problem you are setting out to solve. If it isn't, it's surely going to cause a bit of pain as your product's adoptability grows (think backward compatibility, refactoring, extensibility, and more such challenges). In this course, we will take a recent feature we implemented on our Web App, and design the actual JSON Data Model alongside exploring alternative structures. Purchase course in one of 2 ways: 1. Go to https://getsnowpal.com, and purchase it on the Web 2. On your phone:     (i) If you are an iPhone user, go to http://ios.snowpal.com, and watch the course on the go.     (ii). If you are an Android user, go to http://android.snowpal.com.

Critical Point
Artificial intelligence and insurance, part 2: Rise of the machine-learning models

Critical Point

Play Episode Listen Later Oct 16, 2023 38:23 Transcription Available


In our second Critical Point episode about AI applications in insurance, we drill down into the topic of machine learning and particularly its evolving uses in healthcare. Milliman Principal and Consulting Actuary Robert Eaton leads a conversation with fellow data science leaders about the models they use, the challenges of data accessibility and quality, and working with regulators to ensure fairness. They also pick sides in the great debate of Team Stochastic Parrot versus Team Sparks AGI. You can read the episode transcript on our website.

Data Engineering Podcast
Reduce Friction In Your Business Analytics Through Entity Centric Data Modeling

Data Engineering Podcast

Play Episode Listen Later Jul 9, 2023 72:54


Summary For business analytics the way that you model the data in your warehouse has a lasting impact on what types of questions can be answered quickly and easily. The major strategies in use today were created decades ago when the software and hardware for warehouse databases were far more constrained. In this episode Maxime Beauchemin of Airflow and Superset fame shares his vision for the entity-centric data model and how you can incorporate it into your own warehouse design. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Introducing RudderStack Profiles. RudderStack Profiles takes the SaaS guesswork and SQL grunt work out of building complete customer profiles so you can quickly ship actionable, enriched data to every downstream team. You specify the customer traits, then Profiles runs the joins and computations for you to create complete customer profiles. Get all of the details and try the new product today at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) Your host is Tobias Macey and today I'm interviewing Max Beauchemin about the concept of entity-centric data modeling for analytical use cases Interview Introduction How did you get involved in the area of data management? Can you describe what entity-centric modeling (ECM) is and the story behind it? How does it compare to dimensional modeling strategies? What are some of the other competing methods Comparison to activity schema What impact does this have on ML teams? (e.g. feature engineering) What role does the tooling of a team have in the ways that they end up thinking about modeling? (e.g. dbt vs. informatica vs. ETL scripts, etc.) What is the impact on the underlying compute engine on the modeling strategies used? What are some examples of data sources or problem domains for which this approach is well suited? What are some cases where entity centric modeling techniques might be counterproductive? What are the ways that the benefits of ECM manifest in use cases that are down-stream from the warehouse? What are some concrete tactical steps that teams should be thinking about to implement a workable domain model using entity-centric principles? How does this work across business domains within a given organization (especially at "enterprise" scale)? What are the most interesting, innovative, or unexpected ways that you have seen ECM used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on ECM? When is ECM the wrong choice? What are your predictions for the future direction/adoption of ECM or other modeling techniques? Contact Info mistercrunch (https://github.com/mistercrunch) on GitHub LinkedIn (https://www.linkedin.com/in/maximebeauchemin/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links Entity Centric Modeling Blog Post (https://preset.io/blog/introducing-entity-centric-data-modeling-for-analytics/?utm_source=pocket_saves) Max's Previous Apperances Defining Data Engineering with Maxime Beauchemin (https://www.dataengineeringpodcast.com/episode-3-defining-data-engineering-with-maxime-beauchemin) Self Service Data Exploration And Dashboarding With Superset (https://www.dataengineeringpodcast.com/superset-data-exploration-episode-182) Exploring The Evolving Role Of Data Engineers (https://www.dataengineeringpodcast.com/redefining-data-engineering-episode-249) Alumni Of AirBnB's Early Years Reflect On What They Learned About Building Data Driven Organizations (https://www.dataengineeringpodcast.com/airbnb-alumni-data-driven-organization-episode-319) Apache Airflow (https://airflow.apache.org/) Apache Superset (https://superset.apache.org/) Preset (https://preset.io/) Ubisoft (https://www.ubisoft.com/en-us/) Ralph Kimball (https://en.wikipedia.org/wiki/Ralph_Kimball) The Rise Of The Data Engineer (https://www.freecodecamp.org/news/the-rise-of-the-data-engineer-91be18f1e603/) The Downfall Of The Data Engineer (https://maximebeauchemin.medium.com/the-downfall-of-the-data-engineer-5bfb701e5d6b) The Rise Of The Data Scientist (https://flowingdata.com/2009/06/04/rise-of-the-data-scientist/) Dimensional Data Modeling (https://www.thoughtspot.com/data-trends/data-modeling/dimensional-data-modeling) Star Schema (https://en.wikipedia.org/wiki/Star_schema) Database Normalization (https://en.wikipedia.org/wiki/Database_normalization) Feature Engineering (https://en.wikipedia.org/wiki/Feature_engineering) DRY == Don't Repeat Yourself (https://en.wikipedia.org/wiki/Don%27t_repeat_yourself) Activity Schema (https://www.activityschema.com/) Podcast Episode (https://www.dataengineeringpodcast.com/narrator-exploratory-analytics-episode-234/) Corporate Information Factory (https://amzn.to/3NK4dpB) (affiliate link) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)