POPULARITY
In this episode, we dive deep into the world of data decentralization with Zhamak Dehghani, founder of Data Mesh and CEO of Next Data. Zhamak, a pioneer in the field, shares her vast experience and insights into how data mesh is reshaping the way organizations manage and leverage their data. Zhamak explains the key concepts behind data mesh and its importance in solving data scalability challenges. With her expertise in designing data-driven solutions, she walks us through real-world use cases and the core principles that make data mesh a game-changer for modern businesses. Host: Jake Aaron Villarreal, leads the top AI Recruitment Firm in Silicon Valley www.matchrelevant.com, uncovering stories of funded startups and goes behinds to scenes to tell their founders journey. If you are growing AI Startup or have a great storytelling, email us at: jake.villarreal@matchrelevant.com
Please Rate and Review us on your podcast app of choice!Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn.Transcript for this episode (link) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here.Alyona's LinkedIn: https://www.linkedin.com/in/alyonagalyeva/In this episode, Scott interviewed Alyona Galyeva, Principal Data Engineer at Thoughtworks. To be clear, she was only representing her own views on the episode.Some key takeaways/thoughts from Alyona's point of view:?Controversial? People keep coming up with simple phrasing and a few sentences about where to focus in data mesh. But if you're headed in the right direction, data mesh will be hard, it's a big change. You might want things to be simple but simplistic answers aren't really going to lead to lasting, high-value change to the way your org does data. Be prepared to put in the effort to make mesh a success at your organization, not a few magic answers.!Controversial! Stop focusing so much on the data work as the point. It's a way to derive and deliver value but the data work isn't the value itself. Relatedly, ask what are the key decisions people need to make and what is currently preventing them from making those decisions. Those are likely to be your best use cases.When it comes to Zhamak's data mesh book, it needs to be used as a source of inspiration instead of trying to use it as a manual. Large concepts like data mesh cannot be copy/paste, they must be adapted to your organization.It's really important to understand your internal data flows. Many people inside organizations - especially the data people - think they know the way data flows across the organization, especially for key use cases. But when you dig in, they don't. Those are some key places to deeply investigate first to add value.On centralization versus decentralization, it's better to think of each decision as a slider rather than one or the other. You need to find your balances and also it's okay to take your time as you shift more towards decentralization for many aspects. Change management is best done incrementally. ?Controversial? A major misunderstanding of data mesh that some long-time data people have is that it is just sticking a better self-serve consumption...
Please Rate and Review us on your podcast app of choice!Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn.Transcript for this episode (link) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here.Darren's LinkedIn: https://www.linkedin.com/in/darrenjwoodagileheadofproduct/Darren's Big Data LDN Presentation: https://youtu.be/vUjoJrl_MEs?si=WzB0sBStVIAyqDJsIn this episode, Scott interviewed Darren Wood, Head of Data Product Strategy at UK media and broadcast company ITV. To be clear, he was only representing his own views on the episode.Scott note: I use "coalition of the willing" to refer to those willing to participate early in your data mesh implementation. I wasn't aware of the historical context here, especially when it came to being used in war, e.g. the Iraq war of the early 2000s. I apologize for using a phrase like this.Some key takeaways/thoughts from Darren's point of view:Overall, when thinking about moving to product thinking in data, it's as much about behavior change as action. You have to understand how humans react to change and support that. You can't expect change to happen overnight - patience, persistence, and empathy are all crucial aspects. Transformation takes time and teamwork.?Controversial?: In data mesh, it's crucial to think about flexibility and adaptability of your approach. Things will change, your understanding of how you deliver value will change. Your key targets will change. Be prepared or you will miss the main point of product thinking in data.When choosing your initial domains and use cases in data mesh, think about big picture benefits. You aren't looking for exact value measurements for return on investment but you also want to target a tangible impact, e.g. if we do X, we think we can increase Y part of the business revenue Z%.Zhamak defines a data product quite well in her book on data mesh. But data as a product is a much broader definition of bringing product management best practices to data. That's harder to define but quite important to get right.When thinking about product discovery - what do data consumers actually need...
Key points:Thus far, most of the generative AI stuff Zhamak has seen is not that much of a differentiator. They are doing far better chat bots but that hasn't really changed the game.When it comes to any ML work - and GenAI is just a subset of ML work - engineers need data products to make their data work easy. Reliable sources of data, ability to version, etc. Data mesh obviously plays well there.Relatedly, we need to continue to make things easier for people to leverage data products for GenAI. Engineers shouldn't have to spend all their time moving data around and using many systems.GenAI really could be game changing in data mesh but right now we don't have enough information to really do it well. We need far more metadata around things like data products.GenAI often gives extremely shallow answers that just aren't that helpful. If we can get better answers, amazing. But right now, it's not there.Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Key Points:We need API-first technologies in data. Not just offering APIs but being able to integrate seamlessly with each other via API. We have that in software but it's been a long-time coming in data. If we want an actual modern data stack, we need to have tooling providers make a real change.Simple made easy: we need to make things simple for data product development and consumption. It's not simplistic but it removes unnecessary complexities.Overall, there is such a trend in data where people aren't building things that remove toil - there is this assumption of increasing complexity of use cases but so much of the work is not that complicated. We need to make it so most people can do most of the work relatively easily without making it overly simplistic - easier said than done of course.Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Highlights from this week's conversation include:Defining data mesh (6:37)Addressing the scale of organizational complexity and usage (9:04)The shift from monolithic to microservices (12:24)The sociological structure in data mesh (13:59)Data product generation and sharing in data mesh (17:27)Data Mesh: Simplifying Data Work (24:09)Getting Started with Data Mesh (29:14)Building products for Data Mesh (36:42)Building a customizable and extensible platform to shape data practice (39:28)The characteristics of a data product (48:40)Defining what a data product is not (50:45)The origin of the term "mesh" in data mesh (53:32)The Data Stack Show is a weekly podcast powered by RudderStack, the CDP for developers. Each week we'll talk to data engineers, analysts, and data scientists about their experience around building and maintaining data infrastructure, delivering data and data products, and driving better outcomes across their businesses with data.RudderStack helps businesses make the most out of their customer data while ensuring data privacy and security. To learn more about RudderStack visit rudderstack.com.
In this bonus episode, Eric and Kostas preview their upcoming conversation regarding Data Mesh with Paolo Platter, Zhamak Dehghani, and Melissa Logan.
Please Rate and Review us on your podcast app of choice!Get involved with Data Mesh Understanding's free community roundtables and introductions: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn.Transcript for this episode (link) provided by Starburst. You can download their Data Products for Dummies e-book (info-gated) here and their Data Mesh for Dummies e-book (info gated) here.Sean's LinkedIn: https://www.linkedin.com/in/seangustafson/In this episode, Scott interviewed Sean Gustafson, Director of the Data Platform at Delivery Hero.Delivery Hero has been on the data mesh journey for longer than most organizations, at least over 3 years.Some key takeaways/thoughts from Sean's point of view:It's extremely hard but still important to try to impact your culture through things like your data platform. Who are you trying to make information available to? How do you make it accessible? How do you make data ownership easier?A key role of the data platform is that golden/easy path. Showing people easy ways to accomplish what they need with data products. Embed best practices into the platform when possible.You need a product manager in your data platform team. It's easy-ish to build cool things in data but understanding and building to user needs is harder and a must. Treat your data platform as a product!Relatedly, there isn't anything all that special about product management around the data platform. You can take what we've learned from other disciplines - especially software - and tweak it a bit for data. But it's not some arcane art.Focus on KPIs around what you are building and why, especially for your data platform. It's very hard to measure developer productivity but that doesn't mean you just don't measure it.?Controversial?: Be prepared to deal with a lot of qualitative data when measuring success around your data platform. Surveys work far better than most might think.Good product managers balance the short and long-term. You don't want to make drastic and breaking changes to your data platform often but that doesn't mean you can't take bigger bets and shake things up. Just balance iterative improvements and the bigger picture. Scott note: Zhamak talks about Thomas Kuhn and cumulative progress versus paradigm shiftsIn the same vein, make small bets where small bets will do but don't be afraid to make big bets when necessary.?Controversial?: It...
Key Points:The rush to categorize all of our tooling in data has caused many issues - we will see a big shake-up coming in the future much like happened in application development tooling.So much of data people's time is spent on things that don't add value themselves, it's work that should be automated. We need to fix that so the data work is about delivering value.We can learn a lot from virtualization but data virtualization is not where things should go in general.Containerization is merely an implementation detail. Much like software developers don't really care much about process containers, the same will happen in data product containers - it's all about the experience and containers significantly improve the experience.The pendulum swung towards decoupled data tech instead of monolithic offerings with 'The Modern Data Stack' but most of the technologies were not that easy to stitch together. Going forward, we want to keep the decoupled strategy but we need a better way to integrate - APIs is how it worked in software, why not in data? Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or
Key Points:The current data product developer experience sucks. In software, we have one simple interface that manages all the pieces needed to get the job done, packages those components. In data, we have to jump through hoops and interfaces across many tools.Right now, the developer has to manage everything themselves which is a ton of work. But it's also a big risk because of lifecycle management - if everything isn't packaged and deployed as one, you have dependencies drifting from each other.There are many places from software we can learn from as to how to do this containerization. Ruby-on-Rails and CloudFoundry are good places to look. Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Key takeaways:If you want to say you are doing data mesh, go ahead. If you really want to know if you are, look to Zhamak's book and her talks - how much are you attempting and are you going with the thin slice model - not ignoring any of the pillars. Have some empathy for yourself.If others want to say they are doing data mesh, it's not really a big deal as long as we aren't trying to emulate the ones fooling themselves or over-marketing.I'd rather welcome more folks into the tent than gatekeep. Maybe that is just my own prerogative but I have empathy for people who see what they think the end journey should look like. But it's like that transformation picture of build a skateboard, then bike, then moped, then motorcycle, then little 2 door car, etc. with the end idea of building a sedan.Please Rate and Review us on your podcast app of choice!Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Key Points:Containers in software abstracted away a number of very cumbersome tasks and encapsulated a lot of the dependencies software had to its environment. Combined, that meant developers could focus on delivering value instead of focusing on the infra. We need to do the same in data.It's all about sharing data in a responsible and easy way. That means putting all the components together so you don't have to manage many versions. Just like microservices.How do you manage to make this easy for the data product developer - bundle everything together. But centralized data products are creating a lot of potential issues/risk to scalability and flexibility. We just keep trying to centralize in data.Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Takeaways:This is just scratching the surface of generative AI and data mesh - we will have much deeper discussions in future episodes.Zhamak believes generative AI has a ton of positive real world potential, especially in data mesh. Scott is more skeptical. But if things like GenAI are only able to be leveraged by a few large companies trying to collect as much information - especially sensitive information - as possible, there are some big potential societal issues that might come from that. We need to democratize the ability to leverage these types of tools.ChatGPT set off a frenzy. It can be easy to want to move incredibly fast towards implementing generative AI. But companies don't have the vast amount of data where they can throw moderate - or worse - quality data and get something useful out. Garbage in, garbage out is a real concern.Because they have less data than essentially the sum of the internet like OpenAI used for ChatGPT, companies need to focus on providing quality data into an LLM (large language model) in order for it to actually provide good results. Again, otherwise it is garbage in, garbage out.Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music,...
Due to health-related issues, we are on a temporary hiatus for new episodes. Please enjoy this rerelease of episode 177. As stated in the original show notes, this is one to revist often as it is a great level-setting on why are we doing what we do in data mesh.Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.This is likely to be an episode to revisit. Zhamak explains a simple concept - data should not be copied unless it is owned by a data product - but the why is multi-layered and important. It might be one of the most important yet underestimated aspect of data mesh because when done right, it truly ensures trust in data - for consumer but also producer. There's a lot of nuance in how Zhamak is thinking about this but the actual application is quite easy :)Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Takeaways:A data product by the data mesh definition cannot by definition be a silo because it is meant to be consumed by the rest of the organization as a greater part of the picture of the organization. Whether that is true in practice remains to be seen :) APIs are the way we've learned in software to interconnect between services, so we need to learn how to leverage the same approaches in data.Within a single organization, we may be able to get by without having a great set of industry-wide interconnectivity standards - the company can fully control their own sharing. But, we know that to truly unlock the value of our organizations, cross organization data sharing - in a scalable and useful way - is necessary. So we need better ways to do that - standards and tooling/technology.It remains to be seen which will emerge first: the standards or the tooling/technology - a chicken and egg issue there - but they must really must be intertwined to work well. Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Takeaways:It's important to understand that we need enablement to do data mesh well. That is enablement through technology and enablement through organizational approaches/behavior changes. Doing only one will likely not work."...they need to move fast, they cannot be bogged down by centralization of any kind, organization or technology." Scott note: I will say we discuss later the need for centrally provided enablers but central bottlenecks are the speed and flexibility killer, look to prevent and remove them where possible.People want to simply produce data as a normalized process of doing their job and make it consumable for the rest of the organization. How can we enable that? Why is it so tough? How do we make it interoperable - and more importantly interconnectable - too?Right now, the missing core component to do data mesh well is an easy ability to create and manage data products. Everyone is having to cobble things together and then trying to layer on the observability, the access control, the interconnectivity, etc. But it's built on a shaky foundation.Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Tania's LinkedIn: https://www.linkedin.com/in/sofia-tania/Presentation: "Data Mesh testing: An opinionated view of what good looks like": https://www.youtube.com/watch?v=stNZQESndAAIn this episode, Scott interviewed Sofia Tania (she goes by Tania), Tech Principal at Thoughtworks. To be clear, she was only representing her own views on the episode. Scott asked her to be on especially because of a presentation she did on applying testing - especially important for data contracts - in data mesh.Scott note: I was apparently getting extremely sick throughout this call so if I ramble a bit, I apologize. Tania's dog also _really_ wanted to be part of the conversation so you might hear us both chuckling a bit about her antics. And Tania has some really great insights so I probably asked her probably the hardest questions of any guest to date. She did a great job answering them though! A lot of the takeaways are about are we actually ready to do a lot of the necessary testing to ensure quality around data, which I don't think has a clear answer yet :)Some key takeaways/thoughts from Tania's point of view:We have to bring software best practices to data but we should do it smartly and not make the same mistakes we made in software, let's start from a leveled up position. Zhamak has said the same. The question becomes how but looking at how practices evolved in software should bring us a lot of learnings.Just pushing ownership of data to the domains won't suddenly solve data quality challenges. The new owners - the domains - have to really understand what ownership means and what quality...
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.Takeaways:We should be thinking about how we can get out of the batch mode into the streaming mode. Yes, technologically but also think about How can we get to making decisions based on smaller amounts of data more frequently - both automated systems like AI but also for our people. Instead of making adjustments or decisions based on big batches of data, we can make smaller course corrections."Data mesh is about building responsibility into data and the quality of the data you share and being explicit about that quality."Make the cost of mistakes that much smaller by creating smaller decisions that add up to the bigger decisions - it's not one giant leap, it's many steps that can avoid more hazards as you come across them."Make decisions at the speed of the market" is crucial to being nimble, being able to react to opportunities or new challenges. To do that, we need to put data in the hands of those closest to the market, the domains.Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.Takeaways:We can do better in data than what we did learning decentralization in services/the operational side: "We have to level up. We can't repeat the past mistakes. Let's not be silly and fool ourselves just because we have a schema, now we have an amazing system."The services world has learned good ways of communicating between producers and consumers. We should look to learn more from them and look to adapt then adopt what works well. We need to change our approach to measuring and reflecting on past decisions - we might have made a decision based on not great information, does that mean the decision was bad simply because it didn't work out? Probably not, but as Ari Gold said in Entourage "There are no asterisks in life, only scoreboards…" Can we really get to a place where we allow those asterisks? Zhamak believes we can adopt many software development practices across data - that's pretty key to data mesh - but one area people seem to be skipping over are things like decision records - what were you thinking when you made a past decision, what did you know and what were your hypotheses. It's easy to judge results but it's better to judge the judgment :)Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or
Zhamak Dehghani is the CEO / Founder of Nextdata and also well known as the creator of Data Mesh. Data Mesh is a set of core principles that enables organizations to become data-driven by changing the way they organize their teams and their data architecture. Zhamak walks us through her deep expertise working with enterprise data teams and the story of how Data Mesh came to be. Zhamak also discusses the power decentralization and how it allows enterprises to scale insights. Zhamak Dehghani's vision for Nextdata is building a world where AI and ML is powered by equitable and responsible data ownership via decentralization. We also discuss data products as a first class primitive and how data teams can start their Data Mesh journey with practical steps of 'change through movement.' Follow Zhamak Dehghani on Linkedin and Twitter.Follow Nextdata hereCheck out "Data Mesh: Delivering Data-Driven Value at Scale" by Zhamak Dehghani on AmazonWhat's New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What's New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.
Sign up for Data Mesh Understanding's free roundtable and introduction programs here: https://landing.datameshunderstanding.com/Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Key Takeaways:Postel's Law: Be conservative in what you do, be liberal in what you accept from others. We can do better in data than what we did learning decentralization in services: "We have to level up. We can't repeat the past mistakes. Let's not be silly and fool ourselves just because we have a schema, now we have an amazing system."The services world has learned good ways of communicating between producers and consumers. We should look to learn more from them and look to adapt then adopt what works well. Zhamak believes we have to learn to prepare our data for future use cases. Scott note: If she means reuse of data being generated for current use cases, most agree. If she means creating data that doesn't currently serve a use case, almost everyone else seems to disagree. Time will tell.More on Postel's Law: https://ardalis.com/postels-law-robustness-principle/ Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.Key Takeaways:We need to be better about getting on the same page regarding some semantics in data mesh. Otherwise, it's hard to work together internally and across organizations to move the industry forward.There are so many things we've learned about how to break systems into smaller components on the services side, about preventing tight coupling but the data world has yet to apply those learnings. We're heading down some paths that we don't need to if we follow past learnings.As an example of above, early data contract approaches are too tightly coupled around schema. We need to be a little less rigid there but how feels to be determined.Postel's Law: "Be conservative in what you do, be liberal in what you accept from others." Learn it and think about how to apply it to data so we create more resiliency across our internal data ecosystems. Right now, there isn't much out there on the how.Resiliency at scale is possible on the operational plane, why not the data plane? We need to be "very mindful and not naïve" around how we integrate in the data world to not make the same mistakes we made on the services side.Postel's Law: https://ardalis.com/postels-law-robustness-principle/Semantic Diffusion article Zhamak mentioned: https://www.martinfowler.com/bliki/SemanticDiffusion.htmlPlease Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR,
Please Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.Transcript for this episode (link) provided by Starburst. See their Data Mesh Summit recordings here and their great data mesh resource center here. You can download their Data Mesh for Dummies e-book (info gated) here.Kim's LinkedIn: https://www.linkedin.com/in/vtkthies/Mike's LinkedIn: https://www.linkedin.com/in/2mikealvarez/Ferd's LinkedIn: https://www.linkedin.com/in/ferdscheepers/Omar's LinkedIn: https://www.linkedin.com/in/kmaomar/In this episode, guest host Kim Thies, Director of Intelligence Automation at PayPal facilitated a discussion with Ferd Scheepers, Chief Information Architect at ING, Mike Alvarez, Former VP of Digital Services at a large healthcare distribution company (guest of episode #236), and Omar Khawaja, Head of Business Intelligence at Roche (guest of episode #96). As per usual, all guests were only reflecting their own views.Scott note: I wanted to share my takeaways rather than trying to reflect the nuance of the panelists' views individually.Before we jump in, I think the main takeaway here would be a data mesh implementation leader's journey can be a lonely one. Find peers and exchange information. You can reach out to me (Scott) but there are also many leaders that want to exchange information with each other. The other is the meaning of journey: it's never done; be prepared to continue to push - it can feel Sisyphean but it's important to keep moving forward and expect to continue to drive buy-in.Scott's Top Takeaways:Everyone sees the 'Instagram photos' version of other organizations' data mesh journeys - it's not the reality. Everyone is struggling with certain aspects of data mesh because if this were easy, people would read Zhamak's book and be done with it. It's just not realistic to expect that, give your leaders (yourself?) a...
Sponsored by NextData, Zhamak's company that is helping ease data product creation.For more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.Key Takeaways:We need to make the data product the first class primitive of information sharing - make them the basic building block of how we create our internal data/AI ecosystem. (repeat from ZC20)"Tools reshape behavior" - if we want developers to change their relationship to data, we need to give them the capability to do so easily. Or even if not easily, at least not arduously :)The industry sees the value of data mesh but we need to find much lower effort ways to create sustainable, large-scale change. We all need to be finding catalysts.There is no reason to try to reinvent a lot of the technology in data at the physical layer. Lake storage, streaming technologies, ML libraries _at the physical layer_ are great. But we need new ways of accessing and leveraging them to make it far easier to create and manage data products.While everyone seems to be talking data products, there seems to be so many different definitions. This has led to a weaker market pull on vendors to improve their tooling to make data mesh more easily possible.Semantic Diffusion article Zhamak mentioned: https://www.martinfowler.com/bliki/SemanticDiffusion.htmlPlease Rate and Review us on your podcast app of choice!If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereData Mesh Radio episode list and links to all available episode transcripts here.Provided as a free resource by Data Mesh Understanding / Scott Hirleman. Get in touch with Scott on LinkedIn if you want to chat data mesh.If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesf
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.Takeaways:People are seeing early value from data mesh but we can see where it could be a much greater amount but we are still held back by the organizational challenges and even more by the tooling.In many cases, the tooling isn't good enough yet to change developer behavior - if we remove friction, they will likely want to lean in more on data mesh.We have to find catalysts at the micro level in data mesh to make massive shift at the macro level. We can't try to change everything through pure force of organizational will. But we haven't found these catalysts yet.It's easy to get lost in the vastness of change in a data transformation around data mesh. Try to focus more at the micro with a goal of creating cascading reactions to drive the macro.We need to make mesh data products the first class primitives of information sharing - make them the basic building block of how we create our internal data/AI ecosystem.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.This episode is part of the greater AI/ML conversation I had with Zhamak but it's super important to emphasize the importance of trust - enough so that I created a separate quick episode on it. Not just trust in the data itself but that there is easy access and there will be going forward. A lot of the things we have done in data historically has been defensive in nature - especially grabbing a copy of the data now because who knows when you'll get access to it again. What if we can implicitly trust that there has been care and foresight in preparation of the data I find, that there is an owner I can ask if I'm confused or curious, that my access won't suddenly go away or that what's there won't suddenly change without my knowledge? In ML/AI, the data scientists have done things in ways that made sense to their situation and challenges. What happens when we make trust inherent? What incremental value does that drive?Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.This episode is part of the greater AI/ML conversation I had with Zhamak. To start, Zhamak recognizes we aren't where we want to be in terms of capabilities - ways of working or tooling - to make this a reality just yet. But, if we can make it so data scientists can trust and easily consume from data products - that we create data products that don't care what use case type - regular analytics or AI/ML - can we remove a lot of the complexity they face? Do they need feature stores for data they aren't transforming? If they can get continued access and know the quality, why create a separate process that has fragility instead of trust the data product owners upstream?I wasn't smart enough in the moment to talk about do we need to have a copy of the training data itself for reproducibility but folks smarter on ML than I am can answer that one, probably in the affirmative. But overall, there is a lot of complexity in the way we do AI/ML because data scientists can't trust the sources of their data and they feel the need to take control because if they don't, their models break. So we need to earn their trust and show them a better way. But again, we aren't there yet, so let's work to make this a reality in the future.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado,
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.So, continuing the conversation about AI and ML's place in data mesh, we start the episode with Zhamak discussing an unnecessary complication we've created in data - why do data sets/assets only have to serve one user or even user persona? Yes, product thinking is about creating reuse but are we thinking reuse across regular analytics and ML/AI at the same time? We need to make it easy to give access in the language of, that native mode of access of, the data consumer. We shouldn't have to care what it is used for, regular analytics, ML, or anything in between. There's also this very painful bifurcation between upstream data production and data science where the second data enters the data science realm of influence, it's copied over and you lose sight of it for discoverability, governance, security, quality, etc. They pull it in and then it's essentially impossible to track. That creates all kinds of problems. So why don't we extend data mesh into what they are doing? Do they need to make copies of the data in the feature store? If they have a trusted source of access to the data, do they care?Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR,
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.Humans by our very nature categorize things - otherwise how can we really differentiate? How can we learn about new ideas and experiences if not finding a way to store them in our mental models. And in data, we've been treating diagnostic and descriptive analytics as an entirely different category to the predictive analytics of AI and ML. The way we partition the world in data is around how data will be used and then prepare the data as such, to be very fit for purpose. What if instead we partition around the data domain and don't really care about who or how things are used - we want to serve all consumers - what changes? Can we create data that is simply usable by many? Does that actually reduce complexity overall by not owning data production designed to specific purposes? Do we really need to treat AI/ML as if their consumption is all that different or special?Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a community...
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.We missed our window to record so I am interpreting what Zhamak is saying in her Medium post about why did she creator her company and the general state of the tooling market around data mesh. I also added on a the full 50min recording of our second recording which I had broken up into episodes 4-7.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB
Zhamak Dehghani (CEO & Founder of Next Data) discusses why she started Next Data, and how her company will finally help Data Mesh become a reality. Next Data: https://www.nextdata.com/ Announcement: https://medium.com/@zhamakd/why-we-started-nextdata-dd30b8528fca #datamesh #data
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.We wrap up another recording of Zhamak's corner talking about how do we actually start to look to build data products in a post pipeline data world. Data tools right now are kind of duct taped to each other and duct taped to the pipeline - how do we rethink starting from the end product - that mesh data product - and hook the tools to that to make interacting with it better. If you build a system that truly focuses on intentionality and responsibility that people can see, it creates trust. Away with the data black box!Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.This is likely to be an episode to revisit. Zhamak explains a simple concept - data should not be copied unless it is owned by a data product - but the why is multi-layered and important. It might be one of the most important yet underestimated aspect of data mesh because when done right, it truly ensures trust in data - for consumer but also producer. There's a lot of nuance in how Zhamak is thinking about this but the actual application is quite easy :)Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter.In this episode, Scott and Zhamak covered the misconception many have when she says "leave the data where it is" - it's about leaving the ownership with those who should have it, not leaving the data in the source system. If you only do source system, all you have is current state, so your data isn't even immutable or bi-temporal! They also discussed the need to be smarter about processing data - should it be at the source or should it be on query? There isn't a universal approach but we also shouldn't have to move the data around just to process it, bring the processing to the data. Lastly, we need systems to get smarter around efficient processing. We have far too much manual work on making data processing efficient.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit...
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Oh, and she's hiring.What tech is already available that could be used for data mesh? There are so many amazing approaches and technologies in data but they've been used for the pipeline approach only. We need to think more like developers - not accepting the grunt work or death by a thousand cuts of data - and take a hard look at what we've done historically in data and what should be replaced. Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Oh, and she's hiring.What do we need to develop to provide a better experience for data product developers and consumers? How can we make it easy for consumers to move from discover to trust to lean to use? We still have a long long way to go with analytical APIs, they are basically only for sharing raw data at the moment - how can we better share the embedded information? And how can we give data product developers the ability to do the right thing by default where we optimize for ease of use and also for more generalist developers to use - instead of hyper-specialized tools experts? There have been two very separate universes of operational and analytical planes - how do we change that and leverage software engineering practices in analytics?Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a...
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Oh, and she's hiring.What can we do now relative to data mesh with what we have? People want to move, not wait for the tools to evolve. We can start to shift in anticipation of tooling getting better. It might not make things a ton better now, but when tools start to emerge, then we can jump ahead quickly. Learn from what happened in the API revolution and don't compromise on interoperability - that will just lead to high quality data silos, which is not a great outcome. And we need to get to a place with data where consumers have a delightful experience going from discover to learn to trust to use with little friction.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code...
Data Mesh has caught the technology and data industry by storm and is easily one of the hottest topics today. Oftentimes, the harder parts of Data Mesh don't get as much attention or coverage. This was an open session at the Utah Data Engineering Meetup (November 2022) to ask Zhamak about the hard parts of data mesh that have inspired her to start a company to address them head-on.
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Oh, and she's hiring.Who will be the data product developer in data mesh? There has been a misconception in Zhamak's view that the application developers should be the ones focused on building the data products as well - but she thinks they already have a full-time role :) But, we need someone applying software engineering practices and data know-how to building data products. Right now, to do data work, you need way too much tool knowledge instead of data understanding. We have hyper specialized data roles - ML engineer, data engineer, etc. - when we should have data developers that can tackle these challenges better.Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code...
Data Mesh Radio Patreon - get access to interviews well before they are releasedEpisode list and links to all available episode transcripts (most interviews from #32 on) hereProvided as a free resource by DataStax AstraDB; George Trujillo's contact info: email (george.trujillo@datastax.com) and LinkedInFor more great content from Zhamak, check out her book on data mesh, a book she collaborated on, her LinkedIn, and her Twitter. Oh, and she's hiring.What tech is already available that could be used for data mesh? There are so many amazing approaches and technologies in data but they've been used for the pipeline approach only. We need to think more like developers - not accepting the grunt work or death by a thousand cuts of data - and take a hard look at what we've done historically in data and what should be replaced. Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see hereAll music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): Lesfm, MondayHopes, SergeQuadrado, ItsWatR, Lexin_Music, and/or nevesfData Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): AstraDB
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB); George Trujillo's contact info: email (george.trujillo@datastax.com) and https://www.linkedin.com/in/georgetrujillo/ (LinkedIn) For more great content from Zhamak, check out her book on https://www.oreilly.com/library/view/data-mesh/9781492092384/ (data mesh), https://www.oreilly.com/library/view/software-architecture-the/9781492086888/ (a book she collaborated on), https://www.linkedin.com/in/zhamak-dehghani/ (her LinkedIn), and https://twitter.com/zhamakd (her Twitter). Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (Lesfm), https://pixabay.com/users/mondayhopes-22948862/?tab=audio (MondayHopes), https://pixabay.com/users/sergequadrado-24990007/ (SergeQuadrado), https://pixabay.com/users/itswatr-12344345/ (ItsWatR), https://pixabay.com/users/lexin_music-28841948/ (Lexin_Music), and/or https://pixabay.com/users/nevesf-5724572/ (nevesf) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
Inverse Conway Maneuver Definition: https://www.thoughtworks.com/radar/techniques/inverse-conway-maneuver (https://www.thoughtworks.com/radar/techniques/inverse-conway-maneuver) https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB); George Trujillo's contact info: email (george.trujillo@datastax.com) and https://www.linkedin.com/in/georgetrujillo/ (LinkedIn) For more great content from Zhamak, check out her book on https://www.oreilly.com/library/view/data-mesh/9781492092384/ (data mesh), https://www.oreilly.com/library/view/software-architecture-the/9781492086888/ (a book she collaborated on), https://www.linkedin.com/in/zhamak-dehghani/ (her LinkedIn), and https://twitter.com/zhamakd (her Twitter). Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (Lesfm), https://pixabay.com/users/mondayhopes-22948862/?tab=audio (MondayHopes), https://pixabay.com/users/sergequadrado-24990007/ (SergeQuadrado), https://pixabay.com/users/itswatr-12344345/ (ItsWatR), https://pixabay.com/users/lexin_music-28841948/ (Lexin_Music), and/or https://pixabay.com/users/nevesf-5724572/ (nevesf) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
“If you want to unlock the value of your data by generating data-driven values, and you want to do it reliably and resiliently at scale, then you need to consider data mesh." Zhamak Dehghani is the author of the “Data Mesh” book. In this episode, we discussed in-depth about the data mesh, a concept she founded in 2018, which has then been becoming an industry trend. We started our conversation by discussing the current challenges working with data, such as the data centralization approach and why the current data tools are still inadequate. Zhamak then described data mesh and why organizations should adopt it to generate data-driven values at scale. Zhamak then explained the 4 principles of data mesh, which include domain ownership, data as a product, the self-serve data platform, and the federated computational governance. Listen out for: Career Journey - [00:06:49] Challenges Working with Data - [00:10:19] Centralization of Data - [00:13:53] Why Current Tools Not Adequate - [00:16:00] Data Mesh & Its Drivers - [00:19:32] Principle of Domain Ownership - [00:25:54] Principle of Data as a Product - [00:35:57] Principle of The Self-Serve Data Platform - [00:40:51] Principle of Federated Computational Governance - [00:46:01] 3 Tech Lead Wisdom - [00:52:23] _____ Zhamak Dehghani's Bio Zhamak Dehghani works as the CEO and founder of a stealth tech startup reimagining the future of data developer experience. She founded the concept of Data Mesh in 2018 and since has been implementing the concept and evangelizing it with the wider industry. She is the author of Architecture the Hard Parts and Data Mesh books. Zhamak serves on multiple tech advisory boards. She has worked as a technologist for over 24 years and has contributed to multiple patents in distributed computing communications. She is an advocate for the decentralization of all things, including architecture, data, and ultimately power. Follow Zhamak: Twitter – @zhamakd LinkedIn – linkedin.com/in/zhamak-dehghani Our Sponsors Mental well-being is a silent pandemic. According to the WHO, depression and anxiety cost the global economy over USD 1 trillion every year. It's time to make a difference! Learn how to enhance your lives through a master class on mental wellness. Visit founderswellbeing.com/masterclass and enter TLJ20 for a 20% discount. The iSAQB® Software Architecture Gathering is the international conference highlight for all those working on solution structures in IT projects: primarily software architects, developers, professionals in quality assurance, and also system analysts. A selection of well-known international experts will share their practical knowledge on the most important topics in state-of-the-art software architecture. The conference takes place online from November 14 to 17, 2022, and we have a 15% discount code for you: TLJ_MP_15. DevTernity 2022 (devternity.com) is the top international software development conference with an emphasis on coding, architecture, and tech leadership skills. The lineup is truly stellar and features many legends of software development like Robert "Uncle Bob" Martin, Kent Beck, Scott Hanselman, Venkat Subramaniam, Kevlin Henney, and many others! The conference takes place online, and we have the 10% discount code for you: AWSM_TLJ. Like this episode? Subscribe on your podcast app. Follow @techleadjournal on LinkedIn, Twitter, and Instagram. Pledge your support by becoming a patron. For episode show notes, visit techleadjournal.dev/episodes/107.
With the data space more complex than ever, many people throw their hands in the air, and use solutions that fix many (but not all) of their problems. Today's guest saw that complexity, and leaned into it. Zhamak Dehghani is the creator of the data mesh. Where others saw chaos, she saw an opportunity. Zhamak dives into decentralization, all things data mesh, how data architecture can empower your workers, and much more.--------“Seeing the real world problems made me curious about the data space—scratching the surface and seeing the discord between the reality of the complex world we're living in with data. The solutions weren't up to the task for dealing with that complexity, so it got me to look further for a solution.” — Zhamak Dehghani--------Time Stamps* (0:00) A brief history of data architecture* (2:45) An intro to the data mesh* (13:08) How analytics impact data engineers* (18:54) The socio-technical approach* (23:26) Getting started in the data mesh* (28:59) Data mesh vs. Data fabric--------SponsorThis podcast is presented by Alation.Hear more radical perspectives on leading data culture at Alation.com/podcast--------LinksConnect with Zhamak on LinkedInCheck out Zhamak's book Data Mesh
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB); George Trujillo's contact info: email (george.trujillo@datastax.com) and https://www.linkedin.com/in/georgetrujillo/ (LinkedIn) For more great content from Zhamak, check out her book on https://www.oreilly.com/library/view/data-mesh/9781492092384/ (data mesh), https://www.oreilly.com/library/view/software-architecture-the/9781492086888/ (a book she collaborated on), https://www.linkedin.com/in/zhamak-dehghani/ (her LinkedIn), and https://twitter.com/zhamakd (her Twitter). Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (Lesfm), https://pixabay.com/users/mondayhopes-22948862/?tab=audio (MondayHopes), https://pixabay.com/users/sergequadrado-24990007/ (SergeQuadrado), https://pixabay.com/users/itswatr-12344345/ (ItsWatR), https://pixabay.com/users/lexin_music-28841948/ (Lexin_Music), and/or https://pixabay.com/users/nevesf-5724572/ (nevesf) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB); George Trujillo's contact info: email (george.trujillo@datastax.com) and https://www.linkedin.com/in/georgetrujillo/ (LinkedIn) For more great content from Zhamak, check out her book on https://www.oreilly.com/library/view/data-mesh/9781492092384/ (data mesh), https://www.oreilly.com/library/view/software-architecture-the/9781492086888/ (a book she collaborated on), https://www.linkedin.com/in/zhamak-dehghani/ (her LinkedIn), and https://twitter.com/zhamakd (her Twitter). Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (Lesfm), https://pixabay.com/users/mondayhopes-22948862/?tab=audio (MondayHopes), https://pixabay.com/users/sergequadrado-24990007/ (SergeQuadrado), https://pixabay.com/users/itswatr-12344345/ (ItsWatR), https://pixabay.com/users/lexin_music-28841948/ (Lexin_Music), and/or https://pixabay.com/users/nevesf-5724572/ (nevesf) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB); George Trujillo's contact info: email (george.trujillo@datastax.com) and https://www.linkedin.com/in/georgetrujillo/ (LinkedIn) Transcript for this episode (https://docs.google.com/document/d/1uQ-o4BewpWADsUWTUXz6vayygnIiqAnp-q0X_PS6We4/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?utm_campaign=starburst-brand&utm_medium=outbound&utm_source=&utm_type=&utm_content=dmradiodnvid&utm_term= (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?utm_campaign=starburst-brand&utm_medium=outbound&utm_source=&utm_type=&utm_content=dmradiodmcenter&utm_term= (here). You can download their Data Mesh for Dummies e-book (info gated) https://starburst.io/info/data-mesh-for-dummies/?utm_campaign=starburst-brand&utm_medium=outbound&utm_source=&utm_type=&utm_content=dmradiodnvid&utm_term= (here). Data Mesh at PayPal blog post: https://medium.com/paypal-tech/the-next-generation-of-data-platforms-is-the-data-mesh-b7df4b825522 (https://medium.com/paypal-tech/the-next-generation-of-data-platforms-is-the-data-mesh-b7df4b825522) JGP's All Things Open talk (free virtual registration): https://2022.allthingsopen.org/sessions/building-a-data-mesh-with-open-source-technologies/ (https://2022.allthingsopen.org/sessions/building-a-data-mesh-with-open-source-technologies/) JGP's LinkedIn: https://www.linkedin.com/in/jgperrin/ (https://www.linkedin.com/in/jgperrin/) JGP's Twitter: @jgperrin / https://twitter.com/jgperrin (https://twitter.com/jgperrin) JGP's YouTube: https://www.youtube.com/c/JeanGeorgesPerrin (https://www.youtube.com/c/JeanGeorgesPerrin) JGP's Website: https://jgp.ai/ (https://jgp.ai/) In this episode, Scott interviewed Jean-Georges Perrin AKA JGP, Intelligence Platform Lead at PayPal. JGP is probably the first guest to lean into using "data quantum" instead of "data product". JGP did want to emphasize that as of now, he was only discussing the implementation for his team the GCSC IA (Global Credit Risk, Seller Risk, Collections Intelligence Automation) within PayPal. Some key takeaways/thoughts from JGP's point of view: Data mesh as it's been laid out by Zhamak obviously leaves a lot of room for innovation. For some, that's great. For others, they want the blueprint. And it's okay to wait for the blueprint. But JGP and team are excited to innovate! PayPal's 3 main initial target outcomes from data mesh: A) faster and easier data discovery, B) easier to use the data in a governed way, and C) increase data consumer trust in data. PayPal's initial data consumers are data scientists so their platform and data quanta are built to serve that audience first. Really consider what you want to prove out in your MVP. Is that minimum viable A) data quantum, B) data platform, C) data mesh, or D) something else? Only doing a data quantum probably sets you up for trouble and a platform only won't be tested until it has data quanta on it. Data contracts are crucial to making trustability actually measurable and agreed upon. Otherwise, it's far too easy to have miscommunication between data producers and consumers, which leads to lack/loss of trust. Producers, don't set your data contract terms too strictly when first launching a data quantum. There's no need to over-engineer - despite how interesting that can sometimes be... For too long, we have tried to keep software...
It is once again that time of year when our host, Cindi Howson shares her favorite data and analytics book recommendations. In this special annual episode, we feature three of the industry's top data writers, thinkers, and fellow podcasters. Tim Harford comes to the conversation with his new book, The Data Detective, and big-picture ideas about how traits like curiosity serve data scientists so well. Zhamak Dehghani shares her concept of The Data Mesh, especially as it relates to sharing data across business verticals. Finally, in his book, Effective Data Storytelling, Brent Dykes compels readers to think carefully about the way they craft the message or narrative around the data they're interpreting. Tune in to learn:Making sense of the world through a data lens with Tim Harford (05:18)Tim's favorite of the ten data commandments (07:11)Guard against using data as a control mechanism (15:14)How does curiosity create a healthier relationship with data? (19:57) Data Mesh with Zhamak Dehghani (32:31)Blending centralized and decentralized schools of thought (36:39)Data Mesh isn't for everyone (48:11)How do you begin your data mesh transformation? (53:28)Brent Dykes and the importance of data storytelling (1:02:17)Why do data scientists need to be better communicators? (1:11:13)The Venn diagram of data storytelling (1:15:14)Mentions:The Data Detective: Ten Easy Rules to Make Sense of Statistics by Tim Harford Data Mesh: Delivering Data-Driven Value at Scale by Zhamak Dehghani Effective Data Storytelling: How to Drive Change with Data, Narrative, and Visuals by Brent Dykes Get even more insights from data and analytics leaders like Tim, Zhamak, and Brent on The Data Chief. Mission.org is a media studio producing content for world-class clients. Learn more at mission.org.
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) For more great content from Zhamak, check out her book on https://www.oreilly.com/library/view/data-mesh/9781492092384/ (data mesh), https://www.oreilly.com/library/view/software-architecture-the/9781492086888/ (a book she collaborated on), https://www.linkedin.com/in/zhamak-dehghani/ (her LinkedIn), and https://twitter.com/zhamakd (her Twitter). Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (Lesfm), https://pixabay.com/users/mondayhopes-22948862/?tab=audio (MondayHopes), https://pixabay.com/users/sergequadrado-24990007/ (SergeQuadrado), https://pixabay.com/users/itswatr-12344345/ (ItsWatR), https://pixabay.com/users/lexin_music-28841948/ (Lexin_Music), and/or https://pixabay.com/users/nevesf-5724572/ (nevesf) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
We talked about: Zhamak's background What is Data Mesh? Domain ownership Determining what to optimize for with Data Mesh Decentralization Data as a product Self-serve data platforms Data governance Understanding Data Mesh Adopting Data Mesh Resources on implementing Data Mesh Links: Free 30-day code from O'Reilly: https://learning.oreilly.com/get-learning/?code=DATATALKS22 Data Mesh book: https://learning.oreilly.com/library/view/data-mesh/9781492092384/ LinkedIn: https://www.linkedin.com/in/zhamak-dehghani ML Zoomcamp: https://github.com/alexeygrigorev/mlbookcamp-code/tree/master/course-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) For more great content from Zhamak, check out her book on https://www.oreilly.com/library/view/data-mesh/9781492092384/ (data mesh), https://www.oreilly.com/library/view/software-architecture-the/9781492086888/ (a book she collaborated on), https://www.linkedin.com/in/zhamak-dehghani/ (her LinkedIn), and https://twitter.com/zhamakd (her Twitter). Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (Lesfm), https://pixabay.com/users/mondayhopes-22948862/?tab=audio (MondayHopes), https://pixabay.com/users/sergequadrado-24990007/ (SergeQuadrado), https://pixabay.com/users/itswatr-12344345/ (ItsWatR), https://pixabay.com/users/lexin_music-28841948/ (Lexin_Music), and/or https://pixabay.com/users/nevesf-5724572/ (nevesf)
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB); George Trujillo's contact info: email (george.trujillo@datastax.com) and https://www.linkedin.com/in/georgetrujillo/ (LinkedIn) Transcript for this episode (https://docs.google.com/document/d/1saxAgpV9y2wdfWVwfmFFyb6YFJqoXugMMMXH7fOVaQw/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here) In this episode, Scott interviewed Argyris Argyrou, Head of Data, and Konstantinos "Kostas" Siaterlis, Director of Big Data at Orfium. There is a ton of useful information on anti-patterns, what is going well now, advice, etc. in this one. From here forward in this write-up, A&K will refer to Argyris and Kostas rather than trying to specifically call out who said which part in most cases. Some key takeaways/thoughts from A&K's points of view: On a data mesh journey: "It's not a sprint, it's a marathon." Pace yourself. It's okay to go at your own pace, don't worry about what other people are doing with data mesh, do what's right for you. Really focusing on the why and showing people results was a far better driver to buy-in and participation than any amount of selling about data mesh as a practice. Calling it data mesh when trying to explain it to people outside the data team didn't go well either... Orfium's "Data Doctor" approach - a low friction and low pressure office hours for a general staff data engineer - has really helped people help with data challenges and in spreading good data practices but without the "Doctor" becoming a bottleneck. The Data Doctor's role is to answer questions and provide guidance but not do the work for people. Then, take what was discussed and the best practice and document it for others to learn from - providing good leverage for scaling best data practices. In a smaller company like Orfium (~250 people), it's hard to justify a lot of full-time heads to implement data mesh. And trying to treat a data mesh implementation like a side-project also creates issues. There isn't a great answer here on exactly what to do except possibly take things slower than most startups are used to. Your data will still be waiting for you a few months later. If you are having difficulty driving broad buy-in, showing people what data mesh can do in action really helped at Orfium. Once they saw the approach delivering value, they wanted to participate. When trying to drive buy-in, specifically talking about data mesh didn't work well with non data folks. It's very easy to get confused around data mesh for data folks - just imagine it for non data folks. Trying to use Zhamak's articles as the optimal early state - where you need to be just to get moving - requires far too much work. Get to a place where you can try, learn, iterate, and repeat on your way to driving value. It's a journey! It's probably not a great idea for your first use case to be your most advanced or complicated - you will build your platform to focus on serving those needs instead of general affordances. Jen Tedrow's episode covers this quite nicely. Really assess how much additional work your data products will be for a data product owner. For Orfium, it was something to add to the existing product managers' plates as it wasn't a huge incremental burden just yet. Consider splitting your mesh data product ownership between business context ownership...
For more great content from Zhamak, check out her book on https://www.oreilly.com/library/view/data-mesh/9781492092384/ (data mesh), https://www.oreilly.com/library/view/software-architecture-the/9781492086888/ (a book she collaborated on), https://www.linkedin.com/in/zhamak-dehghani/ (her LinkedIn), and https://twitter.com/zhamakd (her Twitter). https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (Lesfm), https://pixabay.com/users/mondayhopes-22948862/?tab=audio (MondayHopes), https://pixabay.com/users/sergequadrado-24990007/ (SergeQuadrado), and/or https://pixabay.com/users/nevesf-5724572/ (nevesf) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB); George Trujillo's contact info: email (george.trujillo@datastax.com) and https://www.linkedin.com/in/georgetrujillo/ (LinkedIn) Transcript for this episode (https://docs.google.com/document/d/1J4ikSSLv_VgY0yZYqz_xqwzB7BC0P4Q8P5PaCNZ18WU/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here) Data Governance In Action: What Does Good Governance Look Like in Data Mesh - Interview w/ Shawn Kyzer and Gustavo Drachenberg In this episode, Scott interviewed Shawn Kyzer, Principal Data Engineer, and Gustavo Drachenberg, Delivery Lead at Thoughtworks. Both have worked on multiple data mesh engagements including with Glovo starting 2+ years ago. From here forward in this write-up, S&G will refer to Shawn and Gustavo rather than trying to specifically call out who said which part. Some key takeaways/thoughts from Shawn and Gustavo's point of view: It's very easy for centralized governance to become a bottleneck. Make sure any central governance team/board that is making decisions has a way to quickly work through backlog through good delegation. Not every decision needs deep scrutiny from top management. To do federated governance right, you need to enable the enforcement - or often more appropriately the application - of policies through the platform wherever possible. Take the burden off the engineers to comply with your governance standards/requirements. Domains should have the freedom to apply policies to their data products in a way that best benefits the data product consumers. So if there are data quality standard policies, the data product should adhere to the standard for measuring completeness as an aspect of data quality but might be optimized for something other than completeness. The cost of getting anything "wrong" in data previously has been quite high because of how rigid things have been - the cost of change was high. But with data mesh, we are finding new ways to lower the cost of change. So it is okay to start with policies that aren't complete and will evolve as you move along. If you have an existing centralized governance board, that will sometimes make moving to federated governance ... challenging at best ... so you will need a top-down mandate to reshape the board. Look to meet the necessary representation across your capabilities (e.g. product, security, platform, engineering, etc.) but not create a political issue if possible. Look to add incremental value through each governance policy. And look to iterate quickly on policy decisions where you can. Create a feedback loop on your policies to iterate and adjust. It's okay to not get your policies perfect the first time, you can adjust them. Really figure out what you are trying to prove out in your initial proof of value/concept. If it's full data mesh capabilities, that can easily take 4-6 months. An interesting incremental insight: Zhamak has warned about organizations trying to scale too fast as an anti-pattern that may result in lots of tech debt or a failure of your implementation. An interesting incremental insight: in all of the data mesh implementations S&G have worked on thus far, the initial data product has not had any PII as that adds significant complications probably beyond what the value add of including PII would be in most cases....
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB); George Trujillo's contact info: email (george.trujillo@datastax.com) and https://www.linkedin.com/in/georgetrujillo/ (LinkedIn) Transcript for this episode (https://docs.google.com/document/d/1P-Xjxgz7GafDgkCnBF67Tj5LuQXcjvQerU5FLxyyxNM/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here) In this episode, Scott interviewed Stéphanie Bergamo and Simon Maurin of Leboncoin. Stéphanie is a Lead Data Engineer and Simon is a Lead Architect at Leboncoin. From here on, S&S will refer to Stéphanie and Simon. Some key takeaways/thoughts from Stéphanie and Simon's point of view: "Bet on curious people", "just have people talk to each other", and "lower the cognitive costs of using the tooling" - if you can do that, you'll raise your chance of success with your data mesh implementation. Leboncoin requires teams to share information on the enterprise service bus that might not be directly useful to the originating domain on the operational plane. They are using a similar approach with data for data mesh - sharing information that might not be useful directly to the originating domain by default. Leboncoin presses teams to get data requests to other teams early so they can prioritize it. There isn't an expectation of producing new data very quickly after a new request, which is probably a healthy approach to data work/collaboration. Embedding a data engineer into a domain doesn't make everything easy, it's not magic. Software engineers will still need a lot of training and help to really understand data engineering practices. Tooling and frameworks can only go so far. Be prepared for friction. Similarly, getting data engineers to realize that data engineering is just software engineering but for data - and to actually treat it as such - might be even harder. Software engineers generally don't know how to write good tests relative to data. Neither do data engineers. But testing is possibly more important in data than in software. We all need to get better at data testing. Start with building the self-service platform to solve the challenges of the data producers first. You may make it very easy to discover and consume data but if the producers aren't producing any data... If your software engineers are doing data pipelines at all before starting to work with them in a data mesh implementation, you can probably expect they aren't using best practices. It's pretty common for good/best practices to be known by only a few people inside an organization, such as with a specialty-focused guild. Look for ways to cross-pollinate information so more people are at least aware of best practices if not able to fully implement them yet. Trying to force people to share data in a data mesh fashion didn't work for Leboncoin and probably won't in most organizations. Find curious developers and help them accomplish something with data, that will drive buy-in. As part of #10, data products often start as something serving the producing domain and then evolve to serve additional use cases. They start by serving a specific business need and evolve from there. Look to build your tooling to enforce your data governance requirements/needs. Trying to put too much on the plate of software engineers probably won't go well. Around the time Zhamak's first post...
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) Transcript for this episode (https://docs.google.com/document/d/1r2GDDj3IQ0L4UO3iGv-5sLPCU0rKErjrbUzN7vMTzTY/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here) In this episode, Scott interviewed Dave Colls, Director of Data and AI at Thoughtworks Australia. Scott invited Dave on due to a few pieces of content including a webinar on fitness functions with Zhamak in 2021. There aren't any actual bears, as guests or referenced, in the episode :) To start, some key takeaways/thoughts and remaining questions: Fitness functions are a very useful tool to assess questions of progress/success at a granular and easy-to-answer level. Those answers can then be summed up into a greater big picture. You should start with fitness functions early in your data mesh journey so you can also measure your progress along the way. To develop your fitness functions, ask "what does good look like?" Focus your fitness functions on measuring things that you will act on or are important to measuring success. Something like amount of data processed is probably a vanity metric - drive towards value-based measurements instead. Your fitness functions may lose relevance and that is okay. You should be measuring how well you are doing overall, not locking on to measuring the same thing every X time period. What helps you assess your success? Again, measure things you will act on, otherwise it's just a metric. Dave believes the reason to create - or genesis of - a mesh data product should be a specific use case. The data product can evolve to serve multiple consumers but to start, you should not create data products unless you know how it will (likely?) be consumed and have at least one consumer. Team Topologies can be an effective approach to implementing data mesh. Using the TT approach, the enablement team should focus on simultaneously 1) speeding the time to value of the specific stream-aligned teams they are collaborating with and 2) look for reusable patterns and implementation details to add to the platform to make future data product creation and management easier. We still don't have a great approach to evolving our data products to keep our analytical plane in sync with "the changing reality" of the actual domain on the operating plane. On the one hand, we want to maintain a picture of reality. On the other, data product evolution can cause issues for data consumers. So we must balance reflecting a fast-changing reality with data consumer disruption, including downstream and cross data product interoperability. There aren't great patterns for how to do that yet. There is a tradeoff to consider regarding mesh data product size. Dave recommends you resist the pull of historical data ways - and woes - of trying to tackle too much at once. The smaller the data product, the less scope it has, which makes it easier to maintain and the quicker to deploy and feedback cycle. But smaller-scope data products will increase the number of total data products, likely leading to harder data discovery. And do we have data product owners with many data products in their portfolios? Dave recommends using the Agile Triangle, framework to figure out a good data product scope (link at the end). Dave mentioned he first started discussing fitness functions regarding...
Data mesh has the potential to solve a big data problem. The book, Data Mesh: Delivering Data-Driven Value at Scale, authored by Zhamak Dehghani, the creator of data mesh, is a guide for practitioners, architects, technical leaders, and decision makers on their journey from traditional big data architecture to distributed and multidimensional approach to analytical data management. In this episode, Zhamak and Jesse talk about the book and unravel some of the complex and critiqued areas of data mesh, and shed light on how to succeed with data mesh. This episode is part of a trilogy of conversations with Zhamak and Jesse, as they delve into the different sides of data mesh. Before you dive in, do make sure you've listened to their first and second conversations. More about our host, Jesse Anderson Read the transcript of this episode Listen to our first conversation with Zhamak from 2021, Data for Everyone, or her second conversation with Jesse, Reflections. Get Zhamak's book Data Mesh: Delivering Data-Driven Value at Scale Connect with us on social media: Twitter, LinkedIn Find book recommendations and more resources for data professionals at https://dreamteam.soda.io From Soda, the provider of data reliability tools and observability platform to enable data teams to discover, prioritize, and resolve data issues.
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and link to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) Ole's Book (O'Reilly Early Release): https://www.oreilly.com/library/view/the-enterprise-data/9781492098706/ (https://www.oreilly.com/library/view/the-enterprise-data/9781492098706/) Transcript for this episode (https://docs.google.com/document/d/1wWCM6AQJ0GtGmYJ7XFR-kSUfWv2jzxVp4jUApNDSyMk/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here) In this episode, Scott interviewed Ole Olesen-Bagneux, an Enterprise Architect who focuses on data at GN and the author of an upcoming book on data catalogs with O'Reilly. To be clear, Ole was only representing himself and not GN. The two main topics, which are somewhat intertwined, were: 1) how can we better understand and handle the concept of a domain when discussing data; and 2) how can we build systems that better enable us to search "for" data, not just search "in" data that we know exists? Some practical advice and general conclusions from Ole: Leverage what the Library and Information Sciences discipline - which is centuries (millennia?) old - has formed around the domain concept. It will help you better dig into the actual business dealings of the domain first before trying to focus too much on the technical/software aspects. The software aspects hinder your initial domain mapping - especially in depth - and business context understanding when you start from a DDD perspective in data. Spend a lot more time on enabling people to understand what data is available. We focus a lot on optimizing for searching "in" data, but we don't spend near enough time setting up our systems to allow people to search "for" data. To do that, work seriously on your metadata tools system and look for ways to harmonize data across those tools. Ole started the conversation sharing his view that Domain Driven Design (DDD) has some shortcomings when used especially for data domain mapping and in general in data. In his view, DDD is overly tied to software engineering so there is too much of a technical bent to understanding and even mapping out domains. He recommends taking domain analysis and domain theory learnings from the Library and Information Sciences discipline and using that to start your domain mapping and then look to bring in DDD after you get a good initial understanding of your domains. DDD and domain analysis can work together harmoniously, they don't really contradict, but domain analysis focuses on the knowledge first instead of the technical first. While Ole was inspired by Zhamak's book as well as the book by Piethein Strengholt, his believes domain analysis lowers the significant friction and often frustration organizations feel when trying to start doing DDD for data. Domain analysis digs much more into what the domain does and why instead of how the domain communicates via software. He believes that data mesh should focus more on the information sharing and less on the software and that DDD will overcomplicate your domain mapping. For Ole, DDD is overly concerned with modeling domains into software but you need to get to a deeper understanding of your domains and organization first before focusing in on your model. It may be that you truly can't fully communicate your domain's context in a data model either and it's good to know that upfront and take steps to communicate in other...
Zhamak Dehghani is the founder of data mesh and author of the book, Data Mesh: Delivering Data-Driven Value at Scale. Zhamak returns to the Soda Podcast. In part one, Zhamak and Jesse will delve into data mesh and how that is redefining how we manage data. They talk through the reality, the dream, and the execution, and bring us up to speed on what's been happening since they met in season one. Part two - coming soon - will deep-dive into Zhamak's book. More about our host, Jesse Anderson Read the transcript of this episode Get Zhamak's book Data Mesh: Delivering Data-Driven Value at Scale Connect with us on social media: Twitter, LinkedIn Find book recommendations and more resources for data professionals at https://dreamteam.soda.io From Soda, the provider of data reliability tools and observability platform to enable data teams to discover, prioritize, and resolve data issues.
Zhamak Dehghani is the founder of data mesh and author of Data Mesh: Delivering Data-Driven Value at Scale. We're delighted to welcome Zhamak back to the Soda Podcast to talk about the reality, the dream, and the execution of data mesh, writing a book, and life in general. More about our host, Jesse Anderson Read the transcript of this episode Connect with us on social media: Twitter, LinkedIn, Facebook Find book recommendations and more resources for data professionals at https://dreamteam.soda.io From Soda, the provider of data reliability tools and observability platform to enable data teams to discover, prioritize, and resolve data issues.
https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Episode list and links to all available episode transcripts (most interviews from #32 on) https://docs.google.com/spreadsheets/d/1ZmCIinVgIm0xjIVFpL9jMtCiOlBQ7LbvLmtmb0FKcQc/edit?usp=sharing (here) Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB); George Trujillo's contact info: email (george.trujillo@datastax.com) and https://www.linkedin.com/in/georgetrujillo/ (LinkedIn) Transcript for this episode (https://docs.google.com/document/d/1ozNbiGSetfyoBHAWofygmGLgFN0PKza1U0RlaoChMgc/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here) In this episode, Scott interviewed Tim Gasper, VP of Product at data.world and the co-host of the Catalog & Cocktails podcast. They covered two main topics - 1) the skeptic's view of data mesh and 2) Tim's/the data.world team's "ABCs of Data Products" framework. Skeptics have a few main pushbacks on data mesh in Tim's view. Tim listed the top 6 that he sees and then discussed them with Scott. #1: Data mesh isn't for every organization depending on size, number of domains, data/problem space complexity, etc. Tim said this. Zhamak has said this. Most data mesh advocates/fans say this regularly. This is one of the myths of data mesh - that it's designed for everyone. Don't go to a decentralized data setup if you don't need to. Tim made the very good point that we need more conversations and better guidance on what to measure if centralization of your data team and processes is your actual challenge. #2: Tooling doesn't exist - yet? - to make it easy for domains to easily take over data ownership. A big conceptual myth of data mesh is that it has to solve every data problem, even the most difficult, right out of the gate. Tim mentioned that your team needs to really think about self-service being about empowerment, not necessarily a single big red easy button. And your implementation will evolve - it MUST evolve. It's not easy yet and if your team isn't prepared to roll up their sleeves, it's okay to wait to implement. #3: There shouldn't be anyone who "owns" the data. Tim made a really good point here on accountability to sharing your data versus the "fiefdom" model - where someone has complete control over how the data is used. Yes, someone shouldn't be able to prevent other domains from using data. But that's not at all in the spirit of data mesh anyway. Why would you make data reusable and discoverable if people can't use it? #4: There aren't enough case studies yet. Tim mentioned this briefly. It is a bit of a chicken and egg issue: if we wait for people to be "done" with their journeys, it will be another 5 years before good case studies emerge. It's okay to need more proof before wanting to go forward but it might mean lost opportunity. And there are good examples out there, including guests from this podcast (20+ so far). #5: Lacking guidance on exactly how to handle cross domain data combinations. Tim mentioned that there is the question of how do those combinations get managed as right now, in a data warehouse or data lake world, there are clear owners - the data team. Unfortunately for those who want a direct data mesh playbook, this is situational and you have to figure it out yourself for each situation and be ready to evolve. #6: Data mesh will create data silos. Sure, if you have the data mart model of old where data is created only for the domains to use internally. But that's not data mesh. Tim talked about how important iteration and collaboration is to prevent data silos. So much is about the intent to not let data silos...
This bonus episode features conversations from season 1 and 2 of the Open||Source||Data podcast. In this episode, you'll hear from Zhamak Dehghani, Director of Emerging Technologies at ThoughtWorks North America; David Thomas, Principal at Deloitte; and Shirshanka Das, Founder of LinkedIn DataHub and Acryl Data.Sam sat down with each guest to discuss data meshes, fabrics, and discovery. You can listen to the full episodes from Zhamak Dehghani, David Thomas, and Shirshanka Das by clicking the links below.-------------------Episode Timestamps:(00:36): Zhamak Dehghani(01:41): David Thomas(02:43): Shirshanka Das-------------------Links:Listen to Zhamak's episodeListen to David's episodeListen to Shirshanka's episode
Zhamak Dehghani joins the show to chat about all things Data Mesh, and her new O'Reilly book by the same title. This is a rare opportunity to learn all about Data Mesh from Zhamak in this unscripted and candid live chat. Q&A welcome Streamed live on LinkedIn and YouTube #datamesh #dataengineering #data Note - this was initially supposed to be on the Monday Morning Data Chat, but moved to the TGIF, Let's Talk Data Show! (podcast coming soon). Enjoy!
This interview was recorded for the GOTO Book Club.gotopia.tech/bookclubZhamak Dehghani - Author of "Data Mesh" & Director of Emerging Technologies at ThoughtworksSamia Rahman - Director of Data & AI at SeagenDESCRIPTIONHow can modern organizations handle their data in a way that delivers value at scale? Zhamak Dehghani, author of “Data Mesh: Delivering Data-Driven Value at Scale,” covers the key principles of data mesh and how it can help organizations move beyond the data lake to provide meaningful insights. She's joined by Samia Rahman, director of data and AI at Seagen, as they also explore the concept of the earliest explorable data.The interview is based on Zhamak's book "Data Mesh".Read the full transcription of the interview hereRECOMMENDED BOOKSZhamak Dehghani • Data MeshZhamak Dehghani, Neal Ford, Mark Richards & Pramod Sadalage • Software ArchitectureSam Newman • Monolith to MicroservicesSam Newman • Building MicroservicesSandeep Uttamchandani • The Self-Service Data RoadmapPiethein Strengholt • Data Management at ScaleMartin Kleppmann • Designing Data-Intensive ApplicationsTwitterLinkedInFacebookLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket at gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted almost daily.Discovery MattersA collection of stories and insights on matters of discovery that advance life...Listen on: Apple Podcasts Spotify Health, Wellness & Performance Catalyst w/ Dr. Brad CooperLooking for a catalyst to optimize your health, wellness & performance? You've found it!!Listen on: Apple Podcasts Spotify
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Transcript for this episode (https://docs.google.com/document/d/1ZCJGJU5jh5qqYN5wVrWMgTGRGOYF3PbkNO181I5PoKM/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here) Knowledge Graph Conference website: https://www.knowledgegraph.tech/ (https://www.knowledgegraph.tech/) Free Ticket Raffle for Knowledge Graph Conference (submissions must be by April 18 at 11:59pm PST): https://forms.gle/Gy8KSMNDxbBfib2Z6 (Google Form) In this episode of the Knowledge Graph Conference takeover week, special guest host Ellie Young (https://www.linkedin.com/in/sellieyoung/ (Link)) interviewed Veronika Haderlein-Høgberg, PhD. Veronika was employed at Fraunhofer-Gesellschaft at the time of recording but was representing only her own view and experiences. She was invited for her special mix of both data mesh and knowledge graph know-how. At Fraunhofer-Gesellschaft, Veronika's employer up until recently, she and team were currently implementing a knowledge graph to help with decision support for the organization. And previously, Veronika worked on a data mesh-like implementation as part of the Norwegian public sector at the Norwegian tax authority before the data mesh concept was really congealed into a singular form by Zhamak. Veronika and Ellie wrapped the conversation with a few key insights: to share data, groups need to agree on common standards to represent it, and they also need to be able to share information with each other about that data into the future. To develop these initial data standards, and to build the relationships to coordinate around that data long term, different departments in the enterprise have to converse with each other. Building conversations across departments requires also building trust, and for this curiosity is a crucial ingredient, both on the individual level, but also at the domain and organizational levels. If people don't feel comfortable asking questions, they can't understand each other's perspectives well enough to contribute to that shared context. What does this look like in practice? Different departments coming to discuss the difference definitions they have of different terms, and finding out what data they need from each other, therefore, what data they must collect and protocols they must develop. And, computer scientists discussing data with business people—understanding both what business requirements are, and conveying the needs of data systems in order to provide organized, quality data. Veronika's recent organization, Fraunhofer, is using a knowledge graph as they need to make their investment decisions much more data driven. They need to do analysis across many different sources - they have some slight control over internal data sources but essentially none over external sources. They are repeatedly doing harmonization across these sources, often the same harmonizations. Veronika believes they shouldn't have to do the harmonization manually, so they needed a translation layer - the knowledge graph. To build out their knowledge graph, they need business experts to work with the ontology experts - however, it is a struggle for time and attention from the business experts and they need to learn the importance and how to do ontologies. This is when Ellie mentioned that by centralizing the integration, it might cost a lot of effort up front, but it's necessary if you only want to do the harmonization work once. For Veronika, thinking in the data as a product mindset and having data owners is crucial...
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Transcript for this episode (https://docs.google.com/document/d/1PblP9tgQcqJp5ljyIH1MjNsSZmq9ot8pY3LzDiDBG1A/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here) In this episode, Scott interviewed Chris Riccomini, a Software Engineer, Author, and Investor. Chris led the infrastructure team at WePay when they embarked on a data mesh journey and made a well-written post on thinking about data mesh in DevOps terms. Like a number of people/organizations that have come on the podcast, at WePay, Chris was pursuing the general goals of data mesh and was applying some of the approaches as well - but it was not nearly as cohesive as Zhamak laid things out. Their initial setup had two teams managing the pipeline/transformation infrastructure. Chris's team was mostly handing the extracting and loading and then there was a team of analytics engineers handling the transformations. The Transformation team saw a major increase in demand and quickly became overloaded -> a bottleneck. Chris' team also started to get overloaded so they knew they had to evolve. One way the team started to address the bottlenecks was by decentralizing the pipelines. Teams could make a request and a scalable and reliable pipeline would essentially get automatically set up for them. WePay is in the financial services space so as part of those pipelines, to prevent risk, teams could mark their sensitive/PII columns and the infra team also put in some autodetection capabilities to make sure they didn't miss any. WePay created a "canonical data representation" or CDR, which is pretty analogous to a data product in data mesh. Chris really liked WePay's use of the embedded analytics engineer to serve as a data product developer. One key innovation for WePay was tooling to enable safe application schema evolution. They looked for things like dropped columns and had more comprehensive data contract checking mechanisms. It allowed developers to test changes pre-commit. 80-90% of data breakages were things the developers had no idea would cause an issue and they reverted those changes. 10-20% of the time, the developers still wanted to go through with the changes and that kicked off negotiations with data consumers. That forced conversation was very helpful for a few reasons. Chris talked about standardizing around technologies for the platform but allowing teams to roll their own if they wanted. But they were super clear with those teams that the infra team wouldn't support those other technologies, even if it was okay to use it. He also sees a major need for an API gateway concept for data. Currently, everything around versioning, auto-documentation, etc. is way too manual and high friction. Chris talked about taking the learning from DevOps and applying them to data mesh. One good one to look at, per Chris, is the embedded SRE concept - should you do the same with a data or analytics engineer? There is also a need for many standards and replicable patterns. They launched a data review committee, similar to a design review committee, that helped come up with standard data models and other standards. Sherpas not gatekeepers - build out your review functions as councils to guide and disseminate knowledge. The team's role should be about assisting where they can, being a trusted partner. And what WePay saw was as people went through more reviews and similar, they saw there was less of a need for them as people learned what good/best practices were. Lastly, Chris...
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Transcript for this episode (https://docs.google.com/document/d/16EBMvfIqyEnf_0bdEjsZNj9yCM4NoLZIvQnUzRPLwtk/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova-on-demand/?datameshradio (here) and their great data mesh resource center https://www.starburst.io/info/distributed-data-mesh-resource-center/?datameshradio (here) In this episode, Scott interviewed Dan Sullivan, Principal Data Architect at 4 Mile Analytics. A key point Dan brought up is tech debt around data. Taking on tech debt should ALWAYS be a very conscious choice. But the way most organizations work with data, it is much more of an unconscious choice, especially by data producers, who are taking on debt that the data engineering teams will have to pay down. We need to find ways to deliver value quickly but with discipline. Zhamak has mentioned in a few talks that data engineers soon may not exist in orgs deploying data mesh. Dan actually somewhat agrees that data engineering will change a lot as right now, there is a big rush to build out the initial iterations of data products (the industry definition). Going forward, Dan thinks there will be a need for data engineers that can really understand consumer needs and build the interactions, e.g. the SDKs, to leverage data. Dan has 3 key pillars for driving data literacy for data engineers are domain knowledge, learning, and collaboration. Data engineers should pair with business people to acquire domain knowledge, they should be given the opportunity to spend time doing things like online training to learn, and they should collaborate across the organization instead of just being ticket tacklers. Per Dan, not all data engineers are the same depending on background - some come from a data analyst/data science background but many come from a software engineering background. So we can't treat training all data engineers as if it's the same. But we do need them to have a well-rounded background. A big need is for them to understand more about the data consumers and/or the producers so embedding them in the domains can really help. For driving buy-in with data engineers, Dan points to the problems typically being around incentives. Data engineering is often hampered by organizational issues and a lack of clear direction. So if you can tackle those, you can often win over DEs. In any organization but especially in one implementing data mesh, standards, protocols, and contracts are all very important. However, most data engineering teams are not given the time to create them. They take a lot of effort and are hard to get right! Dan talked about how data can take a lot of useful practices from Agile, especially the fast-cycle feedback loop. And that data people really need to think more about the user experience (UX) for data. Dan's LinkedIn: https://www.linkedin.com/in/dansullivanpdx/ (https://www.linkedin.com/in/dansullivanpdx/) Dan's Email: dan.sullivan at 4mile.io Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman):...
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) Transcript for this episode (https://docs.google.com/document/d/1Ei1x3pCjrrbBzrmTmCr3a3c7fWUYuQAzSFYutLVuY-E/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova?datameshradio (here) (info gated) In this episode, Scott interviewed José Cabeda, Data Engineer at Call-center-as-a-service provider Talkdesk. They talked about Talkdesk's start to their data mesh journey and progress so far. When José came across Zhamak's original post, it spoke to a number of the challenges Talkdesk was facing, checking many of the boxes to where they wanted to head. The team started from a single data product and iterated from there. While they are still relatively early in their journey, like every company, they have advanced far past their initial use case. At Talkdesk, a data product is typically a single table or view in Snowflake but the company's North Star is event streaming as their key information storage and sharing mechanism. However, it was sometimes difficult to train people to understand the difference between a business event - something that occurred in the real world - and an event streaming event. José had a few key takeaways and recommendations for those implementing data mesh: 1. Change will be constant in a data mesh implementation so it is best to standardize the way people and systems will interact as much as possible. Define expectations! 2. Be open to new ideas, there are many challenges ahead so it's best to face them together. 3. Use a single universal ID for major concepts like account or business events to make interoperability easier / possible. 4. Don't be afraid to slice your data in different ways to serve different use cases. 5. To drive buy-in, start with a single use case, whether that is a data product or multiple data products - most people recommend 2-3 data products in your PoC - so you can show why data mesh is a good idea. José's LinkedIn: https://www.linkedin.com/in/jecabeda/ (https://www.linkedin.com/in/jecabeda/) José's Twitter: @jecabeda / https://twitter.com/jecabeda (https://twitter.com/jecabeda) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (https://pixabay.com/users/lesfm-22579021/) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) Transcript for this episode (https://docs.google.com/document/d/133fTlVue-K-hUwpabjsYDj-adm6IGeKd0KVkkQtxo9M/edit?usp=sharing (link)) provided by Starburst. See their Data Mesh Summit recordings https://www.starburst.io/learn/events-webinars/datanova?datameshradio (here) (info gated) In this episode, Scott interviews Azmath Pasha, member of the Forbes Technology Council, who has 25+ years in implementing large-scale IT projects including at CapGemini and Paradigm Technology. Azmath gave his 3 key measures for data value: cost savings, business value (e.g. driving new initiatives), and data reuse. For data mesh, the long-term value is in the second two but for Azmath, a PoC could be better served focusing on cost savings as it is easier to track and faster to realize. They dove into the concept of data discovery with human interaction, not purely an online experience. Similar to event storming for discovering your domain events (see DDD for Data episodes), discovery as a purely tool-based experience is always likely to be somewhat lacking. Scott was intrigued about this as that aspect of data discovery hasn't been widely discussed. To Azmath, the data product experience, part of what Zhamak calls 'the experience plane', is crucial. It is much harder to drive buy-in if your product is hard to use / has a bad user experience. Azmath's other crucial aspects to getting a data mesh (or any large scale data project) implementation right included: staying tool agnostic so you can remain "future proof"; supporting data producers to reduce time to delivery, especially initial delivery; and looking at your architecture and tool investments over a 5 year time horizon, not just for the short to medium-term. Azmath wrapped up by saying we are entering a new era of using data, we must democratize the data and also look to new metrics for evaluating the business value of data. Azmath's LinkedIn: https://www.linkedin.com/in/azmathpasha/ (https://www.linkedin.com/in/azmathpasha/) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (https://pixabay.com/users/lesfm-22579021/) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) In this episode, Scott interviews Kurt Gardiner, Engineering Manager of Data Engineering at Australian Insurance company nib Group. Kurt shared some insights into nib's journey so far, including the search for something like data mesh before Zhamak published, tool choices (Snowflake, dbt, Fivetran, EventBridge, Kinesis), the slow-role approach to replacing legacy implementation (the "application strangler" pattern mentioned), how they got started, and much more. Much of nib's approach is the small-scale tactical while building incrementally for the bigger strategic focus. E.g. helping teams to design their data products somewhat manually while building the reusable tooling to be far less manual going forward. Along their journey, there was some internal pushback from data consumers, especially those used to consuming from the data warehouse. To do data mesh right, Kurt and Scott both emphasized the need to set things up so they can evolve. That will frustrate or scare some people and it's important to work with them to see why that matters. There also needs to be a high tolerance for failure - you will NOT get everything right on your first go. Kurt also waxed poetic (said nice things) about event streaming patterns, especially CQRS - see link below for more info -, for a useful and scalable pattern that is good for both application development and creating a scalable and useful domain data model. But it requires a complete redesign so it is probably something to slowly introduce where it makes sense, if at all. Some pithy nuggets of wisdom from Kurt that are highly applicable to data mesh: "The single biggest problem in communication is the illusion that it has taken place" "Nobody cares what you know until they know that you care" Application Strangler pattern (recently renamed Strangler Fig Application pattern): https://martinfowler.com/bliki/StranglerFigApplication.html (https://martinfowler.com/bliki/StranglerFigApplication.html) CQRS: https://www.martinfowler.com/bliki/CQRS.html (https://www.martinfowler.com/bliki/CQRS.html) Kurt's LinkedIn: https://www.linkedin.com/in/kugardiner/ (https://www.linkedin.com/in/kugardiner/) nib Group careers page: https://nib.wd3.myworkdayjobs.com/careers (https://nib.wd3.myworkdayjobs.com/careers) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (https://pixabay.com/users/lesfm-22579021/) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) In this episode, Scott interviews two Domain Driven Design (DDD) experts from Thoughtworks - Danilo Sato Director and the Head of Data & AI (part of Office of the CTO) and Andrew Harmel-Law, Technical Principal. This was a further delve into Domain Driven Design for Data after the conversations with Paolo Platter and Piethein Strengholt. Danilo and Andrew gave a lot of great information about Event Storming, domain definitions and boundaries, ubiquitous language, and so much more but the main theme was "just get people to talk to each other". DDD is about bridging the gap between how the tech people talk and how the business/business people talk; if you are doing it right, both sides can understand each other and then the engineers can implement those business process learnings as part of the code. For an initial PoC, Danilo recommends starting with 2-3 data products. It is better if you can do the PoC across multiple domains but it isn't necessary. Validate value and do it quickly. As Andrew mentions, the earlier you can show value, the less pressure there is overall. Look for the initial quick wins while also building for the long-term. One key thing to remember, per Danilo, when doing DDD for data and data mesh in general: it is always an iterative process. Andrew briefly discussed a way to do DDD in more of a guerilla style than the blue/red books (well known DDD guides). Don't get ahead of yourself as Max Schultze mentioned in his episode. Do not let the size of the eventual task throw you into analysis paralysis. Andrew talked a lot about how normalization and strong abstractions on the application side make it very difficult to re-add the context lost when you normalize. Both Andrew and Danilo talked about the need to embrace complexity. If you want context, you have to accept there will be complexity. In the pursuit of simplification, you lose the richness, and that is VERY hard to reconstruct afterwards. Some practical advice for boundary definition is that the boundaries need to be very clear but malleable. Build everything with an eye that it will evolve. Before you start splitting into many 2 pizza teams, look at the big picture and select some coarse-grained boundaries. It is MUCH easier to split later than it is to glue things back together. Danilo's Webinar with Zhamak called "Data mesh and domain ownership": https://www.thoughtworks.com/en-us/about-us/events/webinars/core-principles-of-data-mesh/data-mesh-and-domain-ownership (https://www.thoughtworks.com/en-us/about-us/events/webinars/core-principles-of-data-mesh/data-mesh-and-domain-ownership) Vladik Khononov, 7 Years of DDD: https://www.youtube.com/watch?v=h_HjtYAH0AI (https://www.youtube.com/watch?v=h_HjtYAH0AI) Danilo LinkedIn: https://www.linkedin.com/in/danilosato/ (https://www.linkedin.com/in/danilosato/) Danilo Twitter: @dtsato / https://twitter.com/dtsato (https://twitter.com/dtsato) Andrew LinkedIn: https://www.linkedin.com/in/andrewharmellaw/ (https://www.linkedin.com/in/andrewharmellaw/) Andrew Twitter: @al94781 / https://twitter.com/al94781 (https://twitter.com/al94781) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/...
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) Don't forget to catch Dr. Abadi at https://starburst.io/info/datanova2022?utm_source=DataMeshRadio (Datanova - the Data Mesh Summit on Feb 9-10th). Thanks to Starburst for sponsoring the transcripts for Data Mesh Radio, check out the transcript https://docs.google.com/document/d/1CkXYjUNMRLl1okJJFmEbzTgTybAAFa-_VC4sOvnFH8s/edit?usp=sharing (here). And check out Starburst's other free data mesh resources https://www.starburst.io/info/data-mesh-resource-center?utm_source=DataMeshRadio (here). In this episode, Scott interviewed Dr. Daniel Abadi, the Darnell-Kanal Computer Science Professor at the University of Maryland with a focus on scalable data management research. Dr. Abadi will be presenting next week at the Data Mesh Summit on Data Fabric and Data Mesh alongside Zhamak and Sanjeev Mohan. This was a pretty wide ranging and free wheeling conversation about data virtualization in general and how it can be used in data mesh. Both agreed that there are many places where data virtualization can play in data mesh, whether in extracting information from operational systems, stitching together a data product once data processing has been done, or at the mesh experience plane re combining data across multiple data products. Dr. Abadi specifically mentions something like a query fabric that makes use of a data virtualization approach, not just tools that only do data virtualization. There is a natural side effect of having multiple different technologies in use - when you give the domains the ability to use what they choose, the difficulty of combining data from multiple sources needs to be solved. There is always a balance between how much you just copy data and how much you can access in the source system and data virtualization can give a few more options rather than all or nothing. As data virtualization has been around as a concept for 30+ years, there is a lot of baggage with the term but Dr. Abadi sees there being recent advancements that mean more people should take a second look at where they can be useful. But warns to do your homework and really think through whether they fit your use case. A query fabric can make your user experience much more pleasant. Trying to create data products entirely within a data virtualization platform probably won't be, at least according to Scott. Additional topics included retransmitting or reprocessing data, versioning, the importance of denormalizing data for analytics and how that plays with data virtualization, and much more. It is a really fascinating deep dive into the history of computing and how it impacts what we are trying to do today. Dr. Abadi's blog post on data federalization and data virtualization: https://blog.starburst.io/data-federation-and-data-virtualization-never-worked-in-the-past-but-now-its-different (https://blog.starburst.io/data-federation-and-data-virtualization-never-worked-in-the-past-but-now-its-different) Dr. Abadi's contact info: LinkedIn: Twitter: @daniel_abadi / https://twitter.com/daniel_abadi (https://twitter.com/daniel_abadi) Starburst blog posts: https://blog.starburst.io/author/daniel-abadi (https://blog.starburst.io/author/daniel-abadi) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) Music used this episode created by Lesfm (intro includes slight edits by Scott...
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) In this somewhat controversial episode, part 1 of 2, Scott covers topics re is data mesh right for your organization. This should only be used as a jumping off point for discussions. The 4 segment titles are: Data quality challenges don't necessarily mean it's time for a data mesh Data Mesh Lite Building on a solid foundation How 'bout now, how 'bout right now? ("Patience you must have, my young Padawan") Take it with a grain of salt! Mentioned content links: https://www.thoughtworks.com/about-us/events/webinars/core-principles-of-data-mesh/lessons-from-the-trenches-in-data-mesh (Webinar with Zhamak and Sina Jahan - lessons from the trenches with data mesh) https://barryoreilly.com/explore/podcast/decentralizing-data-zhamak-dehghani/ (Zhamak podcast interview with Barry O'Reilly) https://www.youtube.com/watch?v=-POiudR2_R0 (Flexport Data Mesh Learning meetup) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (https://pixabay.com/users/lesfm-22579021/) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) In this episode, Scott interviews Juan Sequeda, Principal Scientist at data.world and co-host of the Catalog and Cocktails podcast. They discussed Juan's knowledge first approach: putting the meaning and value of the data first instead of focusing on the amount of data we are handling/producing. Knowledge first has 3 components, 1) context, 2) people, and 3) relationships. Juan is a big proponent of knowledge graphs and the relationships side is one many people miss. Juan also gave some thoughts on what his approach to data mesh hinges on: treating data as a product and finding a balance between centralization and decentralization for all the aspects of building out an implementation. Juan mentioned Intuit's approach of fixed, flexible/extensible, or customizable as a good general tool and to look for (and embrace) what he calls intellectual friction. Lastly, Juan and Scott talked about the general drive to reduce toil, of reinventing the wheel re data interoperability and standard schemas in data mesh. Juan points to a lot of existing research and standards - e.g. RDF, OWL, and many more (see below) - as a starting point. Juan's contact info and related links: Email: juan at data.world Twitter: @juansequeda / https://twitter.com/juansequeda (https://twitter.com/juansequeda) LinkedIn: https://www.linkedin.com/in/juansequeda/ (https://www.linkedin.com/in/juansequeda/) Catalog & Cocktails Podcast: https://data.world/podcasts/ (https://data.world/podcasts/) Juan's post about Zhamak's appearance on the Data Engineering Podcast: https://www.linkedin.com/pulse/my-takeaways-data-engineering-podcast-episode-mesh-zhamak-sequeda/ (https://www.linkedin.com/pulse/my-takeaways-data-engineering-podcast-episode-mesh-zhamak-sequeda/) Juan's post about knowledge first: https://www.linkedin.com/feed/update/urn:li:activity:6884179569277059072/ (https://www.linkedin.com/feed/update/urn:li:activity:6884179569277059072/) Standards related links: Dublin Core Metadata Initiative: https://dublincore.org/ (https://dublincore.org/) RDF (Resoruce Description Framework): https://www.w3.org/2001/sw/wiki/RDF (https://www.w3.org/2001/sw/wiki/RDF) OWL (Web Ontology Language): https://www.w3.org/OWL/ (https://www.w3.org/OWL/) PROV-O: The PROV Ontology: https://www.w3.org/TR/prov-o/ (https://www.w3.org/TR/prov-o/) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (https://pixabay.com/users/lesfm-22579021/) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
The data mesh is a thesis that was presented to address the technical and organizational challenges that businesses face in managing their analytical workflows at scale. Zhamak Dehghani introduced the concepts behind this architectural patterns in 2019, and since then it has been gaining popularity with many companies adopting some version of it in their systems. In this episode Zhamak re-joins the show to discuss the real world benefits that have been seen, the lessons that she has learned while working with her clients and the community, and her vision for the future of the data mesh.
A thoughtful conversation examining the paradigm shift and the unlearning required to build a data-driven organization at scale. Hear Zhamak, the founder of data mesh, discuss what data mesh is and what it isn't. This conversation provides insights, failsafe tips, and inspiration to use data to augment and improve business and life.
Zhamak Dehghani is a software engineer, architect, and founder of data mesh. Zhamak founded the concept of data mesh as the paradigm shift needed in how we manage data at scale. Meet Zhamak and learn how data mesh will help organizations achieve data-driven value at scale.
The data mesh architectural paradigm shift is all about moving analytical data away from a monolithic data warehouse or data lake into a distributed architecture—allowing data to be shared for analytical purposes in real time, right at the point of origin. The idea of data mesh was introduced by Zhamak Dehghani (Director of Emerging Technologies, Thoughtworks) in 2019. Here, she provides an introduction to data mesh and the fundamental problems that it's trying to solve. Zhamak describes that the complexity and ambition to use data have grown in today's industry. But what is data mesh? For over half a century, we've been trying to democratize data to deliver value and provide better analytic insights. With the ever-growing number of distributed domain data sets, diverse information arrives in increasing volumes and with high velocity. To remove the friction and serve the requirement for data to be consumed by operational needs in various use cases, the best way is to mesh the data. This means connecting data through a peer-to-peer fashion and liberating data for analytics, machine learning, serving up data-intensive applications across the organization, and more. Data mesh tackles the deficiency of the traditional, centralized data lake and data warehouse platform architecture. The data mesh paradigm is founded on four principles: Domain-oriented ownershipData as a productData available everywhere in a self-serve data infrastructureData standardization governanceA decentralized, agnostic data structure enables you to synthesize data and innovate. The starting point is embracing the ideology that data can be anywhere. Source-aligned data should serve as a product available for people across the organization to combine, explore, and drive actionable insights. Zhamak and Tim also discuss the next steps we need to take in order to bring data mesh to life at the industry level.To learn more about the topic, you can visit the all-new Confluent Developer course: Data Mesh 101. Confluent Developer is a single destination with resources to begin your Kafka journey. EPISODE LINKSZhamak Dehghani: How to Build the Data Mesh FoundationData Mesh 101 CourseSaxo Bank's Best Practices for a Distributed Domain-Driven Architecture Founded on the Data MeshPlacing Apache Kafka at the Heart of a Data Revolution at Saxo BankWatch the video version of this podcastJoin the Confluent CommunityLearn Kafka on Confluent DeveloperLive demo: Event-Driven Microservices with ConfluentUse PODCAST100 to get $100 of Confluent Cloud usage (details)
Zhamak Dehghani, Director of Emerging Technologies at ThoughtWorks joins Dave Vellante for theCUBE on Cloud 2021.
ThoughtWorks' Director of Next Tech Incubation Zhamak Dehghani joins us for a round on Cocktails and talks about how she founded Data Mesh and the concepts behind it. She expounds on the decentralization and productization of data and discusses the importance of the discovery of new paradigm shifts in the industry.
A conversation with Iranian photographer, music-video and film director Zhamak Fullad. Zhamak has been mainly working in Canada and Los Angeles. We discuss how she got into the profession, the obstacles she has been facing as a brown woman in the West, and more. Donate! Please consider donating towards our work: Patreon.com/habibicollective. A small monthly donation goes a long way towards paying innumerable costs including: screening fees for filmmakers, MGs, design assets and the endless web costs of developing a streaming service. Habibi Collective operates completely on a volunteer-led basis—is vital that we stay independent. --- Support this podcast: https://anchor.fm/roisin-tapponi/support