POPULARITY
Fresh from the TED conference, Tim Gasper and Juan Sequeda share the most thought-provoking ideas that caught their attention: What are you most excited about in the world of AI? What are you concerned about? How do you find a balance?
Juan Sequeda and Jesus Barrasa are among the top experts on graphs in the world. In this episode, we chat about the definitions of semantics, ontologies, and the differences between RDF and property graphs, etc. We also talk about how AI is giving graphs a new surge of interest.
Fresh from their travels meeting data leaders across industries, Tim Gasper and Juan Sequeda share the six most compelling data trends they've encountered in the field. In this special episode, our hosts cut through the hype to reveal what's actually happening on the ground - from how companies are really using AI to surprising shifts in data governance. Join them for an insider's look at what's shaping enterprise data strategies right now, straight from their conversations with practitioners and executives.
Tony Baer, Matt Housley, and Juan Sequeda and I recap our thoughts on Data Day Texas 2025.
Hugo Bowne-Anderson hosts a panel discussion from the MLOps World and Generative AI Summit in Austin, exploring the long-term growth of AI by distinguishing real problem-solving from trend-based solutions. If you're navigating the evolving landscape of generative AI, productionizing models, or questioning the hype, this episode dives into the tough questions shaping the field. The panel features: - Ben Taylor (Jepson) (https://www.linkedin.com/in/jepsontaylor/) – CEO and Founder at VEOX Inc., with experience in AI exploration, genetic programming, and deep learning. - Joe Reis (https://www.linkedin.com/in/josephreis/) – Co-founder of Ternary Data and author of Fundamentals of Data Engineering. - Juan Sequeda (https://www.linkedin.com/in/juansequeda/) – Principal Scientist and Head of AI Lab at Data.World, known for his expertise in knowledge graphs and the semantic web. The discussion unpacks essential topics such as: - The shift from prompt engineering to goal engineering—letting AI iterate toward well-defined objectives. - Whether generative AI is having an electricity moment or more of a blockchain trajectory. - The combinatorial power of AI to explore new solutions, drawing parallels to AlphaZero redefining strategy games. - The POC-to-production gap and why AI projects stall. - Failure modes, hallucinations, and governance risks—and how to mitigate them. - The disconnect between executive optimism and employee workload. Hugo also mentions his upcoming workshop on escaping Proof-of-Concept Purgatory, which has evolved into a Maven course "Building LLM Applications for Data Scientists and Software Engineers" launching in January (https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?utm_campaign=8123d0&utm_medium=partner&utm_source=instructor). Vanishing Gradient listeners can get 25% off the course (use the code VG25), with $1,000 in Modal compute credits included. A huge thanks to Dave Scharbach and the Toronto Machine Learning Society for organizing the conference and to the audience for their thoughtful questions. As we head into the new year, this conversation offers a reality check amidst the growing AI agent hype. LINKS Hugo on twitter (https://x.com/hugobowne) Hugo on LinkedIn (https://www.linkedin.com/in/hugo-bowne-anderson-045939a5/) Vanishing Gradients on twitter (https://x.com/vanishingdata) "Building LLM Applications for Data Scientists and Software Engineers" course (https://maven.com/hugo-stefan/building-llm-apps-ds-and-swe-from-first-principles?utm_campaign=8123d0&utm_medium=partner&utm_source=instructor).
As Season 8 of Catalog & Cocktails comes to a close, Tim and Juan reflect on the conversations, breakthroughs, and trends that shaped the past episodes. From unraveling the complexities of data quality to exploring the future of AI, this recap dives into the standout moments, key lessons, and recurring themes from our incredible guests. Tune in for an honest, no-BS discussion on what we've learned this season—and a sneak peek at what's coming next!
Tim and Juan are live from DGIQ + AIGov with their Honest, No-BS takeaways on the latest topics. Is it the same old story, or is there something new to pay attention to? What's actually working? What's worth keeping an eye on? And what are the standout success stories? Tune in this week to find out!
Juan Sequeda is a principal data scientist and head of the AI Lab at data.world, and is also the co-host of the fantastic data podcast Catalog and Cocktails. This episode tackles semantics, semantic web, Juan's research in how raw text-to-SQL performs versus text-to-semantic layer, and where we both believe AI will make an impact in the world of structured data analytics. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com. The Analytics Engineering Podcast is sponsored by dbt Labs.
Are your outputs generating the right outcomes? I'm in Austin for Data Day Texas, and I reflect on this topic via a conversation I had last night with Juan Sequeda, Tim Gasper, and Santona Tuli. In 2024, outcomes will matter more than ever. What are you doing to drive the right outcomes for your organization?
In this episode, Nathan sits down with Juan Sequeda, Principal Scientist and Head of AI Lab at data.world. They discuss how knowledge graphs can be your organization's "brain" for AI, integrating structured and unstructured data, benchmarking enterprise AI systems, and more. If you need an ecommerce platform, check out our sponsor Shopify: https://shopify.com/cognitive for a $1/month trial period. We're hiring across the board at Turpentine and for Erik's personal team on other projects he's incubating. He's hiring a Chief of Staff, EA, Head of Special Projects, Investment Associate, and more. For a list of JDs, check out: eriktorenberg.com. --- LINKS: Data.world: https://data.world/ SPONSORS: Shopify is the global commerce platform that helps you sell at every stage of your business. Shopify powers 10% of ALL eCommerce in the US. And Shopify's the global force behind Allbirds, Rothy's, and Brooklinen, and 1,000,000s of other entrepreneurs across 175 countries.From their all-in-one e-commerce platform, to their in-person POS system – wherever and whatever you're selling, Shopify's got you covered. With free Shopify Magic, sell more with less effort by whipping up captivating content that converts – from blog posts to product descriptions using AI. Sign up for $1/month trial period: https://shopify.com/cognitive Omneky is an omnichannel creative generation platform that lets you launch hundreds of thousands of ad iterations that actually work customized across all platforms, with a click of a button. Omneky combines generative AI and real-time advertising data. Mention "Cog Rev" for 10% off www.omneky.com NetSuite has 25 years of providing financial software for all your business needs. More than 36,000 businesses have already upgraded to NetSuite by Oracle, gaining visibility and control over their financials, inventory, HR, eCommerce, and more. If you're looking for an ERP platform ✅ head to NetSuite: http://netsuite.com/cognitive and download your own customized KPI checklist. X/SOCIALS: @labenz (Nathan) @juansequeda (Juan) @datadotworld (data.world) @CogRev_Podcast TIMESTAMPS: (00:00:00) - Introduction to Juan Sequeda and data.world (00:01:11) - Discussion on data and generative ai (00:06:15) - Data.world's origins as an open data catalog platform ("Github for data") (00:09:35) - Using knowledge graphs and semantics to integrate and query data (00:12:52) - Main use cases for data catalogs: search/discovery, governance, data operations (00:15:00) - The process of building knowledge graphs automatically from data (00:24:29) - AI for unlocking and capturing tribal business knowledge (00:34:24) - Understanding the data landscape in enterprises (00:38:32) - The emergence of knowledge engineers and data product managers (00:40:44) - The consumer experience in data.world (00:45:36) - The importance of context in data analysis (00:46:58) - The role of AI in improving data analysis (00:48:08) - The importance of accuracy and explainability in data analysis (00:50:08) - Question cataloging in data analysis (00:51:44) - The future potential of "chat with your data" interfaces (01:18:02) - Finetuning with data vs metadata (01:29:24) - Future of enterprise data teams
2023 is coming to a close and so is our season 6 with 20 episodes. Join Tim and Juan where they will provide the takeaway of takeaways for all the episodes of season 6. We will be back in January 2024 and kick off Season 7.
Join Shane Gibson as he chats with Juan Sequeda on Knowledge Graphs. You can get in touch with Juan via LinkedIn or at https://www.juansequeda.com If you want to read the transcript for the podcast head over to: https://agiledata.io/podcast/agiledata-podcast/knowledge-graphs-with-juan-sequeda/#read Listen to more podcasts on applying AgileData patterns over at https://agiledata.io/podcasts/ Read more on the AgileData Way of Working over at https://wow.agiledata.io/way-of-working/ If you want to join us on the next podcast, get in touch over at https://agiledata.io/podcasts/#contact Or if you just want to talk about making magic happen with agile and data you can connect with Shane @shagility or Nigel @nigelvining on LinkedIn. Subscribe: Apple Podcast | Spotify | Google Podcast | Amazon Audible | TuneIn | iHeartRadio | PlayerFM | Listen Notes | Podchaser | Deezer | Podcast Addict | Simply Magical Data
Investing in Knowledge Graph provides higher accuracy for LLM-powered question-answering systems. That's the conclusion of the latest research that Juan Sequeda, Dean Allemang and Bryon Jacob have recently presented. In this episode, we will dive into the details of this research and understand why to succeed in this AI world, enterprises must treat the business context and semantics as a first-class citizen.
What's the state of data stewardship today and where is it going? Will data stewards continue to exist? How is this evolving with respect to data products? And what is the impact of AI? All of these questions and more is what Tim and Juan ranted about in this episode.
Juan Sequeda and I chat about knowledge graphs (he's an OG in this area), the potential of LLMs on structured datasets, and much more. This is an honest, no-BS chat about the transition from a data-first world to a knowledge-first world. Enjoy! LinkedIn: https://www.linkedin.com/in/juansequeda/ data.world: https://data.world/product/ website: https://www.juansequeda.com/
Juan Sequeda, Principal Scientist and Head of AI Lab at data.world shares case studies and best practices for creating cultures of enablement around data quality and behaviours.Topics Include:Juan Sequeda introductionFoundation of knowledge & data graphs for AI2 million users in the OpenData catalogue, 40% of Fortune 500OneWeb Case Study – data from satellitesThe Culture of Enablement to empower your organization with dataOneWeb using AWS SagemakerPrologis Real Estate logistics data case studyTying staff rewards to data quality and behavioursDeveloping a culture of changeSession wrap up
Tim Gasper and Juan Sequeda debrief after 4 days on-site at Snowflake Summit 2023 in Las Vegas.
We all need a break sometimes.Juan is on vacation but he wasn't skipping this week's episode of Catalog & Cocktails.Join Tim Gasper and Juan Sequeda for this weeks mini-sode.
Are you criminally under leveraging your data?This week Juan Sequeda and Tim Gasper will be joined by Maddy Want, VP of Data at Fanatics and author of “Precisely: Working with Precision Systems in a World of Data” to discuss digital transformations stories from her book.
Join this weeks episode of Catalog & Cocktails LIVE from the Knowledge Graph Conference.Hosts Tim Gasper and Juan Sequeda will be joined by Katariina Kari, Lead Ontologist at IKEA to chat about all things Knowledge Graph.
Survival guides, rulebooks, blueprints and more. The best way to prepare yourself as a next generation data leader is this episode of Catalog & Cocktails with Tim Gasper, Juan Sequeda, and special guest Veronika Durgin, VP of Data at Saks 5th Avenue.Get deep into some preparation-curriculum with topics like:-Do's and Don'ts of Data Leaders-Are you skeptical enough?-How curious are you?-Joining communities that educate and inspire-and MORE
In this episode of Catalog & Cocktails, Kristin Schooley from Learning Care Group sits down with Juan Sequeda and Tim Gasper to discuss the importance of teamwork and building relationships when it comes to scaling an analytics team within a large organization. The conversation covers a lot of ground, including the need to understand what reporting is needed and what story is being told, how to ensure scalability and compliance, as well as the importance of measuring usage and interpreting why certain data may not be utilized. Key Takeaways:[00:00 - 01:10] Introduction & Cheers[01:15 - 03:11] What was your favorite class in school and why?[03:16 - 06:34] A data team of five[06:36 - 11:29] Partnering with business units and thinking about what is being learned[11:43 - 14:07] Data as a product and lifecycle management[14:22 - 15:13] Data literacy isn't necessarily the phrase we should be using[15:15 - 18:13] Business analytics and understanding how to present data[18:16 - 22:51] How organizations organize their data teams with checks and balances[22:52 - 26:18] Business literacy, centralized teams, and scaling beyond a bottleneck[26:20 - 30:31] Incentivizing for having documentation up to date[30:40 - 34:57] Lightning round[35:00 - 39:46] Tim & Juan's Takeaways[39:46 - 43:06] Three questions
Today's guest is Juan Sequeda, Principal Scientist at data.world and Co-Host of the Catalog & Cocktails Podcast. Juan joins Seth Earley and Chris Featherstone and shares how to understand the problem that you are trying to solve. Juan also discusses how your company's success should be defined differently. Don't focus on just on saving money to make money. Focus on solving a problem. Juan also shares valuable advice on how understanding who you report to helps you speak the same language. Takeaways:Juan believes the market is immature when it comes to what they want or what they think they want. This is where data catalogs become important so that companies can locate information. From the perspective of the data management world, it's focused on only technology. The problems that they had been trying to solve 30 years ago continue to be the same problems they've been trying to solve.If you are on the technical side of your business, it is important to understand who you should be reporting to. Understanding this early on will help you tailor information to meet the correct outcome. Juan's definition of a knowledge graph is representing real-world concepts and the relationships between those real-world concepts end up forming a graph. The reason why the graph is really valuable is because you can integrate data coming from many diverse sources.Quote of the Show:“Keep working on the same vision.” (07:50)Links:TwitterLinkedInWebsitePodcast: Catalog & Cocktails presented by data.worldJuan's Portfolio Book: Integrating Relational Databases with the Semantic WebWays to Tune In:WebsiteApple Podcast SpotifyiHeart RadioStitcherAmazon MusicBuzzsproutThanks to our sponsors: Marketing AI Institute CMSWire Earley Information Science AI Powered Enterprise Book
Those who don't know their history are doomed to repeat it. If there is someone who can speak to data history, it's Bill Inmon. How did data warehouses start? Why is the computing profession still immature?Join Tim Gasper, Juan Sequeda and Bill Inmon, the father of the data warehouse and Founder of Forest Rim Technology, to learn about the past and present of data, and where things might be heading in the future.
Summary The data ecosystem has seen a constant flurry of activity for the past several years, and it shows no signs of slowing down. With all of the products, techniques, and buzzwords being discussed it can be easy to be overcome by the hype. In this episode Juan Sequeda and Tim Gasper from data.world share their views on the core principles that you can use to ground your work and avoid getting caught in the hype cycles. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management When you're ready to build your next pipeline, or want to test out the projects you hear about on the show, you'll need somewhere to deploy it, so check out our friends at Linode. With their new managed database service you can launch a production ready MySQL, Postgres, or MongoDB cluster in minutes, with automated backups, 40 Gbps connections from your application hosts, and high throughput SSDs. Go to dataengineeringpodcast.com/linode (https://www.dataengineeringpodcast.com/linode) today and get a $100 credit to launch a database, create a Kubernetes cluster, or take advantage of all of their other services. And don't forget to thank them for their continued support of this show! Modern data teams are dealing with a lot of complexity in their data pipelines and analytical code. Monitoring data quality, tracing incidents, and testing changes can be daunting and often takes hours to days or even weeks. By the time errors have made their way into production, it's often too late and damage is done. Datafold built automated regression testing to help data and analytics engineers deal with data quality in their pull requests. Datafold shows how a change in SQL code affects your data, both on a statistical level and down to individual rows and values before it gets merged to production. No more shipping and praying, you can now know exactly what will change in your database! Datafold integrates with all major data warehouses as well as frameworks such as Airflow & dbt and seamlessly plugs into CI workflows. Visit dataengineeringpodcast.com/datafold (https://www.dataengineeringpodcast.com/datafold) today to book a demo with Datafold. RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudder (https://www.dataengineeringpodcast.com/rudder) Build Data Pipelines. Not DAGs. That's the spirit behind Upsolver SQLake, a new self-service data pipeline platform that lets you build batch and streaming pipelines without falling into the black hole of DAG-based orchestration. All you do is write a query in SQL to declare your transformation, and SQLake will turn it into a continuous pipeline that scales to petabytes and delivers up to the minute fresh data. SQLake supports a broad set of transformations, including high-cardinality joins, aggregations, upserts and window operations. Output data can be streamed into a data lake for query engines like Presto, Trino or Spark SQL, a data warehouse like Snowflake or Redshift., or any other destination you choose. Pricing for SQLake is simple. You pay $99 per terabyte ingested into your data lake using SQLake, and run unlimited transformation pipelines for free. That way data engineers and data users can process to their heart's content without worrying about their cloud bill. For data engineering podcast listeners, we're offering a 30 day trial with unlimited data, so go to dataengineeringpodcast.com/upsolver (https://www.dataengineeringpodcast.com/upsolver) today and see for yourself how to avoid DAG hell. Your host is Tobias Macey and today I'm interviewing Juan Sequeda and Tim Gasper about their views on the role of the data mesh paradigm for driving re-assessment of the foundational principles of data systems Interview Introduction How did you get involved in the area of data management? What are the areas of the data ecosystem that you see the most turmoil and confusion? The past couple of years have brought a lot of attention to the idea of the "modern data stack". How has that influenced the ways that your and your customers' teams think about what skills they need to be effective? The other topic that is introducing a lot of confusion and uncertainty is the "data mesh". How has that changed the ways that teams think about who is involved in the technical and design conversations around data in an organization? Now that we, as an industry, have reached a new generational inflection about how data is generated, processed, and used, what are some of the foundational principles that have proven their worth? What are some of the new lessons that are showing the greatest promise? data modeling data platform/infrastructure data collaboration data governance/security/privacy How does your work at data.world work support these foundational practices? What are some of the ways that you work with your teams and customers to help them stay informed on industry practices? What is your process for understanding the balance between hype and reality as you encounter new ideas/technologies? What are some of the notable changes that have happened in the data.world product and market since I last had Bryon on the show in 2017? What are the most interesting, innovative, or unexpected ways that you have seen data.world used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on data.world? When is data.world the wrong choice? What do you have planned for the future of data.world? Contact Info Juan LinkedIn (https://www.linkedin.com/in/juansequeda/) @juansequeda (https://twitter.com/juansequeda) on Twitter Website (https://www.juansequeda.com/) Tim LinkedIn (https://www.linkedin.com/in/timgasper/) @TimGasper (https://twitter.com/TimGasper) on Twitter Website (https://www.timgasper.com/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ () covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links data.world (https://data.world/) Podcast Episode (https://www.dataengineeringpodcast.com/data-dot-world-with-bryon-jacob-episode-9/) Gartner Hype Cycle (https://www.gartner.com/en/information-technology/glossary/hype-cycle) Data Mesh (https://www.thoughtworks.com/en-us/what-we-do/data-and-ai/data-mesh) Modern Data Stack (https://tanay.substack.com/p/understanding-the-modern-data-stack) DataOps (https://en.wikipedia.org/wiki/DataOps) Data Observability (https://www.montecarlodata.com/blog-what-is-data-observability/) Data & AI Landscape (https://mattturck.com/data2021/) DataDog (https://www.datadoghq.com/) RDF == Resource Description Framework (https://en.wikipedia.org/wiki/Resource_Description_Framework) SPARQL (https://en.wikipedia.org/wiki/SPARQL) Moshe Vardi (https://en.wikipedia.org/wiki/Moshe_Vardi) Star Schema (https://en.wikipedia.org/wiki/Star_schema) Data Vault (https://en.wikipedia.org/wiki/Data_vault_modeling) Podcast Episode (https://www.dataengineeringpodcast.com/data-vault-data-modeling-episode-119/) BPMN == Business Process Modeling Notation (https://en.wikipedia.org/wiki/Business_Process_Model_and_Notation) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)
Well, another incredible season of Catalog & Cocktails concludes this week with hosts Tim Gasper and Juan Sequeda.Join in for the ultimate takeaways of the takeaways as Tim and Juan recap best moments, favorite hot takes, and the most controversial opinions over the last season. Listeners, please submit your feedback: https://forms.gle/FdjMfarUaVnJ3SzB9
You have to have a lot of data to get AI to work. But the data folks are not jumping on it as fast as they should. So what happens when data teams aren't up to speed, companies are hiring more data scientists than they are engineers, AND current data teams are focusing too much on biz reporting and not supporting AI?This week on Catalog & Cocktails, join hosts Tim Gasper and Juan Sequeda as they chat with special guest, Theresa Kushner, Head of North America Innovation Center at NTT Data Services to discuss how the AI train is leaving the station and data teams can only run so fast. Key Takeaways[00:10 - 02:25] Introduction & Cheers[02:28 - 04:12] What's your favorite way to travel and why?[04:15 - 07:01] Are data teams keeping up with AI teams?[07:01 - 08:50] Are data teams and AI teams helping each other or avoiding each other?[08:54 - 13:09] AI teams become a data set in themselves[13:10 - 14:45] Data ownership and control[14:51 - 17:20] Thoughts on purchasing data[17:20 - 20:53] Data products and observations[20:53 - 24:52] CDO versus the CDAO, definitions and comparisons[24:53 - 27:05] Should there be a CDO or a CDAO?[27:03 - 30:04] Data makes AI work[30:02 - 34:58] If you want results you have to collaborate[34:59 - 37:29] Creating culture tied to data quality[37:27 - 42:01] The skill sets for managing data products[42:02 - 46:09] Theresa's message of advice to data teams[46:12 - 52:36] Lightning Round[52:36 - 58:16] Takeaways[58:17 - 01:00:31] Three questions
In this episode, Juan Sequeda, Principal Scientist, and Tim Gasper, Chief Customer Officer and Product Strategist, from data.world discuss the importance of empowering business and technical users to understand, locate, trust and get value from data from a "Knowledge-first world." Juan and Tim explain how the current "data first-world" practice focuses on data literacy rather than business literacy, leaving a disconnect between data and business teams.To learn more, follow Juan Sequeda and Tim Gasper on LinkedIn and tune in to their podcast: "Catalog and Cocktails," an honest, no BS, non-salesy conservation about enterprise data and analytics.What's New In Data is a data thought leadership series hosted by John Kutay who leads data and products at Striim. What's New In Data hosts industry practitioners to discuss latest trends, common patterns for real world data patterns, and analytics success stories.
*Raw audio while processing internal technical difficulties*Machines and people. Why can't we just speak the same language? The truth is we can, and doing so could make life demonstrably better for data scientists. Yet here we are, living in a world of rows and columns that few people outside of the data owner understand.Join this weeks episode of Catalog & Cocktails as hosts, Juan Sequeda and Tim Gasper with special guest, Dan Bennett, tackle semantics and how to get everyone -- machines and people -- on the same page.
Data modeling is seeing a massive resurgence of attention lately. How do you cut through the noise and know what's useful for your situation? Juan Sequeda (data.world) joins the show to chat about honest, no-BS data modeling, aka data modeling for the real world. #datamodel #dataengineering #data
When it comes to leading a successful business, it is crucial to remain data-driven. But being overly technical in your approach can often take away from the social needs of your enterprise.Malcolm and Dr. Juan Sequeda focus primarily on four key topics: data as a product, the data mesh phenomenon, why data leaders are incorrectly focused on technology and how taking a more ‘social' approach — as advocated by the data mesh — will deliver superior results.Dr. Sequeda breaks down data-related technologies into three core principles that he argues have changed little over the last several decades. CDOs with more of a business or non-technical background will appreciate how Dr. Sequeda is able to distill the complexities of the modern data estate into a simplified model — and warns how various data management vendors continue to complicate by focusing too much on software tools and features. While exploring ways for data leaders to extricate themselves from technology-first approaches, the two explore the growing trend towards data as a product and how CDOs can benefit from it. Dr. Sequeda shares his ‘ABC' framework for approaching data as a product that CDOs from all backgrounds can quickly use within their data organizations. Dr. Sequeda both challenges and acknowledges the benefits of data centralization during a discussion focused on how master data management (MDM) is still needed by all organizations despite the decentralized approach advocated by the data mesh. Ultimately, it should be no surprise that a noted scholar on knowledge graphs believes that context and semantics should drive more modern approaches to governance and MDM — where the context or use case of data ultimately determines what rules/policies should be defined rather than the data itself. This episode of CDO Matters will help less technical CDOs understand the underlying data semantics and why the data mesh — most especially the ‘data as a product' phenomenon — is worthy of consideration. Prioritizing efforts to integrate product management disciplines in data management — at both centralized and decentralized levels — will ultimately help data leaders to drive superior results by being more driven socially. Key Moments[4:24] Bridging Tech and Business[6:06] Defining Data Mesh for Your Organization[8:20] A Social-first Approach to the Data Mesh[10:52] What Comes After Data Decentralization?[15:10] The 3 Principles of the Data Stack[16:01] Modern Data Developments and How Data Software Categories Drive the Conversation[17:05] Social vs. Cultural Business Approaches[20:15] Metadata Serving as the Glue Behind Data[23:12] Operational Focus of the Data Mesh[25:20] The Relevance of Master Data Management (MDM) Today[28:30] Powering a Data Fabric with a Semantic Layer[33:20] Data Centralization through Governance Key TakeawaysBridging Technology and Business for CDOs [5:05 — 6:03]“I would say you need to have people on your team who can be those bridges…who will be able to fill that gap [between technology and business]. As a leader, you want to understand the overview of things, but you also want to feel empowered by having the best people around you.” — Dr. Juan SequedaIs Data Mesh a Software Category? [7:16 — 8:14]“Data mesh is a social-technical paradigm shift, it is not something you buy… if somebody is selling you a data mesh, please run far away as fast as you can from that vendor because they are selling you B.S.” — Dr. Juan SequedaThe 3 Principles of the Data Stack [15:06— 16:49]“We talk about the modern data stack…look at the principles…here is this box and it has inputs and outputs. It is the three main boxes. One is the box that moves data. Data comes in, data comes out. Then you have another box where data comes in, questions come in and answers come out. That is your storage and compute…then you have another box where different questions come out. That is your analytics.” — Dr. Juan SequedaThe Problem with Being Overly Tech-Focused [17:05 — 17:42]“The issue here is that we have been defining success from a technical perspective, which is ‘my data is now in one place,' but that was not the goal…define success from the social perspective about the needs of the business.” — Dr. Juan SequedaAbout Dr. Juan SequedaDr. Juan Sequeda is the Principal Scientist at Data.World and the co-host of the Catalogs & Cocktails podcast. Juan holds a Ph.D. in Computer Science from the University of Texas at Austin and is a noted scholar and researcher in the fields of semantic technologies, including knowledge graphs. He is a frequent public speaker at data and analytics conferences across the globe and is passionate about helping data leaders implement more modern and innovative approaches to both data strategy and data management. EPISODE LINKS & RESOURCES:Connect with Juan on LinkedInVisit Data.World Check out the Catalog & Cocktails podcast
The average CDO has 2 years to turn data into business value. Today we talk about data products, implicit and explicit knowledge, and the cultural revolution we need to turn data into knowledge at scale. My guests today are Juan Sequeda and Tim Gasper from data.world co-hosts of the Catalog & Cocktails podcast. Episode page https://www.discoveringdata.com/podcast/episode-044 (https://www.discoveringdata.com/podcast/episode-044) Events coming up The Data Management Marathon 5.0 is coming up soon October 12-13,2022. This is a one-of-a-kind virtual event for every data lover, born out of a love for data management, knowledge sharing and storytelling. Organized by Thinklinkers in collaboration with the one and only Scott Taylor, this event features high-level speakers, influencers and an enthusiastic community of data professionals. Check out the event agenda and use our special promo code DISCOVER20 to get a 20% off the pro pass: https://bit.ly/3BXtkiR (https://bit.ly/3BXtkiR) For Brands Do you want to showcase your thought leadership with great content and build trust with a global audience of data leaders? We publish conversations with industry leaders to help practitioners create more business outcomes. Explore all the ways to tell your data story here https://www.discoveringdata.com/brands (https://www.discoveringdata.com/brands). For sponsors Want to help educate the next generation of data leaders? As a sponsor, you get to hang out with the very in the industry. Want to see if you are a match? Apply now: https://www.discoveringdata.com/sponsors (https://www.discoveringdata.com/sponsors) For Guests Do you enjoy educating an audience? Do you want to help data leaders build indispensable data products? That's awesome! Great episodes start with a clear transformation. Pitch your idea at https://www.discoveringdata.com/guest (https://www.discoveringdata.com/guest).
Answering critical business questions relies on integrating data from a variety of systems. But it takes a lot of work to understand what the disparate data means and how it all fits together. How do we make data as reliable as an electricity?Join Tim Gasper, Juan Sequeda and Fraser Harris, VP of Product at Fivetran, as they celebrate the 100th live episode of Catalog & Cocktails and discuss how #metadata, #datacatalogs, and #dataintegration act as the power source for your connected enterprise
Catalog & Cocktails is BACK! Tune in for the SEASON FOUR PREMIERE with hosts Juan Sequeda and Tim Gasper!Bridging the gap between business and IT is an age-old problem. IT doesn't understand the business. The business doesn't understand IT. Everyone looks at IT and business as if they were separate and point fingers, when they should actually be together. Join Tim, Juan and Vipul P., Global Head of Data Management at WPP where they discuss approaches to bridge the gap between business and IT.
Catalog & Cocktails is BACK! Tune in for the SEASON FOUR PREMIERE with hosts Juan Sequeda and Tim Gasper!Bridging the gap between business and IT is an age-old problem. IT doesn't understand the business. The business doesn't understand IT. Everyone looks at IT and business as if they were separate and point fingers, when they should actually be together. Join Tim, Juan and Vipul P., Global Head of Data Management at WPP where they discuss approaches to bridge the gap between business and IT.
There's much more to data than just...data. There's knowledge. Juan Sequeda and Tim Gasper from data.world join the show to chat about why we need to change our focus from data to knowledge. https://data.world/ #data #knowledge
On this podcast, APQC's Mercy Harper and Lauren Trees talk with Juan Sequeda, PhD, principal scientist at Data.World, about what knowledge graphs are the people and process steps you need to bring this powerful technology into your own organization. Check out Juan's book, Designing and Building Enterprise Knowledge Graphs, his recent article for the Association for Computing Machinery, and his podcast, Catalog and Cocktails. Subscribe to us on iTunes or wherever you get your podcasts!
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) https://www.patreon.com/datameshradio (Data Mesh Radio Patreon) - get access to interviews well before they are released Knowledge Graph Conference website: https://www.knowledgegraph.tech/ (https://www.knowledgegraph.tech/) Free Ticket Raffle for Knowledge Graph Conference (submissions must be by April 18 at 11:59pm PST): https://forms.gle/Gy8KSMNDxbBfib2Z6 (Google Form) Thank you to our contributors! You can find additional introductory resources below. Contributors and their contact info: Karen Passmore, CEO and Founder of Predictive UX Karen LinkedIn: https://www.linkedin.com/in/karenpassmore/ (https://www.linkedin.com/in/karenpassmore/) https://daappod.com/data-mesh-radio/dont-sleep-on-ux-karen-passmore-steve-stesney/ (DMR Episode 30) Xhensila Poda, Machine Learning Engineer at CARDO AI Xhensila's LinkedIn: https://www.linkedin.com/in/xhensilapoda/ (https://www.linkedin.com/in/xhensilapoda/) Jens Scheidtmann, Lead Architect at Bayer Jens' LinkedIn: https://www.linkedin.com/in/jens-scheidtmann/ (https://www.linkedin.com/in/jens-scheidtmann/) Juan Sequeda, Principal Scientist at Data.world Juan's LinkedIn: https://www.linkedin.com/in/juansequeda/ (https://www.linkedin.com/in/juansequeda/) https://daappod.com/data-mesh-radio/knowledge-first-juan-sequeda/ (DMR Episode 14) Steve Stesney, Senior Product Lead and Data Practice Lead at Predictive UX Steve's LinkedIn: https://www.linkedin.com/in/stephenstesney/ (https://www.linkedin.com/in/stephenstesney/) https://daappod.com/data-mesh-radio/dont-sleep-on-ux-karen-passmore-steve-stesney/ (DMR Episode 30) Tim Tischler, Principal Engineer at Wayfair Tim's LinkedIn: https://www.linkedin.com/in/timtischler/ (https://www.linkedin.com/in/timtischler/) https://daappod.com/data-mesh-radio/data-mesh-resilience-tim-tischler/ (DMR Episode 43) Further Introductory Resources (provided by Ellie Young and Juan Sequeda): What is a Knowledge Graph video by Martin Keen: https://www.youtube.com/watch?v=y7sXDpffzQQ (https://www.youtube.com/watch?v=y7sXDpffzQQ) Knowledge graphs: Introduction, history, and perspectives (paper): https://onlinelibrary.wiley.com/doi/10.1002/aaai.12033 (https://onlinelibrary.wiley.com/doi/10.1002/aaai.12033) Knowledge Graphs book (free): https://kgbook.org/ (https://kgbook.org/) Knowledge Graphs paper by Juan Sequeda and Claudio Gutierrez: https://cacm.acm.org/magazines/2021/3/250711-knowledge-graphs/fulltext (https://cacm.acm.org/magazines/2021/3/250711-knowledge-graphs/fulltext) Juan's book "http://www.morganclaypoolpublishers.com/catalog_Orig/product_info.php?products_id=1658 (Designing and Building Enterprise Knowledge Graphs)", 25% discount using code DATAWORLD Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode was found on PixaBay and was created by (including slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (Lesfm), https://pixabay.com/users/mondayhopes-22948862/?tab=audio (MondayHopes), https://pixabay.com/users/sergequadrado-24990007/ (SergeQuadrado), and/or https://pixabay.com/users/nevesf-5724572/ (nevesf) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free...
Provided as a free resource by DataStax https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB) In this episode, Scott interviews Juan Sequeda, Principal Scientist at data.world and co-host of the Catalog and Cocktails podcast. They discussed Juan's knowledge first approach: putting the meaning and value of the data first instead of focusing on the amount of data we are handling/producing. Knowledge first has 3 components, 1) context, 2) people, and 3) relationships. Juan is a big proponent of knowledge graphs and the relationships side is one many people miss. Juan also gave some thoughts on what his approach to data mesh hinges on: treating data as a product and finding a balance between centralization and decentralization for all the aspects of building out an implementation. Juan mentioned Intuit's approach of fixed, flexible/extensible, or customizable as a good general tool and to look for (and embrace) what he calls intellectual friction. Lastly, Juan and Scott talked about the general drive to reduce toil, of reinventing the wheel re data interoperability and standard schemas in data mesh. Juan points to a lot of existing research and standards - e.g. RDF, OWL, and many more (see below) - as a starting point. Juan's contact info and related links: Email: juan at data.world Twitter: @juansequeda / https://twitter.com/juansequeda (https://twitter.com/juansequeda) LinkedIn: https://www.linkedin.com/in/juansequeda/ (https://www.linkedin.com/in/juansequeda/) Catalog & Cocktails Podcast: https://data.world/podcasts/ (https://data.world/podcasts/) Juan's post about Zhamak's appearance on the Data Engineering Podcast: https://www.linkedin.com/pulse/my-takeaways-data-engineering-podcast-episode-mesh-zhamak-sequeda/ (https://www.linkedin.com/pulse/my-takeaways-data-engineering-podcast-episode-mesh-zhamak-sequeda/) Juan's post about knowledge first: https://www.linkedin.com/feed/update/urn:li:activity:6884179569277059072/ (https://www.linkedin.com/feed/update/urn:li:activity:6884179569277059072/) Standards related links: Dublin Core Metadata Initiative: https://dublincore.org/ (https://dublincore.org/) RDF (Resoruce Description Framework): https://www.w3.org/2001/sw/wiki/RDF (https://www.w3.org/2001/sw/wiki/RDF) OWL (Web Ontology Language): https://www.w3.org/OWL/ (https://www.w3.org/OWL/) PROV-O: The PROV Ontology: https://www.w3.org/TR/prov-o/ (https://www.w3.org/TR/prov-o/) Data Mesh Radio is hosted by Scott Hirleman. If you want to connect with Scott, reach out to him at community at datameshlearning.com or on LinkedIn: https://www.linkedin.com/in/scotthirleman/ (https://www.linkedin.com/in/scotthirleman/) If you want to learn more and/or join the Data Mesh Learning Community, see here: https://datameshlearning.com/community/ (https://datameshlearning.com/community/) If you want to be a guest or give feedback (suggestions for topics, comments, etc.), please see https://docs.google.com/document/d/1WkXLhSH7mnbjfTChD0uuYeIF5Tj0UBLUP4Jvl20Ym10/edit?usp=sharing (here) All music used this episode created by Lesfm (intro includes slight edits by Scott Hirleman): https://pixabay.com/users/lesfm-22579021/ (https://pixabay.com/users/lesfm-22579021/) Data Mesh Radio is brought to you as a community resource by DataStax. Check out their high-scale, multi-region database offering (w/ lots of great APIs) and use code DAAP500 for a free $500 credit (apply under "add payment"): https://www.datastax.com/products/datastax-astra?utm_source=DataMeshRadio (AstraDB)
The concept of a business glossary goes back decades, but it never really worked out. Today, data catalogs have been catapulted to enterprise fame, helping organizations weave together meaningful understanding of even the most complex environments. Find out all the latest on this episode of #DMRadio, as host @eric_kavanagh interviews several experts: Juan Sequeda, Data.World; Rick Sherman, Athena IT Solutions; Sarbjeet Johal, Analyst; Luc Legardeur, Zeenea.
Data Architecture is evolving and there are many questions with various perspectives. What is the balance between centralization and decentralization? How do you start treating data as a product? How do you incentivize people? What's the role of Data Mesh, Data Fabric, Knowledge Graphs? This special edition of Catalog and Cocktails is the Data Architecture panel from the Knowledge Graph Conference, moderated by Juan Sequeda. Listen and learn from Teresa Tung Chief Technologist of Accenture's Cloud First group , Zhamak Dehghani director of emerging technologies at Thoughtworks and founder of Data Mesh concept, and Jay Yu, Distinguished Architect at Intuit. You can watch the panel here and follow Juan's takeaways.