Podcasts about data lineage

  • 33PODCASTS
  • 49EPISODES
  • 44mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Jul 3, 2024LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about data lineage

Latest podcast episodes about data lineage

The AI Fundamentalists
Data lineage and AI: Ensuring quality and compliance with Matt Barlin

The AI Fundamentalists

Play Episode Listen Later Jul 3, 2024 28:29 Transcription Available


Ready to uncover the secrets of modern systems engineering and the future of AI? Join us for an enlightening conversation with Matt Barlin, the Chief Science Officer of Valence. Matt's extensive background in systems engineering and data lineage sets the stage for a fascinating discussion. He sheds light on the historical evolution of the field, the critical role of documentation, and the early detection of defects in complex systems. This episode promises to expand your understanding of model-based systems and data issues, offering valuable insights that only an expert of Matt's caliber can provide.In the heart of our episode, we dive into the fundamentals and transformative benefits of data lineage in AI. Matt draws intriguing parallels between data lineage and the engineering life cycle, stressing the importance of tracking data origins, access rights, and verification processes. Discover how decentralized identifiers are paving the way for individuals to control and monetize their own data. With the phasing out of third-party cookies and the challenges of human-generated training data shortages, we explore how systems like retrieval-augmented generation (RAG) and compliance regulations like the EU AI Act are shaping the landscape of AI data quality and compliance. Don't miss this thought-provoking episode that promises to keep you at the forefront of responsible AI.What did you think? Let us know.Do you have a question or a discussion topic for the AI Fundamentalists? Connect with them to comment on your favorite topics: LinkedIn - Episode summaries, shares of cited articles, and more. YouTube - Was it something that we said? Good. Share your favorite quotes. Visit our page - see past episodes and submit your feedback! It continues to inspire future episodes.

Open||Source||Data
Redefining AI Ethics: The Key Role of Explainability with Beth Rudden

Open||Source||Data

Play Episode Listen Later Jul 2, 2024 53:18


Timestamps00:00:00 - Intro00:02:00 - Beth's Journey00:19:33​ - Ontologies in AI00:21:44 - Data Lineage and Provenance00:32:52 - Open Source Tools00:38:38​ - Explainable AI00:44:58- Inspiration from NatureQuotesBeth Rudden: "The best thing that I could tell you that I see is that it's going to shift from more pure mathematical and statistical to much more semantic, more qualitative. Instead of quantity, we're going to have quality."Charna Parkey: "I love that because I've been so mathematical for most of my life. I didn't have a lot of words for the feelings or expressions, right? And so I had sort of this lack of data and the Brené Brown reference you make, like I have many of her books on my shelf and I often pull, I don't even know where it is right now, but the Atlas of the Heart because I am having this feeling and I don't know what it is."LinksConnect with BethConnect with Charna

The Ravit Show
Data Lineage, Data Visibility, Management, and Compliance with Yael Ben Arie

The Ravit Show

Play Episode Listen Later Apr 16, 2024 10:51


In this episode of The Ravit Show I interview Yael Ben Arie (Steinberger) at the Gartner Data & Analytics Summit. Yael is the CEO of OCTOPAI, the leader in enterprise Metadata Management & Data Lineage. Yael shares her insights on what she is hearing from data leaders at this year's event, and why she thinks Data Lineage can be easy for data teams struggling with data visibility, management, and compliance. Watch the video to hear her take-aways. #data #ai #gartnerorlando #octopai #theravitshow

ceo management compliance visibility arie data lineage analytics summit yael ben
The Data Democracy
Episode 3 w/ Irina Steenbeek - Beyond Tooling: The Strategic Importance of Data Lineage in Democratization

The Data Democracy

Play Episode Listen Later Sep 20, 2023 47:19


In this podcast episode, host Ole Olesen-Bagneux interviews Dr. Irina Steenbeek, a seasoned data management expert, to unravel the complex fabric of data democracy and its intertwined relationship with data lineage. Stepping into the realms of data democratization, the conversation emphasizes the need for prioritizing organizational objectives over mere tool selection. Irina shares her insights on the challenges of achieving comprehensive data lineage in multi-cloud environments, drawing attention to the blend of automation and manual efforts required. The duo also sheds light on the rising complexities of data mesh, underscoring the need for clarity in its definition and execution. Further enriching the discussion, Irina's anecdotes from her professional journey illustrate the motivation behind persevering in data management projects, while also highlighting the challenges and unpredictability inherent to the field.

Unf*ck Your Data
Data Governance oder die Schiedsrichter auf dem Dataspielfeld | Tiankai Feng

Unf*ck Your Data

Play Episode Listen Later Jul 4, 2023 31:42


Ist Data Governance wirklich so trocken wie der Begriff klingt? Und warum ist sie eigentlich für eine saubere Datenarbeit unerlässlich?Darüber spricht Christian Krug, der Host des Podcasts „Unf*ck Your Data“ mit Tiankai Feng, Senior Director, Product Data Governance bei Adidas.Die beste Analogie für Data Governance sind die Schiedsrichter beim Sport. Sie regelt den grundlegenden Umgang mit Daten. Wie, warum und wofür werden Daten im Unternehmen erfasst und wofür werden sie verwendet.Das Ziel: Ganz klar. Maximale Transparenz über die Daten.Nur wenn ich weiß wie Daten entstanden sind, dann kann ich sie auch korrekt interpretieren. Angefangen von Business Analytics bis zu komplexen AI Algorithmen, wenn die Datenqualität nicht stimmt, werden falsche Entscheidungen getroffen.Damit genau das nicht passiert haben die Mitarbeiter*innen im der Data Governance Werkzeuge wie einen Data Catalog, Data Flows oder Data Lineage. So zeigen sie auf wie der Stammbaum von Daten aussieht und welches es überhaupt gibt. Doch auch hier stellt Tiankai fest: Wir haben alle technischen Mittel um es umzusetzen, aber es menschelt. Daher ist einer der wichtigsten Teile der Arbeit: Changemanagement. Wenn die Mitarbeitenden verstehen, warum sie gewisse Regeln einhalten müssen oder Dinge anders erfassen müssen, dann sind sie eher bereit es zu tun. Im Idealfall erleichtert es sogar ihre Arbeit.Wie so oft klingt das Grundlagenthema Data Governance trocken und sperrig. Aber wenn dein Fundament nicht steht, wackelt dein Haus. Und es ist gar nicht so unsexy wie der Name vermuten lässt. Auch hier kann man sich mit gesundem Menschenverstand und Fingerspitzengefühl dem Thema nähern.Der Beitrag von Jessica zu AI Bias: https://www.linkedin.com/feed/update/urn:li:activity:7070281600961261568/▬▬▬▬▬▬ Profile: ▬▬▬▬Zum LinkedIn-Profil von Tiankai: https://www.linkedin.com/in/tiankaifeng/Zum LinkedIn-Profil von Christian: https://www.linkedin.com/in/christian-krug/Unf*ck Your Data auf Linkedin: https://www.linkedin.com/company/unfck-your-data▬▬▬▬▬▬ Buchempfehlung: ▬▬▬▬Buchempfehlung von Tiankai: Disruptive Data Governanve – Laura MadsenAlle Empfehlungen in Melenas Bücherladen: https://gunzenhausen.buchhandlung.de/unfuckyourdata▬▬▬▬▬▬ Hier findest Du Unf*ck Your Data: ▬▬▬▬Zum Podcast auf Spotify: https://open.spotify.com/show/6Ow7ySMbgnir27etMYkpxT?si=dc0fd2b3c6454bfaZum Podcast auf iTunes: https://podcasts.apple.com/de/podcast/unf-ck-your-data/id1673832019Zum Podcast auf Google: https://podcasts.google.com/feed/aHR0cHM6Ly9mZWVkcy5jYXB0aXZhdGUuZm0vdW5mY2steW91ci1kYXRhLw?ep=14Zum Podcast auf Deezer: https://deezer.page.link/FnT5kRSjf2k54iib6▬▬▬▬▬▬ Kontakt: ▬▬▬▬E-Mail: christian@uyd-podcast.com▬▬▬▬▬▬ Timestamps: ▬▬▬▬▬▬▬▬▬▬▬▬▬00:00 Intro01:44 Tiankai stellt sich vor02:45 Data Governance regelt den Umgang mit Daten im Unternehmen03:35 Die Schiedrichter auf dem Dataspielfeld überwachen die Einhaltung04:43 Daten werden für andere Zwecke verwendet, als die für die sie erfasst wurden07:12 Data Governance sichert die richtige Verwendung von Daten08:10 Der Data Calalog stellt als Werkzeug eine hohe Transparenz her09:37 Die Data Lineage als Stammbaum der Daten11:37 Der Katalog macht technische Schlüssel lesbar und verständlich12:12 Data Governance ist ein People Business und Changemanagement15:36 Externe und Interne Faktoren bestimmen die Anforderungen17:15 Die Datenqualität wird durch Data Governance erhöht19:19 Verantwortliche AI geht nicht ohne Governance auf die Datenbasis20:23 Data Governance stellt die technische Datenqualität...

Drill to Detail
Drill to Detail Ep.105 ‘Data Catalogs, Data Discovery and Data Lineage for the Modern Data Stack' with Special Guest Shinji Kim

Drill to Detail

Play Episode Listen Later Jun 1, 2023 37:44


Mark is joined by Shinji Kim, Founder and CEO of Select Star to talk about their mission to re-invent data catalogs, data discovery and data lineage for the modern data stackData Discovery vs. Data Observability: Understanding the Differences for Better DataOpsSelect Star and dbt Labs Partner for Better Data Discovery on dbtSelect Star : Free Trial

Engenharia de Dados [Cast]
O Poder do Lineage de Dados com Lucas Galindo & Gabs Ferreira da Alvin

Engenharia de Dados [Cast]

Play Episode Listen Later Apr 19, 2023 55:49


No episódio de hoje, Luan Moreno e Mateus Oliveira entrevistaram  Lucas Galindo Data Engineer/Software Engineer  & Gabs Ferreira Community Builder, ambos trabalhando na Alvin.A solução Alvin cria e mantém automaticamente um conjunto de dados de gráfico  conectando em fontes de dados, como Snowflake, Redshift, dentre outros entregando uma solução da Data Lineage robusta.O Alvin oferece os seguintes benefícios:Uma maneira automatizada de detectar e rastrear erros/bugsdo pipeline, reduzindo o tempo de inatividade dos dados.Automatiza o teste de regressão, fornecendo um relatório detalhado do impacto downstream antes da implantação do código.Mapeia automaticamente os fluxos de dados dentro e entre os sistemas e mostra como eles são consumidos em toda a empresa.Neste podcast, foi também argumentado assuntos como governança, democratização e qualidade dos dados.Conceitos de Linhagem de Dados e Governança de DadosFeatures Integrações disponíveis na Plataforma da Alvin.Diferenças de mercado (Nacional & Internacional).Comunidade de Dados.Entenda porque precisamos de governança de dados e como a Alvin pode nos entregar um produto focado em Linhagem de Dados para agregar valor para aos seus cliente.AlvinGabs FerreiraLucas Galindo Luan Moreno = https://www.linkedin.com/in/luanmoreno/

AgileBI
AgileData #39 - Data Lineage Patterns - Tomas Kratky

AgileBI

Play Episode Listen Later Mar 28, 2023 53:16


Join Shane Gibson as he chats with Tomas Kratky on his experience in defining data lineage and DataOps patterns. You can get in touch with Tomas on LinkedIn   If you want to read the transcript for the podcast head over to: https://agiledata.io/podcast/agiledata-podcast/data-lineage-patterns-tomas-kratky/#transcript Listen to more podcasts on applying AgileData patterns over at https://agiledata.io/podcasts/ Read more on the AgileData Way of Working over at https://wow.agiledata.io/way-of-working/     If you want to join us on the next podcast, get in touch over at https://agiledata.io/podcasts/#contact   Or if you just want to talk about making magic happen with agile and data you can connect with Shane @shagility or Nigel @nigelvining on LinkedIn.   Subscribe: Apple Podcast | Spotify | Google Podcast  | Amazon Audible | TuneIn | iHeartRadio | PlayerFM | Listen Notes | Podchaser |    Simply Magical Data

patterns dataops data lineage
BI or DIE
Reportingkatalog & Self Service | BI or DIE x PwC | Teil 1 - Tobias Elle & Jumen Rest

BI or DIE

Play Episode Listen Later Feb 28, 2023 36:39


Was haben Mietwagen eigentlich mit Reportings zu tun? Mit Tobias und Jumen spricht Andreas über Toolauswahl, die Definition von Self Service und wo er an seine Grenzen stößt. Außerdem dabei.. - IT oder Fachbereich - Wer sollte die Verantwortung für Self Service Projekte tragen? - Notationskonzept oder Analysetool - Was ist wichtiger? - Information Design - Wieviel Design-Anspruch darf man an Visualisierungen haben? - Reporting - wo liegen momentane Herausforderungen bei Konzeption & Implementierung? Tobias und Jumen sind Manager bei PwC, verantworten im Bereich Data & Analytics das Reporting Team und sind täglich für ihre Kunden in sämtlichen Themenfeldern rund um Reporting unterwegs. Egal ob für die Einführung von Self-Service Lösungen mit passender Governance, den Aufbau von Shared Service Centern oder die end-to-end Implementierung von komplexen Reporting Applikationen - Tobias, Jumen und das Data & Analytics Team bei PwC beraten und implementieren entlang der gesamten Reporting Value Chain.

From Research to Reality: The Hewlett Packard Labs Podcast
Technical Leadership, Suparna Bhattacharya

From Research to Reality: The Hewlett Packard Labs Podcast

Play Episode Listen Later Feb 23, 2023 23:27


In this episode of the Hewlett Packard Labs podcast “From Research to Reality,” Dejan Milojicic hosts Suparna Bhattacharya, newly promoted HPE Fellow and VP at Hewlett Packard Labs. The focus is on technical leadership. Suparna discusses the importance of being a fellow, her work with data and trustworthy AI, and advice for colleagues in her field. She emphasizes the importance of data for AI and open-source applications. She concludes by sharing anecdotes from her personal life.  

From Research to Reality: The Hewlett Packard Labs Podcast
Upcoming Hewlett Packard Labs Podcast: Technical Leadership, Suparna Bhattacharya

From Research to Reality: The Hewlett Packard Labs Podcast

Play Episode Listen Later Feb 16, 2023 1:16


In next week's episode of the Hewlett Packard Labs podcast “From Research to Reality,” Dejan Milojicic hosts Suparna Bhattacharya, newly promoted HPE Fellow and VP at Hewlett Packard Labs. The focus will be on technical leadership. Suparna will discuss the importance of being a fellow, her work with data and trustworthy AI, and advice for colleagues in her field. She emphasizes the importance of data for AI and open-source applications. She concludes by sharing anecdotes from her personal life.  

The Shifting Privacy Left Podcast
S1E5: The Rise of Global Data Sharing Platforms with Stephen Wilson (Constellation Research)

The Shifting Privacy Left Podcast

Play Episode Play 56 sec Highlight Listen Later Nov 22, 2022 59:24 Transcription Available


I'm joined by Stephen Wilson, accomplished data protection innovator, researcher, analyst and advisor who leads Digital Safety and Privacy efforts at Constellation Research and is Managing Director of Lockstep Technologies. In our conversation, we discuss the importance of information value chains, the emergence of data sharing platforms, discuss why data should be like clean drinking water, and explore the problems with "data ownership."--------Thank you to our sponsor, Privado, the developer-friendly privacy platform--------Stephen explains the push for more data sharing and to establish user-centric business models that deliver value for businesses and benefits for individuals. We discuss emerging tools that assure the orderliness, fairness, and transparency of information value chains and why Stephen aims to take data processing "out of the shadows" with his research.Lastly, we discuss key Facebook & Google EU court cases that addresses collection & use of facial biometrics  from people without sufficient consent and the challenges that Google and search engines have with addressing "the right to be forgotten." Plus, we discuss the privacy expectations within the ‘digital town square,' particularly through the lens of Twitter and Facebook. ---------Listen to the episode on Apple Podcasts, Spotify, iHeartRadio, or on your favorite podcast platform.---------Topics Covered:Stephen's assertion that privacy is about restraint: what you choose to not know.The rise of data sharing platforms to facilitate and scale global information value chains.How if data is like “crude oil,” then it requires safe handling, and why we should treat data like "clean drinking water" instead.The importance of data quality, data originality, and data lineage.Stephen's analysis of the growing market for “Data Protection as a Service," which includes: data clean rooms, privacy APIs, and more.Why you don't need to own your own data to get good privacy outcomes.Resources Mentioned:Read the 2021 Data for Better Lives report (World Bank)  Privado.ai Privacy assurance at the speed of product development. Get instant visibility w/ privacy code scans.Shifting Privacy Left Media Where privacy engineers gather, share, & learnBuzzsprout - Launch your podcast Disclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Copyright © 2022 - 2024 Principled LLC. All rights reserved.

Hipsters Ponto Tech
Engenharia e Linhagem de dados – Hipsters Ponto Tech #327

Hipsters Ponto Tech

Play Episode Listen Later Oct 18, 2022 38:42


No Hipsters.Tech desta semana conversamos com o time da Alvin sobre os desafios de time de dados em suas rotinas de trabalho, principalmente em relação a linhagem de dados (ou Data Lineage), que é a toda a jornada que os dados fazem. 

Lights On Data Show
Different Approaches to Data Lineage

Lights On Data Show

Play Episode Listen Later Oct 7, 2022 38:51


Data lineage helps organizations understand the full story behind their data so they can make better business decisions. There are, however, MULTIPLE ways how to do data lineage. Let's understand these different approaches, the importance of each, best practices to follow, and the connection to business glossary, data quality, and overall data governance. Our guest for this episode is Torsten Priebe, Chief Technology Officer at Simplity, the company behind the Accurity data governance software suite.

Data Engineering Podcast
Make Data Lineage A Ubiquitous Part Of Your Work By Simplifying Its Implementation With Alvin

Data Engineering Podcast

Play Episode Listen Later Oct 3, 2022 56:16


Data lineage is something that has grown from a convenient feature to a critical need as data systems have grown in scale, complexity, and centrality to business. Alvin is a platform that aims to provide a low effort solution for data lineage capabilities focused on simplifying the work of data engineers. In this episode co-founder Martin Sahlen explains the impact that easy access to lineage information can have on the work of data engineers and analysts, and how he and his team have designed their platform to offer that information to engineers and stakeholders in the places that they interact with data.

Data Engineering Podcast
What "Data Lineage Done Right" Looks Like And How They're Doing It At Manta

Data Engineering Podcast

Play Episode Listen Later Jul 31, 2022 65:18


Data lineage is the roadmap for your data platform, providing visibility into all of the dependencies for any report, machine learning model, or data warehouse table that you are working with. Because of its centrality to your data systems it is valuable for debugging, governance, understanding context, and myriad other purposes. This means that it is important to have an accurate and complete lineage graph so that you don't have to perform your own detective work when time is in short supply. In this episode Ernie Ostic shares the approach that he and his team at Manta are taking to build a complete view of data lineage across the various data systems in your organization and the useful applications of that information in the work of every data stakeholder.

The Data Protection and Privacy Podcast
What The EXPERTS Do Not TELL Us about Data Protection ? Part 16 Omar ElNaggar, founder and CEO of Weavechain discusses the future privacy of personal data, web 3 , decentralized science ,Privacy Data Lineage, Privacy Data brokerage.

The Data Protection and Privacy Podcast

Play Episode Listen Later May 15, 2022 26:11


Omar ElNaggar, founder and CEO of Weavechain , Weavechain brings enterprise big data to Web3. That's impossible today because public blockchains don't meet enterprise data governance standards, and no blockchains meet big data performance needs. Our middleware leaves the data in existing high performance databases and uses our smart hashing technology in coordination with a variety of blockchains that meet security needs. This unlocks basic Web3 benefits like tamper-proofing and built-in monetization, and connects Web2 companies to Web3 customers and more.Join our Discord! The link is on www.weavechain.comFollow us on Twitter @WeavechainWeb3Email us at hello@weavechain.com (hint: it goes straight to Omar)https://www.weavechain.comhttps://www.linkedin.com/company/weavechain/https://twitter.com/weavechainweb3https://discord.gg/VjuPrwe4ub  This Episode is sponsored by www.vciso.co  We help  Companies , start-ups to meet privacy and cyber security requirements and standards so they close sales deals quicker and can achieve cyber privacy alignment certifications in minimum time The DataprotectionPrivacy.co.uk  podcast is hosted by David Clarke follow me on twitter @1davidclarke 102k followers https://twitter.com/1DavidClarke Founder of the linkedin GDPR Technology Group 21k + members https://www.linkedin.com/groups/12017677 Connect with David Clarke on Linkedin  https://www.linkedin.com/in/1davidclarke/

Data in Construction
Data Governance in Real Estate & Construction: Ian Cameron

Data in Construction

Play Episode Listen Later Apr 11, 2022 34:35


Follow Ian here: https://www.linkedin.com/in/ian-cameron-7aa64212/Check out OSCRE here: https://www.oscre.org/

Data in Construction
Data Lineage and Graph Databases

Data in Construction

Play Episode Listen Later Mar 27, 2022 33:10


Follow Wouter hereCheck out Xudo.beCheck out the Medium article that started it all: https://medium.com/p/8cbf0497d5a6 From The_Link:Subscribe on Apple: https://podcasts.apple.com/us/podcast/data-in-construction/id1604092908Subscribe on Spotify: https://open.spotify.com/show/2AUUpaT0yYueyah826JOOQ?si=a34ca4e3acf24835Sign up for the Data in Construction Book: http://eepurl.com/hTtFPHSign up for Data in Construction skills webinarsBuy The Construction Technology Handbook here: https://www.amazon.com/gp/product/B08PNHBB1M/ref=dbs_a_def_rwt_bibl_vppi_i0

Data in Construction
An Introduction to Data Governance, Jessi Ashdown & Uri Gilad, Google

Data in Construction

Play Episode Listen Later Feb 13, 2022 44:29


Follow Jessi Ashdown hereFollow Uri Gilad hereFind Data Governance, The Definitive Guide here: From the Publisher:Subscribe on Apple: https://podcasts.apple.com/us/podcast/data-in-construction/id1604092908Subscribe on Spotify: https://open.spotify.com/show/2AUUpaT0yYueyah826JOOQ?si=a34ca4e3acf24835Sign up for the Data in Construction Book: http://eepurl.com/hTtFPHSign up for Data in Construction skills webinarsBuy The Construction Technology Handbook here: https://www.amazon.com/gp/product/B08PNHBB1M/ref=dbs_a_def_rwt_bibl_vppi_i0

MetaDAMA - Data Management in the Nordics
#14 - Data Democratization in public services (Nor)

MetaDAMA - Data Management in the Nordics

Play Episode Listen Later Feb 1, 2022 42:25


How can you democratize data from Norway's public services?What does it mean to have a user centric approach to public services?What needs to be in place to organize your organization, the public services as a whole, while maintaining focus on good services to your citizens?I had the pleasure of chatting with Gustav Aagesen, Chief Data Officer at Lånekassen. The Norwegian State Educational Loan Fund, who is celebrating its 75th anniversary in 2022. Gustav started as Information Architect in Lånekassen in 2012, became analysis manager before taking the position as CDO at Lånekassen. Today, he sees his responsibility in supporting the entire organization and to institutionalize information management. We looked at 3 different Perspectives on “Orden I eget hus” a Data Governance framework for public services and Democratization of data in Norway: 1.       Lånekassen. An internal view on automation and structures, data citizenship, and culture.2.       Public Norway. A perspective that includes valuable work on the common data catalogue, “orden i eget hus”, common concepts and datasets.3.        Citizen perspective. Thoughts about finding ways to make use and consumption of data easier and with less barriers, and provide citizen-centric services.Here are some of my key takeaways:-          Information Management is not a goal by itself, but a way to create and gain value-          Information Management has to start with a purpose!-          Data and Information has a longer lifecycle then applications.-          Data Lineage is important, with the objective in mind to create services and gather data based on consumer demands, or the needs of the citizens. -          The value for the citizen is an end-to-end- value stream that is traceable and can create trust in the data.-          If you want to give the citizens access to proactive public services, information has to flow between different institutions.-          It seems easier to get funding for technology then work-processes. That is also a reason automation is in high demand.-          Data Sharing needs to be balanced with trust and privacy to ensue good solutions for the consumer and citizens.

Discovering Data
Ideas: Data lineage from a business perspective with Irina Steenbeek

Discovering Data

Play Episode Listen Later Jan 31, 2022 62:46


Lineage is a critical capability, it's complex and expensive so it's really important to know what can go wrong and how to avoid pitfalls to deliver maximum value to the business. Today I am joined by Irina Steenbeek an absolute expert on this topic. Irina has a lot of experience managing ERP implementations, she is the founder of Data Crossroads and her background spans civil engineering, management, consultancy, and finance. Irina authored more than 60 blog posts in data management and published four books: The Orange Model of Data Management, The Data Management Toolkit, The Data Management Cookbook, and her latest book, Data Lineage from a Business Perspective, which we will talk about today. This episode is here to give you a solid understanding of what data lineage is and what it isn't, the value, costs, and benefits for your organization, some execution tips and ideas to talk about it with your business stakeholders. Read/listen: https://www.discoveringdata.com/ Crash course: How to build a successful data lineage business case?If you are managing a data linage implementation you might be very familiar with scope creep and issues related to funding and project management. How do you secure the necessary funding, scope and plan the work and craft a solution that delivers business outcomes? These are hard questions! The good news is that Irina developed a proven framework over the past 20 years to help you navigate this complexity, and we are partnering to bring you her knowledge and expertise in the form of a crash course. Registrations will open soon, subscribe to our newsletter to be notified.

Discovering Data
Tech: Why data lineage

Discovering Data

Play Episode Listen Later Jan 20, 2022 56:23


Data lineage is about gaining confidence in the data to save time, money, and even lives. How do you sell it to the business? Learn with me as I speak to Jan Ulrych, VP of Research & Education at MANTA. Get the full experience at https://www.discoveringdata.com (discoveringdata.com).

Data in Construction
Introducing the Data in Construction Podcast

Data in Construction

Play Episode Listen Later Jan 9, 2022 27:00


Sign up for news about the upcoming "Data in Construction" book here: http://eepurl.com/hRlpjTSign up for Data in Construction skills webinars here: http://eepurl.com/ht8Od9Follow Stephen here: https://www.linkedin.com/in/stephenpoppe/Follow Hugh here: https://www.linkedin.com/in/hughseaton/Buy The Construction Technology Handbook here: https://www.amazon.com/gp/product/B08PNHBB1M/ref=dbs_a_def_rwt_bibl_vppi_i0Finally, learn more about The_Link here: https://www.linkedin.com/company/theconstructionlink/

Software Lifecycle Stories
Immersing into data lineage and career contours with Sriram Rangarajan

Software Lifecycle Stories

Play Episode Listen Later Jan 7, 2022 37:50


Wishing you all a very happy new year 2022!!In this episode, Gayatri is in conversation with Sriram Rangarajan, Accomplished Data Analytics Leader at IBM shared his data journey His role as an enterprise data architect and what lenses he has to wearWays to get started collecting, analysing and gaining insights from dataHow can young engineers joining the data ecosystem look at dataData to be viewed through the business process and lifecycle Ways of managing complexities across organization and system boundaries Onboarding application teams right at the start instead of making it an after thoughtHow can you manage data errors and keep the sanctity of data cleanChallenges that are encountered while handling master data, data definition and gaps in data setupWorking with regulatory bodies and keeping the right level of data without infringing into compliancePerspective of work has changed across three decades from task centric, business centric and what-if possibilitiesEnsuring that foundations are clearly understood on data prior to diving deeper into technologySriram started his career with TCS after his graduation from JNU. Sriram has worked in IBM, NetApp Inc, and Informatica, demonstrating in-depth knowledge of Data Governance, Data Integration, Master Data Management, Data protection, privacy, and data lifecycle management. Sriram is a data analytics and strategist at heart with robust experience acquired over the years in delivering optimal results in high-growth environments. He excels at cultivating collaborative professional relationships and delivering data-driven and business-centric solutions; Result-driven with extensive expertise across scalable enterprise architecture, value-driven data science, and full-stack engineering. Sriram can be reached at https://www.linkedin.com/in/mrsriram/ 

BI or DIE
New Banking - Data Governance als Business Case - Im Gespräch mit Marco Geuer, Fiege Logistik

BI or DIE

Play Episode Listen Later Nov 21, 2021 60:48


Neben der Frage, warum das Thema für sie nicht langweilig ist, sprechen die beiden u.a. über: - Die Faszination und die Möglichkeiten von Data Governance - Was die Rapid Data Performance Simulation ist und wie sie funktioniert - Wie Data Governance sinnvoller Bestandteil der Business Intelligence wird - Wie man sich dem Thema Business-Case getrieben nähert Marco ist Management Consultant Data Strategy & Data Governance, Trainer für Data Governance und Data Quality Management bei der Haufe-Akademie, Lehrbeauftragter bei der DHBW Villingen-Schwenningen und Fachbeirat der #datagovkon. Zudem verbreitet er sein Wissen über seine Webseite https://www.business-information-excellence.de und ist hier und da mal als Speaker auf Fachkonferenzen. Warum macht er das alles? Weil er denkt, dass das Thema in der DACH-Region noch zu wenig erkannt ist und in den Organisationen als wichtiger Werttreiber zur Data Driven Company etabliert werden muss.

The Analytics Engineering Podcast
Julien Le Dem: Why Data Lineage Matters 

The Analytics Engineering Podcast

Play Episode Listen Later Nov 4, 2021 48:42


Julien has a unique history of building open frameworks that make data platforms interoperable. He's contributed in various ways to Apache Arrow, Apache Iceberg, Apache Parquet, and Marquez, and is currently leading OpenLineage, an open framework for data lineage collection and analysis. In this episode, Tristan & Julia dive into how open source projects grow to become standards, and why data lineage in particular is in need of an open standard. They also cover into some of the compelling use cases for this data lineage metadata, and where you might be able to deploy it in your work. For full show notes and to read 6+ years of back issues of the podcast's companion newsletter, head to https://roundup.getdbt.com.  The Analytics Engineering Podcast is sponsored by dbt Labs.

Podcast – Software Engineering Daily
Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem

Podcast – Software Engineering Daily

Play Episode Listen Later Jul 12, 2021 58:48


Big Data has exploded the past decade as cloud computing and more efficient hardware made scaling essentially limitless. Products like Uber revolve entirely around analyzing data to provide rides. According to an EMC/IDC study, there was approximately 5.2TB of data for every person in 2020. That estimate was made before the transition to remote work, The post Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem appeared first on Software Engineering Daily.

Data – Software Engineering Daily
Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem

Data – Software Engineering Daily

Play Episode Listen Later Jul 12, 2021 59:03


Big Data has exploded the past decade as cloud computing and more efficient hardware made scaling essentially limitless. Products like Uber revolve entirely around analyzing data to provide rides. According to an EMC/IDC study, there was approximately 5.2TB of data for every person in 2020. That estimate was made before the transition to remote work, The post Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem appeared first on Software Engineering Daily.

Software Engineering Daily
Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem

Software Engineering Daily

Play Episode Listen Later Jul 12, 2021 59:03


Big Data has exploded the past decade as cloud computing and more efficient hardware made scaling essentially limitless. Products like Uber revolve entirely around analyzing data to provide rides. According to an EMC/IDC study, there was approximately 5.2TB of data for every person in 2020. That estimate was made before the transition to remote work, The post Data Lineage: Understanding Data Lineage at Scale with Julien Le Dem appeared first on Software Engineering Daily.

Data Engineering Podcast
Unlocking The Power of Data Lineage In Your Platform with OpenLineage

Data Engineering Podcast

Play Episode Listen Later May 18, 2021 57:38


Data lineage is the common thread that ties together all of your data pipelines, workflows, and systems. In order to get a holistic understanding of your data quality, where errors are occurring, or how a report was constructed you need to track the lineage of the data from beginning to end. The complicating factor is that every framework, platform, and product has its own concepts of how to store, represent, and expose that information. In order to eliminate the wasted effort of building custom integrations every time you want to combine lineage information across systems Julien Le Dem introduced the OpenLineage specification. In this episode he explains his motivations for starting the effort, the far-reaching benefits that it can provide to the industry, and how you can start integrating it into your data platform today. This is an excellent conversation about how competing companies can still find mutual benefit in co-operating on open standards.

Data Engineering Podcast
Unlocking The Power of Data Lineage In Your Platform with OpenLineage

Data Engineering Podcast

Play Episode Listen Later May 18, 2021


Data lineage is the common thread that ties together all of your data pipelines, workflows, and systems. In order to get a holistic understanding of your data quality, where errors are occurring, or how a report was constructed you need to track the lineage of the data from beginning to end. The complicating factor is that every framework, platform, and product has its own concepts of how to store, represent, and expose that information. In order to eliminate the wasted effort of building custom integrations every time you want to combine lineage information across systems Julien Le Dem introduced the OpenLineage specification. In this episode he explains his motivations for starting the effort, the far-reaching benefits that it can provide to the industry, and how you can start integrating it into your data platform today. This is an excellent conversation about how competing companies can still find mutual benefit in co-operating on open standards.

Data Leadership Lessons Podcast
Data Literacy with John Ladley – Episode 35

Data Leadership Lessons Podcast

Play Episode Listen Later Apr 12, 2021 49:36


Watch the video version of this episode on YouTube: https://youtu.be/wGOGf6mSnLM On this episode, we welcome back John Ladley for a spirited discussion on Data Lineage. This one was an especially […]

Data Leadership Lessons Podcast
Data Literacy with John Ladley - Episode 35

Data Leadership Lessons Podcast

Play Episode Listen Later Apr 11, 2021 49:36


Watch this episode on YouTube: https://youtu.be/wGOGf6mSnLM On this episode, we welcome back John Ladley for a spirited discussion on Data Lineage. This one was an especially fun conversation that you will either thoroughly enjoy or completely disagree! Worth tuning in, regardless! * Get the Data Leadership Book – https://dataleadershipbook.com * Data Leadership Lessons on YouTube – https://www.youtube.com/c/DataLeadershipLessons* Save 20% on your first order at the DATAVERSITY Training Center with promo code “AlgminDL” – https://training.dataversity.net/?utm_source=algmindl_res * Guest and Sponsorship Inquiries – podcast@algmin.com About John Ladley:John Ladley is an experienced practitioner who helps organizations define and transition to new business and data capabilities. His books are considered the primary source for organizations to enable alignment of business and data strategy, organizational change and practical application of data technology to business problems. John focuses on adding data to the society's and organization's mentality – our postindustrial era must address Land, Labor, Capital, and Data. Data Governance: How to Design, Deploy, and Sustain an Effective Data Governance Program 2nd Edition – https://www.amazon.com/Data-Governance-Sustain-Effective-Program/dp/012815831X For more information and to contact John directly, visit https://johnladley.com/.

DataCast
Martin Fiser {Keboola} - povidani o datovych platformach

DataCast

Play Episode Listen Later Mar 10, 2021 51:04


V dalsim dile se dozvite, jak se Martin propracoval k praci v datech, jak se zorientovat v dnesnim svete, ktery zacina byt zavaleny vsemoznymi datovymi a integracnimi platformami. Probereme si blize data operations platformy a k jakym trenduv datovy svet smeruje od Data Katalogu pres Data Lineage.

probereme data lineage keboola
AgileBI
AgileData #22 - Data Lineage, mapping your way to magic

AgileBI

Play Episode Listen Later Nov 18, 2020 30:03


Join Shane and Nigel as they discuss data lineage, what it is, why people want it and what it can actually be used for. Listen to more podcasts from Shane and Nigel on applying AgileData techniques over at https://agiledata.io/podcasts/ If you have a suggestion on a subject you would like us to discuss on the next podcast, or even better you want to join us on the next podcast, send us a suggestion over at https://agiledata.io/podcasts/#contact Or if you just want to talk about making magic happen with agile and data you can connect with Shane @shagility or Nigel @nigelvining over on Twitter. Subscribe: iTunes, Spotify, Google Play   Simply Magical Data

magic mapping data lineage
Data Leadership Lessons Podcast
Welcome to Mass Hampshire with Barbara Nichols – Episode 26

Data Leadership Lessons Podcast

Play Episode Listen Later Nov 9, 2020 42:54


Watch the video version of this Episode on YouTube: https://youtu.be/Kf2-QrGoq2o Barbara Nichols is an accomplished Information Technology professional with over 35 years of experience assisting clients and software vendors toContinue reading

KuppingerCole Analysts
Analyst Chat #32: Data Management and Data Lineage - The Foundation for Big Data Governance and Security

KuppingerCole Analysts

Play Episode Listen Later Jul 20, 2020 13:36


Matthias Reinwarth and Martin Kuppinger talk about governance and security of data across a variety of sources and formats and the need for maintaining data lineage across its complete life cycle.

KuppingerCole Analysts Videos
Analyst Chat #32: Data Management and Data Lineage - The Foundation for Big Data Governance and Security

KuppingerCole Analysts Videos

Play Episode Listen Later Jul 20, 2020 13:36


Matthias Reinwarth and Martin Kuppinger talk about governance and security of data across a variety of sources and formats and the need for maintaining data lineage across its complete life cycle.

Catalog & Cocktails
What's the story with data lineage?

Catalog & Cocktails

Play Episode Listen Later Jul 1, 2020 30:14


How can you be certain the data you’re working with is trusted, up-to-date and understandable? Data lineage tells you where data originated, how it's used, and what transformations are applied as it flows from data sources and ETL workflows to downstream data marts and dashboards. In episode 7, Tim, Juan, and special guest Ernie Ostic from MANTA discuss how mapping data - and all its associated metadata and dependencies - leads to cleaner analysis and better business decisions.

Data Engineering Podcast
Solving Data Lineage Tracking And Data Discovery At WeWork - Episode 111

Data Engineering Podcast

Play Episode Listen Later Dec 16, 2019 61:52 Transcription Available


Building clean datasets with reliable and reproducible ingestion pipelines is completely useless if it's not possible to find them and understand their provenance. The solution to discoverability and tracking of data lineage is to incorporate a metadata repository into your data platform. The metadata repository serves as a data catalog and a means of reporting on the health and status of your datasets when it is properly integrated into the rest of your tools. At WeWork they needed a system that would provide visibility into their Airflow pipelines and the outputs produced. In this episode Julien Le Dem and Willy Lulciuc explain how they built Marquez to serve that need, how it is architected, and how it compares to other options that you might be considering. Even if you already have a metadata repository this is worth a listen to learn more about the value that visibility of your data can bring to your organization.

Streaming Audio: a Confluent podcast about Apache Kafka
Data Modeling for Apache Kafka – Streams, Topics & More with Dani Traphagen

Streaming Audio: a Confluent podcast about Apache Kafka

Play Episode Listen Later Oct 7, 2019 40:25


Helping users be successful when it comes to using Apache Kafka® is a large part of Dani Traphagen’s role as a senior systems engineer at Confluent. Whether she’s advising companies on implementing parts of Kafka or rebuilding their systems entirely from the ground up, Dani is passionate about event-driven architecture and the way streaming data provides real-time insights on business activity. She explains the concept of a stream, topic, key, and stream-table duality, and how each of these pieces relate to one another. When it comes to data modeling, Dani covers importance business requirements, including the need for a domain model, practicing domain-driven design principles, and bounded context. She also discusses the attributes of data modeling: time, source, key, header, metadata, and payload, in addition to exploring the significance of data governance and lineage and performing joins.EPISODE LINKSConvert from table to stream and stream to table Distributed, Real-Time Joins and Aggregations on User Activity Events Using Kafka StreamsKSQL in Action: Real-Time Streaming ETL from Oracle Transactional DataKSQL in Action: Enriching CSV Events with Data from RDBMS into AWSJourney to Event Driven – Part 4: Four Pillars of Event Streaming MicroservicesJoin the Confluent Community SlackFully managed Apache Kafka as a service! Try free.

Data Engineering Podcast
Data Lineage For Your Pipelines - Episode 82

Data Engineering Podcast

Play Episode Listen Later May 26, 2019 49:01 Transcription Available


Some problems in data are well defined and benefit from a ready-made set of tools. For everything else, there's Pachyderm, the platform for data science that is built to scale. In this episode Joe Doliner, CEO and co-founder, explains how Pachyderm started as an attempt to make data provenance easier to track, how the platform is architected and used today, and examples of how the underlying principles manifest in the workflows of data engineers and data scientists as they collaborate on data projects. In addition to all of that he also shares his thoughts on their recent round of fund-raising and where the future will take them. If you are looking for a set of tools for building your data science workflows then Pachyderm is a solid choice, featuring data versioning, first class tracking of data lineage, and language agnostic data pipelines.

O'Reilly Data Show - O'Reilly Media Podcast
Why companies are in need of data lineage solutions

O'Reilly Data Show - O'Reilly Media Podcast

Play Episode Listen Later Apr 25, 2019 34:29


In this episode of the Data Show, I spoke with Neelesh Salian, software engineer at Stitch Fix, a company that combines machine learning and human expertise to personalize shopping. As companies integrate machine learning into their products and systems, there are important foundational technologies that come into play. This shouldn’t come as a shock, as […]

Roaring Elephant
Episode 109 – Open Metadata and Governance Masterclass with Mandy Chessell – Part 2

Roaring Elephant

Play Episode Listen Later Oct 9, 2018 52:10


In this GDPR world, Data Governance and Data Lineage are, or should be, very much top of mind for anybody in the Big Data world. We reached out to Mandy Chessell, who has been very active in this area and were delighted when she accepted to do an interview with us. In this second part, we discuss the ins and outs of good data stewardship and how companies can adopt, implement and contribute. Mandy Chessell Distinguished Engineer, Master Inventor, Fellow of Royal Academy of Engineering https://www.linkedin.com/in/mandy-chessell-a4989722/ ODPi Blog post on Egeria: First Release of ODPi Egeria is Here ODPi github projects: Egeria - Open Metadata and Governance https://github.com/odpi/egeria Data-governance companion project https://github.com/odpi/data-governance Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Roaring Elephant
Episode 107 – Open Metadata and Governance Masterclass with Mandy Chessell – Part 1

Roaring Elephant

Play Episode Listen Later Sep 25, 2018 41:50


In this GDPR world, Data Governance and Data Lineage are, or should be, very much top of mind for anybody in the Big Data world. We reached out to Mandy Chessell, who has been very active in this area and were delighted when she accepted to do an interview with us. In this first part, the focus is more on Mandy herself and we lay the groundwork for the second part that will go live in episode 109. Mandy Chessell Distinguished Engineer, Master Inventor, Fellow of Royal Academy of Engineering https://www.linkedin.com/in/mandy-chessell-a4989722/ ODPi Blog post on Egeria: First Release of ODPi Egeria is Here ODPi github projects: Egeria - Open Metadata and Governance https://github.com/odpi/egeria Data-governance companion project https://github.com/odpi/data-governance   Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.

Bigdata Hebdo
Episode 55 : News

Bigdata Hebdo

Play Episode Listen Later Mar 5, 2018 86:29


Building Reliable Reprocessing and Dead Letter Queues with Kafkahttps://eng.uber.com/reliable-reprocessing/Data Lineage sur Apache Spark avec Splinehttp://blog.ippon.fr/2018/02/19/data-lineage-spark-avec-spline/Elastic - Doubling Down on Openhttps://www.elastic.co/blog/doubling-down-on-openhttps://www.elastic.co/products/x-pack/openJupyterLab is Ready for Usershttps://blog.jupyter.org/jupyterlab-is-ready-for-users-5a6f039b8906Cherami: Uber Engineering’s Durable and Scalable Task Queue in Gohttps://eng.uber.com/cherami/Streams in and out of Pravegahttp://blog.pravega.io/2018/02/12/streams-in-and-out-of-pravega/http://pravega.io/Migrating Batch ETL to Stream Processing: A Netflix Case Study with Kafka and Flinkhttps://www.infoq.com/articles/netflix-migrating-stream-processingMachine Learning pour les grand-mèreshttps://www.saagie.com/fr/blog/machine-learning-pour-les-grand-meresAUTOMATED ML : IS IT THE END OF THE SEXIEST JOB OF THE 21ST CENTURY ?http://blog.xebia.fr/2018/02/20/automated-machine-learning-is-it-the-end-of-the-sexiest-job-of-the-21st-century/Google Cloud Auto MLhttps://cloud.google.com/automl/Apache MXNet - A flexible and efficient library for deep learning.http://mxnet.incubator.apache.org/Confluent and Apache Kafka in 2017https://www.confluent.io/blog/confluent-apache-kafka-2017/Oracle : l’insulte faite aux DBAhttps://www.dsfc.net/infrastructure/base-de-donnees-infrastructure/oracle-insulte-faite-aux-dba/amp/Apache Cassandra 3.11.2 releasehttps://www.mail-archive.com/dev@cassandra.apache.org/msg12075.htmlDocker Meet Cassandra. Cassandra Meet Docker.http://thelastpickle.com/blog/2018/01/23/docker-meet-cassandra.htmlAutoscaling Dataproc clusters https://blog.doit-intl.com/autoscaling-google-dataproc-clusters-21f34beaf8a3Lisez le blog d'affini-Techhttp://blog.affini-tech.com-------------------------------------------------------------http://www.bigdatahebdo.com https://twitter.com/bigdatahebdoVincent : https://twitter.com/vhe74Alexander : https://twitter.com/alexanderdeja Cette publication est sponsorisée par Affini-Tech ( http://affini-tech.com https://twitter.com/affinitech )On recrute ! venez cruncher de la data avec nous ! écrivez nous à recrutement@affini-tech.com

Bigdata Hebdo
Episode 55 : News

Bigdata Hebdo

Play Episode Listen Later Mar 5, 2018 86:29


Building Reliable Reprocessing and Dead Letter Queues with Kafkahttps://eng.uber.com/reliable-reprocessing/Data Lineage sur Apache Spark avec Splinehttp://blog.ippon.fr/2018/02/19/data-lineage-spark-avec-spline/Elastic - Doubling Down on Openhttps://www.elastic.co/blog/doubling-down-on-openhttps://www.elastic.co/products/x-pack/openJupyterLab is Ready for Usershttps://blog.jupyter.org/jupyterlab-is-ready-for-users-5a6f039b8906Cherami: Uber Engineering’s Durable and Scalable Task Queue in Gohttps://eng.uber.com/cherami/Streams in and out of Pravegahttp://blog.pravega.io/2018/02/12/streams-in-and-out-of-pravega/http://pravega.io/Migrating Batch ETL to Stream Processing: A Netflix Case Study with Kafka and Flinkhttps://www.infoq.com/articles/netflix-migrating-stream-processingMachine Learning pour les grand-mèreshttps://www.saagie.com/fr/blog/machine-learning-pour-les-grand-meresAUTOMATED ML : IS IT THE END OF THE SEXIEST JOB OF THE 21ST CENTURY ?http://blog.xebia.fr/2018/02/20/automated-machine-learning-is-it-the-end-of-the-sexiest-job-of-the-21st-century/Google Cloud Auto MLhttps://cloud.google.com/automl/Apache MXNet - A flexible and efficient library for deep learning.http://mxnet.incubator.apache.org/Confluent and Apache Kafka in 2017https://www.confluent.io/blog/confluent-apache-kafka-2017/Oracle : l’insulte faite aux DBAhttps://www.dsfc.net/infrastructure/base-de-donnees-infrastructure/oracle-insulte-faite-aux-dba/amp/Apache Cassandra 3.11.2 releasehttps://www.mail-archive.com/dev@cassandra.apache.org/msg12075.htmlDocker Meet Cassandra. Cassandra Meet Docker.http://thelastpickle.com/blog/2018/01/23/docker-meet-cassandra.htmlAutoscaling Dataproc clusters https://blog.doit-intl.com/autoscaling-google-dataproc-clusters-21f34beaf8a3Lisez le blog d'affini-Techhttp://blog.affini-tech.com-------------------------------------------------------------http://www.bigdatahebdo.com https://twitter.com/bigdatahebdoVincent : https://twitter.com/vhe74Alexander : https://twitter.com/alexanderdeja Cette publication est sponsorisée par Affini-Tech ( http://affini-tech.com https://twitter.com/affinitech )On recrute ! venez cruncher de la data avec nous ! écrivez nous à recrutement@affini-tech.com

Not So Standard Deviations
49 - Baltimore is the Home of Cloud Computing

Not So Standard Deviations

Play Episode Listen Later Nov 21, 2017 71:42


Hilary and Roger discuss when to hire a data scientist, the Kaggle State of Data Science and Machine Learning Survey, and the lack of tools for tracking changes to data. Show notes: Stickers at Sonos: https://twitter.com/jahilliar/status/926153450857082880 John Carmack on git: https://twitter.com/id_aa_carmack/status/929389759624916992 Linear Digressions episode on “Data Lineage”: http://lineardigressions.com/episodes/2017/9/3/data-lineage Kaggle State of Data Science Survey: https://www.kaggle.com/surveys/2017 Support us through our Patreon page: https://www.patreon.com/NSSDeviations Roger on Twitter: https://twitter.com/rdpeng Hilary on Twitter: https://twitter.com/hspter Get the Not So Standard Deviations book: https://leanpub.com/conversationsondatascience/ Subscribe to the podcast on Apple Podcasts: https://itunes.apple.com/us/podcast/not-so-standard-deviations/id1040614570 Subscribe to the podcast on Google Play: https://play.google.com/music/listen?u=0#/ps/Izfnbx6tlruojkfrvhjfdj3nmna Find past episodes: http://nssdeviations.com Contact us at nssdeviations@gmail.com  

baltimore google play data science stickers cloud computing sonos john carmack data lineage linear digressions not so standard deviations