POPULARITY
In questa puntata, ci immergiamo nel mondo del MLOps e dell'orchestrazione dei dati con Stefano Bosisio, Senior Software Engineer presso NVIDIA. Stefano condivide le sue conoscenze su framework popolari come Apache Beam, Kubeflow e Dagster, evidenziandone punti di forza e limitazioni. Affrontiamo anche le tendenze emergenti nel DataOps e le sfide che i team devono affrontare nella scelta degli strumenti di orchestrazione più adatti alle loro esigenze.
Welcome to Manufacturing Hub, where we dive deep into the world of industrial automation, software, and digital transformation. In this episode, hosts Dave and Vlad are joined by Zach Scriven, an industrial automation expert, digital transformation evangelist, and a key player in the development of Prove It, a groundbreaking industry conference.This conversation explores a range of topics, from Zach's personal journey in industrial automation and SCADA integration to his pioneering work in digital transformation education. We discuss Unified Namespace (UNS)—a powerful framework for structuring and scaling industrial data—and its role in breaking down silos and creating scalable, interoperable architectures.Key Topics Discussed:✅ Zach Scriven's Background: His journey from SCADA integration in the water industry to co-founding 4.0 Solutions and IoT University.✅ Unified Namespace (UNS): What it is, why it matters, and how it enables scalable industrial data architectures.✅ Digital Transformation in Manufacturing: The need for a clear strategy, the challenges of data silos, and the shift toward IT-OT convergence.✅ Edge Computing & Industrial Data Platforms: How Ignition, MQTT, Litmus Edge, HighByte, and HiveMQ are changing the landscape of industrial automation.✅ Challenges in Legacy Industrial Systems: How companies with aging infrastructure can begin their digital transformation journey.✅ The Future of Industrial Conferences – Prove It: Why traditional conferences fail to deliver value and how Prove It is disrupting the model by requiring vendors to "prove" their solutions in a real-world simulated environment.References & Companies Mentioned:
Christophe Blefari est Staff Data Engineer, auteur de la newsletter data la plus connue au sein de l'écosystème français (Blef.fr), cofondateur de nao et surtout selon moi l'un des plus gros experts data en France.On aborde :
Mark Rittman is joined in this episode by Greg McKeon, Staff Product Manager at dbt Labs to talk about their recent acquisition of SDF Labs, the vision for dbt as the control plane for data collaboration at-scale and the upcoming drag-and-drop Visual Editor that's soon to be part of dbt Cloud.dbt Cloud: The control plane for data collaboration at scaleAbout the Visual EditorCoalesce 2024 and the Launch of dbt's Visual Editing Experiencedbt Labs Acquires SDF Labs to Introduce Robust SQL Comprehension into dbt and Supercharge Developer EfficiencyDrill to Detail Ep.115 ‘Airbnb, DataOps and SQLMesh's Data Engineering Innovation' with Special Guest Toby Mao
Join Shane Gibson as he chats with Chris Bergh on improving your teams way of working by using DataOps patterns. You can get in touch with Chris via LinkedIn or over at https://datakitchen.io If you want to read the transcript for the podcast head over to: https://agiledata.io/podcast/agiledata-podcast/dataops-patterns-with-chris-bergh/#read Listen to more podcasts on applying AgileData patterns over at https://agiledata.io/podcasts/ Read more on the AgileData Way of Working over at https://wow.agiledata.io/ If you want to join us on the next podcast, get in touch over at https://agiledata.io/podcasts/#contact Or if you just want to talk about making magic happen with agile and data you can connect with Shane @shagility on LinkedIn. Subscribe: Apple Podcast | Spotify | Google Podcast | Amazon Audible | TuneIn | iHeartRadio | PlayerFM | Listen Notes | Podchaser | Deezer | Podcast Addict | Simply Magical Data
Christelle Marfaing, ex-Head of Data de Lydia, est aujourd'hui Chief Data Officer de May, la startup qui a développé une app d'avantages salariés (3 millions d'euros levés en 2022).Christophe Blefari est Staff Data Engineer et auteur de la newsletter data la plus connue au sein de l'écosystème français : Blef.fr.Cet épisode spécial Noël est le 1er d'une nouvelle série dont l'objectif est d'échanger avec un.e leader data à 3 avec Blef.
Peter Seeberg talks to Vatsal Shah, Founder & CEO Litmus about Unlocking & Activating Industrial Data
Daniele Panfilo è dottore in Intelligenza Artificiale e CEO di Aindo. Classe 1988 e laureato alla Sapienza Università di Roma in Industrial and Management Engineering, Panfilo ha conseguito due master - il primo presso la Sapienza in Optimization and Modelling e il secondo in ricerca operativa presso il dipartimento di Knowledge Engineering and Data Science delll'Università di Maastricht - e un dottorato all'Università degli Studi di Trieste in Artificial Intelligence specializzandosi in modelli di machine learning generativi. Ha lavorato come specialista di intelligenza artificiale nel settore sanitario e assicurativo in Italia e all'estero per realtà come Medtronic e Allianz Technology. Nel 2018 - insieme a Sebastiano Saccani - ha fondato Aindo, scaleup della Scuola Internazionale Superiore di Studi Avanzati (SISSA) di Trieste che ha come mission quella di aiutare il mondo a sfruttare tutte le potenzialità dell'Intelligenza Artificiale attraverso una piattaforma di DataOps e strumenti per la cura dei dati basata sulla generazione di dati sintetici. In qualità di CEO e cofondatore di Aindo, Daniele combina la sua conoscenza dell'intelligenza artificiale ad un approccio coinvolgente al business, occupandosi principalmente di strategia, sviluppo business, R&D, e product design. Inoltre, continua a dedicarsi all'ambito accademico e scientifico: nel 2022 ha collaborato al paper sulla sintesi dei dati relazionali "Generating Realistic Synthetic Relational Data through Graph Variational Autoencoders" che è stato presentato nell'ambito di NeurIPS (Neural Information Processing Systems), la più prestigiosa conferenza globale sull'intelligenza artificiale. Nel 2023 Aindo è tornata a NeurIPS con il paper "Privacy Measurement in Tabular Synthetic Data: State of the Art and Future Research Directions". Nel 2023 Daniele Panfilo è stato premiato dal programma internazionale Nova 111 tra gli under 35 che cambieranno il futuro in campo Software, Cloud & IT: la lista, stilata dal network mondiale Nova Talent, riunisce i migliori 111 professionisti italiani che, in 11 settori chiave dell'economia, hanno ottenuto risultati eccezionali e il cui impatto ha trainato l'innovazione. Il lavoro di Daniele Panfilo si basa sulla convinzione che l'Intelligenza Artificiale possa migliorare molti aspetti della vita e per questo, con Aindo, intende contribuire mettendo a disposizione della ricerca e dell'innovazione il patrimonio informativo contenuto nei dati, in maniera sicura libera ed efficiente.Siti, app libri e link utiliSito di Aindo Blog di Aindo Sapienza Università di Roma in Industrial and Management EngineeringKnowledge Engineering and Data Science delll'Università di Maastricht Università degli Studi di Trieste in Artificial IntelligenceSISSA di Trieste, Scuola internazionale di studi avanzati di TriesteCourseraI libri da scegliereLa formazione in dati sintetici per settore sanitarioTra le figure più gettonate che seguono questa formazione c'è il Machine Learning Developer specializzato nello sviluppo di modelli di AI finalizzati alla generazione di dati sintetici, che sono dati artificiali che replicano le caratteristiche dei dati reali, senza contenere informazioni sensibili o personali. Questi dati vengono utilizzati per allenare modelli di intelligenza artificiale, testare algoritmi, sviluppare applicazioni, garantendo al contempo la conformità alle normative sulla privacy (es. GDPR). L'intelligenza artificiale (IA) sta rivoluzionando tutti gli aspetti della nostra vita. Purtroppo, però, oltre l'85% dei progetti di IA non arriva mai alla fase di produzione. Questo avviene perché i progetti di IA, per poter essere avviati, hanno bisogno di grandi quantità di dati. Le organizzazioni devono avere accesso a tali dati e assicurarsi che siano completi e sicuri. Si tratta di un processo costoso in termini di denaro e di tempo. La tecnologia dei dati sintetici si sta imponendo come elemento chiave per implementare con successo progetti di intelligenza artificiale e data analytics. I dati sintetici non sono raccolti attraverso tradizionali metodi empirici ma vengono generati algoritmicamente. In quanto tali, non possono essere collegati ad alcuna persona del mondo reale; inoltre, grazie a determinate tecniche di IA, i dati sintetici possono comportarsi come quelli reali. Di conseguenza, il vantaggio principale di questa tecnologia consiste nel coniugare privacy e innovazione. I dati sintetici altamente realistici vengono costruiti attraverso tecniche di intelligenza artificiale generativa. In particolare, i metodi di IA generativa hanno il compito di dedurre i pattern statistici di un dataset reale per poi replicarli in un dataset sintetico. Se l'inferenza dell'IA avviene con successo, i nuovi campioni di dati avranno lo stesso comportamento dei dati reali. I dati sintetici consentono dunque alle organizzazioni di valorizzare i propri dati. In particolare, consentono di scambiare e analizzare i dati in modo sicuro e libero, e di sopperire alle carenze di dati fornendo dati completi e rappresentativi. In ambito sanitario, ad esempio, un ospedale potrebbe voler condividere i dati dei pazienti per sviluppare strumenti volti a migliorare la diagnostica e la cura di numerose patologie. Tuttavia, le cartelle cliniche elettroniche dei pazienti sono altamente riservate e in genere non possono essere scambiate o aggregate facilmente, neppure per ragioni di ricerca e sviluppo o per collaborazioni tra enti pubblici. In questi casi, la piattaforma sviluppata da Aindo converte le informazioni e genera un database di record sintetici utilizzabili per fini di ricerca e sviluppo, garantendo al tempo stesso i più elevati standard in termini di privacy.
Mark is joined in this latest Drill to Detail episode by Tobias (Toby) Mao, CTO and co-Founder at Tobiko Data to talk about SQLGlot, the innovation culture at Airbnb and the data engineering challenges solved by SQLMesh.SQLGlot Introducing Minerva — Airbnb's Metric Platform SQLMesh homepage Tobiko Cloud Running dbt in SQLMesh dbt Mesh Dlt (Data Load Tool)
Christophe Blefari est Staff Data Engineer, auteur de la newsletter data la plus connue au sein de l'écosystème français (Blef.fr) et récemment cofondateur de nao. Il est également selon moi l'un des plus gros experts data en France.
This week we welcome Keiran Stokes, Director and Head of Technology at Thred, all the way from New Zealand. Keiran shares his unique journey from being an industrial electrician to a control systems engineer and then co-founding Thred. Throughout this insightful discussion, you will gain a deeper understanding of the intricacies of industrial data engineering, the role of digital twins, and the importance of context in data analytics. Keiran sheds light on the challenges of integrating industrial systems, the shortage of data engineers in the sector, and Thred's revolutionary approach to bridging the gap between Industry 4.0 promises and current technology. This episode is perfect for anyone interested in the forefront of IIoT innovations and looking to understand the real-world applications and obstacles in the industry. About: Keiran Stokes is a leader in IT/OT integration and data strategy, renowned for his expertise in converging operational and information technologies to drive digital transformation. With a strong focus on supporting New Zealand's industrial sector, Keiran specialises in enterprise and data architecture, helping businesses unlock productivity through digital solutions. His passion for bridging the gap between the factory floor and the cloud has positioned him as a thought leader in operational data, as well as in the engineering and operations that transform that data into insights and actionable improvements. 00:00 Introduction to Unplugged: An IIoT Podcast 00:35 Introduction to Guest: Keiran Stokes 01:45 Keiran Stokes' Journey: From Electrician to IIoT Expert 05:12 Transition into Manufacturing Execution Systems (MES) Development 08:23 Founding of Thred: Mission and Goals 11:03 Differences Between Industrial and IT Systems 14:17 Challenges of Industrial Data Sharing 17:02 Importance of Domain Knowledge in Industrial Data Engineering 20:50 The Role of Knowledge Graphs in Data Contextualization 25:11 Insights on Large Language Models (LLMs) 28:44 Overview of Thred's New Tool: 3 Cloud 32:30 Concepts of Digital Twins and Real-time Cobot Replication 36:15 Industry Challenges and Solutions for Small Manufacturers 39:42 Data Engineering and DataOps in the Industrial Sector 42:15 Open Source Software in Industrial Applications 45:50 Final Thoughts: Closing Gaps in Industrial Data 48:37 Listener Takeaways on IIoT Innovations and Challenges 52:10 Episode Wrap-Up and Future Discussions Don't forget to subscribe for more insights and updates on the future of Industrial Internet of Things and automation! Connect with Keiran on LinkedIn: https://www.linkedin.com/in/keiran-stokes/ Connect with Phil on LinkedIn: https://www.linkedin.com/in/phil-seboa/ Connect with Ed on LinkedIn: https://www.linkedin.com/in/ed-fuentes-2046121a/ About Industry Sage Media: Industry Sage Media is your backstage pass to industry experts and the conversations that are shaping the future of the manufacturing industry. Learn more at: http://www.industrysagemedia.com
In this episode, Dr. Sara Fletcher, CEO of the PA Education Association, talks with Robert Furter, PhD, MBA, Senior Director of Research & DataOps, and Principal Psychometrician at PAEA, about the vital role of data in PA education. Furter emphasizes intentional data collection and analysis, distinguishing between meaningful signals and noise. He outlines PAEA's efforts to enhance data visualization, aiming to introduce dynamic dashboards by 2025. Additionally, Furter stresses the importance of robust, accessible data sets for program evaluation and quality improvement, advocating for a multidisciplinary approach and ensuring research validity. This episode is sponsored by PA Excel. For more information, visit them online at paexcel.com. All Things PA Education is produced by Association Briefings.
In a rapidly evolving data landscape, organizations must adapt or risk falling behind. Six Five Media Host Mike Vizard connects with BMC's Basil Faruqui, Solutions Marketing Director for Digital Business Automation to explore how DataOps can be a game-changer. Faruqui shares insights on how businesses can harness the power of DataOps to unlock the full potential of their data and drive digital transformation. Their discussion covers: The evolving importance of DataOps in today's data-driven environments Strategies for effective data orchestration and automation Challenges organizations face in maximizing data value and how to overcome them The role of digital transformation in enhancing data operations Predictions for the future of data management and operations
What is the current state of DataOps in the enterprise? Host Mike Vizard and BMC's Ram Chakravarti, Senior Vice President & Chief Technology Officer, share thoughts with Six Five Media at BMC Connect on the evolving landscape of DataOps, AI, and their critical roles in enterprise orchestration and data efficiency. Their discussion covers: The current state and the future of DataOps in enterprises How AI technologies are being integrated into orchestration tools Strategies for enhancing data efficiency and security in cloud environments The role of automation in managing IT operations and data workflows Insights into BMC's initiatives towards advancing AI and DataOps solutions
AI and cloud technologies are transforming industries, but how can your business stay ahead of the curve? Six Five Media Host Mike Vizard sits down with BMC's Gur Steif, President of Digital Business Automation at BMC Connect for an insightful conversation on how BMC empowers customers through the integration of DataOps, cloud, and artificial intelligence. Their discussion covers: The evolving role of DataOps in today's technology landscape How BMC integrates AI to enhance its digital business automation solutions The impact of cloud technologies on business innovation and efficiency Strategies for organizations to effectively adopt and leverage BMC's innovative solutions Insights into BMC's future direction and commitment to customer empowerment
MLOps Coffee Sessions #177 with Mohamed Abusaid and Mara Pometti, Building in Production Human-centred GenAI Solutions sponsored by QuantumBlack, AI by McKinsey. // Abstract Trust is paramount in the adoption of new technologies, especially in the realm of education. Mohamed and Mara shed light on the importance of AI governance programs and establishing AI governance boards to ensure safe and ethical use of technology while managing associated risks. They discuss the impact on customers, potential risks, and mitigation strategies that organizations must consider to protect their brand reputation and comply with regulations. // Bio Mara Pometti Mara is an Associate Design Director at McKinsey & Company, where she helps organisations drive AI adoption through human-centered methods. She defines herself as a data-savvy humanist. Her practice spans across AI, data journalism, and design with the overarching objective of finding the strategic intersection between AI models and human intents to implement responsible AI systems that move organisations forward. Previously, she led the AI Strategy practice at IBM, where she also developed the company's first-ever data storytelling program. Yet, by background, she is a data journalist. She worked as a data journalist for agencies and newsrooms like Aljazeera. Mara lectured at many universities about how to humanize AI, including the London School of Economics. Her books and writing explore how to weave a humanistic approach to AI development. Mohamed Abusaid Am Mohamed, a tech enthusiast, hacker, avid traveler, and foodie all rolled into one individual. Built his first website when he was 9 and fell in love with computers and the internet ever since. Graduated with computer science from university although dabbled in electrical, electronic, and network engineering before that. When he's not reading up on the latest tech conversations and products on Hacker News, Mohamed spends his time traveling to new destinations and exploring their cuisine and culture. Mohamed works with different companies helping them tackle challenges in developing, deploying, and scaling their analytics to reach its potential. Some topics he's enthusiastic about include MLOps, DataOps, GenerativeAI, Product thinking, and building cross-functional teams to deliver user-first products. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links QuantumBlack, AI by McKinsey: https://www.mckinsey.com/capabilities/quantumblack/how-we-help-clients --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Stephen on LinkedIn: https://www.linkedin.com/in/stephen-batifol/ Connect with Mara on LinkedIn: https://www.linkedin.com/in/mara-pometti Connect with Mohamed on LinkedIn: https://www.linkedin.com/in/mabusaid/
DataOps, the promising future that nobody seems to be able to make reality. But not for lack of trying: meet Chris Bergh, "Head Chef" at DataKitchen, joining us again to tell us how te filed evolved over the last few years. To get in touch with Chris, pay a visit to DataKitchen.io! And find all the Open Source Tools we discussed on the DataKitchen GitHub pages. Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
0:00 hi everyone Welcome to our event this event is brought to you by data dos club which is a community of people who love 0:06 data and we have weekly events and today one is one of such events and I guess we 0:12 are also a community of people who like to wake up early if you're from the states right Christopher or maybe not so 0:19 much because this is the time we usually have uh uh our events uh for our guests 0:27 and presenters from the states we usually do it in the evening of Berlin time but yes unfortunately it kind of 0:34 slipped my mind but anyways we have a lot of events you can check them in the 0:41 description like there's a link um I don't think there are a lot of them right now on that link but we will be 0:48 adding more and more I think we have like five or six uh interviews scheduled so um keep an eye on that do not forget 0:56 to subscribe to our YouTube channel this way you will get notified about all our future streams that will be as awesome 1:02 as the one today and of course very important do not forget to join our community where you can hang out with 1:09 other data enthusiasts during today's interview you can ask any question there's a pin Link in live chat so click 1:18 on that link ask your question and we will be covering these questions during the interview now I will stop sharing my 1:27 screen and uh there is there's a a message in uh and Christopher is from 1:34 you so we actually have this on YouTube but so they have not seen what you wrote 1:39 but there is a message from to anyone who's watching this right now from Christopher saying hello everyone can I 1:46 call you Chris or you okay I should go I should uh I should look on YouTube then okay yeah but anyways I'll you don't 1:53 need like you we'll need to focus on answering questions and I'll keep an eye 1:58 I'll be keeping an eye on all the question questions so um 2:04 yeah if you're ready we can start I'm ready yeah and you prefer Christopher 2:10 not Chris right Chris is fine Chris is fine it's a bit shorter um 2:18 okay so this week we'll talk about data Ops again maybe it's a tradition that we talk about data Ops every like once per 2:25 year but we actually skipped one year so because we did not have we haven't had 2:31 Chris for some time so today we have a very special guest Christopher Christopher is the co-founder CEO and 2:37 head chef or hat cook at data kitchen with 25 years of experience maybe this 2:43 is outdated uh cuz probably now you have more and maybe you stopped counting I 2:48 don't know but like with tons of years of experience in analytics and software engineering Christopher is known as the 2:55 co-author of the data Ops cookbook and data Ops Manifesto and it's not the 3:00 first time we have Christopher here on the podcast we interviewed him two years ago also about data Ops and this one 3:07 will be about data hops so we'll catch up and see what actually changed in in 3:13 these two years and yeah so welcome to the interview well thank you for having 3:19 me I'm I'm happy to be here and talking all things related to data Ops and why 3:24 why why bother with data Ops and happy to talk about the company or or what's changed 3:30 excited yeah so let's dive in so the questions for today's interview are prepared by Johanna berer as always 3:37 thanks Johanna for your help so before we start with our main topic for today 3:42 data Ops uh let's start with your ground can you tell us about your career Journey so far and also for those who 3:50 have not heard have not listened to the previous podcast maybe you can um talk 3:55 about yourself and also for those who did listen to the previous you can also maybe give a summary of what has changed 4:03 in the last two years so we'll do yeah so um my name is Chris so I guess I'm 4:09 a sort of an engineer so I spent about the first 15 years of my career in 4:15 software sort of working and building some AI systems some non- AI systems uh 4:21 at uh Us's NASA and MIT linol lab and then some startups and then um 4:30 Microsoft and then about 2005 I got I got the data bug uh I think you know my 4:35 kids were small and I thought oh this data thing was easy and I'd be able to go home uh for dinner at 5 and life 4:41 would be fine um because I was a big you started your own company right and uh it didn't work out that way 4:50 and um and what was interesting is is for me it the problem wasn't doing the 4:57 data like I we had smart people who did data science and data engineering the act of creating things it was like the 5:04 systems around the data that were hard um things it was really hard to not have 5:11 errors in production and I would sort of driving to work and I had a Blackberry at the time and I would not look at my 5:18 Blackberry all all morning I had this long drive to work and I'd sit in the parking lot and take a deep breath and 5:24 look at my Blackberry and go uh oh is there going to be any problems today and I'd be and if there wasn't I'd walk and 5:30 very happy um and if there was I'd have to like rce myself um and you know and 5:36 then the second problem is the team I worked for we just couldn't go fast enough the customers were super 5:42 demanding they didn't care they all they always thought things should be faster and we are always behind and so um how 5:50 do you you know how do you live in that world where things are breaking left and right you're terrified of making errors 5:57 um and then second you just can't go fast enough um and it's preh Hadoop era 6:02 right it's like before all this big data Tech yeah before this was we were using 6:08 uh SQL Server um and we actually you know we had smart people so we we we 6:14 built an engine in SQL Server that made SQL Server a column or 6:20 database so we built a column or database inside of SQL Server um so uh 6:26 in order to make certain things fast and and uh yeah it was it was really uh it's not 6:33 bad I mean the principles are the same right before Hadoop it's it's still a database there's still indexes there's 6:38 still queries um things like that we we uh at the time uh you would use olap 6:43 engines we didn't use those but you those reports you know are for models it's it's not that different um you know 6:50 we had a rack of servers instead of the cloud um so yeah and I think so what what I 6:57 took from that was uh it's just hard to run a team of people to do do data and analytics and it's not 7:05 really I I took it from a manager perspective I started to read Deming and 7:11 think about the work that we do as a factory you know and in a factory that produces insight and not automobiles um 7:18 and so how do you run that factory so it produces things that are good of good 7:24 quality and then second since I had come from software I've been very influenced 7:29 by by the devops movement how you automate deployment how you run in an agile way how you 7:35 produce um how you how you change things quickly and how you innovate and so 7:41 those two things of like running you know running a really good solid production line that has very low errors 7:47 um and then second changing that production line at at very very often they're kind of opposite right um and so 7:55 how do you how do you as a manager how do you technically approach that and 8:00 then um 10 years ago when we started data kitchen um we've always been a profitable company and so we started off 8:07 uh with some customers we started building some software and realized that we couldn't work any other way and that 8:13 the way we work wasn't understood by a lot of people so we had to write a book and a Manifesto to kind of share our our 8:21 methods and then so yeah we've been in so we've been in business now about a little over 10 8:28 years oh that's cool and uh like what 8:33 uh so let's talk about dat offs and you mentioned devops and how you were inspired by that and by the way like do 8:41 you remember roughly when devops as I think started to appear like when did people start calling these principles 8:49 and like tools around them as de yeah so agile Manifesto well first of all the I 8:57 mean I had a boss in 1990 at Nasa who had this idea build a 9:03 little test a little learn a lot right that was his Mantra and then which made 9:09 made a lot of sense um and so and then the sort of agile software Manifesto 9:14 came out which is very similar in 2001 and then um the sort of first real 9:22 devops was a guy at Twitter started to do automat automated deployment you know 9:27 push a button and that was like 200 Nish and so the first I think devops 9:33 Meetup was around then so it's it's it's been 15 years I guess 6 like I was 9:39 trying to so I started my career in 2010 so I my first job was a Java 9:44 developer and like I remember for some things like we would just uh SFTP to the 9:52 machine and then put the jar archive there and then like keep our fingers crossed that it doesn't break uh uh like 10:00 it was not really the I wouldn't call it this way right you were deploying you 10:06 had a Dey process I put it yeah 10:11 right was that so that was documented too it was like put the jar on production cross your 10:17 fingers I think there was uh like a page on uh some internal Viki uh yeah that 10:25 describes like with passwords and don't like what you should do yeah that was and and I think what's interesting is 10:33 why that changed right and and we laugh at it now but that was why didn't you 10:38 invest in automating deployment or a whole bunch of automated regression 10:44 tests right that would run because I think in software now that would be rare 10:49 that people wouldn't use C CD they wouldn't have some automated tests you know functional 10:56 regression tests that would be the exception whereas that the norm at the beginning of your career and so that's 11:03 what's interesting and I think you know if we if we talk about what's changed in the last two three years I I think it is 11:10 getting more standard there are um there's a lot more companies who are 11:15 talking data Ops or data observability um there's a lot more tools that are a lot more people are 11:22 using get in data and analytics than ever before I think thanks to DBT um and 11:29 there's a lot of tools that are I think getting more code Centric right that 11:35 they're not treating their configuration like a black box there there's several 11:41 bi tools that tout the fact that they that they're uh you know they're they're git Centric you know and and so and that 11:49 they're testable and that they have apis so things like that I think people maybe let's take a step back and just do a 11:57 quick summary of what data Ops data Ops is and then we can talk about like what changed in the last two years sure so I 12:06 guess it starts with a problem and that it's it sort of 12:11 admits some dark things about data and analytics and that we're not really successful and we're not really happy um 12:19 and if you look at the statistics on sort of projects and problems and even 12:25 the psychology like I think about a year or two we did a survey of 12:31 data Engineers 700 data engineers and 78% of them wanted their job to come with a therapist and 50% were thinking 12:38 of leaving the career altogether and so why why is everyone sort of unhappy well I I I think what happens is 12:46 teams either fall into two buckets they're sort of heroic teams who 12:52 are doing their they're working night and day they're trying really hard for their customer um and then they get 13:01 burnt out and then they quit honestly and then the second team have wrapped 13:06 their projects up in so much process and proceduralism and steps that doing 13:12 anything is sort of so slow and boring that they again leave in frustration um 13:18 or or live in cynicism and and that like the only outcome is quit and 13:24 start uh woodworking yeah the only outcome really is quit and start working 13:29 and um as a as a manager I always hated that right because when when your team 13:35 is either full of heroes or proceduralism you always have people who have the whole system in their head 13:42 they're certainly key people and then when they leave they take all that knowledge with them and then that 13:48 creates a bottleneck and so both of which are aren aren't and I think the 13:53 main idea of data Ops is there's a balance between fear and herois 14:00 that you can live you don't you know you don't have to be fearful 95% of the time maybe one or two% it's good to be 14:06 fearful and you don't have to be a hero again maybe one or two per it's good to be a hero but there's a balance um and 14:13 and in that balance you actually are much more prod
DataOps, the promising future that nobody seems to be able to make reality. But not for lack of trying: meet Chris Bergh, "Head Chef" at DataKitchen, joining us again to tell us how te filed evolved over the last few years. To get in touch with Chris, make a visit to DataKitchen.io! Please use the Contact Form on this blog or our twitter feed to send us your questions, or to suggest future episode topics you would like us to cover.
In this episode, Amir chats with Jai Prakash, Head of Insights and Data Engineering at Pella Corporation, about the evolution of XOps. They dive into Pella's journey in building a center of excellence, upskilling staff, and implementing AI and advanced analytics to enhance manufacturing. Jai shares insights on Pella's strategic tech transformations, including moving to the cloud and how legacy companies can innovate. He highlights the critical role of DataOps and plans for MLOps and AIOps, offering valuable takeaways for established and new XOps adopters. Highlights: - 01:17 The Need for XOps in Traditional Manufacturing - 03:22 Innovations in Manufacturing Processes - 05:14 Implementing Advanced Technologies - 09:51 Building a DataOps and XOps Infrastructure - 14:07 Upskilling and Future-Proofing the Workforce Guest: Jai Prakash is the Head of Insights and Data Engineering at Pella Corporation, where he spearheads the company's data-driven transformation. With extensive experience in building data strategies and implementing AI and advanced analytics, Jai has been instrumental in enhancing manufacturing processes. He focuses on creating centers of excellence, upskilling staff, and establishing robust DataOps infrastructures. Jai's expertise in technological innovations, including cloud migration and the development of MLOps and AIOps, positions him as a leader in driving strategic technological transformations within traditional manufacturing environments. LinkedIn: https://www.linkedin.com/in/jaiprakashd/ ---- Thank you so much for checking out this episode of The Tech Trek. We would appreciate it if you would take a minute to rate and review us on your favorite podcast player. Want to learn more about us? Head over at https://www.elevano.com Have questions or want to cover specific topics with our future guests? Please message me at https://www.linkedin.com/in/amirbormand (Amir Bormand)
Chris Bergh joins me to chat about all things DataOps. We also discuss lean, removing waste from data processes and teams, and much more. DataKitchen: https://datakitchen.io/ DataOps Manifesto: https://dataopsmanifesto.org/en/
Doug Needham is an OG DBA and data architect who built DataOps workflows back in Desert Storm (!) and has managed to stay very current with data to today. We talk about data architecture war stories, the hard work to do generative AI in the enterprise, and much more. Enjoy!
Explore the reasons for data engineers to collaborate with data scientists, machine learning (ML) engineers, and developers on DataOps initiatives that support GenAI. Published at: https://www.eckerson.com/articles/dataops-for-generative-ai-data-pipelines-part-iii-team-collaboration
Companies that adopt DataOps increase the odds of success by making GenAI data pipelines what they should be: modular, scalable, robust, flexible, and governed. Published: https://www.eckerson.com/articles/dataops-for-generative-ai-data-pipelines-part-ii-must-have-characteristics
Unser heutiger Gast ist Christian Schneider, CEO bei der QuinScape GmbH, ein Unternehmen der Dataciders Gruppe. Bei Dataciders hat Christian die Rolle des Data & Analytics Evangelists. Hier unterhalten wir uns unter anderem über: Die Bedeutung von Mentoren und einem guten Team (ab 06:52)
There's the interview you think you're going to have, then there's the interview you get. This is one of those, in the best way possible. I expected to chat about his time at Snowflake. We didn't even get past his early days building data warehouses because it was so fascinating. Did you know Kent is arguably one of the very first practitioners (probably an accidental inventor) of DataOps? This is sort of a "prequel" episode. Kent Graziano and I chat about his early days as a data practitioner.
This week's guest is Dominik Obermaier (https://www.linkedin.com/in/dobermai/), Co-Founder and CTO of HiveMQ (https://www.linkedin.com/company/hivemq-gmbh/). With over 10 years of experience serving on the MQTT technical committee and helping organizations build their data foundations using HiveMQ's MQTT platform, Dominik shares his deep expertise on the technology. He explains what makes MQTT such an important communications protocol, why the emergence of the Unified Namespace matters for manufacturers, and debates the merits of on-prem vs. cloud solutions. Augmented Ops is a podcast for industrial leaders, shop floor operators, citizen developers, and anyone else that cares about what the future of frontline operations will look like across industries. This show is presented by Tulip (https://tulip.co/), the Frontline Operations Platform. You can find more from us at Tulip.co/podcast (https://tulip.co/podcast) or by following the show on LinkedIn (https://www.linkedin.com/company/augmentedpod/). HiveMQ is a Tulip Technology Ecosystem (https://tulip.co/partners/technology-ecosystem-partners/) Partner. Special Guest: Dominik Obermaier.
Anass Bensrhir is the Associate Partner of McKinsey & Company Casablanca. Anu Arora is the Principal Data Engineering at McKinsey & Company. Check out mckinsey.com/quantumblack MLOps podcast #214 with QuantumBlack AI by McKinsey's Principal Data Engineer, Anu Arora and Associate Partner, Anass Bensrhir, Managing Data for Effective GenAI Application brought to you by our Premium Brand Partner QuantumBlack AI by @McKinsey . // Abstract Generative AI is poised to bring impact across all industries and business functions across industries While many companies pilot GenAI, only a few have deployed GenAI use cases, e.g., retailers are producing videos to answer common customer questions using ChatGPT. A majority of organizations are facing challenges to industrialize and scale, with data being one of the biggest inhibitors. Organizations need to strengthen their data foundations given that among leading organizations, 72% noted managing data among the top challenges preventing them from scaling impact. Furthermore, leaders noticed that +31% of their staff's time is spent on non-value-added tasks due to poor data quality and availability issues. // Bio Anu Arora Data architect(~12 years) and have experience in Big data technologies, API development, building scalable data pipelines including DevOps and DataOps, and building GenAI solutions. Anass Bensrhir Anass Leads QuantumBlack in Africa, he specializes in the Financial sector and helps organizations deliver successful large Data transformation programs. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://www.mckinsey.com/capabilities/quantumblack/how-we-help-clients --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Anu on LinkedIn: https://uk.linkedin.com/in/anu-arora-072012 Connect with Anass on LinkedIn: https://www.linkedin.com/in/abensrhir/ Timestamps: [00:00] Anass and Anu's preferred coffee [00:35] Takeaways [04:02] Please like, share, leave a review, and subscribe to our MLOps channels! [04:09] Huge shout out to our sponsor QuantumBlack! [04:29] Anu's tech background [06:31] Anass tech background [07:28] The landscape of data [10:37] Dealing with unstructured data [15:51] Data lakes and ETL processes [22:19] Data Engineers' Heavy Workload [29:49] Data privacy and PII in the new LLMs paradigm [36:13] Balancing LLM Adoption Risk [44:06] Effective LMS Implementation Strategy [49:00] Decisions: Create or Wait [50:39] Wrap up
The success of Generative AI depends on fundamental disciplines like DataOps. Published at: https://www.eckerson.com/articles/dataops-for-generative-ai-data-pipelines-part-i-what-and-why
This week's guest is Alex Krüger (https://www.linkedin.com/in/alexander-krueger/), Co-founder and CEO of United Manufacturing Hub (https://www.linkedin.com/company/united-manufacturing-hub/), or UMH. Alex shares his journey from working on integration projects in consulting fresh out of college, to founding UMH and building an open source alternative to the offerings from incumbent vendors. He breaks down the role of the open source software movement in manufacturing, how the Unified Namespace architecture compares to the traditional ISA-95 model, and how IT can best enable OT to solve problems. Plus, he shares his vision for how microservice-based MES solutions can disrupt the existing monolithic applications. Augmented Ops is a podcast for industrial leaders, shop floor operators, citizen developers, and anyone else that cares about what the future of frontline operations will look like across industries. This show is presented by Tulip (https://tulip.co/), the Frontline Operations Platform. You can find more from us at Tulip.co/podcast (https://tulip.co/podcast) or by following the show on LinkedIn (https://www.linkedin.com/company/augmentedpod/). UMH is a Tulip Technology Ecosystem (https://tulip.co/partners/technology-ecosystem-partners/) Partner. Special Guest: Alex Krüger.
Drive change; own your data challenges. This week, we're bringing you a special episode as we're joined by two of our own; John Breedon, Director of DataOps & Performance and Seb Tyack, Managing Director of Channel Solutions. We discuss owning your channel data strategy, and how you can get your data to do the heavy lifting to supercharge your channel success. Why is data a key part of channel success? As a vendor, what data should you be prioritising?How to tell if your programs are performing for your channel?And so many more insightful topics beyond these questions too.
This week's guest is Vatsal Shah (https://www.linkedin.com/in/vatsal12/), Founder and CEO of Litmus (https://www.linkedin.com/company/litmus-automation/). Vatsal discusses his journey from an automation engineer at Rockwell, to building a new industrial data platform from the ground up after becoming frustrated with the limitations of the offerings from established vendors. He discusses manufacturers' exodus from on-prem to cloud systems, the pros and cons of data protocols like MQTT and Sparkplug B, and why the Unified Namespace architecture is getting so much attention. Plus, he shares his vision for the future of edge computing and how an open ecosystem of interoperable tools is transforming the industry. Augmented Ops is a podcast for industrial leaders, shop floor operators, citizen developers, and anyone else that cares about what the future of frontline operations will look like across industries. This show is presented by Tulip (https://tulip.co/), the Frontline Operations Platform. You can find more from us at Tulip.co/podcast (https://tulip.co/podcast) or by following the show on LinkedIn (https://www.linkedin.com/company/augmentedpod/). Litmus is a Tulip Technology Ecosystem (https://tulip.co/partners/technology-ecosystem-partners/) Partner. Special Guest: Vatsal Shah.
This interview was recorded for the GOTO Book Club.http://gotopia.tech/bookclubRead the full transcription of the interview hereLauren Maffeo - Senior Service Designer at Steampunk & Author of "Designing Data Governance from the Ground Up"Samia Rahman - Director of Enterprise Data Strategy and Governance at SeagenRESOURCESLaurenhttps://twitter.com/LaurenMaffeohttps://www.linkedin.com/in/laurenmaffeoSamiahttps://www.linkedin.com/in/samia-r-b7b65216https://twitter.com/rahman1_samiaDESCRIPTIONData governance manages the people, processes, and strategy needed for deploying data projects to production. But doing it well is far from easy: Less than one-fourth of business leaders say their organizations are data-driven. In Designing Data Governance from the Ground Up, you'll build a cross-functional strategy to create roadmaps and stewardship for data-focused projects, embed data governance into your engineering practice, and put processes in place to monitor data after deployment.In the last decade, the amount of data people produced grew 3,000 percent. Most organizations lack the strategy to clean, collect, organize, and automate data for production-ready projects. Without effective data governance, most businesses will keep failing to gain value from the mountain of data that's available to them.There's a plethora of content intended to help DataOps and DevOps teams reach production, but 90 percent of projects trained with big data fail to reach production because they lack governance.This book shares six steps you can take to build a data governance strategy from scratch. You'll find a data framework, pull together a team of data stewards, build a data governance team, define your roadmap, weave data governance into your development process, and monitor your data in production. [...]* Book description: © Pragmatic ProgrammersThe interview is based on the book " Designing Data Governance from the Ground Up".RECOMMENDED BOOKSLauren Maffeo • Designing Data Governance from the Ground UpKatharine Jarmul • Practical Data PrivacyKatharine Jarmul & Jacqueline Kazil • Data Wrangling with PythonYehonathan Sharvit • Data-Oriented ProgrammingZhamak Dehghani • Data MeshEberhard Wolff & Hanna Prinz • Service MeshPiethein Strengholt • Data Management at ScaleMartin Kleppmann • Designing Data-Intensive ApplicationsTwitterInstagramLinkedInFacebookLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket: gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted daily!
Join our virtual conference 'AI in Production' Transform faster. Innovate smarter. Anticipate the future. At QuantumBlack, we unlock the power of artificial intelligence (AI) to help organizations reinvent themselves from the ground up—and accelerate sustainable and inclusive growth. MLOps Coffee Sessions Special episode with QuantumBlack, AI by McKinsey, GenAI Buy vs Build, Commercial vs Open Source, fueled by our Premium Brand Partner, QuantumBlack, AI by McKinsey. // Abstract Do you build or buy? Check the QuantumBlack team discussing the different sides of buying vs building your own GenAI solution. Let's look at the trade-offs companies need to make - including some of the considerations of using black box solutions that do not provide transparency on what data sources were used. Whether you are a business leader or a developer exploring the space of GenAI, this talk provides you with valuable insights to prepare you for how you can be more informed and prepared for navigating this fast-moving space. // Bio Ilona Logvinova Ilona Logvinova is the Head of Innovation for McKinsey Legal, working across the legal department to identify, lead, and implement cross-cutting and impactful innovation initiatives, covering legal technologies and reimagination of the profession initiatives. At McKinsey Ilona is also Managing Counsel for McKinsey Digital, working closely with emerging technologies across use cases and industries. Mohamed Abusaid Am Mohamed, a tech enthusiast, hacker, avid traveler, and foodie all rolled into one individual. Built his first website when he was 9 and fell in love with computers and the internet ever since. Graduated with a computer science from university although dabbled in electrical, electronic, and network engineering before that. When he's not reading up on the latest tech conversations and products on Hacker News, Mohamed spends his time traveling to new destinations and exploring their cuisine and culture. Mohamed works with different companies helping them tackle challenges in developing, deploying, and scaling their analytics to reach its potential. Some topics he's enthusiastic about include MLOps, DataOps, GenerativeAI, Product thinking, and building cross-functional teams to deliver user-first products. Nayur Khan Nayur is a partner within McKinsey and part of the QuantumBlack, AI by McKinsey leadership team. He predominantly focuses on helping organizations build capabilities to industrialize and scale artificial intelligence (AI), including the newer Generative AI. He helps companies navigate innovations, technologies, processes, and digital skills as needed to run at scale. He is a keynote speaker and is recognized in the DataIQ 100 - a list of the top 100 influential people in data. Nayur also leads the firm's diversity and inclusion efforts within QuantumBlack to promote a more equitable environment for all. He speaks with organizations on the importance of diversity and diverse team building—especially when working with data and building AI. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Ilona on LinkedIn: www.linkedin.com/in/ilonalogvinova Connect with Mo on LinkedIn: https://www.linkedin.com/in/mabusaid/ Connect with Nayur on LinkedIn: https://www.linkedin.com/in/nayur/
Telemetry data pipelines are designed to collect, process and transmit telemetry data from various sources towards a place where businesses can store, analyse and utilise it for better decision-making. Important for SREs, DevOps, ITOps, SecOps, DataOps and the wider tech professional, taking control of your telemetry data to address data challenges in a compliant and secure way should be a common goal for the modern business. In this episode of the EM360 Podcast, Analyst Kevin Petrie speaks to Tucker Callaway, CEO at Mezmo, to discuss:Understanding telemetry data pipelinesAddressing data challenges with telemetry dataOptimising performance and enhancing security with telemetry data
In the past year, developers have faced both promise and uncertainty, particularly in the realm of generative AI. Heath Newburn, global field CTO for PagerDuty, joins TNS host Heather Joslyn to talk about the impact AI and other topics will have on developers in 2024.Newburn anticipates a growing emphasis on DevSecOps in response to high-profile cyber incidents, noting a shift in executive attitudes toward security spending. The rise of automation-centric tools like Backstage signals a changing landscape in the link between development and operations tools. Notably, there's a move from focusing on efficiency gains to achieving new outcomes, with organizations seeking innovative products rather than marginal coding speed improvements.Newburn highlights the importance of experimentation, encouraging organizations to identify areas for trial and error, learning swiftly from failures. The upcoming year is predicted to favor organizations capable of rapid experimentation and information gathering over perfection in code writing.Listen to the full podcast episode as Newburn further discusses his predictions related to platform engineering, remote work, and the continued impact of generative AI.Learn more from The New Stack about PagerDuty and trends in software development:How AI and Automation Can Improve Operational ResiliencyWhy Infrastructure as Code Is Vital for Modern DevOpsOperationalizing AI: Accelerating Automation, DataOps, AIOps
Nikki - Can you tell us a little bit about what interested you in cloud security in the first place? I know you have a particular interest in misconfigurations - was there a singular event that spurred your interest? Chris - What are your thoughts around Guardrails in the cloud and using things such as event based detections?Chris - You interestingly took a Product role, but have a Detection and CloudSec background. How has the Product role been and do you think having the practitioner background helps you be a more effective Product Manager and leader?Nikki - There's a lot of talk around DataOps and SecOps - we're really seeing a bridging of fields and concepts to bring teams together. I wanted to talk a little bit about the human element here - do you see more of these blending of fields/disciplines?Chris - I know you've taken a new role recently with Monad, which focuses on Security Data Lake. What made you interested in this role and why do you think we're seeing the focus on Security Data Lakes in the industry so much? Nikki - What are some of the emerging trends you see in cyber attacks against cloud? What should people be most concerned with and focus on first when it comes to cloud security? Chris - You also lead the Cyber Pulse newsletter, which I read and strongly recommend for news and market trends. What made you start the newsletter and have you found it helps keep you sharp due to needing to stay on top of relevant topics and trends?Nikki - What does cyber resiliency mean to you?
Operational resiliency, as explained by Dormain Drewitz of PagerDuty, involves the ability to bounce back and recover from setbacks, not only technically but also in terms of organizational recovery. True resiliency means maintaining the willingness to take risks even after facing challenges. In a conversation with Heather Joslyn on the New Stack Makers podcast, Drewitz discussed the role of AI and automation in achieving operational resiliency, especially in a context where teams are under pressure to be more productive.Automation, including generative AI code completion tools, is increasingly used to boost developer productivity. However, this may lead to shifting bottlenecks from developers to operations, creating new challenges. Drewitz emphasized the importance of considering the entire value chain and identifying areas where AI and automation can assist. For instance, automating repetitive tasks in incident response, such as checking APIs, closing ports, or database checks, can significantly reduce interruptions and productivity losses.PagerDuty's AI-powered platform leverages generative AI to automate tasks and create runbooks for incident handling, allowing engineers to focus on resolving root causes and restoring services. This includes drafting status updates and incident postmortem reports, streamlining incident response and saving time. Having an operations platform that can generate draft reports at the push of a button simplifies the process, making it easier to review and edit without starting from scratch.Learn more from The New Stack about AI, Automation, Incident Response, and PagerDuty:Operationalizing AI: Accelerating Automation, DataOps, AIOpsThree Ways Automation Can Improve Workplace CultureIncident Response: Three Ts to Rule Them AllFour Ways to Win Executive Buy-In for Automation
MLOps Coffee Sessions #177 with Mohamed Abusaid and Mara Pometti, Empowering Employees: Education and Literacy for Data and AI in the Workplace sponsored by QuantumBlack. // Abstract Trust is paramount in the adoption of new technologies, especially in the realm of education. Mohamed and Mara shed light on the importance of AI governance programs and establishing AI governance boards to ensure safe and ethical use of technology while managing associated risks. They discuss the impact on customers, potential risks, and mitigation strategies that organizations must consider to protect their brand reputation and comply with regulations. // Bio Mara Pometti Mara is an Associate Design Director at McKinsey & Company, where she helps organisations drive AI adoption through human-centered methods. She defines herself as a data-savvy humanist. Her practice spans across AI, data journalism, and design with the overarching objective of finding the strategic intersection between AI models and human intents to implement responsible AI systems that move organisations forward. Previously, she led the AI Strategy practice at IBM, where she also developed the company's first-ever data storytelling program. Yet, by background, she is a data journalist. She worked as a data journalist for agencies and newsrooms like Aljazeera. Mara lectured at many universities about how to humanize AI, including the London School of Economics. Her books and writing explore how to weave a humanistic approach to AI development. Mohamed Abusaid Am Mohamed, a tech enthusiast, hacker, avid traveler, and foodie all rolled into one individual. Built his first website when he was 9 and fell in love with computers and the internet ever since. Graduated with computer science from university although dabbled in electrical, electronic, and network engineering before that. When he's not reading up on the latest tech conversations and products on Hacker News, Mohamed spends his time traveling to new destinations and exploring their cuisine and culture. Mohamed works with different companies helping them tackle challenges in developing, deploying, and scaling their analytics to reach its potential. Some topics he's enthusiastic about include MLOps, DataOps, GenerativeAI, Product thinking, and building cross-functional teams to deliver user-first products. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mara on LinkedIn: https://www.linkedin.com/in/mara-pometti Connect with Mohamed on LinkedIn: https://www.linkedin.com/in/mabusaid/ Timestamps: [00:00] Mara and Mohamed's preferred coffee [00:47] Takeaways [05:04] Big shout out to QuantumBlack and thank you for sponsoring this episode [05:37] Please like, share, and subscribe to our MLOps channels! [05:53] How QuantumBlack uses LLMs [08:54] Using Notion AI function [09:33] Shapers [12:08] Build on existing services with sophistication [15:41] Being creative to compete [21:19] From taker to shaper, from shaper to maker [25:24] Testing prompts [29:33] Next troubleshooting steps when prompts go wrong [32:29] Backup mechanisms essential for improving LLM product [35:07] APIs and external models in the in-house setting [37:38] Calling "Open-source" as legitimately "Open-source" [40:32] Privacy or Convenience? [44:41] API Usage Approaches Split [51:07] Companies serious about privacy, data usage, important regulations [58:39] AI's potential danger lies in text disinformation [1:00:11] Cultural influences [1:01:19] Wrap up
Data Futurology - Data Science, Machine Learning and Artificial Intelligence From Industry Leaders
At Data Futurology's OpsWorld conference in March, a panel of experts came together to discuss the importance of getting measurements, processes and methodologies right to drive DataOps and MLOps across the organisation. The panel consisted of Katherine Fowler, Head of Business Transformation at L'Occitane Australia, Amar Poddatooru, Head of Data and Technology at Australian Ethical, and Emyr James, Head of Data at Resolution Life and moderating the discussion was Andrew Aho, Regional Director, Data Platforms at InterSystems. It became a far-reaching discussion that started with methods to define and measure the ROI of data and analytics initiatives and how to get those projects off the ground. The discussion moved on to overhyped technologies in the data space, and then looked forward to what is on the horizon for the years ahead. As the panel discussed, there is a lot of interest among consumers in some innovative technologies, including ChatGPT. This is in turn driving a lot of interest at the executive level at rolling out solutions that use these tools. However, without the right foundations in place, and without proper concern for the privacy and regulatory risks associated with these tools, they will cause the data team more headaches than they're worth. This panel discussion is essential for understanding how to structure a foundation for data success, be disciplined in deploying the available resources across the data team, gain executive buy-in, and then steadily build the practice up. Enjoy the show! Thank you to our sponsor, Talent Insights Group! Join us for our next events: Advancing AI and Data Engineering Sydney (5-7 September) and OpsWorld: Deploying Data & ML Products (Melbourne, 24-25 October): https://www.datafuturology.com/events Join our Slack Community: https://join.slack.com/t/datafuturologycircle/shared_invite/zt-z19cq4eq-ET6O49o2uySgvQWjM6a5ng What we discussed 2:07: Felipe introduces the Measurements Thought Leaders panel and moderator, Andrew Aho. 3:48: How do you define and measure data and analytics ROI? 7:21: A discussion on metrics that help get data initiatives off the ground. 9:41: How a data leader needs to focus on the data platform, and articulate both the “big picture” view and the details. 12:35: As more organisations adopt ops, processes and methodologies, what challenges might people anticipate arising, and how can those be addressed? 17:24: What can data professionals do to help solve the change management challenge? 18:34: What are the challenges and impact of upcoming “silver bullet” technologies like ChatGPT? 20:16: What is currently overhyped in the data space (and why)? 24:03: What can we as data scientists do to ensure that we're looking at the right risks and drawing accurate conclusions on what is right for the business? 26:13: If the goal is to focus on data science, how can we also keep experimentation and creativity going? 29:49: How do you estimate the value of change to get executive buy-in? 31:18: What upcoming developments and trends will emerge over the next five to ten years? --- Send in a voice message: https://podcasters.spotify.com/pod/show/datafuturology/message
The unbundling of the data ecosystem is causing organizations to “duct tape” products and frameworks together to build their solutions and data delivery processes. Organizations fail to build and deploy end-to-end, automated, repeatable data-driven systems, ignoring data engineering & dataops principles as well as best practices. Published at: https://www.eckerson.com/articles/dataops-in-data-engineering
Summary Data transformation is a key activity for all of the organizational roles that interact with data. Because of its importance and outsized impact on what is possible for downstream data consumers it is critical that everyone is able to collaborate seamlessly. SQLMesh was designed as a unifying tool that is simple to work with but powerful enough for large-scale transformations and complex projects. In this episode Toby Mao explains how it works, the importance of automatic column-level lineage tracking, and how you can start using it today. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack)- Your host is Tobias Macey and today I'm interviewing Toby Mao about SQLMesh, an open source DataOps framework designed to scale data transformations with ease of collaboration and validation built in Interview Introduction How did you get involved in the area of data management? Can you describe what SQLMesh is and the story behind it? DataOps is a term that has been co-opted and overloaded. What are the concepts that you are trying to convey with that term in the context of SQLMesh? What are the rough edges in existing toolchains/workflows that you are trying to address with SQLMesh? How do those rough edges impact the productivity and effectiveness of teams using those Can you describe how SQLMesh is implemented? How have the design and goals evolved since you first started working on it? What are the lessons that you have learned from dbt which have informed the design and functionality of SQLMesh? For teams who have already invested in dbt, what is the migration path from or integration with dbt? You have some built-in integration with/awareness of orchestrators (currently Airflow). What are the benefits of making the transformation tool aware of the orchestrator? What do you see as the potential benefits of integration with e.g. data-diff? What are the second-order benefits of using a tool such as SQLMesh that addresses the more mechanical aspects of managing transformation workfows and the associated dependency chains? What are the most interesting, innovative, or unexpected ways that you have seen SQLMesh used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on SQLMesh? When is SQLMesh the wrong choice? What do you have planned for the future of SQLMesh? Contact Info tobymao (https://github.com/tobymao) on GitHub @captaintobs (https://twitter.com/captaintobs) on Twitter Website (http://tobymao.com/) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? Closing Announcements Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ (https://www.pythonpodcast.com) covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast (https://www.themachinelearningpodcast.com) helps you go from idea to production with machine learning. Visit the site (https://www.dataengineeringpodcast.com) to subscribe to the show, sign up for the mailing list, and read the show notes. If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com (mailto:hosts@dataengineeringpodcast.com)) with your story. To help other people find the show please leave a review on Apple Podcasts (https://podcasts.apple.com/us/podcast/data-engineering-podcast/id1193040557) and tell your friends and co-workers Links SQLMesh (https://github.com/TobikoData/sqlmesh) Tobiko Data (https://tobikodata.com/) SAS (https://www.sas.com/en_us/home.html) AirBnB Minerva (https://medium.com/airbnb-engineering/how-airbnb-achieved-metric-consistency-at-scale-f23cc53dea70) SQLGlot (https://github.com/tobymao/sqlglot) Cron (https://man.freebsd.org/cgi/man.cgi?query=cron&sektion=8&n=1) AST == Abstract Syntax Tree (https://en.wikipedia.org/wiki/Abstract_syntax_tree) Pandas (https://pandas.pydata.org/) Terraform (https://www.terraform.io/) dbt (https://www.getdbt.com/) Podcast Episode (https://www.dataengineeringpodcast.com/dbt-data-analytics-episode-81/) SQLFluff (https://github.com/sqlfluff/sqlfluff) Podcast.__init__ Episode (https://www.pythonpodcast.com/sqlfluff-sql-linter-episode-318/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)
Links: LinkedIn: https://www.linkedin.com/in/santona-tuli/ Upsolver website: upsolver.com Why we built a SQL-based solution to unify batch and stream workflows: https://www.upsolver.com/blog/why-we-built-a-sql-based-solution-to-unify-batch-and-stream-workflows Free MLOps course: https://github.com/DataTalksClub/mlops-zoomcamp Join DataTalks.Club: https://datatalks.club/slack.html Our events: https://datatalks.club/events.html
Summary A significant portion of the time spent by data engineering teams is on managing the workflows and operations of their pipelines. DataOps has arisen as a parallel set of practices to that of DevOps teams as a means of reducing wasted effort. Agile Data Engine is a platform designed to handle the infrastructure side of the DataOps equation, as well as providing the insights that you need to manage the human side of the workflow. In this episode Tevje Olin explains how the platform is implemented, the features that it provides to reduce the amount of effort required to keep your pipelines running, and how you can start using it in your own team. Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management RudderStack helps you build a customer data platform on your warehouse or data lake. Instead of trapping data in a black box, they enable you to easily collect customer data from the entire stack and build an identity graph on your warehouse, giving you full visibility and control. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools. Sign up free at dataengineeringpodcast.com/rudderstack (https://www.dataengineeringpodcast.com/rudderstack) Your host is Tobias Macey and today I'm interviewing Tevje Olin about Agile Data Engine, a platform that combines data modeling, transformations, continuous delivery and workload orchestration to help you manage your data products and the whole lifecycle of your warehouse Interview Introduction How did you get involved in the area of data management? Can you describe what Agile Data Engine is and the story behind it? What are some of the tools and architectures that an organization might be able to replace with Agile Data Engine? How does the unified experience of Agile Data Engine change the way that teams think about the lifecycle of their data? What are some of the types of experiments that are enabled by reduced operational overhead? What does CI/CD look like for a data warehouse? How is it different from CI/CD for software applications? Can you describe how Agile Data Engine is architected? How have the design and goals of the system changed since you first started working on it? What are the components that you needed to develop in-house to enable your platform goals? What are the changes in the broader data ecosystem that have had the most influence on your product goals and customer adoption? Can you describe the workflow for a team that is using Agile Data Engine to power their business analytics? What are some of the insights that you generate to help your customers understand how to improve their processes or identify new opportunities? In your "about" page it mentions the unique approaches that you take for warehouse automation. How do your practices differ from the rest of the industry? How have changes in the adoption/implementation of ML and AI impacted the ways that your customers exercise your platform? What are the most interesting, innovative, or unexpected ways that you have seen the Agile Data Engine platform used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Agile Data Engine? When is Agile Data Engine the wrong choice? What do you have planned for the future of Agile Data Engine? Guest Contact Info LinkedIn (https://www.linkedin.com/in/tevjeolin/?originalSubdomain=fi) Parting Question From your perspective, what is the biggest gap in the tooling or technology for data management today? About Agile Data Engine Agile Data Engine unlocks the potential of your data to drive business value - in a rapidly changing world. Agile Data Engine is a DataOps Management platform for designing, deploying, operating and managing data products, and managing the whole lifecycle of a data warehouse. It combines data modeling, transformations, continuous delivery and workload orchestration into the same platform. Links Agile Data Engine (https://www.agiledataengine.com/agile-data-engine-x-data-engineering-podcast) Bill Inmon (https://en.wikipedia.org/wiki/Bill_Inmon) Ralph Kimball (https://en.wikipedia.org/wiki/Ralph_Kimball) Snowflake (https://www.snowflake.com/en/) Redshift (https://aws.amazon.com/redshift/) BigQuery (https://cloud.google.com/bigquery) Azure Synapse (https://azure.microsoft.com/en-us/products/synapse-analytics/) Airflow (https://airflow.apache.org/) The intro and outro music is from The Hug (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/Love_death_and_a_drunken_monkey/04_-_The_Hug) by The Freak Fandango Orchestra (http://freemusicarchive.org/music/The_Freak_Fandango_Orchestra/) / CC BY-SA (http://creativecommons.org/licenses/by-sa/3.0/)
Many data engineers already use large language models to assist data ingestion, transformation, DataOps, and orchestration. This blog commences a series that explores the emergence of ChatGPT, Bard, and LLM tools from data pipeline vendors, and their implications for the discipline of data engineering. Published at: https://www.eckerson.com/articles/should-ai-bots-build-your-data-pipelines-examining-the-role-of-chatgpt-and-large-language-models-in-data-engineering
The breakaway success of ChatGPT is hiding an important fact and an even bigger problem. The next wave of generative AI will not be built by trawling the Internet but by mining hordes of proprietary data that have been piling up for years inside organizations. While Elon Musk and Reddit may breathe a sigh of relief, this ushers in a new set of concerns that go well beyond prompt injections and AI hallucinations. Who is responsible for making sure our private data doesn't get used as training data? And what happens if it does? Do they even know what's in the data to begin with?We tagged in data engineering expert Josh Wills and security veteran Mike Sabbota of Amazon Prime Video to go past the headlines and into what it takes to safely harness the vast oceans of data they've been responsible for in the past and present. Foundational questions like “who is responsible for data hygiene?” and “what is data governance?” may not be nearly as sexy as tricking AI into saying it wants to destroy humanity but they arguably will have a much greater impact on our safety in the long run. Mike, Josh and Dave go deep into the practical realities of working with data at scale and why the topic is more critical than ever.For anyone wondering exactly how we arrived at this moment where generative AI dominates the headlines and we can't quite recall why we ever cared about blockchains and NFTs, we kick off the episode with Josh explaining the recent history of data science and how it led to this moment. We quickly (and painlessly) cover the breakthrough attention-based transformer model explained in 2017 and key events that have happened since that point.
No episódio de hoje, Luan Moreno & Mateus Oliveira entrevistaram Tobias Mao, atualmente como Co-Founder e CTO na Tobiko Data.SQLMesh é um framework desenvolvido em Python para automatizar tudo que se faça necessário para uma plataforma de dados escalável utilizando o conceito de DataOps.Com SQLMesh, você possui os seguintes benefícios:Foco nos dados do negócio, usando DataOps como premissa principal. Foco em escalabilidade sem se preocupar com seu Data Warehouse ou Engine de Query.Nosso bate papo iremos falar sobre:Estado dos Dados {State of Data}SQLMeshDataOpsPython e SQL para Engenharia de DadosTobiko DataEm todas as organizações independentemente do porte, vemos a necessidade de tornar o processo de uso dos dados mais escalável, sendo assim o SQLMesh é uma excelente opção para otimizar o processo de DataOps.Tobias MaoSQLMeshTobiko Data Luan Moreno = https://www.linkedin.com/in/luanmoreno/
Enabling data engineers to create data pipelines easily while delivering data streams that meet low-latency, production requirements is a difficult balancing act. David Yaffe and Johnny Gaettinger join us today to share how they have created that balance at Estuary. Estuary is a data operations platform that synchronizes data across the systems where data lives The post Unified DataOps for Teams and Enterprise with Estuary.dev appeared first on Software Engineering Daily.
This is the intro to DataOps that you've been waiting for. All about what it is, why it matters, best practices, and how to start into this field. My guest today is Jennifer Glenski, Director of Product Management at BMC Software.
On this week's Industrial Talk we're onsite at the https://transformindustry.com/ (Digital Manufacturing Summit) and talking to John Harrington, Co-Founder and Chief Product Officer at HighByte about "Improving operational data quality in less time across your enterprise". Get the answers to your "DataOps" questions along with John's unique insight on the “How” on this Industrial Talk interview! Finally, get your exclusive free access to the https://industrialtalk.com/wp-admin/inforum-industrial-academy-discount/ (Industrial Academy) and a series on “https://industrialtalk.com/why-you-need-to-podcast/ (Why You Need To Podcast)” for Greater Success in 2022. All links designed for keeping you current in this rapidly changing Industrial Market. Learn! Grow! Enjoy! JOHN HARRINGTON'S CONTACT INFORMATION: Personal LinkedIn: https://www.linkedin.com/in/john-harrington-142906a/ (https://www.linkedin.com/in/john-harrington-142906a/) Company LinkedIn: https://www.linkedin.com/company/highbyte/ (https://www.linkedin.com/company/highbyte/) Company Website: https://www.highbyte.com/ (https://www.highbyte.com/) PODCAST VIDEO: https://youtu.be/IyC6yUilNtk THE STRATEGIC REASON "WHY YOU NEED TO PODCAST": https://industrialtalk.com/why-you-need-to-podcast/ () OTHER GREAT INDUSTRIAL RESOURCES: NEOM: https://www.neom.com/en-us (https://www.neom.com/en-us) Hitachi Vantara: https://www.hitachivantara.com/en-us/home.html (https://www.hitachivantara.com/en-us/home.html) Industrial Marketing Solutions: https://industrialtalk.com/industrial-marketing/ (https://industrialtalk.com/industrial-marketing/) Industrial Academy: https://industrialtalk.com/industrial-academy/ (https://industrialtalk.com/industrial-academy/) Industrial Dojo: https://industrialtalk.com/industrial_dojo/ (https://industrialtalk.com/industrial_dojo/) We the 15:https://www.wethe15.org/ ( https://www.wethe15.org/) YOUR INDUSTRIAL DIGITAL TOOLBOX: LifterLMS: Get One Month Free for $1 – https://lifterlms.com/ (https://lifterlms.com/) Active Campaign: https://www.activecampaign.com/?_r=H855VEPU (Active Campaign Link) Social Jukebox: https://www.socialjukebox.com/ (https://www.socialjukebox.com/) Industrial Academy (One Month Free Access And One Free License For Future Industrial Leader): https://industrialtalk.com/wp-admin/inforum-industrial-academy-discount/ () Business Beatitude the Book Do you desire a more joy-filled, deeply-enduring sense of accomplishment and success? Live your business the way you want to live with the BUSINESS BEATITUDES...The Bridge connecting sacrifice to success. YOU NEED THE BUSINESS BEATITUDES! TAP INTO YOUR INDUSTRIAL SOUL, RESERVE YOUR COPY NOW! BE BOLD. BE BRAVE. DARE GREATLY AND CHANGE THE WORLD. GET THE BUSINESS BEATITUDES! https://industrialtalk.com/business-beatitude-reserve/ ( Reserve My Copy and My 25% Discount) PODCAST TRANSCRIPT: SUMMARY KEYWORDS data, problem, asset, manufacturing, collect, people, pull, systems, analytics, industrial, case, chief product officer, traceability, create, john, vibration sensor, facilitate, high, solve, scaling 00:04 Welcome to the industrial talk podcast with Scott Mackenzie. Scott is a passionate industry professional dedicated to transferring cutting edge industry focused innovations and trends while highlighting the men and women who keep the world moving. So put on your hard hat, grab your work boots, and let's go. 00:22 Well, welcome to industrial talk. Thank you very much for joining the number one industrial related podcast in the universe as celebrates industry professionals, all around the world because you're bold, you're brave. You dare greatly you innovate. You solve problems, you're changing my lives, you're changing lives of other people and you're changing the world. Why not celebrate you? We are also broadcasting from the digital manufacturing symposium right here. Chicago, great venue, incredible collection of I want to...