Podcasts about mlflow

Play Episode Listen Later Jul 9, 2025 64:49

Thanks to MLflow for supporting this episode — the platform helping teams track, manage, and deploy ML and GenAI projects with ease. Try it free at mlflow.org.What if AI could build and maintain your software—like a co-worker who never forgets state? In this episode, Jiquan Ngiam chats with Demetrios about agents that actually do the work: parsing emails, updating spreadsheets, and reshaping how we design software itself. Less hype, more hands-on AI—tune in for a glimpse at the future of truly personalized computing.// BioJiquan Ngiam is the Co-Founder and CEO of Lutra AI, with deep expertise in artificial intelligence and machine learning. He was previously at Google Brain, Coursera, and in the Stanford CS Ph.D. program advised by Andrew Ng. He helped develop the first online courses in Machine Learning, and is now building agentic AI systems that can complete tasks for us.// Related Linkshttps://www.youtube.com/@LutraAI#api #llm #lutra #costefficiency #latentspace ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreMLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Jiquan on LinkedIn: /jngiam/Timestamps:[00:00] Agents That Actually Do Work[08:21] Building Tables With AI Help[12:54] Guardrails for Smarter Code[16:35 - 18:00] MLFlow Ad[18:30] What's Next for MCP?[23:23] AI as Your Data Conductor[31:13] Rethinking AI + Data Stacks[32:10] Sandbox Security, Real Risks[40:48] Smarter Reviews, Powered by Use[46:08] Cost vs. Quality in AI[52:00] Podcast Editing Gets Creative[56:27] Transparent UIs, Powered by AI[01:00:28] Can AI Learn Good Taste?[01:04:45] Peeking Into Wild AI Futures

ceo ai co founders cost powered machine learning new way ml genai coursera guardrails mcp andrew ng google brain demetrios mlflow

Bridging the Gap Between AI and Business Data // Deepti Srivastava // #325

Digital Insurance Podcast

Play Episode Listen Later Jun 20, 2025 57:13

Bridging the Gap Between AI and Business Data // MLOps Podcast #325 with Deepti Srivastava, Founder and CEO at Snow Leopard.Join the Community: https://go.mlops.community/YTJoinInGet the newsletter: https://go.mlops.community/YTNewsletter// AbstractI'm sure the MLOps community is probably aware – it's tough to make AI work in enterprises for many reasons, from data silos, data privacy and security concerns, to going from POCs to production applications. But one of the biggest challenges facing businesses today, that I particularly care about, is how to unlock the true potential of AI by leveraging a company's operational business data. At Snow Leopard, we aim to bridge the gap between AI systems and critical business data that is locked away in databases, data warehouses, and other API-based systems, so enterprises can use live business data from any data source – whether it's database, warehouse, or APIs – in real time and on demand, natively. In this interview, I'd like to cover Snow Leopard's intelligent data retrieval approach that can leverage business data directly and on-demand to make AI work.// BioDeepti is the founder and CEO of Snow Leopard AI, a platform that helps teams build AI apps using their live business data, on-demand. She has nearly 2 decades of experience in data platforms and infrastructure.As Head of Product at Observable, Deepti led the 0→1 product and GTM strategy in the crowded data analytics market. Before that, Deepti was the founding PM for Google Spanner, growing it to thousands of internal customers (Ads, PlayStore, Gmail, etc.), before launching it externally as a seminal cloud database service. Deepti started her career as a distributed systems engineer in the RAC database kernel at Oracle.// Related LinksWebsite: https://www.snowleopard.ai/AI SQL Data Analyst // Donné Stevenson - https://youtu.be/hwgoNmyCGhQ~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExploreJoin our Slack community [https://go.mlops.community/slack]Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)] Sign up for the next meetup: [https://go.mlops.community/register]MLOps Swag/Merch: [https://shop.mlops.community/]Connect with Demetrios on LinkedIn: /dpbrinkmConnect with Deepti on LinkedIn: /thedeepti/Timestamps:[00:00] Deepti's preferred coffee[00:49] MLflow vs Kubeflow Debate[04:58] GenAI Data Integration Challenges[09:02] GenAI Sidecar Spicy Takes[14:07] Troubleshooting LLM Hallucinations[19:03] AI Overengineering and Hype[25:06] Self-Serve Analytics Governance[33:29] Dashboards vs Data Quality[37:06] Agent Database Context Control[43:00] LLM as Orchestrator[47:34] Tool Call Ownership Clarification[51:45] MCP Server Challenges[56:52] Wrap up

Vom Mainframe zur Cloud: Der Weg zur DORA-Compliance für Finanzinstitute

Play Episode Listen Later May 7, 2025 37:44

In dieser Podcast-Episode spreche ich mit Lukas Grubwieser, Senior Solution Architect, und Leonie Hollstein, Global Account Manager, beide von Databricks, über das Thema DORA Compliance im Finanzsektor. Wir tauchen tief in die Herausforderungen und Chancen ein, die sich durch die neue Regulierung ergeben. 5 Highlights der Episode: Was ist DORA?: Wir klären, was der Digital Operational Resilience Act (DORA) überhaupt ist: Ein regulatorisches Framework, das IT-Risiken in Finanzinstituten adressiert und am 17. Januar 2025 in Kraft getreten ist. DORA zielt darauf ab, die Abhängigkeit von einzelnen Anbietern zu reduzieren und die Resilienz von IT-Systemen zu stärken. Herausforderungen der DORA Compliance: Wir diskutieren die größten Herausforderungen, die sich für Finanzinstitute bei der Umsetzung von DORA stellen. Dazu gehören das Management von Drittparteienrisiken (z.B. Cloud-Anbieter), der notwendige Change-Management-Prozess, der Umgang mit veralteter Technologie (Legacy-Systeme) und die Notwendigkeit eines ganzheitlichen Ansatzes, der alle IT-Systeme umfasst. Die Rolle von Databricks: Ich lerne, wie Databricks Finanzinstitute bei der Erfüllung der DORA-Anforderungen unterstützt. Databricks liefert nicht nur eine Plattform, sondern fungiert auch als Berater und hilft bei der Entwicklung von Prozessen und der Implementierung der notwendigen Technologie. Das Geschäftsmodell ist consumption-based, also erfolgsabhängig. Incident Management und Echtzeit-Monitoring: Wir beleuchten die Bedeutung von Echtzeit-Monitoring und -Logging zur frühzeitigen Erkennung und Reaktion auf Sicherheitsvorfälle. Databricks bietet hierfür Lösungen, die diverse Systeme integrieren und eine zentrale Übersicht ermöglichen. Das beinhaltet auch die Automatisierung von Prozessen. Governance und Open Source: Die Bedeutung von Daten-Governance und die Rolle von Open-Source-Technologien wie Spark und MLflow werden hervorgehoben. Databricks setzt auf einen hybriden Ansatz, der sowohl die Vorteile der Cloud als auch die Unabhängigkeit von einzelnen Anbietern berücksichtigt. Links in dieser Ausgabe Zur Homepage von Jonas Piela Zum LinkedIn-Profil von Jonas Piela Zum LinkedIn-Profil von Lukas Grubwieser Zum LinkedIn-Profil von Leonie Hollstein Zum Whitepaper zu Dora Die Liferay Digital Experience Platform Kunden erwarten digitale Services für die Kommunikation, Schadensmeldung und -abwicklung. Liferays Digital Experience Platform bietet Out-of-the-Box-Funktionen wie Low-Code, höchste Sicherheit & Zuverlässigkeit. Jetzt Kontakt aufnehmen.

#126 MMM, CLV & Bayesian Marketing Analytics, with Will Dean

Learning Bayesian Statistics

Play Episode Listen Later Feb 19, 2025 54:47 Transcription Available

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!Intro to Bayes Course (first 2 lessons free)Advanced Regression Course (first 2 lessons free)Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work!Visit our Patreon page to unlock exclusive Bayesian swag ;)Takeaways:Marketing analytics is crucial for understanding customer behavior.PyMC Marketing offers tools for customer lifetime value analysis.Media mix modeling helps allocate marketing spend effectively.Customer Lifetime Value (CLV) models are essential for understanding long-term customer behavior.Productionizing models is essential for real-world applications.Productionizing models involves challenges like model artifact storage and version control.MLflow integration enhances model tracking and management.The open-source community fosters collaboration and innovation.Understanding time series is vital in marketing analytics.Continuous learning is key in the evolving field of data science.Chapters:00:00 Introduction to Will Dean and His Work10:48 Diving into PyMC Marketing17:10 Understanding Media Mix Modeling25:54 Challenges in Productionizing Models35:27 Exploring Customer Lifetime Value Models44:10 Learning and Development in Data ScienceThank you to my Patrons for making this episode possible!Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz,...

#172 - Mirakl : Déployer une stratégie IA Générative

Data Gen

Play Episode Listen Later Dec 9, 2024 43:16

Anne-Claire Baschet est Chief Data & AI Officer chez Mirakl, la licorne française qui propose une solution clé en main pour monter sa marketplace et qui a réalisé une levée de fonds record début 2024 de 555 millions de dollars.On aborde :

microsoft acast dust ia openai gemini alexandre leur visitez llama strat galileo suivez genai gagner laissez produit sncf axa databricks ecouter inscrivez chief data mlflow

076 - AI Roles Demystified: A Guide for Infrastructure Admins with Myles Gray

Unexplored Territory

Play Episode Listen Later May 26, 2024 49:13

In this conversation, Myles Gray discusses the AI workflow and its personas, the responsibilities of data scientists and developers in deploying AI models, the role of infrastructure administrators, and the challenges of deploying models at the edge. He also explains the concept of quantization and the importance of accuracy in models. Additionally, he talks about the pipeline for deploying models and the difference between unit testing and integration testing. Unit testing is used to test the functionality of a single module or function within an application. Integration testing involves testing the interaction between different components or applications. MLflow and other tools are used to store and manage ML models. Smaller models are emerging as a solution to the resource constraints of large models. Collaboration between different personas is important for ensuring security and governance in AI projects. Data governance policies are crucial for maintaining data quality and consistency.TakeawaysThe AI workflow involves multiple personas, including data scientists, developers, and infrastructure administrators.Data scientists play a crucial role in developing AI models, while developers are responsible for deploying the models into production.Infrastructure administrators need to consider the virtualization layer and ensure efficient and easy consumption of infrastructure components.Deploying AI models at the edge requires quantization to reduce model size and considerations for form factor, scale, and connectivity.The pipeline for deploying models involves steps such as unit testing, scanning for vulnerabilities, building container images, and pushing to a registry.Unit testing focuses on testing individual components, while integration testing ensures the compatibility and functionality of the entire system. Unit testing is used to test the functionality of a single module or function within an application.Integration testing involves testing the interaction between different components or applications.MLflow and other tools are used to store and manage ML models.Smaller models are emerging as a solution to the resource constraints of large models.Collaboration between different personas is important for ensuring security and governance in AI projects.Data governance policies are crucial for maintaining data quality and consistency.Chapters00:00 Understanding the AI Workflow and Personas03:24 The Role of Data Scientists and Developers in Deploying AI Models08:47 The Responsibilities of Infrastructure Administrators15:25 Challenges of Deploying Models at the Edge20:29 The Pipeline for Deploying AI Models24:45 Unit Testing vs. Integration Testing28:22 Managing ML Models with MLflow and Other Tools32:17 The Emergence of Smaller Models39:58 Collaboration for Security and Governance in AI Projects46:32 The Importance of Data GovernanceDisclaimer: The thoughts and opinions shared in this podcast are our own/guest(s), and not necessarily those of Broadcom or VMware by Broadcom.

[Exclusive] Databricks Roundtable // Introducing DBRX: The Future of Language Models

GOTO - Today, Tomorrow and the Future

Play Episode Listen Later Apr 12, 2024 48:35

Join us at our first in-person conference on June 25 all about AI Quality: https://www.aiqualityconference.com/ MLOps Coffee Sessions Special episode with Databricks, Introducing DBRX: The Future of Language Models, fueled by our Premium Brand Partner, Databricks. DBRX is designed to be especially capable of a wide range of tasks and outperforms other open LLMs on standard benchmarks. It also promises to excel at code and math problems, areas where others have struggled. Our panel of experts will get into the technical nuances, potential applications, and implications of DBRx for businesses, developers, and the broader tech community. This session is a great opportunity to hear from insiders about how DBRX's capabilities can benefit you. // Bio Denny Lee - Co-host Denny Lee is a long-time Apache Spark™ and MLflow contributor, Delta Lake maintainer, and a Sr. Staff Developer Advocate at Databricks. A hands-on distributed systems and data sciences engineer with extensive experience developing internet-scale data platforms and predictive analytics systems. He has previously built enterprise DW/BI and big data systems at Microsoft, including Azure Cosmos DB, Project Isotope (HDInsight), and SQL Server. Davis Blalock Davis Blalock is a research scientist and the first employee at MosaicML. He previously worked at PocketSonics (acquired 2013) and completed his PhD at MIT, where he was advised by John Guttag. He received his M.S. from MIT and his B.S. from the University of Virginia. He is a Qualcomm Innovation Fellow, NSF Graduate Research Fellow, and Barry M. Goldwater Scholar. He is also the author of Davis Summarizes Papers, one of the most widely-read machine learning newsletters. Bandish Shah Bandish Shah is an Engineering Manager at MosaicML/Databricks, where he focuses on making generative AI training and inference efficient, fast, and accessible by bridging the gap between deep learning, large-scale distributed systems, and performance computing. Bandish has over a decade of experience building systems for machine learning and enterprise applications. Prior to MosaicML, Bandish held engineering and development roles at SambaNova Systems where he helped develop and ship the first RDU systems from the ground up, and Oracle where he worked as an ASIC engineer for SPARC-based enterprise servers. Abhi Venigalla Abhi is an NLP architect working on helping organizations build their own LLMs using Databricks. Joined as part of the MosaicML team and used to work as a researcher at Cerebras Systems. Ajay Saini Ajay is an engineering manager at Databricks leading the GenAI training platform team. He was one of the early engineers at MosaicML (acquired by Databricks) where he first helped build and launch Composer (an open source deep learning training framework) and afterwards led the development of the MosaicML training platform which enabled customers to train models (such as LLMs) from scratch on their own datasets at scale. Prior to MosaicML, Ajay was co-founder and CEO of Overfit, an online personal training startup (YC S20). Before that, Ajay worked on ML solutions for ransomware detection and data governance at Rubrik. Ajay has both a B.S. and MEng in computer science with a concentration in AI from MIT. // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Website: https://www.databricks.com/ Databricks DBRX: https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/

ceo university ai phd microsoft mit language exclusive roundtable sr oracle models nlp composer ml genai rubrik meng engineering manager asic databricks sql server sparc apache spark rdu mosaicml azure cosmos db cerebras systems mlflow connect with us join sambanova systems

ZenML erhält Millionen für Open-Source-Framework im Machine Learning (Machine Learning • Crane • Point Nine)

Startup Insider

Play Episode Listen Later Oct 24, 2023 22:20

In der Nachmittagsfolge begrüßen wir heute Adam Probst, CEO von ZenML, und sprechen mit ihm über die erfolgreiche Erweiterung der Seed-Finanzierungsrunde auf 6,4 Millionen US-Dollar.ZenML entwickelt ein erweiterbares Open-Source-Framework zur Erstellung produktionsreifer Pipelines für maschinelles Lernen. Dabei wird die Komplexität der Infrastruktur für Machine Learning Engineers abstrahiert, ohne sie an einen Anbieter zu binden, da es ein einheitliches Erlebnis auf allen wichtigen Plattformen wie AWS, GCP und Azure bietet. Dies ermöglicht Unternehmen Cloud-übergreifende Workloads effektiv zu verwalten. Darüber hinaus erweitern die bestehenden Integrationen von ZenML mit über 50 ML-Tools, darunter HuggingFace, Weights & Biases und MLflow, die Anpassungsfähigkeit und den Komfort. Dies bietet eine hohe strategische Flexibilität mit Cloud-agnostischen Integrationen. Ein Beispiel für den Wert der Lösung ist die Integration von Orchestrierungstools mit Experiment-Tracking-Tools. Anstatt mit fragmentierten Pipelines zu arbeiten, bietet ZenML einen zentralen Rahmen, der diese Werkzeuge auf kohärente und standardisierte Weise miteinander verbindet. User können mit einem Klick von der lokalen Entwicklung zur Skalierung in die Cloud wechseln. Seit Anfang 2023 hat das Startup außerdem einen vollständig verwalteten Cloud-Service für eine ausgewählte Gruppe von Kunden. Dieser Dienst baut auf dem Open-Source-Kern auf und erweitert dessen Fähigkeiten um umfassende Funktionen wie Single Sign-on, rollenbasierte Zugriffskontrolle und Delivery-Integrationen. ZenML wurde im Jahr 2021 von Adam Probst und Hamza Tahir in München gegründet.Nun hat das Open-Source-Framework eine Erweiterung der Seed-Runde um 3,7 Millionen US-Dollar auf 6,4 Millionen US-Dollar bekannt gegeben. Die Erweiterung wurde von Point Nine angeführt und von dem bestehenden Investor Crane unterstützt. An der Investitionsrunde beteiligten sich Business Angels wie D. Sculley, CEO von Kaggle, Harold Giménez, SVP R&D bei Hashicorp sowie Luke de Oliveira, ehemaliger Direktor für maschinelles Lernen bei Twilio. Das frische Kapital soll die Einführung von ZenML Cloud unterstützen.

AI Governance, Regulation, and Bias | Verta's Manasi Vartak

AI For All Podcast

Play Episode Listen Later Jul 13, 2023 29:30

On this episode of the AI For All Podcast, Manasi Vartak, CEO of Verta, joins Ryan Chacon to discuss AI governance, AI regulation, and AI bias. They talk about generative AI and its risks, reducing AI bias, responsible AI, and how regulation will impact AI adoption. Manasi Vartak is the founder and CEO of Verta, the Menlo Park, California-based provider of the Verta Operational AI platform and Verta Model Catalog. Manasi invented experiment management and tracking while at MIT CSAIL when she created ModelDB, the first open-source model management system deployed at Fortune 500 companies and the progenitor of MLflow. After earning her PhD from MIT, Vartak went on to data science positions at Twitter, where she worked on deep learning for content recommendation as part of the feed-ranking team, and Google, where she worked on dynamic ad-targeting, before founding Verta. Emerging from the AI innovations at MIT, Twitter, NVIDIA, Google, and Facebook, Verta, based in Silicon Valley, specializes in Model Catalog, Model Lifecycle Management, and AI portfolio management. Since its inception in 2018, it has served Fortune 500 companies and digital pioneers, and was recognized as a 'Gartner Cool Vendor AI Core Technologies' in 2022. More about Verta: https://www.verta.ai Connect with Manasi: https://www.linkedin.com/in/manasi-vartak/ Key Questions and Topics from This Episode: (00:00) Intro to the AI For All Podcast (01:13) Intro to Manasi Vartak and Verta (01:44) What is generative AI? (02:52) Current state and future of generative AI (04:05) Generative AI risks (05:32) AI bias (06:54) Reducing AI bias (10:58) What makes generative AI possible? (12:34) AI governance and responsible AI (15:30) Who is responsible for making AI responsible? (18:41) AI regulation (23:52) How will regulation impact AI adoption? (27:51) Will we struggle to govern AI in the future? (28:37) Learn more about Verta Subscribe on YouTube: https://bit.ly/43dYQV9 Join Our Newsletter: https://ai-forall.com Follow Us on Twitter: https://twitter.com/_aiforall

ceo california ai google phd mit current fortune silicon valley bias emerging regulation governance nvidia menlo park verta mit csail mlflow

Scaling Machine Learning with Spark • Adi Polak & Holden Karau

Play Episode Listen Later Jun 30, 2023 40:06 Transcription Available

This interview was recorded for the GOTO Book Club.gotopia.tech/bookclubRead the full transcription of the interview hereAdi Polak - VP of Developer Experience at Treeverse & Contributing to lakeFS OSSHolden Karau - Co-Author of "Kubeflow for Machine Learning" & many more books & Open Source Engineer at NetflixDESCRIPTIONLearn how to build end-to-end scalable machine learning solutions with Apache Spark. With this practical guide, author Adi Polak introduces data and ML practitioners to creative solutions that supersede today's traditional methods. You'll learn a more holistic approach that takes you beyond specific requirements and organizational goals--allowing data and ML practitioners to collaborate and understand each other better.Scaling Machine Learning with Spark examines several technologies for building end-to-end distributed ML workflows based on the Apache Spark ecosystem with Spark MLlib, MLflow, TensorFlow, and PyTorch. If you're a data scientist who works with machine learning, this book shows you when and why to use each technology.You will:• Explore machine learning, including distributed computing concepts and terminology• Manage the ML lifecycle with MLflow• Ingest data and perform basic preprocessing with Spark• Explore feature engineering, and use Spark to extract features• Train a model with MLlib and build a pipeline to reproduce it• Build a data system to combine the power of Spark with deep learning• Get a step-by-step example of working with distributed TensorFlow• Use PyTorch to scale machine learning and its internal architecture* Book description: © O'ReillyThe interview is based on the book "Scaling Machine Learning with Spark"RECOMMENDED BOOKSAdi Polak • Machine Learning with Apache SparkHolden Karau, Trevor Grant, Boris Lublinsky, Richard Liu & Ilan Filonenko • Kubeflow for Machine LearningHolden Karau • Distributed Computing 4 KidsHolden Karau • Scaling Python with DaskHolden Karau & Boris Lublinsky • Scaling Python with RayHolden Karau & Rachel Warren • High Performance SparkHolden Karau, Konwinski, Wendell & Zaharia • Learning SparkHolden Karau & Krishna Sankar • Fast Data Processing with Spark 2nd EditionHolden Karau • Fast Data Processing with Spark 1st EditionTwitterLinkedInFacebookLooking for a unique learning experience?Attend the next GOTO conference near you! Get your ticket: gotopia.techSUBSCRIBE TO OUR YOUTUBE CHANNEL - new videos posted almost daily

The Birth and Growth of Spark: An Open Source Success Story // Matei Zaharia // MLOps Podcast #155

Adventures in Machine Learning

Play Episode Listen Later Apr 25, 2023 58:12

MLOps Coffee Sessions #155 with Matei Zaharia, The Birth and Growth of Spark: An Open Source Success Story, co-hosted by Vishnu Rachakonda. // Abstract We dive deep into the creation of Spark, with the creator himself - Matei Zaharia Chief technologist at Databricks. This episode also explores the development of Databricks' other open source home run ML Flow and the concept of "lake house ML". As a special treat Matei talked to us about the details of the "DSP" (Demonstrate Search Predict) project, which aims to enable building applications by combining LLMs and other text-returning systems. // About the guest: Matei has the unique advantage of being able to see different perspectives, having worked in both academia and the industry. He listens carefully to people's challenges and excitement about ML and uses this to come up with new ideas. As a member of Databricks, Matei also has the advantage of applying ML to Databricks' own internal practices. He is constantly asking the question "What's a better way to do this?" // Bio Matei Zaharia is an Associate Professor of Computer Science at Stanford and Chief Technologist at Databricks. He started the Apache Spark project during his Ph.D. at UC Berkeley, and co-developed other widely used open-source projects, including MLflow and Delta Lake, at Databricks. At Stanford, he works on distributed systems, NLP, and information retrieval, building programming models that can combine language models and external services to perform complex tasks. Matei's research work was recognized through the 2014 ACM Doctoral Dissertation Award for the best Ph.D. dissertation in computer science, an NSF CAREER Award, and the US Presidential Early Career Award for Scientists and Engineers (PECASE). // MLOps Jobs board https://mlops.pallet.xyz/jobs // MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links https://cs.stanford.edu/~matei/ https://spark.apache.org/ --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Matei on LinkedIn: https://www.linkedin.com/in/mateizaharia/ Timestamps: [00:00] Matei's preferred coffee [01:45] Takeaways [05:50] Please subscribe to our newsletters, join our Slack, and subscribe to our podcast channels! [06:52] Getting to know Matei as a person [09:10] Spark [14:18] Open and freewheeling cross-pollination [16:35] Actual formation of Spark [20:05] Spark and MLFlow Similarities and Differences [24:24] Concepts in MLFlow [27:34] DJ Khalid of the ML world [30:58] Data Lakehouse [33:35] Stanford's unique culture of the Computer Science Department [36:06] Starting a company [39:30] Unique advice to grad students [41:51] Open source project [44:35] LLMs in the New Revolution [47:57] Type of company to start with [49:56] Emergence of Corporate Research Labs [53:50] LLMs size context [54:44] Companies to respect [57:28] Wrap up

starting growth open wrap birth unique companies stanford scientists associate professor differences takeaways spark nlp slack success stories computer science uc berkeley concepts open source emergence ml dsp vishnu databricks dj khalid chief technologist matei demetrios new revolution apache spark computer science department nsf career award data lakehouse mlflow engineers pecase connect with us join matei zaharia

Ep 2: Databricks CTO Matei Zaharia on scaling and orchestrating large language models

Unsupervised Learning

Play Episode Listen Later Mar 7, 2023 46:24

Patrick and Jacob sit down with Matei Zaharia, Co-Founder and CTO at Databricks and Professor at Stanford. They discuss how companies are training and serving models in production with Databricks, where LLMs fall short for search and how to improve them, the state of the art AI research at Stanford, and how the size and cost of models is likely to change with technological advances in the coming years. (0:00) - Introduction(2:04) - Founding story of Databricks(6:03) - PhD classmates using early version of spark for Netflix competition(6:55) - Building applications with MLFlow(9:55) - LLMs and ChatGPT(12:05) - Working with and fine-tuning foundation models(13:00) - Prompt engineering here to stay or temporary?(15:12) - Matei's research at Stanford. The Demonstrate-Search-Predict framework (DSP)(17:42) - How LLMs will be combined with classic information retrieval systems for world-class search(19:38) - LLMs writing programs to orchestrate LLMs(20:36) - Using LLMs in Databricks cloud product(24:21) - Scaling LLM training and serving(27:29) - How much will cost to train LLMs go down in coming years?(29:22) - How many parameters is too many?(31:14) - Open source vs closed source?(35:19) - Stanford AI research - Snorkel, ColBERT, and More(38:58) - Matei getting a $50 amazon gift card for weeks of work(43:23) - Quick-fire round With your co-hosts:@jasoncwarner- Former CTO GitHub, VP Eng Heroku & Canonical @ericabrescia- Former COO Github, Founder Bitnami (acq'd by VMWare) @patrickachase- Partner at Redpoint, Former ML Engineer LinkedIn @jacobeffron- Partner at Redpoint, Former PM Flatiron Health

Matei Zaharia - Episode 32

ACM ByteCast

Play Episode Listen Later Dec 13, 2022 54:27

In this episode of ACM ByteCast, Bruke Kifle hosts Matei Zaharia, computer scientist, educator, and creator of Apache Spark. Matei is the Chief Technologist and Co-Founder of Databricks and an Assistant Professor of Computer Science at Stanford. He started the Apache Spark project during his PhD at UC Berkeley in 2009 and has worked broadly on other widely used data and machine learning software, including MLflow, Delta Lake, and Apache Mesos. Matei's research was recognized through the 2014 ACM Doctoral Dissertation Award, an NSF Career Award, and the US Presidential Early Career Award for Scientists and Engineers. Matei, who was born in Romania and grew up mostly in Canada, describes how he developed Spark, a framework for writing programs that run on a large cluster of nodes and process data in parallel, and how this led him to co-found Databricks around this technology. Matei and Bruke also discuss the new paradigm shift from traditional data warehouses to data lakes, as well as his work on MLflow, an open-source platform for managing the end-to-end machine learning lifecycle. He highlights some recent announcements in the field of AI and machine learning and shares observations from teaching and conducting research at Stanford, including an important current gap in computing education.

Databricks, Spark och Azure

Microsoft Partner Podden

Play Episode Listen Later Dec 5, 2022 33:21

Att höra någon som jobbar med data & AI prata om Databricks är inget ovanligt men vad är det och varför ska du bry dig? Svaret på detta får vi av veckans gäster Herman von Greiff och Jonas Dahlberg. Vi lär oss vad Databricks är, pratar om Lake House, Spark, MLFlow och Delta lake. Och vad pratar Herman om när han myntar uttrycket "Det är bättre med en data swamp än en data desert"Microsoft Intelligent Data Platform | Microsoft Hosted on Acast. See acast.com/privacy for more information.

ai acast delta spark att herman azure svaret databricks lake house mlflow

MLflow 2.0 And How Large-Scale Projects Are Managed In The Open Source - ML 096

Play Episode Listen Later Dec 1, 2022 49:05

Corey Zumar talks about the new release of MLflow, 2.0, and what the new major features that are included in the release. Bilal and Corey then discuss managing feature implementation priorities, and selling large-scale project ideas to internal customers, end-users, executives, and the dev team. The discussion also centers around generalizing feature requests to implementations that will work for the masses and how to effectively do prototype releases for incremental agile development for complex projects. Sponsors Chuck's Resume Template Developer Book Club starting with Clean Architecture by Robert C. Martin Become a Top 1% Dev with a Top End Devs Membership LinksGitHub: Corey-Zumar

projects open source managed dev bilal large scale clean architecture mlflow

2155: Databricks - The Story Behind the Lakehouse Company

The Tech Blog Writer Podcast

Play Episode Listen Later Oct 27, 2022 39:45

Many are citing open source as the future. The UK Government's National Data Strategy even talks about the importance of opening public sector datasets to form the backbone of innovation, efficiency, and growth. This is a trend that Databricks is betting on in a big way. Databricks is the lakehouse company. More than 7,000 organizations worldwide — including Comcast, Condé Nast, H&M, and over 40% of the Fortune 500 — rely on the Databricks Lakehouse Platform to unify their data, analytics and AI. The company is headquartered in San Francisco, with offices around the globe. Founded by the original creators of Apache Spark™, Delta Lake and MLflow, Databricks is on a mission to help data teams solve the world's toughest problems. I have invited Dael Williamson, EMEA CTO, Field Advisory & Engineering at Databricks, to join me on Tech Talks Daily to share the story behind the company and how they are helping data teams solve the world's most challenging problems.

ai san francisco fortune founded story behind comcast hm uk government cond nast databricks apache spark mlflow tech talks daily

Enterprise MLOps Interview-Simon Stiebellehner

52 Weeks of Cloud

Play Episode Listen Later Sep 23, 2022 56:16

If you enjoyed this video, here are additional resources to look at:Coursera + Duke Specialization: Building Cloud Computing Solutions at Scale Specialization: https://www.coursera.org/specializations/building-cloud-computing-solutions-at-scalePython, Bash, and SQL Essentials for Data Engineering Specialization: https://www.coursera.org/specializations/python-bash-sql-data-engineering-dukeAWS Certified Solutions Architect - Professional (SAP-C01) Cert Prep: 1 Design for Organizational Complexity:https://www.linkedin.com/learning/aws-certified-solutions-architect-professional-sap-c01-cert-prep-1-design-for-organizational-complexity/design-for-organizational-complexity?autoplay=trueO'Reilly Book: Practical MLOps: https://www.amazon.com/Practical-MLOps-Operationalizing-Machine-Learning/dp/1098103017O'Reilly Book: Python for DevOps: https://www.amazon.com/gp/product/B082P97LDW/O'Reilly Book: Developing on AWS with C#: A Comprehensive Guide on Using C# to Build Solutions on the AWS Platformhttps://www.amazon.com/Developing-AWS-Comprehensive-Solutions-Platform/dp/1492095877Pragmatic AI: An Introduction to Cloud-based Machine Learning: https://www.amazon.com/gp/product/B07FB8F8QP/Pragmatic AI Labs Book: Python Command-Line Tools: https://www.amazon.com/gp/product/B0855FSFYZPragmatic AI Labs Book: Cloud Computing for Data Analysis: https://www.amazon.com/gp/product/B0992BN7W8Pragmatic AI Book: Minimal Python: https://www.amazon.com/gp/product/B0855NSRR7Pragmatic AI Book: Testing in Python: https://www.amazon.com/gp/product/B0855NSRR7Subscribe to Pragmatic AI Labs YouTube Channel: https://www.youtube.com/channel/UCNDfiL0D1LUeKWAkRE1xO5QSubscribe to 52 Weeks of AWS Podcast: https://52-weeks-of-cloud.simplecast.comView content on noahgift.com: https://noahgift.com/View content on Pragmatic AI Labs Website: https://paiml.com/[00:00.000 --> 00:02.260] Hey, three, two, one, there we go, we're live.[00:02.260 --> 00:07.260] All right, so welcome Simon to Enterprise ML Ops interviews.[00:09.760 --> 00:13.480] The goal of these interviews is to get people exposed[00:13.480 --> 00:17.680] to real professionals who are doing work in ML Ops.[00:17.680 --> 00:20.360] It's such a cutting edge field[00:20.360 --> 00:22.760] that I think a lot of people are very curious about.[00:22.760 --> 00:23.600] What is it?[00:23.600 --> 00:24.960] You know, how do you do it?[00:24.960 --> 00:27.760] And very honored to have Simon here.[00:27.760 --> 00:29.200] And do you wanna introduce yourself[00:29.200 --> 00:31.520] and maybe talk a little bit about your background?[00:31.520 --> 00:32.360] Sure.[00:32.360 --> 00:33.960] Yeah, thanks again for inviting me.[00:34.960 --> 00:38.160] My name is Simon Stebelena or Simon.[00:38.160 --> 00:40.440] I am originally from Austria,[00:40.440 --> 00:43.120] but currently working in the Netherlands and Amsterdam[00:43.120 --> 00:46.080] at Transaction Monitoring Netherlands.[00:46.080 --> 00:48.780] Here I am the lead ML Ops engineer.[00:49.840 --> 00:51.680] What are we doing at TML actually?[00:51.680 --> 00:55.560] We are a data processing company actually.[00:55.560 --> 00:59.320] We are owned by the five large banks of Netherlands.[00:59.320 --> 01:02.080] And our purpose is kind of what the name says.[01:02.080 --> 01:05.920] We are basically lifting specifically anti money laundering.[01:05.920 --> 01:08.040] So anti money laundering models that run[01:08.040 --> 01:11.440] on a personalized transactions of businesses[01:11.440 --> 01:13.240] we get from these five banks[01:13.240 --> 01:15.760] to detect unusual patterns on that transaction graph[01:15.760 --> 01:19.000] that might indicate money laundering.[01:19.000 --> 01:20.520] That's a natural what we do.[01:20.520 --> 01:21.800] So as you can imagine,[01:21.800 --> 01:24.160] we are really focused on building models[01:24.160 --> 01:27.280] and obviously ML Ops is a big component there[01:27.280 --> 01:29.920] because that is really the core of what you do.[01:29.920 --> 01:32.680] You wanna do it efficiently and effectively as well.[01:32.680 --> 01:34.760] In my role as lead ML Ops engineer,[01:34.760 --> 01:36.880] I'm on the one hand the lead engineer[01:36.880 --> 01:38.680] of the actual ML Ops platform team.[01:38.680 --> 01:40.200] So this is actually a centralized team[01:40.200 --> 01:42.680] that builds out lots of the infrastructure[01:42.680 --> 01:47.320] that's needed to do modeling effectively and efficiently.[01:47.320 --> 01:50.360] But also I am the craft lead[01:50.360 --> 01:52.640] for the machine learning engineering craft.[01:52.640 --> 01:55.120] These are actually in our case, the machine learning engineers,[01:55.120 --> 01:58.360] the people working within the model development teams[01:58.360 --> 01:59.360] and cross functional teams[01:59.360 --> 02:01.680] actually building these models.[02:01.680 --> 02:03.640] That's what I'm currently doing[02:03.640 --> 02:05.760] during the evenings and weekends.[02:05.760 --> 02:09.400] I'm also lecturer at the University of Applied Sciences, Vienna.[02:09.400 --> 02:12.080] And there I'm teaching data mining[02:12.080 --> 02:15.160] and data warehousing to master students, essentially.[02:16.240 --> 02:19.080] Before TMNL, I was at bold.com,[02:19.080 --> 02:21.960] which is the largest eCommerce retailer in the Netherlands.[02:21.960 --> 02:25.040] So I always tend to see the Amazon of the Netherlands[02:25.040 --> 02:27.560] or been a lux actually.[02:27.560 --> 02:30.920] It is still the biggest eCommerce retailer in the Netherlands[02:30.920 --> 02:32.960] even before Amazon actually.[02:32.960 --> 02:36.160] And there I was an expert machine learning engineer.[02:36.160 --> 02:39.240] So doing somewhat comparable stuff,[02:39.240 --> 02:42.440] a bit more still focused on the actual modeling part.[02:42.440 --> 02:44.800] Now it's really more on the infrastructure end.[02:45.760 --> 02:46.760] And well, before that,[02:46.760 --> 02:49.360] I spent some time in consulting, leading a data science team.[02:49.360 --> 02:50.880] That's actually where I kind of come from.[02:50.880 --> 02:53.360] I really come from originally the data science end.[02:54.640 --> 02:57.840] And there I kind of started drifting towards ML Ops[02:57.840 --> 02:59.200] because we started building out[02:59.200 --> 03:01.640] a deployment and serving platform[03:01.640 --> 03:04.440] that would as consulting company would make it easier[03:04.440 --> 03:07.920] for us to deploy models for our clients[03:07.920 --> 03:10.840] to serve these models, to also monitor these models.[03:10.840 --> 03:12.800] And that kind of then made me drift further and further[03:12.800 --> 03:15.520] down the engineering lane all the way to ML Ops.[03:17.000 --> 03:19.600] Great, yeah, that's a great background.[03:19.600 --> 03:23.200] I'm kind of curious in terms of the data science[03:23.200 --> 03:25.240] to ML Ops journey,[03:25.240 --> 03:27.720] that I think would be a great discussion[03:27.720 --> 03:29.080] to dig into a little bit.[03:30.280 --> 03:34.320] My background is originally more on the software engineering[03:34.320 --> 03:36.920] side and when I was in the Bay Area,[03:36.920 --> 03:41.160] I did individual contributor and then ran companies[03:41.160 --> 03:44.240] at one point and ran multiple teams.[03:44.240 --> 03:49.240] And then as the data science field exploded,[03:49.240 --> 03:52.880] I hired multiple data science teams and worked with them.[03:52.880 --> 03:55.800] But what was interesting is that I found that[03:56.840 --> 03:59.520] I think the original approach of data science[03:59.520 --> 04:02.520] from my perspective was lacking[04:02.520 --> 04:07.240] in that there wasn't really like deliverables.[04:07.240 --> 04:10.520] And I think when you look at a software engineering team,[04:10.520 --> 04:12.240] it's very clear there's deliverables.[04:12.240 --> 04:14.800] Like you have a mobile app and it has to get better[04:14.800 --> 04:15.880] each week, right?[04:15.880 --> 04:18.200] Where else, what are you doing?[04:18.200 --> 04:20.880] And so I would love to hear your story[04:20.880 --> 04:25.120] about how you went from doing kind of more pure data science[04:25.120 --> 04:27.960] to now it sounds like ML Ops.[04:27.960 --> 04:30.240] Yeah, yeah, actually.[04:30.240 --> 04:33.800] So back then in consulting one of the,[04:33.800 --> 04:36.200] which was still at least back then in Austria,[04:36.200 --> 04:39.280] data science and everything around it was still kind of[04:39.280 --> 04:43.720] in this infancy back then 2016 and so on.[04:43.720 --> 04:46.560] It was still really, really new to many organizations,[04:46.560 --> 04:47.400] at least in Austria.[04:47.400 --> 04:50.120] There might be some years behind in the US and stuff.[04:50.120 --> 04:52.040] But back then it was still relatively fresh.[04:52.040 --> 04:55.240] So in consulting, what we very often struggled with was[04:55.240 --> 04:58.520] on the modeling end, problems could be solved,[04:58.520 --> 05:02.040] but actually then easy deployment,[05:02.040 --> 05:05.600] keeping these models in production at client side.[05:05.600 --> 05:08.880] That was always a bit more of the challenge.[05:08.880 --> 05:12.400] And so naturally kind of I started thinking[05:12.400 --> 05:16.200] and focusing more on the actual bigger problem that I saw,[05:16.200 --> 05:19.440] which was not so much building the models,[05:19.440 --> 05:23.080] but it was really more, how can we streamline things?[05:23.080 --> 05:24.800] How can we keep things operating?[05:24.800 --> 05:27.960] How can we make that move easier from a prototype,[05:27.960 --> 05:30.680] from a PUC to a productionized model?[05:30.680 --> 05:33.160] Also how can we keep it there and maintain it there?[05:33.160 --> 05:35.480] So personally I was really more,[05:35.480 --> 05:37.680] I saw that this problem was coming up[05:38.960 --> 05:40.320] and that really fascinated me.[05:40.320 --> 05:44.120] So I started jumping more on that exciting problem.[05:44.120 --> 05:45.080] That's how it went for me.[05:45.080 --> 05:47.000] And back then we then also recognized it[05:47.000 --> 05:51.560] as a potential product in our case.[05:51.560 --> 05:54.120] So we started building out that deployment[05:54.120 --> 05:56.960] and serving and monitoring platform, actually.[05:56.960 --> 05:59.520] And that then really for me, naturally,[05:59.520 --> 06:01.840] I fell into that rabbit hole[06:01.840 --> 06:04.280] and I also never wanted to get out of it again.[06:05.680 --> 06:09.400] So the system that you built initially,[06:09.400 --> 06:10.840] what was your stack?[06:10.840 --> 06:13.760] What were some of the things you were using?[06:13.760 --> 06:17.000] Yeah, so essentially we had,[06:17.000 --> 06:19.560] when we talk about the stack on the backend,[06:19.560 --> 06:20.560] there was a lot of,[06:20.560 --> 06:23.000] so the full backend was written in Java.[06:23.000 --> 06:25.560] We were using more from a user perspective,[06:25.560 --> 06:28.040] the contract that we kind of had,[06:28.040 --> 06:32.560] our goal was to build a drag and drop platform for models.[06:32.560 --> 06:35.760] So basically the contract was you package your model[06:35.760 --> 06:37.960] as an MLflow model,[06:37.960 --> 06:41.520] and then you basically drag and drop it into a web UI.[06:41.520 --> 06:43.640] It's gonna be wrapped in containers.[06:43.640 --> 06:45.040] It's gonna be deployed.[06:45.040 --> 06:45.880] It's gonna be,[06:45.880 --> 06:49.680] there will be a monitoring layer in front of it[06:49.680 --> 06:52.760] based on whatever the dataset is you trained it on.[06:52.760 --> 06:55.920] You would automatically calculate different metrics,[06:55.920 --> 06:57.360] different distributional metrics[06:57.360 --> 06:59.240] around your variables that you are using.[06:59.240 --> 07:02.080] And so we were layering this approach[07:02.080 --> 07:06.840] to, so that eventually every incoming request would be,[07:06.840 --> 07:08.160] you would have a nice dashboard.[07:08.160 --> 07:10.040] You could monitor all that stuff.[07:10.040 --> 07:12.600] So stackwise it was actually MLflow.[07:12.600 --> 07:15.480] Specifically MLflow models a lot.[07:15.480 --> 07:17.920] Then it was Java in the backend, Python.[07:17.920 --> 07:19.760] There was a lot of Python,[07:19.760 --> 07:22.040] especially PySpark component as well.[07:23.000 --> 07:25.880] There was a, it's been quite a while actually,[07:25.880 --> 07:29.160] there was a quite some part written in Scala.[07:29.160 --> 07:32.280] Also, because there was a component of this platform[07:32.280 --> 07:34.800] was also a bit of an auto ML approach,[07:34.800 --> 07:36.480] but that died then over time.[07:36.480 --> 07:40.120] And that was also based on PySpark[07:40.120 --> 07:43.280] and vanilla Spark written in Scala.[07:43.280 --> 07:45.560] So we could facilitate the auto ML part.[07:45.560 --> 07:48.600] And then later on we actually added that deployment,[07:48.600 --> 07:51.480] the easy deployment and serving part.[07:51.480 --> 07:55.280] So that was kind of, yeah, a lot of custom build stuff.[07:55.280 --> 07:56.120] Back then, right?[07:56.120 --> 07:59.720] There wasn't that much MLOps tooling out there yet.[07:59.720 --> 08:02.920] So you need to build a lot of that stuff custom.[08:02.920 --> 08:05.280] So it was largely custom built.[08:05.280 --> 08:09.280] Yeah, the MLflow concept is an interesting concept[08:09.280 --> 08:13.880] because they provide this package structure[08:13.880 --> 08:17.520] that at least you have some idea of,[08:17.520 --> 08:19.920] what is gonna be sent into the model[08:19.920 --> 08:22.680] and like there's a format for the model.[08:22.680 --> 08:24.720] And I think that part of MLflow[08:24.720 --> 08:27.520] seems to be a pretty good idea,[08:27.520 --> 08:30.080] which is you're creating a standard where,[08:30.080 --> 08:32.360] you know, if in the case of,[08:32.360 --> 08:34.720] if you're using scikit learn or something,[08:34.720 --> 08:37.960] you don't necessarily want to just throw[08:37.960 --> 08:40.560] like a pickled model somewhere and just say,[08:40.560 --> 08:42.720] okay, you know, let's go.[08:42.720 --> 08:44.760] Yeah, that was also our thinking back then.[08:44.760 --> 08:48.040] So we thought a lot about what would be a,[08:48.040 --> 08:51.720] what would be, what could become the standard actually[08:51.720 --> 08:53.920] for how you package models.[08:53.920 --> 08:56.200] And back then MLflow was one of the little tools[08:56.200 --> 08:58.160] that was already there, already existent.[08:58.160 --> 09:00.360] And of course there was data bricks behind it.[09:00.360 --> 09:02.680] So we also made a bet on that back then and said,[09:02.680 --> 09:04.920] all right, let's follow that packaging standard[09:04.920 --> 09:08.680] and make it the contract how you would as a data scientist,[09:08.680 --> 09:10.800] then how you would need to package it up[09:10.800 --> 09:13.640] and submit it to the platform.[09:13.640 --> 09:16.800] Yeah, it's interesting because the,[09:16.800 --> 09:19.560] one of the, this reminds me of one of the issues[09:19.560 --> 09:21.800] that's happening right now with cloud computing,[09:21.800 --> 09:26.800] where in the cloud AWS has dominated for a long time[09:29.480 --> 09:34.480] and they have 40% market share, I think globally.[09:34.480 --> 09:38.960] And Azure's now gaining and they have some pretty good traction[09:38.960 --> 09:43.120] and then GCP's been down for a bit, you know,[09:43.120 --> 09:45.760] in that maybe the 10% range or something like that.[09:45.760 --> 09:47.760] But what's interesting is that it seems like[09:47.760 --> 09:51.480] in the case of all of the cloud providers,[09:51.480 --> 09:54.360] they haven't necessarily been leading the way[09:54.360 --> 09:57.840] on things like packaging models, right?[09:57.840 --> 10:01.480] Or, you know, they have their own proprietary systems[10:01.480 --> 10:06.480] which have been developed and are continuing to be developed[10:06.640 --> 10:08.920] like Vertex AI in the case of Google,[10:09.760 --> 10:13.160] the SageMaker in the case of Amazon.[10:13.160 --> 10:16.480] But what's interesting is, let's just take SageMaker,[10:16.480 --> 10:20.920] for example, there isn't really like this, you know,[10:20.920 --> 10:25.480] industry wide standard of model packaging[10:25.480 --> 10:28.680] that SageMaker uses, they have their own proprietary stuff[10:28.680 --> 10:31.040] that kind of builds in and Vertex AI[10:31.040 --> 10:32.440] has their own proprietary stuff.[10:32.440 --> 10:34.920] So, you know, I think it is interesting[10:34.920 --> 10:36.960] to see what's gonna happen[10:36.960 --> 10:41.120] because I think your original hypothesis which is,[10:41.120 --> 10:44.960] let's pick, you know, this looks like it's got some traction[10:44.960 --> 10:48.760] and it wasn't necessarily tied directly to a cloud provider[10:48.760 --> 10:51.600] because Databricks can work on anything.[10:51.600 --> 10:53.680] It seems like that in particular,[10:53.680 --> 10:56.800] that's one of the more sticky problems right now[10:56.800 --> 11:01.800] with MLopsis is, you know, who's the leader?[11:02.280 --> 11:05.440] Like, who's developing the right, you know,[11:05.440 --> 11:08.880] kind of a standard for tooling.[11:08.880 --> 11:12.320] And I don't know, maybe that leads into kind of you talking[11:12.320 --> 11:13.760] a little bit about what you're doing currently.[11:13.760 --> 11:15.600] Like, do you have any thoughts about the, you know,[11:15.600 --> 11:18.720] current tooling and what you're doing at your current company[11:18.720 --> 11:20.920] and what's going on with that?[11:20.920 --> 11:21.760] Absolutely.[11:21.760 --> 11:24.200] So at my current organization,[11:24.200 --> 11:26.040] Transaction Monitor Netherlands,[11:26.040 --> 11:27.480] we are fully on AWS.[11:27.480 --> 11:32.000] So we're really almost cloud native AWS.[11:32.000 --> 11:34.840] And so that also means everything we do on the modeling side[11:34.840 --> 11:36.600] really evolves around SageMaker.[11:37.680 --> 11:40.840] So for us, specifically for us as MLops team,[11:40.840 --> 11:44.680] we are building the platform around SageMaker capabilities.[11:45.680 --> 11:48.360] And on that end, at least company internal,[11:48.360 --> 11:52.880] we have a contract how you must actually deploy models.[11:52.880 --> 11:56.200] There is only one way, what we call the golden path,[11:56.200 --> 11:59.800] in that case, this is the streamlined highly automated path[11:59.800 --> 12:01.360] that is supported by the platform.[12:01.360 --> 12:04.360] This is the only way how you can actually deploy models.[12:04.360 --> 12:09.360] And in our case, that is actually a SageMaker pipeline object.[12:09.640 --> 12:12.680] So in our company, we're doing large scale batch processing.[12:12.680 --> 12:15.040] So we're actually not doing anything real time at present.[12:15.040 --> 12:17.040] We are doing post transaction monitoring.[12:17.040 --> 12:20.960] So that means you need to submit essentially DAX, right?[12:20.960 --> 12:23.400] This is what we use for training.[12:23.400 --> 12:25.680] This is what we also deploy eventually.[12:25.680 --> 12:27.720] And this is our internal contract.[12:27.720 --> 12:32.200] You need to provision a SageMaker in your model repository.[12:32.200 --> 12:34.640] You got to have one place,[12:34.640 --> 12:37.840] and there must be a function with a specific name[12:37.840 --> 12:41.440] and that function must return a SageMaker pipeline object.[12:41.440 --> 12:44.920] So this is our internal contract actually.[12:44.920 --> 12:46.600] Yeah, that's interesting.[12:46.600 --> 12:51.200] I mean, and I could see like for, I know many people[12:51.200 --> 12:53.880] that are using SageMaker in production,[12:53.880 --> 12:58.680] and it does seem like where it has some advantages[12:58.680 --> 13:02.360] is that AWS generally does a pretty good job[13:02.360 --> 13:04.240] at building solutions.[13:04.240 --> 13:06.920] And if you just look at the history of services,[13:06.920 --> 13:09.080] the odds are pretty high[13:09.080 --> 13:12.880] that they'll keep getting better, keep improving things.[13:12.880 --> 13:17.080] And it seems like what I'm hearing from people,[13:17.080 --> 13:19.080] and it sounds like maybe with your organization as well,[13:19.080 --> 13:24.080] is that potentially the SDK for SageMaker[13:24.440 --> 13:29.120] is really the win versus some of the UX tools they have[13:29.120 --> 13:32.680] and the interface for Canvas and Studio.[13:32.680 --> 13:36.080] Is that what's happening?[13:36.080 --> 13:38.720] Yeah, so I think, right,[13:38.720 --> 13:41.440] what we try to do is we always try to think about our users.[13:41.440 --> 13:44.880] So how do our users, who are our users?[13:44.880 --> 13:47.000] What capabilities and skills do they have?[13:47.000 --> 13:50.080] And what freedom should they have[13:50.080 --> 13:52.640] and what abilities should they have to develop models?[13:52.640 --> 13:55.440] In our case, we don't really have use cases[13:55.440 --> 13:58.640] for stuff like Canvas because our users[13:58.640 --> 14:02.680] are fairly mature teams that know how to do their,[14:02.680 --> 14:04.320] on the one hand, the data science stuff, of course,[14:04.320 --> 14:06.400] but also the engineering stuff.[14:06.400 --> 14:08.160] So in our case, things like Canvas[14:08.160 --> 14:10.320] do not really play so much role[14:10.320 --> 14:12.960] because obviously due to the high abstraction layer[14:12.960 --> 14:15.640] of more like graphical user interfaces,[14:15.640 --> 14:17.360] drag and drop tooling,[14:17.360 --> 14:20.360] you are also limited in what you can do,[14:20.360 --> 14:22.480] or what you can do easily.[14:22.480 --> 14:26.320] So in our case, really, it is the strength of the flexibility[14:26.320 --> 14:28.320] that the SageMaker SDK gives you.[14:28.320 --> 14:33.040] And in general, the SDK around most AWS services.[14:34.080 --> 14:36.760] But also it comes with challenges, of course.[14:37.720 --> 14:38.960] You give a lot of freedom,[14:38.960 --> 14:43.400] but also you're creating a certain ask,[14:43.400 --> 14:47.320] certain requirements for your model development teams,[14:47.320 --> 14:49.600] which is also why we've also been working[14:49.600 --> 14:52.600] about abstracting further away from the SDK.[14:52.600 --> 14:54.600] So our objective is actually[14:54.600 --> 14:58.760] that you should not be forced to interact with the raw SDK[14:58.760 --> 15:00.600] when you use SageMaker anymore,[15:00.600 --> 15:03.520] but you have a thin layer of abstraction[15:03.520 --> 15:05.480] on top of what you are doing.[15:05.480 --> 15:07.480] That's actually something we are moving towards[15:07.480 --> 15:09.320] more and more as well.[15:09.320 --> 15:11.120] Because yeah, it gives you the flexibility,[15:11.120 --> 15:12.960] but also flexibility comes at a cost,[15:12.960 --> 15:15.080] comes often at the cost of speeds,[15:15.080 --> 15:18.560] specifically when it comes to the 90% default stuff[15:18.560 --> 15:20.720] that you want to do, yeah.[15:20.720 --> 15:24.160] And one of the things that I have as a complaint[15:24.160 --> 15:29.160] against SageMaker is that it only uses virtual machines,[15:30.000 --> 15:35.000] and it does seem like a strange strategy in some sense.[15:35.000 --> 15:40.000] Like for example, I guess if you're doing batch only,[15:40.000 --> 15:42.000] it doesn't matter as much,[15:42.000 --> 15:45.000] which I think is a good strategy actually[15:45.000 --> 15:50.000] to get your batch based predictions very, very strong.[15:50.000 --> 15:53.000] And in that case, maybe the virtual machines[15:53.000 --> 15:56.000] make a little bit less of a complaint.[15:56.000 --> 16:00.000] But in the case of the endpoints with SageMaker,[16:00.000 --> 16:02.000] the fact that you have to spend up[16:02.000 --> 16:04.000] these really expensive virtual machines[16:04.000 --> 16:08.000] and let them run 24 seven to do online prediction,[16:08.000 --> 16:11.000] is that something that your organization evaluated[16:11.000 --> 16:13.000] and decided not to use?[16:13.000 --> 16:15.000] Or like, what are your thoughts behind that?[16:15.000 --> 16:19.000] Yeah, in our case, doing real time[16:19.000 --> 16:22.000] or near real time inference is currently not really relevant[16:22.000 --> 16:25.000] for the simple reason that when you think a bit more[16:25.000 --> 16:28.000] about the money laundering or anti money laundering space,[16:28.000 --> 16:31.000] typically when, right,[16:31.000 --> 16:34.000] all every individual bank must do anti money laundering[16:34.000 --> 16:37.000] and they have armies of people doing that.[16:37.000 --> 16:39.000] But on the other hand,[16:39.000 --> 16:43.000] the time it actually takes from one of their systems,[16:43.000 --> 16:46.000] one of their AML systems actually detecting something[16:46.000 --> 16:49.000] that's unusual that then goes into a review process[16:49.000 --> 16:54.000] until it eventually hits the governmental institution[16:54.000 --> 16:56.000] that then takes care of the cases that have been[16:56.000 --> 16:58.000] at least twice validated that they are indeed,[16:58.000 --> 17:01.000] they look very unusual.[17:01.000 --> 17:04.000] So this takes a while, this can take quite some time,[17:04.000 --> 17:06.000] which is also why it doesn't really matter[17:06.000 --> 17:09.000] whether you ship your prediction within a second[17:09.000 --> 17:13.000] or whether it takes you a week or two weeks.[17:13.000 --> 17:15.000] It doesn't really matter, hence for us,[17:15.000 --> 17:19.000] that problem so far thinking about real time inference[17:19.000 --> 17:21.000] has not been there.[17:21.000 --> 17:25.000] But yeah, indeed, for other use cases,[17:25.000 --> 17:27.000] for also private projects,[17:27.000 --> 17:29.000] we've also been considering SageMaker Endpoints[17:29.000 --> 17:31.000] for a while, but exactly what you said,[17:31.000 --> 17:33.000] the fact that you need to have a very beefy machine[17:33.000 --> 17:35.000] running all the time,[17:35.000 --> 17:39.000] specifically when you have heavy GPU loads, right,[17:39.000 --> 17:43.000] and you're actually paying for that machine running 2047,[17:43.000 --> 17:46.000] although you do have quite fluctuating load.[17:46.000 --> 17:49.000] Yeah, then that definitely becomes quite a consideration[17:49.000 --> 17:51.000] of what you go for.[17:51.000 --> 17:58.000] Yeah, and I actually have been talking to AWS about that,[17:58.000 --> 18:02.000] because one of the issues that I have is that[18:02.000 --> 18:07.000] the AWS platform really pushes serverless,[18:07.000 --> 18:10.000] and then my question for AWS is,[18:10.000 --> 18:13.000] so why aren't you using it?[18:13.000 --> 18:16.000] I mean, if you're pushing serverless for everything,[18:16.000 --> 18:19.000] why is SageMaker nothing serverless?[18:19.000 --> 18:21.000] And so maybe they're going to do that, I don't know.[18:21.000 --> 18:23.000] I don't have any inside information,[18:23.000 --> 18:29.000] but it is interesting to hear you had some similar concerns.[18:29.000 --> 18:32.000] I know that there's two questions here.[18:32.000 --> 18:37.000] One is someone asked about what do you do for data versioning,[18:37.000 --> 18:41.000] and a second one is how do you do event based MLOps?[18:41.000 --> 18:43.000] So maybe kind of following up.[18:43.000 --> 18:46.000] Yeah, what do we do for data versioning?[18:46.000 --> 18:51.000] On the one hand, we're running a data lakehouse,[18:51.000 --> 18:54.000] where after data we get from the financial institutions,[18:54.000 --> 18:57.000] from the banks that runs through massive data pipeline,[18:57.000 --> 19:01.000] also on AWS, we're using glue and step functions actually for that,[19:01.000 --> 19:03.000] and then eventually it ends up modeled to some extent,[19:03.000 --> 19:06.000] sanitized, quality checked in our data lakehouse,[19:06.000 --> 19:10.000] and there we're actually using hoodie on top of S3.[19:10.000 --> 19:13.000] And this is also what we use for versioning,[19:13.000 --> 19:16.000] which we use for time travel and all these things.[19:16.000 --> 19:19.000] So that is hoodie on top of S3,[19:19.000 --> 19:21.000] when then pipelines,[19:21.000 --> 19:24.000] so actually our model pipelines plug in there[19:24.000 --> 19:27.000] and spit out predictions, alerts,[19:27.000 --> 19:29.000] what we call alerts eventually.[19:29.000 --> 19:33.000] That is something that we version based on unique IDs.[19:33.000 --> 19:36.000] So processing IDs, we track pretty much everything,[19:36.000 --> 19:39.000] every line of code that touched,[19:39.000 --> 19:43.000] is related to a specific row in our data.[19:43.000 --> 19:46.000] So we can exactly track back for every single row[19:46.000 --> 19:48.000] in our predictions and in our alerts,[19:48.000 --> 19:50.000] what pipeline ran on it,[19:50.000 --> 19:52.000] which jobs were in that pipeline,[19:52.000 --> 19:56.000] which code exactly was running in each job,[19:56.000 --> 19:58.000] which intermediate results were produced.[19:58.000 --> 20:01.000] So we're basically adding lineage information[20:01.000 --> 20:03.000] to everything we output along that line,[20:03.000 --> 20:05.000] so we can track everything back[20:05.000 --> 20:09.000] using a few tools we've built.[20:09.000 --> 20:12.000] So the tool you mentioned,[20:12.000 --> 20:13.000] I'm not familiar with it.[20:13.000 --> 20:14.000] What is it called again?[20:14.000 --> 20:15.000] It's called hoodie?[20:15.000 --> 20:16.000] Hoodie.[20:16.000 --> 20:17.000] Hoodie.[20:17.000 --> 20:18.000] Oh, what is it?[20:18.000 --> 20:19.000] Maybe you can describe it.[20:19.000 --> 20:22.000] Yeah, hoodie is essentially,[20:22.000 --> 20:29.000] it's quite similar to other tools such as[20:29.000 --> 20:31.000] Databricks, how is it called?[20:31.000 --> 20:32.000] Databricks?[20:32.000 --> 20:33.000] Delta Lake maybe?[20:33.000 --> 20:34.000] Yes, exactly.[20:34.000 --> 20:35.000] Exactly.[20:35.000 --> 20:38.000] It's basically, it's equivalent to Delta Lake,[20:38.000 --> 20:40.000] just back then when we looked into[20:40.000 --> 20:42.000] what are we going to use.[20:42.000 --> 20:44.000] Delta Lake was not open sourced yet.[20:44.000 --> 20:46.000] Databricks open sourced a while ago.[20:46.000 --> 20:47.000] We went for Hoodie.[20:47.000 --> 20:50.000] It essentially, it is a layer on top of,[20:50.000 --> 20:53.000] in our case, S3 that allows you[20:53.000 --> 20:58.000] to more easily keep track of what you,[20:58.000 --> 21:03.000] of the actions you are performing on your data.[21:03.000 --> 21:08.000] So it's essentially very similar to Delta Lake,[21:08.000 --> 21:13.000] just already before an open sourced solution.[21:13.000 --> 21:15.000] Yeah, that's, I didn't know anything about that.[21:15.000 --> 21:16.000] So now I do.[21:16.000 --> 21:19.000] So thanks for letting me know.[21:19.000 --> 21:21.000] I'll have to look into that.[21:21.000 --> 21:27.000] The other, I guess, interesting stack related question is,[21:27.000 --> 21:29.000] what are your thoughts about,[21:29.000 --> 21:32.000] I think there's two areas that I think[21:32.000 --> 21:34.000] are interesting and that are emerging.[21:34.000 --> 21:36.000] Oh, actually there's, there's multiple.[21:36.000 --> 21:37.000] Maybe I'll just bring them all up.[21:37.000 --> 21:39.000] So we'll do one by one.[21:39.000 --> 21:42.000] So these are some emerging areas that I'm, that I'm seeing.[21:42.000 --> 21:49.000] So one is the concept of event driven, you know,[21:49.000 --> 21:54.000] architecture versus, versus maybe like a static architecture.[21:54.000 --> 21:57.000] And so I think obviously you're using step functions.[21:57.000 --> 22:00.000] So you're a fan of, of event driven architecture.[22:00.000 --> 22:04.000] Maybe we start, we'll start with that one is what are your,[22:04.000 --> 22:08.000] what are your thoughts on going more event driven in your organization?[22:08.000 --> 22:09.000] Yeah.[22:09.000 --> 22:13.000] In, in, in our case, essentially everything works event driven.[22:13.000 --> 22:14.000] Right.[22:14.000 --> 22:19.000] So since we on AWS, we're using event bridge or cloud watch events.[22:19.000 --> 22:21.000] I think now it's called everywhere.[22:21.000 --> 22:22.000] Right.[22:22.000 --> 22:24.000] This is how we trigger pretty much everything in our stack.[22:24.000 --> 22:27.000] This is how we trigger our data pipelines when data comes in.[22:27.000 --> 22:32.000] This is how we trigger different, different lambdas that parse our[22:32.000 --> 22:35.000] certain information from your log, store them in different databases.[22:35.000 --> 22:40.000] This is how we also, how we, at some point in the back in the past,[22:40.000 --> 22:44.000] how we also triggered new deployments when new models were approved in[22:44.000 --> 22:46.000] your model registry.[22:46.000 --> 22:50.000] So basically everything we've been doing is, is fully event driven.[22:50.000 --> 22:51.000] Yeah.[22:51.000 --> 22:56.000] So, so I think this is a key thing you bring up here is that I've,[22:56.000 --> 23:00.000] I've talked to many people who don't use AWS, who are, you know,[23:00.000 --> 23:03.000] all alternatively experts at technology.[23:03.000 --> 23:06.000] And one of the things that I've heard some people say is like, oh,[23:06.000 --> 23:13.000] well, AWS is in as fast as X or Y, like Lambda is in as fast as X or Y or,[23:13.000 --> 23:17.000] you know, Kubernetes or, but, but the point you bring up is exactly the[23:17.000 --> 23:24.000] way I think about AWS is that the true advantage of AWS platform is the,[23:24.000 --> 23:29.000] is the tight integration with the services and you can design event[23:29.000 --> 23:31.000] driven workflows.[23:31.000 --> 23:33.000] Would you say that's, that's absolutely.[23:33.000 --> 23:34.000] Yeah.[23:34.000 --> 23:35.000] Yeah.[23:35.000 --> 23:39.000] I think designing event driven workflows on AWS is incredibly easy to do.[23:39.000 --> 23:40.000] Yeah.[23:40.000 --> 23:43.000] And it also comes incredibly natural and that's extremely powerful.[23:43.000 --> 23:44.000] Right.[23:44.000 --> 23:49.000] And simply by, by having an easy way how to trigger lambdas event driven,[23:49.000 --> 23:52.000] you can pretty much, right, pretty much do everything and glue[23:52.000 --> 23:54.000] everything together that you want.[23:54.000 --> 23:56.000] I think that gives you a tremendous flexibility.[23:56.000 --> 23:57.000] Yeah.[23:57.000 --> 24:00.000] So, so I think there's two things that come to mind now.[24:00.000 --> 24:07.000] One is that, that if you are developing an ML ops platform that you[24:07.000 --> 24:09.000] can't ignore Lambda.[24:09.000 --> 24:12.000] So I, because I've had some people tell me, oh, well, we can do this and[24:12.000 --> 24:13.000] this and this better.[24:13.000 --> 24:17.000] It's like, yeah, but if you're going to be on AWS, you have to understand[24:17.000 --> 24:18.000] why people use Lambda.[24:18.000 --> 24:19.000] It isn't speed.[24:19.000 --> 24:24.000] It's, it's the ease of, ease of developing very rich solutions.[24:24.000 --> 24:25.000] Right.[24:25.000 --> 24:26.000] Absolutely.[24:26.000 --> 24:28.000] And then the glue between, between what you are building eventually.[24:28.000 --> 24:33.000] And you can even almost your, the thoughts in your mind turn into Lambda.[24:33.000 --> 24:36.000] You know, like you can be thinking and building code so quickly.[24:36.000 --> 24:37.000] Absolutely.[24:37.000 --> 24:41.000] Everything turns into which event do I need to listen to and then I trigger[24:41.000 --> 24:43.000] a Lambda and that Lambda does this and that.[24:43.000 --> 24:44.000] Yeah.[24:44.000 --> 24:48.000] And the other part about Lambda that's pretty, pretty awesome is that it[24:48.000 --> 24:52.000] hooks into services that have infinite scale.[24:52.000 --> 24:56.000] Like so SQS, like you can't break SQS.[24:56.000 --> 24:59.000] Like there's nothing you can do to ever take SQS down.[24:59.000 --> 25:02.000] It handles unlimited requests in and unlimited requests out.[25:02.000 --> 25:04.000] How many systems are like that?[25:04.000 --> 25:05.000] Yeah.[25:05.000 --> 25:06.000] Yeah, absolutely.[25:06.000 --> 25:07.000] Yeah.[25:07.000 --> 25:12.000] So then this kind of a followup would be that, that maybe data scientists[25:12.000 --> 25:17.000] should learn Lambda and step functions in order to, to get to[25:17.000 --> 25:18.000] MLOps.[25:18.000 --> 25:21.000] I think that's a yes.[25:21.000 --> 25:25.000] If you want to, if you want to put the foot into MLOps and you are on AWS,[25:25.000 --> 25:31.000] then I think there is no way around learning these fundamentals.[25:31.000 --> 25:32.000] Right.[25:32.000 --> 25:35.000] There's no way around learning things like what is a Lambda?[25:35.000 --> 25:39.000] How do I, how do I create a Lambda via Terraform or whatever tool you're[25:39.000 --> 25:40.000] using there?[25:40.000 --> 25:42.000] And how do I hook it up to an event?[25:42.000 --> 25:47.000] And how do I, how do I use the AWS SDK to interact with different[25:47.000 --> 25:48.000] services?[25:48.000 --> 25:49.000] So, right.[25:49.000 --> 25:53.000] I think if you want to take a step into MLOps from, from coming more from[25:53.000 --> 25:57.000] the data science and it's extremely important to familiarize yourself[25:57.000 --> 26:01.000] with how do you, at least the fundamentals, how do you architect[26:01.000 --> 26:03.000] basic solutions on AWS?[26:03.000 --> 26:05.000] How do you glue services together?[26:05.000 --> 26:07.000] How do you make them speak to each other?[26:07.000 --> 26:09.000] So yeah, I think that's quite fundamental.[26:09.000 --> 26:14.000] Ideally, ideally, I think that's what the platform should take away from you[26:14.000 --> 26:16.000] as a, as a pure data scientist.[26:16.000 --> 26:19.000] You don't, should not necessarily have to deal with that stuff.[26:19.000 --> 26:23.000] But if you're interested in, if you want to make that move more towards MLOps,[26:23.000 --> 26:27.000] I think learning about infrastructure and specifically in the context of AWS[26:27.000 --> 26:31.000] about the services and how to use them is really fundamental.[26:31.000 --> 26:32.000] Yeah, it's good.[26:32.000 --> 26:33.000] Because this is automation eventually.[26:33.000 --> 26:37.000] And if you want to automate, if you want to automate your complex processes,[26:37.000 --> 26:39.000] then you need to learn that stuff.[26:39.000 --> 26:41.000] How else are you going to do it?[26:41.000 --> 26:42.000] Yeah, I agree.[26:42.000 --> 26:46.000] I mean, that's really what, what, what Lambda step functions are is their[26:46.000 --> 26:47.000] automation tools.[26:47.000 --> 26:49.000] So that's probably the better way to describe it.[26:49.000 --> 26:52.000] That's a very good point you bring up.[26:52.000 --> 26:57.000] Another technology that I think is an emerging technology is the[26:57.000 --> 26:58.000] managed file system.[26:58.000 --> 27:05.000] And the reason why I think it's interesting is that, so I 20 plus years[27:05.000 --> 27:11.000] ago, I was using file systems in the university setting when I was at[27:11.000 --> 27:14.000] Caltech and then also in film, film industry.[27:14.000 --> 27:22.000] So film has been using managed file servers with parallel processing[27:22.000 --> 27:24.000] farms for a long time.[27:24.000 --> 27:27.000] I don't know how many people know this, but in the film industry,[27:27.000 --> 27:32.000] the, the, the architecture, even from like 2000 was there's a very[27:32.000 --> 27:38.000] expensive file server and then there's let's say 40,000 machines or 40,000[27:38.000 --> 27:39.000] cores.[27:39.000 --> 27:40.000] And that's, that's it.[27:40.000 --> 27:41.000] That's the architecture.[27:41.000 --> 27:46.000] And now what's interesting is I see with data science and machine learning[27:46.000 --> 27:52.000] operations that like that, that could potentially happen in the future is[27:52.000 --> 27:57.000] actually a managed NFS mount point with maybe Kubernetes or something like[27:57.000 --> 27:58.000] that.[27:58.000 --> 28:01.000] Do you see any of that on the horizon?[28:01.000 --> 28:04.000] Oh, that's a good question.[28:04.000 --> 28:08.000] I think for our, for our, what we're currently doing, that's probably a[28:08.000 --> 28:10.000] bit further away.[28:10.000 --> 28:15.000] But in principle, I could very well imagine that in our use case, not,[28:15.000 --> 28:17.000] not quite.[28:17.000 --> 28:20.000] But in principle, definitely.[28:20.000 --> 28:26.000] And then maybe a third, a third emerging thing I'm seeing is what's going[28:26.000 --> 28:29.000] on with open AI and hugging face.[28:29.000 --> 28:34.000] And that has the potential, but maybe to change the game a little bit,[28:34.000 --> 28:38.000] especially with hugging face, I think, although both of them, I mean,[28:38.000 --> 28:43.000] there is that, you know, in the case of pre trained models, here's a[28:43.000 --> 28:48.000] perfect example is that an organization may have, you know, maybe they're[28:48.000 --> 28:53.000] using AWS even for this, they're transcribing videos and they're going[28:53.000 --> 28:56.000] to do something with them, maybe they're going to detect, I don't know,[28:56.000 --> 29:02.000] like, you know, if you recorded customers in your, I'm just brainstorm,[29:02.000 --> 29:05.000] I'm not seeing your company did this, but I'm just creating a hypothetical[29:05.000 --> 29:09.000] situation that they recorded, you know, customer talking and then they,[29:09.000 --> 29:12.000] they transcribe it to text and then run some kind of a, you know,[29:12.000 --> 29:15.000] criminal detection feature or something like that.[29:15.000 --> 29:19.000] Like they could build their own models or they could download the thing[29:19.000 --> 29:23.000] that was released two days ago or a day ago from open AI that transcribes[29:23.000 --> 29:29.000] things, you know, and then, and then turn that transcribe text into[29:29.000 --> 29:34.000] hugging face, some other model that summarizes it and then you could[29:34.000 --> 29:38.000] feed that into a system. So it's, what is, what is your, what are your[29:38.000 --> 29:42.000] thoughts around some of these pre trained models and is your, are you[29:42.000 --> 29:48.000] thinking of in terms of your stack, trying to look into doing fine tuning?[29:48.000 --> 29:53.000] Yeah, so I think pre trained models and especially the way that hugging face,[29:53.000 --> 29:57.000] I think really revolutionized the space in terms of really kind of[29:57.000 --> 30:02.000] platformizing the entire business around or the entire market around[30:02.000 --> 30:07.000] pre trained models. I think that is really quite incredible and I think[30:07.000 --> 30:10.000] really for the ecosystem a changing way how to do things.[30:10.000 --> 30:16.000] And I believe that looking at the, the costs of training large models[30:16.000 --> 30:19.000] and looking at the fact that many organizations are not able to do it[30:19.000 --> 30:23.000] for, because of massive costs or because of lack of data.[30:23.000 --> 30:29.000] I think this is a, this is a clear, makes it very clear how important[30:29.000 --> 30:33.000] such platforms are, how important sharing of pre trained models actually is.[30:33.000 --> 30:37.000] I believe it's a, we are only at the, quite at the beginning actually of that.[30:37.000 --> 30:42.000] And I think we're going to see that nowadays you see it mostly when it[30:42.000 --> 30:47.000] comes to fairly generalized data format, images, potentially videos, text,[30:47.000 --> 30:52.000] speech, these things. But I believe that we're going to see more marketplace[30:52.000 --> 30:57.000] approaches when it comes to pre trained models in a lot more industries[30:57.000 --> 31:01.000] and in a lot more, in a lot more use cases where data is to some degree[31:01.000 --> 31:05.000] standardized. Also when you think about, when you think about banking,[31:05.000 --> 31:10.000] for example, right? When you think about transactions to some extent,[31:10.000 --> 31:14.000] transaction, transaction data always looks the same, kind of at least at[31:14.000 --> 31:17.000] every bank. Of course you might need to do some mapping here and there,[31:17.000 --> 31:22.000] but also there is a lot of power in it. But because simply also thinking[31:22.000 --> 31:28.000] about sharing data is always a difficult thing, especially in Europe.[31:28.000 --> 31:32.000] Sharing data between organizations is incredibly difficult legally.[31:32.000 --> 31:36.000] It's difficult. Sharing models is a different thing, right?[31:36.000 --> 31:40.000] Basically, similar to the concept of federated learning. Sharing models[31:40.000 --> 31:44.000] is significantly easier legally than actually sharing data.[31:44.000 --> 31:48.000] And then applying these models, fine tuning them and so on.[31:48.000 --> 31:52.000] Yeah, I mean, I could just imagine. I really don't know much about[31:52.000 --> 31:56.000] banking transactions, but I would imagine there could be several[31:56.000 --> 32:01.000] kinds of transactions that are very normal. And then there's some[32:01.000 --> 32:06.000] transactions, like if you're making every single second,[32:06.000 --> 32:11.000] you're transferring a lot of money. And it happens just[32:11.000 --> 32:14.000] very quickly. It's like, wait, why are you doing this? Why are you transferring money[32:14.000 --> 32:20.000] constantly? What's going on? Or the huge sum of money only[32:20.000 --> 32:24.000] involves three different points in the network. Over and over again,[32:24.000 --> 32:29.000] just these three points are constantly... And so once you've developed[32:29.000 --> 32:33.000] a model that is anomaly detection, then[32:33.000 --> 32:37.000] yeah, why would you need to develop another one? I mean, somebody already did it.[32:37.000 --> 32:41.000] Exactly. Yes, absolutely, absolutely. And that's[32:41.000 --> 32:45.000] definitely... That's encoded knowledge, encoded information in terms of the model,[32:45.000 --> 32:49.000] which is not personally... Well, abstracts away from[32:49.000 --> 32:53.000] but personally identifiable data. And that's really the power. That is something[32:53.000 --> 32:57.000] that, yeah, as I've said before, you can share significantly easier and you can[32:57.000 --> 33:03.000] apply to your use cases. The kind of related to this in[33:03.000 --> 33:09.000] terms of upcoming technologies is, I think, dealing more with graphs.[33:09.000 --> 33:13.000] And so is that something from a stackwise that your[33:13.000 --> 33:19.000] company's investigated resource can do? Yeah, so when you think about[33:19.000 --> 33:23.000] transactions, bank transactions, right? And bank customers.[33:23.000 --> 33:27.000] So in our case, again, it's a... We only have pseudonymized[33:27.000 --> 33:31.000] transaction data, so actually we cannot see anything, right? We cannot see names, we cannot see[33:31.000 --> 33:35.000] iPads or whatever. We really can't see much. But[33:35.000 --> 33:39.000] you can look at transactions moving between[33:39.000 --> 33:43.000] different entities, between different accounts. You can look at that[33:43.000 --> 33:47.000] as a network, as a graph. And that's also what we very frequently do.[33:47.000 --> 33:51.000] You have your nodes in your network, these are your accounts[33:51.000 --> 33:55.000] or your presence, even. And the actual edges between them,[33:55.000 --> 33:59.000] that's what your transactions are. So you have this[33:59.000 --> 34:03.000] massive graph, actually, that also we as TMNL, as Transaction Montenegro,[34:03.000 --> 34:07.000] are sitting on. We're actually sitting on a massive transaction graph.[34:07.000 --> 34:11.000] So yeah, absolutely. For us, doing analysis on top of[34:11.000 --> 34:15.000] that graph, building models on top of that graph is a quite important[34:15.000 --> 34:19.000] thing. And like I taught a class[34:19.000 --> 34:23.000] a few years ago at Berkeley where we had to[34:23.000 --> 34:27.000] cover graph databases a little bit. And I[34:27.000 --> 34:31.000] really didn't know that much about graph databases, although I did use one actually[34:31.000 --> 34:35.000] at one company I was at. But one of the things I learned in teaching that[34:35.000 --> 34:39.000] class was about the descriptive statistics[34:39.000 --> 34:43.000] of a graph network. And it[34:43.000 --> 34:47.000] is actually pretty interesting, because I think most of the time everyone talks about[34:47.000 --> 34:51.000] median and max min and standard deviation and everything.[34:51.000 --> 34:55.000] But then with a graph, there's things like centrality[34:55.000 --> 34:59.000] and I forget all the terms off the top of my head, but you can see[34:59.000 --> 35:03.000] if there's a node in the network that's[35:03.000 --> 35:07.000] everybody's interacting with. Absolutely. You can identify communities[35:07.000 --> 35:11.000] of people moving around a lot of money all the time. For example,[35:11.000 --> 35:15.000] you can detect different metric features eventually[35:15.000 --> 35:19.000] doing computations on your graph and then plugging in some model.[35:19.000 --> 35:23.000] Often it's feature engineering. You're computing between the centrality scores[35:23.000 --> 35:27.000] across your graph or your different entities. And then[35:27.000 --> 35:31.000] you're building your features actually. And then you're plugging in some[35:31.000 --> 35:35.000] model in the end. If you do classic machine learning, so to say[35:35.000 --> 35:39.000] if you do graph deep learning, of course that's a bit different.[35:39.000 --> 35:43.000] So basically that could for people that are analyzing[35:43.000 --> 35:47.000] essentially networks of people or networks, then[35:47.000 --> 35:51.000] basically a graph database would be step one is[35:51.000 --> 35:55.000] generate the features which could be centrality.[35:55.000 --> 35:59.000] There's a score and then you then go and train[35:59.000 --> 36:03.000] the model based on that descriptive statistic.[36:03.000 --> 36:07.000] Exactly. So one way how you could think about it is[36:07.000 --> 36:11.000] whether we need a graph database or not, that always depends on your specific use case[36:11.000 --> 36:15.000] and what database. We're actually also running[36:15.000 --> 36:19.000] that using Spark. You have graph frames, you have[36:19.000 --> 36:23.000] graph X actually. So really stuff in Spark built for[36:23.000 --> 36:27.000] doing analysis on graphs.[36:27.000 --> 36:31.000] And then what you usually do is exactly what you said. You are trying[36:31.000 --> 36:35.000] to build features based on that graph.[36:35.000 --> 36:39.000] Based on the attributes of the nodes and the attributes on the edges and so on.[36:39.000 --> 36:43.000] And so I guess in terms of graph databases right[36:43.000 --> 36:47.000] now, it sounds like maybe the three[36:47.000 --> 36:51.000] main players maybe are there's Neo4j which[36:51.000 --> 36:55.000] has been around for a long time. There's I guess Spark[36:55.000 --> 36:59.000] and then there's also, I forgot what the one is called for AWS[36:59.000 --> 37:03.000] is it? Neptune, that's Neptune.[37:03.000 --> 37:07.000] Have you played with all three of those and did you[37:07.000 --> 37:11.000] like Neptune? Neptune was something we, Spark of course we actually currently[37:11.000 --> 37:15.000] using for exactly that. Also because it allows us to do[37:15.000 --> 37:19.000] to keep our stack fairly homogeneous. We did[37:19.000 --> 37:23.000] also PUC in Neptune a while ago already[37:23.000 --> 37:27.000] and well Neptune you definitely have essentially two ways[37:27.000 --> 37:31.000] how to query Neptune either using Gremlin or SparkQL.[37:31.000 --> 37:35.000] So that means the people, your data science[37:35.000 --> 37:39.000] need to get familiar with that which then is already one bit of a hurdle[37:39.000 --> 37:43.000] because usually data scientists are not familiar with either.[37:43.000 --> 37:47.000] But also what we found with Neptune[37:47.000 --> 37:51.000] is also that it's not necessarily built for[37:51.000 --> 37:55.000] as an analytics graph database. It's not necessarily made for[37:55.000 --> 37:59.000] that. And that then become, then it's sometimes, at least[37:59.000 --> 38:03.000] for us, it has become quite complicated to handle different performance considerations[38:03.000 --> 38:07.000] when you actually do fairly complex queries across that graph.[38:07.000 --> 38:11.000] Yeah, so you're bringing up like a point which[38:11.000 --> 38:15.000] happens a lot in my experience with[38:15.000 --> 38:19.000] technology is that sometimes[38:19.000 --> 38:23.000] the purity of the solution becomes the problem[38:23.000 --> 38:27.000] where even though Spark isn't necessarily[38:27.000 --> 38:31.000] designed to be a graph database system, the fact is[38:31.000 --> 38:35.000] people in your company are already using it. So[38:35.000 --> 38:39.000] if you just turn on that feature now you can use it and it's not like[38:39.000 --> 38:43.000] this huge technical undertaking and retraining effort.[38:43.000 --> 38:47.000] So even if it's not as good, if it works, then that's probably[38:47.000 --> 38:51.000] the solution your company will use versus I agree with you like a lot of times[38:51.000 --> 38:55.000] even if a solution like Neo4j is a pretty good example of[38:55.000 --> 38:59.000] it's an interesting product but[38:59.000 --> 39:03.000] you already have all these other products like do you really want to introduce yet[39:03.000 --> 39:07.000] another product into your stack. Yeah, because eventually[39:07.000 --> 39:11.000] it all comes with an overhead of course introducing it. That is one thing[39:11.000 --> 39:15.000] it requires someone to maintain it even if it's a[39:15.000 --> 39:19.000] managed service. Somebody needs to actually own it and look after it[39:19.000 --> 39:23.000] and then as you said you need to retrain people to also use it effectively.[39:23.000 --> 39:27.000] So it comes at significant cost and that is really[39:27.000 --> 39:31.000] something that I believe should be quite critically[39:31.000 --> 39:35.000] assessed. What is really the game you have? How far can you go with[39:35.000 --> 39:39.000] your current tooling and then eventually make[39:39.000 --> 39:43.000] that decision. At least personally I'm really[39:43.000 --> 39:47.000] not a fan of thinking tooling first[39:47.000 --> 39:51.000] but personally I really believe in looking at your organization, looking at the people[39:51.000 --> 39:55.000] what skills are there, looking at how effective[39:55.000 --> 39:59.000] are these people actually performing certain activities and processes[39:59.000 --> 40:03.000] and then carefully thinking about what really makes sense[40:03.000 --> 40:07.000] because it's one thing but people need to[40:07.000 --> 40:11.000] adopt and use the tooling and eventually it should really speed them up and improve[40:11.000 --> 40:15.000] how they develop. Yeah, I think it's very[40:15.000 --> 40:19.000] that's great advice that it's hard to understand how good of advice it is[40:19.000 --> 40:23.000] because it takes experience getting burned[40:23.000 --> 40:27.000] creating new technology. I've[40:27.000 --> 40:31.000] had experiences before where[40:31.000 --> 40:35.000] one of the mistakes I've made was putting too many different technologies in an organization[40:35.000 --> 40:39.000] and the problem is once you get enough complexity[40:39.000 --> 40:43.000] it can really explode and then[40:43.000 --> 40:47.000] this is the part that really gets scary is that[40:47.000 --> 40:51.000] let's take Spark for example. How hard is it to hire somebody that knows Spark? Pretty easy[40:51.000 --> 40:55.000] how hard is it going to be to hire somebody that knows[40:55.000 --> 40:59.000] Spark and then hire another person that knows the gremlin query[40:59.000 --> 41:03.000] language for Neptune, then hire another person that knows Kubernetes[41:03.000 --> 41:07.000] then tire another, after a while if you have so many different kinds of tools[41:07.000 --> 41:11.000] you have to hire so many different kinds of people that all[41:11.000 --> 41:15.000] productivity goes to a stop. So it's the hiring as well[41:15.000 --> 41:19.000] Absolutely, I mean it's virtually impossible[41:19.000 --> 41:23.000] to find someone who is really well versed with gremlin for example[41:23.000 --> 41:27.000] it's incredibly hard and I think tech hiring is hard[41:27.000 --> 41:31.000] by itself already[41:31.000 --> 41:35.000] so you really need to think about what can I hire for as well[41:35.000 --> 41:39.000] what expertise can I realistically build up?[41:39.000 --> 41:43.000] So that's why I think AWS[41:43.000 --> 41:47.000] even with some of the limitations about the ML platform[41:47.000 --> 41:51.000] the advantages of using AWS is that[41:51.000 --> 41:55.000] you have a huge audience of people to hire from and then the same thing like[41:55.000 --> 41:59.000] Spark, there's a lot of things I don't like about Spark but a lot of people[41:59.000 --> 42:03.000] use Spark and so if you use AWS and you use Spark[42:03.000 --> 42:07.000] let's say those two which you are then you're going to have a much easier time[42:07.000 --> 42:11.000] hiring people, you're going to have a much easier time training people[42:11.000 --> 42:15.000] there's tons of documentation about it so I think a lot of people[42:15.000 --> 42:19.000] are very wise that you're thinking that way but a lot of people don't think about that[42:19.000 --> 42:23.000] they're like oh I've got to use the latest, greatest stuff and this and this and this[42:23.000 --> 42:27.000] and then their company starts to get into trouble because they can't hire[42:27.000 --> 42:31.000] people, they can't maintain systems and then productivity starts to[42:31.000 --> 42:35.000] to degrees. Also something[42:35.000 --> 42:39.000] not to ignore is the cognitive load you put on a team[42:39.000 --> 42:43.000] that needs to manage a broad range of very different[42:43.000 --> 42:47.000] tools or services. It also puts incredible[42:47.000 --> 42:51.000] cognitive load on that team and you suddenly also need an incredible breadth[42:51.000 --> 42:55.000] of expertise in that team and that means you're also going[42:55.000 --> 42:59.000] to create single points of failures if you don't really[42:59.000 --> 43:03.000] scale up your team.[43:03.000 --> 43:07.000] It's something to really, I think when you go for[43:07.000 --> 43:11.000] new tooling you should really look at it from a holistic perspective[43:11.000 --> 43:15.000] not only about this is the latest and greatest.[43:15.000 --> 43:19.000] In terms of Europe versus[43:19.000 --> 43:23.000] US, have you spent much time in the US at all?[43:23.000 --> 43:27.000] Not at all actually, flying to the US Monday but no, not at all.[43:27.000 --> 43:31.000] That also would be kind of an interesting[43:31.000 --> 43:35.000] comparison in that the culture of the United States[43:35.000 --> 43:39.000] is really this culture of[43:39.000 --> 43:43.000] I would say more like survival of the fittest or you work[43:43.000 --> 43:47.000] seven days a week and you're constantly like you don't go on vacation[43:47.000 --> 43:51.000] and you're proud of it and I think it's not[43:51.000 --> 43:55.000] a good culture. I'm not saying that's a good thing, I think it's a bad[43:55.000 --> 43:59.000] thing and that a lot of times the critique people have[43:59.000 --> 44:03.000] about Europe is like oh will people take vacation all the time and all this[44:03.000 --> 44:07.000] and as someone who has spent time in both I would say[44:07.000 --> 44:11.000] yes that's a better approach. A better approach is that people[44:11.000 --> 44:15.000] should feel relaxed because when[44:15.000 --> 44:19.000] especially the kind of work you do in MLOPs[44:19.000 --> 44:23.000] is that you need people to feel comfortable and happy[44:23.000 --> 44:27.000] and more the question[44:27.000 --> 44:31.000] what I was going to is that[44:31.000 --> 44:35.000] I wonder if there is a more productive culture[44:35.000 --> 44:39.000] for MLOPs in Europe[44:39.000 --> 44:43.000] versus the US in terms of maintaining[44:43.000 --> 44:47.000] systems and building software where the US[44:47.000 --> 44:51.000] what it's really been good at I guess is kind of coming up with new[44:51.000 --> 44:55.000] ideas and there's lots of new services that get generated but[44:55.000 --> 44:59.000] the quality and longevity[44:59.000 --> 45:03.000] is not necessarily the same where I could see[45:03.000 --> 45:07.000] in the stuff we just talked about which is if you're trying to build a team[45:07.000 --> 45:11.000] where there's low turnover[45:11.000 --> 45:15.000] you have very high quality output[45:15.000 --> 45:19.000] it seems like that maybe organizations[45:19.000 --> 45:23.000] could learn from the European approach to building[45:23.000 --> 45:27.000] and maintaining systems for MLOPs.[45:27.000 --> 45:31.000] I think there's definitely some truth in it especially when you look at the median[45:31.000 --> 45:35.000] tenure of a tech person in an organization[45:35.000 --> 45:39.000] I think that is actually still significantly lower in the US[45:39.000 --> 45:43.000] I'm not sure I think in the Bay Area somewhere around one year or two months or something like that[45:43.000 --> 45:47.000] compared to Europe I believe[45:47.000 --> 45:51.000] still fairly low. Here of course in tech people also like to switch companies more often[45:51.000 --> 45:55.000] but I would say average is still more around[45:55.000 --> 45:59.000] two years something around that staying with the same company[45:59.000 --> 46:03.000] also in tech which I think is a bit longer[46:03.000 --> 46:07.000] than you would typically have it in the US.[46:07.000 --> 46:11.000] I think from my perspective where I've also built up most of the[46:11.000 --> 46:15.000] current team I think it's[46:15.000 --> 46:19.000] super important to hire good people[46:19.000 --> 46:23.000] and people that fit to the team fit to the company culture wise[46:23.000 --> 46:27.000] but also give them[46:27.000 --> 46:31.000] let them not be in a sprint all the time[46:31.000 --> 46:35.000] it's about having a sustainable way of working in my opinion[46:35.000 --> 46:39.000] and that sustainable way means you should definitely take your vacation[46:39.000 --> 46:43.000] and I think usually in Europe we have quite generous[46:43.000 --> 46:47.000] even by law vacation I mean in Netherlands by law you get 20 days a year[46:47.000 --> 46:51.000] but most companies give you 25 many IT companies[46:51.000 --> 46:55.000] 30 per year so that's quite nice[46:55.000 --> 46:59.000] but I do take that so culture wise it's really everyone[46:59.000 --> 47:03.000] likes to take vacations whether that's sea level or whether that's an engineer on a team[47:03.000 --> 47:07.000] and that's in many companies that's also really encouraged[47:07.000 --> 47:11.000] to have a healthy work life balance[47:11.000 --> 47:15.000] and of course it's not only about vacations also but growth opportunities[47:15.000 --> 47:19.000] letting people explore develop themselves[47:19.000 --> 47:23.000] and not always pushing on max performance[47:23.000 --> 47:27.000] so really at least I always see like a partnership[47:27.000 --> 47:31.000] the organization wants to get something from an[47:31.000 --> 47:35.000] employee but the employee should also be encouraged and developed[47:35.000 --> 47:39.000] in that organization a

Conferência Data+AI Summit 2022 da Databricks: Anúncios e Novidades por Luan Moreno

Engenharia de Dados [Cast]

Play Episode Listen Later Aug 31, 2022 52:13

Anúncios e Novidades da Conferência da Databricks, Data+AI Summit 2022, segue informações:https://databricks.com/dataaisummit/ Delta Lake 2.0https://databricks.com/blog/2022/06/30/open-sourcing-all-of-delta-lake.html MLFlow 2.0https://databricks.com/blog/2022/06/29/introducing-mlflow-pipelines-with-mlflow-2-0.html Project Lightspeedhttps://databricks.com/blog/2022/06/28/project-lightspeed-faster-and-simpler-stream-processing-with-apache-spark.html Spark Connecthttps://databricks.com/blog/2022/07/07/introducing-spark-connect-the-power-of-apache-spark-everywhere.html Databricks Runtime 11.0https://docs.databricks.com/release-notes/runtime/releases.html Databricks Workflowshttps://databricks.com/blog/2022/05/10/introducing-databricks-workflows.html DBT em Produção no Databrickshttps://databricks.com/blog/2022/06/29/top-5-workflows-announcements-at-data-ai-summit.html Delta Live Tables e Projeto Enzymehttps://databricks.com/blog/2022/06/29/delta-live-tables-announces-new-capabilities-and-performance-optimizations.html Novos Conectores do Databricks SQLhttps://databricks.com/blog/2022/06/29/connect-from-anywhere-to-databricks-sql.htmlDatabricks SQL ServerLesshttps://databricks.com/blog/2022/06/28/databricks-sql-serverless-now-available-on-aws.html Unity Cataloghttps://databricks.com/blog/2022/06/28/whats-new-with-databricks-unity-catalog-at-the-data-ai-summit-2022.htmlTerraform para Databrickshttps://databricks.com/blog/2022/06/22/databricks-terraform-provider-is-now-generally-available.html No YouTube possuímos um canal de Engenharia de Dados com os tópicos mais importantes dessa área e com lives todas as quartas-feiras.https://www.youtube.com/channel/UCnErAicaumKqIo4sanLo7vQ Quer ficar por dentro dessa área com posts e updates semanais, então acesse o LinkedIN para não perder nenhuma notícia.https://www.linkedin.com/in/luanmoreno/ Disponível no Spotify e na Apple Podcasthttps://open.spotify.com/show/5n9mOmAcjra9KbhKYpOMqYht Luan Moreno = https://www.linkedin.com/in/luanmoreno/

spotify spark analytics big data moreno python produ dados novidades sql confer dbt dispon engenharia luan databricks data ai ai summit apache spark datab mlflow

MLX: Opinionated ML Pipelines in MLflow // Xiangrui Meng // Coffee Sessions #112

The Machine Learning Podcast

Play Episode Listen Later Aug 3, 2022 50:17

MLOps Coffee Sessions #112 with Xiangrui Meng, Principal Software Engineer of Meng, MLX: Opinionated ML Pipelines in MLflow co-hosted by Vishnu Rachakonda. // Abstract MLX is to enable data scientists to stay mostly within their comfort zone utilizing their expert knowledge while following the best practices in ML development and delivering production-ready ML projects, with little help from production engineers and DevOps. // Bio Xiangrui Meng is a Principal Software Engineer at Databricks and an Apache Spark PMC member. His main interests center around simplifying the end-to-end user experience of building machine learning applications, from algorithms to platforms and to operations. // MLOps Jobs board https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links Good Strategy Bad Strategy: The Difference and Why It Matters book by Richard Rumelt: https://www.amazon.com/Good-Strategy-Bad-Difference-Matters/dp/0307886239 --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Vishnu on LinkedIn: https://www.linkedin.com/in/vrachakonda/ Connect with Xiangrui on LinkedIn: https://www.linkedin.com/in/mengxr/ Timestamps: [00:00] Introduction to Xiangrui Meng [00:39] Takeaways [02:09] Xiangrui's background [03:38] What kept Xiangrui in Databricks [07:33] What needs to be done to get there [09:20] Machine Learning passion of Xiangrui [11:52] Changes in building that keep you fresh for the future [14:35] Evolution core challenges to real-time and use cases in real-time [17:33] DevOps + DataOps + ModelOps = MLOps [19:21] MLFlow Support [21:37] Notebooks to production debates [25:42] Companies tackling Notebooks to production [27:40] MLOoops stories [31:03] Opinionated MLOps productionizing in a good way [40:23] Xiangrui's MLOps Vision [44:47] Lightning round [48:45] Wrap up

Build Better Models Through Data Centric Machine Learning Development With Snorkel AI

Play Episode Listen Later Jul 29, 2022 53:49

Summary Machine learning is a data hungry activity, and the quality of the resulting model is highly dependent on the quality of the inputs that it receives. Generating sufficient quantities of high quality labeled data is an expensive and time consuming process. In order to reduce that time and cost Alex Ratner and his team at Snorkel AI have built a system for powering data-centric machine learning development. In this episode he explains how the Snorkel platform allows domain experts to create labeling functions that translate their expertise into reusable logic that dramatically reduces the time needed to build training data sets and drives down the total cost. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Building good ML models is hard, but testing them properly is even harder. At Deepchecks, they built an open-source testing framework that follows best practices, ensuring that your models behave as expected. Get started quickly using their built-in library of checks for testing and validating your model’s behavior and performance, and extend it to meet your specific needs as your model evolves. Accelerate your machine learning projects by building trust in your models and automating the testing that you used to do manually. Go to themachinelearningpodcast.com/deepchecks today to get started! Data powers machine learning, but poor data quality is the largest impediment to effective ML today. Galileo is a collaborative data bench for data scientists building Natural Language Processing (NLP) models to programmatically inspect, fix and track their data across the ML workflow (pre-training, post-training and post-production) – no more excel sheets or ad-hoc python scripts. Get meaningful gains in your model performance fast, dramatically reduce data labeling and procurement costs, while seeing 10x faster ML iterations. Galileo is offering listeners a free 30 day trial and a 30% discount on the product there after. This offer is available until Aug 31, so go to themachinelearningpodcast.com/galileo and request a demo today! Predibase is a low-code ML platform without low-code limits. Built on top of our open source foundations of Ludwig and Horovod, our platform allows you to train state-of-the-art ML and deep learning models on your datasets at scale. Our platform works on text, images, tabular, audio and multi-modal data using our novel compositional model architecture. We allow users to operationalize models on top of the modern data stack, through REST and PQL – an extension of SQL that puts predictive power in the hands of data practitioners. Go to themachinelearningpodcast.com/predibase today to learn more and try it out! Your host is Tobias Macey and today I’m interviewing Alex Ratner about Snorkel AI, a platform for data-centric machine learning workflows powered by programmatic data labeling techniques Interview Introduction How did you get involved in machine learning? Can you describe what Snorkel AI is and the story behind it? What are the problems that you are focused on solving? Which pieces of the ML lifecycle are you focused on? How did your experience building the open source Snorkel project and working with the community inform your product direction for Snorkel AI? How has the underlying Snorkel project evolved over the past 4 years? What are the deciding factors that an organization or ML team need to consider when evaluating existing labeling strategies against the programmatic approach that you provide? What are the features that Snorkel provides over and above managing code execution across the source data set? Can you describe what you have built at Snorkel AI and how it is implemented? What are some of the notable developments of the ML ecosystem that had a meaningful impact on your overall product vision/viability? Can you describe the workflow for an individual or team who is using Snorkel for generating their training data set? How does Snorkel integrate with the experimentation process to track how changes to labeling logic correlate with the performance of the resulting model? What are some of the complexities involved in designing and testing the labeling logic? How do you handle complex data formats such as audio, video, images, etc. that might require their own ML models to generate labels? (e.g. object detection for bounding boxes) With the increased scale and quality of labeled data that Snorkel AI offers, how does that impact the viability of autoML toolchains for generating useful models? How are you managing the governance and feature boundaries between the open source Snorkel project and the business that you have built around it? What are the most interesting, innovative, or unexpected ways that you have seen Snorkel AI used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Snorkel AI? When is Snorkel AI the wrong choice? What do you have planned for the future of Snorkel AI? Contact Info LinkedIn Website @ajratner on Twitter Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Snorkel AI Data Engineering Podcast Episode University of Washington Snorkel OSS Natural Language Processing (NLP) Tensorflow PyTorch Podcast.__init__ Episode Deep Learning Foundation Models MLFlow SHAP Podcast.__init__ Episode The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

Declarative Machine Learning For High Performance Deep Learning Models With Predibase

The Machine Learning Podcast

Play Episode Listen Later Jul 21, 2022 60:19

Summary Deep learning is a revolutionary category of machine learning that accelerates our ability to build powerful inference models. Along with that power comes a great deal of complexity in determining what neural architectures are best suited to a given task, engineering features, scaling computation, etc. Predibase is building on the successes of the Ludwig framework for declarative deep learning and Horovod for horizontally distributing model training. In this episode CTO and co-founder of Predibase, Travis Addair, explains how they are reducing the burden of model development even further with their managed service for declarative and low-code ML and how they are integrating with the growing ecosystem of solutions for the full ML lifecycle. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery. Building good ML models is hard, but testing them properly is even harder. At Deepchecks, they built an open-source testing framework that follows best practices, ensuring that your models behave as expected. Get started quickly using their built-in library of checks for testing and validating your model’s behavior and performance, and extend it to meet your specific needs as your model evolves. Accelerate your machine learning projects by building trust in your models and automating the testing that you used to do manually. Go to themachinelearningpodcast.com/deepchecks today to get started! Data powers machine learning, but poor data quality is the largest impediment to effective ML today. Galileo is a collaborative data bench for data scientists building Natural Language Processing (NLP) models to programmatically inspect, fix and track their data across the ML workflow (pre-training, post-training and post-production) – no more excel sheets or ad-hoc python scripts. Get meaningful gains in your model performance fast, dramatically reduce data labeling and procurement costs, while seeing 10x faster ML iterations. Galileo is offering listeners a free 30 day trial and a 30% discount on the product there after. This offer is available until Aug 31, so go to themachinelearningpodcast.com/galileo and request a demo today! Do you wish you could use artificial intelligence to drive your business the way Big Tech does, but don’t have a money printer? Graft is a cloud-native platform that aims to make the AI of the 1% accessible to the 99%. Wield the most advanced techniques for unlocking the value of data, including text, images, video, audio, and graphs. No machine learning skills required, no team to hire, and no infrastructure to build or maintain. For more information on Graft or to schedule a demo, visit themachinelearningpodcast.com/graft today and tell them Tobias sent you. Your host is Tobias Macey and today I’m interviewing Travis Addair about Predibase, a low-code platform for building ML models in a declarative format Interview Introduction How did you get involved in machine learning? Can you describe what Predibase is and the story behind it? Who is your target audience and how does that focus influence your user experience and feature development priorities? How would you describe the semantic differences between your chosen terminology of "declarative ML" and the "autoML" nomenclature that many projects and products have adopted? Another platform that launched recently with a promise of "declarative ML" is Continual. How would you characterize your relative strengths? Can you describe how the Predibase platform is implemented? How have the design and goals of the product changed as you worked through the initial implementation and started working with early customers? The operational aspects of the ML lifecycle are still fairly nascent. How have you thought about the boundaries for your product to avoid getting drawn into scope creep while providing a happy path to delivery? Ludwig is a core element of your platform. What are the other capabilities that you are layering around and on top of it to build a differentiated product? In addition to the existing interfaces for Ludwig you created a new language in the form of PQL. What was the motivation for that decision? How did you approach the semantic and syntactic design of the dialect? What is your vision for PQL in the space of "declarative ML" that you are working to define? Can you describe the available workflows for an individual or team that is using Predibase for prototyping and validating an ML model? Once a model has been deemed satisfactory, what is the path to production? How are you approaching governance and sustainability of Ludwig and Horovod while balancing your reliance on them in Predibase? What are some of the notable investments/improvements that you have made in Ludwig during your work of building Predibase? What are the most interesting, innovative, or unexpected ways that you have seen Predibase used? What are the most interesting, unexpected, or challenging lessons that you have learned while working on Predibase? When is Predibase the wrong choice? What do you have planned for the future of Predibase? Contact Info LinkedIn tgaddair on GitHub @travisaddair on Twitter Parting Question From your perspective, what is the biggest barrier to adoption of machine learning today? Closing Announcements Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes. If you’ve learned something or tried out a project from the show then tell us about it! Email hosts@themachinelearningpodcast.com) with your story. To help other people find the show please leave a review on iTunes and tell your friends and co-workers Links Predibase Horovod Ludwig Podcast.__init__ Episode Support Vector Machine Hadoop Tensorflow Uber Michaelangelo AutoML Spark ML Lib Deep Learning PyTorch Continual Data Engineering Podcast Episode Overton Kubernetes Ray Nvidia Triton Whylogs Data Engineering Podcast Episode Weights and Biases MLFlow Comet Confusion Matrices dbt Data Engineering Podcast Episode Torchscript Self-supervised Learning The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0

ML Flow vs Kubeflow 2022 // Byron Allen // Coffee Sessions #108

ai motivation writing coffee wrap developers machine learning ml organizational managed messing sql managed services byron allen yaml demetrios contino mlflow

Play Episode Listen Later Jul 19, 2022 66:01

MLOps Coffee Sessions #108 with Byron Allen, AI & ML Practice Lead at Contino, ML Flow vs Kubeflow 2022 co-hosted by George Pearse. // Abstract The amazing Byron Allen talks to us about why MLflow and Kubeflow are not playing the same game! ML flow vs Kubeflow is more like comparing apples to oranges or as he likes to make the analogy they are both cheese but one is an all-rounder and the other a high-class delicacy. This can be quite deceiving when analyzing the two. We do a deep dive into the functionalities of both and the pros/cons they have to offer. // Bio Byron wears several hats. AI & ML practice lead, solutions architect, ML engineer, data engineer, data scientist, Google Cloud Authorized Trainer, and scrum master. He has a track record of successfully advising on and delivering data science platforms and projects. Byron has a mix of technical capability, business acumen, and communication skills that make me an effective leader, team player, and technology advocate. See Byron write at https://medium.com/@byron.allen // MLOps Jobs board https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://mlops-community.myshopify.com/ // Related Links --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with George on LinkedIn: https://www.linkedin.com/in/george-pearse-b7a76a157/?originalSubdomain=uk Connect with Byron on LinkedIn: https://www.linkedin.com/in/byronaallen/ Timestamps: [00:00] Introduction to Byron Allen [01:10] Introduction to the new co-host George Pearse [01:41] ML Flow vs Kubeflow [05:40] George's take on ML Flow and Kubeflow [07:28] Writing in YAML [09:47] Developer experience [13:38] Changes in ML Flow and Kubeflow [17:58] Messing around ML Flow Serving [20:00] A taste of Kubeflow through K-Serve [23:18] Managed service of Kubeflow [25:15] How George used Kubeflow [27:45] Getting the Managed Service [31:30] Getting Authentication [32:41] ML Flow docs vs Kubeflow docs [36:59] Kubeflow community incentives [42:25] MLOps Search term [42:52] Organizational problem [43:50] Final thoughts on ML Flow and Kubeflow [49:19] Bonus [49:35] Entity-Centric Modeling [52:11] Semantic Layer options [57:27] Semantic Layer with Machine Learning [58:40] Satellite Infra Images demo [1:00:49] Motivation to move away from SQL [1:03:00] Managing SQL [1:05:24] Wrap up

Episode 65 - Recapping The 2022 Databricks Data And AI Summit With Luan Moreno

Datascape Podcast

Play Episode Listen Later Jul 13, 2022 46:02

In this episode Warner and Luan cover the biggest announcements from the Databricks Data and AI Summit 2022. These include improvements on Delta Lake tables, Spark Structured Streaming, serverless SQL endpoints, new Unity Catalog features, MLFlow 2.0 and more!

recapping warner moreno sql luan databricks ai summit mlflow

Making MLFlow // Lead MLFlow Maintainer Corey Zumar // MLOps Coffee Sessions #103

Adventures in Machine Learning

Play Episode Listen Later Jun 17, 2022 64:45

MLOps Coffee Sessions #103 with Corey Zumar, MLOps Podcast on Making MLflow co-hosted by Mihail Eric. // Abstract Because MLOps is a broad ecosystem of rapidly evolving tools and techniques, it creates several requirements and challenges for platform developers: - To serve the needs of many practitioners and organizations, it's important for MLOps platforms to support a variety of tools in the ecosystem. This necessitates extra scrutiny when designing APIs, as well as rigorous testing strategies to ensure compatibility. - Extensibility to new tools and frameworks is a must, but it's important not to sacrifice maintainability. MLflow Plugins (https://www.mlflow.org/docs/latest/plugins.html) is a great example of striking this balance. - Open source is a great space for MLOps platforms to flourish. MLflow's growth has been heavily aided by: 1. meaningful feedback from a community of ML practitioners with a wide range of use cases and workflows & 2. collaboration with industry experts from a variety of organizations to co-develop APIs that are becoming standards in the MLOps space. // Bio Corey Zumar is a software engineer at Databricks, where he's spent the last four years working on machine learning infrastructure and APIs for the machine learning lifecycle, including model management and production deployment. Corey is an active developer of MLflow. He holds a master's degree in computer science from UC Berkeley. // MLOps Jobs board https://mlops.pallet.xyz/jobs MLOps Swag/Merch https://www.printful.com/ // Related Links --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Catch all episodes, blogs, newsletters, and more: https://mlops.community/ Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/ Connect with Corey on LinkedIn: https://www.linkedin.com/in/corey-zumar/ Timestamps: [00:00] Origin story of MLFlow [02:12] Spark as a big player [03:12] Key insights [04:42] Core abstractions and principles on MLFlow's success [07:08] Product development with open-source [09:29] Fine line between competing principles [11:53] Shameless way to pursue collaboration [12:24] Right go-to-market open-source [16:27] Vanity metrics [18:57] First gate of MLOps drug [22:11] Project fundamentals [24:29] Through the pillars [26:14] Best in breed or one tool to rule them all [29:16] MLOps space mature with the MLOps tool [30:49] Ultimate vision for MLFlow [33:56] Alignment of end-users and business values [38:11] Adding a project abstraction separate from the current ML project [42:03] Implementing bigger bets in certain directions [44:54] Log in features to experiment page [45:46] Challenge when operationalizing MLFlow in their stack [48:34] What would you work on if it weren't MLFlow? [49:52] Something to put on top of MLFlow [51:42] Proxy metric [52:39] Feature Stores and MLFlow [54:33] Lightning round [57:36] Wrap up

Solving the Real Issues with the MLflow Team - ML 059

Play Episode Listen Later Jan 27, 2022 48:34

If you're looking for a team that actually cares about the issues you're facing, look no further than Databricks, and they've got something exciting out. In this episode, Michael and Ben welcome on the development team of MLflow, an open-source lifecycle manager for machine learning. They cover how Databricks is redefining how developers and engineers collaborate, the reason behind Databricks' crazy success, and the number ONE most important testing structure for any development team. “A lot of the success was attributed to process and dedicated focus on the interface, understanding what major problems we were going after. ” - Corey Zumar In This Episode How Databricks allows data analysis, engineers, and developers to collaborate effectively Why Databricks was able to rake in 800,000 downloads per MONTH in their first year A simple but powerful methodology that helps Databrick identify the highest ROI problems to tackle (not just the most popular ones) The number one MOST important testing structure that reveals how Databricks keeps their work top-notch What makes Databricks unique from everyone else and is the KEY to putting users first in 2022 Sponsors Top End Devs (https://topenddevs.com/) Coaching | Top End Devs (https://topenddevs.com/coaching) Special Guests: Corey Zumar, Harutaka Kawamura, Weichen Xu, and Zhang Jin.

roi real issues databricks ben wilson mlflow coaching top end devs zhang jin

MLflow

Play Episode Listen Later Jan 20, 2022 0:28

This Week in Machine Learning & Artificial Intelligence (AI) Podcast

10/22/20 #2 Matei Zaharia - Machine Learning at Industrial Scale: Lessons from the MLflow Project

Stanford MLSys Seminar

Play Episode Listen Later Jan 8, 2022 59:33

Matei Zaharia - Machine Learning at Industrial Scale: Lessons from the MLflow Project Although enterprise adoption of machine learning is still early on, many enterprises in all industries already have hundreds of internal ML applications. ML powers business processes with an impact of hundreds of millions of dollars in industrial IoT, finance, healthcare and retail. Building and operating these applications reliably requires infrastructure that is different from traditional software development, which has led to significant investment in the construction of “ML platforms” specifically designed to run ML applications. In this talk, I'll discuss some of the common challenges in productionizing ML applications based on experience building MLflow, an open source ML platform started at Databricks. MLflow is now the most widely used open source project in this area, with over 2 million downloads a month and integrations with dozens of other products. I'll also highlight some interesting problems users face that are not covered deeply in current ML systems research, such as the need for “hands-free” ML that can train thousands of independent models without direct tuning from the ML developer for regulatory reasons, and the impact of privacy and interpretability regulations on ML. All my examples will be based on experience at large Databricks / MLflow customers.

lessons building project scale industrial machine learning iot ml databricks mlflow matei zaharia

Building Blocks of Machine Learning at LEGO with Francesc Joan Riera - #533

Play Episode Listen Later Nov 4, 2021 43:13

Today we're joined by Francesc Joan Riera, an applied machine learning engineer at The LEGO Group. In our conversation, we explore the ML infrastructure at LEGO, specifically around two use cases, content moderation and user engagement. While content moderation is not a new or novel task, but because their apps and products are marketed towards children, their need for heightened levels of moderation makes it very interesting. We discuss if the moderation system is built specifically to weed out bad actors or passive behaviors if their system has a human-in-the-loop component, why they built a feature store as opposed to a traditional database, and challenges they faced along that journey. We also talk through the range of skill sets on their team, the use of MLflow for experimentation, the adoption of AWS for serverless, and so much more! The complete show notes for this episode can be found at twimlai.com/go/534.

lego machine learning aws ml building blocks francesc lego group mlflow

The MLOps Community with Demetrios Brinkmann - ML048

Adventures in Machine Learning

Play Episode Listen Later Oct 7, 2021 57:07

Demetrios Brinkmann joins the adventure to discuss how he build and supports the MLOps Slack community and online meetups. He goes into the community, moderation, running meetups, sponsorships, and much more. Panel Ben WilsonCharles Max WoodFrancois Bertrand Guest Demetrios Brinkmann Sponsors Dev Influencers AcceleratorLevel Up | Devchat.tvPodcastBootcamp.io Links MLOps CommunityTwitter: mlopscommunity ( @mlopscommunity )LinkedIn: Demetrios BrinkmannTwitter: Demetrios ( @Dpbrinkm ) Picks Ben- MLFlowCharles- Tribe of MillionairesCharles- Top End DevsDemetrios- Make Noise: A Creator's Guide to Podcasting and Great Audio StorytellingDemetrios- Out on the WireDemetrios- Radiolab: Podcasts Contact Ben: DatabricksGitHub | BenWilson2/ML-EngineeringGitHub | databrickslabs/automl-toolkitLinkedIn: Benjamin Wilson Contact Charles: Devchat.tvDevChat.tv | FacebookTwitter: DevChat.tv ( @devchattv ) Contact Francois: Francois BertrandGitHub | fbdesignpro/sweetviz Special Guest: Demetrios Brinkmann.

community guide podcasting panel millionaires tribe wire github databricks brinkmann ben wilson demetrios devchat charles max wood mlflow devchattv mlops community dev influencers accelerator top end devs level up devchat podcastbootcamp

MLflow

Play Episode Listen Later Jun 4, 2021 0:28

Support the show (http://paypal.me/SachinPanicker )

Quail data por tacos de datos

tacos de recalentado de pycastR

Play Episode Listen Later May 19, 2021 32:16

Estamos super ocupados esta semana para grabar y vamos a tomarnos unas mini vacaciones hasta el siguiente miercoles. Les traemos este episodio de pycastR que ojala les sea útil

track whatsapp estamos luigi tacos tienen ml shiny diferencias ciencias regresamos google i o dag argo checa google android material design delaunay recalentado scipy mlflow

Data Brew Season 2 Episode 1: ML in Production

Data Brew by Databricks

Play Episode Listen Later Apr 22, 2021 30:49

For our second season, we will be focusing on machine learning, from research to production. We will interview folks in academia and industry to discuss topics such as data ethics, production-grade infrastructure for ML, hyperparameter tuning, AutoML, and many more.In the season opener, Matei Zaharia discusses how he entered the field of ML, best practices for productionizing ML pipelines, leveraging MLflow & the Lakehouse architecture for reproducible ML, and his current research in this field.See more at databricks.com/data-brew

ai data production spark machine learning brew ml automl mlflow matei zaharia

Developing Feast, the Leading Open Source Feature Store, with Willem Pienaar (Gojek, Tecton)

Machine Learning Engineered

Play Episode Listen Later Mar 9, 2021 71:49

Willem Pienaar is the co-creator of Feast, the leading open source feature store, which he leads the development of as a tech lead at Tecton. Previously, he led the ML platform team at Gojek, a super-app in Southeast Asia. Learn more: https://twitter.com/willpienaar (https://twitter.com/willpienaar) https://feast.dev/ (https://feast.dev/) Every Thursday I send out the most useful things I've learned, curated specifically for the busy machine learning engineer. Sign up here: https://www.cyou.ai/newsletter (https://www.cyou.ai/newsletter) Follow Charlie on Twitter: https://twitter.com/CharlieYouAI (https://twitter.com/CharlieYouAI) Subscribe to ML Engineered: https://mlengineered.com/listen (https://mlengineered.com/listen) Comments? Questions? Submit them here: http://bit.ly/mle-survey (http://bit.ly/mle-survey) Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/ (https://www.givingwhatwecan.org/) Timestamps: 02:15 How Willem got started in computer science 03:40 Paying for college by starting an ISP 05:25 Willem's experience creating Gojek's ML platform 21:45 Issues faced that led to the creation of Feast 26:45 Lessons learned building Feast 33:45 Integrating Feast with data quality monitoring tools 40:10 What it looks like for a team to adopt Feast 44:20 Feast's current integrations and future roadmap 46:05 How a data scientist would use Feast when creating a model 49:40 How the feature store pattern handles DAGs of models 52:00 Priorities for a startup's data infrastructure 55:00 Integrating with Amundsen, Lyft's data catalog 57:15 The evolution of data and MLOps tool standards for interoperability 01:01:35 Other tools in the modern data stack 01:04:30 The interplay between open and closed source offerings Links: https://github.com/feast-dev/feast (Feast's Github) https://blog.gojekengineering.com/data-science/home (Gojek Data Science Blog) https://www.getdbt.com/ (Data Build Tool (DBT)) https://www.tensorflow.org/tfx/data_validation/get_started (Tensorflow Data Validation (TFDV)) https://feast.dev/post/a-state-of-feast/ (A State of Feast) https://cloud.google.com/bigquery (Google BigQuery) https://www.amundsen.io/ (Lyft Amundsen) https://www.cortex.dev/ (Cortex) https://www.kubeflow.org/ (Kubeflow) https://mlflow.org/ (MLFlow)

MLOps Engineering Labs Recap // Part 1 // MLOps Coffee Sessions #30

Play Episode Listen Later Feb 23, 2021 59:38

This is a deep dive into the most recent MLOps Engineering Labs from the point of view of Team 1. // Diagram Link: https://github.com/mlops-labs-team1/engineering.labs#workflow --------------- ✌️Connect With Us ✌️ ------------- Join our slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Alexey on LinkedIn: https://www.linkedin.com/in/alexeynaiden/ Connect with John on LinkedIn: https://www.linkedin.com/in/johnsavageireland/ Connect with Michel on LinkedIn: https://www.linkedin.com/in/michel-vasconcelos-8273008/ Connect with Varuna on LinkedIn: https://www.linkedin.com/in/vpjayasiri/ Timestamps [00:00] Introduction to Engineering Labs Participants [00:34] What are the Engineering Labs? [01:05] Credits to Ivan Nardini who organized this episode! [04:24] John Savage Profile [05:13] Did you want to learn MLFlow before this? [05:50] Alexey Naiden Profile [07:26] Varuna Jayasiri Profile [08:28] Michel Vasconcelos Profile [10:07] Do something with Pytorch and MLFlow and then figure out the rest: What did the process look like for you all? What have you created? [13:39] What did the implementation look like? How you went about structuring and coding it? [17:03] Did you encounter problems along the way? [20:26] Can you give us a rough overview of what you designed and then where was the first problem you saw? [23:08] Was there a lot to catch up with or did you feel it was fine. Can you explain how it was? [24:12] Talk to us about this tool that you have that John was calling out. What was it called? [24:41] Is this homegrown? You built this? [24:51] Did you guys implement this when you went to the engineering labs? [26:03] Can you take us through the pipeline and then the serving and what the overall view of the diagram is? [37:26] For a pet project it works well, but when you wanna start adding a little bit more on top of it wasn't doing the trick? [38:13] So you see it coming in it's much less of an integral part, another lego building block that is part of the whole thing? [40:54] Did you all have trouble with Pytorch or MLFlow? [42:44] Along with that, what was the prompt you were encountering when you were trying to use Torchserve? [44:27] What are you thinking would have been better in that case? [49:05] Feedback on how Engineering Labs went [50:20] Michel: "Engineering Labs should go on. I would like to be a part of it in the next lab." [51:52] Varuna: "This gives me a tangible thing to look at at any point in time and learn from it." [53:00] John: "I feel I have an anchor into the world of MLOps from having done this lab." [55:52] Alexey: "We're at a checkpoint where there are ways we could take" [56:01] Terraform piece Michel wrote for reproducibility.

talk coffee team engineering labs credits terraform alexey pytorch demetrios varuna mlflow

Data Brew Episode 5: Combining Machine Learning and MLflow with your Lakehouse

Data Brew by Databricks

Play Episode Listen Later Jan 6, 2021 36:00

Ellissa Verseput, ML Engineer at Quby, joins Denny and Brooke to discuss how Quby leverages ML to extract additional value from their data lake and how they manage this process.See more at databricks.com/data-brew

data machine learning brew ml mlflow quby

How To Move From Barely Doing BI to Doing AI // Joe Reis // MLOps Meetup #45

Play Episode Listen Later Dec 20, 2020 54:03

MLOps community meetup #45! Last Wednesday, we talked to Joe Reis, CEO/Co-Founder of Ternary Data. // Abstract: The fact is that most companies are barely doing BI, let alone AI. Joe discussed ways for companies to build a solid data foundation so they can succeed with machine learning. This meetup covers the continuum from cloud data warehousing to MLOps. // Bio: Joe is a Data Engineer and Architect, Recovering Data Scientist, 20 years in the data game. Joe enjoys helping companies make sense of their culture, processes, and architecture so they can go from dreaming to doing. He’s certified in both AWS and Google Cloud. When not working, you can find Joe at one of the two groups he co-founded—The Utah Data Engineering Meetup and SLC Python. Joe also sits on the board of Utah Python, a non-profit dedicated to advocating Python in Utah. // Other links to check on Joe: https://www.youtube.com/channel/UC3H60XHMp6BrUzR5eUZDyZg https://josephreis.com/ https://www.ternarydata.com/ https://www.linkedin.com/pulse/what-recovering-data-scientist-joe-reis/ https://www.linkedin.com/pulse/should-you-get-tech-certification-depends-joe-reis/ // Final thoughts Please feel free to drop some questions you may have beforehand into our slack channel (https://go.mlops.community/slack) Watch some old meetups on our youtube channel: https://www.youtube.com/channel/UCG6qpjVnBTTT8wLGBygANOQ ----------- Connect With Us ✌️------------- Join our Slack community: https://go.mlops.community/slack Follow us on Twitter: @mlopscommunity Sign up for the next meetup: https://go.mlops.community/register Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Joe on LinkedIn: https://www.linkedin.com/in/josephreis/ Timestamps: [00:23] How did you get into tech? What brought you on to the journey into data? [04:50] You got into the auto ML and you decided to branch out and do your own thing? How did that happen? [08:18] What is it with BI and then making that jump to ML? [11:00] How have you seen Machine Learning fall flat with trying to shoehorn Machine Learning on top of the already weak foundation of BI? [13:45] Let's imagine we're doing BI fairly well and now we want to jump to Machine Learning. Do we have to go out and reinvent the whole stack or can we shoehorn it on? [15:36] How do you move from BI to ML? [18:24] What do you mean by realtime? [20:35] Managed Services in DevOps [23:30] The maturity isn't there yet [26:03] Where would you draw the line between BI and AI? [30:45] What are the things is Machine Learning an overkill for? [33:43] Are you thinking about what data sets to collect and how different do those vary? [35:18] "Software Engineering and Data Engineering are basically going to merge into one." [38:27] What do you usually recommend moving from BI to AI? [40:45] What is "strong data foundation" in your eyes? [42:47] "MLFlow to gateway drug." What's your take on it? [46:25] In this pandemic, how easy is it for you to pivot to a new provider? [49:10] Vision of companies starts coming together on different parts of the stack in the Machine Learning tools.

OSS Framework Support in Azure Machine Learning Service

AI Show - Channel 9

Play Episode Listen Later Sep 8, 2020 18:31

Learn how Azure ML supports Open Source ML Frameworks and MLflow in AzureML. We'll walk through a ScikitLearn and Pytorch example to show the built in support for ML frameworks. We'll also go over how you can take these examples and easily track your artifacts with MLflow.Learn More: Azure ML ExamplesAzure ML Curated EnvironmentsTrack and Monitor ML Flow Create a Free account (Azure)Deep Learning vs. Machine Learning Get Started with Machine Learning Don't miss new episodes, subscribe to the AI Show

service ml pytorch ai show azure machine learning mlflow framework support

Episode 41: Effective Data Science with Eugene Yan

Datacast

Play Episode Listen Later Sep 3, 2020 94:55

Show Notes(2:19) Eugene got his Bachelor’s degree in Psychology and Organizational Behavior from Singapore Management University, in which he did a senior thesis titled “Competition Improves Performance.”(3:29) Eugene’s first role out of school is an Investment Analyst position at Singapore’s Ministry of Trade & Industry.(4:18) Eugene then moved to a Data Analyst role at IBM, working on projects such as supply-chain dashboards, social media analytics, and anti-money laundering detection.(5:55) Eugene transitioned to an internal Data Scientist role at IBM, working on job forecasting and job recommendations.(9:03) Eugene shared the story of how he became a Data Scientist at Lazada Group, which was a small e-commerce startup back in 2015.(12:08) Eugene explained his decision to go back to school and pursued an online Master’s degree in Computer Science at Georgia Tech.(19:14) Eugene shared his career milestones, as displayed in his blog post reflecting on his journey from getting a degree in Psychology to leading data science at Lazada.(22:17) Eugene discussed the unique data science challenges while working at uCare.ai - a startup that aims to make healthcare more efficient and reduce costs.(25:29) Eugene revealed three useful tips to deliver great data science talks (read his blog post “How to Give a Kick-Ass Data Science Talk” for the details).(28:29) Eugene talked about his transition to become an Applied Scientist at Amazon - working on Amazon Kindle.(30:43) Eugene unpacked his post “Commando, Soldier, Police, and Your Career Choices” that provides an interesting metaphor to help guide career decisions.(33:43) Eugene went meta onto his writing process (read here) and note-taking strategy (read here).(39:01) Eugene shared the lessons learned from taking on responsibilities in hiring, mentoring, and stakeholder engagement in his second year at Lazada (read his blog post on the first 100 days as a Data Science Lead).(44:20) Eugene went in-depth into the engineering and cultural challenges throughout Alibaba Group’s acquisition of Lazada Group.(47:51) Eugene explained Alibaba’s playbook for the technical integration of their acquisitions and the super-apps phenomenon in Asia (check out a summary of his talk on Asia’s Tech Giants).(53:52) Eugene unpacked the values and essential aspects of Lazada’s data science team culture, as detailed in his post “Building a Strong Data Science Team Culture.”(57:44) Eugene summarized his thoughts on the topic of data science and agile/scrum development (Read his 3-part blog series: Part 1, Part 2, and Part 3).(01:03:18) Eugene was heavily involved with the development of product ranking, product recommendations, and product classification models in his first year at Lazada (check out slides to his talk “How Lazada Ranks Products”).(01:09:08) Eugene helped mentor and empower teams on multiple machine learning systems while acting as VP of Data Science at Lazada (check out slides to his talk “Data Science Challenges at Lazada”).(01:12:07) Eugene shared the case study of how uCare.ai developed a machine learning system for Southeast Asia’s largest healthcare group that estimates a patient’s total bill at the point of pre-admission.(01:14:06) Eugene summarized his 2-part series that exposes the challenges after model deployment and yields a practical guide to maintaining models in production.(01:19:04) Eugene discussed his early-career Product Classification project that uses a public Amazon dataset and builds two APIs for image classification & image search.(01:22:29) Eugene discussed his 2-part series that implements seven different models on the same Amazon dataset, from matrix factorization to graphs and NLP.(01:24:42) Closing segment.His Contact InfoWebsiteTwitterLinkedInGitHubHis Recommended ResourcesNiklas Luhmann (well-known German sociologist)Roam Research (note-taking application)MLflow (A platform for ML lifecycle management)Amazon Product Review Dataset (big data in JSON format)Andrej Karpathy (Read “The Unreasonable Effectiveness of RNNs” and “A Recipe For Training Neural Networks”)Jeremy Howard (Read the “Universal Language Model Fine-tuning for Text Classification paper)Hamel Hussain (Check out GitHub Actions and fastpages)“Introduction to Statistical Learning” (by Trevor Hastie and Rob Tibshirani)“The Pragmatic Programmer” (by Andy Hunt and Dave Thomas)applied-ml repositoryml-survey repository

Episode 41: Effective Data Science with Eugene Yan

DataCast

Play Episode Listen Later Sep 3, 2020 94:55

MLflow

Play Episode Listen Later Jul 12, 2020 1:08

An open source platform for the end-to-end machine learning life cycle.Support the show (http://paypal.me/SachinPanicker )

#7 mlflowを使ってみた話

Work In Progress

Play Episode Listen Later Jun 27, 2020 17:34

mlflowを使ってみた話をしました！See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

MLOps Meetup #17 // The Challenges of ML Operations & How Hermione Helps Along the Way // Neylson Crepalde

Play Episode Listen Later Jun 11, 2020 61:29

MLOps.community meetup #17 a deep dive into the open source ML framework Hermoine built on top of MLflow with Neylson Crepalde Key takeaways for attendees: MLOps problems are dealt with tools but also with processes Open-source framework Hermione can help in a lot of parts of the operations process Abstract: In Neylson's experience with Machine Learning projects, he has encountered a series of challenges regarding agile processes to build and deploy ML models in a professional cooperative environment that fosters teamwork. While on this journey, Neylson and his team developed some of our their own solutions for these challenges. Out of this was the open-source project Hermoine born . Hermoine is a collection of solutions for these specific MLOps problems that were packaged into a library, an ML project structure framework called Hermione. In this meetup we talk about these challenges, what they did to overcome them and how Hermione helped address these different issues along the way. We will also do a demo on how to build an ML project with Hermione. Check out Hermoine here: https://github.com/a3data/hermione Neylson Crepalde is a partner and MLOps Tech Lead at A3Data. He holds a PhD in Economic Sociology, a masters in Sociology of Culture, an MBA in Cultural Management and a bachelor degree in Music/Conducting. He is professor of Machine Learning and Head of Data Science Department at Izabela Hendrix Methodist Technological University. His main research interests are Machine Learning processes, Politics and Deliberation, Economic Sociology and Sociology of Education. In his PhD he has worked with Multilevel Social Network Analysis and Exponential Random Graph Models to understand the social construction of quality in an orchestras’ market. Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on twitter:@mlopscommunity Sign up for the next meetup: https://zoom.us/webinar/register/WN_a_nuYR1xT86TGIB2wp9B1g Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Neyslon on LinkedIn: https://www.linkedin.com/in/neylsoncrepalde/

MLOps #14 Kubeflow vs MLflow with Byron Allen

Play Episode Listen Later May 28, 2020 55:27

The amazing Byron Allen talks to us about why MLflow and Kubeflow are not playing the same game! ML flow vs Kubeflow is more like comparing apples to oranges or as he likes to make the analogy they are both cheese but one is an all-rounder and the other a high-class delicacy. This can be quite deceiving when analyzing the two. We do a deep dive into the functionalities of both and the pros/cons they have to offer. Byron is a Senior Consultant at Servian - a data consultancy in Australia that also has a footprint across APAC as well as the UK. Byron is based in the London office where he helps organizations discover and build competitive advantage through their data. His focus is on client advisory and consulting delivery related to Experiments and ProductionML (i.e. data science, experimental design, ML model development, MLOps). Byron has written about a wide range of topics including the divide between data engineer and scientist, the role of ML in the post-covid world, and Kubeflow vs. MLflow. Check it all out here: https://medium.com/@byron.allen Join our slack community: https://join.slack.com/t/mlops-community/shared_invite/zt-391hcpnl-aSwNf_X5RyYSh40MiRe9Lw Follow us on twitter:@mlopscommunity Sign up for the next meetup: https://zoom.us/webinar/register/WN_a_nuYR1xT86TGIB2wp9B1g Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/ Connect with Byron on LinkedIn: https://www.linkedin.com/in/byronaallen/

australia uk experiments ml apac senior consultant byron allen wn demetrios mlflow servian

MLflow

Play Episode Listen Later Apr 7, 2020 1:08

What is MLflow?Support the show (http://paypal.me/SachinPanicker )