SuperDataScience

Follow SuperDataScience
Share on
Copy link to clipboard

Kirill Eremenko is a Data Science coach and lifestyle entrepreneur. The goal of the Super Data Science podcast is to bring you the most inspiring Data Scientists and Analysts from around the World to help you build your successful career in Data Science. Data is growing exponentially and so are sala…

Kirill Eremenko: Data Science Coach, Lifestyle Entrepreneur


    • Jul 19, 2024 LATEST EPISODE
    • weekdays NEW EPISODES
    • 41m AVG DURATION
    • 805 EPISODES

    4.6 from 248 ratings Listeners of SuperDataScience that love the show mention: kirill, udemy, data scientists, data analytics, data science podcast, courses, field, complex, high quality, career, great content, thank you so much, industry, understand, guests, happy, learning, interviews, host, every episode.


    Ivy Insights

    The SuperDataScience podcast is a fantastic resource for anyone interested in the field of data science. Hosted by Jon Krone, this podcast covers a wide range of topics and features expert guests who provide valuable insights and practical tips. The conversations are engaging and informative, making complex concepts accessible to both beginners and experienced professionals.

    One of the best aspects of this podcast is the quality of the guests and discussions. Jon brings on thought leaders and experienced practitioners who dive deep into various topics related to data science, including machine learning, artificial intelligence, and big data analytics. The conversations strike a perfect balance between technical depth and accessible explanations, ensuring that listeners can grasp the information being shared.

    Another great aspect of this podcast is Jon's ability to distill complex ideas into understandable chunks. He has a knack for taking huge concepts and breaking them down in a way that is easy to follow. This makes the podcast incredibly valuable for staying current on industry movements and trends.

    However, one potential downside of this podcast is that it may not be suitable for those completely new to data science. While Jon does a great job of explaining concepts, some episodes assume a certain level of knowledge or familiarity with the subject matter. It would be helpful if there were more beginner-friendly episodes or resources provided for those just starting out in the field.

    In conclusion, The SuperDataScience podcast is an excellent resource for anyone interested in data science. With its exceptional guests, engaging discussions, and accessible explanations, this podcast provides valuable insights and practical tips for professionals at all levels. Whether you're looking to stay current on industry developments or deepen your understanding of specific topics within data science, this podcast offers something for everyone.



    Search for episodes from SuperDataScience with a specific topic:

    Latest episodes from SuperDataScience

    802: In Case You Missed It in June 2024

    Play Episode Listen Later Jul 19, 2024 23:55


    How to grab investor interest with your AI startup idea, revisiting algorithms, and helping practitioners ensure AI safety with regulatory frameworks and beyond: This month, you missed a whole bunch of great interviews. But don't worry, Jon Krohn is here to recap all the best bits for you! Additional materials: www.superdatascience.com/802 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

    ai missed jon krohn
    801: Merged LLMs Are Smaller And More Capable, with Arcee AI's Mark McQuade and Charles Goddard

    Play Episode Listen Later Jul 16, 2024 77:05


    Merged LLMs are the future, and we're exploring how with Mark McQuade and Charles Goddard from Arcee AI on this episode with Jon Krohn. Learn how to combine multiple LLMs without adding bulk, train more efficiently, and dive into different expert approaches. Discover how smaller models can outperform larger ones and leverage open-source projects for big enterprise wins. This episode is packed with must-know insights for data scientists and ML engineers. Don't miss out! Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Explanation of Charles' job title: Chief of Frontier Research [03:31] • Model Merging Technology combining multiple LLMs without increasing size [04:43] • Using MergeKit for model merging [14:49] • Evolutionary Model Merging using evolutionary algorithms [22:55] • Commercial applications and success stories [28:10] • Comparison of Mixture of Experts (MoE) vs. Mixture of Agents [37:57] • Spectrum Project for efficient training by targeting specific modules [54:28] • Future of Small Language Models (SLMs) and their advantages [01:01:22] Additional materials: www.superdatascience.com/801

    800: A Transformative Century of Technological Progress, with Annie P.

    Play Episode Listen Later Jul 12, 2024 43:37


    The SuperDataScience Podcast is celebrating its 800th episode! Host Jon Krohn speaks to his grandmother, Annie, about growing up at a time when so many technologies we take for granted today were yet to be developed. Listen in to hear Annie's experience of the changes in technology across 94 years and how she and her family fared in 1940s Ukraine with no electricity or running water. Additional materials: www.superdatascience.com/800 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

    799: AGI Could Be Near: Dystopian and Utopian Implications, with Dr. Andrey Kurenkov

    Play Episode Listen Later Jul 9, 2024 105:48


    No-code games with GenAI, the creative possibilities of LLMs, and our proximity to AGI: In this episode, Jon Krohn talks to Andrey Kurenkov about what turned him from an AGI skeptic to a positivist. You'll also hear about his wildly popular podcast “Last Week in AI” and how the NVIDIA-backed startup Astrocade is helping videogame enthusiasts to create their own games through generative AI. A must-listen! This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • All about The Gradient and Last Week in AI [10:42] • All about Astrocade and Andrey's role at the startup [24:35] • Balancing UX and creative control at Astrocade [42:00] • The creative possibilities of LLMs [1:04:15] • The rapid emergence of AGI [1:10:31] Additional materials: www.superdatascience.com/799

    798: Claude 3.5 Sonnet: Frontier Capabilities & Slick New "Artifacts" UI

    Play Episode Listen Later Jul 5, 2024 15:10


    Claude 3.5 Sonnet, Anthropic's newest model, is making waves in the AI community. This mid-size model outshines the larger Claude 3 Opus in tasks like code generation, content creation, and document summarization, and it's twice as fast. In this episode of The Super Data Science Podcast, Jon Krohn discusses its top-notch performance across benchmarks like MMLU, GPQA, and HumanEval, along with its improved machine vision capabilities. Plus, learn about the new Artifacts UI feature, which makes managing generated content easier by displaying outputs side-by-side with inputs. Tune in to find out why Claude 3.5 Sonnet is setting new standards in AI. Additional materials: www.superdatascience.com/798 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

    797: Deep Learning Classics and Trends, with Dr. Rosanne Liu

    Play Episode Listen Later Jul 2, 2024 69:59


    Dr. Rosanne Liu, Research Scientist at Google DeepMind and co-founder of the ML Collective, shares her journey and the mission to democratize AI research. She explains her pioneering work on intrinsic dimensions in deep learning and the advantages of curiosity-driven research. Jon and Dr. Liu also explore the complexities of understanding powerful AI models, the specifics of character-aware text encoding, and the significant impact of diversity, equity, and inclusion in the ML community. With publications in NeurIPS, ICLR, ICML, and Science, Dr. Liu offers her expertise and vision for the future of machine learning. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • How the ML Collective came about [03:31] • The concept of a failure CV [16:12] • ML Collective research topics [19:03] • How Dr. Liu's work on the “intrinsic dimension” of deep learning models inspired the now-standard LoRA approach to fine-tuning LLMs [21:28] • The pros and cons of curiosity-driven vs. goal-driven ML research [29:08] • Discussion on Dr. Liu's research and papers [33:17] • Character-aware vs. character-blind text encoding [54:59] • The positive impacts of diversity, equity, and inclusion in the ML community [57:51] Additional materials: www.superdatascience.com/797

    796: Earth's Coming Population Collapse and How AI Can Help, with Simon Kuestenmacher

    Play Episode Listen Later Jun 28, 2024 42:45


    Want to feel optimistic about your day? In this Friday episode, Simon Kuestenmacher talks to Jon Krohn about demography: What it is, why it's so important, and why its forecasts should give us reason to hope for a better future. In an increasingly globalized world, and with an aging population in countries with the biggest GDPs, demography is more valuable than ever. Additional materials: www.superdatascience.com/796 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

    795: Fast-Evolving Data and AI Regulatory Frameworks, with Dr. Gina Guillaume-Joseph

    Play Episode Listen Later Jun 25, 2024 66:40


    Gina Guillaume-Joseph talks to Jon Krohn about the data and regulatory frameworks set to transform the AI industry and why that's important to anyone working with data. This episode offers a solid path to understanding AI regulation's past, present and future. Gina walks listeners through the AI Bill of Rights, the NIST AI Risk Framework and the MITRE ATLAS threat model. This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0), by Crawlbase (crawlbase.com), the ultimate data crawling platform, and by Babbel (https://www.babbel.com/superdata), the science-backed language-learning platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • What “responsible AI” means [08:14] • Why the federal government should be behind AI regulation [12:22] • The US vs EU on AI regulation [18:46] • About the AI Bill of Rights [26:14] • About MITRE and the MITRE Atlas [37:19] • What a systems engineer does [54:11] Additional materials: www.superdatascience.com/795

    794: Exciting (and Frightening!) Trends in Open-Source AI

    Play Episode Listen Later Jun 21, 2024 11:02


    Trends in open-source AI: Join Jon Krohn and a panel of data science icons as they discuss the most exciting and concerning developments in open-source AI. Hear insights from Drew Conway, Jared Lander, Emily Zabor, and JD Long on the transformative potential of AI and its future impact. Additional materials: www.superdatascience.com/794 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.

    793: Bayesian Methods and Applications, with Alexandre Andorra

    Play Episode Listen Later Jun 18, 2024 93:20


    Bayesian methods take the spotlight in this episode with Alex Andorra, co-founder of PyMC Labs, and Jon Krohn. Learn how Bayesian techniques handle tough problems, make the most of prior knowledge, and work wonders with limited data. Alex and Jon break down essentials like PyMC, PyStan, and NumPyro libraries, show how to boost model efficiency with PyTensor, and talk about using ArviZ for top-notch diagnostics and visualizations. Plus, get into advanced modeling with Gaussian Processes. This episode is brought to you by Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Practical introduction to Bayesian statistics [04:54] • Definition and significance of epistemology [17:52] • Explanation of PyMC and Monte Carlo methods [27:57] • How to get started with Bayesian modeling and PyMC [34:26] • PyMC Labs and its consulting services [50:50] • ArviZ for post-modeling diagnostics and visualization [01:02:23] • Gaussian processes and their applications [01:09:02] Additional materials: www.superdatascience.com/793

    792: In Case You Missed It in May 2024

    Play Episode Listen Later Jun 14, 2024 22:42


    Jon Krohn shares his favorite clips from May. Hear how Navdeep Martin is spearheading a company to tackle the climate crisis, why Sol Rashidi and Demetrios Brinkmann find nailing job titles so necessary in the fast-paced industries of tech and AI, and get the latest on embeddings with Luis Serrano. Additional materials: www.superdatascience.com/792 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    791: Reinforcement Learning from Human Feedback (RLHF), with Dr. Nathan Lambert

    Play Episode Listen Later Jun 11, 2024 57:10


    Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique's origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0), and Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Why it is important that AI is open [03:13] • The efficacy and scalability of direct preference optimization [07:32] • Robotics and LLMs [14:32] • The challenges to aligning reward models with human preferences [23:00] • How to make sure AI's decision making on preferences reflect desirable behavior [28:52] • Why Nathan believes AI is closer to alchemy than science [37:38] Additional materials: www.superdatascience.com/791

    790: Open-Source Libraries for Data Science at the New York R Conference

    Play Episode Listen Later Jun 7, 2024 7:25


    The experts reveal their top open-source R libraries with us live from the New York R Conference! This Super Data Science Podcast episode features an exclusive panel with data science trailblazers Drew Conway, Jared Lander, Emily Zabor, and JD Long. They share their favorite R libraries and valuable insights to enhance your data science practice. Additional materials: www.superdatascience.com/790 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    conference open source data science libraries new york r drew conway jared lander
    789: ML for Wind-Powered Energy Generation, with Dr. Jason Yosinski

    Play Episode Listen Later Jun 4, 2024 74:49


    Machine Learning for Wind Energy is front and center in this episode as Jon Krohn is joined by Dr. Jason Yosinski, CEO of Windscape AI. Dr. Yosinski brings to light the latest ML advancements sparking significant changes in renewable energy. Tune in for a comprehensive review of these cutting-edge technologies and their expansive impact on the industry and the environment's well-being. This episode is brought to you by Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Enhancing predictability in wind energy with ML [04:52] • Data utilization from wind turbines by energy providers [11:41] • Jason's journey into wind energy [17:55] • Landing the right startup idea [22:47] • Visualizing neural networks with the Deep Vis Toolbox [31:29] • Extreme event forecasting at Uber vs. nowcasting at Windscape AI [45:13] • Discoveries from Loss Change Allocation research [47:48] • Engaging with Jason's ML Collective [59:46] • Traits of successful AI entrepreneurs [1:10:26] Additional materials: www.superdatascience.com/789

    788: Multi-Agent Systems: How Teams of LLMs Excel at Complex Tasks

    Play Episode Listen Later May 31, 2024 10:07


    Multi-agent systems could mark a significant turning point in generative AI. From mastering increasingly complex tasks to getting LLMs to collaborate, in this Five-Minute Friday, Jon Krohn discusses the systems that are working to bridge the remaining gaps left by the latest large language models (LLMs). Additional materials: www.superdatascience.com/788 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    787: MLOps: The Job and The Key Tools, with Demetrios Brinkmann

    Play Episode Listen Later May 28, 2024 56:18


    MLOps, how to build an online community, and tools for scaling LLMs: In this episode, Demetrios Brinkmann speaks to Jon Krohn about the similarities and differences between LLMOps, MLOps and DevOps, and why this should matter to companies looking to hire such engineers. You will also hear how to get involved in the MLOps community wherever you are in the world, and how you can start developing great products with the available tools. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • What MLOps is [03:51] • About LLMOps [12:06] • About LlamaIndex and Ollama [18:29] • Insights from Demetrios' MLOps survey [20:49] • Guidance for using third-party APIs [40:18] • Recommendations for building an online community in tech and AI [47:07] Additional materials: www.superdatascience.com/787

    786: The Six Keys to Data Scientists' Success, with Kirill Eremenko

    Play Episode Listen Later May 24, 2024 27:21


    Learn about the six keys to data science success as host Jon Krohn welcomes back Kirill Eremenko, the mastermind behind SuperDataScience. Kirill shares his top insights on data science careers, from building strong portfolios to leveraging mentors and hands-on labs. With over 2.7 million students, his advice is a must-hear for aspiring and experienced data scientists alike. Additional materials: www.superdatascience.com/786 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    785: Math, Quantum ML and Language Embeddings, with Dr. Luis Serrano

    Play Episode Listen Later May 21, 2024 66:06


    Dr. Luis Serrano from the Serrano Academy reveals how to make Math and Quantum ML accessible, tackles the challenges of teaching A.I. to beginners, and explores the power of embeddings in enterprise applications. Explore the future of Quantum Machine Learning and the latest trends in AI, including multimodality and autonomous systems. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How math and AI can be made easy to understand [05:21] • The three major categories of learners [16:21] • Why embeddings are the most important component of LLMs [26:19] • How semantic search differs from a traditional keyword search [29:57] • The most exciting emerging application areas for AI [42:41] • The promising application areas for Quantum Machine Learning [49:18] Additional materials: www.superdatascience.com/785

    784: Aligning Large Language Models, with Sinan Ozdemir

    Play Episode Listen Later May 17, 2024 9:38


    Aligning LLMs: How can we teach pre-trained LLMs to hold a conversation and learn new information from each other? This was where Sinan Ozdemir began his investigation into aligning LLMs. In this episode, he talks to Jon Krohn about the limitations of definitions for LLMs, training LLMs, and whether it is possible to train an LLM without alignment. Additional materials: www.superdatascience.com/784 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    783: Generative A.I. for Solar Power Installation, with Navdeep Martin

    Play Episode Listen Later May 14, 2024 65:51


    Recent advances in GenAI, how to tackle the climate crisis with advanced technology, and addressing the knowledge gap in understanding AI: Jon Krohn speaks to Flypower co-founder and CEO Navdeep Martin about the advances made in GenAI, from products to applications, and how we might use AI to tackle climate change. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How the Washington Post's recommendation systems work [03:29] • Why product leaders make great CEOs [10:36] • How Flypower uses GenAI to tackle climate change [22:13] • How Flypower identifies its customers' most pertinent questions [30:03] • How AI might come to tackle climate change [36:52] • How to mitigate hallucination in AI models [41:04] Additional materials: www.superdatascience.com/783

    782: In Case You Missed It in April 2024

    Play Episode Listen Later May 10, 2024 40:53


    Hear Jon Krohn's favorite five clips from his April interviews. Chief Scientist at Posit PBC Hadley Wickham on the subtle differences between Python and R. Professor of Business Analytics Barrett Thomas walks through the variables that companies should consider when using drones or any other tech to improve their business operations and bottom line. Aleksa Gordić, Founder of Runa AI believes an overhaul of the current educational system is long overdue. Bernard Marr discusses the future of GenAI and its impact on the world of work. And SuperDataScience founder Kirill Eremenko gives a lively workshop on gradient boosting. Additional materials: www.superdatascience.com/782 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    781: Ensuring Successful Enterprise AI Deployments, with Sol Rashidi

    Play Episode Listen Later May 7, 2024 64:48


    Sol Rashidi, a distinguished data executive who has served in C-suite roles at Fortune 100 companies, joins Jon Krohn to delve into successful enterprise AI strategies and the reasons behind the high turnover among Chief Data Officers. This episode provides an in-depth look at selecting AI projects that succeed and understanding the strategic value of patents in various industries. Benefit from Sol's extensive experience and practical advice on navigating complex corporate challenges. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How CDOs and related roles have such high turnover because [09:40] • The importance of building relationships in AI projects [17:01] • How Sol's book "The AI Survival Guide" came about [20:44] • How high-criticality, low-complexity AI projects are the ones with the highest probability of success [27:11] • How Enterprise data security issues can be resolved with technologies like Protopia's stained-glass data-masking solution [36:10] • Why having great data engineers is essential [47:57] • The value of patents [51:45] Additional materials: www.superdatascience.com/781

    780: How to Become a Data Scientist, with Dr. Adam Ross Nelson

    Play Episode Listen Later May 3, 2024 8:23


    Want to become a data scientist? Jon and Adam discuss the key steps to becoming a data scientist, with a focus on developing portfolio projects. Hear about the 10 project ideas Adam recommends in his book to help you stand out in the data science community. Additional materials: www.superdatascience.com/780 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    779: The Tidyverse of Essential R Libraries and their Python Analogues, with Dr. Hadley Wickham

    Play Episode Listen Later Apr 30, 2024 87:59


    Tidyverse, ggplot2, and the secret to a tech company's longevity: Hadley Wickham talks to Jon Krohn about Posit's rebrand, Tidyverse and why it needs to be in every data scientist's toolkit, and why getting your hands dirty with open-source projects can be so lucrative for your career. This episode is brought to you by Intel and HPE Ezmeral Software (https://bit.ly/hpeintel). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • All about the Tidyverse [04:46] • Hadley's favorite R libraries [17:10] • The goal of Posit [30:29] • On bringing multiple programming languages together [36:02] • The principles for a long-lasting tech company [52:10] • How Hadley developed ggplot2 [55:24] • How to contribute to the open-source community [1:05:43] Additional materials: www.superdatascience.com/779

    essential intel python libraries analogues posit hadley wickham jon krohn tidyverse
    778: Mixtral 8x22B: SOTA Open-Source LLM Capabilities at a Fraction of the Compute

    Play Episode Listen Later Apr 26, 2024 6:52


    Mixtral 8x22B is the focus on this week's Five-Minute Friday. Jon Krohn examines how this model from French AI startup Mistral leverages its mixture-of-experts architecture to redefine efficiency and specialization in AI-powered tasks. Tune in to learn about its performance benchmarks and the transformative potential of its open-source license. Additional materials: www.superdatascience.com/778 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    777: Generative AI in Practice, with Bernard Marr

    Play Episode Listen Later Apr 23, 2024 68:49


    Generative AI is reshaping our world, and Bernard Marr, world-renowned futurist and best-selling author, joins Jon Krohn to guide us through this transformation. In this episode, Bernard shares his insights on how AI is transforming industries, revolutionizing daily life, and addressing global challenges. With his extensive experience advising top organizations worldwide, he also examines the ethical considerations of AI deployment. This episode is brought to you by Intel and HPE Ezmeral Software (https://bit.ly/hpeintel). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How Generative AI will transform industries [03:55] • The evolution of Generative AI [10:19] • How will Generative AI impact daily life [16:52] • The ethical challenges of AI [18:55] • How corporations can harness Generative AI for collaboration [24:36] • Industries that will be impacted by Generative AI [32:20] • How Sora-like Generative AI systems will create highly immersive entertainment [42:16] • How Generative AI could unlock 99% of business data [53:34] Additional materials: www.superdatascience.com/777

    776: Deep Utopia: AI Could Solve All Human Problems in Our Lifetime

    Play Episode Listen Later Apr 19, 2024 7:36


    What are the risks of AI progressing beyond a point of no return? What do we stand to gain? On this Five-Minute Friday, Jon Krohn talks ‘books' as he outlines two nonfiction works on AI and futurism by Oxford philosopher Nick Bostrom. Listen to a breakdown of DEEP UTOPIA and SUPERINTELLIGENCE in this episode. Additional materials: www.superdatascience.com/776 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    775: What will humans do when machines are vastly more intelligent? With Aleksa Gordić

    Play Episode Listen Later Apr 16, 2024 96:41


    Tech entrepreneurship, artificial superintelligence, and the future of education: Aleksa Gordić speaks to Jon Krohn about his strategies for self-directed learning, the traits that help people succeed in moving from big tech to entrepreneurship, and the social impact of artificial superintelligence. This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How to motivate yourself to become a tech entrepreneur [17:02] • Aleksa's checklist for the perfect CTO [35:00] • Potential sustainable solutions for LLMs [41:51] • The next major developments in AI and tech [48:29] • How hobbies have a knock-on effect for a person's career [1:01:53] • How and why formal education needs to change [1:09:24] Additional materials: www.superdatascience.com/775

    774: RFM-1 Gives Robots Human-like Reasoning and Conversation Abilities

    Play Episode Listen Later Apr 12, 2024 12:52


    Covariant's RFM-1: Jon Krohn explores the future of AI-driven robotics with RFM-1, a groundbreaking robot arm designed by Covariant and discussed by A.I. roboticist Pieter Abbeel. Explore how this innovation aims to merge digital intelligence with the physical world, promising a new era of efficiency and autonomy. Additional materials: www.superdatascience.com/774 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    773: Deep Reinforcement Learning for Maximizing Profits, with Prof. Barrett Thomas

    Play Episode Listen Later Apr 9, 2024 67:40


    Dr. Barrett Thomas, an award-winning Research Professor at the University of Iowa, explores the intricacies of Markov decision processes and their connection to Deep Reinforcement Learning. Discover how these concepts are applied in operations research to enhance business efficiency and drive innovations in same-day delivery and autonomous transportation systems. This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Barrett's start in operations logistics [02:27] • Concorde Solver and the traveling salesperson problem [09:59] • Cross-function approximation explained [19:13] • How Markov decision processes relate to deep reinforcement learning [26:08] • Understanding policy in decision-making contexts [33:40] • Revolutionizing supply chains and transportation with aerial drones [46:47] • Barrett's career evolution: past changes and future prospects [52:19] Additional materials: www.superdatascience.com/773

    772: In Case You Missed It in March 2024

    Play Episode Listen Later Apr 5, 2024 24:00


    Pytorch benefits, how to get funding for your AI startup, and managing scientific silos: In our new series for SuperDataScience, “In Case You Missed It”, host Jon Krohn engages in some “reinforcement learning through human feedback” of his own with need-to-hear sound bites from past SDS episodes! Additional materials: www.superdatascience.com/772 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    771: Gradient Boosting: XGBoost, LightGBM and CatBoost, with Kirill Eremenko

    Play Episode Listen Later Apr 2, 2024 119:00


    Kirill Eremenko joins Jon Krohn for another exclusive, in-depth teaser for a new course just released on the SuperDataScience platform, “Machine Learning Level 2”. Kirill walks listeners through why decision trees and random forests are fruitful for businesses, and he offers hands-on walkthroughs for the three leading gradient-boosting algorithms today: XGBoost, LightGBM, and CatBoost. This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/), and by Data Universe, the out-of-this-world data conference (https://datauniverse2024.com). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • All about decision trees [09:28] • All about ensemble models [22:03] • All about AdaBoost [38:46] • All about gradient boosting [46:51] • Gradient boosting for classification problems [1:01:26] • All about LightGBM and CatBoost [1:05:39] Additional materials: www.superdatascience.com/771

    770: The Neuroscientific Guide to Confidence

    Play Episode Listen Later Mar 29, 2024 45:22


    Explore the science of confidence with Lucy Antrobus, as she unveils neuroscience-backed strategies to build and boost confidence through practice, positive energy, and the power of laughter. An essential listen for fostering unshakable self-assurance. Additional materials: www.superdatascience.com/770 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    769: Generative AI for Medicine, with Prof. Zack Lipton

    Play Episode Listen Later Mar 26, 2024 109:12


    Generative AI in medicine takes center stage as Prof. Zachary Lipton, Chief Scientific Officer at Abridge, joins host Jon Krohn to discuss the significant advancements in AI that are reshaping healthcare. This episode is brought to you by the DataConnect Conference (https://www.dataconnectconf.com/dccwest/conference), and by Data Universe, the out-of-this-world data conference (https://datauniverse2024.com). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • The inspiration for Zack to get started in ML and healthcare [03:56] • The hardware required to use Abridge [12:29] • The key data science projects at Abridge right now [35:05] • Abridge's tech stack [59:54] • How Abridge ensures reliability in a high-stakes setting like healthcare [1:07:29] • How Zack's academic research cross-pollinates with his commercial ML projects [1:21:05] • How Zack's jazz background molded his entrepreneur and data science journey [1:30:32] Additional materials: www.superdatascience.com/769

    768: Is Claude 3 Better than GPT-4?

    Play Episode Listen Later Mar 22, 2024 12:55


    Claude 3, LLMs and testing ML performance: Jon Krohn tests out Anthropic's new model family, Claude 3, which includes the Haiku, Sonnet and Opus models (written in order of their performance power, from least to greatest). Can it stand shoulder to shoulder with other models such as GPT-4 and Gemini 1.0 Ultra? And how important is it for machine learning practitioners to try out these models with their own benchmarks? Jon walks listeners through a test of his own in this Five-Minute Friday. Additional materials: www.superdatascience.com/768 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    767: Open-Source LLM Libraries and Techniques, with Dr. Sebastian Raschka

    Play Episode Listen Later Mar 19, 2024 108:12


    Jon Krohn sits down with Sebastian Raschka to discuss his latest book, Machine Learning Q and AI, the open-source libraries developed by Lightning AI, how to exploit the greatest opportunities for LLM development, and what's on the horizon for LLMs. This episode is brought to you by the DataConnect Conference (https://www.dataconnectconf.com/dccwest/conference), and by Data Universe, the out-of-this-world data conference (https://datauniverse2024.com). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • All about Machine Learning Q and AI [04:13] • Sebastian Raschka's role as Staff Research Engineer at Lightning AI [19:21] • PyTorch Lightning's and Lightning Fabric's capabilities [39:32] • Large language models: Opportunities and challenges [43:35] • DoRA vs LoRA [48:56] • How to be a successful AI educator [1:34:18] Additional materials: www.superdatascience.com/767

    766: Vonnegut's Player Piano (1952): An Eerie Novel on the Current AI Revolution

    Play Episode Listen Later Mar 15, 2024 8:13


    Kurt Vonnegut's "Player Piano" delivers striking parallels between its dystopian vision and today's AI challenges. This week, Jon Krohn explores the novel's depiction of a world where humans are marginalized by machines, reflecting on the impact of automation on society and the ethical considerations it raises. Tune in as we unpack the timeless relevance of Vonnegut's work to the AI era. Additional materials: www.superdatascience.com/766 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    765: NumPy, SciPy and the Economics of Open-Source, with Dr. Travis Oliphant

    Play Episode Listen Later Mar 12, 2024 97:29


    Explore the origins of NumPy and SciPy with their creator, Dr. Travis Oliphant. Discover the journey from personal need to global impact, the challenges overcome, and the future of these essential Python libraries in scientific computing and data science. This episode is brought to you by the DataConnect Conference (https://www.dataconnectconf.com/dccwest/conference), by Data Universe, the out-of-this-world data conference (https://datauniverse2024.com), and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Travis's journey to creating NumPy and SciPy [08:05] • How Anaconda got started [42:24] • How Numba, a high-performance Python compiler, was brought to market [54:48] • Python's influence on the thought processes of scientists and engineers [1:04:21] • The commercial projects that support Travis's vast open-source efforts and communities [1:10:22] • How to get involved in Travis's commercial projects and communities [1:22:34] • The future of scientific computing and Python libraries [1:29:50] Additional materials: www.superdatascience.com/765

    764: The Top 10 Episodes of 2023

    Play Episode Listen Later Mar 8, 2024 8:04


    Data science futurists, bestselling authors, and lively how-to guides from the industry's top practitioners, which range from applying data science for good to using open-source tools for NLP: This is The Super Data Science Podcast's top ten most listened-to episodes in 2023, hosted by Jon Krohn. A great snapshot of our great content from 2023. Additional materials: www.superdatascience.com/764 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    data jon krohn
    763: The Best A.I. Startup Opportunities, with venture capitalist Rudina Seseri

    Play Episode Listen Later Mar 5, 2024 87:14


    At Glasswing Ventures, Rudina Seseri wants to be able to answer the question: What has Glasswing Ventures done for the company beyond capital investment? She speaks to Jon Krohn about how her company uses data to assess venture capital investments, the secret sauce of successful AI startups, and why she feels generative AI is only the start of a much broader impact that AI will make in communities and businesses. This episode is brought to you by the DataConnect Conference (https://www.dataconnectconf.com/dccwest/conference), and by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/). Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Potential interest areas for Series A AI venture capitalists [12:22] • How Glasswing's AI Palette helps AI startups [23:06] • How data driven the venture capital industry is [27:21] • Advice for adopting services from AI providers [47:21] • Model collapse: Causes and concerns [58:44] • Glasswing's checklist for AI startups [1:04:59] Additional materials: www.superdatascience.com/763

    ai advice opportunities model startups venture capitalists glasswing ventures glasswing jon krohn rudina seseri
    762: Gemini 1.5 Pro, the Million-Token-Context LLM

    Play Episode Listen Later Mar 1, 2024 16:58


    Jon Krohn presents an insightful overview of Google's groundbreaking Gemini Pro 1.5, a million-token LLM that's transforming the landscape of AI. Discover the innovative aspects of Gemini Pro 1.5, from its extensive context window to its multimodal functionalities, which are broadening the scope of AI technology and signifying a significant leap in data science. Plus, join Jon for a practical demonstration, showcasing the real-world applications, capabilities, and limitation of this advanced language model. Additional materials: www.superdatascience.com/762 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    761: Gemini Ultra: How to Release an A.I. Product for Billions of Users, with Google's Lisa Cohen

    Play Episode Listen Later Feb 27, 2024 70:15


    Google's Gemini Ultra takes the spotlight this week, as host Jon Krohn welcomes Lisa Cohen, Google's Director of Data Science and Engineering, for a conversation about the launch of Gemini Ultra. Discover the capabilities of this cutting-edge large language model and how it stands toe-to-toe with GPT-4. Lisa shares her insights on the development, rollout, and potential of Gemini Ultra in reshaping various sectors. Whether you're a data science professional, tech enthusiast, or curious about the future of AI, this episode offers a deep dive into one of the most significant advancements in artificial intelligence. This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/), and by Intel and HPE Ezmeral Software Solutions (https://hpe.com/ezmeral/chatbots). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Google's Gemini model family and Lisa's key responsibilities [04:55] • How LLMs will transform the practice of Data Science [19:47] • Lisa on prompt engineering and reinforcement learning from human feedback [24:38] • How to fine-tune Gemini models with Google's Vertex AI [30:52] • How AI-assistants will transform life and work for everyone from data scientists to educators to children [47:14] • The challenges of developing a data-centric culture [57:31] • Centralized vs decentralized data science teams [1:03:50] Additional materials: www.superdatascience.com/761

    760: Humans Love A.I.-Crafted Beer

    Play Episode Listen Later Feb 23, 2024 6:31


    AI-crafted beer, machine learning for passion projects, and self-taught data science: Jon Krohn and Beau Warren's hotly anticipated, data-driven, punny lager Krohn&Borg is finally given a taste test in this week's Five-Minute Friday. Heading to the Species X brewery in Columbus, Ohio, Jon Krohn and Beau Warren launched the beer that had been predicted, optimized and developed by a machine-learning model. Additional materials: www.superdatascience.com/760 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    759: Full Encoder-Decoder Transformers Fully Explained, with Kirill Eremenko

    Play Episode Listen Later Feb 20, 2024 103:13


    Encoders, cross attention and masking for LLMs: SuperDataScience Founder Kirill Eremenko returns to the SuperDataScience podcast, where he speaks with Jon Krohn about transformer architectures and why they are a new frontier for generative AI. If you're interested in applying LLMs to your business portfolio, you'll want to pay close attention to this episode! This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/), by Oracle NetSuite business software (netsuite.com/superdata), and by Intel and HPE Ezmeral Software Solutions (http://hpe.com/ezmeral/chatbots). Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How decoder-only transformers work [15:51] • How cross-attention works in transformers [41:05] • How encoders and decoders work together (an example) [52:46] • How encoder-only architectures excel at understanding natural language [1:20:34] • The importance of masking during self-attention [1:27:08] Additional materials: www.superdatascience.com/759

    758: The Mamba Architecture: Superior to Transformers in LLMs

    Play Episode Listen Later Feb 16, 2024 8:12


    Explore the groundbreaking Mamba model, a potential game-changer in AI that promises to outpace the traditional Transformer architecture with its efficient, linear-time sequence modeling. Additional materials: www.superdatascience.com/758 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    757: How to Speak so You Blow Listeners' Minds, with Cole Nussbaumer Knaflic

    Play Episode Listen Later Feb 13, 2024 89:03


    Explore mind-blowing storytelling with Cole Nussbaumer Knaflic in this episode. Audience favorite and author of "Storytelling with You," Cole returns to share essential tips for crafting impactful presentations, emphasizing narrative construction and audience engagement. Learn how to effectively communicate data and stories, enhancing your presentations with insights from a leading expert in the field. This episode is brought to you by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How to become a confident communicator [11:59] • How to get rid of filler words [26:32] • How facts alone can't make a strong impact [41:44] • Cole's overview of her book Storytelling with You [55:19] • How to craft an effective presentation [1:00:24] • Common mistakes in virtual presentations [1:09:48] • Cole's virtual presentation setup [1:15:33] • Cole's next book Daphne Draws Data [1:20:23] Additional materials: www.superdatascience.com/757

    756: AlphaGeometry: AI is Suddenly as Capable as the Brightest Math Minds

    Play Episode Listen Later Feb 9, 2024 8:45


    AlphaGeometry, intuitive AI, and geometric deduction: In this week's Five-Minute Friday, Super Data Science host Jon Krohn looks into developments from DeepMind, Google's ground-breaking AI lab, and explores how this is a critical step towards a future of broadly accessible AI solutions across scientific disciplines. Additional materials: www.superdatascience.com/756 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    755: Brewing Beer with A.I., with Beau Warren

    Play Episode Listen Later Feb 6, 2024 95:43


    ChatGPT applications and data-driven beer: Beer brewer and Super Data Science regular listener Beau Warren talks to Jon Krohn about the wonders of “sweaty ales”, how to brew beer with data, and how to get started on creative machine learning projects even without a degree in data science. This episode is brought to you by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • About Species X [06:31] • How to become a certified beer taster [12:37] • How Beau checks the quality of his beer [25:01] • Beau and Jon's machine learning project [38:02] • About genetic algorithms [52:35] • How to get creativity out of LLMs [1:24:46] Additional materials: www.superdatascience.com/755

    754: A Code-Specialized LLM Will Realize AGI, with Jason Warner

    Play Episode Listen Later Feb 2, 2024 37:01


    Explore the future of coding with poolside co-founder and CEO Jason Warner as he explores the potential of code-specialized LLMs and their revolutionary impact on the developer's role. Tune in for insights on the shift towards an AI-led development paradigm. Additional materials: www.superdatascience.com/754 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.

    753: Blend Any Programming Languages in Your ML Workflows, with Dr. Greg Michaelson

    Play Episode Listen Later Jan 30, 2024 86:20


    Explore the future of collaborative ML workflows in this engaging episode with Dr. Greg Michaelson, Co-Founder of Zerve. Dr. Michaelson introduces the groundbreaking Zerve IDE and Pypelines project, addressing the critical gap in AutoML for commercial use and pinpointing why many A.I. projects don't meet their objectives. Gain insights into steering AI initiatives towards success and enhancing project communication, all in this insightful session. This episode is brought to you by Oracle NetSuite business software (https://netsuite.com/superdata), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Why Zerve IDE is so sorely needed [04:50] • Pypelines: AutoML open-source in python [30:00] • Why most commercial A.I. projects fail and how to ensure they succeed [47:45] • How AutoML will impact the role of the data scientist [53:21] • Greg's background as a pastor and working at DataRobot [1:03:40] • How to develop impressive communication and storytelling skills [1:16:16] Additional materials: www.superdatascience.com/753

    752: AI is Disadvantaging Job Applicants, But You Can Fight Back

    Play Episode Listen Later Jan 26, 2024 50:56


    Jon Krohn interviews Hilke Schellmann about the ethics of recruitment algorithms, the field's current state of play, and what can be improved about AI used in recruiting. Additional materials: www.superdatascience.com/752 Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.

    Claim SuperDataScience

    In order to claim this podcast we'll send an email to with a verification link. Simply click the link and you will be able to edit tags, request a refresh, and other features to take control of your podcast page!

    Claim Cancel