POPULARITY
Our 186th episode with a summary and discussion of last week's big AI news! With hosts Andrey Kurenkov and guest host Jon Krohn from the SuperDataScience Podcast. Check out Jon's upcoming agent-focused event here - AI Catalyst: Agentic Artificial Intelligence Read out our text newsletter and comment on the podcast at https://lastweekin.ai/. If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form. Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai Timestamps + Links: (00:00:00) Intro / Banter (00:04:14) News Preview (00:05:28) Response to listener comments / corrections Tools & Apps (00:07:10) Adobe's AI video model is here, and it's already inside Premiere Pro (00:11:52) Adobe teases AI tools that build 3D scenes, animate text, and make distractions disappear (00:15:43) Adobe's Project Super Sonic uses AI to generate sound effects for your videos (00:17:05) YouTube expands AI audio generation tool to all U.S. creators (00:20:29) All Gemini users can now generate images with Imagen 3 (00:22:27) Meta AI will launch in six more countries today, including the UK (00:24:27) OpenAI Unveils Secret Meta Prompt—And It's Very Different From Anthropic's Approach Applications & Business (00:27:46) Tesla's big ‘We, Robot' event criticized for ‘parlor tricks' and vague timelines for robots, Cybercab, Robovan (00:37:25) OpenAI announces content deal with Hearst, including content from Cosmopolitan, Esquire and the San Francisco Chronicle Projects & Open Source (00:47:59) OpenR: An Open-Source AI Framework Enhancing Reasoning in Large Language Models (00:49:54) MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering (00:56:29) OpenAI Releases Swarm: An Experimental AI Framework for Building, Orchestrating, and Deploying Multi-Agent Systems Research & Advancements (00:59:23) Nobel Physics Prize Awarded for Pioneering A.I. Research by 2 Scientists (01:05:22) Nobel Prize in Chemistry Goes to 3 Scientists for Predicting and Creating Proteins (01:09:09) LLMs can't perform “genuine logical reasoning,” Apple researchers suggest (01:13:05) GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models Policy & Safety (01:14:34) Anthropic CEO goes full techno-optimist in 15,000-word paean to AI (01:23:04) Google will help build seven nuclear reactors to power its AI systems (01:24:11) LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations Synthetic Media & Art (01:26:26) Adobe Pushes Content Authenticity Forward With a Free Web App Designed for Creators (01:29:13) Outro
Our 184th episode with a summary and discussion of last week's big AI news! With hosts Andrey Kurenkov and guest host Jon Krohn. Read out our text newsletter and comment on the podcast at https://lastweekin.ai/. If you would like to become a sponsor for the newsletter, podcast, or both, please fill out this form. Email us your questions and feedback at contact@lastweekinai.com and/or hello@gladstone.ai In this episode: OpenAI, Meta, and Google are enhancing their AI assistants with advanced voice modes, while Meta released Llama 3.2, an open-source model capable of processing both images and text. Significant AI infrastructure developments include Grok's partnership with Aramco for a massive data center in Saudi Arabia, and Microsoft's plan to power data centers using a reopened Three Mile Island nuclear plant. Recent research shows chain-of-thought prompting is most effective for math and symbolic reasoning, while OpenAI's GPT-4 with vision capabilities is being integrated into Perplexity AI's search platform. AI is being rapidly integrated into various sectors, with examples including ChartWatch reducing unexpected hospital deaths, Snapchat and YouTube introducing AI video generation tools, and Lionsgate partnering with Runway for AI-assisted film production. Timestamps + Links: (00:00:00) Intro / Banter (00:04:45) Response to listener comments / corrections Tools & Apps(00:07:46) OpenAI rolls out Advanced Voice Mode with more voices and a new look (00:13:32) Meta's AI can now talk to you in the voices of Awkwafina, John Cena, and Judi Dench (00:17:11) Gemini's chatty voice mode is out now for free on Android (00:21:30) AI video rivalry intensifies as Luma announces Dream Machine API hours after Runway (00:23:35) Copilot Wave 2 supercharges productivity with AI across all your Microsoft 365 apps (00:25:56) Perplexity introduces new 'Reasoning' focus powered by OpenAI's o1 Applications & Business(00:33:47) OpenAI Execs Mass Quit as Company Removes Control From Non-Profit Board and Hands It to Sam Altman (00:41:46) Sam Altman departs OpenAI's safety committee (00:43:04) Chip Startup Groq Backs Saudi AI Ambitions With Aramco Deal (00:46:29) Grok's image generator, Black Forest Labs, is raising $100M at a $1B valuation, say sources (00:48:05) Pudu unveils super semi-humanoid robot with 8-hour battery, 10kg lift power (00:50:56) Amazon introduces Amelia, an AI assistant for third-party sellers Projects & Open Source(00:52:58) Meta Releases Llama 3.2—and Gives Its AI a Voice (00:56:52) Alibaba Unveils Ovis 1.6 – A New Multimodal Language Model Research & Advancements(01:00:35) To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning (01:06:14) LLMs Still Can't Plan; Can LRMs? A Preliminary Evaluation of OpenAI's o1 on PlanBench (01:10:15) Norwegian Startup 1X Unveils AI World Model for Robot Training (01:12:44) AI tool cuts unexpected deaths in hospital by 26%, Canadian study finds Policy & Safety(01:15:47) Three Mile Island nuclear plant will reopen to power Microsoft data centers (01:18:23) Governor Newsom signs bills to combat deepfake election content (01:20:12) Governor Newsom signs bills to protect digital likeness of performers (01:22:20) Startup behind “world's first robot lawyer” to pay $193K for false ads, FTC says Synthetic Media & Art(01:24:49) Snap is introducing an AI video generation tool for creators (01:25:57) YouTube Shorts to integrate Veo, Google's AI video model (01:26:56) Lionsgate Signs Deal With AI Company Runway, Hopes That AI Can Eliminate Storyboard Artists and VFX Crews (01:28:01) Outro
How to grab investor interest with your AI startup idea, revisiting algorithms, and helping practitioners ensure AI safety with regulatory frameworks and beyond: This month, you missed a whole bunch of great interviews. But don't worry, Jon Krohn is here to recap all the best bits for you! Additional materials: www.superdatascience.com/802 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Merged LLMs are the future, and we're exploring how with Mark McQuade and Charles Goddard from Arcee AI on this episode with Jon Krohn. Learn how to combine multiple LLMs without adding bulk, train more efficiently, and dive into different expert approaches. Discover how smaller models can outperform larger ones and leverage open-source projects for big enterprise wins. This episode is packed with must-know insights for data scientists and ML engineers. Don't miss out! Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Explanation of Charles' job title: Chief of Frontier Research [03:31] • Model Merging Technology combining multiple LLMs without increasing size [04:43] • Using MergeKit for model merging [14:49] • Evolutionary Model Merging using evolutionary algorithms [22:55] • Commercial applications and success stories [28:10] • Comparison of Mixture of Experts (MoE) vs. Mixture of Agents [37:57] • Spectrum Project for efficient training by targeting specific modules [54:28] • Future of Small Language Models (SLMs) and their advantages [01:01:22] Additional materials: www.superdatascience.com/801
No-code games with GenAI, the creative possibilities of LLMs, and our proximity to AGI: In this episode, Jon Krohn talks to Andrey Kurenkov about what turned him from an AGI skeptic to a positivist. You'll also hear about his wildly popular podcast “Last Week in AI” and how the NVIDIA-backed startup Astrocade is helping videogame enthusiasts to create their own games through generative AI. A must-listen! This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • All about The Gradient and Last Week in AI [10:42] • All about Astrocade and Andrey's role at the startup [24:35] • Balancing UX and creative control at Astrocade [42:00] • The creative possibilities of LLMs [1:04:15] • The rapid emergence of AGI [1:10:31] Additional materials: www.superdatascience.com/799
Claude 3.5 Sonnet, Anthropic's newest model, is making waves in the AI community. This mid-size model outshines the larger Claude 3 Opus in tasks like code generation, content creation, and document summarization, and it's twice as fast. In this episode of The Super Data Science Podcast, Jon Krohn discusses its top-notch performance across benchmarks like MMLU, GPQA, and HumanEval, along with its improved machine vision capabilities. Plus, learn about the new Artifacts UI feature, which makes managing generated content easier by displaying outputs side-by-side with inputs. Tune in to find out why Claude 3.5 Sonnet is setting new standards in AI. Additional materials: www.superdatascience.com/798 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Want to feel optimistic about your day? In this Friday episode, Simon Kuestenmacher talks to Jon Krohn about demography: What it is, why it's so important, and why its forecasts should give us reason to hope for a better future. In an increasingly globalized world, and with an aging population in countries with the biggest GDPs, demography is more valuable than ever. Additional materials: www.superdatascience.com/796 Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information.
Gina Guillaume-Joseph talks to Jon Krohn about the data and regulatory frameworks set to transform the AI industry and why that's important to anyone working with data. This episode offers a solid path to understanding AI regulation's past, present and future. Gina walks listeners through the AI Bill of Rights, the NIST AI Risk Framework and the MITRE ATLAS threat model. This episode is brought to you by AWS Inferentia (go.aws/3zWS0au) and AWS Trainium (go.aws/3ycV6K0), by Crawlbase (crawlbase.com), the ultimate data crawling platform, and by Babbel (https://www.babbel.com/superdata), the science-backed language-learning platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • What “responsible AI” means [08:14] • Why the federal government should be behind AI regulation [12:22] • The US vs EU on AI regulation [18:46] • About the AI Bill of Rights [26:14] • About MITRE and the MITRE Atlas [37:19] • What a systems engineer does [54:11] Additional materials: www.superdatascience.com/795
Data Bytes listeners get an exclusive discount to join Women in Data. View discount here. (00:00:00) - Creating a course with AI (00:00:35) - Welcome Dr. Jon Krohn (00:01:58) - AI's impact on teaching (00:02:33) - Updating Jupyter notebooks (00:03:10) - Using Claude 3 Opus (00:04:13) - Free content availability (00:06:35) - O'Reilly and free access (00:07:37) - Early content creation (00:10:01) - Impact on global readers (00:12:14) - Interactive SQL course (00:12:48) - Evaluating AI models (00:13:53) - Comparing GPT-4, Gemini, Claude 3 (00:15:17) - Data privacy with LLMs (00:21:24) - AI in team workflows (00:27:07) - LLMs in data science (00:29:49) - Automating data labeling (00:32:28) - Creating synthetic data (00:35:02) - Exciting AI advancements (00:35:49) - Expanding audience reach (00:38:20) - Positive future show concept (00:40:17) - Optimistic view of the future (00:43:26) - Popularity of the show (00:46:01) - Closing thoughts and thanks --- Support this podcast: https://podcasters.spotify.com/pod/show/women-in-data/support
Bayesian methods take the spotlight in this episode with Alex Andorra, co-founder of PyMC Labs, and Jon Krohn. Learn how Bayesian techniques handle tough problems, make the most of prior knowledge, and work wonders with limited data. Alex and Jon break down essentials like PyMC, PyStan, and NumPyro libraries, show how to boost model efficiency with PyTensor, and talk about using ArviZ for top-notch diagnostics and visualizations. Plus, get into advanced modeling with Gaussian Processes. This episode is brought to you by Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Email natalie@superdatascience.com for sponsorship information. In this episode you will learn: • Practical introduction to Bayesian statistics [04:54] • Definition and significance of epistemology [17:52] • Explanation of PyMC and Monte Carlo methods [27:57] • How to get started with Bayesian modeling and PyMC [34:26] • PyMC Labs and its consulting services [50:50] • ArviZ for post-modeling diagnostics and visualization [01:02:23] • Gaussian processes and their applications [01:09:02] Additional materials: www.superdatascience.com/793
Jon Krohn shares his favorite clips from May. Hear how Navdeep Martin is spearheading a company to tackle the climate crisis, why Sol Rashidi and Demetrios Brinkmann find nailing job titles so necessary in the fast-paced industries of tech and AI, and get the latest on embeddings with Luis Serrano. Additional materials: www.superdatascience.com/792 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Reinforcement learning through human feedback (RLHF) has come a long way. In this episode, research scientist Nathan Lambert talks to Jon Krohn about the technique's origins of the technique. He also walks through other ways to fine-tune LLMs, and how he believes generative AI might democratize education. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0), and Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Why it is important that AI is open [03:13] • The efficacy and scalability of direct preference optimization [07:32] • Robotics and LLMs [14:32] • The challenges to aligning reward models with human preferences [23:00] • How to make sure AI's decision making on preferences reflect desirable behavior [28:52] • Why Nathan believes AI is closer to alchemy than science [37:38] Additional materials: www.superdatascience.com/791
Machine Learning for Wind Energy is front and center in this episode as Jon Krohn is joined by Dr. Jason Yosinski, CEO of Windscape AI. Dr. Yosinski brings to light the latest ML advancements sparking significant changes in renewable energy. Tune in for a comprehensive review of these cutting-edge technologies and their expansive impact on the industry and the environment's well-being. This episode is brought to you by Crawlbase (https://crawlbase.com), the ultimate data crawling platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Enhancing predictability in wind energy with ML [04:52] • Data utilization from wind turbines by energy providers [11:41] • Jason's journey into wind energy [17:55] • Landing the right startup idea [22:47] • Visualizing neural networks with the Deep Vis Toolbox [31:29] • Extreme event forecasting at Uber vs. nowcasting at Windscape AI [45:13] • Discoveries from Loss Change Allocation research [47:48] • Engaging with Jason's ML Collective [59:46] • Traits of successful AI entrepreneurs [1:10:26] Additional materials: www.superdatascience.com/789
Multi-agent systems could mark a significant turning point in generative AI. From mastering increasingly complex tasks to getting LLMs to collaborate, in this Five-Minute Friday, Jon Krohn discusses the systems that are working to bridge the remaining gaps left by the latest large language models (LLMs). Additional materials: www.superdatascience.com/788 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
MLOps, how to build an online community, and tools for scaling LLMs: In this episode, Demetrios Brinkmann speaks to Jon Krohn about the similarities and differences between LLMOps, MLOps and DevOps, and why this should matter to companies looking to hire such engineers. You will also hear how to get involved in the MLOps community wherever you are in the world, and how you can start developing great products with the available tools. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • What MLOps is [03:51] • About LLMOps [12:06] • About LlamaIndex and Ollama [18:29] • Insights from Demetrios' MLOps survey [20:49] • Guidance for using third-party APIs [40:18] • Recommendations for building an online community in tech and AI [47:07] Additional materials: www.superdatascience.com/787
Learn about the six keys to data science success as host Jon Krohn welcomes back Kirill Eremenko, the mastermind behind SuperDataScience. Kirill shares his top insights on data science careers, from building strong portfolios to leveraging mentors and hands-on labs. With over 2.7 million students, his advice is a must-hear for aspiring and experienced data scientists alike. Additional materials: www.superdatascience.com/786 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Aligning LLMs: How can we teach pre-trained LLMs to hold a conversation and learn new information from each other? This was where Sinan Ozdemir began his investigation into aligning LLMs. In this episode, he talks to Jon Krohn about the limitations of definitions for LLMs, training LLMs, and whether it is possible to train an LLM without alignment. Additional materials: www.superdatascience.com/784 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Sol Rashidi, a distinguished data executive who has served in C-suite roles at Fortune 100 companies, joins Jon Krohn to delve into successful enterprise AI strategies and the reasons behind the high turnover among Chief Data Officers. This episode provides an in-depth look at selecting AI projects that succeed and understanding the strategic value of patents in various industries. Benefit from Sol's extensive experience and practical advice on navigating complex corporate challenges. This episode is brought to you by AWS Inferentia (https://go.aws/3zWS0au) and AWS Trainium (https://go.aws/3ycV6K0). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How CDOs and related roles have such high turnover because [09:40] • The importance of building relationships in AI projects [17:01] • How Sol's book "The AI Survival Guide" came about [20:44] • How high-criticality, low-complexity AI projects are the ones with the highest probability of success [27:11] • How Enterprise data security issues can be resolved with technologies like Protopia's stained-glass data-masking solution [36:10] • Why having great data engineers is essential [47:57] • The value of patents [51:45] Additional materials: www.superdatascience.com/781
Tidyverse, ggplot2, and the secret to a tech company's longevity: Hadley Wickham talks to Jon Krohn about Posit's rebrand, Tidyverse and why it needs to be in every data scientist's toolkit, and why getting your hands dirty with open-source projects can be so lucrative for your career. This episode is brought to you by Intel and HPE Ezmeral Software (https://bit.ly/hpeintel). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • All about the Tidyverse [04:46] • Hadley's favorite R libraries [17:10] • The goal of Posit [30:29] • On bringing multiple programming languages together [36:02] • The principles for a long-lasting tech company [52:10] • How Hadley developed ggplot2 [55:24] • How to contribute to the open-source community [1:05:43] Additional materials: www.superdatascience.com/779
Mixtral 8x22B is the focus on this week's Five-Minute Friday. Jon Krohn examines how this model from French AI startup Mistral leverages its mixture-of-experts architecture to redefine efficiency and specialization in AI-powered tasks. Tune in to learn about its performance benchmarks and the transformative potential of its open-source license. Additional materials: www.superdatascience.com/778 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Generative AI is reshaping our world, and Bernard Marr, world-renowned futurist and best-selling author, joins Jon Krohn to guide us through this transformation. In this episode, Bernard shares his insights on how AI is transforming industries, revolutionizing daily life, and addressing global challenges. With his extensive experience advising top organizations worldwide, he also examines the ethical considerations of AI deployment. This episode is brought to you by Intel and HPE Ezmeral Software (https://bit.ly/hpeintel). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How Generative AI will transform industries [03:55] • The evolution of Generative AI [10:19] • How will Generative AI impact daily life [16:52] • The ethical challenges of AI [18:55] • How corporations can harness Generative AI for collaboration [24:36] • Industries that will be impacted by Generative AI [32:20] • How Sora-like Generative AI systems will create highly immersive entertainment [42:16] • How Generative AI could unlock 99% of business data [53:34] Additional materials: www.superdatascience.com/777
What are the risks of AI progressing beyond a point of no return? What do we stand to gain? On this Five-Minute Friday, Jon Krohn talks ‘books' as he outlines two nonfiction works on AI and futurism by Oxford philosopher Nick Bostrom. Listen to a breakdown of DEEP UTOPIA and SUPERINTELLIGENCE in this episode. Additional materials: www.superdatascience.com/776 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Tech entrepreneurship, artificial superintelligence, and the future of education: Aleksa Gordić speaks to Jon Krohn about his strategies for self-directed learning, the traits that help people succeed in moving from big tech to entrepreneurship, and the social impact of artificial superintelligence. This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How to motivate yourself to become a tech entrepreneur [17:02] • Aleksa's checklist for the perfect CTO [35:00] • Potential sustainable solutions for LLMs [41:51] • The next major developments in AI and tech [48:29] • How hobbies have a knock-on effect for a person's career [1:01:53] • How and why formal education needs to change [1:09:24] Additional materials: www.superdatascience.com/775
Covariant's RFM-1: Jon Krohn explores the future of AI-driven robotics with RFM-1, a groundbreaking robot arm designed by Covariant and discussed by A.I. roboticist Pieter Abbeel. Explore how this innovation aims to merge digital intelligence with the physical world, promising a new era of efficiency and autonomy. Additional materials: www.superdatascience.com/774 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Pytorch benefits, how to get funding for your AI startup, and managing scientific silos: In our new series for SuperDataScience, “In Case You Missed It”, host Jon Krohn engages in some “reinforcement learning through human feedback” of his own with need-to-hear sound bites from past SDS episodes! Additional materials: www.superdatascience.com/772 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Kirill Eremenko joins Jon Krohn for another exclusive, in-depth teaser for a new course just released on the SuperDataScience platform, “Machine Learning Level 2”. Kirill walks listeners through why decision trees and random forests are fruitful for businesses, and he offers hands-on walkthroughs for the three leading gradient-boosting algorithms today: XGBoost, LightGBM, and CatBoost. This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/), and by Data Universe, the out-of-this-world data conference (https://datauniverse2024.com). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • All about decision trees [09:28] • All about ensemble models [22:03] • All about AdaBoost [38:46] • All about gradient boosting [46:51] • Gradient boosting for classification problems [1:01:26] • All about LightGBM and CatBoost [1:05:39] Additional materials: www.superdatascience.com/771
Today we're talking to Jon Krohn, Co-Founder & Data Science at Nebula and host of the Super Data Science podcast. Jon discusses his project on AI-driven craft beer brewing and shares insights on the development and implications of Artificial General Intelligence. He reflects on his experiences as a data scientist and entrepreneur, highlighting AI's impact across various sectors. The conversation offers a grounded look at how AI technologies are influencing both professional fields and personal passions All of this right here, right now, on the Modern CTO Podcast! To learn more about Nebula, check out their website here. To listen to the Super Data Science podcast, check out their website here or wherever you get your podcasts. Have feedback about the show? Let us know here. Produced by ProSeries Media.
Generative AI in medicine takes center stage as Prof. Zachary Lipton, Chief Scientific Officer at Abridge, joins host Jon Krohn to discuss the significant advancements in AI that are reshaping healthcare. This episode is brought to you by the DataConnect Conference (https://www.dataconnectconf.com/dccwest/conference), and by Data Universe, the out-of-this-world data conference (https://datauniverse2024.com). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • The inspiration for Zack to get started in ML and healthcare [03:56] • The hardware required to use Abridge [12:29] • The key data science projects at Abridge right now [35:05] • Abridge's tech stack [59:54] • How Abridge ensures reliability in a high-stakes setting like healthcare [1:07:29] • How Zack's academic research cross-pollinates with his commercial ML projects [1:21:05] • How Zack's jazz background molded his entrepreneur and data science journey [1:30:32] Additional materials: www.superdatascience.com/769
Claude 3, LLMs and testing ML performance: Jon Krohn tests out Anthropic's new model family, Claude 3, which includes the Haiku, Sonnet and Opus models (written in order of their performance power, from least to greatest). Can it stand shoulder to shoulder with other models such as GPT-4 and Gemini 1.0 Ultra? And how important is it for machine learning practitioners to try out these models with their own benchmarks? Jon walks listeners through a test of his own in this Five-Minute Friday. Additional materials: www.superdatascience.com/768 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Jon Krohn sits down with Sebastian Raschka to discuss his latest book, Machine Learning Q and AI, the open-source libraries developed by Lightning AI, how to exploit the greatest opportunities for LLM development, and what's on the horizon for LLMs. This episode is brought to you by the DataConnect Conference (https://www.dataconnectconf.com/dccwest/conference), and by Data Universe, the out-of-this-world data conference (https://datauniverse2024.com). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • All about Machine Learning Q and AI [04:13] • Sebastian Raschka's role as Staff Research Engineer at Lightning AI [19:21] • PyTorch Lightning's and Lightning Fabric's capabilities [39:32] • Large language models: Opportunities and challenges [43:35] • DoRA vs LoRA [48:56] • How to be a successful AI educator [1:34:18] Additional materials: www.superdatascience.com/767
Sit down with host Brandon Laws and guest Jon Krohn, Co-Founder & Chief Data Scientist of Nebula.io, to delve into the intersection of AI and HR technology. This episode's discussion revolves around the recent surge in AI and its effectiveness in job searching, recruiting, and hiring the right fit. Been holding off on integrating AI tools into your HR strategy? It might just be the key to streamlining your HR operations and enhancing productivity. TAKEAWAYS Deep learning, inspired by the human brain structure, is being applied in HR contexts to mitigate biases in decision-making. The recent surge in the effectiveness of artificial intelligence tools, including deep learning, is attributed to the availability of vast training data and increased computing power. Nebula.io offers a dynamic compensation benchmarking service, providing granular insights into salary trends and empowering HR professionals with actionable data. HR should be embracing AI tools as they have the potential to enhance productivity, save time and money, and alleviate administrative burdens. A QUICK GLIMPSE INTO OUR PODCAST
Kurt Vonnegut's "Player Piano" delivers striking parallels between its dystopian vision and today's AI challenges. This week, Jon Krohn explores the novel's depiction of a world where humans are marginalized by machines, reflecting on the impact of automation on society and the ethical considerations it raises. Tune in as we unpack the timeless relevance of Vonnegut's work to the AI era. Additional materials: www.superdatascience.com/766 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Data science futurists, bestselling authors, and lively how-to guides from the industry's top practitioners, which range from applying data science for good to using open-source tools for NLP: This is The Super Data Science Podcast's top ten most listened-to episodes in 2023, hosted by Jon Krohn. A great snapshot of our great content from 2023. Additional materials: www.superdatascience.com/764 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
At Glasswing Ventures, Rudina Seseri wants to be able to answer the question: What has Glasswing Ventures done for the company beyond capital investment? She speaks to Jon Krohn about how her company uses data to assess venture capital investments, the secret sauce of successful AI startups, and why she feels generative AI is only the start of a much broader impact that AI will make in communities and businesses. This episode is brought to you by the DataConnect Conference (https://www.dataconnectconf.com/dccwest/conference), and by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/). Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Potential interest areas for Series A AI venture capitalists [12:22] • How Glasswing's AI Palette helps AI startups [23:06] • How data driven the venture capital industry is [27:21] • Advice for adopting services from AI providers [47:21] • Model collapse: Causes and concerns [58:44] • Glasswing's checklist for AI startups [1:04:59] Additional materials: www.superdatascience.com/763
Jon Krohn presents an insightful overview of Google's groundbreaking Gemini Pro 1.5, a million-token LLM that's transforming the landscape of AI. Discover the innovative aspects of Gemini Pro 1.5, from its extensive context window to its multimodal functionalities, which are broadening the scope of AI technology and signifying a significant leap in data science. Plus, join Jon for a practical demonstration, showcasing the real-world applications, capabilities, and limitation of this advanced language model. Additional materials: www.superdatascience.com/762 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Google's Gemini Ultra takes the spotlight this week, as host Jon Krohn welcomes Lisa Cohen, Google's Director of Data Science and Engineering, for a conversation about the launch of Gemini Ultra. Discover the capabilities of this cutting-edge large language model and how it stands toe-to-toe with GPT-4. Lisa shares her insights on the development, rollout, and potential of Gemini Ultra in reshaping various sectors. Whether you're a data science professional, tech enthusiast, or curious about the future of AI, this episode offers a deep dive into one of the most significant advancements in artificial intelligence. This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/), and by Intel and HPE Ezmeral Software Solutions (https://hpe.com/ezmeral/chatbots). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Google's Gemini model family and Lisa's key responsibilities [04:55] • How LLMs will transform the practice of Data Science [19:47] • Lisa on prompt engineering and reinforcement learning from human feedback [24:38] • How to fine-tune Gemini models with Google's Vertex AI [30:52] • How AI-assistants will transform life and work for everyone from data scientists to educators to children [47:14] • The challenges of developing a data-centric culture [57:31] • Centralized vs decentralized data science teams [1:03:50] Additional materials: www.superdatascience.com/761
AI-crafted beer, machine learning for passion projects, and self-taught data science: Jon Krohn and Beau Warren's hotly anticipated, data-driven, punny lager Krohn&Borg is finally given a taste test in this week's Five-Minute Friday. Heading to the Species X brewery in Columbus, Ohio, Jon Krohn and Beau Warren launched the beer that had been predicted, optimized and developed by a machine-learning model. Additional materials: www.superdatascience.com/760 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Encoders, cross attention and masking for LLMs: SuperDataScience Founder Kirill Eremenko returns to the SuperDataScience podcast, where he speaks with Jon Krohn about transformer architectures and why they are a new frontier for generative AI. If you're interested in applying LLMs to your business portfolio, you'll want to pay close attention to this episode! This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/), by Oracle NetSuite business software (netsuite.com/superdata), and by Intel and HPE Ezmeral Software Solutions (http://hpe.com/ezmeral/chatbots). Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How decoder-only transformers work [15:51] • How cross-attention works in transformers [41:05] • How encoders and decoders work together (an example) [52:46] • How encoder-only architectures excel at understanding natural language [1:20:34] • The importance of masking during self-attention [1:27:08] Additional materials: www.superdatascience.com/759
AlphaGeometry, intuitive AI, and geometric deduction: In this week's Five-Minute Friday, Super Data Science host Jon Krohn looks into developments from DeepMind, Google's ground-breaking AI lab, and explores how this is a critical step towards a future of broadly accessible AI solutions across scientific disciplines. Additional materials: www.superdatascience.com/756 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
ChatGPT applications and data-driven beer: Beer brewer and Super Data Science regular listener Beau Warren talks to Jon Krohn about the wonders of “sweaty ales”, how to brew beer with data, and how to get started on creative machine learning projects even without a degree in data science. This episode is brought to you by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • About Species X [06:31] • How to become a certified beer taster [12:37] • How Beau checks the quality of his beer [25:01] • Beau and Jon's machine learning project [38:02] • About genetic algorithms [52:35] • How to get creativity out of LLMs [1:24:46] Additional materials: www.superdatascience.com/755
Jon Krohn interviews Hilke Schellmann about the ethics of recruitment algorithms, the field's current state of play, and what can be improved about AI used in recruiting. Additional materials: www.superdatascience.com/752 Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information.
Venture capital and AI, and how to succeed with an AI company in 2024: Rasmus Rothe, Cofounder of Merantix, speaks to Jon Krohn about the Merantix campus in Berlin, how a venture capitalist identifies the best AI startups, the surefire ways for AI company founders to raise venture capital, and the jobs that are most and least vulnerable to disruption by automation. This episode is brought to you by Oracle NetSuite business software (netsuite.com/superdata), by QuickChat customized AI assistants (https://quickchat.ai), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • How Merantix started [05:17] • How does Merantix work and how to apply for funding [08:19] • How to secure AI funding [21:02] • How AI companies can prove competitiveness [33:46] • Ensuring AI regulation [41:17] • How AI will change the future of work [56:56] Additional materials: www.superdatascience.com/751
Explore the transformative power of AI in science. Jon Krohn reviews the groundbreaking AI-driven discoveries at MIT and beyond, showcasing how AI is reshaping various scientific fields, from pharmaceuticals to climate science, and pondering the balance between AI's capabilities and human ingenuity. Additional materials: www.superdatascience.com/750 Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information.
Data science for clean energy takes center stage as Emily Pastewka from Palmetto joins Jon Krohn this week, exploring innovative paths to a sustainable future. This episode covers the impact of AI on smart energy choices, the creation of a smart grid, and the wide array of professionals required to bring cleantech data solutions to life. This episode is brought to you by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Emily on her Master's in Deep Learning [08:20] • Using AI to solve clean energy challenges at Palmetto [17:22] • The different roles needed to solve cleantech problems [27:33] • How econometrics impacts consumer decision-making [38:56] • How Emily manages high-performing teams [56:30] • The tools and technologies that drive small teams [1:06:58] Additional materials: www.superdatascience.com/749
Attention and transformers in LLMs, the five stages of data processing, and a brand-new data science course: Kirill Eremenko joins host Jon Krohn to explore what goes into well-crafted LLMs, what makes Transformers so powerful, and how to succeed as a data scientist in this new age of generative AI. This episode is brought to you by Intel and HPE Ezmeral Software Solutions (https://hpe.com/ezmeral/chatwithyourdata), and by Prophets of AI (https://prophetsofai.com), the leading agency for AI experts. Interested in sponsoring a SuperDataScience Podcast episode? Visit https://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Supply and demand in AI recruitment [08:30] • Kirill and Hadelin's new course on LLMs, “Large Language Models (LLMs), Transformers & GPT A-Z” [15:37] • The learning difficulty in understanding LLMs [19:46] • The basics of LLMs [22:00] • The five building blocks of transformer architecture [36:29] - 1: Input embedding [44:10] - 2: Positional encoding [50:46] - 3: Attention mechanism [54:04] - 4: Feedforward neural network [1:16:17] - 5: Linear transformation and softmax [1:19:16] • Inference vs training time [1:29:12] • Why transformers are so powerful [1:49:22] Additional materials: www.superdatascience.com/747
Chatbots, large language models and generative AI: Founder of Quickchat AI Piotr Grudzień believes the key to any successful AI platform is to ensure it can be tailored to a company's specific needs. He speaks to host Jon Krohn about helping clients generate realistic and satisfying conversations that help their customer base find what they need quickly. This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader, and by CloudWolf (https://www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit http://passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • About Quickchat AI and how it works [02:46] • How to successfully set up a conversational AI [23:58] • What “temperature” is in the context of AI [38:38] • How the LLM landscape has changed in recent years [40:24] • The future of generative AI [57:43] • The advantages of an AI accelerator [1:09:38] Additional materials: www.superdatascience.com/743
Sam Altman's exit and rehiring, AGI, and OpenAI's Q*: In this week's Five-Minute Friday, Jon Krohn peeks behind the curtains of OpenAI, where development of the world's first model that can solve complex, nonlinear logical problems, Q*, might be well underway. This episode casts light on the rumors behind OpenAI's Q*, what its emergence could mean for the future of AI, and the controversies already surrounding an agent that has not yet reached the market. Additional materials: www.superdatascience.com/740 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
AI Protein design, machine learning and cancer care, and pharmaceuticals: At Exazyme, CEO and Co-Founder Ingmar Schuster uses AI to design proteins. He speaks with Jon Krohn about their wider applications in pharmaceuticals and chemistry, how Kernel methods make the design of synthetic biological catalysts more efficient, and when to use shallow machine learning over deep learning. This episode is brought to you by Gurobi (https://gurobi.com/sds), the Decision Intelligence Leader. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: • On designing proteins with AI [03:14] • Designing proteins at Exazyme [08:22] • About the kernel methods [18:10] • The importance of human-led approaches in protein research [35:44] • Europe's focus on AI regulation [43:45] • Deep vs shallow in AI [59:35] • How a background in academia helps with entrepreneurship [1:09:17] Additional materials: www.superdatascience.com/739
Bioengineering and Generative AI converge under the visionary leadership of Dr. Pierre Salvy at Cambrium GmbH, propelling material science into uncharted territories. He sits down with Jon Krohn live at Merantix A.I. Campus in Berlin to discuss how he's transforming material design, exemplified by his swift development of NovaColl, a vegan collagen crafted within two years. Additional materials: www.superdatascience.com/738 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
scikit-learn co-founder Gaël Varoquaux and Jon Krohn are live at the historic Sorbonne in Paris, where they discuss the evolution of scikit-learn. From its origins as a memory-efficient Python implementation of support vector machines to its present-day status as a pivotal resource in machine learning, Gaël paints a vivid picture of its remarkable growth. Join us for a glimpse into scikit-learn's evolution, the realm of open-source collaboration, and the transformative power of data-driven insights in today's dynamic data landscape. This episode is brought to you by Gurobi (gurobi.com/sds), the Decision Intelligence Leader, by Data Universe (https://datauniverse2024.com), the out-of-this-world data conference, and by CloudWolf (www.cloudwolf.com/sds), the Cloud Skills platform. Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information. In this episode you will learn: • The early beginnings and growth of scikit-learn [05:34] • Development principles of scikit-learn [18:05] • How to apply scikit-learn to your ML problem [21:16] • Resource-efficiency and scikit-learn development [25:32] • How to contribute to an open-source project like scikit-learn yourself [38:21] • The future of scikit-learn [51:13] • Gaël on the social-impact data projects in his Soda lab [1:02:33] • Why domain expertise and statistical rigor are more important than ever [1:11:24] Additional materials: www.superdatascience.com/737