Interviews with various people who research, build, or use AI, including academics, engineers, artists, entrepreneurs, and more.
Episode 142Happy holidays! This is one of my favorite episodes of the year — for the third time, Nathan Benaich and I did our yearly roundup of all the AI news and advancements you need to know. This includes selections from this year's State of AI Report, some early takes on o3, a few minutes LARPing as China Guys……… If you've stuck around and continue to listen, I'm really thankful you're here. I love hearing from you. You can find Nathan and Air Street Press here on Substack and on Twitter, LinkedIn, and his personal site. Check out his writing at press.airstreet.com. Find me on Twitter (or LinkedIn if you want…) for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Outline* (00:00) Intro* (01:00) o3 and model capabilities + “reasoning” capabilities* (05:30) Economics of frontier models* (09:24) Air Street's year and industry shifts: product-market fit in AI, major developments in science/biology, "vibe shifts" in defense and robotics* (16:00) Investment strategies in generative AI, how to evaluate and invest in AI companies* (19:00) Future of BioML and scientific progress: on AlphaFold 3, evaluation challenges, and the need for cross-disciplinary collaboration* (32:00) The “AGI” question and technology diffusion: Nathan's take on “AGI” and timelines, technology adoption, the gap between capabilities and real-world impact* (39:00) Differential economic impacts from AI, tech diffusion* (43:00) Market dynamics and competition* (50:00) DeepSeek and global AI innovation* (59:50) A robotics renaissance? robotics coming back into focus + advances in vision-language models and real-world applications* (1:05:00) Compute Infrastructure: NVIDIA's dominance, GPU availability, the competitive landscape in AI compute* (1:12:00) Industry consolidation: partnerships, acquisitions, regulatory concerns in AI* (1:27:00) Global AI politics and regulation: international AI governance and varying approaches* (1:35:00) The regulatory landscape* (1:43:00) 2025 predictions * (1:48:00) ClosingLinks and ResourcesFrom Air Street Press:* The State of AI Report* The State of Chinese AI* Open-endedness is all we'll need* There is no scaling wall: in discussion with Eiso Kant (Poolside)* Alchemy doesn't scale: the economics of general intelligence* Chips all the way down* The AI energy wars will get worse before they get betterOther highlights/resources:* Deepseek: The Quiet Giant Leading China's AI Race — an interview with DeepSeek CEO Liang Wenfeng via ChinaTalk, translated by Jordan Schneider, Angela Shen, Irene Zhang and others* A great position paper on open-endedness by Minqi Jiang, Tim Rocktäschel, and Ed Grefenstette — Minqi also wrote a blog post on this for us!* for China Guys only: China's AI Regulations and How They Get Made by Matt Sheehan (+ an interview I did with Matt in 2022!)* The Simple Macroeconomics of AI by Daron Acemoglu + a critique by Maxwell Tabarrok (more links in the Report)* AI Nationalism by Ian Hogarth (from 2018)* Some analysis on the EU AI Act + regulation from Lawfare Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 141I spoke with Professor Philip Goff about:* What a “post-Galilean” science of consciousness looks like* How panpsychism helps explain consciousness and the hybrid cosmopsychist viewEnjoy!Philip Goff is a British author, idealist philosopher, and professor at Durham University whose research focuses on philosophy of mind and consciousness. Specifically, it focuses on how consciousness can be part of the scientific worldview. He is the author of multiple books including Consciousness and Fundamental Reality, Galileo's Error: Foundations for a New Science of Consciousness and Why? The Purpose of the Universe.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:05) Goff vs. Carroll on the Knowledge Arguments and explanation* (08:00) Preferences for theories* (12:55) Curiosity (Grounding, Essence) and the Knowledge Argument* (14:40) Phenomenal transparency and physicalism vs. anti-physicalism* (29:00) How Exactly does Panpsychism Help Explain Consciousness* (30:05) The argument for hybrid cosmopsychism* (36:35) “Bare” subjects / subjects before inheriting phenomenal properties* (40:35) Bundle theories of the self* (43:35) Fundamental properties and new subjects as causal powers* (50:00) Integrated Information Theory* (55:00) Fundamental assumptions in hybrid cosmopsychism* (1:00:00) OutroLinks:* Philip's homepage and Twitter* Papers* Putting Consciousness First* Curiosity (Grounding, Essence) and the Knowledge Argument Get full access to The Gradient at thegradientpub.substack.com/subscribe
Hi everyone!If you're a new subscriber or listener, welcome. If you're not new, you've probably noticed that things have slowed down from us a bit recently. Hugh Zhang, Andrey Kurenkov and I sat down to recap some of The Gradient's history, where we are now, and how things will look going forward. To summarize and give some context:The Gradient has been around for around 6 years now – we began as an online magazine, and began producing our own newsletter and podcast about 4 years ago. With a team of volunteers — we take in a bit of money through Substack that we use for subscriptions to tools we need and try to pay ourselves a bit — we've been able to keep this going for quite some time. Our team has less bandwidth than we'd like right now (and I'll admit that at least some of us are running on fumes…) — we'll be making a few changes:* Magazine: We're going to be scaling down our editing work on the magazine. While we won't be accepting pitches for unwritten drafts for now, if you have a full piece that you'd like to pitch to us, we'll consider posting it. If you've reached out about writing and haven't heard from us, we're really sorry. We've tried a few different arrangements to manage the pipeline of articles we have, but it's been difficult to make it work. We still want this to be a place to promote good work and writing from the ML community, so we intend to continue using this Substack for that purpose. If we have more editing bandwidth on our team in the future, we want to continue doing that work. * Newsletter: We'll aim to continue the newsletter as before, but with a “Best from the Community” section highlighting posts. We'll have a way for you to send articles you want to be featured, but for now you can reach us at our editor@thegradient.pub. * Podcast: I'll be continuing this (at a slower pace), but eventually transition it away from The Gradient given the expanded range. If you're interested in following, it might be worth subscribing on another player like Apple Podcasts, Spotify, or using the RSS feed.* Sigmoid Social: We'll keep this alive as long as there's financial support for it.If you like what we do and/or want to help us out in any way, do reach out to editor@thegradient.pub. We love hearing from you.Timestamps* (0:00) Intro* (01:55) How The Gradient began* (03:23) Changes and announcements* (10:10) More Gradient history! On our involvement, favorite articles, and some plugsSome of our favorite articles!There are so many, so this is very much a non-exhaustive list:* NLP's ImageNet moment has arrived* The State of Machine Learning Frameworks in 2019* Why transformative artificial intelligence is really, really hard to achieve* An Introduction to AI Story Generation* The Artificiality of Alignment (I didn't mention this one in the episode, but it should be here)Places you can find us!Hugh:* Twitter* Personal site* Papers/things mentioned!* A Careful Examination of LLM Performance on Grade School Arithmetic (GSM1k)* Planning in Natural Language Improves LLM Search for Code Generation* Humanity's Last ExamAndrey:* Twitter* Personal site* Last Week in AI PodcastDaniel:* Twitter* Substack blog* Personal site (under construction) Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 140I spoke with Professor Jacob Andreas about:* Language and the world* World models* How he's developed as a scientistEnjoy!Jacob is an associate professor at MIT in the Department of Electrical Engineering and Computer Science as well as the Computer Science and Artificial Intelligence Laboratory. His research aims to understand the computational foundations of language learning, and to build intelligent systems that can learn from human guidance. Jacob earned his Ph.D. from UC Berkeley, his M.Phil. from Cambridge (where he studied as a Churchill scholar) and his B.S. from Columbia. He has received a Sloan fellowship, an NSF CAREER award, MIT's Junior Bose and Kolokotrones teaching awards, and paper awards at ACL, ICML and NAACL.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:40) Jacob's relationship with grounding fundamentalism* (05:21) Jacob's reaction to LLMs* (11:24) Grounding language — is there a philosophical problem?* (15:54) Grounding and language modeling* (24:00) Analogies between humans and LMs* (30:46) Grounding language with points and paths in continuous spaces* (32:00) Neo-Davidsonian formal semantics* (36:27) Evolving assumptions about structure prediction* (40:14) Segmentation and event structure* (42:33) How much do word embeddings encode about syntax?* (43:10) Jacob's process for studying scientific questions* (45:38) Experiments and hypotheses* (53:01) Calibrating assumptions as a researcher* (54:08) Flexibility in research* (56:09) Measuring Compositionality in Representation Learning* (56:50) Developing an independent research agenda and developing a lab culture* (1:03:25) Language Models as Agent Models* (1:04:30) Background* (1:08:33) Toy experiments and interpretability research* (1:13:30) Developing effective toy experiments* (1:15:25) Language Models, World Models, and Human Model-Building* (1:15:56) OthelloGPT's bag of heuristics and multiple “world models”* (1:21:32) What is a world model?* (1:23:45) The Big Question — from meaning to world models* (1:28:21) From “meaning” to precise questions about LMs* (1:32:01) Mechanistic interpretability and reading tea leaves* (1:35:38) Language and the world* (1:38:07) Towards better language models* (1:43:45) Model editing* (1:45:50) On academia's role in NLP research* (1:49:13) On good science* (1:52:36) OutroLinks:* Jacob's homepage and Twitter* Language Models, World Models, and Human Model-Building* Papers* Semantic Parsing as Machine Translation (2013)* Grounding language with points and paths in continuous spaces (2014)* How much do word embeddings encode about syntax? (2014)* Translating neuralese (2017)* Analogs of linguistic structure in deep representations (2017)* Learning with latent language (2018)* Learning from Language (2018)* Measuring Compositionality in Representation Learning (2019)* Experience grounds language (2020)* Language Models as Agent Models (2022) Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 139I spoke with Evan Ratliff about:* Shell Game, Evan's new podcast, where he creates an AI voice clone of himself and sets it loose. * The end of the Longform Podcast and his thoughts on the state of journalism. Enjoy!Evan is an award-winning investigative journalist, bestselling author, podcast host, and entrepreneur. He's the author of the The Mastermind: A True Story of Murder, Empire, and a New Kind of Crime Lord; the writer and host of the hit podcasts Shell Game and Persona: The French Deception; and the cofounder of The Atavist Magazine, Pop-Up Magazine, and the Longform Podcast. As a writer, he's a two-time National Magazine Award finalist. As an editor and producer, he's a two-time Emmy nominee and National Magazine Award winner.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:05) Evan's ambitious and risky projects* (04:45) Wearing different personas as a journalist* (08:31) Boundaries and acceptability in using voice agents* (11:42) Impacts on other people* (13:12) “The kids these days” — how will new technologies impact younger people?* (17:12) Evan's approach to children's technology use* (20:05) Techno-solutionism and improvements in medicine, childcare* (24:15) Evan's perspective on simulations of people* (27:05) On motivations for building tech startups* (30:42) Evan's outlook for Shell Game's impact and motivations for his work* (36:05) How Evan decided to write for a career* (40:02) How voice agents might impact our conversations* (43:52) Evan's experience with Longform and podcasting* (47:15) Perspectives on doing good interviews* (52:11) Mimicking and inspiration, developing style* (57:15) Writers and their motivations, the state of longform journalism* (1:06:15) The internet and writing* (1:09:41) On the ending of Longform* (1:19:48) OutroLinks:* Evan's homepage and Twitter* Shell Game, Evan's new podcast* Longform Podcast Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 138I spoke with Meredith Morris about:* The intersection of AI and HCI and why we need more cross-pollination between AI and adjacent fields* Disability studies and AI* Generative ghosts and technological determinism* Developing a useful definition of AGII didn't get to record an intro for this episode since I've been sick. Enjoy!Meredith is Director for Human-AI Interaction Research for Google DeepMind and an Affiliate Professor in The Paul G. Allen School of Computer Science & Engineering and in The Information School at the University of Washington, where she participates in the dub research consortium. Her work spans the areas of human-computer interaction (HCI), human-centered AI, human-AI interaction, computer-supported cooperative work (CSCW), social computing, and accessibility. She has been recognized as an ACM Fellow and ACM SIGCHI Academy member for her contributions to HCI.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Meredith's influences and earlier work* (03:00) Distinctions between AI and HCI* (05:56) Maturity of fields and cross-disciplinary work* (09:03) Technology and ends* (10:37) Unique aspects of Meredith's research direction* (12:55) Forms of knowledge production in interdisciplinary work* (14:08) Disability, Bias, and AI* (18:32) LaMPost and using LMs for writing* (20:12) Accessibility approaches for dyslexia* (22:15) Awareness of AI and perceptions of autonomy* (24:43) The software model of personhood* (28:07) Notions of intelligence, normative visions and disability studies* (32:41) Disability categories and learning systems* (37:24) Bringing more perspectives into CS research and re-defining what counts as CS research* (39:36) Training interdisciplinary researchers, blurring boundaries in academia and industry* (43:25) Generative Agents and public imagination* (45:13) The state of ML conferences, the need for more cross-pollination* (46:42) Prestige in conferences, the move towards more cross-disciplinary work* (48:52) Joon Park Appreciation* (49:51) Training interdisciplinary researchers* (53:20) Generative Ghosts and technological determinism* (57:06) Examples of generative ghosts and clones, relationships to agentic systems* (1:00:39) Reasons for wanting generative ghosts* (1:02:25) Questions of consent for generative clones and ghosts* (1:05:01) Labor involved in maintaining generative ghosts, psychological tolls* (1:06:25) Potential religious and spiritual significance of generative systems* (1:10:19) Anthropomorphization* (1:12:14) User experience and cognitive biases* (1:15:24) Levels of AGI* (1:16:13) Defining AGI* (1:23:20) World models and AGI* (1:26:16) Metacognitive abilities in AGI* (1:30:06) Towards Bidirectional Human-AI Alignment* (1:30:55) Pluralistic value alignment* (1:32:43) Meredith's perspective on deploying AI systems* (1:36:09) Meredith's advice for younger interdisciplinary researchersLinks:* Meredith's homepage, Twitter, and Google Scholar* Papers* Mediating Group Dynamics through Tabletop Interface Design* SearchTogether: An Interface for Collaborative Web Search* AI and Accessibility: A Discussion of Ethical Considerations* Disability, Bias, and AI* LaMPost: Design and Evaluation of an AI-assisted Email Writing Prototype for Adults with Dyslexia* Generative Ghosts* Levels of AGI Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 137I spoke with Davidad Dalrymple about:* His perspectives on AI risk* ARIA (the UK's Advanced Research and Invention Agency) and its Safeguarded AI ProgrammeEnjoy—and let me know what you think!Davidad is a Programme Director at ARIA. He was most recently a Research Fellow in technical AI safety at Oxford. He co-invented the top-40 cryptocurrency Filecoin, led an international neuroscience collaboration, and was a senior software engineer at Twitter and multiple startups.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:36) Calibration and optimism about breakthroughs* (03:35) Calibration and AGI timelines, effects of AGI on humanity* (07:10) Davidad's thoughts on the Orthogonality Thesis* (10:30) Understanding how our current direction relates to AGI and breakthroughs* (13:33) What Davidad thinks is needed for AGI* (17:00) Extracting knowledge* (19:01) Cyber-physical systems and modeling frameworks* (20:00) Continuities between Davidad's earlier work and ARIA* (22:56) Path dependence in technology, race dynamics* (26:40) More on Davidad's perspective on what might go wrong with AGI* (28:57) Vulnerable world, interconnectedness of computers and control* (34:52) Formal verification and world modeling, Open Agency Architecture* (35:25) The Semantic Sufficiency Hypothesis* (39:31) Challenges for modeling* (43:44) The Deontic Sufficiency Hypothesis and mathematical formalization* (49:25) Oversimplification and quantitative knowledge* (53:42) Collective deliberation in expressing values for AI* (55:56) ARIA's Safeguarded AI Programme* (59:40) Anthropic's ASL levels* (1:03:12) Guaranteed Safe AI — * (1:03:38) AI risk and (in)accurate world models* (1:09:59) Levels of safety specifications for world models and verifiers — steps to achieve high safety* (1:12:00) Davidad's portfolio research approach and funding at ARIA* (1:15:46) Earlier concerns about ARIA — Davidad's perspective* (1:19:26) Where to find more information on ARIA and the Safeguarded AI Programme* (1:20:44) OutroLinks:* Davidad's Twitter* ARIA homepage* Safeguarded AI Programme* Papers* Guaranteed Safe AI* Davidad's Open Agency Architecture for Safe Transformative AI* Dioptics: a Common Generalization of Open Games and Gradient-Based Learners (2019)* Asynchronous Logic Automata (2008) Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 136I spoke with Clive Thompson about:* How he writes* Writing about the climate and biking across the US* Technology culture and persistent debates in AI* PoetryEnjoy—and let me know what you think!Clive is a journalist who writes about science and technology. He is a contributing writer forWired magazine, and is currently writing his next book about micromobility and cycling across the US.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:07) Clive's life as a Tarantino movie* (03:07) Boring life and interesting art, life as material for art* (10:25) Cycling across the US — Clive's new book on mobility and decarbonization* (15:07) Turning inward in writing* (27:21) Including personal experience in writing* (31:53) Personal and less personal writing* (36:08) Conveying uncertainty and the “voice from nowhere” in traditional journalism* (41:10) Finding the natural end of a piece* (1:02:10) Writing routine* (1:05:08) Theories of change in Clive's writing* (1:12:33) How Clive saw things before the rest of us* (1:27:00) Automation in software engineering* (1:31:40) The anthropology of coders, poetry as a framework* (1:43:50) Proust discourse* (1:45:00) Technology culture in NYC + interaction between the tech world and other worlds* (1:50:30) Technological developments Clive wants to see happen (free ideas)* (2:01:11) Clive's argument for memorizing poetry* (2:09:24) How Clive finds poetry* (2:18:03) Clive's pursuit of freelance writing and making compromises* (2:27:25) OutroLinks:* Clive's Twitter and website* Selected writing* The Attack of the Incredible Grading Machine (Lingua Franca, 1999)* The Know-It-All Machine (Lingua Franca, 2001)* How to teach AI some common sense (Wired, 2018)* Blogs to Riches (NY Mag, 2006)* Clive vs. Jonathan Franzen on whether the internet is good for writing (The Chronicle of Higher Education, 2013)* The Minecraft Generation (New York Times, 2016)* What AI College Exam Proctors are Really Teaching Our Kids (Wired, 2020)* Companies Don't Need to Be Creepy to Make Money (Wired, 2021)* Is Sucking Carbon Out of the Air the Solution to Our Climate Crisis? (Mother Jones, 2021)* AI Shouldn't Compete with Workers—It Should Supercharge Them (Wired, 2022)* Back to BASIC—the Most Consequential Programming Language in the History of Computing Wired, 2024) Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 136I spoke with Judy Fan about:* Our use of physical artifacts for sensemaking* Why cognitive tools can be a double-edged sword* Her approach to scientific inquiry and how that approach has developedEnjoy—and let me know what you think!Judy is Assistant Professor of Psychology at Stanford and director of the Cognitive Tools Lab. Her lab employs converging approaches from cognitive science, computational neuroscience, and artificial intelligence to reverse engineer the human cognitive toolkit, especially how people use physical representations of thought — such as sketches and prototypes — to learn, communicate, and solve problems.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :) You can also support upkeep for the full Gradient team/project through a paid subscription on Substack!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:49) Throughlines and discontinuities in Judy's research* (06:26) “Meaning” in Judy's research* (08:05) Production and consumption of artifacts* (13:03) Explanatory questions, why we develop visual artifacts, science as a social enterprise* (15:46) Unifying principles* (17:45) “Hard limits” to knowledge and optimism* (21:47) Tensions in different fields' forms of sensemaking and establishing truth claims* (30:55) Dichotomies and carving up the space of possible hypotheses, conceptual tools* (33:22) Cognitive tools and projectivism, simplified models vs. nature* (40:28) Scientific training and science as process and habit* (45:51) Developing mental clarity about hypotheses* (51:45) Clarifying and expressing ideas* (1:03:21) Cognitive tools as double-edged* (1:14:21) Historical and social embeddedness of tools* (1:18:34) How cognitive tools impact our imagination* (1:23:30) Normative commitments and the role of cognitive science outside the academy* (1:32:31) OutroLinks:* Judy's Twitter and lab page* Selected papers (there are lots!)* Overviews* Drawing as a versatile cognitive tool (2023)* Using games to understand the mind (2024)* Socially intelligent machines that learn from humans and help humans learn (2024)* Research papers * Communicating design intent using drawing and text (2024)* Creating ad hoc graphical representations of number (2024)* Visual resemblance and interaction history jointly constrain pictorial meaning (2023)* Explanatory drawings prioritize functional properties at the expense of visual fidelity (2023)* SEVA: Leveraging sketches to evaluate alignment between human and machine visual abstraction (2023)* Parallel developmental changes in children's production and recognition of line drawings of visual concepts (2023)* Learning to communicate about shared procedural abstractions (2021)* Visual communication of object concepts at different levels of abstraction (2021)* Relating visual production and recognition of objects in the human visual cortex (2020)* Collabdraw: an environment for collaborative sketching with an artificial agent (2019)* Pragmatic inference and visual abstraction enable contextual flexibility in visual communication (2019)* Common object representations for visual production and recognition (2018) Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 135I spoke with L. M. Sacasas about:* His writing and intellectual influences* The value of asking hard questions about technology and our relationship to it* What happens when we decide to outsource skills and competency* Evolving notions of what it means to be human and questions about how to live a good lifeEnjoy—and let me know what you think!Michael is Executive Director of the Christian Study Center of Gainesville, Florida and author of The Convivial Society, a newsletter about technology and society. He does some of the best writing on technology I've had the pleasure to read, and I highly recommend his newsletter.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :) You can also support upkeep for the full Gradient team/project through a paid subscription on Substack!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:12) On podcasts as a medium* (06:12) Michael's writing* (12:38) Michael's intellectual influences, contingency* (18:48) Moral seriousness* (22:00) Michael's ambitions for his work* (26:17) The value of asking the right questions (about technology)* (34:18) Technology use and the “natural” pace of human life* (46:40) Outsourcing of skills and competency, engagement with others* (55:33) Inevitability narratives and technological determinism, the “Borg Complex”* (1:05:10) Notions of what it is to be human, embodiment* (1:12:37) Higher cognition vs. the body, dichotomies* (1:22:10) The body as a starting point for philosophy, questions about the adoption of new technologies* (1:30:01) Enthusiasm about technology and the cultural milieu* (1:35:30) Projectivism, desire for knowledge about and control of the world* (1:41:22) Positive visions for the future* (1:47:11) OutroLinks:* Michael's Substack: The Convivial Society and his book, The Frailest Thing: Ten Years of Thinking about the Meaning of Technology* Michael's Twitter* Essays* Humanist Technology Criticism* What Does the Critic Love?* The Ambling Mind* Waste Your Time, Your Life May Depend On It* The Work of Art* The Stuff of (a Well-Lived) Life Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 134I spoke with Pete Wolfendale about:* The flaws in longtermist thinking* Selections from his new book, The Revenge of Reason* Metaphysics* What philosophy has to say about reason and AIEnjoy—and let me know what you think!Pete is an independent philosopher based in Newcastle. Dr. Wolfendale got both his undergraduate degree and his Ph.D in Philosophy at the University of Warwick. His Ph.D thesis offered a re-examination of the Heideggerian Seinsfrage, arguing that Heideggerian scholarship has failed to fully do justice to its philosophical significance, and supplementing the shortcomings in Heidegger's thought about Being with an alternative formulation of the question. He is the author of Object-Oriented Philosophy: The Noumenon's New Clothes and The Revenge of Reason. His blog is Deontologistics.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :) You can also support upkeep for the full Gradient team/project through a paid subscription on Substack!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:30) Pete's experience with (para-)academia, incentive structures* (10:00) Progress in philosophy and the analytic tradition* (17:57) Thinking through metaphysical questions* (26:46) Philosophy of science, uncovering categorical properties vs. dispositions* (31:55) Structure of thought and the world, epistemological excess* (40:25) * (49:31) What reason is, relation to language models, semantic fragmentation of AGI* (1:00:55) Neural net interpretability and intervention* (1:08:16) World models, architecture and behavior of AI systems* (1:12:35) Language acquisition in humans and LMs* (1:15:30) Pretraining vs. evolution* (1:16:50) Technological determinism* (1:18:19) Pete's thinking on e/acc* (1:27:45) Prometheanism vs. e/acc* (1:29:39) The Weight of Forever — Pete's critique of What We Owe the Future* (1:30:15) Our rich deontological language and longtermism's limits* (1:43:33) Longtermism and the opacity of desire* (1:44:41) Longtermism's historical narrative and technological determinism, theories of power* (1:48:10) The “posthuman” condition, language and techno-linguistic infrastructure* (2:00:15) Type-checking and universal infrastructure* (2:09:23) Multitudes and selfhood* (2:21:12) Definitions of the self and (non-)circularity* (2:32:55) Freedom and aesthetics, aesthetic exploration and selfhood* (2:52:46) OutroLinks:* Pete's blog and Twitter* Book: The Revenge of Reason* Writings / References* The Weight of Forever* On Neorationalism* So, Accelerationism, what's that all about? Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 133I spoke with Peter Lee about:* His early work on compiler generation, metacircularity, and type theory* Paradoxical problems* GPT-4s impact, Microsoft's “Sparks of AGI” paper, and responses and criticismEnjoy—and let me know what you think!Peter is President of Microsoft Research. He leads Microsoft Research and incubates new research-powered products and lines of business in areas such as artificial intelligence, computing foundations, health, and life sciences. Before joining Microsoft in 2010, he was at DARPA, where he established a new technology office that created operational capabilities in machine learning, data science, and computational social science. Prior to that, he was a professor and the head of the computer science department at Carnegie Mellon University. Peter is a member of the National Academy of Medicine and serves on the boards of the Allen Institute for Artificial Intelligence, the Brotman Baty Institute for Precision Medicine, and the Kaiser Permanente Bernard J. Tyson School of Medicine. He served on President Obama's Commission on Enhancing National Cybersecurity. He has testified before both the US House Science and Technology Committee and the US Senate Commerce Committee. With Carey Goldberg and Dr. Isaac Kohane, he is the coauthor of the best-selling book, “The AI Revolution in Medicine: GPT-4 and Beyond.” In 2024, Peter Lee was named by Time magazine as one of the 100 most influential people in health and life sciences.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :) You can also support upkeep for the full Gradient team/project through a paid subscription on Substack!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:50) Basic vs. applied research* (05:20) Theory and practice in computing* (10:28) Traditional denotational semantics and semantics engineering in modern-day systems* (16:47) Beauty and practicality* (20:40) Metacircularity in the polymorphic lambda calculus: research directions* (24:31) Understanding the nature of difficulties with metacircularity* (26:30) Difficulties with reflection, classic paradoxes* (31:02) Sparks of AGI* (31:41) Reproducibility* (38:04) Confirming and disconfirming theories, foundational work* (42:00) Back and forth between commitments and experimentation* (51:01) Dealing with responsibility* (56:30) Peter's picture of AGI* (1:01:38) OutroLinks:* Peter's Twitter, LinkedIn, and Microsoft Research pages* Papers and references* The automatic generation of realistic compilers from high-level semantic descriptions* Metacircularity in the polymorphic lambda calculus* A Fresh Look at Combinator Graph Reduction* Sparks of AGI* Re-envisioning DARPA* Fundamental Research in Engineering Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 132I spoke with Manuel and Lenore Blum about:* Their early influences and mentors* The Conscious Turing Machine and what theoretical computer science can tell us about consciousnessEnjoy—and let me know what you think!Manuel is a pioneer in the field of theoretical computer science and the winner of the 1995 Turing Award in recognition of his contributions to the foundations of computational complexity theory and its applications to cryptography and program checking, a mathematical approach to writing programs that check their work. He worked as a professor of computer science at the University of California, Berkeley until 2001. From 2001 to 2018, he was the Bruce Nelson Professor of Computer Science at Carnegie Mellon University.Lenore is a Distinguished Career Professor of Computer Science, Emeritus at Carnegie Mellon University and former Professor-in-Residence in EECS at UC Berkeley. She is president of the Association for Mathematical Consciousness Science and newly elected member of the American Academy of Arts and Sciences. Lenore is internationally recognized for her work in increasing the participation of girls and women in Science, Technology, Engineering, and Math (STEM) fields. She was a founder of the Association for Women in Mathematics, and founding Co-Director (with Nancy Kreinberg) of the Math/Science Network and its Expanding Your Horizons conferences for middle- and high-school girls.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :) You can also support upkeep for the full Gradient team/project through a paid subscription on Substack!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (03:09) Manuel's interest in consciousness* (05:55) More of the story — from memorization to derivation* (11:15) Warren McCulloch's mentorship* (14:00) McCulloch's anti-Freudianism* (15:57) More on McCulloch's influence* (27:10) On McCulloch and telling stories* (32:35) The Conscious Turing Machine (CTM)* (33:55) A last word on McCulloch* (35:20) Components of the CTM* (39:55) Advantages of the CTM model* (50:20) The problem of free will* (52:20) On pain* (1:01:10) Brainish / CTM's multimodal inner language, language and thinking* (1:13:55) The CTM's lack of a “central executive”* (1:18:10) Empiricism and a self, tournaments in the CTM* (1:26:30) Mental causation* (1:36:20) Expertise and the CTM model, role of TCS* (1:46:30) Dreams and dream experience* (1:50:15) Disentangling components of experience from multimodal language* (1:56:10) CTM Robot, meaning and symbols, embodiment and consciousness* (2:00:35) AGI, CTM and AI processors, capabilities* (2:09:30) CTM implications, potential worries* (2:17:15) Advice for younger (computer) scientists* (2:22:57) OutroLinks:* Manuel's homepage* Lenore's homepage; find Lenore on Twitter (https://x.com/blumlenore) and Linkedin (https://www.linkedin.com/in/lenore-blum-1a47224)* Articles* “The ‘Accidental Activist' Who Changed the Face of Mathematics” — Ben Brubaker's Q&A with Lenore* “How this Turing-Award-winning researcher became a legendary academic advisor” — Sheon Han's profile of Manuel* Papers (Manuel and Lenore)* AI Consciousness is Inevitable: A Theoretical Computer Science Perspective* A Theory of Consciousness from a Theoretical Computer Science Perspective: Insights from the Conscious Turing Machine* A Theoretical Computer Science Perspective on Consciousness and Artificial General Intelligence* References (McCulloch)* Embodiments of Mind* Rebel Genius Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 131I spoke with Professor Kevin Dorst about:* Subjective Bayesianism and epistemology foundations* What happens when you're uncertain about your evidence* Why it's rational for people to polarize on political mattersEnjoy—and let me know what you think!Kevin is an Associate Professor in the Department of Linguistics and Philosophy at MIT. He works at the border between philosophy and social science, focusing on rationality.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :) You can also support upkeep for the full Gradient team/project through a paid subscription on Substack!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:15) When do Bayesians need theorems?* (05:52) Foundations of epistemology, metaethics, formal models, error theory* (09:35) Extreme views and error theory, arguing for/against opposing positions* (13:35) Changing focuses in philosophy — pragmatic pressures* (19:00) Kevin's goals through his research and work* (25:10) Structural factors in coming to certain (political) beliefs* (30:30) Acknowledging limited resources, heuristics, imperfect rationality* (32:51) Hindsight Bias is Not a Bias* (33:30) The argument* (35:15) On eating cereal and symmetric properties of evidence* (39:45) Colloquial notions of hindsight bias, time and evidential support* (42:45) An example* (48:02) Higher-order uncertainty* (48:30) Explicitly modeling higher-order uncertainty* (52:50) Another example (spoons)* (54:55) Game theory, iterated knowledge, even higher order uncertainty* (58:00) Uncertainty and philosophy of mind* (1:01:20) Higher-order evidence about reliability and rationality* (1:06:45) Being Rational and Being Wrong* (1:09:00) Setup on calibration and overconfidence* (1:12:30) The need for average rational credence — normative judgments about confidence and realism/anti-realism* (1:15:25) Quasi-realism about average rational credence?* (1:19:00) Classic epistemological paradoxes/problems — lottery paradox, epistemic luck* (1:25:05) Deference in rational belief formation, uniqueness and permissivism* (1:39:50) Rational Polarization* (1:40:00) Setup* (1:37:05) Epistemic nihilism, expanded confidence akrasia* (1:40:55) Ambiguous evidence and confidence akrasia* (1:46:25) Ambiguity in understanding and notions of rational belief* (1:50:00) Claims about rational sensitivity — what stories we can tell given evidence* (1:54:00) Evidence vs presentation of evidence* (2:01:20) ChatGPT and the case for human irrationality* (2:02:00) Is ChatGPT replicating human biases?* (2:05:15) Simple instruction tuning and an alternate story* (2:10:22) Kevin's aspirations with his work* (2:15:13) OutroLinks:* Professor Dorst's homepage and Twitter* Papers* Modest Epistemology* Hedden: Hindsight bias is not a bias* Higher-order evidence + (Almost) all evidence is higher-order evidence* Being Rational and Being Wrong* Rational Polarization* ChatGPT and human irrationality Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 130I spoke with David Pfau about:* Spectral learning and ML* Learning to disentangle manifolds and (projective) representation theory* Deep learning for computational quantum mechanics* Picking and pursuing research problems and directionsDavid's work is really (times k for some very large value of k) interesting—I've been inspired to descend a number of rabbit holes because of it. (if you listen to this episode, you might become as cool as this guy)While I'm at it — I'm still hovering around 40 ratings on Apple Podcasts. It'd mean a lot if you'd consider helping me bump that up!Enjoy—and let me know what you think!David is a staff research scientist at Google DeepMind. He is also a visiting professor at Imperial College London in the Department of Physics, where he supervises work on applications of deep learning to computational quantum mechanics. His research interests span artificial intelligence, machine learning and scientific computing.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :) You can also support upkeep for the full Gradient team/project through a paid subscription on Substack!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:52) David Pfau the “critic”* (02:05) Scientific applications of deep learning — David's interests* (04:57) Brain / neural network analogies* (09:40) Modern ML systems and theories of the brain* (14:19) Desirable properties of theories* (18:07) Spectral Inference Networks* (19:15) Connections to FermiNet / computational physics, a series of papers* (33:52) Deep slow feature analysis — interpretability and findings on eigenfunctions* (39:07) Following up on eigenfunctions (there are indeed only so many hours in a day; I have been asking the Substack people if they can ship 40-hour days, but I don't think they've gotten to it yet)* (42:17) Power iteration and intuitions* (45:23) Projective representation theory* (46:00) ???* (46:54) Geomancer and learning to decompose a manifold from data* (47:45) we consider the question of whether you will spend 90 more minutes of this podcast episode (there are not 90 more minutes left in this podcast episode, but there could have been)* (1:08:47) Learning embeddings* (1:11:12) The “unexpected emergent property” of Geomancer* (1:14:43) Learned embeddings and disentangling and preservation of topology* n/b I still haven't managed to do this in colab because I keep crashing my instance when I use s3o4d :(* (1:21:07) What's missing from the ~ current (deep learning) paradigm ~* (1:29:04) LLMs as swiss-army knives* (1:32:05) RL and human learning — TD learning in the brain* (1:37:43) Models that cover the Pareto Front (image below)* (1:46:54) AI accelerators and doubling down on transformers* (1:48:27) On Slow Research — chasing big questions and what makes problems attractive* (1:53:50) Future work on Geomancer* (1:55:35) Finding balance in pursuing interesting and lucrative work* (2:00:40) OutroLinks:* Papers* Natural Quantum Monte Carlo Computation of Excited States (2023)* Making sense of raw input (2021)* Integrable Nonparametric Flows (2020)* Disentangling by Subspace Diffusion (2020)* Ab initio solution of the many-electron Schrödinger equation with deep neural networks (2020)* Spectral Inference Networks (2018)* Connecting GANs and Actor-Critic Methods (2016)* Learning Structure in Time Series for Neuroscience and Beyond (2015, dissertation)* Robust learning of low-dimensional dynamics from large neural ensembles (2013)* Probabilistic Deterministic Infinite Automata (2010)* Other* On Slow Research* “I just want to put this out here so that no one ever says ‘we can just get around the data limitations of LLMs with self-play' ever again.” Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 129I spoke with Dan Hart and Michelle Michael about:* Developing NSWEduChat, an AI-powered chatbot designed and delivered by the NSW Department of Education for students and teachers.* The challenges in effectively teaching students as technology develops* Understanding and defining the importance of the classroomEnjoy—and let me know what you think!Dan Hart is Head of AI, and Michelle Michael is Director of Educational Support and Rural Initiatives at the New South Wales (NSW) Department of Education. Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :) You can also support upkeep for the full Gradient team/project through a paid subscription on Substack!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:48) How NSWEduChat came to be, educational principles for AI use* (02:37) Educational environment in New South Wales* (04:41) How educators have adapted to new challenges for teaching and assessment* (07:47) Considering technology advancement while teaching and assessing students* (12:14) Educating teachers and students about how to use AI tools* (15:03) AI in the classroom and enabling teachers* (19:44) Product-first thinking for educational AI* (22:15) Red teaming and testing* (24:02) Benchmarking, chatbots as an assistant* (26:35) The importance of the classroom* (28:10) Media coverage and hype* (30:35) Measurement and the benchmarking process/methodology* (34:50) Principles for how chatbots should interact with students* (44:29) Producing good educational outcomes at scale* (46:41) Operating with speed and effectiveness while implementing governance* (49:03) How the experience of building technologies evolves* (51:45) Identifying good technologists and educators for development and use* (55:07) Teaching standards and how AI impacts teachers* (57:01) How technologists incorporate teaching standards and expertise in their work* (1:00:03) NSWEduChat model details* (1:02:55) Value alignment for NSWEduChat* (1:05:40) Practicing caution in filtering chatbot responses* (1:07:35) Equity and personalized instruction — how NSWEduChat can help* (1:10:19) Helping students become “the students they could be”* (1:13:39) OutroLinks:* NSWEduChat* Guardian article on NSWEduChat Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 129I spoke with Kristin Lauter about:* Elliptic curve cryptography and homomorphic encryption* Standardizing cryptographic protocols* Machine Learning on encrypted data* Attacking post-quantum cryptography with AIEnjoy—and let me know what you think!Kristin is Senior Director of FAIR Labs North America (2022—present), based in Seattle. Her current research areas are AI4Crypto and Private AI. She joined FAIR (Facebook AI Research) in 2021, after 22 years at Microsoft Research (MSR). At MSR she was Partner Research Manager on the senior leadership team of MSR Redmond. Before joining Microsoft in 1999, she was Hildebrandt Assistant Professor of Mathematics at the University of Michigan (1996-1999). She is an Affiliate Professor of Mathematics at the University of Washington (2008—present). She received all her advanced degrees from the University of Chicago, BA (1990), MS (1991), PhD (1996) in Mathematics. She is best known for her work on Elliptic Curve Cryptography, Supersingular Isogeny Graphs in Cryptography, Homomorphic Encryption (SEALcrypto.org), Private AI, and AI4Crypto. She served as President of the Association for Women in Mathematics from 2015-2017 and on the Council of the American Mathematical Society from 2014-2017.Find me on Twitter for updates on new episodes, and reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :) You can also support upkeep for the full Gradient team/project through a paid subscription on Substack!Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:10) Llama 3 and encrypted data — where do we want to be?* (04:20) Tradeoffs: individual privacy vs. aggregated value in e.g. social media forums* (07:48) Kristin's shift in views on privacy* (09:40) Earlier work on elliptic curve cryptography — applications and theory* (10:50) Inspirations from algebra, number theory, and algebraic geometry* (15:40) On algebra vs. analysis and on clear thinking* (18:38) Elliptic curve cryptography and security, algorithms and concrete running time* (21:31) Cryptographic protocols and setting standards* (26:36) Supersingular isogeny graphs (and higher-dimensional supersingular isogeny graphs)* (32:26) Hard problems for cryptography and finding new problems* (36:42) Guaranteeing security for cryptographic protocols and mathematical foundations* (40:15) Private AI: Crypto-Nets / running neural nets on homomorphically encrypted data* (42:10) Polynomial approximations, activation functions, and expressivity* (44:32) Scaling up, Llama 2 inference on encrypted data* (46:10) Transitioning between MSR and FAIR, industry research* (52:45) An efficient algorithm for integer lattice reduction (AI4Crypto)* (56:23) Local minima, convergence and limit guarantees, scaling* (58:27) SALSA: Attacking Lattice Cryptography with Transformers* (58:38) Learning With Errors (LWE) vs. standard ML assumptions* (1:02:25) Powers of small primes and faster learning* (1:04:35) LWE and linear regression on a torus* (1:07:30) Secret recovery algorithms and transformer accuracy* (1:09:10) Interpretability / encoding information about secrets* (1:09:45) Future work / scaling up* (1:12:08) Reflections on working as a mathematician among technologistsLinks:* Kristin's Meta, Wikipedia, Google Scholar, and Twitter pages* Papers and sources mentioned/referenced:* The Advantages of Elliptic Curve Cryptography for Wireless Security (2004)* Cryptographic Hash Functions from Expander Graphs (2007, introducing Supersingular Isogeny Graphs)* Families of Ramanujan Graphs and Quaternion Algebras (2008 — the higher-dimensional analogues of Supersingular Isogeny Graphs)* Cryptographic Cloud Storage (2010)* Can homomorphic encryption be practical? (2011)* ML Confidential: Machine Learning on Encrypted Data (2012)* CryptoNets: Applying neural networks to encrypted data with high throughput and accuracy (2016)* A community effort to protect genomic data sharing, collaboration and outsourcing (2017)* The Homomorphic Encryption Standard (2022)* Private AI: Machine Learning on Encrypted Data (2022)* SALSA: Attacking Lattice Cryptography with Transformers (2022)* SalsaPicante: A Machine Learning Attack on LWE with Binary Secrets* SALSA VERDE: a machine learning attack on LWE with sparse small secrets* Salsa Fresca: Angular Embeddings and Pre-Training for ML Attacks on Learning With Errors* The cool and the cruel: separating hard parts of LWE secrets* An efficient algorithm for integer lattice reduction (2023) Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 128I spoke with Sergiy Nesterenko about:* Developing an automated system for designing PCBs* Difficulties in human and automated PCB design* Building a startup at the intersection of different areas of expertiseBy the way — I hit 40 ratings on Apple Podcasts (and am at 66 on Spotify). It'd mean a lot (really, a lot) if you'd consider leaving a rating or a review. I read everything, and it's very heartening and helpful to hear what you think. Enjoy, and let me know what you think!Sergiy is founder and CEO of Quilter. Sergiy spent 5 years at SpaceX developing radiation-hardened avionics for SpaceX's Falcon 9 and Falcon Heavy's second stage rockets, before discovering a big problem: designing printed circuit boards for all the electronics in these rockets was tedious, manual and error prone. So in 2019, he founded Quilter to build the next generation of AI-powered tooling for electrical engineers.I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :)Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:45) Quilter origins and difficulties in designing PCBs* (04:12) PCBs and schematic implementations* (06:40) Iteration cycles and simulations* (08:35) Octilinear traces and first-principles design for PCBs* (12:38) The design space of PCBs* (15:27) Benchmarks for PCB design* (20:05) RL and PCB design* (22:48) PCB details, track widths* (25:09) Board functionality and aesthetics* (27:53) PCB designers and automation* (30:24) Quilter as a compiler* (33:56) Gluing social worlds and bringing together expertise* (36:00) Process knowledge vs. first-principles thinking* (42:05) Example boards* (44:45) Auto-routers for PCBs* (48:43) Difficulties for scaling to larger boards* (50:42) Customers and skepticism* (53:42) On experiencing negative feedback* (56:42) Maintaining stamina while building Quilter* (1:00:00) Endgame for Quilter and future directions* (1:03:24) OutroLinks:* Quilter homepage* Other pages/features mentioned:* Thin-to-thick traces* Octilinear trace routing* Comment from Tom Fleet Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 127I spoke with Christopher Thi Nguyen about:* How we lose control of our values* The tradeoffs of legibility, aggregation, and simplification* Gamification and its risksEnjoy—and let me know what you think!C. Thi Nguyen as of July 2020 is Associate Professor of Philosophy at the University of Utah. His research focuses on how social structures and technology can shape our rationality and our agency. He has published on trust, expertise, group agency, community art, cultural appropriation, aesthetic value, echo chambers, moral outrage porn, and games. He received his PhD from UCLA. Once, he was a food writer for the Los Angeles Times.I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :)Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:10) The ubiquity of James C. Scott* (06:03) Legibility and measurement* (12:50) Value capture, classes and measurement* (17:30) Political value choice in ML* (23:30) Why value collapse happens* (33:00) Blackburn, “Hume and Thick Connexions” — projectivism and legibility* (36:20) Heuristics and decision-making* (40:08) Institutional classification systems* (46:55) Back to Hume* (48:27) Epistemic arms races, stepping outside our conceptual architectures* (56:40) The “what to do” question* (1:04:00) Gamification, aesthetic engagement* (1:14:51) Echo chambers and defining utility* (1:22:10) Progress, AGI millenarianism* (disclaimer: I don't know what's going to happen with the world, either.)* (1:26:04) Parting visions* (1:30:02) OutroLinks:* Chrisopher's Twitter and homepage* Games: Agency as Art* Papers referenced* Transparency is Surveillance* Games and the art of agency* Autonomy and Aesthetic Engagement* Art as a Shelter from Science* Value Capture* Hostile Epistemology* Hume and Thick Connexions (Simon Blackburn) Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 126I spoke with Vivek Natarajan about:* Improving access to medical knowledge with AI* How an LLM for medicine should behave* Aspects of training Med-PaLM and AMIE* How to facilitate appropriate amounts of trust in users of medical AI systemsVivek Natarajan is a Research Scientist at Google Health AI advancing biomedical AI to help scale world class healthcare to everyone. Vivek is particularly interested in building large language models and multimodal foundation models for biomedical applications and leads the Google Brain moonshot behind Med-PaLM, Google's flagship medical large language model. Med-PaLM has been featured in The Scientific American, The Economist, STAT News, CNBC, Forbes, New Scientist among others.I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :)Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:35) The concept of an “AI doctor”* (06:54) Accessibility to medical expertise* (10:31) Enabling doctors to do better/different work* (14:35) Med-PaLM* (15:30) Instruction tuning, desirable traits in LLMs for medicine* (23:41) Axes for evaluation of medical QA systems* (30:03) Medical LLMs and scientific consensus* (35:32) Demographic data and patient interventions* (40:14) Data contamination in Med-PaLM* (42:45) Grounded claims about capabilities* (45:48) Building trust* (50:54) Genetic Discovery enabled by a LLM* (51:33) Novel hypotheses in genetic discovery* (57:10) Levels of abstraction for hypotheses* (1:01:10) Directions for continued progress* (1:03:05) Conversational Diagnostic AI* (1:03:30) Objective Structures Clinical Examination as an evaluative framework* (1:09:08) Relative importance of different types of data* (1:13:52) Self-play — conversational dispositions and handling patients* (1:16:41) Chain of reasoning and information retention* (1:20:00) Performance in different areas of medical expertise* (1:22:35) Towards accurate differential diagnosis* (1:31:40) Feedback mechanisms and expertise, disagreement among clinicians* (1:35:26) Studying trust, user interfaces* (1:38:08) Self-trust in using medical AI models* (1:41:39) UI for medical AI systems* (1:43:50) Model reasoning in complex scenarios* (1:46:33) Prompting* (1:48:41) Future outlooks* (1:54:53) OutroLinks:* Vivek's Twitter and homepage* Papers* Towards Expert-Level Medical Question Answering with LLMs (2023)* LLMs encode clinical knowledge (2023)* Towards Generalist Biomedical AI (2024)* AMIE* Genetic Discovery enabled by a LLM (2023) Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 125False universalism freaks me out. It doesn't freak me out as a first principle because of epistemic violence; it freaks me out because it works. I spoke with Professor Thomas Mullaney about:* Telling stories about your work and balancing what feels meaningful with practical realities* Destabilizing our understandings of the technologies we feel familiar with, and the work of researching the history of the Chinese typewriter* The personal nature of researchThe Chinese Typewriter and The Chinese Computer are two of the best books I've read in a very long time. And they're not just good and interesting, but important to read, for the history they tell and the ideas and arguments they present—I can't recommend them and Professor Mullaney's other work enough.Tom is Professor of History and Professor of East Asian Languages and Cultures, by courtesy. He is also the Kluge Chair in Technology and Society at the Library of Congress, and a Guggenheim Fellow. He is the author or lead editor of 8 books, including The Chinese Computer, The Chinese Typewriter (winner of the Fairbank prize), Your Computer is on Fire, and Coming to Terms with the Nation: Ethnic Classification in Modern China.I spend a lot of time on this podcast—if you like my work, you can support me on Patreon :)Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:00) “In Their Own Words” interview: on telling stories about your work* (07:42) Clashing narratives and authenticity/inauthenticity in pursuing your work* (15:48) Why Professor Mullaney pursued studying the Chinese typewriter* (18:20) Worldmaking, transforming the physical world to fit our descriptive models* (30:07) Internal and illegible continuities/coherence in work* (31:45) The role of a “self”* (43:06) The 2008 Beijing Olympics and false (alphabetical) universalism, projectivism* (1:04:23) “Kicking the ladder” and the personal nature of research* (1:18:07) The “Technolinguistic Chinese Exclusion Act” — the situatedness of historians in their work* (1:33:00) Is the Chinese typewriter project finished? / on the resolution of problems* (1:43:35) OutroLinks:* Professor Mullaney's homepage and Twitter* In Their Own Words: Thomas Mullaney* Books* The Chinese Computer: A Global History of the Information Age* The Chinese Typewriter: A History* Coming to Terms with the Nation: Ethnic Classification in Modern China Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 124You may think you're doing a priori reasoning, but actually you're just over-generalizing from your current experience of technology.I spoke with Professor Seth Lazar about:* Why managing near-term and long-term risks isn't always zero-sum* How to think through axioms and systems in political philosphy* Coordination problems, economic incentives, and other difficulties in developing publicly beneficial AISeth is Professor of Philosophy at the Australian National University, an Australian Research Council (ARC) Future Fellow, and a Distinguished Research Fellow of the University of Oxford Institute for Ethics in AI. He has worked on the ethics of war, self-defense, and risk, and now leads the Machine Intelligence and Normative Theory (MINT) Lab, where he directs research projects on the moral and political philosophy of AI.Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:54) Ad read — MLOps conference* (01:32) The allocation of attention — attention, moral skill, and algorithmic recommendation* (03:53) Attention allocation as an independent good (or bad)* (08:22) Axioms in political philosophy* (11:55) Explaining judgments, multiplying entities, parsimony, intuitive disgust* (15:05) AI safety / catastrophic risk concerns* (22:10) Superintelligence arguments, reasoning about technology* (28:42) Attacking current and future harms from AI systems — does one draw resources from the other? * (35:55) GPT-2, model weights, related debates* (39:11) Power and economics—coordination problems, company incentives* (50:42) Morality tales, relationship between safety and capabilities* (55:44) Feasibility horizons, prediction uncertainty, and doing moral philosophy* (1:02:28) What is a feasibility horizon? * (1:08:36) Safety guarantees, speed of improvements, the “Pause AI” letter* (1:14:25) Sociotechnical lenses, narrowly technical solutions* (1:19:47) Experiments for responsibly integrating AI systems into society* (1:26:53) Helpful/honest/harmless and antagonistic AI systems* (1:33:35) Managing incentives conducive to developing technology in the public interest* (1:40:27) Interdisciplinary academic work, disciplinary purity, power in academia* (1:46:54) How we can help legitimize and support interdisciplinary work* (1:50:07) OutroLinks:* Seth's Linktree and Twitter* Resources* Attention, moral skill, and algorithmic recommendation* Catastrophic AI Risk slides Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 123I spoke with Suhail Doshi about:* Why benchmarks aren't prepared for tomorrow's AI models* How he thinks about artists in a world with advanced AI tools* Building a unified computer vision model that can generate, edit, and understand pixels. Suhail is a software engineer and entrepreneur known for founding Mixpanel, Mighty Computing, and Playground AI (they're hiring!).Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:54) Ad read — MLOps conference* (01:30) Suhail is *not* in pivot hell but he *is* all-in on 50% AI-generated music* (03:45) AI and music, similarities to Playground* (07:50) Skill vs. creative capacity in art* (12:43) What we look for in music and art* (15:30) Enabling creative expression* (18:22) Building a unified computer vision model, underinvestment in computer vision* (23:14) Enhancing the aesthetic quality of images: color and contrast, benchmarks vs user desires* (29:05) “Benchmarks are not prepared for how powerful these models will become”* (31:56) Personalized models and personalized benchmarks* (36:39) Engaging users and benchmark development* (39:27) What a foundation model for graphics requires* (45:33) Text-to-image is insufficient* (46:38) DALL-E 2 and Imagen comparisons, FID* (49:40) Compositionality* (50:37) Why Playground focuses on images vs. 3d, video, etc.* (54:11) Open source and Playground's strategy* (57:18) When to stop open-sourcing?* (1:03:38) Suhail's thoughts on AGI discourse* (1:07:56) OutroLinks:* Playground homepage* Suhail on Twitter Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 122I spoke with Azeem Azhar about:* The speed of progress in AI* Historical context for some of the terminology we use and how we think about technology* What we might want our future to look likeAzeem is an entrepreneur, investor, and adviser. He is the creator of Exponential View, a global platform for in-depth technology analysis, and the host of the Bloomberg Original series Exponentially.Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:32) Ad read — MLOps conference* (01:05) Problematizing the term “exponential”* (07:35) Moore's Law as social contract, speed of technological growth and impedances* (14:45) Academic incentives, interdisciplinary work, rational agents and historical context* (21:24) Monolithic scaling* (26:38) Investment in scaling* (31:22) On Sam Altman* (36:25) Uses of “AGI,” “intelligence”* (41:32) Historical context for terminology* (48:58) AI and teaching* (53:51) On the technology-human divide* (1:06:26) New technologies and the futures we want* (1:10:50) Inevitability narratives* (1:17:01) Rationality and objectivity* (1:21:13) Cultural affordances and intellectual history* (1:26:15) Centralized and decentralized AI systems* (1:32:54) Instruction tuning and helpful/honest/harmless* (1:39:18) Azeem's future outlook * (1:46:15) OutroLinks:* Azeem's website and Twitter* Exponential View Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 122I spoke with Professor David Thorstad about:* The practical difficulties of doing interdisciplinary work* Why theories of human rationality should account for boundedness, heuristics, and other cognitive limitations* why EA epistemics suck (ok, it's a little more nuanced than that)Professor Thorstad is an Assistant Professor of Philosophy at Vanderbilt University, a Senior Research Affiliate at the Global Priorities Institute at Oxford, and a Research Affiliate at the MINT Lab at Australian National University. One strand of his research asks how cognitively limited agents should decide what to do and believe. A second strand asks how altruists should use limited funds to do good effectively.Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. Subscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:15) David's interest in rationality* (02:45) David's crisis of confidence, models abstracted from psychology* (05:00) Blending formal models with studies of the mind* (06:25) Interaction between academic communities* (08:24) Recognition of and incentives for interdisciplinary work* (09:40) Movement towards interdisciplinary work* (12:10) The Standard Picture of rationality* (14:11) Why the Standard Picture was attractive* (16:30) Violations of and rebellion against the Standard Picture* (19:32) Mistakes made by critics of the Standard Picture* (22:35) Other competing programs vs Standard Picture* (26:27) Characterizing Bounded Rationality* (27:00) A worry: faculties criticizing themselves* (29:28) Self-improving critique and longtermism* (30:25) Central claims in bounded rationality and controversies* (32:33) Heuristics and formal theorizing* (35:02) Violations of Standard Picture, vindicatory epistemology* (37:03) The Reason Responsive Consequentialist View (RRCV)* (38:30) Objective and subjective pictures* (41:35) Reason responsiveness* (43:37) There are no epistemic norms for inquiry* (44:00) Norms vs reasons* (45:15) Arguments against epistemic nihilism for belief* (47:30) Norms and self-delusion* (49:55) Difficulty of holding beliefs for pragmatic reasons* (50:50) The Gibbardian picture, inquiry as an action* (52:15) Thinking how to act and thinking how to live — the power of inquiry* (53:55) Overthinking and conducting inquiry* (56:30) Is thinking how to inquire as an all-things-considered matter?* (58:00) Arguments for the RRCV* (1:00:40) Deciding on minimal criteria for the view, stereotyping* (1:02:15) Eliminating stereotypes from the theory* (1:04:20) Theory construction in epistemology and moral intuition* (1:08:20) Refusing theories for moral reasons and disciplinary boundaries* (1:10:30) The argument from minimal criteria, evaluating against competing views* (1:13:45) Comparing to other theories* (1:15:00) The explanatory argument* (1:17:53) Parfit and Railton, norms of friendship vs utility* (1:20:00) Should you call out your friend for being a womanizer* (1:22:00) Vindicatory Epistemology* (1:23:05) Panglossianism and meliorative epistemology* (1:24:42) Heuristics and recognition-driven investigation* (1:26:33) Rational inquiry leading to irrational beliefs — metacognitive processing* (1:29:08) Stakes of inquiry and costs of metacognitive processing* (1:30:00) When agents are incoherent, focuses on inquiry* (1:32:05) Indirect normative assessment and its consequences* (1:37:47) Against the Singularity Hypothesis* (1:39:00) Superintelligence and the ontological argument* (1:41:50) Hardware growth and general intelligence growth, AGI definitions* (1:43:55) Difficulties in arguing for hyperbolic growth* (1:46:07) Chalmers and the proportionality argument* (1:47:53) Arguments for/against diminishing growth, research productivity, Moore's Law* (1:50:08) On progress studies* (1:52:40) Improving research productivity and technology growth* (1:54:00) Mistakes in the moral mathematics of existential risk, longtermist epistemics* (1:55:30) Cumulative and per-unit risk* (1:57:37) Back and forth with longtermists, time of perils* (1:59:05) Background risk — risks we can and can't intervene on, total existential risk* (2:00:56) The case for longtermism is inflated* (2:01:40) Epistemic humility and longtermism* (2:03:15) Knowledge production — reliable sources, blog posts vs peer review* (2:04:50) Compounding potential errors in knowledge* (2:06:38) Group deliberation dynamics, academic consensus* (2:08:30) The scope of longtermism* (2:08:30) Money in effective altruism and processes of inquiry* (2:10:15) Swamping longtermist options* (2:12:00) Washing out arguments and justified belief* (2:13:50) The difficulty of long-term forecasting and interventions* (2:15:50) Theory of change in the bounded rationality program* (2:18:45) OutroLinks:* David's homepage and Twitter and blog* Papers mentioned/read* Bounded rationality and inquiry* Why bounded rationality (in epistemology)?* Against the newer evidentialists* The accuracy-coherence tradeoff in cognition* There are no epistemic norms of inquiry* Permissive metaepistemology* Global priorities and effective altruism* What David likes about EA* Against the singularity hypothesis (+ blog posts)* Three mistakes in the moral mathematics of existential risk (+ blog posts)* The scope of longtermism* Epistemics Get full access to The Gradient at thegradientpub.substack.com/subscribe
Episode 121I spoke with Professor Ryan Tibshirani about:* Differences between the ML and statistics communities in scholarship, terminology, and other areas. * Trend filtering* Why you can't just use garbage prediction functions when doing conformal predictionRyan is a Professor in the Department of Statistics at UC Berkeley. He is also a Principal Investigator in the Delphi group. From 2011-2022, he was a faculty member in Statistics and Machine Learning at Carnegie Mellon University. From 2007-2011, he did his Ph.D. in Statistics at Stanford University.Reach me at editor@thegradient.pub for feedback, ideas, guest suggestions. The Gradient Podcast on: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:10) Ryan's background and path into statistics* (07:00) Cultivating taste as a researcher* (11:00) Conversations within the statistics community* (18:30) Use of terms, disagreements over stability and definitions* (23:05) Nonparametric Regression* (23:55) Background on trend filtering* (33:48) Analysis and synthesis frameworks in problem formulation* (39:45) Neural networks as a specific take on synthesis* (40:55) Divided differences, falling factorials, and discrete splines* (41:55) Motivations and background* (48:07) Divided differences vs. derivatives, approximation and efficiency* (51:40) Conformal prediction* (52:40) Motivations* (1:10:20) Probabilistic guarantees in conformal prediction, choice of predictors* (1:14:25) Assumptions: i.i.d. and exchangeability — conformal prediction beyond exchangeability* (1:25:00) Next directions* (1:28:12) Epidemic forecasting — COVID-19 impact and trends survey* (1:29:10) Survey methodology* (1:38:20) Data defect correlation and its limitations for characterizing datasets* (1:46:14) OutroLinks:* Ryan's homepage* Works read/mentioned* Nonparametric Regression* Adaptive Piecewise Polynomial Estimation via Trend Filtering (2014) * Divided Differences, Falling Factorials, and Discrete Splines: Another Look at Trend Filtering and Related Problems (2020)* Distribution-free Inference* Distribution-Free Predictive Inference for Regression (2017)* Conformal Prediction Under Covariate Shift (2019)* Conformal Prediction Beyond Exchangeability (2023)* Delphi and COVID-19 research* Flexible Modeling of Epidemics* Real-Time Estimation of COVID-19 Infections* The US COVID-19 Trends and Impact Survey and Big data, big problems: Responding to “Are we there yet?” Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 120 of The Gradient Podcast, Daniel Bashir speaks to Sasha Luccioni.Sasha is the AI and Climate Lead at HuggingFace, where she spearheads research, consulting, and capacity-building to elevate the sustainability of AI systems. A founding member of Climate Change AI (CCAI) and a board member of Women in Machine Learning (WiML), Sasha is passionate about catalyzing impactful change, organizing events and serving as a mentor to under-represented minorities within the AI community.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach Daniel at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (00:43) Sasha's background* (01:52) How Sasha became interested in sociotechnical work* (03:08) Larger models and theory of change for AI/climate work* (07:18) Quantifying emissions for ML systems* (09:40) Aggregate inference vs training costs* (10:22) Hardware and data center locations* (15:10) More efficient hardware vs. bigger models — Jevons paradox* (17:55) Uninformative experiments, takeaways for individual scientists, knowledge sharing, failure reports* (27:10) Power Hungry Processing: systematic comparisons of ongoing inference costs* (28:22) General vs. task-specific models* (31:20) Architectures and efficiency* (33:45) Sequence-to-sequence architectures vs. decoder-only* (36:35) Hardware efficiency/utilization* (37:52) Estimating the carbon footprint of Bloom and lifecycle assessment* (40:50) Stable Bias* (46:45) Understanding model biases and representations* (52:07) Future work* (53:45) Metaethical perspectives on benchmarking for AI ethics* (54:30) “Moral benchmarks”* (56:50) Reflecting on “ethicality” of systems* (59:00) Transparency and ethics* (1:00:05) Advice for picking research directions* (1:02:58) OutroLinks:* Sasha's homepage and Twitter* Papers read/discussed* Climate Change / Carbon Emissions of AI Models* Quantifying the Carbon Emissions of Machine Learning* Power Hungry Processing: Watts Driving the Cost of AI Deployment?* Tackling Climate Change with Machine Learning* CodeCarbon* Responsible AI* Stable Bias: Analyzing Societal Representations in Diffusion Models* Metaethical Perspectives on ‘Benchmarking' AI Ethics* Measuring Data* Mind your Language (Model): Fact-Checking LLMs and their Role in NLP Research and Practice Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 119 of The Gradient Podcast, Daniel Bashir speaks to Professor Michael Sipser.Professor Sipser is the Donner Professor of Mathematics and member of the Computer Science and Artificial Intelligence Laboratory at MIT.He received his PhD from UC Berkeley in 1980 and joined the MIT faculty that same year. He was Chairman of Applied Mathematics from 1998 to 2000 and served as Head of the Mathematics Department 2004-2014. He served as interim Dean of Science 2013-2014 and then as Dean of Science 2014-2020.He was a research staff member at IBM Research in 1980, spent the 1985-86 academic year on the faculty of the EECS department at Berkeley and at MSRI, and was a Lady Davis Fellow at Hebrew University in 1988. His research areas are in algorithms and complexity theory, specifically efficient error correcting codes, interactive proof systems, randomness, quantum computation, and establishing the inherent computational difficulty of problems. He is the author of the widely used textbook, Introduction to the Theory of Computation (Third Edition, Cengage, 2012).Have suggestions for future podcast guests (or other feedback)? Let us know here or reach Daniel at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:40) Professor Sipser's background* (04:35) On interesting questions* (09:00) Different kinds of research problems* (13:00) What makes certain problems difficult* (18:48) Nature of the P vs NP problem* (24:42) Identifying interesting problems* (28:50) Lower bounds on the size of sweeping automata* (29:50) Why sweeping automata + headway to P vs. NP* (36:40) Insights from sweeping automata, infinite analogues to finite automata problems* (40:45) Parity circuits* (43:20) Probabilistic restriction method* (47:20) Relativization and the polynomial time hierarchy* (55:10) P vs. NP* (57:23) The non-connection between GO's polynomial space hardness and AlphaGo* (1:00:40) On handicapping Turing Machines vs. oracle strategies* (1:04:25) The Natural Proofs Barrier and approaches to P vs. NP* (1:11:05) Debates on methods for P vs. NP* (1:15:04) On the possibility of solving P vs. NP* (1:18:20) On academia and its role* (1:27:51) OutroLinks:* Professor Sipser's homepage* Papers discussed/read* Halting space-bounded computations (1978)* Lower bounds on the size of sweeping automata (1979)* GO is Polynomial-Space Hard (1980)* A complexity theoretic approach to randomness (1983)* Parity, circuits, and the polynomial-time hierarchy (1984)* A follow-up to Furst-Saxe-Sipser* The Complexity of Finite Functions (1991) Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 118 of The Gradient Podcast, Daniel Bashir speaks to Andrew Lee.Andrew is co-founder and CEO of Shortwave, a company dedicated to building a better product experience for email, particularly by leveraging AI. He previously co-founded and was CTO at Firebase.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach Daniel at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:43) Andrew's previous work, Firebase* (04:48) Benefits of lacking experience in building Firebase* (08:55) On “abstract reasoning” vs empirical capabilities* (10:30) Shortwave's AI system as a black box* (11:55) Motivations for Shortwave* (17:10) Why is Google not innovating on email?* (21:53) Shortwave's overarching product vision and pivots* (27:40) Shortwave AI features* (33:20) AI features for email and security concerns* (35:45) Shortwave's AI Email Assistant + architecture* (43:40) Issues with chaining LLM calls together* (45:25) Understanding implicit context in utterances, modularization without loss of context* (48:56) Performance for AI assistant, batching and pipelining* (55:10) Prompt length* (57:00) On shipping fast* (1:00:15) AI improvements that Andrew is following* (1:03:10) OutroLinks:* Andrew's blog and Twitter* Shortwave* Introducing Ghostwriter* Everything we shipped for AI Launch Week* A deep dive into the world's smartest email AI Get full access to The Gradient at thegradientpub.substack.com/subscribe
“You get more of what you engage with. Everyone who complains about coverage should understand that every click, every quote tweet, every argument is registered by these publications as engagement. If what you want is really meaty, dispassionate, balanced, and fair explainers, you need to click on that, you need to read the whole thing, you need to share it, talk about it, comment on it. We get the media that we deserve.”In episode 117 of The Gradient Podcast, Daniel Bashir speaks to Joss Fong.Joss is a producer focused on science and technology, and was a founding member of the Vox video team. Her work has been recognized by the AAAS Kavli Science Journalism Awards, the Online Journalism Awards, and the News & Documentary Emmys. She holds a master's degree in science, health, and environmental reporting from NYU.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:32) Joss's path into videomaking, J-school* (07:45) Consumption and creation in explainer journalism* (10:45) Finding clarity in information* (13:15) Communication of ML research* (15:55) Video journalism and science communication as separate and overlapping disciplines* (19:41) Evolution of videos and videomaking* (26:33) Explaining AI and communicating mental models* (30:47) Meeting viewers in the middle, competing for attention* (34:07) Explanatory techniques in Glad You Asked* (37:10) Storytelling and communicating scientific information* (40:57) “Is Beauty Culture Hurting Us?” and participating in video narratives* (46:37) AI beauty filters* (52:59) Obvious bias in generative AI* (59:31) Definitions and ideas of progress, humanities and technology* (1:05:08) “Iterative development” and outsourcing quality control to the public* (1:07:10) Disagreement about (tech) journalism's purpose* (1:08:51) Incentives in newsrooms and journalistic organizations* (1:12:04) AI for video generation and implications, limits of creativity* (1:17:20) Skill and creativity* (1:22:35) Joss's new YouTube channel!* (1:23:29) OutroLinks:* Joss's website and playlist of selected work* AI-focused videos* AI Art, Explained (2022)* AI can do your homework. Now what? (2023)* Computers just got a lot better at writing (2020)* Facebook showed this ad to 95% women. Is that a problem? (2020)* What facial recognition steals from us (2019)* The big debate about the future of work (2017)* AI and Creativity short film for Runway's AIFF (2023)* Others* Is Beauty Culture Hurting Us? from Glad You Asked (2020)* Joss's Scientific American videos :) Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 116 of The Gradient Podcast, Daniel Bashir speaks to Kate Park. Kate is the Director of Product at Scale AI. Prior to joining Scale, Kate worked on Tesla Autopilot as the AI team's first and lead product manager building the industry's first data engine. She has also published research on spoken natural language processing and a travel memoir.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:11) Kate's background* (03:22) Tesla and cameras vs. Lidar, importance of data* (05:12) “Data is key”* (07:35) Data vs. architectural improvements* (09:36) Effort for data scaling* (10:55) Transfer of capabilities in self-driving* (13:44) Data flywheels and edge cases, deployment* (15:48) Transition to Scale* (18:52) Perspectives on shifting to transformers and data* (21:00) Data engines for NLP vs. for vision* (25:32) Model evaluation for LLMs in data engines* (27:15) InstructGPT and data for RLHF* (29:15) Benchmark tasks for assessing potential labelers* (32:07) Biggest challenges for data engines* (33:40) Expert AI trainers* (36:22) Future work in data engines* (38:25) Need for human labeling when bootstrapping new domains or tasks* (41:05) OutroLinks:* Scale Data Engine* OpenAI case study Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 115 of The Gradient Podcast, Daniel Bashir speaks to Ben Wellington.Ben is the Deputy Head of Feature Forecasting at Two Sigma, a financial sciences company. Ben has been at Two Sigma for more than 15 years, and currently leads efforts focused on natural language processing and feature forecasting. He is also the author of data science blog I Quant NY, which has influenced local government policy, including changes in NYC street infrastructure and the design of NYC subway vending machines. Ben is a Visiting Assistant Professor in the Urban and Community Planning program at the Pratt Institute in Brooklyn where he teaches statistics using urban open data. He holds a Ph.D. in Computer Science from New York University.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:30) Ben's background* (04:30) Why Ben was interested in NLP* (05:48) Ben's work on translational equivalence, dominant techniques* (10:14) Scaling, large datasets at Two Sigma* (12:50) Applying ML techniques to quantitative finance, features in financial ML systems* (17:27) Baselines and time-dependence in constructing features, human knowledge* (19:23) Black box models in finance* (24:00) Two Sigma's presence in the AI research community* (26:55) Short- and long-term research initiatives at Two Sigma* (30:42) How ML fits into Two Sigma's investment strategy* (34:05) Alpha and competition in investing* (36:13) Temporality in data* (40:38) Challenges for finance/AI and beating the market* (44:36) Reproducibility* (49:47) I Quant NY and storytelling with data* (56:43) Descriptive statistics and stories* (1:01:05) Benefits of simple methods* (1:07:11) OutroLinks:* Ben's work on translational equivalence and scalable discriminative learning* Two Sigma Insights* Storytelling with data and I Quant NY Get full access to The Gradient at thegradientpub.substack.com/subscribe
“There is this move from generality in a relative sense of ‘we are not as specialized as insects' to generality in the sense of omnipotent, omniscient, godlike capabilities. And I think there's something very dangerous that happens there, which is you start thinking of the word ‘general' in completely unhinged ways.”In episode 114 of The Gradient Podcast, Daniel Bashir speaks to Venkatesh Rao. Venkatesh is a writer and consultant. He has been writing the widely read Ribbonfarm blog since 2007, and more recently, the popular Ribbonfarm Studio Substack newsletter. He is the author of Tempo, a book on timing and decision-making, and is currently working on his second book, on the foundations of temporality. He has been an independent consultant since 2011, supporting senior executives in the technology industry. His work in recent years has focused on AI, semiconductor, sustainability, and protocol technology sectors. He holds a PhD in control theory (2003) from the University of Michigan. He is currently based in the Seattle area, and enjoys dabbling in robotics in his spare time. You can learn more about his work at venkateshrao.comHave suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:38) Origins of Ribbonfarm and Venkat's academic background* (04:23) Voice and recurring themes in Venkat's work* (11:45) Patch models and multi-agent systems: integrating philosophy of language, balancing realism with tractability* (21:00) More on abstractions vs. tractability in Venkat's work* (29:07) Scaling of industrial value systems, characterizing AI as a discipline* (39:25) Emergent science, intelligence and abstractions, presuppositions in science, generality and universality, cameras and engines* (55:05) Psychometric terms* (1:09:07) Inductive biases (yes I mentioned the No Free Lunch Theorem and then just talked about the definition of inductive bias and not the actual theorem
In episode 113 of The Gradient Podcast, Daniel Bashir speaks to Professor Sasha Rush.Professor Rush is an Associate Professor at Cornell University and a Researcher at HuggingFace. His research aims to develop natural language processing systems that are safe, fast, and controllable. His group is interested primarily in tasks that involve text generation, and they study data-driven probabilistic methods that combine deep-learning based models with probabilistic controls. He is also interested in open-source NLP and deep learning, and develops projects to make deep learning systems safer, clearer, and easier to use.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:47) Professor Rush's background* (03:23) Professor Rush's reflections on prior work—importance of learning and inference* (04:58) How much engineering matters in deep learning, the Rush vs. Frankle Bet* (07:12) On encouraging and incubating good research* (10:50) Features of good research environments* (12:36) 5% bets in Professor Rush's research: State-Space Models (SSMs) as an alternative to Transformers* (15:58) SSMs vs. Transformers* (18:53) Probabilistic Context-Free Grammars—are (P)CFGs worth paying attention to?* (20:53) Sequence-level knowledge distillation: approximating sequence-level distributions* (25:08) Pruning and knowledge distillation — orthogonality of efficiency techniques* (26:33) Broader thoughts on efficiency* (28:31) Works on prompting* (28:58) Prompting and In-Context Learning* (30:05) Thoughts on mechanistic interpretability* (31:25) Multitask prompted training enables zero-shot task generalization* (33:48) How many data points is a prompt worth? * (35:13) Directions for controllability in LLMs* (39:11) Controllability and safety* (41:23) Open-source work, deep learning libraries* (42:08) A story about Professor Rush's post-doc at FAIR* (43:51) The impact of PyTorch* (46:08) More thoughts on deep learning libraries* (48:48) Levels of abstraction, PyTorch as an interface to motivate research* (50:23) Empiricism and research commitments* (53:32) OutroLinks:* Research* Early work / PhD* Dual Decomposition and LP Relaxations* Vine Pruning for Efficient Multi-Pass Dependency Parsing* Improved Parsing and POS Tagging Using Inter-Sentence Dependency Constraints* Research — interpretable and controllable natural language generation* Compound Probabilistic Context-Free Grammars for Grammar Induction* Multitask prompted training enables zero-shot task generalization* Research — deep generative models* A Neural Attention Model for Abstractive Sentence Summarization* Learning Neural Templates for Text Generation* How many data points is a prompt worth?* Research — efficient algorithms and hardware for speech, translation, dialogue* Sequence-Level Knowledge Distillation* Open-source work* NamedTensor* Torch Struct Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 112 of The Gradient Podcast, Daniel Bashir speaks to Cameron Jones and Sean Trott.Cameron is a PhD candidate in the Cognitive Science Department at the University of California, San Diego. His research compares how humans and large language models process language about world knowledge, situation models, and theory of mind.Sean is an Assistant Teaching Professor in the Cognitive Science Department at the University of California, San Diego. His research interests include probing large language models, ambiguity in languages, how ambiguous words are represented, and pragmatic inference. He previously completed his PhD at UCSD.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:55) Cameron's background* (06:00) Sean's background* (08:15) Unexpected capabilities of language models and the need for embodiment to understand meaning* (11:05) Interpreting results of Turing tests, separating what humans and LLMs do when behaving as though they “understand”* (14:27) Internal mechanisms, interpretability, how we test theories* (16:40) Languages are efficient, but for whom? * (17:30) Initial motivations: lexical ambiguity * (19:20) The balance of meanings across wordforms* (22:35) Tension between speaker- and comprehender-oriented pressures in lexical ambiguity* (25:05) Context and potential vs. realized ambiguity* (27:15) LLM-ology* (28:30) Studying LLMs as models of human cognition and as interesting objects of study in their own right* (30:03) Example of explaining away effects* (33:54) The internalist account of belief sensitivity—behavior and internal representations* (37:43) LLMs and the False Belief Task* (42:05) Hypothetical on observed behavior and inferences about internal representations* (48:05) Distributional Semantics Still Can't Account for Affordances* (50:25) Tests of embodied theories and limitations of distributional cues* (53:54) Multimodal models and object affordances* (58:30) Language and grounding, other buzzwords* (59:45) How could we know if LLMs understand language?* (1:04:50) Reference: as a thing words do vs. ontological notion* (1:11:38) The Role of Physical Inference in Pronoun Resolution* (1:16:40) World models and world knowledge* (1:19:45) EPITOME* (1:20:20) The different tasks* (1:26:43) Confounders / “attending” in LM performance on tasks* (1:30:30) Another hypothetical, on theory of mind* (1:32:26) How much information can language provide in service of mentalizing? * (1:35:14) Convergent validity and coherence/validity of theory of mind* (1:39:30) Interpretive questions about behavior w/r/t/ theory of mind* (1:43:35) Does GPT-4 Pass the Turing Test?* (1:44:00) History of the Turing Test* (1:47:05) Interrogator strategies and the strength of the Turing Test* (1:52:15) “Internal life” and personality* (1:53:30) How should this research impact how we assess / think about LLM abilities? * (1:58:56) OutroLinks:* Cameron's homepage and Twitter* Sean's homepage and Twitter* Research — Language and NLP* Languages are efficient, but for whom?* Research — LLM-ology* Do LLMs know what humans know?* Distributional Semantics Still Can't Account for Affordances* In Cautious Defense of LLM-ology* Should Psycholinguists use LLMs as “model organisms”?* (Re)construing Meaning in NLP* Research — language and grounding, theory of mind, reference [insert other buzzwords here]* Do LLMs have a “theory of mind”?* How could we know if LLMs understand language?* Does GPT-4 Pass the Turing Test?* Could LMs change language?* The extended mind and why it matters for cognitive science research* EPITOME* The Role of Physical Inference in Pronoun Resolution Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 111 of The Gradient Podcast, Daniel Bashir speaks to Nicholas Thompson.Nicholas is the CEO of The Atlantic. Previously, he served as editor-in-chief of Wired and editor of Newyorker.com. Nick also cofounded Atavist, which sold to Automattic in 2018. Publications under Nick's leadership have won numerous National Magazine Awards and Pulitzer Prizes, and one WIRED story he edited was the basis for the movie Argo. Nick is also the co-founder of Speakeasy AI, a software platform designed to foster constructive online conversations about the world's most pressing problems.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:12) Nick's path into journalism* (03:25) The Washington Monthly — a turning point* (05:09) Perspectives from different positions in the journalism industry* (08:16) What is great journalism?* (09:42) Example from The Atlantic* (11:00) Other examples/pieces of good journalism* (12:20) Pieces on aging* (12:56) Mortality and life-force associated with running — Nick's piece in WIRED* (15:30) On urgency* (18:20) The job of an editor* (22:23) AI in journalism — benefits and limitations* (26:45) How AI can help writers, experimentation* (28:40) Examples of AI in journalism and issues: CNET, Sports Illustrated, Nick's thoughts on how AI should be used in journalism* (32:20) Speakeasy AI and creating healthy conversation spaces* (34:00) Details about Speakeasy* (35:12) Business pivots and business model trouble* (35:37) Remaining gaps in fixing conversational spaces* (38:27) Lessons learned* (40:00) Nick's optimism about Speakeasy-like projects* (43:14) Social simulacra, a “Troll WestWorld,” algorithmic adjustments in social media* (46:11) Lessons and wisdom from journalism about engagement, more on engagement in social media* (50:27) Successful and unsuccessful futures for AI in journalism* (54:17) Previous warnings about synthetic media, Nick's perspective on risks from synthetic media in journalism* (57:00) Stop trying to build AGI(59:13) OutroLinks:* Nicholas's Twitter and website* Speakeasy AI* Writing* “To Run My Best Marathon at Age 44, I Had to Outrun My Past” in WIRED* “The year AI actually changes the media business” in NiemanLab's Predictions for Journalism 2023 Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 110 of The Gradient Podcast, Daniel Bashir speaks to Professor Subbarao Kambhampati.Professor Kambhampati is a professor of computer science at Arizona State University. He studies fundamental problems in planning and decision making, motivated by the challenges of human-aware AI systems. He is a fellow of the Association for the Advancement of Artificial Intelligence, American Association for the Advancement of Science, and Association for Computing machinery, and was an NSF Young Investigator. He was the president of the Association for the Advancement of Artificial Intelligence, trustee of the International Joint Conference on Artificial Intelligence, and a founding board member of Partnership on AI.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:11) Professor Kambhampati's background* (06:07) Explanation in AI* (18:08) What people want from explanations—vocabulary and symbolic explanations* (21:23) The realization of new concepts in explanation—analogy and grounding* (30:36) Thinking and language* (31:48) Conscious and subconscious mental activity* (36:58) Tacit and explicit knowledge* (42:09) The development of planning as a research area* (46:12) RL and planning* (47:47) What makes a planning problem hard? * (51:23) Scalability in planning* (54:48) LLMs do not perform reasoning* (56:51) How to show LLMs aren't reasoning* (59:38) External verifiers and backprompting LLMs* (1:07:51) LLMs as cognitive orthotics, language and representations* (1:16:45) Finding out what kinds of representations an AI system uses* (1:31:08) “Compiling” system 2 knowledge into system 1 knowledge in LLMs* (1:39:53) The Generative AI Paradox, reasoning and retrieval* (1:43:48) AI as an ersatz natural science* (1:44:03) Why AI is straying away from its engineering roots, and what constitutes engineering* (1:58:33) OutroLinks:* Professor Kambhampati's Twitter and homepage* Research and Writing — Planning and Human-Aware AI Systems* A Validation-structure-based theory of plan modification and reuse (1990)* Challenges of Human-Aware AI Systems (2020)* Polanyi vs. Planning (2021)* LLMs and Planning* Can LLMs Really Reason and Plan? (2023)* On the Planning Abilities of LLMs (2023)* Other* Changing the nature of AI research Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 109 of The Gradient Podcast, Daniel Bashir speaks to Russ Maschmeyer.Russ is the Product Lead for AI and Spatial Commerce at Shopify. At Shopify, he leads a team that looks at how AI can better empower entrepreneurs, with a particular interest in how image generation can help make the lives of business owners and merchants more productive. He previously led design for multiple services at Facebook and co-founded Primer, an AR-enabled interior design marketplace.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:50) Russ's background and a hacked Kinect sensor* (06:00) Instruments and emotion, embodiment and accessibility* (08:45) Natural language as input and generative AI in creating emotive experiences* (10:55) Work on search queries and recommendations at Facebook, designing for search* (16:35) AI in the retail and entrepreneurial landscape* (19:15) Shopify and AI for business owners* (22:10) Vision and directions for AI in commerce* (25:01) Personalized experiences for shopping* (28:45) Challenges for creating personalized experiences* (31:49) Intro to spatial commerce* (34:48) AR/VR devices and spatial commerce* (37:30) MR and AI for immersive product search* (41:35) Implementation details* (48:05) WonkaVision and difficulties for immersive web experiences* (52:10) Future projects and directions for spatial commerce* (55:10) OutroLinks:* Russ's Twitter and homepage* With a Wave of the Hand, Improvising on Kinect in The New York Times* Shopify Spatial Commerce Projects* MR and AI for immersive product search* A more immersive web with a simple optical illusion* What if your room had a reset button? Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 108 of The Gradient Podcast, Daniel Bashir speaks to Professor Benjamin Breen.Professor Breen is an associate professor of history at UC Santa Cruz specializing in the history of science, medicine, globalization, and the impacts of technological change. He is the author of multiple books including The Age of Intoxication: Origins of the Global Drug Trade and the more recent Tripping on Utopia: Margaret Mead, the Cold War, and the Troubled Birth of Psychedelic Science, which you can pre-order now.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:05) Professor Breen's background* (04:47) End of history narratives / millenarian thinking in AI/technology* (09:53) Transformative technological change and societal change* (16:45) AI and psychedelics* (17:23) Techno-utopianism* (26:08) Technologies as metaphors for humanity* (32:34) McLuhanist thinking / brain as a computational machine, Prof. Breen's skepticism* (37:13) Issues with overblown narratives about technology* (42:46) Narratives about transformation and their impacts on progress* (45:23) The historical importance of today's AI landscape* (50:05) International aspects of the history of technology* (53:13) Doomerism vs optimism, why doomerism is appealing* (57:58) Automation, meta-skills, jobs — advice for early career* (1:01:08) LLMs and (history) education* (1:07:10) OutroLinks:* Professor Breen's Twitter and homepage* Books* Tripping on Utopia: Margaret Mead, the Cold War, and the Troubled Birth of Psychedelic Science* The Age of Intoxication: Origins of the Global Drug Trade* Writings* Into the mystic* ‘Alien Jesus'* Simulating History with ChatGPT Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 107 of The Gradient Podcast, Daniel Bashir speaks to Professor Ted Gibson.Ted is a Professor of Cognitive Science at MIT. He leads the TedLab, which investigates why languages look the way they do; the relationship between culture and cognition, including language; and how people learn, represent, and process language.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:13) Prof Gibson's background* (05:33) The computational linguistics community and NLP, engineering focus* (10:48) Models of brains* (12:03) Prof Gibson's focus on behavioral work* (12:53) How dependency distances impact language processing* (14:03) Dependency distances and the origin of the problem* (18:53) Dependency locality theory* (21:38) The structures languages tend to use* (24:58) Sentence parsing: structural integrations and memory costs* (36:53) Reading strategies vs. ordinary language processing* (40:23) Legalese* (46:18) Cross-dependencies* (50:11) Number as a cognitive technology* (54:48) Experiments* (1:03:53) Why counting is useful for Western societies* (1:05:53) The Whorf hypothesis* (1:13:05) Language as Communication* (1:13:28) The noisy channel perspective on language processing* (1:27:08) Fedorenko lab experiments—language for thought vs. communication and Chomsky's claims* (1:43:53) Thinking without language, inner voices, language processing vs. language as an aid for other mental processing* (1:53:01) Dependency grammars and a critique of Chomsky's grammar proposals, LLMs* (2:08:48) LLM behavior and internal representations* (2:12:53) OutroLinks:* Ted's lab page and Twitter* Re-imagining our theories of language* Research — linguistic complexity and dependency locality theory* Linguistic complexity: locality of syntactic dependencies (1998)* The Dependency Locality Theory: A Distance-Based Theory of Linguistic Complexity (2000)* Consequences of the Serial Nature of Linguistic Input for Sentential Complexity (2005)* Large-scale evidence of dependency length minimization in 37 languages (2015)* Dependency locality as an explanatory principle for word order (2020)* Robust effects of working memory demand during naturalistic language comprehension in language-selective cortex (2022)* A resource-rational model of human processing of recursive linguistic structure (2022)* Research — language processing / communication and cross-linguistic universals* Number as a cognitive technology: Evidence from Pirahã language and cognition (2008)* The communicative function of ambiguity in language (2012)* The rational integration of noisy evidence and prior semantic expectations in sentence interpretation (2013)* Color naming across languages reflects color use (2017)* How Efficiency Shapes Human Language (2019) Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 106 of The Gradient Podcast, Daniel Bashir speaks to Professor Harvey Lederman.Professor Lederman is a professor of philosophy at UT Austin. He has broad interests in contemporary philosophy and in the history of philosophy: his areas of specialty include philosophical logic, the Ming dynasty philosopher Wang Yangming, epistemology, and philosophy of language. He has recently been working on incomplete preferences, on trying in the philosophy of language, and on Wang Yangming's moral metaphysics.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:15) Harvey's background* (05:30) Higher-order metaphysics and propositional attitudes* (06:25) Motivations* (12:25) Setup: syntactic types and ontological categories* (25:11) What makes higher-order languages meaningful and not vague?* (25:57) Higher-order languages corresponding to the world* (30:52) Extreme vagueness* (35:32) Desirable features of languages and important questions in philosophy* (36:42) Higher-order identity* (40:32) Intuitions about mental content, language, context-sensitivity* (50:42) Perspectivism* (51:32) Co-referring names, identity statements* (55:42) The paper's approach, “know” as context-sensitive* (57:24) Propositional attitude psychology and mentalese generalizations* (59:57) The “good standing” of theorizing about propositional attitudes* (1:02:22) Mentalese* (1:03:32) “Does knowledge imply belief?” — when a question does not have good standing* (1:06:17) Sense, Reference, and Substitution* (1:07:07) Fregeans and the principle of Substitution* (1:12:12) Follow-up work to this paper* (1:13:39) Do Language Models Produce Reference Like Libraries or Like Librarians?* (1:15:02) Bibliotechnism* (1:19:08) Inscriptions and reference, what it takes for something to refer* (1:22:37) Derivative and basic reference* (1:24:47) Intuition: n-gram models and reference* (1:28:22) Meaningfulness in sentences produced by n-gram models* (1:30:40) Bibliotechnism and LLMs, disanalogies to n-grams* (1:33:17) On other recent work (vector grounding, do LMs refer?, etc.)* (1:40:12) Causal connections and reference, how bibliotechnism makes good on the meanings of sentences* (1:45:46) RLHF, sensitivity to truth and meaningfulness* (1:48:47) Intelligibility* (1:50:52) When LLMs produce novel reference* (1:53:37) Novel reference vs. find-replace* (1:56:00) Directionality example* (1:58:22) Human intentions and derivative reference* (2:00:47) Between bibliotechnism and agency* (2:05:32) Where do invented names / novel reference come from?* (2:07:17) Further questions* (2:10:04) OutroLinks:* Harvey's homepage and Twitter* Papers discussed* Higher-order metaphysics and propositional attitudes* Perspectivism* Sense, Reference, and Substitution* Are Language Models More Like Libraries or Like Librarians? Bibliotechnism, the Novel Reference Problem, and the Attitudes of LLMs Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 105 of The Gradient Podcast, Daniel Bashir speaks to Eric Jang.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:25) Updates since Eric's last interview* (06:07) The problem space of humanoid robots* (08:42) Motivations for the book “AI is Good for You”* (12:20) Definitions of AGI* (14:35) ~ AGI timelines ~* (16:33) Do we have the ingredients for AGI?* (18:58) Rediscovering old ideas in AI and robotics* (22:13) Ingredients for AGI* (22:13) Artificial Life* (25:02) Selection at different levels of information—intelligence at different scales* (32:34) AGI as a collective intelligence* (34:53) Human in the loop learning* (37:38) From getting correct answers to doing things correctly* (40:20) Levels of abstraction for modeling decision-making — the neurobiological stack* (44:22) Implementing loneliness and other details for AGI* (47:31) Experience in AI systems* (48:46) Asking for Generalization* (49:25) Linguistic relativity* (52:17) Language vs. complex thought and Fedorenko experiments* (54:23) Efficiency in neural design* (57:20) Generality in the human brain and evolutionary hypotheses* (59:46) Embodiment and real-world robotics* (1:00:10) Moravec's Paradox and the importance of embodiment* (1:05:33) How embodiment fits into the picture—in verification vs. in learning* (1:10:45) Nonverbal information for training intelligent systems* (1:11:55) AGI and humanity* (1:12:20) The positive future with AGI* (1:14:55) The negative future — technology as a lever* (1:16:22) AI in the military* (1:20:30) How AI might contribute to art* (1:25:41) Eric's own work and a positive future for AI* (1:29:27) OutroLinks:* Eric's book* Eric's Twitter and homepage Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 104 of The Gradient Podcast, Daniel Bashir speaks to Nathan Benaich.Nathan is Founder and General Partner at Air Street Capital, a VC firm focused on investing in AI-first technology and life sciences companies. Nathan runs a number of communities focused on AI including the Research and Applied AI Summit and leads Spinout.fyi to improve the creation of university spinouts. Nathan co-authors the State of AI Report.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:00) Updates in Nathan World — Air Street's second fund, spinouts, * (07:30) Events: Research and Applied AI Summit, State of AI Report launches* (09:50) The State of AI: main messages, the increasing role of subject matter experts* Research* (14:13) Open and closed-source* (17:55) Benchmarking and evaluation, small/large models and industry verticals* (21:10) “Vibes” in LLM evaluation* (24:00) Codegen models, personalized AI, curriculum learning* (26:20) The exhaustion of human-generated data, lukewarm content, synthetic data* (29:50) Opportunities for AI applications in the natural sciences* (35:15) Reinforcement Learning from Human Feedback and alternatives* (38:30) Industry* (39:00) ChatGPT and productivity* (42:37) General app wars, ChatGPT competitors* (45:50) Compute—demand, supply, competition* (50:55) Export controls and geopolitics* (54:45) Startup funding and compute spend* (59:15) Politics* (59:40) Calls for regulation, regulatory divergence* (1:04:40) AI safety* (1:07:30) Nathan's perspective on regulatory approaches* (1:12:30) The UK's early access to frontier models, standards setting, regulation difficulties* (1:17:20) Jailbreaking, constitutional AI, robustness* (1:20:50) Predictions!* (1:25:00) Generative AI misuse in elections and politics (and, this prediction coming true in Bangladesh)* (1:26:50) Progress on AI governance* (1:30:30) European dynamism* (1:35:08) OutroLinks:* Nathan's homepage and Twitter* The 2023 State of AI Report* Bringing Dynamism to European Defense* A prediction coming true: How AI is disrupting Bangladesh's election* Air Street Capital is hiring a full-time Community Lead! Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 103 of The Gradient Podcast, Daniel Bashir speaks to Dr. Kathleen Fisher.As the director of DARPA's Information Innovation Office (I2O), Dr. Kathleen Fisher oversees a portfolio that includes most of the agency's AI-related research and development efforts, including the recent AI Forward initiative. AI Forward explores new directions for AI research that will result in trustworthy systems for national security missions. This summer, roughly 200 participants from the commercial sector, academia, and the U.S. government attended workshops that generated ideas to inform DARPA's next phase of AI exploratory projects. Dr. Fisher previously served as a program manager in I2O from 2011 to 2014. As a program manager, she conceptualized, created, and executed programs in high-assurance computing and machine learning, including Probabilistic Programming for Advancing Machine Learning (PPAML), making building ML applications easier. She was also a co-author of a recent paper about the threats posed by large language models.Since 2018, DARPA has dedicated over $2 billion in R&D funding to AI research. The agency DARPA has been generating groundbreaking research and development for 65 years – leading to game-changing military capabilities and icons of modern society, such as initiating the research field that rendered self-driving cars and developing the technology that led to Apple's Siri.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:30) Kathleen's background* (05:05) Intersections between programming languages and AI* (07:15) Neuro-symbolic AI, trade-offs between flexibility and guarantees* (09:45) History of DARPA and the Information Innovation Office (I2O)* (13:55) DARPA's perspective on research* (17:10) Galvanizing a research community* (20:06) DARPA's recent investments in AI and AI Forward* (26:35) Dual-use nature of generative AI, identifying and mitigating security risks, Kathleen's perspective on short-term and long-term risk (note: the “Gradient podcast” Kathleen mentions is from Last Week in AI)* (30:10) Concerns about deployment and interaction* (32:20) Outcomes from AI Forward workshops and themes* (36:10) Incentives in building and using AI technologies, friction* (38:40) Interactions between DARPA and other government agencies* (40:09) Future research directions* (44:04) Ways to stay up to date on DARPA's work* (45:40) OutroLinks:* DARPA I2O website* Probabilistic Programming for Advancing Machine Learning (PPAML) (Archived)* Assured Neuro Symbolic Learning and Reasoning (ANSR)* AI Cyber Challenge* AI Forward* Identifying and Mitigating the Security Risks of Generative AI Paper* FoundSci Solicitation* FACT Solicitation* Semantic Forensics (SemaFor)* GARD Open Source Resources* I2O Newsletter signup Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 102 of The Gradient Podcast, Daniel Bashir speaks to Peter Tse.Professor Tse is a Professor of Cognitive Neuroscience and chair of the department of Psychological and Brain Sciences at Dartmouth College. His research focuses on using brain and behavioral data to constrain models of the neural bases of attention and consciousness, unconscious processing that precedes and constructs consciousness, mental causation, and human capacities for imagination and creativity. He is especially interested in the processing that goes into the construction of conscious experience between retinal activation at time 0 and seeing an event about a third of a second later.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:45) Prof. Tse's background* (03:25) Early experiences in physics/math and philosophy of physics* (06:10) Choosing to study neuroscience* (07:15) Prof Tse's commitments about determinism* (10:00) Quantum theory and determinism* (13:45) Biases/preferences in choosing theories* (20:41) Falsifiability and scientific questions, transition from physics to neuroscience* (30:50) How neuroscience is unusual among the sciences* (33:20) Neuroscience and subjectivity* (34:30) Reductionism* (37:30) Gestalt psychology* (41:30) Introspection in neuroscience* (45:30) The preconscious buffer and construction of conscious experience, color constancy* (53:00) Perceptual and cognitive inference* (55:00) AI systems and intrinsic meaning* (57:15) Information vs. meaning* (1:01:45) Consciousness and representation of bodily states* (1:05:10) Our second-order free will* (1:07:20) Jaegwon Kim's exclusion argument* (1:11:45) Why Kim thought his own argument was wrong* (1:15:00) Resistance and counterarguments to Kim* (1:19:45) Criterial causation* (1:23:00) How neurons evaluate inputs criterially* (1:24:00) Concept neurons in the hippocampus* (1:31:57) Criterial causation and physicalism, mental causation* (1:40:10) Daniel makes another attempt to push back
In episode 101 of The Gradient Podcast, Daniel Bashir speaks to Vera Liao.Vera is a Principal Researcher at Microsoft Research (MSR) Montréal where she is part of the FATE (Fairness, Accountability, Transparency, and Ethics) group. She is trained in human-computer interaction research and works on human-AI interaction, currently focusing on explainable AI and responsible AI. She aims to bridge emerging AI technologies and human-centered design practices, and use both qualitative and quantitative methods to generate recommendations for technology design. Before joining MSR, Vera worked at IBM TJ Watson Research Center, and her work contributed to IBM products such as AI Explainability 360, Uncertainty Quantification 360, and Watson Assistant.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:41) Vera's background* (07:15) The sociotechnical gap* (09:00) UX design and toolkits for AI explainability* (10:50) HCI, explainability, etc. as “separate concerns” from core AI reseaarch* (15:07) Interfaces for explanation and model capabilities* (16:55) Vera's earlier studies of online social communities* (22:10) Technologies and user behavior* (23:45) Explainability vs. interpretability, transparency* (26:25) Questioning the AI: Informing Design Practices for Explainable AI User Experiences* (42:00) Expanding Explainability: Towards Social Transparency in AI Systems* (50:00) Connecting Algorithmic Research and Usage Contexts* (59:40) Pitfalls in existing explainability methods* (1:05:35) Ideal and real users, seamful systems and slow algorithms* (1:11:08) AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap* (1:11:35) Vera's earlier experiences with chatbots* (1:13:00) Need to understand pitfalls and use-cases for LLMs* (1:13:45) Perspectives informing this paper* (1:20:30) Transparency informing goals for LLM use* (1:22:45) Empiricism and explainability* (1:27:20) LLM faithfulness* (1:32:15) Future challenges for HCI and AI* (1:36:28) OutroLinks:* Vera's homepage and Twitter* Research* Earlier work* Understanding Experts' and Novices' Expertise Judgment of Twitter Users* Beyond the Filter Bubble* Expert Voices in Echo Chambers* HCI / collaboration* Exploring AI Values and Ethics through Participatory Design Fictions* Ways of Knowing for AI: (Chat)bots as Interfaces for ML* Human-AI Collaboration: Towards Socially-Guided Machine Learning* Questioning the AI: Informing Design Practices for Explainable AI User Experiences* Rethinking Model Evaluation as Narrowing the Socio-Technical Gap* Human-Centered XAI: From Algorithms to User Experiences* AI Transparency in the Age of LLMs: A Human-Centered Research Roadmap* Fairness and explainability* Questioning the AI: Informing Design Practices for Explainable AI User Experiences* Expanding Explainability: Towards Social Transparency in AI Systems* Connecting Algorithmic Research and Usage Contexts Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 100 of The Gradient Podcast, Daniel Bashir speaks to Professor Thomas Dietterich.Professor Dietterich is Distinguished Professor Emeritus in the School of Electrical Engineering and Computer Science at Oregon State University. He is a pioneer in the field of machine learning, and has authored more than 225 refereed publications and two books. His current research topics include robust artificial intelligence, robust human-AI systems, and applications in sustainability. He is a former President of the Association for the Advancement of Artificial Intelligence, and the founding President of the International Machine Learning Society. Other major roles include Executive Editor of the journal Machine Learning, co-founder of the Journal for Machine Learning Research, and program chair of AAAI 1990 and NIPS 2000. He currently serves as one of the moderators for the cs.LG category on arXiv.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Episode 100 Note* (02:03) Intro* (04:23) Prof. Dietterich's background* (14:20) Kuhn and theory development in AI, how Prof Dietterich thinks about the philosophy of science and AI* (20:10) Scales of understanding and sentience, grounding, observable evidence* (23:58) Limits of statistical learning without causal reasoning, systematic understanding* (25:48) A challenge for the ML community: testing for systematicity* (26:13) Forming causal understandings of the world* (28:18) Learning at the Knowledge Level* (29:18) Background and definitions* (32:18) Knowledge and goals, a note on LLMs* (33:03) What it means to learn* (41:05) LLMs as learning results of inference without learning first principles* (43:25) System I/II thinking in humans and LLMs* (47:23) “Routine Science”* (47:38) Solving multiclass learning problems via error-correcting output codes* (52:53) Error-correcting codes and redundancy* (54:48) Why error-correcting codes work, contra intuition* (59:18) Bias in ML* (1:06:23) MAXQ for hierarchical RL* (1:15:48) Computational sustainability* (1:19:53) Project TAHMO's moonshot* (1:23:28) Anomaly detection for weather stations* (1:25:33) Robustness* (1:27:23) Motivating The Familiarity Hypothesis* (1:27:23) Anomaly detection and self-models of competence* (1:29:25) Measuring the health of freshwater streams* (1:31:55) An open set problem in species detection* (1:33:40) Issues in anomaly detection for deep learning* (1:37:45) The Familiarity Hypothesis* (1:40:15) Mathematical intuitions and the Familiarity Hypothesis* (1:44:12) What's Wrong with LLMs and What We Should Be Building Instead* (1:46:20) Flaws in LLMs* (1:47:25) The systems Prof Dietterich wants to develop* (1:49:25) Hallucination/confabulation and LLMs vs knowledge bases* (1:54:00) World knowledge and linguistic knowledge* (1:55:07) End-to-end learning and knowledge bases* (1:57:42) Components of an intelligent system and separability* (1:59:06) Thinking through external memory* (2:01:10) OutroLinks:* Research — Fundamentals (Philosophy of AI)* Learning at the Knowledge Level* What Does it Mean for a Machine to Understand?* Research – “Routine science”* Ensemble methods in ML and error-correcting output codes* Solving multiclass learning problems via error-correcting output codes* An experimental comparison of bagging, boosting, and randomization* ML Bias, Statistical Bias, and Statistical Variance of Decision Tree Algorithms* The definitive treatment of these questions, by Gareth James* Discovering/Exploiting structure in MDPs:* MAXQ for hierarchical RL* Exogenous State MDPs (paper with George Trimponias, slides)* Research — Ecosystem Informatics and Computational Sustainability* Project TAHMO* Challenges for ML in Computational Sustainability* Research — Robustness* Steps towards robust AI (AAAI President's Address)* Benchmarking NN Robustness to Common Corruptions and Perturbations with Dan Hendrycks* The familiarity hypothesis: Explaining the behavior of deep open set methods* Recent commentary* Toward High-Reliability AI* What's Wrong with Large Language Models and What We Should Be Building Instead Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 99 of The Gradient Podcast, Daniel Bashir speaks to Professor Martin Wattenberg.Professor Wattenberg is a professor at Harvard and part-time member of Google Research's People + AI Research (PAIR) initiative, which he co-founded. His work, with long-time collaborator Fernanda Viégas, focuses on making AI technology broadly accessible and reflective of human values. At Google, Professor Wattenberg, his team, and Professor Viégas have created end-user visualizations for products such as Search, YouTube, and Google Analytics. Note: Professor Wattenberg is recruiting PhD students through Harvard SEAS—info here.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (03:30) Prof. Wattenberg's background* (04:40) Financial journalism at SmartMoney* (05:35) Contact with the academic visualization world, IBM* (07:30) Transition into visualizing ML* (08:25) Skepticism of neural networks in the 1980s* (09:45) Work at IBM* (10:00) Multiple scales in information graphics, organization of information* (13:55) How much information should a graphic display to whom? * (17:00) Progressive disclosure of complexity in interface design* (18:45) Visualization as a rhetorical process* (20:45) Conversation Thumbnails for Large-Scale Discussions* (21:35) Evolution of conversation interfaces—Slack, etc.* (24:20) Path dependence — mutual influences between user behaviors and technology, takeaways for ML interface design* (26:30) Baby Names and Social Data Analysis — patterns of interest in baby names* (29:50) History Flow* (30:05) Why investigate editing dynamics on Wikipedia?* (32:06) Implications of editing patterns for design and governance* (33:25) The value of visualizations in this work, issues with Wikipedia editing* (34:45) Community moderation, bureaucracy* (36:20) Consensus and guidelines* (37:10) “Neutral” point of view as an organizing principle* (38:30) Takeaways* PAIR* (39:15) Tools for model understanding and “understanding” ML systems* (41:10) Intro to PAIR (at Google)* (42:00) Unpacking the word “understanding” and use cases* (43:00) Historical comparisons for AI development* (44:55) The birth of TensorFlow.js* (47:52) Democratization of ML* (48:45) Visualizing translation — uncovering and telling a story behind the findings* (52:10) Shared representations in LLMs and their facility at translation-like tasks* (53:50) TCAV* (55:30) Explainability and trust* (59:10) Writing code with LMs and metaphors for using* More recent research* (1:01:05) The System Model and the User Model: Exploring AI Dashboard Design* (1:10:05) OthelloGPT and world models, causality* (1:14:10) Dashboards and interaction design—interfaces and core capabilities* (1:18:07) Reactions to existing LLM interfaces* (1:21:30) Visualizing and Measuring the Geometry of BERT* (1:26:55) Note/Correction: The “Atlas of Meaning” Prof. Wattenberg mentions is called Context Atlas* (1:28:20) Language model tasks and internal representations/geometry* (1:29:30) LLMs as “next word predictors” — explaining systems to people* (1:31:15) The Shape of Song* (1:31:55) What does music look like? * (1:35:00) Levels of abstraction, emergent complexity in music and language models* (1:37:00) What Prof. Wattenberg hopes to see in ML and interaction design* (1:41:18) OutroLinks:* Professor Wattenberg's homepage and Twitter* Harvard SEAS application info — Professor Wattenberg is recruiting students!* Research* Earlier work* A Fuzzy Commitment Scheme* Stacked Graphs—Geometry & Aesthetics* A Multi-Scale Model of Perceptual Organization in Information Graphics* Conversation Thumbnails for Large-Scale Discussions* Baby Names and Social Data Analysis* History Flow (paper)* At Harvard and Google / PAIR* Tools for Model Understanding: Facets, SmoothGrad, Attacking discrimination with smarter ML* TensorFlow.js* Visualizing translation* TCAV* Other ML papers:* The System Model and the User Model: Exploring AI Dashboard Design (recent speculative essay)* Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task* Visualizing and Measuring the Geometry of BERT* Artwork* The Shape of Song Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 98 of The Gradient Podcast, Daniel Bashir speaks to Laurence Liew.Laurence is the Director for AI Innovation at AI Singapore. He is driving the adoption of AI by the Singapore ecosystem through the 100 Experiments, AI Apprenticeship Programmes and the Generational AI Talent Development initiative. He is the current Co-Chair of the Innovations and Commercialisation working group and Co-Chair of the "Broad Adoption of AI by SME" committee.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:25) Laurence's background* (07:00) AI Singapore and Singapore's AI Strategy* (08:27) Awareness and adoption of AI in Singapore* (19:45) AI Apprenticeship Program stories* (27:35) Developing generational AI talent within Singapore, literacy* (32:25) Singapore's place within the global AI ecosystem* (38:30) How the generative AI boom has affected Singapore* (43:50) Laurence's vision for the future of Singapore's tech ecosystem* (49:41) OutroLinks:* AI Singapore Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 97 of The Gradient Podcast, Daniel Bashir speaks to Professor Michael Levin and Adam Goldstein. Professor Levin is a Distinguished Professor and Vannevar Bush Chair in the Biology Department at Tufts University. He also directs the Allen Discovery Center at Tufts. His group, the Levin Lab, focuses on understanding the biophysical mechanisms that implement decision-making during complex pattern regulation, and harnessing endogenous bioelectric dynamics toward rational control of growth and form. Adam Goldstein was a visiting scientist at the Levin Lab, where he worked on cancer research, and is the co-founder and Chairman of Astonishing Labs. Previously Adam founded Hipmunk, wrote tech books for O'Reilly, and was a Visiting Partner at Y Combinator.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:37) Intros* (03:20) Prof. Levin intro* (04:26) Adam intro* (06:25) A perspective on intelligence* (08:40) Diverse intelligence — in unconventional embodiments and unfamiliar spaces, substrate independence* (12:23) Failure of the life-machine distinction, text-based systems, grounding, and embodiment* (16:12) What it is to be a Self, fluidity and persistence* (22:45) The combination problem in cognitive function, levels and representation* (27:10) Goals for AI / cognitive science, Prof Levin's perspective on building intelligent systems* (31:25) Adam's and Prof. Levin's recent research—regenerative medicine and cancer* (36:25) Examples of regeneration, Adam on the right approach to the regeneration problem as generation* (45:25) Protein engineering vs. Adam and Prof. Levin's program, implicit assumptions underlying biology* (48:15) Regeneration example in liver disease* (50:50) Perspectives on AI and its goalsLinks:* Levin Lab homepage* Forms of life, forms of mind* Adam's homepage* Research* On Having No Head: Cognition throughout Biological Systems* Technological Approach to Mind Everywhere* Living Things Are Not (20th Century) Machines: Updating Mechanism Metaphors in Light of the Modern Science of Machine Behavior* Life, death, and self: Fundamental questions of primitive cognition viewed through the lens of body plasticity and synthetic organisms* Modular cognition* Endless Forms* Future Medicine: from molecular pathways to the collective intelligence of the body* Technological Approach to Mind Everywhere: an experimentally-grounded framework for understanding diverse bodies and minds* The Computational Boundary of a “Self”: Developmental Bioelectricity Drives Multicellularity and Scale-Free Cognition* Machine life Get full access to The Gradient at thegradientpub.substack.com/subscribe
In episode 96 of The Gradient Podcast, Daniel Bashir speaks to Jonathan Frankle.Jonathan is the Chief Scientist at MosaicML and (as of release). Jonathan completed his PhD at MIT, where he investigated the properties of sparse neural networks that allow them to train effectively through his lottery ticket hypothesis. He also spends a portion of his time working on technology policy, and currently works with the OECD to implement the AI principles he helped develop in 2019.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (02:35) Jonathan's background and work* (04:25) Origins of the Lottery Ticket Hypothesis* (06:00) Jonathan's empiricism and approach to science* (08:25) More Karl Popper discourse + hot takes* (09:45) Walkthrough of the Lottery Ticket Hypothesis* (12:00) Issues with the Lottery Ticket Hypothesis as a statement* (12:30) Jonathan's advice for PhD students, on asking good questions* (15:55) Strengths and Promise of the Lottery Ticket Hypothesis* (18:55) More Lottery Ticket Hypothesis Papers* (19:10) Comparing Rewinding and Fine-tuning* (23:00) Care in making experimental choices* (25:05) Linear Mode Connectivity and the Lottery Ticket Hypothesis* (27:50) On what is being measured and how* (28:50) “The outcome of optimization is determined to a linearly connected region”* (31:15) On good metrics* (32:54) On the Predictability of Pruning Across Scales — scaling laws for pruning* (34:40) The paper's takeaway* (38:45) Pruning Neural Networks at Initialization — on a scientific disagreement* (45:00) On making takedown papers useful* (46:15) On what can be known early in training* (49:15) Jonathan's perspective on important research questions today* (54:40) MosaicML* (55:19) How Mosaic got started* (56:17) Mosaic highlights* (57:33) Customer stories* (1:00:30) Jonathan's work and perspectives on AI policy* (1:05:45) The key question: what we want* (1:07:35) OutroLinks:* Jonathan's homepage and Twitter* Papers* The Lottery Ticket Hypothesis and follow-up work* Comparing Rewinding and Fine-tuning in Neural Network Pruning* Linear Mode Connectivity and the LTH* On the Predictability of Pruning Across Scales* Pruning Neural Networks at Initialization: Why Are We Missing The Mark?* Desirable Inefficiency Get full access to The Gradient at thegradientpub.substack.com/subscribe