Podcasts about boltzmann

  • 154PODCASTS
  • 250EPISODES
  • 49mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • Apr 11, 2025LATEST
boltzmann

POPULARITY

20172018201920202021202220232024


Best podcasts about boltzmann

Latest podcast episodes about boltzmann

You Know What I Would Do
Episode 31: Teeth, Farmers Markets, Boltzmann Brain Theory, Plays, Paper Cuts

You Know What I Would Do

Play Episode Listen Later Apr 11, 2025 77:10


The boys discuss teeth, Boltzman Brain Theory and paper cuts

Cuarto Milenio (Oficial)
Cuarto Milenio: Tres suicidas de la ciencia

Cuarto Milenio (Oficial)

Play Episode Listen Later Apr 9, 2025 31:07


Luis Enrique García Muñoz, Vicerrector de Investigación y Transferencia de la Universidad Carlos III de Madrid, nos visita para contarnos una extraña historia, la de unos científicos que decidieron quitarse la vida, como el caso de Ludwig Boltzmann, quien, a principios del siglo XX, tras explicar su teoría sobre la entropía a unos colegas, recibió tales burlas, que decidió acabar con su vida ahorcándose. Veremos que el caso de Boltzmann no es único. Otros, como él, tomaron la misma drástica decisión. Escucha el episodio completo en la app de iVoox, o descubre todo el catálogo de iVoox Originals

ITmedia NEWS
125年来の未解決問題「ヒルベルトの第6問題」を解決か 米数学者がプレプリント発表 「時間の矢」にも光

ITmedia NEWS

Play Episode Listen Later Mar 18, 2025 0:30


125年来の未解決問題「ヒルベルトの第6問題」を解決か 米数学者がプレプリント発表 「時間の矢」にも光。 米シカゴ大学と米ミシガン大学などに所属する研究者らが発表した査読前論文「Hilbert's sixth problem: derivation of fluid equations via Boltzmann's kinetic theory」は、125年も未解決だった「ヒルベルトの第6問題」に対する解答を示した研究報告である。

il posto delle parole
Sauro Succi "I tre volti del tempo"

il posto delle parole

Play Episode Listen Later Feb 4, 2025 35:09


Sauro Succi"I tre volti del tempo"Fisico, biologico, psicologicoEdizioni Dedalowww.edizionidedalo.itCos'è il tempo? A questa domanda cerca di rispondere l'autore, attraverso uno sguardo originale sul tema, con una riflessione acuta e coinvolgente tra fisica, biologia, psicologia e intelligenza artificiale.«Succi offre una prospettiva affascinante su uno dei concetti più enigmatici dell'Universo: il tempo. Con una brillante fusione di fisica, biologia e psicologia, in una visione olistica e innovativa, l'autore apre nuove frontiere nel dialogo tra scienze naturali e umane, dandoci una lente nuova attraverso cui osservare la realtà e il nostro posto in essa».Giuseppe RivaDa sempre il tempo ci appare un tema affascinante e misterioso. Parte di questo “mistero” è forse dovuta alla nostra attitudine a considerarlo come un'entità unica, senza riconoscerne la sua triplice natura: fisica, biologica e psicologica. Il tempo fisico ci serve a disegnare un quadro quanto più possibile oggettivo dei fenomeni naturali, il tempo biologico è quello che scandisce i ritmi del nostro organismo; il tempo psicologico, invece, accompagna il flusso dei nostri pensieri. Questo libro analizza in modo accessibile, ma mai banale, la relazione fra i tre volti del tempo alla luce dei progressi della moderna scienza dei sistemi complessi, toccando inevitabilmente il prepotente ingresso sulla scena dell'intelligenza artificiale. Una riflessione che ci aiuta a comprendere meglio il mondo intorno e dentro di noi.Sauro Succi è un fisico e ingegnere italiano, noto per i suoi contributi alla fisica statistica e computazionale. In particolare è uno dei principali ideatori e sviluppatori dei metodi reticolari di Boltzmann in fluidodinamica e in fisica della materia soffice. È autore di oltre 500 pubblicazioni scientifiche su riviste internazionali e di tre monografie con la Oxford University Press. Ha ricevuto numerosi riconoscimenti internazionali.IL POSTO DELLE PAROLEascoltare fa pensarewww.ilpostodelleparole.itDiventa un supporter di questo podcast: https://www.spreaker.com/podcast/il-posto-delle-parole--1487855/support.

RadicalxChange(s)
Gary Zhexi Zhang: Artist and Writer

RadicalxChange(s)

Play Episode Listen Later Jan 30, 2025 57:31


Matt Prewitt and Gary Zhexi Zhang discuss Chinese cybernetics, focusing on pioneer Qian Xuesen and how the field developed differently in China versus the West. They explore how Chinese cybernetics emerged as a practical tool for nation-building, examining its scientific foundations, political context, and broader cultural impact. Together, they discuss key concepts like information control systems while highlighting the field's interdisciplinary nature and its evolution from thermodynamic to information-based approaches.Links & References: References:The Critical Legacy of Chinese Cybernetics by Gary Zhexi Zhang | Combinations Magazine Cybernetics - WikipediaNorbert Wiener ("Father of Cybernetics")Whose entropy is it anyway? (Part 1: Boltzmann, Shannon, and Gibbs ) — Chris AdamiCollection: Norbert Wiener papers | MIT ArchivesSpaceRelationship between entropy of a language and crossword puzzles (a comment from Claude Shannon) - Mathematics Stack ExchangeA Mathematical Theory of Communication BY C.E. SHANNON | Harvard MathA Mathematical Theory of Communication - WikipediaCybernetics - MITBrownian motion - WikipediaIntercontinental ballistic missile - Wikipedia AKA “ICBMs”Summary: The Macy ConferencesWarren Sturgis McCulloch (Neuroscience), Gregory Bateson and Margaret Mead (Cultural Anthropology)Claude Shannon (Mathematician)The Bandwagon BY CLAUDE E. SHANNONFrom Counterculture to Cyberculture: Stewart Brand, the Whole Earth Network, and the Rise of Digital Utopianism by Fred Turner, introductionFrom Cybernetics to AI: the pioneering work of Norbert Wiener - Max Planck NeuroscienceMarvin Minsky | AI Pioneer, Cognitive Scientist & MIT Professor | BritannicBios:Gary Zhexi Zhang is an artist and writer. He is the editor of Catastrophe Time! (Strange Attractor Press, 2023) and most recently exhibited at the 9th Asian Art Biennial, Taichung.Gary's Social Links:Gary Zhexi Zhang (@hauntedsurimi) / X Matt Prewitt (he/him) is a lawyer, technologist, and writer. He is the President of the RadicalxChange Foundation.Matt's Social Links:ᴍᴀᴛᴛ ᴘʀᴇᴡɪᴛᴛ (@m_t_prewitt) / X Connect with RadicalxChange Foundation:RadicalxChange Website@RadxChange | TwitterRxC | YouTubeRxC | InstagramRxC | LinkedInJoin the conversation on Discord.Credits:Produced by G. Angela Corpus.Co-Produced, Edited, Narrated, and Audio Engineered by Aaron Benavides.Executive Produced by G. Angela Corpus and Matt Prewitt.Intro/Outro music by MagnusMoone, “Wind in the Willows,” is licensed under an Attribution-NonCommercial-ShareAlike 3.0 International License (CC BY-NC-SA 3.0)

Interviews: Tech and Business
AI, Deep Learning, and the Future of Work | #860

Interviews: Tech and Business

Play Episode Listen Later Dec 12, 2024 53:29


Artificial intelligence is rapidly transforming business, technology, and society. On this episode of CXO Talk, Dr. Terrence Sejnowski, a renowned computational neuroscientist, deep learning pioneer, and author of "ChatGPT and the Future of AI," discusses the implications of this technological revolution. He explores how AI is evolving, drawing parallels with the human brain, and explains why a robust data strategy is crucial for successful AI implementation. Dr. Sejnowski holds the Francis Crick Chair at the Salk Institute for Biological Studies and is a Distinguished Professor at UC San Diego. Dr. Sejnowski explains the importance of lifelong learning for employees and emphasizes AI's role in augmenting, not replacing, human capabilities. He also addresses critical topics such as explainability in AI decision-making, ethical considerations, and the potential impact of AI on the future of work. This discussion offers practical guidance for business and technology leaders navigating the complexities of AI integration and its implications for their organizations. Episode Participants Terrence J. Sejnowski is Francis Crick Chair at The Salk Institute for Biological Studies and Distinguished Professor at the University of California at San Diego. He has published over 500 scientific papers and 12 books, including ChatGPT and The Future of AI: The Deep Language Learning Revolution. He was instrumental in shaping the BRAIN Initiative that was announced by the White House in 2013, and he received the prestigious Gruber Prize in Neuroscience in 2022 and the Brain Prize in 2024. Sejnowski was also a pioneer in developing learning algorithms for neural networks in the 1980s, inventing the Boltzmann machine with Geoffrey Hinton; this was the first learning algorithm for multilayer neural networks and laid the foundation for deep learning. He is the President of the Neural Information Processing Systems (NeurIPS) Foundation, which organizes the largest AI conference, and he is a leader in the recent convergence between neuroscience and AI. Michael Krigsman is a globally recognized analyst, strategic advisor, and industry commentator known for his deep expertise in digital transformation, innovation, and leadership. He has presented at industry events worldwide and written extensively on the reasons for IT failures. His work has been referenced in the media over 1,000 times and in more than 50 books and journal articles; his commentary on technology trends and business strategy reaches a global audience. #AI #ArtificialIntelligence #FutureofWork #DeepLearning #CXO #DigitalTransformation #BusinessStrategy #TechnologyLeadership #ChatGPT #cxotalk

BIT-BUY-BIT's podcast
PayPal Strikes Back | THE BITCOIN BRIEF 45

BIT-BUY-BIT's podcast

Play Episode Listen Later Nov 20, 2024 87:19 Transcription Available


The Bitcoin Brief is a show hosted by Max and Bitcoin QnA. We cover important updates in the world of bitcoin and open source software. It is our imperative to provide some education along the way too, so that the misfits can expand their knowledge base and become more sovereign as a result. We do this every second week to keep our listeners informed without having to dedicate hours every day to keep on top of developments. We break things down in a simple and fun way and we welcome questions or topic suggestions via Podcasting 2.0 boosts.SHOW DETAILS AOBQ has fun with PaypalOrange man elected - Ross home soon?New Foundation community forumNew Foundation product teaserMultisig talk at PubKey Nov 17thNEWSOP Next streamsRoman storm trial pushed back to AprilRoman Sterlingov sentenced to 12.5 years in jailUPDATES/RELEASESBitAxe v2.4.0Boltzmann v0.1.0 is a TypeScript library that computes the entropy of bitcoin transactions and the linkability of their inputs and outputs.Robosats v0.7.2 alpha - first step to client 'Nostricfication'Keeper v1.2.18 mobile and desktop v0.1.3Zeus White announcedZeus v0.9.2Liana v8.0Mempal app releasedIMPORTANT LINKS https://freesamourai.comhttps://p2prights.org/donate.htmlhttps://ungovernablemisfits.comSPONSORSFOUNDATIONhttps://foundation.xyz/ungovernableFoundation builds Bitcoin-centric tools that empower you to reclaim your digital sovereignty.As a sovereign computing company, Foundation is the antithesis of today's tech conglomerates. Returning to cypherpunk principles, they build open source technology that “can't be evil”.Thank you Foundation Devices for sponsoring the show!Use code: Ungovernable for $10 off of your purchaseCAKE WALLEThttps://cakewallet.comCake Wallet is an open-source, non-custodial wallet available on Android, iOS, macOS, and Linux.Features:- Built-in Exchange: Swap easily between Bitcoin and Monero.- User-Friendly: Simple interface for all users.Monero Users:- Batch Transactions: Send multiple payments at once.- Faster Syncing: Optimized syncing via specified restore heights- Proxy Support: Enhance privacy with proxy node options.Bitcoin Users:- Coin Control: Manage your transactions effectively.- Silent Payments: Static bitcoin addresses- Batch Transactions: Streamline your payment process.Thank you Cake Wallet for sponsoring the show!VALUE FOR VALUEThanks for listening you Ungovernable Misfits, we appreciate your continued support and hope you enjoy the shows.You can support this episode using your time, talent or treasure.TIME:- create fountain clips for the show- create a meetup- help boost the signal on social mediaTALENT:- create ungovernable misfit inspired art, animation or music- design or implement some software that can make the podcast better- use whatever talents you have to make a contribution to the show!TREASURE:- BOOST IT OR STREAM SATS on the Podcasting 2.0 apps @ https://podcastapps.com- DONATE via Paynym @ https://paynym.is/+maxbuybit- DONATE via Monero @ https://xmrchat.com/ugmf- BUY SOME CLOTHING @ https://ungovernablemisfits.com/store/- BUY SOME ART!! @ https://ungovernablemisfits.com/art-gallery/(00:00:00) INTRO(00:00:57) THANK YOU FOUNDATION(00:01:38) THANK YOU CAKE WALLET(00:02:49) Chow Mein QnA(00:05:06) Fancy Some New Carpet?(00:06:37) Ross is Released SOON(00:08:40) Your Account is Closed(00:13:28) New Foundation Community Forum!(00:16:34) Beyond The Hardware Wallet(00:17:59) Be @ Pubkey on December 17th!(00:19:28) NEWS(00:19:31) OP_NEXT Conference(00:25:11) Roman Storm Trial Delayed(00:25:54) Roman Sterlingov Sentenced to 12 1/2 Years(00:28:25) BOOSTS(00:46:06) UPDATES & RELEASES(00:46:12) BitAxe v2.4.0(00:47:03) Boltzmann's Back(00:47:46) Robosats Introduces the Nostr Orderbook(00:49:18) Bitcoin Keeper Has Enhanced UTXO Management(00:50:02) Introducing ZEUS White(00:51:10) ZEUS Allows You To Export Your TX History(00:53:37) Liana Gets a NIce Upgrade(00:57:34) Mempal(00:58:16) QUESTIONS(00:58:18) Voicemail from Soulex: Seedphrases and Passphrases for Friends(01:12:34) Quiet on the Ashigaru Front(01:14:16) My NUC's Too Noisy(01:17:05) Thoughts on Lopp and Jamesob(01:27:11) Now You Can Leave Q

New Books Network
8.3 Aspire to Magic but End Up With Madness: Adam Ehrlich Sachs speaks with Sunny Yudkoff (JP)

New Books Network

Play Episode Listen Later Nov 7, 2024 30:20


What happens when a novelist wants “nonsense and joy” but his characters are destined for a Central European sanatorium? How does the abecedarian form (i.e. organized not chronologically or sequentially but alphabetically) insist on order, yet also embrace absurdity? Here to ponder such questions with host John Plotz are University of Wisconsin–Madison's Sunny Yudkoff (last heard on ND speaking with Sheila Heti) and Adam Ehrlich Sachs, author of Inherited Disorders, The Organs of Sense, and the recently published Gretel and the Great War. Sachs has fallen under the spell of late Habsburg Vienna, where the polymath Ludwig Wittgenstein struggled to make sense of Boltzmann's physics, Arnold Schoenberg read the acerbic journalist Karl Kraus, and everyone, Sachs suspects, was reading Grimms' Fairy Tales, searching for the feeling of inevitability only narrative closure can provide. Beneath his OULIPO-like attachment to arbitrary orders and word-games, though, Sachs admits to a desire for chaos. Thomas Bernhard, later 20th century Austrian experimental novelist Heinrich von Kleist, “Michael Kohlhass” Romantic-era German writer Italo Calvino,If on a Winter's Night a Traveler OULIPO Home of French literary experimentalists like Perec and Raymond Queneau Georges Perec's most famous experiment is Life: A User's Manual (although John is devoted to “W: or the Memory of Childhood”) Dr. Seuss, On Beyond Zebra! (ignore John calling the author Dr Scarry, which was a scary mistake.,..) Marcel Proust: was he a worldbuilder and fantasist, as Nabokov says or, as Doris Lessing claims, principally an anatomist of French social structures, a second Zola? Franz Kafka is unafraid of turning his character into a bug in a story's first sentence. Virginia Woolf in Mrs. Dalloway offers the reader a mad (Septimus) and a sane (Mrs Dalloway herself) version of stream of consciousness: how different are they? Cezanne, for example The Fisherman (Fantastic Scene) The Pointillism of painters like Georges Seurat Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/new-books-network

New Books in Literary Studies
8.3 Aspire to Magic but End Up With Madness: Adam Ehrlich Sachs speaks with Sunny Yudkoff (JP)

New Books in Literary Studies

Play Episode Listen Later Nov 7, 2024 30:20


What happens when a novelist wants “nonsense and joy” but his characters are destined for a Central European sanatorium? How does the abecedarian form (i.e. organized not chronologically or sequentially but alphabetically) insist on order, yet also embrace absurdity? Here to ponder such questions with host John Plotz are University of Wisconsin–Madison's Sunny Yudkoff (last heard on ND speaking with Sheila Heti) and Adam Ehrlich Sachs, author of Inherited Disorders, The Organs of Sense, and the recently published Gretel and the Great War. Sachs has fallen under the spell of late Habsburg Vienna, where the polymath Ludwig Wittgenstein struggled to make sense of Boltzmann's physics, Arnold Schoenberg read the acerbic journalist Karl Kraus, and everyone, Sachs suspects, was reading Grimms' Fairy Tales, searching for the feeling of inevitability only narrative closure can provide. Beneath his OULIPO-like attachment to arbitrary orders and word-games, though, Sachs admits to a desire for chaos. Thomas Bernhard, later 20th century Austrian experimental novelist Heinrich von Kleist, “Michael Kohlhass” Romantic-era German writer Italo Calvino,If on a Winter's Night a Traveler OULIPO Home of French literary experimentalists like Perec and Raymond Queneau Georges Perec's most famous experiment is Life: A User's Manual (although John is devoted to “W: or the Memory of Childhood”) Dr. Seuss, On Beyond Zebra! (ignore John calling the author Dr Scarry, which was a scary mistake.,..) Marcel Proust: was he a worldbuilder and fantasist, as Nabokov says or, as Doris Lessing claims, principally an anatomist of French social structures, a second Zola? Franz Kafka is unafraid of turning his character into a bug in a story's first sentence. Virginia Woolf in Mrs. Dalloway offers the reader a mad (Septimus) and a sane (Mrs Dalloway herself) version of stream of consciousness: how different are they? Cezanne, for example The Fisherman (Fantastic Scene) The Pointillism of painters like Georges Seurat Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/literary-studies

New Books in Literature
8.3 Aspire to Magic but End Up With Madness: Adam Ehrlich Sachs speaks with Sunny Yudkoff (JP)

New Books in Literature

Play Episode Listen Later Nov 7, 2024 30:20


What happens when a novelist wants “nonsense and joy” but his characters are destined for a Central European sanatorium? How does the abecedarian form (i.e. organized not chronologically or sequentially but alphabetically) insist on order, yet also embrace absurdity? Here to ponder such questions with host John Plotz are University of Wisconsin–Madison's Sunny Yudkoff (last heard on ND speaking with Sheila Heti) and Adam Ehrlich Sachs, author of Inherited Disorders, The Organs of Sense, and the recently published Gretel and the Great War. Sachs has fallen under the spell of late Habsburg Vienna, where the polymath Ludwig Wittgenstein struggled to make sense of Boltzmann's physics, Arnold Schoenberg read the acerbic journalist Karl Kraus, and everyone, Sachs suspects, was reading Grimms' Fairy Tales, searching for the feeling of inevitability only narrative closure can provide. Beneath his OULIPO-like attachment to arbitrary orders and word-games, though, Sachs admits to a desire for chaos. Thomas Bernhard, later 20th century Austrian experimental novelist Heinrich von Kleist, “Michael Kohlhass” Romantic-era German writer Italo Calvino,If on a Winter's Night a Traveler OULIPO Home of French literary experimentalists like Perec and Raymond Queneau Georges Perec's most famous experiment is Life: A User's Manual (although John is devoted to “W: or the Memory of Childhood”) Dr. Seuss, On Beyond Zebra! (ignore John calling the author Dr Scarry, which was a scary mistake.,..) Marcel Proust: was he a worldbuilder and fantasist, as Nabokov says or, as Doris Lessing claims, principally an anatomist of French social structures, a second Zola? Franz Kafka is unafraid of turning his character into a bug in a story's first sentence. Virginia Woolf in Mrs. Dalloway offers the reader a mad (Septimus) and a sane (Mrs Dalloway herself) version of stream of consciousness: how different are they? Cezanne, for example The Fisherman (Fantastic Scene) The Pointillism of painters like Georges Seurat Learn more about your ad choices. Visit megaphone.fm/adchoices Support our show by becoming a premium member! https://newbooksnetwork.supportingcast.fm/literature

Engines of Our Ingenuity
The Engines of Our Ingenuity 1273: Science and Technology in 1852

Engines of Our Ingenuity

Play Episode Listen Later Nov 3, 2024 3:39


Episode: 1273 Some surprises in the 1852 Annual of Scientific Discovery.  Today, an old book takes stock of science and art in 1852.

Jøss‽
#180 - Kinesiske kjerringråd, boltzmann-hjerne, og over snittet antall fingere

Jøss‽

Play Episode Listen Later Oct 31, 2024 21:44


Tre raringer møtes i et knøttlite studio for å fekte med facts! En tilfeldig samling partikler blir til hjernen din, malurt møter tuberkolose, og hva er gjennomsnittlig antall peniser?Raringer: Bjørn Eidem (@beidem) Inga Strümke (@strumkis) Andreas Wahl (@andreas__wahl) 

Game over?
Nobellprisen går til kunstig intelligens

Game over?

Play Episode Listen Later Oct 25, 2024 26:59


Vi befinner oss midt i en revolusjon drevet av kunstig intelligens, og den påvirker er langt mer enn bare å genere morsomme taler emd ChatGPT eller rare bilder med MidJourney. KI forvandler nå også kjernefag som fysikk og kjemi, og ingen steder er det tydeligere enn i årets Nobelpriser, som markerer hvordan kunstig intelligens fundamentalt endrer vitenskapen.Nobelprisen i fysikk 2024 gikk til Geoffrey Hinton, ofte kalt «KIs gudfar", og John Hopfield for deres arbeid med nevrale nettverk. Hinton utviklet Boltzmann-maskinen, som introduserte stokastiske metoder for å analysere komplekse datasett. Sammen med Hopfields pionerarbeid innen dynamiske nettverk har de lagt grunnlaget for dagens KI-teknologier.I kjemi ble David Baker, Demis Hassabis og John Jumper tildelt Nobelprisen for utviklingen av AlphaFold, et KI-system som kan forutsi 3D-strukturer av proteiner basert på aminosyresekvenser. Dette har revolusjonert biovitenskapen, spesielt innen medikamentutvikling og personalisert medisin.

Quantum Physics for Kids
The Four Constants that Change the Universe - Boltzmann, Planck, Speed of Light, Euler's Number

Quantum Physics for Kids

Play Episode Listen Later Oct 12, 2024 6:13


Ever wondered what holds the universe together? Join us on a mind-bending exploration of the four fundamental constants that shape our reality. From the Boltzmann constant governing the energy of particles to the Planck constant quantifying the energy of light, the speed of light defining the fabric of spacetime, and Euler's number underpinning exponential growth, these constants are the building blocks of the cosmos. Discover how these seemingly abstract numbers influence everything from the behavior of atoms to the expansion of the universe. Prepare to be amazed as we unravel the mysteries of the universe, one constant at a time.

China Daily Podcast
英语新闻丨2024年诺贝尔物理学奖揭晓,为什么得奖的是计算机学家?

China Daily Podcast

Play Episode Listen Later Oct 11, 2024 2:22


After the Nobel Prize in physics went to John J. Hopfield and Geoffrey E. Hinton "for foundational discoveries and inventions that enable machine learning with artificial neural networks", many asked why a prize for physics has gone to computer scientists for what is also an achievement in computer science.在约翰·霍普菲尔德和杰弗里·辛顿因“为推动利用人工神经网络进行机器学习作出的基础性发现和发明”获得诺贝尔物理学奖后,许多人发问,为什么物理学奖授予了计算机学家,且其成就也属于计算机科学领域。Even Hinton, a winner of the 2018 Turing Award and one of the "godfathers of AI", was himself "extremely surprised" at receiving the call telling him he had got the Nobel in physics, while the other recipient Hopfield said "It was just astounding."就连2018年图灵奖得主、“人工智能教父”之一的辛顿,在接到瑞典皇家科学院的电话时,也直呼“没有想到”。另一位获奖者霍普菲尔德则说:“这简直令人震惊。”Actually, the artificial neural network research has a lot to do with physics. Most notably, Hopfield replicated the functioning of the human brain by using the self-rotation of single molecules as if they were neurons and linking them together into a network, which is what the famous Hopfield neural network is about. In the process, Hopfield used two physical equations. Similarly, Hinton made Hopfield's approach the basis for a more sophisticated artificial neural network called the Boltzmann machine, which can catch and correct computational errors.其实,人工神经网络研究与物理学有很大关系。最值得注意的是,霍普菲尔德利用单分子自旋复制了人脑的功能,把它们当作神经元,并把它们连接成一个网络,这就是著名的“霍普菲尔德神经网络”。在这个过程中,霍普菲尔德使用了两个物理方程。同样,辛顿将霍普菲尔德的方法作为一种更复杂的人工神经网络的基础,这种人工神经网络被称为玻尔兹曼机,它可以捕捉和纠正计算错误。The two steps have helped in forming a net that can act like a human brain and compute. The neural networks today can learn from their own mistakes and constantly improve, thus being able to solve complicated problems for humanity. For example, the Large Language Model that's the basis of the various GPT technologies people use today dates back to the early days when Hopfield and Hinton formed and improved their network.这两项成果帮助形成了可以像人脑一样进行计算的网络。如今的神经网络可以从自己的错误中学习并不断改进,从而能够为人类解决复杂的问题。例如,作为当今人们使用的各种GPT技术基础的大语言模型,就可以追溯到早期霍普菲尔德和辛顿形成和改进人工神经网络的时候。Instead of weakening the role of physics, that the Nobel Prize in Physics goes to neural network achievements strengthens it by revealing to the world the role physics, or fundamental science as a whole, plays in sharpening technology. Physics studies the rules followed by particles and the universe and paves the way for modern technologies. That is why there is much to thank physicists for the milestones modern computer science has crossed.诺贝尔物理学奖授予神经网络成就,并不是削弱物理学的作用,而是通过向世界揭示物理学或整个基础科学在提高技术方面的作用来加强其地位。物理学研究粒子和宇宙所遵循的规则,并为现代技术铺平道路。这就是现代计算机科学所跨越的里程碑要感谢物理学家的原因。neuraladj. 神经的astoundingadj. 令人震惊的replicatev. 复制,重复

The top AI news from the past week, every ThursdAI

Hey Folks, we are finally due for a "relaxing" week in AI, no more HUGE company announcements (if you don't consider Meta Movie Gen huge), no conferences or dev days, and some time for Open Source projects to shine. (while we all wait for Opus 3.5 to shake things up) This week was very multimodal on the show, we covered 2 new video models, one that's tiny and is open source, and one massive from Meta that is aiming for SORA's crown, and 2 new VLMs, one from our friends at REKA that understands videos and audio, while the other from Rhymes is apache 2 licensed and we had a chat with Kwindla Kramer about OpenAI RealTime API and it's shortcomings and voice AI's in general. ThursdAI - Recaps of the most high signal AI weekly spaces is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.All right, let's TL;DR and show notes, and we'll start with the 2 Nobel prizes in AI

FLASH DIARIO de El Siglo 21 es Hoy
Premio Nobel de física destaca a los creadores de la IA

FLASH DIARIO de El Siglo 21 es Hoy

Play Episode Listen Later Oct 9, 2024 4:36


El Nobel de física premia avances en inteligencia artificial.  El Nobel de física 2024 ha sido otorgado a Geoffrey Hinton y John Hopfield por sus investigaciones pioneras en redes neuronales artificiales.Comenta tus ideas y no olvides seguirnos en Spotify: Flash Diario en Spotify.Estos trabajos, iniciados en los años 80, han sentado las bases del aprendizaje automático y la inteligencia artificial moderna. Tanto Hinton como Hopfield han aprovechado el reconocimiento para alertar sobre el uso irresponsable de esta tecnología. Hinton, que recientemente dejó Google, señaló los peligros de crear sistemas más inteligentes que el ser humano. ¿Qué nos depara el futuro con estas herramientas que hoy transforman nuestra vida diaria?  El Nobel reconoce el avance, pero sus creadores alertan sobre los riesgos.  El premio Nobel de física ha destacado este año el trabajo de dos pioneros en el desarrollo de redes neuronales: Geoffrey Hinton y John Hopfield. Sus investigaciones sobre cómo modelar el cerebro humano a través de sistemas computacionales han impulsado el aprendizaje automático, una tecnología que ahora utilizamos en ámbitos como la traducción automática y el reconocimiento facial. El comité Nobel ha resaltado la capacidad de estos sistemas para aprender y reconocer patrones en grandes conjuntos de datos, una herramienta que ha revolucionado la informática moderna y que continúa transformando diversas áreas del conocimiento.   A pesar de la importancia de sus descubrimientos, Hinton y Hopfield han expresado serias preocupaciones sobre el futuro de la inteligencia artificial. Hinton, en particular, ha advertido que la IA podría llegar a superar las capacidades humanas, lo que plantea riesgos difíciles de controlar. Después de años de trabajar en Google, decidió apartarse para enfocar sus esfuerzos en alertar sobre estos posibles peligros. Hopfield, desde su perspectiva como físico, ha señalado que aún no comprendemos completamente el alcance de esta tecnología, ni cómo establecer límites seguros para su uso. Ambos subrayan la necesidad de una mayor comprensión y regulación.   El reconocimiento del comité Nobel no se limita a destacar los avances técnicos, sino que invita a reflexionar sobre el impacto de la inteligencia artificial en nuestras vidas. El trabajo de Hinton y Hopfield ha abierto un nuevo horizonte en la investigación científica, pero también ha generado una conversación urgente sobre los desafíos éticos y sociales que enfrenta la humanidad ante estas nuevas capacidades. Los expertos coinciden en que es crucial definir el camino que tomará esta tecnología para garantizar que sea utilizada de manera responsable y beneficiosa para todos.  Hinton y Hopfield revolucionaron el campo de la inteligencia artificial a través de conceptos extraídos de la física. El modelo de Hopfield es una de las primeras redes neuronales artificiales capaces de aprender patrones y reconstruir información a partir de datos incompletos. Hinton, por su parte, creó la “máquina Boltzmann”, que mejoró aún más la capacidad de las máquinas para aprender de forma autónoma. Aunque estos avances ya son parte de nuestra vida cotidiana, sus creadores piden cautela sobre el futuro de la IA.  El Nobel de física 2024 reconoce a los creadores de las bases tecnológicas de la inteligencia artificial moderna, pero también nos recuerda que esta poderosa herramienta requiere una gestión responsable. ¿Qué desafíos traerá la inteligencia artificial en los próximos años? ¿Cómo equilibrar sus beneficios con los riesgos? Comparte tus ideas y no olvides seguirnos en Spotify: Flash Diario en Spotify.Bibliografía:OrangeLe MondeHuffington PostConviértete en un seguidor de este podcast: https://www.spreaker.com/podcast/flash-diario-de-el-siglo-21-es-hoy--5835407/support.

Curiuss
L'uomo che credeva negli atomi - Geni Impolverati #4

Curiuss

Play Episode Listen Later Sep 29, 2024 21:20


La convinzione che la materia sia fatta da atomi si rafforza nel corso del tempo grazie all'apporto di molti scienziati. Uno in particolare ha dato tutto per questa causa.Per sostenerci: https://associazioneatelier.it/Per contatti: associazioneatelier@gmail.comSe volete sostenerci con il 5 per mille:Atelier APS (iscritta al RUNTS (terzo settore))CF: 98181440177

Top 10 Potential Future Wars

Play Episode Listen Later May 31, 2024 21:30


Step into the future with Boltzmann. Join our Telegram at https://t.me/Boltzmann_Net to experience the future of crypto and AI where privacy meets unlimited potential Link to my second podcast on world history and interviews: / @history102-qg5oj   Link to my cancellation insurance: https://becomepluribus.com/creators/20 Link to my Twitter-https://twitter.com/whatifalthist?ref... Link to my Instagram-https://www.instagram.com/rudyardwlyn... Bibliography: The Reluctant Super by Peter Zeihan Disunited Nations by Peter Zeihan The End of the World is just the Beginning by Peter Zeihan The Next 100 Years by George Friedman The Storm before the Calm by George Friedman Flash Points by George Friedman Ages of Discord by Peter Turchin Secular Cycles by Peter Turchin End Times by Peter Turchin The Great Wave by David Hackett Fischer Asian Waters by Humphrey Hawksley Asia's Cauldron by Kaplan Third World Century Charles Stewart Goodwin The Economics of Discontent by Jean Michel Paul Monsoon by Robert Kaplan The Strange Death of Europe by Douglas Murray The Best of Times and the Worst of Times by Michael Burleigh Nothing is True and Everything is Possible by Pomerantz The Rise and Fall of Nations by Ruchir Shamir The New Map by Dan Yergen The World in Conflict by John Andrews Fragile Empire by Ben Judah

A Manifesto for the New Right

Play Episode Listen Later May 13, 2024 38:01


A project 2.5 years in the making. This is a historic moment where the right is forming a new ideology. Here are the best ideas for the new ideological coalition of the "Not Left". Step into the future with Boltzmann. Join our Telegram at https://t.me/Boltzmann_Net to experience the future of crypto and AI where privacy meets unlimited potential Link to my second podcast History 102: https://www.youtube.com/channel/UC0NCSdGglnmdWg-qHALhu1w FOLLOW ON X: @whatifalthist (Rudyard) @TurpentineMedia Bibliography: The Eye of Shiva by Amaury de Riencourt The Happiness Hypothesis by John Haidt The True Believer by Eric Hoffer The WEIRDest people in the World by Joseph Heinrich The Body Keeps the Score by Van Der Kolk Lost Connections by Johann Hari Trauma and the Soul by Kalsched The Inner World of Trauma by Kalsched The Seven Types of Atheism by Gray Secularity by Zahl Ultimate Journey by Monroe Far Journeys by Monroe Journeys out of the Body by Monroe The Sacred History by Mark Booth Recapture the Rapture by Jamie Wheal Beyond Order by Jordan Peterson Behave by Sapolsky On Grand Strategy by John Lewis Gaddis Dominion by Tom Holland The Road to Serfdom by Hayek Why Nations Fail by Robinson and Acemoglu The Origins of Political Order by Francis Fukuyama Regime Change by Deneen A Conflict of Visions by Thomas Sowell Honor by Bowman Meditations by Marcus Aurelius The Writings of Epictetus Hoe God Becomes Real by Luhrmann Nihilism by Seraphim Rose The Immortality Key by Brian Muraresku The Secret of our Success by Joseph Heinrich Seeing like a State by James Scott War, What is it Good for by Ian Morris The Soul of India by Amaury de Riencourt The Soul of China by Amaury de Riencourt The Coming Caesars by Amaury de Riencourt War in Human Civilization by Azar Gat War, Peace and War by Peter Turchin  Maps of Meaning by Jordan Peterson Man and His Symbols by Carl Jung The World After Liberalism by Matthew Rose The Ascent of Humanity by Eisenstein The Knowledge Machine by Michael Strevens The Infinite Staircase by Moore The Invention of Yesterday by Tamim Ansary Envy by Helmut Schoeck The Fate of Empires by Hubbard The Righteous Mind by John Haidt Cynical Theories by James Lindsay Foragers, Farmers and Fossil Fuels by Ian Morris The Philosophy of History by Hegel A History of Western Philosophy by Bertrand Russel The Web of Existence by Jeremy Lent Trump and the Post Truth World by Ken Wilbur Spiral Dynamics by Ken Wilbur The Laws of Human Nature by Robert Greene Sapiens by Yuval Noah Harari Homo Deus by Yuval Noah Harari The Rise of the West by William McNeil Mere Christianity by CS Lewis The Blank Slate by Steven Pinker The Unabomber's Manifesto The Decline of the West by Oswald Spengler A Secret History of the World by Mark Booth Forgotten Truth by Houston Smith Religions of the World by Houston Smith Hermeticism by Evola

The Nonlinear Library
LW - So What's Up With PUFAs Chemically? by J Bostock

The Nonlinear Library

Play Episode Listen Later Apr 27, 2024 11:32


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: So What's Up With PUFAs Chemically?, published by J Bostock on April 27, 2024 on LessWrong. This is exploratory investigation of a new-ish hypothesis, it is not intended to be a comprehensive review of the field or even a a full investigation of the hypothesis. I've always been skeptical of the seed-oil theory of obesity. Perhaps this is bad rationality on my part, but I've tended to retreat to the sniff test on issues as charged and confusing as diet. My response to the general seed-oil theory was basically "Really? Seeds and nuts? The things you just find growing on plants, and that our ancestors surely ate loads of?" But a twitter thread recently made me take another look at it, and since I have a lot of chemistry experience I thought I'd take a look. The PUFA Breakdown Theory It goes like this: PUFAs from nuts and seeds are fine. Deep-frying using PUFAs causes them to break down in a way other fatty acids do not, and these breakdown products are the problem. Most of a fatty acid is the "tail". This consists of hydrogen atoms decorating a backbone of carbon atoms. Each carbon atom can make up to four bonds, of which two have to be to other carbons (except the end carbon which only bonds to one carbon) leaving space for two hydrogens. When a chain has the maximum number of hydrogen atoms, we say it's "saturated". These tails have the general formula CnH2n+1: For a carbon which is saturated (i.e. has four single bonds) the bonds are arranged like the corners of a tetrahedron, and rotation around single bonds is permitted, meaning the overall assembly is like a floppy chain. Instead, we can have two adjacent carbons form a double bond, each forming one bond to hydrogen, two bonds to the adjacent carbon, and one to a different carbon: Unlike single bonds, double bonds are rigid, and if a carbon atom has a double bond, the three remaining bonds fall in a plane. This means there are two ways in which the rest of the chain can be laid out. If the carbons form a zig-zag S shape, this is a trans double bond. If they form a curved C shape, we have a cis double bond. The health dangers of trans-fatty acids have been known for a long while. They don't occur in nature (which is probably why they're so bad for us). Cis-fatty acids are very common though, especially in vegetable and, yes, seed oils. Of course there's no reason why we should stop at one double bond, we can just as easily have multiple. This gets us to the name poly-unsaturated fatty acids (PUFAs). I'll compare stearic acid (SA) oleic acid (OA) and linoleic acid (LA) for clarity: Linoleic acid is the one that seed oil enthusiasts are most interested in. We can go even further and look at α-linoleic acid, which has even more double bonds, but I think LA makes the point just fine. Three fatty acids, usually identical ones, attach to one glycerol molecule to form a triglyceride. Isomerization As I mentioned earlier, double bonds are rigid, so if you have a cis double bond, it stays that way. Mostly. In chemistry a reaction is never impossible, the components are just insufficiently hot. If we heat up a cis-fatty acid to a sufficient temperature, the molecules will be able to access enough energy to flip. The rate of reactions generally scales with temperature according to the Arrhenius equation: v=Aexp(EakBT) Where A is a general constant determining the speed, Ea is the "activation energy" of the reaction, T is temperature, and kB is a Boltzmann's constant which makes the units work out. Graphing this gives the following shape: Suffice to say this means that reaction speed can grow very rapidly with temperature at the "right" point on this graph. Why is this important? Well, trans-fatty acids are slightly lower energy than cis ones, so at a high enough temperature, we can see cis to trans isomerization, turning OA o...

FM4 Projekt X
Richard Walther aus der Wirtschaftsredaktion trifft Hermann Ilse Boltzmann

FM4 Projekt X

Play Episode Listen Later Apr 19, 2024 43:15


Sendungshinweis: FM4 Projekt X, 19.4.2024, 0 Uhr

The Nonlinear Library
LW - Generalized Stat Mech: The Boltzmann Approach by David Lorell

The Nonlinear Library

Play Episode Listen Later Apr 12, 2024 32:42


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Generalized Stat Mech: The Boltzmann Approach, published by David Lorell on April 12, 2024 on LessWrong. Context There's a common intuition that the tools and frames of statistical mechanics ought to generalize far beyond physics and, of particular interest to us, it feels like they ought to say a lot about agency and intelligence. But, in practice, attempts to apply stat mech tools beyond physics tend to be pretty shallow and unsatisfying. This post was originally drafted to be the first in a sequence on "generalized statistical mechanics": stat mech, but presented in a way intended to generalize beyond the usual physics applications. The rest of the supposed sequence may or may not ever be written. In what follows, we present very roughly the formulation of stat mech given by Clausius, Maxwell and Boltzmann (though we have diverged substantially; we're not aiming for historical accuracy here) in a frame intended to make generalization to other fields relatively easy. We'll cover three main topics: Boltzmann's definition for entropy, and the derivation of the Second Law of Thermodynamics from that definition. Derivation of the thermodynamic efficiency bound for heat engines, as a prototypical example application. How to measure Boltzmann entropy functions experimentally (assuming the Second Law holds), with only access to macroscopic measurements. Entropy To start, let's give a Boltzmann-flavored definition of (physical) entropy. The "Boltzmann Entropy" SBoltzmann is the log number of microstates of a system consistent with a given macrostate. We'll use the notation: SBoltzmann(Y=y)=logN[X|Y=y] Where Y=y is a value of the macrostate, and X is a variable representing possible microstate values (analogous to how a random variable X would specify a distribution over some outcomes, and X=x would give one particular value from that outcome-space.) Note that Boltzmann entropy is a function of the macrostate. Different macrostates - i.e. different pressures, volumes, temperatures, flow fields, center-of-mass positions or momenta, etc - have different Boltzmann entropies. So for an ideal gas, for instance, we might write SBoltzmann(P,V,T), to indicate which variables constitute "the macrostate". Considerations for Generalization What hidden assumptions about the system does Boltzmann's definition introduce, which we need to pay attention to when trying to generalize to other kinds of applications? There's a division between "microstates" and "macrostates", obviously. As yet, we haven't done any derivations which make assumptions about those, but we will soon. The main three assumptions we'll need are: Microstates evolve reversibly over time. Macrostate at each time is a function of the microstate at that time. Macrostates evolve deterministically over time. Mathematically, we have some microstate which varies as a function of time, x(t), and some macrostate which is also a function of time, y(t). The first assumption says that x(t)=ft(x(t1)) for some invertible function ft. The second assumption says that y(t)=gt(x(t)) for some function gt. The third assumption says that y(t)=Ft(y(t1)) for some function Ft. The Second Law: Derivation The Second Law of Thermodynamics says that entropy can never decrease over time, only increase. Let's derive that as a theorem for Boltzmann Entropy. Mathematically, we want to show: logN[X(t+1)|Y(t+1)=y(t+1)]logN[X(t)|Y(t)=y(t)] Visually, the proof works via this diagram: The arrows in the diagram show which states (micro/macro at t/t+1) are mapped to which other states by some function. Each of our three assumptions contributes one set of arrows: By assumption 1, microstate x(t) can be computed as a function of x(t+1) (i.e. no two microstates x(t) both evolve to the same later microstate x(t+1)). By assumption 2, macrostate y(t) can be comput...

Why is Every Society Religious?

Play Episode Listen Later Apr 8, 2024 38:20


Step into the future with Boltzmann. Join our Telegram at https://t.me/Boltzmann_Net to experience the future of crypto and AI where privacy meets unlimited potential Link to my second podcast on world history and interviews: / @history102-qg5oj   Link to my cancellation insurance: https://becomepluribus.com/creators/20 Link to my Twitter-https://twitter.com/whatifalthist?ref... Link to my Instagram-https://www.instagram.com/rudyardwlyn... Bibliography: The Fate of Empires by Hubbard Religions of the World by Houston Smith Forgotten Truth by Houston Smith The Decline of the West by Spengler The Lessons of History by Will Durant Our Oriental Heritage by Will Durant Caesar and Christ by Will Durant The Life of Greece by Will Durant The Age of Faith by Will Durant a History of Philosophy by Will Durant Examined Lives by James Miller A History of the World by CJ Meyers A History of the World by McNeil A History of the Arabs by Sir John Glubb Tragedy and Hope by Carroll Quiggley The Evolution of Civilizations by Carroll Quiggley Europe: A History by Norman Davies A History of Russia, Central Asia and Mongolia by David Christian A Secular Age by Charles Taylor Cotton, Climate and Candles by Bulliet Destiny Disrupted by Tamim Ansary Al Muqahdimmah by Ibn Khaldun A History of the Ancient World by Susan Wise Bauer A Secret History of the World by Mark Booth The Sacred History by Mark Booth The Master and His Emissary by Ian McGhilchrist Strategy by Lawrence Freedman  Sex and Civilizations by JD Unwin Atrocities by Matthew White The Dictators by Richard Every Spiteful Mutants by Edward Dutton Dominion by Tom Holland The Righteous Mind by Jon Haidt The Gateway Protocol by Robert Monroe A Conflict of Visions by Thomas Sowell The Rise and Fall of Ancient Egypt by Toby Wilkinson Empty Planet by Brocker and Ibbitson Disunited Nations by Peter Zeihan The Ancient City by Foustel de Coulanges Nihilism by Seraphim Rose  Behavior by Sapolsky The Happiness Hypothesis by Jon Haidt

Theoretical Neuroscience Podcast
On origins of computational neuroscience and AI as scientific fields - with Terrence Sejnowski (vintage) - #9

Theoretical Neuroscience Podcast

Play Episode Listen Later Mar 16, 2024 115:27


Today's guest is a pioneer both in the fields of computational neuroscience and artificial intelligence (AI) and has had a front seat during their development.  His many contributions include, for example, the invention of the Boltzmann machine with Ackley and Hinton in the mid 1980s.  In this “vintage” episode recorded in late 2019 he describes the joint births of these adjacent scientific fields and outlines how they came about.

Troubled Minds Radio
Beyond the Mindscape - An Elusive Fractal Beginning

Troubled Minds Radio

Play Episode Listen Later Mar 7, 2024 177:16


Are we living in a universe optimized for the creation of consciousness? Could our minds contain vast neural multiverses, and could the cosmos itself be a slowly awakening mind? As we ponder the nature of reality and our place within it, we must ask ourselves: are these ideas mere flights of fancy, or could they represent a glimpse into a deeper, more astonishing truth about the nature of existence?LIVE ON Digital Radio! http://bit.ly/3m2Wxom or http://bit.ly/40KBtlWhttp://www.troubledminds.org Support The Show!https://www.spreaker.com/podcast/troubled-minds-radio--4953916/supporthttps://rokfin.com/creator/troubledmindshttps://patreon.com/troubledmindshttps://www.buymeacoffee.com/troubledmindshttps://troubledfans.comFriends of Troubled Minds! - https://troubledminds.org/friendsShow Schedule Sun-Mon-Tues-Wed-Thurs 7-10pstiTunes - https://apple.co/2zZ4hx6Spotify - https://spoti.fi/2UgyzqMTuneIn - https://bit.ly/2FZOErSTwitter - https://bit.ly/2CYB71U----------------------------------------https://troubledminds.org/beyond-the-mindscape-an-elusive-fractal-ideascape/https://www.newscientist.com/article/mg26134792-100-is-the-human-brain-really-the-most-complex-object-in-the-universe/https://mindmatters.ai/2024/02/could-our-minds-be-bigger-than-even-a-multiverse/https://savewisdom.org/https://www.dndbeyond.com/posts/1145-what-is-the-far-realm-a-timeless-land-of-writhinghttps://www.scienceabc.com/nature/universe/matrioshka-brain.htmlhttps://bigthink.com/hard-science/are-we-living-inside-a-matrioshka-brain-how-advanced-civilizations-could-reshape-reality/https://www.thoughtco.com/what-are-boltzmann-brains-2699421https://en.wikipedia.org/wiki/Boltzmann_brainhttps://www.howandwhys.com/boltzmann-brain-explained-for-dummies/https://medicalxpress.com/news/2024-02-conflicting-theories-consciousness.html

The Nonlinear Library
AF - Neural uncertainty estimation for alignment by Charlie Steiner

The Nonlinear Library

Play Episode Listen Later Dec 5, 2023 27:53


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Neural uncertainty estimation for alignment, published by Charlie Steiner on December 5, 2023 on The AI Alignment Forum. Introduction Suppose you've built some AI model of human values. You input a situation, and it spits out a goodness rating. You might want to ask: "What are the error bars on this goodness rating?" In addition to it just being nice to know error bars, an uncertainty estimate can also be useful inside the AI: guiding active learning[1], correcting for the optimizer's curse[2], or doing out-of-distribution detection[3]. I recently got into the uncertainty estimation literature for neural networks (NNs) for a pet reason: I think it would be useful for alignment to quantify the domain of validity of an AI's latent features. If we point an AI at some concept in its world-model, optimizing for realizations of that concept can go wrong by pushing that concept outside its domain of validity. But just keep thoughts of alignment in your back pocket for now. This post is primarily a survey of the uncertainty estimation literature, interspersed with my own takes. The Bayesian neural network picture The Bayesian NN picture is the great granddaddy of basically every uncertainty estimation method for NNs, so it's appropriate to start here. The picture is simple. You start with a prior distribution over parameters. Your training data is evidence, and after training on it you get an updated distribution over parameters. Given an input, you calculate a distribution over outputs by propagating the input through the Bayesian neural network. This would all be very proper and irrelevant ("Sure, let me just update my 2trilliondimensional joint distribution over all the parameters of the model"), except for the fact that actually training NNs does kind of work this way. If you use a log likelihood loss and L2 regularization, the parameters that minimize loss will be at the peak of the distribution that a Bayesian NN would have, if your prior on the parameters was a Gaussian[4][5]. This is because of a bridge between the loss landscape and parameter uncertainty. Bayes's rule says P(parameters|dataset)=P(parameters)P(dataset|parameters)/P(dataset). Here P(parameters|dataset)is your posterior distribution you want to estimate, and P(parameters)P(dataset|parameters) is the exponential of the loss[6]. This lends itself to physics metaphors like "the distribution of parameters is a Boltzmann distribution sitting at the bottom of the loss basin." Empirically, calculating the uncertainty of a neural net by pretending it's adhering to the Bayesian NN picture works so well that one nice paper on ensemble methods[7] called it "ground truth." Of course to actually compute anything here you have to make approximations, and if you make the quick and dirty approximations (e.g. pretend you can find the shape of the loss basin from the Hessian) you get bad results[8], but people are doing clever things with Monte Carlo methods these days[9], and they find that better approximations to the Bayesian NN calculation get better results. But doing Monte Carlo traversal of the loss landscape is expensive. For a technique to apply at scale, it must impose only a small multiplier on cost to run the model, and if you want it to become ubiquitous the cost it imposes must be truly tiny. Ensembles A quite different approach to uncertainty is ensembles[10]. Just train a dozen-ish models, ask them for their recommendations, and estimate uncertainty from the spread. The dozen-times cost multiplier on everything is steep, but if you're querying the model a lot it's cheaper than Monte Carlo estimation of the loss landscape. Ensembling is theoretically straightforward. You don't need to pretend the model is trained to convergence, you don't need to train specifically for predictive loss, you don't even need...

Zimmerman en Space
Tijd kristallen

Zimmerman en Space

Play Episode Listen Later Nov 13, 2023 11:23


De tijd tikt maar door, en zo is dit alweer de 50e aflevering van "Zimmerman en Space". In deze aflevering behandelen we de zogenaamde Tijd Kristallen.A Nuclear Spin System at Negative Temperature:https://journals.aps.org/pr/abstract/10.1103/PhysRev.81.279Negative Absolute Temperature for Motional Degrees of Freedom:https://arxiv.org/abs/1211.0545Kan het kouder dan het absolute nulpunt?https://www.cursor.tue.nl/nieuws/2013/januari/kan-het-kouder-dan-het-absolute-nulpunt/Time crystal:https://en.wikipedia.org/wiki/Time_crystalAbsolute zero:https://en.wikipedia.org/wiki/Absolute_zeroGibbs, Boltzmann, and negative temperatures:https://arxiv.org/pdf/1403.4299.pdfZero-point energy:https://en.wikipedia.org/wiki/Zero-point_energyDiscrete time crystals: rigidity, criticality, and realizations:https://arxiv.org/pdf/1608.02589.pdfQuantum Time Crystals:https://arxiv.org/pdf/1202.2539.pdfAbsence of Quantum Time Crystals:https://arxiv.org/abs/1410.2143Time crystals in periodically driven systems:https://arxiv.org/abs/1811.06657Observation of discrete time-crystalline order in a disordered dipolar many-body system:https://arxiv.org/abs/1610.08057Physicists create time crystals: New form of matter may hold key to developing quantum machines:https://phys.org/news/2017-04-physicists-crystals-key-quantum-machines.htmlZimmerman en Space Discord:https://discord.gg/qnJWjXt4JZDe Zimmerman en Space podcast is gelicenseerd onder een Creative Commons CC0 1.0 licentie.http://creativecommons.org/publicdomain/zero/1.0

Learned Lag
Now the Worst Day

Learned Lag

Play Episode Listen Later Sep 8, 2023 14:11


Boltzmann can bite me.

From Nowhere to Nothing
Boltzmann Brains

From Nowhere to Nothing

Play Episode Listen Later Sep 8, 2023 54:36


In this episode, we explore the mind-bending implications of Boltzmann Brains to the best of our abilities.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
LLMs Everywhere: Running 70B models in browsers and iPhones using MLC — with Tianqi Chen of CMU / OctoML

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Aug 10, 2023 52:10


We have just announced our first set of speakers at AI Engineer Summit! Sign up for the livestream or email sponsors@ai.engineer if you'd like to support.We are facing a massive GPU crunch. As both startups and VC's hoard Nvidia GPUs like countries count nuclear stockpiles, tweets about GPU shortages have become increasingly common. But what if we could run LLMs with AMD cards, or without a GPU at all? There's just one weird trick: compilation. And there's one person uniquely qualified to do it.We had the pleasure to sit down with Tianqi Chen, who's an Assistant Professor at CMU, where he both teaches the MLC course and runs the MLC group. You might also know him as the creator of XGBoost, Apache TVM, and MXNet, as well as the co-founder of OctoML. The MLC (short for Machine Learning Compilation) group has released a lot of interesting projects:* MLC Chat: an iPhone app that lets you run models like RedPajama-3B and Vicuna-7B on-device. It gets up to 30 tok/s!* Web LLM: Run models like LLaMA-70B in your browser (!!) to offer local inference in your product.* MLC LLM: a framework that allows any language models to be deployed natively on different hardware and software stacks.The MLC group has just announced new support for AMD cards; we previously talked about the shortcomings of ROCm, but using MLC you can get performance very close to the NVIDIA's counterparts. This is great news for founders and builders, as AMD cards are more readily available. Here are their latest results on AMD's 7900s vs some of top NVIDIA consumer cards.If you just can't get a GPU at all, MLC LLM also supports ARM and x86 CPU architectures as targets by leveraging LLVM. While speed performance isn't comparable, it allows for non-time-sensitive inference to be run on commodity hardware.We also enjoyed getting a peek into TQ's process, which involves a lot of sketching:With all the other work going on in this space with projects like ggml and Ollama, we're excited to see GPUs becoming less and less of an issue to get models in the hands of more people, and innovative software solutions to hardware problems!Show Notes* TQ's Projects:* XGBoost* Apache TVM* MXNet* MLC* OctoML* CMU Catalyst* ONNX* GGML* Mojo* WebLLM* RWKV* HiPPO* Tri Dao's Episode* George Hotz EpisodePeople:* Carlos Guestrin* Albert GuTimestamps* [00:00:00] Intros* [00:03:41] The creation of XGBoost and its surprising popularity* [00:06:01] Comparing tree-based models vs deep learning* [00:10:33] Overview of TVM and how it works with ONNX* [00:17:18] MLC deep dive* [00:28:10] Using int4 quantization for inference of language models* [00:30:32] Comparison of MLC to other model optimization projects* [00:35:02] Running large language models in the browser with WebLLM* [00:37:47] Integrating browser models into applications* [00:41:15] OctoAI and self-optimizing compute* [00:45:45] Lightning RoundTranscriptAlessio: Hey everyone, welcome to the Latent Space podcast. This is Alessio, Partner and CTO in Residence at Decibel Partners, and I'm joined by my co-host Swyx, writer and editor of Latent Space. [00:00:20]Swyx: Okay, and we are here with Tianqi Chen, or TQ as people call him, who is assistant professor in ML computer science at CMU, Carnegie Mellon University, also helping to run Catalyst Group, also chief technologist of OctoML. You wear many hats. Are those, you know, your primary identities these days? Of course, of course. [00:00:42]Tianqi: I'm also, you know, very enthusiastic open source. So I'm also a VP and PRC member of the Apache TVM project and so on. But yeah, these are the things I've been up to so far. [00:00:53]Swyx: Yeah. So you did Apache TVM, XGBoost, and MXNet, and we can cover any of those in any amount of detail. But maybe what's one thing about you that people might not learn from your official bio or LinkedIn, you know, on the personal side? [00:01:08]Tianqi: Let me say, yeah, so normally when I do, I really love coding, even though like I'm trying to run all those things. So one thing that I keep a habit on is I try to do sketchbooks. I have a book, like real sketchbooks to draw down the design diagrams and the sketchbooks I keep sketching over the years, and now I have like three or four of them. And it's kind of a usually a fun experience of thinking the design through and also seeing how open source project evolves and also looking back at the sketches that we had in the past to say, you know, all these ideas really turn into code nowadays. [00:01:43]Alessio: How many sketchbooks did you get through to build all this stuff? I mean, if one person alone built one of those projects, he'll be a very accomplished engineer. Like you built like three of these. What's that process like for you? Like it's the sketchbook, like the start, and then you think about the code or like. [00:01:59]Swyx: Yeah. [00:02:00]Tianqi: So, so usually I start sketching on high level architectures and also in a project that works for over years, we also start to think about, you know, new directions, like of course generative AI language model comes in, how it's going to evolve. So normally I would say it takes like one book a year, roughly at that rate. It's usually fun to, I find it's much easier to sketch things out and then gives a more like a high level architectural guide for some of the future items. Yeah. [00:02:28]Swyx: Have you ever published this sketchbooks? Cause I think people would be very interested on, at least on a historical basis. Like this is the time where XGBoost was born, you know? Yeah, not really. [00:02:37]Tianqi: I started sketching like after XGBoost. So that's a kind of missing piece, but a lot of design details in TVM are actually part of the books that I try to keep a record of. [00:02:48]Swyx: Yeah, we'll try to publish them and publish something in the journals. Maybe you can grab a little snapshot for visual aid. Sounds good. [00:02:57]Alessio: Yeah. And yeah, talking about XGBoost, so a lot of people in the audience might know it's a gradient boosting library, probably the most popular out there. And it became super popular because many people started using them in like a machine learning competitions. And I think there's like a whole Wikipedia page of like all state-of-the-art models. They use XGBoost and like, it's a really long list. When you were working on it, so we just had Tri Dao, who's the creator of FlashAttention on the podcast. And I asked him this question, it's like, when you were building FlashAttention, did you know that like almost any transform race model will use it? And so I asked the same question to you when you were coming up with XGBoost, like, could you predict it would be so popular or like, what was the creation process? And when you published it, what did you expect? We have no idea. [00:03:41]Tianqi: Like, actually, the original reason that we built that library is that at that time, deep learning just came out. Like that was the time where AlexNet just came out. And one of the ambitious mission that myself and my advisor, Carlos Guestrin, then is we want to think about, you know, try to test the hypothesis. Can we find alternatives to deep learning models? Because then, you know, there are other alternatives like, you know, support vector machines, linear models, and of course, tree-based models. And our question was, if you build those models and feed them with big enough data, because usually like one of the key characteristics of deep learning is that it's taking a lot [00:04:22]Swyx: of data, right? [00:04:23]Tianqi: So we will be able to get the same amount of performance. That's a hypothesis we're setting out to test. Of course, if you look at now, right, that's a wrong hypothesis, but as a byproduct, what we find out is that, you know, most of the gradient boosting library out there is not efficient enough for us to test that hypothesis. So I happen to have quite a bit of experience in the past of building gradient boosting trees and their variants. So Effective Action Boost was kind of like a byproduct of that hypothesis testing. At that time, I'm also competing a bit in data science challenges, like I worked on KDDCup and then Kaggle kind of become bigger, right? So I kind of think maybe it's becoming useful to others. One of my friends convinced me to try to do a Python binding of it. That tends to be like a very good decision, right, to be effective. Usually when I build it, we feel like maybe a command line interface is okay. And now we have a Python binding, we have R bindings. And then it realized, you know, it started getting interesting. People started contributing different perspectives, like visualization and so on. So we started to push a bit more on to building distributive support to make sure it works on any platform and so on. And even at that time point, when I talked to Carlos, my advisor, later, he said he never anticipated that we'll get to that level of success. And actually, why I pushed for gradient boosting trees, interestingly, at that time, he also disagreed. He thinks that maybe we should go for kernel machines then. And it turns out, you know, actually, we are both wrong in some sense, and Deep Neural Network was the king in the hill. But at least the gradient boosting direction got into something fruitful. [00:06:01]Swyx: Interesting. [00:06:02]Alessio: I'm always curious when it comes to these improvements, like, what's the design process in terms of like coming up with it? And how much of it is a collaborative with like other people that you're working with versus like trying to be, you know, obviously, in academia, it's like very paper-driven kind of research driven. [00:06:19]Tianqi: I would say the extra boost improvement at that time point was more on like, you know, I'm trying to figure out, right. But it's combining lessons. Before that, I did work on some of the other libraries on matrix factorization. That was like my first open source experience. Nobody knew about it, because you'll find, likely, if you go and try to search for the package SVD feature, you'll find some SVN repo somewhere. But it's actually being used for some of the recommender system packages. So I'm trying to apply some of the previous lessons there and trying to combine them. The later projects like MXNet and then TVM is much, much more collaborative in a sense that... But, of course, extra boost has become bigger, right? So when we started that project myself, and then we have, it's really amazing to see people come in. Michael, who was a lawyer, and now he works on the AI space as well, on contributing visualizations. Now we have people from our community contributing different things. So extra boost even today, right, it's a community of committers driving the project. So it's definitely something collaborative and moving forward on getting some of the things continuously improved for our community. [00:07:37]Alessio: Let's talk a bit about TVM too, because we got a lot of things to run through in this episode. [00:07:42]Swyx: I would say that at some point, I'd love to talk about this comparison between extra boost or tree-based type AI or machine learning compared to deep learning, because I think there is a lot of interest around, I guess, merging the two disciplines, right? And we can talk more about that. I don't know where to insert that, by the way, so we can come back to it later. Yeah. [00:08:04]Tianqi: Actually, what I said, when we test the hypothesis, the hypothesis is kind of, I would say it's partially wrong, because the hypothesis we want to test now is, can you run tree-based models on image classification tasks, where deep learning is certainly a no-brainer right [00:08:17]Swyx: now today, right? [00:08:18]Tianqi: But if you try to run it on tabular data, still, you'll find that most people opt for tree-based models. And there's a reason for that, in the sense that when you are looking at tree-based models, the decision boundaries are naturally rules that you're looking at, right? And they also have nice properties, like being able to be agnostic to scale of input and be able to automatically compose features together. And I know there are attempts on building neural network models that work for tabular data, and I also sometimes follow them. I do feel like it's good to have a bit of diversity in the modeling space. Actually, when we're building TVM, we build cost models for the programs, and actually we are using XGBoost for that as well. I still think tree-based models are going to be quite relevant, because first of all, it's really to get it to work out of the box. And also, you will be able to get a bit of interoperability and control monotonicity [00:09:18]Swyx: and so on. [00:09:19]Tianqi: So yes, it's still going to be relevant. I also sometimes keep coming back to think about, are there possible improvements that we can build on top of these models? And definitely, I feel like it's a space that can have some potential in the future. [00:09:34]Swyx: Are there any current projects that you would call out as promising in terms of merging the two directions? [00:09:41]Tianqi: I think there are projects that try to bring a transformer-type model for tabular data. I don't remember specifics of them, but I think even nowadays, if you look at what people are using, tree-based models are still one of their toolkits. So I think maybe eventually it's not even a replacement, it will be just an ensemble of models that you can call. Perfect. [00:10:07]Alessio: Next up, about three years after XGBoost, you built this thing called TVM, which is now a very popular compiler framework for models. Let's talk about, so this came out about at the same time as ONNX. So I think it would be great if you could maybe give a little bit of an overview of how the two things work together. Because it's kind of like the model, then goes to ONNX, then goes to the TVM. But I think a lot of people don't understand the nuances. I can get a bit of a backstory on that. [00:10:33]Tianqi: So actually, that's kind of an ancient history. Before XGBoost, I worked on deep learning for two years or three years. I got a master's before I started my PhD. And during my master's, my thesis focused on applying convolutional restricted Boltzmann machine for ImageNet classification. That is the thing I'm working on. And that was before AlexNet moment. So effectively, I had to handcraft NVIDIA CUDA kernels on, I think, a GTX 2070 card. I have a 22070 card. It took me about six months to get one model working. And eventually, that model is not so good, and we should have picked a better model. But that was like an ancient history that really got me into this deep learning field. And of course, eventually, we find it didn't work out. So in my master's, I ended up working on recommender system, which got me a paper, and I applied and got a PhD. But I always want to come back to work on the deep learning field. So after XGBoost, I think I started to work with some folks on this particular MXNet. At that time, it was like the frameworks of CAFE, Ciano, PyTorch haven't yet come out. And we're really working hard to optimize for performance on GPUs. At that time, I found it's really hard, even for NVIDIA GPU. It took me six months. And then it's amazing to see on different hardwares how hard it is to go and optimize code for the platforms that are interesting. So that gets me thinking, can we build something more generic and automatic? So that I don't need an entire team of so many people to go and build those frameworks. So that's the motivation of starting working on TVM. There is really too little about machine learning engineering needed to support deep learning models on the platforms that we're interested in. I think it started a bit earlier than ONNX, but once it got announced, I think it's in a similar time period at that time. So overall, how it works is that TVM, you will be able to take a subset of machine learning programs that are represented in what we call a computational graph. Nowadays, we can also represent a loop-level program ingest from your machine learning models. Usually, you have model formats ONNX, or in PyTorch, they have FX Tracer that allows you to trace the FX graph. And then it goes through TVM. We also realized that, well, yes, it needs to be more customizable, so it will be able to perform some of the compilation optimizations like fusion operator together, doing smart memory planning, and more importantly, generate low-level code. So that works for NVIDIA and also is portable to other GPU backends, even non-GPU backends [00:13:36]Swyx: out there. [00:13:37]Tianqi: So that's a project that actually has been my primary focus over the past few years. And it's great to see how it started from where I think we are the very early initiator of machine learning compilation. I remember there was a visit one day, one of the students asked me, are you still working on deep learning frameworks? I tell them that I'm working on ML compilation. And they said, okay, compilation, that sounds very ancient. It sounds like a very old field. And why are you working on this? And now it's starting to get more traction, like if you say Torch Compile and other things. I'm really glad to see this field starting to pick up. And also we have to continue innovating here. [00:14:17]Alessio: I think the other thing that I noticed is, it's kind of like a big jump in terms of area of focus to go from XGBoost to TVM, it's kind of like a different part of the stack. Why did you decide to do that? And I think the other thing about compiling to different GPUs and eventually CPUs too, did you already see some of the strain that models could have just being focused on one runtime, only being on CUDA and that, and how much of that went into it? [00:14:50]Tianqi: I think it's less about trying to get impact, more about wanting to have fun. I like to hack code, I had great fun hacking CUDA code. Of course, being able to generate CUDA code is cool, right? But now, after being able to generate CUDA code, okay, by the way, you can do it on other platforms, isn't that amazing? So it's more of that attitude to get me started on this. And also, I think when we look at different researchers, myself is more like a problem solver type. So I like to look at a problem and say, okay, what kind of tools we need to solve that problem? So regardless, it could be building better models. For example, while we build extra boots, we build certain regularizations into it so that it's more robust. It also means building system optimizations, writing low-level code, maybe trying to write assembly and build compilers and so on. So as long as they solve the problem, definitely go and try to do them together. And I also see it's a common trend right now. Like if you want to be able to solve machine learning problems, it's no longer at Aggressor layer, right? You kind of need to solve it from both Aggressor data and systems angle. And this entire field of machine learning system, I think it's kind of emerging. And there's now a conference around it. And it's really good to see a lot more people are starting to look into this. [00:16:10]Swyx: Yeah. Are you talking about ICML or something else? [00:16:13]Tianqi: So machine learning and systems, right? So not only machine learning, but machine learning and system. So there's a conference called MLsys. It's definitely a smaller community than ICML, but I think it's also an emerging and growing community where people are talking about what are the implications of building systems for machine learning, right? And how do you go and optimize things around that and co-design models and systems together? [00:16:37]Swyx: Yeah. And you were area chair for ICML and NeurIPS as well. So you've just had a lot of conference and community organization experience. Is that also an important part of your work? Well, it's kind of expected for academic. [00:16:48]Tianqi: If I hold an academic job, I need to do services for the community. Okay, great. [00:16:53]Swyx: Your most recent venture in MLsys is going to the phone with MLCLLM. You announced this in April. I have it on my phone. It's great. I'm running Lama 2, Vicuña. I don't know what other models that you offer. But maybe just kind of describe your journey into MLC. And I don't know how this coincides with your work at CMU. Is that some kind of outgrowth? [00:17:18]Tianqi: I think it's more like a focused effort that we want in the area of machine learning compilation. So it's kind of related to what we built in TVM. So when we built TVM was five years ago, right? And a lot of things happened. We built the end-to-end machine learning compiler that works, the first one that works. But then we captured a lot of lessons there. So then we are building a second iteration called TVM Unity. That allows us to be able to allow ML engineers to be able to quickly capture the new model and how we demand building optimizations for them. And MLCLLM is kind of like an MLC. It's more like a vertical driven organization that we go and build tutorials and go and build projects like LLM to solutions. So that to really show like, okay, you can take machine learning compilation technology and apply it and bring something fun forward. Yeah. So yes, it runs on phones, which is really cool. But the goal here is not only making it run on phones, right? The goal is making it deploy universally. So we do run on Apple M2 Macs, the 17 billion models. Actually, on a single batch inference, more recently on CUDA, we get, I think, the most best performance you can get out there already on the 4-bit inference. Actually, as I alluded earlier before the podcast, we just had a result on AMD. And on a single batch, actually, we can get the latest AMD GPU. This is a consumer card. It can get to about 80% of the 4019, so NVIDIA's best consumer card out there. So it's not yet on par, but thinking about how diversity and what you can enable and the previous things you can get on that card, it's really amazing that what you can do with this kind of technology. [00:19:10]Swyx: So one thing I'm a little bit confused by is that most of these models are in PyTorch, but you're running this inside a TVM. I don't know. Was there any fundamental change that you needed to do, or was this basically the fundamental design of TVM? [00:19:25]Tianqi: So the idea is that, of course, it comes back to program representation, right? So effectively, TVM has this program representation called TVM script that contains more like computational graph and operational representation. So yes, initially, we do need to take a bit of effort of bringing those models onto the program representation that TVM supports. Usually, there are a mix of ways, depending on the kind of model you're looking at. For example, for vision models and stable diffusion models, usually we can just do tracing that takes PyTorch model onto TVM. That part is still being robustified so that we can bring more models in. On language model tasks, actually what we do is we directly build some of the model constructors and try to directly map from Hugging Face models. The goal is if you have a Hugging Face configuration, we will be able to bring that in and apply optimization on them. So one fun thing about model compilation is that your optimization doesn't happen only as a soft language, right? For example, if you're writing PyTorch code, you just go and try to use a better fused operator at a source code level. Torch compile might help you do a bit of things in there. In most of the model compilations, it not only happens at the beginning stage, but we also apply generic transformations in between, also through a Python API. So you can tweak some of that. So that part of optimization helps a lot of uplifting in getting both performance and also portability on the environment. And another thing that we do have is what we call universal deployment. So if you get the ML program into this TVM script format, where there are functions that takes in tensor and output tensor, we will be able to have a way to compile it. So they will be able to load the function in any of the language runtime that TVM supports. So if you could load it in JavaScript, and that's a JavaScript function that you can take in tensors and output tensors. If you're loading Python, of course, and C++ and Java. So the goal there is really bring the ML model to the language that people care about and be able to run it on a platform they like. [00:21:37]Swyx: It strikes me that I've talked to a lot of compiler people, but you don't have a traditional compiler background. You're inventing your own discipline called machine learning compilation, or MLC. Do you think that this will be a bigger field going forward? [00:21:52]Tianqi: First of all, I do work with people working on compilation as well. So we're also taking inspirations from a lot of early innovations in the field. Like for example, TVM initially, we take a lot of inspirations from Halide, which is just an image processing compiler. And of course, since then, we have evolved quite a bit to focus on the machine learning related compilations. If you look at some of our conference publications, you'll find that machine learning compilation is already kind of a subfield. So if you look at papers in both machine learning venues, the MLC conferences, of course, and also system venues, every year there will be papers around machine learning compilation. And in the compiler conference called CGO, there's a C4ML workshop that also kind of trying to focus on this area. So definitely it's already starting to gain traction and becoming a field. I wouldn't claim that I invented this field, but definitely I helped to work with a lot of folks there. And I try to bring a perspective, of course, trying to learn a lot from the compiler optimizations as well as trying to bring in knowledges in machine learning and systems together. [00:23:07]Alessio: So we had George Hotz on the podcast a few episodes ago, and he had a lot to say about AMD and their software. So when you think about TVM, are you still restricted in a way by the performance of the underlying kernel, so to speak? So if your target is like a CUDA runtime, you still get better performance, no matter like TVM kind of helps you get there, but then that level you don't take care of, right? [00:23:34]Swyx: There are two parts in here, right? [00:23:35]Tianqi: So first of all, there is the lower level runtime, like CUDA runtime. And then actually for NVIDIA, a lot of the mood came from their libraries, like Cutlass, CUDN, right? Those library optimizations. And also for specialized workloads, actually you can specialize them. Because a lot of cases you'll find that if you go and do benchmarks, it's very interesting. Like two years ago, if you try to benchmark ResNet, for example, usually the NVIDIA library [00:24:04]Swyx: gives you the best performance. [00:24:06]Tianqi: It's really hard to beat them. But as soon as you start to change the model to something, maybe a bit of a variation of ResNet, not for the traditional ImageNet detections, but for latent detection and so on, there will be some room for optimization because people sometimes overfit to benchmarks. These are people who go and optimize things, right? So people overfit the benchmarks. So that's the largest barrier, like being able to get a low level kernel libraries, right? In that sense, the goal of TVM is actually we try to have a generic layer to both, of course, leverage libraries when available, but also be able to automatically generate [00:24:45]Swyx: libraries when possible. [00:24:46]Tianqi: So in that sense, we are not restricted by the libraries that they have to offer. That's why we will be able to run Apple M2 or WebGPU where there's no library available because we are kind of like automatically generating libraries. That makes it easier to support less well-supported hardware, right? For example, WebGPU is one example. From a runtime perspective, AMD, I think before their Vulkan driver was not very well supported. Recently, they are getting good. But even before that, we'll be able to support AMD through this GPU graphics backend called Vulkan, which is not as performant, but it gives you a decent portability across those [00:25:29]Swyx: hardware. [00:25:29]Alessio: And I know we got other MLC stuff to talk about, like WebLLM, but I want to wrap up on the optimization that you're doing. So there's kind of four core things, right? Kernel fusion, which we talked a bit about in the flash attention episode and the tiny grab one memory planning and loop optimization. I think those are like pretty, you know, self-explanatory. I think the one that people have the most questions, can you can you quickly explain [00:25:53]Swyx: those? [00:25:54]Tianqi: So there are kind of a different things, right? Kernel fusion means that, you know, if you have an operator like Convolutions or in the case of a transformer like MOP, you have other operators that follow that, right? You don't want to launch two GPU kernels. You want to be able to put them together in a smart way, right? And as a memory planning, it's more about, you know, hey, if you run like Python code, every time when you generate a new array, you are effectively allocating a new piece of memory, right? Of course, PyTorch and other frameworks try to optimize for you. So there is a smart memory allocator behind the scene. But actually, in a lot of cases, it's much better to statically allocate and plan everything ahead of time. And that's where like a compiler can come in. We need to, first of all, actually for language model, it's much harder because dynamic shape. So you need to be able to what we call symbolic shape tracing. So we have like a symbolic variable that tells you like the shape of the first tensor is n by 12. And the shape of the third tensor is also n by 12. Or maybe it's n times 2 by 12. Although you don't know what n is, right? But you will be able to know that relation and be able to use that to reason about like fusion and other decisions. So besides this, I think loop transformation is quite important. And it's actually non-traditional. Originally, if you simply write a code and you want to get a performance, it's very hard. For example, you know, if you write a matrix multiplier, the simplest thing you can do is you do for i, j, k, c, i, j, plus, equal, you know, a, i, k, times b, i, k. But that code is 100 times slower than the best available code that you can get. So we do a lot of transformation, like being able to take the original code, trying to put things into shared memory, and making use of tensor calls, making use of memory copies, and all this. Actually, all these things, we also realize that, you know, we cannot do all of them. So we also make the ML compilation framework as a Python package, so that people will be able to continuously improve that part of engineering in a more transparent way. So we find that's very useful, actually, for us to be able to get good performance very quickly on some of the new models. Like when Lamato came out, we'll be able to go and look at the whole, here's the bottleneck, and we can go and optimize those. [00:28:10]Alessio: And then the fourth one being weight quantization. So everybody wants to know about that. And just to give people an idea of the memory saving, if you're doing FB32, it's like four bytes per parameter. Int8 is like one byte per parameter. So you can really shrink down the memory footprint. What are some of the trade-offs there? How do you figure out what the right target is? And what are the precision trade-offs, too? [00:28:37]Tianqi: Right now, a lot of people also mostly use int4 now for language models. So that really shrinks things down a lot. And more recently, actually, we started to think that, at least in MOC, we don't want to have a strong opinion on what kind of quantization we want to bring, because there are so many researchers in the field. So what we can do is we can allow developers to customize the quantization they want, but we still bring the optimum code for them. So we are working on this item called bring your own quantization. In fact, hopefully MOC will be able to support more quantization formats. And definitely, I think there's an open field that's being explored. Can you bring more sparsities? Can you quantize activations as much as possible, and so on? And it's going to be something that's going to be relevant for quite a while. [00:29:27]Swyx: You mentioned something I wanted to double back on, which is most people use int4 for language models. This is actually not obvious to me. Are you talking about the GGML type people, or even the researchers who are training the models also using int4? [00:29:40]Tianqi: Sorry, so I'm mainly talking about inference, not training, right? So when you're doing training, of course, int4 is harder, right? Maybe you could do some form of mixed type precision for inference. I think int4 is kind of like, in a lot of cases, you will be able to get away with int4. And actually, that does bring a lot of savings in terms of the memory overhead, and so on. [00:30:09]Alessio: Yeah, that's great. Let's talk a bit about maybe the GGML, then there's Mojo. How should people think about MLC? How do all these things play together? I think GGML is focused on model level re-implementation and improvements. Mojo is a language, super sad. You're more at the compiler level. Do you all work together? Do people choose between them? [00:30:32]Tianqi: So I think in this case, I think it's great to say the ecosystem becomes so rich with so many different ways. So in our case, GGML is more like you're implementing something from scratch in C, right? So that gives you the ability to go and customize each of a particular hardware backend. But then you will need to write from CUDA kernels, and you write optimally from AMD, and so on. So the kind of engineering effort is a bit more broadened in that sense. Mojo, I have not looked at specific details yet. I think it's good to start to say, it's a language, right? I believe there will also be machine learning compilation technologies behind it. So it's good to say, interesting place in there. In the case of MLC, our case is that we do not want to have an opinion on how, where, which language people want to develop, deploy, and so on. And we also realize that actually there are two phases. We want to be able to develop and optimize your model. By optimization, I mean, really bring in the best CUDA kernels and do some of the machine learning engineering in there. And then there's a phase where you want to deploy it as a part of the app. So if you look at the space, you'll find that GGML is more like, I'm going to develop and optimize in the C language, right? And then most of the low-level languages they have. And Mojo is that you want to develop and optimize in Mojo, right? And you deploy in Mojo. In fact, that's the philosophy they want to push for. In the ML case, we find that actually if you want to develop models, the machine learning community likes Python. Python is a language that you should focus on. So in the case of MLC, we really want to be able to enable, not only be able to just define your model in Python, that's very common, right? But also do ML optimization, like engineering optimization, CUDA kernel optimization, memory planning, all those things in Python that makes you customizable and so on. But when you do deployment, we realize that people want a bit of a universal flavor. If you are a web developer, you want JavaScript, right? If you're maybe an embedded system person, maybe you would prefer C++ or C or Rust. And people sometimes do like Python in a lot of cases. So in the case of MLC, we really want to have this vision of, you optimize, build a generic optimization in Python, then you deploy that universally onto the environments that people like. [00:32:54]Swyx: That's a great perspective and comparison, I guess. One thing I wanted to make sure that we cover is that I think you are one of these emerging set of academics that also very much focus on your artifacts of delivery. Of course. Something we talked about for three years, that he was very focused on his GitHub. And obviously you treated XGBoost like a product, you know? And then now you're publishing an iPhone app. Okay. Yeah. Yeah. What is his thinking about academics getting involved in shipping products? [00:33:24]Tianqi: I think there are different ways of making impact, right? Definitely, you know, there are academics that are writing papers and building insights for people so that people can build product on top of them. In my case, I think the particular field I'm working on, machine learning systems, I feel like really we need to be able to get it to the hand of people so that really we see the problem, right? And we show that we can solve a problem. And it's a different way of making impact. And there are academics that are doing similar things. Like, you know, if you look at some of the people from Berkeley, right? A few years, they will come up with big open source projects. Certainly, I think it's just a healthy ecosystem to have different ways of making impacts. And I feel like really be able to do open source and work with open source community is really rewarding because we have a real problem to work on when we build our research. Actually, those research bring together and people will be able to make use of them. And we also start to see interesting research challenges that we wouldn't otherwise say, right, if you're just trying to do a prototype and so on. So I feel like it's something that is one interesting way of making impact, making contributions. [00:34:40]Swyx: Yeah, you definitely have a lot of impact there. And having experience publishing Mac stuff before, the Apple App Store is no joke. It is the hardest compilation, human compilation effort. So one thing that we definitely wanted to cover is running in the browser. You have a 70 billion parameter model running in the browser. That's right. Can you just talk about how? Yeah, of course. [00:35:02]Tianqi: So I think that there are a few elements that need to come in, right? First of all, you know, we do need a MacBook, the latest one, like M2 Max, because you need the memory to be big enough to cover that. So for a 70 million model, it takes you about, I think, 50 gigahertz of RAM. So the M2 Max, the upper version, will be able to run it, right? And it also leverages machine learning compilation. Again, what we are doing is the same, whether it's running on iPhone, on server cloud GPUs, on AMDs, or on MacBook, we all go through that same MOC pipeline. Of course, in certain cases, maybe we'll do a bit of customization iteration for either ones. And then it runs on the browser runtime, this package of WebLM. So that will effectively... So what we do is we will take that original model and compile to what we call WebGPU. And then the WebLM will be to pick it up. And the WebGPU is this latest GPU technology that major browsers are shipping right now. So you can get it in Chrome for them already. It allows you to be able to access your native GPUs from a browser. And then effectively, that language model is just invoking the WebGPU kernels through there. So actually, when the LATMAR2 came out, initially, we asked the question about, can you run 17 billion on a MacBook? That was the question we're asking. So first, we actually... Jin Lu, who is the engineer pushing this, he got 17 billion on a MacBook. We had a CLI version. So in MLC, you will be able to... That runs through a metal accelerator. So effectively, you use the metal programming language to get the GPU acceleration. So we find, okay, it works for the MacBook. Then we asked, we had a WebGPU backend. Why not try it there? So we just tried it out. And it's really amazing to see everything up and running. And actually, it runs smoothly in that case. So I do think there are some kind of interesting use cases already in this, because everybody has a browser. You don't need to install anything. I think it doesn't make sense yet to really run a 17 billion model on a browser, because you kind of need to be able to download the weight and so on. But I think we're getting there. Effectively, the most powerful models you will be able to run on a consumer device. It's kind of really amazing. And also, in a lot of cases, there might be use cases. For example, if I'm going to build a chatbot that I talk to it and answer questions, maybe some of the components, like the voice to text, could run on the client side. And so there are a lot of possibilities of being able to have something hybrid that contains the edge component or something that runs on a server. [00:37:47]Alessio: Do these browser models have a way for applications to hook into them? So if I'm using, say, you can use OpenAI or you can use the local model. Of course. [00:37:56]Tianqi: Right now, actually, we are building... So there's an NPM package called WebILM, right? So that you will be able to, if you want to embed it onto your web app, you will be able to directly depend on WebILM and you will be able to use it. We are also having a REST API that's OpenAI compatible. So that REST API, I think, right now, it's actually running on native backend. So that if a CUDA server is faster to run on native backend. But also we have a WebGPU version of it that you can go and run. So yeah, we do want to be able to have easier integrations with existing applications. And OpenAI API is certainly one way to do that. Yeah, this is great. [00:38:37]Swyx: I actually did not know there's an NPM package that makes it very, very easy to try out and use. I want to actually... One thing I'm unclear about is the chronology. Because as far as I know, Chrome shipped WebGPU the same time that you shipped WebILM. Okay, yeah. So did you have some kind of secret chat with Chrome? [00:38:57]Tianqi: The good news is that Chrome is doing a very good job of trying to have early release. So although the official shipment of the Chrome WebGPU is the same time as WebILM, actually, you will be able to try out WebGPU technology in Chrome. There is an unstable version called Canary. I think as early as two years ago, there was a WebGPU version. Of course, it's getting better. So we had a TVM-based WebGPU backhand two years ago. Of course, at that time, there were no language models. It was running on less interesting, well, still quite interesting models. And then this year, we really started to see it getting matured and performance keeping up. So we have a more serious push of bringing the language model compatible runtime onto the WebGPU. [00:39:45]Swyx: I think you agree that the hardest part is the model download. Has there been conversations about a one-time model download and sharing between all the apps that might use this API? That is a great point. [00:39:58]Tianqi: I think it's already supported in some sense. When we download the model, WebILM will cache it onto a special Chrome cache. So if a different web app uses the same WebILM JavaScript package, you don't need to redownload the model again. So there is already something there. But of course, you have to download the model once at least to be able to use it. [00:40:19]Swyx: Okay. One more thing just in general before we're about to zoom out to OctoAI. Just the last question is, you're not the only project working on, I guess, local models. That's right. Alternative models. There's gpt4all, there's olama that just recently came out, and there's a bunch of these. What would be your advice to them on what's a valuable problem to work on? And what is just thin wrappers around ggml? Like, what are the interesting problems in this space, basically? [00:40:45]Tianqi: I think making API better is certainly something useful, right? In general, one thing that we do try to push very hard on is this idea of easier universal deployment. So we are also looking forward to actually have more integration with MOC. That's why we're trying to build API like WebILM and other things. So we're also looking forward to collaborate with all those ecosystems and working support to bring in models more universally and be able to also keep up the best performance when possible in a more push-button way. [00:41:15]Alessio: So as we mentioned in the beginning, you're also the co-founder of Octomel. Recently, Octomel released OctoAI, which is a compute service, basically focuses on optimizing model runtimes and acceleration and compilation. What has been the evolution there? So Octo started as kind of like a traditional MLOps tool, where people were building their own models and you help them on that side. And then it seems like now most of the market is shifting to starting from pre-trained generative models. Yeah, what has been that experience for you and what you've seen the market evolve? And how did you decide to release OctoAI? [00:41:52]Tianqi: One thing that we found out is that on one hand, it's really easy to go and get something up and running, right? So if you start to consider there's so many possible availabilities and scalability issues and even integration issues since becoming kind of interesting and complicated. So we really want to make sure to help people to get that part easy, right? And now a lot of things, if we look at the customers we talk to and the market, certainly generative AI is something that is very interesting. So that is something that we really hope to help elevate. And also building on top of technology we build to enable things like portability across hardwares. And you will be able to not worry about the specific details, right? Just focus on getting the model out. We'll try to work on infrastructure and other things that helps on the other end. [00:42:45]Alessio: And when it comes to getting optimization on the runtime, I see when we run an early adopters community and most enterprises issue is how to actually run these models. Do you see that as one of the big bottlenecks now? I think a few years ago it was like, well, we don't have a lot of machine learning talent. We cannot develop our own models. Versus now it's like, there's these great models you can use, but I don't know how to run them efficiently. [00:43:12]Tianqi: That depends on how you define by running, right? On one hand, it's easy to download your MLC, like you download it, you run on a laptop, but then there's also different decisions, right? What if you are trying to serve a larger user request? What if that request changes? What if the availability of hardware changes? Right now it's really hard to get the latest hardware on media, unfortunately, because everybody's trying to work on the things using the hardware that's out there. So I think when the definition of run changes, there are a lot more questions around things. And also in a lot of cases, it's not only about running models, it's also about being able to solve problems around them. How do you manage your model locations and how do you make sure that you get your model close to your execution environment more efficiently? So definitely a lot of engineering challenges out there. That we hope to elevate, yeah. And also, if you think about our future, definitely I feel like right now the technology, given the technology and the kind of hardware availability we have today, we will need to make use of all the possible hardware available out there. That will include a mechanism for cutting down costs, bringing something to the edge and cloud in a more natural way. So I feel like still this is a very early stage of where we are, but it's already good to see a lot of interesting progress. [00:44:35]Alessio: Yeah, that's awesome. I would love, I don't know how much we're going to go in depth into it, but what does it take to actually abstract all of this from the end user? You know, like they don't need to know what GPUs you run, what cloud you're running them on. You take all of that away. What was that like as an engineering challenge? [00:44:51]Tianqi: So I think that there are engineering challenges on. In fact, first of all, you will need to be able to support all the kind of hardware backhand you have, right? On one hand, if you look at the media library, you'll find very surprisingly, not too surprisingly, most of the latest libraries works well on the latest GPU. But there are other GPUs out there in the cloud as well. So certainly being able to have know-hows and being able to do model optimization is one thing, right? Also infrastructures on being able to scale things up, locate models. And in a lot of cases, we do find that on typical models, it also requires kind of vertical iterations. So it's not about, you know, build a silver bullet and that silver bullet is going to solve all the problems. It's more about, you know, we're building a product, we'll work with the users and we find out there are interesting opportunities in a certain point. And when our engineer will go and solve that, and it will automatically reflect it in a service. [00:45:45]Swyx: Awesome. [00:45:46]Alessio: We can jump into the lightning round until, I don't know, Sean, if you have more questions or TQ, if you have more stuff you wanted to talk about that we didn't get a chance to [00:45:54]Swyx: touch on. [00:45:54]Alessio: Yeah, we have talked a lot. [00:45:55]Swyx: So, yeah. We always would like to ask, you know, do you have a commentary on other parts of AI and ML that is interesting to you? [00:46:03]Tianqi: So right now, I think one thing that we are really pushing hard for is this question about how far can we bring open source, right? I'm kind of like a hacker and I really like to put things together. So I think it's unclear in the future of what the future of AI looks like. On one hand, it could be possible that, you know, you just have a few big players, you just try to talk to those bigger language models and that can do everything, right? On the other hand, one of the things that Wailing Academic is really excited and pushing for, that's one reason why I'm pushing for MLC, is that can we build something where you have different models? You have personal models that know the best movie you like, but you also have bigger models that maybe know more, and you get those models to interact with each other, right? And be able to have a wide ecosystem of AI agents that helps each person while still being able to do things like personalization. Some of them can run locally, some of them, of course, running on a cloud, and how do they interact with each other? So I think that is a very exciting time where the future is yet undecided, but I feel like there is something we can do to shape that future as well. [00:47:18]Swyx: One more thing, which is something I'm also pursuing, which is, and this kind of goes back into predictions, but also back in your history, do you have any idea, or are you looking out for anything post-transformers as far as architecture is concerned? [00:47:32]Tianqi: I think, you know, in a lot of these cases, you can find there are already promising models for long contexts, right? There are space-based models, where like, you know, a lot of some of our colleagues from Albert, who he worked on this HIPPO models, right? And then there is an open source version called RWKV. It's like a recurrent models that allows you to summarize things. Actually, we are bringing RWKV to MOC as well, so maybe you will be able to see one of the models. [00:48:00]Swyx: We actually recorded an episode with one of the RWKV core members. It's unclear because there's no academic backing. It's just open source people. Oh, I see. So you like the merging of recurrent networks and transformers? [00:48:13]Tianqi: I do love to see this model space continue growing, right? And I feel like in a lot of cases, it's just that attention mechanism is getting changed in some sense. So I feel like definitely there are still a lot of things to be explored here. And that is also one reason why we want to keep pushing machine learning compilation, because one of the things we are trying to push in was productivity. So that for machine learning engineering, so that as soon as some of the models came out, we will be able to, you know, empower them onto those environments that's out there. [00:48:43]Swyx: Yeah, it's a really good mission. Okay. Very excited to see that RWKV and state space model stuff. I'm hearing increasing chatter about that stuff. Okay. Lightning round, as always fun. I'll take the first one. Acceleration. What has already happened in AI that you thought would take much longer? [00:48:59]Tianqi: Emergence of more like a conversation chatbot ability is something that kind of surprised me before it came out. This is like one piece that I feel originally I thought would take much longer, but yeah, [00:49:11]Swyx: it happens. And it's funny because like the original, like Eliza chatbot was something that goes all the way back in time. Right. And then we just suddenly came back again. Yeah. [00:49:21]Tianqi: It's always too interesting to think about, but with a kind of a different technology [00:49:25]Swyx: in some sense. [00:49:25]Alessio: What about the most interesting unsolved question in AI? [00:49:31]Swyx: That's a hard one, right? [00:49:32]Tianqi: So I can tell you like what kind of I'm excited about. So, so I think that I have always been excited about this idea of continuous learning and lifelong learning in some sense. So how AI continues to evolve with the knowledges that have been there. It seems that we're getting much closer with all those recent technologies. So being able to develop systems, support, and be able to think about how AI continues to evolve is something that I'm really excited about. [00:50:01]Swyx: So specifically, just to double click on this, are you talking about continuous training? That's like a training. [00:50:06]Tianqi: I feel like, you know, training adaptation and it's all similar things, right? You want to think about entire life cycle, right? The life cycle of collecting data, training, fine tuning, and maybe have your local context that getting continuously curated and feed onto models. So I think all these things are interesting and relevant in here. [00:50:29]Swyx: Yeah. I think this is something that people are really asking, you know, right now we have moved a lot into the sort of pre-training phase and off the shelf, you know, the model downloads and stuff like that, which seems very counterintuitive compared to the continuous training paradigm that people want. So I guess the last question would be for takeaways. What's basically one message that you want every listener, every person to remember today? [00:50:54]Tianqi: I think it's getting more obvious now, but I think one of the things that I always want to mention in my talks is that, you know, when you're thinking about AI applications, originally people think about algorithms a lot more, right? Our algorithm models, they are still very important. But usually when you build AI applications, it takes, you know, both algorithm side, the system optimizations, and the data curations, right? So it takes a connection of so many facades to be able to bring together an AI system and be able to look at it from that holistic perspective is really useful when we start to build modern applications. I think it's going to continue going to be more important in the future. [00:51:35]Swyx: Yeah. Thank you for showing the way on this. And honestly, just making things possible that I thought would take a lot longer. So thanks for everything you've done. [00:51:46]Tianqi: Thank you for having me. [00:51:47]Swyx: Yeah. [00:51:47]Alessio: Thanks for coming on TQ. [00:51:49]Swyx: Have a good one. [00:51:49] Get full access to Latent Space at www.latent.space/subscribe

Robinson's Podcast
121 - Julian Barbour: Thermodynamics, Boltzmann Brains, and a New Theory of Time

Robinson's Podcast

Play Episode Listen Later Jul 30, 2023 117:01


Julian Barbour is a physicist working in the foundations of physics and quantum gravity, with a special interest in time and the history of science. In this episode, Julian and Robinson discuss thermodynamics and the arrows of time, including a new theory of time developed by Julian and his collaborators, which is laid out in his book, The Janus Point: A New Theory of Time. If you're interested in the foundations of physics—which you absolutely should be—then please check out the John Bell Institute (Julian is an Honorary Fellow at the JBI), which is devoted to providing a home for research and education in this important area. At this early stage any donations are immensely helpful. Julian's Website: http://platonia.com/index.html The Janus Point: https://a.co/d/4NVOGqq A History of Thermodynamics: http://platonia.com/A_History_of_Thermodynamics.pdf Quantum without Quantum:  https://arxiv.org/abs/2305.13335 The John Bell Institute: https://www.johnbellinstitute.org OUTLINE 00:00 In This Episode… 00:56 Introduction 04:42 Julian's Interest in Time 07:27 Time's Arrows 23:34 The Problem of Time-Reversal Symmetry 25:54 A Potted Overview of Entropy and Thermodynamics 38:21 Entropy and Time's Arrow 52:32 The Janus Point and a New Theory of Time 01:07:00 Intuition and The Janus Point 01:21:21 Entropy and Entaxy 01:26:00 Cosmic Inflation and Its Problems 01:44:05 Quantum Mechanics without the Wave Function Robinson's Website: http://robinsonerhardt.com Robinson Erhardt researches symbolic logic and the foundations of mathematics at Stanford University. Join him in conversations with philosophers, scientists, weightlifters, artists, and everyone in-between.  --- Support this podcast: https://podcasters.spotify.com/pod/show/robinson-erhardt/support

The Gradient Podcast
Hugo Larochelle: Deep Learning as Science

The Gradient Podcast

Play Episode Listen Later Jul 6, 2023 108:28


In episode 80 of The Gradient Podcast, Daniel Bashir speaks to Professor Hugo Larochelle. Professor Larochelle leads the Montreal Google DeepMind team and is adjunct professor at Université de Montréal and a Canada CIFAR Chair. His research focuses on the study and development of deep learning algorithms.Have suggestions for future podcast guests (or other feedback)? Let us know here or reach us at editor@thegradient.pubSubscribe to The Gradient Podcast: Apple Podcasts  | Spotify | Pocket Casts | RSSFollow The Gradient on TwitterOutline:* (00:00) Intro* (01:38) Prof. Larochelle's background, working in Bengio's lab* (04:53) Prof. Larochelle's work and connectionism* (08:20) 2004-2009, work with Bengio* (08:40) Nonlocal Estimation of Manifold Structure, manifolds and deep learning* (13:58) Manifold learning in vision and language* (16:00) Relationship to Denoising Autoencoders and greedy layer-wise pretraining* (21:00) From input copying to learning about local distribution structure* (22:30) Zero-Data Learning of New Tasks* (22:45) The phrase “extend machine learning towards AI” and terminology* (26:55) Prescient hints of prompt engineering* (29:10) Daniel goes on totally unnecessary tangent* (30:00) Methods for training deep networks (strategies and robust interdependent codes)* (33:45) Motivations for layer-wise pretraining* (35:15) Robust Interdependent Codes and interactions between neurons in a single network layer* (39:00) 2009-2011, postdoc in Geoff Hinton's lab* (40:00) Reflections on the AlexNet moment* (41:45) Frustration with methods for evaluating unsupervised methods, NADE* (44:45) How researchers thought about representation learning, toying with objectives instead of architectures* (47:40) The Restricted Boltzmann Forest* (50:45) Imposing structure for tractable learning of distributions* (53:11) 2011-2016 at U Sherbooke (and Twitter)* (53:45) How Prof. Larochelle approached research problems* (56:00) How Domain Adversarial Networks came about* (57:12) Can we still learn from Restricted Boltzmann Machines?* (1:02:20) The ~ Infinite ~ Restricted Boltzmann Machine* (1:06:55) The need for researchers doing different sorts of work* (1:08:58) 2017-present, at MILA (and Google)* (1:09:30) Modulating Early Visual Processing by Language, neuroscientific inspiration* (1:13:22) Representation learning and generalization, what is a good representation (Meta-Dataset, Universal representation transformer layer, universal template, Head2Toe)* (1:15:10) Meta-Dataset motivation* (1:18:00) Shifting focus to the problem—good practices for “recycling deep learning”* (1:19:15) Head2Toe intuitions* (1:21:40) What are “universal representations” and manifold perspective on datasets, what is the right pretraining dataset* (1:26:02) Prof. Larochelle's takeaways from Fortuitous Forgetting in Connectionist Networks (led by Hattie Zhou)* (1:32:15) Obligatory commentary on The Present Moment and current directions in ML* (1:36:18) The creation and motivations of the TMLR journal* (1:41:48) Prof. Larochelle's takeaways about doing good science, building research groups, and nurturing a research environment* (1:44:05) Prof. Larochelle's advice for aspiring researchers today* (1:47:41) OutroLinks:* Professor Larochelle's homepage and Twitter* Transactions on Machine Learning Research* Papers* 2004-2009* Nonlocal Estimation of Manifold Structure* Classification using Discriminative Restricted Boltzmann Machines* Zero-data learning of new tasks* Exploring Strategies for Training Deep Neural Networks* Deep Learning using Robust Interdependent Codes* 2009-2011* Stacked Denoising Autoencoders* Tractable multivariate binary density estimation and the restricted Boltzmann forest* The Neural Autoregressive Distribution Estimator* Learning Attentional Policies for Tracking and Recognition in Video with Deep Networks* 2011-2016* Practical Bayesian Optimization of Machine Learning Algorithms* Learning Algorithms for the Classification Restricted Boltzmann Machine* A neural autoregressive topic model* Domain-Adversarial Training of Neural Networks* NADE* An Infinite Restricted Boltzmann Machine* 2017-present* Modulating early visual processing by language* Meta-Dataset* A Universal Representation Transformer Layer for Few-Shot Image Classification* Learning a universal template for few-shot dataset generalization* Impact of aliasing on generalization in deep convolutional networks* Head2Toe: Utilizing Intermediate Representations for Better Transfer Learning* Fortuitous Forgetting in Connectionist Networks Get full access to The Gradient at thegradientpub.substack.com/subscribe

Zimmerman en Space
Boltzmann brains en Last Thursdayism

Zimmerman en Space

Play Episode Listen Later Jul 4, 2023 17:18


Vandaag kijken we samen weer eens hoe diep het konijnenhol gaat. Wat weten we eigenlijk over de realiteit waarin wij leven? Of bent u de enige die dit leest? En zo ja, waarom dan?Boltzmann brain:https://en.wikipedia.org/wiki/Boltzmann_brainSinks in the Landscape, Boltzmann Brains, and the Cosmological Constant Problem:https://arxiv.org/pdf/hep-th/0611043.pdfBrain in a vat:https://en.wikipedia.org/wiki/Brain_in_a_vatDream argument:https://en.wikipedia.org/wiki/Dream_argumentSolipsisme:https://en.wikipedia.org/wiki/SolipsismHoe lang duurt het nu:https://www.nporadio2.nl/podcasts/zimmerman-in-space/5335/32-hoe-lang-duurt-het-nuLeven we in een computersimulatie:https://www.nporadio2.nl/podcasts/zimmerman-in-space/5286/4-leven-we-in-een-computersimulatieOn testing the simulation theory:https://arxiv.org/pdf/1703.00058.pdfDe Zimmerman en Space podcast is gelicenseerd onder een Creative Commons CC0 1.0 licentie.http://creativecommons.org/publicdomain/zero/1.0

Robinson's Podcast
106 - David Albert & Sean Carroll: Quantum Theory, Boltzmann Brains, & The Fine-Tuned Universe

Robinson's Podcast

Play Episode Listen Later Jun 25, 2023 130:20


David Albert is the Frederick E. Woodbridge Professor of Philosophy at Columbia University and Director of the Philosophical Foundations of Physics program at Columbia. David is a prior guest of the Robinson's Podcast multiverse, having appeared on episodes #23 (with Justin Clarke-Doane), #30, and #67 (with Tim Maudlin). Sean Carroll is Homewood Professor of Natural Philosophy at Johns Hopkins University and fractal faculty at the Santa Fe Institute. He is also host of Sean Carroll's Mindscape, a terrific show (that influenced the birth of Robinson's Podcast ) about science, society, philosophy, culture, arts, and ideas. Sean also had a great conversation with David on Mindscape, linked below. Both David and Sean are rare breeds—philosophers who are physicists, and physicists who are philosophers—and in this episode Robinson, David, and Sean speak about some of the philosophical concerns at the foundations of physics. They first discuss the Many-Worlds theory of quantum mechanics before turning to the apparent fine-tuning of our universe for life and the possibility of Boltzmann Brains, or complex observers in the universe that arise spontaneously due to quantum fluctuations or the random motion of matter. Preorder David's A Guess at the Riddle: https://a.co/d/4MUEJZN Sean's Website: https://www.preposterousuniverse.com Sean's Twitter: https://twitter.com/seanmcarroll The Biggest Ideas in the Universe: https://a.co/d/dPKZ40X David Albert on Sean Carroll's Mindscape: https://youtu.be/AglOFx6eySE  OUTLINE 00:00 In This Episode… 00:59 Introduction 08:11 Superposition and The Many-Worlds Theory of Quantum Mechanics 22:34 Decoherence 27:20 Probability 41:32 Some Thought Experiments Concerning Probability 01:08:35 Parsimony 01:12:03 The Fine-Tuned Universe and Quantum Theory 01:14:52 Entropy 01:45:37 Intelligent Design 01:47:22 Boltzmann Brains Galore Robinson's Website: http://robinsonerhardt.com Robinson Erhardt researches symbolic logic and the foundations of mathematics at Stanford University. Join him in conversations with philosophers, scientists, weightlifters, artists, and everyone in-between.  --- Support this podcast: https://podcasters.spotify.com/pod/show/robinson-erhardt/support

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion
AI Today Podcast: AI Glossary Series – Boltzmann Machine

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion

Play Episode Listen Later May 31, 2023 10:09


In this episode of the AI Today podcast hosts Kathleen Walch and Ron Schmelzer define the term Boltzmann Machine, explain how these terms relate to AI and why it's important to know about them. Want to dive deeper into an understanding of artificial intelligence, machine learning, or big data concepts? Want to learn how to apply AI and data using hands-on approaches and the latest technologies? Continue reading AI Today Podcast: AI Glossary Series – Boltzmann Machine at AI & Data Today.

The Cartesian Cafe
Daniel Schroeder | Introduction to Thermal Physics

The Cartesian Cafe

Play Episode Listen Later May 2, 2023 93:14


Daniel Schroeder is a particle and accelerator physicist and an editor for The American Journal of Physics. Dan received his PhD from Stanford University, where he spent most of his time at the Stanford Linear Accelerator, and he is currently a professor in the department of physics and astronomy at Weber State University. Dan is also the author of two revered physics textbooks, the first with Michael Peskin called An Introduction to Quantum Field Theory (or simply Peskin & Schroeder within the physics community) and the second An Introduction to Thermal Physics. Dan enjoys teaching physics courses at all levels, from Elementary Astronomy through Quantum Mechanics. In this episode, I get to connect with one of my teachers, having taken both thermodynamics and quantum field theory courses when I was a university student based on Dan's textbooks. We take a deep dive towards answering two fundamental questions in the subject of thermodynamics: what is temperature and what is entropy? We provide both a qualitative and quantitative analysis, discussing good and bad definitions of temperature, microstates and macrostates, the second law of thermodynamics, and the relationship between temperature and entropy. Our discussion was also a great chance to shed light on some of the philosophical assumptions and conundrums in thermodynamics that do not typically come up in a physics course: the fundamental assumption of statistical mechanics, Laplace's demon, and the arrow of time problem (Loschmidt's paradox) arising from the second law of thermodynamics (i.e. why is entropy increasing in the future when mechanics has time-reversal symmetry). Patreon: https://www.patreon.com/timothynguyen Outline: 00:00:00 : Introduction 00:01:54 : Writing Books 00:06:51 : Academic Track: Research vs Teaching 00:11:01 : Charming Book Snippets 00:14:54 : Discussion Plan: Two Basic Questions 00:17:19 : Temperature is What You Measure with a Thermometer 00:22:50 : Bad definition of Temperature: Measure of Average Kinetic Energy 00:25:17 : Equipartition Theorem 00:26:10 : Relaxation Time 00:27:55 : Entropy from Statistical Mechanics 00:30:12 : Einstein solid 00:32:43 : Microstates + Example Computation 00:38:33: Fundamental Assumption of Statistical Mechanics (FASM) 00:46:29 : Multiplicity is highly concentrated about its peak 00:49:50 : Entropy is Log(Multiplicity) 00:52:02 : The Second Law of Thermodynamics 00:56:13 : FASM based on our ignorance? 00:57:37 : Quantum Mechanics and Discretization 00:58:30 : More general mathematical notions of entropy 01:02:52 : Unscrambling an Egg and The Second Law of Thermodynamics 01:06:49 : Principle of Detailed Balance 01:09:52 : How important is FASM? 01:12:03 : Laplace's Demon 01:13:35 : The Arrow of Time (Loschmidt's Paradox) 01:15:20 : Comments on Resolution of Arrow of Time Problem 01:16:07 : Temperature revisited: The actual definition in terms of entropy 01:25:24 : Historical comments: Clausius, Boltzmann, Carnot 01:29:07 : Final Thoughts: Learning Thermodynamics   Further Reading: Daniel Schroeder. An Introduction to Thermal Physics L. Landau & E. Lifschitz. Statistical Physics.   Twitter: @iamtimnguyen Webpage: http://www.timothynguyen.org

The Theory of Anything
Episode 57: Quantum Immortality / Quantum Torment

The Theory of Anything

Play Episode Listen Later May 1, 2023 63:00


Does every one of us live forever in the multiverse? Is death a solvable problem? What is “quantum suicide”? Is quantum torment a concern? Does every fantastical thing we can imagine occur somewhere in the multiverse? What are “Harry Potter universes? Are we Boltzmann brains? Bruce, Cameo, and Peter consider these questions in this week's episode. Image from jupiterimages on Freeimages.com --- Support this podcast: https://podcasters.spotify.com/pod/show/four-strands/support

The Nonlinear Library
LW - The Brain is Not Close to Thermodynamic Limits on Computation by DaemonicSigil

The Nonlinear Library

Play Episode Listen Later Apr 24, 2023 8:10


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: The Brain is Not Close to Thermodynamic Limits on Computation, published by DaemonicSigil on April 24, 2023 on LessWrong. Introduction This post is written as a response to jacob_cannel's recent post Contra Yudkowsky on AI Doom. He writes: EY correctly recognizes that thermodynamic efficiency is a key metric for computation/intelligence, and he confidently, brazenly claims (as of late 2021), that the brain is about 6 OOM from thermodynamic efficiency limits EY is just completely out of his depth here: he doesn't seem to understand how the Landauer limit actually works, doesn't seem to understand that synapses are analog MACs which minimally require OOMs more energy than simple binary switches, doesn't seem to understand that interconnect dominates energy usage regardless, etc. Most of Jacob's analysis for brain efficiency is contained in this post: Brain Efficiency: Much More than You Wanted to Know. I believe this analysis is flawed with respect to the thermodynamic energy efficiency of the brain. That's the scope of this post: I will respond to Jacob's claims about thermodynamic limits on brain energy efficiency. Other constraints are out of scope, as is a discussion of the rest of the analysis in Brain Efficiency. The Landauer limit Just to review quickly, the Landauer limit says that erasing 1 bit of information has an energy cost of kTlog2. This energy must be dissipated as heat into the environment. Here k is Boltzmann's constant, while T is the temperature of the environment. At room temperature, this is about 0.02 eV. Erasing a bit is something that you have to do quite often in many types of computations, and the more bit erasures your computation needs, the more energy it costs to do that computation. (To give a general sense of how many erasures are needed to do a given amount of computation: If we add n-bit numbers a and b to get a+bmod2n, and then throw away the original values of a and b, that costs n bit erasures. I.e. the energy cost is nkTlog2.) Extra reliability costs? Brain Efficiency claims that the energy dissipation required to erase a bit becomes many times larger when we try to erase the bit reliably. The key transition error probability α is constrained by the bit energy: α=e−EbkBT Here's a range of bit energies and corresponding minimal room temp switch error rates (in electronvolts): α=0.49, Eb=0.02eV α=0.01, Eb=0.1eV α=10−25, Eb=1eV This adds a factor of about 50 to the energy cost of erasing a bit, so this would be quite significant if true. To back up this claim, Jacob cites this paper by Michael P. Frank. The relevant equation is pulled from section 2. However, in that entire section, Frank is temporarily assuming that the energy used to represent the bit internally is entirely dissipated when it comes time for the bit to be erased. Dissipating that entire energy is not required by the laws of physics, however. Frank himself explicitly mentions this in the paper (see section 3): The energy used to represent the bit can be partially recovered when erasing it. Only kTlog2 must actually be dissipated when erasing a bit, even if we ask for very high reliability. (I originally became suspicious of Jacob's numbers here based on a direct calculation. Details in this comment for those interested.) Analog signals? Quoting Brain Efficiency: Analog operations are implemented by a large number of quantal/binary carrier units; with the binary precision equivalent to the signal to noise ratio where the noise follows a binomial distribution. Because of this analog representation, Jacob estimates about 6000 eV required to do the equivalent of an 8 bit multiplication. However, the laws of physics don't require us to do our floating point operations in analog. "are implemented" does not imply "have to be implemented". Digital multiplication of two 8 bit ...

Beauty At Work
What's Beautiful About Mathematics? With Dr. Carlo Lancellotti

Beauty At Work

Play Episode Listen Later Apr 15, 2023 42:42 Transcription Available


Carlo Lancellotti is a Professor of Mathematics at the College of Staten Island and a faculty member in the Physics Program at the CUNY Graduate Center. His field of scholarship is mathematical physics, with a special emphasis on the kinetic theory of plasmas and gravitating systems. He has published in a variety of journals, including Physical Review, Physical Review Letters, the Journal of Statistical Physics, Chaos, the Journal of Transport Theory, and Statistical Physics. He has also translated into English and published three volumes of works by the late Italian philosopher Augusto Del Noce. Lancellotti has also written essays of his own on Del Noce and other topics, which have appeared in Communio, Public Discourse, Church Life Journal, First Things, and other outlets.In this episode, we talk about: Beauty, structure, and harmony and their role in the study of mathematics.The aesthetic criteria used by some mathematicians.The beauty found in the Boltzmann equation.Beauty and truth in simplicity and consistency—understanding reality through math.The limitations of mathematics in what it can tell us about reality.Mathematicians and the Platonic world of ideas.Appreciating the beauty in mathematics—how beauty can help encourage the study of math.Understanding math is a necessity in learning art.Resources mentioned:David Bohm's Wholeness and the Implicate Order:https://www.amazon.com/Wholeness-Implicate-Order-David-Bohm/dp/0415289793The Redemption of Scientific Reason by Carlo Lancellotihttps://churchlifejournal.nd.edu/articles/the-redemption-of-scientific-reason/Support us on Patreon:https://www.patreon.com/BeautyatWorkPodcastSupport the show

The Nonlinear Library
LW - Why Are Maximum Entropy Distributions So Ubiquitous? by johnswentworth

The Nonlinear Library

Play Episode Listen Later Apr 6, 2023 15:56


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why Are Maximum Entropy Distributions So Ubiquitous?, published by johnswentworth on April 5, 2023 on LessWrong. If we measure the distribution of particle velocities in a thin gas, we'll find that they're roughly normally distributed. Specifically, the probability density of velocity v will be proportional to e−12mv2/(kBT) - or, written differently, e−E(v)/(kBT), where E(v) is the kinetic energy of a particle of the gas with velocity v, T is temperature, and kB is Boltzmann's constant. The latter form, e−E/(kBT), generalizes even beyond thin gasses - indeed, it generalizes even to solids, fluids, and plasmas. It applies to the concentrations of chemical species in equilibrium solutions, or the concentrations of ions around an electrode. It applies to light emitted from hot objects. Roughly speaking, it applies to microscopic states in basically any physical system in thermal equilibrium where quantum effects aren't significant. It's called the Boltzmann distribution; it's a common sub-case of a more general class of relatively-elegant distributions called maximum entropy distributions. Even more generally, maximum entropy distributions show up remarkably often. The normal distribution is another good example: you might think of normal distributions mostly showing up when we add up lots of independent things (thanks to the Central Limit Theorem), but then what about particle velocities in a gas? Sure, there's conceptually lots of little things combining together to produce gas particle velocities, but it's not literally a bunch of numbers adding together; Central Limit Theorem doesn't directly apply. Point is: normal distributions show up surprisingly often, even when we're not adding together lots of numbers. Same story with lots of other maximum entropy distributions - poisson, geometric/exponential, uniform, dirichlet. most of the usual named distributions in a statistical library are either maximum entropy distributions or near relatives. Like the normal distribution, they show up surprisingly often. What's up with that? Why this particular class of distributions? If you have a Bayesian background, there's kind of a puzzle here. Usually we think of probability distributions as epistemic states, descriptions of our own uncertainty. Probabilities live “in the mind”. But here we have a class of distributions which are out there “in the territory”: we look at the energies of individual particles in a gas or plasma or whatever, and find that they have not just any distribution, but a relatively “nice” distribution, something simple. Why? What makes a distribution like that appear, not just in our own models, but out in the territory? What Exactly Is A Maximum Entropy Distribution? Before we dive into why maximum entropy distributions are so ubiquitous, let's be explicit about what maximum entropy distributions are. Any (finite) probability distribution has some information-theoretic entropy, the “amount of information” conveyed by a sample from the distribution, given by Shannon's formula: −∑ipilog(pi) As the name suggests, a maximum entropy distribution is the distribution with the highest entropy, subject to some constraints. Different constraints yield different maximum entropy distributions. Conceptually: if a distribution has maximum entropy, then we gain the largest possible amount of information by observing a sample from the distribution. On the flip side, that means we know as little as possible about the sample before observing it. Maximum entropy = maximum uncertainty. With that in mind, you can probably guess one maximum entropy distribution: what's the maximum entropy distribution over a finite number of outcomes (e.g. heads/tails, or 1/2/3/4/5/6), without any additional constraints? (Think about that for a moment if you want.) Intuitively, the “most unce...

Why This Universe?
47 - Boltzmann Brains: Could Reality Be An Illusion? (Rebroadcast)

Why This Universe?

Play Episode Listen Later Mar 27, 2023 28:02 Very Popular


A whirlwind of questions brings us to an odd conclusion: could reality be an illusion created by a quantum fluctuation of a brain? This is a rebroadcast of a past favorite.For ad-free episodes and other exclusives, join us for $3 a month on Patreon: https://patreon.com/whythisuniverseSupport the show

PJScast
March 17, 2023 - Boltzmann Brain

PJScast

Play Episode Listen Later Mar 18, 2023 113:26


Promo code: THPN Gambling Problem? Call (800) 327-5050 or visit gamblinghelplinema.org (MA), Call 877- 8-HOPENY/text HOPENY (467369) (NY), If you or someone you know has a gambling problem, crisis counseling and referral services can be accessed by calling 1-800-GAMBLER (1-800-426-2537) (CO/IL/IN/LA/MD/MI/NJ/OH/PA/TN/WV/WY), 1-800-NEXT STEP (AZ), 1-800-522-4700 (KS/NH), 888-789-7777/visit ccpg.org (CT), 1-800-BETS OFF (IA), visit OPGR.org (OR), or 1-888-532-3500 (VA). 21+ (18+ NH/WY). Physically present in AZ/CO/CT/IL/IN/IA/KS/LA(select parishes)/MA/MD/MI /NH /NJ/ NY/OH/OR/PA/TN/VA/WV/WY only. VOID IN ONT. Eligibility restrictions apply. Bonus bets (void in NH/OR): Valid 1 per new customer. Min. $5 deposit. Min $5 bet. Promo code req. $200 issued as eight (8) $25 bonus bets. Bonus bets are non-cashable and cannot be withdrawn. Bonus bets must be wagered 1x and stake is not included in any returns or winnings. Bonus Bets expire 7 days (168 hours) after being awarded. Promotional offer period ends 3/19/23 at 11:59 PM ET. See terms at sportsbook.draftkings.com/basketballterms No Sweat Bet (Void in OR): Valid 1 offer per customer. Opt in req. Valid only on college basketball bets 3/13/23 - 3/19/23. First bet after opting in must lose. Paid as one (1) bonus bet based on amount of initial losing bet. Max $10 bonus bet awarded. Bonus bets expire 7 days (168 hours) after being awarded. See termsat sportsbook.draftkings.com/basketballterms. Learn more about your ad choices. Visit megaphone.fm/adchoices

PJScast
March 17, 2023 - Boltzmann Brain

PJScast

Play Episode Listen Later Mar 18, 2023 112:41


Blackhawks somehow beat the historically good Bruins 6-3 and Alex Stalock is unbelievably good (7:35), Jordan Binnington is a giant piss baby (29:50), the NHL sucks at marketing (50:40), the Wizard of Oz (57:40), Edwin Diaz's injury (1:17:10), and Twitter questions (1:34:25).Promo code: THPNGambling Problem? Call (800) 327-5050 or visit gamblinghelplinema.org (MA), Call 877- 8-HOPENY/text HOPENY (467369) (NY), If you or someone you know has a gambling problem, crisis counseling and referral services can be accessed by calling 1-800-GAMBLER (1-800-426-2537) (CO/IL/IN/LA/MD/MI/NJ/OH/PA/TN/WV/WY), 1-800-NEXT STEP (AZ), 1-800-522-4700 (KS/NH), 888-789-7777/visit ccpg.org (CT), 1-800-BETS OFF (IA), visit OPGR.org (OR), or 1-888-532-3500 (VA).21+ (18+ NH/WY). Physically present in AZ/CO/CT/IL/IN/IA/KS/LA(select parishes)/MA/MD/MI /NH /NJ/ NY/OH/OR/PA/TN/VA/WV/WY only. VOID IN ONT. Eligibility restrictions apply. Bonus bets (void in NH/OR): Valid 1 per new customer. Min. $5 deposit. Min $5 bet. Promo code req. $200 issued as eight (8) $25 bonus bets. Bonus bets are non-cashable and cannot be withdrawn. Bonus bets must be wagered 1x and stake is not included in any returns or winnings. Bonus Bets expire 7 days (168 hours) after being awarded. Promotional offer period ends 3/19/23 at 11:59 PM ET. See terms at sportsbook.draftkings.com/basketballtermsNo Sweat Bet (Void in OR): Valid 1 offer per customer. Opt in req. Valid only on college basketball bets 3/13/23 - 3/19/23. First bet after opting in must lose. Paid as one (1) bonus bet based on amount of initial losing bet. Max $10 bonus bet awarded. Bonus bets expire 7 days (168 hours) after being awarded. See termsat sportsbook.draftkings.com/basketballterms.

Robinson's Podcast
58 - Huw Price: Philosophy of Time, Boltzmann Brains, and Retrocausality

Robinson's Podcast

Play Episode Listen Later Mar 4, 2023 62:29


Huw Price is the former Bertrand Russell Professor in the Faculty of Philosophy at the University of Cambridge, and was before that Challis Professor of Philosophy and Director of the Centre for Time at the University of Sydney, and then—even before that—was Professor of Logic and Metaphysics at the University of Edinburgh. Huw is an expert across a wide variety of subdomains within the family of philosophy of science and physics, and in this episode he and Robinson discuss topics drawn from the philosophy of time, ranging from its flow and direction to its relationship to causation and quantum mechanics. Huw is also the author of Naturalism Without Mirrors and Time's Arrow and Archimedes' Point: New Directions for the Physics of Time. You can keep up with Huw on his website, prce.hu, and via his Twitter account, @HuwPriceAU. OUTLINE 00:00 Introduction 2:22 Huw's Background 4:23 The A- and B-Series of Time 12:57 The Flow of Time 25:49 Boltzmann Brains 33:30 The Arrow of Time 38:23 The Fixed Past and The Open Future 50:31 Quantum Mechanics and Retrocausality Robinson's Website: http://robinsonerhardt.com Robinson Erhardt researches symbolic logic and the foundations of mathematics at Stanford University. Join him in conversations with philosophers, scientists, weightlifters, artists, and everyone in-between. --- Support this podcast: https://podcasters.spotify.com/pod/show/robinson-erhardt/support

The Nonlinear Library
AF - Parametrically retargetable decision-makers tend to seek power by Alex Turner

The Nonlinear Library

Play Episode Listen Later Feb 18, 2023 4:16


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Parametrically retargetable decision-makers tend to seek power, published by Alex Turner on February 18, 2023 on The AI Alignment Forum. This paper—accepted as a poster to NeurIPS 2022— is the sequel to Optimal Policies Tend to Seek Power. The new theoretical results are extremely broad, discarding the requirements of full observability, optimal policies, or even requiring a finite number of options. Abstract: If capable AI agents are generally incentivized to seek power in service of the objectives we specify for them, then these systems will pose enormous risks, in addition to enormous benefits. In fully observable environments, most reward functions have an optimal policy which seeks power by keeping options open and staying alive. However, the real world is neither fully observable, nor must trained agents be even approximately reward-optimal. We consider a range of models of AI decision-making, from optimal, to random, to choices informed by learning and interacting with an environment. We discover that many decision-making functions are retargetable, and that retargetability is sufficient to cause power-seeking tendencies. Our functional criterion is simple and broad. We show that a range of qualitatively dissimilar decision-making procedures incentivize agents to seek power. We demonstrate the flexibility of our results by reasoning about learned policy incentives in Montezuma's Revenge. These results suggest a safety risk: Eventually, retargetable training procedures may train real-world agents which seek power over humans. Examples of agent designs the power-seeking theorems now apply to: Boltzmann-rational agents,, Expected utility maximizers and minimizers, Even if they uniformly randomly sample a few plans and then choose the best sampled Satisficers (as I formalized them), Quantilizing with a uniform prior over plans, and RL-trained agents under certain modeling assumptions. The key insight is that the original results hinge not on optimality per se, but on the retargetability of the policy-generation process via a reward or utility function or some other parameter. See Satisficers Tend To Seek Power: Instrumental Convergence Via Retargetability for intuitions and illustrations. Why am I only now posting this? First, I've been way more excited about shard theory. I still think these theorems are really cool, though. Second, I think the results in this paper are informative about the default incentives for decision-makers which "care about things." IE, make decisions on the basis of e.g. how many diamonds that decision leads to, or how many paperclips, and so on. However, I think that conventional accounts and worries around "utility maximization" are subtly misguided. Whenever I imagined posting this paper, I felt like "ugh sharing this result will just make it worse." I'm not looking to litigate that concern right now, but I do want to flag it. Third, Optimal Policies Tend to Seek Power makes the "reward is the optimization target" mistake super strongly. Parametrically retargetable decision-makers tend to seek power makes the mistake less hard, both because it discusses utility functions and learned policies instead of optimal policies, and also thanks to edits I've made since realizing my optimization-target mistake. Conclusion This paper isolates the key mechanism—retargetability—which enables the results in Optimal Policies Tend to Seek Power. This paper also takes healthy steps away from the optimal policy regime (which I consider to be a red herring for alignment) and lays out a bunch of theory I found—and still find—beautiful. This paper is both published in a top-tier conference and, unlike the previous paper, actually has a shot of being applicable to realistic agents and training processes. Therefore, compared to the original optimal policy pa...

The Nonlinear Library
LW - Parametrically retargetable decision-makers tend to seek power by TurnTrout

The Nonlinear Library

Play Episode Listen Later Feb 18, 2023 4:16


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Parametrically retargetable decision-makers tend to seek power, published by TurnTrout on February 18, 2023 on LessWrong. This paper—accepted as a poster to NeurIPS 2022— is the sequel to Optimal Policies Tend to Seek Power. The new theoretical results are extremely broad, discarding the requirements of full observability, optimal policies, or even requiring a finite number of options. Abstract: If capable AI agents are generally incentivized to seek power in service of the objectives we specify for them, then these systems will pose enormous risks, in addition to enormous benefits. In fully observable environments, most reward functions have an optimal policy which seeks power by keeping options open and staying alive. However, the real world is neither fully observable, nor must trained agents be even approximately reward-optimal. We consider a range of models of AI decision-making, from optimal, to random, to choices informed by learning and interacting with an environment. We discover that many decision-making functions are retargetable, and that retargetability is sufficient to cause power-seeking tendencies. Our functional criterion is simple and broad. We show that a range of qualitatively dissimilar decision-making procedures incentivize agents to seek power. We demonstrate the flexibility of our results by reasoning about learned policy incentives in Montezuma's Revenge. These results suggest a safety risk: Eventually, retargetable training procedures may train real-world agents which seek power over humans. Examples of agent designs the power-seeking theorems now apply to: Boltzmann-rational agents,, Expected utility maximizers and minimizers, Even if they uniformly randomly sample a few plans and then choose the best sampled Satisficers (as I formalized them), Quantilizing with a uniform prior over plans, and RL-trained agents under certain modeling assumptions. The key insight is that the original results hinge not on optimality per se, but on the retargetability of the policy-generation process via a reward or utility function or some other parameter. See Satisficers Tend To Seek Power: Instrumental Convergence Via Retargetability for intuitions and illustrations. Why am I only now posting this? First, I've been way more excited about shard theory. I still think these theorems are really cool, though. Second, I think the results in this paper are informative about the default incentives for decision-makers which "care about things." IE, make decisions on the basis of e.g. how many diamonds that decision leads to, or how many paperclips, and so on. However, I think that conventional accounts and worries around "utility maximization" are subtly misguided. Whenever I imagined posting this paper, I felt like "ugh sharing this result will just make it worse." I'm not looking to litigate that concern right now, but I do want to flag it. Third, Optimal Policies Tend to Seek Power makes the "reward is the optimization target" mistake super strongly. Parametrically retargetable decision-makers tend to seek power makes the mistake less hard, both because it discusses utility functions and learned policies instead of optimal policies, and also thanks to edits I've made since realizing my optimization-target mistake. Conclusion This paper isolates the key mechanism—retargetability—which enables the results in Optimal Policies Tend to Seek Power. This paper also takes healthy steps away from the optimal policy regime (which I consider to be a red herring for alignment) and lays out a bunch of theory I found—and still find—beautiful. This paper is both published in a top-tier conference and, unlike the previous paper, actually has a shot of being applicable to realistic agents and training processes. Therefore, compared to the original optimal policy paper, I think th...

You Know What I Would Do
Episode 31: Teeth, Farmers Markets, Boltzmann Brain Theory, Plays, Paper Cuts

You Know What I Would Do

Play Episode Listen Later Jan 11, 2023 77:10


Scott Horton Show - Just the Interviews
10/7/22 Boltzmann Booty on the CIA Asset who Funded the OKC Bombing

Scott Horton Show - Just the Interviews

Play Episode Listen Later Oct 16, 2022 49:21


Scott interviews Boltzmann Booty, a writer who recently wrote an article about possible CIA ties to the Oklahoma City bombing of 1995. It's well known that Timothy McVeigh and accomplice Terry Nichols used money stolen from a man named Roger Moore to help fund the bombing. Later on, some research suggested that the robbery of Roger Moore was fake — that Moore used the theft to help fund the attack while keeping his hands clean. In this interview, Booty lays out some of the evidence that may suggest Moore himself was working with the government, specifically the CIA.  Discussed on the show: “The CIA Asset That Funded The Oklahoma City Bombing” (Substack) Oklahoma City Bombing Archive - Libertarian Institute Oklahoma City by Andrew Gumbel and Roger G. Charles The Oklahoma City Bombing and the Politics of Terror by David Hoffman Boltzmann Booty is a writer who focuses on covert intelligence operations and false flag terrorism. His work can be found at his Substack. This episode of the Scott Horton Show is sponsored by: The War State and Why The Vietnam War?, by Mike Swanson; Tom Woods' Liberty Classroom; ExpandDesigns.com/Scott; and Thc Hemp Spot. Shop Libertarian Institute merch or donate to the show through Patreon, PayPal or Bitcoin: 1DZBZNJrxUhQhEzgDh7k8JXHXRjYu5tZiG. Learn more about your ad choices. Visit megaphone.fm/adchoices

The Libertarian Institute - All Podcasts
10/7/22 Boltzmann Booty on the CIA Asset who Funded the OKC Bombing

The Libertarian Institute - All Podcasts

Play Episode Listen Later Oct 16, 2022 49:07


 Download Episode. Scott interviews Boltzmann Booty, a writer who recently wrote an article about possible CIA ties to the Oklahoma City bombing of 1995. It's well known that Timothy McVeigh and accomplice Terry Nichols used money stolen from a man named Roger Moore to help fund the bombing. Later on, some research suggested that the robbery of Roger Moore was fake — that Moore used the theft to help fund the attack while keeping his hands clean. In this interview, Booty lays out some of the evidence that may suggest Moore himself was working with the government, specifically the CIA.  Discussed on the show: “The CIA Asset That Funded The Oklahoma City Bombing” (Substack) Oklahoma City Bombing Archive - Libertarian Institute Oklahoma City by Andrew Gumbel and Roger G. Charles The Oklahoma City Bombing and the Politics of Terror by David Hoffman Boltzmann Booty is a writer who focuses on covert intelligence operations and false flag terrorism. His work can be found at his Substack. This episode of the Scott Horton Show is sponsored by: The War State and Why The Vietnam War?, by Mike Swanson; Tom Woods' Liberty Classroom; ExpandDesigns.com/Scott; and Thc Hemp Spot. Shop Libertarian Institute merch or donate to the show through Patreon, PayPal or Bitcoin: 1DZBZNJrxUhQhEzgDh7k8JXHXRjYu5tZiG.