Priestess of the Temple of Apollo at Delphi
POPULARITY
Klicka på länken för att lyssna på "Henry frågar ChatGPT" och "Henry läser dagens historia":https://linktr.ee/henryspoddarOraklet i Delfi utgör en av de mest kända religiösa företeelserna från antikens Grekland. Med oraklet menas en prästinna vid namn Pythia som uttalade profetior under gudomlig inspiration från Apollon. Men hur gick det till att välja ett orakel? Vad gjorde dessa egentligen? Och hur gick det till när de fick svar från deras gud? Wikipedia säger sitt pm Oraklet i Delfi. Hosted on Acast. See acast.com/privacy for more information.
A real mixed bag this week with new tracks on this weeks DarkCompass, from Our North, Pridian, Pythia and more … Violent Possession – DarkCompass 9th May 2025
"HKBU20" koduyla, tüm randevu paketlerinde geçerli %10'luk indirimden yararlanmak için https://doctorontheline.com adresini ziyaret edebilirsiniz
Welcome to episode 97 (season 5, episode 3) of High Tales of History! We knew you'd be here because we saw it happen in our scrying mirror. We have our friend Saundra in the Smoke Circle with us for a two part episode on divination through history. Saundra is a psychic medium and intuitive tarot card reader and we have a blast all hanging out and taking a long trip back to the many divination practices of ancient civilizations.In this two part series, we will travel the Silk Road from east to west, stopping at various civilizations along the way and finishing in Ancient Rome. In part two, we will be picking up again in the Elizabethan Age, visiting those wild Victorians in their Spiritualism Era, and bringing it up through to the New Age Movement and today. Along the way, we will be meeting famous divinators, learning about tarot's evolution from card game to fortune telling, and get a reading from our guest, Saundra!~~~~~~~*Check Out What Our Guest, Saundra, is Doing!www.saundrainsagittarius.comTikTok: @saundra.in.sagittariusInstagram: @saundra.in.sagYouTube: @saundra.in.sagittarius~~~~~~*The Socials and Patreon!Patreon-- The Best Buds Club! Instagram - @HighTalesofHistory TikTok- @HighTalesofHistoryPod YouTube-- @High Tales of HistoryFacebook -High Tales of History or @HighTalesofHistory Email—hightailingthroughhistory@gmail.com ~~~~*~Source Materials--https://www.oxfordbibliographies.com/display/document/obo-9780195393361/obo-9780195393361-0287.xml#:~:text=Divination%20is%20a%20universal%20phenomenon,unpublished%20even%20in%20the%202020shttps://www.jstor.org/stable/2347094?read-now=1&seq=1https://daily.jstor.org/how-to-read-bones-like-a-scapulimancer/https://en.wikipedia.org/wiki/Chinese_astrologyhttps://bmcr.brynmawr.edu/2005/2005.06.29/#:~:text=The%20liver%20diviners%20and%20celestial%20diviners%20appear,as%20to%20confirm%20or%20refute%20medical%20advice.&text=The%20latest%20known%20Babylonian%20horoscope%2C%20BH%2027,the%20Greek%20tradition%20by%20just%20seven%20yearshttps://www.academia.edu/44688407/Geomancy_in_the_Islamic_Worldhttps://oxfordre.com/planetaryscience/display/10.1093/acrefore/9780190647926.001.0001/acrefore-9780190647926-e-46#:~:text=The%20relationship%20between%20planets%20and,more%20often%20than%20direct%20observationhttps://www.worldhistory.org/Pythia/#:~:text=There%2C%20at%20the%20temple%20center,sacrifice%20of%20a%20black%20ram.&text=It%20is%20a%20Hellenic%20tradition,accordance%20with%20our%20editorial%20policy~~~~*Intro/outro music: "Loopster" by Kevin MacLeod (incompetech.com) Licensed under Creative Commons: By Attribution 3.0 License http://creativecommons.org/licenses/by/3.0/
Welcome to episode 97 (season 5, episode 3) of High Tales of History! We knew you'd be here because we saw it happen in our scrying mirror. We have our friend Saundra in the Smoke Circle with us for a two part episode on divination through history. Saundra is a psychic medium and intuitive tarot card reader and we have a blast all hanging out and taking a long trip back to the many divination practices of ancient civilizations.In this two part series, we will travel the Silk Road from east to west, stopping at various civilizations along the way and finishing in Ancient Rome. In part two, we will be picking up again in the Elizabethan Age, visiting those wild Victorians in their Spiritualism Era, and bringing it up through to the New Age Movement and today. Along the way, we will be meeting famous divinators, learning about tarot's evolution from card game to fortune telling, and get a reading from our guest, Saundra!~~~~~~~*Check Out What Our Guest, Saundra, is Doing!www.saundrainsagittarius.comTikTok: @saundra.in.sagittariusInstagram: @saundra.in.sagYouTube: @saundra.in.sagittarius~~~~~~*The Socials and Patreon!Patreon-- The Best Buds Club! Instagram - @HighTalesofHistory TikTok- @HighTalesofHistoryPod YouTube-- @High Tales of HistoryFacebook -High Tales of History or @HighTalesofHistory Email—hightailingthroughhistory@gmail.com ~~~~*~Source Materials--https://www.oxfordbibliographies.com/display/document/obo-9780195393361/obo-9780195393361-0287.xml#:~:text=Divination%20is%20a%20universal%20phenomenon,unpublished%20even%20in%20the%202020shttps://www.jstor.org/stable/2347094?read-now=1&seq=1https://daily.jstor.org/how-to-read-bones-like-a-scapulimancer/https://en.wikipedia.org/wiki/Chinese_astrologyhttps://bmcr.brynmawr.edu/2005/2005.06.29/#:~:text=The%20liver%20diviners%20and%20celestial%20diviners%20appear,as%20to%20confirm%20or%20refute%20medical%20advice.&text=The%20latest%20known%20Babylonian%20horoscope%2C%20BH%2027,the%20Greek%20tradition%20by%20just%20seven%20yearshttps://www.academia.edu/44688407/Geomancy_in_the_Islamic_Worldhttps://oxfordre.com/planetaryscience/display/10.1093/acrefore/9780190647926.001.0001/acrefore-9780190647926-e-46#:~:text=The%20relationship%20between%20planets%20and,more%20often%20than%20direct%20observationhttps://www.worldhistory.org/Pythia/#:~:text=There%2C%20at%20the%20temple%20center,sacrifice%20of%20a%20black%20ram.&text=It%20is%20a%20Hellenic%20tradition,accordance%20with%20our%20editorial%20policy~~~~*Intro/outro music: "Loopster" by Kevin MacLeod (incompetech.com) Licensed under Creative Commons: By Attribution 3.0 License http://creativecommons.org/licenses/by/3.0/
Yaramaurd and Pythia discuss the cultures, practices, and cosmology of the Senoi Temiar people of Malaysia and their use of ritual and its correlations with theatre. After consideration of techniques we could bring into our own practices, Yara talks about methods of herbal tincture making and Pythia brings attention to the Aquilaria or lign-aloe tree and sustainability. Cited Sources:Abdullah, Muhammad Fuad, et al. “TRADITIONAL KNOWLEDGE and the USES of NATURAL RESOURCES by the RESETTLEMENT of INDIGENOUS PEOPLE in MALAYSIA.” Journal of Southeast Asian Studies, vol. 25, no. 1, 20 June 2020, pp. 168–190, https://doi.org/10.22452/jati.vol25no1.9.Benjamin, Geoffrey. “Austroasiatic Subgroupings and Prehistory in the Malay Peninsula.” Oceanic Linguistics Special Publications, no. 13, 1976, pp. 37–128. JSTOR, http://www.jstor.org/stable/20019154.Ch, Russell Maeth. “G. William Domhoff. The Mystique of Dreams ; a Search for Utopia through Senoi Dream Theory. Berkeley, Calif. : University of California Press, 1985. X, 146 P.” Estudios de Asia Y África, vol. 21, no. 2, 1 Apr. 1986, pp. 354–356.Cole, Fay-Cooper. The Peoples of Malaysia. 1945.Domhoff, G William. “Senoi, Kilton Stewart and the Mystique of Dreams: Further Thoughts on an Allegory about an Allegory.” Lucidity Letter, vol. 10, 1 Jan. 1991. Accessed 27 Jan. 2025.Fix, Alan G. The Demography of the Semai Senoi. U OF M MUSEUM ANTHRO ARCHAEOLOGY, 1 Jan. 1977.G William Domhoff. The Mystique of Dreams : A Search for Utopia through Senoi Dream Theory. Berkeley, University Of California Press, 1985.Jennings, Sue. Theatre, Ritual and Transformation. Routledge, 20 Dec. 2018.Masron, T. & Masami, F. & Ismail, Norhasimah. (2013). Orang Asli in Peninsular Malaysia: population, spatial distribution and socio-economic condition. J. Ritsumeikan Soc. Sci. Hum.. 6. 75-115.Noone, H. D. “Report on the Settlements and Welfare of the Ple-Temiar Senoi of the Perak-Kelantan Watershed.” Journal of the Federated Malay States Museums. 1936.Saputra, Riza & Khotimah, Husnul. (2021). BRIDGING TO ANOTHER DIMENSION: THE RELATIONAL SYSTEM OF SHAMANISM AND RELIGIOUS ENCOUNTER AMONGST THE TEMIAR SENOI OF MALAYA. Jurnal Ilmiah Ilmu Ushuluddin. 20. 72. 10.18592/jiiu.v20i1.5051.Thambiah, Shanthi, et al. “Reclaiming the Eclipsed Female in the Sacred.” Bijdragen Tot de Taal-, Land- En Volkenkunde / Journal of the Humanities and Social Sciences of Southeast Asia, vol. 174, no. 2-3, 1 Jan. 2018, pp. 264–290, https://doi.org/10.1163/22134379-17402002.Toshihiro Nobuta. Living on the Periphery. Trans Pacific Press, 2008.
Episode: 2822 Herodotus describing historical events of 5th century BC, fantastical and entertaining component of The Histories. Today, we visit Herodotus.
Vor rund 3000 Jahren entsteht an den Hängen des Parnass die größte Weisssagungsstätte der griechischen Welt. In seiner Mitte steht das mysteriöse Heiligtum des Gottes Apollon. Hier haust auch das Orkael selbst - eine Frau aus Delphi, die im Rausch die Stimme des Gottes verkündet. Der Einfluss dieses Orakels wird bald so groß werden, dass internationale Gesandte, Könige, Vertreter vieler Dynastien aber auch einfache Bürger vor den wichtigsten Entscheidungen um eine Weissagung bitten. So bestimmt das Orakel von Delphi bald die Schicksale der griechischen Welt, von einfachen Bürgern bis zu ganzen Königreichen…….…Das Folgenbild zeigt das Orakel bzw. die Pythia und Priester im Apollon-Heiligtum bei einer Weissagung........WERBUNGExpressVPN - Ein schnelles und sicheres VPN! Spare jetzt exklusiv 61% auf den 2-Jahres Plan und bekomme 4 Monate gratis! https://ExpressVPN.com/His2GoDu willst dir die Rabatte unserer weiteren Werbepartner sichern? Hier geht's zu den Angeboten!.......Jetzt His2Go unterstützen für tolle Vorteile - über Steady!Klick hier und werde His2Go Hero oder His2Go Legend.......LITERATURScott, Michael: Delphi: a history of the center of the Ancient world, Princeton; Oxford 2014.Nesselrath, Heinz-Günther; Bäbler, Balbina: Delphi : Apollons Orakel in der Welt der Antike, Tübingen 2021.......COPYRIGHTMusic from https://filmmusic.io: “Sneaky Snitch” by Kevin MacLeod and "Plain Loafer" by Kevin MacLeod (https://incompetech.com) License: CC BY....... !Neu! Jetzt hier His2Go unterstützen, Themen mitbestimmen und Quiz2Go mit Moderatorin Chiara erleben! https://plus.acast.com/s/his2go-geschichte-podcast. Hosted on Acast. See acast.com/privacy for more information.
In our first episode of 2025, hosts Liam and Johannah kick off the year with an exciting conversation about innovation, challenges, and growth. Joining them are Jeremy Bormann, CEO, and Raymon Ceasar, Head of R&D at Legal Pythia, a cutting-edge startup leveraging AI to detect fraud in documents and datasets. This insightful discussion dives into: The diverse, international team behind Legal Pythia and how their unique mix of cultures fosters innovative problem-solving. Navigating compliance challenges and preparing for the EU's upcoming AI regulations. The journey of pivoting as a startup to meet customer needs and refining the value proposition. Communicating effectively with customers to demonstrate the impact of their solution. Insights on securing government grant funding and the critical role of product testing and user interface design. Overcoming the talent challenge: how startups can attract top-tier talent while competing with established tech giants. Plus, we hear the team's advice for new start ups and close with a warming message about the friends you meet along the journey!
¡Gracias por tu paciencia, pública hermosa! Después de una pequeña pausa festiva, volvemos a la cobertura de Canadá 5 con diseño y las expectativas bajan, lamentablemente. Las reinas se enfrentan en un nuevo desafío de costura, cuyo objetivo es confeccionar un atuendo que se pueda vender en un mall, además de un look que funcione en “cámara lenta”, pero al parecer no tenemos fashion queen esta temporada. Un episodio cargado de drama, una conversación en el werkroom sobre las críticas hacia los cuerpos diversos, un nuevo lipsync de Rêve y los argumentos para que las reinas sean salvadas por el castor dorado nos deja EN FUEGO. ¿Cuál fue tu atuendo favorito? ¿Cuándo vuelve la Pythia a salvar las pasarelas? ¿A quién hubieses salvado tú del lipsync con el castor dorado?
Welcome back to the 209th episode of The Cup which is our a weekly (give or take, TBD, these are unprecedented times) performing arts talk show presented by Cup of Hemlock Theatre. With the theatres on a come back we offer a mix of both reviews of live shows we've seen and continued reviews of prophet productions! For our 209th episode we bring you a Duet Review of Oraculum, created and performed by Denim and Pythia, directed by ted witzel, and presented by Buddies in Bad Times Theatre. Join Mackenzie Horner and Ryan Borochovitz, as they discuss the aesthetics of drag performance, the legitimacy of tarot readings, and anxieties about the future. Oraculum is playing at Buddies in Bad Times Theatre (12 Alexander Street, Toronto, ON) until December 14th, 2024. Tickets can be purchased from the following link: https://buddiesinbadtimes.com/show/oraculum/ This review contains many SPOILERS for Oraculum. It will begin with a general non-spoiler review until the [14:21] mark, followed by a more in-depth/anything goes/spoiler-rich discussion. If you intend to see the production, we recommend you stop watching after that point, or at least proceed at your own risk. Follow our panelists: Mackenzie Horner (Before the Downbeat: A Musical Podcast) – Instagram/Facebook: BeforetheDownbeat Apple Podcasts: https://apple.co/3aYbBeN Spotify: https://spoti.fi/3sAbjAu Ryan Borochovitz – [Just send all that love to CoH instead; he won't mind!]; if you enjoy his theatre thoughts, more can be found at https://www.intermissionmagazine.ca/author/ryan-borochovitz/ & https://nextmag.ca/search/borochovitz Follow Cup of Hemlock Theatre on Instagram/Facebook/Twitter: @cohtheatre If you'd like us to review your upcoming show in Toronto, please send press invites/inquiries to coh.theatre.MM@gmail.com --- Support this podcast: https://podcasters.spotify.com/pod/show/cup-of-hemlock-theatre/support
In these three new historical mysteries, I tell you about the Pythia, the High Priestess of Apollo in Delphi, who for more than a millennium delivered sought-after prophecies in a state of frenzy - but the Oracle may have been more ancient than Classical Greece. The second story is about the myth of changelings, that is to say newborns, or young children that would have been exchanged by fairies, or trolls in the Scandinavian version. Why this belief, and how did it appear? I also added a touch of true crime with the story of Bridget Cleary, a woman murdered in 1895 because she was believed to be a changeling. The third story takes us from Russia and Germany to the USA in the 20th Century: who were all these women claiming to be Grand Duchess Anastasia, the youngest daughter of the last Tsar of Russia, miraculously alive? This modern mystery found an answer thanks to DNA, I tell you how, and a little bit about Eugenia Smith or Anna Anderson. Welcome to Lights Out LibraryJoin me for a sleepy adventure tonight. Sit back, relax, and fall asleep to documentary-style stories read in a calming voice. Learn something new while you enjoy a restful night of sleep.Listen ad free and get access to bonus content on our Patreon: https://www.patreon.com/LightsOutLibrary621Listen on Youtube: https://www.youtube.com/@LightsOutLibraryov ¿Quieres escuchar en Español? Echa un vistazo a La Biblioteca de los Sueños!En Spotify: https://open.spotify.com/show/1t522alsv5RxFsAf9AmYfgEn Apple Podcasts: https://podcasts.apple.com/us/podcast/la-biblioteca-de-los-sue%C3%B1os-documentarios-para-dormir/id1715193755En Youtube: https://www.youtube.com/@LaBibliotecadelosSuenosov
In this episode, our host Gaby Azorsky speaks with Ariella Daly. This is part two of a two part series! Ariella is a beekeeper and teacher. She is a voice in the wave of bee people devoted to the preservation of the bee and pollinators in their own sovereignty. Here because she believes in the wild, divine power of nature as both teacher and healer. For the last 14 years, she has been a devoted student of the bee, bee animism, womb wisdom, and history and folklore around bees and bee priestesses of Ancient Greece. Ariella's approach to beekeeping falls under the labels of natural and bee-centric beekeeping. She is primarily interested in the relationship between beekeeper and bees, what we can learn from the hive as a colony and what we can offer to the bees, both in a single hive and as a species. From Ariella's instagram: Hymenoptera. The veil winged. She who carries souls between the worlds. Descended from stars. Born of bull. Born of lion. Melissae: the bee nymph and Mistress to Python. The bond between bee and woman goes as far back as recorded history. She, of the in between places. When the Melissae nymphs recused infant Zeus from his all-consuming father, they hid him away in a cave and raised him on sweet ambrosia: honey and milk. When bees came to earth they arrived in a long comet of starts. Seven of them remain to this day in the heavens. The seven Bees. The seven sisters. The Pleiades. When the tiny cluster of stars rise in the spring, honey bees cover the earth and bounty returns to the land. They say the first brewer was a woman and she brewed mead from her bees who told her the secrets of their intoxicating elixir. The great Greek prophetess was of the bee. Most revered oracles of the west, their era spanning centuries. The priestesses of Delphi were called Melissae, meaning bee, and the Pythia offering prophecy, was called the Delphic Bee. The bee offers the hum of life. The sound of creation. She, lover to flower and sunlight. Beloved companion of darkness. She who dances through the dim golden halls of her honeycomb cathedral. In our conversation today, we talk about magick, animism, imagination, ancient cultures and mythology, natural beekeeping, the hive, how they sound and feel and look, how we listen and relate, swarms, sovereignty, the bees and the flowers, eros, the beloved, the rhythms of beekeeping, and dreams. *For 20% off your first month of The Flower Portal, use the code SPIRALOFFLOWERS through the end of August* Connect ~ With our guest Ariella | Website and IG @beekeepinginskirts With her free lecture and other lectures, Messengers of Love With our host Gaby Azorsky | Website and IG @gaby.azorsky With Spiral Deeper | Website and IG @spiral.deeper Sign up for Gaby's newsletter Partners ~ Thank you to our partners! Moon Juice - Code ‘GABY.AZORSKY' Activist Manuka Honey - Code ‘GABY15' The Retreat Newspaper - Code ‘GABY100' for your first issue free Music by Gaby's incredible partner, Connor Hayes. Spiral Deeper Icon by Kami Marchand. If you would like to advertise on Spiral Deeper, please email gabyazorsky@gmail.com for packages and information. Please rate, review, and subscribe wherever you listen ~ it means so much. Thank you for your support!
In this episode of the RuPaul's Drag Race Recap Show, Joe and Robert discuss the second part of the Global Talent Extravaganza. They review the mini challenge, where Nehellenia wins and Pythia's performance falls flat. They also analyze the talent show performances, including Nehellenia's original lip sync, Vanity Vain's generic pop song, and Tessa's burlesque routine. The hosts share their thoughts on the queens' performances and provide their ratings for each act. In this episode, Robert and Joe discuss the talent show performances and the runway looks of the queens. They share their thoughts on the comedy performance, the original song, and the pole dancing routine. They also discuss the cultural references and humor in the show, particularly related to Mexican and Filipino cultures. The hosts explore the concept of humor in different cultures and the appeal of certain types of performances. They also touch on the themes of family and support within the drag community. Overall, they provide a humorous and insightful analysis of the episode. Voicemail: speakpipe.com/afterthoughtmedia Email: dragracerecap@afterthought.media Twitter: @dragracerecap YouTube: youtube.com/dragracerecap Patreon: patreon.com/afterthoughtmedia Learn more about your ad choices. Visit podcastchoices.com/adchoices
In this episode, our host Gaby Azorsky speaks with Ariella Daly. This is part one of a two part series! Ariella is a beekeeper and teacher. She is a voice in the wave of bee people devoted to the preservation of the bee and pollinators in their own sovereignty. Here because she believes in the wild, divine power of nature as both teacher and healer. For the last 14 years, she has been a devoted student of the bee, bee animism, womb wisdom, and history and folklore around bees and bee priestesses of Ancient Greece. Ariella's approach to beekeeping falls under the labels of natural and bee-centric beekeeping. She is primarily interested in the relationship between beekeeper and bees, what we can learn from the hive as a colony and what we can offer to the bees, both in a single hive and as a species. From Ariella's instagram: Hymenoptera. The veil winged. She who carries souls between the worlds. Descended from stars. Born of bull. Born of lion. Melissae: the bee nymph and Mistress to Python. The bond between bee and woman goes as far back as recorded history. She, of the in between places. When the Melissae nymphs recused infant Zeus from his all-consuming father, they hid him away in a cave and raised him on sweet ambrosia: honey and milk. When bees came to earth they arrived in a long comet of starts. Seven of them remain to this day in the heavens. The seven Bees. The seven sisters. The Pleiades. When the tiny cluster of stars rise in the spring, honey bees cover the earth and bounty returns to the land. They say the first brewer was a woman and she brewed mead from her bees who told her the secrets of their intoxicating elixir. The great Greek prophetess was of the bee. Most revered oracles of the west, their era spanning centuries. The priestesses of Delphi were called Melissae, meaning bee, and the Pythia offering prophecy, was called the Delphic Bee. The bee offers the hum of life. The sound of creation. She, lover to flower and sunlight. Beloved companion of darkness. She who dances through the dim golden halls of her honeycomb cathedral. In our conversation today, we talk about magick, animism, imagination, ancient cultures and mythology, natural beekeeping, the hive, how they sound and feel and look, how we listen and relate, swarms, sovereignty, the bees and the flowers, eros, the beloved, the rhythms of beekeeping, and dreams. *For 20% off your first month of The Flower Portal, use the code SPIRALOFFLOWERS through the end of August* Connect ~ With our guest Ariella | Website and IG @beekeepinginskirts With her free lecture and other lectures, Messengers of Love With our host Gaby Azorsky | Website and IG @gaby.azorsky With Spiral Deeper | Website and IG @spiral.deeper Sign up for Gaby's newsletter Partners ~ Thank you to our partners! Moon Juice - Code ‘GABY.AZORSKY' Activist Manuka Honey - Code ‘GABY15' The Retreat Newspaper - Code ‘GABY100' for your first issue free Music by Gaby's incredible partner, Connor Hayes. Spiral Deeper Icon by Kami Marchand. If you would like to advertise on Spiral Deeper, please email gabyazorsky@gmail.com for packages and information. Please rate, review, and subscribe wherever you listen ~ it means so much. Thank you for your support!
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Stitching SAEs of different sizes, published by Bart Bussmann on July 13, 2024 on The AI Alignment Forum. Work done in Neel Nanda's stream of MATS 6.0, equal contribution by Bart Bussmann and Patrick Leask, Patrick Leask is concurrently a PhD candidate at Durham University TL;DR: When you scale up an SAE, the features in the larger SAE can be categorized in two groups: 1) "novel features" with new information not in the small SAE and 2) "reconstruction features" that sparsify information that already exists in the small SAE. You can stitch SAEs by adding the novel features to the smaller SAE. Introduction Sparse autoencoders (SAEs) have been shown to recover sparse, monosemantic features from language models. However, there has been limited research into how those features vary with dictionary size, that is, when you take the same activation in the same model and train a wider dictionary on it, what changes? And how do the features learned vary? We show that features in larger SAEs cluster into two kinds of features: those that capture similar information to the smaller SAE (either identical features, or split features; about 65%), and those which capture novel features absent in the smaller mode (the remaining 35%). We validate this by showing that inserting the novel features from the larger SAE into the smaller SAE boosts the reconstruction performance, while inserting the similar features makes performance worse. Building on this insight, we show how features from multiple SAEs of different sizes can be combined to create a "Frankenstein" model that outperforms SAEs with an equal number of features, though tends to lead to higher L0, making a fair comparison difficult. Our work provides new understanding of how SAE dictionary size impacts the learned feature space, and how to reason about whether to train a wider SAE. We hope that this method may also lead to a practically useful way of training high-performance SAEs with less feature splitting and a wider range of learned novel features. Larger SAEs learn both similar and entirely novel features Set-up We use sparse autoencoders as in Towards Monosemanticity and Sparse Autoencoders Find Highly Interpretable Directions. In our setup, the feature activations are computed as: Based on these feature activations, the input is then reconstructed as The encoder and decoder matrices and biases are trained with a loss function that combines an L2 penalty on the reconstruction loss and an L1 penalty on the feature activations: In our experiments, we train a range of sparse autoencoders (SAEs) with varying widths across residual streams in GPT-2 and Pythia-410m. The width of an SAE is determined by the number of features (F) in the sparse autoencoder. Our smallest SAE on GPT-2 consists of only 768 features, while the largest one has nearly 100,000 features. Here is the full list of SAEs used in this research: Name Model site Dictionary size L0 MSE CE Loss Recovered from zero ablation CE Loss Recovered from mean ablation GPT2-768 gpt2-small layer 8 of 12 resid_pre 768 35.2 2.72 0.915 0.876 GPT2-1536 gpt2-small layer 8 of 12 resid_pre 1536 39.5 2.22 0.942 0.915 GPT2-3072 gpt2-small layer 8 of 12 resid_pre 3072 42.4 1.89 0.955 0.937 GPT2-6144 gpt2-small layer 8 of 12 resid_pre 6144 43.8 1.631 0.965 0.949 GPT2-12288 gpt2-small layer 8 of 12 resid_pre 12288 43.9 1.456 0.971 0.958 GPT2-24576 gpt2-small layer 8 of 12 resid_pre 24576 42.9 1.331 0.975 0.963 GPT2-49152 gpt2-small layer 8 of 12 resid_pre 49152 42.4 1.210 0.978 0.967 GPT2-98304 gpt2-small layer 8 of 12 resid_pre 98304 43.9 1.144 0.980 0.970 Pythia-8192 Pythia-410M-deduped layer 3 of 24 resid_pre 8192 51.0 0.030 0.977 0.972 Pythia-16384 Pythia-410M-deduped layer 3 of 24 resid_pre 16384 43.2 0.024 0.983 0.979 The base language models used are those included in Transform...
Cet épisode, en deux parties, a reçu le soutien d'Envol et Matrescence. Plus d'infos à la fin de la description de l'épisode !Dans la 1ère partie de cet épisode, vous avez découvert l'histoire de Marie. Nous avions enregistré fin 2022, un an et demi après le décès de sa petite Lénie, à quelques heures de vie. Aujourd'hui, vous écoutez la suite de son parcours, enregistrée en novembre 2023, alors que Marie était enceinte de 7 mois. Une deuxième grossesse après avoir ressenti le besoin de faire une pause de plusieurs mois. Une deuxième grossesse après 18 mois d'essais. Et cette deuxième grossesse, c'est une "grossesse d'après" bien entourée. Car autour d'elle, Marie a une équipe de choc : ses parents, sa soeur, mais aussi des soignant·es, une doula et une psychologue. Car il faut tout un village, finalement, pour soutenir une femme qui porte la vie, après avoir vécu le pire. Dans cet épisode, Marie revient sur son cheminement avant d'essayer d'être de nouveau enceinte, l'attente du fameux test positif, mais aussi cette grossesse pleine d'espoir.
Cet épisode, en deux parties, a reçu le soutien d'Envol et Matrescence. Plus d'infos à la fin de la description de l'épisode !En 2020, Marie est sûre d'elle : elle va avoir un enfant. Célibataire depuis plusieurs années, elle ne souhaite pas attendre d'avoir un homme dans sa vie pour envisager une grossesse. Être enceinte, Marie le désire plus que tout. Pour cela, elle va prendre des chemins de traverse, sortir des sentiers battus, explorer des territoires inexplorés.Rapidement, Marie découvre sa grossesse et la vit entourée de ses parents, sa grande soeur et son amie Candice. Et si elle décide de vivre cette maternité en tant que femme célibataire, elle peut compter sur le soutien de toute sa tribu dans les moments de joie comme dans les épreuves. Car à la fin du second trimestre, à l'occasion de sa seconde échographie à laquelle elle se rend avec Candice, Marie découvre que son bébé ne va pas bien. Lorsque la sage-femme lui demande, à la fin du rendez-vous : "êtes-vous bien entourée ?", la suite de l'histoire nous montre que oui, Marie est bien entourée. Tout comme Lénie, sa petite fille, qui vient au monde en juin 2021 et vit pendant 5 heures, sous les regards de sa maman et de toutes celles et ceux qui l'aimaient déjà si fort.Aujourd'hui, vous pouvez écouter la première partie de cet épisode, réalisé à partir d'un enregistrement de fin 2022. La seconde partie de l'histoire de Marie, enregistrée en novembre 2023, arrive dans quelques jours !
Construisons la saison 5 d'Au Revoir Podcast ensemble ! Pour soutenir le podcast, rdv sur la page Ulule.com/Aurevoirpodcast !Le silence qui entoure le deuil périnatal, qui n'y a jamais été confronté·e ? Le silence gêné, le silence qui donne l'impression que rien ne s'est passé. Le silence qui gomme l'épreuve traversée. Cher·es proches, sachez-le : le silence rajoute de la douleur à la douleur. Je sais que c'est dur de se confronter à la peine de l'autre, mais c'est le moins que vous puissiez faire dans ces moments-là. Alors dans cet épisode, Yolanda du compte Instagram @parlez_moidelle et moi-même échangeons à propos de ce silence, bien trop répandu dans l'entourage des personnes qui vivent un deuil périnatal et vous donnons quelques pistes pour offrir, justement, un espace de parole. Vous qui faites partie de l'entourage d'une personne qui vit un deuil périnatal, écoutez bien ce qui va suivre : cet épisode, issu d'une série consacrée aux idées reçues sur le deuil périnatal, est conçu pour vous. Pour vous aider à comprendre que ce que votre fils, votre fille, votre amie, votre cousine, votre collègue vit et pour vous donner des clés pour mieux les accompagner.Prenez un papier, un stylo pour prendre des notes et ouvrez grand vos oreilles, le premier épisode consacré aux idées reçues sur le deuil périnatal, va commencer !Un grand merci à Yolanda mais aussi à Anissa et Fanny qui ont apporté leur témoignage. Je vous souhaite une bonne écoute !Vous pouvez retrouver les deux premiers épisodes : "Ce n'est pas un deuil express !" et "Stop aux hiérarchies !"Au Revoir Podcast est aussi sur Instagram : @aurevoir.podcast. Venez me retrouver sur ce compte pour échanger et pour lever, ensemble, le voile sur le deuil périnatal.Création et réalisation : Sophie de ChivréLes magnifiques morceaux qui illustrent cet épisode sont les créations du groupe caennais Portier Dean et sont extraits de leur album "Ancient Majesty". "Intro" / "The Pool" (version instrumentale inédite) / "Pythia"© 2018 Portier Dean Productions / Collectif Toujours. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.
Let's Talk About Myths, Baby! Greek & Roman Mythology Retold
Finishing off the play about the crimes of gods and men. Creusa considers burning Apollo's Oracle to the ground. Help keep LTAMB going by subscribing to Liv's Patreon for bonus content! CW/TW: far too many Greek myths involve assault. Given it's fiction, and typically involves gods and/or monsters, I'm not as deferential as I would be were I referencing the real thing. Sources: Euripides' Ion: translation by Cecelia Eaton Luschnig; introduction to Euripides' Orestes and Other Plays by Edith Hall. Attributions and licensing information for music used in the podcast can be found here: mythsbaby.com/sources-attributions.See omnystudio.com/listener for privacy information.
Rediffusion d'un épisode initialement mis en ligne en novembre 2022. Le deuil périnatal ça a un début et ça n'a pas de fin… On chemine au quotidien, c'est un processus long, qui dure toute une vie. Parce qu'on ne l'oublie pas ce bébé, ce bébé qu'on attendait, qu'on aimait, avec qui on a partagé tant de choses.Quand on est frappé par la mort de son enfant, qu'on saute à pieds joints dans cette épreuve si dure et si méconnue, et si injuste, on voudrait nous faire croire que les éventuelles grossesses suivantes, que les enfants qui viendraient agrandir la famille, seraient comme le point final de ce deuil. Mais non, c'est faux. Et ce n'est pas mon invitée, Laura, qui dira le contraire. L'absence, c'est pour toute la vie. Sa première fille, Louise, sera pour toujours absente, invisible aux yeux des autres, mais d'une absence très particulière : Louise, elle vit autrement. Dans les dates, dans ce mois de février, qui les vues naître, deux de ses soeurs et elle. Mais aussi dans ce lien, que Laura et son compagnon, tissent entre chacune de leurs filles. Oui, Louise n'est plus là. Mais elle est là d'une autre manière. Pour Laura, c'est une bonne étoile qui se révèle à elle et lui fait des clins d'oeil... Louise est sur ces photos et dessins, sur ces souvenirs qui tapissent le coeur et les murs de la chambre de sa maman. Et puis Louise, elle est là pour ses soeurs, Clémence, Salomé et Romy. Une grande soeur avec qui elles ne peuvent pas jouer, mais une grande soeur qu'elles vont voir régulièrement, à qui elles offrent des fleurs, des bougies et des dessins. Je vous souhaite une belle écoute et je tiens à remercier chaleureusement Laura pour sa participation !Au Revoir Podcast est aussi sur Instagram : @aurevoir.podcast. Venez me retrouver sur ce compte pour échanger et pour lever, ensemble, le voile sur le deuil périnatal.VOUS SOUHAITEZ SOUTENIR AU REVOIR PODCAST ? Rendez-vous sur la plateforme de financement participatif Tipeee pour faire un don et m'aider dans mon travail: https://fr.tipeee.com/aurevoirpodcastCréation, réalisation, montage et mixage : Sophie de ChivréLes magnifiques morceaux qui illustrent cet épisode sont les créations du groupe caennais Portier Dean et sont extraits de leur album "Ancient Majesty". "Intro" / "The Pool" (version instrumentale inédite) / "Pythia"© 2018 Portier Dean Productions / Collectif Toujours. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.
Yara and Pythia discuss Australian Aboriginal Dreamtime, cacti, and cordage.
La frontière est ténue entre la vie et la mort : Marie le sait bien. En tant que puéricultrice en réanimation néonatale mais aussi en tant que maman. Le deuil périnatal, elle l'a vécu à l'occasion de sa seconde grossesse. Il y a 11 ans, Marie n'est plus la soignante qui accompagne les bébés des autres, c'est la maman qui accompagne son enfant. Son petit Gabriel, décédé in utero suite à une IMG. "On ne voit bien qu'avec le cœur. L'essentiel est invisible pour les yeux", a écrit Antoine de Saint-Exupéry dans Le Petit Prince. Il y a quelque temps, Marie m'a dit qu'elle aimait beaucoup cette citation. Et elle a raison Marie. Parce que tous ces bébés qui ne sont plus là, Marie les voit. Avec son cœur de maman, avec son cœur de soignante.Un grand merci Marie pour ton témoignage et ta confiance !Je vous souhaite une belle écoute.Au Revoir Podcast est aussi sur Instagram : @aurevoir.podcast. Venez me retrouver sur ce compte pour échanger et pour lever, ensemble, le voile sur le deuil périnatal.Création et réalisation : Sophie de ChivréLes magnifiques morceaux qui illustrent cet épisode sont les créations du groupe caennais Portier Dean et sont extraits de leur album "Ancient Majesty". "Intro" / "The Pool" (version instrumentale inédite) / "Pythia"© 2018 Portier Dean Productions / Collectif Toujours. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Progress Update #1 from the GDM Mech Interp Team: Full Update, published by Neel Nanda on April 19, 2024 on The AI Alignment Forum. This is a series of snippets about the Google DeepMind mechanistic interpretability team's research into Sparse Autoencoders, that didn't meet our bar for a full paper. Please start at the summary post for more context, and a summary of each snippet. They can be read in any order. Activation Steering with SAEs Arthur Conmy, Neel Nanda TL;DR: We use SAEs trained on GPT-2 XL's residual stream to decompose steering vectors into interpretable features. We find a single SAE feature for anger which is a Pareto-improvement over the anger steering vector from existing work (Section 3, 3 minute read). We have more mixed results with wedding steering vectors: we can partially interpret the vectors, but the SAE reconstruction is a slightly worse steering vector, and just taking the obvious features produces a notably worse vector. We can produce a better steering vector by removing SAE features which are irrelevant ( Section 4). This is one of the first examples of SAEs having any success for enabling better control of language models, and we are excited to continue exploring this in future work. 1. Background and Motivation We are uncertain about how useful mechanistic interpretability research, including SAE research, will be for AI safety and alignment. Unlike RLHF and dangerous capability evaluation (for example), mechanistic interpretability is not currently very useful for downstream applications on models. Though there are ambitious goals for mechanistic interpretability research such as finding safety-relevant features in language models using SAEs, these are likely not tractable on the relatively small base models we study in all our snippets. To address these two concerns, we decided to study activation steering[1] (introduced in this blog post and expanded on in a paper). We recommend skimming the blog post for an explanation of the technique and examples of what it can do. Briefly, activation steering takes vector(s) from the residual stream on some prompt(s), and then adds these to the residual stream on a second prompt. This makes outputs from the second forward pass have properties inherited from the first forward pass. There is early evidence that this technique could help with safety-relevant properties of LLMs, such as sycophancy. We have tentative early research results that suggest SAEs are helpful for improving and interpreting steering vectors, albeit with limitations. We find these results particularly exciting as they provide evidence that SAEs can identify causally meaningful intermediate variables in the model, indicating that they aren't just finding clusters in the data or directions in logit space, which seemed much more likely before we did this research. We plan to continue this research to further validate SAEs and to gain more intuition about what features SAEs do and don't learn in practice. 2. Setup We use SAEs trained on the residual stream of GPT-2 XL at various layers, the model used in the initial activation steering blog post, inspired by the success of residual stream SAEs on GPT-2 Small ( Bloom, 2024) and Pythia models ( Cunningham et. al, 2023). The SAEs have 131072 learned features, L0 of around 60[2], and loss recovered around 97.5% (e.g. splicing in the SAE from Section 3 increases loss from 2.88 to 3.06, compared to the destructive zero ablation intervention resulting in Loss > 10). We don't think this was a particularly high-quality SAE, as the majority of its learned features were dead, and we found limitations with training residual stream SAEs that we will discuss in an upcoming paper. Even despite this, we think the results in this work are tentative evidence for SAEs being useful. It is likely ea...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: [Full Post] Progress Update #1 from the GDM Mech Interp Team, published by Neel Nanda on April 19, 2024 on LessWrong. This is a series of snippets about the Google DeepMind mechanistic interpretability team's research into Sparse Autoencoders, that didn't meet our bar for a full paper. Please start at the summary post for more context, and a summary of each snippet. They can be read in any order. Activation Steering with SAEs Arthur Conmy, Neel Nanda TL;DR: We use SAEs trained on GPT-2 XL's residual stream to decompose steering vectors into interpretable features. We find a single SAE feature for anger which is a Pareto-improvement over the anger steering vector from existing work (Section 3, 3 minute read). We have more mixed results with wedding steering vectors: we can partially interpret the vectors, but the SAE reconstruction is a slightly worse steering vector, and just taking the obvious features produces a notably worse vector. We can produce a better steering vector by removing SAE features which are irrelevant ( Section 4). This is one of the first examples of SAEs having any success for enabling better control of language models, and we are excited to continue exploring this in future work. 1. Background and Motivation We are uncertain about how useful mechanistic interpretability research, including SAE research, will be for AI safety and alignment. Unlike RLHF and dangerous capability evaluation (for example), mechanistic interpretability is not currently very useful for downstream applications on models. Though there are ambitious goals for mechanistic interpretability research such as finding safety-relevant features in language models using SAEs, these are likely not tractable on the relatively small base models we study in all our snippets. To address these two concerns, we decided to study activation steering[1] (introduced in this blog post and expanded on in a paper). We recommend skimming the blog post for an explanation of the technique and examples of what it can do. Briefly, activation steering takes vector(s) from the residual stream on some prompt(s), and then adds these to the residual stream on a second prompt. This makes outputs from the second forward pass have properties inherited from the first forward pass. There is early evidence that this technique could help with safety-relevant properties of LLMs, such as sycophancy. We have tentative early research results that suggest SAEs are helpful for improving and interpreting steering vectors, albeit with limitations. We find these results particularly exciting as they provide evidence that SAEs can identify causally meaningful intermediate variables in the model, indicating that they aren't just finding clusters in the data or directions in logit space, which seemed much more likely before we did this research. We plan to continue this research to further validate SAEs and to gain more intuition about what features SAEs do and don't learn in practice. 2. Setup We use SAEs trained on the residual stream of GPT-2 XL at various layers, the model used in the initial activation steering blog post, inspired by the success of residual stream SAEs on GPT-2 Small ( Bloom, 2024) and Pythia models ( Cunningham et. al, 2023). The SAEs have 131072 learned features, L0 of around 60[2], and loss recovered around 97.5% (e.g. splicing in the SAE from Section 3 increases loss from 2.88 to 3.06, compared to the destructive zero ablation intervention resulting in Loss > 10). We don't think this was a particularly high-quality SAE, as the majority of its learned features were dead, and we found limitations with training residual stream SAEs that we will discuss in an upcoming paper. Even despite this, we think the results in this work are tentative evidence for SAEs being useful. It is likely easiest to simpl...
11 avril 2006. Cette date, Julie s'en souviendra toute sa vie.Il y a des chemins tout tracés, les projections que l'on se faits : le mariage puis les enfants. Julie imaginait en avoir trois ! Mais sa première grossesse ne se passe pas comme prévu et toutes les certitudes volent en éclats. Théo, son bébé, est gravement malade. Julie et son mari décident donc de l'accompagner in utero, lors d'une interruption médicale de grossesse. 18 ans après, Julie revient sur cette épreuve et sur le tournant que sa vie a pris. Dans cet épisode, elle explique le vide ressenti après son retour de la maternité, le manque d'accompagnement psychologique lors de sa seconde grossesse et la manière qu'elle a eue de se reconstruire. Alors qu'elle est enceinte de son troisième enfant, elle décide d'emprunter un autre itinéraire et prend un virage professionnel : elle veut devenir infirmière en maternité afin d'être là pour toutes les femmes, pour tous les couples, dans les moments heureux et dans les épreuves. nCe jour-là, quand elle prend cette décision, nul doute qu'une boussole la guide dans ce périple. Et cette boussole, elle porte le prénom de Théo.Vivre sans son bébé, cela n'a aucun sens. Mais il n'empêche que, parfois, même s'il n'est pas là, c'est lui qui nous montre la direction.Cet épisode, il est pour toi Théo. Théo, toi qui guides ta maman, depuis le 11 avril 2006. Je vous souhaite une belle écoute !Un grand merci à Julie pour ton témoignage et sa confiance.Ressources complémentaires : Julie de Troy Lecante a écrit le livre Ma petite plume. Vivre et surmonter l'interruption médicale de grossesse (Ed. J'ai Lu).Dans cet épisode, je mentionne plusieurs associations qui peuvent vous soutenir pendant votre deuil : Petite Emilie - que vous pouvez retrouver dans un hors-série d'Au Revoir Podcast de mars 2024 -, Agapa et Spama. Si vous rencontrez des difficultés maternelles, vous pouvez entrer en contact l'association Maman Blues.Au Revoir Podcast est aussi sur Instagram : @aurevoir.podcast. Venez me retrouver sur ce compte pour échanger et pour lever, ensemble, le voile sur le deuil périnatal.Création et réalisation : Sophie de ChivréLes magnifiques morceaux qui illustrent cet épisode sont les créations du groupe caennais Portier Dean et sont extraits de leur album "Ancient Majesty". "Intro" / "The Pool" (version instrumentale inédite) / "Pythia"© 2018 Portier Dean Productions / Collectif Toujours. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.
Introducing Cristina Cristina's big 3 and how her intuition shows up for her Cristina shares how she came to astrology Working with asteroids and look for the Divine Feminine in the birth chart Asteroids connected to the witch archetype: Hecate, Circe, Medusa, Sibila, Pythia, Vesta, Persephone Lilith - the banished wild feminine Where can we look in our chart for our portal to spiritual connection Themes and collective lessons for 2024 Tips for choosing the right timing of a new project/endeavor Something Cristina wants to debunk about astrology (ALT: Debunking the fear mongering around astrology) How to connect with Cristina Sitting with Spirit ******* Cristina Farella runs Eighth House Astrology, where she offers both astrology consultations and a series of classes on astrology, myth, and ritual. Her work is guided by the prevailing belief that all things are interconnected, and in both her teaching and astrology readings, she strives to find ways to illuminate those connections for whomever she works with. You can find her on IG and TikTok @eighthhouseastro, subscribe to her Substack where she shares weekly astro forecasts, a bi-weekly podcast on mythology, and other nice astro writings. You can also hear more from Cristina by tuning into her podcast, Soror Mystica, a podcast exploring life's mysteries through its symbols, co-hosted with tarotist Mariana Louis." https://www.cristinafarella.com https://www.instagram.com/eighthhouseastro/ https://eighthhouseastrology.substack.com https://www.sorormysticapodcast.com She is offering a discount just for this audience for her new class Astrology for Beginners. Use the code ANGEL for $5 off your ticket price. https://www.cristinafarella.com/astrology-for-beginners ******* Connect with Taylor further on Instagram @angels_and_amethyst or on her website www.angelsandamethyst.com. Follow @MagicHourPod on instagram for more Magic Hour content. If you have any questions about, intuition, spirituality, angels, or anything and everything magical, please email contact@magichourpod.com, Taylor will answer one question at the top of each episode. Don't forget to leave us a 5 sparkling star review, they help more people find the pod and remember their magic. Please screenshot and email your 5 star reviews to contact@magichourpod.com and we will send you a free downloadable angelic meditation, and enter you to win an angel reading with Taylor Paige! The first Angel Reading giveaway will happen when we hit 55 5 star reviews. Join the waitlist for a reading with Taylor here: https://angelsandamethyst.com/offerings/ Find Taylor's 3 part workshop series on Angelic Connection, Attracting a Soulmate Connection, and Healing the Witch wound here: https://angelsandamethyst.com/workshops/ Code 333 gives $33 off, plus, each student can email Taylor one question on the subject material per lesson. Join Taylor's email list at angelsandamethyst.com to know when her monthly gatherings of Earth Angel Club are open for registration. Earth angel club is a monthly meeting of like-minded and magical people across the world. EAC includes an astrological and energetic overview, a guided meditation attuned to the current zodiac season, and for the highest ticket tier, a mini email angel reading. Each EAC member also has the option to skip the waitlist and sit with Taylor sooner for a reading. Are you an aligned business owner that would like to advertise to our beautiful community of magical people? Please email contact@magichourpod.com Music by Justin Fleuriel and Mandie Cheung. For more of their music check out @goodnightsband on instagram.
Introducing Cristina Cristina's big 3 and how her intuition shows up for her Cristina shares how she came to astrology Working with asteroids and look for the Divine Feminine in the birth chart Asteroids connected to the witch archetype: Hecate, Circe, Medusa, Sibila, Pythia, Vesta, Persephone Lilith and the banished wild feminine Where can we look in our chart for our portal to spiritual connection Themes and collective lessons for 2024 Tips for choosing the right timing of a new project/endeavor Something Cristina wants to debunk about astrology (ALT: Debunking the fear mongering around astrology) How to connect with Cristina Sitting with Spirit ******* Cristina Farella runs Eighth House Astrology, where she offers both astrology consultations and a series of classes on astrology, myth, and ritual. Her work is guided by the prevailing belief that all things are interconnected, and in both her teaching and astrology readings, she strives to find ways to illuminate those connections for whomever she works with. You can find her on IG and TikTok @eighthhouseastro, subscribe to her Substack where she shares weekly astro forecasts, a bi-weekly podcast on mythology, and other nice astro writings. You can also hear more from Cristina by tuning into her podcast, Soror Mystica, a podcast exploring life's mysteries through its symbols, co-hosted with tarotist Mariana Louis." https://www.cristinafarella.com https://www.instagram.com/eighthhouseastro/ https://eighthhouseastrology.substack.com https://www.sorormysticapodcast.com She is offering a discount just for this audience for her new class Astrology for Beginners. Use the code ANGEL for $5 off your ticket price. https://www.cristinafarella.com/astrology-for-beginners ******* Connect with Taylor further on Instagram @angels_and_amethyst or on her website www.angelsandamethyst.com. Follow @MagicHourPod on instagram for more Magic Hour content. If you have any questions about, intuition, spirituality, angels, or anything and everything magical, please email contact@magichourpod.com, Taylor will answer one question at the top of each episode. Don't forget to leave us a 5 sparkling star review, they help more people find the pod and remember their magic. Please screenshot and email your 5 star reviews to contact@magichourpod.com and we will send you a free downloadable angelic meditation, and enter you to win an angel reading with Taylor Paige! The first Angel Reading giveaway will happen when we hit 55 5 star reviews. Join the waitlist for a reading with Taylor here: https://angelsandamethyst.com/offerings/ Find Taylor's 3 part workshop series on Angelic Connection, Attracting a Soulmate Connection, and Healing the Witch wound here: https://angelsandamethyst.com/workshops/ Code 333 gives $33 off, plus, each student can email Taylor one question on the subject material per lesson. Join Taylor's email list at angelsandamethyst.com to know when her monthly gatherings of Earth Angel Club are open for registration. Earth angel club is a monthly meeting of like-minded and magical people across the world. EAC includes an astrological and energetic overview, a guided meditation attuned to the current zodiac season, and for the highest ticket tier, a mini email angel reading. Each EAC member also has the option to skip the waitlist and sit with Taylor sooner for a reading. Are you an aligned business owner that would like to advertise to our beautiful community of magical people? Please email contact@magichourpod.com Music by Justin Fleuriel and Mandie Cheung. For more of their music check out @goodnightsband on instagram.
A dramatic reading by Jason Louv of the 1819 poem "The Fall of Hyperion—A Dream" by John Keats, set to music by Jason. Not uncommon for the 19th century, it is awash in occult and Hermetic symbolism. Show Links Magick.Me Magick.Me's Fast-Growing YouTube Channel: Like and Subscribe!!! The full text of the poem follows: "The Fall of Hyperion—A Dream" John Keats CANTO I Fanatics have their dreams, wherewith they weave A paradise for a sect; the savage too From forth the loftiest fashion of his sleep Guesses at Heaven; pity these have not Trac'd upon vellum or wild Indian leaf The shadows of melodious utterance. But bare of laurel they live, dream, and die; For Poesy alone can tell her dreams, With the fine spell of words alone can save Imagination from the sable charm And dumb enchantment. Who alive can say, 'Thou art no Poet may'st not tell thy dreams?' Since every man whose soul is not a clod Hath visions, and would speak, if he had loved And been well nurtured in his mother tongue. Whether the dream now purpos'd to rehearse Be poet's or fanatic's will be known When this warm scribe my hand is in the grave. Methought I stood where trees of every clime, Palm, myrtle, oak, and sycamore, and beech, With plantain, and spice blossoms, made a screen; In neighbourhood of fountains, by the noise Soft showering in my ears, and, by the touch Of scent, not far from roses. Turning round I saw an arbour with a drooping roof Of trellis vines, and bells, and larger blooms, Like floral censers swinging light in air; Before its wreathed doorway, on a mound Of moss, was spread a feast of summer fruits, Which, nearer seen, seem'd refuse of a meal By angel tasted or our Mother Eve; For empty shells were scattered on the grass, And grape stalks but half bare, and remnants more, Sweet smelling, whose pure kinds I could not know. Still was more plenty than the fabled horn Thrice emptied could pour forth, at banqueting For Proserpine return'd to her own fields, Where the white heifers low. And appetite More yearning than on earth I ever felt Growing within, I ate deliciously; And, after not long, thirsted, for thereby Stood a cool vessel of transparent juice Sipp'd by the wander'd bee, the which I took, And, pledging all the mortals of the world, And all the dead whose names are in our lips, Drank. That full draught is parent of my theme. No Asian poppy nor elixir fine Of the soon fading jealous Caliphat, No poison gender'd in close monkish cell To thin the scarlet conclave of old men, Could so have rapt unwilling life away. Among the fragrant husks and berries crush'd, Upon the grass I struggled hard against The domineering potion; but in vain: The cloudy swoon came on, and down I sunk Like a Silenus on an antique vase. How long I slumber'd 'tis a chance to guess. When sense of life return'd, I started up As if with wings; but the fair trees were gone, The mossy mound and arbour were no more: I look'd around upon the carved sides Of an old sanctuary with roof august, Builded so high, it seem'd that filmed clouds Might spread beneath, as o'er the stars of heaven; So old the place was, I remember'd none The like upon the earth: what I had seen Of grey cathedrals, buttress'd walls, rent towers, The superannuations of sunk realms, Or Nature's rocks toil'd hard in waves and winds, Seem'd but the faulture of decrepit things To that eternal domed monument. Upon the marble at my feet there lay Store of strange vessels and large draperies, Which needs had been of dyed asbestos wove, Or in that place the moth could not corrupt, So white the linen, so, in some, distinct Ran imageries from a sombre loom. All in a mingled heap confus'd there lay Robes, golden tongs, censer and chafing dish, Girdles, and chains, and holy jewelries. Turning from these with awe, once more I rais'd My eyes to fathom the space every way; The embossed roof, the silent massy range Of columns north and south, ending in mist Of nothing, then to eastward, where black gates Were shut against the sunrise evermore. Then to the west I look'd, and saw far off An image, huge of feature as a cloud, At level of whose feet an altar slept, To be approach'd on either side by steps, And marble balustrade, and patient travail To count with toil the innumerable degrees. Towards the altar sober paced I went, Repressing haste, as too unholy there; And, coming nearer, saw beside the shrine One minist'ring; and there arose a flame. When in mid May the sickening East wind Shifts sudden to the south, the small warm rain Melts out the frozen incense from all flowers, And fills the air with so much pleasant health That even the dying man forgets his shroud; Even so that lofty sacrificial fire, Sending forth Maian incense, spread around Forgetfulness of everything but bliss, And clouded all the altar with soft smoke, From whose white fragrant curtains thus I heard Language pronounc'd: 'If thou canst not ascend 'These steps, die on that marble where thou art. 'Thy flesh, near cousin to the common dust, 'Will parch for lack of nutriment thy bones 'Will wither in few years, and vanish so 'That not the quickest eye could find a grain 'Of what thou now art on that pavement cold. 'The sands of thy short life are spent this hour, 'And no hand in the universe can turn 'Thy hourglass, if these gummed leaves be burnt 'Ere thou canst mount up these immortal steps.' I heard, I look'd: two senses both at once, So fine, so subtle, felt the tyranny Of that fierce threat and the hard task proposed. Prodigious seem'd the toil, the leaves were yet Burning when suddenly a palsied chill Struck from the paved level up my limbs, And was ascending quick to put cold grasp Upon those streams that pulse beside the throat: I shriek'd; and the sharp anguish of my shriek Stung my own ears I strove hard to escape The numbness; strove to gain the lowest step. Slow, heavy, deadly was my pace: the cold Grew stifling, suffocating, at the heart; And when I clasp'd my hands I felt them not. One minute before death, my iced foot touch'd The lowest stair; and as it touch'd, life seem'd To pour in at the toes: I mounted up, As once fair angels on a ladder flew From the green turf to Heaven. 'Holy Power,' Cried I, approaching near the horned shrine, 'What am I that should so be saved from death? 'What am I that another death come not 'To choke my utterance sacrilegious here?' Then said the veiled shadow 'Thou hast felt 'What 'tis to die and live again before 'Thy fated hour. That thou hadst power to do so 'Is thy own safety; thou hast dated on 'Thy doom.' 'High Prophetess,' said I, 'purge off, 'Benign, if so it please thee, my mind's film.' 'None can usurp this height,' return'd that shade, 'But those to whom the miseries of the world 'Are misery, and will not let them rest. 'All else who find a haven in the world, 'Where they may thoughtless sleep away their days, 'If by a chance into this fane they come, 'Rot on the pavement where thou rottedst half.' 'Are there not thousands in the world,' said I, Encourag'd by the sooth voice of the shade, 'Who love their fellows even to the death; 'Who feel the giant agony of the world; 'And more, like slaves to poor humanity, 'Labour for mortal good? I sure should see 'Other men here; but I am here alone.' 'Those whom thou spak'st of are no vision'ries,' Rejoin'd that voice; 'they are no dreamers weak; 'They seek no wonder but the human face, 'No music but a happy noted voice; 'They come not here, they have no thought to come; 'And thou art here, for thou art less than they: 'What benefit canst thou do, or all thy tribe, 'To the great world? Thou art a dreaming thing, 'A fever of thyself think of the Earth; 'What bliss even in hope is there for thee? 'What haven? every creature hath its home; 'Every sole man hath days of joy and pain, 'Whether his labours be sublime or low 'The pain alone; the joy alone; distinct: 'Only the dreamer venoms all his days, 'Bearing more woe than all his sins deserve. 'Therefore, that happiness be somewhat shar'd, 'Such things as thou art are admitted oft 'Into like gardens thou didst pass erewhile, 'And suffer'd in these temples: for that cause 'Thou standest safe beneath this statue's knees.' 'That I am favour'd for unworthiness, 'By such propitious parley medicin'd 'In sickness not ignoble, I rejoice, 'Aye, and could weep for love of such award.' So answer'd I, continuing, 'If it please, 'Majestic shadow, tell me: sure not all 'Those melodies sung into the world's ear 'Are useless: sure a poet is a sage; 'A humanist, physician to all men. 'That I am none I feel, as vultures feel 'They are no birds when eagles are abroad. 'What am I then? Thou spakest of my tribe: 'What tribe?' The tall shade veil'd in drooping white Then spake, so much more earnest, that the breath Moved the thin linen folds that drooping hung About a golden censer from the hand Pendent. 'Art thou not of the dreamer tribe? 'The poet and the dreamer are distinct, 'Diverse, sheer opposite, antipodes. 'The one pours out a balm upon the world, 'The other vexes it.' Then shouted I Spite of myself, and with a Pythia's spleen, 'Apollo! faded! O far flown Apollo! 'Where is thy misty pestilence to creep 'Into the dwellings, through the door crannies 'Of all mock lyrists, large self worshipers, 'And careless Hectorers in proud bad verse. 'Though I breathe death with them it will be life 'To see them sprawl before me into graves. 'Majestic shadow, tell me where I am, 'Whose altar this; for whom this incense curls; 'What image this whose face I cannot see, 'For the broad marble knees; and who thou art, 'Of accent feminine so courteous?' Then the tall shade, in drooping linens veil'd, Spoke out, so much more earnest, that her breath Stirr'd the thin folds of gauze that drooping hung About a golden censer from her hand Pendent; and by her voice I knew she shed Long treasured tears. 'This temple, sad and lone, 'Is all spar'd from the thunder of a war 'Foughten long since by giant hierarchy 'Against rebellion: this old image here, 'Whose carved features wrinkled as he fell, 'Is Saturn's; I Moneta, left supreme 'Sole priestess of this desolation.' I had no words to answer, for my tongue, Useless, could find about its roofed home No syllable of a fit majesty To make rejoinder to Moneta's mourn. There was a silence, while the altar's blaze Was fainting for sweet food: I look'd thereon, And on the paved floor, where nigh were piled Faggots of cinnamon, and many heaps Of other crisped spice wood then again I look'd upon the altar, and its horns Whiten'd with ashes, and its lang'rous flame, And then upon the offerings again; And so by turns till sad Moneta cried, 'The sacrifice is done, but not the less 'Will I be kind to thee for thy good will. 'My power, which to me is still a curse, 'Shall be to thee a wonder; for the scenes 'Still swooning vivid through my globed brain 'With an electral changing misery 'Thou shalt with those dull mortal eyes behold, 'Free from all pain, if wonder pain thee not.' As near as an immortal's sphered words Could to a mother's soften, were these last: And yet I had a terror of her robes, And chiefly of the veils, that from her brow Hung pale, and curtain'd her in mysteries That made my heart too small to hold its blood. This saw that Goddess, and with sacred hand Parted the veils. Then saw I a wan face, Not pin'd by human sorrows, but bright blanch'd By an immortal sickness which kills not; It works a constant change, which happy death Can put no end to; deathwards progressing To no death was that visage; it had pass'd The lily and the snow; and beyond these I must not think now, though I saw that face But for her eyes I should have fled away. They held me back, with a benignant light Soft mitigated by divinest lids Half closed, and visionless entire they seem'd Of all external things; they saw me not, But in blank splendour beam'd like the mild moon, Who comforts those she sees not, who knows not What eyes are upward cast. As I had found A grain of gold upon a mountain side, And twing'd with avarice strain'd out my eyes To search its sullen entrails rich with ore, So at the view of sad Moneta's brow I ach'd to see what things the hollow brain Behind enwombed: what high tragedy In the dark secret chambers of her skull Was acting, that could give so dread a stress To her cold lips, and fill with such a light Her planetary eyes, and touch her voice With such a sorrow 'Shade of Memory!' Cried I, with act adorant at her feet, 'By all the gloom hung round thy fallen house, 'By this last temple, by the golden age, 'By great Apollo, thy dear Foster Child, 'And by thyself, forlorn divinity, 'The pale Omega of a withered race, 'Let me behold, according as thou saidst, 'What in thy brain so ferments to and fro!' No sooner had this conjuration pass'd My devout lips, than side by side we stood (Like a stunt bramble by a solemn pine) Deep in the shady sadness of a vale, Far sunken from the healthy breath of morn, Far from the fiery noon and eve's one star. Onward I look'd beneath the gloomy boughs, And saw, what first I thought an image huge, Like to the image pedestal'd so high In Saturn's temple. Then Moneta's voice Came brief upon mine ear 'So Saturn sat When he had lost his realms ' whereon there grew A power within me of enormous ken To see as a god sees, and take the depth Of things as nimbly as the outward eye Can size and shape pervade. The lofty theme At those few words hung vast before my mind, With half unravel'd web. I set myself Upon an eagle's watch, that I might see, And seeing ne'er forget. No stir of life Was in this shrouded vale, not so much air As in the zoning of a summer's day Robs not one light seed from the feather'd grass, But where the dead leaf fell there did it rest. A stream went voiceless by, still deaden'd more By reason of the fallen divinity Spreading more shade; the Naiad 'mid her reeds Press'd her cold finger closer to her lips. Along the margin sand large footmarks went No farther than to where old Saturn's feet Had rested, and there slept, how long a sleep! Degraded, cold, upon the sodden ground His old right hand lay nerveless, listless, dead, Unsceptred; and his realmless eyes were clos'd, While his bow'd head seem'd listening to the Earth, His ancient mother, for some comfort yet. It seem'd no force could wake him from his place; But there came one who with a kindred hand Touch'd his wide shoulders after bending low With reverence, though to one who knew it not. Then came the griev'd voice of Mnemosyne, And griev'd I hearken'd. 'That divinity 'Whom thou saw'st step from yon forlornest wood, 'And with slow pace approach our fallen King, 'Is Thea, softest natur'd of our brood.' I mark'd the Goddess in fair statuary Surpassing wan Moneta by the head, And in her sorrow nearer woman's tears. There was a listening fear in her regard, As if calamity had but begun; As if the vanward clouds of evil days Had spent their malice, and the sullen rear Was with its stored thunder labouring up. One hand she press'd upon that aching spot Where beats the human heart, as if just there, Though an immortal, she felt cruel pain; The other upon Saturn's bended neck She laid, and to the level of his hollow ear Leaning with parted lips, some words she spake In solemn tenor and deep organ tune; Some mourning words, which in our feeble tongue Would come in this like accenting; how frail To that large utterance of the early Gods! 'Saturn! look up and for what, poor lost King? 'I have no comfort for thee; no not one; 'I cannot cry, Wherefore thus sleepest thou? 'For Heaven is parted from thee, and the Earth 'Knows thee not, so afflicted, for a God; 'And Ocean too, with all its solemn noise, 'Has from thy sceptre pass'd, and all the air 'Is emptied of thine hoary majesty: 'Thy thunder, captious at the new command, 'Rumbles reluctant o'er our fallen house; 'And thy sharp lightning, in unpracticed hands, 'Scorches and burns our once serene domain. 'With such remorseless speed still come new woes, 'That unbelief has not a space to breathe. 'Saturn! sleep on: Me thoughtless, why should I 'Thus violate thy slumbrous solitude? 'Why should I ope thy melancholy eyes? 'Saturn, sleep on, while at thy feet I weep.' As when upon a tranced summer night Forests, branch charmed by the earnest stars, Dream, and so dream all night without a noise, Save from one gradual solitary gust, Swelling upon the silence; dying off; As if the ebbing air had but one wave; So came these words, and went; the while in tears She press'd her fair large forehead to the earth, Just where her fallen hair might spread in curls A soft and silken mat for Saturn's feet. Long, long those two were postured motionless, Like sculpture builded up upon the grave Of their own power. A long awful time I look'd upon them: still they were the same; The frozen God still bending to the earth, And the sad Goddess weeping at his feet, Moneta silent. Without stay or prop But my own weak mortality, I bore The load of this eternal quietude, The unchanging gloom, and the three fixed shapes Ponderous upon my senses, a whole moon. For by my burning brain I measured sure Her silver seasons shedded on the night, And ever day by day methought I grew More gaunt and ghostly. Oftentimes I pray'd Intense, that Death would take me from the vale And all its burthens gasping with despair Of change, hour after hour I curs'd myself; Until old Saturn rais'd his faded eyes, And look'd around and saw his kingdom gone, And all the gloom and sorrow of the place, And that fair kneeling Goddess at his feet. As the moist scent of flowers, and grass, and leaves Fills forest dells with a pervading air, Known to the woodland nostril, so the words Of Saturn fill'd the mossy glooms around, Even to the hollows of time eaten oaks And to the windings of the foxes' hole, With sad low tones, while thus he spake, and sent Strange musings to the solitary Pan. 'Moan, brethren, moan; for we are swallow'd up 'And buried from all Godlike exercise 'Of influence benign on planets pale, 'And peaceful sway above man's harvesting, 'And all those acts which Deity supreme 'Doth ease its heart of love in. Moan and wail, 'Moan, brethren, moan; for lo, the rebel spheres 'Spin round, the stars their ancient courses keep, 'Clouds still with shadowy moisture haunt the earth, 'Still suck their fill of light from sun and moon, 'Still buds the tree, and still the sea shores murmur; 'There is no death in all the Universe, 'No smell of death there shall be death Moan, moan, 'Moan, Cybele, moan; for thy pernicious babes 'Have changed a God into a shaking Palsy. 'Moan, brethren, moan, for I have no strength left, 'Weak as the reed weak feeble as my voice 'O, O, the pain, the pain of feebleness. 'Moan, moan, for still I thaw or give me help; 'Throw down those imps, and give me victory. 'Let me hear other groans, and trumpets blown 'Of triumph calm, and hymns of festival 'From the gold peaks of Heaven's high piled clouds; 'Voices of soft proclaim, and silver stir 'Of strings in hollow shells; and let there be 'Beautiful things made new, for the surprise 'Of the sky children.' So he feebly ceas'd, With such a poor and sickly sounding pause, Methought I heard some old man of the earth Bewailing earthly loss; nor could my eyes And ears act with that pleasant unison of sense Which marries sweet sound with the grace of form, And dolorous accent from a tragic harp With large limb'd visions. More I scrutinized: Still fix'd he sat beneath the sable trees, Whose arms spread straggling in wild serpent forms, With leaves all hush'd; his awful presence there (Now all was silent) gave a deadly lie To what I erewhile heard only his lips Trembled amid the white curls of his beard. They told the truth, though, round, the snowy locks Hung nobly, as upon the face of heaven A mid day fleece of clouds. Thea arose, And stretched her white arm through the hollow dark, Pointing some whither: whereat he too rose Like a vast giant, seen by men at sea To grow pale from the waves at dull midnight. They melted from my sight into the woods; Ere I could turn, Moneta cried, 'These twain 'Are speeding to the families of grief, 'Where roof'd in by black rocks they waste, in pain 'And darkness, for no hope.' And she spake on, As ye may read who can unwearied pass Onward from the antechamber of this dream, Where even at the open doors awhile I must delay, and glean my memory Of her high phrase: perhaps no further dare. CANTO II 'Mortal, that thou may'st understand aright, 'I humanize my sayings to thine ear, 'Making comparisons of earthly things; 'Or thou might'st better listen to the wind, 'Whose language is to thee a barren noise, 'Though it blows legend laden through the trees. 'In melancholy realms big tears are shed, 'More sorrow like to this, and such like woe, 'Too huge for mortal tongue, or pen of scribe. 'The Titans fierce, self hid or prison bound, 'Groan for the old allegiance once more, 'Listening in their doom for Saturn's voice. 'But one of our whole eagle brood still keeps 'His sov'reignty, and rule, and majesty; 'Blazing Hyperion on his orbed fire 'Still sits, still snuffs the incense teeming up 'From man to the sun's God: yet unsecure, 'For as upon the earth dire prodigies 'Fright and perplex, so also shudders he: 'Nor at dog's howl or gloom bird's Even screech, 'Or the familiar visitings of one 'Upon the first toll of his passing bell: 'But horrors, portioned to a giant nerve, 'Make great Hyperion ache. His palace bright, 'Bastion'd with pyramids of glowing gold, 'And touch'd with shade of bronzed obelisks, 'Glares a blood red through all the thousand courts, 'Arches, and domes, and fiery galleries: 'And all its curtains of Aurorian clouds 'Flush angerly; when he would taste the wreaths 'Of incense breath'd aloft from sacred hills, 'Instead of sweets his ample palate takes 'Savour of poisonous brass and metals sick. 'Wherefore when harbour'd in the sleepy West, 'After the full completion of fair day, 'For rest divine upon exalted couch 'And slumber in the arms of melody, 'He paces through the pleasant hours of ease 'With strides colossal, on from hall to hall; 'While far within each aisle and deep recess 'His winged minions in close clusters stand 'Amaz'd, and full of fear; like anxious men, 'Who on a wide plain gather in sad troops, 'When earthquakes jar their battlements and towers. 'Even now, while Saturn, roused from icy trance, 'Goes step for step with Thea from yon woods, 'Hyperion, leaving twilight in the rear, 'Is sloping to the threshold of the West. 'Thither we tend.' Now in clear light I stood, Reliev'd from the dusk vale. Mnemosyne Was sitting on a square edg'd polish'd stone, That in its lucid depth reflected pure Her priestess garments. My quick eyes ran on From stately nave to nave, from vault to vault, Through bow'rs of fragrant and enwreathed light And diamond paved lustrous long arcades. Anon rush'd by the bright Hyperion; His flaming robes stream'd out beyond his heels, And gave a roar, as if of earthly fire, That scared away the meek ethereal hours And made their dove wings tremble. On he flared. THE END 1819
Ce hors-série d'Au Revoir Podcast est mis en ligne dans le cadre du Podcasthon, dont la seconde édition a lieu en cette fin de mois de mars 2024.Pendant 7 jours, plus de 300 animateurs et animatrices de Podcasts se mobilisent pour mettre en valeur le monde associatif et ses valeurs ! Le principe : la même semaine, nous publions toutes et tous un épisode consacré à une association, une ONG ou une cause de notre choix. La cause que je défends, vous la connaissez déjà. Alors cette année, j'ai décidé de mettre en lumière une association très importante qui se bat pour un meilleur accompagnement du deuil périnatal : Petite Emilie. Petite Emilie est une association qui apporte du soutien aux personnes confrontées à une IMG et à un deuil périnatal. Si vous ne la connaissez pas encore, écoutez cet épisode pour la découvrir !Je vous encourage à visiter le site www.podcasthon.org et, si vous en avez la possibilité, à faire un don pour l'association Petite Emilie, dont les bénévoles participent notamment à des actions de formations auprès des soignant·es. Un grand merci à Laurence, Julien, Amanda et Silvia, les bénévoles de Petite Emilie dont vous entendez les voix dans cet épisode.Au Revoir Podcast est aussi sur Instagram : @aurevoir.podcast. Venez me retrouver sur ce compte pour échanger et pour lever, ensemble, le voile sur le deuil périnatal.Création et réalisation : Sophie de ChivréLes magnifiques morceaux qui illustrent cet épisode sont les créations du groupe caennais Portier Dean et sont extraits de leur album "Ancient Majesty". "Intro" / "The Pool" (version instrumentale inédite) / "Pythia"© 2018 Portier Dean Productions / Collectif Toujours. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.
Une mutuelle qui rembourse les rendez-vous avec les doulas, ça existe ? Oui, même que j'en connais une : Nostrum Care, partenaire de cet épisode d'Au Revoir Podcast ! Un grand merci à cette mutuelle engagée dans les domaines de la santé mentale, physique et de la périnatalité de soutenir Au Revoir Podcast. Pour plus d'info, rdv sur le site Internet de Nostrum Care et sur son compte Instagram (https://www.instagram.com/nostrumcare/?hl=fr).Dans ce nouvel entretien d'Au Revoir Podcast, mis en ligne à l'occasion de la semaine mondiale des doulas qui a lieu du 22 au 28 mars, j'ai eu envie de rencontrer Leslie Lucien. Leslie, après avoir longtemps travaillé dans la communication, s'est intéressée à la parentalité en devenant maman. Elle a donc réalisé une formation pour être auxiliaire de puériculture, a travaillé en PMI - centre de Protection Maternelle et Infantile - et a fini par trouver LE métier qu'elle avait envie d'exercer : doula, afin d'être au plus près des femmes et des familles durant la grossesse et après. Aujourd'hui, pour Au Revoir Podcast, elle répond à toutes les questions que vous vous posez à propos de la manière dont les doulas peuvent vous accompagner lorsque vous vivez un deuil périnatal. Je vous souhaite une bonne écoute !Un grand merci à Leslie Lucien et à Nostrum Care pour son engagement vis-à-vis du deuil périnatal.Au Revoir Podcast est aussi sur Instagram : @aurevoir.podcast. Venez me retrouver sur ce compte pour échanger et pour lever, ensemble, le voile sur le deuil périnatal.Création et réalisation : Sophie de ChivréLes magnifiques morceaux qui illustrent cet épisode sont les créations du groupe caennais Portier Dean et sont extraits de leur album "Ancient Majesty". "Intro" / "The Pool" (version instrumentale inédite) / "Pythia"© 2018 Portier Dean Productions / Collectif Toujours. Hébergé par Acast. Visitez acast.com/privacy pour plus d'informations.
We are in 336. Alexander travels too Delphi to ask Pythia how his Persian campaign will go. Afterwards, he will head north to avenge some of his father's defeats.Instagram:https://www.instagram.com/eloudianos/Donation link:https://alexandroscast.gr/en/donate/
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Fixing Feature Suppression in SAEs, published by Benjamin Wright on February 16, 2024 on LessWrong. Produced as part of the ML Alignment Theory Scholars Program - Winter 2023-24 Cohort as part of Lee Sharkey's stream. Sparse autoencoders are a method of resolving superposition by recovering linearly encoded "features" inside activations. Unfortunately, despite the great recent success of SAEs at extracting human interpretable features, they fail to perfectly reconstruct the activations. For instance, Cunningham et al. (2023) note that replacing the residual stream of layer 2 of Pythia-70m with the reconstructed output of an SAE increased the perplexity of the model on the Pile from 25 to 40. It is important for interpretability that the features we extract accurately represent what the model is doing. In this post, I show how and why SAEs have a reconstruction gap due to 'feature suppression'. Then, I look at a few ways to fix this while maintaining SAEs interpretability. By modifying and fine-tuning a pre-trained SAE, we achieve a 9% decrease in mean square error and a 24% reduction in the perplexity increase upon patching activations into the LLM. Finally, I connect a theoretical example to the observed amounts of feature suppression in Pythia 70m, confirming that features are suppressed primarily based on the strength of their activations, not on their frequency of activation. Feature Suppression The architecture of an SAE is: f(xx)=ReLU(Wexx+bbe) yy=Wdf(xx)+bbd The loss function usually combines a MSE reconstruction loss with a sparsity term, like L(xx,f(xx),yy)=||yyxx||2/d+c|f(xx)|, where d is the dimension of x. When training the SAE on this loss, the decoder's weight matrix is fixed to have unit norm for each feature (column). The reason for feature suppression is simple: The training loss has two terms, only one of which is reconstruction. Therefore, reconstruction isn't perfect. In particular, the loss function pushes for smaller f(xx) values, leading to suppressed features and worse reconstruction. An illustrative example of feature suppression As an example, consider the trivial case where there is only one binary feature in one dimension. That is, xx=1 with probability p and xx=0 otherwise. Then, ideally the optimal SAE would extract feature activations of f(x){0,1} and have a decoder with Wd=1. However, if we were to train an SAE optimizing the loss function L(xx,f(xx),yy)=||yyxx||2+c|f(xx)|, we get a different result. If we ignore bias terms for simplicity of argument, and say that the encoder outputs feature activation aa if xx=1 and 0 otherwise, then the optimization problem becomes: aa=argminpL(1,aa,aa)+(1p)L(0,0,0)=argmin(aa1)2+|aa|c=argminaa2+(c2)aa+1 aa=1c2 Therefore the feature is scaled by a factor of 1c/2 compared to optimal. This is an example of feature suppression. If we allow the ground truth feature to have an activation strength g upon activation and dimension d, this factor becomes: aa=1cd2g In other words, instead of having the ground truth activation g, the SAE learns an activation of gcd2, a constant amount less. Features with activation strengths below cd2 would be completely killed off by the SAE. Feature suppression is a significant problem in current SAEs To experimentally verify that feature suppression affects SAEs, we first trained SAEs on the residual stream output of each layer of Pythia-70m with an L1 sparsity penalty (coefficient 2e-3) on 6 epochs of 100 million tokens of OpenWebText, with batch size 64 and learning rate 1e-3, resulting in roughly 13-80 feature activations per token. The residual stream of Pythia-70m had a dimension size of 512 and we used a dictionary size of 2048, for a four times scale up. If feature suppression had a noticeable effect, we'd see that the SAE reconstructions had noticeably smaller L2 norm...
Based on the research of Fuad I. Khuri in his book Being a Druze and The Druze Faith by Sami Nasib Makarem, Yara and Pythia discuss the beliefs and practices of the Druze and how you can incorporate aspects of their philosophy into your own practice. They also discuss how to make your own lye and what it can be used for along with information on plant stratification and how to prepare your plants for the upcoming year.
Join Kat and Jethro for another mind-bending episode of The Box of Oddities podcast! This week, we're diving deep into the strange and unusual, exploring four incredible topics that will leave you in awe. First, we unravel the mysteries of the "Stone Tape Theory." Is it possible for inanimate objects to record and playback past events? Discover the eerie phenomena and the science behind it as we venture into the unexplained. Next, we take you on a journey to uncover the enigmatic "Chatata Wall Inscriptions." The ancient writings that have baffled historians and archaeologists alike. Join us as we delve into the secrets hidden within these mysterious inscriptions, their significance, and their surprising origin, Then, we unveil a fascinating scientific theory about how Pythia, the legendary oracle of Delphi, summoned her visions. Was it the magic, rituals, or something entirely different? Tune in to explore this captivating aspect of ancient history. And finally, we bring you the incredible story of how scientists have defied time itself by resurrecting a 32,000-year-old plant. It's a tale of scientific marvel and wonder that will leave you questioning the boundaries of life and death. Join us on this journey of the bizarre, as we explore the Stone Tape Theory, The Chatata Wall Inscriptions, Pythia's visions, and the resurrection of an ancient plant. Don't miss out on this episode packed with oddities and surprises! Subscribe, rate, and share The Box of Oddities podcast to keep the curiosity flowing. If you would like to advertise on The Box of Oddities, contact sales@advertisecast.com http://www.airwavemedia.com Learn more about your ad choices. Visit megaphone.fm/adchoices
A new week means new questions! Hope you have fun with these!What name is given to a European plum that has been dried?In Greek Mythology, Pythia was the priestess of Apollo and was known as the what of Delphi?Which Scottish novelist wrote the novels Rob Roy, The Lady of the Lake and Ivanhoe?How many pillars are there in the 'Pillars of Islam'?What part of your body can only be seen when you are sitting?Riccardo Muti, Leopold Anthony Stokowski and Marin Alsop are all famous what?What other team competes against Real Madrid in "The Classic"?Which Taylor Swift song had the longest stay on the Billboard top 10 with a nearly six-month stay?What language does the term eureka come from?MusicHot Swing, Fast Talkin, Bass Walker, Dances and Dames, Ambush by Kevin MacLeod (incompetech.com)Licensed under Creative Commons: By Attribution 3.0 http://creativecommons.org/licenses/by/3.0/Don't forget to follow us on social media:Patreon – patreon.com/quizbang – Please consider supporting us on Patreon. Check out our fun extras for patrons and help us keep this podcast going. We appreciate any level of support!Website – quizbangpod.com Check out our website, it will have all the links for social media that you need and while you're there, why not go to the contact us page and submit a question!Facebook – @quizbangpodcast – we post episode links and silly lego pictures to go with our trivia questions. Enjoy the silly picture and give your best guess, we will respond to your answer the next day to give everyone a chance to guess.Instagram – Quiz Quiz Bang Bang (quizquizbangbang), we post silly lego pictures to go with our trivia questions. Enjoy the silly picture and give your best guess, we will respond to your answer the next day to give everyone a chance to guess.Twitter – @quizbangpod We want to start a fun community for our fellow trivia lovers. If you hear/think of a fun or challenging trivia question, post it to our twitter feed and we will repost it so everyone can take a stab it. Come for the trivia – stay for the trivia.Ko-Fi – ko-fi.com/quizbangpod – Keep that sweet caffeine running through our body with a Ko-Fi, power us through a late night of fact checking and editing!
Episode: 2822 Herodotus describing historical events of 5th century BC, fantastical and entertaining component of The Histories. Today, we visit Herodotus.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Fact Finding: Do Early Layers Specialise in Local Processing? (Post 5), published by Neel Nanda on December 23, 2023 on The AI Alignment Forum. This is the fifth post in the Google DeepMind mechanistic interpretability team's investigation into how language models recall facts. This post is a bit tangential to the main sequence, and documents some interesting observations about how, in general, early layers of models somewhat (but not fully) specialise into processing recent tokens. You don't need to believe these results to believe our overall results about facts, but we hope they're interesting! And likewise you don't need to read the rest of the sequence to engage with this. Introduction In this sequence we've presented the multi-token embedding hypothesis, that a crucial mechanism behind factual recall is that on the final token of a multi-token entity there forms an "embedding", with linear representations of attributes of that entity. We further noticed that this seemed to be most of what early layers did, and that they didn't seem to respond much to prior context (e.g. adding "Mr Michael Jordan" didn't substantially change the residual). We hypothesised the stronger claim that early layers (e.g. the first 10-20%), in general, specialise in local processing, and that the prior context (e.g. more than 10 tokens back) is only brought in in early-mid layers. We note that this is stronger than the multi-token embedding hypothesis in two ways: it's a statement about how early layers behave on all tokens, not just the final tokens of entities about which facts are known; and it's a claim that early layers are not also doing longer range stuff in addition to producing the multi-token embedding (e.g. detecting the language of the text). We find this stronger hypothesis plausible, because tokens are a pretty messy input format, and analysing individual tokens in isolation can be highly misleading, e.g. We tested this by taking a bunch of arbitrary prompts from the Pile, taking residual streams on those, truncating the prompts to the most recent few tokens and taking residual streams on the truncated prompts, and looking at the mean centred cosine sim at different layers. Our findings: Early layers do, in general, specialise in local processing, but it's a soft division of labour not a hard split. There's a gradual transition where more context is brought in across the layers. Early layers do significant processing on recent tokens, not just the current token - this is not just a trivial result where the residual stream is dominated by the current token and slightly adjusted by each layer Early layers do much more long-range processing on common tokens (punctuation, articles, pronouns, etc) Experiments The "early layers specialise in local processing" hypothesis concretely predicts that, for a given token X in a long prompt, if we truncate the prompt to just the most recent few tokens before X, the residual stream at X should be very similar at early layers and dissimilar at later layers. We can test this empirically by looking at the cosine sim of the original vs truncated residual streams, as a function of layer and truncated context length. Taking cosine sims of residual streams naively can be misleading, as there's often a significant shared mean across all tokens, so we first subtract the mean residual stream across all tokens, and then take the cosine sim. Set-Up Model: Pythia 2.8B, as in the rest of our investigation Dataset: Strings from the Pile, the Pythia pre-training distribution. Metric: To measure how similar the original and truncated residual streams are we subtract the mean residual stream and then take the cosine sim. We compute a separate mean per layer, across all tokens in random prompts from the Pile Truncated context: We vary the number of tokens i...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: A Universal Emergent Decomposition of Retrieval Tasks in Language Models, published by Alexandre Variengien on December 19, 2023 on LessWrong. This work was done as a Master's thesis project at Conjecture, independent from the primary agenda of the organization. Paper available here, thesis here. Over the past months I (Alexandre) - with the help of Eric - have been working on a new approach to interpretability of language models (LMs). In the search for the units of interpretability, I decided to zoom out instead of zooming in. I focused on careful dataset design and causal intervention at a macro-level (i.e. scale of layers). My goal has been to find out if there are such things as "organs"[1] in LMs. In other words, are there macroscopic universal motifs, coarse-grained internal structures corresponding to a function that would generalize across models and domains? I think I found an example of universal macroscopic motifs! Our paper suggests that the information flow inside Transformers can be decomposed cleanly at a macroscopic level. This gives hope that we could design safety applications to know what models are thinking or intervene on their mechanisms without the need to fully understand their internal computations. In this post, we give an overview of the results and compare them with two recent works that also study high-level information flow in LMs. We discuss the respective setups, key differences, and the general picture they paint when taken together. Executive summary of the paper Methods We introduce ORION, a collection of carefully crafted retrieval tasks that offer token-level control and include 6 domains. Prompts in ORION are composed of a request (e.g. a question) asking to retrieve an entity (e.g. a character) from a context (e.g. a story). We can understand the high-level processing happening at the last token position of an ORION prompt: Middle layers at the last token position process the request. Late layers take the representation of the request from early layers and retrieve the correct entity from the context. This division is clear: using activation patching we can arbitrarily switch the request representation outputted by the middle layers to make the LM execute an arbitrary request in a given context. We call this experimental result request patching (see figure below). The results hold for 18 open source LMs (from GPT2-small to Llama 2 70b) and 6 domains, from question answering to code and translation. We provide a detailed case study on Pythia-2.8b using more classical mechanistic interpretability methods to link what we know happens at the layer level to how it is implemented by individual components. The results suggest that the clean division only emerges at the scale of layers and doesn't hold at the scale of components. Applications Building on this understanding, we demonstrate a proof of concept application for scalable oversight of LM internals to mitigate prompt-injection while requiring human supervision on only a single input. Our solution drastically mitigates the distracting effect of the prompt injection (accuracy increases from 15.5% to 97.5% on Pythia-12b). We used the same setting to build an application for mechanistic anomaly detection. We study settings where a token X is both the target of the prompt injection and the correct answer. We tried to answer "Does the LM answer X because it's the correct answer or because it has been distracted by the prompt injection?". Applying the same technique fails at identifying prompt injection in most cases. We think it is surprising and it could be a concrete and tractable problem to study in future works. Setup We study prompts where predicting the next token involves retrieving a specific keyword from a long context. For example: Here is a short story. Read it carefully ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Finding Sparse Linear Connections between Features in LLMs, published by Logan Riggs Smith on December 9, 2023 on The AI Alignment Forum. TL;DR: We use SGD to find sparse connections between features; additionally a large fraction of features between the residual stream & MLP can be modeled as linearly computed despite the non-linearity in the MLP. See linear feature section for examples. Special thanks to fellow AISST member, Adam Kaufman, who originally thought of the idea of learning sparse connections between features & to Jannik Brinkmann for training these SAE's. Sparse AutoEncoders (SAE)'s are able to turn the activations of an LLM into interpretable features. To define circuits, we would like to find how these features connect to each other. SAE's allowed us to scalably find interpretable features using SGD, so why not use SGD to find the connections too? We have a set of features before the MLP, F1, and a set of features after the MLP, F2. These features were learned by training SAE's on the activations at these layers. Ideally, we learn a linear function such that F2 = W(F1), & W is sparse (ie L1 penalty on weights of W). So then we can look at a feature in F2, and say "Oh, it's just a sparse linear combination of features of F1 e.g. 0.8*(however feature) + 0.6*(but feature)", which would be quite interpretable! However, we're trying to replicate an MLP's computation, which surely can't be all linear![1] So, what's the simplest computation from F1 to F2 that gets the lowest loss (ignoring L1 weight sparsity penalty for now)? Training on only MSE between F1 & F2, we plot the MSE throughout training across 5 layers in Pythia-70m-deduped in 4 settings: Linear: y=Wx Nonlinear: y=Relu(Wx) MLP: y=W2ReLU(W1x) Two Nonlinear: ReLU(W2ReLU(W1x)) For all layers, training loss clusters along (MLP & two nonlinear) and (linear & nonlinear). Since MLP & linear are the simplest of these two clusters, the rest of the analysis will only look at those two. [I also looked at bias vs no-bias: adding a bias didn't positively improve loss, so it was excluded] Interestingly enough, the relative linear-MLP difference is huge in the last layer (and layer 2). The last layer is also much larger loss in general, though the L2 norm of the MLP activations in layer 5 are 52 compared to 13 in layer 4. This is a 4x increase, which would be a 16x increase in MSE loss. The losses for the last datapoints are 0.059 & 0.0038, which are ~16x different. What percentage of Features are Linear? Clearly the MLP is better, but that's on average. What if a percentage of features can be modeled as linearly computed? So we take the difference in loss for features (ie for a feature, we take linear loss - MLP loss), normalize all losses by their respective L2-norm/layer, and plot them. Uhhh… there are some huge outliers here, meaning these specific features are very non-linear. Just setting a threshold of 0.001 for all layers: Layer Percent of features < 0.001 loss-difference (ie can be represented linearly) 1 78% 2 96% 3 97% 4 98% 5 99.1% Most of the features can be linearly modeled w/ a small difference in loss (some have a negative loss-diff, meaning linear had a *lower* loss than the MLP. The values are so small that I'd chalk that up to noise). That's very convenient! [Note: 0.001 is sort of arbitrary. To make this more principled, we could plot the effect of adding varying levels of noise to a layer of an LLM's activation, then pick a threshold that has a negligible drop in cross entropy loss? Adding in Sparsity Now, let's train sparse MLP & sparse linear connections. Additionally, we can restrict the linear one to only features that are well-modeled as linear (same w/ the MLP). We'll use the loss of: Loss = MSE(F2 - F2_hat) + l1_alpha*L1(weights) But how do we select l1_alpha? Let's just plot the ...
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Some open-source dictionaries and dictionary learning infrastructure, published by Sam Marks on December 5, 2023 on The AI Alignment Forum. As more people begin work on interpretability projects which incorporate dictionary learning, it will be valuable to have high-quality dictionaries publicly available.[1] To get the ball rolling on this, my collaborator (Aaron Mueller) and I are: open-sourcing a number of sparse autoencoder dictionaries trained on Pythia-70m MLPs releasing our repository for training these dictionaries[2]. Let's discuss the dictionaries first, and then the repo. The dictionaries The dictionaries can be downloaded from here. See the sections "Downloading our open-source dictionaries" and "Using trained dictionaries" here for information about how to download and use them. If you use these dictionaries in a published paper, we ask that you mention us in the acknowledgements. We're releasing two sets of dictionaries for EleutherAI's 6-layer pythia-70m-deduped model. The dictionaries in both sets were trained on 512-dimensional MLP output activations (not the MLP hidden layer like Anthropic used), using ~800M tokens from The Pile. The first set, called 0_8192, consists of dictionaries of size 8192=16512. These were trained with an L1 penalty of 1e-3. The second set, called 1_32768, consists of dictionaries of size 32768=64512. These were trained with an l1 penalty of 3e-3. Here are some statistics. (See our repo's readme for more info on what these statistics mean.) For dictionaries in the 0_8192 set: Layer MSE Loss L1 loss L0 % Alive % Loss Recovered 0 0.056 6.132 9.951 0.998 0.984 1 0.089 6.677 44.739 0.887 0.924 2 0.108 11.44 62.156 0.587 0.867 3 0.135 23.773 175.303 0.588 0.902 4 0.148 27.084 174.07 0.806 0.927 5 0.179 47.126 235.05 0.672 0.972 For dictionaries in the 1_32768 set: Layer MSE Loss L1 loss L0 % Alive % Loss Recovered 0 0.09 4.32 2.873 0.174 0.946 1 0.13 2.798 11.256 0.159 0.768 2 0.152 6.151 16.381 0.118 0.724 3 0.211 11.571 39.863 0.226 0.765 4 0.222 13.665 29.235 0.19 0.816 5 0.265 26.4 43.846 0.13 0.931 And here are some histograms of feature frequencies. Overall, I'd guess that these dictionaries are decent, but not amazing. We trained these dictionaries because we wanted to work on a downstream application of dictionary learning, but lacked the dictionaries. These dictionaries are more than good enough to get us off the ground on our mainline project, but I expect that in not too long we'll come back to train some better dictionaries (which we'll also open source). I think the same is true for other folks: these dictionaries should be sufficient to get started on projects that require dictionaries; and when better dictionaries are available later, you can swap them in for optimal results. Some miscellaneous notes about these dictionaries (you can find more in the repo). The L1 penalty for 1_32768 seems to have been too large; only 10-20% of the neurons are alive, and the loss recovered is much worse. That said, we'll remark that after examining features from both sets of dictionaries, the dictionaries from the 1_32768 set seem to have more interpretable features than those from the 0_8192 set (though it's hard to tell). In particular, we suspect that for 0_8192, the many high-frequency features in the later layers are uninterpretable but help significantly with reconstructing activations, resulting in deceptively good-looking statistics. (See the bullet point below regarding neuron resampling and bimodality.) As we progress through the layers, the dictionaries tend to get worse along most metrics (except for % loss recovered). This may have to do with the growing scale of the activations themselves as one moves through the layers of pythia models (h/t to Arthur Conmy for raising this hypothesis). We note that our dictionary fea...
Hey there, lovely souls! Join us on Episode 9 of "The Spiritual Rabbit Hole" with your favorite trio of Spiritual Mediums - Nicole, Kristin, and Glenda. Today, we're taking a wild ride through history, exploring psychics and how they've left their mark on cultures. We'll chat about ancient prophets like Abraham and Isaiah, who shaped the religious scene. Then, get ready for some mystical vibes as we unravel tales of seers and oracles, from Greece's Pythia to Odin and those wise Celtic Druids. Ever wondered how folks sought guidance back then? We're spilling the tea on divination practices, from Chinese I Ching to Roman Sibylline Oracles and ancient Chinese Oracle Bones. And, of course, we can't forget the OG psychics - Edgar Cayce and Nostradamus. These guys claimed to see the future and left us all fascinated. Fast forward to the 19th-century spiritualist movement, where mediums stole the show. The Fox sisters, with their séances and spirit chats, took center stage. We'll wrap it up by talking about how these mystical figures shaped our art, literature, and beliefs. It's a wild ride, and we want you on board! So, grab your favorite snack, cozy up, and let's explore "The Spiritual Rabbit Hole" together. Don't forget to subscribe for more deep dives into the mystical and downright intriguing. **There is a correction we need to make in this episode. There is a reference to Mary Magdalene's gospel and it was accidentally said to be from the 2nd century BC. It should say 2nd century AD. To learn more about Nicole, Kristin, and Glenda and their spiritual community, visit the Soul on a Voyage website, soulonavoyage.com. If you would like to schedule an appointment with Nicole Glosser, you may do so through her website, nicoleglosser.com. To find out more about the services Kristin Daniels has to offer, visit her website balancewithKristin.com. If you want to work with Glenda, email her at gsintuitivecalling@gmail.com.
This week in Episode #613, I talk with Travis Horseman from Pythia: The Last Oracle and Jordan Thomas with The Man From Maybe! Pythia: The Last Oracle is a four-issue comic miniseries from Travis Horseman, who has brought us Amiculus: A Secret History, Sugar Creek, and In Noctem. This book, which has a Kickstarter project going on until Friday, November 17, at 11:59 p.m. EDT, is described this way: “The Pythia, Apollo's Oracle of Delphi, must use her last prophecy to save the world, even if it means killing the god she serves.” We talk about how this series came to be, who the various characters are, and what we can expect from Travis in the coming months! Then everything wraps up with my interview with Jordan Thomas, who recently released the debut issue of The Man From Maybe, which is published by Oni Press. It's described this way: "This is a two-fisted, 48-page injection of post-apocalyptic postmodernism!” Jordan is the scripter and creator of this big, pulpy adventure, so we explore how this three-issue miniseries came to be, who the various characters are, and what we might expect from it in the coming months! Be sure to let your local comics shop know you want this excellent comics title!
Acts 16:16-24 16 Once when we were going to the place of prayer, we were met by a slave girl who had a spirit by which she predicted the future. She earned a great deal of money for her owners by fortune-telling. 17 This girl followed Paul and the rest of us, shouting, "These men are servants of the Most High God, who are telling you the way to be saved." 18 She kept this up for many days. Finally Paul became so troubled that he turned around and said to the spirit, "In the name of Jesus Christ I command you to come out of her!" At that moment the spirit left her. 19 When the owners of the slave girl realized that their hope of making money was gone, they seized Paul and Silas and dragged them into the marketplace to face the authorities. 20 They brought them before the magistrates and said, "These men are Jews, and are throwing our city into an uproar 21 by advocating customs unlawful for us Romans to accept or practice. 22 The crowd joined in the attack against Paul and Silas, and the magistrates ordered them to stripped and beaten. 23 After they had been severely flogged, they were thrown into prison, and the jailer was commanded to guard them carefully. 24 Upon receiving such orders, he put them in the inner cell and fastened their feet in the stocks. (NIV 84) FROM THE LESSON Spirit of a Python: in verse 16, Luke literally reports that this slave girl had the "spirit of a python." This is a reference to the young women who served as priestess at the Oracle at Delphi. These women were called Pythia, named after the Python dragon that the Greek god Apollo had defeated. Ventriloquism: in verse 16, the word translated as "fortune-telling" will also be used to describe the art of ventriloquism. This is how the Greeks would have thought about what was happening with this slave girl. The words she spoke would have been understood as the very words of the gods being spoken through her, just like a ventriloquist and his puppet. IJM: International Justice Mission is a Christian organization that partners with local authorities all around the world to combat human trafficking and slavery, particularly violence against women and the exploitation of children. You can check out the incredible work they do at www.ijm.org. WHY SPEAK THE NAME OF JESUS? We speak the name of Jesus because we submit our lives to His authority. We speak the name of Jesus because we want to live without any ambiguity. We speak the name of Jesus because we desire to see His liberating activity. DISCUSSION QUESTIONS How many times did you speak the name of Jesus last week? Think about it and try to come up with a number. Paul wrote in Philippians 2:9 that "His name is above every name and at the name of Jesus every knee should bow." How can you speak the name of Jesus more in your day to day life as a way of submitting your life to His authority? Peter spoke these words in Acts 4:12, "Salvation is found in no one else, for there is no other name under heaven given to mankind by which we must be saved." Peter was crystal clear with his message. How can you speak the name of Jesus more in your day to day life as a way of living without any ambiguity? The angel of the Lord told Joseph in Matthew 1:21, "You are to give him the name Jesus because he will set His people free from sin." How can you speak the name of Jesus more in your day to day life as a way of seeing His liberating activity? https://www.youtube.com/watch?v=PcmqSfr1ENY
Inspired by the Goddess Pythia, who was the Oracle of Delphi at the Temple of Apollo, this meditation asks us not only to investigate what ritual we mihgt like to develop around our meditation practice, but also to trust in the innate wisdom that we have within ourselves to answer the questions that come up for us in life. Before Pythia would give her answers to those who came from far and wide to hear her wisdom, there was a significant ritual. Humans love ritual and by developing your own it will send a clear signal to mind, body and spirit that you are taking time to go within, time to find the clarity that is there at the Third Eye Chakra.Trusting that we already have within us everything we need to answer the questions that come up in our lives is hugely comforting. We all have the potential to be Pythia - we just need to trust in ourselves, and find the quiet space to listen to the Third Eye wisdom.I would love to hear about your meditation ritual! Send me photos of your set up and I'll share them to my instagram if you allow me to!Also, to book your chakra reading please do so here. Much loveRosanne Hosted on Acast. See acast.com/privacy for more information.
Shownotes How the present moment is actually the future in the making The science behind the brain forming connections with psychedelics Exploring the history of High Priestesses and witchcraft The art of using plant medicine to wake up our divine nature How healing trauma led to building an empire Sacred sexual practices as an expressway to divine remembrance Embracing the shadow as part of High Priestess work Bio Emily Fletcher is the founder of Ziva Meditation and has taught over 50,000 people. Her best-selling book, Stress Less, Accomplish More, debuted at #7 out of all books on Amazon and has been translated into 12 languages. Her work has been featured by The New York Times, Good Morning America, The Today Show, Vogue and ABC. She's been named one of the top 100 women in wellness to watch and has taught at Apple, Google and Harvard Business School. A formerly stressed Broadway performer who was going gray at 27, Emily discovered a powerful practice that cured her insomnia and improved her health on the first day. Her transformation was so dramatic that she felt inspired to share it with others. It's Emily's mission to help as many people as possible achieve extraordinary benefits — like dramatically reduced stress, anxiety relief and deep, restful sleep. TimeStamp 0:00:00 Intro 0:00:40 Layla Martin This Tantric Life 0:01:29 Stay Connected with Layla - https://laylamartin.com/join-list/ 0:02:17 Emily Fletcher Episode Introduction 0:03:26 Delphi: the location of the oracle Pythia, high priestess of the Temple of Apollo 0:09:54 The present moment is the future in the making 0:14:17 The brain forms connections with psychedelics and those connections remain 0:18:36 MOOD Supplements - Sex Magic - https://shopmood.com 0:23:15 War on drugs in Europe 0:29:38 Wise women "witches" High Priestesses give plant medicine 'sacrament' to wake up to your divine nature 0:37:28 If you want to stop the witches teach them to hate themselves 0:47:48 Daddy issues: climbing out of a trauma hell hole and building an empire 0:53:13 Epic Sex Guide https://laylamartin.com/epicsex 0:54:43 Pan's Cave 0:59:40 Sacred sexual practices: superhighway to divine remembrance 1:09:57 Can two women change destiny? 1:15:55 Part of the High Priestess work is to embrace the shadow 1:17:15 Conclusion of part 2 of the 4 part series. Tune in next week for parts 3 & 4 1:20:11 Thank you for listening to This Tantric Life === Follow Layla! Website: www.laylamartin.com Instagram: https://www.instagram.com/thelaylamartin/ Sign up to receive my free weekly email that allows you to slowly master the art of experiencing confidence, power, sexiness, radiance, and true love: https://laylamartin.com/join-list/ Listen on Apple Podcasts: https://podcasts.apple.com/us/podcast/this-tantric-life-with-layla-martin/id1685418994 Listen on Spotify: https://open.spotify.com/show/72iBpAwMSsTl9zLk8fHMp1?si=9f465574aa114d63 MOOD Sexy Plant Activated Supplements: https://shopmood.com/ For Men: Learn advanced sexual skills that will make you the best possible lover by unlocking your primal power AND your partner's pleasure in Men's Sexual Mastery - https://hubs.ly/Q01c_Wgx0 Obliss: The Sexual Masterclass for Women. This 6-week online course contains 24 transformative exercises and techniques. https://hubs.ly/Q01c0SWh0
This lecture was given on July 15th, 2023, at the "Thomistic Philosophy & Natural Science Symposium" at the Dominican House of Studies. For more information on upcoming events, please visit our website: thomisticinstitute.org/upcoming-events Speaker Bio: Dr. Steve Mrenna is a scientist and particle theorist at the Fermi National Accelerator Science Laboratory, a premier U.S. particle physics lab. He is a contributor to the Compact Muon Solenoid experiment at the Large Hadron Collider. Dr. Mrenna works closely with Monte Carlo event generators, which are computer programs that simulate the complex structure of particle beam collisions at high energies. He is one of the primary authors of the Pythia event generator, and works to develop critical components of data analysis to relate observed data to theoretical models in particle physics. His work is at the crucial intersection of theory and practice in modern physics.
Travis was born in Springfield, Ohio, and discovered comic books at the age of six. Since then, the course of his life has been shaped by his love of comics, theater and classical history. In addition to being a writer, he has been an actor, a director, a produced playwright, and occasionally, all three at once. Travis partnered with veteran comic artist Giancarlo Caracuzzo (www.giancarlocaracuzzo.com) to bring his ancient Roman epic graphic-novel series Amiculus: A Secret History to life panel-by-panel over three successful crowdfunding campaigns, releasing the final version in 2020. Since Amiculus, Travis has continued to write, crowdfund and publish comics under his label Amiculus Books, venturing into modern-day horror with Sugar Creek in 2019 and combining horror with ancient history in the vampire story In Noctem in 2021. His next work, the first installment of Pythia, the Last Oracle, will be released in spring, 2023. Travis' Links: Facebook Amiculus Books Instagram travishorseman Website www.amiculusrome.com Order From Here at Bookshop.org! https://bookshop.org/a/10588/9798218216122 Or at Amazon.com https://www.amazon.com/stores/Travis-Horseman/author/B00V1UPSQO?ref=ap_rdr&store_ref=ap_rdr&isDramIntegrated=true&shoppingPortalEnabled=true Music by Jam Hansley Edited/Produced by Rob Southgate Buy our books: www.4horsemenpublications.com Our Social media: @Drinkingwithauthors #drinkingwithauthors #4horsemenpublications #authorslife #authorssupportingauthors #indieauthors #authorsofinsta #publishedauthor #authorlove #authorsoninstagram #supportauthors #plotter #panster #writercommunity #authorgram #authorpreneur #authorquotes #authorlove #authortobe #Authorevent #AuthorDay #authortalk #authorconfession #writerscorner #writersofinsta #writerofig #writerssociety #writersociety #writerscommunityofinstagram #writerswrite #drinkingwithauthorspodcast #writerslife #writingtips #writing #authors #erikalance #drinking
FlashAttention was first published by Tri Dao in May 2022 and it had a deep impact in the large language models space. Most open models you've heard of (RedPajama, MPT, LLaMA, Falcon, etc) all leverage it for faster inference. Tri came on the podcast to chat about FlashAttention, the newly released FlashAttention-2, the research process at Hazy Lab, and more. This is the first episode of our “Papers Explained” series, which will cover some of the foundational research in this space. Our Discord also hosts a weekly Paper Club, which you can signup for here. How does FlashAttention work?The paper is titled “FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness”. There are a couple keywords to call out:* “Memory Efficient”: standard attention memory usage is quadratic with sequence length (i.e. O(N^2)). FlashAttention is sub-quadratic at O(N). * “Exact”: the opposite of “exact” in this case is “sparse”, as in “sparse networks” (see our episode with Jonathan Frankle for more). This means that you're not giving up any precision.* The “IO” in “IO-Awareness” stands for “Input/Output” and hints at a write/read related bottleneck. Before we dive in, look at this simple GPU architecture diagram:The GPU has access to three memory stores at runtime:* SRAM: this is on-chip memory co-located with the actual execution core. It's limited in size (~20MB on an A100 card) but extremely fast (19TB/s total bandwidth)* HBM: this is off-chip but on-card memory, meaning it's in the GPU but not co-located with the core itself. An A100 has 40GB of HBM, but only a 1.5TB/s bandwidth. * DRAM: this is your traditional CPU RAM. You can have TBs of this, but you can only get ~12.8GB/s bandwidth, which is way too slow.Now that you know what HBM is, look at how the standard Attention algorithm is implemented:As you can see, all 3 steps include a “write X to HBM” step and a “read from HBM” step. The core idea behind FlashAttention boils down to this: instead of storing each intermediate result, why don't we use kernel fusion and run every operation in a single kernel in order to avoid memory read/write overhead? (We also talked about kernel fusion in our episode with George Hotz and how PyTorch / tinygrad take different approaches here)The result is much faster, but much harder to read:As you can see, FlashAttention is a very meaningful speed improvement on traditional Attention, and it's easy to understand why it's becoming the standard for most models.This should be enough of a primer before you dive into our episode! We talked about FlashAttention-2, how Hazy Research Group works, and some of the research being done in Transformer alternatives.Show Notes:* FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness (arXiv)* FlashAttention-2* Together AI* From Deep Learning to Long Learning* The Hardware Lottery by Sara Hooker* Hazy Research* Is Attention All You Need?* Nvidia CUTLASS 3* SRAM scaling slows* Transformer alternatives:* S4* Hyena* Recurrent Neural Networks (RNNs)Timestamps:* Tri's background [00:00:00]* FlashAttention's deep dive [00:02:18]* How the Hazy Research group collaborates across theory, systems, and applications [00:17:21]* Evaluating models beyond raw performance [00:25:00]* FlashAttention-2 [00:27:00]* CUDA and The Hardware Lottery [00:30:00]* Researching in a fast-changing market [00:35:00]* Promising transformer alternatives like state space models and RNNs [00:37:30]* The spectrum of openness in AI models [00:43:00]* Practical impact of models like LLAMA2 despite restrictions [00:47:12]* Incentives for releasing open training datasets [00:49:43]* Lightning Round [00:53:22]Transcript:Alessio: Hey everyone, welcome to the Latent Space podcast. This is Alessio, Partner and CTO-in-Residence at Decibel Partners. Today we have no Swyx, because he's in Singapore, so it's a one-on-one discussion with Tri Dao. Welcome! [00:00:24]Tri: Hi everyone. I'm Tri Dao, excited to be here. [00:00:27]Alessio: Tri just completed his PhD at Stanford a month ago. You might not remember his name, but he's one of the main authors in the FlashAttention paper, which is one of the seminal work in the Transformers era. He's got a lot of interest from efficient transformer training and inference, long range sequence model, a lot of interesting stuff. And now you're going to be an assistant professor in CS at Princeton next year. [00:00:51]Tri: Yeah, that's right. [00:00:52]Alessio: Yeah. And in the meantime, just to get, you know, a low pressure thing, you're Chief Scientist at Together as well, which is the company behind RedPajama. [00:01:01]Tri: Yeah. So I just joined this week actually, and it's been really exciting. [00:01:04]Alessio: So what's something that is not on the internet that people should know about you? [00:01:09]Tri: Let's see. When I started college, I was going to be an economist, so I was fully on board. I was going to major in economics, but the first week I was at Stanford undergrad, I took a few math classes and I immediately decided that I was going to be a math major. And that kind of changed the course of my career. So now I'm doing math, computer science, AI research. [00:01:32]Alessio: I had a similar thing. I started with physics and then I took like a programming course and I was like, I got to do computer science. I don't want to do physics. So FlashAttention is definitely, everybody's using this. Everybody loves it. You just released FlashAttention 2 last week. [00:01:48]Tri: Yeah. Early this week on Monday. Yeah. [00:01:53]Alessio: You know, AI time. Things move fast. So maybe let's run through some of the FlashAttention highlights, some of the innovation there, and then we can dive into FlashAttention 2. So the core improvement in FlashAttention is that traditional attention is a quadratic sequence length. And to the two, FlashAttention is linear, which obviously helps with scaling some of these models. [00:02:18]Tri: There are two factors there. So of course the goal has been to make attention go faster or more memory efficient. And ever since attention became popular in 2017 with the Transformer paper, lots and lots of folks have been working on this. And a lot of approaches has been focusing on approximating attention. The goal is you want to scale to longer sequences. There are tons of applications where you want to do that. But scaling to longer sequences is difficult because attention scales quadratically in sequence length on both runtime and memory, as you mentioned. So instead of trying to approximate attention, we were trying to figure out, can we do the same computation and maybe be more memory efficient? So in the end, we ended up being the memory is linear in sequence length. In terms of computation, it's still quadratic, but we managed to make it much more hardware friendly. And as a result, we do get wall clock speed up on the order of 2 to 4x, which really helps because that just means that you'll be able to train with 2 to 4x longer sequence length for the same cost without doing any approximations. As a result, lots of folks have been using this. The thing is available in a lot of libraries that do language model training or fine tuning. [00:03:32]Alessio: And the approximation thing is important because this is an exact thing versus a sparse. So maybe explain a little bit the difference there. [00:03:40]Tri: For sure. So in addition, essentially you compute pairwise similarity between every single element in a sequence against each other. So there's been other approaches where instead of doing all that pairwise computation, you only compute similarity for some pairs of elements in the sequence. So you don't do quadratic number of comparison. And this can be seen as some form of sparsity. Essentially you're ignoring some of the elements. When you write down the matrix, you essentially say, OK, I'm going to pretend there's zero. So that has some benefits in terms of runtime and memory. But the trade-off is that it tends to do worse in terms of quality because you're essentially approximating or ignoring some elements. And I personally have worked on this as well for a few years. But when we talk to practitioners who actually train models, especially at large scale, they say, tend not to use these approximate attention methods. Because it turns out, this was surprising to me at the time, was that these approximation methods, even though they perform fewer computation, they tend to not be faster in walk-on time. So this was pretty surprising because back then, I think my background was more on the theoretical side. So I was thinking of, oh, how many flops or floating point operations are you performing? And hopefully that correlates well with walk-on time. But I realized that I was missing a bunch of ideas from the system side where flops or floating point operations don't necessarily correlate with runtime. There are other factors like memory reading and writing, parallelism, and so on. So I learned a ton from just talking to systems people because they kind of figured this stuff out a while ago. So that was really eye-opening. And then we ended up focusing a lot more on memory reading and writing because that turned out to be the majority of the time when you're doing attention is reading and writing memory. [00:05:34]Alessio: Yeah, the I.O. awareness is probably one of the biggest innovations here. And the idea behind it is, like you mentioned, the FLOPS growth of the cards have been going up, but the memory bandwidth, not as much. So I think maybe that was one of the assumptions that the original attention paper had. So talk a bit about how that came to be as an idea. It's one of those things that like in insight, it's like, obviously, why are we like rewriting to like HBM every time, you know, and like once you change it, it's clear. But what was that discovery process? [00:06:08]Tri: Yeah, in hindsight, a lot of the ideas have already been there in the literature. And I would say is it was somehow at the intersection of both machine learning and systems. And you kind of needed ideas from both sides. So on one hand, on the system side, so lots of systems folks have known that, oh, you know, kernel fusion is great. Kernel fusion just means that instead of performing, you know, loading the same element, instead of performing an operation, write it down, load it back up and perform the second operation, you just load it once, perform two operations and then write it down again. So that saves you kind of memory read and write in the middle there. So kernel fusion has been a classic. There's been other techniques from the system side, like tiling, where you perform things in the form of computations in block, again, so that you can load it into a really fast memory. Think of it as a cache. And this is, again, classical computer science ideas, right? You want to use the cache. So the system folks have been thinking about these ideas for a long time, and they apply to attention as well. But there were certain things in attention that made it difficult to do a complete kernel fusion. One of which is there is this softmax operation in the middle, which requires you to essentially sum across the row of the attention matrix. So it makes it difficult to kind of break it, because there's this dependency. So it makes it difficult to break things into a block. So on the system side, people have been thinking about these ideas, but it's been difficult to kind of do kernel fusion for the entire operation. On the machine learning side, people have been thinking more algorithmically. They say, okay, either we can approximate attention, or there's this trick called the online softmax trick, which says that because of softmax, the way it's written mathematically, you can actually break it up into smaller pieces, do some rescaling, and still get the right answer. So this online softmax trick has been around for a while. I think there was a paper from NVIDIA folks back in 2018 about this. And then there was a paper from Google. So Marcus, Rob, and Stats wrote a paper late 2021 on using this online softmax trick to break attention up into smaller pieces. So a lot of the ideas were already there. But it turns out, you kind of need to combine ideas from both sides. So you need to understand that, hey, we want to do kernel fusion to reduce memory written writes. But we also need this online softmax trick to be able to break the softmax into smaller pieces so that a lot of the systems tricks kind of carry through. We saw that, and it was kind of a natural idea that we ended up using ideas from both sides, and it ended up working pretty well. Yeah. [00:08:57]Alessio: Are there any downsides to kernel fusion? If I think about databases and the reasons why we have atomic operations, you know, it's like, you have observability and fallback in between them. How does that work with attention? Is there anything that we lose by fusing the operations? [00:09:13]Tri: Yeah, I think mostly on the practical side is that you lose a little bit of flexibility in the sense that, hey, now you have, for example, faster attention, it's just a subroutine that you would call to do attention. But as a researcher, let's say you don't want that exact thing, right? You don't want just attention, let's say you want some modification to attention. You want to do, hey, I'm going to multiply the query and key, but then I'm going to do this extra thing before I carry on. So kernel fusion just means that, okay, we have a subroutine that does the entire thing. But if you want to experiment with things, you won't be able to use that fused kernel. And the answer is, can we have a compiler that then automatically does a lot of this kernel fusion? Lots of compiler folks are thinking about this, either with a new language or you can embed it in PyTorch. PyTorch folks have been working on this as well. So if you write just your code in PyTorch and they can capture the graph, can they generate code that will fuse everything together? That's still ongoing, and it works for some cases. But for attention, because of this kind of softmax rewriting stuff, it's been a little bit more difficult. So maybe in a year or two, we'll have compilers that are able to do a lot of these optimizations for you. And you don't have to, for example, spend a couple months writing CUDA to get this stuff to work. Awesome. [00:10:41]Alessio: And just to make it clear for listeners, when we say we're not writing it to memory, we are storing it, but just in a faster memory. So instead of the HBM, we're putting it in the SRAM. Yeah. [00:10:53]Tri: Yeah. [00:10:54]Alessio: Maybe explain just a little bit the difference there. [00:10:56]Tri: Yeah, for sure. This is kind of a caricature of how you think about accelerators or GPUs in particular, is that they have a large pool of memory, usually called HBM, or high bandwidth memory. So this is what you think of as GPU memory. So if you're using A100 and you list the GPU memory, it's like 40 gigs or 80 gigs. So that's the HBM. And then when you perform any operation, you need to move data from the HBM to the compute unit. So the actual hardware unit that does the computation. And next to these compute units, there are on-chip memory or SRAM, which are much, much smaller than HBM, but much faster. So the analogy there is if you're familiar with, say, CPU and RAM and so on. So you have a large pool of RAM, and then you have the CPU performing the computation. But next to the CPU, you have L1 cache and L2 cache, which are much smaller than DRAM, but much faster. So you can think of SRAM as the small, fast cache that stays close to the compute unit. Physically, it's closer. There is some kind of asymmetry here. So HBM is much larger, and SRAM is much smaller, but much faster. One way of thinking about it is, how can we design algorithms that take advantage of this asymmetric memory hierarchy? And of course, lots of folks have been thinking about this. These ideas are pretty old. I think back in the 1980s, the primary concerns were sorting. How can we sort numbers as efficiently as possible? And the motivating example was banks were trying to sort their transactions, and that needs to happen overnight so that the next day they can be ready. And so the same idea applies, which is that they have slow memory, which was hard disk, and they have fast memory, which was DRAM. And people had to design sorting algorithms that take advantage of this asymmetry. And it turns out, these same ideas can apply today, which is different kinds of memory. [00:13:00]Alessio: In your paper, you have the pyramid of memory. Just to give people an idea, when he says smaller, it's like HBM is like 40 gig, and then SRAM is like 20 megabytes. So it's not a little smaller, it's much smaller. But the throughput on card is like 1.5 terabytes a second for HBM and like 19 terabytes a second for SRAM, which is a lot larger. How do you think that evolves? So TSMC said they hit the scaling limits for SRAM, they just cannot grow that much more. HBM keeps growing, HBM3 is going to be 2x faster than HBM2, I think the latest NVIDIA thing has HBM3. How do you think about the future of FlashAttention? Do you think HBM is going to get fast enough when maybe it's not as useful to use the SRAM? [00:13:49]Tri: That's right. I think it comes down to physics. When you design hardware, literally SRAM stays very close to compute units. And so you don't have that much area to essentially put the transistors. And you can't shrink these things too much. So just physics, in terms of area, you don't have that much area for the SRAM. HBM is off-chip, so there is some kind of bus that essentially transfers data from HBM to the compute unit. So you have more area to essentially put these memory units. And so yeah, I think in the future SRAM probably won't get that much larger, because you don't have that much area. HBM will get larger and faster. And so I think it becomes more important to design algorithms that take advantage of this memory asymmetry. It's the same thing in CPU, where the cache is really small, the DRAM is growing larger and larger. DRAM could get to, I don't know, two terabytes, six terabytes, or something, whereas the cache stays at, I don't know, 15 megabytes or something like that. I think maybe the algorithm design becomes more and more important. There's still ways to take advantage of this, I think. So in the future, I think flash attention right now is being used. I don't know if in the next couple of years, some new architecture will come in and whatnot, but attention seems to be still important. For the next couple of years, I still expect some of these ideas to be useful. Not necessarily the exact code that's out there, but I think these ideas have kind of stood the test of time. New ideas like IO awareness from back in the 1980s, ideas like kernel fusions, tiling. These are classical ideas that have stood the test of time. So I think in the future, these ideas will become more and more important as we scale models to be larger, as we have more kinds of devices, where performance and efficiency become much, much more important. [00:15:40]Alessio: Yeah, and we had Jonathan Frankle on the podcast, and if you go to issattentionallyouneed.com, he has an outstanding bet, and he does believe that attention will be the state of the art architecture still in a few years. Did you think flash attention would be this popular? I'm always curious on the research side, you publish a paper, and obviously you know it's great work, but sometimes it just kind of falls flat in the industry. Could you see everybody just starting to use this, or was that a surprise to you? [00:16:11]Tri: Certainly, I didn't anticipate the level of popularity. Of course, we were extremely happy to have people using this stuff and giving us feedback and so on, and help us improve things. I think when we were writing the paper, I remember sending an email to one of my advisors, and like, hey, I'm excited about this paper, but I think the most important thing will be the artifact, which is the code. So I knew that the code will be valuable. So we kind of focus a lot on the code and make sure that the code is usable and as fast as can be. Of course, the idea, the paper presents the ideas and explain it and have experiments that validate the idea, but I knew that the artifact or the code was also pretty important. And that turned out to be the right focus, which is, you know, we put out the paper, we release the code and continue working on the code. So it's a team effort with my co-authors as well. [00:17:07]Alessio: We mentioned Hazy Research a bunch of times on the podcast before. I would love for you to spend five minutes just talking about how does the group work? How do people get together? How do you bounce ideas off of each other? Yeah. [00:17:21]Tri: So Hazy Research is a research group at Stanford led by one of my advisors, Chris Re. I love the people there. It was one of the best experiences I had. They've made my PhD so much more enjoyable. And I think there are a couple of ways that the group has been working pretty well. So one is, I think there's a diverse pool of people who either, you know, some of them focus on algorithms and theory, some of them focus on building systems, some of them focus on applications. And as a result, there is this flow of idea. So as an example, some of us were working on like more algorithms and theory, and then we can talk to the folks building systems and say, hey, let's try it out and let's put it in the systems and see how it is. And there you will get feedback from systems folks. They will say, hey, we implemented this, or we tried this and this is where it doesn't work, something like that. And once we put it in the systems, the application folks can use the algorithm or new methods or new models. And we again get great feedback from them because the application folks, for example, some of my good friends, they focus on medical imaging or seizure detection. And that is the problem they care about. And if your method doesn't work on the task they care about, they will tell you. Whereas I think a lot of people in machine learning, they're a little bit more flexible. So they will be like, hey, it doesn't work on seizure detection. Let's try some other task, right? But having that direct feedback of like, hey, it doesn't work there, let's figure out why. I think that that feedback allows us to do better work. And I think that kind of process of exchanging ideas, validating it in a real system so that applications folks can try it out and give you feedback. That cycle has been very, very useful. And so that's one, having a diverse group of people. The other one is, and this is something I really appreciate from advice from Chris was try to understand the fundamental, right? And he's happy letting me go off and read some textbooks and playing with things because I think a lot of research ideas come from understanding the old literature and see how it fits with the new landscape. And so if you just new archive papers every day, that's great, but you also need to read textbooks. And that's one advice I got from Chris, which is understand the fundamentals. And I think that allows us to do more impactful work. [00:19:46]Alessio: How do you think about academia versus industry? I feel like AI / Machine Learning has been an area where up until three, four years ago, most of the cutting edge work was being done in academia. And now there's all these big industry research labs. You're obviously going to Princeton, so you're an academia believer. How should people think about where to go? Say I'm doing my master's, I have to decide between doing a PhD and going into OpenAI Anthropic. How should I decide? [00:20:15]Tri: I think they kind of play a complementary role, in my opinion. Of course, I also was considering different paths as well. So I think right now, scaling matters a lot, especially when you talk about language models and AI and so on. Scaling matters a lot. And that means that you need compute resources and you need infrastructure and you need engineers time. And so industry tends to have an advantage when it comes to scaling things. But a lot of the ideas actually came from academia. So let's take Attention, which got popular with the Transformer in 2017. Attention actually has been around for a while. So I think the first mention was in 2014, a paper from Bernadot and others and Yoshua Bengio, which is coming from academia. A lot of ideas did come from academia. And scaling things up, of course, I think OpenAI has been great at scaling things up. That was the bet that they made after, I think, GPT-2. So they saw that scaling these things up to back then was 1.5 billion parameter seemed to give you amazing capabilities. So they really committed to that. They really committed to scaling things. And that turned out to be, it's been a pretty successful bet. I think for academia, we're still trying to figure out exactly what we're doing in this shifting landscape. And so lots of folks have been focusing on, for example, evaluation. So I know the Stanford Center for Foundation Model led by Percy, they have this benchmark called HELM, which is this holistic benchmark. So trying to figure out, okay, characterizing the landscape of different kinds of models, what people should evaluate, what people should measure, and things like that. So evaluation is one role. The other one is understanding. So this has happened historically where there's been some development in the industry and academia can play a role in explaining, understanding. They have the luxury to slow down trying to understand stuff, right? So lots of paper on understanding what's really going on, probing these models, and so on. I think I'm not as familiar with the NLP literature, but my impression is there's a lot of that going on in the NLP conferences, which is understanding what these models are doing, what capabilities they have, and so on. And the third one I could see is that the academia can take more risky bets in the sense that we can work on stuff that is quite different from industry. I think industry, my impression is you have some objective. You're trying to say, hey, for this quarter, we want to scale the model in this particular way. Next quarter, we want the model to have these capabilities. You're trying to get objectives that maybe, I don't know, 70% that will work out because it's important for the company's direction. I think for academia, the way things work is you have many, many researchers or PhD students, and they're kind of pursuing independent directions. And they have a little bit more flexibility on, hey, I'm going to try out this seemingly crazy idea and see, let's say there's a 30% chance of success or something. And however you define success, for academia, a lot of the time, success just means like, hey, we found something interesting. That could eventually go into industry through collaboration and so on. So I do see academia and industry kind of playing complementary roles. And as for someone choosing a career, I think just more and more generally, industry would be probably better in terms of compensation, in terms of probably work-life balance. But my biased perspective is that maybe academia gives you a little bit more freedom to think and understand things. So it probably comes down to personal choice. I end up choosing to be a professor next year at Princeton. But of course, I want to maintain a relationship with industry folks. I think industry folks can provide very valuable feedback to what we're doing in academia so that we understand where the field is moving because some of the directions are very much influenced by what, for example, OpenAI or Google is doing. So we want to understand where the field is moving. What are some promising applications? And try to anticipate, okay, if the field is moving like this, these applications are going to be popular. What problems will be important in two, three years? And then we try to start thinking about those problems so that hopefully in two, three years, we have some of the answers to some of these problems in two, three years. Sometimes it works out, sometimes it doesn't. But as long as we do interesting things in academia, that's the goal. [00:25:03]Alessio: And you mentioned the eval side. So we did a Benchmarks 101 episode. And one of the things we were seeing is sometimes the benchmarks really influence the model development. Because obviously, if you don't score well on the benchmarks, you're not going to get published and you're not going to get funded. How do you think about that? How do you think that's going to change now that a lot of the applications of these models, again, is in more narrow industry use cases? Do you think the goal of the academia eval system is to be very broad and then industry can do their own evals? Or what's the relationship there? [00:25:40]Tri: Yeah, so I think evaluation is important and often a little bit underrated. So it's not as flashy as, oh, we have a new model that can do such and such. But I think evaluation, what you don't measure, you can't make progress on, essentially. So I think industry folks, of course, they have specific use cases that their models need to do well on. And that's what they care about. Not just academia, but other groups as well. People do understand what are some of the emerging use cases. So for example, now one of the most popular use cases is Chatbot. And then I think folks from Berkeley, some of them are from Berkeley, call them MLCs. They set up this kind of Chatbot arena to essentially benchmark different models. So people do understand what are some of the emerging use cases. People do contribute to evaluation and measurement. And as a whole, I think people try to contribute to the field and move the field forward, albeit that maybe slightly different directions. But we're making progress and definitely evaluation and measurement is one of the ways you make progress. So I think going forward, there's still going to be just more models, more evaluation. We'll just have better understanding of what these models are doing and what capabilities they have. [00:26:56]Alessio: I like that your work has been focused on not making benchmarks better, but it's like, let's just make everything faster. So it's very horizontal. So FlashAttention 2, you just released that on Monday. I read in the blog post that a lot of the work was also related to some of the NVIDIA library updates. Yeah, maybe run us through some of those changes and some of the innovations there. Yeah, for sure. [00:27:19]Tri: So FlashAttention 2 is something I've been working on for the past couple of months. So the story is the NVIDIA CUTLASS team, they released a new version of their library, which contains all these primitives to allow you to do matrix multiply or memory loading on GPU efficiently. So it's a great library and I built on that. So they released their version 3 back in January and I got really excited and I wanted to play with that library. So as an excuse, I was just like, okay, I'm going to refactor my code and use this library. So that was kind of the start of the project. By the end, I just ended up working with the code a whole lot more and I realized that, hey, there are these inefficiencies still in Flash Attention. We could change this way or that way and make it, in the end, twice as fast. But of course, building on the library that the NVIDIA folks released. So that was kind of a really fun exercise. I was starting out, it's just an excuse for myself to play with the new library. What ended up was several months of improvement, improving Flash Attention, discovering new ideas. And in the end, we managed to make it 2x faster and now it's pretty close to probably the efficiency of things like matrix multiply, which is probably the most optimized subroutine on the planet. So we're really happy about it. The NVIDIA Cutlass team has been very supportive and hopefully in the future, we're going to collaborate more. [00:28:46]Alessio: And since it's an NVIDIA library, can you only run this on CUDA runtimes? Or could you use this and then run it on an AMD GPU? [00:28:56]Tri: Yeah, so it's an NVIDIA library. So right now, the code we release runs on NVIDIA GPUs, which is what most people are using to train models. Of course, there are emerging other hardware as well. So the AMD folks did implement a version of Flash Attention, I think last year as well, and that's also available. I think there's some implementation on CPU as well. For example, there's this library, ggml, where they implemented the same idea running on Mac and CPU. So I think that kind of broadly, the idea would apply. The current implementation ended up using NVIDIA's library or primitives, but I expect these ideas to be broadly applicable to different hardware. I think the main idea is you have asymmetry in memory hierarchy, which tends to be everywhere in a lot of accelerators. [00:29:46]Alessio: Yeah, it kind of reminds me of Sara Hooker's post, like the hardware lottery. There could be all these things that are much better, like architectures that are better, but they're not better on NVIDIA. So we're never going to know if they're actually improved. How does that play into some of the research that you all do too? [00:30:04]Tri: Yeah, so absolutely. Yeah, I think Sara Hooker, she wrote this piece on hardware lottery, and I think she captured really well of what a lot of people have been thinking about this. And I certainly think about hardware lottery quite a bit, given that I do some of the work that's kind of really low level at the level of, hey, we're optimizing for GPUs or NVIDIA GPUs and optimizing for attention itself. And at the same time, I also work on algorithms and methods and transformer alternatives. And we do see this effect in play, not just hardware lottery, but also kind of software framework lottery. You know, attention has been popular for six years now. And so many kind of engineer hours has been spent on making it as easy and efficient as possible to run transformer, right? And there's libraries to do all kinds of tensor parallel, pipeline parallel, if you use transformer. Let's say someone else developed alternatives, or let's just take recurrent neural nets, like LSTM, GRU. If we want to do that and run that efficiently on current hardware with current software framework, that's quite a bit harder. So in some sense, there is this feedback loop where somehow the model architectures that take advantage of hardware become popular. And the hardware will also kind of evolve to optimize a little bit for that kind of architecture and software framework will also evolve to optimize for that particular architecture. Right now, transformer is the dominant architecture. So yeah, I'm not sure if there is a good way out of this. Of course, there's a lot of development. Things like, I think compilers will play a role because compilers allow you to maybe still be much more efficient across different kinds of hardware because essentially you write the same code and compiler will be able to make it run efficiently different kinds of hardware. So for example, there's this language Mojo, they're compiler experts, right? And their bet is AI models will be running on different kinds of devices. So let's make sure that we have really good compilers with a good language that then the compiler can do a good job optimizing for all kinds of devices. So that's maybe one way that you can get out of this cycle. But yeah, I'm not sure of a good way. In my own research, I have to think about both the algorithm new model and how it maps to hardware. So there are crazy ideas that seem really good, but will be really, really difficult to run efficiently. And so as a result, for example, we can't really scale some of the architectures up simply because they're not hardware friendly. I have to think about both sides when I'm working on new models. [00:32:50]Alessio: Yeah. Have you spent any time looking at some of the new kind of like AI chips companies, so to speak, like the Cerebras of the world? Like one of their innovations is co-locating everything on the chip. So you remove some of this memory bandwidth issue. How do you think about that? [00:33:07]Tri: Yeah, I think that's an interesting bet. I think Tesla also has this Dojo supercomputer where they try to have essentially as fast on-chip memory as possible and removing some of these data transfer back and forth. I think that's a promising direction. The issues I could see, you know, I'm definitely not a hardware expert. One issue is the on-chip memory tends to be really expensive to manufacture, much more expensive per gigabyte compared to off-chip memory. So I talked to, you know, some of my friends at Cerebros and, you know, they have their own stack and compiler and so on, and they can make it work. The other kind of obstacle is, again, with compiler and software framework and so on. For example, if you can run PyTorch on this stuff, lots of people will be using it. But supporting all the operations in PyTorch will take a long time to implement. Of course, people are working on this. So I think, yeah, we kind of need these different bets on the hardware side as well. Hardware has, my understanding is, has a kind of a longer time scale. So you need to design hardware, you need to manufacture it, you know, maybe on the order of three to five years or something like that. So people are taking different bets, but the AI landscape is changing so fast that it's hard to predict, okay, what kind of models will be dominant in, let's say, three or five years. Or thinking back five years ago, would we have known that Transformer would have been the dominant architecture? Maybe, maybe not, right? And so different people will make different bets on the hardware side. [00:34:39]Alessio: Does the pace of the industry and the research also influence the PhD research itself? For example, in your case, you're working on improving attention. It probably took you quite a while to write the paper and everything, but in the meantime, you could have had a new model architecture come out and then it's like nobody cares about attention anymore. How do people balance that? [00:35:02]Tri: Yeah, so I think it's tough. It's definitely tough for PhD students, for researchers. Given that the field is moving really, really fast, I think it comes down to understanding fundamental. Because that's essentially, for example, what the PhD allows you to do. It's been a couple of years understanding the fundamentals. So for example, when I started my PhD, I was working on understanding matrix vector multiply, which has been a concept that's been around for hundreds of years. We were trying to characterize what kind of matrices would have theoretically fast multiplication algorithm. That seems to have nothing to do with AI or anything. But I think that was a time when I developed mathematical maturity and research taste and research skill. The research topic at that point didn't have to be super trendy or anything, as long as I'm developing skills as a researcher, I'm making progress. And eventually, I've gotten quite a bit better in terms of research skills. And that allows, for example, PhD students later in their career to quickly develop solutions to whatever problems they're facing. So I think that's just the natural arc of how you're being trained as a researcher. For a lot of PhD students, I think given the pace is so fast, maybe it's harder to justify spending a lot of time on the fundamental. And it's tough. What is this kind of explore, exploit kind of dilemma? And I don't think there's a universal answer. So I personally spend some time doing this kind of exploration, reading random textbooks or lecture notes. And I spend some time keeping up with the latest architecture or methods and so on. I don't know if there's a right balance. It varies from person to person. But if you only spend 100% on one, either you only do exploration or only do exploitation, I think it probably won't work in the long term. It's probably going to have to be a mix and you have to just experiment and kind of be introspective and say, hey, I tried this kind of mixture of, I don't know, one exploration paper and one exploitation paper. How did that work out for me? Should I, you know, having conversation with, for example, my advisor about like, hey, did that work out? You know, should I shift? I focus more on one or the other. I think quickly adjusting and focusing on the process. I think that's probably the right way. I don't have like a specific recommendation that, hey, you focus, I don't know, 60% on lecture notes and 40% on archive papers or anything like that. [00:37:35]Alessio: Let's talk about some Transformer alternatives. You know, say Jonathan Franco loses his bet and Transformer is not the state of the art architecture. What are some of the candidates to take over? [00:37:49]Tri: Yeah, so this bet is quite fun. So my understanding is this bet between Jonathan Franco and Sasha Rush, right? I've talked to Sasha a bunch and I think he recently gave an excellent tutorial on Transformer alternatives as well. So I would recommend that. So just to quickly recap, I think there's been quite a bit of development more recently about Transformer alternatives. So architectures that are not Transformer, right? And the question is, can they do well on, for example, language modeling, which is kind of the application that a lot of people care about these days. So there are methods based on state space methods that came out in 2021 from Albert Gu and Curran and Chris Re that presumably could do much better in terms of capturing long range information while not scaling quadratically. They scale sub-quadratically in terms of sequence length. So potentially you could have a much more efficient architecture when sequence length gets really long. The other ones have been focusing more on recurrent neural nets, which is, again, an old idea, but adapting to the new landscape. So things like RWKV, I've also personally worked in this space as well. So there's been some promising results. So there's been some results here and there that show that, hey, these alternatives, either RNN or state space methods, can match the performance of Transformer on language modeling. So that's really exciting. And we're starting to understand on the academic research side, we want to understand, do we really need attention? I think that's a valuable kind of intellectual thing to understand. And maybe we do, maybe we don't. If we want to know, we need to spend serious effort on trying the alternatives. And there's been folks pushing on this direction. I think RWKV scale up to, they have a model at 14 billion that seems pretty competitive with Transformer. So that's really exciting. That's kind of an intellectual thing. We want to figure out if attention is necessary. So that's one motivation. The other motivation is Transformer Alternative could have an advantage in practice in some of the use cases. So one use case is really long sequences. The other is really high throughput of generation. So for really long sequences, when you train with Transformer, with flash attention and so on, the computation is still quadratic in the sequence length. So if your sequence length is on the order of, I don't know, 16K, 32K, 100K or something, which some of these models have sequence length 100K, then you do get significantly slower in terms of training, also in terms of inference. So maybe these alternative architectures could scale better in terms of sequence length. I haven't seen actual validation on this. Let's say an RNN model release with context length, I don't know, 100K or something. I haven't really seen that. But the hope could be that as we scale to long sequences, these alternative architectures could be more well-suited. Not just text, but things like high resolution images, audio, video, and so on, which are emerging applications. So that's one, long sequences. Number two is a high throughput generation, where I can imagine scenarios where the application isn't like an interactive chatbot, but let's say a company wants to batch as many requests as possible on their server, or they're doing offline processing, they're generating stuff based on their internal documents, that you need to process in batch. And the issue with Transformer is that during generation, it essentially needs to keep around all the previous history. It's called the KV cache. And that could take a significant amount of memory, so you can't really batch too much because you run out of memory. I am personally bullish on RNNs. I think RNNs, they essentially summarize the past into a state vector that has fixed size, so the size doesn't grow with the history. So that means that you don't need as much memory to keep around all the previous tokens. And as a result, I think you can scale to much higher batch sizes. And as a result, you can make much more efficient use of the GPUs or the accelerator, and you could have much higher generation throughput. Now, this, I don't think, has been validated at scale. So as a researcher, I'm bullish on this stuff because I think in the next couple of years, these are use cases where these alternatives could have an advantage. We'll just kind of have to wait and see to see if these things will happen. I am personally bullish on this stuff. At the same time, I also spend a bunch of time making attention as fast as possible. So maybe hatching and playing both sides. Ultimately, we want to understand, as researchers, we want to understand what works, why do the models have these capabilities? And one way is, let's push attention to be as efficient as possible. On the other hand, let's push other alternatives to be as efficient at scale, as big as possible, and so that we can kind of compare them and understand. Yeah, awesome. [00:43:01]Alessio: And I think as long as all of this work happens and open, it's a net positive for everybody to explore all the paths. Yeah, let's talk about open-source AI. Obviously, together, when Red Pajama came out, which was an open clone of the LLAMA1 pre-training dataset, it was a big thing in the industry. LLAMA2 came out on Tuesday, I forget. And this week, there's been a lot of things going on, which they call open-source, but it's not really open-source. Actually, we wrote a post about it that was on the front page of Hacker News before this podcast, so I was frantically responding. How do you think about what open-source AI really is? In my mind, in open-source software, we have different levels of open. So there's free software, that's like the GPL license. There's open-source, which is Apache, MIT. And then there's kind of restricted open-source, which is the SSPL and some of these other licenses. In AI, you have the open models. So Red Pajama is an open model because you have the pre-training dataset, you have the training runs and everything. And then there's obviously RandomLens that doesn't make it one-to-one if you retrain it. Then you have the open-weights model that's kind of like StableLM, where the weights are open, but the dataset is not open. And then you have LLAMA2, which is the dataset is not open, the weights are restricted. It's kind of like not really open-source, but open enough. I think it's net positive because it's like $3 million of flops donated to the public. [00:44:32]Tri: How do you think about that? [00:44:34]Alessio: And also, as you work together, what is your philosophy with open-source AI? Right, right. [00:44:40]Tri: Yeah, I think that's a great question. And I think about it on maybe more practical terms. So of course, Meta has done an amazing job training LLAMA1, LLAMA2. And for LLAMA2, they make it much less restrictive compared to LLAMA1. Now you can use it for businesses, unless you are a monthly active user or something like that. I think just this change will have a very significant impact in the kind of landscape of open-source AI, where now lots of businesses, lots of companies will be using, I expect will be using things like LLAMA2. They will fine-tune on their own dataset. They will be serving variants or derivatives of LLAMA2. Whereas before, with LLAMA1, it was also a really good model, but your business companies weren't allowed to do that. So I think on a more practical term, it's kind of shifting the balance between a closed-source model like OpenAI and Anthropic and Google, where you're making API calls, right? And maybe you don't understand as much of what the model is doing, how the model is changing, and so on. Versus now, we have a model with open weight that is pretty competitive from what I've seen in terms of benchmarks, pretty competitive with GPT 3.5, right? And if you fine-tune it on your own data, maybe it's more well-suited for your own data. And I do see that's going to shift the balance of it. More and more folks are going to be using, let's say, derivatives of LLAMA2. More and more folks are going to fine-tune and serve their own model instead of calling an API. So that shifting of balance is important because in one way, we don't want just a concentration of decision-making power in the hands of a few companies. So I think that's a really positive development from Meta. Of course, training the model takes a couple of millions of dollars, but engineers have and I'm sure they spend tons of time trying many, many different things. So the actual cost is probably way more than that. And they make the weights available and they allow probably a lot of companies are going to be using this. So I think that's a really positive development. And we've also seen amazing progress on the open source community where they would take these models and they either fine-tune on different kinds of data sets or even make changes to the model. So as an example, I think for LLAMA1, the context lane was limited to 2K. Like a bunch of folks figured out some really simple methods to scale up to like 8K. [00:47:12]Alessio: Like the RoPE. [00:47:13]Tri: Yes. I think the open source community is very creative, right? And lots of people. LLAMA2 will, again, kind of accelerate this where more people will try it out. More people will make tweaks to it and make a contribution and then so on. So overall, I think I see that as still a very positive development for the field. And there's been lots of libraries that will allow you to host or fine-tune these models, like even with quantization and so on. Just a couple of hours after LLAMA2 was released, tons of companies announcing that, hey, it's on our API or hosting and so on and together did the same. So it's a very fast-paced development and just kind of a model with available weights that businesses are allowed to use. I think that alone is already a very positive development. At the same time, yeah, we can do much better in terms of releasing data sets. Data sets tend to be... Somehow people are not incentivized to release data sets. So philosophically, yeah, you want to be as open as possible. But on a practical term, I think it's a little bit harder for companies to release data sets. Legal issues. The data sets released tend to be not as eye-catchy as the model release. So maybe people are less incentivized to do that. We've seen quite a few companies releasing data sets together. Released a red pajama data set. I think Cerebus then worked on that and deduplicate and clean it up and release slim pajama and so on. So we're also seeing positive development on that front, kind of on the pre-training data set. So I do expect that to continue. And then on the fine-tuning data set or instruction tuning data set, I think we now have quite a few open data sets on instruction tuning and fine-tuning. But these companies do pay for human labelers to annotate these instruction tuning data set. And that is expensive. And maybe they will see that as their competitive advantage. And so it's harder to incentivize these companies to release these data sets. So I think on a practical term, we're still going to make a lot of progress on open source AI, on both the model development, on both model hosting, on pre-training data set and fine-tuning data set. Right now, maybe we don't have the perfect open source model since all the data sets are available. Maybe we don't have such a thing yet, but we've seen very fast development on the open source side. I think just maybe this time last year, there weren't as many models that are competitive with, let's say, ChatGPT. [00:49:43]Alessio: Yeah, I think the open data sets have so much more impact than open models. If you think about Elusive and the work that they've done, GPT-J was great, and the Pythia models are great, but the Pyle and the Stack, everybody uses them. So hopefully we get more people to contribute time to work on data sets instead of doing the 100th open model that performs worse than all the other ones, but they want to say they released the model. [00:50:14]Tri: Yeah, maybe the question is, how do we figure out an incentive structure so that companies are willing to release open data sets? And for example, it could be like, I think some of the organizations are now doing this where they are asking volunteers to annotate and so on. And maybe the Wikipedia model of data set, especially for instruction tuning, could be interesting where people actually volunteer their time and instead of editing Wikipedia, add annotation. And somehow they acknowledge and feel incentivized to do so. Hopefully we get to that kind of level of, in terms of data, it would be kind of like Wikipedia. And in terms of model development, it's kind of like Linux where people are contributing patches and improving the model in some way. I don't know exactly how that's going to happen, but based on history, I think there is a way to get there. [00:51:05]Alessio: Yeah, I think the Dolly-15K data set is a good example of a company saying, let's do this smaller thing, just make sure we make it open. We had Mike Conover from Databricks on the podcast, and he was like, people just bought into it and leadership was bought into it. You have companies out there with 200,000, 300,000 employees. It's like, just put some of them to label some data. It's going to be helpful. So I'm curious to see how that evolves. What made you decide to join Together? [00:51:35]Tri: For Together, the focus has been focusing a lot on open source model. And I think that aligns quite well with what I care about, of course. I also know a bunch of people there that I know and trust, and I'm excited to work with them. Philosophically, the way they've been really open with data set and model release, I like that a lot. Personally, for the stuff, for example, the research that I've developed, like we also try to make code available, free to use and modify and so on, contributing to the community. That has given us really valuable feedback from the community and improving our work. So philosophically, I like the way Together has been focusing on open source model. And the nice thing is we're also going to be at the forefront of research and the kind of research areas that I'm really excited about, things like efficient training and inference, aligns quite well with what the company is doing. We'll try our best to make things open and available to everyone. Yeah, but it's going to be fun being at the company, leading a team, doing research on the topic that I really care about, and hopefully we'll make things open to benefit the community. [00:52:45]Alessio: Awesome. Let's jump into the lightning round. Usually, I have two questions. So one is on acceleration, one on exploration, and then a takeaway. So the first one is, what's something that already happened in AI machine learning that you thought would take much longer than it has? [00:53:01]Tri: I think understanding jokes. I didn't expect that to happen, but it turns out scaling model up and training lots of data, the model can now understand jokes. Maybe it's a small thing, but that was amazing to me. [00:53:16]Alessio: What about the exploration side? What are some of the most interesting unsolved questions in the space? [00:53:22]Tri: I would say reasoning in the broad term. We don't really know how these models do. Essentially, they do something that looks like reasoning. We don't know how they're doing it. We have some ideas. And in the future, I think we will need to design architecture that explicitly has some kind of reasoning module in it if we want to have much more capable models. [00:53:43]Alessio: What's one message you want everyone to remember today? [00:53:47]Tri: I would say try to understand both the algorithm and the systems that these algorithms run on. I think at the intersection of machine learning system has been really exciting, and there's been a lot of amazing results at this intersection. And then when you scale models to large scale, both the machine learning side and the system side really matter. [00:54:06]Alessio: Awesome. Well, thank you so much for coming on 3. [00:54:09]Tri: This was great. Yeah, this has been really fun. [00:54:11] Get full access to Latent Space at www.latent.space/subscribe