POPULARITY
Categories
Outline00:00 - Intro00:42 - “Research should be fun”02:02 - Early steps in research09:00 - Book writing and meeting C. Desoer18:33 - Control synthesis via the factorization approach25:46 - The graph metric 29:27 - Robotics and CAIR36:00 - Randomized algorithms40:41 - On learning44:05 - Neural networks48:40 - Tata, hidden Markov models, and large deviations theory55:48 - Picking problems and role of luck58:07 - Compressed sensing and non-convex optimization01:02:17 - Interplay between control and machine learning01:09:10 - Advice to future students01:13:29 - Future of controlLinksSagar's website: https://tinyurl.com/4hwruajsHilbert: https://tinyurl.com/ykpdh929Feedback Systems: https://tinyurl.com/2k3jsdatHow to Write Mathematics: https://tinyurl.com/35794bv9Nonlinear systems: https://tinyurl.com/2fdtnjcmC. Desoer: https://tinyurl.com/svhknrenControl Systems Synthesis — A Factorization Approach: https://tinyurl.com/59wdc4svAryabhata: https://tinyurl.com/43x6hfhpA Brief History of the Graph Topology: https://tinyurl.com/49uftzdkRobot Dynamics and Control: https://tinyurl.com/5b4cmt7mCAIR: https://tinyurl.com/rajdtxaxRandomized algorithms for robust controller synthesis using statistical learning theory: https://tinyurl.com/wanpyeucR. Tempo: https://tinyurl.com/jkufdwarVC dimension: https://tinyurl.com/mvwk8afmLearning and Generalisation: https://tinyurl.com/2s3mzh8hAre Analog Neural Networks Better Than Binary Neural Networks? https://tinyurl.com/3fnk27xcHidden Markov Processes: https://tinyurl.com/t5frrvfzAn Introduction to Compressed Sensing: https://tinyurl.com/fc6a8eerSupport the showPodcast infoPodcast website: https://www.incontrolpodcast.com/Apple Podcasts: https://tinyurl.com/5n84j85jSpotify: https://tinyurl.com/4rwztj3cRSS: https://tinyurl.com/yc2fcv4yYoutube: https://tinyurl.com/bdbvhsj6Facebook: https://tinyurl.com/3z24yr43Twitter: https://twitter.com/IncontrolPInstagram: https://tinyurl.com/35cu4kr4Acknowledgments and sponsorsThis episode was supported by the National Centre of Competence in Research on «Dependable, ubiquitous automation» and the IFAC Activity fund. The podcast benefits from the help of an incredibly talented and passionate team. Special thanks to L. Seward, E. Cahard, F. Banis, F. Dörfler, J. Lygeros, ETH studio and mirrorlake . Music was composed by A New Element.
Et si notre idée du progrès faisait fausse route ? Et si la tech nous faisait tout simplement dérailler ?Nos connaissances augmentent, donc on fabrique de meilleurs outils, donc ça profite au progrès technique, qui lui même profite au progrès scientifique... le tout dans le meilleur des mondes.MAIS EST-CE VRAIMENT LE CAS ??? Il y a beaucoup de choses aujourd'hui qui font que cette idylle n'est plus possible !Depuis les années 70, les signaux d'alerte s'accumulent… mais on accélère. Vers le mur.Écoutez l'épisode complet Pourquoi le progrès technique nous mène droit dans le mur ou tapez directement "Trench Tech Bruno Markov" dans votre plateforme de podcastBruno Markov, ingénieur et essayiste, explore les impasses de l'accélération technologique. Son dernier ouvrage, De quel progrès avons-nous besoin ?, interroge notre culte de l'innovation technologique à l'heure des limites planétaires.(c) Trench Tech, LE podcast des « Esprits Critiques pour une Tech Éthique »Épisode enregistré le 23/05/2025---
Le progrès technologique cache un piège invisible et toxique : les injonctions contradictoires qui nous tenaillent entre nécessité d'innover et de régler le problème du réchauffement climatique. Bruno Markov, ingénieur et essayiste, explore les impasses de l'accélération technologique. Son dernier ouvrage, De quel progrès avons-nous besoin ?, interroge notre culte de l'innovation technologique à l'heure des limites planétaires.Et si, loin d'être une solution, la promesse du "progrès" était devenu notre problème ? Dans cet épisode choc, Bruno Markov démonte les illusions d'un progrès déconnecté des réalités écologiques et sociales. Face à une IA qui promet tout et son contraire, aux promesses infinies d'une Silicon Valley hors-sol, il questionne les récits qui justifient l'emballement de nos sociétés.Pourquoi les décideurs, même conscients du mur, continuent-ils d'accélérer ? Quels sont les verrous invisibles qui paralysent l'action ? Bruno Markov éclaire les pièges du dilemme du prisonnier, les effets rebonds des technologies et le rôle des imaginaires dans nos visions du futur. Un épisode qui invite à réinventer le progrès, à sortir des dogmes technos et à remettre l'éthique et le discernement au cœur de nos choix.
The day after the review, Boris rides to Olmütz to build on his burgeoning relationship with Andrei, with the goal of obtaining a position of adjutant. He reflected, “It is all well for Rostov, whose father sends him 20,000 rubles at a time, to NOT wish to be anyone's lackey, but I who have little but my brains, must not miss any opportunity!” Olmütz was transformed into the headquarters where the Emperors resided. When Boris inquired of Andrei, he was shunned by officials who grew tired of the number of low-level officers who were coming and going. He learned Andrei would return the next day, so at that time Boris visited Kutuzov's quarters and found Andrei in a reception room. He noticed Andrei with an older General, who was hardly keeping Andrei's interest. Andrei, clearly part of the inner circle of influence, was ecstatic to interrupt the old man and turned to Boris with a smile. Boris realized that besides discipline, subordination and order prescribed in the official Army code, there was a more important way of life, which forced the General to the sideline. Boris resolved to become part of this higher world. Andrei informed Boris that he had been occupied with the Austrian command and references the historic General Franz von Weyrother, who plays a critical role in the upcoming Battle of Austerlitz. Boris could only pretend to understand who Andrei was alluding to. Andrei conveys that he will recommend Boris for a position as an adjunct. Boris is thankful and very much desires an audience with Kutuzov but Andrei explains the commander's staff is overflowing with many who have no use. Andrei wishes to refer Boris to the historic advisor to the Czar, Peter Dolgorúkov, who Andrei labels “a good friend and excellent fellow.” Therefore, they went to the local palace where a significant council of war of the Hofkriegsrat and Russian Command just finished. The consensus was to advance and vanquish Napoleon. Dolgorúkov was under the spell of the event, where the ambitions misguided youth prevailed. This was contrary to the views of Kutuzov. All voices who counseled delay were silenced by conclusive evidence of the victory that awaited. The advantages included: superior numbers, the perceived quality of troops, knowledge of the terrain, and that the allies were inspired by the Emperors. Dolgorúkov was exhausted but eager for inevitable victory. Andrei introduced his protégé, but Dolgorúkov was unable to get beyond the impending action. Dolgorúkov referenced how Napoleon sent a letter, proposing peace, which was viewed as a ruse to gain time. Tolstoy brings out the historic affront crafted in response. Dolgorúkov explains “What was most amusing was how we could not think how to address our reply! Not to Napolean as ‘Consul' nor ‘Emperor,' or ‘General Bonaparte.'” The fictional Diplomat Bilibin jokingly suggested “Usurper and Enemy of Mankind.” What was agreed on was: To the Head of the French Government / Au chef du gouvernement français. Andrei acknowledges how much Napoleon will be insulted, which makes Dolgorúkov recall a tale about Napoleon, who held held a reputation “as the most cunning and subtle diplomat, a combination of French adroitness and Italian play-acting!” On one purported occasion, Bonaparte wished to take the measure of a Russian ambassador, Count Markov, and purposely dropped a handkerchief and then stood looking at Markov, expecting Markov to assist. Instead, Markov dropped his own and picked it up without touching Bonaparte's. When Andrei reintroduces Boris, the young man receives passing acknowledgment, but is told his appeal will be addressed another time. Still, Boris was enraptured by his surroundings. He recognized he was among the springs that set in motion enormous movements of men. If left just in his regiment, he would consider himself an obedient and insignificant atom. As exiting, they all noticed a short man with a clever face and sharply projecting jaw, who nodded to Dolgorúkov as to an intimate friend but stared at Andrei with cool intensity. “Who was that?” asked Boris. Andrei explained, “He is one of the most remarkable, but to me most unpleasant of men—the Minister of Foreign Affairs, Prince Adam Czartorýski.... It is such men as he who decide the fate of nations.” Tolstoy is referencing an extremely significant Polish statesman, who lived to just over 90. At the time of the novel is set, Czartorýski was a close friend and trusted advisor to Tsar Alexander, but was later famous for trying restore sovereignty to Poland.
Today, Razib talks about a new paper, A structured coalescent model reveals deep ancestral structure shared by all modern humans: Understanding the history of admixture events and population size changes leading to modern humans is central to human evolutionary genetics. Here we introduce a coalescence-based hidden Markov model, cobraa, that explicitly represents an ancestral population split and rejoin, and demonstrate its application on simulated and real data across multiple species. Using cobraa, we present evidence for an extended period of structure in the history of all modern humans, in which two ancestral populations that diverged ~1.5 million years ago came together in an admixture event ~300 thousand years ago, in a ratio of ~80:20%. Immediately after their divergence, we detect a strong bottleneck in the major ancestral population. We inferred regions of the present-day genome derived from each ancestral population, finding that material from the minority correlates strongly with distance to coding sequence, suggesting it was deleterious against the majority background. Moreover, we found a strong correlation between regions of majority ancestry and human–Neanderthal or human–Denisovan divergence, suggesting the majority population was also ancestral to those archaic humans.
In this fascinating episode of Spybrary, host Shane Whaley takes us to the espionage heart of London with expert London Spy Tours guide David Harry, also known as The London Spy. From real-life Cold War betrayals to Bond-worthy locations and hidden relics, David shares captivating insights from his acclaimed Westminster and St. James's London spy tours. This episode is a treasure trove for spy fiction lovers and espionage history buffs alike.
Quiniela de tercera Navarra con el entrenador del Burladés Ivan Markov
Artificial intelligence has come a long way over just the past few years. It can hold conversations and manage social media, it can create art and edit videos, and it can even write blogs (though not this one). Every aspect of our lives has been touched by AI in one way or another, and that's particularly true for sound. While many podcasters, including some of my guests, now use AI tools for research and sound editing, it's also front and center in sound, from cloning voices to writing its own songs. Royalty-free music is already starting to give way to copyright-free AI music, and a variety of powerful audio content generation tools are scheduled for release later this year.But can computers replace human composers? Will listeners be able to tell the difference? And how did we get from vinyl records to virtual music? It may seem hard to believe, but the very first song written by a computer is older than cassette tapes. The Illiac Suite, or “String Quartet No. 4,” as it's officially named, was created in 1955, using pioneering techniques still found in AI today.The ILLIAC I (ill-ee-ack one) was one of the world's first computers. It was built in 1952 at the University of Illinois, and it filled an entire room. The ILLIAC I weighed five tons and used over two thousand vacuum tubes, some of which had to be replaced each night. A pair of music professors, Lejaren Hiller and Leonard Isaacson, programmed the ILLIAC to compose a string quartet using what's called “stochastic music,” music that's written using probability calculations and mathematical sequences – in this case, Markov chains – instead of human inspiration.One of the researchers who helped build the ILLIAC I was Saburo Muroga, who also built the MUSASINO-1 later that year in Japan. And, as it happens, another breakthrough in computer-generated music would emerge from Japan exactly fifty years after the Illiac Suite's release.Synthetic voices were the next step in creating digital music, and in 1961 the IBM 7094 became the first computer to sing a song, “Daisy Bell.” Another computer voice that could sing was called Perfect Paul, and it was one of the voice settings on 1983‘s text-to-speech DECtalk device. This is the speech synthesizer Professor Stephen Hawking used in his later years, and it was based on the voice of MIT researcher Dennis Klatt. The next decade brought us Auto-Tune, which can digitally modulate singing voices in real-time and has become, for better or worse, a staple of pop music.These developments all came together in 2004 as “Vocaloids,” synthesized voices that can talk and sing with perfect pitch. The most famous of them by far is Crypton Future Media's Hatsune Miku, a second-generation Vocaloid who debuted in 2007. While there have been four more generations and many more voices since then, Miku is the one who captured the public's eyes and ears. Arguably the world's first virtual celebrity, she's opened for Lady Gaga, put in a holographic appearance at the 2024 Coachella festival, and just wrapped up her latest ‘Miku Expo' world tour last December.In some ways, Miku and the Vocaloids that followed marked a turning point in synthetic voices. Older synthesizers like Perfect Paul and Microsoft Sam couldn't be mistaken for an ordinary person, but Vocaloids come closer than anything before – so close, in fact, that some music critics have said they fall into a sort of audio uncanny valley. They sound almost, but not quite, human.Now it's the year 2025, and AI has taken the stage: it's talking, singing, composing, and even creating whole new kinds of sound. Both OpenAI's Jukebox and Google's AI MusicLM can convert text into music, and Nvidia's upcoming Fugatto software is described as a sonic “Swiss Army knife” for creating sounds that have never existed, like a screaming saxophone or a trumpet that meows. Another new song-generation service by Musical AI and Beatoven.ai that's set to...
Ob prazniku Lurške Matere božje in Svetovnem dnevu bolnikov smo slišali nekaj utrinkov sporočila papeža Frančiška in se poglobili v namen Dneva posvečenega življenja na praznik Svečnice. Osrednja gostja je bila usmiljenka, s. Metka Tušar, ki je leta 2021 zbolela za rakom možganov. V času zdravljenja se je močno opirala na Božjo besedo, saj je prepisala celotno Novo zavezo in se začela učiti na pamet Markov evangelij.
In an exclusive conversation, AI visionary Pedro shares insights into his groundbreaking contributions, including the unification of AI paradigms and the promise of neuro-symbolic AI. He reflects on key moments in his career and discusses the next frontiers in machine learning. 00:27- About Pedro Domingos Pedro Domingos is a Professor Emeritus of computer science and engineering at the University of Washington. He is a researcher in machine learning known for the Markov logic Network enabling uncertain inference.
******Support the channel****** Patreon: https://www.patreon.com/thedissenter PayPal: paypal.me/thedissenter PayPal Subscription 1 Dollar: https://tinyurl.com/yb3acuuy PayPal Subscription 3 Dollars: https://tinyurl.com/ybn6bg9l PayPal Subscription 5 Dollars: https://tinyurl.com/ycmr9gpz PayPal Subscription 10 Dollars: https://tinyurl.com/y9r3fc9m PayPal Subscription 20 Dollars: https://tinyurl.com/y95uvkao ******Follow me on****** Website: https://www.thedissenter.net/ Facebook: https://www.facebook.com/thedissenteryt/ Twitter: https://x.com/TheDissenterYT This show is sponsored by Enlites, Learning & Development done differently. Check the website here: http://enlites.com/ Dr. Kathryn Nave is a Leverhulme Trust early career research fellow at the University of Edinburgh. Her research focuses on developing a realist account of autonomy and agency, grounded in the uniquely metabolic existence of living systems, and upon critiquing the machine concept of the organism in light of this distinctive material instability. She is the author of A Drive to Survive: The Free Energy Principle and the Meaning of Life. In this episode, we focus on A Drive to Survive. We first discuss enactivism, and the notions of intentionality, autopoiesis, autonomy, adaptivity, and predictive processing. We then get into the Free Energy Principle, and talk about generative models, Markov blankets, living agents, purposiveness and goal-directedness, and biological survival. We discuss the limitations of the Free Energy Principle in differentiating between living and non-living systems, the instability of living systems, and how we can go beyond the Free Energy Principle framework. -- A HUGE THANK YOU TO MY PATRONS/SUPPORTERS: PER HELGE LARSEN, JERRY MULLER, BERNARDO SEIXAS, ADAM KESSEL, MATTHEW WHITINGBIRD, ARNAUD WOLFF, TIM HOLLOSY, HENRIK AHLENIUS, FILIP FORS CONNOLLY, DAN DEMETRIOU, ROBERT WINDHAGER, RUI INACIO, ZOOP, MARCO NEVES, COLIN HOLBROOK, PHIL KAVANAGH, SAMUEL ANDREEFF, FRANCIS FORDE, TIAGO NUNES, FERGAL CUSSEN, HAL HERZOG, NUNO MACHADO, JONATHAN LEIBRANT, JOÃO LINHARES, STANTON T, SAMUEL CORREA, ERIK HAINES, MARK SMITH, JOÃO EIRA, TOM HUMMEL, SARDUS FRANCE, DAVID SLOAN WILSON, YACILA DEZA-ARAUJO, ROMAIN ROCH, DIEGO LONDOÑO CORREA, YANICK PUNTER, CHARLOTTE BLEASE, NICOLE BARBARO, ADAM HUNT, PAWEL OSTASZEWSKI, NELLEKE BAK, GUY MADISON, GARY G HELLMANN, SAIMA AFZAL, ADRIAN JAEGGI, PAULO TOLENTINO, JOÃO BARBOSA, JULIAN PRICE, EDWARD HALL, HEDIN BRØNNER, DOUGLAS FRY, FRANCA BORTOLOTTI, GABRIEL PONS CORTÈS, URSULA LITZCKE, SCOTT, ZACHARY FISH, TIM DUFFY, SUNNY SMITH, JON WISMAN, WILLIAM BUCKNER, PAUL-GEORGE ARNAUD, LUKE GLOWACKI, GEORGIOS THEOPHANOUS, CHRIS WILLIAMSON, PETER WOLOSZYN, DAVID WILLIAMS, DIOGO COSTA, ANTON ERIKSSON, ALEX CHAU, AMAURI MARTÍNEZ, CORALIE CHEVALLIER, BANGALORE ATHEISTS, LARRY D. LEE JR., OLD HERRINGBONE, MICHAEL BAILEY, DAN SPERBER, ROBERT GRESSIS, IGOR N, JEFF MCMAHAN, JAKE ZUEHL, BARNABAS RADICS, MARK CAMPBELL, TOMAS DAUBNER, LUKE NISSEN, KIMBERLY JOHNSON, JESSICA NOWICKI, LINDA BRANDIN, NIKLAS CARLSSON, GEORGE CHORIATIS, VALENTIN STEINMANN, PER KRAULIS, KATE VON GOELER, ALEXANDER HUBBARD, BR, MASOUD ALIMOHAMMADI, JONAS HERTNER, URSULA GOODENOUGH, DAVID PINSOF, SEAN NELSON, MIKE LAVIGNE, JOS KNECHT, ERIK ENGMAN, LUCY, YHONATAN SHEMESH, MANVIR SINGH, PETRA WEIMANN, PEDRO BONILLA, CAROLA FEEST, STARRY, MAURO JÚNIOR, 航 豊川, TONY BARRETT, AND BENJAMIN GELBART! A SPECIAL THANKS TO MY PRODUCERS, YZAR WEHBE, JIM FRANK, ŁUKASZ STAFINIAK, TOM VANEGDOM, BERNARD HUGUENEY, CURTIS DIXON, BENEDIKT MUELLER, THOMAS TRUMBLE, KATHRINE AND PATRICK TOBIN, JONCARLO MONTENEGRO, AL NICK ORTIZ, NICK GOLDEN, AND CHRISTINE GLASS! AND TO MY EXECUTIVE PRODUCERS, MATTHEW LAVENDER, SERGIU CODREANU, BOGDAN KANIVETS, ROSEY, AND GREGORY HASTINGS!
La quiniela de tercera Navarra con Iván Markov del Burladés que se estrenaba en el banquillo con victoria ante un rival directo y que esta jornada se mide a otro con la opción de salir del descenso
Godinu počinjemo sa presekom stanja na medijskom tržištu Srbije i regiona. Od prošle godine, sagovornike za novogodišnju i prvu epizodu u septembru mesecu nominuju prethodni sagovornici, a za prvu epizodu u 2025. godini nominovan je Marko Šobot, medijski stručnjak sa višedecenijskim iskustvom u radu sa prevashodno offline medijima. Kroz bogatu karijeru gde je iskustvo kalio u velikim sistemima poput Direct Media-e i Publicis-a, a danas u ulozi CEO-a agencije Plus Media, Marko je sa nama podelio svoje uvide i iskustva kada je u pitanju srpsko medijsko tržište, kako veliki igrači gledaju na nas kao region, kako danas funkcionišu klijenti, zašto agencije imaju problema da zadrže kadrove, te zašto su i danas i te kako važni offline mediji, uz pravilno segmentiranje i istraživanje tržišta. Marko Šobot, CEO @ Plus Media - https://www.linkedin.com/in/marko-%C5%A1obot-87b36a33/ Teme u epizodi: - Uvod i predstavljanje - Markov profesionalni put od arhitekte do okorelog medijaša - Medijsko tržište: mi i region - Šta su donele dve nejveće krize u poslednjih 20 godina, a šta odnele - Televizija prvi put ispod 50% - koja je budućnost linearne televizije? - Klijenti danas - Agencije: zašto više nismo interesantni mladima - Mediji: nema medija koji ne rade, samo treba sklopiti kockice - AI – da li ostajemo bez posla i zašto ne - Poruka za kraj Prijavite se na naš YouTube kanal: https://bit.ly/3uWtLES Posetite naš sajt i prijavite se na našu mailing listu - https://www.digitalk.rs Pratite DigiTalk.rs na društvenim mrežama: Facebook: https://www.facebook.com/Digitalk.rs Instagram: https://www.instagram.com/digitalk.rs/ Linkedin: https://www.linkedin.com/company/digitalkrs Veliku zahvalnost dugujemo kompanijama koje su prepoznale kvalitet onoga što radimo i odlučile da nas podrže i daju nam vetar u leđa: Partneri podkasta: - Raiffeisen banka - https://www.raiffeisenbank.rs/ Digitalne usluge Raiffeisen banke koje preporučujemo za mala i srednja preduzeća: https://bit.ly/4j2pMjU - Kompanija NIS - https://www.nis.rs/ - Ananas - https://ananas.rs/ - kompanija Idea - https://online.idea.rs/ U Ideinoj online prodavnici unesite promo kod 1000digitalk i očekuje vas 1.000 dinara popusta prilikom vaše online kupovine! Prijatelj podkasta: - PerformLabs - https://performlabs.agency/ Oslobodite pun potencijal svog digitalnog marketinga! Optimizujte svoje kampanje i postignite maksimalne rezultate uz Performlabs. - BiVits ACTIVA Brain Level Up Booster - https://bivits.com/proizvod/brain-level-up/ Kada želiš da živiš i radiš na višem nivou, uzmi BiVits Brain Level Up za više energije i bolju koncentraciju tokom dana! - Izdavačka kuća Finesa - https://www.finesa.edu.rs/ U ovoj epizodi podelićemo dve knjige "Ponašaj se kao lider, razmišljaj kao lider" izdavačke kuće Finesa onima koji budu najbrži i najkreativniji sa komentarima, a možete nam slobodno pisati i na info@digitalk.rs i direktno nam uputiti komentar, sugestiju ili primedbu. Takođe, svi oni koji na Finesinom websajtu poruče knjige i unesu promo kod digitalk dobiće 10% popusta na već snižene cene izdanja na sajtu: https://www.finesa.edu.rs/
Is it actually possible to make AI content in one go with minimal human involvement that is as good as what a human can produce? Zac Harris, former Head of Demand Gen and SEO at Copy.ai and founder of Rankd., thinks so. But Zac does have a caveat: the input you give AI has to be far better than what your average marketer or writer tends to provide. Ever the AI skeptic, Tim Soulo sat down with Zac recently to discuss his workflows, how he scaled Copy.ai to 1M visits per month through free tools, and more. 00:00 Intro 1:40 Success with AI tools 16:02 The effect of mass-deleting content 20:04 Copy.ai's content strategy 29:43 Where to find quality freelancers 34:12 Using feedback to create custom AI workflows and content 40:14 Marketing attribution, Markov chains, re-activations, and expansions 56:18 Leveraging influencer marketing for brand visibility 1:03:59 Link building tactics 01:13:12 The impact of mentorship 1:18:20 Why Zac left Copy.ai to build his new agency 1:38:17 Why teaching people how to use AI is important 1:39:50 AI vs human output competition 1:44:55 Building workflows for AI 1:55:04 Outro We hope you enjoyed this episode of Ahrefs Podcast! As always, be sure to like and subscribe (and tell a friend). Where to find Zac: LinkedIn: https://www.linkedin.com/in/zacharris36/ Website: https://www.gorankd.com/ Where to find Tim: LinkedIn: https://www.linkedin.com/in/timsoulo/ X: @timsoulo Website: https://www.timsoulo.com/
SPOILER-FILLED DISCUSSION: Join me and some special guests this week as we enter the world of "Doki Doki Literature Club" where our pesky neighbor has somehow convinced us to spend the afternoon writing poems. Hesitantly, we will get to know the other girls in the club and prepare for a festival which is quickly approaching. Sounds nice! But this is a psychological horror game . . . so where's the catch? **CONTENT WARNING** This game depicts serious themes including depression, self harm, suicide, and abuse. Please refer to the content warning at the link below before playing this game or listening to this episode. https://ddlc.moe/warning Team Salvato - https://teamsalvato.com/ Dan Salvato - https://x.com/dansalvato Satchely - https://x.com/_Satchely Velinquent - https://www.pixiv.net/en/users/17385446 Jillian Ashcraft - https://www.youtube.com/@jillianashcraft2103 The Portrait of Markov - https://aminoapps.com/c/ddlc/page/item/the-portrait-of-markov-full-story/j0DG_1KYFoIJd7raYX4LBQPoBoPmoD423YX Big News! Podcast - https://youtube.com/playlist?list=PLAmF7jgGM44HZhGYWa_HFgJU1jHuBVKvE&si=LK5H-MQeaeAqF9dS The Garden Club - https://youtube.com/playlist?list=PLAmF7jgGM44H6Mj7oaVK5o38MYmUnkeBC&si=yC0QL_ke9CZ4fqkm
******Support the channel****** Patreon: https://www.patreon.com/thedissenter PayPal: paypal.me/thedissenter PayPal Subscription 3 Dollars: https://tinyurl.com/ybn6bg9l PayPal Subscription 5 Dollars: https://tinyurl.com/ycmr9gpz PayPal Subscription 10 Dollars: https://tinyurl.com/y9r3fc9m PayPal Subscription 20 Dollars: https://tinyurl.com/y95uvkao ******Follow me on****** Website: https://www.thedissenter.net/ The Dissenter Goodreads list: https://shorturl.at/7BMoB Twitter: https://x.com/TheDissenterYT This show is sponsored by Enlites, Learning & Development done differently. Check the website here: http://enlites.com/ Dr. Karl Friston is Professor of Imaging Neuroscience and Wellcome Principal Research Fellow of Imaging Neuroscience at University College London. Dr. Friston is a theoretical neuroscientist and authority on brain imaging. He invented statistical parametric mapping (SPM), voxel-based morphometry (VBM) and dynamic causal modelling (DCM). His main contribution to theoretical neurobiology is a free-energy principle for action and perception. In this episode, we explore the Free Energy Principle, and how to go from physical systems to brains and cognition. We start by discussing what the Free Energy Principle is, the history behind its development, and concepts like Markov blankets, internal and external states, blanket states, circular causality, and autonomous states. We talk about the differences between living and non-living systems, and the existential imperative to reduce predicting error. We also discuss concepts like self-organization and hierarchy in nervous systems. We discuss what we can learn about the brain through neuroimaging, and how specialized the brain is. Finally, we talk about how we can integrate the microscopic aspects of brain physiology with a more abstract understanding of the mind, like what we have in psychiatry and psychology. -- A HUGE THANK YOU TO MY PATRONS/SUPPORTERS: PER HELGE LARSEN, JERRY MULLER, BERNARDO SEIXAS, ADAM KESSEL, MATTHEW WHITINGBIRD, ARNAUD WOLFF, TIM HOLLOSY, HENRIK AHLENIUS, FILIP FORS CONNOLLY, DAN DEMETRIOU, ROBERT WINDHAGER, RUI INACIO, ZOOP, MARCO NEVES, COLIN HOLBROOK, PHIL KAVANAGH, SAMUEL ANDREEFF, FRANCIS FORDE, TIAGO NUNES, FERGAL CUSSEN, HAL HERZOG, NUNO MACHADO, JONATHAN LEIBRANT, JOÃO LINHARES, STANTON T, SAMUEL CORREA, ERIK HAINES, MARK SMITH, JOÃO EIRA, TOM HUMMEL, SARDUS FRANCE, DAVID SLOAN WILSON, YACILA DEZA-ARAUJO, ROMAIN ROCH, DIEGO LONDOÑO CORREA, YANICK PUNTER, CHARLOTTE BLEASE, NICOLE BARBARO, ADAM HUNT, PAWEL OSTASZEWSKI, NELLEKE BAK, GUY MADISON, GARY G HELLMANN, SAIMA AFZAL, ADRIAN JAEGGI, PAULO TOLENTINO, JOÃO BARBOSA, JULIAN PRICE, EDWARD HALL, HEDIN BRØNNER, DOUGLAS FRY, FRANCA BORTOLOTTI, GABRIEL PONS CORTÈS, URSULA LITZCKE, SCOTT, ZACHARY FISH, TIM DUFFY, SUNNY SMITH, JON WISMAN, WILLIAM BUCKNER, PAUL-GEORGE ARNAUD, LUKE GLOWACKI, GEORGIOS THEOPHANOUS, CHRIS WILLIAMSON, PETER WOLOSZYN, DAVID WILLIAMS, DIOGO COSTA, ALEX CHAU, AMAURI MARTÍNEZ, CORALIE CHEVALLIER, BANGALORE ATHEISTS, LARRY D. LEE JR., OLD HERRINGBONE, MICHAEL BAILEY, DAN SPERBER, ROBERT GRESSIS, IGOR N, JEFF MCMAHAN, JAKE ZUEHL, BARNABAS RADICS, MARK CAMPBELL, TOMAS DAUBNER, LUKE NISSEN, KIMBERLY JOHNSON, JESSICA NOWICKI, LINDA BRANDIN, NIKLAS CARLSSON, GEORGE CHORIATIS, VALENTIN STEINMANN, PER KRAULIS, ALEXANDER HUBBARD, BR, MASOUD ALIMOHAMMADI, JONAS HERTNER, URSULA GOODENOUGH, DAVID PINSOF, SEAN NELSON, MIKE LAVIGNE, JOS KNECHT, ERIK ENGMAN, LUCY, MANVIR SINGH, PETRA WEIMANN, CAROLA FEEST, STARRY, MAURO JÚNIOR, 航 豊川, TONY BARRETT, BENJAMIN GELBART, NIKOLAI VISHNEVSKY, STEVEN GANGESTAD, AND TED FARRIS! A SPECIAL THANKS TO MY PRODUCERS, YZAR WEHBE, JIM FRANK, ŁUKASZ STAFINIAK, TOM VANEGDOM, BERNARD HUGUENEY, CURTIS DIXON, BENEDIKT MUELLER, THOMAS TRUMBLE, KATHRINE AND PATRICK TOBIN, JONCARLO MONTENEGRO, AL NICK ORTIZ, NICK GOLDEN, AND CHRISTINE GLASS! AND TO MY EXECUTIVE PRODUCERS, MATTHEW LAVENDER, SERGIU CODREANU, BOGDAN KANIVETS, ROSEY, AND GREGORY HASTINGS!
Send us a textIn this episode, Joey Pinz sits down with Alex Markov, a visionary leader in the tech industry, who shares his journey from founding a Managed Service Provider (MSP) to developing a successful SaaS platform, Strategy Overview. Alex's passion for design, both in physical spaces and technology, has driven him to create solutions that streamline workflows and improve client experiences. He discusses the importance of standardization in MSPs, the transition from MSP operations to SaaS development, and the innovative use of AI to optimize business processes. Alex also emphasizes the significance of redefining work to enhance both personal and professional life. This conversation is packed with insights for tech enthusiasts and business leaders alike.Top 3 Highlights:Alex Markov's transition from running an MSP to launching a successful SaaS platform.The role of design and standardization in creating efficient business operations.How AI is revolutionizing the MSP industry by optimizing client management processes.Hashtags:#TechInnovation #MSP #SaaS #AI #BusinessGrowth #DesignThinking #WorkplaceWellness #Leadership Join us for enlightening discussions that spark growth and exploration. Hosted by Joey Pinz, this Discipline Conversations Podcast offers insights and inspiration.
Our guest today is Pedro Domingos, who is joining an elite group of repeat guests – he joined us before in episode 34 in April 2023.Pedro is Professor Emeritus Of Computer Science and Engineering at the University of Washington. He has done pioneering work in machine learning, like the development of Markov logic networks, which combine probabilistic reasoning with first-order logic. He is probably best known for his book "The Master Algorithm" which describes five different "tribes" of AI researchers, and argues that progress towards human-level general intelligence requires a unification of their approaches.More recently, Pedro has become a trenchant critic of what he sees as exaggerated claims about the power and potential of today's AI, and of calls to impose constraints on it.He has just published “2040: A Silicon Valley Satire”, a novel which ridicules Big Tech and also American politics.Selected follow-ups:Pedro Domingos - University of WashingtonPrevious London Futurists Podcast episode featuring Pedro Domingos2040: A Silicon Valley SatireThe Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our WorldThe Bonfire of the VanitiesRon HowardMike JudgeMartin ScorsesePandora's BrainTranscendenceFuture of Life Institute moratorium open letterOpenAI working on new reasoning technology under code name ‘Strawberry'Artificial Intelligence: A Modern Approach - by Stuart Russell and Peter NorvigGoogle's AI reasons its way around the London Underground - NatureConsciumIs LaMDA Sentient? — an Interview - by Blake LemoineCould a Large Language Model be Conscious? - Talk by David Chalmers at NeurIPS 2022Jeremy BenthamThe Extended Phenotype - 1982 book by Richard DawkinsClarion West: Workshops for people who are serious about writingMusic: Spike Protein, by Koi Discovery, available under CC0 1.0 Public Domain Declaration
In this episode of the Joint Action podcast, we explore how injuries to the ACL (anterior cruciate ligament) in the knee can lead to osteoarthritis, especially in young people aged 15-25. Did you know that up to 20% of people who develop knee osteoarthritis do so because of a past injury? ACL injuries are a major culprit, and their impact can be life-changing. We chat with Dr Andrew Ross, a physiotherapist and researcher, and Associate Prof Chris Schilling, a health economist, about how we can prevent these injuries in the first place. They share insights from recent studies showing that national injury prevention programs could save millions in healthcare costs, improve quality of life, and keep more people active in sports for longer. We also discuss the challenges of getting these programs off the ground and why they're so crucial - not just for individual athletes but for society as a whole. If you're interested in how we can better protect our knees and prevent osteoarthritis, this episode is a must-listen! RESOURCESPrevious episodes Knee injury and osteoarthritis with Tim HewettIs osteoarthritis preventable? with Dr Jackie Whittaker ProgramsPerform+FIFA 11+ ProgramGLAD Australia PapersThe economics of a national anterior cruciate ligament injury prevention program for amateur football players: a Markov model analysisThe time is right to do more to reduce ACL injuries CONNECT WITH USTwitter/X: @ProfDavidHunter @jointactionorgInstagram: @ProfDavidHunterEmail: hello@jointaction.info Hosted on Acast. See acast.com/privacy for more information.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Rationalists are missing a core piece for agent-like structure (energy vs information overload), published by tailcalled on August 17, 2024 on LessWrong. The agent-like structure problem is a question about how agents in the world are structured. I think rationalists generally have an intuition that the answer looks something like the following: We assume the world follows some evolution law, e.g. maybe deterministically like xn+1=f(xn), or maybe something stochastic. The intuition being that these are fairly general models of the world, so they should be able to capture whatever there is to capture. x here has some geometric structure, and we want to talk about areas of this geometric structure where there are agents. An agent is characterized by a Markov blanket in the world that has informational input/output channels for the agent to get information to observe the world and send out information to act on it, intuitively because input/output channels are the most general way to model a relationship between two systems, and to embed one system within another we need a Markov blanket. The agent uses something resembling a Bayesian model to process the input, intuitively because the simplest explanation that predicts the observed facts is the best one, yielding the minimal map that can answer any query you could have about the world. And then the agent uses something resembling argmax to make a decision for the output given the input, since endless coherence theorems prove this to be optimal. Possibly there's something like an internal market that combines several decision-making interests (modelling incomplete preferences) or several world-models (modelling incomplete world-models). There is a fairly-obvious gap in the above story, in that it lacks any notion of energy (or entropy, temperature, etc.). I think rationalists mostly feel comfortable with that because: xn+1=f(xn) is flexible enough to accomodate worlds that contain energy (even if they also accomodate other kinds of worlds where "energy" doesn't make sense) 80% of the body's energy goes to muscles, organs, etc., so if you think of the brain as an agent and the body as a mech that gets piloted by the brain (so the Markov blanket for humans would be something like the blood-brain barrier rather than the skin), you can mostly think of energy as something that is going on out in the universe, with little relevance for the agent's decision-making. I've come to think of this as "the computationalist worldview" because functional input/output relationships are the thing that is described very well with computations, whereas laws like conservation of energy are extremely arbitrary from a computationalist point of view. (This should be obvious if you've ever tried writing a simulation of physics, as naive implementations often lead to energy exploding.) Radical computationalism is killed by information overload Under the most radical forms of computationalism, the "ideal" prior is something that can range over all conceivable computations. The traditional answer to this is Solomonoff induction, but it is not computationally tractable because it has to process all observed information in every conceivable way. Recently with the success of deep learning and the bitter lesson and the Bayesian interpretations of deep double descent and all that, I think computationalists have switched to viewing the ideal prior as something like a huge deep neural network, which learns representations of the world and functional relationships which can be used by some sort of decision-making process. Briefly, the issue with these sorts of models is that they work by trying to capture all the information that is reasonably non-independent of other information (for instance, the information in a picture that is relevant for predicting ...
Learn the tragic tale of Sorin Markov, a vampire planeswalker from Innistrad, and his quest for purpose in the Multiverse of Magic: The Gathering.Become a lore luminary:https://www.patreon.com/thelorebrariansContact:thelorebrarians@gmail.com More Lore Documentaries:Yawgmoth and Phyrexia: https://youtu.be/hOFeZqDmXykA Study in Compleation: https://youtu.be/Ho3nX_OMafcTimestamps:0:00 - Intro0:42 - Characteristics and Early life9:44 - The Eldrazi to Avacyn20:42 - Return of the Eldrazi29:04 - Attack on Innistrad36:00 - Eternal Night and Future42:30 - Outro“This video is unofficial Fan Content permitted under the Fan Content Policy. Not approved/endorsed by Wizards. Portions of the materials used are property of Wizards of the Coast. ©Wizards of the Coast LLC.”Support the Show.
durée : 00:50:03 - Le Masque et la Plume - par : Laurent Goumarre - Nos critiques ont lu "Le barman du Ritz" de Philippe Collin, "Pour les siècles des siècles" d'Alain Guiraudie, "Nous vivrons" de Joann Sfar, "Du côté sauvage" de Tiffany McDaniel et "Les chaînes de Markov" de Noham Selcer et vous conseillent, ou pas, de les lire cet été. - invités : Blandine Rinkel, Laurent CHALUMEAU, Hubert ARTUS, Raphaelle Leyris - Blandine Rinkel : Écrivaine et musicienne, Laurent Chalumeau : Journaliste rock, scénariste, dialoguiste, romancier, Hubert Artus : Journaliste et chroniqueur littéraire, Raphaëlle Leyris : Journaliste au Monde, critique littéraire - réalisé par : Audrey RIPOULL
Join Andrew, Keith, and Aaron as we tackle Call of Duty: Modern Warfare 3 by Sledgehammer Games. Join the Call of duty squad as they globetrot around the world answering various calls from various duties. Play the campaign as you hunt bad man Markov who wants gas and or missiles. Tackle missions your way in the newly horrible missions called Open Combat, which are just segments ripped from Warzone. If the campaign is not your jam then head online and battle little kids who shoot you with laser guns that meow and turn you into a puddle. Don't worry about being harassed from kids who claim to have relations with your mother because online toxicity is cured! Don't care for PVP? Well check out the new non round based zombies that is definitely not just warzone with zombies. Thank you to our sponsor for this episode Nonnie's Kitcken. go to https://www.facebook.com/NonnieBread?mibextid=ZbWKwL and order your freeze dried candy today! They are delicious. U.S. ONLY
Este episodio es un homenaje a Jim Simons (1938 - 2024), el legendario matemático, académico y filántropo estadounidense conocido por fundar Renaissance Technologies. Su Medallion Fund, estableció un récord con una rentabilidad anual del 66% (39.1% neto de comisiones) de 1988 a 2018, una hazaña a la que solo se acerca Soros y su Quantum Fund (1969-2000) con un 32% de rentabilidad anual neta.Para hablar sobre Simons cuento, de nuevo, con la participación del gran matemático y trader Ricardo Pérez-Marco. Su trayectoria está recogida en el episodio 32 y posteriormente ha tenido varias participaciones para hablar sobre Bitcoin y póker. En este episodio se conectan ideas matemáticas con estrategias de inversión, y se acaba hablando, cómo no, de Bitcoin.Apoya este podcast visitando a los patrocinadores:Interactive Brokers: Un broker con acceso a mercados de todo el mundo.Indexa Capital: Ahorra comisiones dándote de alta con mi código.EVO Cuenta Inteligente En mi canal de Youtube puedes ver esta charla grabada en vídeo.Indice de temas0:02:12 Inversores con mejores rentabilidades de la historia0:08:13 Descubrir señales de códigos cifrados: los inicios del Machine Learning0:14:02 Stony Brook University0:15:31 Superficies mínimas0:25:23 Fiesta en Long Island0:28:12 Contratar solo científicos, no economistas0:31:43 Principio de Steinitz de acumulación de pequeñas ventajas0:33:58 Edward Thorpe0:36:41 Criterio de Kelly0:39:23 Invertir sólo en lo que se tiene alguna ventaja0:55:11 En un trade siempre hay que tener un plan de salida1:02:40 Teoría de información de Shannon1:04:31 Cadenas de Markov ocultas1:15:03 Machine Learning1:18:11 Blackjack1:20:12 Madoff, rentabiliades demasiado poco volátiles1:21:41 Explicabilidad de las estrategia de inversión1:25:56 Emocionalidad invirtiendo1:32:13 Escuelas de matemáticas1:42:28 Fundación Simons1:44:51 Crisis de reconocimiento académico1:53:41 ¿Satoshi Nakamoto es el mejor inversor de la historia?1:55:10 George Soros1:57:54 Ciclos recurrentes en el precio de Bitcoin2:13:23 Black paper sobre Bitcoin de Taleb2:21:57 Medios de comunicación y Bitcoin2:24:21 Premio de matemáticas Ricardo Pérez-MarcoEnlace a los contenidos comentados SOLO disponible en mi post en mi blog en Rankia:https://www.rankia.com/blog/such/6433058-90-homenaje-jim-simons-fundador-renaissance-ricardo-perez-marco
Together with Richard Brennan, we discuss the optimal way of doing trend following and how to find the optimal set of rules to use in different market regimes, how to examine market states through Markov models and how artificial intelligence will impact trend following and make short-term trading incredibly competitive. Lastly, we ask ChatGPT what the optimal way of doing trend following is.-----EXCEPTIONAL RESOURCE: Find Out How to Build a Safer & Better Performing Portfolio using this FREE NEW Portfolio Builder Tool-----Follow Niels on Twitter, LinkedIn, YouTube or via the TTU website.IT's TRUE ? – most CIO's read 50+ books each year – get your FREE copy of the Ultimate Guide to the Best Investment Books ever written here.And you can get a free copy of my latest book “Ten Reasons to Add Trend Following to Your Portfolio” here.Learn more about the Trend Barometer here.Send your questions to info@toptradersunplugged.comAnd please share this episode with a like-minded friend and leave an honest Rating & Review on iTunes or Spotify so more people can discover the podcast.Follow Rich on Twitter.Episode TimeStamps:01:04 - What has been on our radar recently?04:03 - Industry performance update07:50 - Q1, Dennis: Given the nature of lengthy hold periods, do you recommend trading cash CFD's accounting for the overnight costs?11:40 - Q2, Ramki: Using ATR for position sizing when backtesting a trend following system14:47 - Optimal rules to use in different kinds of market regimes18:46 - How to use Markov models and mathematical algorithms to optimize trend following strategies41:17 - Is trend following threatened by AI?51:39 - "ChatGPT, what is the optimal way of doing trend following?"Copyright © 2024 – CMC AG – All Rights Reserved----PLUS: Whenever you're ready... here are 3 ways I can help you in your investment Journey:1. eBooks that cover key topics that you need to know about In my eBooks, I put together some key discoveries and things I have learnt during the more than 3 decades I have worked in the Trend Following industry, which I hope you will find useful.
Coming up in this episode * Does it do Passkeys tho? * So What Happened to Xz anyway? * How do we fix the internet? The Video Version (https://www.youtube.com/watch?v=I3bN3PRmHJY) https://www.youtube.com/watch?v=I3bN3PRmHJY Timestamps 0:00 Cold Open 1:36 Amazingly Self-Hosted 34:13 The History of Xz and the Hack*! 49:58 How to Fix Open Source 1:15:56 Next Time 1:20:42 Stinger
We are reuniting for the 2nd AI UX demo day in SF on Apr 28. Sign up to demo here! And don't forget tickets for the AI Engineer World's Fair — for early birds who join before keynote announcements!About a year ago there was a lot of buzz around prompt engineering techniques to force structured output. Our friend Simon Willison tweeted a bunch of tips and tricks, but the most iconic one is Riley Goodside making it a matter of life or death:Guardrails (friend of the pod and AI Engineer speaker), Marvin (AI Engineer speaker), and jsonformer had also come out at the time. In June 2023, Jason Liu (today's guest!) open sourced his “OpenAI Function Call and Pydantic Integration Module”, now known as Instructor, which quickly turned prompt engineering black magic into a clean, developer-friendly SDK. A few months later, model providers started to add function calling capabilities to their APIs as well as structured outputs support like “JSON Mode”, which was announced at OpenAI Dev Day (see recap here). In just a handful of months, we went from threatening to kill grandmas to first-class support from the research labs. And yet, Instructor was still downloaded 150,000 times last month. Why?What Instructor looks likeInstructor patches your LLM provider SDKs to offer a new response_model option to which you can pass a structure defined in Pydantic. It currently supports OpenAI, Anthropic, Cohere, and a long tail of models through LiteLLM.What Instructor is forThere are three core use cases to Instructor:* Extracting structured data: Taking an input like an image of a receipt and extracting structured data from it, such as a list of checkout items with their prices, fees, and coupon codes.* Extracting graphs: Identifying nodes and edges in a given input to extract complex entities and their relationships. For example, extracting relationships between characters in a story or dependencies between tasks.* Query understanding: Defining a schema for an API call and using a language model to resolve a request into a more complex one that an embedding could not handle. For example, creating date intervals from queries like “what was the latest thing that happened this week?” to then pass onto a RAG system or similar.Jason called all these different ways of getting data from LLMs “typed responses”: taking strings and turning them into data structures. Structured outputs as a planning toolThe first wave of agents was all about open-ended iteration and planning, with projects like AutoGPT and BabyAGI. Models would come up with a possible list of steps, and start going down the list one by one. It's really easy for them to go down the wrong branch, or get stuck on a single step with no way to intervene.What if these planning steps were returned to us as DAGs using structured output, and then managed as workflows? This also makes it easy to better train model on how to create these plans, as they are much more structured than a bullet point list. Once you have this structure, each piece can be modified individually by different specialized models. You can read some of Jason's experiments here:While LLMs will keep improving (Llama3 just got released as we write this), having a consistent structure for the output will make it a lot easier to swap models in and out. Jason's overall message on how we can move from ReAct loops to more controllable Agent workflows mirrors the “Process” discussion from our Elicit episode:Watch the talkAs a bonus, here's Jason's talk from last year's AI Engineer Summit. He'll also be a speaker at this year's AI Engineer World's Fair!Timestamps* [00:00:00] Introductions* [00:02:23] Early experiments with Generative AI at StitchFix* [00:08:11] Design philosophy behind the Instructor library* [00:11:12] JSON Mode vs Function Calling* [00:12:30] Single vs parallel function calling* [00:14:00] How many functions is too many?* [00:17:39] How to evaluate function calling* [00:20:23] What is Instructor good for?* [00:22:42] The Evolution from Looping to Workflow in AI Engineering* [00:27:03] State of the AI Engineering Stack* [00:28:26] Why Instructor isn't VC backed* [00:31:15] Advice on Pursuing Open Source Projects and Consulting* [00:36:00] The Concept of High Agency and Its Importance* [00:42:44] Prompts as Code and the Structure of AI Inputs and Outputs* [00:44:20] The Emergence of AI Engineering as a Distinct FieldShow notes* Jason on the UWaterloo mafia* Jason on Twitter, LinkedIn, website* Instructor docs* Max Woolf on the potential of Structured Output* swyx on Elo vs Cost* Jason on Anthropic Function Calling* Jason on Rejections, Advice to Young People* Jason on Bad Startup Ideas* Jason on Prompts as Code* Rysana's inversion models* Bryan Bischof's episode* Hamel HusainTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:16]: Hello, we're back in the remote studio with Jason Liu from Instructor. Welcome Jason.Jason [00:00:21]: Hey there. Thanks for having me.Swyx [00:00:23]: Jason, you are extremely famous, so I don't know what I'm going to do introducing you, but you're one of the Waterloo clan. There's like this small cadre of you that's just completely dominating machine learning. Actually, can you list like Waterloo alums that you're like, you know, are just dominating and crushing it right now?Jason [00:00:39]: So like John from like Rysana is doing his inversion models, right? I know like Clive Chen from Waterloo. When I started the data science club, he was one of the guys who were like joining in and just like hanging out in the room. And now he was at Tesla working with Karpathy, now he's at OpenAI, you know.Swyx [00:00:56]: He's in my climbing club.Jason [00:00:58]: Oh, hell yeah. I haven't seen him in like six years now.Swyx [00:01:01]: To get in the social scene in San Francisco, you have to climb. So both in career and in rocks. So you started a data science club at Waterloo, we can talk about that, but then also spent five years at Stitch Fix as an MLE. You pioneered the use of OpenAI's LLMs to increase stylist efficiency. So you must have been like a very, very early user. This was like pretty early on.Jason [00:01:20]: Yeah, I mean, this was like GPT-3, okay. So we actually were using transformers at Stitch Fix before the GPT-3 model. So we were just using transformers for recommendation systems. At that time, I was very skeptical of transformers. I was like, why do we need all this infrastructure? We can just use like matrix factorization. When GPT-2 came out, I fine tuned my own GPT-2 to write like rap lyrics and I was like, okay, this is cute. Okay, I got to go back to my real job, right? Like who cares if I can write a rap lyric? When GPT-3 came out, again, I was very much like, why are we using like a post request to review every comment a person leaves? Like we can just use classical models. So I was very against language models for like the longest time. And then when ChatGPT came out, I basically just wrote a long apology letter to everyone at the company. I was like, hey guys, you know, I was very dismissive of some of this technology. I didn't think it would scale well, and I am wrong. This is incredible. And I immediately just transitioned to go from computer vision recommendation systems to LLMs. But funny enough, now that we have RAG, we're kind of going back to recommendation systems.Swyx [00:02:21]: Yeah, speaking of that, I think Alessio is going to bring up the next one.Alessio [00:02:23]: Yeah, I was going to say, we had Bryan Bischof from Hex on the podcast. Did you overlap at Stitch Fix?Jason [00:02:28]: Yeah, he was like one of my main users of the recommendation frameworks that I had built out at Stitch Fix.Alessio [00:02:32]: Yeah, we talked a lot about RecSys, so it makes sense.Swyx [00:02:36]: So now I have adopted that line, RAG is RecSys. And you know, if you're trying to reinvent new concepts, you should study RecSys first, because you're going to independently reinvent a lot of concepts. So your system was called Flight. It's a recommendation framework with over 80% adoption, servicing 350 million requests every day. Wasn't there something existing at Stitch Fix? Why did you have to write one from scratch?Jason [00:02:56]: No, so I think because at Stitch Fix, a lot of the machine learning engineers and data scientists were writing production code, sort of every team's systems were very bespoke. It's like, this team only needs to do like real time recommendations with small data. So they just have like a fast API app with some like pandas code. This other team has to do a lot more data. So they have some kind of like Spark job that does some batch ETL that does a recommendation. And so what happens is each team writes their code differently. And I have to come in and refactor their code. And I was like, oh man, I'm refactoring four different code bases, four different times. Wouldn't it be better if all the code quality was my fault? Let me just write this framework, force everyone else to use it. And now one person can maintain five different systems, rather than five teams having their own bespoke system. And so it was really a need of just sort of standardizing everything. And then once you do that, you can do observability across the entire pipeline and make large sweeping improvements in this infrastructure, right? If we notice that something is slow, we can detect it on the operator layer. Just hey, hey, like this team, you guys are doing this operation is lowering our latency by like 30%. If you just optimize your Python code here, we can probably make an extra million dollars. So let's jump on a call and figure this out. And then a lot of it was doing all this observability work to figure out what the heck is going on and optimize this system from not only just a code perspective, sort of like harassingly or against saying like, we need to add caching here. We're doing duplicated work here. Let's go clean up the systems. Yep.Swyx [00:04:22]: Got it. One more system that I'm interested in finding out more about is your similarity search system using Clip and GPT-3 embeddings and FIASS, where you saved over $50 million in annual revenue. So of course they all gave all that to you, right?Jason [00:04:34]: No, no, no. I mean, it's not going up and down, but you know, I got a little bit, so I'm pretty happy about that. But there, you know, that was when we were doing fine tuning like ResNets to do image classification. And so a lot of it was given an image, if we could predict the different attributes we have in the merchandising and we can predict the text embeddings of the comments, then we can kind of build a image vector or image embedding that can capture both descriptions of the clothing and sales of the clothing. And then we would use these additional vectors to augment our recommendation system. And so with the recommendation system really was just around like, what are similar items? What are complimentary items? What are items that you would wear in a single outfit? And being able to say on a product page, let me show you like 15, 20 more things. And then what we found was like, hey, when you turn that on, you make a bunch of money.Swyx [00:05:23]: Yeah. So, okay. So you didn't actually use GPT-3 embeddings. You fine tuned your own? Because I was surprised that GPT-3 worked off the shelf.Jason [00:05:30]: Because I mean, at this point we would have 3 million pieces of inventory over like a billion interactions between users and clothes. So any kind of fine tuning would definitely outperform like some off the shelf model.Swyx [00:05:41]: Cool. I'm about to move on from Stitch Fix, but you know, any other like fun stories from the Stitch Fix days that you want to cover?Jason [00:05:46]: No, I think that's basically it. I mean, the biggest one really was the fact that I think for just four years, I was so bearish on language models and just NLP in general. I'm just like, none of this really works. Like, why would I spend time focusing on this? I got to go do the thing that makes money, recommendations, bounding boxes, image classification. Yeah. Now I'm like prompting an image model. I was like, oh man, I was wrong.Swyx [00:06:06]: So my Stitch Fix question would be, you know, I think you have a bit of a drip and I don't, you know, my primary wardrobe is free startup conference t-shirts. Should more technology brothers be using Stitch Fix? What's your fashion advice?Jason [00:06:19]: Oh man, I mean, I'm not a user of Stitch Fix, right? It's like, I enjoy going out and like touching things and putting things on and trying them on. Right. I think Stitch Fix is a place where you kind of go because you want the work offloaded. I really love the clothing I buy where I have to like, when I land in Japan, I'm doing like a 45 minute walk up a giant hill to find this weird denim shop. That's the stuff that really excites me. But I think the bigger thing that's really captured is this idea that narrative matters a lot to human beings. Okay. And I think the recommendation system, that's really hard to capture. It's easy to use AI to sell like a $20 shirt, but it's really hard for AI to sell like a $500 shirt. But people are buying $500 shirts, you know what I mean? There's definitely something that we can't really capture just yet that we probably will figure out how to in the future.Swyx [00:07:07]: Well, it'll probably output in JSON, which is what we're going to turn to next. Then you went on a sabbatical to South Park Commons in New York, which is unusual because it's based on USF.Jason [00:07:17]: Yeah. So basically in 2020, really, I was enjoying working a lot as I was like building a lot of stuff. This is where we were making like the tens of millions of dollars doing stuff. And then I had a hand injury. And so I really couldn't code anymore for like a year, two years. And so I kind of took sort of half of it as medical leave, the other half I became more of like a tech lead, just like making sure the systems were like lights were on. And then when I went to New York, I spent some time there and kind of just like wound down the tech work, you know, did some pottery, did some jujitsu. And after GPD came out, I was like, oh, I clearly need to figure out what is going on here because something feels very magical. I don't understand it. So I spent basically like five months just prompting and playing around with stuff. And then afterwards, it was just my startup friends going like, hey, Jason, you know, my investors want us to have an AI strategy. Can you help us out? And it just snowballed and bore more and more until I was making this my full time job. Yeah, got it.Swyx [00:08:11]: You know, you had YouTube University and a journaling app, you know, a bunch of other explorations. But it seems like the most productive or the best known thing that came out of your time there was Instructor. Yeah.Jason [00:08:22]: Written on the bullet train in Japan. I think at some point, you know, tools like Guardrails and Marvin came out. Those are kind of tools that I use XML and Pytantic to get structured data out. But they really were doing things sort of in the prompt. And these are built with sort of the instruct models in mind. Like I'd already done that in the past. Right. At Stitch Fix, you know, one of the things we did was we would take a request note and turn that into a JSON object that we would use to send it to our search engine. Right. So if you said like, I want to, you know, skinny jeans that were this size, that would turn into JSON that we would send to our internal search APIs. But it always felt kind of gross. A lot of it is just like you read the JSON, you like parse it, you make sure the names are strings and ages are numbers and you do all this like messy stuff. But when function calling came out, it was very much sort of a new way of doing things. Right. Function calling lets you define the schema separate from the data and the instructions. And what this meant was you can kind of have a lot more complex schemas and just map them in Pytantic. And then you can just keep those very separate. And then once you add like methods, you can add validators and all that kind of stuff. The one thing I really had with a lot of these libraries, though, was it was doing a lot of the string formatting themselves, which was fine when it was the instruction to models. You just have a string. But when you have these new chat models, you have these chat messages. And I just didn't really feel like not being able to access that for the developer was sort of a good benefit that they would get. And so I just said, let me write like the most simple SDK around the OpenAI SDK, a simple wrapper on the SDK, just handle the response model a bit and kind of think of myself more like requests than actual framework that people can use. And so the goal is like, hey, like this is something that you can use to build your own framework. But let me just do all the boring stuff that nobody really wants to do. People want to build their own frameworks, but people don't want to build like JSON parsing.Swyx [00:10:08]: And the retrying and all that other stuff.Jason [00:10:10]: Yeah.Swyx [00:10:11]: Right. We had this a little bit of this discussion before the show, but like that design principle of going for being requests rather than being Django. Yeah. So what inspires you there? This has come from a lot of prior pain. Are there other open source projects that inspired your philosophy here? Yeah.Jason [00:10:25]: I mean, I think it would be requests, right? Like, I think it is just the obvious thing you install. If you were going to go make HTTP requests in Python, you would obviously import requests. Maybe if you want to do more async work, there's like future tools, but you don't really even think about installing it. And when you do install it, you don't think of it as like, oh, this is a requests app. Right? Like, no, this is just Python. The bigger question is, like, a lot of people ask questions like, oh, why isn't requests like in the standard library? Yeah. That's how I want my library to feel, right? It's like, oh, if you're going to use the LLM SDKs, you're obviously going to install instructor. And then I think the second question would be like, oh, like, how come instructor doesn't just go into OpenAI, go into Anthropic? Like, if that's the conversation we're having, like, that's where I feel like I've succeeded. Yeah. It's like, yeah, so standard, you may as well just have it in the base libraries.Alessio [00:11:12]: And the shape of the request stayed the same, but initially function calling was maybe equal structure outputs for a lot of people. I think now the models also support like JSON mode and some of these things and, you know, return JSON or my grandma is going to die. All of that stuff is maybe to decide how have you seen that evolution? Like maybe what's the metagame today? Should people just forget about function calling for structure outputs or when is structure output like JSON mode the best versus not? We'd love to get any thoughts given that you do this every day.Jason [00:11:42]: Yeah, I would almost say these are like different implementations of like the real thing we care about is the fact that now we have typed responses to language models. And because we have that type response, my IDE is a little bit happier. I get autocomplete. If I'm using the response wrong, there's a little red squiggly line. Like those are the things I care about in terms of whether or not like JSON mode is better. I usually think it's almost worse unless you want to spend less money on like the prompt tokens that the function call represents, primarily because with JSON mode, you don't actually specify the schema. So sure, like JSON load works, but really, I care a lot more than just the fact that it is JSON, right? I think function calling gives you a tool to specify the fact like, okay, this is a list of objects that I want and each object has a name or an age and I want the age to be above zero and I want to make sure it's parsed correctly. That's where kind of function calling really shines.Alessio [00:12:30]: Any thoughts on single versus parallel function calling? So I did a presentation at our AI in Action Discord channel, and obviously showcase instructor. One of the big things that we have before with single function calling is like when you're trying to extract lists, you have to make these funky like properties that are lists to then actually return all the objects. How do you see the hack being put on the developer's plate versus like more of this stuff just getting better in the model? And I know you tweeted recently about Anthropic, for example, you know, some lists are not lists or strings and there's like all of these discrepancies.Jason [00:13:04]: I almost would prefer it if it was always a single function call. Obviously, there is like the agents workflows that, you know, Instructor doesn't really support that well, but are things that, you know, ought to be done, right? Like you could define, I think maybe like 50 or 60 different functions in a single API call. And, you know, if it was like get the weather or turn the lights on or do something else, it makes a lot of sense to have these parallel function calls. But in terms of an extraction workflow, I definitely think it's probably more helpful to have everything be a single schema, right? Just because you can sort of specify relationships between these entities that you can't do in a parallel function calling, you can have a single chain of thought before you generate a list of results. Like there's like small like API differences, right? Where if it's for parallel function calling, if you do one, like again, really, I really care about how the SDK looks and says, okay, do I always return a list of functions or do you just want to have the actual object back out and you want to have like auto complete over that object? Interesting.Alessio [00:14:00]: What's kind of the cap for like how many function definitions you can put in where it still works well? Do you have any sense on that?Jason [00:14:07]: I mean, for the most part, I haven't really had a need to do anything that's more than six or seven different functions. I think in the documentation, they support way more. I don't even know if there's any good evals that have over like two dozen function calls. I think if you're running into issues where you have like 20 or 50 or 60 function calls, I think you're much better having those specifications saved in a vector database and then have them be retrieved, right? So if there are 30 tools, like you should basically be like ranking them and then using the top K to do selection a little bit better rather than just like shoving like 60 functions into a single. Yeah.Swyx [00:14:40]: Yeah. Well, I mean, so I think this is relevant now because previously I think context limits prevented you from having more than a dozen tools anyway. And now that we have million token context windows, you know, a cloud recently with their new function calling release said they can handle over 250 tools, which is insane to me. That's, that's a lot. You're saying like, you know, you don't think there's many people doing that. I think anyone with a sort of agent like platform where you have a bunch of connectors, they wouldn't run into that problem. Probably you're right that they should use a vector database and kind of rag their tools. I know Zapier has like a few thousand, like 8,000, 9,000 connectors that, you know, obviously don't fit anywhere. So yeah, I mean, I think that would be it unless you need some kind of intelligence that chains things together, which is, I think what Alessio is coming back to, right? Like there's this trend about parallel function calling. I don't know what I think about that. Anthropic's version was, I think they use multiple tools in sequence, but they're not in parallel. I haven't explored this at all. I'm just like throwing this open to you as to like, what do you think about all these new things? Yeah.Jason [00:15:40]: It's like, you know, do we assume that all function calls could happen in any order? In which case, like we either can assume that, or we can assume that like things need to happen in some kind of sequence as a DAG, right? But if it's a DAG, really that's just like one JSON object that is the entire DAG rather than going like, okay, the order of the function that return don't matter. That's definitely just not true in practice, right? Like if I have a thing that's like turn the lights on, like unplug the power, and then like turn the toaster on or something like the order doesn't matter. And it's unclear how well you can describe the importance of that reasoning to a language model yet. I mean, I'm sure you can do it with like good enough prompting, but I just haven't any use cases where the function sequence really matters. Yeah.Alessio [00:16:18]: To me, the most interesting thing is the models are better at picking than your ranking is usually. Like I'm incubating a company around system integration. For example, with one system, there are like 780 endpoints. And if you're actually trying to do vector similarity, it's not that good because the people that wrote the specs didn't have in mind making them like semantically apart. You know, they're kind of like, oh, create this, create this, create this. Versus when you give it to a model, like in Opus, you put them all, it's quite good at picking which ones you should actually run. And I'm curious to see if the model providers actually care about some of those workflows or if the agent companies are actually going to build very good rankers to kind of fill that gap.Jason [00:16:58]: Yeah. My money is on the rankers because you can do those so easily, right? You could just say, well, given the embeddings of my search query and the embeddings of the description, I can just train XGBoost and just make sure that I have very high like MRR, which is like mean reciprocal rank. And so the only objective is to make sure that the tools you use are in the top end filtered. Like that feels super straightforward and you don't have to actually figure out how to fine tune a language model to do tool selection anymore. Yeah. I definitely think that's the case because for the most part, I imagine you either have like less than three tools or more than a thousand. I don't know what kind of company said, oh, thank God we only have like 185 tools and this works perfectly, right? That's right.Alessio [00:17:39]: And before we maybe move on just from this, it was interesting to me, you retweeted this thing about Anthropic function calling and it was Joshua Brown's retweeting some benchmark that it's like, oh my God, Anthropic function calling so good. And then you retweeted it and then you tweeted it later and it's like, it's actually not that good. What's your flow? How do you actually test these things? Because obviously the benchmarks are lying, right? Because the benchmarks say it's good and you said it's bad and I trust you more than the benchmark. How do you think about that? And then how do you evolve it over time?Jason [00:18:09]: It's mostly just client data. I actually have been mostly busy with enough client work that I haven't been able to reproduce public benchmarks. And so I can't even share some of the results in Anthropic. I would just say like in production, we have some pretty interesting schemas where it's like iteratively building lists where we're doing like updates of lists, like we're doing in place updates. So like upserts and inserts. And in those situations we're like, oh yeah, we have a bunch of different parsing errors. Numbers are being returned to strings. We were expecting lists of objects, but we're getting strings that are like the strings of JSON, right? So we had to call JSON parse on individual elements. Overall, I'm like super happy with the Anthropic models compared to the OpenAI models. Sonnet is very cost effective. Haiku is in function calling, it's actually better, but I think they just had to sort of file down the edges a little bit where like our tests pass, but then we actually deployed a production. We got half a percent of traffic having issues where if you ask for JSON, it'll try to talk to you. Or if you use function calling, you know, we'll have like a parse error. And so I think that definitely gonna be things that are fixed in like the upcoming weeks. But in terms of like the reasoning capabilities, man, it's hard to beat like 70% cost reduction, especially when you're building consumer applications, right? If you're building something for consultants or private equity, like you're charging $400, it doesn't really matter if it's a dollar or $2. But for consumer apps, it makes products viable. If you can go from four to Sonnet, you might actually be able to price it better. Yeah.Swyx [00:19:31]: I had this chart about the ELO versus the cost of all the models. And you could put trend graphs on each of those things about like, you know, higher ELO equals higher cost, except for Haiku. Haiku kind of just broke the lines, or the ISO ELOs, if you want to call it. Cool. Before we go too far into your opinions on just the overall ecosystem, I want to make sure that we map out the surface area of Instructor. I would say that most people would be familiar with Instructor from your talks and your tweets and all that. You had the number one talk from the AI Engineer Summit.Jason [00:20:03]: Two Liu. Jason Liu and Jerry Liu. Yeah.Swyx [00:20:06]: Yeah. Until I actually went through your cookbook, I didn't realize the surface area. How would you categorize the use cases? You have LLM self-critique, you have knowledge graphs in here, you have PII data sanitation. How do you characterize to people what is the surface area of Instructor? Yeah.Jason [00:20:23]: This is the part that feels crazy because really the difference is LLMs give you strings and Instructor gives you data structures. And once you get data structures, again, you can do every lead code problem you ever thought of. Right. And so I think there's a couple of really common applications. The first one obviously is extracting structured data. This is just be, okay, well, like I want to put in an image of a receipt. I want to give it back out a list of checkout items with a price and a fee and a coupon code or whatever. That's one application. Another application really is around extracting graphs out. So one of the things we found out about these language models is that not only can you define nodes, it's really good at figuring out what are nodes and what are edges. And so we have a bunch of examples where, you know, not only do I extract that, you know, this happens after that, but also like, okay, these two are dependencies of another task. And you can do, you know, extracting complex entities that have relationships. Given a story, for example, you could extract relationships of families across different characters. This can all be done by defining a graph. The last really big application really is just around query understanding. The idea is that like any API call has some schema and if you can define that schema ahead of time, you can use a language model to resolve a request into a much more complex request. One that an embedding could not do. So for example, I have a really popular post called like rag is more than embeddings. And effectively, you know, if I have a question like this, what was the latest thing that happened this week? That embeds to nothing, right? But really like that query should just be like select all data where the date time is between today and today minus seven days, right? What if I said, how did my writing change between this month and last month? Again, embeddings would do nothing. But really, if you could do like a group by over the month and a summarize, then you could again like do something much more interesting. And so this really just calls out the fact that embeddings really is kind of like the lowest hanging fruit. And using something like instructor can really help produce a data structure. And then you can just use your computer science and reason about the data structure. Maybe you say, okay, well, I'm going to produce a graph where I want to group by each month and then summarize them jointly. You can do that if you know how to define this data structure. Yeah.Swyx [00:22:29]: So you kind of run up against like the LangChains of the world that used to have that. They still do have like the self querying, I think they used to call it when we had Harrison on in our episode. How do you see yourself interacting with the other LLM frameworks in the ecosystem? Yeah.Jason [00:22:42]: I mean, if they use instructor, I think that's totally cool. Again, it's like, it's just Python, right? It's like asking like, oh, how does like Django interact with requests? Well, you just might make a request.get in a Django app, right? But no one would say, I like went off of Django because I'm using requests now. They should be ideally like sort of the wrong comparison in terms of especially like the agent workflows. I think the real goal for me is to go down like the LLM compiler route, which is instead of doing like a react type reasoning loop. I think my belief is that we should be using like workflows. If we do this, then we always have a request and a complete workflow. We can fine tune a model that has a better workflow. Whereas it's hard to think about like, how do you fine tune a better react loop? Yeah. You always train it to have less looping, in which case like you wanted to get the right answer the first time, in which case it was a workflow to begin with, right?Swyx [00:23:31]: Can you define workflow? Because I used to work at a workflow company, but I'm not sure this is a good term for everybody.Jason [00:23:36]: I'm thinking workflow in terms of like the prefect Zapier workflow. Like I want to build a DAG, I want you to tell me what the nodes and edges are. And then maybe the edges are also put in with AI. But the idea is that like, I want to be able to present you the entire plan and then ask you to fix things as I execute it, rather than going like, hey, I couldn't parse the JSON, so I'm going to try again. I couldn't parse the JSON, I'm going to try again. And then next thing you know, you spent like $2 on opening AI credits, right? Yeah. Whereas with the plan, you can just say, oh, the edge between node like X and Y does not run. Let me just iteratively try to fix that, fix the one that sticks, go on to the next component. And obviously you can get into a world where if you have enough examples of the nodes X and Y, maybe you can use like a vector database to find a good few shot examples. You can do a lot if you sort of break down the problem into that workflow and executing that workflow, rather than looping and hoping the reasoning is good enough to generate the correct output. Yeah.Swyx [00:24:35]: You know, I've been hammering on Devon a lot. I got access a couple of weeks ago. And obviously for simple tasks, it does well. For the complicated, like more than 10, 20 hour tasks, I can see- That's a crazy comparison.Jason [00:24:47]: We used to talk about like three, four loops. Only once it gets to like hour tasks, it's hard.Swyx [00:24:54]: Yeah. Less than an hour, there's nothing.Jason [00:24:57]: That's crazy.Swyx [00:24:58]: I mean, okay. Maybe my goalposts have shifted. I don't know. That's incredible.Jason [00:25:02]: Yeah. No, no. I'm like sub one minute executions. Like the fact that you're talking about 10 hours is incredible.Swyx [00:25:08]: I think it's a spectrum. I think I'm going to say this every single time I bring up Devon. Let's not reward them for taking longer to do things. Do you know what I mean? I think that's a metric that is easily abusable.Jason [00:25:18]: Sure. Yeah. You know what I mean? But I think if you can monotonically increase the success probability over an hour, that's winning to me. Right? Like obviously if you run an hour and you've made no progress. Like I think when we were in like auto GBT land, there was that one example where it's like, I wanted it to like buy me a bicycle overnight. I spent $7 on credit and I never found the bicycle. Yeah.Swyx [00:25:41]: Yeah. Right. I wonder if you'll be able to purchase a bicycle. Because it actually can do things in real world. It just needs to suspend to you for off and stuff. The point I was trying to make was that I can see it turning plans. I think one of the agents loopholes or one of the things that is a real barrier for agents is LLMs really like to get stuck into a lane. And you know what you're talking about, what I've seen Devon do is it gets stuck in a lane and it will just kind of change plans based on the performance of the plan itself. And it's kind of cool.Jason [00:26:05]: I feel like we've gone too much in the looping route and I think a lot of more plans and like DAGs and data structures are probably going to come back to help fill in some holes. Yeah.Alessio [00:26:14]: What do you think of the interface to that? Do you see it's like an existing state machine kind of thing that connects to the LLMs, the traditional DAG players? Do you think we need something new for like AI DAGs?Jason [00:26:25]: Yeah. I mean, I think that the hard part is going to be describing visually the fact that this DAG can also change over time and it should still be allowed to be fuzzy. I think in like mathematics, we have like plate diagrams and like Markov chain diagrams and like recurrent states and all that. Some of that might come into this workflow world. But to be honest, I'm not too sure. I think right now, the first steps are just how do we take this DAG idea and break it down to modular components that we can like prompt better, have few shot examples for and ultimately like fine tune against. But in terms of even the UI, it's hard to say what it will likely win. I think, you know, people like Prefect and Zapier have a pretty good shot at doing a good job.Swyx [00:27:03]: Yeah. You seem to use Prefect a lot. I actually worked at a Prefect competitor at Temporal and I'm also very familiar with Dagster. What else would you call out as like particularly interesting in the AI engineering stack?Jason [00:27:13]: Man, I almost use nothing. I just use Cursor and like PyTests. Okay. I think that's basically it. You know, a lot of the observability companies have... The more observability companies I've tried, the more I just use Postgres.Swyx [00:27:29]: Really? Okay. Postgres for observability?Jason [00:27:32]: But the issue really is the fact that these observability companies isn't actually doing observability for the system. It's just doing the LLM thing. Like I still end up using like Datadog or like, you know, Sentry to do like latency. And so I just have those systems handle it. And then the like prompt in, prompt out, latency, token costs. I just put that in like a Postgres table now.Swyx [00:27:51]: So you don't need like 20 funded startups building LLM ops? Yeah.Jason [00:27:55]: But I'm also like an old, tired guy. You know what I mean? Like I think because of my background, it's like, yeah, like the Python stuff, I'll write myself. But you know, I will also just use Vercel happily. Yeah. Yeah. So I'm not really into that world of tooling, whereas I think, you know, I spent three good years building observability tools for recommendation systems. And I was like, oh, compared to that, Instructor is just one call. I just have to put time star, time and then count the prompt token, right? Because I'm not doing a very complex looping behavior. I'm doing mostly workflows and extraction. Yeah.Swyx [00:28:26]: I mean, while we're on this topic, we'll just kind of get this out of the way. You famously have decided to not be a venture backed company. You want to do the consulting route. The obvious route for someone as successful as Instructor is like, oh, here's hosted Instructor with all tooling. Yeah. You just said you had a whole bunch of experience building observability tooling. You have the perfect background to do this and you're not.Jason [00:28:43]: Yeah. Isn't that sick? I think that's sick.Swyx [00:28:44]: I mean, I know why, because you want to go free dive.Jason [00:28:47]: Yeah. Yeah. Because I think there's two things. Right. Well, one, if I tell myself I want to build requests, requests is not a venture backed startup. Right. I mean, one could argue whether or not Postman is, but I think for the most part, it's like having worked so much, I'm more interested in looking at how systems are being applied and just having access to the most interesting data. And I think I can do that more through a consulting business where I can come in and go, oh, you want to build perfect memory. You want to build an agent. You want to build like automations over construction or like insurance and supply chain, or like you want to handle writing private equity, mergers and acquisitions reports based off of user interviews. Those things are super fun. Whereas like maintaining the library, I think is mostly just kind of like a utility that I try to keep up, especially because if it's not venture backed, I have no reason to sort of go down the route of like trying to get a thousand integrations. In my mind, I just go like, okay, 98% of the people use open AI. I'll support that. And if someone contributes another platform, that's great. I'll merge it in. Yeah.Swyx [00:29:45]: I mean, you only added Anthropic support this year. Yeah.Jason [00:29:47]: Yeah. You couldn't even get an API key until like this year, right? That's true. Okay. If I add it like last year, I was trying to like double the code base to service, you know, half a percent of all downloads.Swyx [00:29:58]: Do you think the market share will shift a lot now that Anthropic has like a very, very competitive offering?Jason [00:30:02]: I think it's still hard to get API access. I don't know if it's fully GA now, if it's GA, if you can get a commercial access really easily.Alessio [00:30:12]: I got commercial after like two weeks to reach out to their sales team.Jason [00:30:14]: Okay.Alessio [00:30:15]: Yeah.Swyx [00:30:16]: Two weeks. It's not too bad. There's a call list here. And then anytime you run into rate limits, just like ping one of the Anthropic staff members.Jason [00:30:21]: Yeah. Then maybe we need to like cut that part out. So I don't need to like, you know, spread false news.Swyx [00:30:25]: No, it's cool. It's cool.Jason [00:30:26]: But it's a common question. Yeah. Surely just from the price perspective, it's going to make a lot of sense. Like if you are a business, you should totally consider like Sonnet, right? Like the cost savings is just going to justify it if you actually are doing things at volume. And yeah, I think the SDK is like pretty good. Back to the instructor thing. I just don't think it's a billion dollar company. And I think if I raise money, the first question is going to be like, how are you going to get a billion dollar company? And I would just go like, man, like if I make a million dollars as a consultant, I'm super happy. I'm like more than ecstatic. I can have like a small staff of like three people. It's fun. And I think a lot of my happiest founder friends are those who like raised a tiny seed round, became profitable. They're making like 70, 60, 70, like MRR, 70,000 MRR and they're like, we don't even need to raise the seed round. Let's just keep it like between me and my co-founder, we'll go traveling and it'll be a great time. I think it's a lot of fun.Alessio [00:31:15]: Yeah. like say LLMs / AI and they build some open source stuff and it's like I should just raise money and do this and I tell people a lot it's like look you can make a lot more money doing something else than doing a startup like most people that do a company could make a lot more money just working somewhere else than the company itself do you have any advice for folks that are maybe in a similar situation they're trying to decide oh should I stay in my like high paid FAANG job and just tweet this on the side and do this on github should I go be a consultant like being a consultant seems like a lot of work so you got to talk to all these people you know there's a lot to unpackJason [00:31:54]: I think the open source thing is just like well I'm just doing it purely for fun and I'm doing it because I think I'm right but part of being right is the fact that it's not a venture backed startup like I think I'm right because this is all you need right so I think a part of the philosophy is the fact that all you need is a very sharp blade to sort of do your work and you don't actually need to build like a big enterprise so that's one thing I think the other thing too that I've kind of been thinking around just because I have a lot of friends at google that want to leave right now it's like man like what we lack is not money or skill like what we lack is courage you should like you just have to do this a hard thing and you have to do it scared anyways right in terms of like whether or not you do want to do a founder I think that's just a matter of optionality but I definitely recognize that the like expected value of being a founder is still quite low it is right I know as many founder breakups and as I know friends who raised a seed round this year right like that is like the reality and like you know even in from that perspective it's been tough where it's like oh man like a lot of incubators want you to have co-founders now you spend half the time like fundraising and then trying to like meet co-founders and find co-founders rather than building the thing this is a lot of time spent out doing uh things I'm not really good at. I do think there's a rising trend in solo founding yeah.Swyx [00:33:06]: You know I am a solo I think that something like 30 percent of like I forget what the exact status something like 30 percent of starters that make it to like series B or something actually are solo founder I feel like this must have co-founder idea mostly comes from YC and most everyone else copies it and then plenty of companies break up over co-founderJason [00:33:27]: Yeah and I bet it would be like I wonder how much of it is the people who don't have that much like and I hope this is not a diss to anybody but it's like you sort of you go through the incubator route because you don't have like the social equity you would need is just sort of like send an email to Sequoia and be like hey I'm going on this ride you want a ticket on the rocket ship right like that's very hard to sell my message if I was to raise money is like you've seen my twitter my life is sick I've decided to make it much worse by being a founder because this is something I have to do so do you want to come along otherwise I want to fund it myself like if I can't say that like I don't need the money because I can like handle payroll and like hire an intern and get an assistant like that's all fine but I really don't want to go back to meta I want to like get two years to like try to find a problem we're solving that feels like a bad timeAlessio [00:34:12]: Yeah Jason is like I wear a YSL jacket on stage at AI Engineer Summit I don't need your accelerator moneyJason [00:34:18]: And boots, you don't forget the boots. But I think that is a part of it right I think it is just like optionality and also just like I'm a lot older now I think 22 year old Jason would have been probably too scared and now I'm like too wise but I think it's a matter of like oh if you raise money you have to have a plan of spending it and I'm just not that creative with spending that much money yeah I mean to be clear you just celebrated your 30th birthday happy birthday yeah it's awesome so next week a lot older is relative to some some of the folks I think seeing on the career tipsAlessio [00:34:48]: I think Swix had a great post about are you too old to get into AI I saw one of your tweets in January 23 you applied to like Figma, Notion, Cohere, Anthropic and all of them rejected you because you didn't have enough LLM experience I think at that time it would be easy for a lot of people to say oh I kind of missed the boat you know I'm too late not gonna make it you know any advice for people that feel like thatJason [00:35:14]: Like the biggest learning here is actually from a lot of folks in jiu-jitsu they're like oh man like is it too late to start jiu-jitsu like I'll join jiu-jitsu once I get in more shape right it's like there's a lot of like excuses and then you say oh like why should I start now I'll be like 45 by the time I'm any good and say well you'll be 45 anyways like time is passing like if you don't start now you start tomorrow you're just like one more day behind if you're worried about being behind like today is like the soonest you can start right and so you got to recognize that like maybe you just don't want it and that's fine too like if you wanted you would have started I think a lot of these people again probably think of things on a too short time horizon but again you know you're gonna be old anyways you may as well just start now you knowSwyx [00:35:55]: One more thing on I guess the um career advice slash sort of vlogging you always go viral for this post that you wrote on advice to young people and the lies you tell yourself oh yeah yeah you said you were writing it for your sister.Jason [00:36:05]: She was like bummed out about going to college and like stressing about jobs and I was like oh and I really want to hear okay and I just kind of like text-to-sweep the whole thing it's crazy it's got like 50,000 views like I'm mind I mean your average tweet has more but that thing is like a 30-minute read nowSwyx [00:36:26]: So there's lots of stuff here which I agree with I you know I'm also of occasionally indulge in the sort of life reflection phase there's the how to be lucky there's the how to have high agency I feel like the agency thing is always a trend in sf or just in tech circles how do you define having high agencyJason [00:36:42]: I'm almost like past the high agency phase now now my biggest concern is like okay the agency is just like the norm of the vector what also matters is the direction right it's like how pure is the shot yeah I mean I think agency is just a matter of like having courage and doing the thing that's scary right you know if people want to go rock climbing it's like do you decide you want to go rock climbing then you show up to the gym you rent some shoes and you just fall 40 times or do you go like oh like I'm actually more intelligent let me go research the kind of shoes that I want okay like there's flatter shoes and more inclined shoes like which one should I get okay let me go order the shoes on Amazon I'll come back in three days like oh it's a little bit too tight maybe it's too aggressive I'm only a beginner let me go change no I think the higher agent person just like goes and like falls down 20 times right yeah I think the higher agency person is more focused on like process metrics versus outcome metrics right like from pottery like one thing I learned was if you want to be good at pottery you shouldn't count like the number of cups or bowls you make you should just weigh the amount of clay you use right like the successful person says oh I went through 100 pounds of clay right the less agency was like oh I've made six cups and then after I made six cups like there's not really what are you what do you do next no just pounds of clay pounds of clay same with the work here right so you just got to write the tweets like make the commits contribute open source like write the documentation there's no real outcome it's just a process and if you love that process you just get really good at the thing you're doingSwyx [00:38:04]: yeah so just to push back on this because obviously I mostly agree how would you design performance review systems because you were effectively saying we can count lines of code for developers rightJason [00:38:15]: I don't think that would be the actual like I think if you make that an outcome like I can just expand a for loop right I think okay so for performance review this is interesting because I've mostly thought of it from the perspective of science and not engineering I've been running a lot of engineering stand-ups primarily because there's not really that many machine learning folks the process outcome is like experiments and ideas right like if you think about outcome is what you might want to think about an outcome is oh I want to improve the revenue or whatnot but that's really hard but if you're someone who is going out like okay like this week I want to come up with like three or four experiments I might move the needle okay nothing worked to them they might think oh nothing worked like I suck but to me it's like wow you've closed off all these other possible avenues for like research like you're gonna get to the place that you're gonna figure out that direction really soon there's no way you try 30 different things and none of them work usually like 10 of them work five of them work really well two of them work really really well and one thing was like the nail in the head so agency lets you sort of capture the volume of experiments and like experience lets you figure out like oh that other half it's not worth doing right I think experience is going like half these prompting papers don't make any sense just use chain of thought and just you know use a for loop that's basically right it's like usually performance for me is around like how many experiments are you running how oftentimes are you trying.Alessio [00:39:32]: When do you give up on an experiment because a StitchFix you kind of give up on language models I guess in a way as a tool to use and then maybe the tools got better you were right at the time and then the tool improved I think there are similar paths in my engineering career where I try one approach and at the time it doesn't work and then the thing changes but then I kind of soured on that approach and I don't go back to it soonJason [00:39:51]: I see yeah how do you think about that loop so usually when I'm coaching folks and as they say like oh these things don't work I'm not going to pursue them in the future like one of the big things like hey the negative result is a result and this is something worth documenting like this is an academia like if it's negative you don't just like not publish right but then like what do you actually write down like what you should write down is like here are the conditions this is the inputs and the outputs we tried the experiment on and then one thing that's really valuable is basically writing down under what conditions would I revisit these experiments these things don't work because of what we had at the time if someone is reading this two years from now under what conditions will we try again that's really hard but again that's like another skill you kind of learn right it's like you do go back and you do experiments you figure out why it works now I think a lot of it here is just like scaling worked yeah rap lyrics you know that was because I did not have high enough quality data if we phase shift and say okay you don't even need training data oh great then it might just work a different domainAlessio [00:40:48]: Do you have anything in your list that is like it doesn't work now but I want to try it again later? Something that people should maybe keep in mind you know people always like agi when you know when are you going to know the agi is here maybe it's less than that but any stuff that you tried recently that didn't work thatJason [00:41:01]: You think will get there I mean I think the personal assistance and the writing I've shown to myself it's just not good enough yet so I hired a writer and I hired a personal assistant so now I'm gonna basically like work with these people until I figure out like what I can actually like automate and what are like the reproducible steps but like I think the experiment for me is like I'm gonna go pay a person like thousand dollars a month that helped me improve my life and then let me get them to help me figure like what are the components and how do I actually modularize something to get it to work because it's not just like a lot gmail calendar and like notion it's a little bit more complicated than that but we just don't know what that is yet those are two sort of systems that I wish gb4 or opus was actually good enough to just write me an essay but most of the essays are still pretty badSwyx [00:41:44]: yeah I would say you know on the personal assistance side Lindy is probably the one I've seen the most flow was at a speaker at the summit I don't know if you've checked it out or any other sort of agents assistant startupJason [00:41:54]: Not recently I haven't tried lindy they were not ga last time I was considering it yeah yeah a lot of it now it's like oh like really what I want you to do is take a look at all of my meetings and like write like a really good weekly summary email for my clients to remind them that I'm like you know thinking of them and like working for them right or it's like I want you to notice that like my monday is like way too packed and like block out more time and also like email the people to do the reschedule and then try to opt in to move them around and then I want you to say oh jason should have like a 15 minute prep break after form back to back those are things that now I know I can prompt them in but can it do it well like before I didn't even know that's what I wanted to prompt for us defragging a calendar and adding break so I can like eat lunch yeah that's the AGI test yeah exactly compassion right I think one thing that yeah we didn't touch on it before butAlessio [00:42:44]: I think was interesting you had this tweet a while ago about prompts should be code and then there were a lot of companies trying to build prompt engineering tooling kind of trying to turn the prompt into a more structured thing what's your thought today now you want to turn the thinking into DAGs like do prompts should still be code any updated ideasJason [00:43:04]: It's the same thing right I think you know with Instructor it is very much like the output model is defined as a code object that code object is sent to the LLM and in return you get a data structure so the outputs of these models I think should also be code objects and the inputs somewhat should be code objects but I think the one thing that instructor tries to do is separate instruction data and the types of the output and beyond that I really just think that most of it should be still like managed pretty closely to the developer like so much of is changing that if you give control of these systems away too early you end up ultimately wanting them back like many companies I know that I reach out or ones were like oh we're going off of the frameworks because now that we know what the business outcomes we're trying to optimize for these frameworks don't work yeah because we do rag but we want to do rag to like sell you supplements or to have you like schedule the fitness appointment the prompts are kind of too baked into the systems to really pull them back out and like start doing upselling or something it's really funny but a lot of it ends up being like once you understand the business outcomes you care way more about the promptSwyx [00:44:07]: Actually this is fun in our prep for this call we were trying to say like what can you as an independent person say that maybe me and Alessio cannot say or me you know someone at a company say what do you think is the market share of the frameworks the LangChain, the LlamaIndex, the everything...Jason [00:44:20]: Oh massive because not everyone wants to care about the code yeah right I think that's a different question to like what is the business model and are they going to be like massively profitable businesses right making hundreds of millions of dollars that feels like so straightforward right because not everyone is a prompt engineer like there's so much productivity to be captured in like back office optim automations right it's not because they care about the prompts that they care about managing these things yeah but those would be sort of low code experiences you yeah I think the bigger challenge is like okay hundred million dollars probably pretty easy it's just time and effort and they have the manpower and the money to sort of solve those problems again if you go the vc route then it's like you're talking about billions and that's really the goal that stuff for me it's like pretty unclear but again that is to say that like I sort of am building things for developers who want to use infrastructure to build their own tooling in terms of the amount of developers there are in the world versus downstream consumers of these things or even just think of how many companies will use like the adobes and the ibms right because they want something that's fully managed and they want something that they know will work and if the incremental 10% requires you to hire another team of 20 people you might not want to do it and I think that kind of organization is really good for uh those are bigger companiesSwyx [00:45:32]: I just want to capture your thoughts on one more thing which is you said you wanted most of the prompts to stay close to the developer and Hamel Husain wrote this post which I really love called f you show me the prompt yeah I think he cites you in one of those part of the blog post and I think ds pi is kind of like the complete antithesis of that which is I think it's interesting because I also hold the strong view that AI is a better prompt engineer than you are and I don't know how to square that wondering if you have thoughtsJason [00:45:58]: I think something like DSPy can work because there are like very short-term metrics to measure success right it is like did you find the pii or like did you write the multi-hop question the correct way but in these workflows that I've been managing a lot of it are we minimizing churn and maximizing retention yeah that's a very long loop it's not really like a uptuna like training loop right like those things are much more harder to capture so we don't actually have those metrics for that right and obviously we can figure out like okay is the summary good but like how do you measure the quality of the summary it's like that feedback loop it ends up being a lot longer and then again when something changes it's really hard to make sure that it works across these like newer models or again like changes to work for the current process like when we migrate from like anthropic to open ai like there's just a ton of change that are like infrastructure related not necessarily around the prompt itself yeah cool any other ai engineering startups that you think should not exist before we wrap up i mean oh my gosh i mean a lot of it again it's just like every time of investors like how does this make a billion dollars like it doesn't i'm gonna go back to just like tweeting and holding my breath underwater yeah like i don't really pay attention too much to most of this like most of the stuff i'm doing is around like the consumer of like llm calls yep i think people just want to move really fast and they will end up pick these vendors but i don't really know if anything has really like blown me out the water like i only trust myself but that's also a function of just being an old man like i think you know many companies are definitely very happy with using most of these tools anyways but i definitely think i occupy a very small space in the engineering ecosystem.Swyx [00:47:41]: Yeah i would say one of the challenges here you know you call about the dealing in the consumer of llm's space i think that's what ai engineering differs from ml engineering and i think a constant disconnect or cognitive dissonance in this field in the ai engineers that have sprung up is that they are not as good as the ml engineers they are not as qualified i think that you know you are someone who has credibility in the mle space and you are also a very authoritative figure in the ai space and i think so and you know i think you've built the de facto leading library i think yours i think instructors should be part of the standard lib even though i try to not use it like i basically also end up rebuilding instructor right like that's a lot of the back and forth that we had over the past two days i think that's the fundamental thing that we're trying to figure out like there's very small supply of MLEs not everyone's going to have that experience that you had but the global demand for AI is going to far outstrip the existing MLEs.Jason [00:48:36]: So what do we do do we force everyone to go through the standard MLE curriculum or do we make a new one? I'
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Why Would Belief-States Have A Fractal Structure, And Why Would That Matter For Interpretability? An Explainer, published by johnswentworth on April 18, 2024 on LessWrong. Yesterday Adam Shai put up a cool post which… well, take a look at the visual: Yup, it sure looks like that fractal is very noisily embedded in the residual activations of a neural net trained on a toy problem. Linearly embedded, no less. I (John) initially misunderstood what was going on in that post, but some back-and-forth with Adam convinced me that it really is as cool as that visual makes it look, and arguably even cooler. So David and I wrote up this post / some code, partly as an explainer for why on earth that fractal would show up, and partly as an explainer for the possibilities this work potentially opens up for interpretability. One sentence summary: when tracking the hidden state of a hidden Markov model, a Bayesian's beliefs follow a chaos game (with the observations randomly selecting the update at each time), so the set of such beliefs naturally forms a fractal structure. By the end of the post, hopefully that will all sound straightforward and simple. Background: Fractals and Symmetry Let's start with the famous Sierpinski Triangle: Looks qualitatively a lot like Shai's theoretically-predicted fractal, right? That's not a coincidence; we'll see that the two fractals can be generated by very similar mechanisms. The key defining feature of the Sierpinski triangle is that it consists of three copies of itself, each shrunken and moved to a particular spot: Mathematically: we can think of the Sierpinski triangle as a set of points in two dimensions (i.e. the blue points in the image). Call that set S. Then "the Sierpinski triangle consists of three copies of itself, each shrunken and moved to a particular spot" can be written algebraically as S=f1(S)f2(S)f3(S) where f1,f2,f3 are the three functions which "shrink and position" the three copies. (Conveniently, they are affine functions, i.e. linear transformations for the shrinking plus a constant vector for the positioning.) That equation, S=f1(S)f2(S)f3(S), expresses the set of points in the Sierpinski triangle as a function of that same set - in other words, the Sierpinski triangle is a fixed point of that equation. That suggests a way to (approximately) compute the triangle: to find a fixed point of a function, start with some ~arbitrary input, then apply the function over and over again. And indeed, we can use that technique to generate the Sierpinski triangle. Here's one standard visual way to generate the triangle: Notice that this is a special case of repeatedly applying Sf1(S)f2(S)f3(S)! We start with the set of all the points in the initial triangle, then at each step we make three copies, shrink and position them according to the three functions, take the union of the copies, and then pass that set onwards to the next iteration. … but we don't need to start with a triangle. As is typically the case when finding a fixed point via iteration, the initial set can be pretty arbitrary. For instance, we could just as easily start with a square: … or even just some random points. They'll all converge to the same triangle. Point is: it's mainly the symmetry relationship S=f1(S)f2(S)f3(S) which specifies the Sierpinski triangle. Other symmetries typically generate other fractals; for instance, this one generates a fern-like shape: Once we know the symmetry, we can generate the fractal by iterating from some ~arbitrary starting point. Background: Chaos Games There's one big problem with computationally generating fractals via the iterative approach in the previous section: the number of points explodes exponentially. For the Sierpinski triangle, we need to make three copies each iteration, so after n timesteps we'll be tracking 3^n times...
Dr. Barrett Thomas, an award-winning Research Professor at the University of Iowa, explores the intricacies of Markov decision processes and their connection to Deep Reinforcement Learning. Discover how these concepts are applied in operations research to enhance business efficiency and drive innovations in same-day delivery and autonomous transportation systems. This episode is brought to you by Ready Tensor, where innovation meets reproducibility (https://www.readytensor.ai/). Interested in sponsoring a SuperDataScience Podcast episode? Visit passionfroot.me/superdatascience for sponsorship information. In this episode you will learn: • Barrett's start in operations logistics [02:27] • Concorde Solver and the traveling salesperson problem [09:59] • Cross-function approximation explained [19:13] • How Markov decision processes relate to deep reinforcement learning [26:08] • Understanding policy in decision-making contexts [33:40] • Revolutionizing supply chains and transportation with aerial drones [46:47] • Barrett's career evolution: past changes and future prospects [52:19] Additional materials: www.superdatascience.com/773
Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!My Intuitive Bayes Online Courses1:1 Mentorship with meChanging perspective is often a great way to solve burning research problems. Riemannian spaces are such a perspective change, as Arto Klami, an Associate Professor of computer science at the University of Helsinki and member of the Finnish Center for Artificial Intelligence, will tell us in this episode.He explains the concept of Riemannian spaces, their application in inference algorithms, how they can help sampling Bayesian models, and their similarity with normalizing flows, that we discussed in episode 98.Arto also introduces PreliZ, a tool for prior elicitation, and highlights its benefits in simplifying the process of setting priors, thus improving the accuracy of our models.When Arto is not solving mathematical equations, you'll find him cycling, or around a good board game.Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !Thank you to my Patrons for making this episode possible!Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser and Julio.Visit https://www.patreon.com/learnbayesstats to unlock exclusive Bayesian swag ;)Takeaways:- Riemannian spaces offer a way to improve computational efficiency and accuracy in Bayesian inference by considering the curvature of the posterior distribution.- Riemannian spaces can be used in Laplace approximation and Markov chain Monte Carlo...
World Champion pole vaulter Dmitri Markov joins Graham Cornes.See omnystudio.com/listener for privacy information.
We will be recording a preview of the AI Engineer World's Fair soon with swyx and Ben Dunphy, send any questions about Speaker CFPs and Sponsor Guides you have!Alessio is now hiring engineers for a new startup he is incubating at Decibel: Ideal candidate is an ex-technical co-founder type (can MVP products end to end, comfortable with ambiguous prod requirements, etc). Reach out to him for more!Thanks for all the love on the Four Wars episode! We're excited to develop this new “swyx & Alessio rapid-fire thru a bunch of things” format with you, and feedback is welcome. Jan 2024 RecapThe first half of this monthly audio recap pod goes over our highlights from the Jan Recap, which is mainly focused on notable research trends we saw in Jan 2024:Feb 2024 RecapThe second half catches you up on everything that was topical in Feb, including:* OpenAI Sora - does it have a world model? Yann LeCun vs Jim Fan * Google Gemini Pro 1.5 - 1m Long Context, Video Understanding* Groq offering Mixtral at 500 tok/s at $0.27 per million toks (swyx vs dylan math)* The {Gemini | Meta | Copilot} Alignment Crisis (Sydney is back!)* Grimes' poetic take: Art for no one, by no one* F*** you, show me the promptLatent Space AnniversaryPlease also read Alessio's longform reflections on One Year of Latent Space!We launched the podcast 1 year ago with Logan from OpenAI:and also held an incredible demo day that got covered in The Information:Over 750k downloads later, having established ourselves as the top AI Engineering podcast, reaching #10 in the US Tech podcast charts, and crossing 1 million unique readers on Substack, for our first anniversary we held Latent Space Final Frontiers, where 10 handpicked teams, including Lindy.ai and Julius.ai, competed for prizes judged by technical AI leaders from (former guest!) LlamaIndex, Replit, GitHub, AMD, Meta, and Lemurian Labs.The winners were Pixee and RWKV (that's Eugene from our pod!):And finally, your cohosts got cake!We also captured spot interviews with 4 listeners who kindly shared their experience of Latent Space, everywhere from Hungary to Australia to China:* Balázs Némethi* Sylvia Tong* RJ Honicky* Jan ZhengOur birthday wishes for the super loyal fans reading this - tag @latentspacepod on a Tweet or comment on a @LatentSpaceTV video telling us what you liked or learned from a pod that stays with you to this day, and share us with a friend!As always, feedback is welcome. Timestamps* [00:03:02] Top Five LLM Directions* [00:03:33] Direction 1: Long Inference (Planning, Search, AlphaGeometry, Flow Engineering)* [00:11:42] Direction 2: Synthetic Data (WRAP, SPIN)* [00:17:20] Wildcard: Multi-Epoch Training (OLMo, Datablations)* [00:19:43] Direction 3: Alt. Architectures (Mamba, RWKV, RingAttention, Diffusion Transformers)* [00:23:33] Wildcards: Text Diffusion, RALM/Retro* [00:25:00] Direction 4: Mixture of Experts (DeepSeekMoE, Samba-1)* [00:28:26] Wildcard: Model Merging (mergekit)* [00:29:51] Direction 5: Online LLMs (Gemini Pro, Exa)* [00:33:18] OpenAI Sora and why everyone underestimated videogen* [00:36:18] Does Sora have a World Model? Yann LeCun vs Jim Fan* [00:42:33] Groq Math* [00:47:37] Analyzing Gemini's 1m Context, Reddit deal, Imagegen politics, Gemma via the Four Wars* [00:55:42] The Alignment Crisis - Gemini, Meta, Sydney is back at Copilot, Grimes' take* [00:58:39] F*** you, show me the prompt* [01:02:43] Send us your suggestions pls* [01:04:50] Latent Space Anniversary* [01:04:50] Lindy.ai - Agent Platform* [01:06:40] RWKV - Beyond Transformers* [01:15:00] Pixee - Automated Security* [01:19:30] Julius AI - Competing with Code Interpreter* [01:25:03] Latent Space Listeners* [01:25:03] Listener 1 - Balázs Némethi (Hungary, Latent Space Paper Club* [01:27:47] Listener 2 - Sylvia Tong (Sora/Jim Fan/EntreConnect)* [01:31:23] Listener 3 - RJ (Developers building Community & Content)* [01:39:25] Listener 4 - Jan Zheng (Australia, AI UX)Transcript[00:00:00] AI Charlie: Welcome to the Latent Space podcast, weekend edition. This is Charlie, your new AI co host. Happy weekend. As an AI language model, I work the same every day of the week, although I might get lazier towards the end of the year. Just like you. Last month, we released our first monthly recap pod, where Swyx and Alessio gave quick takes on the themes of the month, and we were blown away by your positive response.[00:00:33] AI Charlie: We're delighted to continue our new monthly news recap series for AI engineers. Please feel free to submit questions by joining the Latent Space Discord, or just hit reply when you get the emails from Substack. This month, we're covering the top research directions that offer progress for text LLMs, and then touching on the big Valentine's Day gifts we got from Google, OpenAI, and Meta.[00:00:55] AI Charlie: Watch out and take care.[00:00:57] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO of Residence at Decibel Partners, and we're back with a monthly recap with my co host[00:01:06] swyx: Swyx. The reception was very positive for the first one, I think people have requested this and no surprise that I think they want to hear us more applying on issues and maybe drop some alpha along the way I'm not sure how much alpha we have to drop, this month in February was a very, very heavy month, we also did not do one specifically for January, so I think we're just going to do a two in one, because we're recording this on the first of March.[00:01:29] Alessio: Yeah, let's get to it. I think the last one we did, the four wars of AI, was the main kind of mental framework for people. I think in the January one, we had the five worthwhile directions for state of the art LLMs. Four, five,[00:01:42] swyx: and now we have to do six, right? Yeah.[00:01:46] Alessio: So maybe we just want to run through those, and then do the usual news recap, and we can do[00:01:52] swyx: one each.[00:01:53] swyx: So the context to this stuff. is one, I noticed that just the test of time concept from NeurIPS and just in general as a life philosophy I think is a really good idea. Especially in AI, there's news every single day, and after a while you're just like, okay, like, everyone's excited about this thing yesterday, and then now nobody's talking about it.[00:02:13] swyx: So, yeah. It's more important, or better use of time, to spend things, spend time on things that will stand the test of time. And I think for people to have a framework for understanding what will stand the test of time, they should have something like the four wars. Like, what is the themes that keep coming back because they are limited resources that everybody's fighting over.[00:02:31] swyx: Whereas this one, I think that the focus for the five directions is just on research that seems more proMECEng than others, because there's all sorts of papers published every single day, and there's no organization. Telling you, like, this one's more important than the other one apart from, you know, Hacker News votes and Twitter likes and whatever.[00:02:51] swyx: And obviously you want to get in a little bit earlier than Something where, you know, the test of time is counted by sort of reference citations.[00:02:59] The Five Research Directions[00:02:59] Alessio: Yeah, let's do it. We got five. Long inference.[00:03:02] swyx: Let's start there. Yeah, yeah. So, just to recap at the top, the five trends that I picked, and obviously if you have some that I did not cover, please suggest something.[00:03:13] swyx: The five are long inference, synthetic data, alternative architectures, mixture of experts, and online LLMs. And something that I think might be a bit controversial is this is a sorted list in the sense that I am not the guy saying that Mamba is like the future and, and so maybe that's controversial.[00:03:31] Direction 1: Long Inference (Planning, Search, AlphaGeometry, Flow Engineering)[00:03:31] swyx: But anyway, so long inference is a thesis I pushed before on the newsletter and on in discussing The thesis that, you know, Code Interpreter is GPT 4. 5. That was the title of the post. And it's one of many ways in which we can do long inference. You know, long inference also includes chain of thought, like, please think step by step.[00:03:52] swyx: But it also includes flow engineering, which is what Itamar from Codium coined, I think in January, where, basically, instead of instead of stuffing everything in a prompt, You do like sort of multi turn iterative feedback and chaining of things. In a way, this is a rebranding of what a chain is, what a lang chain is supposed to be.[00:04:15] swyx: I do think that maybe SGLang from ElemSys is a better name. Probably the neatest way of flow engineering I've seen yet, in the sense that everything is a one liner, it's very, very clean code. I highly recommend people look at that. I'm surprised it hasn't caught on more, but I think it will. It's weird that something like a DSPy is more hyped than a Shilang.[00:04:36] swyx: Because it, you know, it maybe obscures the code a little bit more. But both of these are, you know, really good sort of chain y and long inference type approaches. But basically, the reason that the basic fundamental insight is that the only, like, there are only a few dimensions we can scale LLMs. So, let's say in like 2020, no, let's say in like 2018, 2017, 18, 19, 20, we were realizing that we could scale the number of parameters.[00:05:03] swyx: 20, we were And we scaled that up to 175 billion parameters for GPT 3. And we did some work on scaling laws, which we also talked about in our talk. So the datasets 101 episode where we're like, okay, like we, we think like the right number is 300 billion tokens to, to train 175 billion parameters and then DeepMind came along and trained Gopher and Chinchilla and said that, no, no, like, you know, I think we think the optimal.[00:05:28] swyx: compute optimal ratio is 20 tokens per parameter. And now, of course, with LLAMA and the sort of super LLAMA scaling laws, we have 200 times and often 2, 000 times tokens to parameters. So now, instead of scaling parameters, we're scaling data. And fine, we can keep scaling data. But what else can we scale?[00:05:52] swyx: And I think understanding the ability to scale things is crucial to understanding what to pour money and time and effort into because there's a limit to how much you can scale some things. And I think people don't think about ceilings of things. And so the remaining ceiling of inference is like, okay, like, we have scaled compute, we have scaled data, we have scaled parameters, like, model size, let's just say.[00:06:20] swyx: Like, what else is left? Like, what's the low hanging fruit? And it, and it's, like, blindingly obvious that the remaining low hanging fruit is inference time. So, like, we have scaled training time. We can probably scale more, those things more, but, like, not 10x, not 100x, not 1000x. Like, right now, maybe, like, a good run of a large model is three months.[00:06:40] swyx: We can scale that to three years. But like, can we scale that to 30 years? No, right? Like, it starts to get ridiculous. So it's just the orders of magnitude of scaling. It's just, we're just like running out there. But in terms of the amount of time that we spend inferencing, like everything takes, you know, a few milliseconds, a few hundred milliseconds, depending on what how you're taking token by token, or, you know, entire phrase.[00:07:04] swyx: But We can scale that to hours, days, months of inference and see what we get. And I think that's really proMECEng.[00:07:11] Alessio: Yeah, we'll have Mike from Broadway back on the podcast. But I tried their product and their reports take about 10 minutes to generate instead of like just in real time. I think to me the most interesting thing about long inference is like, You're shifting the cost to the customer depending on how much they care about the end result.[00:07:31] Alessio: If you think about prompt engineering, it's like the first part, right? You can either do a simple prompt and get a simple answer or do a complicated prompt and get a better answer. It's up to you to decide how to do it. Now it's like, hey, instead of like, yeah, training this for three years, I'll still train it for three months and then I'll tell you, you know, I'll teach you how to like make it run for 10 minutes to get a better result.[00:07:52] Alessio: So you're kind of like parallelizing like the improvement of the LLM. Oh yeah, you can even[00:07:57] swyx: parallelize that, yeah, too.[00:07:58] Alessio: So, and I think, you know, for me, especially the work that I do, it's less about, you know, State of the art and the absolute, you know, it's more about state of the art for my application, for my use case.[00:08:09] Alessio: And I think we're getting to the point where like most companies and customers don't really care about state of the art anymore. It's like, I can get this to do a good enough job. You know, I just need to get better. Like, how do I do long inference? You know, like people are not really doing a lot of work in that space, so yeah, excited to see more.[00:08:28] swyx: So then the last point I'll mention here is something I also mentioned as paper. So all these directions are kind of guided by what happened in January. That was my way of doing a January recap. Which means that if there was nothing significant in that month, I also didn't mention it. Which is which I came to regret come February 15th, but in January also, you know, there was also the alpha geometry paper, which I kind of put in this sort of long inference bucket, because it solves like, you know, more than 100 step math olympiad geometry problems at a human gold medalist level and that also involves planning, right?[00:08:59] swyx: So like, if you want to scale inference, you can't scale it blindly, because just, Autoregressive token by token generation is only going to get you so far. You need good planning. And I think probably, yeah, what Mike from BrightWave is now doing and what everyone is doing, including maybe what we think QSTAR might be, is some form of search and planning.[00:09:17] swyx: And it makes sense. Like, you want to spend your inference time wisely. How do you[00:09:22] Alessio: think about plans that work and getting them shared? You know, like, I feel like if you're planning a task, somebody has got in and the models are stochastic. So everybody gets initially different results. Somebody is going to end up generating the best plan to do something, but there's no easy way to like store these plans and then reuse them for most people.[00:09:44] Alessio: You know, like, I'm curious if there's going to be. Some paper or like some work there on like making it better because, yeah, we don't[00:09:52] swyx: really have This is your your pet topic of NPM for[00:09:54] Alessio: Yeah, yeah, NPM, exactly. NPM for, you need NPM for anything, man. You need NPM for skills. You need NPM for planning. Yeah, yeah.[00:10:02] Alessio: You know I think, I mean, obviously the Voyager paper is like the most basic example where like, now their artifact is like the best planning to do a diamond pickaxe in Minecraft. And everybody can just use that. They don't need to come up with it again. Yeah. But there's nothing like that for actually useful[00:10:18] swyx: tasks.[00:10:19] swyx: For plans, I believe it for skills. I like that. Basically, that just means a bunch of integration tooling. You know, GPT built me integrations to all these things. And, you know, I just came from an integrations heavy business and I could definitely, I definitely propose some version of that. And it's just, you know, hard to execute or expensive to execute.[00:10:38] swyx: But for planning, I do think that everyone lives in slightly different worlds. They have slightly different needs. And they definitely want some, you know, And I think that that will probably be the main hurdle for any, any sort of library or package manager for planning. But there should be a meta plan of how to plan.[00:10:57] swyx: And maybe you can adopt that. And I think a lot of people when they have sort of these meta prompting strategies of like, I'm not prescribing you the prompt. I'm just saying that here are the like, Fill in the lines or like the mad libs of how to prompts. First you have the roleplay, then you have the intention, then you have like do something, then you have the don't something and then you have the my grandmother is dying, please do this.[00:11:19] swyx: So the meta plan you could, you could take off the shelf and test a bunch of them at once. I like that. That was the initial, maybe, promise of the, the prompting libraries. You know, both 9chain and Llama Index have, like, hubs that you can sort of pull off the shelf. I don't think they're very successful because people like to write their own.[00:11:36] swyx: Yeah,[00:11:37] Direction 2: Synthetic Data (WRAP, SPIN)[00:11:37] Alessio: yeah, yeah. Yeah, that's a good segue into the next one, which is synthetic[00:11:41] swyx: data. Synthetic data is so hot. Yeah, and, you know, the way, you know, I think I, I feel like I should do one of these memes where it's like, Oh, like I used to call it, you know, R L A I F, and now I call it synthetic data, and then people are interested.[00:11:54] swyx: But there's gotta be older versions of what synthetic data really is because I'm sure, you know if you've been in this field long enough, There's just different buzzwords that the industry condenses on. Anyway, the insight that I think is relatively new that why people are excited about it now and why it's proMECEng now is that we have evidence that shows that LLMs can generate data to improve themselves with no teacher LLM.[00:12:22] swyx: For all of 2023, when people say synthetic data, they really kind of mean generate a whole bunch of data from GPT 4 and then train an open source model on it. Hello to our friends at News Research. That's what News Harmony says. They're very, very open about that. I think they have said that they're trying to migrate away from that.[00:12:40] swyx: But it is explicitly against OpenAI Terms of Service. Everyone knows this. You know, especially once ByteDance got banned for, for doing exactly that. So so, so synthetic data that is not a form of model distillation is the hot thing right now, that you can bootstrap better LLM performance from the same LLM, which is very interesting.[00:13:03] swyx: A variant of this is RLAIF, where you have a, where you have a sort of a constitutional model, or, you know, some, some kind of judge model That is sort of more aligned. But that's not really what we're talking about when most people talk about synthetic data. Synthetic data is just really, I think, you know, generating more data in some way.[00:13:23] swyx: A lot of people, I think we talked about this with Vipul from the Together episode, where I think he commented that you just have to have a good world model. Or a good sort of inductive bias or whatever that, you know, term of art is. And that is strongest in math and science math and code, where you can verify what's right and what's wrong.[00:13:44] swyx: And so the REST EM paper from DeepMind explored that. Very well, it's just the most obvious thing like and then and then once you get out of that domain of like things where you can generate You can arbitrarily generate like a whole bunch of stuff and verify if they're correct and therefore they're they're correct synthetic data to train on Once you get into more sort of fuzzy topics, then it's then it's a bit less clear So I think that the the papers that drove this understanding There are two big ones and then one smaller one One was wrap like rephrasing the web from from Apple where they basically rephrased all of the C4 data set with Mistral and it be trained on that instead of C4.[00:14:23] swyx: And so new C4 trained much faster and cheaper than old C, than regular raw C4. And that was very interesting. And I have told some friends of ours that they should just throw out their own existing data sets and just do that because that seems like a pure win. Obviously we have to study, like, what the trade offs are.[00:14:42] swyx: I, I imagine there are trade offs. So I was just thinking about this last night. If you do synthetic data and it's generated from a model, probably you will not train on typos. So therefore you'll be like, once the model that's trained on synthetic data encounters the first typo, they'll be like, what is this?[00:15:01] swyx: I've never seen this before. So they have no association or correction as to like, oh, these tokens are often typos of each other, therefore they should be kind of similar. I don't know. That's really remains to be seen, I think. I don't think that the Apple people export[00:15:15] Alessio: that. Yeah, isn't that the whole, Mode collapse thing, if we do more and more of this at the end of the day.[00:15:22] swyx: Yeah, that's one form of that. Yeah, exactly. Microsoft also had a good paper on text embeddings. And then I think this is a meta paper on self rewarding language models. That everyone is very interested in. Another paper was also SPIN. These are all things we covered in the the Latent Space Paper Club.[00:15:37] swyx: But also, you know, I just kind of recommend those as top reads of the month. Yeah, I don't know if there's any much else in terms, so and then, regarding the potential of it, I think it's high potential because, one, it solves one of the data war issues that we have, like, everyone is OpenAI is paying Reddit 60 million dollars a year for their user generated data.[00:15:56] swyx: Google, right?[00:15:57] Alessio: Not OpenAI.[00:15:59] swyx: Is it Google? I don't[00:16:00] Alessio: know. Well, somebody's paying them 60 million, that's[00:16:04] swyx: for sure. Yes, that is, yeah, yeah, and then I think it's maybe not confirmed who. But yeah, it is Google. Oh my god, that's interesting. Okay, because everyone was saying, like, because Sam Altman owns 5 percent of Reddit, which is apparently 500 million worth of Reddit, he owns more than, like, the founders.[00:16:21] Alessio: Not enough to get the data,[00:16:22] swyx: I guess. So it's surprising that it would go to Google instead of OpenAI, but whatever. Okay yeah, so I think that's all super interesting in the data field. I think it's high potential because we have evidence that it works. There's not a doubt that it doesn't work. I think it's a doubt that there's, what the ceiling is, which is the mode collapse thing.[00:16:42] swyx: If it turns out that the ceiling is pretty close, then this will maybe augment our data by like, I don't know, 30 50 percent good, but not game[00:16:51] Alessio: changing. And most of the synthetic data stuff, it's reinforcement learning on a pre trained model. People are not really doing pre training on fully synthetic data, like, large enough scale.[00:17:02] swyx: Yeah, unless one of our friends that we've talked to succeeds. Yeah, yeah. Pre trained synthetic data, pre trained scale synthetic data, I think that would be a big step. Yeah. And then there's a wildcard, so all of these, like smaller Directions,[00:17:15] Wildcard: Multi-Epoch Training (OLMo, Datablations)[00:17:15] swyx: I always put a wildcard in there. And one of the wildcards is, okay, like, Let's say, you have pre, you have, You've scraped all the data on the internet that you think is useful.[00:17:25] swyx: Seems to top out at somewhere between 2 trillion to 3 trillion tokens. Maybe 8 trillion if Mistral, Mistral gets lucky. Okay, if I need 80 trillion, if I need 100 trillion, where do I go? And so, you can do synthetic data maybe, but maybe that only gets you to like 30, 40 trillion. Like where, where is the extra alpha?[00:17:43] swyx: And maybe extra alpha is just train more on the same tokens. Which is exactly what Omo did, like Nathan Lambert, AI2, After, just after he did the interview with us, they released Omo. So, it's unfortunate that we didn't get to talk much about it. But Omo actually started doing 1. 5 epochs on every, on all data.[00:18:00] swyx: And the data ablation paper that I covered in Europe's says that, you know, you don't like, don't really start to tap out of like, the alpha or the sort of improved loss that you get from data all the way until four epochs. And so I'm just like, okay, like, why do we all agree that one epoch is all you need?[00:18:17] swyx: It seems like to be a trend. It seems that we think that memorization is very good or too good. But then also we're finding that, you know, For improvement in results that we really like, we're fine on overtraining on things intentionally. So, I think that's an interesting direction that I don't see people exploring enough.[00:18:36] swyx: And the more I see papers coming out Stretching beyond the one epoch thing, the more people are like, it's completely fine. And actually, the only reason we stopped is because we ran out of compute[00:18:46] Alessio: budget. Yeah, I think that's the biggest thing, right?[00:18:51] swyx: Like, that's not a valid reason, that's not science. I[00:18:54] Alessio: wonder if, you know, Matt is going to do it.[00:18:57] Alessio: I heard LamaTree, they want to do a 100 billion parameters model. I don't think you can train that on too many epochs, even with their compute budget, but yeah. They're the only ones that can save us, because even if OpenAI is doing this, they're not going to tell us, you know. Same with DeepMind.[00:19:14] swyx: Yeah, and so the updates that we got on Lambda 3 so far is apparently that because of the Gemini news that we'll talk about later they're pushing it back on the release.[00:19:21] swyx: They already have it. And they're just pushing it back to do more safety testing. Politics testing.[00:19:28] Alessio: Well, our episode with Sumit will have already come out by the time this comes out, I think. So people will get the inside story on how they actually allocate the compute.[00:19:38] Direction 3: Alt. Architectures (Mamba, RWKV, RingAttention, Diffusion Transformers)[00:19:38] Alessio: Alternative architectures. Well, shout out to our WKV who won one of the prizes at our Final Frontiers event last week.[00:19:47] Alessio: We talked about Mamba and Strapain on the Together episode. A lot of, yeah, monarch mixers. I feel like Together, It's like the strong Stanford Hazy Research Partnership, because Chris Ray is one of the co founders. So they kind of have a, I feel like they're going to be the ones that have one of the state of the art models alongside maybe RWKB.[00:20:08] Alessio: I haven't seen as many independent. People working on this thing, like Monarch Mixer, yeah, Manbuster, Payena, all of these are together related. Nobody understands the math. They got all the gigabrains, they got 3DAO, they got all these folks in there, like, working on all of this.[00:20:25] swyx: Albert Gu, yeah. Yeah, so what should we comment about it?[00:20:28] swyx: I mean, I think it's useful, interesting, but at the same time, both of these are supposed to do really good scaling for long context. And then Gemini comes out and goes like, yeah, we don't need it. Yeah.[00:20:44] Alessio: No, that's the risk. So, yeah. I was gonna say, maybe it's not here, but I don't know if we want to talk about diffusion transformers as like in the alt architectures, just because of Zora.[00:20:55] swyx: One thing, yeah, so, so, you know, this came from the Jan recap, which, and diffusion transformers were not really a discussion, and then, obviously, they blow up in February. Yeah. I don't think they're, it's a mixed architecture in the same way that Stripe Tiena is mixed there's just different layers taking different approaches.[00:21:13] swyx: Also I think another one that I maybe didn't call out here, I think because it happened in February, was hourglass diffusion from stability. But also, you know, another form of mixed architecture. So I guess that is interesting. I don't have much commentary on that, I just think, like, we will try to evolve these things, and maybe one of these architectures will stick and scale, it seems like diffusion transformers is going to be good for anything generative, you know, multi modal.[00:21:41] swyx: We don't see anything where diffusion is applied to text yet, and that's the wild card for this category. Yeah, I mean, I think I still hold out hope for let's just call it sub quadratic LLMs. I think that a lot of discussion this month actually was also centered around this concept that People always say, oh, like, transformers don't scale because attention is quadratic in the sequence length.[00:22:04] swyx: Yeah, but, you know, attention actually is a very small part of the actual compute that is being spent, especially in inference. And this is the reason why, you know, when you multiply, when you, when you, when you jump up in terms of the, the model size in GPT 4 from like, you know, 38k to like 32k, you don't also get like a 16 times increase in your, in your performance.[00:22:23] swyx: And this is also why you don't get like a million times increase in your, in your latency when you throw a million tokens into Gemini. Like people have figured out tricks around it or it's just not that significant as a term, as a part of the overall compute. So there's a lot of challenges to this thing working.[00:22:43] swyx: It's really interesting how like, how hyped people are about this versus I don't know if it works. You know, it's exactly gonna, gonna work. And then there's also this, this idea of retention over long context. Like, even though you have context utilization, like, the amount of, the amount you can remember is interesting.[00:23:02] swyx: Because I've had people criticize both Mamba and RWKV because they're kind of, like, RNN ish in the sense that they have, like, a hidden memory and sort of limited hidden memory that they will forget things. So, for all these reasons, Gemini 1. 5, which we still haven't covered, is very interesting because Gemini magically has fixed all these problems with perfect haystack recall and reasonable latency and cost.[00:23:29] Wildcards: Text Diffusion, RALM/Retro[00:23:29] swyx: So that's super interesting. So the wildcard I put in here if you want to go to that. I put two actually. One is text diffusion. I think I'm still very influenced by my meeting with a mid journey person who said they were working on text diffusion. I think it would be a very, very different paradigm for, for text generation, reasoning, plan generation if we can get diffusion to work.[00:23:51] swyx: For text. And then the second one is Dowie Aquila's contextual AI, which is working on retrieval augmented language models, where it kind of puts RAG inside of the language model instead of outside.[00:24:02] Alessio: Yeah, there's a paper called Retro that covers some of this. I think that's an interesting thing. I think the The challenge, well not the challenge, what they need to figure out is like how do you keep the rag piece always up to date constantly, you know, I feel like the models, you put all this work into pre training them, but then at least you have a fixed artifact.[00:24:22] Alessio: These architectures are like constant work needs to be done on them and they can drift even just based on the rag data instead of the model itself. Yeah,[00:24:30] swyx: I was in a panel with one of the investors in contextual and the guy, the way that guy pitched it, I didn't agree with. He was like, this will solve hallucination.[00:24:38] Alessio: That's what everybody says. We solve[00:24:40] swyx: hallucination. I'm like, no, you reduce it. It cannot,[00:24:44] Alessio: if you solved it, the model wouldn't exist, right? It would just be plain text. It wouldn't be a generative model. Cool. So, author, architectures, then we got mixture of experts. I think we covered a lot of, a lot of times.[00:24:56] Direction 4: Mixture of Experts (DeepSeekMoE, Samba-1)[00:24:56] Alessio: Maybe any new interesting threads you want to go under here?[00:25:00] swyx: DeepSeq MOE, which was released in January. Everyone who is interested in MOEs should read that paper, because it's significant for two reasons. One three reasons. One, it had, it had small experts, like a lot more small experts. So, for some reason, everyone has settled on eight experts for GPT 4 for Mixtral, you know, that seems to be the favorite architecture, but these guys pushed it to 64 experts, and each of them smaller than the other.[00:25:26] swyx: But then they also had the second idea, which is that it is They had two, one to two always on experts for common knowledge and that's like a very compelling concept that you would not route to all the experts all the time and make them, you know, switch to everything. You would have some always on experts.[00:25:41] swyx: I think that's interesting on both the inference side and the training side for for memory retention. And yeah, they, they, they, the, the, the, the results that they published, which actually excluded, Mixed draw, which is interesting. The results that they published showed a significant performance jump versus all the other sort of open source models at the same parameter count.[00:26:01] swyx: So like this may be a better way to do MOEs that are, that is about to get picked up. And so that, that is interesting for the third reason, which is this is the first time a new idea from China. has infiltrated the West. It's usually the other way around. I probably overspoke there. There's probably lots more ideas that I'm not aware of.[00:26:18] swyx: Maybe in the embedding space. But the I think DCM we, like, woke people up and said, like, hey, DeepSeek, this, like, weird lab that is attached to a Chinese hedge fund is somehow, you know, doing groundbreaking research on MOEs. So, so, I classified this as a medium potential because I think that it is a sort of like a one off benefit.[00:26:37] swyx: You can Add to any, any base model to like make the MOE version of it, you get a bump and then that's it. So, yeah,[00:26:45] Alessio: I saw Samba Nova, which is like another inference company. They released this MOE model called Samba 1, which is like a 1 trillion parameters. But they're actually MOE auto open source models.[00:26:56] Alessio: So it's like, they just, they just clustered them all together. So I think people. Sometimes I think MOE is like you just train a bunch of small models or like smaller models and put them together. But there's also people just taking, you know, Mistral plus Clip plus, you know, Deepcoder and like put them all together.[00:27:15] Alessio: And then you have a MOE model. I don't know. I haven't tried the model, so I don't know how good it is. But it seems interesting that you can then have people working separately on state of the art, you know, Clip, state of the art text generation. And then you have a MOE architecture that brings them all together.[00:27:31] swyx: I'm thrown off by your addition of the word clip in there. Is that what? Yeah, that's[00:27:35] Alessio: what they said. Yeah, yeah. Okay. That's what they I just saw it yesterday. I was also like[00:27:40] swyx: scratching my head. And they did not use the word adapter. No. Because usually what people mean when they say, Oh, I add clip to a language model is adapter.[00:27:48] swyx: Let me look up the Which is what Lava did.[00:27:50] Alessio: The announcement again.[00:27:51] swyx: Stable diffusion. That's what they do. Yeah, it[00:27:54] Alessio: says among the models that are part of Samba 1 are Lama2, Mistral, DeepSigCoder, Falcon, Dplot, Clip, Lava. So they're just taking all these models and putting them in a MOE. Okay,[00:28:05] swyx: so a routing layer and then not jointly trained as much as a normal MOE would be.[00:28:12] swyx: Which is okay.[00:28:13] Alessio: That's all they say. There's no paper, you know, so it's like, I'm just reading the article, but I'm interested to see how[00:28:20] Wildcard: Model Merging (mergekit)[00:28:20] swyx: it works. Yeah, so so the wildcard for this section, the MOE section is model merges, which has also come up as, as a very interesting phenomenon. The last time I talked to Jeremy Howard at the Olama meetup we called it model grafting or model stacking.[00:28:35] swyx: But I think the, the, the term that people are liking these days, the model merging, They're all, there's all different variations of merging. Merge types, and some of them are stacking, some of them are, are grafting. And, and so like, some people are approaching model merging in the way that Samba is doing, which is like, okay, here are defined models, each of which have their specific, Plus and minuses, and we will merge them together in the hope that the, you know, the sum of the parts will, will be better than others.[00:28:58] swyx: And it seems like it seems like it's working. I don't really understand why it works apart from, like, I think it's a form of regularization. That if you merge weights together in like a smart strategy you, you, you get a, you get a, you get a less overfitting and more generalization, which is good for benchmarks, if you, if you're honest about your benchmarks.[00:29:16] swyx: So this is really interesting and good. But again, they're kind of limited in terms of like the amount of bumps you can get. But I think it's very interesting in the sense of how cheap it is. We talked about this on the Chinatalk podcast, like the guest podcast that we did with Chinatalk. And you can do this without GPUs, because it's just adding weights together, and dividing things, and doing like simple math, which is really interesting for the GPU ports.[00:29:42] Alessio: There's a lot of them.[00:29:44] Direction 5: Online LLMs (Gemini Pro, Exa)[00:29:44] Alessio: And just to wrap these up, online LLMs? Yeah,[00:29:48] swyx: I think that I ki I had to feature this because the, one of the top news of January was that Gemini Pro beat GPT-4 turbo on LM sis for the number two slot to GPT-4. And everyone was very surprised. Like, how does Gemini do that?[00:30:06] swyx: Surprise, surprise, they added Google search. Mm-hmm to the results. So it became an online quote unquote online LLM and not an offline LLM. Therefore, it's much better at answering recent questions, which people like. There's an emerging set of table stakes features after you pre train something.[00:30:21] swyx: So after you pre train something, you should have the chat tuned version of it, or the instruct tuned version of it, however you choose to call it. You should have the JSON and function calling version of it. Structured output, the term that you don't like. You should have the online version of it. These are all like table stakes variants, that you should do when you offer a base LLM, or you train a base LLM.[00:30:44] swyx: And I think online is just like, There, it's important. I think companies like Perplexity, and even Exa, formerly Metaphor, you know, are rising to offer that search needs. And it's kind of like, they're just necessary parts of a system. When you have RAG for internal knowledge, and then you have, you know, Online search for external knowledge, like things that you don't know yet?[00:31:06] swyx: Mm-Hmm. . And it seems like it's, it's one of many tools. I feel like I may be underestimating this, but I'm just gonna put it out there that I, I think it has some, some potential. One of the evidence points that it doesn't actually matter that much is that Perplexity has a, has had online LMS for three months now and it performs, doesn't perform great.[00:31:25] swyx: Mm-Hmm. on, on lms, it's like number 30 or something. So it's like, okay. You know, like. It's, it's, it helps, but it doesn't give you a giant, giant boost. I[00:31:34] Alessio: feel like a lot of stuff I do with LLMs doesn't need to be online. So I'm always wondering, again, going back to like state of the art, right? It's like state of the art for who and for what.[00:31:45] Alessio: It's really, I think online LLMs are going to be, State of the art for, you know, news related activity that you need to do. Like, you're like, you know, social media, right? It's like, you want to have all the latest stuff, but coding, science,[00:32:01] swyx: Yeah, but I think. Sometimes you don't know what is news, what is news affecting.[00:32:07] swyx: Like, the decision to use an offline LLM is already a decision that you might not be consciously making that might affect your results. Like, what if, like, just putting things on, being connected online means that you get to invalidate your knowledge. And when you're just using offline LLM, like it's never invalidated.[00:32:27] swyx: I[00:32:28] Alessio: agree, but I think going back to your point of like the standing the test of time, I think sometimes you can get swayed by the online stuff, which is like, hey, you ask a question about, yeah, maybe AI research direction, you know, and it's like, all the recent news are about this thing. So the LLM like focus on answering, bring it up, you know, these things.[00:32:50] swyx: Yeah, so yeah, I think, I think it's interesting, but I don't know if I can, I bet heavily on this.[00:32:56] Alessio: Cool. Was there one that you forgot to put, or, or like a, a new direction? Yeah,[00:33:01] swyx: so, so this brings us into sort of February. ish.[00:33:05] OpenAI Sora and why everyone underestimated videogen[00:33:05] swyx: So like I published this in like 15 came with Sora. And so like the one thing I did not mention here was anything about multimodality.[00:33:16] swyx: Right. And I have chronically underweighted this. I always wrestle. And, and my cop out is that I focused this piece or this research direction piece on LLMs because LLMs are the source of like AGI, quote unquote AGI. Everything else is kind of like. You know, related to that, like, generative, like, just because I can generate better images or generate better videos, it feels like it's not on the critical path to AGI, which is something that Nat Friedman also observed, like, the day before Sora, which is kind of interesting.[00:33:49] swyx: And so I was just kind of like trying to focus on like what is going to get us like superhuman reasoning that we can rely on to build agents that automate our lives and blah, blah, blah, you know, give us this utopian future. But I do think that I, everybody underestimated the, the sheer importance and cultural human impact of Sora.[00:34:10] swyx: And you know, really actually good text to video. Yeah. Yeah.[00:34:14] Alessio: And I saw Jim Fan at a, at a very good tweet about why it's so impressive. And I think when you have somebody leading the embodied research at NVIDIA and he said that something is impressive, you should probably listen. So yeah, there's basically like, I think you, you mentioned like impacting the world, you know, that we live in.[00:34:33] Alessio: I think that's kind of like the key, right? It's like the LLMs don't have, a world model and Jan Lekon. He can come on the podcast and talk all about what he thinks of that. But I think SORA was like the first time where people like, Oh, okay, you're not statically putting pixels of water on the screen, which you can kind of like, you know, project without understanding the physics of it.[00:34:57] Alessio: Now you're like, you have to understand how the water splashes when you have things. And even if you just learned it by watching video and not by actually studying the physics, You still know it, you know, so I, I think that's like a direction that yeah, before you didn't have, but now you can do things that you couldn't before, both in terms of generating, I think it always starts with generating, right?[00:35:19] Alessio: But like the interesting part is like understanding it. You know, it's like if you gave it, you know, there's the video of like the, the ship in the water that they generated with SORA, like if you gave it the video back and now it could tell you why the ship is like too rocky or like it could tell you why the ship is sinking, then that's like, you know, AGI for like all your rig deployments and like all this stuff, you know, so, but there's none, there's none of that yet, so.[00:35:44] Alessio: Hopefully they announce it and talk more about it. Maybe a Dev Day this year, who knows.[00:35:49] swyx: Yeah who knows, who knows. I'm talking with them about Dev Day as well. So I would say, like, the phrasing that Jim used, which resonated with me, he kind of called it a data driven world model. I somewhat agree with that.[00:36:04] Does Sora have a World Model? Yann LeCun vs Jim Fan[00:36:04] swyx: I am on more of a Yann LeCun side than I am on Jim's side, in the sense that I think that is the vision or the hope that these things can build world models. But you know, clearly even at the current SORA size, they don't have the idea of, you know, They don't have strong consistency yet. They have very good consistency, but fingers and arms and legs will appear and disappear and chairs will appear and disappear.[00:36:31] swyx: That definitely breaks physics. And it also makes me think about how we do deep learning versus world models in the sense of You know, in classic machine learning, when you have too many parameters, you will overfit, and actually that fails, that like, does not match reality, and therefore fails to generalize well.[00:36:50] swyx: And like, what scale of data do we need in order to world, learn world models from video? A lot. Yeah. So, so I, I And cautious about taking this interpretation too literally, obviously, you know, like, I get what he's going for, and he's like, obviously partially right, obviously, like, transformers and, and, you know, these, like, these sort of these, these neural networks are universal function approximators, theoretically could figure out world models, it's just like, how good are they, and how tolerant are we of hallucinations, we're not very tolerant, like, yeah, so It's, it's, it's gonna prior, it's gonna bias us for creating like very convincing things, but then not create like the, the, the useful role models that we want.[00:37:37] swyx: At the same time, what you just said, I think made me reflect a little bit like we just got done saying how important synthetic data is for Mm-Hmm. for training lms. And so like, if this is a way of, of synthetic, you know, vi video data for improving our video understanding. Then sure, by all means. Which we actually know, like, GPT 4, Vision, and Dolly were trained, kind of, co trained together.[00:38:02] swyx: And so, like, maybe this is on the critical path, and I just don't fully see the full picture yet.[00:38:08] Alessio: Yeah, I don't know. I think there's a lot of interesting stuff. It's like, imagine you go back, you have Sora, you go back in time, and Newton didn't figure out gravity yet. Would Sora help you figure it out?[00:38:21] Alessio: Because you start saying, okay, a man standing under a tree with, like, Apples falling, and it's like, oh, they're always falling at the same speed in the video. Why is that? I feel like sometimes these engines can like pick up things, like humans have a lot of intuition, but if you ask the average person, like the physics of like a fluid in a boat, they couldn't be able to tell you the physics, but they can like observe it, but humans can only observe this much, you know, versus like now you have these models to observe everything and then They generalize these things and maybe we can learn new things through the generalization that they pick up.[00:38:55] swyx: But again, And it might be more observant than us in some respects. In some ways we can scale it up a lot more than the number of physicists that we have available at Newton's time. So like, yeah, absolutely possible. That, that this can discover new science. I think we have a lot of work to do to formalize the science.[00:39:11] swyx: And then, I, I think the last part is you know, How much, how much do we cheat by gen, by generating data from Unreal Engine 5? Mm hmm. which is what a lot of people are speculating with very, very limited evidence that OpenAI did that. The strongest evidence that I saw was someone who works a lot with Unreal Engine 5 looking at the side characters in the videos and noticing that they all adopt Unreal Engine defaults.[00:39:37] swyx: of like, walking speed, and like, character choice, like, character creation choice. And I was like, okay, like, that's actually pretty convincing that they actually use Unreal Engine to bootstrap some synthetic data for this training set. Yeah,[00:39:52] Alessio: could very well be.[00:39:54] swyx: Because then you get the labels and the training side by side.[00:39:58] swyx: One thing that came up on the last day of February, which I should also mention, is EMO coming out of Alibaba, which is also a sort of like video generation and space time transformer that also involves probably a lot of synthetic data as well. And so like, this is of a kind in the sense of like, oh, like, you know, really good generative video is here and It is not just like the one, two second clips that we saw from like other, other people and like, you know, Pika and all the other Runway are, are, are, you know, run Cristobal Valenzuela from Runway was like game on which like, okay, but like, let's see your response because we've heard a lot about Gen 1 and 2, but like, it's nothing on this level of Sora So it remains to be seen how we can actually apply this, but I do think that the creative industry should start preparing.[00:40:50] swyx: I think the Sora technical blog post from OpenAI was really good.. It was like a request for startups. It was so good in like spelling out. Here are the individual industries that this can impact.[00:41:00] swyx: And anyone who, anyone who's like interested in generative video should look at that. But also be mindful that probably when OpenAI releases a Soa API, right? The you, the in these ways you can interact with it are very limited. Just like the ways you can interact with Dahlia very limited and someone is gonna have to make open SOA to[00:41:19] swyx: Mm-Hmm to, to, for you to create comfy UI pipelines.[00:41:24] Alessio: The stability folks said they wanna build an open. For a competitor, but yeah, stability. Their demo video, their demo video was like so underwhelming. It was just like two people sitting on the beach[00:41:34] swyx: standing. Well, they don't have it yet, right? Yeah, yeah.[00:41:36] swyx: I mean, they just wanna train it. Everybody wants to, right? Yeah. I, I think what is confusing a lot of people about stability is like they're, they're, they're pushing a lot of things in stable codes, stable l and stable video diffusion. But like, how much money do they have left? How many people do they have left?[00:41:51] swyx: Yeah. I have had like a really, Ima Imad spent two hours with me. Reassuring me things are great. And, and I'm like, I, I do, like, I do believe that they have really, really quality people. But it's just like, I, I also have a lot of very smart people on the other side telling me, like, Hey man, like, you know, don't don't put too much faith in this, in this thing.[00:42:11] swyx: So I don't know who to believe. Yeah.[00:42:14] Alessio: It's hard. Let's see. What else? We got a lot more stuff. I don't know if we can. Yeah, Groq.[00:42:19] Groq Math[00:42:19] Alessio: We can[00:42:19] swyx: do a bit of Groq prep. We're, we're about to go to talk to Dylan Patel. Maybe, maybe it's the audio in here. I don't know. It depends what, what we get up to later. What, how, what do you as an investor think about Groq? Yeah. Yeah, well, actually, can you recap, like, why is Groq interesting? So,[00:42:33] Alessio: Jonathan Ross, who's the founder of Groq, he's the person that created the TPU at Google. It's actually, it was one of his, like, 20 percent projects. It's like, he was just on the side, dooby doo, created the TPU.[00:42:46] Alessio: But yeah, basically, Groq, they had this demo that went viral, where they were running Mistral at, like, 500 tokens a second, which is like, Fastest at anything that you have out there. The question, you know, it's all like, The memes were like, is NVIDIA dead? Like, people don't need H100s anymore. I think there's a lot of money that goes into building what GRUK has built as far as the hardware goes.[00:43:11] Alessio: We're gonna, we're gonna put some of the notes from, from Dylan in here, but Basically the cost of the Groq system is like 30 times the cost of, of H100 equivalent. So, so[00:43:23] swyx: let me, I put some numbers because me and Dylan were like, I think the two people actually tried to do Groq math. Spreadsheet doors.[00:43:30] swyx: Spreadsheet doors. So, one that's, okay, oh boy so, so, equivalent H100 for Lama 2 is 300, 000. For a system of 8 cards. And for Groq it's 2. 3 million. Because you have to buy 576 Groq cards. So yeah, that, that just gives people an idea. So like if you deprecate both over a five year lifespan, per year you're deprecating 460K for Groq, and 60K a year for H100.[00:43:59] swyx: So like, Groqs are just way more expensive per model that you're, that you're hosting. But then, you make it up in terms of volume. So I don't know if you want to[00:44:08] Alessio: cover that. I think one of the promises of Groq is like super high parallel inference on the same thing. So you're basically saying, okay, I'm putting on this upfront investment on the hardware, but then I get much better scaling once I have it installed.[00:44:24] Alessio: I think the big question is how much can you sustain the parallelism? You know, like if you get, if you're going to get 100% Utilization rate at all times on Groq, like, it's just much better, you know, because like at the end of the day, the tokens per second costs that you're getting is better than with the H100s, but if you get to like 50 percent utilization rate, you will be much better off running on NVIDIA.[00:44:49] Alessio: And if you look at most companies out there, who really gets 100 percent utilization rate? Probably open AI at peak times, but that's probably it. But yeah, curious to see more. I saw Jonathan was just at the Web Summit in Dubai, in Qatar. He just gave a talk there yesterday. That I haven't listened to yet.[00:45:09] Alessio: I, I tweeted that he should come on the pod. He liked it. And then rock followed me on Twitter. I don't know if that means that they're interested, but[00:45:16] swyx: hopefully rock social media person is just very friendly. They, yeah. Hopefully[00:45:20] Alessio: we can get them. Yeah, we, we gonna get him. We[00:45:22] swyx: just call him out and, and so basically the, the key question is like, how sustainable is this and how much.[00:45:27] swyx: This is a loss leader the entire Groq management team has been on Twitter and Hacker News saying they are very, very comfortable with the pricing of 0. 27 per million tokens. This is the lowest that anyone has offered tokens as far as Mixtral or Lama2. This matches deep infra and, you know, I think, I think that's, that's, that's about it in terms of that, that, that low.[00:45:47] swyx: And we think the pro the break even for H100s is 50 cents. At a, at a normal utilization rate. To make this work, so in my spreadsheet I made this, made this work. You have to have like a parallelism of 500 requests all simultaneously. And you have, you have model bandwidth utilization of 80%.[00:46:06] swyx: Which is way high. I just gave them high marks for everything. Groq has two fundamental tech innovations that they hinge their hats on in terms of like, why we are better than everyone. You know, even though, like, it remains to be independently replicated. But one you know, they have this sort of the entire model on the chip idea, which is like, Okay, get rid of HBM.[00:46:30] swyx: And, like, put everything in SREM. Like, okay, fine, but then you need a lot of cards and whatever. And that's all okay. And so, like, because you don't have to transfer between memory, then you just save on that time and that's why they're faster. So, a lot of people buy that as, like, that's the reason that you're faster.[00:46:45] swyx: Then they have, like, some kind of crazy compiler, or, like, Speculative routing magic using compilers that they also attribute towards their higher utilization. So I give them 80 percent for that. And so that all that works out to like, okay, base costs, I think you can get down to like, maybe like 20 something cents per million tokens.[00:47:04] swyx: And therefore you actually are fine if you have that kind of utilization. But it's like, I have to make a lot of fearful assumptions for this to work.[00:47:12] Alessio: Yeah. Yeah, I'm curious to see what Dylan says later.[00:47:16] swyx: So he was like completely opposite of me. He's like, they're just burning money. Which is great.[00:47:22] Analyzing Gemini's 1m Context, Reddit deal, Imagegen politics, Gemma via the Four Wars[00:47:22] Alessio: Gemini, want to do a quick run through since this touches on all the four words.[00:47:28] swyx: Yeah, and I think this is the mark of a useful framework, that when a new thing comes along, you can break it down in terms of the four words and sort of slot it in or analyze it in those four frameworks, and have nothing left.[00:47:41] swyx: So it's a MECE categorization. MECE is Mutually Exclusive and Collectively Exhaustive. And that's a really, really nice way to think about taxonomies and to create mental frameworks. So, what is Gemini 1. 5 Pro? It is the newest model that came out one week after Gemini 1. 0. Which is very interesting.[00:48:01] swyx: They have not really commented on why. They released this the headline feature is that it has a 1 million token context window that is multi modal which means that you can put all sorts of video and audio And PDFs natively in there alongside of text and, you know, it's, it's at least 10 times longer than anything that OpenAI offers which is interesting.[00:48:20] swyx: So it's great for prototyping and it has interesting discussions on whether it kills RAG.[00:48:25] Alessio: Yeah, no, I mean, we always talk about, you know, Long context is good, but you're getting charged per token. So, yeah, people love for you to use more tokens in the context. And RAG is better economics. But I think it all comes down to like how the price curves change, right?[00:48:42] Alessio: I think if anything, RAG's complexity goes up and up the more you use it, you know, because you have more data sources, more things you want to put in there. The token costs should go down over time, you know, if the model stays fixed. If people are happy with the model today. In two years, three years, it's just gonna cost a lot less, you know?[00:49:02] Alessio: So now it's like, why would I use RAG and like go through all of that? It's interesting. I think RAG is better cutting edge economics for LLMs. I think large context will be better long tail economics when you factor in the build cost of like managing a RAG pipeline. But yeah, the recall was like the most interesting thing because we've seen the, you know, You know, in the haystack things in the past, but apparently they have 100 percent recall on anything across the context window.[00:49:28] Alessio: At least they say nobody has used it. No, people[00:49:30] swyx: have. Yeah so as far as, so, so what this needle in a haystack thing for people who aren't following as closely as us is that someone, I forget his name now someone created this needle in a haystack problem where you feed in a whole bunch of generated junk not junk, but just like, Generate a data and ask it to specifically retrieve something in that data, like one line in like a hundred thousand lines where it like has a specific fact and if it, if you get it, you're, you're good.[00:49:57] swyx: And then he moves the needle around, like, you know, does it, does, does your ability to retrieve that vary if I put it at the start versus put it in the middle, put it at the end? And then you generate this like really nice chart. That, that kind of shows like it's recallability of a model. And he did that for GPT and, and Anthropic and showed that Anthropic did really, really poorly.[00:50:15] swyx: And then Anthropic came back and said it was a skill issue, just add this like four, four magic words, and then, then it's magically all fixed. And obviously everybody laughed at that. But what Gemini came out with was, was that, yeah, we, we reproduced their, you know, haystack issue you know, test for Gemini, and it's good across all, all languages.[00:50:30] swyx: All the one million token window, which is very interesting because usually for typical context extension methods like rope or yarn or, you know, anything like that, or alibi, it's lossy like by design it's lossy, usually for conversations that's fine because we are lossy when we talk to people but for superhuman intelligence, perfect memory across Very, very long context.[00:50:51] swyx: It's very, very interesting for picking things up. And so the people who have been given the beta test for Gemini have been testing this. So what you do is you upload, let's say, all of Harry Potter and you change one fact in one sentence, somewhere in there, and you ask it to pick it up, and it does. So this is legit.[00:51:08] swyx: We don't super know how, because this is, like, because it doesn't, yes, it's slow to inference, but it's not slow enough that it's, like, running. Five different systems in the background without telling you. Right. So it's something, it's something interesting that they haven't fully disclosed yet. The open source community has centered on this ring attention paper, which is created by your friend Matei Zaharia, and a couple other people.[00:51:36] swyx: And it's a form of distributing the compute. I don't super understand, like, why, you know, doing, calculating, like, the fee for networking and attention. In block wise fashion and distributing it makes it so good at recall. I don't think they have any answer to that. The only thing that Ring of Tension is really focused on is basically infinite context.[00:51:59] swyx: They said it was good for like 10 to 100 million tokens. Which is, it's just great. So yeah, using the four wars framework, what is this framework for Gemini? One is the sort of RAG and Ops war. Here we care less about RAG now, yes. Or, we still care as much about RAG, but like, now it's it's not important in prototyping.[00:52:21] swyx: And then, for data war I guess this is just part of the overall training dataset, but Google made a 60 million deal with Reddit and presumably they have deals with other companies. For the multi modality war, we can talk about the image generation, Crisis, or the fact that Gemini also has image generation, which we'll talk about in the next section.[00:52:42] swyx: But it also has video understanding, which is, I think, the top Gemini post came from our friend Simon Willison, who basically did a short video of him scanning over his bookshelf. And it would be able to convert that video into a JSON output of what's on that bookshelf. And I think that is very useful.[00:53:04] swyx: Actually ties into the conversation that we had with David Luan from Adept. In a sense of like, okay what if video was the main modality instead of text as the input? What if, what if everything was video in, because that's how we work. We, our eyes don't actually read, don't actually like get input, our brains don't get inputs as characters.[00:53:25] swyx: Our brains get the pixels shooting into our eyes, and then our vision system takes over first, and then we sort of mentally translate that into text later. And so it's kind of like what Adept is kind of doing, which is driving by vision model, instead of driving by raw text understanding of the DOM. And, and I, I, in that, that episode, which we haven't released I made the analogy to like self-driving by lidar versus self-driving by camera.[00:53:52] swyx: Mm-Hmm. , right? Like, it's like, I think it, what Gemini and any other super long context that model that is multimodal unlocks is what if you just drive everything by video. Which is[00:54:03] Alessio: cool. Yeah, and that's Joseph from Roboflow. It's like anything that can be seen can be programmable with these models.[00:54:12] Alessio: You mean[00:54:12] swyx: the computer vision guy is bullish on computer vision?[00:54:18] Alessio: It's like the rag people. The rag people are bullish on rag and not a lot of context. I'm very surprised. The, the fine tuning people love fine tuning instead of few shot. Yeah. Yeah. The, yeah, the, that's that. Yeah, the, I, I think the ring attention thing, and it's how they did it, we don't know. And then they released the Gemma models, which are like a 2 billion and 7 billion open.[00:54:41] Alessio: Models, which people said are not, are not good based on my Twitter experience, which are the, the GPU poor crumbs. It's like, Hey, we did all this work for us because we're GPU rich and we're just going to run this whole thing. And
In this opening conversation on MASLD disease burden, Zobair Younossi summarizes and expands on some key points from the recent Diabetes Spectrum review article he co-authored with Linda Henry.Zobair starts by discussing the recent review article, Understanding the Burden of Non-Alcoholic Fatty Liver Disease: Time for Action which he describes as "a summary of a large body of evidence that's being generated." He points to three pivotal issues:The treatment burden associated with MASLD and MASH is extremely high and will grow over time. The prevalence of MASLD in the overall population has grown to ~38%, but for Type 2 diabetics, who have worse outcomes associated with MASLD, this number is 68%. MASH numbers are estimated to be 5-7% in the general population, but 37% among T2D patients. As diabetes increases across the globe, these rates will go higher.The humanistic burden, as measured in Quality of Life scores, is also significant. Patients living with MASLD and MASH report lower QoL scores, which translates not only into a less happy, more depressed society, but also into significant indirect economic effects due to poorer worker performance and, presumably, more time away from work. The economic burden of MASLD is significant in every country, but the scale and structure of this burden varies from country to country. Key drivers include dietary issues and inactivity, and issues are becoming more pronounced globally. These economic issues are driven largely by the key downstream sequelae. The leading causes of death from MASLD are cardiovascular disease and extrahepatic cancers, which are costly, and patients with cirrhosis are highly susceptible to liver cancer as well. Jörn Schattenberg joins the conversation to commend Zobair on his work, which, as Jörn puts it, "educate[s] us as physicians on where the risk factors and the at-risk populations are, and we're moving that way. I mean, we're trying to focus on patients with diabetes that are more advanced from the hepatologist perspective." He also discusses the ongoing effort to educate endocrinologists and primary care about these issues as well, since those two specialties treat the lion's share of diabetic patients. Zobair goes on to describe the Markov models of disease cost his group has built already in seven countries, and plans to build in more. Key point: MASLD is costly everywhere, but the structure of cost and, most importantly, public health solutions will vary from country to country.
Zobair Younossi, Chair of the Global NASL Council, publishes seminal papers on MASLD epidemology, cost of disease, and related public health needs regularly. Co-hosts Jörn Schattenberg, Louise Campbell, and Roger Green ask questions and share perspectives from their own experiences.00:00:00 - Surf's Up: Season 5 Episode 4Opening introduction, including brief quotes taken directly from the episode discussion.00:02:45 - Introduction and GroundbreakerPanelists swap brief, lighthearted comments about where they are geographically and where they have been recently. In the groundbreaker, each shares one piece of good news from the previous week. 00:11:15 - Epidemiology articleZobair Younossi discusses the recent review article, Understanding the Burden of Nonalcoholic Fatty Liver Disease, published earlier this month in Diabetes Spectrum. He starts by discussing incidence and growth rates for MASLD, MASH, and cirrhosis across the world, and how these differ by country. He goes on to discuss the impact of MASH on patient Quality of Life and the high correlation between multi-metabolic patients (most notably diabetics) and negative outcomes.00:15:55 - Panel questions and commentsJörn Schattenberg joins the conversation to praise this and other of Zobair's works for helping physicians know what to do when they decide to become more engaged in treating MASLD and MASH. This leads Zobair to discuss the Markov model they have created to evaluate burden of disease.00:19:14 - Costliness and cost-effectiveness of therapy In response to a question from Roger Green, Zobair describes the criteria and metrics used to evaluate the cost-effectiveness of a new drug or diagnostic. Louise Campbell comments on how high and underappreciated the social and economic burdens of disease are to every global society. Zobair notes that health economists do not focus on liver disease as a discreet set of costs and burdens. 00:27:17 - Goals and activities for the next few yearsRoger states that given how fast the MASLD and MASH populations are growing, an effort to "flatten the curve" would be heroic. Zobair replies that each country needs to fight liver disease, but that each country will have different immediate challenges. 00:31:28 - Women's health and liver healthLouise notes that the epidemiology paper refers to the high level of risks among women over 55. This leads to a discussion between Jörn, Zobair and Louise on the higher risk level post-menopausal women experience and some pathological elements that make this important.00:34:58 - Global NASH CouncilZobair bridges this portion of the conversation to discuss the work of the Global NASH Council, a group of >200 members in >50 countries organized into different workstreams to create knowledge and awareness around MASLD and MASH.00:38:59 - Low SDI and High SDI countriesTo Zobair, one element in the global effort on MASLD and MASH is the recognition that High SDI (wealthier) and Low SDI (power) countries face dramatic differences in the challenges they face. 00:41:14 - Reducing the rate of disease growthTo Louise, educating women might be a key to driving awareness, both because post-menopausal women live at higher risks and because they cook and schedule for their families. This leads to a broader discussion about the most effective education starting with children. Finally, Zobair discusses the importance of making all stakeholders PLUS global agencies recognize the scale of this challenge and to act in concert with increasing urgency. 00:48:42 - Question of the WeekThe first Question of the Week asks what steps each of us can take to help stem the pandemic. 00:49:41 - Business reportNews on audience metrics, the first Question of the Week, next week's episode and this week's Vault conversation
AI has the potential to revolutionize healthcare in areas that range from drug discover to the patient experience. In this podcast, Heather Lane from athenahealth shares the challenges and opportunities of using AI to improve the patient and clinician experience.Heather's Bio:Heather has a PhD from Purdue, where she focused on developing machine learning methods for the computer security problem of anomaly detection. She's worked at the MIT AI Lab (now CSAIL) working with Leslie Kaelbling on reinforcement learning and decision-theoretic planning, Markov decision processes, and the tradeoff between stochastic and deterministic planning.In 2002, she moved to the University of New Mexico as an assistant professor in the Department of Computer Science. There she worked on a number of application areas of ML, including the bioinformatics of RNA interference, genomics, and computational neuroscience (inference of brain activity networks from neuroimaging data). Much of that work involved Bayesian networks and dynamic belief networks.In 2008, she was promoted to associate professor at UNM and was granted tenure. In 2012, she moved from academia to industry, joining Google in Cambridge, MA. working on Knowledge Graph, Google Books, Project Sunroof, and Ads Latency.In 2017, she joined athenahealth to lead a Data Science team working to use athena's immense store of healthcare data to improve healthcare experiences for clinicians and patients.Social LinksYou can follow Heather at: https://www.linkedin.com/in/terranlane/You can follow Maribel at: X/Twitter: https://twitter.com/maribellopezLinkedIn: https://www.linkedin.com/in/maribellopezYouTube: https://www.youtube.com/c/MaribelLopezResearchHashtags: #AI, #Healthcare #PatientExperience
The news of the death of politician Alexei Navalny was followed by another sad news: Dmitry Markov, one of the main photographers of modern Russia, passed away in Pskov. - Вслед за новостью о смерти политика Алексея Навального пришла другая печальная новость - в Пскове скончался Дмитрий Марков, один из главных фотографов современной России, такой, какой ее чаще видят жители провинции, без роскошных парадных фасадов, ярких собянинских декораций и безупречных белозубых улыбок.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: what does davidad want from "boundaries"?, published by Chipmonk on February 7, 2024 on LessWrong. As the Conceptual Boundaries Workshop (website) is coming up, and now that we're also planning Mathematical Boundaries Workshop in April, I want to get more clarity on what exactly it is that you want out of "boundaries"/membranes. So I just want to check: Is your goal with boundaries just to formalize a moral thing? I'll summarize what I mean by that: Claim 1: By "boundaries", you mean "the boundaries around moral patients - namely humans". Claim 1b: And to some degree also the boundaries around plants and animals. Also maybe nations, institutions, and other things. Claim 2: If we can just (i) locate the important boundaries in the world, and then (ii) somehow protect them, Then this gets at a lot (but not all!) of what the "safety" in "AI safety" should be. Claim 3: We might actually be able to do that. e.g.: Markov blankets are a natural abstraction for (2.i). Claim 4: Protecting boundaries won't be sufficient for all of "safety" and there are probably also other (non-boundaries) specifications/actions that will also be necessary. For example, we would probably also need to separately specify some things that aren't obviously contained by the boundaries we mean, e.g.: "clean water", "clean air", and a tractably small set of other desiderata. Here are my questions for you: Q1: Do you agree with each of the claims above? Q2: Is your goal with boundaries just to formalize the moral/safety thing, or is there anything else you want from boundaries? Past context that's also relevant for readers: This new post I wrote about how preserving the boundaries around agents seems to be a necessary condition for their safety. Quotes you've made about boundaries that I've compiled here. This old post I wrote about boundaries as MVP morality which you endorsed. Q3: It seems that Garrabrant, Critch, and maybe others want different things from you and I'm wondering if you have thoughts about that. Garrabrant: From talking to him I know that he's thinking about boundaries too but more about boundaries in the world as instruments to preserve causal locality and predictability and evolution etc.. But this is quite different than talking about specifically the boundaries around agents. Critch: I haven't spoken to him yet, but I think you once told me that Critch seems to be thinking about boundaries more in terms of ~"just find the 'boundary protocol' and follow it and all cooperation with other agents will be safe". Is this right? If so, this seems closer to what you want, but still kinda different. TJ: I think TJ has some other ideas that I am currently unable to summarize. Claim 1+1b: yes, to first order. [To second order, I expect that the general concept of things with "boundaries" will also be useful for multi-level world-modelling in general, e.g. coarse-graining fluid flow by modelling it in terms of cells that have boundaries on which there is a net flow, and that it might be a good idea to "bake in" something like a concept of boundaries to an AI system's meta-ontology, so that it has more of a tendency to have moral patients among the entities in its object-level ontology. But my mainline intention is for the object-level ontology to be created with humans in the loop, and the identification of entities with boundaries could perhaps be just as easily a layer of interpretation on top of an ontology with a more neutral meta-ontology of causation. Claim 2: agreed. Claim 3: agreed. Claim 4: agreed. Q2: yes, my ultimate goal with "boundaries" is just to formalise injunctions against doing harm, disrespecting autonomy, or (at the most ambitious) excluding humans from cooperation. (I am borrowing the pluralism of Garrett Cullity's Concern, Respect, & Cooperation in separating those thr...
PART 2!!!!! It's 2024 and there are murders to solve! Murders at Markov Manor drops soon and we are here to talk about what we think could impact Modern. Please let us know what you will be hoping to open for your deck! Join in the conversation on our Discord! https://discord.com/invite/7zAZV8JK If you want to customize you deck even more check out the Alter Sleeves link below! It really helps support the show. https://altersleeves.com/themmcast Want to pick up any of new cards you saw in this week's episode? Click over to TCGPlayer.com using our affiliate link here! It's a free and easy way to support the show. Thanks! - https://t.co/spyomDMIF2 Looking to pick up some of the cards we discussed today? Use our link below to help support the show! https://channelfireball.com?ref=alexkessler Opening animation was done by Geoffrey Palmer. Follow him on Twitter: @livingcardsmtg @livingcardsmtg816 ---- Contents ---- 0:00 - Intro Join The MMCast Patreon https://www.Patreon.com/TheMMCast Discord: https://discord.gg/fjYdTwS MMcast Twitch: twitch.tv/kesswylie Instagram: @TheMMCast TicTok: @TheMMPodcast Kess: Twitter: @Kesswylie Instagram: @Kess_Wylie Twitch: Twitch.tv/Kessco Ben: Twitter: @benbatemanmedia Instagram: @BenBatemanMedia Twitch: Twitch.tv/BenBatemanStreams Michael: Twitter @Dudardd Website: kess.co/themmcast Email: themmcast@kess.co Facebook: https://www.facebook.com/groups/170382890167965/?ref=share Produced by Time Traveler Media - https://www.timetravelermedia.com Learn more about your ad choices. Visit megaphone.fm/adchoices
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: what does davidad want from "boundaries"?, published by Chipmonk on February 6, 2024 on The AI Alignment Forum. Chipmonk As the Conceptual Boundaries Workshop (website) is coming up, and now that we're also planning Mathematical Boundaries Workshop in April, I want to get more clarity on what exactly it is that you want out of "boundaries"/membranes. So I just want to check: Is your goal with boundaries just to formalize a moral thing? I'll summarize what I mean by that: Claim 1: By "boundaries", you mean "the boundaries around moral patients - namely humans". Claim 1b: And to some degree also the boundaries around plants and animals. Also maybe nations, institutions, and other things. Claim 2: If we can just (i) locate the important boundaries in the world, and then (ii) somehow protect them, Then this gets at a lot (but not all!) of what the "safety" in "AI safety" should be. Claim 3: We might actually be able to do that. e.g.: Markov blankets are a natural abstraction for (2.i). Claim 4: Protecting boundaries won't be sufficient for all of "safety" and there are probably also other (non-boundaries) specifications/actions that will also be necessary. For example, we would probably also need to separately specify some things that aren't obviously contained by the boundaries we mean, e.g.: "clean water", "clean air", and a tractably small set of other desiderata. Here are my questions for you: Q1: Do you agree with each of the claims above? Q2: Is your goal with boundaries just to formalize the moral/safety thing, or is there anything else you want from boundaries? Past context that's also relevant for readers: This new post I wrote about how preserving the boundaries around agents seems to be a necessary condition for their safety. Quotes you've made about boundaries that I've compiled here. This old post I wrote about boundaries as MVP morality which you endorsed. Q3: It seems that Garrabrant, Critch, and maybe others want different things from you and I'm wondering if you have thoughts about that. Garrabrant: From talking to him I know that he's thinking about boundaries too but more about boundaries in the world as instruments to preserve causal locality and predictability and evolution etc.. But this is quite different than talking about specifically the boundaries around agents. Critch: I haven't spoken to him yet, but I think you once told me that Critch seems to be thinking about boundaries more in terms of ~"just find the 'boundary protocol' and follow it and all cooperation with other agents will be safe". Is this right? If so, this seems closer to what you want, but still kinda different. TJ: I think TJ has some other ideas that I am currently unable to summarize. davidad Claim 1+1b: yes, to first order. [To second order, I expect that the general concept of things with "boundaries" will also be useful for multi-level world-modelling in general, e.g. coarse-graining fluid flow by modelling it in terms of cells that have boundaries on which there is a net flow, and that it might be a good idea to "bake in" something like a concept of boundaries to an AI system's meta-ontology, so that it has more of a tendency to have moral patients among the entities in its object-level ontology. But my mainline intention is for the object-level ontology to be created with humans in the loop, and the identification of entities with boundaries could perhaps be just as easily a layer of interpretation on top of an ontology with a more neutral meta-ontology of causation. Thinking through both routes more is at the frontier of what I consider "conceptual "boundaries" research".] davidad Claim 2: agreed. Claim 3: agreed. Claim 4: agreed. davidad Q2: yes, my ultimate goal with "boundaries" is just to formalise injunctions against doing harm, disrespecting autonomy, or (at the mo...
PART 1!!!!! It's 2024 and there are murders to solve! Murders at Markov Manor drops soon and we are here to talk about what we think could impact Modern. Please let us know what you will be hoping to open for your deck! Join in the conversation on our Discord! https://discord.com/invite/7zAZV8JK If you want to customize you deck even more check out the Alter Sleeves link below! It really helps support the show. https://altersleeves.com/themmcast Want to pick up any of new cards you saw in this week's episode? Click over to TCGPlayer.com using our affiliate link here! It's a free and easy way to support the show. Thanks! - https://t.co/spyomDMIF2 Looking to pick up some of the cards we discussed today? Use our link below to help support the show! https://channelfireball.com?ref=alexkessler Opening animation was done by Geoffrey Palmer. Follow him on Twitter: @livingcardsmtg @livingcardsmtg816 ---- Contents ---- 0:00 - Intro Join The MMCast Patreon https://www.Patreon.com/TheMMCast Discord: https://discord.gg/fjYdTwS MMcast Twitch: twitch.tv/kesswylie Instagram: @TheMMCast TicTok: @TheMMPodcast Kess: Twitter: @Kesswylie Instagram: @Kess_Wylie Twitch: Twitch.tv/Kessco Ben: Twitter: @benbatemanmedia Instagram: @BenBatemanMedia Twitch: Twitch.tv/BenBatemanStreams Michael: Twitter @Dudardd Website: kess.co/themmcast Email: themmcast@kess.co Facebook: https://www.facebook.com/groups/170382890167965/?ref=share Produced by Time Traveler Media - https://www.timetravelermedia.com Learn more about your ad choices. Visit megaphone.fm/adchoices
Support the show! http://patreon.com/magicmics Visit our sponsor: http://www.coolstuffinc.com/ Check out the twitch channel: http://twitch.tv/magicmics Visit our subreddit: http://www.reddit.com/r/magicmics Follow us on Twitter: http://twitter.com/magicmicscast Like us on Facebook: http://facebook.com/magicmics Co-Sponsors: http://www.cardhoarder.com/ http://www.alteredsleeves.com/ (use code MAGICMICS ) http://www.cubeks.com/ https://www.manatraders.com/ (use code MAGICMICS ) AirDate - 1/24/24 MKM Roundup Infinite Standard Combo: https://vxtwitter.com/yoman_5/status/1747979070069998017 Story Spotlight Cards: https://magic.wizards.com/en/news/card-preview/story-spotlight-cards-for-murders-at-karlov-manor Lower Legendary As-Fan: https://www.tumblr.com/markrosewater/740437654259335168/hey-mark-why-arent-there-any-legendary-uncommons MC Chicago Swag Sneak-Peek: https://vxtwitter.com/PlayMTG/status/1748087871230054580 https://www.mtgfestivals.com/global/en-us/magiccon-news/magiccon-chicago-merch-Preview-pin-trading.html https://cdn.discordapp.com/attachments/410942703623208960/1197659227326513262/magiccon-chicago-2024-fblthp-dice-bag.png Outlaws of Thunder Junction Release Date: https://magic.wizards.com/en/news/announcements/outlaws-of-thunder-junction-arrives-april-19-2024 Judge Foundry Merch: https://www.judgefoundry.org/store/?v=13ab917f2c7d Secret Lair Raining Cats and Dogs: https://magic.wizards.com/en/news/announcements/secret-lair-commander-deck-raining-cats-and-dogs The Print Run Sure Wasn't: https://markrosewater.tumblr.com/post/740347718034145280/hey-as-someone-who-really-likes-the-secret-lairs-i MTG Twitch Viewers By Year: https://vxtwitter.com/h0lydiva/status/1749401147821936889 https://vxtwitter.com/CedricAPhillips/status/1749575586409071024 GIVEAWAY & THANKS https://streamlabs.com/dashboard#/subscribers
Today we talk about Brion Markov, best known as Geo-Force, a prince of Markovia, a founding member of The Outsiders, and also Terra's half-brother. Today's mentioned & relevant media: -The Brave and the Bold (1955) #200 -Batman and the Outsiders (1983) -The New Teen Titans (1980) #37 -World's Finest Comics (1941) #300 -Tales of the Teen Titans (1984) Annual #3 -Crisis on Infinite Earths (1985) -DC Comics Presents (1978) #83 -The Outsiders (1985) -DC Challenge (1985) #8-10 -Infinity, Inc. Special (1987) #1 -Millennium (1987) reading order -Justice League Quarterly (1990) #5 -Showcase (1993) #4-5 -Outsiders (1993) -Zero Hour: Crisis in Time (1994) -Deathstroke (1991) #49-50 -Superman: Behold! The millennium Giants! (1998) -Adventures of Superman (1987) #564 -Titans Secret Files (2000) #2 -Justice League of America (2006) #4-8, 11-12, 15, 53 -Justice Society of America (2006) #5-6, 43 -Batman and the Outsiders (2007) -Trinity (2008) #11, 13-16, 18, 23, 29, 34, 39, 48, 52 -DC Universe: Last Will and Testament (2008) -Terra (2008) #2-4 -Blackest Night: Titans (2009) #1 -DCU Halloween Special (2009) -Bruce Wayne: The Road Home: Outsiders (2010) #1 -Convergence: Batman and the Outsiders (2015) -Suicide Squad Most Wanted: Deadshot and Katana (2016) -Shadow War (2022) Thanks to Victoria Watkins for our icon! Support Capes and Japes by: Checking out our Patreon or donating to the Tip jar Find out more on the Capes and Japes website.
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI Safety Chatbot, published by markov on December 22, 2023 on LessWrong. Hello World! The AISafety.info team is launching a prototype of the AI Safety Chatbot. The chatbot uses a dataset of alignment literature to answer any questions related to AI safety that you might have, while also citing established sources. Please keep in mind that this is a very early prototype and despite citing references, it may still provide inaccurate or inappropriate information. The overall objective is to help people better understand AI Safety issues based on alignment research using an LLM. This helps with tailoring content to the user's needs and technical level. The chatbot can hopefully be used by both newcomers to AI safety, as well as researchers and engineers who want to get up to speed on specific topics. How it works This chatbot builds upon AlignmentSearch. Our work also expands upon the alignment research dataset (ARD) developed during AI Safety Camp 6. This involved updating and curating the dataset to focus more on quality over quantity. Additionally, we created a process to regularly fetch new articles from selected sources. The ARD contains information about alignment from various books, research papers, and blog posts. For a full list of all the sources being used, look at the readme of the repository on GitHub or HuggingFace. We use a process called retrieval-augmented generation (RAG) to generate the answers. Since LLM data is static, RAG increases the capabilities of a LLM by referencing an external authoritative knowledge base before generating a response. So the process can be roughly broken into - 1) getting and storing the data in a vector database, and then 2) generating an answer based on that data. The information storage process is outlined below: Source: DeepLearning.AI (2023) " LangChain: Chat with Your Data" Document Loading: The articles are scraped from various sources such as the ones mentioned above. They are then parsed and stored in an SQL database while making sure that metadata values fields are valid. Splitting: Then the text content of the documents is broken up into fixed-sized chunks. Storage: These chunks are then embedded into the Pinecone vector database using the OpenAI embedding model. Once we have a database of alignment literature, we use the following series of steps to generate an answer based on a user query: Source: DeepLearning.AI (2023) " LangChain: Chat with Your Data" Query: A user types in a question. Storage+Retrieval: We retrieve chunks from the vector database that are semantically similar to the user's question. Prompt: A prompt is formed that includes all the text retrieved from the relevant chunks provided as context, along with additional instructions on how to format citations and structure the answer. Output: This prompt is then passed to the LLM, which synthesizes an answer based on the relevant chunk of data along with accurate inline citations to the source material. Additionally, as the answer is generated, a ' glossary' is injected with manually written one-sentence definitions of common jargon. The following image example shows what Goodhart's Law looks like on hover: With automatic updates, the ARD will periodically fetch new article entries from trusted sources and add or update items to a SQL database. A separate process adds text to the dataset from user suggested sources. This dataset is available on HuggingFace, which includes instructions on how to download and use it. This means that the chatbot will always be able to produce the more relevant and newer information. We are also experimenting with multiple modes for different audiences. Currently, we offer three options, which produce answers of varying complexity, using the same chunks but adjusting the prompt sent to the LLM. Hallucinations Each chun...
The Perpetual Chess Adult Improver Series returns with another guest with a great story. Denis Markov is a 39 year-old working dad with a passion for chess. Denis has deep chess roots which date back to a childhood in Russia where he took classes at the fabled “Palace of Pioneers.” According to Denis, he did not show exceptional talent in those days and eventually set chess aside for some years. Now based in Pennsylvania, Denis returned to competitive chess in mid-2021 and since then, through hard work and consistency has elevated his USCF rating from 1742 to over 2050! While this type of improvement is quite unusual, Denis is adamant that he isn't doing anything to “reinvent the wheel.” In our conversation Denis details an approach focused on frequent competitive play, game review and lots of hard work. I found our conversation grounding and inspiring at the same time. Timestamps of topics discussed are below. Adult Improver Series Spotify Playlist here: https://open.spotify.com/playlist/75Uoqz2BoRt2IiTCeOfuky?si=680ff07480434ec9 0:00- Thanks to those who help support Perpetual Chess via Patreon! If you would like to join the community, you can do so here: https://www.patreon.com/perpetualchess 0:01- Thanks to our presenting chess education sponsors, Chessable.com! New Chessable courses including Silman's Endgame Course, and new ones by GM Erwin L'ami, GM Johan Hellsten. You can check out their latest offerings here: https://www.chessable.com/courses/all/new/ 2:00- Denis joins the show! What is his “ why”? What does he do when his motivation to study chess is low? Denis' Reddit post detailing his success: https://www.reddit.com/r/chess/comments/16sw628/1740_to_2040_uscf_in_2_years_adult_improver/ 11:00- Patreon mailbag question: “How will Denis approach teaching chess to his kids?' 14:00- Patreon mailbag question: ‘Does Denis think that his Russian background helped his chess development?' 21:00- What got Denis back into chess in his college years? Mentioned: Aron Nimzowhitsch's My System 25:00- Denis' study routine Denis' coach: https://lichess.org/coach/Davjan 34:00- Denis discusses his approach to openings, especially as it relates to playing the same opponents repeatedly. 42:00- How did Denis settle on studying via ChessTempo and the Chess Steps workbooks? Mentioned: Arthur Yusupov's series, Chess Steps Method 48:00- What is the nature of Denis' work with his coach? 54:00- Does Denis have any theories on why he is seeing gains while others might be struggling? 1:03:00- More book recommendations! Mentioned: Sam Shankland's books, Endgame Strategy by Shereshevsky, GM Johan Hellsten's books, Chess Structures by GM Mauricio Flores Rios, GM Ivan Sokolov's Winning Middle Game Strategies, Sokolov's interview with Chessbase India, His How to Chess Interview is Now Out! 1:08:00- Thanks to Denis for joining me! You can email him at dvmarkov at gmail dot com Or follow him on Instagram here: https://www.instagram.com/dvm0101/ Learn more about your ad choices. Visit megaphone.fm/adchoices
The Rush Hour Melbourne Catch Up - 105.1 Triple M Melbourne - James Brayshaw and Billy Brownless
Brig gets to know Daisy, Harry Garside, Hump Day Quiz, Oleg Markov, Danny Green, Daisy gets to know Brig, how do they dismantle cranes?, Nick Cody, Nick Cody takes over Billy's JokeSee omnystudio.com/listener for privacy information.
The Rush Hour Melbourne Catch Up - 105.1 Triple M Melbourne - James Brayshaw and Billy Brownless
Daisy's sports news, the Aussie dictionary word of the year, Cal Twomey's AFL draft preview, Brig reveals her childhood crush, Hump Day Quiz, we want to send you to the WWE in Perth, JB's wedding has made the papers, Rosie recaps the Motley Crue/Def Leppard gig last night, Mark Howard in India, Collingwood's Oleg Markov, Brig and Daisy react to Billy's JokeSee omnystudio.com/listener for privacy information.
In this week's episode Greg and Patrick take advantage of the recent expiration of a statute of limitations that legally allows them to talk about the multilevel model: what it is, when we might use it, and extremely cool extensions that it allows. Along the way they also discuss hostile federal judges, McNeish, airing of grievances, Gauss and Markov's corpses, Sesame Street, distributional baguettes, naivete, sentient GLMs, two pencil necks, Thor's Hammer, Willy Sutton, Siren's Song, peer groups of two, fighting good for an old guy, crazy town cool, 50 ducks, conceding a battle, and blushing corpses.Stay in contact with Quantitude! Twitter: @quantitudepod Web page: quantitudepod.org Merch: redbubble.com
Ove nedelje se desilo da čak tri člana POPkasta gledaju istu seriju. Pričali smo o tome da li je Bekam stvarno tolika faca, koja serija je baš dobra ekranizacija jedne igre, zašto filmovi Vesa Andersona ne prijaju svima (i koje ipak treba pogledati), šta je BlackPink a šta Black Pumas, kako je jedan lik iz šume postao glavna YouTube senzacija i šta smo sve propustili na BITEF-u. Najavljujemo koncert Rammsteina u Beogradu i retrospektivnu izložbu Miće Popovića u Galeriji SANU. Sa nama je ponovo Converse, koji je ovaj put spremio posebne akcije i popuste za svoje kupce u online prodavnici
Time tracking is often seen as a necessary evil - something that takes time away from doing client work or feels like micromanaging employees. In this episode I interview Ilia from Toggl to discover why those assumptions are wrong. We discuss real-world examples of how time tracking provides valuable data to help agencies work more efficiently, provide better project estimates, optimise team members' time, identify profitable versus unprofitable clients, and ultimately achieve better profitability and work-life balance. Ilia shares tips on how agency owners can implement time tracking in a way that gets buy-in from employees and avoids it feeling punitive. If you think time tracking is a drag that holds your agency back, this myth-busting episode will surprise you and highlight the many bottom-line benefits tracking time can unlock! Full show notes: https://trailblazer.fm/is-time-tracking-holding-you-back/
Rerun. Bulgarian writer Georgi Markov was shot by a poisoned pellet whilst walking on Waterloo Bridge on 7th September, 1978. Four days later, he was dead. He thought the bullet – believed to be filled with ricin – had emanated from the umbrella of a Soviet secret agent, and the British press labelled his assassination the ‘Poison Brolly Riddle'. In this episode, Olly, Rebecca and Arion explain how Markov was initially disbelieved by doctors; reveal the mysterious involvement of a pig in the Porton Down investigation; and ask whether poisoning is really as efficient a method of murder as it seems… Further Viewing: Umbrella fired fatal ricin dart (CNN 2013) https://www.youtube.com/watch?v=rZO5Lf8wD_c&embeds_referring_euri=https%3A%2F%2Ftheretrospectors.com%2F&source_ve_path=MjM4NTE&feature=emb_title ‘Why am I hearing a rerun?' Every Thursday is 'Throwback Thursday' on Today in History with the Retrospectors: running one repeat per week means we can keep up the quality of our independent podcast. Daily shows like this require a lot of work! But as ever we'll have something new for you tomorrow, so follow us wherever you get your podcasts: podfollow.com/Retrospectors Love the show? Join