Podcasts about berts

  • 97PODCASTS
  • 162EPISODES
  • 54mAVG DURATION
  • 1MONTHLY NEW EPISODE
  • May 27, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about berts

Latest podcast episodes about berts

Þjóðmál
#320 – Það skiptir máli hver stjórnar – Ásdís Kristjáns og Íris Róberts fara yfir stöðuna

Þjóðmál

Play Episode Listen Later May 27, 2025 77:16


Ásdís Kristjánsdóttir og Íris Róbertsdóttir, bæjarstjórar í Kópavogi og í Vestmannaeyjum, mæta í Þjóðmálastofuna nú þegar ár er í sveitastjórnarkosningar. Við ræðum um helstu áherslur í rekstri sveitarfélaga, mikilvægi þess að hafa reksturinn í lagi, menntamálin, atvinnulífið og margt fleira sem snýr að rekstri sveitarfélaganna.

Onderwijs leiden met hart en ziel
100. Bert Wienen over de pedagogische basis m.m.v. Jan Dirk Imelman en Wilna Meijer

Onderwijs leiden met hart en ziel

Play Episode Listen Later May 23, 2025 58:56


Te gast is Bert Wienen en we spreken over de beleidsterm ‘de pedagogische basis' naar aanleiding van zijn boek Laat/d de pedagogische basis met rust! Twee grote pedagogen, Wilna Meijer en Jan Dirk Imelman, reflecteren op zijn betoog.Bert vertel eerst waar deze beleidsterm vandaan komt. Zijn zorg is dat gemeentelijke overheden de pedagogische basis teveel zullen benaderen vanuit een machine-logica. Zo van, wij maken een eensluidend professioneel beleid en droppen dat in alle pedagogische contexten met als doel de pedagogische basis van de samenleving effectief te versterken. Dit machine-denken past niet bij de pedagogiek. Bert vindt het te gemakkelijk gedacht, het belooft te veel en het is te maakbaar gedacht. In het echte leven is er namelijk niet 1 pedagogische basis, maar zijn er verschillende pedagogische contexten, zoals het gezin, de kinderopvang, het onderwijs, de straat, de sportvereniging. Bovendien zijn dit allemaal sociale praktijken, die historisch zijn ontwikkeld en door de mensen zelf zijn en worden gevormd. De pedagogische basissen zijn er dus al en hoeven niet van bovenaf geïmplementeerd te worden door professionals, die de neiging hebben om al die praktijken over te nemen en over een kam te scheren. Tot slot wijst Bert erop dat pedagogiek inherent normatief van aard is en wenst hij dat ouders, leraren, sportliefhebbers etc. met elkaar het gesprek voeren over wat goed opvoeden in deze tijd inhoudt.In het tweede deel reflecteren Wilna en Jan Dirk instemmend en kritisch op Berts betoog vanuit de vraag: hoe pedagogisch geletterd en verantwoord is onze cultuur eigenlijk?Deze podcast is mede mogelijk gemaakt door schoolleidersopleiding ATTC, onderwijsadvies School Matters en rustplek De vallei van het goede leven. 

Gamlaste nytt
Det är okej, flicka lilla

Gamlaste nytt

Play Episode Listen Later Apr 24, 2025 61:13


Denna veckan blir det mycket "tech"-snack aka tv-seriernas krig mot filmen – vänder det igen? Om kanonserien The Last of Us usla svenska marknadsföring, dubbelheten i Netflixserien Black Mirrors betalningsbudskap , de tvära kasten mellan en anorexiadokumentär och Berts värld och mycket, mycket mer!

Nose Bleeds  Sports PodCast
Nose Bleeds "324" Purple Thunder

Nose Bleeds Sports PodCast

Play Episode Listen Later Apr 4, 2025 92:56


Chris and Adam went on an adventure in place of last week's podcast, Reese's Peanut Butter Pie Miniatures, Adam stumbles upon the holy grail of walk finds and learns about chicken ice, the 2025 Reds are underway, Adam has surprise guests in his home, the Mount Rushmore of Berts (or Burts), and Bert Kreischer's new special is on Netflix.

Klankcast
105. De Johannes-Passion: “een soort Hollywood-film waarin van alles gebeurt”

Klankcast

Play Episode Listen Later Apr 3, 2025 73:27


Schrijver Bert Natter is een groot Bach-kenner, al noemt hij zichzelf liever Bach-liefhebber. Botte Jellema duikt met Bert in Bachs meesterwerk de Johannes-Passion. Minder bekend dan de Matthäus, maar bezig aan een heuse comeback. Wat zijn de aria's waar je op moet letten? Waarin verschillen de Johannes en de Matthäus van elkaar? En – gewetensvraag! – welk werk heeft Berts voorkeur? Met veel audio-fragmenten! Playlist: https://open.spotify.com/playlist/5xHghhr2wcNhqeQmNFEtpD

Pa ceļam ar Klasiku
"Trio Palladio", Šūberts un Baltijas valstu komponisti

Pa ceļam ar Klasiku

Play Episode Listen Later Dec 9, 2024 15:04


14. decembrī Mazajā ģildē Lielās mūzikas balvas laureāts Trio Palladio (vijolniece Eva Bindere, čelliste Kristīne Blaumane un pianists Reinis Zariņš) atskaņos jaunu koncertprogrammu, kuru veido divu pasauļu pretnostatījums: trīs Baltijas autoru darbi un Franča Šūberta Pirmais klaviertrio, kas sacerēts neilgi pirms autora nāves. Lietuvu pārstāvēs Anatolijs Šenderovs (1945-2019) un viņa "Dziesma un deja" (2008), Igauniju - Erki Svens Tīrs (1959) ar Fata morgana (2004), bet Latviju - Pētera Vaska (1946) darbs "Vientuļais eņģelis" (2006/2019). Baltijas valstu komponistu darbi izskanēs kā veltījums Baltijas ceļa 35. gadskārtai, tāpēc Kristīne Blaumane dalās savās atmiņās par Baltijas ceļu: "Mana un Evas paaudze, mēs vēl bijām pārāk jauni, lai mums padomju laiks atstātu kaut kādas traumas. Tajā pašā laikā bijām jau pietiekoši saprātīgā vecumā, lai apzinātos notikuma nozīmību. Es jūtos priviliģēta, ka saprotu, ko tas nozīmē. Nevajag iedomāties, ka brīvība un neatkarība ir kaut kas pašsaprotams. Tam vajadzētu būt pašsaprotamam, bet [jāspēj] novērtēt to pa īstam, jo mēs piedzīvojām arī laikus, kad tā nebija." Par gaidāma koncerta laikmetīgo repertuāru stāsta Reinis Zariņš: "Visi baltiešu komponistu darbi tapuši jaunās tūkstošgades pirmajā desmitgadē, tātad samērā nesen. Varētu teikt, nesena laikmetīga mūzika, bet tās pamatā ir pavisam fundamentālas lietas, kaut kas pagānisks, kaut kas no mūsu senčiem, kaut kas sensens, nozīmīgs jebkuras kultūras cilvēkam. Gan Šenderova darbā, gan Vaska bezgalīgajā dvēseles dziesmā ir tādas lietas, kas ir saprotamas pilnīgi visiem cilvēkiem. Tīra opusā Fata morgana mani visvairāk saista tas, kā dabiskā skaņurinda izmantota tik iespaidīgā, mistēriskā, noslēpumainā veidā. Dabiskā skaņurinda galu galā ir pamats visai mūzikai, ko cilvēks tikai vienu dienu ir atklājis un iemācījies izmantot, bet tā ir ierakstīta šīs pasaules DNS kopš radīšanas. Šī skaņurinda ceļas kā nogrimusi pils no ezera, un iespaids ir ļoti spēcīgs. It kā nesena mūzika, kas kaut kādā ziņā tomēr ir ļoti sena. Vismaz es tā par šiem skaņdarbiem esmu sācis domāt." Eva Bindere: Jā, Šenderova skaņdarbs tiešām ir ārkārtīgi pagānisks, tas ir īstākais vārds, kā to aprakstīt, pat salīdzinoši traks savā garā. Godīgi sakot, biju pārsteigta, kad pirmoreiz to dzirdēju, jo tas aiziet tādā virpulī, ko es galīgi nebūtu [gaidījusi]. Tā nav tā ideja, ko mēs sagaidām no lietuviešu mūzikas, atverot lappuses un nezinot skaņdarbu. Katrs no šiem trim darbiem rāda savu, ļoti īpašu pasauli, un tie ir savstarpēji ļoti kontrastējoši, tādēļ domāju, ka būs interesanti tos dzirdēt un salīdzināt tās skaņu vīzijas, kādas katrs no tiem rada. Trio “Palladio” šobrīd ir Latvijā un 14. decembrī būs koncertā Mazajā Ģildē, bet pēc tam jums ir koncerti arī Moldovā un Itālijā. Tur būs šī pati programma? Reinis Zariņš: Mēs miksēsim gan šos, gan iepriekš spēlētus darbus, kādas nu vēlmes kura koncertvieta izteica, tā kā tas no mums prasīs papildus darbiņu, kas vienmēr ir ļoti labi. Kristīne Blaumane: Bet ir svarīgi, ka mēs visos šajos koncertos spēlēsim mūsu tautieti Pēteri Vasku. Tas tiks atskaņots visur, tāpat kā Šūberta trio.  

Off the Ball
What does Dumbarton mean to you? Berts - on and off the pitch; Nose XI; Terracing Teaser

Off the Ball

Play Episode Listen Later Nov 26, 2024 64:56


The most Petty and Illinformed podcast available. Paul English and Tam Cowan are joined by Dumbarton super-fan Robert Ryan to chat about - What does Dumbarton mean to you? Berts - on and off the pitch; Nose XI; Terracing Teaser

Gamlaste nytt
S1E47 - Fiktiva dokumentärer och Berts osammanhängande värld

Gamlaste nytt

Play Episode Listen Later Aug 29, 2024 57:50


Efter semestersåsigheten är Gamlaste nytt tillbaka i gammal god form! I dagens avsnitt ska Bertil och lyssnarna gissa vilken dokumentärfilm som kommer på riktigt och vilka Rasmus hittat på. Berra har grävt i Alla mot alla-arkiven och gör en hisnande upptäckt. Dessutom: ett färskt exempel på Bert Karlsson descent into madness! Det vill ni inte missa!

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
AI Magic: Shipping 1000s of successful products with no managers and a team of 12 — Jeremy Howard of Answer.ai

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Aug 16, 2024 58:56


Disclaimer: We recorded this episode ~1.5 months ago, timing for the FastHTML release. It then got bottlenecked by Llama3.1, Winds of AI Winter, and SAM2 episodes, so we're a little late. Since then FastHTML was released, swyx is building an app in it for AINews, and Anthropic has also released their prompt caching API. Remember when Dylan Patel of SemiAnalysis coined the GPU Rich vs GPU Poor war? (if not, see our pod with him). The idea was that if you're GPU poor you shouldn't waste your time trying to solve GPU rich problems (i.e. pre-training large models) and are better off working on fine-tuning, optimized inference, etc. Jeremy Howard (see our “End of Finetuning” episode to catchup on his background) and Eric Ries founded Answer.AI to do exactly that: “Practical AI R&D”, which is very in-line with the GPU poor needs. For example, one of their first releases was a system based on FSDP + QLoRA that let anyone train a 70B model on two NVIDIA 4090s. Since then, they have come out with a long list of super useful projects (in no particular order, and non-exhaustive):* FSDP QDoRA: this is just as memory efficient and scalable as FSDP/QLoRA, and critically is also as accurate for continued pre-training as full weight training.* Cold Compress: a KV cache compression toolkit that lets you scale sequence length without impacting speed.* colbert-small: state of the art retriever at only 33M params* JaColBERTv2.5: a new state-of-the-art retrievers on all Japanese benchmarks.* gpu.cpp: portable GPU compute for C++ with WebGPU.* Claudette: a better Anthropic API SDK. They also recently released FastHTML, a new way to create modern interactive web apps. Jeremy recently released a 1 hour “Getting started” tutorial on YouTube; while this isn't AI related per se, but it's close to home for any AI Engineer who are looking to iterate quickly on new products: In this episode we broke down 1) how they recruit 2) how they organize what to research 3) and how the community comes together. At the end, Jeremy gave us a sneak peek at something new that he's working on that he calls dialogue engineering: So I've created a new approach. It's not called prompt engineering. I'm creating a system for doing dialogue engineering. It's currently called AI magic. I'm doing most of my work in this system and it's making me much more productive than I was before I used it.He explains it a bit more ~44:53 in the pod, but we'll just have to wait for the public release to figure out exactly what he means.Timestamps* [00:00:00] Intro by Suno AI* [00:03:02] Continuous Pre-Training is Here* [00:06:07] Schedule-Free Optimizers and Learning Rate Schedules* [00:07:08] Governance and Structural Issues within OpenAI and Other AI Labs* [00:13:01] How Answer.ai works* [00:23:40] How to Recruit Productive Researchers* [00:27:45] Building a new BERT* [00:31:57] FSDP, QLoRA, and QDoRA: Innovations in Fine-Tuning Large Models* [00:36:36] Research and Development on Model Inference Optimization* [00:39:49] FastHTML for Web Application Development* [00:46:53] AI Magic & Dialogue Engineering* [00:52:19] AI wishlist & predictionsShow Notes* Jeremy Howard* Previously on Latent Space: The End of Finetuning, NeurIPS Startups* Answer.ai* Fast.ai* FastHTML* answerai-colbert-small-v1* gpu.cpp* Eric Ries* Aaron DeFazio* Yi Tai* Less Wright* Benjamin Warner* Benjamin Clavié* Jono Whitaker* Austin Huang* Eric Gilliam* Tim Dettmers* Colin Raffel* Sebastian Raschka* Carson Gross* Simon Willison* Sepp Hochreiter* Llama3.1 episode* Snowflake Arctic* Ranger Optimizer* Gemma.cpp* HTMX* UL2* BERT* DeBERTa* Efficient finetuning of Llama 3 with FSDP QDoRA* xLSTMTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO-in-Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol AI.Swyx [00:00:14]: And today we're back with Jeremy Howard, I think your third appearance on Latent Space. Welcome.Jeremy [00:00:19]: Wait, third? Second?Swyx [00:00:21]: Well, I grabbed you at NeurIPS.Jeremy [00:00:23]: I see.Swyx [00:00:24]: Very fun, standing outside street episode.Jeremy [00:00:27]: I never heard that, by the way. You've got to send me a link. I've got to hear what it sounded like.Swyx [00:00:30]: Yeah. Yeah, it's a NeurIPS podcast.Alessio [00:00:32]: I think the two episodes are six hours, so there's plenty to listen, we'll make sure to send it over.Swyx [00:00:37]: Yeah, we're trying this thing where at the major ML conferences, we, you know, do a little audio tour of, give people a sense of what it's like. But the last time you were on, you declared the end of fine tuning. I hope that I sort of editorialized the title a little bit, and I know you were slightly uncomfortable with it, but you just own it anyway. I think you're very good at the hot takes. And we were just discussing in our pre-show that it's really happening, that the continued pre-training is really happening.Jeremy [00:01:02]: Yeah, absolutely. I think people are starting to understand that treating the three ULM FIT steps of like pre-training, you know, and then the kind of like what people now call instruction tuning, and then, I don't know if we've got a general term for this, DPO, RLHFE step, you know, or the task training, they're not actually as separate as we originally suggested they were in our paper, and when you treat it more as a continuum, and that you make sure that you have, you know, more of kind of the original data set incorporated into the later stages, and that, you know, we've also seen with LLAMA3, this idea that those later stages can be done for a lot longer. These are all of the things I was kind of trying to describe there. It wasn't the end of fine tuning, but more that we should treat it as a continuum, and we should have much higher expectations of how much you can do with an already trained model. You can really add a lot of behavior to it, you can change its behavior, you can do a lot. So a lot of our research has been around trying to figure out how to modify the model by a larger amount rather than starting from random weights, because I get very offended at the idea of starting from random weights.Swyx [00:02:14]: Yeah, I saw that in ICLR in Vienna, there was an outstanding paper about starting transformers from data-driven piers. I don't know if you saw that one, they called it sort of never trained from scratch, and I think it was kind of rebelling against like the sort of random initialization.Jeremy [00:02:28]: Yeah, I've, you know, that's been our kind of continuous message since we started Fast AI, is if you're training for random weights, you better have a really good reason, you know, because it seems so unlikely to me that nobody has ever trained on data that has any similarity whatsoever to the general class of data you're working with, and that's the only situation in which I think starting from random weights makes sense.Swyx [00:02:51]: The other trends since our last pod that I would point people to is I'm seeing a rise in multi-phase pre-training. So Snowflake released a large model called Snowflake Arctic, where they detailed three phases of training where they had like a different mixture of like, there was like 75% web in the first instance, and then they reduced the percentage of the web text by 10% each time and increased the amount of code in each phase. And I feel like multi-phase is being called out in papers more. I feel like it's always been a thing, like changing data mix is not something new, but calling it a distinct phase is new, and I wonder if there's something that you're seeingJeremy [00:03:32]: on your end. Well, so they're getting there, right? So the point at which they're doing proper continued pre-training is the point at which that becomes a continuum rather than a phase. So the only difference with what I was describing last time is to say like, oh, there's a function or whatever, which is happening every batch. It's not a huge difference. You know, I always used to get offended when people had learning rates that like jumped. And so one of the things I started doing early on in Fast.ai was to say to people like, no, you should actually have your learning rate schedule should be a function, not a list of numbers. So now I'm trying to give the same idea about training mix.Swyx [00:04:07]: There's been pretty public work from Meta on schedule-free optimizers. I don't know if you've been following Aaron DeFazio and what he's doing, just because you mentioned learning rate schedules, you know, what if you didn't have a schedule?Jeremy [00:04:18]: I don't care very much, honestly. I don't think that schedule-free optimizer is that exciting. It's fine. We've had non-scheduled optimizers for ages, like Less Wright, who's now at Meta, who was part of the Fast.ai community there, created something called the Ranger optimizer. I actually like having more hyperparameters. You know, as soon as you say schedule-free, then like, well, now I don't get to choose. And there isn't really a mathematically correct way of, like, I actually try to schedule more parameters rather than less. So like, I like scheduling my epsilon in my atom, for example. I schedule all the things. But then the other thing we always did with the Fast.ai library was make it so you don't have to set any schedules. So Fast.ai always supported, like, you didn't even have to pass a learning rate. Like, it would always just try to have good defaults and do the right thing. But to me, I like to have more parameters I can play with if I want to, but you don't have to.Alessio [00:05:08]: And then the more less technical side, I guess, of your issue, I guess, with the market was some of the large research labs taking all this innovation kind of behind closed doors and whether or not that's good, which it isn't. And now we could maybe make it more available to people. And then a month after we released the episode, there was the whole Sam Altman drama and like all the OpenAI governance issues. And maybe people started to think more, okay, what happens if some of these kind of labs, you know, start to break from within, so to speak? And the alignment of the humans is probably going to fall before the alignment of the models. So I'm curious, like, if you have any new thoughts and maybe we can also tie in some of the way that we've been building Answer as like a public benefit corp and some of those aspects.Jeremy [00:05:51]: Sure. So, yeah, I mean, it was kind of uncomfortable because two days before Altman got fired, I did a small public video interview in which I said, I'm quite sure that OpenAI's current governance structure can't continue and that it was definitely going to fall apart. And then it fell apart two days later and a bunch of people were like, what did you know, Jeremy?Alessio [00:06:13]: What did Jeremy see?Jeremy [00:06:15]: I didn't see anything. It's just obviously true. Yeah. So my friend Eric Ries and I spoke a lot before that about, you know, Eric's, I think probably most people would agree, the top expert in the world on startup and AI governance. And you know, we could both clearly see that this didn't make sense to have like a so-called non-profit where then there are people working at a company, a commercial company that's owned by or controlled nominally by the non-profit, where the people in the company are being given the equivalent of stock options, like everybody there was working there with expecting to make money largely from their equity. So the idea that then a board could exercise control by saying like, oh, we're worried about safety issues and so we're going to do something that decreases the profit of the company, when every stakeholder in the company, their remuneration pretty much is tied to their profit, it obviously couldn't work. So I mean, that was a huge oversight there by someone. I guess part of the problem is that the kind of people who work at non-profits and in this case the board, you know, who are kind of academics and, you know, people who are kind of true believers. I think it's hard for them to realize that 99.999% of the world is driven very heavily by money, especially huge amounts of money. So yeah, Eric and I had been talking for a long time before that about what could be done differently, because also companies are sociopathic by design and so the alignment problem as it relates to companies has not been solved. Like, companies become huge, they devour their founders, they devour their communities and they do things where even the CEOs, you know, often of big companies tell me like, I wish our company didn't do that thing. You know, I know that if I didn't do it, then I would just get fired and the board would put in somebody else and the board knows if they don't do it, then their shareholders can sue them because they're not maximizing profitability or whatever. So what Eric's spent a lot of time doing is trying to think about how do we make companies less sociopathic, you know, how to, or more, you know, maybe a better way to think of it is like, how do we make it so that the founders of companies can ensure that their companies continue to actually do the things they want them to do? You know, when we started a company, hey, we very explicitly decided we got to start a company, not a academic lab, not a nonprofit, you know, we created a Delaware Seacorp, you know, the most company kind of company. But when we did so, we told everybody, you know, including our first investors, which was you Alessio. They sound great. We are going to run this company on the basis of maximizing long-term value. And in fact, so when we did our second round, which was an angel round, we had everybody invest through a long-term SPV, which we set up where everybody had to agree to vote in line with long-term value principles. So like never enough just to say to people, okay, we're trying to create long-term value here for society as well as for ourselves and everybody's like, oh, yeah, yeah, I totally agree with that. But when it comes to like, okay, well, here's a specific decision we have to make, which will not maximize short-term value, people suddenly change their mind. So you know, it has to be written into the legal documents of everybody so that no question that that's the way the company has to be managed. So then you mentioned the PBC aspect, Public Benefit Corporation, which I never quite understood previously. And turns out it's incredibly simple, like it took, you know, like one paragraph added to our corporate documents to become a PBC. It was cheap, it was easy, but it's got this huge benefit, which is if you're not a public benefit corporation, then somebody can come along and offer to buy you with a stated description of like turning your company into the thing you most hate, right? And if they offer you more than the market value of your company and you don't accept it, then you are not necessarily meeting the kind of your fiduciary responsibilities. So the way like Eric always described it to me is like, if Philip Morris came along and said that you've got great technology for marketing cigarettes to children, so we're going to pivot your company to do that entirely, and we're going to pay you 50% more than the market value, you're going to have to say yes. If you have a PBC, then you are more than welcome to say no, if that offer is not in line with your stated public benefit. So our stated public benefit is to maximize the benefit to society through using AI. So given that more children smoking doesn't do that, then we can say like, no, we're not selling to you.Alessio [00:11:01]: I was looking back at some of our emails. You sent me an email on November 13th about talking and then on the 14th, I sent you an email working together to free AI was the subject line. And then that was kind of the start of the C round. And then two days later, someone got fired. So you know, you were having these thoughts even before we had like a public example of like why some of the current structures didn't work. So yeah, you were very ahead of the curve, so to speak. You know, people can read your awesome introduction blog and answer and the idea of having a R&D lab versus our lab and then a D lab somewhere else. I think to me, the most interesting thing has been hiring and some of the awesome people that you've been bringing on that maybe don't fit the central casting of Silicon Valley, so to speak. Like sometimes I got it like playing baseball cards, you know, people are like, oh, what teams was this person on, where did they work versus focusing on ability. So I would love for you to give a shout out to some of the awesome folks that you have on the team.Jeremy [00:11:58]: So, you know, there's like a graphic going around describing like the people at XAI, you know, Elon Musk thing. And like they are all connected to like multiple of Stanford, Meta, DeepMind, OpenAI, Berkeley, Oxford. Look, these are all great institutions and they have good people. And I'm definitely not at all against that, but damn, there's so many other people. And one of the things I found really interesting is almost any time I see something which I think like this is really high quality work and it's something I don't think would have been built if that person hadn't built the thing right now, I nearly always reach out to them and ask to chat. And I tend to dig in to find out like, okay, you know, why did you do that thing? Everybody else has done this other thing, your thing's much better, but it's not what other people are working on. And like 80% of the time, I find out the person has a really unusual background. So like often they'll have like, either they like came from poverty and didn't get an opportunity to go to a good school or had dyslexia and, you know, got kicked out of school in year 11, or they had a health issue that meant they couldn't go to university or something happened in their past and they ended up out of the mainstream. And then they kind of succeeded anyway. Those are the people that throughout my career, I've tended to kind of accidentally hire more of, but it's not exactly accidentally. It's like when I see somebody who's done, two people who have done extremely well, one of them did extremely well in exactly the normal way from the background entirely pointing in that direction and they achieved all the hurdles to get there. And like, okay, that's quite impressive, you know, but another person who did just as well, despite lots of constraints and doing things in really unusual ways and came up with different approaches. That's normally the person I'm likely to find useful to work with because they're often like risk-takers, they're often creative, they're often extremely tenacious, they're often very open-minded. So that's the kind of folks I tend to find myself hiring. So now at Answer.ai, it's a group of people that are strong enough that nearly every one of them has independently come to me in the past few weeks and told me that they have imposter syndrome and they're not convinced that they're good enough to be here. And I kind of heard it at the point where I was like, okay, I don't think it's possible that all of you are so far behind your peers that you shouldn't get to be here. But I think part of the problem is as an R&D lab, the great developers look at the great researchers and they're like, wow, these big-brained, crazy research people with all their math and s**t, they're too cool for me, oh my God. And then the researchers look at the developers and they're like, oh, they're killing it, making all this stuff with all these people using it and talking on Twitter about how great it is. I think they're both a bit intimidated by each other, you know. And so I have to kind of remind them like, okay, there are lots of things in this world where you suck compared to lots of other people in this company, but also vice versa, you know, for all things. And the reason you came here is because you wanted to learn about those other things from those other people and have an opportunity to like bring them all together into a single unit. You know, it's not reasonable to expect you're going to be better at everything than everybody else. I guess the other part of it is for nearly all of the people in the company, to be honest, they have nearly always been better than everybody else at nearly everything they're doing nearly everywhere they've been. So it's kind of weird to be in this situation now where it's like, gee, I can clearly see that I suck at this thing that I'm meant to be able to do compared to these other people where I'm like the worst in the company at this thing for some things. So I think that's a healthy place to be, you know, as long as you keep reminding each other about that's actually why we're here. And like, it's all a bit of an experiment, like we don't have any managers. We don't have any hierarchy from that point of view. So for example, I'm not a manager, which means I don't get to tell people what to do or how to do it or when to do it. Yeah, it's been a bit of an experiment to see how that would work out. And it's been great. So for instance, Ben Clavier, who you might have come across, he's the author of Ragatouille, he's the author of Rerankers, super strong information retrieval guy. And a few weeks ago, you know, this additional channel appeared on Discord, on our private Discord called Bert24. And these people started appearing, as in our collab sections, we have a collab section for like collaborating with outsiders. And these people started appearing, there are all these names that I recognize, like Bert24, and they're all talking about like the next generation of Bert. And I start following along, it's like, okay, Ben decided that I think, quite rightly, we need a new Bert. Because everybody, like so many people are still using Bert, and it's still the best at so many things, but it actually doesn't take advantage of lots of best practices. And so he just went out and found basically everybody who's created better Berts in the last four or five years, brought them all together, suddenly there's this huge collaboration going on. So yeah, I didn't tell him to do that. He didn't ask my permission to do that. And then, like, Benjamin Warner dived in, and he's like, oh, I created a whole transformers from scratch implementation designed to be maximally hackable. He originally did it largely as a teaching exercise to show other people, but he was like, I could, you know, use that to create a really hackable BERT implementation. In fact, he didn't say that. He said, I just did do that, you know, and I created a repo, and then everybody's like starts using it. They're like, oh my god, this is amazing. I can now implement all these other BERT things. And it's not just answer AI guys there, you know, there's lots of folks, you know, who have like contributed new data set mixes and blah, blah, blah. So, I mean, I can help in the same way that other people can help. So like, then Ben Clavier reached out to me at one point and said, can you help me, like, what have you learned over time about how to manage intimidatingly capable and large groups of people who you're nominally meant to be leading? And so, you know, I like to try to help, but I don't direct. Another great example was Kerem, who, after our FSTP QLORA work, decided quite correctly that it didn't really make sense to use LoRa in today's world. You want to use the normalized version, which is called Dora. Like two or three weeks after we did FSTP QLORA, he just popped up and said, okay, I've just converted the whole thing to Dora, and I've also created these VLLM extensions, and I've got all these benchmarks, and, you know, now I've got training of quantized models with adapters that are as fast as LoRa, and as actually better than, weirdly, fine tuning. Just like, okay, that's great, you know. And yeah, so the things we've done to try to help make these things happen as well is we don't have any required meetings, you know, but we do have a meeting for each pair of major time zones that everybody's invited to, and, you know, people see their colleagues doing stuff that looks really cool and say, like, oh, how can I help, you know, or how can I learn or whatever. So another example is Austin, who, you know, amazing background. He ran AI at Fidelity, he ran AI at Pfizer, he ran browsing and retrieval for Google's DeepMind stuff, created Jemma.cpp, and he's been working on a new system to make it easier to do web GPU programming, because, again, he quite correctly identified, yeah, so I said to him, like, okay, I want to learn about that. Not an area that I have much expertise in, so, you know, he's going to show me what he's working on and teach me a bit about it, and hopefully I can help contribute. I think one of the key things that's happened in all of these is everybody understands what Eric Gilliam, who wrote the second blog post in our series, the R&D historian, describes as a large yard with narrow fences. Everybody has total flexibility to do what they want. We all understand kind of roughly why we're here, you know, we agree with the premises around, like, everything's too expensive, everything's too complicated, people are building too many vanity foundation models rather than taking better advantage of fine-tuning, like, there's this kind of general, like, sense of we're all on the same wavelength about, you know, all the ways in which current research is fucked up, and, you know, all the ways in which we're worried about centralization. We all care a lot about not just research for the point of citations, but research that actually wouldn't have happened otherwise, and actually is going to lead to real-world outcomes. And so, yeah, with this kind of, like, shared vision, people understand, like, you know, so when I say, like, oh, well, you know, tell me, Ben, about BERT 24, what's that about? And he's like, you know, like, oh, well, you know, you can see from an accessibility point of view, or you can see from a kind of a actual practical impact point of view, there's far too much focus on decoder-only models, and, you know, like, BERT's used in all of these different places and industry, and so I can see, like, in terms of our basic principles, what we're trying to achieve, this seems like something important. And so I think that's, like, a really helpful that we have that kind of shared perspective, you know?Alessio [00:21:14]: Yeah. And before we maybe talk about some of the specific research, when you're, like, reaching out to people, interviewing them, what are some of the traits, like, how do these things come out, you know, usually? Is it working on side projects that you, you know, you're already familiar with? Is there anything, like, in the interview process that, like, helps you screen for people that are less pragmatic and more research-driven versus some of these folks that are just gonna do it, you know? They're not waiting for, like, the perfect process.Jeremy [00:21:40]: Everybody who comes through the recruiting is interviewed by everybody in the company. You know, our goal is 12 people, so it's not an unreasonable amount. So the other thing to say is everybody so far who's come into the recruiting pipeline, everybody bar one, has been hired. So which is to say our original curation has been good. And that's actually pretty easy, because nearly everybody who's come in through the recruiting pipeline are people I know pretty well. So Jono Whitaker and I, you know, he worked on the stable diffusion course we did. He's outrageously creative and talented, and he's super, like, enthusiastic tinkerer, just likes making things. Benjamin was one of the strongest parts of the fast.ai community, which is now the alumni. It's, like, hundreds of thousands of people. And you know, again, like, they're not people who a normal interview process would pick up, right? So Benjamin doesn't have any qualifications in math or computer science. Jono was living in Zimbabwe, you know, he was working on, like, helping some African startups, you know, but not FAANG kind of credentials. But yeah, I mean, when you actually see people doing real work and they stand out above, you know, we've got lots of Stanford graduates and open AI people and whatever in our alumni community as well. You know, when you stand out above all of those people anyway, obviously you've got something going for you. You know, Austin, him and I worked together on the masks study we did in the proceeding at the National Academy of Science. You know, we had worked together, and again, that was a group of, like, basically the 18 or 19 top experts in the world on public health and epidemiology and research design and so forth. And Austin, you know, one of the strongest people in that collaboration. So yeah, you know, like, I've been lucky enough to have had opportunities to work with some people who are great and, you know, I'm a very open-minded person, so I kind of am always happy to try working with pretty much anybody and some people stand out. You know, there have been some exceptions, people I haven't previously known, like Ben Clavier, actually, I didn't know before. But you know, with him, you just read his code, and I'm like, oh, that's really well-written code. And like, it's not written exactly the same way as everybody else's code, and it's not written to do exactly the same thing as everybody else's code. So yeah, and then when I chatted to him, it's just like, I don't know, I felt like we'd known each other for years, like we just were on the same wavelength, but I could pretty much tell that was going to happen just by reading his code. I think you express a lot in the code you choose to write and how you choose to write it, I guess. You know, or another example, a guy named Vic, who was previously the CEO of DataQuest, and like, in that case, you know, he's created a really successful startup. He won the first, basically, Kaggle NLP competition, which was automatic essay grading. He's got the current state-of-the-art OCR system, Surya. Again, he's just a guy who obviously just builds stuff, you know, he doesn't ask for permission, he doesn't need any, like, external resources. Actually, Karim's another great example of this, I mean, I already knew Karim very well because he was my best ever master's student, but it wasn't a surprise to me then when he then went off to create the world's state-of-the-art language model in Turkish on his own, in his spare time, with no budget, from scratch. This is not fine-tuning or whatever, he, like, went back to Common Crawl and did everything. Yeah, it's kind of, I don't know what I'd describe that process as, but it's not at all based on credentials.Swyx [00:25:17]: Assemble based on talent, yeah. We wanted to dive in a little bit more on, you know, turning from the people side of things into the technical bets that you're making. Just a little bit more on Bert. I was actually, we just did an interview with Yi Tay from Reka, I don't know if you're familiar with his work, but also another encoder-decoder bet, and one of his arguments was actually people kind of over-index on the decoder-only GPT-3 type paradigm. I wonder if you have thoughts there that is maybe non-consensus as well. Yeah, no, absolutely.Jeremy [00:25:45]: So I think it's a great example. So one of the people we're collaborating with a little bit with BERT24 is Colin Raffle, who is the guy behind, yeah, most of that stuff, you know, between that and UL2, there's a lot of really interesting work. And so one of the things I've been encouraging the BERT group to do, Colin has as well, is to consider using a T5 pre-trained encoder backbone as a thing you fine-tune, which I think would be really cool. You know, Colin was also saying actually just use encoder-decoder as your Bert, you know, why don't you like use that as a baseline, which I also think is a good idea. Yeah, look.Swyx [00:26:25]: What technical arguments are people under-weighting?Jeremy [00:26:27]: I mean, Colin would be able to describe this much better than I can, but I'll give my slightly non-expert attempt. Look, I mean, think about like diffusion models, right? Like in stable diffusion, like we use things like UNet. You have this kind of downward path and then in the upward path you have the cross connections, which it's not a tension, but it's like a similar idea, right? You're inputting the original encoding path into your decoding path. It's critical to make it work, right? Because otherwise in the decoding part, the model has to do so much kind of from scratch. So like if you're doing translation, like that's a classic kind of encoder-decoder example. If it's decoder only, you never get the opportunity to find the right, you know, feature engineering, the right feature encoding for the original sentence. And it kind of means then on every token that you generate, you have to recreate the whole thing, you know? So if you have an encoder, it's basically saying like, okay, this is your opportunity model to create a really useful feature representation for your input information. So I think there's really strong arguments for encoder-decoder models anywhere that there is this kind of like context or source thing. And then why encoder only? Well, because so much of the time what we actually care about is a classification, you know? It's like an output. It's like generating an arbitrary length sequence of tokens. So anytime you're not generating an arbitrary length sequence of tokens, decoder models don't seem to make much sense. Now the interesting thing is, you see on like Kaggle competitions, that decoder models still are at least competitive with things like Deberta v3. They have to be way bigger to be competitive with things like Deberta v3. And the only reason they are competitive is because people have put a lot more time and money and effort into training the decoder only ones, you know? There isn't a recent Deberta. There isn't a recent Bert. Yeah, it's a whole part of the world that people have slept on a little bit. And this is just what happens. This is how trends happen rather than like, to me, everybody should be like, oh, let's look at the thing that has shown signs of being useful in the past, but nobody really followed up with properly. That's the more interesting path, you know, where people tend to be like, oh, I need to get citations. So what's everybody else doing? Can I make it 0.1% better, you know, or 0.1% faster? That's what everybody tends to do. Yeah. So I think it's like, Itay's work commercially now is interesting because here's like a whole, here's a whole model that's been trained in a different way. So there's probably a whole lot of tasks it's probably better at than GPT and Gemini and Claude. So that should be a good commercial opportunity for them if they can figure out what those tasks are.Swyx [00:29:07]: Well, if rumors are to be believed, and he didn't comment on this, but, you know, Snowflake may figure out the commercialization for them. So we'll see.Jeremy [00:29:14]: Good.Alessio [00:29:16]: Let's talk about FSDP, Qlora, Qdora, and all of that awesome stuff. One of the things we talked about last time, some of these models are meant to run on systems that nobody can really own, no single person. And then you were like, well, what if you could fine tune a 70B model on like a 4090? And I was like, no, that sounds great, Jeremy, but like, can we actually do it? And then obviously you all figured it out. Can you maybe tell us some of the worst stories behind that, like the idea behind FSDP, which is kind of taking sharded data, parallel computation, and then Qlora, which is do not touch all the weights, just go quantize some of the model, and then within the quantized model only do certain layers instead of doing everything.Jeremy [00:29:57]: Well, do the adapters. Yeah.Alessio [00:29:59]: Yeah. Yeah. Do the adapters. Yeah. I will leave the floor to you. I think before you published it, nobody thought this was like a short term thing that we're just going to have. And now it's like, oh, obviously you can do it, but it's not that easy.Jeremy [00:30:12]: Yeah. I mean, to be honest, it was extremely unpleasant work to do. It's like not at all enjoyable. I kind of did version 0.1 of it myself before we had launched the company, or at least the kind of like the pieces. They're all pieces that are difficult to work with, right? So for the quantization, you know, I chatted to Tim Detmers quite a bit and, you know, he very much encouraged me by saying like, yeah, it's possible. He actually thought it'd be easy. It probably would be easy for him, but I'm not Tim Detmers. And, you know, so he wrote bits and bytes, which is his quantization library. You know, he wrote that for a paper. He didn't write that to be production like code. It's now like everybody's using it, at least the CUDA bits. So like, it's not particularly well structured. There's lots of code paths that never get used. There's multiple versions of the same thing. You have to try to figure it out. So trying to get my head around that was hard. And you know, because the interesting bits are all written in CUDA, it's hard to like to step through it and see what's happening. And then, you know, FSTP is this very complicated library and PyTorch, which not particularly well documented. So the only really, really way to understand it properly is again, just read the code and step through the code. And then like bits and bytes doesn't really work in practice unless it's used with PEF, the HuggingFace library and PEF doesn't really work in practice unless you use it with other things. And there's a lot of coupling in the HuggingFace ecosystem where like none of it works separately. You have to use it all together, which I don't love. So yeah, trying to just get a minimal example that I can play with was really hard. And so I ended up having to rewrite a lot of it myself to kind of create this like minimal script. One thing that helped a lot was Medec had this LlamaRecipes repo that came out just a little bit before I started working on that. And like they had a kind of role model example of like, here's how to train FSTP, LoRa, didn't work with QLoRa on Llama. A lot of the stuff I discovered, the interesting stuff would be put together by Les Wright, who's, he was actually the guy in the Fast.ai community I mentioned who created the Ranger Optimizer. So he's doing a lot of great stuff at Meta now. So yeah, I kind of, that helped get some minimum stuff going and then it was great once Benjamin and Jono joined full time. And so we basically hacked at that together and then Kerim joined like a month later or something. And it was like, gee, it was just a lot of like fiddly detailed engineering on like barely documented bits of obscure internals. So my focus was to see if it kind of could work and I kind of got a bit of a proof of concept working and then the rest of the guys actually did all the work to make it work properly. And, you know, every time we thought we had something, you know, we needed to have good benchmarks, right? So we'd like, it's very easy to convince yourself you've done the work when you haven't, you know, so then we'd actually try lots of things and be like, oh, and these like really important cases, the memory use is higher, you know, or it's actually slower. And we'd go in and we just find like all these things that were nothing to do with our library that just didn't work properly. And nobody had noticed they hadn't worked properly because nobody had really benchmarked it properly. So we ended up, you know, trying to fix a whole lot of different things. And even as we did so, new regressions were appearing in like transformers and stuff that Benjamin then had to go away and figure out like, oh, how come flash attention doesn't work in this version of transformers anymore with this set of models and like, oh, it turns out they accidentally changed this thing, so it doesn't work. You know, there's just, there's not a lot of really good performance type evals going on in the open source ecosystem. So there's an extraordinary amount of like things where people say like, oh, we built this thing and it has this result. And when you actually check it, so yeah, there's a shitload of war stories from getting that thing to work. And it did require a particularly like tenacious group of people and a group of people who don't mind doing a whole lot of kind of like really janitorial work, to be honest, to get the details right, to check them. Yeah.Alessio [00:34:09]: We had a trade out on the podcast and we talked about how a lot of it is like systems work to make some of these things work. It's not just like beautiful, pure math that you do on a blackboard. It's like, how do you get into the nitty gritty?Jeremy [00:34:22]: I mean, flash attention is a great example of that. Like it's, it basically is just like, oh, let's just take the attention and just do the tiled version of it, which sounds simple enough, you know, but then implementing that is challenging at lots of levels.Alessio [00:34:36]: Yeah. What about inference? You know, obviously you've done all this amazing work on fine tuning. Do you have any research you've been doing on the inference side, how to make local inference really fast on these models too?Jeremy [00:34:47]: We're doing quite a bit on that at the moment. We haven't released too much there yet. But one of the things I've been trying to do is also just to help other people. And one of the nice things that's happened is that a couple of folks at Meta, including Mark Seraphim, have done a nice job of creating this CUDA mode community of people working on like CUDA kernels or learning about that. And I tried to help get that going well as well and did some lessons to help people get into it. So there's a lot going on in both inference and fine tuning performance. And a lot of it's actually happening kind of related to that. So PyTorch team have created this Torch AO project on quantization. And so there's a big overlap now between kind of the FastAI and AnswerAI and CUDA mode communities of people working on stuff for both inference and fine tuning. But we're getting close now. You know, our goal is that nobody should be merging models, nobody should be downloading merged models, everybody should be using basically quantized plus adapters for almost everything and just downloading the adapters. And that should be much faster. So that's kind of the place we're trying to get to. It's difficult, you know, because like Karim's been doing a lot of work with VLM, for example. These inference engines are pretty complex bits of code. They have a whole lot of custom kernel stuff going on as well, as do the quantization libraries. So we've been working on, we're also quite a bit of collaborating with the folks who do HQQ, which is a really great quantization library and works super well. So yeah, there's a lot of other people outside AnswerAI that we're working with a lot who are really helping on all this performance optimization stuff, open source.Swyx [00:36:27]: Just to follow up on merging models, I picked up there that you said nobody should be merging models. That's interesting because obviously a lot of people are experimenting with this and finding interesting results. I would say in defense of merging models, you can do it without data. That's probably the only thing that's going for it.Jeremy [00:36:45]: To explain, it's not that you shouldn't merge models. You shouldn't be distributing a merged model. You should distribute a merged adapter 99% of the time. And actually often one of the best things happening in the model merging world is actually that often merging adapters works better anyway. The point is, Sean, that once you've got your new model, if you distribute it as an adapter that sits on top of a quantized model that somebody's already downloaded, then it's a much smaller download for them. And also the inference should be much faster because you're not having to transfer FB16 weights from HPM memory at all or ever load them off disk. You know, all the main weights are quantized and the only floating point weights are in the adapters. So that should make both inference and fine tuning faster. Okay, perfect.Swyx [00:37:33]: We're moving on a little bit to the rest of the fast universe. I would have thought that, you know, once you started Answer.ai, that the sort of fast universe would be kind of on hold. And then today you just dropped Fastlight and it looks like, you know, there's more activity going on in sort of Fastland.Jeremy [00:37:49]: Yeah. So Fastland and Answerland are not really distinct things. Answerland is kind of like the Fastland grown up and funded. They both have the same mission, which is to maximize the societal benefit of AI broadly. We want to create thousands of commercially successful products at Answer.ai. And we want to do that with like 12 people. So that means we need a pretty efficient stack, you know, like quite a few orders of magnitude more efficient, not just for creation, but for deployment and maintenance than anything that currently exists. People often forget about the D part of our R&D firm. So we've got to be extremely good at creating, deploying and maintaining applications, not just models. Much to my horror, the story around creating web applications is much worse now than it was 10 or 15 years ago in terms of, if I say to a data scientist, here's how to create and deploy a web application, you know, either you have to learn JavaScript or TypeScript and about all the complex libraries like React and stuff, and all the complex like details around security and web protocol stuff around how you then talk to a backend and then all the details about creating the backend. You know, if that's your job and, you know, you have specialists who work in just one of those areas, it is possible for that to all work. But compared to like, oh, write a PHP script and put it in the home directory that you get when you sign up to this shell provider, which is what it was like in the nineties, you know, here are those 25 lines of code and you're done and now you can pass that URL around to all your friends, or put this, you know, .pl file inside the CGI bin directory that you got when you signed up to this web host. So yeah, the thing I've been mainly working on the last few weeks is fixing all that. And I think I fixed it. I don't know if this is an announcement, but I tell you guys, so yeah, there's this thing called fastHTML, which basically lets you create a complete web application in a single Python file. Unlike excellent projects like Streamlit and Gradio, you're not working on top of a highly abstracted thing. That's got nothing to do with web foundations. You're working with web foundations directly, but you're able to do it by using pure Python. There's no template, there's no ginger, there's no separate like CSS and JavaScript files. It looks and behaves like a modern SPA web application. And you can create components for like daisy UI, or bootstrap, or shoelace, or whatever fancy JavaScript and or CSS tailwind etc library you like, but you can write it all in Python. You can pip install somebody else's set of components and use them entirely from Python. You can develop and prototype it all in a Jupyter notebook if you want to. It all displays correctly, so you can like interactively do that. And then you mentioned Fastlight, so specifically now if you're using SQLite in particular, it's like ridiculously easy to have that persistence, and all of your handlers will be passed database ready objects automatically, that you can just call dot delete dot update dot insert on. Yeah, you get session, you get security, you get all that. So again, like with most everything I do, it's very little code. It's mainly tying together really cool stuff that other people have written. You don't have to use it, but a lot of the best stuff comes from its incorporation of HTMX, which to me is basically the thing that changes your browser to make it work the way it always should have. So it just does four small things, but those four small things are the things that are basically unnecessary constraints that HTML should never have had, so it removes the constraints. It sits on top of Starlet, which is a very nice kind of lower level platform for building these kind of web applications. The actual interface matches as closely as possible to FastAPI, which is a really nice system for creating the kind of classic JavaScript type applications. And Sebastian, who wrote FastAPI, has been kind enough to help me think through some of these design decisions, and so forth. I mean, everybody involved has been super helpful. Actually, I chatted to Carson, who created HTMX, you know, so about it. Some of the folks involved in Django, like everybody in the community I've spoken to definitely realizes there's a big gap to be filled around, like, highly scalable, web foundation-based, pure Python framework with a minimum of fuss. So yeah, I'm getting a lot of support and trying to make sure that FastHTML works well for people.Swyx [00:42:38]: I would say, when I heard about this, I texted Alexio. I think this is going to be pretty huge. People consider Streamlit and Gradio to be the state of the art, but I think there's so much to improve, and having what you call web foundations and web fundamentals at the core of it, I think, would be really helpful.Jeremy [00:42:54]: I mean, it's based on 25 years of thinking and work for me. So like, FastML was built on a system much like this one, but that was of hell. And so I spent, you know, 10 years working on that. We had millions of people using that every day, really pushing it hard. And I really always enjoyed working in that. Yeah. So, you know, and obviously lots of other people have done like great stuff, and particularly HTMX. So I've been thinking about like, yeah, how do I pull together the best of the web framework I created for FastML with HTMX? There's also things like PicoCSS, which is the CSS system, which by default, FastHTML comes with. Although, as I say, you can pip install anything you want to, but it makes it like super easy to, you know, so we try to make it so that just out of the box, you don't have any choices to make. Yeah. You can make choices, but for most people, you just, you know, it's like the PHP in your home directory thing. You just start typing and just by default, you'll get something which looks and feels, you know, pretty okay. And if you want to then write a version of Gradio or Streamlit on top of that, you totally can. And then the nice thing is if you then write it in kind of the Gradio equivalent, which will be, you know, I imagine we'll create some kind of pip installable thing for that. Once you've outgrown, or if you outgrow that, it's not like, okay, throw that all away and start again. And this like whole separate language that it's like this kind of smooth, gentle path that you can take step-by-step because it's all just standard web foundations all the way, you know.Swyx [00:44:29]: Just to wrap up the sort of open source work that you're doing, you're aiming to create thousands of projects with a very, very small team. I haven't heard you mention once AI agents or AI developer tooling or AI code maintenance. I know you're very productive, but you know, what is the role of AI in your own work?Jeremy [00:44:47]: So I'm making something. I'm not sure how much I want to say just yet.Swyx [00:44:52]: Give us a nibble.Jeremy [00:44:53]: All right. I'll give you the key thing. So I've created a new approach. It's not called prompt engineering. It's called dialogue engineering. But I'm creating a system for doing dialogue engineering. It's currently called AI magic. I'm doing most of my work in this system and it's making me much more productive than I was before I used it. So I always just build stuff for myself and hope that it'll be useful for somebody else. Think about chat GPT with code interpreter, right? The basic UX is the same as a 1970s teletype, right? So if you wrote APL on a teletype in the 1970s, you typed onto a thing, your words appeared at the bottom of a sheet of paper and you'd like hit enter and it would scroll up. And then the answer from APL would be printed out, scroll up, and then you would type the next thing. And like, which is also the way, for example, a shell works like bash or ZSH or whatever. It's not terrible, you know, like we all get a lot done in these like very, very basic teletype style REPL environments, but I've never felt like it's optimal and everybody else has just copied chat GPT. So it's also the way BART and Gemini work. It's also the way the Claude web app works. And then you add code interpreter. And the most you can do is to like plead with chat GPT to write the kind of code I want. It's pretty good for very, very, very beginner users who like can't code at all, like by default now the code's even hidden away, so you never even have to see it ever happened. But for somebody who's like wanting to learn to code or who already knows a bit of code or whatever, it's, it seems really not ideal. So okay, that's one end of the spectrum. The other end of the spectrum, which is where Sean's work comes in, is, oh, you want to do more than chat GPT? No worries. Here is Visual Studio Code. I run it. There's an empty screen with a flashing cursor. Okay, start coding, you know, and it's like, okay, you can use systems like Sean's or like cursor or whatever to be like, okay, Apple K in cursors, like a creative form that blah, blah, blah. But in the end, it's like a convenience over the top of this incredibly complicated system that full-time sophisticated software engineers have designed over the past few decades in a totally different environment as a way to build software, you know. And so we're trying to like shoehorn in AI into that. And it's not easy to do. And I think there are like much better ways of thinking about the craft of software development in a language model world to be much more interactive, you know. So the thing that I'm building is neither of those things. It's something between the two. And it's built around this idea of crafting a dialogue, you know, where the outcome of the dialogue is the artifacts that you want, whether it be a piece of analysis or whether it be a Python library or whether it be a technical blog post or whatever. So as part of building that, I've created something called Claudette, which is a library for Claude. I've created something called Cosette, which is a library for OpenAI. They're libraries which are designed to make those APIs much more usable, much easier to use, much more concise. And then I've written AI magic on top of those. And that's been an interesting exercise because I did Claudette first, and I was looking at what Simon Willison did with his fantastic LLM library. And his library is designed around like, let's make something that supports all the LLM inference engines and commercial providers. I thought, okay, what if I did something different, which is like make something that's as Claude friendly as possible and forget everything else. So that's what Claudette was. So for example, one of the really nice things in Claude is prefill. So by telling the assistant that this is what your response started with, there's a lot of powerful things you can take advantage of. So yeah, I created Claudette to be as Claude friendly as possible. And then after I did that, and then particularly with GPT 4.0 coming out, I kind of thought, okay, now let's create something that's as OpenAI friendly as possible. And then I tried to look to see, well, where are the similarities and where are the differences? And now can I make them compatible in places where it makes sense for them to be compatible without losing out on the things that make each one special for what they are. So yeah, those are some of the things I've been working on in that space. And I'm thinking we might launch AI magic via a course called how to solve it with code. The name is based on the classic Polya book, if you know how to solve it, which is, you know, one of the classic math books of all time, where we're basically going to try to show people how to solve challenging problems that they didn't think they could solve without doing a full computer science course, by taking advantage of a bit of AI and a bit of like practical skills, as particularly for this like whole generation of people who are learning to code with and because of ChatGPT. Like I love it, I know a lot of people who didn't really know how to code, but they've created things because they use ChatGPT, but they don't really know how to maintain them or fix them or add things to them that ChatGPT can't do, because they don't really know how to code. And so this course will be designed to show you how you can like either become a developer who can like supercharge their capabilities by using language models, or become a language model first developer who can supercharge their capabilities by understanding a bit about process and fundamentals.Alessio [00:50:19]: Nice. That's a great spoiler. You know, I guess the fourth time you're going to be on learning space, we're going to talk about AI magic. Jeremy, before we wrap, this was just a great run through everything. What are the things that when you next come on the podcast in nine, 12 months, we're going to be like, man, Jeremy was like really ahead of it. Like, is there anything that you see in the space that maybe people are not talking enough? You know, what's the next company that's going to fall, like have drama internally, anything in your mind?Jeremy [00:50:47]: You know, hopefully we'll be talking a lot about fast HTML and hopefully the international community that at that point has come up around that. And also about AI magic and about dialogue engineering. Hopefully dialogue engineering catches on because I think it's the right way to think about a lot of this stuff. What else? Just trying to think about all on the research side. Yeah. I think, you know, I mean, we've talked about a lot of it. Like I think encoder decoder architectures, encoder only architectures, hopefully we'll be talking about like the whole re-interest in BERT that BERT 24 stimulated.Swyx [00:51:17]: There's a safe space model that came out today that might be interesting for this general discussion. One thing that stood out to me with Cartesia's blog posts was that they were talking about real time ingestion, billions and trillions of tokens, and keeping that context, obviously in the state space that they have.Jeremy [00:51:34]: Yeah.Swyx [00:51:35]: I'm wondering what your thoughts are because you've been entirely transformers the whole time.Jeremy [00:51:38]: Yeah. No. So obviously my background is RNNs and LSTMs. Of course. And I'm still a believer in the idea that state is something you can update, you know? So obviously Sepp Hochreiter came up, came out with xLSTM recently. Oh my God. Okay. Another whole thing we haven't talked about, just somewhat related. I've been going crazy for like a long time about like, why can I not pay anybody to save my KV cash? I just ingested the Great Gatsby or the documentation for Starlet or whatever, you know, I'm sending it as my prompt context. Why are you redoing it every time? So Gemini is about to finally come out with KV caching, and this is something that Austin actually in Gemma.cpp had had on his roadmap for years, well not years, months, long time. The idea that the KV cache is like a thing that, it's a third thing, right? So there's RAG, you know, there's in-context learning, you know, and prompt engineering, and there's KV cache creation. I think it creates like a whole new class almost of applications or as techniques where, you know, for me, for example, I very often work with really new libraries or I've created my own library that I'm now writing with rather than on. So I want all the docs in my new library to be there all the time. So I want to upload them once, and then we have a whole discussion about building this application using FastHTML. Well nobody's got FastHTML in their language model yet, I don't want to send all the FastHTML docs across every time. So one of the things I'm looking at doing in AI Magic actually is taking advantage of some of these ideas so that you can have the documentation of the libraries you're working on be kind of always available. Something over the next 12 months people will be spending time thinking about is how to like, where to use RAG, where to use fine-tuning, where to use KV cache storage, you know. And how to use state, because in state models and XLSTM, again, state is something you update. So how do we combine the best of all of these worlds?Alessio [00:53:46]: And Jeremy, I know before you talked about how some of the autoregressive models are not maybe a great fit for agents. Any other thoughts on like JEPA, diffusion for text, any interesting thing that you've seen pop up?Jeremy [00:53:58]: In the same way that we probably ought to have state that you can update, i.e. XLSTM and state models, in the same way that a lot of things probably should have an encoder, JEPA and diffusion both seem like the right conceptual mapping for a lot of things we probably want to do. So the idea of like, there should be a piece of the generative pipeline, which is like thinking about the answer and coming up with a sketch of what the answer looks like before you start outputting tokens. That's where it kind of feels like diffusion ought to fit, you know. And diffusion is, because it's not autoregressive, it's like, let's try to like gradually de-blur the picture of how to solve this. So this is also where dialogue engineering fits in, by the way. So with dialogue engineering, one of the reasons it's working so well for me is I use it to kind of like craft the thought process before I generate the code, you know. So yeah, there's a lot of different pieces here and I don't know how they'll all kind of exactly fit together. I don't know if JEPA is going to actually end up working in the text world. I don't know if diffusion will end up working in the text world, but they seem to be like trying to solve a class of problem which is currently unsolved.Alessio [00:55:13]: Awesome, Jeremy. This was great, as usual. Thanks again for coming back on the pod and thank you all for listening. Yeah, that was fantastic. Get full access to Latent Space at www.latent.space/subscribe

Noob Spearo Podcast | Spearfishing Talk with Shrek and Turbo
NSP:264 W.A Part 2 2024 | Going North with Old Man Blue | Bert Keulder & Deryck Tan

Noob Spearo Podcast | Spearfishing Talk with Shrek and Turbo

Play Episode Listen Later Jul 3, 2024 37:02


Interview with Bert Keulder & Deryck Tan Todays interview is with Bert Keulder & Deryck Tan as we travel north for part 2 of the WA trip! Today we suffer a vehicle breakdown, go diving around Geraldton and chat about the highlights and memorable fish from the trip. Important times 00:13 Intro 03:20 G'day guys, we have some bad news! 07:25 Diving Geraldton 09:40 Cuttlefish and Dhufish 14:50 Berts nightmare: Deryck and the Buff Bream 18:00 Tough dive conditions 20:45 Broken anchor 26:25 Bert's onboard cooker 29:35 Plans for the rest of the trip 34:15 Outro Listen in and subscribe on iOS or Android Important Links   Noob Spearo Partners and Discount Codes | Get Spear Ready and make the most of your next spearfishing trip! 50 days to better spearfishing! - Use the code NOOBSPEARO for a free hat of your choice from FuckTheTaxman.com . Use the code NOOBSPEARO save $20 on every purchase over $200 at checkout – Flat shipping rate, especially in AUS! – Use the code NOOB10 to save 10% off anything store-wide. Free Shipping on USA orders over $99 | Simple, Effective, Dependable Wooden Spearguns. Use the Code NOOB to save $30 on any speargun:) | 10% off for listeners with code: NOOBSPEARO | Get 10% off Sharkshield Technology | Freedom7 or Scuba7 enter the code NOOBSPEARO | ‘Spearo Dad' | ‘Jobfish Tribute' | 99 Spearo Recipes use the code SPEARO to get 20% off any course 28-day Freediving Transformation | Equalization Masterclass – Roadmap to Frenzel | The 5 minute Freediver | Break the 10 Meter Barrier – Use the code NOOBSPEARO to save . Listen to 99 Tips to Get Better at Spearfishing | Wickedly tough and well thought out gear! Check out the legendary

Let's Talk AI
#171 - - Apple Intelligence, Dream Machine, SSI Inc

Let's Talk AI

Play Episode Listen Later Jun 24, 2024 124:01 Transcription Available


Our 171st episode with a summary and discussion of last week's big AI news! With hosts Andrey Kurenkov (https://twitter.com/andrey_kurenkov) and Jeremie Harris (https://twitter.com/jeremiecharris) Feel free to leave us feedback here. Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai and/or hello@gladstone.ai Timestamps + Links: (00:00:00) Intro / Banter Tools & Apps(00:03:13) Apple Intelligence: every new AI feature coming to the iPhone and Mac (00:10:03) ‘We don't need Sora anymore': Luma's new AI video generator Dream Machine slammed with traffic after debut (00:14:48) Runway unveils new hyper realistic AI video model Gen-3 Alpha, capable of 10-second-long clips (00:18:21) Leonardo AI image generator adds new video mode — here's how it works (00:22:31) Anthropic just dropped Claude 3.5 Sonnet with better vision and a sense of humor Applications & Business(00:28:23 ) Sam Altman might reportedly turn OpenAI into a regular for-profit company (00:31:19) Ilya Sutskever, Daniel Gross, Daniel Levy launch Safe Superintelligence Inc. (00:38:53) OpenAI welcomes Sarah Friar (CFO) and Kevin Weil (CPO) (00:41:44) Report: OpenAI Doubled Annualized Revenue in 6 Months (00:44:30) AI startup Adept is in deal talks with Microsoft (00:48:55) Mistral closes €600m at €5.8bn valuation with new lead investor (00:53:12) Huawei Claims Ascend 910B AI Chip Manages To Surpass NVIDIA's A100, A Crucial Alternative For China (00:56:58) Astrocade raises $12M for AI-based social gaming platform Projects & Open Source(01:01:03) Announcing the Open Release of Stable Diffusion 3 Medium, Our Most Sophisticated Image Generation Model to Date (01:05:53) Meta releases flurry of new AI models for audio, text and watermarking (01:09:39) ElevenLabs unveils open-source creator tool for adding sound effects to videos Research & Advancements(01:12:02) Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling (01:22:07) Improve Mathematical Reasoning in Language Models by Automated Process Supervision (01:28:01) Introducing Lamini Memory Tuning: 95% LLM Accuracy, 10x Fewer Hallucinations (01:30:32) An Empirical Study of Mamba-based Language Models (01:31:57) BERTs are Generative In-Context Learners (01:33:33) SELFGOAL: Your Language Agents Already Know How to Achieve High-level Goals Policy & Safety(01:35:16) Sycophancy to subterfuge: Investigating reward tampering in language models (01:42:26) Waymo issues software and mapping recall after robotaxi crashes into a telephone pole (01:45:53) Meta pauses AI models launch in Europe (01:46:44) Refusal in Language Models Is Mediated by a Single Direction Sycophancy to subterfuge: Investigating reward tampering in language models (01:51:38) Huawei exec concerned over China's inability to obtain 3.5nm chips, bemoans lack of advanced chipmaking tools Synthetic Media & Art(01:55:07) It Looked Like a Reliable News Site. It Was an A.I. Chop Shop. (01:57:39) Adobe overhauls terms of service to say it won't train AI on customers' work (01:59:31) Buzzy AI Search Engine Perplexity Is Directly Ripping Off Content From News Outlets (02:02:23) Outro + AI Song 

TBTL: Too Beautiful To Live
#4191 The Berts And The Bees (Or “The Bert In The High Castle”)

TBTL: Too Beautiful To Live

Play Episode Listen Later Apr 24, 2024 68:06


Luke met the most interesting man in the world yesterday. Not the guy you're thinking of - someone even more interesting than that. He and Andrew also preview a very special edition of TBTL that they're working on.   

Morgonpasset i P3 – Gästen
Johan Ulveson: ”När jag tänker på döden så är jag rädd”

Morgonpasset i P3 – Gästen

Play Episode Listen Later Apr 9, 2024 19:45


Berts pappa i Bert! Fredrik i c/o Segemyhr! Den blinda nazisten i filmen Yrrol! Johan Ulveson har gjort många ikoniska roller, där en av dem är Skalle-Per i den nya filmatiseringen av Ronja. Vi blickar tillbaka på hans roller men också framåt med den nya Skalle-Per. Och hur känner Johan Ulveson inför döden? Lyssna på alla avsnitt i Sveriges Radio Play. Programledare: David Druid och Linnea Wikblad.

Alaska Wild Project
AWP Episode 160 "We All Win When We Win" w/Bert Sorin of Sorinex Excercise Equipment

Alaska Wild Project

Play Episode Listen Later Mar 25, 2024 175:06


AWP Episode 160 “We All Win When We Win” w/Bert Sorin of Sorinex Exercise Equipment, Sorinex Summer Strong & Winterstrong. - Broomed off”   Daniel Buitrago, Brandon Fifield, Jack Lau & Chad Aurentz crush the rack with special guest, Bert Sorin co-owner & President of Sorinex Exercise Equipment out of South Carolina   Ultra focused high's, Chugach Powder Guides & snow cats, Chad's dude week, social media hacks, (Coffee Creamer & THC edibles), edible truths, 15ft case drop, last chance snowmobile snow check @ Alaska Mining & Diving, event ends March 31st, ($1,000 in parts, accessories & Gear or a 1-year bumper bumper warranty), That PPP $$$, the 2 billion $ Godmother, the dark side of the student loan process, the union option ain't a bad idea, spring bear time is coming, you wanna fuck with bears or sharks, shooting a bear form the knee prone position, the brown bear charge, rolling the dice with Chad, Sorinex intro, Argentina Red Stag Hunt, Bert's African hunt, Sorin Equipment Starting in 1980, Berts growth into the business through training, building a brand through tracks & trails, #1 perpetuate the business brand, #2 personal brand, #3 Learn nothing #4Teach something, The 2024 Winterstrong vibe, we all win we we win, Brandon's winterstrong experience, developing the “FARM”, Kipp & Donny @ the Farm, Soinex Outdoors, building the fitness & custom equipment process, being an ambassador to the Global War on Terrorism  Memorial Foundation (Mike Rodriguez),   Visit our website - www.alaskawildproject.com Follow on Instagram - www.instagram.com/alaskawildproject Watch on YouTube - www.youtube.com/@alaskawildproject Support on Patreon - www.patreon.com/alaskawildproject  

Gamlaste nytt
Berts mentala värld och Liljestrands blunder

Gamlaste nytt

Play Episode Listen Later Mar 14, 2024 59:30


I dagens avsnitt gör Rasmus en djupdykning i nöjesprofilen och entreprenören Bert Karlssons mentala hälsa. Hur mår han EGENTLIGEN? Bertil ger sin syn på tv-programmet Marko och Irma samtidigt som kulturjournalisten och författaren Jens Liljestrand får sin en Nato-känga. Dessutom, varför har Malin Åkermans karriär blivit som den blivit? Och vem skriver tipslistorna på Metro EGENTLIGEN?

Life with One Eye
A Mercurial Life - Chapter 18: The Beach, relief, and wounded hurt during the Kitchen work at Berts

Life with One Eye

Play Episode Listen Later Mar 8, 2024 23:08


Inspired by Kahlil Gibran, Bert' Market and Folly Beach, Elle, Trey, Eric, and Patrick.  Audiobook.  Mature listeners only (18+).

Review Party Dot Com
RPDC 184: Poor Q*Bert

Review Party Dot Com

Play Episode Listen Later Jan 9, 2024 50:57


This little guys, *sigh*. He just wasn't made for this world. What we do have here, though, are some otherworldly (internet) reviews for Long John Silvers, TotalWine.come, Romantix Adult Arcade, A Christmas Carol, a silly little activity journal, and a big ol' arena in Omaha. Buckle up, Berts, this is a bumpy ride! Want more party? Check it out at https://www.reviewpartydotcom.com/ !

Scales N Tales
Episode 129 Bert Salas

Scales N Tales

Play Episode Listen Later Dec 14, 2023 84:39 Very Popular


Local DFW swimbait stick Bert Salas comes on the show to answer all my random questions. I can honestly say we covered dang near every topic I can think of, FFS, Striper fishing, ideal fishing scenario, bait modding and of course shad vs trout glides. We didn't get super deep into stories or experiences, so expect a pt2 for this episode soon. Also check out Bert's clothing brand Province Clothing Co. Berts socials: IG: @bertsalas IG:provinceclothignco Check out Leviathan Rods, and use code scales20 at check out for 20% off all your rod purchases! ⁠⁠https://www.leviathanrods.com⁠ ⁠  Check out the new official SNT tackle shop sponsor, Lake Pro Tackle! Use code "SCALES" at checkout for 15% off your order of any conventional or Swimbait-related products! ⁠https://lakeprotackle.com/⁠   Pro Bass Adventures Mexico is the only company with lodges on both Lake El Salto and Lake Lake Baccarac in western Mexico. More 10+ pound monster bass have consistently been caught from these two lakes than anywhere else on earth. If you are considering a Mexico bass fishing trip, look no further.  ⁠https://www.mexicofishing.net/index.html  Meat Crafters is now offering 10% off their site when you use code SCALESNSLICES at checkout! This is small batch meat made with immense quality and attention to detail. My favorite product of theirs so far is the Raging Brats! Made with real local brewed IPA and fresh ingredients to complement the whole Brat, it's no surprise why this is my favorite! ⁠⁠⁠⁠https://www.meatcrafters.com/⁠⁠⁠ --- Support this podcast: https://podcasters.spotify.com/pod/show/sntpod/support

The Mike Calta Show Featured Cut of the Day

The Mike Calta Show Featured Cut

TT Filmpodcast
280. The Cobweb Machine Wolf Like Me!

TT Filmpodcast

Play Episode Listen Later Aug 31, 2023 61:31


I detta intima och rätt så obehagliga avsnitt tar vi med er och försöker antingen skrämma er eller locka er. Vi har varit på bio och sett kanske årets bästa skräckfilm. I filmen COBWEB träffar vi den 8 årige Peter. Han är mobbad i skolan och hans föräldrar är löjligt överbeskyddande. En natt hör han knackningar i väggen från hans rum och detta kommer vända uppochned på hela hans tillvaro.  Tomas har äntligen tagit tag i THE GUARDIAN OF THE GALAXY VOL 3, nu när den finns tillgänglig på hemmabio. Frågan är om han gillade James Gunns sorti från Marvels universum. Vi vet ju att Thomas gillande den men det brukar ju resultera ett motsatt resultat från den andra poddhalvan. Vi har även sett den nu streaming och Blu-ray aktuella filmen THE MACHINE där ståupparen och poddaren Bert Kreischer fått möjlighet att brodera ut en film från sin ungdom när han var student och rånade ett tåg i Ryssland. Självaste Luke Skywalker (Mark Hamill) är med på färden som Berts svinjobbiga pappa. Frågan är om filmen är svinjobbig eller om detta var en fin överrasking? Och avslutningsvis så tar Thomas tag i sitt uppdrag med tv-serien WOLF LIKE ME. Tomas har tjatat och tjatat i flera år om denna serie. Isla Fisher och Josh Gad frontar denna 6 delars serie om sorg, hopplöshet, trasiga själar och en helt oväntat komponent. Gillade Thomas det? Ifall han gjorde det, kommer han någonsin erkänna det? Det blir en fullmatat timme som svischar förbi i ett nafs. Så vad väntar du på. Tryck på Play! Det blir intressant, vi lovar!

Ready Set Roll
Ep.20 who framed Bert Bronson

Ready Set Roll

Play Episode Listen Later Jul 26, 2023 68:43


In this episode of legends we take a step back into Berts past and find out what or who he's really running from Support us @ https://www.patreon.com/ReadySetRoll1 Like us on Facebook http://facebook.com/readysetroll1 Twitter http://twitter.com/readysetroll20   We all like shirts get yours at http://rsrmerch.com  Get your set of Dice at http://diceenvy.com/readysetroll and get 10% off   Don't forget to rate, review, subscribe & share!  

The Nonlinear Library
AF - Speculative inferences about path dependence in LLM supervised fine-tuning from results on linear mode connectivity and model souping by Robert Kirk

The Nonlinear Library

Play Episode Listen Later Jul 20, 2023 8:34


Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Speculative inferences about path dependence in LLM supervised fine-tuning from results on linear mode connectivity and model souping, published by Robert Kirk on July 20, 2023 on The AI Alignment Forum. TL;DR: I claim that supervised fine-tuning of the existing largest LLMs is likely path-dependent (different random seeds and initialisations have an impact on final performance and model behaviour), based on the fact that when fine-tuning smaller LLMs, models pretrained closer to convergence produce fine-tuned models with similar mechanisms while this isn't the case for models pretrained without being close to convergence; this is analogous to current LLMs that are very far from convergence at the end of training. This is supported by linking together existing work on model souping, linear mode connectivity, mechanistic similarity and path dependence. Epistemic status: Written in about two hours, but thought about for longer. Experiments could definitely test these hypotheses. Acknowledgements: Thanks to Ekdeep Singh Lubana for helpful comments and corrections, and discussion which lead to this post. Thanks also to Jean Kaddour, Nandi Schoots, Akbir Khan, Laura Ruis and Kyle McDonell for helpful comments, corrections and suggestions on drafts of this post. Terminology Model souping is the procedure of taking a pretrained model, fine-tuning it with different hyperparameters and random seeds on the same task, and then averaging the parameters of all the networks. This gets better results on both in-distribution and out-of-distribution testing in Computer Vision when fine-tuning a large-scale contrastively-pretrained transformer or CNN image model on ImageNet-like tasks. (Linear) mode connectivity (LMC) between two models on a task means that any (linear) interpolation in parameter space between the two models achieves the same or lower loss as the two models. A training process is path independent if it always reaches (roughly) the same outcome regardless of irrelevant details or randomness (for example network initialisation or data ordering in supervised learning, or sampling from a policy in supervised learning). A training process is path dependent if it's the opposite. There is of course nuance in what counts as "irrelevant details of randomness". For this post we can operationalise this as just data ordering and network initialisation in a supervised learning context. Linking terminology together: For model souping to work, you likely need linear mode connectivity to hold between all the models you're averaging on the tasks you care about - the average is one point on the linear interpolation. (In fact you need more than that - the average point needs to have better loss, not just the same). If a training process always produces linearly connected models, then we can think of it as being approximately path independent. Mechanistic Mode Connectivity shows that for converged vision models, two models being linearly connected implies they use similar mechanisms to predict the output (specifically they're invariant to the same set of interventions on the data generating process). Linear Connectivity Reveals Generalization Strategies shows empirically a similar phenomenon: fine-tuned BERT models that are linearly connected generalise in similar ways out-of-distribution. Overall this gives us this picture of properties a training process can have: Current Results Linear Connectivity Reveals Generalization Strategies shows that different fine-tunes of BERT on the same task are often linearly disconnected. In Appendix J they show that this isn't the case for different fine-tunes of RoBERTa, with the main difference between BERT and RoBERTa being much longer pretraining on more data. BERTs of a feather do not generalize together: Large variability in generalization across...

Puppkultur
Folge 2: Ernie und Bert

Puppkultur

Play Episode Listen Later Jul 1, 2023


Sie sind die wohl bekanntesten Klappmäuler der weltweiten Fernsehgeschichte. Ernie und Bert kennt jedes Kind. Viele Generationen wuchsen inzwischen mit dem Comedy-Duo auf. Wir beleuchten das Phänomen von Grund auf, diskutieren über die deutschen Synchronstimmen, präsentieren unsere Lieblingsclips und klären sogar die Frage, wie Berts berühmte Monobraue funktioniert. Taucht mit uns ein in eure Kindheit und schaut dabei ein wenig hinter die Kulissen! Besprochene Bilder: 0:24:30 https://ibb.co/1Mrky3b 0:42:15 https://ibb.co/rHvfx86 0:51:25 https://ibb.co/1JQsrrB 0:56:02 https://ibb.co/2WM5sL3 0:57:04 https://ibb.co/wN41mHD 1:09:37 https://ibb.co/HY1qtmB 1:12:24 https://ibb.co/kJPLqgB 1:13:58 https://ibb.co/r66qbJm

Nerd Forensics
31: The Berts from Brazil (Pt. 4 of the Critique of "God's Debris" by Scott Adams)

Nerd Forensics

Play Episode Listen Later Jun 20, 2023 117:56


Join Millicent Oriana and her co-host Jacob Urban and producer Sofia Baca as they, once again, delve into a review of “God's Debris”. References: Invader Zim: “Germs”. SpongeBob SquarePants: “Just One Bite”. Miracles by ICP. It's Always Sunny - “The Gang buys a Boat”, “The Gang Saves the Day”. Documentary Now! - “Location is Everything”. Peep Show - “St. Hospital's” The Whitest Kids You Know - “Gallon of PCP”. Willy Wonka and the Chocolate Factory (1971). King of the Hill - “Witches of East Arlen”. Hero (2002). This episode is distributed under a Creative Commons Attribution-ShareAlike 4.0 International License. For more information, visit creativecommons.org/licenses/by-sa/4.0/ [Featuring: Millicent Oriana, Jacob Urban, Sofia Baca] --- Support this podcast: https://podcasters.spotify.com/pod/show/nerdforensics/support

IB Podcast - Over God, Israël en de Bijbel
Wat is er gebeurd met de Joodse Groningers?

IB Podcast - Over God, Israël en de Bijbel

Play Episode Play 23 sec Highlight Listen Later Jun 9, 2023 9:33


De Tweede Wereldoorlog betekent vrijwel het einde van Joods Groningen.  Door een administratieve fout ontloopt de vader van Bert van der Hak de gaskamer van Auschwitz. Luister naar Berts verhaal over zijn Groningse familie en hoe hij tot geloof is gekomen.  En niet alleen hij...!Meer weten over Israël en de Bijbel? Klik hier.Blijf op de hoogteVia Facebook, Instagram Via IB Magazine  of de gratis nieuwsbriefVia de gebedsapp op Telegram of WhatsAppSupport the showEen Bijbel in elk Joods huis

Hard To Follow
Childhood Traumas

Hard To Follow

Play Episode Listen Later May 7, 2023 71:21


In this episode we get into some crazy stories. Berts starts us of with his imaginary scenario of human sized cockroaches. We create emo Miriam. We discuss responsibilities of parents while traveling and how we all got disciplined as children. We also have a phone call with a special guest.

My First Fantasy
S4E25 - NHL - Grand Finale

My First Fantasy

Play Episode Listen Later Apr 3, 2023 43:39


Josh and Berts are back to bring you into the final week of your Fantasy Hockey championship. https://www.tankathon.com/nhl/remaining_schedule_strength

My First Fantasy
S4E24 - NHL - Reunited (and it Feels So Good)

My First Fantasy

Play Episode Listen Later Mar 27, 2023 40:21


Josh and Berts reunited for this week's streams and schedule. Hockey Schedule Helper: https://public.tableau.com/app/profile/jesse8369/viz/HockeyScheduleHelper/HockeyScheduleHelper Contact us on Twitter @myfirstfantasy1

Andrew Schulz's Flagrant 2 with Akaash and Kaz
Bert Kreischer Gets Pranked By His Biggest Fear

Andrew Schulz's Flagrant 2 with Akaash and Kaz

Play Episode Listen Later Mar 21, 2023 129:24


Whats up people, we've got The Machine in the building, Bert Kreischer. Schulz reached out to his 2 bears, 1 cave co-host Tom Segura to get some info on some of Berts fears... turns out he hates clowns and balloons. We DID NOT expect it to end up like this. INDULGE! 00:00 START 01:27 Rolex boys 04:59 WHO Picked berts movie wife? + kissing a new woman 08:42 Why Bert won't cheat + Bert loves to EAT 12:46 Being honest + earning the ability to talk about yourself 15:30 Tosh is the greatest + colluding in poker 20:48 Bert's comedy style + having unbelievable conviction 30:26 Patrice O'Neal story + insight into their lives 32:07 Birth of "The Machine" + being protective about personal life 37:39 Bert LOVES Segura even if Bert gets hurt 41:54 Music industry infiltrated Comedy + wanting to make Comedy 45:58 Stand up clips are gonna end comedy but it WORKS 51:15 Body counts really do count 59:57 Segura sends his regards + "you can't trust them" 01:03:53 Flagrant unspiked Bert's beer 01:04:57 "I can smell them" - Balloon PTSD 01:09:18 Ari is more trustworthy than Flagrant + can't trust Clowns 01:15:04 Bert is terrified of flying + blimp panic attack 01:18:02 Industrial Clown Complex is done 01:19:17 Deaths, Phobias, panic attacks & spelunking 01:39:49 Travel show - cave diving or wingsuiting will be your funeral 01:45:04 Bert is relentless + failure was ultimate blessing 01:54:26 Early Rogan appearances - impacted EVERYTHING 02:04:22 Comics used to hate each other 02:06:41 Blessed to have lived the best life

My First Fantasy
S4E22 - NHL - Finals Prep

My First Fantasy

Play Episode Listen Later Mar 16, 2023 17:54


Berts takes you through this week's streams and schedule. Hockey Schedule Helper: https://public.tableau.com/app/profile/jesse8369/viz/HockeyScheduleHelper/HockeyScheduleHelper Contact us on Twitter @myfirstfantasy1

My First Fantasy
S4E21 - NHL - Week 20 Preview

My First Fantasy

Play Episode Listen Later Feb 28, 2023 25:47


Berts takes you through this week's streams and schedule. Hockey Schedule Helper: https://public.tableau.com/app/profile/jesse8369/viz/HockeyScheduleHelper/HockeyScheduleHelper Contact us on Twitter @myfirstfantasy1

Ins Unreine gesprochen...
#IUG_059 - Intelligenz, künstlich..

Ins Unreine gesprochen...

Play Episode Listen Later Feb 28, 2023 10:55


Alle reden davon. Oder damit. Ich jetzt auch. Die künstliche - mehr oder weniger - Intelligenz steht einem überall im Weg. Wir können sie nutzen - und uns das Leben einfacher machen. Nur: Wie einfach darf es sein, damit wir nicht komplett (geistig) verarmen? Eine gute Frage... im Selbstgespräch bei "Ins Unreine Gesprochen" - ungeschnitten, ohne Postproduktion und ohne Jingles.. Ach ja... der Film, den ich ansprach ist Wall-E. Mehr dazu gibt es bei diesem Wikipedia https://de.wikipedia.org/wiki/WALL%C2%B7E_%E2%80%93_Der_Letzte_r%C3%A4umt_die_Erde_auf Mehr wissenwollen? Fragen zum Thema? Anregungen? Themen? Her damit! Elektrische Post an: br@raschkowski.org Mehr zum Kopf, der hinter dieser Sache steckt? https://www.raschkowski.org/ Das Impuls- und Ideenteiler-Format 14-täglich Live & Bunt in Zoom https://www.raschkowski.org/fragbertfragt/ Home of the BERTS - mit Liebe von Hand gemachte Aufstellungsfiguren https://www.raschkowski.org/berts/

My First Fantasy
S4E20 - NHL - Wheel Good Streams

My First Fantasy

Play Episode Listen Later Feb 20, 2023 48:06


Josh and Berts are back to run you through this week's streams and schedule. Hockey Schedule Helper: https://public.tableau.com/app/profile/jesse8369/viz/HockeyScheduleHelper/HockeyScheduleHelper Contact us on Twitter @myfirstfantasy1

Nare Jongens Podcast
Nare Jongens Podcast 103 - TokTokTok Special

Nare Jongens Podcast

Play Episode Listen Later Feb 14, 2023 65:12


Wat een show! Geerts vragen, Mei Li's afgang, Yesims stand up, Bentes integriteitsschandaal en als klap op de vuurpijl: nieuws over Berts mislukte sollicitatie bij wat de meest inclusieve krant van Nederland blijkt te zijn! De TokTokTok Special van de Nare Jongens Podcast. Met een natte bende op het eind. Ook de extra's ontvangen? Meld je aan bij petjeaf.com/narejongens!

My First Fantasy
S4E19 - NHL - The Kids Are Alright!

My First Fantasy

Play Episode Listen Later Feb 14, 2023 25:33


Berts has your fantasy hockey matchup covered this week on another episode of the My First Fantasy podcast. Hockey Schedule Helper: https://public.tableau.com/app/profile/jesse8369/viz/HockeyScheduleHelper/HockeyScheduleHelper Player Comparator: https://public.tableau.com/app/profile/jesse8369/viz/NHLPlayerComparator/PlayerComparator Contact us on Twitter @myfirstfantasy1

My First Fantasy
S4E17 - NHL - Ranger Fantasy Woes

My First Fantasy

Play Episode Listen Later Jan 23, 2023 48:28


Josh and Berts run you through the fantasy hockey schedule and streamer options for this week. Hockey Schedule Helper: https://public.tableau.com/app/profile/jesse8369/viz/HockeyScheduleHelper/HockeyScheduleHelper Player Comparator: https://public.tableau.com/app/profile/jesse8369/viz/NHLPlayerComparator/PlayerComparator Contact us on Twitter @myfirstfantasy1

Kvällspasset i P4
Då blev det blött

Kvällspasset i P4

Play Episode Listen Later Jan 18, 2023 55:50


Visst är det rätt blött nu? Är det inte regn så är det översvämning och är det inte det så är det snö och tö. När blev det blött för dig? Ett nyfiket och underhållande aktualitetsprogram med lyssnaren i fokus.Väldigt blött detta! Vi fick bland annat höra från Ronny som fick en väldigt blöt överraskning med totalrenovering som följd, Magnus har haft vattensäng i 30 år och Berts hund födde kattungar på hans mage.Och så blir det en massa extramaterial där vi bland annat får höra om vattensängsförbud, hur det känns att få se poddfolkstecknet och på vilka sätt man kan cykla genom en vattenpöl.

The Bert Show
BERTS NEW DEODERANT ISN"T WORKING AND CASSIE GOT AN INVITE FOR A TV SHOW OVER VIRAL TIK TOK

The Bert Show

Play Episode Listen Later Nov 23, 2022 12:44


Learn more about your ad choices. Visit megaphone.fm/adchoices

Rediscovering the Indies
Episode 23 - Bert Prentice Part 5

Rediscovering the Indies

Play Episode Listen Later Jul 14, 2022


We finish the deep dive into the promoting career of Bert Prentice. On this episode we cover the end of 2002 until Berts passing in 2021. We cover events such as Bert being the local promoter for TNA, starting USA Championship Wrestling, working angles in Memphis for Corey Maclin, Being in Stings movie, promoting the 50th Anniversary of Jerry Lawler event. All this and much more on this deep dive of one of pro wrestling's most interesting people.

tna stings jerry lawler prentice berts usa championship wrestling
Me Before Mom
Season 3 Episode 4: Berts What to Watch List

Me Before Mom

Play Episode Listen Later Jun 7, 2022 34:47


Today Bert gushes over one of her most beloved pastimes…television! From nostalgic moments of bonding with family and friends to enjoying some much-needed alone time after a long day, the idiot box holds a special place in all of our hearts. Find out what shows Bert recommends, and maybe even pick up a tip on how to enjoy your favorite entertainment whenever you want.Matriarch Digital Media (matriarchdm.com) produces this and other podcasts that understand, encourage and uplift women.

Kvällspasset i P4
Kvällspasset i P4 med Rasmus Persson: Repliker som fastnat

Kvällspasset i P4

Play Episode Listen Later Jun 3, 2022 53:53


Vilken replik riktigt golvade dig?! Kvällspasset snackar om orden, fraserna eller uttrycken som blivit klassiker i ditt gäng. Det handlar om repliker som etsat sig fast i minnet. Berts kypare presenterade mimosan med en mycket osmaklig replik. När Elisabeth ser en soptunna blir hon fikasugen och Per-Arnes morbror kom på ett snärtigt svar på den banala och ibland innehållslösa frågan Hur mår du?I extramaterialet pratar gänget om egna påhittade ord och fraser. Det spelas även upp ett par minnesvärda klipp från reality-tv. Det blir en poddline, och Rasmus får rätt!

Olikheter - En podcast om ledarskap
Ibland blåser det och är ensamt på toppen – men utsikten kan vara vacker

Olikheter - En podcast om ledarskap

Play Episode Listen Later May 12, 2022 43:00


Tänk dig att du är VD i det samägda familjeföretaget. Du är dina familjemedlemmars chef, du skall alltid ha bolagets bästa för näthinnan och du är ytterst ansvarig för affären och arbetsmiljön. I dag träffar vi Bert Petersson, tillika VD i det familjeägda företaget LPE Sverige, ett entreprenörsföretag med flera strängar på sin lyra. Lyssna på Berts erfarenheter om vad det innebär att driva familjeföretag och vad hans syn på nycklar till framgång är.     Vi som poddar med Bert idag är Mats och Lars från @Leadership2Grow 

Bussin' With The Boys
Bert Kreischer (pt. 2)

Bussin' With The Boys

Play Episode Listen Later May 5, 2022 125:17 Very Popular


Recorded: April 22, 2022 | Is part 2 with Bert better than part 1 with Bert? The only way to find out is to subscribe, follow the boys, and listen to it. Intro (0:00) Bert interview starts (4:00) Bert's first time doing comedy (5:43) Writing process of a stand-up routine (21:05) Sending photos of his piece & not being able to do a first kiss (28:00) Kool-Aid & Jennifer Anniston viral moments (34:20) Tier Talk - best fast food burgers (51:40) Berts love for glizzy's (1:12:54) Magic of being a fan (1:20:50) Strategies to sell tickets (1:44:20) ----- SHOP: https://store.barstoolsports.com/collections/bussin-with-the-boys FOLLOW THE BOYS Instagram: https://www.instagram.com/bussinwtb Twitter: https://twitter.com/BussinWTB Facebook: https://www.facebook.com/BussinWTB Website: https://www.bussinwtb.com ----- SUPPORT OUR SPONSORS: Chevy: Chevy Silverado - The Strongest, Most Advanced Silverado Ever. Georgia Boot: Go to https://barstool.link/GeorgiaBoot and use code BUSSIN for 20% off Duke Cannon: Use code “Bussin” at https://barstool.link/DukeCannonBSS for 15% off your first order.

Bussin' With The Boys
Bert Kreischer (part 1)

Bussin' With The Boys

Play Episode Listen Later May 3, 2022 118:48 Very Popular


Recorded: April 22, 2022 | The moment we have all been waiting for. Bert Kreischer finally blesses the bus with his presence for a podcast so good we had to split it up into two parts. It's just as epic as you think it is going to be (Mt. Busmore). Intro (0:00) Bert Kreischer interview starts (21:27) Doing his own stunts and blowing out his tricep (25:50) Superstitions (29:30) Berts first time having sex scarred him for life (41:02) Swimming with sharks (54:40) Playing catch up in the stand up comedy industry (1:08:50) Meeting Aaron Rodgers (1:22:45) Gift giving with Tom Segura and Joe Rogan (1:38:00) End pod (1:58:47) ----- SHOP: https://store.barstoolsports.com/collections/bussin-with-the-boys FOLLOW THE BOYS Instagram: https://www.instagram.com/bussinwtb Twitter: https://twitter.com/BussinWTB Facebook: https://www.facebook.com/BussinWTB Website: https://www.bussinwtb.com ----- SUPPORT OUR SPONSORS: Chevy: Chevy Silverado - The Strongest, Most Advanced Silverado Ever. Georgia Boot: Go to https://barstool.link/GeorgiaBoot and use code BUSSIN for 20% off WhistlePig Whiskey: Get your bottle at https://barstool.link/WhistlePigBSS or at a local retailer. Duke Cannon: Use code “Bussin” at https://barstool.link/DukeCannonBSS for 15% off your first order. Roman: Go to https://barstool.link/BussinRoman to get $15 off your first order of ED treatment if approved

Lauwarmduscher
Alles hat ein Ende, nur die Berts sind zwei.

Lauwarmduscher

Play Episode Listen Later Apr 27, 2022 56:56


Zum großen Staffelfinale haben Steven und Marti nicht einen, nicht drei, nein: ZWEI Gäste eingeladen. Es handelt sich um die Typen hinter der erfolgreichen Audio Sitcom "Jour Fixe": Robert Sladeczek und Albert Bozesan. Live aus München zugeschaltet treffen zum ersten Mal alle vier aufeinander und erleben gemeinsam das verklingen der (vorerst) letzten Folge. Es war uns eine Ehre und ein Vergnügen. Bleibt sauber.

Segðu mér
Andrea Róberts framkvæmdastjóri

Segðu mér

Play Episode Listen Later Jan 17, 2022 40:00


Andrea Róberts elskar mánudaga og segist brosandi gangast við öllum tilfinningum sínum. Hún er framkavæmdastjóri FKA og segir frá því starfi.

Segðu mér
Andrea Róberts framkvæmdastjóri

Segðu mér

Play Episode Listen Later Jan 17, 2022


Andrea Róberts elskar mánudaga og segist brosandi gangast við öllum tilfinningum sínum. Hún er framkavæmdastjóri FKA og segir frá því starfi.

B.A.L.D
Episode 29 : BERTS A SQUIRTER FEAT MEL MEL

B.A.L.D

Play Episode Listen Later Mar 10, 2021 125:03


WAZZUP MFS WE GOT MCDONS RUNNINN THRU OUR VEINS SO U KNOW THE VIBEZ BOUTTA BE ON. WE GOT MELISSA AKA MEL MEL AKA LUCERNE VALLEY GOAT IN THE ROOM. WE TELL HELLA LIL MAN AKA TAY SOUL STORIES AND ALL HIS 5 FOOT SHENANIGANS. JK TAY WE LUV U BUT U BE DOIN SUM DUMB SHIDD SUMTIMES. BERT VIBEZ MAD FUCKED UP WE THIS THIS MF DYIN ON US. HIS AWAKE APNEA MACHINE DEADASS GAVE OUT AND THE MCDONS TOOK THE WHEEL AND DROVE HIM STRAIGHT TO CLEEP TOWN. IAN VIBEZ FINALLY BACK UP HE WAZ MISSIN HIS BIG BRO ET BUT THEY LINKED NOW HE FEELIN REJUVINATED. LMAO CATCH YALL NEXT WEEK. RIP OUR INTERN SHE GOT FIRED SO HARD SHE MOVED TO THE NEXT CITY SMH WHAT CAN YA DO

4 guys at once
The boy is back in town!

4 guys at once

Play Episode Listen Later Jan 26, 2021 76:30


Berts back from being a lumberjack and we talk about aliens and who would win in a fight between Nickelodeon characters and Cartoon network characters. Make sure to subscribe, and follow our podcast @4guysatonce on all social platforms and listen on all major podcast platforms. Email us at 4guyatonce@gmail.com Facebook Twitter Instagram --- Send in a voice message: https://podcasters.spotify.com/pod/show/4guysatonce/message Support this podcast: https://podcasters.spotify.com/pod/show/4guysatonce/support

Once Upon A Table Podcast
Bar Room Blitz

Once Upon A Table Podcast

Play Episode Listen Later Mar 21, 2020 47:06


An actual play fifth edition Dungeons & Dragons podcast, set in a fictional version of modern day Milwaukee.  Displaced, weary and aggravated one by one our characters find themselves arriving to a local east side bar, Berts. Henk tries to smell a new friend. Rob gets his boiler maker on the house. Mark learns the joys of reading. The guys meet a cop, but leave with no baseball cards.