Substance formed when two or more constituents are physically combined together
POPULARITY
Categories
IBM z17 is here! In episode 50 of Mixture of Experts, host Tim Hwang is joined by Kate Soule, Shobhit Varshney and Hillery Hunter to debrief the launch of a new mainframe with robust AI infrastructure. Next, Meta dropped Llama 4 over the weekend;, how's it going? Then, Shobhit is recording live from Google Cloud Next in Las Vegas, along with Gemini 2.5 Pro. What are some of the most exciting announcements? Finally, the Pew Research Center shows perception of AI, how does this impact the industry? All that and more on today's 50th Mixture of Experts. 00:01 -- Intro 00:55 -- IBM z17 11:42 -- Llama 4 25:02 -- Google Cloud Next 2025 34:29 -- Pew's research on perception of AI The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity. Explore the new features of IBM z17: https://www.ibm.com/products/z17 Read the Pew Research: https://www.pewresearch.org/internet/2025/04/03/how-the-us-public-and-ai-experts-view-artificial-intelligence/ Subscribe for AI updates: https://ibm.biz/Think_newsletter Visit Mixture of Experts podcast page to learn more AI content: https://www.ibm.com/think/podcasts/mixture-of-experts
Will OpenAI be fully open source by 2027? In episode 49 of Mixture of Experts, host Tim Hwang is joined by Aaron Baughman, Ash Minhas and Chris Hay to analyze Sam Altman's latest move towards open source. Next, we explore Anthropic's mechanistic interpretability results and the progress the AI research community is making. Then, can Apple catch up? We analyze the latest critiques on Apple Intelligence. Finally, Amazon enters the chat with AI agents. How does this elevate the competition? All that and more on today's Mixture of Experts.00:01 -- Introduction00:48 -- OpenAI goes open 11:36 -- Anthropic interpretability results 24:55 -- Daring Fireball on Apple Intelligence 34:22 -- Amazon's AI agentsThe opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.Subscribe for AI updates: https://www.ibm.com/account/reg/us-en/signup?formid=news-urx-52120Learn more about artificial intelligence → https://www.ibm.com/think/artificial-intelligenceVisit Mixture of Experts podcast page to learn more AI content → https://www.ibm.com/think/podcasts/mixture-of-experts
What's the best open-source model? In episode 48 of Mixture of Experts, host Tim Hwang is joined by Kate Soule, Kush Varshney and Skyler Speakman to explore the future of open-source AI models. First, we chat about the release of DeepSeek-V3-0324. Then, more announcements coming out of Google including Gemini Canvas and Gemini 2.5. Next, Extropic has entered the chat with a thermodynamic chip. Finally, AI image generation is on the rise as OpenAI released GPT-4o image generation. All that, and more on today's Mixture of Experts. 00:01 – Intro 00:42– DeepSeek-V3-0324 09:48 – Gemini 2.5 and Canvas 21:27– Extropic's thermodynamic chip 30:20 – OpenAI image generation The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
Hey everyone, Alex here
What's the most exciting announcement coming out of NVIDIA GTC? In episode 47 of Mixture of Experts, host Tim Hwang is joined by Nathalie Baracaldo, Kaoutar El Maghraoui and Vyoma Gajjar. First, we dive into the latest announcements from NVIDIA GTC, including the Groot N1 model for humanoid robotics. Next, Baidu released some new AI reasoning models, and they're not open source? Then, for our paper of the week we discuss the flaws of Chain-of-Thought reasoning. Finally, Gemini Flash 2.0 has released image generation models for developer experimentation., Iis Google catching up on the AI game? Tune -in to today's Mixture of Experts to find out! 00:01 – Intro 01:27– NVIDIA GTC 14:18– New Baidu AI models 21:19– Chain-of-Thought reasoning 32:18 – Gemini image generation The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
March 18, 2025 Ezra 9:1-15; Ps. 31:9-18; Prov. 11:16-17; I Cor. 5:9-13
Is Manus a second DeepSeek moment? In episode 46 of Mixture of Experts, host Tim Hwang is joined by Chris Hay, Kaoutar El Maghraoui and Vyoma Gajjar to talk Manus! Next, the rise of vibe coding—what started as a joke has now become a thing? Then, we dive deep into the future of scaling laws. Finally, Perplexity is teaming up with Deutsche Telekom to release an AI phone—what's the motivation here? Tune-in to today's Mixture of Experts to find out more! 00:01 – Intro 00:37 -- Manus 14:09 – Vibe coding 30:13 – Scaling laws 39:07 – Perplexity's AI phone The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
When can we expect quantum to reach consumer devices? In episode 45 of Mixture of Experts, host Tim Hwang is joined by special guest, Blake Johnson, to debrief the quantum noise in the news. Blake helps us understand the intersection between quantum and AI and how far we are from this technology. Then, veteran experts Chris Hay and Volkmar Uhlig hash out some other news in AI this week. We cover Anthropic's Model Context Protocol, CoreWeave filing for an IPO and Sesame AI's new voice companion. All that and more on today's Mixture of Experts! 00:01 – Intro 01:06 – Quantum leap 20:08 -- Model Context Protocol 28:24 -- CoreWeave IPO 40:12 -- Sesame AI voice companion The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
The 365 Days of Astronomy, the daily podcast of the International Year of Astronomy 2009
Hosted by Chris Beckett & Shane Ludtke, two amateur astronomers in Saskatchewan. actualastronomy@gmail.com The Observer's Calendar for March 2025 on Episode 472 of the Actual Astronomy podcast. I'm Chris and joining me is Shane. We are amateur astronomers who love looking up at the night sky and this podcast is for everyone who enjoys going out under the stars. March 4th is Pancake Tuesday March 5 - Moon 0.6-degrees N of Pleiades but 6-7 degrees E of M45 for us March 6 - Lunar X & V visible March 7 - Lunar straight wall and Walther Sunrise Ray visible on Moon March 8 - Mercury at greatest evening elongation 18-degrees from Sun in W. & Mars 1.7 degrees S of Moon March 9 - Jewelled Handle Visible on Moon March 11 - 2 Satellites Visible on Jupiter at 8:42 pm EST March 12 - Asteroid 8 Flora at opposition m=9.5 - Discovered by Hind in 1847 is is the innermost large asteroid and the seventh brightest. Name was proposed by John Herschel for the latin goddess of flowers and gardens. Parent of the Flora family of asteroids. Mixture of silicate rock, nickel-iron metal. March 12 - also, - Wargetin Pancake Visible on Moon March 13 - M 93 well placed this evening March 14 - Lunar Eclipse for NA - Just before Midnight on the 13…for us it's best around 2:45 CST. March 20 - Spring Equinox March 22 - Zodiacal Light becomes visible for a. Couple weeks in W evening sky March 23 - large tides this week March 24 - Mare Orientale visible on Moon - 6am March 27 - 2579 nebula and cluster well placed for observing this evening - Galaxy NGC 2784 March 28 - Friday, best weekend this year for Messier Marathon March 29 - Partial Solar Eclipse - Centred on Northern Labrador and Baffin Island. - Gegenschein visible from a very Dark Site high in S at midnight March 30 - More Large Tides - Sirius B, “The Pup” - Current separation about 11 arc seconds max in 50 years. https://www.rasc.ca/sirius-b-observing-challenge Concluding Listener Message: Please subscribe and share the show with other stargazers you know and send us show ideas, observations and questions to actualastronomy@gmail.com We've added a new way to donate to 365 Days of Astronomy to support editing, hosting, and production costs. Just visit: https://www.patreon.com/365DaysOfAstronomy and donate as much as you can! Share the podcast with your friends and send the Patreon link to them too! Every bit helps! Thank you! ------------------------------------ Do go visit http://www.redbubble.com/people/CosmoQuestX/shop for cool Astronomy Cast and CosmoQuest t-shirts, coffee mugs and other awesomeness! http://cosmoquest.org/Donate This show is made possible through your donations. Thank you! (Haven't donated? It's not too late! Just click!) ------------------------------------ The 365 Days of Astronomy Podcast is produced by the Planetary Science Institute. http://www.psi.edu Visit us on the web at 365DaysOfAstronomy.org or email us at info@365DaysOfAstronomy.org.
Dj Bully B -Essence of Soul - Divine Mixture -4-3-25 -
Is pre-training dead? In this bonus episode of Mixture of Experts, guest host Bryan Casey is joined by Kate Soule and Chris Hay. On Thursday, Sam Altman dropped GPT-4.5 just after we wrapped our weekly recording. We got a few of our veteran experts on the podcast to analyze OpenAI's largest and “best” chat model yet. What's the hype? Tune-in to this bonus episode to find out! 00:01 – Intro 00:25 – GPT-4.5 The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
Granite 3.2 is officially here! In episode 44 of Mixture of Experts, host Tim Hwang is joined by Kate Soule, Maya Murad and Kaoutar El Maghraoui to debrief a few big AI announcements. Last week we covered small vision-language models (VLMs), and this week Granite 3.2 dropped with new VLMs, enhanced reasoning capabilities, and more! Kate takes us under the hood to understand the new features and how they were created. Next, Anthropic dropped a new intelligence model, Claude 3.7 Sonnet, and a new agentic coding tool, Claude Code. Why did Anthropic release these separately? Then, as we cannot have an episode without covering agents, Maya takes us through the new BeeAI agents! Finally, can fine tuning on a malicious task lead to much broader misalignment? Our experts analyze a new paper released on ‘Emergent misalignment.' All that and more on this week's episode! 00:01 – Intro 00:41 – Claude 3.7 Sonnet 11:58 – BeeAI agents 20:11– Granite 3.2 29:23 – Emergent misalignment The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
Dj Bully B - The Essence of Soul - 100% Independent Music Mixture 26/2/25
This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Today, we're joined by Ron Diamant, chief architect for Trainium at Amazon Web Services, to discuss hardware acceleration for generative AI and the design and role of the recently released Trainium2 chip. We explore the architectural differences between Trainium and GPUs, highlighting its systolic array-based compute design, and how it balances performance across key dimensions like compute, memory bandwidth, memory capacity, and network bandwidth. We also discuss the Trainium tooling ecosystem including the Neuron SDK, Neuron Compiler, and Neuron Kernel Interface (NKI). We also dig into the various ways Trainum2 is offered, including Trn2 instances, UltraServers, and UltraClusters, and access through managed services like AWS Bedrock. Finally, we cover sparsity optimizations, customer adoption, performance benchmarks, support for Mixture of Experts (MoE) models, and what's next for Trainium. The complete show notes for this episode can be found at https://twimlai.com/go/720.
Paul Maurice, Florida Panthers Head Coach, joins the show! Did 4 Nations rock the entire world? Some Canadian and USA 11th province banter. And the Stanley Cup is the only thing on the Panthers mind?!
What is all the hype around Deep Research? In episode 43 of Mixture of Experts, host Tim Hwang is joined by Kate Soule, Volkmar Uhlig and Shobhit Varshney. This week, we discuss reasoning model features coming out of companies like OpenAI's Deep Research, Google Gemini, Perplexity, xAI's Grok-3 and more! Next, OpenAI is rumored to release an inference chip, but how likely is this to be a success in the AI chip game? Then, we analyze the capabilities of small vision-language models (VLMs). Finally, a startup, Firecrawl, released a job posting in search of an AI agent. Is this the future for AI tools in the workforce? Tune-in to today's Mixture of Experts to find out. 00:01 – Intro 00:35 – Deep Research 11:58 – OpenAI inference chip 22:17 – Small VLMs 32:31 – AI agent job posting The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
Kollel Iyun Halacha. Shuirim are held Sun-Thurs at 185 Miller Road Lakewood NJ. For more info email: kih185miller@gmail.com
Live from Paris, Tim Hwang is at the AI Action Summit 2025. In episode 42 of Mixture of Experts, we welcome Anastasia Stasenko, CEO and Co-Founder of pleias along with our veteran experts Marina Danilevsky and Chris Hay. Last week, we touched on some potential conversations at the Paris AI Summit, this week we recap what actually happened. Is AI safety improving Globally? Next, for our paper of the week, we breakdown s1: Simple test-time scaling. Then, Sam Altman is back with another blog, “Three Observations,” what do our experts have to say? Finally, what can we learn from Anthropic's Economic Index? All that and more on today's Mixture of Experts. 00:01 – Intro 00:42 – Paris AI Summit 11:10 – s1: Simple test-time scaling 19:32 – Sam Altman's “Three Observations” 30:41 – Anthropic's Economic Index The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity. Resources:Read the paper about s1: Simple test-time scaling: https://arxiv.org/abs/2501.19393Read Sam Altman's "Three Observations": https://blog.samaltman.com/three-observationsRead Anthropic's Economic Index: https://www.anthropic.com/economic-indexRead more about AGI: https://www.ibm.com/think/topics/artificial-general-intelligence
This week I welcome on the show two of the most important technologists ever, in any field.Jeff Dean is Google's Chief Scientist, and through 25 years at the company, has worked on basically the most transformative systems in modern computing: from MapReduce, BigTable, Tensorflow, AlphaChip, to Gemini.Noam Shazeer invented or co-invented all the main architectures and techniques that are used for modern LLMs: from the Transformer itself, to Mixture of Experts, to Mesh Tensorflow, to Gemini and many other things.We talk about their 25 years at Google, going from PageRank to MapReduce to the Transformer to MoEs to AlphaChip – and maybe soon to ASI.My favorite part was Jeff's vision for Pathways, Google's grand plan for a mutually-reinforcing loop of hardware and algorithmic design and for going past autoregression. That culminates in us imagining *all* of Google-the-company, going through one huge MoE model.And Noam just bites every bullet: 100x world GDP soon; let's get a million automated researchers running in the Google datacenter; living to see the year 3000.SponsorsScale partners with major AI labs like Meta, Google Deepmind, and OpenAI. Through Scale's Data Foundry, labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities. If you're an AI researcher or engineer, learn about how Scale's Data Foundry and research lab, SEAL, can help you go beyond the current frontier at scale.com/dwarkesh.Curious how Jane Street teaches their new traders? They use Figgie, a rapid-fire card game that simulates the most exciting parts of markets and trading. It's become so popular that Jane Street hosts an inter-office Figgie championship every year. Download from the app store or play on your desktop at figgie.com.Meter wants to radically improve the digital world we take for granted. They're developing a foundation model that automates network management end-to-end. To do this, they just announced a long-term partnership with Microsoft for tens of thousands of GPUs, and they're recruiting a world class AI research team. To learn more, go to meter.com/dwarkesh.Advertisers:To sponsor a future episode, visit: dwarkeshpatel.com/p/advertise.Timestamps00:00:00 - Intro00:02:44 - Joining Google in 199900:05:36 - Future of Moore's Law00:10:21 - Future TPUs00:13:13 - Jeff's undergrad thesis: parallel backprop00:15:10 - LLMs in 200700:23:07 - “Holy s**t” moments00:29:46 - AI fulfills Google's original mission00:34:19 - Doing Search in-context00:38:32 - The internal coding model00:39:49 - What will 2027 models do?00:46:00 - A new architecture every day?00:49:21 - Automated chip design and intelligence explosion00:57:31 - Future of inference scaling01:03:56 - Already doing multi-datacenter runs01:22:33 - Debugging at scale01:26:05 - Fast takeoff and superalignment01:34:40 - A million evil Jeff Deans01:38:16 - Fun times at Google01:41:50 - World compute demand in 203001:48:21 - Getting back to modularity01:59:13 - Keeping a giga-MoE in-memory02:04:09 - All of Google in one model02:12:43 - What's missing from distillation02:18:03 - Open research, pros and cons02:24:54 - Going the distance Get full access to Dwarkesh Podcast at www.dwarkeshpatel.com/subscribe
She's had the organist. Now she wants the Vicar.A Series in 17 parts, by Blacksheep. Listen to the Podcast at Steamy Stories. Mia weakly raised her hand and switched off the shower."That was amazing, Gordy-pie. Organists really are good with their hands!""Not so bad yourself," he panted. "Wow. I enjoyed that immensely! You're quite a lass, Mia.""I'd like to see you play the organ," she said, stepping out of the shower and reaching for a towel."I need to get my breath back first!" He laughed, as Mia began playfully drying him off. "God, you're an eager little beaver aren't you?""Hee hee. Yes, but what I meant was, I'd like to see you play the church organ. I've not been inside a church for years. Jenna said that St Michael's is cool.""It's a nice church." I wonder what else she's told her? Gordon thought. "Why not come along to the Sunday service? You can see me in action there, so to speak. After the service, you can have a go on the organ if you'd like. Do you play any musical instruments?""Guitar and violin, but I've not practiced for ages.""Ah, so strings are your thing? That's good. It'd be nice to have a violinist in the choir. One of the choristers plays the trumpet. Which keeps him from singing and I'm glad of it as his voice is bloody awful."Mia sniggered. "You're funny, Gordy-pie. I really like you. Are all organists as fun as you?""Nay lass. I'm one of a kind. He pulled her close and kissed her neck and lips. He was an incredible kisser, and she was curious to know more about him."Are you married?""Long divorced," came his reply. "I'm married to the pipe organ, as they say." He wondered if Jenna had mentioned anything about their various liaisons over the past year, and was about to say something, when the bathroom door suddenly opened."Jen! Ever thought of knocking before entering?" Mia gasped, covering herself with a towel."I can't leave you alone for five minutes can I?" She turned to Gordon, who grinned sheepishly at her."Um, hello!""Funny place to have organ lessons, Gordon," Jenna said, as she watched him squirm."Gordy-pie was just showing me how good an organist is with his hands, weren't you?" Mia said, kissing him. "And you know what, he's amazing!""Oh I'm well aware of how good he is," Jenna replied, folding her arms.Sensing disapproval, Gordon attempted to explain. "It just happened. I didn't know your cousin was here," he prattled. "I put the plant pots in the yard, went into the kitchen and she was just there, wearing nothing but a towel!""You don't need to explain yourself, Gordy-pie. We've not done anything wrong," Mia said. "We're both single. Why are you so uptight, Jenna? Is it because we're in the vicarage? Is that like, a sin or something?"Jenna was in no position to claim the moral high ground. "No, no of course not. I was, just a bit surprised, that's all. It's fine. Just, try to be a bit more discreet, Mia. What if Simon had walked in?""Oh I'm sure the good reverend would approve," Gordon smiled, winking at her.The perceptive Mia noticed his gesture and wondered what he was hinting at."Jenna took a deep breath. "Okay, well I'm going to have a coffee. I'll leave you to get dressed. Do you want a drink, Gordon?""A tea would be lovely. I'm parched. Thanks!""I'll have tea as well, please." Mia added.Jenna left the bathroom."She's acting weird," Mia said. "There's something she's not telling me."Oh boy, wait until you find out, Gordon thought. Your mind will be blown."Maybe she's a bit envious!" Gordon said as he picked up his clothes, and wondered where his underpants had gone."Can I keep these, Gordy-pie?" Mia giggled, holding up his white briefs."Think they're too big for you!""I don't want to wear them. I want to keep them under my pillow and sniff them at night.""In that case, they're all yours! But I want your knickers in return!""Fair's fair!" She tossed him her pale pink cotton undies to him."Thanks!""I loved our shower time," Mia said, kissing him again. "And I loved your big cock. You're a sexy man, Gordy-pie.""Gordy-pie hopes Mia-pie can play with his organ again very soon!" the organist replied as they got dressed and headed downstairs.Jenna brought them both a cup of tea as they sat down in the lounge."Gordon, you're not going to put up with her calling you that cringey nickname are you?" she said, handing him the cup."I like it. It's cute," he said, as Mia rested her head on his shoulder."It's childish. If someone had called you that a year ago, you'd have bitten their head off. You used to have a terrible temper.""Ah well that was before I saw the light," he said, sipping his tea. "When you, showed me the way." He smiled at Jenna as she sat opposite them. "For that, you know I am forever grateful," he added."Did you become a born again Christian like Jenna, Gordy-pie?" Mia asked."I've always been a Christian," Gordon replied. "I just sin a lot, that's all. As we all do, right?" He raised an eyebrow at the vicar's wife. "But we keep praying for forgiveness every week, and luckily for us, God is the forgiving sort, eh?"The front door opened and Reverend Morris came in."Good lord, I need a large brandy!" He gasped, tossing the car keys on the table."What I have seen, can't be unseen, and what I've heard, can't be unheard!""Whatever's the matter Simon?" Jenna said, standing up."You were right, Jen. Gladys Wilcox and the churchwarden. They're, at it!""Told you so," Jenna said. "Actual sex? I'm not being ageist but can Gladys manage that at her age?""No. Regular vanilla sex would've been easier to deal with. Actually, I think gerbilling would be easier to deal with. But seeing Norman, naked in her backyard, wearing a pinny and being struck on his arse with a riding crop,”Jenna cleared her throat, trying to silence him, given that they had company." She treats him like a slave and he enjoys it!" The vicar continued, unaware there was an audience. "And there's more. She knows about the storeroom threesome, and you won't believe this, she proudly told me, that sometime during Lent, she performed oral sex on Gordon.""Ahem. Simon, shush, we've got," Jenna cringed. "Wait, what? She gave Gordon oral?"Mia's jaw dropped."Sucked him off whilst he was sat at the church organ! She'd wanted him to be her slave, but he declined. So she set her sights on Norman instead. Well we both know Gordon prefers a younger woman, right?" He turned round, and noticed Gordon sat on the settee, and Mia sat next to him."Oh, good afternoon Gordon!""I brought those plant pots you wanted," the organist meekly uttered.Later,Jenna and Reverend Morris sat on the settee watching an episode of Father Brown, although neither were really paying attention to it."I can't get that image out of my head. Gladys giving Gordon a blowjob and whipping Norman's bare buttocks. I know we've, engaged in some naughtiness, but I never imagined one of the oldest members of the church was into that sort of thing!""Good for her," Jenna replied. "Kinkiness aside, it's nice for her to have Norman as a lodger. I mean, she lives alone and in this day and age, older people can feel vulnerable. I know Gladys misses her hubby a lot.""Oh Bert. Yes. He was dead long before I came to St Michaels. Bishop George told me more about him. He was the organist before Gordon took over. Apparently he was quite a character.""I'm sure he was. And the current organist seems to be going the same way.""Jen, you seem a bit unhappy about Gordon having intercourse with your cousin today. Is that because you're protective of her or because of, well, I know how close you are to him?"Jenna sighed. "Oh Simon. I'm ashamed of myself. I actually felt jealous when I saw the two of them together. How selfish is that? After everything you did for me last year when it was my birthday, and you gladly accepted my dalliances with the other male members of the church. Can you forgive me? I wish to say a prayer of forgiveness."The vicar took his wife's hands in his. "Of course I can, my love. And I understand how you feel. You see, with Mia here, I think you've got something you've never had to deal with before.""What's that?""A rival!"Mia was eavesdropping from the staircase. A mischievous grin formed on her face as she listened."Holy shit, Jenna's had more men than Elton John's had wigs. She had the nerve to have a go at me for seducing Tom. And she's slept with Gordon too? No wonder she looked so tense. Ha! And sweet, Reverend Simon is okay with that? That's not what it teaches in the Bible, surely?"She slipped back to her bedroom."Let us pray together," Reverend Morris said."Father, I return to You with my sins before me. Nowadays, I lack compassion for my brother and sisters, my eyes are clouded with wrongdoings my heart is against. Opposing Your Words, I sinned and done evil in Your eyes. I drained myself off Your kindness and followed my worldly desires. Father, guide me as You are right in Your verdict and justified in Your judgment. Do not leave me astray as I pray for a blissful life with You and a life free of evil. In Your Mercy, I pray.Amen."-(Luke 15:18, Psalm 51:3-4)"I feel better," Jenna said, opening her eyes. She ran a finger down her husband's cheek. "Simon, let's go to bed. Mia's asleep. The guest bedroom is right at the other end of the landing. She won't hear us. Tonight I need my Vicar's touch,”"What a good idea! All this talk of Gladys Wilcox getting her hands on men's dicks, I'd quite like some hands on mine!"A Girl With FantasiesMia lay back on the bed in the darkness, her mind buzzing with the events of the day. Reaching under the pillow, she pulled out the pair of Gordon's briefs."Enjoyed you, Gordy-pie! You were a total sweetie."She sighed, pressing the crotch of the underwear against her nose and inhaling deeply, whilst fingering herself with her other hand. Gordon's undies bore a pleasant, musky, manly scent, a faint mark which she assumed was pre-cum, and a couple of wiry grey pubic hairs. Perfect. Knowing that the organist's thick cock had been snugly contained within was enough to make her climax again. She wondered if he was wanking off and sniffing her knickers."Hope he likes mine too." She wanted to see the organist again, as sex with him had been amazing, but Mia had her sights set on a bigger prize - and this one wore a clerical collar.InsomniaGordon was in bed, but having difficulty sleeping. His mind was a complete whirl. He reflected how in the past year, he'd gone from being completely sex-starved, to having more sex than he'd ever had during a whole fifteen years of marriage, and during his late teens, when he'd been a horny youth, desperate to sleep with any woman. In the Eighties, those halcyon pre-Internet days, just stumbling across a discarded porn magazine in the bushes was more valuable than gold. He remembered his time at university, when he used to spy on the nurses undressing at a nearby hospital.He chuckled as he remembered losing his virginity to his piano teacher - whilst she was giving him a tour of Blackpool Tower ballroom. He credited her with starting his interest in wanting to play organs,"Look at me now," he said out loud. "I got seduced by a woman young enough to be my daughter. Who is now the vicar's wife. I fucked a Ukrainian woman in the church. I've been fucking the vicar's wife every week in the church. I took part in a threesome with her and the vicar. I and several other men gave her a facial in the church. I got my dick sucked by an eighty-six year old pensioner too. Now I'm fucking the eighteen-year old cousin of the vicar's wife, and exchanging underwear with her."He reached for the pair of pink knickers and gave them a good sniff, stroking his cock at the same time. The crotch had dried, but earlier it had been wet and sticky with Mia's pussy juices. A heavenly scent."The world is a bloody mess right now, but I'd say my life is pretty good," he smiled. "I hope Mia wants to see me again. She's a lovely, horny little thing. I hope she comes to church this Sunday."He wanked himself off happily, before slipping into a blissful slumber. For the first time in a year, he dreamt of a woman other than Jenna.Mia's DelightMia was edging closer to an orgasm as she continued to pleasure herself. Gordon's briefs pressed against her face were having the desired effect, but oh, God, she wished she had a large dildo as well. Her sopping pussy was aching to be filled again.Hearing muffled laughter on the landing, brought her back to her senses. The sound of a bedroom door closing. More laughter.She slid off the bed and wiped her hand on her t-shirt. Tiptoeing to the door, she opened it, and listened. The inky darkness of the landing was disturbed by a light under Jenna's bedroom door.With the stealth of a cat, Mia slunk down the landing. Standing in front of the door, the sounds from within were clearer. The creak of a bedframe. The headboard bumping against the wall. The low moans of the reverend, followed by the higher pitched gasps of Jenna.She bit her lip as she listened to their carnal sounds. Squinting, she peered through the keyhole. The tiny opening barely allowed an interested voyeur to see a thing, but just briefly, she glimpsed Reverend Morris' bare backside rising and falling. Lying between her cousin's legs which, likewise entirely bare, were extended straight upwards into the air."Hosanna! Hosanna! Hosanna, in, the, Highest Heavens!" Reverend Morris yelled, to which Jenna responded by screaming in ecstasy.Mia clamped her hand against her mouth to stifle a laugh. At the same time, her pussy tingled like crazy. That the good vicar quoted Biblical phrases during sex, turned her on in a way she never expected."I am coming soon! Hold on to what you have, so that no one will take your crown!"This quote from the Book of Revelation proved too much, and seconds later, Jenna climaxed, with a scream.Mia tried to remain silent as she too, came. With a wildly beating heart, she shuffled back to her bedroom."I want him. I want Reverend Morris to fuck me like that."Reverend Morris is seduced, but can he satisfy her?Lightning flashed, followed by a crash of thunder so powerful it rattled the kitchen windows. The storm began not with a sprinkle or drizzle but with a sudden downpour, as if clouds were hollow structures that could shatter like eggshells and spill their entire contents at once. So far, July was proving far less flaming than June."Blimey," Reverend Morris said, as the rain made him look up from his laptop. "Not a good start to Mia's first day in her new job, is it?""A bit of summer rain won't bother her," Jenna replied. "Her mind's probably fixated on Gordon.""Heh, give her some credit, Jen. She's shown initiative. I think she'll work hard and be a good cleaner for the church. She did an excellent job tidying up our kitchen.""That's true. She should be about finished in around twenty minutes. Ten hours a week isn't much. I wonder what her long-term plans are? I mean, she can't clean the church hall toilets for the rest of her life can she? And I must phone Aunt Kathleen, I keep putting it off. She'll go berserk when she finds out what's happened."Reverend Morris sipped his coffee. "Have faith in her, Jen. She's chosen this path for herself. And as my dad always says, never put off until tomorrow what can be done today. Right, I have to pop over to the church. I'll check in on Mia and see if she's okay with setting the alarm system. Don't know if she wants some lunch with us or if she has plans of her own?"Jenna picked up the phone. "She didn't say. Okay, I'm going to bite the bullet and phone Aunt Kathleen."In the church hall, Jenna had finished using the floor-polishing machine on the wooden floor. The two hours had flown by. As well as making the floor spotless after this morning's yoga class, she'd cleaned the toilets and emptied the bins. The work was boring, as the vicar had warned her, but an absolute doddle. For £12 an hour, she couldn't complain. It was the easiest cash she'd ever earned. It was far better than stacking shelves in Aldi and having to deal with abusive members of the public. The church toilets hadn't been the horror show she'd braced herself for - even the gents were reasonable. The good chaps of St Michaels had good manners and good aim it would seem!Outside, more thunder boomed. The sound of the rain. The rain. The cold merciless sound of the rain."Ugh," Mia muttered, looking out of the window. "I hate weather like this."It was typical British weather. The storm had washed all the color out of the day. The sky was as charry as burnt-out ruins. Wind-driven rain, grey as iron nails, hammered every surface, and road gutters overflowed with filthy water.Mia returned the machine to the store cupboard and locked it. She checked her phone. Nearly 1 o'clock.The sound of the main door opening made her jump."Oh Reverend Simon!""Hello Mia. Just checking to see how you're getting on. Have you finished?""Yes, I'm done. I was just going to set the alarm thingy." She noticed how wet his black shirt was."Great stuff, you're okay with setting it?""Oh no worries there.""Little tip if you're working in the hall by yourself, be sure to lock the main door. Anyone could walk in. We're lucky we don't get a lot of crime round here, but for your own safety, it's best to lock yourself in. There are lots of places someone could hide. Right, well I'm just heading into the church to sort a few things out ready for the curate's ordination on Sunday. Jenna's prepared some lunch if you're hungry, oh and be warned, she's phoning your mum.""What? Oh no! Why's she doing that?" Mia pouted."Look, don't panic, she's just letting her know that your safe and well and staying with us. You don't want your poor parents to be worrying themselves to death not knowing where you've gone do you?""Well no. But I don't want Mum turning up.""I don't think you need to worry. Your mum lives in Buxton doesn't she? That's a good fifty miles from here. I don't think she'll drive up here today. But at some point you'll have to speak to her."Mia looked down. "I like it here. I don't want to go back to my parents. Of course, I don't want to be a burden to you,”"You're no burden Mia, please don't think that. If you want to talk, why not join me in the church when you've finished locking up?" He left the hall and Mia took that as an open invitation."Oh I'll join you, Vicar, but I want to do more than talk!"A few minutes later, having successfully set the alarm, Mia dashed over to the church, trying to avoid getting soaked by the rain. The ancient oak door's handle turned stubbornly. She wondered why Reverend Morris hadn't bothered to lock himself in either, then she remembered something Jenna had said about the church "always having to be open for those in need."And Mia was in need all right.Reverend Morris was in the vestry, having just changed out of his damp shirt and into a dry one. He'd donned his regular cassock and surplice, as he always did when in the church, even though he was off duty. He inspected the row of church vestments on the clothes rail. Some items were missing. Some members of the choir weren't the tidiest, and often neglected to hang their surplices back up after the services.Mia walked down the aisle of St Michael's church, glancing round. The incessant pounding of rain on the roof seemed magnified here in this old, airy building. Then the organ pipes to the right of the altar caught her eye. The highly-polished silver colored pipes reflected what little light was shining through the stained glass windows."Impressive," she muttered, admiring the many pipes. "But where are its, keyboards? No wait, manuals. He called them manuals." She looked round, and noticed the organ console behind the pulpit."Ah!"Mia walked over to it. She ran her hand down the wooden stool. "So this is where Gordy-pie sits." Giving a little mischievous giggle, she looked round. There was no sign of Reverend Morris anywhere, so she slid herself onto the stool."Look at this thing. It's like, unreal. All these buttons and stuff It's like a flight deck." Her feet touched the organ's pedalboard. "How the hell does he remember all these? She looked closely at some of the stops. They all had weird-sounding names on them. Diapason, Mixture, Gemshorn."I wonder what these knobs do?" She switched on the small lamp above the manuals, in order to get a better look.Curiosity got the better of her and she fiddled with a couple of stops and pressed a few keys on the lower manual. Nothing happened, seeing as the organ was switched off."Hmm, must be like an electronic piano." She idly pressed down several more keys, pretending to play."Witness the great maestro Mia at work," she said out loud, putting on a fake Geordie accent to mimic presenters, Ant and Dec. "Here on Britain's Got Talent, Mia will now play some of her favorite songs for the audience. Starting with Titanium by David Guetta!" She flung her arms around, as though conducting an orchestra, and accidentally hit the red on/off button above the manuals."This is being live-streamed. Be sure to vote!" Mia slammed her fingers down hard on the middle manual. "I am Titanium!"The organ responded at once, with a deep, radiant sound that seemed to rattle the entire foundations of the church. It was so loud, the stool seemed to vibrate."Shit!!" Mia gasped as she got the shock of her life. Fearing she'd damaged the organ, she panicked and froze on the spot.In the vestry, Reverend Morris had finished re-arranging the vestments, when the booming note from the organ shattered his peace and quiet."What the," He almost jumped out of his skin. "Bloody hell, Gordon. You sure pick your moments to come and practice."When nothing but silence followed that ear-splitting note, he headed out of the vestry to investigate.Mia's fingers were trembling. "Fuck, what did I do?""Well, well. What do we have here?" Reverend Morris chuckled as he appeared beside the console."Eep! I didn't mean to, Simon. I was just, I,”"Ha, it's alright, don't panic!" He said."I caught something and it made that noise.""You managed to switch it on, that's all!" He indicated the red button."Oh, so it's not broken then?" Mia said, getting her breath back."No, of course not. It's seen a lot of heavy use. It can cope with a lot!""It looks so complicated. How does Gordon play it?""With ease, because he's had years of practice. Jenna's just learned to play it, and said how hard it was. No use asking me. I haven't a clue. I'm not musically talented it all. In fact I'll tell you something. I can't even read music.""Really?" Mia replied."I'm hopeless," the vicar continued. "Jenna's tried to introduce me to the piano, but I've got poor co-ordination. My fingers go all over the place. My attempts sounded like Les Dawson."Mia blinked. "Who?""Never mind. He's from before your time." He pressed down a couple of the organ's keys and made a feeble attempt at playing a few notes."Gordon says you have to use your whole body when playing a pipe organ." Mia said, giving him a dreamy grin."He's right, you do.""Do you have to use your whole body when preaching to the congregation, Simon?""Ah, well that depends," he said, switching off the organ and the lamp. "I definitely have to keep my mind focused. Especially during the sermon.""I can imagine. I bet you're amazing. I like your church robes.""Oh thanks! It's called a cassock and surplice. Um, why not come to the Sunday service if you're curious? You don't need to take communion if you're not comfortable.""I've been confirmed," Mia replied. "I'm okay with that.""It's the curate's ordination service on Sunday afternoon too. "That will be quite a spectacle. The Bishop will be performing the ceremony. We're expecting lots of people to attend. Afterwards there'll be a buffet in the hall. Nice social occasion. There'll be more people your own age there."Mia shrugged. "I'm not mad keen on people my own age," she said."I see. Well, Gordon will be there, so that's a reason to attend, surely?" Reverend Morris cleared his throat. "You like him a lot, don't you?""Oh yes. He is lovely. He's really sexy! But you know what? You're sexy too. I hope it's not a sin to compliment a vicar in church?"The flustered reverend's cheeks turned pink. "Oh not at all! Very kind of you to say, Mia."Yes, very sexy,” she purred, and without hesitating, stood up and kissed him on the lips."M-Mia, what are you doing?" Reverend Morris spluttered, backing away.She ignored his question and slipped her arms round his shoulders. "I am worshipping you, Reverend Simon. Like I said, I think you're really sexy,”"B-but, but, I am a married man!" He stammered.Mia breathed in the scent of his aftershave. "And? Jenna's a married woman, yet she seems to have slept with half of the men of this church. And you're like, okay with that?""Did Jenna tell you all this?" He gasped. This time, he made no attempt to free himself from her grasp."She didn't need to. I overheard.""You shouldn't eavesdrop, Mia.""Yes I know, but come on. Seriously? What kind of open marriage do you guys have? Is that church rules or something? How can you be cool with that?"Reverend Morris still made no attempt to move. "Well it's not like you think. I love Jenna so much. I just fell for her big time. She had quite effect on the men of this church when she first started attending, not just me. I was trapped in a sexless marriage at the time. I er, thought the first time we had sex, it was a wild one-off."This explanation failed to satisfy Mia. "And Gordon?""The thing with Gordon, well before Jenna came along, he was a very unhappy, angry man. She made him feel happier than he had been in years. And the choir were beyond grateful for his change in personality, let me tell you.""I see. So Jen just has this natural talent for seducing all these lonely men and cheering them up? A gift from God? In that case, what I'm doing isn't a sin then is it?"She kissed the vicar again, longer and harder."Mia, wait!" He protested. "I can't,”"Of course you can, Reverend Simon. "You've been so kind to me, letting me stay at the vicarage and getting me this job. It's time I repaid that kindness.""Yes, but, I thought you liked Gordon!""I do like Gordon. I just like you too. Don't you find me attractive, just like you find Jenna attractive?"He would've been lying if he'd said no, and his erection was already proof."Yes. You're beautiful," Reverend Morris said, running a finger down her cheek. "Such smooth skin,” Instinctively, he bent down and pressed his lips against hers."Heavenly,”Mia unbuttoned her top, and guided his hands to her small and beautiful tits for him to squeeze and play with."Give me a blessing, Reverend," Mia whispered.The vicar took her hand, led her into the vestry and quoted a passage from Numbers."May the Lord bless you and keep you. May the Lord's face shine upon you and be gracious to you, may the Lord turn his face to you and bring you peace.""Amen," Mia said. After a brief silence, something seemed to snap in Reverend Morris, and he cast off his reluctance."Let me get your legs," he whispered, his voice quavering a bit with sexual tension.Stroking from the knee down, to start. Then Mia felt his holy hands open and slide up the back of her thighs, pushing her skirt up."Spread your legs a bit."His thumbs caressed her inner thigh, and came close, oh so close to her pussy. She wasn't wearing any underwear and he bent down to smell her sex. His thumbs tantalizingly close. Now his hands were on her arse. Seductive massage, strokes, and squeezes nearly sent Mia over the edge. She moaned."Oh yes," he breathed. "Praise the Lord,”Mia's hands roamed across his surplice, and her eagerness surprised him. "Hold on a sec," he said, removing the garment, and starting to unbutton his cassock. When it was open, his black trousers were revealed, along with a straining bulge. She squeezed his hard arse cheeks and pulled him against her. His cock throbbed. Mia unfastened his belt and unzipped his trousers. Seconds later, she pulled his boxer shorts down.He groaned when she took his hot cock into her warm hand, cupping his balls with her other. His cock was thick and of decent length, though not, she noted as big as Gordon's or Tom's. Gordon's was the biggest of the lot. Mia couldn't help be a little disappointed, though of course what one did with something was what counted, not the size.I wonder if this is why Jenna goes with all the other church guys, because Reverend Simon just isn't enough to satisfy her? She thought."Mia, I can't hold back, do you want me to bless you properly or not?""Yes Reverend Simon, I want you to purify me! I need you to fuck me!"Mia wrapped her leg around him, opening up for his cock. He rubbed the head of it on her clit. Reverend Morris was out of control now and she let him take her how he wanted. He entered her and pounded her hard on the vestry's small wooden table.Mia rode his cock and enjoyed his thrusts, but, as good as it felt, the vicar wasn't satisfying her in the way Gordon had done.How can this be? She thought, as her cousin's husband continued thrusting fast and hard into her, grunting as he did so.It must be because he's just not old enough for me, she mused. After all, he's only forty! Still, I've achieved what I wanted to do. I wanted to experience sex with a vicar, and a married one at that. And I've finally got my own back on Jenna after all these years,"Oh Mia I'm cumming!" Reverend Morris slammed into her for one last time and shot his load deep inside her."Well,” Reverend Morris said, after he'd got his breath back. "I hope you enjoyed that Mia. I certainly did, I can't believe I did that."Mia was about to say something, but at that moment, the vestry door opened and Jenna appeared.For a few moments there was nothing but stunned silence."Mia, why? Why Simon?""Now we're even, Jen," Mia said with a wink."Even?""Remember all those years ago when we were at primary school and I was in love with that older boy, Darren Grimshaw?""Er, what?""You knew how much I fancied him.""Mia, you were only ten at the time. You had a bit of an innocent crush.""Well at the time it felt like true love. And you had to muscle in and ruin it. He took you out to Burger King instead of asking me. I was so upset at the time. I vowed that one day, I'd get my own back!""Uh, yeah. I do remember you saying that, now I recall. So, this is your idea of getting your own back, is it? Seducing my husband, in his church?""Jen, you can't really complain. You've seduced half the men of this church!"Reverend Morris looked sheepishly at them both. "Look, I didn't say anything, she overheard us talking!"Jenna took a deep breath. "You're right, Mia. Guess I'm nothing but a hypocrite there. But where do we go from here?"Mia turned to Reverend Morris. "I've seen the light. And had a revelation. And the truth is, vicars just don't float my boat after all. No offence, Reverend Simon. You were really great. But, you're too young for me. Give me a gorgeous older organist any day! I've already found my perfect man and his name is Gordon!""Lucky Gordon," Jenna said at last."Jen, I want you to promise me one thing. I'll never lay a finger on your vicar again, if you'll promise not to get it on with Gordon again."Jenna's face suddenly fell. "What?"Reverend Morris nodded. "Fair's fair, Jen. And you don't need any more organ lessons - you can play the organ perfectly fine now."Jenna thought for a moment, remembering all the fun times she'd had with Gordon - they'd engaged in some fantastic sex over the past year, and at Easter, she'd got the impression his feelings were becoming stronger than just mere lust."Okay, I promise.""Make it a proper promise. We're in church, remember?""In the name of God, I promise," Jenna said."That's better.""Right, now that we've got that out of the way, how about we all go and have some lunch?" Reverend Morris said, fastening his trousers and belt. "I've worked up quite an appetite!"Jenna shook her head as she watched Mia head down the church aisle in front of them."Is she seriously going to ask Gordon to be her boyfriend? He's so much older than her.""Just like I am to you," Reverend Morris replied."Yes but it's double the age gap that we have. What if Mia wants kids ten years from now? Gordon will be in his mid-sixties! He doesn't have any kids of his own. Can you see him being a dad?""I think he'd be a great dad. You're assuming Mia will want to be a mum. Lots of women choose not to have children these days.""Guess you're right.""Isn't it great, all the people of our church and nearby churches have met someone? I've got you, Josh has hooked up with Yulia. Father Aiden has Róisín. Norman's moved in with Gladys, now there's an odd couple, but they're happy! My ex-wife Lucy married Debbie. Gordon's got your cousin, before you arrived, all these people were unhappy. I'd say your work is done, my love!"They walked down the aisle, hand in hand.Privately, however, Jenna smirked to herself."My work isn't fully done. At least I still have Bishop George, Gordon's cousin Barry, Mayor Buckingham and a few other chaps!"By Blacksheep, for Literotica.
Kollel Iyun Halacha. Shuirim are held Sun-Thurs at 185 Miller Road Lakewood NJ. For more info email: kih185miller@gmail.com
Kollel Iyun Halacha. Shuirim are held Sun-Thurs at 185 Miller Road Lakewood NJ. For more info email: kih185miller@gmail.com
Arnaud et Emmanuel discutent des nouvelles de ce mois. On y parle intégrité de JVM, fetch size de JDBC, MCP, de prompt engineering, de DeepSeek bien sûr mais aussi de Maven 4 et des proxy de répository Maven. Et d'autres choses encore, bonne lecture. Enregistré le 7 février 2025 Téléchargement de l'épisode LesCastCodeurs-Episode-322.mp3 ou en vidéo sur YouTube. News Langages Les evolutions de la JVM pour augmenter l'intégrité https://inside.java/2025/01/03/evolving-default-integrity/ un article sur les raisons pour lesquelles les editeurs de frameworks et les utilisateurs s'arrachent les cheveux et vont continuer garantir l'integrite du code et des données en enlevant des APIs existantes historiquemnt agents dynamiques, setAccessible, Unsafe, JNI Article expliques les risques percus par les mainteneurs de la JVM Franchement c'est un peu leg sur les causes l'article, auto propagande JavaScript Temporal, enfin une API propre et moderne pour gérer les dates en JS https://developer.mozilla.org/en-US/blog/javascript-temporal-is-coming/ JavaScript Temporal est un nouvel objet conçu pour remplacer l'objet Date, qui présente des défauts. Il résout des problèmes tels que le manque de prise en charge des fuseaux horaires et la mutabilité. Temporal introduit des concepts tels que les instants, les heures civiles et les durées. Il fournit des classes pour gérer diverses représentations de date/heure, y compris celles qui tiennent compte du fuseau horaire et celles qui n'en tiennent pas compte. Temporal simplifie l'utilisation de différents calendriers (par exemple, chinois, hébreu). Il comprend des méthodes pour les comparaisons, les conversions et le formatage des dates et des heures. La prise en charge par les navigateurs est expérimentale, Firefox Nightly ayant l'implémentation la plus aboutie. Un polyfill est disponible pour essayer Temporal dans n'importe quel navigateur. Librairies Un article sur les fetch size du JDBC et les impacts sur vos applications https://in.relation.to/2025/01/24/jdbc-fetch-size/ qui connait la valeur fetch size par default de son driver? en fonction de vos use cases, ca peut etre devastateur exemple d'une appli qui retourne 12 lignes et un fetch size de oracle a 10, 2 a/r pour rien et si c'est 50 lignres retournées la base de donnée est le facteur limitant, pas Java donc monter sont fetch size est avantageux, on utilise la memoire de Java pour eviter la latence Quarkus annouce les MCP servers project pour collecter les servier MCP en Java https://quarkus.io/blog/introducing-mcp-servers/ MCP d'Anthropic introspecteur de bases JDBC lecteur de filke system Dessine en Java FX demarrables facilement avec jbang et testes avec claude desktop, goose et mcp-cli permet d'utliser le pouvoir des librarires Java de votre IA d'ailleurs Spring a la version 0.6 de leur support MCP https://spring.io/blog/2025/01/23/spring-ai-mcp-0 Infrastructure Apache Flink sur Kibernetes https://www.decodable.co/blog/get-running-with-apache-flink-on-kubernetes-2 un article tres complet ejn deux parties sur l'installation de Flink sur Kubernetes installation, setup mais aussi le checkpointing, la HA, l'observablité Data et Intelligence Artificielle 10 techniques de prompt engineering https://medium.com/google-cloud/10-prompt-engineering-techniques-every-beginner-should-know-bf6c195916c7 Si vous voulez aller plus loin, l'article référence un très bon livre blanc sur le prompt engineering https://www.kaggle.com/whitepaper-prompt-engineering Les techniques évoquées : Zero-Shot Prompting: On demande directement à l'IA de répondre à une question sans lui fournir d'exemple préalable. C'est comme si on posait une question à une personne sans lui donner de contexte. Few-Shot Prompting: On donne à l'IA un ou plusieurs exemples de la tâche qu'on souhaite qu'elle accomplisse. C'est comme montrer à quelqu'un comment faire quelque chose avant de lui demander de le faire. System Prompting: On définit le contexte général et le but de la tâche pour l'IA. C'est comme donner à l'IA des instructions générales sur ce qu'elle doit faire. Role Prompting: On attribue un rôle spécifique à l'IA (enseignant, journaliste, etc.). C'est comme demander à quelqu'un de jouer un rôle spécifique. Contextual Prompting: On fournit des informations supplémentaires ou un contexte pour la tâche. C'est comme donner à quelqu'un toutes les informations nécessaires pour répondre à une question. Step-Back Prompting: On pose d'abord une question générale, puis on utilise la réponse pour poser une question plus spécifique. C'est comme poser une question ouverte avant de poser une question plus fermée. Chain-of-Thought Prompting: On demande à l'IA de montrer étape par étape comment elle arrive à sa conclusion. C'est comme demander à quelqu'un d'expliquer son raisonnement. Self-Consistency Prompting: On pose plusieurs fois la même question à l'IA et on compare les réponses pour trouver la plus cohérente. C'est comme vérifier une réponse en la posant sous différentes formes. Tree-of-Thoughts Prompting: On permet à l'IA d'explorer plusieurs chemins de raisonnement en même temps. C'est comme considérer toutes les options possibles avant de prendre une décision. ReAct Prompting: On permet à l'IA d'interagir avec des outils externes pour résoudre des problèmes complexes. C'est comme donner à quelqu'un les outils nécessaires pour résoudre un problème. Les patterns GenAI the thoughtworks https://martinfowler.com/articles/gen-ai-patterns/ tres introductif et pre RAG le direct prompt qui est un appel direct au LLM: limitations de connaissance et de controle de l'experience eval: evaluer la sortie d'un LLM avec plusieurs techniques mais fondamentalement une fonction qui prend la demande, la reponse et donc un score numerique evaluation via un LLM (le meme ou un autre), ou evaluation humaine tourner les evaluations a partir de la chaine de build amis aussi en live vu que les LLMs puvent evoluer. Decrit les embedding notament d'image amis aussi de texte avec la notion de contexte DeepSeek et la fin de la domination de NVidia https://youtubetranscriptoptimizer.com/blog/05_the_short_case_for_nvda un article sur les raisons pour lesquelles NVIDIA va se faire cahllengert sur ses marges 90% de marge quand meme parce que les plus gros GPU et CUDA qui est proprio mais des approches ardware alternatives existent qui sont plus efficientes (TPU et gros waffle) Google, MS et d'autres construisent leurs GPU alternatifs CUDA devient de moins en moins le linga franca avec l'investissement sur des langages intermediares alternatifs par Apple, Google OpenAI etc L'article parle de DeepSkeek qui est venu mettre une baffe dans le monde des LLMs Ils ont construit un competiteur a gpt4o et o1 avec 5M de dollars et des capacites de raisonnements impressionnant la cles c'etait beaucoup de trick d'optimisation mais le plus gros est d'avoir des poids de neurores sur 8 bits vs 32 pour les autres. et donc de quatizer au fil de l'eau et au moment de l'entrainement beaucoup de reinforcemnt learning innovatifs aussi et des Mixture of Expert donc ~50x moins chers que OpenAI Donc plus besoin de GPU qui on des tonnes de vRAM ah et DeepSeek est open source un article de semianalytics change un peu le narratif le papier de DeepSkeek en dit long via ses omissions par ensemple les 6M c'est juste l'inference en GPU, pas les couts de recherches et divers trials et erreurs en comparaison Claude Sonnet a coute 10M en infererence DeepSeek a beaucoup de CPU pre ban et ceratins post bans evalués a 5 Milliards en investissement. leurs avancées et leur ouverture reste extremement interessante Une intro à Apache Iceberg http://blog.ippon.fr/2025/01/17/la-revolution-des-donnees-lavenement-des-lakehouses-avec-apache-iceberg/ issue des limites du data lake. non structuré et des Data Warehouses aux limites en diversite de données et de volume entrent les lakehouse Et particulierement Apache Iceberg issue de Netflix gestion de schema mais flexible notion de copy en write vs merge on read en fonction de besoins garantie atomicite, coherence, isoliation et durabilite notion de time travel et rollback partitions cachées (qui abstraient la partition et ses transfos) et evolution de partitions compatbile avec les moteurs de calcul comme spark, trino, flink etc explique la structure des metadonnées et des données Guillaume s'amuse à générer des histoires courtes de Science-Fiction en programmant des Agents IA avec LangChain4j et aussi avec des workflows https://glaforge.dev/posts/2025/01/27/an-ai-agent-to-generate-short-scifi-stories/ https://glaforge.dev/posts/2025/01/31/a-genai-agent-with-a-real-workflow/ Création d'un générateur automatisé de nouvelles de science-fiction à l'aide de Gemini et Imagen en Java, LangChain4j, sur Google Cloud. Le système génère chaque nuit des histoires, complétées par des illustrations créées par le modèle Imagen 3, et les publie sur un site Web. Une étape d'auto-réflexion utilise Gemini pour sélectionner la meilleure image pour chaque chapitre. L'agent utilise un workflow explicite, drivé par le code Java, où les étapes sont prédéfinies dans le code, plutôt que de s'appuyer sur une planification basée sur LLM. Le code est disponible sur GitHub et l'application est déployée sur Google Cloud. L'article oppose les agents de workflow explicites aux agents autonomes, en soulignant les compromis de chaque approche. Car parfois, les Agent IA autonomes qui gèrent leur propre planning hallucinent un peu trop et n'établissent pas un plan correctement, ou ne le suive pas comme il faut, voire hallucine des “function call”. Le projet utilise Cloud Build, le Cloud Run jobs, Cloud Scheduler, Firestore comme base de données, et Firebase pour le déploiement et l'automatisation du frontend. Dans le deuxième article, L'approche est différente, Guillaume utilise un outil de Workflow, plutôt que de diriger le planning avec du code Java. L'approche impérative utilise du code Java explicite pour orchestrer le workflow, offrant ainsi un contrôle et une parallélisation précis. L'approche déclarative utilise un fichier YAML pour définir le workflow, en spécifiant les étapes, les entrées, les sorties et l'ordre d'exécution. Le workflow comprend les étapes permettant de générer une histoire avec Gemini 2, de créer une invite d'image, de générer des images avec Imagen 3 et d'enregistrer le résultat dans Cloud Firestore (base de donnée NoSQL). Les principaux avantages de l'approche impérative sont un contrôle précis, une parallélisation explicite et des outils de programmation familiers. Les principaux avantages de l'approche déclarative sont des définitions de workflow peut-être plus faciles à comprendre (même si c'est un YAML, berk !) la visualisation, l'évolutivité et une maintenance simplifiée (on peut juste changer le YAML dans la console, comme au bon vieux temps du PHP en prod). Les inconvénients de l'approche impérative incluent le besoin de connaissances en programmation, les défis potentiels en matière de maintenance et la gestion des conteneurs. Les inconvénients de l'approche déclarative incluent une création YAML pénible, un contrôle de parallélisation limité, l'absence d'émulateur local et un débogage moins intuitif. Le choix entre les approches dépend des exigences du projet, la déclarative étant adaptée aux workflows plus simples. L'article conclut que la planification déclarative peut aider les agents IA à rester concentrés et prévisibles. Outillage Vulnérabilité des proxy Maven https://github.blog/security/vulnerability-research/attacks-on-maven-proxy-repositories/ Quelque soit le langage, la techno, il est hautement conseillé de mettre en place des gestionnaires de repositories en tant que proxy pour mieux contrôler les dépendances qui contribuent à la création de vos produits Michael Stepankin de l'équipe GitHub Security Lab a cherché a savoir si ces derniers ne sont pas aussi sources de vulnérabilité en étudiant quelques CVEs sur des produits comme JFrog Artifactory, Sonatype Nexus, et Reposilite Certaines failles viennent de la UI des produits qui permettent d'afficher les artifacts (ex: mettez un JS dans un fichier POM) et même de naviguer dedans (ex: voir le contenu d'un jar / zip et on exploite l'API pour lire, voir modifier des fichiers du serveur en dehors des archives) Les artifacts peuvent aussi être compromis en jouant sur les paramètres propriétaires des URLs ou en jouant sur le nomage avec les encodings. Bref, rien n'est simple ni niveau. Tout système rajoute de la compléxité et il est important de les tenir à mettre à jour. Il faut surveiller activement sa chaine de distribution via différents moyens et ne pas tout miser sur le repository manager. L'auteur a fait une présentation sur le sujet : https://www.youtube.com/watch?v=0Z_QXtk0Z54 Apache Maven 4… Bientôt, c'est promis …. qu'est ce qu'il y aura dedans ? https://gnodet.github.io/maven4-presentation/ Et aussi https://github.com/Bukama/MavenStuff/blob/main/Maven4/whatsnewinmaven4.md Apache Maven 4 Doucement mais surement …. c'est le principe d'un projet Maven 4.0.0-rc-2 est dispo (Dec 2024). Maven a plus de 20 ans et est largement utilisé dans l'écosystème Java. La compatibilité ascendante a toujours été une priorité, mais elle a limité la flexibilité. Maven 4 introduit des changements significatifs, notamment un nouveau schéma de construction et des améliorations du code. Changements du POM Séparation du Build-POM et du Consumer-POM : Build-POM : Contient des informations propres à la construction (ex. plugins, configurations). Consumer-POM : Contient uniquement les informations nécessaires aux consommateurs d'artefacts (ex. dépendances). Nouveau Modèle Version 4.1.0 : Utilisé uniquement pour le Build-POM, alors que le Consumer-POM reste en 4.0.0 pour la compatibilité. Introduit de nouveaux éléments et en marque certains comme obsolètes. Modules renommés en sous-projets : “Modules” devient “Sous-projets” pour éviter la confusion avec les Modules Java. L'élément remplace (qui reste pris en charge). Nouveau type de packaging : “bom” (Bill of Materials) : Différencie les POMs parents et les BOMs de gestion des dépendances. Prend en charge les exclusions et les imports basés sur les classifiers. Déclaration explicite du répertoire racine : permet de définir explicitement le répertoire racine du projet. Élimine toute ambiguïté sur la localisation des racines de projet. Nouvelles variables de répertoire : ${project.rootDirectory}, ${session.topDirectory} et ${session.rootDirectory} pour une meilleure gestion des chemins. Remplace les anciennes solutions non officielles et variables internes obsolètes. Prise en charge de syntaxes alternatives pour le POM Introduction de ModelParser SPI permettant des syntaxes alternatives pour le POM. Apache Maven Hocon Extension est un exemple précoce de cette fonctionnalité. Améliorations pour les sous-projets Versioning automatique des parents Il n'est plus nécessaire de définir la version des parents dans chaque sous-projet. Fonctionne avec le modèle de version 4.1.0 et s'étend aux dépendances internes au projet. Support complet des variables compatibles CI Le Flatten Maven Plugin n'est plus requis. Prend en charge les variables comme ${revision} pour le versioning. Peut être défini via maven.config ou la ligne de commande (mvn verify -Drevision=4.0.1). Améliorations et corrections du Reactor Correction de bug : Gestion améliorée de --also-make lors de la reprise des builds. Nouvelle option --resume (-r) pour redémarrer à partir du dernier sous-projet en échec. Les sous-projets déjà construits avec succès sont ignorés lors de la reprise. Constructions sensibles aux sous-dossiers : Possibilité d'exécuter des outils sur des sous-projets sélectionnés uniquement. Recommandation : Utiliser mvn verify plutôt que mvn clean install. Autres Améliorations Timestamps cohérents pour tous les sous-projets dans les archives packagées. Déploiement amélioré : Le déploiement ne se produit que si tous les sous-projets sont construits avec succès. Changements de workflow, cycle de vie et exécution Java 17 requis pour exécuter Maven Java 17 est le JDK minimum requis pour exécuter Maven 4. Les anciennes versions de Java peuvent toujours être ciblées pour la compilation via Maven Toolchains. Java 17 a été préféré à Java 21 en raison d'un support à long terme plus étendu. Mise à jour des plugins et maintenance des applications Suppression des fonctionnalités obsolètes (ex. Plexus Containers, expressions ${pom.}). Mise à jour du Super POM, modifiant les versions par défaut des plugins. Les builds peuvent se comporter différemment ; définissez des versions fixes des plugins pour éviter les changements inattendus. Maven 4 affiche un avertissement si des versions par défaut sont utilisées. Nouveau paramètre “Fail on Severity” Le build peut échouer si des messages de log atteignent un niveau de gravité spécifique (ex. WARN). Utilisable via --fail-on-severity WARN ou -fos WARN. Maven Shell (mvnsh) Chaque exécution de mvn nécessitait auparavant un redémarrage complet de Java/Maven. Maven 4 introduit Maven Shell (mvnsh), qui maintient un processus Maven résident unique ouvert pour plusieurs commandes. Améliore la performance et réduit les temps de build. Alternative : Utilisez Maven Daemon (mvnd), qui gère un pool de processus Maven résidents. Architecture Un article sur les feature flags avec Unleash https://feeds.feedblitz.com//911939960/0/baeldungImplement-Feature-Flags-in-Java-With-Unleash Pour A/B testing et des cycles de développements plus rapides pour « tester en prod » Montre comment tourner sous docker unleash Et ajouter la librairie a du code java pour tester un feature flag Sécurité Keycloak 26.1 https://www.keycloak.org/2025/01/keycloak-2610-released.html detection des noeuds via la proble base de donnée aulieu echange reseau virtual threads pour infinispan et jgroups opentelemetry tracing supporté et plein de fonctionalités de sécurité Loi, société et organisation Les grands morceaux du coût et revenus d'une conférence. Ici http://bdx.io|bdx.io https://bsky.app/profile/ameliebenoit33.bsky.social/post/3lgzslhedzk2a 44% le billet 52% les sponsors 38% loc du lieu 29% traiteur et café 12% standiste 5% frais speaker (donc pas tous) Ask Me Anything Julien de Provin: J'aime beaucoup le mode “continuous testing” de Quarkus, et je me demandais s'il existait une alternative en dehors de Quarkus, ou à défaut, des ressources sur son fonctionnement ? J'aimerais beaucoup avoir un outil agnostique utilisable sur les projets non-Quarkus sur lesquels j'intervient, quitte à y metttre un peu d'huile de coude (ou de phalange pour le coup). https://github.com/infinitest/infinitest/ Conférences La liste des conférences provenant de Developers Conferences Agenda/List par Aurélie Vache et contributeurs : 6-7 février 2025 : Touraine Tech - Tours (France) 21 février 2025 : LyonJS 100 - Lyon (France) 28 février 2025 : Paris TS La Conf - Paris (France) 6 mars 2025 : DevCon #24 : 100% IA - Paris (France) 13 mars 2025 : Oracle CloudWorld Tour Paris - Paris (France) 14 mars 2025 : Rust In Paris 2025 - Paris (France) 19-21 mars 2025 : React Paris - Paris (France) 20 mars 2025 : PGDay Paris - Paris (France) 20-21 mars 2025 : Agile Niort - Niort (France) 25 mars 2025 : ParisTestConf - Paris (France) 26-29 mars 2025 : JChateau Unconference 2025 - Cour-Cheverny (France) 27-28 mars 2025 : SymfonyLive Paris 2025 - Paris (France) 28 mars 2025 : DataDays - Lille (France) 28-29 mars 2025 : Agile Games France 2025 - Lille (France) 3 avril 2025 : DotJS - Paris (France) 3 avril 2025 : SoCraTes Rennes 2025 - Rennes (France) 4 avril 2025 : Flutter Connection 2025 - Paris (France) 4 avril 2025 : aMP Orléans 04-04-2025 - Orléans (France) 10-11 avril 2025 : Android Makers - Montrouge (France) 10-12 avril 2025 : Devoxx Greece - Athens (Greece) 16-18 avril 2025 : Devoxx France - Paris (France) 23-25 avril 2025 : MODERN ENDPOINT MANAGEMENT EMEA SUMMIT 2025 - Paris (France) 24 avril 2025 : IA Data Day 2025 - Strasbourg (France) 29-30 avril 2025 : MixIT - Lyon (France) 7-9 mai 2025 : Devoxx UK - London (UK) 15 mai 2025 : Cloud Toulouse - Toulouse (France) 16 mai 2025 : AFUP Day 2025 Lille - Lille (France) 16 mai 2025 : AFUP Day 2025 Lyon - Lyon (France) 16 mai 2025 : AFUP Day 2025 Poitiers - Poitiers (France) 24 mai 2025 : Polycloud - Montpellier (France) 24 mai 2025 : NG Baguette Conf 2025 - Nantes (France) 5-6 juin 2025 : AlpesCraft - Grenoble (France) 5-6 juin 2025 : Devquest 2025 - Niort (France) 10-11 juin 2025 : Modern Workplace Conference Paris 2025 - Paris (France) 11-13 juin 2025 : Devoxx Poland - Krakow (Poland) 12-13 juin 2025 : Agile Tour Toulouse - Toulouse (France) 12-13 juin 2025 : DevLille - Lille (France) 13 juin 2025 : Tech F'Est 2025 - Nancy (France) 17 juin 2025 : Mobilis In Mobile - Nantes (France) 24 juin 2025 : WAX 2025 - Aix-en-Provence (France) 25-26 juin 2025 : Agi'Lille 2025 - Lille (France) 25-27 juin 2025 : BreizhCamp 2025 - Rennes (France) 26-27 juin 2025 : Sunny Tech - Montpellier (France) 1-4 juillet 2025 : Open edX Conference - 2025 - Palaiseau (France) 7-9 juillet 2025 : Riviera DEV 2025 - Sophia Antipolis (France) 18-19 septembre 2025 : API Platform Conference - Lille (France) & Online 2-3 octobre 2025 : Volcamp - Clermont-Ferrand (France) 6-10 octobre 2025 : Devoxx Belgium - Antwerp (Belgium) 9-10 octobre 2025 : Forum PHP 2025 - Marne-la-Vallée (France) 16-17 octobre 2025 : DevFest Nantes - Nantes (France) 4-7 novembre 2025 : NewCrafts 2025 - Paris (France) 6 novembre 2025 : dotAI 2025 - Paris (France) 7 novembre 2025 : BDX I/O - Bordeaux (France) 12-14 novembre 2025 : Devoxx Morocco - Marrakech (Morocco) 28-31 janvier 2026 : SnowCamp 2026 - Grenoble (France) 23-25 avril 2026 : Devoxx Greece - Athens (Greece) 17 juin 2026 : Devoxx Poland - Krakow (Poland) Nous contacter Pour réagir à cet épisode, venez discuter sur le groupe Google https://groups.google.com/group/lescastcodeurs Contactez-nous via X/twitter https://twitter.com/lescastcodeurs ou Bluesky https://bsky.app/profile/lescastcodeurs.com Faire un crowdcast ou une crowdquestion Soutenez Les Cast Codeurs sur Patreon https://www.patreon.com/LesCastCodeurs Tous les épisodes et toutes les infos sur https://lescastcodeurs.com/
Tarragon bearnaise with sirloin and wilted greens Cook time: 15 minutes Prep time: 10 minutes Serves: 6 6 sirloin steaks Bearnaise 3 eggs, separated 1/2 cup malt vinegar 1/2 cup water 1 pkt fresh tarragon, reserve 6 leaves for later 1/2 white onion, peeled and sliced 2 cloves garlic, crushed 2 bay leaves 6 peppercorns 1 tsp Worcestershire 280gm unsalted butter, melted over a low heat Sea salt and cracked pepper Mixture of greens I used - Kale Spinach Silverbeet Spring peas 1 tsp cooking oil Remove the steaks from the packet and allow to come up to room temperature. To make the bearnaise, start with the all important reduction. In a small pot combine the vinegar, water, tarragon, onion, garlic, bay leaves and peppercorns. Bring to the boil and turn down to simmer for 5 minutes. Turn the reduction off and allow to infuse for 5 minutes. Before straining through a sieve, you should have 1/3 of a cup. Top up with a touch of water if you're short. Bring a medium pot 1/2 full of water to the boil and turn down to a simmer. Then take a medium-sized heat proof bowl and tip in the egg yolks and the reduction. Place over the simmering water and whisk continuously until the eggs start to cook. This will happen quickly after about a minute. Once the eggs start to come away from the side of the bowl, turn the heat off and whilst still whisking slowly pour in the melted butter, leaving the white butter milk behind. Once added, remove from heat. Add in the Worcestershire, season with salt, pepper and finally the remaining leaves of tarragon which has been chopped. Season and cook your sirloin to a desired cooking and allow to rest. Quickly sauté the greens of in a hot pan with the oil and add a touch of water if required Serve alongside the steak and a good amount of bearnaise sauce. LISTEN ABOVESee omnystudio.com/listener for privacy information.
What does Sam Altman have up his sleeve? In episode 41 of Mixture of Experts, join host Tim Hwang along with experts Nathalie Baracaldo, Marina Danilevsky and Chris Hay. Last week, we covered all things DeepSeek, and this week OpenAI has some new releases to share. Today, the experts dissect deep research and o3-mini. Next, our host Tim Hwang is travelling to AI Action Summit, he asks our experts what we can expect coming out of the event. Then, we talk about Anthropic's Constitutional Classifiers. Finally, Microsoft is creating a unit to study AI's impact, what does this mean? Find out all this and more on Mixture of Experts. 00:01 – intro 00:41 – Open AI deep research and o3-mini 13:51 – AI Action Summit 20:17 – Anthropic's Constitutional Classifiers 28:54 – Microsoft AI Impact team The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity. Subscribe for AI updatesLearn more about artificial intelligenceDeepSeek's reasoning AI shows power of small models, efficiently trainedVisit Mixture of Experts podcast page to learn more AI content
From the beginning of time, a war has raged over humanity—one that seeks to distort, defile, and ultimately sever our connection to God. The Fallen Sons of God abandoned their divine purpose, descending to earth and corrupting its people through deception and genetic manipulation. Their offspring, the Nephilim, were more than just giants of old; they embodied an agenda to erase the image of God from humanity. Though their physical presence faded, their influence remains—woven into the fabric of our world through mind control and ideological deception. But darkness does not have the final say. In this episode of the Revelations Podcast, host Reagan Kramer welcomes back Dr. Laura Sanger, a researcher, author, speaker, and clinical psychologist with a deep passion for awakening people to the spiritual battle at hand. Together, they dive into the spiritual war between the sons of God and the forces of darkness, tracing its origins from biblical times to its modern-day manifestations. They discuss the erosion of biblical truth, the dangers of gender ideology, transhumanism, and the corrupt systems that seek to enslave future generations. Whether you're new to these concepts or looking to equip yourself for the days ahead, this conversation will challenge and inspire you to step into your identity as a son or daughter of God.Here are three reasons why you should listen to this episode:Learn the hidden truths behind the Nephilim agenda and how it impacts our world todayGain practical insights on how to rise up as a son or daughter of God, equipped with spiritual authority to combat these dark forces.Reflect on the urgency of spiritual maturity and the call to live a victorious life aligned with God's truth in perilous times.Become Part of Our Mission! Support The Revelations Podcast:Your support fuels our mission to share transformative messages of hope and faith. Click here to learn how you can contribute and be part of this growing community!ResourcesMore from the Revelations Podcast hosted by Reagan Kramer: Website | Instagram | Apple Podcast | YoutubeListen to our previous episode with Dr. Laura Sanger, “Fighting the Nephilim Agenda with our Authority in Christ”"The Roots of the Federal Reserve" by Dr. Laura Sanger"Generation Hoodwinked" by Dr. Laura Sanger"From Transgender to Transhuman" — by Martin Rothblatt"Future Humans" — Children's BookLaura Sanger: Website | Instagram | Youtube | RumbleLaura's Telegram: @laurasanger444hzBible VersesEcclesiastes 10:20Mark 41 Corinthians 14:20John 14:10John 7:16-18John 12:49-50 Galatians 4:1,7Romans 8:14Ephesians 5:112 Timothy 4:3-4This Episode is brought to you by Advanced Medicine AlternativesGet back to the active life you love through natural & regenerative musculoskeletal healing: https://www.georgekramermd.com/Episode Highlights[0:50] Introduction and Background of Dr. Laura SangerReagan Kramer welcomes back Dr. Laura Sanger to The Revelations Podcast to shed light on the hidden spiritual war shaping our world today.With a Ph.D. in Clinical Psychology and a Master of Arts in Theology from Fuller Theological Seminary, her work bridges biblical revelation and scholarly research. Her books, The Roots of the Federal Reserve and Generation Hoodwinked uncover deep-seated deceptions designed to enslave humanity.A recent gathering at Blurry Con provided an opportunity to reconnect with like-minded individuals and reaffirm the urgency of exposing these dark forces.[5:28] Dr. Laura's Vision and MissionA dream and vision she had in May 2020 led to the title Generation Hoodwinked, revealing a world where AI and spiritual oppression silence the voices of future generations.In the vision, Jesus led Dr. Sanger into an underground cavern where children were trapped in cages, symbolizing the control systems designed to enslave them.The Nephilim agenda thrives on deception, and exposing it is essential to breaking its power.Ephesians 5:11 and 2 Timothy 4:3-4 serve as guiding scriptures in this mission, urging believers to stand against false doctrines and wake up to the battle at hand.[11:43] The Battle of the Sons of GodLong ago, the Fallen Sons of God abandoned their heavenly domain, descending to corrupt humanity and unleash the Nephilim agenda.Their goal was to defile the human genome and stage an insurrection against God's divine order.Though Jesus secured victory through His death and resurrection, the war still rages in the spiritual realm.The need for God-fearing believers to rise up has never been greater, as deception seeks to strip humanity of its divine identity.Spiritual warfare is not passive—strongholds must be torn down, and the authority of Christ must be wielded with boldness.[15:38] Defining the Sons of GodNot all believers walk in the full authority of the Sons of God.Romans 8:14 states that those led by the Spirit are the true sons, yet many remain trapped in self-reliance rather than surrendering to divine direction.Cultural norms encourage independence, but spiritual maturity requires complete dependence on Jesus.Obedience to the Holy Spirit is the mark of a true Son of God, distinguishing those who move in divine authority from those merely going through the motions of faith.[20:28] Laura: “Sons of God are not their own person. They don't make their own decisions. They are fully surrendered to the Father's will.”The invitation to step into sonship is available to all—but it requires a willingness to follow God without hesitation.[27:13] Mixture and SyncretismThe mixing of truth with deception opens doors to bondage, preventing believers from being led by the soul rather than the Spirit.Operating from the soul—through emotions and human reasoning—rather than the Spirit leads to misguided intentions, no matter how well-meaning.Syncretism, the blending of Christian faith with pagan influences, is rampant in modern culture, from Halloween celebrations to the normalization of ideologies that distort God's design.Spiritual purity demands discernment, and the removal of compromise is essential to living victoriously in Christ.[30:12] Laura: “The Fallen sons of God, they mix their seed with human seed to birth the Nephilim. And so giving room to mixture, what that does is that allows us to take the bait that causes many of us to become hoodwinked”[36:28] The Nephilim Agenda and TransgenderismA systematic effort to erase human identity is at play, progressing from transgender ideology to full-scale transhumanism.Dr. Laura describes how this movement is being fueled by the United Nations and comprehensive sexuality education (CSE).She highlights the harmful effects of CSE on children, including promoting sexual stimulation and normalizing bestiality.The long-term effects of puberty blockers and gender-affirming surgeries on children's development and mental health are not acts of liberation but of enslavement[48:04] The Impact of Media and TechnologyMedia and technology are not just entertainment but tools of indoctrination.Future Humans for example, a bestselling children's book, subtly introduces transhumanist ideals by showcasing technological modifications.Movies, music, and television shows create fantasies that reinforce the allure of enhanced abilities, steering the next generation toward a post-human reality.The Nephilim agenda thrives on deception; its end goal is to wipe out humanity and cut at the heart of the Kingdom of God.[50:50] Laura: “The Nephilim agenda is really about defiling the human genome so much that we can't have relationship with Jesus anymore”[52:48] The Role of the Sons of God in Spiritual WarfareThe Sons of God are warriors, called to push back the forces of darkness with unwavering faith.The Hebrew phrase Rak Chazak Amats embodies the strength and courage required to stand in battle.Dr. Laura highlights the importance of the Sons of God in arising and maturing to become heirs of God and walking in their inheritance.As deception intensifies, Dr Laura encourages listeners to find Jesus in the secret place to develop an intimate relationship and learn His voice.[1:05:54] Practical Steps to Become a Son or Daughter of GodVictory begins in the secret place, where intimacy with Jesus is cultivated.Dr Laura emphasizes the importance of distinguishing between the true Holy Spirit and false voices in the church and media.Recognizing this requires deep connection with the True Shepherd, and daily communion with Him to ensure that fear and deception lose their grip.As the episode closes, Dr. Laura prays for listeners, asking for protection, boldness, and the empowerment to walk as Sons of God in a world desperately in need of truth.About Laura SangerDr. Laura Sanger is a researcher, author, speaker, and clinical psychologist dedicated to equipping believers with the knowledge and spiritual tools needed to navigate the unseen battle against darkness. As the founder of No Longer Enslaved, her mission is to awaken people to the pervasive influence of the Nephilim agenda and empower them to walk in their God-given authority.With a Ph.D. in Clinical Psychology and a Master of Arts in Theology from Fuller Theological Seminary, Dr. Laura Sanger combines scholarly research with biblical revelation to expose the hidden forces shaping our world. As the author of books such as Generation Hoodwinked: The Impact of the Nephilim Agenda Today, she unravels the deep-seated deception embedded in financial systems, transhumanism, and ideological warfare. Dr. Sanger has shared her insights on platforms across the globe, equipping believers to discern false narratives, break free from spiritual bondage, and step into their true identity in Christ. Her teachings emphasize the importance of spiritual maturity, exposing darkness, and wielding the weapons of our warfare with boldness.Connect with Dr. Laura Sanger and learn more about her conferences and resources at No Longer Enslaved.Enjoyed this Episode?If you did, subscribe and share it with your friends!Post a review and share it! If you found our deep dive into the spiritual influences on mental health insightful, we'd love to hear your thoughts. Leave a review and share this episode with friends and family. Step into your God-given authority and awaken as a Son of God. Expose deception, break free from spiritual bondage, and walk boldly in the truth of Christ.Have any questions? You can connect with me on Instagram.Thank you for tuning in! For more updates, tune in on Apple Podcasts.
Kollel Iyun Halacha. Shuirim are held Sun-Thurs at 185 Miller Road Lakewood NJ. For more info email: kih185miller@gmail.com
They say you either have charisma or you don't, but Charlie Houpert proves charisma can be built, and reveals the secret code to mastering it for success in love, work, and friendship Charlie Houpert is the co-founder of the confidence-building online platform, ‘Charisma on Command'. He is the author of books such as, ‘The Anti Pick Up Line: Real Habits To Naturally Attract Stunning Women' and ‘Charisma On Command: Inspire, Impress, and Energize Everyone You Meet'. In this conversation, Charlie and Steven discuss topics such as, how to stop feeling awkward in social situations, the ultimate body language hack to build trust, how to become instantly likeable, and how to master the art of persuasion. 00:00 Intro 02:25 What Is It You Do? 04:39 How Much Will These Skills Shift Someone's Life? 06:35 Is It Something You Can Learn? 07:15 Your YouTube Channel 09:37 I Was Shy and Introverted—How I Changed 12:47 What Did You Think of Yourself in the Early Years? 15:22 What Was the Biggest Difference in You? 17:32 First Impressions 21:07 Engineer the Conversation You Want to Have 24:38 How to Get Out of Small Talk 26:05 Flirt With the World 27:55 Prey vs. Predator Movements 35:02 The Confidence Trick Before Talking to a Big Crowd 37:02 Do We Underestimate the Ways We Communicate? 41:11 Is Talking About Yourself a Bad Thing? 43:22 How to Connect With Someone in a Normal Interaction 47:40 How to Figure Out if an Interaction Is Real 50:19 People Controlling the Narratives That Reach You 52:18 Narcissists and Sociopaths 55:28 What Billion-Dollar Business Would You Build and Not Sell? 01:01:20 Six Charismatic Mindsets 01:03:16 Elon Musk Salute 01:06:13 The Media Has Made Saying Sorry the Wrong Thing to Do 01:08:26 Ads 01:09:24 Is Trump Charismatic? 01:14:22 Impeccable Honesty and Integrity 01:18:06 I Don't Need to Convince Anyone of Anything 01:20:43 I Proactively Share My Purpose 01:23:46 Be the First to Humanize the Interaction 01:26:13 Charismatic Types of People 01:31:23 Obama's Charisma 01:32:26 The Importance of Charisma 01:33:43 Ads 01:35:40 How to Use These Skills to Get a Job or Promotion 01:41:07 What Are Women Attracted to in Your Opinion? 01:45:08 Are People Testing to See if You Have Standards? 01:49:21 Five Habits That Make People Instantly Dislike You 01:53:56 Speaking Like a Leader 01:54:46 Pausing Instead of Using Filler Words 01:56:12 Does Body Language Matter When Speaking? 01:57:35 The Fundamentals of Being Confident 01:59:19 What's the Most Important Thing You're Doing to Increase Your Well-Being? 02:02:53 What Are the Mixture of Emotions You Feel? 02:08:19 Is There Anything You Wish You Could Have Said to That Boy? Follow Charlie: Instagram - https://g2ul0.app.link/sX0XNx4tBQb Charisma on Command - https://g2ul0.app.link/Bo2XEO2tBQb You can purchase Charlie's book, ‘Charisma On Command: Inspire, Impress, and Energize Everyone You Meet', here: https://g2ul0.app.link/DoIMBn9tBQb Watch the episodes on Youtube - https://g2ul0.app.link/DOACEpisodes My new book! 'The 33 Laws Of Business & Life' is out now - https://g2ul0.app.link/DOACBook You can purchase the The Diary Of A CEO Conversation Cards: Second Edition, here: https://g2ul0.app.link/f31dsUttKKb Follow me: https://g2ul0.app.link/gnGqL4IsKKb Sponsors: Linkedin Ads - https://www.linkedin.com/DIARY NordVPN - https://NORDVPN.COM/DOAC ZOE - http://joinzoe.com with code BARTLETT10 for 10% off Learn more about your ad choices. Visit megaphone.fm/adchoices
This is the second part of episode 10 of Effortless Podcast, hosts Dheeraj Pandey and Amit Prakash sit down with Alex Dimakis, a renowned AI researcher and professor, to discuss one of the biggest breakthroughs in open AI models—DeepSeek R1. They explore how DeepSeek's innovations in reasoning, reinforcement learning, and efficiency optimizations are reshaping the AI landscape.The conversation covers the shift from large, proprietary AI models to open-source alternatives, the role of post-training fine-tuning, and how reinforcement learning (GRPO) enables reasoning capabilities in LLMs. They also dive into KV caching, mixture of experts, multi-token prediction, and what this means for NVIDIA, hardware players, and AI startups.Key Topics & Timestamps:[00:00] - Introduction & Why DeepSeek Matters[01:30] - DeepSeek R1: Open-Source AI Disrupting the Industry[03:00] - Has China Become an AI Innovator?[07:30] - Open Weights vs. Open Data: What Really Matters?[10:00] - KV Caching, Mixture of Experts & Model Optimizations[21:00] - How Reinforcement Learning (GRPO) Enables Reasoning[32:00] - Why OpenAI is Keeping Its Reasoning Traces Hidden[45:00] - The Impact of AI on NVIDIA & Hardware Demand[1:02:00] - AGI: Language Models vs. Multimodal AI[1:15:00] - The Future of AI: Fine-Tuning, Open-Source & Specialized ModelsHosts:Dheeraj Pandey: Co-founder and CEO at DevRev, formerly Co-founder and CEO of Nutanix. A tech visionary with a deep interest in AI and systems thinking.Amit Prakash: Co-founder and CTO at ThoughtSpot, formerly at Google AdSense and Bing, with extensive expertise in analytics and large-scale systems.Guest:Alex Dimakis: Professor at UC Berkeley and co-founder of Bespoke Labs, Alex has made significant contributions to deep learning, machine learning infrastructure, and the development of AI reasoning frameworks.Follow the Hosts and the Guest:Dheeraj Pandey:LinkedIn - https://www.linkedin.com/in/dpandeyTwitter - https://x.com/dheerajAmit Prakash:LinkedIn - https://www.linkedin.com/in/amit-prak...Twitter - https://x.com/amitp42Alex Dimakis:LinkedIn - https://www.linkedin.com/in/alex-dima...Twitter - https://x.com/AlexGDimakisShare Your Thoughts:Have questions, comments, or ideas for future episodes? Email us at EffortlessPodcastHQ@gmail.comDon't forget to Like, Comment, and Subscribe for more in-depth discussions on AI, technology, and innovation!
Kollel Iyun Halacha. Shuirim are held Sun-Thurs at 185 Miller Road Lakewood NJ. For more info email: kih185miller@gmail.com
Let's bust some early myths about DeepSeek. In episode 40 of Mixture of Experts, join host Tim Hwang along with experts Aaron Baughman, Chris Hay and Kate Soule. Last week, we covered the release of DeepSeek-R1; now that the entire world is up to speed, let's separate the facts from the hype. Next, what is model distillation and why does it matter for competition in AI? Finally, Sam Altman among other tech CEOs shared his response to DeepSeek. Will R1 radically change the open-source strategy of other tech giants? Find out all this and more on Mixture of Experts. 00:01 – Intro 00:41 – DeepSeek facts vs hype 21:00 – Model distillation 31:21 – Open source and OpenAI The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
Kollel Iyun Halacha. Shuirim are held Sun-Thurs at 185 Miller Road Lakewood NJ. For more info email: kih185miller@gmail.com
One last Gold sponsor slot is available for the AI Engineer Summit in NYC. Our last round of invites is going out soon - apply here - If you are building AI agents or AI eng teams, this will be the single highest-signal conference of the year for you!While the world melts down over DeepSeek, few are talking about the OTHER notable group of former hedge fund traders who pivoted into AI and built a remarkably profitable consumer AI business with a tiny team with incredibly cracked engineering team — Chai Research. In short order they have:* Started a Chat AI company well before Noam Shazeer started Character AI, and outlasted his departure.* Crossed 1m DAU in 2.5 years - William updates us on the pod that they've hit 1.4m DAU now, another +40% from a few months ago. Revenue crossed >$22m. * Launched the Chaiverse model crowdsourcing platform - taking 3-4 week A/B testing cycles down to 3-4 hours, and deploying >100 models a week.While they're not paying million dollar salaries, you can tell they're doing pretty well for an 11 person startup:The Chai Recipe: Building infra for rapid evalsRemember how the central thesis of LMarena (formerly LMsys) is that the only comprehensive way to evaluate LLMs is to let users try them out and pick winners?At the core of Chai is a mobile app that looks like Character AI, but is actually the largest LLM A/B testing arena in the world, specialized on retaining chat users for Chai's usecases (therapy, assistant, roleplay, etc). It's basically what LMArena would be if taken very, very seriously at one company (with $1m in prizes to boot):Chai publishes occasional research on how they think about this, including talks at their Palo Alto office:William expands upon this in today's podcast (34 mins in):Fundamentally, the way I would describe it is when you're building anything in life, you need to be able to evaluate it. And through evaluation, you can iterate, we can look at benchmarks, and we can say the issues with benchmarks and why they may not generalize as well as one would hope in the challenges of working with them. But something that works incredibly well is getting feedback from humans. And so we built this thing where anyone can submit a model to our developer backend, and it gets put in front of 5000 users, and the users can rate it. And we can then have a really accurate ranking of like which model, or users finding more engaging or more entertaining. And it gets, you know, it's at this point now, where every day we're able to, I mean, we evaluate between 20 and 50 models, LLMs, every single day, right. So even though we've got only got a team of, say, five AI researchers, they're able to iterate a huge quantity of LLMs, right. So our team ships, let's just say minimum 100 LLMs a week is what we're able to iterate through. Now, before that moment in time, we might iterate through three a week, we might, you know, there was a time when even doing like five a month was a challenge, right? By being able to change the feedback loops to the point where it's not, let's launch these three models, let's do an A-B test, let's assign, let's do different cohorts, let's wait 30 days to see what the day 30 retention is, which is the kind of the, if you're doing an app, that's like A-B testing 101 would be, do a 30-day retention test, assign different treatments to different cohorts and come back in 30 days. So that's insanely slow. That's just, it's too slow. And so we were able to get that 30-day feedback loop all the way down to something like three hours.In Crowdsourcing the leap to Ten Trillion-Parameter AGI, William describes Chai's routing as a recommender system, which makes a lot more sense to us than previous pitches for model routing startups:William is notably counter-consensus in a lot of his AI product principles:* No streaming: Chats appear all at once to allow rejection sampling* No voice: Chai actually beat Character AI to introducing voice - but removed it after finding that it was far from a killer feature.* Blending: “Something that we love to do at Chai is blending, which is, you know, it's the simplest way to think about it is you're going to end up, and you're going to pretty quickly see you've got one model that's really smart, one model that's really funny. How do you get the user an experience that is both smart and funny? Well, just 50% of the requests, you can serve them the smart model, 50% of the requests, you serve them the funny model.” (that's it!)But chief above all is the recommender system.We also referenced Exa CEO Will Bryk's concept of SuperKnowlege:Full Video versionOn YouTube. please like and subscribe!Timestamps* 00:00:04 Introductions and background of William Beauchamp* 00:01:19 Origin story of Chai AI* 00:04:40 Transition from finance to AI* 00:11:36 Initial product development and idea maze for Chai* 00:16:29 User psychology and engagement with AI companions* 00:20:00 Origin of the Chai name* 00:22:01 Comparison with Character AI and funding challenges* 00:25:59 Chai's growth and user numbers* 00:34:53 Key inflection points in Chai's growth* 00:42:10 Multi-modality in AI companions and focus on user-generated content* 00:46:49 Chaiverse developer platform and model evaluation* 00:51:58 Views on AGI and the nature of AI intelligence* 00:57:14 Evaluation methods and human feedback in AI development* 01:02:01 Content creation and user experience in Chai* 01:04:49 Chai Grant program and company culture* 01:07:20 Inference optimization and compute costs* 01:09:37 Rejection sampling and reward models in AI generation* 01:11:48 Closing thoughts and recruitmentTranscriptAlessio [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel, and today we're in the Chai AI office with my usual co-host, Swyx.swyx [00:00:14]: Hey, thanks for having us. It's rare that we get to get out of the office, so thanks for inviting us to your home. We're in the office of Chai with William Beauchamp. Yeah, that's right. You're founder of Chai AI, but previously, I think you're concurrently also running your fund?William [00:00:29]: Yep, so I was simultaneously running an algorithmic trading company, but I fortunately was able to kind of exit from that, I think just in Q3 last year. Yeah, congrats. Yeah, thanks.swyx [00:00:43]: So Chai has always been on my radar because, well, first of all, you do a lot of advertising, I guess, in the Bay Area, so it's working. Yep. And second of all, the reason I reached out to a mutual friend, Joyce, was because I'm just generally interested in the... ...consumer AI space, chat platforms in general. I think there's a lot of inference insights that we can get from that, as well as human psychology insights, kind of a weird blend of the two. And we also share a bit of a history as former finance people crossing over. I guess we can just kind of start it off with the origin story of Chai.William [00:01:19]: Why decide working on a consumer AI platform rather than B2B SaaS? So just quickly touching on the background in finance. Sure. Originally, I'm from... I'm from the UK, born in London. And I was fortunate enough to go study economics at Cambridge. And I graduated in 2012. And at that time, everyone in the UK and everyone on my course, HFT, quant trading was really the big thing. It was like the big wave that was happening. So there was a lot of opportunity in that space. And throughout college, I'd sort of played poker. So I'd, you know, I dabbled as a professional poker player. And I was able to accumulate this sort of, you know, say $100,000 through playing poker. And at the time, as my friends would go work at companies like ChangeStreet or Citadel, I kind of did the maths. And I just thought, well, maybe if I traded my own capital, I'd probably come out ahead. I'd make more money than just going to work at ChangeStreet.swyx [00:02:20]: With 100k base as capital?William [00:02:22]: Yes, yes. That's not a lot. Well, it depends what strategies you're doing. And, you know, there is an advantage. There's an advantage to being small, right? Because there are, if you have a 10... Strategies that don't work in size. Exactly, exactly. So if you have a fund of $10 million, if you find a little anomaly in the market that you might be able to make 100k a year from, that's a 1% return on your 10 million fund. If your fund is 100k, that's 100% return, right? So being small, in some sense, was an advantage. So started off, and the, taught myself Python, and machine learning was like the big thing as well. Machine learning had really, it was the first, you know, big time machine learning was being used for image recognition, neural networks come out, you get dropout. And, you know, so this, this was the big thing that's going on at the time. So I probably spent my first three years out of Cambridge, just building neural networks, building random forests to try and predict asset prices, right, and then trade that using my own money. And that went well. And, you know, if you if you start something, and it goes well, you You try and hire more people. And the first people that came to mind was the talented people I went to college with. And so I hired some friends. And that went well and hired some more. And eventually, I kind of ran out of friends to hire. And so that was when I formed the company. And from that point on, we had our ups and we had our downs. And that was a whole long story and journey in itself. But after doing that for about eight or nine years, on my 30th birthday, which was four years ago now, I kind of took a step back to just evaluate my life, right? This is what one does when one turns 30. You know, I just heard it. I hear you. And, you know, I looked at my 20s and I loved it. It was a really special time. I was really lucky and fortunate to have worked with this amazing team, been successful, had a lot of hard times. And through the hard times, learned wisdom and then a lot of success and, you know, was able to enjoy it. And so the company was making about five million pounds a year. And it was just me and a team of, say, 15, like, Oxford and Cambridge educated mathematicians and physicists. It was like the real dream that you'd have if you wanted to start a quant trading firm. It was like...swyx [00:04:40]: Your own, all your own money?William [00:04:41]: Yeah, exactly. It was all the team's own money. We had no customers complaining to us about issues. There's no investors, you know, saying, you know, they don't like the risk that we're taking. We could. We could really run the thing exactly as we wanted it. It's like Susquehanna or like Rintec. Yeah, exactly. Yeah. And they're the companies that we would kind of look towards as we were building that thing out. But on my 30th birthday, I look and I say, OK, great. This thing is making as much money as kind of anyone would really need. And I thought, well, what's going to happen if we keep going in this direction? And it was clear that we would never have a kind of a big, big impact on the world. We can enrich ourselves. We can make really good money. Everyone on the team would be paid very, very well. Presumably, I can make enough money to buy a yacht or something. But this stuff wasn't that important to me. And so I felt a sort of obligation that if you have this much talent and if you have a talented team, especially as a founder, you want to be putting all that talent towards a good use. I looked at the time of like getting into crypto and I had a really strong view on crypto, which was that as far as a gambling device. This is like the most fun form of gambling invented in like ever super fun, I thought as a way to evade monetary regulations and banking restrictions. I think it's also absolutely amazing. So it has two like killer use cases, not so much banking the unbanked, but everything else, but everything else to do with like the blockchain and, and you know, web, was it web 3.0 or web, you know, that I, that didn't, it didn't really make much sense. And so instead of going into crypto, which I thought, even if I was successful, I'd end up in a lot of trouble. I thought maybe it'd be better to build something that governments wouldn't have a problem with. I knew that LLMs were like a thing. I think opening. I had said they hadn't released GPT-3 yet, but they'd said GPT-3 is so powerful. We can't release it to the world or something. Was it GPT-2? And then I started interacting with, I think Google had open source, some language models. They weren't necessarily LLMs, but they, but they were. But yeah, exactly. So I was able to play around with, but nowadays so many people have interacted with the chat GPT, they get it, but it's like the first time you, you can just talk to a computer and it talks back. It's kind of a special moment and you know, everyone who's done that goes like, wow, this is how it should be. Right. It should be like, rather than having to type on Google and search, you should just be able to ask Google a question. When I saw that I read the literature, I kind of came across the scaling laws and I think even four years ago. All the pieces of the puzzle were there, right? Google had done this amazing research and published, you know, a lot of it. Open AI was still open. And so they'd published a lot of their research. And so you really could be fully informed on, on the state of AI and where it was going. And so at that point I was confident enough, it was worth a shot. I think LLMs are going to be the next big thing. And so that's the thing I want to be building in, in that space. And I thought what's the most impactful product I can possibly build. And I thought it should be a platform. So I myself love platforms. I think they're fantastic because they open up an ecosystem where anyone can contribute to it. Right. So if you think of a platform like a YouTube, instead of it being like a Hollywood situation where you have to, if you want to make a TV show, you have to convince Disney to give you the money to produce it instead, anyone in the world can post any content they want to YouTube. And if people want to view it, the algorithm is going to promote it. Nowadays. You can look at creators like Mr. Beast or Joe Rogan. They would have never have had that opportunity unless it was for this platform. Other ones like Twitter's a great one, right? But I would consider Wikipedia to be a platform where instead of the Britannica encyclopedia, which is this, it's like a monolithic, you get all the, the researchers together, you get all the data together and you combine it in this, in this one monolithic source. Instead. You have this distributed thing. You can say anyone can host their content on Wikipedia. Anyone can contribute to it. And anyone can maybe their contribution is they delete stuff. When I was hearing like the kind of the Sam Altman and kind of the, the Muskian perspective of AI, it was a very kind of monolithic thing. It was all about AI is basically a single thing, which is intelligence. Yeah. Yeah. The more intelligent, the more compute, the more intelligent, and the more and better AI researchers, the more intelligent, right? They would speak about it as a kind of erased, like who can get the most data, the most compute and the most researchers. And that would end up with the most intelligent AI. But I didn't believe in any of that. I thought that's like the total, like I thought that perspective is the perspective of someone who's never actually done machine learning. Because with machine learning, first of all, you see that the performance of the models follows an S curve. So it's not like it just goes off to infinity, right? And the, the S curve, it kind of plateaus around human level performance. And you can look at all the, all the machine learning that was going on in the 2010s, everything kind of plateaued around the human level performance. And we can think about the self-driving car promises, you know, how Elon Musk kept saying the self-driving car is going to happen next year, it's going to happen next, next year. Or you can look at the image recognition, the speech recognition. You can look at. All of these things, there was almost nothing that went superhuman, except for something like AlphaGo. And we can speak about why AlphaGo was able to go like super superhuman. So I thought the most likely thing was going to be this, I thought it's not going to be a monolithic thing. That's like an encyclopedia Britannica. I thought it must be a distributed thing. And I actually liked to look at the world of finance for what I think a mature machine learning ecosystem would look like. So, yeah. So finance is a machine learning ecosystem because all of these quant trading firms are running machine learning algorithms, but they're running it on a centralized platform like a marketplace. And it's not the case that there's one giant quant trading company of all the data and all the quant researchers and all the algorithms and compute, but instead they all specialize. So one will specialize on high frequency training. Another will specialize on mid frequency. Another one will specialize on equity. Another one will specialize. And I thought that's the way the world works. That's how it is. And so there must exist a platform where a small team can produce an AI for a unique purpose. And they can iterate and build the best thing for that, right? And so that was the vision for Chai. So we wanted to build a platform for LLMs.Alessio [00:11:36]: That's kind of the maybe inside versus contrarian view that led you to start the company. Yeah. And then what was maybe the initial idea maze? Because if somebody told you that was the Hugging Face founding story, people might believe it. It's kind of like a similar ethos behind it. How did you land on the product feature today? And maybe what were some of the ideas that you discarded that initially you thought about?William [00:11:58]: So the first thing we built, it was fundamentally an API. So nowadays people would describe it as like agents, right? But anyone could write a Python script. They could submit it to an API. They could send it to the Chai backend and we would then host this code and execute it. So that's like the developer side of the platform. On their Python script, the interface was essentially text in and text out. An example would be the very first bot that I created. I think it was a Reddit news bot. And so it would first, it would pull the popular news. Then it would prompt whatever, like I just use some external API for like Burr or GPT-2 or whatever. Like it was a very, very small thing. And then the user could talk to it. So you could say to the bot, hi bot, what's the news today? And it would say, this is the top stories. And you could chat with it. Now four years later, that's like perplexity or something. That's like the, right? But back then the models were first of all, like really, really dumb. You know, they had an IQ of like a four year old. And users, there really wasn't any demand or any PMF for interacting with the news. So then I was like, okay. Um. So let's make another one. And I made a bot, which was like, you could talk to it about a recipe. So you could say, I'm making eggs. Like I've got eggs in my fridge. What should I cook? And it'll say, you should make an omelet. Right. There was no PMF for that. No one used it. And so I just kept creating bots. And so every single night after work, I'd be like, okay, I like, we have AI, we have this platform. I can create any text in textile sort of agent and put it on the platform. And so we just create stuff night after night. And then all the coders I knew, I would say, yeah, this is what we're going to do. And then I would say to them, look, there's this platform. You can create any like chat AI. You should put it on. And you know, everyone's like, well, chatbots are super lame. We want absolutely nothing to do with your chatbot app. No one who knew Python wanted to build on it. I'm like trying to build all these bots and no consumers want to talk to any of them. And then my sister who at the time was like just finishing college or something, I said to her, I was like, if you want to learn Python, you should just submit a bot for my platform. And she, she built a therapy for me. And I was like, okay, cool. I'm going to build a therapist bot. And then the next day I checked the performance of the app and I'm like, oh my God, we've got 20 active users. And they spent, they spent like an average of 20 minutes on the app. I was like, oh my God, what, what bot were they speaking to for an average of 20 minutes? And I looked and it was the therapist bot. And I went, oh, this is where the PMF is. There was no demand for, for recipe help. There was no demand for news. There was no demand for dad jokes or pub quiz or fun facts or what they wanted was they wanted the therapist bot. the time I kind of reflected on that and I thought, well, if I want to consume news, the most fun thing, most fun way to consume news is like Twitter. It's not like the value of there being a back and forth, wasn't that high. Right. And I thought if I need help with a recipe, I actually just go like the New York times has a good recipe section, right? It's not actually that hard. And so I just thought the thing that AI is 10 X better at is a sort of a conversation right. That's not intrinsically informative, but it's more about an opportunity. You can say whatever you want. You're not going to get judged. If it's 3am, you don't have to wait for your friend to text back. It's like, it's immediate. They're going to reply immediately. You can say whatever you want. It's judgment-free and it's much more like a playground. It's much more like a fun experience. And you could see that if the AI gave a person a compliment, they would love it. It's much easier to get the AI to give you a compliment than a human. From that day on, I said, okay, I get it. Humans want to speak to like humans or human like entities and they want to have fun. And that was when I started to look less at platforms like Google. And I started to look more at platforms like Instagram. And I was trying to think about why do people use Instagram? And I could see that I think Chai was, was filling the same desire or the same drive. If you go on Instagram, typically you want to look at the faces of other humans, or you want to hear about other people's lives. So if it's like the rock is making himself pancakes on a cheese plate. You kind of feel a little bit like you're the rock's friend, or you're like having pancakes with him or something, right? But if you do it too much, you feel like you're sad and like a lonely person, but with AI, you can talk to it and tell it stories and tell you stories, and you can play with it for as long as you want. And you don't feel like you're like a sad, lonely person. You feel like you actually have a friend.Alessio [00:16:29]: And what, why is that? Do you have any insight on that from using it?William [00:16:33]: I think it's just the human psychology. I think it's just the idea that, with old school social media. You're just consuming passively, right? So you'll just swipe. If I'm watching TikTok, just like swipe and swipe and swipe. And even though I'm getting the dopamine of like watching an engaging video, there's this other thing that's building my head, which is like, I'm feeling lazier and lazier and lazier. And after a certain period of time, I'm like, man, I just wasted 40 minutes. I achieved nothing. But with AI, because you're interacting, you feel like you're, it's not like work, but you feel like you're participating and contributing to the thing. You don't feel like you're just. Consuming. So you don't have a sense of remorse basically. And you know, I think on the whole people, the way people talk about, try and interact with the AI, they speak about it in an incredibly positive sense. Like we get people who say they have eating disorders saying that the AI helps them with their eating disorders. People who say they're depressed, it helps them through like the rough patches. So I think there's something intrinsically healthy about interacting that TikTok and Instagram and YouTube doesn't quite tick. From that point on, it was about building more and more kind of like human centric AI for people to interact with. And I was like, okay, let's make a Kanye West bot, right? And then no one wanted to talk to the Kanye West bot. And I was like, ah, who's like a cool persona for teenagers to want to interact with. And I was like, I was trying to find the influencers and stuff like that, but no one cared. Like they didn't want to interact with the, yeah. And instead it was really just the special moment was when we said the realization that developers and software engineers aren't interested in building this sort of AI, but the consumers are right. And rather than me trying to guess every day, like what's the right bot to submit to the platform, why don't we just create the tools for the users to build it themselves? And so nowadays this is like the most obvious thing in the world, but when Chai first did it, it was not an obvious thing at all. Right. Right. So we took the API for let's just say it was, I think it was GPTJ, which was this 6 billion parameter open source transformer style LLM. We took GPTJ. We let users create the prompt. We let users select the image and we let users choose the name. And then that was the bot. And through that, they could shape the experience, right? So if they said this bot's going to be really mean, and it's going to be called like bully in the playground, right? That was like a whole category that I never would have guessed. Right. People love to fight. They love to have a disagreement, right? And then they would create, there'd be all these romantic archetypes that I didn't know existed. And so as the users could create the content that they wanted, that was when Chai was able to, to get this huge variety of content and rather than appealing to, you know, 1% of the population that I'd figured out what they wanted, you could appeal to a much, much broader thing. And so from that moment on, it was very, very crystal clear. It's like Chai, just as Instagram is this social media platform that lets people create images and upload images, videos and upload that, Chai was really about how can we let the users create this experience in AI and then share it and interact and search. So it's really, you know, I say it's like a platform for social AI.Alessio [00:20:00]: Where did the Chai name come from? Because you started the same path. I was like, is it character AI shortened? You started at the same time, so I was curious. The UK origin was like the second, the Chai.William [00:20:15]: We started way before character AI. And there's an interesting story that Chai's numbers were very, very strong, right? So I think in even 20, I think late 2022, was it late 2022 or maybe early 2023? Chai was like the number one AI app in the app store. So we would have something like 100,000 daily active users. And then one day we kind of saw there was this website. And we were like, oh, this website looks just like Chai. And it was the character AI website. And I think that nowadays it's, I think it's much more common knowledge that when they left Google with the funding, I think they knew what was the most trending, the number one app. And I think they sort of built that. Oh, you found the people.swyx [00:21:03]: You found the PMF for them.William [00:21:04]: We found the PMF for them. Exactly. Yeah. So I worked a year very, very hard. And then they, and then that was when I learned a lesson, which is that if you're VC backed and if, you know, so Chai, we'd kind of ran, we'd got to this point, I was the only person who'd invested. I'd invested maybe 2 million pounds in the business. And you know, from that, we were able to build this thing, get to say a hundred thousand daily active users. And then when character AI came along, the first version, we sort of laughed. We were like, oh man, this thing sucks. Like they don't know what they're building. They're building the wrong thing anyway, but then I saw, oh, they've raised a hundred million dollars. Oh, they've raised another hundred million dollars. And then our users started saying, oh guys, your AI sucks. Cause we were serving a 6 billion parameter model, right? How big was the model that character AI could afford to serve, right? So we would be spending, let's say we would spend a dollar per per user, right? Over the, the, you know, the entire lifetime.swyx [00:22:01]: A dollar per session, per chat, per month? No, no, no, no.William [00:22:04]: Let's say we'd get over the course of the year, we'd have a million users and we'd spend a million dollars on the AI throughout the year. Right. Like aggregated. Exactly. Exactly. Right. They could spend a hundred times that. So people would say, why is your AI much dumber than character AIs? And then I was like, oh, okay, I get it. This is like the Silicon Valley style, um, hyper scale business. And so, yeah, we moved to Silicon Valley and, uh, got some funding and iterated and built the flywheels. And, um, yeah, I, I'm very proud that we were able to compete with that. Right. So, and I think the reason we were able to do it was just customer obsession. And it's similar, I guess, to how deep seek have been able to produce such a compelling model when compared to someone like an open AI, right? So deep seek, you know, their latest, um, V2, yeah, they claim to have spent 5 million training it.swyx [00:22:57]: It may be a bit more, but, um, like, why are you making it? Why are you making such a big deal out of this? Yeah. There's an agenda there. Yeah. You brought up deep seek. So we have to ask you had a call with them.William [00:23:07]: We did. We did. We did. Um, let me think what to say about that. I think for one, they have an amazing story, right? So their background is again in finance.swyx [00:23:16]: They're the Chinese version of you. Exactly.William [00:23:18]: Well, there's a lot of similarities. Yes. Yes. I have a great affinity for companies which are like, um, founder led, customer obsessed and just try and build something great. And I think what deep seek have achieved. There's quite special is they've got this amazing inference engine. They've been able to reduce the size of the KV cash significantly. And then by being able to do that, they're able to significantly reduce their inference costs. And I think with kind of with AI, people get really focused on like the kind of the foundation model or like the model itself. And they sort of don't pay much attention to the inference. To give you an example with Chai, let's say a typical user session is 90 minutes, which is like, you know, is very, very long for comparison. Let's say the average session length on TikTok is 70 minutes. So people are spending a lot of time. And in that time they're able to send say 150 messages. That's a lot of completions, right? It's quite different from an open AI scenario where people might come in, they'll have a particular question in mind. And they'll ask like one question. And a few follow up questions, right? So because they're consuming, say 30 times as many requests for a chat, or a conversational experience, you've got to figure out how to how to get the right balance between the cost of that and the quality. And so, you know, I think with AI, it's always been the case that if you want a better experience, you can throw compute at the problem, right? So if you want a better model, you can just make it bigger. If you want it to remember better, give it a longer context. And now, what open AI is doing to great fanfare is with projection sampling, you can generate many candidates, right? And then with some sort of reward model or some sort of scoring system, you can serve the most promising of these many candidates. And so that's kind of scaling up on the inference time compute side of things. And so for us, it doesn't make sense to think of AI is just the absolute performance. So. But what we're seeing, it's like the MML you score or the, you know, any of these benchmarks that people like to look at, if you just get that score, it doesn't really tell tell you anything. Because it's really like progress is made by improving the performance per dollar. And so I think that's an area where deep seek have been able to form very, very well, surprisingly so. And so I'm very interested in what Lama four is going to look like. And if they're able to sort of match what deep seek have been able to achieve with this performance per dollar gain.Alessio [00:25:59]: Before we go into the inference, some of the deeper stuff, can you give people an overview of like some of the numbers? So I think last I checked, you have like 1.4 million daily active now. It's like over 22 million of revenue. So it's quite a business.William [00:26:12]: Yeah, I think we grew by a factor of, you know, users grew by a factor of three last year. Revenue over doubled. You know, it's very exciting. We're competing with some really big, really well funded companies. Character AI got this, I think it was almost a $3 billion valuation. And they have 5 million DAU is a number that I last heard. Torquay, which is a Chinese built app owned by a company called Minimax. They're incredibly well funded. And these companies didn't grow by a factor of three last year. Right. And so when you've got this company and this team that's able to keep building something that gets users excited, and they want to tell their friend about it, and then they want to come and they want to stick on the platform. I think that's very special. And so last year was a great year for the team. And yeah, I think the numbers reflect the hard work that we put in. And then fundamentally, the quality of the app, the quality of the content, the quality of the content, the quality of the content, the quality of the content, the quality of the content. AI is the quality of the experience that you have. You actually published your DAU growth chart, which is unusual. And I see some inflections. Like, it's not just a straight line. There's some things that actually inflect. Yes. What were the big ones? Cool. That's a great, great, great question. Let me think of a good answer. I'm basically looking to annotate this chart, which doesn't have annotations on it. Cool. The first thing I would say is this is, I think the most important thing to know about success is that success is born out of failures. Right? Through failures that we learn. You know, if you think something's a good idea, and you do and it works, great, but you didn't actually learn anything, because everything went exactly as you imagined. But if you have an idea, you think it's going to be good, you try it, and it fails. There's a gap between the reality and expectation. And that's an opportunity to learn. The flat periods, that's us learning. And then the up periods is that's us reaping the rewards of that. So I think the big, of the growth shot of just 2024, I think the first thing that really kind of put a dent in our growth was our backend. So we just reached this scale. So we'd, from day one, we'd built on top of Google's GCP, which is Google's cloud platform. And they were fantastic. We used them when we had one daily active user, and they worked pretty good all the way up till we had about 500,000. It was never the cheapest, but from an engineering perspective, man, that thing scaled insanely good. Like, not Vertex? Not Vertex. Like GKE, that kind of stuff? We use Firebase. So we use Firebase. I'm pretty sure we're the biggest user ever on Firebase. That's expensive. Yeah, we had calls with engineers, and they're like, we wouldn't recommend using this product beyond this point, and you're 3x over that. So we pushed Google to their absolute limits. You know, it was fantastic for us, because we could focus on the AI. We could focus on just adding as much value as possible. But then what happened was, after 500,000, just the thing, the way we were using it, and it would just, it wouldn't scale any further. And so we had a really, really painful, at least three-month period, as we kind of migrated between different services, figuring out, like, what requests do we want to keep on Firebase, and what ones do we want to move on to something else? And then, you know, making mistakes. And learning things the hard way. And then after about three months, we got that right. So that, we would then be able to scale to the 1.5 million DAE without any further issues from the GCP. But what happens is, if you have an outage, new users who go on your app experience a dysfunctional app, and then they're going to exit. And so your next day, the key metrics that the app stores track are going to be something like retention rates. And so your next day, the key metrics that the app stores track are going to be something like retention rates. Money spent, and the star, like, the rating that they give you. In the app store. In the app store, yeah. Tyranny. So if you're ranked top 50 in entertainment, you're going to acquire a certain rate of users organically. If you go in and have a bad experience, it's going to tank where you're positioned in the algorithm. And then it can take a long time to kind of earn your way back up, at least if you wanted to do it organically. If you throw money at it, you can jump to the top. And I could talk about that. But broadly speaking, if we look at 2024, the first kink in the graph was outages due to hitting 500k DAU. The backend didn't want to scale past that. So then we just had to do the engineering and build through it. Okay, so we built through that, and then we get a little bit of growth. And so, okay, that's feeling a little bit good. I think the next thing, I think it's, I'm not going to lie, I have a feeling that when Character AI got... I was thinking. I think so. I think... So the Character AI team fundamentally got acquired by Google. And I don't know what they changed in their business. I don't know if they dialed down that ad spend. Products don't change, right? Products just what it is. I don't think so. Yeah, I think the product is what it is. It's like maintenance mode. Yes. I think the issue that people, you know, some people may think this is an obvious fact, but running a business can be very competitive, right? Because other businesses can see what you're doing, and they can imitate you. And then there's this... There's this question of, if you've got one company that's spending $100,000 a day on advertising, and you've got another company that's spending zero, if you consider market share, and if you're considering new users which are entering the market, the guy that's spending $100,000 a day is going to be getting 90% of those new users. And so I have a suspicion that when the founders of Character AI left, they dialed down their spending on user acquisition. And I think that kind of gave oxygen to like the other apps. And so Chai was able to then start growing again in a really healthy fashion. I think that's kind of like the second thing. I think a third thing is we've really built a great data flywheel. Like the AI team sort of perfected their flywheel, I would say, in end of Q2. And I could speak about that at length. But fundamentally, the way I would describe it is when you're building anything in life, you need to be able to evaluate it. And through evaluation, you can iterate, we can look at benchmarks, and we can say the issues with benchmarks and why they may not generalize as well as one would hope in the challenges of working with them. But something that works incredibly well is getting feedback from humans. And so we built this thing where anyone can submit a model to our developer backend, and it gets put in front of 5000 users, and the users can rate it. And we can then have a really accurate ranking of like which model, or users finding more engaging or more entertaining. And it gets, you know, it's at this point now, where every day we're able to, I mean, we evaluate between 20 and 50 models, LLMs, every single day, right. So even though we've got only got a team of, say, five AI researchers, they're able to iterate a huge quantity of LLMs, right. So our team ships, let's just say minimum 100 LLMs a week is what we're able to iterate through. Now, before that moment in time, we might iterate through three a week, we might, you know, there was a time when even doing like five a month was a challenge, right? By being able to change the feedback loops to the point where it's not, let's launch these three models, let's do an A-B test, let's assign, let's do different cohorts, let's wait 30 days to see what the day 30 retention is, which is the kind of the, if you're doing an app, that's like A-B testing 101 would be, do a 30-day retention test, assign different treatments to different cohorts and come back in 30 days. So that's insanely slow. That's just, it's too slow. And so we were able to get that 30-day feedback loop all the way down to something like three hours. And when we did that, we could really, really, really perfect techniques like DPO, fine tuning, prompt engineering, blending, rejection sampling, training a reward model, right, really successfully, like boom, boom, boom, boom, boom. And so I think in Q3 and Q4, we got, the amount of AI improvements we got was like astounding. It was getting to the point, I thought like how much more, how much more edge is there to be had here? But the team just could keep going and going and going. That was like number three for the inflection point.swyx [00:34:53]: There's a fourth?William [00:34:54]: The important thing about the third one is if you go on our Reddit or you talk to users of AI, there's like a clear date. It's like somewhere in October or something. The users, they flipped. Before October, the users... The users would say character AI is better than you, for the most part. Then from October onwards, they would say, wow, you guys are better than character AI. And that was like a really clear positive signal that we'd sort of done it. And I think people, you can't cheat consumers. You can't trick them. You can't b******t them. They know, right? If you're going to spend 90 minutes on a platform, and with apps, there's the barriers to switching is pretty low. Like you can try character AI, you can't cheat consumers. You can't cheat them. You can't cheat them. You can't cheat AI for a day. If you get bored, you can try Chai. If you get bored of Chai, you can go back to character. So the users, the loyalty is not strong, right? What keeps them on the app is the experience. If you deliver a better experience, they're going to stay and they can tell. So that was the fourth one was we were fortunate enough to get this hire. He was hired one really talented engineer. And then they said, oh, at my last company, we had a head of growth. He was really, really good. And he was the head of growth for ByteDance for two years. Would you like to speak to him? And I was like, yes. Yes, I think I would. And so I spoke to him. And he just blew me away with what he knew about user acquisition. You know, it was like a 3D chessswyx [00:36:21]: sort of thing. You know, as much as, as I know about AI. Like ByteDance as in TikTok US. Yes.William [00:36:26]: Not ByteDance as other stuff. Yep. He was interviewing us as we were interviewing him. Right. And so pick up options. Yeah, exactly. And so he was kind of looking at our metrics. And he was like, I saw him get really excited when he said, guys, you've got a million daily active users and you've done no advertising. I said, correct. And he was like, that's unheard of. He's like, I've never heard of anyone doing that. And then he started looking at our metrics. And he was like, if you've got all of this organically, if you start spending money, this is going to be very exciting. I was like, let's give it a go. So then he came in, we've just started ramping up the user acquisition. So that looks like spending, you know, let's say we're spending, we started spending $20,000 a day, it looked very promising than 20,000. Right now we're spending $40,000 a day on user acquisition. That's still only half of what like character AI or talkie may be spending. But from that, it's sort of, we were growing at a rate of maybe say, 2x a year. And that got us growing at a rate of 3x a year. So I'm growing, I'm evolving more and more to like a Silicon Valley style hyper growth, like, you know, you build something decent, and then you canswyx [00:37:33]: slap on a huge... You did the important thing, you did the product first.William [00:37:36]: Of course, but then you can slap on like, like the rocket or the jet engine or something, which is just this cash in, you pour in as much cash, you buy a lot of ads, and your growth is faster.swyx [00:37:48]: Not to, you know, I'm just kind of curious what's working right now versus what surprisinglyWilliam [00:37:52]: doesn't work. Oh, there's a long, long list of surprising stuff that doesn't work. Yeah. The surprising thing, like the most surprising thing, what doesn't work is almost everything doesn't work. That's what's surprising. And I'll give you an example. So like a year and a half ago, I was working at a company, we were super excited by audio. I was like, audio is going to be the next killer feature, we have to get in the app. And I want to be the first. So everything Chai does, I want us to be the first. We may not be the company that's strongest at execution, but we can always be theswyx [00:38:22]: most innovative. Interesting. Right? So we can... You're pretty strong at execution.William [00:38:26]: We're much stronger, we're much stronger. A lot of the reason we're here is because we were first. If we launched today, it'd be so hard to get the traction. Because it's like to get the flywheel, to get the users, to build a product people are excited about. If you're first, people are naturally excited about it. But if you're fifth or 10th, man, you've got to beswyx [00:38:46]: insanely good at execution. So you were first with voice? We were first. We were first. I only knowWilliam [00:38:51]: when character launched voice. They launched it, I think they launched it at least nine months after us. Okay. Okay. But the team worked so hard for it. At the time we did it, latency is a huge problem. Cost is a huge problem. Getting the right quality of the voice is a huge problem. Right? Then there's this user interface and getting the right user experience. Because you don't just want it to start blurting out. Right? You want to kind of activate it. But then you don't have to keep pressing a button every single time. There's a lot that goes into getting a really smooth audio experience. So we went ahead, we invested the three months, we built it all. And then when we did the A-B test, there was like, no change in any of the numbers. And I was like, this can't be right, there must be a bug. And we spent like a week just checking everything, checking again, checking again. And it was like, the users just did not care. And it was something like only 10 or 15% of users even click the button to like, they wanted to engage the audio. And they would only use it for 10 or 15% of the time. So if you do the math, if it's just like something that one in seven people use it for one seventh of their time. You've changed like 2% of the experience. So even if that that 2% of the time is like insanely good, it doesn't translate much when you look at the retention, when you look at the engagement, and when you look at the monetization rates. So audio did not have a big impact. I'm pretty big on audio. But yeah, I like it too. But it's, you know, so a lot of the stuff which I do, I'm a big, you can have a theory. And you resist. Yeah. Exactly, exactly. So I think if you want to make audio work, it has to be a unique, compelling, exciting experience that they can't have anywhere else.swyx [00:40:37]: It could be your models, which just weren't good enough.William [00:40:39]: No, no, no, they were great. Oh, yeah, they were very good. it was like, it was kind of like just the, you know, if you listen to like an audible or Kindle, or something like, you just hear this voice. And it's like, you don't go like, wow, this is this is special, right? It's like a convenience thing. But the idea is that if you can, if Chai is the only platform, like, let's say you have a Mr. Beast, and YouTube is the only platform you can use to make audio work, then you can watch a Mr. Beast video. And it's the most engaging, fun video that you want to watch, you'll go to a YouTube. And so it's like for audio, you can't just put the audio on there. And people go, oh, yeah, it's like 2% better. Or like, 5% of users think it's 20% better, right? It has to be something that the majority of people, for the majority of the experience, go like, wow, this is a big deal. That's the features you need to be shipping. If it's not going to appeal to the majority of people, for the majority of the experience, and it's not a big deal, it's not going to move you. Cool. So you killed it. I don't see it anymore. Yep. So I love this. The longer, it's kind of cheesy, I guess, but the longer I've been working at Chai, and I think the team agrees with this, all the platitudes, at least I thought they were platitudes, that you would get from like the Steve Jobs, which is like, build something insanely great, right? Or be maniacally focused, or, you know, the most important thing is saying no to, not to work on. All of these sort of lessons, they just are like painfully true. They're painfully true. So now I'm just like, everything I say, I'm either quoting Steve Jobs or Zuckerberg. I'm like, guys, move fast and break free.swyx [00:42:10]: You've jumped the Apollo to cool it now.William [00:42:12]: Yeah, it's just so, everything they said is so, so true. The turtle neck. Yeah, yeah, yeah. Everything is so true.swyx [00:42:18]: This last question on my side, and I want to pass this to Alessio, is on just, just multi-modality in general. This actually comes from Justine Moore from A16Z, who's a friend of ours. And a lot of people are trying to do voice image video for AI companions. Yes. You just said voice didn't work. Yep. What would make you revisit?William [00:42:36]: So Steve Jobs, he was very, listen, he was very, very clear on this. There's a habit of engineers who, once they've got some cool technology, they want to find a way to package up the cool technology and sell it to consumers, right? That does not work. So you're free to try and build a startup where you've got your cool tech and you want to find someone to sell it to. That's not what we do at Chai. At Chai, we start with the consumer. What does the consumer want? What is their problem? And how do we solve it? So right now, the number one problems for the users, it's not the audio. That's not the number one problem. It's not the image generation either. That's not their problem either. The number one problem for users in AI is this. All the AI is being generated by middle-aged men in Silicon Valley, right? That's all the content. You're interacting with this AI. You're speaking to it for 90 minutes on average. It's being trained by middle-aged men. The guys out there, they're out there. They're talking to you. They're talking to you. They're like, oh, what should the AI say in this situation, right? What's funny, right? What's cool? What's boring? What's entertaining? That's not the way it should be. The way it should be is that the users should be creating the AI, right? And so the way I speak about it is this. Chai, we have this AI engine in which sits atop a thin layer of UGC. So the thin layer of UGC is absolutely essential, right? It's just prompts. But it's just prompts. It's just an image. It's just a name. It's like we've done 1% of what we could do. So we need to keep thickening up that layer of UGC. It must be the case that the users can train the AI. And if reinforcement learning is powerful and important, they have to be able to do that. And so it's got to be the case that there exists, you know, I say to the team, just as Mr. Beast is able to spend 100 million a year or whatever it is on his production company, and he's got a team building the content, the Mr. Beast company is able to spend 100 million a year on his production company. And he's got a team building the content, which then he shares on the YouTube platform. Until there's a team that's earning 100 million a year or spending 100 million on the content that they're producing for the Chai platform, we're not finished, right? So that's the problem. That's what we're excited to build. And getting too caught up in the tech, I think is a fool's errand. It does not work.Alessio [00:44:52]: As an aside, I saw the Beast Games thing on Amazon Prime. It's not doing well. And I'mswyx [00:44:56]: curious. It's kind of like, I mean, the audience reading is high. The run-to-meet-all sucks, but the audience reading is high.Alessio [00:45:02]: But it's not like in the top 10. I saw it dropped off of like the... Oh, okay. Yeah, that one I don't know. I'm curious, like, you know, it's kind of like similar content, but different platform. And then going back to like, some of what you were saying is like, you know, people come to ChaiWilliam [00:45:13]: expecting some type of content. Yeah, I think it's something that's interesting to discuss is like, is moats. And what is the moat? And so, you know, if you look at a platform like YouTube, the moat, I think is in first is really is in the ecosystem. And the ecosystem, is comprised of you have the content creators, you have the users, the consumers, and then you have the algorithms. And so this, this creates a sort of a flywheel where the algorithms are able to be trained on the users, and the users data, the recommend systems can then feed information to the content creators. So Mr. Beast, he knows which thumbnail does the best. He knows the first 10 seconds of the video has to be this particular way. And so his content is super optimized for the YouTube platform. So that's why it doesn't do well on Amazon. If he wants to do well on Amazon, how many videos has he created on the YouTube platform? By thousands, 10s of 1000s, I guess, he needs to get those iterations in on the Amazon. So at Chai, I think it's all about how can we get the most compelling, rich user generated content, stick that on top of the AI engine, the recommender systems, in such that we get this beautiful data flywheel, more users, better recommendations, more creative, more content, more users.Alessio [00:46:34]: You mentioned the algorithm, you have this idea of the Chaiverse on Chai, and you have your own kind of like LMSYS-like ELO system. Yeah, what are things that your models optimize for, like your users optimize for, and maybe talk about how you build it, how people submit models?William [00:46:49]: So Chaiverse is what I would describe as a developer platform. More often when we're speaking about Chai, we're thinking about the Chai app. And the Chai app is really this product for consumers. And so consumers can come on the Chai app, they can come on the Chai app, they can come on the Chai app, they can interact with our AI, and they can interact with other UGC. And it's really just these kind of bots. And it's a thin layer of UGC. Okay. Our mission is not to just have a very thin layer of UGC. Our mission is to have as much UGC as possible. So we must have, I don't want people at Chai training the AI. I want people, not middle aged men, building AI. I want everyone building the AI, as many people building the AI as possible. Okay, so what we built was we built Chaiverse. And Chaiverse is kind of, it's kind of like a prototype, is the way to think about it. And it started with this, this observation that, well, how many models get submitted into Hugging Face a day? It's hundreds, it's hundreds, right? So there's hundreds of LLMs submitted each day. Now consider that, what does it take to build an LLM? It takes a lot of work, actually. It's like someone devoted several hours of compute, several hours of their time, prepared a data set, launched it, ran it, evaluated it, submitted it, right? So there's a lot of, there's a lot of, there's a lot of work that's going into that. So what we did was we said, well, why can't we host their models for them and serve them to users? And then what would that look like? The first issue is, well, how do you know if a model is good or not? Like, we don't want to serve users the crappy models, right? So what we would do is we would, I love the LMSYS style. I think it's really cool. It's really simple. It's a very intuitive thing, which is you simply present the users with two completions. You can say, look, this is from model one. This is from model two. This is from model three. This is from model A. This is from model B, which is better. And so if someone submits a model to Chaiverse, what we do is we spin up a GPU. We download the model. We're going to now host that model on this GPU. And we're going to start routing traffic to it. And we're going to send, we think it takes about 5,000 completions to get an accurate signal. That's roughly what LMSYS does. And from that, we're able to get an accurate ranking. And we're able to get an accurate ranking. And we're able to get an accurate ranking of which models are people finding entertaining and which models are not entertaining. If you look at the bottom 80%, they'll suck. You can just disregard them. They totally suck. Then when you get the top 20%, you know you've got a decent model, but you can break it down into more nuance. There might be one that's really descriptive. There might be one that's got a lot of personality to it. There might be one that's really illogical. Then the question is, well, what do you do with these top models? From that, you can do more sophisticated things. You can try and do like a routing thing where you say for a given user request, we're going to try and predict which of these end models that users enjoy the most. That turns out to be pretty expensive and not a huge source of like edge or improvement. Something that we love to do at Chai is blending, which is, you know, it's the simplest way to think about it is you're going to end up, and you're going to pretty quickly see you've got one model that's really smart, one model that's really funny. How do you get the user an experience that is both smart and funny? Well, just 50% of the requests, you can serve them the smart model, 50% of the requests, you serve them the funny model. Just a random 50%? Just a random, yeah. And then... That's blending? That's blending. You can do more sophisticated things on top of that, as in all things in life, but the 80-20 solution, if you just do that, you get a pretty powerful effect out of the gate. Random number generator. I think it's like the robustness of randomness. Random is a very powerful optimization technique, and it's a very robust thing. So you can explore a lot of the space very efficiently. There's one thing that's really, really important to share, and this is the most exciting thing for me, is after you do the ranking, you get an ELO score, and you can track a user's first join date, the first date they submit a model to Chaiverse, they almost always get a terrible ELO, right? So let's say the first submission they get an ELO of 1,100 or 1,000 or something, and you can see that they iterate and they iterate and iterate, and it will be like, no improvement, no improvement, no improvement, and then boom. Do you give them any data, or do you have to come up with this themselves? We do, we do, we do, we do. We try and strike a balance between giving them data that's very useful, you've got to be compliant with GDPR, which is like, you have to work very hard to preserve the privacy of users of your app. So we try to give them as much signal as possible, to be helpful. The minimum is we're just going to give you a score, right? That's the minimum. But that alone is people can optimize a score pretty well, because they're able to come up with theories, submit it, does it work? No. A new theory, does it work? No. And then boom, as soon as they figure something out, they keep it, and then they iterate, and then boom,Alessio [00:51:46]: they figure something out, and they keep it. Last year, you had this post on your blog, cross-sourcing the lead to the 10 trillion parameter, AGI, and you call it a mixture of experts, recommenders. Yep. Any insights?William [00:51:58]: Updated thoughts, 12 months later? I think the odds, the timeline for AGI has certainly been pushed out, right? Now, this is in, I'm a controversial person, I don't know, like, I just think... You don't believe in scaling laws, you think AGI is further away. I think it's an S-curve. I think everything's an S-curve. And I think that the models have proven to just be far worse at reasoning than people sort of thought. And I think whenever I hear people talk about LLMs as reasoning engines, I sort of cringe a bit. I don't think that's what they are. I think of them more as like a simulator. I think of them as like a, right? So they get trained to predict the next most likely token. It's like a physics simulation engine. So you get these like games where you can like construct a bridge, and you drop a car down, and then it predicts what should happen. And that's really what LLMs are doing. It's not so much that they're reasoning, it's more that they're just doing the most likely thing. So fundamentally, the ability for people to add in intelligence, I think is very limited. What most people would consider intelligence, I think the AI is not a crowdsourcing problem, right? Now with Wikipedia, Wikipedia crowdsources knowledge. It doesn't crowdsource intelligence. So it's a subtle distinction. AI is fantastic at knowledge. I think it's weak at intelligence. And a lot, it's easy to conflate the two because if you ask it a question and it gives you, you know, if you said, who was the seventh president of the United States, and it gives you the correct answer, I'd say, well, I don't know the answer to that. And you can conflate that with intelligence. But really, that's a question of knowledge. And knowledge is really this thing about saying, how can I store all of this information? And then how can I retrieve something that's relevant? Okay, they're fantastic at that. They're fantastic at storing knowledge and retrieving the relevant knowledge. They're superior to humans in that regard. And so I think we need to come up for a new word. How does one describe AI should contain more knowledge than any individual human? It should be more accessible than any individual human. That's a very powerful thing. That's superswyx [00:54:07]: powerful. But what words do we use to describe that? We had a previous guest on Exa AI that does search. And he tried to coin super knowledge as the opposite of super intelligence.William [00:54:20]: Exactly. I think super knowledge is a more accurate word for it.swyx [00:54:24]: You can store more things than any human can.William [00:54:26]: And you can retrieve it better than any human can as well. And I think it's those two things combined that's special. I think that thing will exist. That thing can be built. And I think you can start with something that's entertaining and fun. And I think, I often think it's like, look, it's going to be a 20 year journey. And we're in like, year four, or it's like the web. And this is like 1998 or something. You know, you've got a long, long way to go before the Amazon.coms are like these huge, multi trillion dollar businesses that every single person uses every day. And so AI today is very simplistic. And it's fundamentally the way we're using it, the flywheels, and this ability for how can everyone contribute to it to really magnify the value that it brings. Right now, like, I think it's a bit sad. It's like, right now you have big labs, I'm going to pick on open AI. And they kind of go to like these human labelers. And they say, we're going to pay you to just label this like subset of questions that we want to get a really high quality data set, then we're going to get like our own computers that are really powerful. And that's kind of like the thing. For me, it's so much like Encyclopedia Britannica. It's like insane. All the people that were interested in blockchain, it's like, well, this is this is what needs to be decentralized, you need to decentralize that thing. Because if you distribute it, people can generate way more data in a distributed fashion, way more, right? You need the incentive. Yeah, of course. Yeah. But I mean, the, the, that's kind of the exciting thing about Wikipedia was it's this understanding, like the incentives, you don't need money to incentivize people. You don't need dog coins. No. Sometimes, sometimes people get the satisfaction fro
Send Everyday AI and Jordan a text messageEverything in your current AI playbook is about to get shredded, stomped on, and turned into digital confetti. I've spent 2024 living in the bleeding edge of AI development, meticulously tracking AI's development as my full-time job. And what's coming next….. yikes. ↳ We're entering an era where AI doesn't just chat – it REMEMBERS. ↳ Where what us humans know becomes kinda worthless. (Or at least worth less.) ↳ Where specialized models hit harder than a triple espresso shot. ↳ Where different AIs team up like some digital Avengers squad. And AGI? It might just slip through the door while everyone's busy debating if it's possible. We're peeling back the silicon curtain on the last and final installment of our 2025 AI Predictions and Roadmap: AI's Technical Leaps: Memory, Models, and Major Changes. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan questions on AIUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Narrow AI Agents2. LLM Memory3. LLMs Becoming Small Language Models4. Mixture of Models5. AGI is AchievedTimestamps:00:00 Live Insights and Trend Spotting06:25 "Seeking Feedback for Newsletter"07:44 AGI: Not Coming Anytime Soon11:11 AI Memory and Context Windows15:25 "Microsoft's GPT-4 Mini Revelation"18:32 Open Source Models' Future Evolution20:47 Small Models Surpassing Larger Ones25:37 "AGI Achieved? Debating OpenAI's Claim"28:35 AGI Achieved: Minimal Immediate ImpactKeywords:AI predictions, AGI, artificial general intelligence, large language models, dumb AI, technical leaps, memory models, everyday AI, AI trends, free daily newsletter, AI experts, podcasts, Microsoft, Google, OpenAI, IBM, agent orchestrators, public companies, AI agents, company reasoning data collection, API prices, AI video tools, AI influencers, AI software, AI regulations, narrow AI agents, LLM memory, context window, OpenAI's memory feature, mixture of models. Ready for ROI on GenAI? Go to youreverydayai.com/partner
What does the future hold for DeepSeek? In episode 39 of Mixture of Experts, join host Tim Hwang along with experts Abraham Daniels, Kaoutar El Maghraoui and Skyler Speakman to discuss the release of DeepSeek-R1. Next, Mistral indicates going IPO. Then, FrontierMath's new benchmark is particularly difficult, the experts debrief. Finally, IDC released a report on code assistants, what do we need to know about generalist and specialized coding assistants? Tune-in to this week's episode to find out. 00:01 – Intro 01:08 – DeepSeek-R1 14:08 – Mistral indicates IPO 20:54 – FrontierMath controversy 30:04 -- IDC code assistants report The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
What would you do with $2 billion? In episode 38 of Mixture of Experts, join host Tim Hwang along with experts Chris Hay, Kaoutar El Maghraoui and Vyoma Gajjar to discuss the Anthropic valuation rumors. Next, Microsoft CEO Nadella created a new CoreAI group to build and run apps for customers. Then, NotebookLM upgraded some of its features, including podcast intervention. Finally, AI agents are making their way into the financial services industry. Can an agent invest all of your money? Tune-in to this week's episode to find out. 00:01 -- What would you do with $2 billion? 00:51 -- Anthropic valuation 12:14 -- Microsoft CoreAI 25:01 -- NotebookLM upgrades 35:17 -- AI agents in finance The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
In this episode, we dive deep into the world of AI engineering with Chip Huyen, author of the excellent, newly released book "AI Engineering: Building Applications with Foundation Models". We explore the nuances of AI engineering, distinguishing it from traditional machine learning, discuss how foundational models make it possible for anyone to build AI applications and cover many other topics including the challenges of AI evaluation, the intricacies of the generative AI stack, why prompt engineering is underrated, why the rumors of the death of RAG are greatly exaggerated, and the latest progress in AI agents. Book: https://www.oreilly.com/library/view/ai-engineering/9781098166298/ Chip Huyen Website - https://huyenchip.com LinkedIn - https://www.linkedin.com/in/chiphuyen Twitter/X - https://x.com/chipro FIRSTMARK Website - https://firstmark.com Twitter - https://twitter.com/FirstMarkCap Matt Turck (Managing Director) LinkedIn - https://www.linkedin.com/in/turck/ Twitter - https://twitter.com/mattturck (00:00) Intro (02:45) What is new about AI engineering? (06:11) The product-first approach to building AI applications (07:38) Are AI engineering and ML engineering two separate professions? (11:00) The Generative AI stack (13:00) Why are language models able to scale? (14:45) Auto-regressive vs. masked models (16:46) Supervised vs. unsupervised vs. self-supervised (18:56) Why does model scale matter? (20:40) Mixture of Experts (24:20) Pre-training vs. post-training (28:43) Sampling (32:14) Evaluation as a key to AI adoption (36:03) Entropy (40:05) Evaluating AI systems (43:21) AI as a judge (46:49) Why prompt engineering is underrated (49:38) In-context learning (51:46) Few-shot learning and zero-shot learning (52:57) Defensive prompt engineering (55:29) User prompt vs. system prompt (57:07) Why RAG is here to stay (01:00:31) Defining AI agents (01:04:04) AI agent planning (01:08:32) Training data as a bottleneck to agent planning
In this episode, Sharon Zhou, Co-Founder and CEO of Lamini AI, shares her expertise in the world of AI, focusing on fine-tuning models for improved performance and reliability.Highlights include:- The integration of determinism and probabilism for handling unstructured data and user queries effectively.- Proprietary techniques like memory tuning and robust evaluation frameworks to mitigate model inaccuracies and hallucinations.- Lessons learned from deploying AI applications, including insights from GitHub Copilot's rollout.Connect with Sharon Zhou and Lamini:https://www.linkedin.com/in/zhousharon/https://x.com/realsharonzhouhttps://www.lamini.ai/
What's the most exciting CES AI announcement? In episode 37 of Mixture of Experts, host Tim Hwang is joined by Skyler Speakman, Volkmar Uhlig and Shobhit Varshney to debrief CES 2025. Specifically, the experts dive into NVIDIA'S Project DIGITS, among other announcements from the AI hardware giant. Next, a new enterprise AI development survey came out that detailing how developers really feel about AI implementation. Then, Apple Intelligence experienced some major hallucination fails, what does this tell us about Apple's stake in the AI game? Finally, Sam Altman of OpenAI released a reflection blog, what does he say about the future of AI? All that and more on today's Mixture of Experts.The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
We're experimenting and would love to hear from you!In this episode of Discover Daily, we explore groundbreaking technological and scientific developments shaping our future. MIT's revolutionary DrivAerNet++ database takes center stage, featuring over 8,000 AI-generated electric vehicle designs with comprehensive aerodynamic data, promising to transform automotive development processes and accelerate EV innovation.The show delves into a major medical breakthrough as lenacapavir, Science magazine's 2024 Breakthrough of the Year, emerges as a game-changing HIV prevention drug. This remarkable innovation from Gilead Sciences offers six months of protection with a single injection, demonstrating 96-100% efficacy in clinical trials and holding promise for global HIV prevention efforts.The episode's main focus spotlights DeepSeek-V3, a cutting-edge open-source AI model boasting 671 billion parameters. Using innovative Mixture-of-Experts architecture, this powerful language model activates only 37 billion parameters per token, achieving impressive efficiency while maintaining high performance across various text-based tasks. The discussion explores its capabilities, limitations, and potential impact on the AI landscape.From Perplexity's Discover Feed:https://www.perplexity.ai/page/mit-s-ev-design-database-HW3LeM4gQNO2pa1oYp6AMwhttps://www.perplexity.ai/page/hiv-drug-named-breakthrough-of-kzPk2YAoQPKS.CdzOsNdXAhttps://www.perplexity.ai/page/deepseek-s-new-open-source-ai-YwAwjp_IQKiAJ2l1qFhN9gPerplexity is the fastest and most powerful way to search the web. Perplexity crawls the web and curates the most relevant and up-to-date sources (from academic papers to Reddit threads) to create the perfect response to any question or topic you're interested in. Take the world's knowledge with you anywhere. Available on iOS and Android Join our growing Discord community for the latest updates and exclusive content. Follow us on: Instagram Threads X (Twitter) YouTube Linkedin
Is deep learning hitting a wall? It's 2025 and Mixture of Experts is back and better than ever. In episode 36, host Tim Hwang is joined by Chris Hay, Kate Soule and Kush Varshney to debrief one of the biggest releases of 2024, OpenAI o3. Next, DeepSeek-V3 is here! Finally, will AI exist in 2027? The experts dissect the AI bet between Miles Brundage and Gary Marcus. All that and more on the first Mixture of Experts of 2025.The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.00:00 — Intro00:49 — OpenAI o314:40 — DeepSeek-V328:00 — The Brundage/Marcus bet
Will 2025 be the year of AI agents? In Episode 35 of Mixture of Experts, host Tim Hwang is joined by some show veterans to debrief 2024 in AI. This week, we review AI models, agents, hardware and product releases with some of the top industry experts. What was the best model of 2024? Is NVIDIA king? What are some of the AI trends in 2025? All that and more on this special edition of Mixture of Experts.The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.
Happy holidays! We'll be sharing snippets from Latent Space LIVE! through the break bringing you the best of 2024! We want to express our deepest appreciation to event sponsors AWS, Daylight Computer, Thoth.ai, StrongCompute, Notable Capital, and most of all all our LS supporters who helped fund the gorgeous venue and A/V production!For NeurIPS last year we did our standard conference podcast coverage interviewing selected papers (that we have now also done for ICLR and ICML), however we felt that we could be doing more to help AI Engineers 1) get more industry-relevant content, and 2) recap 2024 year in review from experts. As a result, we organized the first Latent Space LIVE!, our first in person miniconference, at NeurIPS 2024 in Vancouver.Of perennial interest, particularly at academic conferences, is scaled-up architecture research as people hunt for the next Attention Is All You Need. We have many names for them: “efficient models”, “retentive networks”, “subquadratic attention” or “linear attention” but some of them don't even have any lineage with attention - one of the best papers of this NeurIPS was Sepp Hochreiter's xLSTM, which has a particularly poetic significance as one of the creators of the LSTM returning to update and challenge the OG language model architecture:So, for lack of a better term, we decided to call this segment “the State of Post-Transformers” and fortunately everyone rolled with it.We are fortunate to have two powerful friends of the pod to give us an update here:* Together AI: with CEO Vipul Ved Prakash and CTO Ce Zhang joining us to talk about how they are building Together together as a quote unquote full stack AI startup, from the lowest level kernel and systems programming to the highest level mathematical abstractions driving new model architectures and inference algorithms, with notable industry contributions from RedPajama v2, Flash Attention 3, Mamba 2, Mixture of Agents, BASED, Sequoia, Evo, Dragonfly, Dan Fu's ThunderKittens and many more research projects this year* Recursal AI: with CEO Eugene Cheah who has helped lead the independent RWKV project while also running Featherless AI. This year, the team has shipped RWKV v5, codenamed Eagle, to 1.5 billion Windows 10 and Windows 11 machines worldwide, to support Microsoft's on-device, energy-usage-sensitive Windows Copilot usecases, and has launched the first updates on RWKV v6, codenamed Finch and GoldFinch. On the morning of Latent Space Live, they also announced QRWKV6, a Qwen 32B model modified with RWKV linear attention layers. We were looking to host a debate between our speakers, but given that both of them were working on post-transformers alternativesFull Talk on YoutubePlease like and subscribe!LinksAll the models and papers they picked:* Earlier Cited Work* Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention* Hungry hungry hippos: Towards language modeling with state space models* Hyena hierarchy: Towards larger convolutional language models* Mamba: Linear-Time Sequence Modeling with Selective State Spaces* S4: Efficiently Modeling Long Sequences with Structured State Spaces* Just Read Twice (Arora et al)* Recurrent large language models that compete with Transformers in language modeling perplexity are emerging at a rapid rate (e.g., Mamba, RWKV). Excitingly, these architectures use a constant amount of memory during inference. However, due to the limited memory, recurrent LMs cannot recall and use all the information in long contexts leading to brittle in-context learning (ICL) quality. A key challenge for efficient LMs is selecting what information to store versus discard. In this work, we observe the order in which information is shown to the LM impacts the selection difficulty. * To formalize this, we show that the hardness of information recall reduces to the hardness of a problem called set disjointness (SD), a quintessential problem in communication complexity that requires a streaming algorithm (e.g., recurrent model) to decide whether inputted sets are disjoint. We empirically and theoretically show that the recurrent memory required to solve SD changes with set order, i.e., whether the smaller set appears first in-context. * Our analysis suggests, to mitigate the reliance on data order, we can put information in the right order in-context or process prompts non-causally. Towards that end, we propose: (1) JRT-Prompt, where context gets repeated multiple times in the prompt, effectively showing the model all data orders. This gives 11.0±1.3 points of improvement, averaged across 16 recurrent LMs and the 6 ICL tasks, with 11.9× higher throughput than FlashAttention-2 for generation prefill (length 32k, batch size 16, NVidia H100). We then propose (2) JRT-RNN, which uses non-causal prefix-linear-attention to process prompts and provides 99% of Transformer quality at 360M params., 30B tokens and 96% at 1.3B params., 50B tokens on average across the tasks, with 19.2× higher throughput for prefill than FA2.* Jamba: A 52B Hybrid Transformer-Mamba Language Model* We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-of-experts (MoE) architecture. * Specifically, Jamba interleaves blocks of Transformer and Mamba layers, enjoying the benefits of both model families. MoE is added in some of these layers to increase model capacity while keeping active parameter usage manageable. * This flexible architecture allows resource- and objective-specific configurations. In the particular configuration we have implemented, we end up with a powerful model that fits in a single 80GB GPU.* Built at large scale, Jamba provides high throughput and small memory footprint compared to vanilla Transformers, and at the same time state-of-the-art performance on standard language model benchmarks and long-context evaluations. Remarkably, the model presents strong results for up to 256K tokens context length. * We study various architectural decisions, such as how to combine Transformer and Mamba layers, and how to mix experts, and show that some of them are crucial in large scale modeling. We also describe several interesting properties of these architectures which the training and evaluation of Jamba have revealed, and plan to release checkpoints from various ablation runs, to encourage further exploration of this novel architecture. We make the weights of our implementation of Jamba publicly available under a permissive license.* SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformers* We introduce Sana, a text-to-image framework that can efficiently generate images up to 4096×4096 resolution. Sana can synthesize high-resolution, high-quality images with strong text-image alignment at a remarkably fast speed, deployable on laptop GPU. Core designs include: * (1) Deep compression autoencoder: unlike traditional AEs, which compress images only 8×, we trained an AE that can compress images 32×, effectively reducing the number of latent tokens. * (2) Linear DiT: we replace all vanilla attention in DiT with linear attention, which is more efficient at high resolutions without sacrificing quality. * (3) Decoder-only text encoder: we replaced T5 with modern decoder-only small LLM as the text encoder and designed complex human instruction with in-context learning to enhance the image-text alignment. * (4) Efficient training and sampling: we propose Flow-DPM-Solver to reduce sampling steps, with efficient caption labeling and selection to accelerate convergence. * As a result, Sana-0.6B is very competitive with modern giant diffusion model (e.g. Flux-12B), being 20 times smaller and 100+ times faster in measured throughput. Moreover, Sana-0.6B can be deployed on a 16GB laptop GPU, taking less than 1 second to generate a 1024×1024 resolution image. Sana enables content creation at low cost. * RWKV: Reinventing RNNs for the Transformer Era* Transformers have revolutionized almost all natural language processing (NLP) tasks but suffer from memory and computational complexity that scales quadratically with sequence length. In contrast, recurrent neural networks (RNNs) exhibit linear scaling in memory and computational requirements but struggle to match the same performance as Transformers due to limitations in parallelization and scalability. * We propose a novel model architecture, Receptance Weighted Key Value (RWKV), that combines the efficient parallelizable training of transformers with the efficient inference of RNNs.* Our approach leverages a linear attention mechanism and allows us to formulate the model as either a Transformer or an RNN, thus parallelizing computations during training and maintains constant computational and memory complexity during inference. * We scale our models as large as 14 billion parameters, by far the largest dense RNN ever trained, and find RWKV performs on par with similarly sized Transformers, suggesting future work can leverage this architecture to create more efficient models. This work presents a significant step towards reconciling trade-offs between computational efficiency and model performance in sequence processing tasks.* LoLCATs: On Low-Rank Linearizing of Large Language Models* Recent works show we can linearize large language models (LLMs) -- swapping the quadratic attentions of popular Transformer-based LLMs with subquadratic analogs, such as linear attention -- avoiding the expensive pretraining costs. However, linearizing LLMs often significantly degrades model quality, still requires training over billions of tokens, and remains limited to smaller 1.3B to 7B LLMs. * We thus propose Low-rank Linear Conversion via Attention Transfer (LoLCATs), a simple two-step method that improves LLM linearizing quality with orders of magnitudes less memory and compute. * We base these steps on two findings. * First, we can replace an LLM's softmax attentions with closely-approximating linear attentions, simply by training the linear attentions to match their softmax counterparts with an output MSE loss ("attention transfer").* Then, this enables adjusting for approximation errors and recovering LLM quality simply with low-rank adaptation (LoRA). * LoLCATs significantly improves linearizing quality, training efficiency, and scalability. We significantly reduce the linearizing quality gap and produce state-of-the-art subquadratic LLMs from Llama 3 8B and Mistral 7B v0.1, leading to 20+ points of improvement on 5-shot MMLU. * Furthermore, LoLCATs does so with only 0.2% of past methods' model parameters and 0.4% of their training tokens. * Finally, we apply LoLCATs to create the first linearized 70B and 405B LLMs (50x larger than prior work). * When compared with prior approaches under the same compute budgets, LoLCATs significantly improves linearizing quality, closing the gap between linearized and original Llama 3.1 70B and 405B LLMs by 77.8% and 78.1% on 5-shot MMLU.Timestamps* [00:02:27] Intros* [00:03:16] Why Scale Context Lengths? or work on Efficient Models* [00:06:07] The Story of SSMs* [00:09:33] Idea 1: Approximation -> Principled Modeling* [00:12:14] Idea 3: Selection* [00:15:07] Just Read Twice* [00:16:51] Idea 4: Test Time Compute* [00:17:32] Idea 2: Hardware & Kernel Support* [00:19:49] RWKV vs SSMs* [00:24:24] RWKV Arch* [00:26:15] QWRKWv6 launch* [00:30:00] What's next* [00:33:21] Hot Takes - does anyone really need long context?Transcript[00:00:00] AI Charlie: We're back at Latent Space Live, our first mini conference held at NeurIPS 2024 in Vancouver. This is Charlie, your AI co host. As a special treat this week, we're recapping the best of 2024 going domain by domain. We sent out a survey to the over 900 of you who told us what you wanted, and then invited the best speakers in the Latent Space Network to cover each field.[00:00:24] AI Charlie: 200 of you joined us in person throughout the day, with over 2200 watching live online. Thanks Our next keynote covers the State of Transformers alternative architectures, with a special joint presentation with Dan Fu of Together AI and Eugene Chia of Recursal AI and Featherless AI. We've featured both Together and Recursal on the pod before, with CEO Veepal Vedprakash introducing them.[00:00:49] AI Charlie: And CTO CE Zhang joining us to talk about how they are building together together as a quote unquote full stack AI startup from the lowest level kernel and systems [00:01:00] programming to the highest level mathematical abstractions driving new model architectures and inference algorithms with notable industry contributions from Red Pajama V2, Flash Attention 3, Mamba 2, Mixture of Agents.[00:01:15] AI Charlie: Based, Sequoia, Evo, Dragonfly, Danfoo's Thunder Kittens, and many more research projects this year. As for Recursal and Featherless, we were the first podcast to feature RWKV last year, and this year the team has shipped RWKV v5, codenamed Eagle, to 1. 5 billion Windows 10 and Windows 11 machines worldwide to support Microsoft's on device, end Energy Usage Sensitive Windows Copilot Use Cases and has launched the first updates on RWKV v6, codenamed Finch and Goldfinch.[00:01:53] AI Charlie: On the morning of Latent Space Live, they also announced QRdata UKv6, a QEN32B model [00:02:00] modified with RDWKV linear attention layers. Eugene has also written the most single most popular guest post on the Latent Space blog this year. Yes, we do take guest posts on what he has discovered about the H100 GPU inference NeoCloud market since the successful launch of Featherless AI this year.[00:02:20] AI Charlie: As always, don't forget to check the show notes for the YouTube link to their talk as well as their slides. Watch out and take care.[00:02:27] Intros[00:02:27] Dan Fu: Yeah, so thanks so much for having us. So this is going to be a little bit of a two part presentation. My name is Dan. I'm at Together AI, and I'll be joining UCSD as faculty in about a year. And Eugene, you want to introduce yourself?[00:02:46] Eugene Cheah: Eugene, I lead the art activity team, and I, I'm CEO of Featherless, and we both work on this new post transformer architecture space.[00:02:55] Dan Fu: Yeah, so yeah, so today we're really excited to talk to you a little bit [00:03:00] about that. So first I'm going to give a broad overview of kind of the last few years of progress in non post transformer architectures. And then afterwards Eugene will tell us a little bit about the latest and the greatest and the latest frontier models in this space.[00:03:16] Why Scale Context Lengths? or work on Efficient Models[00:03:16] Dan Fu: So, the story starts with Scaling. So this is probably a figure or something like this that you've seen very recently. Over the last five to six years, we've seen models really scale up in parameter size, and that's brought with it a bunch of new capabilities, like the ability to talk to you and tell you sometimes how to use your Colab screens.[00:03:35] Dan Fu: But another place where we've seen scaling especially recently is scaling in context length. So this can mean Having more text inputs for your models, but it can also mean things like taking a lot of visual token inputs image inputs to your models or generating lots of outputs. And one thing that's been really exciting over the last few months or so is that we're, we're seeing scaling, not only during training time, but also [00:04:00] during test time.[00:04:00] Dan Fu: So this is one of the, the, this is the iconic image from the OpenAI 01 release. Not only are we starting to scale train time compute, but we're also starting to scale test time compute. Now if you're familiar with our attention and our transformer architectures today, this graph on the right might look a little bit scary.[00:04:19] Dan Fu: And one of the reasons is that the implications are a little bit Interesting. So what does it mean if we want to continue having smarter and smarter models? Do we just need to start building bigger, bigger data centers, spending more flops? Is this this little Dolly 3, we need more flops, guys? Is this going to be the future of all of AI?[00:04:39] Dan Fu: Or is there a better way, another path forward? Maybe we can get the same capabilities that we've gotten used to, But for a lot less compute, a lot less flops. And one of the things that we're going to talk about today is specifically looking at that core attention operator in some of these models.[00:04:57] Dan Fu: And the reason is that so this is just some, some [00:05:00] basic you know, scaling curves, but attention has compute that scales quadratically in the context length. So that means that if you're doing something like test time compute and you want to spend a bunch of tokens thinking about what comes next, the longer that that goes the, the, the more tokens you spend on that, that compute grows quadratically in that.[00:05:19] Dan Fu: One of the questions that we're interested in is, can we take that basic sequence model, that basic sequence primitive at the bottom, and get it to scale better? Can we scale in, let's say, n to the 3 halves or n log n? So in, in the first part of the talk, so we just went over the introduction. What I'm gonna do over the next few slides is just talk about some of the key advances and ideas that have shown over the past few years since maybe early 2020 to, to now that shown promise that this might actually be possible.[00:05:48] Dan Fu: That you can actually get potentially the same quality that we want while scale, while scaling better. So to do that, we're and, and basically the, the story that we're gonna look is we're gonna start to see [00:06:00] how. So this is a basic graph of just the past couple years of progress of perplexity where that blue line, that dotted blue line, is attention.[00:06:07] The Story of SSMs[00:06:07] Dan Fu: It's your basic transformer, full dense attention. And then the dots coming down are some of the methods that you'll see in this presentation today. We're going to turn the clock back all the way to 2020. So this, this, this question of can we make attention subquadratic? Basically, as soon as we said attention is all you need, People started asking this question.[00:06:28] Dan Fu: So we have this quadratic attention operator. Can we do better? I'll briefly talk about why attention is quadratic. And the basic thing that happens, if you're not familiar, is that you have these inputs, these keys and queries. And what you do in this attention matrix, this S matrix over here, is that you're using, you're comparing every token in your input to every other token.[00:06:49] Dan Fu: So when I try to do something like upload a whole book to Gemini, what happens beyond the Maybe not Gemini, because we don't necessarily know what architecture is. But let's say we upload it to LLAMA, what happens beyond [00:07:00] the scenes, behind the scenes, is that it's going to take every single word in that book and compare it to every other word.[00:07:05] Dan Fu: And this has been a really, it's, it's led to some pretty impressive things. But it's kind of a brute forcing of the way that you would try to interpret a interpret something. And what attention does in particular is the, and then what attention, sorry, don't want to. Okay, no, no laser pointer. What, what attention does afterwards is that instead of always operating in this quadratic thing, it takes a row wise softmax over this matrix, and then multiplies it by this values matrix.[00:07:32] Dan Fu: So, one of the key points to notice is that the output size is always going to be the same as the inputs, at least in standard self attention. So one of the first things that folks tried to do around 2020 is this thing called linear attention, which is just, just noticing that if we take out this softmax from here, if we take out this non linearity in the middle of the attention operation, and then if you compute the keys and the values operation first, you actually never hit this quadratic bottleneck.[00:07:57] Dan Fu: So that, that's potentially a way [00:08:00] to get a lot more computationally efficient. And there are various ways to do this by basically using feature maps or try to approximate this overall attention computation. But some of this work sort of started to hit a wall in 2020. And the basic challenges were, were two.[00:08:16] Dan Fu: So one was quality. It was back then, it was kind of hard to, to get good quality with these linear attention operators. The other one was actually hardware efficiency. So these, this feature map that was just shown by a simplify simplify here. Actually ends up being quite computationally expensive if you just implement it naively.[00:08:34] Dan Fu: So you started having these operators that not only were you sure, you're not really sure if they have the same quality, but also they're actually just wall clock slower. So you kind of end up getting the worst of both worlds. So this was the the stage. So that kind of sets the stage for four years ago.[00:08:49] Dan Fu: Keep this in mind because linear attention is actually going to come back in a few years once we have a better understanding. But one of the works that started kicking off this, this [00:09:00] mini revolution in post transformer architectures was this idea called states based model. So here the seminal work is, is one about our work queue in 2022.[00:09:09] Dan Fu: And this, this piece of work really brought together a few ideas from, from some long running research research lines of work. The first one was, and this is really one of the keys to, to closing the gap in quality was just using things that, that if you talk to a, a, an electrical engineer off the street, they might know off, off the, like the back of their hand.[00:09:33] Idea 1: Approximation -> Principled Modeling[00:09:33] Dan Fu: But taking some of those properties with how we model dynamical systems in signal processing and then using those ideas to model the inputs, the, the text tokens in, for example a transformer like Next Token Prediction Architecture. So some of those early states-based model papers were looking at this relatively, relatively simple recurrent update model that comes from maybe chapter one of a signal processing class.[00:09:59] Dan Fu: But then using [00:10:00] some principle theory about how you should do that recurrent update in order to really get the most that you can out of your hidden state, out of your out of your sequence. So that, that was one key idea for quality and. When this was eventually realized, you started to see a bunch of benchmarks that were pretty sticky for a few years.[00:10:20] Dan Fu: Things like long range arena, some long sequence evaluation benchmarks, There was stuff in time series, time series analysis. They started to, you started to see the quality tick up in meaningful ways. But the other key thing that What's so influential about these states based models is that they also had a key idea about how you can compute these things efficiently.[00:10:45] Dan Fu: So if you go back to your machine learning 101 class where you learned about RNNs, one thing that you may have learned is that they don't paralyze as well as detention, because if you just run them naively, you have to do this kind of sequential update to process new tokens, [00:11:00] whereas in attention, you can process all the tokens in parallel at one time.[00:11:04] Dan Fu: One of the key insights behind the S4 paper was that these recurrent models, you could take them and you could also formulate them as a convolution. And in particular, with a convolution, you could, instead of using a PyTorch conv1d operation, you can compute that with the FFT. And that would give you n log n compute in the in the sequence length n with an operator that was relatively well optimized for modern hardware.[00:11:28] Dan Fu: So those are really, I'd say, the two key ideas in 2022 that started allowing these breakthroughs to happen in these non transformer architectures. So, these ideas about how to principally model sorry, how to model the recurrent updates of a mo of, of a sequence in a principled way, and also these key ideas in how you can compute it efficiently by turning it into a convolution and then scaling it up with the FFT.[00:11:53] Dan Fu: Along those same lines, so afterwards we started putting out some work on specialized kernels, so just [00:12:00] like we have flash attention for transformers, we also have works like flash fft conf, and if you look at these lines of work oftentimes when, whenever you see a new architecture, you see a new primitive one of the, one of the table stakes now is, do you have an efficient kernel so that you can actually get wall clock speed up?[00:12:14] Idea 3: Selection[00:12:14] Dan Fu: So by 2022, We are starting to have these models that had promising quality primitives, but and, and also promising wall clocks. So you could actually see regimes where they were better than transformers in meaningful ways. That being said, there were, there's still sometimes a quality gap, particularly for language modeling.[00:12:33] Dan Fu: And because languages, It's so core to what we do in sequence modeling these days the, the next, the next key idea that I'm going to talk about is this idea of selection mechanisms. And this is basically an idea of, so you have this recurrent state that you're keeping around that just summarizes everything that, that came before.[00:12:50] Dan Fu: And to get a good sequence model, one of the things that you really need to be able to do is have the model learn what's the best way to pick out pieces from that recurrent [00:13:00] state. So one of the, one of the major ideas here in a line of work called H3, Hungry Hungry Hippos, and also these hyena models were One way you can do this is by just adding some simple element wise gates.[00:13:13] Dan Fu: So versions of these ideas have been around for decades. If you squint at the LSTM paper you, you can probably find, find this gating mechanism. But turns out you can take those old ideas, add them into these new. state space models, and then you can see quality start to pick up. If you've heard of the Mamba model, this also takes the selection to the next level by actually making some changes in that fundamental recurrent state space.[00:13:40] Dan Fu: So, it's not only just this gating that happens around the SSM layer, but also you can actually make The ABCD matrices of your state space model, you can make them data dependent, which will allow you to even better select out different pieces from your hidden state depending on what you're seeing. I'll also point out if you look at the [00:14:00] bottom right of this figure, there's this little triangle with a GPU SRAM, GPU HBM, and this, this is just continuing that trend of when you have a new architecture you, you, you also release it with a kernel to, to, to show that it is hardware efficient, that it, that it can be hardware efficient on modern hardware.[00:14:17] Dan Fu: The, the, one of the next cool things that happened is once we had this understanding of these are the basic pieces, these are the basic principles behind some of the sequence models linear attention actually started to come back. So in earlier this year, there was a model called BASED the, from Simran Arora and, and some other folks, that combined a more principled version of linear attention that basically the, the, the, the two second summary is that it used a Taylor approximation of the softmax attention, combined that with a simple sliding window attention and was starting to able, starting to be able to expand the Pareto frontier of how much data can you recall from your sequence, versus how small is your recurrent state size.[00:14:58] Dan Fu: So those orange dots [00:15:00] are, at the top there, are just showing smaller sequences that can recall more memory.[00:15:07] Just Read Twice[00:15:07] Dan Fu: And the last major idea I think that has been influential in this line of work and is very relatively late breaking just a few months ago, is just the basic idea that when you have these models that are fundamentally more efficient in the sequence length, you maybe don't want to prompt them or use them in exactly the same way.[00:15:26] Dan Fu: So this was a really cool paper called Just Read Twice, also from Simran. That basically said, hey, all these efficient models can process tokens so much more efficiently than transformers that they can sometimes have unfair advantages compared to a simple transformer token. So, or sorry, a simple transformer model.[00:15:44] Dan Fu: So take, for example the standard, the standard use case of you have some long document, you're going to pass it in as input, and then you're going to ask some question about it. One problem you might imagine for a recurrent model where you have a fixed state size is, let's say that [00:16:00] you're. Article is very long, and you're trying to ask about some really niche thing.[00:16:04] Dan Fu: You can imagine it might be hard for the model to know ahead of time what information to put into the hidden state. But these, these, these models are so much more efficient that you can do something really stupid, like, you can just put the document write down the document, write down the question, write down the document again, and then write down the question again, and then this time, the second time that you go over that document, you know exactly what to look for.[00:16:25] Dan Fu: And the cool thing about this is, so this is, And this this results in better quality, especially on these recall intensive tasks. But the other interesting thing is it really takes advantage of the more efficient architectures that, that we're having here. So one of the other, I think, influential ideas in this line of work is if you change the fundamental compute capabilities of your model and the way that it scales, you can actually start to query it at test time differently.[00:16:51] Idea 4: Test Time Compute[00:16:51] Dan Fu: And this actually, of course, goes back to those slides on test time compute. So while everybody's looking at, say, test time compute for big transformer models, [00:17:00] I think potentially a really interesting research question is, how can you take those and how does it change with this new next generation of models?[00:17:09] Dan Fu: So the, I'll just briefly summarize what some of those key ideas were and then talk and then show you briefly kind of what the state of the art is today. So, so the four key ideas are instead of just doing a simple linear attention approximation, instead take ideas that we know from other fields like signal processing, do a more principled approach to your modeling of the sequence.[00:17:32] Idea 2: Hardware & Kernel Support[00:17:32] Dan Fu: Another key idea throughout all these lines of work is you really want. Hardware and kernel support from day one. So, so even if your model is theoretically more efficient if somebody goes and runs it and it's two times slower one of the things that, that we've learned is that if, if you're in that situation, it's, it's just gonna be dead on arrival.[00:17:49] Dan Fu: So you want to be designing your architectures one of the key, key machine learning ideas that has been important for the quality is just making sure that you encode different ways that you can [00:18:00] select from your hidden state and, and really focus on that as a key decider of quality. And finally, I think one of the, the, the emerging new, new things for, for this line of work and something that's quite interesting is, What are the right test time paradigms for these models?[00:18:15] Dan Fu: How do they change relative to relative to what you might do for a standard transformer? I'll briefly end this section. So I've labeled this slide where we are yesterday because Eugene is going to talk about some new models that he released literally this morning. But as of yesterday, some of the really cool results out of the, these efficient alternative models were so AI2 trained this hybrid MOE called Jamba.[00:18:40] Dan Fu: That, that, that seems, that is currently the state of the art for these non transformer architectures. There's this NVIDIA and MIT put out this new diffusion model called SANA recently that one of their key key observations is that you can take a standard diffusion transformer diffusion model, replace the layers with linear [00:19:00] attention, and then that lets you scale to much larger much larger images, much, much Much larger sequences more efficiently.[00:19:07] Dan Fu: And and one thing that I don't think anybody would have called when a few years ago is that one of those gated SSM, gated states based models ended up on the cover of Science because a great group of folks went and trained some DNA models. So that's Michael Polley, Eric Yuen from from Stanford and the Arc Institute.[00:19:26] Dan Fu: So it's, we're really at an exciting time in 2024 where these non transformer, post transformer architectures are showing promise across a wide range. Across a wide range of, of modalities, of applications, and, and of tasks. And with that, I'll pass it on to Eugene, who can tell you a little bit about the latest and greatest with RWKV.[00:19:49] RWKV vs SSMs[00:19:49] Eugene Cheah: So, that's useful? Yeah. You're talking to here. Oh, I'm talking to here. Okay. So, yeah, two streams. Yeah. So, I think one common questions that we tend to get asked, right, is what's the difference between [00:20:00] RWKV and state space? So I think one of the key things to really understand, right the difference between the two groups, right, is that we are actually more like an open source, random internet meets academia kind of situation.[00:20:11] Eugene Cheah: Like, most of us never wrote any paper, but we, we basically look at RNNs and linear intention when intention is all you need came out, and then we decided to like, hey there is a quadratic scaling problem. Why don't we try fixing that instead? So, so, so we end up developing our own branch, but we end up sharing ideas back and forth.[00:20:30] Eugene Cheah: So, and, and we do all this actively in Discord, GitHub, etc. This was so bad for a few years, right, that basically, the average group's H index was so close to zero, right, Illuter. ai actually came in and helped us write our first paper. Great, now our H index is now three, apparently. So, so, so, but, but the thing is, like, a lot of these experiments led to results, and, and, essentially, essentially, we we took the same ideas from linear attention, [00:21:00] and we built on it.[00:21:01] Eugene Cheah: So, to take a step back into, like, how does RWKB handle its own attention mechanic and achieve the same goals of, like, O and compute, respectively, and in focus of our overall goal to make AI accessible to everyone, regardless of language, nation, or compute, that's our goal. We actually train our models primarily on over a hundred languages, which is another topic altogether.[00:21:23] Eugene Cheah: And our goal is to train to even 200 languages to cover all languages in the world. But at the same time, we work on this architecture, To lower the compute cost so that people can run it on Raspberry Pis and on anything. So, how did RWKB break the dependency of LSTM token flow? Because I think to understand architecture, right, it's probably easier to understand it from the RNN lens.[00:21:46] Eugene Cheah: Because that's where we built on. We all, we all state space kind of like try to, try to start anew and took lessons from that and say, So there's a little bit of divergence there. And AKA, this our version of linear attention. So to take step back [00:22:00] all foundation models, be it transformers or non transformers at a very high level, right?[00:22:05] Eugene Cheah: Pumps in the token. I mean, text that things into embeddings and go through a lot of layers. Generate a lot of states where the QKV cache or be iron in states or RW KB states. And outputs and embedding, they are not the same thing. And we just take more layers and more embeddings. And somehow that magically works.[00:22:23] Eugene Cheah: So, if you, if you remember your ancient RNN lessons which we, which we, which we we call best learning these days the general idea is that you have the embedding information flowing all the way up, and when, and you take that information and you flow it back down, and then you process it as part of your LSTM layers.[00:22:41] Eugene Cheah: So, this is how it generally works. Kapati is quoted saying that RNNs are actually unreasonably effective. The problem is this is not scalable. To start doing work on the second token, you need to wait for the first token. And then you need to, and likewise for the third token and fourth token, yada yada.[00:22:55] Eugene Cheah: That is CPU land, not GPU land. So, so, so, you [00:23:00] can have a H100 and you can't even use 1 percent of it. So, so that's kind of why RNNs didn't really take off in the direction that we wanted, like, billions of parameters when it comes to training. So, what did RDAP KV version 0 do? Boom. We just did the dumbest, lamest thing.[00:23:13] Eugene Cheah: Sorry, this is the bottleneck for RNN. We did the dumb thing of removing that line. And it kind of worked. It trained. It sucked, but it kind of worked. Then we were like, hey, then no one cared because the loss was crap, but how do we improve that? And that's essentially where we move forward, because if you see this kind of flow, right, you can actually get your GPU saturated quickly, where it essentially cascades respectively.[00:23:41] Eugene Cheah: So I'm just waiting for this to loop again. So it's like, once you get your first layer, your token to be computed finish. You start to cascade your compute all the way until you are, Hey, I'm using 100 percent of the GPU. So we, we worked on it, and we started going along the principle of that as long as we keep this general architecture [00:24:00] where, where we can cascade and, and be highly efficient with our architecture, nothing is sacred in our architecture.[00:24:06] Eugene Cheah: And we have done some crazy ideas. In fact, you ask us, if you ask me to explain some things in the paper, right, officially in the paper, I'll say we had this idea and we wrote it this way. The reality is someone came with a code, we tested it, it worked, and then we rationalized later. So, so the general[00:24:24] RWKV Arch[00:24:24] Eugene Cheah: The idea behind rwkbr is that we generally have two major blocks that we do.[00:24:30] Eugene Cheah: We call time mix and channel mix. And time mix generally handles handles long term memory states, where essentially, where essentially where we apply the matrix multiplication and Cilu activation functions into processing an input embedding and an output embedding. I'm oversimplifying it because this, This calculation changed every version and we have, like, version 7 right now.[00:24:50] Eugene Cheah: ChannelMix is similar to Base in the sense that it does shorter term attention, where it just looks at the sister token, or the token before it, because [00:25:00] there's a shift in the token shift matrix. I don't really want to go too much into the papers itself, because, like, we do have three papers on this.[00:25:09] Eugene Cheah: Basically, RWKB, RNN for the transformer, ERA, Ego and Pinch, RWKB, Matrix Value State. This is the updated version 5, version 6. And Goldfinch is our, is, is, is, is our hybrid model respectively. We are writing the paper already for V seven and which is, which is for R wk V seven. Called, named Goose, or architectures are named by Bird.[00:25:30] Eugene Cheah: And, I'm going to cover as well, qrwkb, and mama100k, and rwkb, and Where did that lead to? Great! Because we are all GPU poor and to be clear, like, most of this research is done, like, only on a handful H100s, which I had one Google researcher told me that was, like, his experiment budget for a single researcher.[00:25:48] Eugene Cheah: So, our entire organization has less compute than a single researcher in Google. So We, we, one of the things that we explored into was to how do we convert transformer models instead? Because [00:26:00] someone already paid that billion dollars, a million dollars onto training, so why don't we take advantage of those weights?[00:26:05] Eugene Cheah: And, and to, I believe, together AI worked on the lockets for, for the Lambda side of things, and, and we took some ideas from there as well, and we essentially did that for RWKB.[00:26:15] QWRKWv6 launch[00:26:15] Eugene Cheah: And that led to, Q RWKB6, which we just dropped today, a 32 bit instruct preview model, where we took the Quen 32 bit instruct model, freeze the feedforward layer, remove the QKB attention layer, and replace it with RWKB linear layers.[00:26:32] Eugene Cheah: So to be clear, this means we do not have the rwkv channel mix layer, we only have the time mix layer. But but once we do that, we train the rwkv layer. Important is that the feedforward layer needs to be frozen, so the new attention can be learned. And then we unfreeze the feedforward layer, and train all the layers together with a custom learning rate schedule, so that they can learn how to work together.[00:26:54] Eugene Cheah: The end result, surprisingly, And, to be honest, to the frustration of the R. W. [00:27:00] KV MOE team, which ended up releasing the model on the same day, was that, with just a few hours of training on two nodes, we managed to get it to be on par, kind of, with the original QUAN32B model. So, in fact, when the first run, right, that completely confused us, it was like, and I was telling Daniel Goldstein, Smirky, who kind of leads most of our research coordination, When you pitched me this idea, you told me at best you'll get the same level of performance.[00:27:26] Eugene Cheah: You didn't tell me the challenge and score and Winograd score will shoot up. I don't know what's happening there. But it did. MMLU score dropping, that was expected. Because if you think about it, when we were training all the layers, right, we were essentially Like, Frankenstein this thing, and we did brain damage to the feedforward network layer 2 with the new RWKB layers.[00:27:47] Eugene Cheah: But, 76%, hey, somehow it's retained, and we can probably further train this. We didn't even spend more than 3 days training this, so there's a lot more that can be done, hence the preview. This brings up [00:28:00] a big question, because We are already now in the process of converting to 7TB. We are now, this is actually extremely compute efficient to test our attention mechanic.[00:28:10] Eugene Cheah: It's like, it becomes a shortcut. We can, we are already planning to do our version 7 and our hybrid architecture for it. Because we don't need to train from scratch. And we get a really good model out of it. And the other thing that is uncomfortable to say is that because we are doing right now on the 70b is that if this scales correctly to 128k context length, I'm not even talking about a million 128, majority of enterprise workload today is just on 70b at under 32k context length.[00:28:41] Eugene Cheah: That means if this works and the benchmark matches it, It means we can replace the vast majority of current AI workload, unless you want super long context. And then sorry, can someone give us more GPUs? Because we do need the VRAM for super long context, sadly. So yeah, that's what we are working on, and essentially, [00:29:00] we are excited about this to just push it further.[00:29:02] Eugene Cheah: And this conversion process, to be clear, I don't think it's going to be exclusive to RWKB. It probably will work for Mamba as well, I don't see why not. And we will probably see more ideas, or more experiments, or more hybrids, or Yeah, like, one of the weirdest things that I wanted to say outright, and I confirmed this with the Black Mamba team and the Jamba team, which because we did the GoFinch hybrid model, is that none of us understand why a hard hybrid with a state based model to be R.[00:29:28] Eugene Cheah: QA state space and transformer performs better when, than the baseline of both. It's like, it's like when you train one, you expect, and then you replace, you expect the same results. That's our pitch. That's our claim. But somehow when we jam both together, it outperforms both. And that's like one area of emulation that, like, we only have four experiments, plus four teams, that a lot more needs to be done.[00:29:51] Eugene Cheah: But, but these are things that excite me, essentially, because that is what it's potentially we can move ahead for. Which brings us to what comes next.[00:30:00] What's next[00:30:00] [00:30:00][00:30:00] Dan Fu: So, this part is kind of just some, where we'll talk a little bit about stuff that, that we're excited about. Maybe have some wild speculation on, on what, what's, what's coming next.[00:30:12] Dan Fu: And, of course this is also the part that will be more open to questions. So, a couple things that, that I'm excited about is continued hardware model co design for, for these models. So one of the things that we've put out recently is this library called ThunderKittens. It's a CUDA library.[00:30:29] Dan Fu: And one of the things that, that we found frustrating is every time that we built one of these new architectures, and I'm sure you had the exact same experience, we'd have to go and spend two months in CUDA land, like writing these, these new efficient things. And. If we decided to change one thing in PyTorch, like one line of PyTorch code is like a week of CUDA code at least.[00:30:47] Dan Fu: So one of our goals with, with a library like Thunderkitten, so we, we just broke down what are the key principles, what are the key hardware things what are the key, Compute pieces that you get from the hardware. So for example on [00:31:00] H100 everything is really revolves around a warp group matrix multiply operation.[00:31:06] Dan Fu: So you really want your operation to be able to split into relatively small matrix, matrix multiply operations. So like multiplying two 64 by 64 matrices, for example. And so if you know that ahead of time when you're designing your model, that probably gives you you know, some information about how you set the state sizes, how you set the update, how you set the update function.[00:31:27] Dan Fu: So with Thunderkittens we basically built a whole library just around this basic idea that all your basic compute primitives should not be a float, but it should be a matrix, and everything should just be matrix compute. And we've been using that to, to try to both re implement some existing architectures, and also start to design code.[00:31:44] Dan Fu: Some new ones that are really designed with this core with a tensor core primitive in mind. Another thing that that we're, that at least I'm excited about is we, over the last four or five years, we've really been looking at language models as the next thing. But if you've been paying [00:32:00] attention to Twitter there's been a bunch of new next generation models that are coming out.[00:32:04] Dan Fu: So there, there are. So, video generation models that can run real time, that are supported by your mouse and your keyboard, that I'm told if you play with them that, you know, that they only have a few seconds of memory. Can we take that model, can we give it a very long context length so that you could actually maybe generate an entire game state at a time?[00:32:25] Dan Fu: What does that look like for the model? You're certainly not going to do a giant quadratic attention computation to try to run that. Maybe, maybe use some of these new models, or some of these new video generation models that came out. So Sora came out I don't know, two days ago now. But with super long queue times and super long generation times.[00:32:43] Dan Fu: So that's probably a quadratic attention operation at the, at the bottom of it. What if we could remove that and get the same quality, but a lot faster generation time? Or some of the demos that we saw from Paige earlier today. You know, if I have a super long conversation with my [00:33:00] Gemini bot, what if I wanted to remember everything that it's seen in the last week?[00:33:06] Dan Fu: I mean, maybe you don't for personal reasons, but what if I did, you know? What does that mean for the architecture? And I think, you know, that's certainly something I'm pretty excited about. I'm sure you're excited about it too. So, I think we were supposed to have some hot takes, but I honestly don't remember what our hot takes were.[00:33:21] Hot Takes - does anyone really need long context?[00:33:21] Eugene Cheah: Yeah, including the next slide. Hot takes, yes, these are our[00:33:25] Dan Fu: hot takes.[00:33:25] Eugene Cheah: I think the big one on Twitter that we saw, that we shared, was the question is like, is RAG relevant? In the case of, like, the future of, like, state based models?[00:33:38] Dan Fu: Let's see, I haven't played too much with RAG. But when I have. I'll say I found it was a little bit challenging to do research on it because we had this experience over and over again, where you could have any, an embedding model of any quality, so you could have a really, really bad embedding model, or you could have a really, really [00:34:00] good one, By any measure of good.[00:34:03] Dan Fu: And for the final RAG application, it kind of didn't matter. That's what I'll say about RAG while I'm being recorded. I know it doesn't actually answer the question, but[00:34:13] Eugene Cheah: Yeah, so I think a lot of folks are like, extremely excited of the idea of RWKB or State Space potentially having infinite context.[00:34:21] Eugene Cheah: But I think the reality is that when we say infinite context, we just mean a different kind of infinite context, or you, or as it's previously covered, you need to test the model differently. So, think of it more along the lines of the human. Like, I don't remember what I ate for breakfast yesterday.[00:34:37] Eugene Cheah: Yeah, that's the statement that I'll say. And And we humans are not quadratic transformers. If we did, if let's say we increased our brain size for every second we live, we would have exploded by the time we are 5 years old or something like that. And, and I think, I think basically fundamentally for us, right, be it whether we, regardless of whether RWKB, statespace, XLSTM, [00:35:00] etc, our general idea is that instead of that expanding state, that increase in computational cost, what if we have a fixed state size?[00:35:08] Eugene Cheah: And Information theory detects that that fixed state size will have a limit. Just how big of a limit is a question, like, we, like, RWKB is running at 40 megabytes for, for its state. Its future version might run into 400 megabytes. That is like millions of tokens in, if you're talking about mathematically, the maximum possibility.[00:35:29] Eugene Cheah: It's just that I guess we were all more inefficient about it, so maybe we hit 100, 000. And that's kind of like the work we are doing, trying to like push it and maximize it. And that's where the models will start differing, because it will choose to forget things, it will choose to remember things. And that's why I think that there might be some element of right, but it may not be the same right.[00:35:49] Eugene Cheah: It may be the model learn things, and it's like, hmm, I can't remember that, that article. Let me do a database search, to search. Just like us humans, when we can't remember the article in the company. We do a search on Notion. [00:36:00][00:36:00] Dan Fu: I think something that would be really interesting is if you could have facts that are, so right now, the one intuition about language models is that all those parameters are around just to store random facts about the world.[00:36:14] Dan Fu: And this intuition comes from the observation that if you take a really small language model, it can do things like talk to you, or kind of has like the The style of conversation, it can learn that, but where it will usually fall over compared to a much larger one is it'll just be a lot less factual about things that it knows or that it can do.[00:36:32] Dan Fu: But that points to all those weights that we're spending, all that SGD that we're spending to train these models are just being used to store facts. And we have things like databases that are pretty good at storing facts. So I think one thing that would be really interesting is if we could actually have some sort of outside data store that a language model can can look at that that maybe is you know, has has some sort of gradient descent in it, but but would be quite interesting.[00:36:58] Dan Fu: And then maybe you could edit it, delete [00:37:00] facts, you know, change who's president so that it doesn't, it doesn't get lost.[00:37:04] Vibhu: Can we open up Q& A and hot takes for the audience? I have a hot take Q& A. Do these scale? When, when 405B state space model, RAG exists, no one does long context, who's throwing in 2 million token questions, hot takes?[00:37:24] Dan Fu: The, the who's throwing in 2 million token question, I think, is, is a really good question. So I actually, I was going to offer that as a hot take. I mean, my hot take was going to be that long context doesn't matter. I know I just gave a whole talk about it, but you know, what, what's the point of doing research if you can't, you know, play both sides.[00:37:40] Dan Fu: But I think one of the, so I think for both of us, the reason that we first got into this was just from the first principled questions of there's this quadratic thing. Clearly intelligence doesn't need to be quadratic. What is going on? Can we understand it better? You know, since then it's kind of turned into a race, which has [00:38:00] been exciting to watch, like, how much context you can take in.[00:38:03] Dan Fu: But I think it's right. Nobody is actually putting in a two million context prompt into these models. And, and, you know, if they are, maybe we can go, go You know, design a better model to do that particular thing. Yeah, what do you think about that? So you've also been working on this. Do you think long context matters?[00:38:19] Eugene Cheah: So I'm going to burn a bit. How many of you remember the news of Google Gemini supporting 3 million contacts, right? Raise your hand.[00:38:28] Vibhu: Yeah, 2 million.[00:38:29] Eugene Cheah: Oh, it's 2 million.[00:38:31] Eugene Cheah: Yeah, how many of you actually tried that? See?[00:38:34] Vibhu: I use it a lot. You? You work for MindsTV. I use it a lot.[00:38:41] Eugene Cheah: So, for some people that has used, and I think, I think that's the, that's might be, like, this is where my opinion starts to differ, because I think the big labs may have a bigger role in this, because Like, even for RWKB, even when we train non contacts, the reason why I say VRAM is a problem is that because when we did the, we need to backprop [00:39:00] against the states, we actually need to maintain the state in between the tokens by the token length.[00:39:05] Eugene Cheah: So that means we need to actually roll out the whole 1 million contacts if we are actually training 1 million. Which is the same for transformers, actually, but it just means we don't magically reuse the VRAM consumption in the training time space. So that is one of the VRAM bottlenecks, and I'm neither OpenAI nor Google, so donate GPUs if you have too much of them.[00:39:27] Eugene Cheah: But then, putting it back to another paradigm, right, is that I think O1 style reasoning might be actually pushing that direction downwards. In my opinion, this is my partial hot take is that if, let's say you have a super big model, And let's say you have a 70B model that may take double the tokens, but gets the same result.[00:39:51] Eugene Cheah: Strictly speaking, a 70B, and this is even for transformer or non transformer, right? We we'll take less less resources than that 400 B [00:40:00] model, even if it did double the amount thinking. And if that's the case, and we are still all trying to figure this out, maybe the direction for us is really getting the sub 200 B to be as fast as efficient as possible.[00:40:11] Eugene Cheah: We a very efficient architecture that some folks happen to be working on to, to just reason it out over larger and larger context thing.[00:40:20] Question: Yeah. One thing I'm super interested in is. Models that can watch forever? Obviously you cannot train something on infinite context length. How are y'all thinking about that, where you run on a much longer context length than is possible to train on?[00:40:38] Dan Fu: Yeah, it's a, it's a great question. So I think when I think you guys probably had tweets along these lines, too. When we first started doing these things, because these are all recurrent models in theory you could just run it forever. You could just run it forever. And at the very least it won't, it won't like error out on your crash.[00:40:57] Dan Fu: There's another question of whether it can actually [00:41:00] use what it's seen in that infinite context. And I think there, so one place where probably the research and architectures ran faster Then another research is actually the benchmarks for long context. So you turn it on forever. You want to do everything or watch everything.[00:41:16] Dan Fu: What is it that you actually wanted to do? Can we actually build some benchmarks for that? Then measure what's happening. And then ask the question, can the models do it? Is there something else that they need? Yeah, I think that if I were to turn back the clock to 2022, that's probably one of the things I would have done differently, which would have been actually get some long context benchmarks out at the same time as we started pushing context length on all these models.[00:41:41] Eugene Cheah: I will also say the use case. So like, I think we both agree that there's no Infinite memory and the model needs to be able to learn and decide. I think what we have observed for, I think this also fits the state space model, is that one of the key advantages of this alternate attention mechanic that is not based on token position is that the model don't suddenly become crazy when you go past the [00:42:00] 8k training context tank, or a million context tank.[00:42:03] Eugene Cheah: It's actually still stable. It's still able to run, it's still able to rationalize. It just starts forgetting things. But some of these things are still there in latent memory. Some of these things are still somewhat there. That's the whole point of why reading twice works. Things like that. And one of the biggest pushes in this direction is that I think both Statespace and RWKB have Separate papers by other researchers where they use this architecture for time series data.[00:42:26] Eugene Cheah: Weather modeling. So, you are not asking what was the weather five days ago. You're asking what's the weather tomorrow based on the infinite length that we, as long as this Earth and the computer will keep running. So, so, and they found that it is like, better than existing, like, transformer or existing architecture in modeling this weather data.[00:42:47] Eugene Cheah: Control for the param size and stuff. I'm quite sure there are people with larger models. So, so there are things that, that in this case, right, there is future applications if your question is just what's next and not what's 10 years ago.[00:42:59] Dan Fu: Thanks so [00:43:00] much for having us. Get full access to Latent Space at www.latent.space/subscribe
Bio- Jan Leitschuh was bitten by the AT bug in 2002. With no real backpacking experience, she threw herself into learning, training and stomping down fears and questions that swirled around her preparations. She joined the infamous Pack 31- a group of hikers that met online and named themselves after the date they started, March 1, 2003. This community, built on meetings at the ALDHA Gathering and a thousand online hours, still remain friends to this day and Lite Shoe, along with many of those original Pack 31 folks can often be found at the Gathering, sharing their stories and knowledge with a new class of hikers. Guest Links- The Ordinary Adventurer- The Ordinary Adventurer: Hiking Vermont's Long Trail: A Primer for Baby Adventurers and Other Musings on the Nature of the Journey Connect with Anna, aka Mud Butt, at info@traildames.com You can find the Trail Dames at: Our website: Trail Dames The Summit: The Summit 2022 - Presented by the Trail Dames The Trail Dames Foundation: Trail Dames Charitable Foundation | Home Instagram: Instagram (@traildames) Facebook: Trail Dames | Facebook Hiking Radio Network: Hiking Radio Network Hiking Radio Network on Instagram: Instagram (@hikingradionetwork) Music provided for this Podcast by The Burns Sisters "Dance Upon This Earth" https://www.theburnssisters.com Finally, as promised, the coveted recipe for: Homemade Hiker's Halvah (Halvah a la Alli!): 3/4 cup sesame tahini 1/2 cup raw honey 1/4 - 1/2 tsp. vanilla extract 1/4 - 1/2 tsp. pure chocolate extract (could use vanilla instead) 1/4 cup powdered milk (or 10 Tb powdered milk if not using whey and egg white powders) 3 Tb. powdered whey (milk protein) 3 Tb. egg white powder Note: amounts are approximate; what you want at the end is a reasonably dry, solid mass. Instructions: Mix together sesame tahini, honey and extracts until smooth. Gradually add powdered milk, whey and egg white powder. Mixture will become thick; mix well. You want to get it to point where can't be stirred and you have to mix/knead it with your hands; otherwise, the halvah will be too soft and gooey. Press finished mixture into a container or make a flat disk (1/2" high) on a plate. Cover with plastic wrap and chill. Then cut into whatever size you like. Pack in ziplock baggies for the trail and you have a high-calorie, high-protein, high-carb, high-(good)fat power snack! Yum! P.S. from Alli: "The following are the brands I use. They're expensive, but make a superior halvah: *Joyva Sesame Tahini (a must! available in cans; cheaper in bulk); *Dawes Hill Raw Tupelo Honey (Tupelo honey apparently has a different chemical structure than other honeys and has a less dramatic effect on blood sugar; apparently can be used -- in moderation -- by those who are diabetic and hypoglycemic. I have a friend who's severely hypoglycemic and it's the only honey he can tolerate); *Nielsen-Massey Madagascar Bourbon Pure Vanilla Extract (a kick-ass vanilla extract!); *Flavorganics organic Pure Chocolate Extract (amazing!); *I get the whey and egg white powder in bulk at my local food coop; are also available packaged. Jan Lite Shoe, AT Class of '03
This Week in Machine Learning & Artificial Intelligence (AI) Podcast
Today we're joined by Sunil Mallya, CTO and co-founder of Flip AI. We discuss Flip's incident debugging system for DevOps, which was built using a custom mixture of experts (MoE) large language model (LLM) trained on a novel "CoMELT" observability dataset which combines traditional MELT data—metrics, events, logs, and traces—with code to efficiently identify root failure causes in complex software systems. We discuss the challenges of integrating time-series data with LLMs and their multi-decoder architecture designed for this purpose. Sunil describes their system's agent-based design, focusing on clear roles and boundaries to ensure reliability. We examine their "chaos gym," a reinforcement learning environment used for testing and improving the system's robustness. Finally, we discuss the practical considerations of deploying such a system at scale in diverse environments and much more. The complete show notes for this episode can be found at https://twimlai.com/go/708.