Podcasts about Replicate

  • 605PODCASTS
  • 887EPISODES
  • 34mAVG DURATION
  • 5WEEKLY NEW EPISODES
  • Apr 1, 2025LATEST

POPULARITY

20172018201920202021202220232024


Best podcasts about Replicate

Latest podcast episodes about Replicate

The 360 Experience
Ep. 64 | Replicate the Process that Took Me from 3 Loans a Month to 500+ a Year

The 360 Experience

Play Episode Listen Later Apr 1, 2025 59:08


Every loan you fund is either making you unforgettable—or forgettable.If you're closing deals but not creating repeat business, not getting consistent referrals, and not building raving fans out of your clients and partners… your loan process is broken. And you're leaving hundreds of thousands (if not millions) on the table.In this episode of The 360 Experience Podcast, Tim Braheem is flying solo to break down the exact system that helped him go from funding 3 loans a month to over 500 a year. His Perfect Loan Process is a 73-step machine designed to create an experience so addictive, so dialed-in, that clients and referral partners can't help but come back—and bring others with them.What you'll gain from this episode:1️⃣ Turn Every Loan Into New Leads: Discover how to create referral momentum by delivering an unforgettable experience that turns every client, agent, CPA, and advisor into your personal sales force.2️⃣ Systematize Your Excellence: Learn how to build a repeatable, scalable process that eliminates reactivity, increases efficiency, and ensures your team delivers “wow” moments every time.3️⃣ Create Addicted Referral Partners: Hear real-life examples and scripts you can use to proactively turn listing agents, CPAs, and financial advisors into new referral relationships—on every deal.➡️ Ready to turn every loan into your next 3 deals? Download the Perfect Loan Process for free and start building a system that creates repeat business and nonstop referrals: https://go.theloanatlas.com/perfect-loan-process_________________________________________________________________ABOUT TIM BRAHEEMWith more than 25 years of experience as a highly successful mortgage professional, industry leader, educator, and life coach, Tim Braheem is committed to engaging with people on a deep level and helping them uncover the barriers they have placed in the way of having the level of success they deserve in both their business and personal lives.FOLLOW Instagram ► https://www.instagram.com/tbraheem/LinkedIn ► https://www.linkedin.com/in/timbraheemTHE LOAN ATLASJOIN ► https://go.theloanatlas.com/membership FOLLOWInstagram ► https://www.instagram.com/theloanatlas/YOUTUBE ► https://www.youtube.com/@LoanAtlas----------➡️ SUBSCRIBE YOUTUBE ► https://www.youtube.com/@LoanAtlas➡️ FOLLOW THE LOAN ATLASInstagram ► https://www.instagram.com/theloanatlas/➡️ FOLLOW TIM BRAHEEMInstagram ► https://www.instagram.com/tbraheem/LinkedIn ► https://www.linkedin.com/in/timbraheem

Eelke Kleijn | DAYS like NIGHTS Radio
DAYS like NIGHTS 385 - Mexico City, Part 1

Eelke Kleijn | DAYS like NIGHTS Radio

Play Episode Listen Later Mar 31, 2025 60:00


Facebook: https://www.facebook.com/dayslikenights Instagram: https://www.instagram.com/dayslikenights Web: https://www.dayslikenights.com Subscribe to the podcast RSS:
feed: https://feeds.soundcloud.com/users/soundcloud:users:1525250/sounds.rss
 . 01. ID 02. Cocho - Coast Fantasy [DAYS like NIGHTS] 03. Rob Hes, Paralel - Figures [DAYS like NIGHTS] 04. Silicodisco - Biscayne Boulevard (Vhyce Remix) [Espacio CIELO] 05. Was - Like That [DAYS like NIGHTS] 06. Róisin Murphy x DJ Koze - Can't Replicate (12” Edit) [Ninja Tune] 07. Dilby - Passing Notes [Circus] 08. Eelke Kleijn vs Lee Cabrera - Self Control (Eelke Kleijn 12AU Cosmic Airplane Edition) [DAYS like NIGHTS] 09. Fiona Kraft - Deeper Feeling (MAGA Remix) [Connected] 10. Florian Kruse - Nordic Soul [Stone Free Berlin] 11. Rony Seikaly - Out Of Time [Stride] 12. Solee - Sternstunde (Joris Voorn Edit) [Spectrum] This show is syndicated & distributed exclusively by Syndicast. If you are a radio station interested in airing the show or would like to distribute your podcast / radio show please register here: https://syndicast.co.uk/distribution/registration

Farmers Weekly In Focus
News Wrap | Scientists look to replicate pine successes in natives

Farmers Weekly In Focus

Play Episode Listen Later Mar 27, 2025 9:23


Senior reporter Richard Rennie chats about his recent visit to Scion's headquarters in Rotorua, where he caught up with the latest research on both exotic and native trees.He also has the results of the recent Champions of Cheese awards and the inside information on a new wood pellet processing plant in the central North Island.

Law Firm Growth Podcast
Why Female Attorneys Get 'Gangbusters' Marketing Results (And How to Replicate Their Success) with Sonya Palmer

Law Firm Growth Podcast

Play Episode Listen Later Mar 25, 2025 31:35


Why Female Attorneys Get 'Gangbusters' Marketing Results (And How to Replicate Their Success) with Sonya Palmer>> Get the newest LFG episodes delivered to your inbox when you Sign Up for our Newsletter.>> Get the new book beyondintakebook.comResource Links:Fast track your marketing efforts while avoiding common marketing mistakes in our new trainingEstate planning attorney? Stop guessing how to get results from online ads and grow your firm with our client-generating Seminar 3.0 Hosted on Acast. See acast.com/privacy for more information.

The Self-Makers Podcast
81. Building a Personal Brand AI Can't Replicate And Your Unfair Advantage in 2025

The Self-Makers Podcast

Play Episode Listen Later Mar 19, 2025 11:51


Feel like you're drowning in a sea of sameness online? In this AI-dominated landscape, your unique voice and lived experience are your unfair advantage. In this episode, I break down:Why your personal brand is your most valuable asset in 2025 (and no, being "good at what you do" isn't enough)The exact content formats getting real engagement right now (spoiler: it's not your how-to tips)How to weave your authentic stories into marketing that connects and convertsThe Instagram formats that are actually working (and why talking to camera is non-negotiable)Simple ways to humanise your sales process in a world of automationsBecause here's what I know for sure: Your personality, perspective, and human connection are impossible for AI to replicate. Stop hiding behind generic content and start showing up as the irreplaceable expert you are. Ready to stop blending in and start standing out? Let's make some magic.FREE DOWNLOAD: Get my 27-page Lead Magnet Success Manual with 19 proven lead magnet formats for 2025! > LINK https://heyhazal.com/19-proven-lead-magnet-formatsHAZAL'S LINKSThe Self-Makers Mastermind: https://heyhazal.com/self-makers-mastermind Marketing Call Book $20k Marketing CallInstagram: @Hey.HazalWebsite: https://heyhazal.com/#PersonalBranding #ContentStrategy #AIMarketing #2025Marketing #StandOutOnline #AuthenticMarketing #Entrepreneur #SmallBusinessTips #MarketingCoach #SocialMediaStrategy #BusinessGrowth #Storytelling #InstagramMarketing #ContentCreation #OnlineBusiness #BusinessCoach #PersonalBrandMessaging #SelfMakers #MarketingTips #DigitalMarketing

Kevin and Cory
Hour 4 - CBlock, will Josh Smith replicate success, Crosstalk

Kevin and Cory

Play Episode Listen Later Mar 13, 2025 40:42


1pm hour of The K&C Masterpiece!

How‘d You Like That Movie‘s Podcast
Spaceman Replicate Me: Mickey 17 (2025)

How‘d You Like That Movie‘s Podcast

Play Episode Listen Later Mar 13, 2025 33:21


On this week's show we are talking the new sci-fi film by Bong Joon-ho. The filmmaker behind Oscar winner Parasite and the dystopian Snowpiercer adapts the novel Mickey 7 by Edward Ashton into a futuristic space travel flick. Is this film a rocket? or more Space X? Only one way to find out.   Written review for Cambridge Today Mickey-17 review     Talk to us Goose www.howdyoulikethatmovie.com

Keen On Democracy
Episode 2261: Thor Hanson on why virtual reality can never replicate the natural world

Keen On Democracy

Play Episode Listen Later Mar 10, 2025 31:42


There's a story today about how a VR headset can make us more empathetic toward nature. But according to the Pacific Northwest based author and biologist Thor Hanson, no digital technology can ever replicate nature. Instead, he argues in his new book Close to Home, we humans are wired, so to speak, to appreciate the natureal world whether its on the Galapagos or in our local park. In fact, he told me in a windswept conversation he recorded outside his home on San Juan island, the wonders of nature are just acute outside our door, even if we live in Los Angeles or New York City. Here are the five KEEN ON AMERICA takeaways in our conversation with Thor Hanson:1. Nature is accessible everywhere, not just in remote locations. Hanson emphasizes that meaningful connections with nature can be found right outside your door, even in highly urbanized environments like Los Angeles.2. Local nature connections provide emotional resilience against global environmental challenges. Hanson suggests that forming bonds with nearby natural spaces helps counterbalance feelings of helplessness about larger environmental crises.3. Scientific evidence confirms nature's positive impact on physical and mental health. Multiple studies show measurable benefits from nature exposure, including lower anxiety, reduced blood pressure, and faster recovery from illness.4. Children have a natural ability to observe and connect with nature that adults often lose. Kids see more details in nature because they haven't developed the sensory filters that adults use to block out environmental stimuli.5. Small-scale local conservation efforts can collectively make significant environmental impacts. Hanson shares examples like Switzerland's community pond-building initiative that successfully reversed amphibian population declines across an entire region.Author and biologist Thor Hanson is a Guggenheim Fellow, a Switzer Environmental Fellow, and winner of the John Burroughs Medal. His books include Close to Home, Buzz, The Triumph of Seeds, Feathers, Bartholomew Quill, The Impenetrable Forest, Hurricane Lizards and Plastic Squid, Star and the Maestro, and more.Named as one of the "100 most connected men" by GQ magazine, Andrew Keen is amongst the world's best known broadcasters and commentators. In addition to presenting the daily KEEN ON show, he is the host of the long-running How To Fix Democracy interview series. He is also the author of four prescient books about digital technology: CULT OF THE AMATEUR, DIGITAL VERTIGO, THE INTERNET IS NOT THE ANSWER and HOW TO FIX THE FUTURE. Andrew lives in San Francisco, is married to Cassandra Knight, Google's VP of Litigation & Discovery, and has two grown children.Keen On America is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit keenon.substack.com/subscribe

Masty o Rasty | پادکست فارسی مستی و راستی
EP438 Masty o Rasty - Alireza Karduni (AI can't replicate Sasy Mankan)

Masty o Rasty | پادکست فارسی مستی و راستی

Play Episode Listen Later Mar 7, 2025 87:24


This episode is sponsored by BetterHelp. Give online therapy a try at https://betterhelp.com/MASTYORASTY and get on your way to being your best self.-------------------------Alireza Karduni is an assistant professor of Human-Centered Computing at Simon Fraser University, School of Interactive Arts and Technology. He is back on the show to talk about music, creativity and the limits of AI. http://karduni.com-------------------------To learn more about psychedelic therapy go to my brother Mehran's page at: https://www.mindbodyintegration.ca/ or to https://www.somaretreats.org for his next retreat.***Masty o Rasty is not responsible for, or condone, the views and opinions expressed by our guests ******مستی و راستی هیچگونه مسولیتی در برابر نظرها و عقاید مهمان‌های برنامه ندارد.***--------Support the showhttps://paypal.me/raamemamiVenmo + Revolut: @KingRaam Hosted on Acast. See acast.com/privacy for more information.

Top Traders Unplugged
SI337: How Best to "Replicate" Managed Futures Returns ft. Katy Kaminski

Top Traders Unplugged

Play Episode Listen Later Mar 1, 2025 63:28 Transcription Available


In this episode, Katy Kaminski helps us explore the latest trends in managed futures and how different approaches to replication can impact performance and correlation. We explore the pros and cons of index versus mechanical replication, highlighting how each method reacts differently in various market conditions. Plus, we touch on the most recent developments in the ETF space and how they're reshaping access to CTAs, but also question if all of the new products are truly worthy of the "CTA" lable. Stick around as we break down these ideas and share insights that can help shape your investment strategies.-----50 YEARS OF TREND FOLLOWING BOOK AND BEHIND-THE-SCENES VIDEO FOR ACCREDITED INVESTORS - CLICK HERE-----Follow Niels on Twitter, LinkedIn, YouTube or via the TTU website.IT's TRUE ? – most CIO's read 50+ books each year – get your FREE copy of the Ultimate Guide to the Best Investment Books ever written here.And you can get a free copy of my latest book “Ten Reasons to Add Trend Following to Your Portfolio” here.Learn more about the Trend Barometer here.Send your questions to info@toptradersunplugged.comAnd please share this episode with a like-minded friend and leave an honest Rating & Review on iTunes or Spotify so more people can discover the podcast.Follow Katy on LinkedIn.Episode TimeStamps:01:50 - What has caught our attention recently?06:09 - Industry performance update13:06 - Q1, Rick: Have Katy done any work on Co-Movement Factor and "offsides" systems?18:12 - Q1.2, Rick: Katy's work shows us the dispersion between "slow" and "fast" systems w/ an optimal window of 10mo: How does this "optimal window" differ by year?25:54 - Q2, Brian: Are there more individual investors than institutional investors?28:43 - Q2.2: Are investors holding the ETF “long term”, or do they “chase performance”?32:25 - Q3, Niels: How should we think about future returns in trend following?36:24 - The beta of trend following and how to access it44:05 - Combing index- and mechanical replication50:58 - The pitfalls of replication...

Your Truth Shared
#105 The Untapped Marketing Strategy that AI Can't Replicate with Mark Schaefer

Your Truth Shared

Play Episode Listen Later Feb 26, 2025 46:34


Have we lost our humanness and become lazy when it comes to Marketing? “In a world where AI is more than competent, if you are merely competent, you're in trouble. We must rise above the bots. The most human company wins” – Mark Schaefer In this episode we welcome back Marketing visionary Mark Schaefer. Mark joins us to unveil insights from his latest book, "Audacious: How Humans Win in an AI Marketing World" exploring the coexistence of humanity and technology in marketing. Mark suggests that most marketing has lost it's human touch and fails to resonate with it's target audience. Mark shares his guide on how to put the human back into marketing! What you will learn about in this episode: • A recent Marketing campaign that stopped Mark Schaefer in his tracks.• How Marketers are sleepwalking and losing to AI.• Why “competent” marketing is no longer enough.• What Mark learned from meeting some of the world's most creative marketing minds.• Mark Schaefer's simple guide to disruptive marketing.• Recent research that suggests most marketing is boring to it's audience.• The biggest opportunity in modern marketing that AI can't touch• Discover many of the learnings from Mark Schaefer's new "Audacious: How Humans Win in an AI Marketing World" Connect with Mark: https://businessesgrow.com/ https://www.instagram.com/markwschaefer/ https://www.youtube.com/user/MWSchaefer Buy Audacious:https://businessesgrow.com/audacious/ Get connected with host Finola Howard: https://www.finolahoward.com/ https://www.finolahoward.com/book https://www.linkedin.com/in/finolahoward/ https://www.instagram.com/thisisfinolahoward https://www.facebook.com/thisisfinolahoward https://www.youtube.com/@ThisIsFinolaHoward https://www.twitter.com/FinolaHoward

Dukes & Bell
Can Atlanta United replicate their hot start?

Dukes & Bell

Play Episode Listen Later Feb 25, 2025 9:55


United Analyst Jason Longshore recaps a great opening win for the club and if they can keep up the pace this weekend

From Startup to Wunderbrand with Nicholas Kuhne
AI's Limits: The One Thing It Can't Replicate

From Startup to Wunderbrand with Nicholas Kuhne

Play Episode Listen Later Feb 24, 2025 27:02


We also explore:How AI helped Chris write a book in under a weekThe future of AI and automation in businessWhat AI can't replace in human creativityThe balance between AI tools and personal brandingHow AI is leveling the playing field for startups and agenciesA must-listen for marketers, entrepreneurs, and anyone looking to future-proof their business in an AI-driven world!

Decoding the Gurus
Decoding Academia 32: Do Babies REALLY like good guys?

Decoding the Gurus

Play Episode Listen Later Feb 20, 2025 35:06


In this episode, Matt and Chris take a look at a recent developmental psychology paper on the social evaluation of young babies. Do they display a preference for agents who are nice to others or could they care less at the babbling age? This is a large-scale, multi-lab, preregistered replication effort of a rather influential paper so it ticks all of Chris' Open Science boxes, but how does Matt react? Is he stuck in his pre-replication crisis paradigms? Join us to find out and along the way find out about baby Matt's psychotic tendencies, how cats feel about cucumbers, and how Matt narrowly escaped being eaten by a big ol' crocodile.Paper Reference: Lucca, K., Yuen, F., Wang, Y., Alessandroni, N., Allison, O., Alvarez, M., ... & Hamlin, J. K. (2025). Infants' Social Evaluation of Helpers and Hinderers: A Large‐Scale, Multi‐Lab, Coordinated Replication Study. Developmental Science, 28(1), e13581.Original Study: Hamlin, J. K., Wynn, K., & Bloom, P. (2007). Social evaluation by preverbal infants. Nature, 450(7169), 557-559.Decoding Academia 3200:00 Introduction00:59 Matt's Close Shave with a Crocodile03:15 Discussion on Crocodile Behavior05:13 Introduction to the Academic Paper06:18 Understanding Registered Reports07:49 Details of the Replication Study12:07 The Many Babies Study18:23 Challenges in Developmental Psychology20:35 Original Study and Replication Efforts26:27 HARKing and the QRP problem in psychology34:24 Discussing the Results36:58 Exploring the Red Ball Experiment39:38 Forest Plot Analysis41:19 Infant Preferences and Social Evaluation43:24 Failure to Replicate the Original Study47:06 Exploratory Analysis and Moderators50:03 Interpretations and Implications54:21 Evolutionary Perspectives on Social Behavior58:34 Prosocial Evolutionary Speculation01:05:10 Psychopathic Baby Matt01:06:28 Concluding Thoughts and Reflections01:11:20 Comparative Psychology on Snake Hatred!The full episode is available for Patreon subscribers (1hrs 15 mins).Join us at: https://www.patreon.com/DecodingTheGurus

Baskin & Phelps
Can the NBA replicate what the NHL has done with All Star weekend?

Baskin & Phelps

Play Episode Listen Later Feb 20, 2025 8:58


Jeff and Dan talk about the NHL's 4Nations Face Off and whether or not the NBA can replicate what hockey did for All Star weekend.

Tiki and Tierney
The Jets Will Try to Replicate the Lions Rebuild (Hour 3)

Tiki and Tierney

Play Episode Listen Later Feb 19, 2025 39:51


Hour 3: Sal and BT say the Jets will try to replicate the Lions success. Plus, the Knicks will face some interesting challenges coming-up and much more.

Dukes & Bell
Falcons need to replicate Eagles draft model

Dukes & Bell

Play Episode Listen Later Feb 18, 2025 12:06


The guys break down the Eagles draft picks and how the Falcons need to learn from their model

Kreckman & Lindahl
2/13/25 Hour 1 - Jamal Murray drops 55 and some are mad at Nathan MacKinnon, Nuggets lineup, can the Broncos replicate the Eagles?

Kreckman & Lindahl

Play Episode Listen Later Feb 14, 2025 43:14


00:00 Jamal Murray drops 55 and some are mad at Nathan MacKinnon.14:50 Nuggets lineup.30:50 Can the Broncos replicate the Eagles?

Hochman and Crowder
Hour 2: Can the Dolphins replicate an Eagles-esque build with Grier and McDaniel?

Hochman and Crowder

Play Episode Listen Later Feb 10, 2025 33:47


In hour two, What does the Eagles run mean for the Miami Dolphins? Discussing whether there's a blueprint the Dolphins can follow to build a contender and whether Chris Grier and Mike McDaniel are the right guys to lead them. Andrew Wiggins makes his debut tonight for the Heat but Hoch is convinced Jimmy Butler's Warriors are winning the title. Plus, Luka debuts for the Lakers tonight and Anthony Davis quickly confirms everyone's worries in Dallas.

Mully & Haugh Show on 670 The Score
Can the Bears replicate the Eagles' success? (Hour 2)

Mully & Haugh Show on 670 The Score

Play Episode Listen Later Feb 10, 2025 39:43


In the second hour, Mike Mulligan and David Haugh were joined by Tribune reporter Brad Biggs to discuss the Eagles' 40-22 win against the Chiefs in the Super Bowl on Sunday. Biggs also discussed what's next on the Bears' to-do list. Later, Score football analyst Olin Kreutz joined the show to share his takeaways from the Super Bowl.

Cougar Sports with Ben Criddle (BYU)
1-31-25 - LTAS with Blayne Andersen of Bander Wealth about if BYU football can replicate its 5 specials team Tds next season

Cougar Sports with Ben Criddle (BYU)

Play Episode Listen Later Feb 1, 2025 18:44


Ben Criddle talks BYU sports every weekday from 2 to 6 pm.Today's Co-Hosts: Ben Criddle (@criddlebenjamin)Subscribe to the Cougar Sports with Ben Criddle podcast:Apple Podcasts: https://itunes.apple.com/us/podcast/cougar-sports-with-ben-criddle/id99676

Cougar Sports with Ben Criddle (BYU)
1-31-25 - hour 3 - Over/Under Can BYU football replicate their 5 specials team TDs and even get more next season?

Cougar Sports with Ben Criddle (BYU)

Play Episode Listen Later Feb 1, 2025 40:18


Ben Criddle talks BYU sports every weekday from 2 to 6 pm.Today's Co-Hosts: Ben Criddle (@criddlebenjamin)Subscribe to the Cougar Sports with Ben Criddle podcast:Apple Podcasts: https://itunes.apple.com/us/podcast/cougar-sports-with-ben-criddle/id99676

Cougar Sports with Ben Criddle (BYU)
1-131-25 - QOTD - Can BYU Football replicate the 11 win season they had last year and be a top 3 team in the Big 12?

Cougar Sports with Ben Criddle (BYU)

Play Episode Listen Later Feb 1, 2025 22:33


Ben Criddle talks BYU sports every weekday from 2 to 6 pm.Today's Co-Hosts: Ben Criddle (@criddlebenjamin)Subscribe to the Cougar Sports with Ben Criddle podcast:Apple Podcasts: https://itunes.apple.com/us/podcast/cougar-sports-with-ben-criddle/id99676

Cougar Sports with Ben Criddle (BYU)
1-27-25 - Trevor Matich, ESPN CFB Analyst - Why does Trevor think BYU can replicate their 10 win season again?

Cougar Sports with Ben Criddle (BYU)

Play Episode Listen Later Jan 28, 2025 17:06


Ben Criddle talks BYU sports every weekday from 2 to 6 pm.Today's Co-Hosts: Ben Criddle (@criddlebenjamin)Subscribe to the Cougar Sports with Ben Criddle podcast:Apple Podcasts: https://itunes.apple.com/us/podcast/cougar-sports-with-ben-criddle/id99676 Stitcherhttps://www.stitcher.com/podcast/espn960sports/cougar-bytes

Felger & Massarotti
Greg Bedard of Boston Sports Journal // 3 Up and 3 Down // Can the Patriots Replicate the Commanders? - 1/21 (Hour 1)

Felger & Massarotti

Play Episode Listen Later Jan 21, 2025 43:51


(0:00) Greg Bedard of BostonSportsJournal.com joins the show and discusses the turnaround of the Washington Commanders and if the Patriots can replicate that. (16:51) More thoughts on the Patriots and how they can replicate the Commanders returning to contention. (24:12) 3 Up and 3 Down from NFL Divisional Weekend. (34:53) Some additional thoughts from NFL Divisional Weekend, including Jared Goff and the Detroit Lions.

Bull & Fox
Can the Browns replicate what the Commanders did?

Bull & Fox

Play Episode Listen Later Jan 21, 2025 11:07


Nick and Jonathan discuss Jayden Daniels' success in Washington, and if the Browns could follow a similar turnaround.

Bull & Fox
Hour 3: Can the Browns replicate what the Commanders did? + Albert Breer + Golf cart crash with Ryan Day and Will Howard

Bull & Fox

Play Episode Listen Later Jan 21, 2025 33:38


Nick and Jonathan discuss Jayden Daniels' success in Washington, and they're joined by The MMQB's Albert Breer to discuss Ohio State and the NFL Playoffs. Also, they talk about a 'crash' that happened after the national championship game.

Howard and Jeremy
Hour 3 - Can the Bills replicate past defensive success? Sal Capaccio joins the show

Howard and Jeremy

Play Episode Listen Later Jan 16, 2025 38:17


Hour 3 in full featuring Sal Capaccio.

The Skeptics Guide to Emergency Medicine
SGEM#465: Not A Second Time – Single Center RCTs Fail To Replicate In Multi-Center RCTs

The Skeptics Guide to Emergency Medicine

Play Episode Listen Later Jan 11, 2025 35:03


Date: December 2o, 2024 Reference: Kotani et al. Positive single-center randomized trials and subsequent multicenter randomized trials in critically ill patients: a systematic review. Crit Care. 2023  Guest Skeptic: Dr. Scott Weingart is an ED Intensivist from New York. He did fellowships in Trauma, Surgical Critical Care, and ECMO. He is a physician coach concentrating […] The post SGEM#465: Not A Second Time – Single Center RCTs Fail To Replicate In Multi-Center RCTs first appeared on The Skeptics Guide to Emergency Medicine.

NFL: Move the Sticks with Daniel Jeremiah & Bucky Brooks
Can Ben Johnson Replicate His Success as a HC? + Game of the Week & CFP Semifinals Preview

NFL: Move the Sticks with Daniel Jeremiah & Bucky Brooks

Play Episode Listen Later Jan 9, 2025 40:13 Transcription Available


On this episode of Move the Sticks, Bucky Brooks is joined by special guest Lance Zierlein. The duo dives into what makes Lions offensive coordinator Ben Johnson such a coveted head coaching candidate in the NFL. They explore the challenges that come with transitioning from a successful coordinator to a head coach and whether Johnson can replicate his success with a new roster. Next, they break down the Game of the Week: Packers vs. Eagles, predicting who will move on to the divisional round. Finally, they preview the College Football Playoff semifinals, highlighting key things to watch. Tune in for all this and more on the latest episode of Move the Sticks! Help people affected by the raging wildfires in California. Go to redcross.org/NFL to make a donation. See omnystudio.com/listener for privacy information.

All Sides with Ann Fisher Podcast
Tech Tuesday: Democrats are trying to replicate Republicans' success on social media

All Sides with Ann Fisher Podcast

Play Episode Listen Later Jan 7, 2025 49:58


Democrats are trying to replicate Republicans' success in amplifying their political messages across social media platforms, especially in how they use content creators and high-profile podcasters like Joe Rogan.

All Sides with Ann Fisher
Tech Tuesday: Democrats are trying to replicate Republicans' success on social media

All Sides with Ann Fisher

Play Episode Listen Later Jan 7, 2025 49:58


Democrats are trying to replicate Republicans' success in amplifying their political messages across social media platforms, especially in how they use content creators and high-profile podcasters like Joe Rogan.

AI Unraveled: Latest AI News & Trends, Master GPT, Gemini, Generative AI, LLMs, Prompting, GPT Store

A Daily Chronicle of AI Innovations on January 06th 2025Listen to the podcast of this AI Daily News at https://podcasts.apple.com/ca/podcast/ai-unraveled-latest-ai-news-trends-chatgpt-gemini-gen/id1684415169

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Applications for the NYC AI Engineer Summit, focused on Agents at Work, are open!When we first started Latent Space, in the lightning round we'd always ask guests: “What's your favorite AI product?”. The majority would say Midjourney. The simple UI of prompt → very aesthetic image turned it into a $300M+ ARR bootstrapped business as it rode the first wave of AI image generation.In open source land, StableDiffusion was congregating around AUTOMATIC1111 as the de-facto web UI. Unlike Midjourney, which offered some flags but was mostly prompt-driven, A1111 let users play with a lot more parameters, supported additional modalities like img2img, and allowed users to load in custom models. If you're interested in some of the SD history, you can look at our episodes with Lexica, Replicate, and Playground.One of the people involved with that community was comfyanonymous, who was also part of the Stability team in 2023, decided to build an alternative called ComfyUI, now one of the fastest growing open source projects in generative images, and is now the preferred partner for folks like Black Forest Labs's Flux Tools on Day 1. The idea behind it was simple: “Everyone is trying to make easy to use interfaces. Let me try to make a powerful interface that's not easy to use.”Unlike its predecessors, ComfyUI does not have an input text box. Everything is based around the idea of a node: there's a text input node, a CLIP node, a checkpoint loader node, a KSampler node, a VAE node, etc. While daunting for simple image generation, the tool is amazing for more complex workflows since you can break down every step of the process, and then chain many of them together rather than manually switching between tools. You can also re-start execution halfway instead of from the beginning, which can save a lot of time when using larger models.To give you an idea of some of the new use cases that this type of UI enables:* Sketch something → Generate an image with SD from sketch → feed it into SD Video to animate* Generate an image of an object → Turn into a 3D asset → Feed into interactive experiences* Input audio → Generate audio-reactive videosTheir Examples page also includes some of the more common use cases like AnimateDiff, etc. They recently launched the Comfy Registry, an online library of different nodes that users can pull from rather than having to build everything from scratch. The project has >60,000 Github stars, and as the community grows, some of the projects that people build have gotten quite complex:The most interesting thing about Comfy is that it's not a UI, it's a runtime. You can build full applications on top of image models simply by using Comfy. You can expose Comfy workflows as an endpoint and chain them together just like you chain a single node. We're seeing the rise of AI Engineering applied to art.Major Tom's ComfyUI Resources from the Latent Space DiscordMajor shoutouts to Major Tom on the LS Discord who is a image generation expert, who offered these pointers:* “best thing about comfy is the fact it supports almost immediately every new thing that comes out - unlike A1111 or forge, which still don't support flux cnet for instance. It will be perfect tool when conflicting nodes will be resolved”* AP Workflows from Alessandro Perili are a nice example of an all-in-one train-evaluate-generate system built atop Comfy* ComfyUI YouTubers to learn from:* @sebastiankamph* @NerdyRodent* @OlivioSarikas* @sedetweiler* @pixaroma* ComfyUI Nodes to check out:* https://github.com/kijai/ComfyUI-IC-Light* https://github.com/MrForExample/ComfyUI-3D-Pack* https://github.com/PowerHouseMan/ComfyUI-AdvancedLivePortrait* https://github.com/pydn/ComfyUI-to-Python-Extension* https://github.com/THtianhao/ComfyUI-Portrait-Maker* https://github.com/ssitu/ComfyUI_NestedNodeBuilder* https://github.com/longgui0318/comfyui-magic-clothing* https://github.com/atmaranto/ComfyUI-SaveAsScript* https://github.com/ZHO-ZHO-ZHO/ComfyUI-InstantID* https://github.com/AIFSH/ComfyUI-FishSpeech* https://github.com/coolzilj/ComfyUI-Photopea* https://github.com/lks-ai/anynode* Sarav: https://www.youtube.com/@mickmumpitz/videos ( applied stuff )* Sarav: https://www.youtube.com/@latentvision (technical, but infrequent)* look for comfyui node for https://github.com/magic-quill/MagicQuill* “Comfy for Video” resources* Kijai (https://github.com/kijai) pushing out support for Mochi, CogVideoX, AnimateDif, LivePortrait etc* Comfyui node support like LTX https://github.com/Lightricks/ComfyUI-LTXVideo , and HunyuanVideo* FloraFauna AI* Communities: https://www.reddit.com/r/StableDiffusion/, https://www.reddit.com/r/comfyui/Full YouTube EpisodeAs usual, you can find the full video episode on our YouTube (and don't forget to like and subscribe!)Timestamps* 00:00:04 Introduction of hosts and anonymous guest* 00:00:35 Origins of Comfy UI and early Stable Diffusion landscape* 00:02:58 Comfy's background and development of high-res fix* 00:05:37 Area conditioning and compositing in image generation* 00:07:20 Discussion on different AI image models (SD, Flux, etc.)* 00:11:10 Closed source model APIs and community discussions on SD versions* 00:14:41 LoRAs and textual inversion in image generation* 00:18:43 Evaluation methods in the Comfy community* 00:20:05 CLIP models and text encoders in image generation* 00:23:05 Prompt weighting and negative prompting* 00:26:22 Comfy UI's unique features and design choices* 00:31:00 Memory management in Comfy UI* 00:33:50 GPU market share and compatibility issues* 00:35:40 Node design and parameter settings in Comfy UI* 00:38:44 Custom nodes and community contributions* 00:41:40 Video generation models and capabilities* 00:44:47 Comfy UI's development timeline and rise to popularity* 00:48:13 Current state of Comfy UI team and future plans* 00:50:11 Discussion on other Comfy startups and potential text generation supportTranscriptAlessio [00:00:04]: Hey everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.swyx [00:00:12]: Hey everyone, we are in the Chroma Studio again, but with our first ever anonymous guest, Comfy Anonymous, welcome.Comfy [00:00:19]: Hello.swyx [00:00:21]: I feel like that's your full name, you just go by Comfy, right?Comfy [00:00:24]: Yeah, well, a lot of people just call me Comfy, even when they know my real name. Hey, Comfy.Alessio [00:00:32]: Swyx is the same. You know, not a lot of people call you Shawn.swyx [00:00:35]: Yeah, you have a professional name, right, that people know you by, and then you have a legal name. Yeah, it's fine. How do I phrase this? I think people who are in the know, know that Comfy is like the tool for image generation and now other multimodality stuff. I would say that when I first got started with Stable Diffusion, the star of the show was Automatic 111, right? And I actually looked back at my notes from 2022-ish, like Comfy was already getting started back then, but it was kind of like the up and comer, and your main feature was the flowchart. Can you just kind of rewind to that moment, that year and like, you know, how you looked at the landscape there and decided to start Comfy?Comfy [00:01:10]: Yeah, I discovered Stable Diffusion in 2022, in October 2022. And, well, I kind of started playing around with it. Yes, I, and back then I was using Automatic, which was what everyone was using back then. And so I started with that because I had, it was when I started, I had no idea like how Diffusion works. I didn't know how Diffusion models work, how any of this works, so.swyx [00:01:36]: Oh, yeah. What was your prior background as an engineer?Comfy [00:01:39]: Just a software engineer. Yeah. Boring software engineer.swyx [00:01:44]: But like any, any image stuff, any orchestration, distributed systems, GPUs?Comfy [00:01:49]: No, I was doing basically nothing interesting. Crud, web development? Yeah, a lot of web development, just, yeah, some basic, maybe some basic like automation stuff. Okay. Just. Yeah, no, like, no big companies or anything.swyx [00:02:08]: Yeah, but like already some interest in automations, probably a lot of Python.Comfy [00:02:12]: Yeah, yeah, of course, Python. But I wasn't actually used to like the Node graph interface before I started Comfy UI. It was just, I just thought it was like, oh, like, what's the best way to represent the Diffusion process in the user interface? And then like, oh, well. Well, like, naturally, oh, this is the best way I've found. And this was like with the Node interface. So how I got started was, yeah, so basic October 2022, just like I hadn't written a line of PyTorch before that. So it's completely new. What happened was I kind of got addicted to generating images.Alessio [00:02:58]: As we all did. Yeah.Comfy [00:03:00]: And then I started. I started experimenting with like the high-res fixed in auto, which was for those that don't know, the high-res fix is just since the Diffusion models back then could only generate that low-resolution. So what you would do, you would generate low-resolution image, then upscale, then refine it again. And that was kind of the hack to generate high-resolution images. I really liked generating. Like higher resolution images. So I was experimenting with that. And so I modified the code a bit. Okay. What happens if I, if I use different samplers on the second pass, I was edited the code of auto. So what happens if I use a different sampler? What happens if I use a different, like a different settings, different number of steps? And because back then the. The high-res fix was very basic, just, so. Yeah.swyx [00:04:05]: Now there's a whole library of just, uh, the upsamplers.Comfy [00:04:08]: I think, I think they added a bunch of, uh, of options to the high-res fix since, uh, since, since then. But before that was just so basic. So I wanted to go further. I wanted to try it. What happens if I use a different model for the second, the second pass? And then, well, then the auto code base was, wasn't good enough for. Like, it would have been, uh, harder to implement that in the auto interface than to create my own interface. So that's when I decided to create my own. And you were doing that mostly on your own when you started, or did you already have kind of like a subgroup of people? No, I was, uh, on my own because, because it was just me experimenting with stuff. So yeah, that was it. Then, so I started writing the code January one. 2023, and then I released the first version on GitHub, January 16th, 2023. That's how things got started.Alessio [00:05:11]: And what's, what's the name? Comfy UI right away or? Yeah.Comfy [00:05:14]: Comfy UI. The reason the name, my name is Comfy is people thought my pictures were comfy, so I just, uh, just named it, uh, uh, it's my Comfy UI. So yeah, that's, uh,swyx [00:05:27]: Is there a particular segment of the community that you targeted as users? Like more intensive workflow artists, you know, compared to the automatic crowd or, you know,Comfy [00:05:37]: This was my way of like experimenting with, uh, with new things, like the high risk fixed thing I mentioned, which was like in Comfy, the first thing you could easily do was just chain different models together. And then one of the first things, I think the first times it got a bit of popularity was when I started experimenting with the different, like applying. Prompts to different areas of the image. Yeah. I called it area conditioning, posted it on Reddit and it got a bunch of upvotes. So I think that's when, like, when people first learned of Comfy UI.swyx [00:06:17]: Is that mostly like fixing hands?Comfy [00:06:19]: Uh, no, no, no. That was just, uh, like, let's say, well, it was very, well, it still is kind of difficult to like, let's say you want a mountain, you have an image and then, okay. I'm like, okay. I want the mountain here and I want the, like a, a Fox here.swyx [00:06:37]: Yeah. So compositing the image. Yeah.Comfy [00:06:40]: My way was very easy. It was just like, oh, when you run the diffusion process, you kind of generate, okay. You do pass one pass through the diffusion, every step you do one pass. Okay. This place of the image with this brand, this space, place of the image with the other prop. And then. The entire image with another prop and then just average everything together, every step, and that was, uh, area composition, which I call it. And then, then a month later, there was a paper that came out called multi diffusion, which was the same thing, but yeah, that's, uh,Alessio [00:07:20]: could you do area composition with different models or because you're averaging out, you kind of need the same model.Comfy [00:07:26]: Could do it with, but yeah, I hadn't implemented it. For different models, but, uh, you, you can do it with, uh, with different models if you want, as long as the models share the same latent space, like we, we're supposed to ring a bell every time someone says, yeah, like, for example, you couldn't use like Excel and SD 1.5, because those have a different latent space, but like, uh, yeah, like SD 1.5 models, different ones. You could, you could do that.swyx [00:07:59]: There's some models that try to work in pixel space, right?Comfy [00:08:03]: Yeah. They're very slow. Of course. That's the problem. That that's the, the reason why stable diffusion actually became like popular, like, cause was because of the latent space.swyx [00:08:14]: Small and yeah. Because it used to be latent diffusion models and then they trained it up.Comfy [00:08:19]: Yeah. Cause a pixel pixel diffusion models are just too slow. So. Yeah.swyx [00:08:25]: Have you ever tried to talk to like, like stability, the latent diffusion guys, like, you know, Robin Rombach, that, that crew. Yeah.Comfy [00:08:32]: Well, I used to work at stability.swyx [00:08:34]: Oh, I actually didn't know. Yeah.Comfy [00:08:35]: I used to work at stability. I got, uh, I got hired, uh, in June, 2023.swyx [00:08:42]: Ah, that's the part of the story I didn't know about. Okay. Yeah.Comfy [00:08:46]: So the, the reason I was hired is because they were doing, uh, SDXL at the time and they were basically SDXL. I don't know if you remember it was a base model and then a refiner model. Basically they wanted to experiment, like chaining them together. And then, uh, they saw, oh, right. Oh, this, we can use this to do that. Well, let's hire that guy.swyx [00:09:10]: But they didn't, they didn't pursue it for like SD3. What do you mean? Like the SDXL approach. Yeah.Comfy [00:09:16]: The reason for that approach was because basically they had two models and then they wanted to publish both of them. So they, they trained one on. Lower time steps, which was the refiner model. And then they, the first one was trained normally. And then they went during their test, they realized, oh, like if we string these models together are like quality increases. So let's publish that. It worked. Yeah. But like right now, I don't think many people actually use the refiner anymore, even though it is actually a full diffusion model. Like you can use it on its own. And it's going to generate images. I don't think anyone, people have mostly forgotten about it. But, uh.Alessio [00:10:05]: Can we talk about models a little bit? So stable diffusion, obviously is the most known. I know flux has gotten a lot of traction. Are there any underrated models that people should use more or what's the state of the union?Comfy [00:10:17]: Well, the, the latest, uh, state of the art, at least, yeah, for images there's, uh, yeah, there's flux. There's also SD3.5. SD3.5 is two models. There's a, there's a small one, 2.5B and there's the bigger one, 8B. So it's, it's smaller than flux. So, and it's more, uh, creative in a way, but flux, yeah, flux is the best. People should give SD3.5 a try cause it's, uh, it's different. I won't say it's better. Well, it's better for some like specific use cases. Right. If you want some to make something more like creative, maybe SD3.5. If you want to make something more consistent and flux is probably better.swyx [00:11:06]: Do you ever consider supporting the closed source model APIs?Comfy [00:11:10]: Uh, well, they, we do support them as custom nodes. We actually have some, uh, official custom nodes from, uh, different. Ideogram.swyx [00:11:20]: Yeah. I guess DALI would have one. Yeah.Comfy [00:11:23]: That's, uh, it's just not, I'm not the person that handles that. Sure.swyx [00:11:28]: Sure. Quick question on, on SD. There's a lot of community discussion about the transition from SD1.5 to SD2 and then SD2 to SD3. People still like, you know, very loyal to the previous generations of SDs?Comfy [00:11:41]: Uh, yeah. SD1.5 then still has a lot of, a lot of users.swyx [00:11:46]: The last based model.Comfy [00:11:49]: Yeah. Then SD2 was mostly ignored. It wasn't, uh, it wasn't a big enough improvement over the previous one. Okay.swyx [00:11:58]: So SD1.5, SD3, flux and whatever else. SDXL. SDXL.Comfy [00:12:03]: That's the main one. Stable cascade. Stable cascade. That was a good model. But, uh, that's, uh, the problem with that one is, uh, it got, uh, like SD3 was announced one week after. Yeah.swyx [00:12:16]: It was like a weird release. Uh, what was it like inside of stability actually? I mean, statute of limitations. Yeah. The statute of limitations expired. You know, management has moved. So it's easier to talk about now. Yeah.Comfy [00:12:27]: And inside stability, actually that model was ready, uh, like three months before, but it got, uh, stuck in, uh, red teaming. So basically the product, if that model had released or was supposed to be released by the authors, then it would probably have gotten very popular since it's a, it's a step up from SDXL. But it got all of its momentum stolen. It got stolen by the SD3 announcement. So people kind of didn't develop anything on top of it, even though it's, uh, yeah. It was a good model, at least, uh, completely mostly ignored for some reason. Likeswyx [00:13:07]: I think the naming as well matters. It seemed like a branch off of the main, main tree of development. Yeah.Comfy [00:13:15]: Well, it was different researchers that did it. Yeah. Yeah. Very like, uh, good model. Like it's the Worcestershire authors. I don't know if I'm pronouncing it correctly. Yeah. Yeah. Yeah.swyx [00:13:28]: I actually met them in Vienna. Yeah.Comfy [00:13:30]: They worked at stability for a bit and they left right after the Cascade release.swyx [00:13:35]: This is Dustin, right? No. Uh, Dustin's SD3. Yeah.Comfy [00:13:38]: Dustin is a SD3 SDXL. That's, uh, Pablo and Dome. I think I'm pronouncing his name correctly. Yeah. Yeah. Yeah. Yeah. That's very good.swyx [00:13:51]: It seems like the community is very, they move very quickly. Yeah. Like when there's a new model out, they just drop whatever the current one is. And they just all move wholesale over. Like they don't really stay to explore the full capabilities. Like if, if the stable cascade was that good, they would have AB tested a bit more. Instead they're like, okay, SD3 is out. Let's go. You know?Comfy [00:14:11]: Well, I find the opposite actually. The community doesn't like, they only jump on a new model when there's a significant improvement. Like if there's a, only like a incremental improvement, which is what, uh, most of these models are going to have, especially if you, cause, uh, stay the same parameter count. Yeah. Like you're not going to get a massive improvement, uh, into like, unless there's something big that, that changes. So, uh. Yeah.swyx [00:14:41]: And how are they evaluating these improvements? Like, um, because there's, it's a whole chain of, you know, comfy workflows. Yeah. How does, how does one part of the chain actually affect the whole process?Comfy [00:14:52]: Are you talking on the model side specific?swyx [00:14:54]: Model specific, right? But like once you have your whole workflow based on a model, it's very hard to move.Comfy [00:15:01]: Uh, not, well, not really. Well, it depends on your, uh, depends on their specific kind of the workflow. Yeah.swyx [00:15:09]: So I do a lot of like text and image. Yeah.Comfy [00:15:12]: When you do change, like most workflows are kind of going to be complete. Yeah. It's just like, you might have to completely change your prompt completely change. Okay.swyx [00:15:24]: Well, I mean, then maybe the question is really about evals. Like what does the comfy community do for evals? Just, you know,Comfy [00:15:31]: Well, that they don't really do that. It's more like, oh, I think this image is nice. So that's, uh,swyx [00:15:38]: They just subscribe to Fofr AI and just see like, you know, what Fofr is doing. Yeah.Comfy [00:15:43]: Well, they just, they just generate like it. Like, I don't see anyone really doing it. Like, uh, at least on the comfy side, comfy users, they, it's more like, oh, generate images and see, oh, this one's nice. It's like, yeah, it's not, uh, like the, the more, uh, like, uh, scientific, uh, like, uh, like checking that's more on specifically on like model side. If, uh, yeah, but there is a lot of, uh, vibes also, cause it is a like, uh, artistic, uh, you can create a very good model that doesn't generate nice images. Cause most images on the internet are ugly. So if you, if that's like, if you just, oh, I have the best model at 10th giant, it's super smart. I created on all the, like I've trained on just all the images on the internet. The images are not going to look good. So yeah.Alessio [00:16:42]: Yeah.Comfy [00:16:43]: They're going to be very consistent. But yeah. People like, it's not going to be like the, the look that people are going to be expecting from, uh, from a model. So. Yeah.swyx [00:16:54]: Can we talk about LoRa's? Cause we thought we talked about models then like the next step is probably LoRa's. Before, I actually, I'm kind of curious how LoRa's entered the tool set of the image community because the LoRa paper was 2021. And then like, there was like other methods like textual inversion that was popular at the early SD stage. Yeah.Comfy [00:17:13]: I can't even explain the difference between that. Yeah. Textual inversions. That's basically what you're doing is you're, you're training a, cause well, yeah. Stable diffusion. You have the diffusion model, you have text encoder. So basically what you're doing is training a vector that you're going to pass to the text encoder. It's basically you're training a new word. Yeah.swyx [00:17:37]: It's a little bit like representation engineering now. Yeah.Comfy [00:17:40]: Yeah. Basically. Yeah. You're just, so yeah, if you know how like the text encoder works, basically you have, you take your, your words of your product, you convert those into tokens with the tokenizer and those are converted into vectors. Basically. Yeah. Each token represents a different vector. So each word presents a vector. And those, depending on your words, that's the list of vectors that get passed to the text encoder, which is just. Yeah. Yeah. I'm just a stack of, of attention. Like basically it's a very close to LLM architecture. Yeah. Yeah. So basically what you're doing is just training a new vector. We're saying, well, I have all these images and I want to know which word does that represent? And it's going to get like, you train this vector and then, and then when you use this vector, it hopefully generates. Like something similar to your images. Yeah.swyx [00:18:43]: I would say it's like surprisingly sample efficient in picking up the concept that you're trying to train it on. Yeah.Comfy [00:18:48]: Well, people have kind of stopped doing that even though back as like when I was at Stability, we, we actually did train internally some like textual versions on like T5 XXL actually worked pretty well. But for some reason, yeah, people don't use them. And also they might also work like, like, yeah, this is something and probably have to test, but maybe if you train a textual version, like on T5 XXL, it might also work with all the other models that use T5 XXL because same thing with like, like the textual inversions that, that were trained for SD 1.5, they also kind of work on SDXL because SDXL has the, has two text encoders. And one of them is the same as the, as the SD 1.5 CLIP-L. So those, they actually would, they don't work as strongly because they're only applied to one of the text encoders. But, and the same thing for SD3. SD3 has three text encoders. So it works. It's still, you can still use your textual version SD 1.5 on SD3, but it's just a lot weaker because now there's three text encoders. So it gets even more diluted. Yeah.swyx [00:20:05]: Do people experiment a lot on, just on the CLIP side, there's like Siglip, there's Blip, like do people experiment a lot on those?Comfy [00:20:12]: You can't really replace. Yeah.swyx [00:20:14]: Because they're trained together, right? Yeah.Comfy [00:20:15]: They're trained together. So you can't like, well, what I've seen people experimenting with is a long CLIP. So basically someone fine tuned the CLIP model to accept longer prompts.swyx [00:20:27]: Oh, it's kind of like long context fine tuning. Yeah.Comfy [00:20:31]: So, so like it's, it's actually supported in Core Comfy.swyx [00:20:35]: How long is long?Comfy [00:20:36]: Regular CLIP is 77 tokens. Yeah. Long CLIP is 256. Okay. So, but the hack that like you've, if you use stable diffusion 1.5, you've probably noticed, oh, it still works if I, if I use long prompts, prompts longer than 77 words. Well, that's because the hack is to just, well, you split, you split it up in chugs of 77, your whole big prompt. Let's say you, you give it like the massive text, like the Bible or something, and it would split it up in chugs of 77 and then just pass each one through the CLIP and then just cut anything together at the end. It's not ideal, but it actually works.swyx [00:21:26]: Like the positioning of the words really, really matters then, right? Like this is why order matters in prompts. Yeah.Comfy [00:21:33]: Yeah. Like it, it works, but it's, it's not ideal, but it's what people expect. Like if, if someone gives a huge prompt, they expect at least some of the concepts at the end to be like present in the image. But usually when they give long prompts, they, they don't, they like, they don't expect like detail, I think. So that's why it works very well.swyx [00:21:58]: And while we're on this topic, prompts waiting, negative comments. Negative prompting all, all sort of similar part of this layer of the stack. Yeah.Comfy [00:22:05]: The, the hack for that, which works on CLIP, like it, basically it's just for SD 1.5, well, for SD 1.5, the prompt waiting works well because CLIP L is a, is not a very deep model. So you have a very high correlation between, you have the input token, the index of the input token vector. And the output token, they're very, the concepts are very close, closely linked. So that means if you interpolate the vector from what, well, the, the way Comfy UI does it is it has, okay, you have the vector, you have an empty prompt. So you have a, a chunk, like a CLIP output for the empty prompt, and then you have the one for your prompt. And then it interpolates from that, depending on your prompt. Yeah.Comfy [00:23:07]: So that's how it, how it does prompt waiting. But this stops working the deeper your text encoder is. So on T5X itself, it doesn't work at all. So. Wow.swyx [00:23:20]: Is that a problem for people? I mean, cause I'm used to just move, moving up numbers. Probably not. Yeah.Comfy [00:23:25]: Well.swyx [00:23:26]: So you just use words to describe, right? Cause it's a bigger language model. Yeah.Comfy [00:23:30]: Yeah. So. Yeah. So honestly it might be good, but I haven't seen many complaints on Flux that it's not working. So, cause I guess people can sort of get around it with, with language. So. Yeah.swyx [00:23:46]: Yeah. And then coming back to LoRa's, now the, the popular way to, to customize models is LoRa's. And I saw you also support Locon and LoHa, which I've never heard of before.Comfy [00:23:56]: There's a bunch of, cause what, what the LoRa is essentially is. Instead of like, okay, you have your, your model and then you want to fine tune it. So instead of like, what you could do is you could fine tune the entire thing, but that's a bit heavy. So to speed things up and make things less heavy, what you can do is just fine tune some smaller weights, like basically two, two matrices that when you multiply like two low rank matrices and when you multiply them together, gives a, represents a difference between trained weights and your base weights. So by training those two smaller matrices, that's a lot less heavy. Yeah.Alessio [00:24:45]: And they're portable. So you're going to share them. Yeah. It's like easier. And also smaller.Comfy [00:24:49]: Yeah. That's the, how LoRa's work. So basically, so when, when inferencing you, you get an inference with them pretty efficiently, like how ComputeWrite does it. It just, when you use a LoRa, it just applies it straight on the weights so that there's only a small delay at the base, like before the sampling to when it applies the weights and then it just same speed as, as before. So for, for inference, it's, it's not that bad, but, and then you have, so basically all the LoRa types like LoHa, LoCon, everything, that's just different ways of representing that like. Basically, you can call it kind of like compression, even though it's not really compression, it's just different ways of represented, like just, okay, I want to train a different on the difference on the weights. What's the best way to represent that difference? There's the basic LoRa, which is just, oh, let's multiply these two matrices together. And then there's all the other ones, which are all different algorithms. So. Yeah.Alessio [00:25:57]: So let's talk about LoRa. Let's talk about what comfy UI actually is. I think most people have heard of it. Some people might've seen screenshots. I think fewer people have built very complex workflows. So when you started, automatic was like the super simple way. What were some of the choices that you made? So the node workflow, is there anything else that stands out as like, this was like a unique take on how to do image generation workflows?Comfy [00:26:22]: Well, I feel like, yeah, back then everyone was trying to make like easy to use interface. Yeah. So I'm like, well, everyone's trying to make an easy to use interface.swyx [00:26:32]: Let's make a hard to use interface.Comfy [00:26:37]: Like, so like, I like, I don't need to do that, everyone else doing it. So let me try something like, let me try to make a powerful interface that's not easy to use. So.swyx [00:26:52]: So like, yeah, there's a sort of node execution engine. Yeah. Yeah. And it actually lists, it has this really good list of features of things you prioritize, right? Like let me see, like sort of re-executing from, from any parts of the workflow that was changed, asynchronous queue system, smart memory management, like all this seems like a lot of engineering that. Yeah.Comfy [00:27:12]: There's a lot of engineering in the back end to make things, cause I was always focused on making things work locally very well. Cause that's cause I was using it locally. So everything. So there's a lot of, a lot of thought and working by getting everything to run as well as possible. So yeah. ConfUI is actually more of a back end, at least, well, not all the front ends getting a lot more development, but, but before, before it was, I was pretty much only focused on the backend. Yeah.swyx [00:27:50]: So v0.1 was only August this year. Yeah.Comfy [00:27:54]: With the new front end. Before there was no versioning. So yeah. Yeah. Yeah.swyx [00:27:57]: And so what was the big rewrite for the 0.1 and then the 1.0?Comfy [00:28:02]: Well, that's more on the front end side. That's cause before that it was just like the UI, what, cause when I first wrote it, I just, I said, okay, how can I make, like, I can do web development, but I don't like doing it. Like what's the easiest way I can slap a node interface on this. And then I found this library. Yeah. Like JavaScript library.swyx [00:28:26]: Live graph?Comfy [00:28:27]: Live graph.swyx [00:28:28]: Usually people will go for like react flow for like a flow builder. Yeah.Comfy [00:28:31]: But that seems like too complicated. So I didn't really want to spend time like developing the front end. So I'm like, well, oh, light graph. This has the whole node interface. So, okay. Let me just plug that into, to my backend.swyx [00:28:49]: I feel like if Streamlit or Gradio offered something that you would have used Streamlit or Gradio cause it's Python. Yeah.Comfy [00:28:54]: Yeah. Yeah. Yeah.Comfy [00:29:00]: Yeah.Comfy [00:29:14]: Yeah. logic and your backend logic and just sticks them together.swyx [00:29:20]: It's supposed to be easy for you guys. If you're a Python main, you know, I'm a JS main, right? Okay. If you're a Python main, it's supposed to be easy.Comfy [00:29:26]: Yeah, it's easy, but it makes your whole software a huge mess.swyx [00:29:30]: I see, I see. So you're mixing concerns instead of separating concerns?Comfy [00:29:34]: Well, it's because... Like frontend and backend. Frontend and backend should be well separated with a defined API. Like that's how you're supposed to do it. Smart people disagree. It just sticks everything together. It makes it easy to like a huge mess. And also it's, there's a lot of issues with Gradio. Like it's very good if all you want to do is just get like slap a quick interface on your, like to show off your ML project. Like that's what it's made for. Yeah. Like there's no problem using it. Like, oh, I have my, I have my code. I just wanted a quick interface on it. That's perfect. Like use Gradio. But if you want to make something that's like a real, like real software that will last a long time and will be easy to maintain, then I would avoid it. Yeah.swyx [00:30:32]: So your criticism is Streamlit and Gradio are the same. I mean, those are the same criticisms.Comfy [00:30:37]: Yeah, Streamlit I haven't used as much. Yeah, I just looked a bit.swyx [00:30:43]: Similar philosophy.Comfy [00:30:44]: Yeah, it's similar. It's just, it just seems to me like, okay, for quick, like AI demos, it's perfect.swyx [00:30:51]: Yeah. Going back to like the core tech, like asynchronous queues, slow re-execution, smart memory management, you know, anything that you were very proud of or was very hard to figure out?Comfy [00:31:00]: Yeah. The thing that's the biggest pain in the ass is probably the memory management. Yeah.swyx [00:31:05]: Were you just paging models in and out or? Yeah.Comfy [00:31:08]: Before it was just, okay, load the model, completely unload it. Then, okay, that, that works well when you, your model are small, but if your models are big and it takes sort of like, let's say someone has a, like a, a 4090, and the model size is 10 gigabytes, that can take a few seconds to like load and load, load and load, so you want to try to keep things like in memory, in the GPU memory as much as possible. What Comfy UI does right now is it. It tries to like estimate, okay, like, okay, you're going to sample this model, it's going to take probably this amount of memory, let's remove the models, like this amount of memory that's been loaded on the GPU and then just execute it. But so there's a fine line between just because try to remove the least amount of models that are already loaded. Because as fans, like Windows drivers, and one other problem is the NVIDIA driver on Windows by default, because there's a way to, there's an option to disable that feature, but by default it, like, if you start loading, you can overflow your GPU memory and then it's, the driver's going to automatically start paging to RAM. But the problem with that is it's, it makes everything extremely slow. So when you see people complaining, oh, this model, it works, but oh, s**t, it starts slowing down a lot, that's probably what's happening. So it's basically you have to just try to get, use as much memory as possible, but not too much, or else things start slowing down, or people get out of memory, and then just find, try to find that line where, oh, like the driver on Windows starts paging and stuff. Yeah. And the problem with PyTorch is it's, it's high levels, don't have that much fine-grained control over, like, specific memory stuff, so kind of have to leave, like, the memory freeing to, to Python and PyTorch, which is, can be annoying sometimes.swyx [00:33:32]: So, you know, I think one thing is, as a maintainer of this project, like, you're designing for a very wide surface area of compute, like, you even support CPUs.Comfy [00:33:42]: Yeah, well, that's... That's just, for PyTorch, PyTorch supports CPUs, so, yeah, it's just, that's not, that's not hard to support.swyx [00:33:50]: First of all, is there a market share estimate, like, is it, like, 70% NVIDIA, like, 30% AMD, and then, like, miscellaneous on Apple, Silicon, or whatever?Comfy [00:33:59]: For Comfy? Yeah. Yeah, and, yeah, I don't know the market share.swyx [00:34:03]: Can you guess?Comfy [00:34:04]: I think it's mostly NVIDIA. Right. Because, because AMD, the problem, like, AMD works horribly on Windows. Like, on Linux, it works fine. It's, it's lower than the price equivalent NVIDIA GPU, but it works, like, you can use it, you generate images, everything works. On Linux, on Windows, you might have a hard time, so, that's the problem, and most people, I think most people who bought AMD probably use Windows. They probably aren't going to switch to Linux, so... Yeah. So, until AMD actually, like, ports their, like, raw cam to, to Windows properly, and then there's actually PyTorch, I think they're, they're doing that, they're in the process of doing that, but, until they get it, they get a good, like, PyTorch raw cam build that works on Windows, it's, like, they're going to have a hard time. Yeah.Alessio [00:35:06]: We got to get George on it. Yeah. Well, he's trying to get Lisa Su to do it, but... Let's talk a bit about, like, the node design. So, unlike all the other text-to-image, you have a very, like, deep, so you have, like, a separate node for, like, clip and code, you have a separate node for, like, the case sampler, you have, like, all these nodes. Going back to, like, the making it easy versus making it hard, but, like, how much do people actually play with all the settings, you know? Kind of, like, how do you guide people to, like, hey, this is actually going to be very impactful versus this is maybe, like, less impactful, but we still want to expose it to you?Comfy [00:35:40]: Well, I try to... I try to expose, like, I try to expose everything or, but, yeah, at least for the, but for things, like, for example, for the samplers, like, there's, like, yeah, four different sampler nodes, which go in easiest to most advanced. So, yeah, if you go, like, the easy node, the regular sampler node, that's, you have just the basic settings. But if you use, like, the sampler advanced... If you use, like, the custom advanced node, that, that one you can actually, you'll see you have, like, different nodes.Alessio [00:36:19]: I'm looking it up now. Yeah. What are, like, the most impactful parameters that you use? So, it's, like, you know, you can have more, but, like, which ones, like, really make a difference?Comfy [00:36:30]: Yeah, they all do. They all have their own, like, they all, like, for example, yeah, steps. Usually you want steps, you want them to be as low as possible. But you want, if you're optimizing your workflow, you want to, you lower the steps until, like, the images start deteriorating too much. Because that, yeah, that's the number of steps you're running the diffusion process. So, if you want things to be faster, lower is better. But, yeah, CFG, that's more, you can kind of see that as the contrast of the image. Like, if your image looks too bursty. Then you can lower the CFG. So, yeah, CFG, that's how, yeah, that's how strongly the, like, the negative versus positive prompt. Because when you sample a diffusion model, it's basically a negative prompt. It's just, yeah, positive prediction minus negative prediction.swyx [00:37:32]: Contrastive loss. Yeah.Comfy [00:37:34]: It's positive minus negative, and the CFG does the multiplier. Yeah. Yeah. Yeah, so.Alessio [00:37:41]: What are, like, good resources to understand what the parameters do? I think most people start with automatic, and then they move over, and it's, like, snap, CFG, sampler, name, scheduler, denoise. Read it.Comfy [00:37:53]: But, honestly, well, it's more, it's something you should, like, try out yourself. I don't know, you don't necessarily need to know how it works to, like, what it does. Because even if you know, like, CFGO, it's, like, positive minus negative prompt. Yeah. So the only thing you know at CFG is if it's 1.0, then that means the negative prompt isn't applied. It also means sampling is two times faster. But, yeah. But other than that, it's more, like, you should really just see what it does to the images yourself, and you'll probably get a more intuitive understanding of what these things do.Alessio [00:38:34]: Any other nodes or things you want to shout out? Like, I know the animate diff IP adapter. Those are, like, some of the most popular ones. Yeah. What else comes to mind?Comfy [00:38:44]: Not nodes, but there's, like, what I like is when some people, sometimes they make things that use ComfyUI as their backend. Like, there's a plugin for Krita that uses ComfyUI as its backend. So you can use, like, all the models that work in Comfy in Krita. And I think I've tried it once. But I know a lot of people use it, and it's probably really nice, so.Alessio [00:39:15]: What's the craziest node that people have built, like, the most complicated?Comfy [00:39:21]: Craziest node? Like, yeah. I know some people have made, like, video games in Comfy with, like, stuff like that. So, like, someone, like, I remember, like, yeah, last, I think it was last year, someone made, like, a, like, Wolfenstein 3D in Comfy. Of course. And then one of the inputs was, oh, you can generate a texture, and then it changes the texture in the game. So you can plug it to, like, the workflow. And there's a lot of, if you look there, there's a lot of crazy things people do, so. Yeah.Alessio [00:39:59]: And now there's, like, a node register that people can use to, like, download nodes. Yeah.Comfy [00:40:04]: Like, well, there's always been the, like, the ComfyUI manager. Yeah. But we're trying to make this more, like, I don't know, official, like, with, yeah, with the node registry. Because before the node registry, the, like, okay, how did your custom node get into ComfyUI manager? That's the guy running it who, like, every day he searched GitHub for new custom nodes and added dev annually to his custom node manager. So we're trying to make it less effortless. So we're trying to make it less effortless for him, basically. Yeah.Alessio [00:40:40]: Yeah. But I was looking, I mean, there's, like, a YouTube download node. There's, like, this is almost like, you know, a data pipeline more than, like, an image generation thing at this point. It's, like, you can get data in, you can, like, apply filters to it, you can generate data out.Comfy [00:40:54]: Yeah. You can do a lot of different things. Yeah. So I'm thinking, I think what I did is I made it easy to make custom nodes. So I think that helped a lot. I think that helped a lot for, like, the ecosystem because it is very easy to just make a node. So, yeah, a bit too easy sometimes. Then we have the issue where there's a lot of custom node packs which share similar nodes. But, well, that's, yeah, something we're trying to solve by maybe bringing some of the functionality into the core. Yeah. Yeah. Yeah.Alessio [00:41:36]: And then there's, like, video. People can do video generation. Yeah.Comfy [00:41:40]: Video, that's, well, the first video model was, like, stable video diffusion, which was last, yeah, exactly last year, I think. Like, one year ago. But that wasn't a true video model. So it was...swyx [00:41:55]: It was, like, moving images? Yeah.Comfy [00:41:57]: I generated video. What I mean by that is it's, like, it's still 2D Latents. It's basically what I'm trying to do. So what they did is they took SD2, and then they added some temporal attention to it, and then trained it on videos and all. So it's kind of, like, animated, like, same idea, basically. Why I say it's not a true video model is that you still have, like, the 2D Latents. Like, a true video model, like Mochi, for example, would have 3D Latents. Mm-hmm.Alessio [00:42:32]: Which means you can, like, move through the space, basically. It's the difference. You're not just kind of, like, reorienting. Yeah.Comfy [00:42:39]: And it's also, well, it's also because you have a temporal VAE. Mm-hmm. Also, like, Mochi has a temporal VAE that compresses on, like, the temporal direction, also. So that's something you don't have with, like, yeah, animated diff and stable video diffusion. They only, like, compress spatially, not temporally. Mm-hmm. Right. So, yeah. That's why I call that, like, true video models. There's, yeah, there's actually a few of them, but the one I've implemented in comfy is Mochi, because that seems to be the best one so far. Yeah.swyx [00:43:15]: We had AJ come and speak at the stable diffusion meetup. The other open one I think I've seen is COG video. Yeah.Comfy [00:43:21]: COG video. Yeah. That one's, yeah, it also seems decent, but, yeah. Chinese, so we don't use it. No, it's fine. It's just, yeah, I could. Yeah. It's just that there's a, it's not the only one. There's also a few others, which I.swyx [00:43:36]: The rest are, like, closed source, right? Like, Cling. Yeah.Comfy [00:43:39]: Closed source, there's a bunch of them. But I mean, open. I've seen a few of them. Like, I can't remember their names, but there's COG videos, the big, the big one. Then there's also a few of them that released at the same time. There's one that released at the same time as SSD 3.5, same day, which is why I don't remember the name.swyx [00:44:02]: We should have a release schedule so we don't conflict on each of these things. Yeah.Comfy [00:44:06]: I think SD 3.5 and Mochi released on the same day. So everything else was kind of drowned, completely drowned out. So for some reason, lots of people picked that day to release their stuff.Comfy [00:44:21]: Yeah. Which is, well, shame for those. And I think Omnijet also released the same day, which also seems interesting. Yeah. Yeah.Alessio [00:44:30]: What's Comfy? So you are Comfy. And then there's like, comfy.org. I know we do a lot of things for, like, news research and those guys also have kind of like a more open source thing going on. How do you work? Like you mentioned, you mostly work on like, the core piece of it. And then what...Comfy [00:44:47]: Maybe I should fade it in because I, yeah, I feel like maybe, yeah, I only explain part of the story. Right. Yeah. Maybe I should explain the rest. So yeah. So yeah. Basically, January, that's when the first January 2023, January 16, 2023, that's when Amphi was first released to the public. Then, yeah, did a Reddit post about the area composition thing somewhere in, I don't remember exactly, maybe end of January, beginning of February. And then someone, a YouTuber, made a video about it, like Olivio, he made a video about Amphi in March 2023. I think that's when it was a real burst of attention. And by that time, I was continuing to develop it and it was getting, people were starting to use it more, which unfortunately meant that I had first written it to do like experiments, but then my time to do experiments went down. It started going down, because people were actually starting to use it then. Like, I had to, and I said, well, yeah, time to add all these features and stuff. Yeah, and then I got hired by Stability June, 2023. Then I made, basically, yeah, they hired me because they wanted the SD-XL. So I got the SD-XL working very well withітhe UI, because they were experimenting withámphi.house.com. Actually, the SDX, how the SDXL released worked is they released, for some reason, like they released the code first, but they didn't release the model checkpoint. So they released the code. And then, well, since the research was related to code, I released the code in Compute 2. And then the checkpoints were basically early access. People had to sign up and they only allowed a lot of people from edu emails. Like if you had an edu email, like they gave you access basically to the SDXL 0.9. And, well, that leaked. Right. Of course, because of course it's going to leak if you do that. Well, the only way people could easily use it was with Comfy. So, yeah, people started using. And then I fixed a few of the issues people had. So then the big 1.0 release happened. And, well, Comfy UI was the only way a lot of people could actually run it on their computers. Because it just like automatic was so like inefficient and bad that most people couldn't actually, like it just wouldn't work. Like because he did a quick implementation. So people were forced. To use Comfy UI, and that's how it became popular because people had no choice.swyx [00:47:55]: The growth hack.Comfy [00:47:56]: Yeah.swyx [00:47:56]: Yeah.Comfy [00:47:57]: Like everywhere, like people who didn't have the 4090, they had like, who had just regular GPUs, they didn't have a choice.Alessio [00:48:05]: So yeah, I got a 4070. So think of me. And so today, what's, is there like a core Comfy team or?Comfy [00:48:13]: Uh, yeah, well, right now, um, yeah, we are hiring. Okay. Actually, so right now core, like, um, the core core itself, it's, it's me. Uh, but because, uh, the reason where folks like all the focus has been mostly on the front end right now, because that's the thing that's been neglected for a long time. So, uh, so most of the focus right now is, uh, all on the front end, but we are, uh, yeah, we will soon get, uh, more people to like help me with the actual backend stuff. Yeah. So, no, I'm not going to say a hundred percent because that's why once the, once we have our V one release, which is because it'd be the package, come fee-wise with the nice interface and easy to install on windows and hopefully Mac. Uh, yeah. Yeah. Once we have that, uh, we're going to have to, lots of stuff to do on the backend side and also the front end side, but, uh.Alessio [00:49:14]: What's the release that I'm on the wait list. What's the timing?Comfy [00:49:18]: Uh, soon. Uh, soon. Yeah, I don't want to promise a release date. We do have a release date we're targeting, but I'm not sure if it's public. Yeah, and we're still going to continue doing the open source, making MPUI the best way to run stable infusion models. At least the open source side, it's going to be the best way to run models locally. But we will have a few things to make money from it, like cloud inference or that type of thing. And maybe some things for some enterprises.swyx [00:50:08]: I mean, a few questions on that. How do you feel about the other comfy startups?Comfy [00:50:11]: I mean, I think it's great. They're using your name. Yeah, well, it's better they use comfy than they use something else. Yeah, that's true. It's fine. We're going to try not to... We don't want to... We want people to use comfy. Like I said, it's better that people use comfy than something else. So as long as they use comfy, I think it helps the ecosystem. Because more people, even if they don't contribute directly, the fact that they are using comfy means that people are more likely to join the ecosystem. So, yeah.swyx [00:50:57]: And then would you ever do text?Comfy [00:50:59]: Yeah, well, you can already do text with some custom nodes. So, yeah, it's something we like. Yeah, it's something I've wanted to eventually add to core, but it's more like not a very... It's a very high priority. But because a lot of people use text for prompt enhancement and other things like that. So, yeah, it's just that my focus has always been on diffusion models. Yeah, unless some text diffusion model comes out.swyx [00:51:30]: Yeah, David Holtz is investing a lot in text diffusion.Comfy [00:51:34]: Yeah, well, if a good one comes out, then we'll probably implement it since it fits with the whole...swyx [00:51:39]: Yeah, I mean, I imagine it's going to be a close source to Midjourney. Yeah.Comfy [00:51:43]: Well, if an open one comes out, then I'll probably implement it.Alessio [00:51:54]: Cool, comfy. Thanks so much for coming on. This was fun. Bye. Get full access to Latent Space at www.latent.space/subscribe

The Morning Show w/ John and Hugh
Why Notre Dame might try to replicate what Georgia Tech did vs. Georgia

The Morning Show w/ John and Hugh

Play Episode Listen Later Jan 2, 2025 7:01


Tiffany, Mike, and Beau preview Georgia's College Football Playoff Quarterfinal matchup with Notre Dame this afternoon. The Morning Shift crew also lets you hear Kirby Smart talk about how Notre Dame is built like an SEC team, and how all the players play harder for Gunner Stockton.

Mad Radio
Patriots replicate a Texans Wild Draft Day

Mad Radio

Play Episode Listen Later Jan 1, 2025 13:01


Could we see another team make a wild draft day trade that brings in two top draft prospects.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Applications for the 2025 AI Engineer Summit are up, and you can save the date for AIE Singapore in April and AIE World's Fair 2025 in June.Happy new year, and thanks for 100 great episodes! Please let us know what you want to see/hear for the next 100!Full YouTube Episode with Slides/ChartsLike and subscribe and hit that bell to get notifs!Timestamps* 00:00 Welcome to the 100th Episode!* 00:19 Reflecting on the Journey* 00:47 AI Engineering: The Rise and Impact* 03:15 Latent Space Live and AI Conferences* 09:44 The Competitive AI Landscape* 21:45 Synthetic Data and Future Trends* 35:53 Creative Writing with AI* 36:12 Legal and Ethical Issues in AI* 38:18 The Data War: GPU Poor vs. GPU Rich* 39:12 The Rise of GPU Ultra Rich* 40:47 Emerging Trends in AI Models* 45:31 The Multi-Modality War* 01:05:31 The Future of AI Benchmarks* 01:13:17 Pionote and Frontier Models* 01:13:47 Niche Models and Base Models* 01:14:30 State Space Models and RWKB* 01:15:48 Inference Race and Price Wars* 01:22:16 Major AI Themes of the Year* 01:22:48 AI Rewind: January to March* 01:26:42 AI Rewind: April to June* 01:33:12 AI Rewind: July to September* 01:34:59 AI Rewind: October to December* 01:39:53 Year-End Reflections and PredictionsTranscript[00:00:00] Welcome to the 100th Episode![00:00:00] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co host Swyx for the 100th time today.[00:00:12] swyx: Yay, um, and we're so glad that, yeah, you know, everyone has, uh, followed us in this journey. How do you feel about it? 100 episodes.[00:00:19] Alessio: Yeah, I know.[00:00:19] Reflecting on the Journey[00:00:19] Alessio: Almost two years that we've been doing this. We've had four different studios. Uh, we've had a lot of changes. You know, we used to do this lightning round. When we first started that we didn't like, and we tried to change the question. The answer[00:00:32] swyx: was cursor and perplexity.[00:00:34] Alessio: Yeah, I love mid journey. It's like, do you really not like anything else?[00:00:38] Alessio: Like what's, what's the unique thing? And I think, yeah, we, we've also had a lot more research driven content. You know, we had like 3DAO, we had, you know. Jeremy Howard, we had more folks like that.[00:00:47] AI Engineering: The Rise and Impact[00:00:47] Alessio: I think we want to do more of that too in the new year, like having, uh, some of the Gemini folks, both on the research and the applied side.[00:00:54] Alessio: Yeah, but it's been a ton of fun. I think we both started, I wouldn't say as a joke, we were kind of like, Oh, we [00:01:00] should do a podcast. And I think we kind of caught the right wave, obviously. And I think your rise of the AI engineer posts just kind of get people. Sombra to congregate, and then the AI engineer summit.[00:01:11] Alessio: And that's why when I look at our growth chart, it's kind of like a proxy for like the AI engineering industry as a whole, which is almost like, like, even if we don't do that much, we keep growing just because there's so many more AI engineers. So did you expect that growth or did you expect that would take longer for like the AI engineer thing to kind of like become, you know, everybody talks about it today.[00:01:32] swyx: So, the sign of that, that we have won is that Gartner puts it at the top of the hype curve right now. So Gartner has called the peak in AI engineering. I did not expect, um, to what level. I knew that I was correct when I called it because I did like two months of work going into that. But I didn't know, You know, how quickly it could happen, and obviously there's a chance that I could be wrong.[00:01:52] swyx: But I think, like, most people have come around to that concept. Hacker News hates it, which is a good sign. But there's enough people that have defined it, you know, GitHub, when [00:02:00] they launched GitHub Models, which is the Hugging Face clone, they put AI engineers in the banner, like, above the fold, like, in big So I think it's like kind of arrived as a meaningful and useful definition.[00:02:12] swyx: I think people are trying to figure out where the boundaries are. I think that was a lot of the quote unquote drama that happens behind the scenes at the World's Fair in June. Because I think there's a lot of doubt or questions about where ML engineering stops and AI engineering starts. That's a useful debate to be had.[00:02:29] swyx: In some sense, I actually anticipated that as well. So I intentionally did not. Put a firm definition there because most of the successful definitions are necessarily underspecified and it's actually useful to have different perspectives and you don't have to specify everything from the outset.[00:02:45] Alessio: Yeah, I was at um, AWS reInvent and the line to get into like the AI engineering talk, so to speak, which is, you know, applied AI and whatnot was like, there are like hundreds of people just in line to go in.[00:02:56] Alessio: I think that's kind of what enabled me. People, right? Which is what [00:03:00] you kind of talked about. It's like, Hey, look, you don't actually need a PhD, just, yeah, just use the model. And then maybe we'll talk about some of the blind spots that you get as an engineer with the earlier posts that we also had on on the sub stack.[00:03:11] Alessio: But yeah, it's been a heck of a heck of a two years.[00:03:14] swyx: Yeah.[00:03:15] Latent Space Live and AI Conferences[00:03:15] swyx: You know, I was, I was trying to view the conference as like, so NeurIPS is I think like 16, 17, 000 people. And the Latent Space Live event that we held there was 950 signups. I think. The AI world, the ML world is still very much research heavy. And that's as it should be because ML is very much in a research phase.[00:03:34] swyx: But as we move this entire field into production, I think that ratio inverts into becoming more engineering heavy. So at least I think engineering should be on the same level, even if it's never as prestigious, like it'll always be low status because at the end of the day, you're manipulating APIs or whatever.[00:03:51] swyx: But Yeah, wrapping GPTs, but there's going to be an increasing stack and an art to doing these, these things well. And I, you know, I [00:04:00] think that's what we're focusing on for the podcast, the conference and basically everything I do seems to make sense. And I think we'll, we'll talk about the trends here that apply.[00:04:09] swyx: It's, it's just very strange. So, like, there's a mix of, like, keeping on top of research while not being a researcher and then putting that research into production. So, like, people always ask me, like, why are you covering Neuralibs? Like, this is a ML research conference and I'm like, well, yeah, I mean, we're not going to, to like, understand everything Or reproduce every single paper, but the stuff that is being found here is going to make it through into production at some point, you hope.[00:04:32] swyx: And then actually like when I talk to the researchers, they actually get very excited because they're like, oh, you guys are actually caring about how this goes into production and that's what they really really want. The measure of success is previously just peer review, right? Getting 7s and 8s on their um, Academic review conferences and stuff like citations is one metric, but money is a better metric.[00:04:51] Alessio: Money is a better metric. Yeah, and there were about 2200 people on the live stream or something like that. Yeah, yeah. Hundred on the live stream. So [00:05:00] I try my best to moderate, but it was a lot spicier in person with Jonathan and, and Dylan. Yeah, that it was in the chat on YouTube.[00:05:06] swyx: I would say that I actually also created.[00:05:09] swyx: Layen Space Live in order to address flaws that are perceived in academic conferences. This is not NeurIPS specific, it's ICML, NeurIPS. Basically, it's very sort of oriented towards the PhD student, uh, market, job market, right? Like literally all, basically everyone's there to advertise their research and skills and get jobs.[00:05:28] swyx: And then obviously all the, the companies go there to hire them. And I think that's great for the individual researchers, but for people going there to get info is not great because you have to read between the lines, bring a ton of context in order to understand every single paper. So what is missing is effectively what I ended up doing, which is domain by domain, go through and recap the best of the year.[00:05:48] swyx: Survey the field. And there are, like NeurIPS had a, uh, I think ICML had a like a position paper track, NeurIPS added a benchmarks, uh, datasets track. These are ways in which to address that [00:06:00] issue. Uh, there's always workshops as well. Every, every conference has, you know, a last day of workshops and stuff that provide more of an overview.[00:06:06] swyx: But they're not specifically prompted to do so. And I think really, uh, Organizing a conference is just about getting good speakers and giving them the correct prompts. And then they will just go and do that thing and they do a very good job of it. So I think Sarah did a fantastic job with the startups prompt.[00:06:21] swyx: I can't list everybody, but we did best of 2024 in startups, vision, open models. Post transformers, synthetic data, small models, and agents. And then the last one was the, uh, and then we also did a quick one on reasoning with Nathan Lambert. And then the last one, obviously, was the debate that people were very hyped about.[00:06:39] swyx: It was very awkward. And I'm really, really thankful for John Franco, basically, who stepped up to challenge Dylan. Because Dylan was like, yeah, I'll do it. But He was pro scaling. And I think everyone who is like in AI is pro scaling, right? So you need somebody who's ready to publicly say, no, we've hit a wall.[00:06:57] swyx: So that means you're saying Sam Altman's wrong. [00:07:00] You're saying, um, you know, everyone else is wrong. It helps that this was the day before Ilya went on, went up on stage and then said pre training has hit a wall. And data has hit a wall. So actually Jonathan ended up winning, and then Ilya supported that statement, and then Noam Brown on the last day further supported that statement as well.[00:07:17] swyx: So it's kind of interesting that I think the consensus kind of going in was that we're not done scaling, like you should believe in a better lesson. And then, four straight days in a row, you had Sepp Hochreiter, who is the creator of the LSTM, along with everyone's favorite OG in AI, which is Juergen Schmidhuber.[00:07:34] swyx: He said that, um, we're pre trading inside a wall, or like, we've run into a different kind of wall. And then we have, you know John Frankel, Ilya, and then Noam Brown are all saying variations of the same thing, that we have hit some kind of wall in the status quo of what pre trained, scaling large pre trained models has looked like, and we need a new thing.[00:07:54] swyx: And obviously the new thing for people is some make, either people are calling it inference time compute or test time [00:08:00] compute. I think the collective terminology has been inference time, and I think that makes sense because test time, calling it test, meaning, has a very pre trained bias, meaning that the only reason for running inference at all is to test your model.[00:08:11] swyx: That is not true. Right. Yeah. So, so, I quite agree that. OpenAI seems to have adopted, or the community seems to have adopted this terminology of ITC instead of TTC. And that, that makes a lot of sense because like now we care about inference, even right down to compute optimality. Like I actually interviewed this author who recovered or reviewed the Chinchilla paper.[00:08:31] swyx: Chinchilla paper is compute optimal training, but what is not stated in there is it's pre trained compute optimal training. And once you start caring about inference, compute optimal training, you have a different scaling law. And in a way that we did not know last year.[00:08:45] Alessio: I wonder, because John is, he's also on the side of attention is all you need.[00:08:49] Alessio: Like he had the bet with Sasha. So I'm curious, like he doesn't believe in scaling, but he thinks the transformer, I wonder if he's still. So, so,[00:08:56] swyx: so he, obviously everything is nuanced and you know, I told him to play a character [00:09:00] for this debate, right? So he actually does. Yeah. He still, he still believes that we can scale more.[00:09:04] swyx: Uh, he just assumed the character to be very game for, for playing this debate. So even more kudos to him that he assumed a position that he didn't believe in and still won the debate.[00:09:16] Alessio: Get rekt, Dylan. Um, do you just want to quickly run through some of these things? Like, uh, Sarah's presentation, just the highlights.[00:09:24] swyx: Yeah, we can't go through everyone's slides, but I pulled out some things as a factor of, like, stuff that we were going to talk about. And we'll[00:09:30] Alessio: publish[00:09:31] swyx: the rest. Yeah, we'll publish on this feed the best of 2024 in those domains. And hopefully people can benefit from the work that our speakers have done.[00:09:39] swyx: But I think it's, uh, these are just good slides. And I've been, I've been looking for a sort of end of year recaps from, from people.[00:09:44] The Competitive AI Landscape[00:09:44] swyx: The field has progressed a lot. You know, I think the max ELO in 2023 on LMSys used to be 1200 for LMSys ELOs. And now everyone is at least at, uh, 1275 in their ELOs, and this is across Gemini, Chadjibuti, [00:10:00] Grok, O1.[00:10:01] swyx: ai, which with their E Large model, and Enthopic, of course. It's a very, very competitive race. There are multiple Frontier labs all racing, but there is a clear tier zero Frontier. And then there's like a tier one. It's like, I wish I had everything else. Tier zero is extremely competitive. It's effectively now three horse race between Gemini, uh, Anthropic and OpenAI.[00:10:21] swyx: I would say that people are still holding out a candle for XAI. XAI, I think, for some reason, because their API was very slow to roll out, is not included in these metrics. So it's actually quite hard to put on there. As someone who also does charts, XAI is continually snubbed because they don't work well with the benchmarking people.[00:10:42] swyx: Yeah, yeah, yeah. It's a little trivia for why XAI always gets ignored. The other thing is market share. So these are slides from Sarah. We have it up on the screen. It has gone from very heavily open AI. So we have some numbers and estimates. These are from RAMP. Estimates of open AI market share in [00:11:00] December 2023.[00:11:01] swyx: And this is basically, what is it, GPT being 95 percent of production traffic. And I think if you correlate that with stuff that we asked. Harrison Chase on the LangChain episode, it was true. And then CLAUD 3 launched mid middle of this year. I think CLAUD 3 launched in March, CLAUD 3. 5 Sonnet was in June ish.[00:11:23] swyx: And you can start seeing the market share shift towards opening, uh, towards that topic, uh, very, very aggressively. The more recent one is Gemini. So if I scroll down a little bit, this is an even more recent dataset. So RAM's dataset ends in September 2 2. 2024. Gemini has basically launched a price war at the low end, uh, with Gemini Flash, uh, being basically free for personal use.[00:11:44] swyx: Like, I think people don't understand the free tier. It's something like a billion tokens per day. Unless you're trying to abuse it, you cannot really exhaust your free tier on Gemini. They're really trying to get you to use it. They know they're in like third place, um, fourth place, depending how you, how you count.[00:11:58] swyx: And so they're going after [00:12:00] the Lower tier first, and then, you know, maybe the upper tier later, but yeah, Gemini Flash, according to OpenRouter, is now 50 percent of their OpenRouter requests. Obviously, these are the small requests. These are small, cheap requests that are mathematically going to be more.[00:12:15] swyx: The smart ones obviously are still going to OpenAI. But, you know, it's a very, very big shift in the market. Like basically 2023, 2022, To going into 2024 opening has gone from nine five market share to Yeah. Reasonably somewhere between 50 to 75 market share.[00:12:29] Alessio: Yeah. I'm really curious how ramped does the attribution to the model?[00:12:32] Alessio: If it's API, because I think it's all credit card spin. . Well, but it's all, the credit card doesn't say maybe. Maybe the, maybe when they do expenses, they upload the PDF, but yeah, the, the German I think makes sense. I think that was one of my main 2024 takeaways that like. The best small model companies are the large labs, which is not something I would have thought that the open source kind of like long tail would be like the small model.[00:12:53] swyx: Yeah, different sizes of small models we're talking about here, right? Like so small model here for Gemini is AB, [00:13:00] right? Uh, mini. We don't know what the small model size is, but yeah, it's probably in the double digits or maybe single digits, but probably double digits. The open source community has kind of focused on the one to three B size.[00:13:11] swyx: Mm-hmm . Yeah. Maybe[00:13:12] swyx: zero, maybe 0.5 B uh, that's moon dream and that is small for you then, then that's great. It makes sense that we, we have a range for small now, which is like, may, maybe one to five B. Yeah. I'll even put that at, at, at the high end. And so this includes Gemma from Gemini as well. But also includes the Apple Foundation models, which I think Apple Foundation is 3B.[00:13:32] Alessio: Yeah. No, that's great. I mean, I think in the start small just meant cheap. I think today small is actually a more nuanced discussion, you know, that people weren't really having before.[00:13:43] swyx: Yeah, we can keep going. This is a slide that I smiley disagree with Sarah. She's pointing to the scale SEAL leaderboard. I think the Researchers that I talked with at NeurIPS were kind of positive on this because basically you need private test [00:14:00] sets to prevent contamination.[00:14:02] swyx: And Scale is one of maybe three or four people this year that has really made an effort in doing a credible private test set leaderboard. Llama405B does well compared to Gemini and GPT 40. And I think that's good. I would say that. You know, it's good to have an open model that is that big, that does well on those metrics.[00:14:23] swyx: But anyone putting 405B in production will tell you, if you scroll down a little bit to the artificial analysis numbers, that it is very slow and very expensive to infer. Um, it doesn't even fit on like one node. of, uh, of H100s. Cerebras will be happy to tell you they can serve 4 or 5B on their super large chips.[00:14:42] swyx: But, um, you know, if you need to do anything custom to it, you're still kind of constrained. So, is 4 or 5B really that relevant? Like, I think most people are basically saying that they only use 4 or 5B as a teacher model to distill down to something. Even Meta is doing it. So with Lama 3. [00:15:00] 3 launched, they only launched the 70B because they use 4 or 5B to distill the 70B.[00:15:03] swyx: So I don't know if like open source is keeping up. I think they're the, the open source industrial complex is very invested in telling you that the, if the gap is narrowing, I kind of disagree. I think that the gap is widening with O1. I think there are very, very smart people trying to narrow that gap and they should.[00:15:22] swyx: I really wish them success, but you cannot use a chart that is nearing 100 in your saturation chart. And look, the distance between open source and closed source is narrowing. Of course it's going to narrow because you're near 100. This is stupid. But in metrics that matter, is open source narrowing?[00:15:38] swyx: Probably not for O1 for a while. And it's really up to the open source guys to figure out if they can match O1 or not.[00:15:46] Alessio: I think inference time compute is bad for open source just because, you know, Doc can donate the flops at training time, but he cannot donate the flops at inference time. So it's really hard to like actually keep up on that axis.[00:15:59] Alessio: Big, big business [00:16:00] model shift. So I don't know what that means for the GPU clouds. I don't know what that means for the hyperscalers, but obviously the big labs have a lot of advantage. Because, like, it's not a static artifact that you're putting the compute in. You're kind of doing that still, but then you're putting a lot of computed inference too.[00:16:17] swyx: Yeah, yeah, yeah. Um, I mean, Llama4 will be reasoning oriented. We talked with Thomas Shalom. Um, kudos for getting that episode together. That was really nice. Good, well timed. Actually, I connected with the AI meta guy, uh, at NeurIPS, and, um, yeah, we're going to coordinate something for Llama4. Yeah, yeah,[00:16:32] Alessio: and our friend, yeah.[00:16:33] Alessio: Clara Shi just joined to lead the business agent side. So I'm sure we'll have her on in the new year.[00:16:39] swyx: Yeah. So, um, my comment on, on the business model shift, this is super interesting. Apparently it is wide knowledge that OpenAI wanted more than 6. 6 billion dollars for their fundraise. They wanted to raise, you know, higher, and they did not.[00:16:51] swyx: And what that means is basically like, it's very convenient that we're not getting GPT 5, which would have been a larger pre train. We should have a lot of upfront money. And [00:17:00] instead we're, we're converting fixed costs into variable costs, right. And passing it on effectively to the customer. And it's so much easier to take margin there because you can directly attribute it to like, Oh, you're using this more.[00:17:12] swyx: Therefore you, you pay more of the cost and I'll just slap a margin in there. So like that lets you control your growth margin and like tie your. Your spend, or your sort of inference spend, accordingly. And it's just really interesting to, that this change in the sort of inference paradigm has arrived exactly at the same time that the funding environment for pre training is effectively drying up, kind of.[00:17:36] swyx: I feel like maybe the VCs are very in tune with research anyway, so like, they would have noticed this, but, um, it's just interesting.[00:17:43] Alessio: Yeah, and I was looking back at our yearly recap of last year. Yeah. And the big thing was like the mixed trial price fights, you know, and I think now it's almost like there's nowhere to go, like, you know, Gemini Flash is like basically giving it away for free.[00:17:55] Alessio: So I think this is a good way for the labs to generate more revenue and pass down [00:18:00] some of the compute to the customer. I think they're going to[00:18:02] swyx: keep going. I think that 2, will come.[00:18:05] Alessio: Yeah, I know. Totally. I mean, next year, the first thing I'm doing is signing up for Devin. Signing up for the pro chat GBT.[00:18:12] Alessio: Just to try. I just want to see what does it look like to spend a thousand dollars a month on AI?[00:18:17] swyx: Yes. Yes. I think if your, if your, your job is a, at least AI content creator or VC or, you know, someone who, whose job it is to stay on, stay on top of things, you should already be spending like a thousand dollars a month on, on stuff.[00:18:28] swyx: And then obviously easy to spend, hard to use. You have to actually use. The good thing is that actually Google lets you do a lot of stuff for free now. So like deep research. That they just launched. Uses a ton of inference and it's, it's free while it's in preview.[00:18:45] Alessio: Yeah. They need to put that in Lindy.[00:18:47] Alessio: I've been using Lindy lately. I've been a built a bunch of things once we had flow because I liked the new thing. It's pretty good. I even did a phone call assistant. Um, yeah, they just launched Lindy voice. Yeah, I think once [00:19:00] they get advanced voice mode like capability today, still like speech to text, you can kind of tell.[00:19:06] Alessio: Um, but it's good for like reservations and things like that. So I have a meeting prepper thing. And so[00:19:13] swyx: it's good. Okay. I feel like we've, we've covered a lot of stuff. Uh, I, yeah, I, you know, I think We will go over the individual, uh, talks in a separate episode. Uh, I don't want to take too much time with, uh, this stuff, but that suffice to say that there is a lot of progress in each field.[00:19:28] swyx: Uh, we covered vision. Basically this is all like the audience voting for what they wanted. And then I just invited the best people I could find in each audience, especially agents. Um, Graham, who I talked to at ICML in Vienna, he is currently still number one. It's very hard to stay on top of SweetBench.[00:19:45] swyx: OpenHand is currently still number one. switchbench full, which is the hardest one. He had very good thoughts on agents, which I, which I'll highlight for people. Everyone is saying 2025 is the year of agents, just like they said last year. And, uh, but he had [00:20:00] thoughts on like eight parts of what are the frontier problems to solve in agents.[00:20:03] swyx: And so I'll highlight that talk as well.[00:20:05] Alessio: Yeah. The number six, which is the Hacken agents learn more about the environment, has been a Super interesting to us as well, just to think through, because, yeah, how do you put an agent in an enterprise where most things in an enterprise have never been public, you know, a lot of the tooling, like the code bases and things like that.[00:20:23] Alessio: So, yeah, there's not indexing and reg. Well, yeah, but it's more like. You can't really rag things that are not documented. But people know them based on how they've been doing it. You know, so I think there's almost this like, you know, Oh, institutional knowledge. Yeah, the boring word is kind of like a business process extraction.[00:20:38] Alessio: Yeah yeah, I see. It's like, how do you actually understand how these things are done? I see. Um, and I think today the, the problem is that, Yeah, the agents are, that most people are building are good at following instruction, but are not as good as like extracting them from you. Um, so I think that will be a big unlock just to touch quickly on the Jeff Dean thing.[00:20:55] Alessio: I thought it was pretty, I mean, we'll link it in the, in the things, but. I think the main [00:21:00] focus was like, how do you use ML to optimize the systems instead of just focusing on ML to do something else? Yeah, I think speculative decoding, we had, you know, Eugene from RWKB on the podcast before, like he's doing a lot of that with Fetterless AI.[00:21:12] swyx: Everyone is. I would say it's the norm. I'm a little bit uncomfortable with how much it costs, because it does use more of the GPU per call. But because everyone is so keen on fast inference, then yeah, makes sense.[00:21:24] Alessio: Exactly. Um, yeah, but we'll link that. Obviously Jeff is great.[00:21:30] swyx: Jeff is, Jeff's talk was more, it wasn't focused on Gemini.[00:21:33] swyx: I think people got the wrong impression from my tweet. It's more about how Google approaches ML and uses ML to design systems and then systems feedback into ML. And I think this ties in with Lubna's talk.[00:21:45] Synthetic Data and Future Trends[00:21:45] swyx: on synthetic data where it's basically the story of bootstrapping of humans and AI in AI research or AI in production.[00:21:53] swyx: So her talk was on synthetic data, where like how much synthetic data has grown in 2024 in the pre training side, the post training side, [00:22:00] and the eval side. And I think Jeff then also extended it basically to chips, uh, to chip design. So he'd spend a lot of time talking about alpha chip. And most of us in the audience are like, we're not working on hardware, man.[00:22:11] swyx: Like you guys are great. TPU is great. Okay. We'll buy TPUs.[00:22:14] Alessio: And then there was the earlier talk. Yeah. But, and then we have, uh, I don't know if we're calling them essays. What are we calling these? But[00:22:23] swyx: for me, it's just like bonus for late in space supporters, because I feel like they haven't been getting anything.[00:22:29] swyx: And then I wanted a more high frequency way to write stuff. Like that one I wrote in an afternoon. I think basically we now have an answer to what Ilya saw. It's one year since. The blip. And we know what he saw in 2014. We know what he saw in 2024. We think we know what he sees in 2024. He gave some hints and then we have vague indications of what he saw in 2023.[00:22:54] swyx: So that was the Oh, and then 2016 as well, because of this lawsuit with Elon, OpenAI [00:23:00] is publishing emails from Sam's, like, his personal text messages to Siobhan, Zelis, or whatever. So, like, we have emails from Ilya saying, this is what we're seeing in OpenAI, and this is why we need to scale up GPUs. And I think it's very prescient in 2016 to write that.[00:23:16] swyx: And so, like, it is exactly, like, basically his insights. It's him and Greg, basically just kind of driving the scaling up of OpenAI, while they're still playing Dota. They're like, no, like, we see the path here.[00:23:30] Alessio: Yeah, and it's funny, yeah, they even mention, you know, we can only train on 1v1 Dota. We need to train on 5v5, and that takes too many GPUs.[00:23:37] Alessio: Yeah,[00:23:37] swyx: and at least for me, I can speak for myself, like, I didn't see the path from Dota to where we are today. I think even, maybe if you ask them, like, they wouldn't necessarily draw a straight line. Yeah,[00:23:47] Alessio: no, definitely. But I think like that was like the whole idea of almost like the RL and we talked about this with Nathan on his podcast.[00:23:55] Alessio: It's like with RL, you can get very good at specific things, but then you can't really like generalize as much. And I [00:24:00] think the language models are like the opposite, which is like, you're going to throw all this data at them and scale them up, but then you really need to drive them home on a specific task later on.[00:24:08] Alessio: And we'll talk about the open AI reinforcement, fine tuning, um, announcement too, and all of that. But yeah, I think like scale is all you need. That's kind of what Elia will be remembered for. And I think just maybe to clarify on like the pre training is over thing that people love to tweet. I think the point of the talk was like everybody, we're scaling these chips, we're scaling the compute, but like the second ingredient which is data is not scaling at the same rate.[00:24:35] Alessio: So it's not necessarily pre training is over. It's kind of like What got us here won't get us there. In his email, he predicted like 10x growth every two years or something like that. And I think maybe now it's like, you know, you can 10x the chips again, but[00:24:49] swyx: I think it's 10x per year. Was it? I don't know.[00:24:52] Alessio: Exactly. And Moore's law is like 2x. So it's like, you know, much faster than that. And yeah, I like the fossil fuel of AI [00:25:00] analogy. It's kind of like, you know, the little background tokens thing. So the OpenAI reinforcement fine tuning is basically like, instead of fine tuning on data, you fine tune on a reward model.[00:25:09] Alessio: So it's basically like, instead of being data driven, it's like task driven. And I think people have tasks to do, they don't really have a lot of data. So I'm curious to see how that changes, how many people fine tune, because I think this is what people run into. It's like, Oh, you can fine tune llama. And it's like, okay, where do I get the data?[00:25:27] Alessio: To fine tune it on, you know, so it's great that we're moving the thing. And then I really like he had this chart where like, you know, the brain mass and the body mass thing is basically like mammals that scaled linearly by brain and body size, and then humans kind of like broke off the slope. So it's almost like maybe the mammal slope is like the pre training slope.[00:25:46] Alessio: And then the post training slope is like the, the human one.[00:25:49] swyx: Yeah. I wonder what the. I mean, we'll know in 10 years, but I wonder what the y axis is for, for Ilya's SSI. We'll try to get them on.[00:25:57] Alessio: Ilya, if you're listening, you're [00:26:00] welcome here. Yeah, and then he had, you know, what comes next, like agent, synthetic data, inference, compute, I thought all of that was like that.[00:26:05] Alessio: I don't[00:26:05] swyx: think he was dropping any alpha there. Yeah, yeah, yeah.[00:26:07] Alessio: Yeah. Any other new reps? Highlights?[00:26:10] swyx: I think that there was comparatively a lot more work. Oh, by the way, I need to plug that, uh, my friend Yi made this, like, little nice paper. Yeah, that was really[00:26:20] swyx: nice.[00:26:20] swyx: Uh, of, uh, of, like, all the, he's, she called it must read papers of 2024.[00:26:26] swyx: So I laid out some of these at NeurIPS, and it was just gone. Like, everyone just picked it up. Because people are dying for, like, little guidance and visualizations And so, uh, I thought it was really super nice that we got there.[00:26:38] Alessio: Should we do a late in space book for each year? Uh, I thought about it. For each year we should.[00:26:42] Alessio: Coffee table book. Yeah. Yeah. Okay. Put it in the will. Hi, Will. By the way, we haven't introduced you. He's our new, you know, general organist, Jamie. You need to[00:26:52] swyx: pull up more things. One thing I saw that, uh, Okay, one fun one, and then one [00:27:00] more general one. So the fun one is this paper on agent collusion. This is a paper on steganography.[00:27:06] swyx: This is secret collusion among AI agents, multi agent deception via steganography. I tried to go to NeurIPS in order to find these kinds of papers because the real reason Like NeurIPS this year has a lottery system. A lot of people actually even go and don't buy tickets because they just go and attend the side events.[00:27:22] swyx: And then also the people who go and end up crowding around the most popular papers, which you already know and already read them before you showed up to NeurIPS. So the only reason you go there is to talk to the paper authors, but there's like something like 10, 000 other. All these papers out there that, you know, are just people's work that they, that they did on the air and they failed to get attention for one reason or another.[00:27:42] swyx: And this was one of them. Uh, it was like all the way at the back. And this is a deep mind paper that actually focuses on collusion between AI agents, uh, by hiding messages in the text that they generate. Uh, so that's what steganography is. So a very simple example would be the first letter of every word.[00:27:57] swyx: If you Pick that out, you know, and the code sends a [00:28:00] different message than that. But something I've always emphasized is to LLMs, we read left to right. LLMs can read up, down, sideways, you know, in random character order. And it's the same to them as it is to us. So if we were ever to get You know, self motivated, underlined LLMs that we're trying to collaborate to take over the planet.[00:28:19] swyx: This would be how they do it. They spread messages among us in the messages that we generate. And he developed a scaling law for that. So he marked, I'm showing it on screen right now, the emergence of this phenomenon. Basically, for example, for Cypher encoding, GPT 2, Lama 2, mixed trial, GPT 3. 5, zero capabilities, and sudden 4.[00:28:40] swyx: And this is the kind of Jason Wei type emergence properties that people kind of look for. I think what made this paper stand out as well, so he developed the benchmark for steganography collusion, and he also focused on shelling point collusion, which is very low coordination. For agreeing on a decoding encoding format, you kind of need to have some [00:29:00] agreement on that.[00:29:00] swyx: But, but shelling point means like very, very low or almost no coordination. So for example, if I, if I ask someone, if the only message I give you is meet me in New York and you're not aware. Or when you would probably meet me at Grand Central Station. That is the Grand Central Station is a shelling point.[00:29:16] swyx: And it's probably somewhere, somewhere during the day. That is the shelling point of New York is Grand Central. To that extent, shelling points for steganography are things like the, the, the common decoding methods that we talked about. It will be interesting at some point in the future when we are worried about alignment.[00:29:30] swyx: It is not interesting today, but it's interesting that DeepMind is already thinking about this.[00:29:36] Alessio: I think that's like one of the hardest things about NeurIPS. It's like the long tail. I[00:29:41] swyx: found a pricing guy. I'm going to feature him on the podcast. Basically, this guy from NVIDIA worked out the optimal pricing for language models.[00:29:51] swyx: It's basically an econometrics paper at NeurIPS, where everyone else is talking about GPUs. And the guy with the GPUs is[00:29:57] Alessio: talking[00:29:57] swyx: about economics instead. [00:30:00] That was the sort of fun one. So the focus I saw is that model papers at NeurIPS are kind of dead. No one really presents models anymore. It's just data sets.[00:30:12] swyx: This is all the grad students are working on. So like there was a data sets track and then I was looking around like, I was like, you don't need a data sets track because every paper is a data sets paper. And so data sets and benchmarks, they're kind of flip sides of the same thing. So Yeah. Cool. Yeah, if you're a grad student, you're a GPU boy, you kind of work on that.[00:30:30] swyx: And then the, the sort of big model that people walk around and pick the ones that they like, and then they use it in their models. And that's, that's kind of how it develops. I, I feel like, um, like, like you didn't last year, you had people like Hao Tian who worked on Lava, which is take Lama and add Vision.[00:30:47] swyx: And then obviously actually I hired him and he added Vision to Grok. Now he's the Vision Grok guy. This year, I don't think there was any of those.[00:30:55] Alessio: What were the most popular, like, orals? Last year it was like the [00:31:00] Mixed Monarch, I think, was like the most attended. Yeah, uh, I need to look it up. Yeah, I mean, if nothing comes to mind, that's also kind of like an answer in a way.[00:31:10] Alessio: But I think last year there was a lot of interest in, like, furthering models and, like, different architectures and all of that.[00:31:16] swyx: I will say that I felt the orals, oral picks this year were not very good. Either that or maybe it's just a So that's the highlight of how I have changed in terms of how I view papers.[00:31:29] swyx: So like, in my estimation, two of the best papers in this year for datasets or data comp and refined web or fine web. These are two actually industrially used papers, not highlighted for a while. I think DCLM got the spotlight, FineWeb didn't even get the spotlight. So like, it's just that the picks were different.[00:31:48] swyx: But one thing that does get a lot of play that a lot of people are debating is the role that's scheduled. This is the schedule free optimizer paper from Meta from Aaron DeFazio. And this [00:32:00] year in the ML community, there's been a lot of chat about shampoo, soap, all the bathroom amenities for optimizing your learning rates.[00:32:08] swyx: And, uh, most people at the big labs are. Who I asked about this, um, say that it's cute, but it's not something that matters. I don't know, but it's something that was discussed and very, very popular. 4Wars[00:32:19] Alessio: of AI recap maybe, just quickly. Um, where do you want to start? Data?[00:32:26] swyx: So to remind people, this is the 4Wars piece that we did as one of our earlier recaps of this year.[00:32:31] swyx: And the belligerents are on the left, journalists, writers, artists, anyone who owns IP basically, New York Times, Stack Overflow, Reddit, Getty, Sarah Silverman, George RR Martin. Yeah, and I think this year we can add Scarlett Johansson to that side of the fence. So anyone suing, open the eye, basically. I actually wanted to get a snapshot of all the lawsuits.[00:32:52] swyx: I'm sure some lawyer can do it. That's the data quality war. On the right hand side, we have the synthetic data people, and I think we talked about Lumna's talk, you know, [00:33:00] really showing how much synthetic data has come along this year. I think there was a bit of a fight between scale. ai and the synthetic data community, because scale.[00:33:09] swyx: ai published a paper saying that synthetic data doesn't work. Surprise, surprise, scale. ai is the leading vendor of non synthetic data. Only[00:33:17] Alessio: cage free annotated data is useful.[00:33:21] swyx: So I think there's some debate going on there, but I don't think it's much debate anymore that at least synthetic data, for the reasons that are blessed in Luna's talk, Makes sense.[00:33:32] swyx: I don't know if you have any perspectives there.[00:33:34] Alessio: I think, again, going back to the reinforcement fine tuning, I think that will change a little bit how people think about it. I think today people mostly use synthetic data, yeah, for distillation and kind of like fine tuning a smaller model from like a larger model.[00:33:46] Alessio: I'm not super aware of how the frontier labs use it outside of like the rephrase, the web thing that Apple also did. But yeah, I think it'll be. Useful. I think like whether or not that gets us the big [00:34:00] next step, I think that's maybe like TBD, you know, I think people love talking about data because it's like a GPU poor, you know, I think, uh, synthetic data is like something that people can do, you know, so they feel more opinionated about it compared to, yeah, the optimizers stuff, which is like,[00:34:17] swyx: they don't[00:34:17] Alessio: really work[00:34:18] swyx: on.[00:34:18] swyx: I think that there is an angle to the reasoning synthetic data. So this year, we covered in the paper club, the star series of papers. So that's star, Q star, V star. It basically helps you to synthesize reasoning steps, or at least distill reasoning steps from a verifier. And if you look at the OpenAI RFT, API that they released, or that they announced, basically they're asking you to submit graders, or they choose from a preset list of graders.[00:34:49] swyx: Basically It feels like a way to create valid synthetic data for them to fine tune their reasoning paths on. Um, so I think that is another angle where it starts to make sense. And [00:35:00] so like, it's very funny that basically all the data quality wars between Let's say the music industry or like the newspaper publishing industry or the textbooks industry on the big labs.[00:35:11] swyx: It's all of the pre training era. And then like the new era, like the reasoning era, like nobody has any problem with all the reasoning, especially because it's all like sort of math and science oriented with, with very reasonable graders. I think the more interesting next step is how does it generalize beyond STEM?[00:35:27] swyx: We've been using O1 for And I would say like for summarization and creative writing and instruction following, I think it's underrated. I started using O1 in our intro songs before we killed the intro songs, but it's very good at writing lyrics. You know, I can actually say like, I think one of the O1 pro demos.[00:35:46] swyx: All of these things that Noam was showing was that, you know, you can write an entire paragraph or three paragraphs without using the letter A, right?[00:35:53] Creative Writing with AI[00:35:53] swyx: So like, like literally just anything instead of token, like not even token level, character level manipulation and [00:36:00] counting and instruction following. It's, uh, it's very, very strong.[00:36:02] swyx: And so no surprises when I ask it to rhyme, uh, and to, to create song lyrics, it's going to do that very much better than in previous models. So I think it's underrated for creative writing.[00:36:11] Alessio: Yeah.[00:36:12] Legal and Ethical Issues in AI[00:36:12] Alessio: What do you think is the rationale that they're going to have in court when they don't show you the thinking traces of O1, but then they want us to, like, they're getting sued for using other publishers data, you know, but then on their end, they're like, well, you shouldn't be using my data to then train your model.[00:36:29] Alessio: So I'm curious to see how that kind of comes. Yeah, I mean, OPA has[00:36:32] swyx: many ways to publish, to punish people without bringing, taking them to court. Already banned ByteDance for distilling their, their info. And so anyone caught distilling the chain of thought will be just disallowed to continue on, on, on the API.[00:36:44] swyx: And it's fine. It's no big deal. Like, I don't even think that's an issue at all, just because the chain of thoughts are pretty well hidden. Like you have to work very, very hard to, to get it to leak. And then even when it leaks the chain of thought, you don't know if it's, if it's [00:37:00] The bigger concern is actually that there's not that much IP hiding behind it, that Cosign, which we talked about, we talked to him on Dev Day, can just fine tune 4.[00:37:13] swyx: 0 to beat 0. 1 Cloud SONET so far is beating O1 on coding tasks without, at least O1 preview, without being a reasoning model, same for Gemini Pro or Gemini 2. 0. So like, how much is reasoning important? How much of a moat is there in this, like, All of these are proprietary sort of training data that they've presumably accomplished.[00:37:34] swyx: Because even DeepSeek was able to do it. And they had, you know, two months notice to do this, to do R1. So, it's actually unclear how much moat there is. Obviously, you know, if you talk to the Strawberry team, they'll be like, yeah, I mean, we spent the last two years doing this. So, we don't know. And it's going to be Interesting because there'll be a lot of noise from people who say they have inference time compute and actually don't because they just have fancy chain of thought.[00:38:00][00:38:00] swyx: And then there's other people who actually do have very good chain of thought. And you will not see them on the same level as OpenAI because OpenAI has invested a lot in building up the mythology of their team. Um, which makes sense. Like the real answer is somewhere in between.[00:38:13] Alessio: Yeah, I think that's kind of like the main data war story developing.[00:38:18] The Data War: GPU Poor vs. GPU Rich[00:38:18] Alessio: GPU poor versus GPU rich. Yeah. Where do you think we are? I think there was, again, going back to like the small model thing, there was like a time in which the GPU poor were kind of like the rebel faction working on like these models that were like open and small and cheap. And I think today people don't really care as much about GPUs anymore.[00:38:37] Alessio: You also see it in the price of the GPUs. Like, you know, that market is kind of like plummeted because there's people don't want to be, they want to be GPU free. They don't even want to be poor. They just want to be, you know, completely without them. Yeah. How do you think about this war? You[00:38:52] swyx: can tell me about this, but like, I feel like the, the appetite for GPU rich startups, like the, you know, the, the funding plan is we will raise 60 million and [00:39:00] we'll give 50 of that to NVIDIA.[00:39:01] swyx: That is gone, right? Like, no one's, no one's pitching that. This was literally the plan, the exact plan of like, I can name like four or five startups, you know, this time last year. So yeah, GPU rich startups gone.[00:39:12] The Rise of GPU Ultra Rich[00:39:12] swyx: But I think like, The GPU ultra rich, the GPU ultra high net worth is still going. So, um, now we're, you know, we had Leopold's essay on the trillion dollar cluster.[00:39:23] swyx: We're not quite there yet. We have multiple labs, um, you know, XAI very famously, you know, Jensen Huang praising them for being. Best boy number one in spinning up 100, 000 GPU cluster in like 12 days or something. So likewise at Meta, likewise at OpenAI, likewise at the other labs as well. So like the GPU ultra rich are going to keep doing that because I think partially it's an article of faith now that you just need it.[00:39:46] swyx: Like you don't even know what it's going to, what you're going to use it for. You just, you just need it. And it makes sense that if, especially if we're going into. More researchy territory than we are. So let's say 2020 to 2023 was [00:40:00] let's scale big models territory because we had GPT 3 in 2020 and we were like, okay, we'll go from 1.[00:40:05] swyx: 75b to 1. 8b, 1. 8t. And that was GPT 3 to GPT 4. Okay, that's done. As far as everyone is concerned, Opus 3. 5 is not coming out, GPT 4. 5 is not coming out, and Gemini 2, we don't have Pro, whatever. We've hit that wall. Maybe I'll call it the 2 trillion perimeter wall. We're not going to 10 trillion. No one thinks it's a good idea, at least from training costs, from the amount of data, or at least the inference.[00:40:36] swyx: Would you pay 10x the price of GPT Probably not. Like, like you want something else that, that is at least more useful. So it makes sense that people are pivoting in terms of their inference paradigm.[00:40:47] Emerging Trends in AI Models[00:40:47] swyx: And so when it's more researchy, then you actually need more just general purpose compute to mess around with, uh, at the exact same time that production deployments of the old, the previous paradigm is still ramping up,[00:40:58] swyx: um,[00:40:58] swyx: uh, pretty aggressively.[00:40:59] swyx: So [00:41:00] it makes sense that the GPU rich are growing. We have now interviewed both together and fireworks and replicates. Uh, we haven't done any scale yet. But I think Amazon, maybe kind of a sleeper one, Amazon, in a sense of like they, at reInvent, I wasn't expecting them to do so well, but they are now a foundation model lab.[00:41:18] swyx: It's kind of interesting. Um, I think, uh, you know, David went over there and started just creating models.[00:41:25] Alessio: Yeah, I mean, that's the power of prepaid contracts. I think like a lot of AWS customers, you know, they do this big reserve instance contracts and now they got to use their money. That's why so many startups.[00:41:37] Alessio: Get bought through the AWS marketplace so they can kind of bundle them together and prefer pricing.[00:41:42] swyx: Okay, so maybe GPU super rich doing very well, GPU middle class dead, and then GPU[00:41:48] Alessio: poor. I mean, my thing is like, everybody should just be GPU rich. There shouldn't really be, even the GPU poorest, it's like, does it really make sense to be GPU poor?[00:41:57] Alessio: Like, if you're GPU poor, you should just use the [00:42:00] cloud. Yes, you know, and I think there might be a future once we kind of like figure out what the size and shape of these models is where like the tiny box and these things come to fruition where like you can be GPU poor at home. But I think today is like, why are you working so hard to like get these models to run on like very small clusters where it's like, It's so cheap to run them.[00:42:21] Alessio: Yeah, yeah,[00:42:22] swyx: yeah. I think mostly people think it's cool. People think it's a stepping stone to scaling up. So they aspire to be GPU rich one day and they're working on new methods. Like news research, like probably the most deep tech thing they've done this year is Distro or whatever the new name is.[00:42:38] swyx: There's a lot of interest in heterogeneous computing, distributed computing. I tend generally to de emphasize that historically, but it may be coming to a time where it is starting to be relevant. I don't know. You know, SF compute launched their compute marketplace this year, and like, who's really using that?[00:42:53] swyx: Like, it's a bunch of small clusters, disparate types of compute, and if you can make that [00:43:00] useful, then that will be very beneficial to the broader community, but maybe still not the source of frontier models. It's just going to be a second tier of compute that is unlocked for people, and that's fine. But yeah, I mean, I think this year, I would say a lot more on device, We are, I now have Apple intelligence on my phone.[00:43:19] swyx: Doesn't do anything apart from summarize my notifications. But still, not bad. Like, it's multi modal.[00:43:25] Alessio: Yeah, the notification summaries are so and so in my experience.[00:43:29] swyx: Yeah, but they add, they add juice to life. And then, um, Chrome Nano, uh, Gemini Nano is coming out in Chrome. Uh, they're still feature flagged, but you can, you can try it now if you, if you use the, uh, the alpha.[00:43:40] swyx: And so, like, I, I think, like, you know, We're getting the sort of GPU poor version of a lot of these things coming out, and I think it's like quite useful. Like Windows as well, rolling out RWKB in sort of every Windows department is super cool. And I think the last thing that I never put in this GPU poor war, that I think I should now, [00:44:00] is the number of startups that are GPU poor but still scaling very well, as sort of wrappers on top of either a foundation model lab, or GPU Cloud.[00:44:10] swyx: GPU Cloud, it would be Suno. Suno, Ramp has rated as one of the top ranked, fastest growing startups of the year. Um, I think the last public number is like zero to 20 million this year in ARR and Suno runs on Moto. So Suno itself is not GPU rich, but they're just doing the training on, on Moto, uh, who we've also talked to on, on the podcast.[00:44:31] swyx: The other one would be Bolt, straight cloud wrapper. And, and, um, Again, another, now they've announced 20 million ARR, which is another step up from our 8 million that we put on the title. So yeah, I mean, it's crazy that all these GPU pores are finding a way while the GPU riches are also finding a way. And then the only failures, I kind of call this the GPU smiling curve, where the edges do well, because you're either close to the machines, and you're like [00:45:00] number one on the machines, or you're like close to the customers, and you're number one on the customer side.[00:45:03] swyx: And the people who are in the middle. Inflection, um, character, didn't do that great. I think character did the best of all of them. Like, you have a note in here that we apparently said that character's price tag was[00:45:15] Alessio: 1B.[00:45:15] swyx: Did I say that?[00:45:16] Alessio: Yeah. You said Google should just buy them for 1B. I thought it was a crazy number.[00:45:20] Alessio: Then they paid 2. 7 billion. I mean, for like,[00:45:22] swyx: yeah.[00:45:22] Alessio: What do you pay for node? Like, I don't know what the game world was like. Maybe the starting price was 1B. I mean, whatever it was, it worked out for everybody involved.[00:45:31] The Multi-Modality War[00:45:31] Alessio: Multimodality war. And this one, we never had text to video in the first version, which now is the hottest.[00:45:37] swyx: Yeah, I would say it's a subset of image, but yes.[00:45:40] Alessio: Yeah, well, but I think at the time it wasn't really something people were doing, and now we had VO2 just came out yesterday. Uh, Sora was released last month, last week. I've not tried Sora, because the day that I tried, it wasn't, yeah. I[00:45:54] swyx: think it's generally available now, you can go to Sora.[00:45:56] swyx: com and try it. Yeah, they had[00:45:58] Alessio: the outage. Which I [00:46:00] think also played a part into it. Small things. Yeah. What's the other model that you posted today that was on Replicate? Video or OneLive?[00:46:08] swyx: Yeah. Very, very nondescript name, but it is from Minimax, which I think is a Chinese lab. The Chinese labs do surprisingly well at the video models.[00:46:20] swyx: I'm not sure it's actually Chinese. I don't know. Hold me up to that. Yep. China. It's good. Yeah, the Chinese love video. What can I say? They have a lot of training data for video. Or a more relaxed regulatory environment.[00:46:37] Alessio: Uh, well, sure, in some way. Yeah, I don't think there's much else there. I think like, you know, on the image side, I think it's still open.[00:46:45] Alessio: Yeah, I mean,[00:46:46] swyx: 11labs is now a unicorn. So basically, what is multi modality war? Multi modality war is, do you specialize in a single modality, right? Or do you have GodModel that does all the modalities? So this is [00:47:00] definitely still going, in a sense of 11 labs, you know, now Unicorn, PicoLabs doing well, they launched Pico 2.[00:47:06] swyx: 0 recently, HeyGen, I think has reached 100 million ARR, Assembly, I don't know, but they have billboards all over the place, so I assume they're doing very, very well. So these are all specialist models, specialist models and specialist startups. And then there's the big labs who are doing the sort of all in one play.[00:47:24] swyx: And then here I would highlight Gemini 2 for having native image output. Have you seen the demos? Um, yeah, it's, it's hard to keep up. Literally they launched this last week and a shout out to Paige Bailey, who came to the Latent Space event to demo on the day of launch. And she wasn't prepared. She was just like, I'm just going to show you.[00:47:43] swyx: So they have voice. They have, you know, obviously image input, and then they obviously can code gen and all that. But the new one that OpenAI and Meta both have but they haven't launched yet is image output. So you can literally, um, I think their demo video was that you put in an image of a [00:48:00] car, and you ask for minor modifications to that car.[00:48:02] swyx: They can generate you that modification exactly as you asked. So there's no need for the stable diffusion or comfy UI workflow of like mask here and then like infill there in paint there and all that, all that stuff. This is small model nonsense. Big model people are like, huh, we got you in as everything in the transformer.[00:48:21] swyx: This is the multimodality war, which is, do you, do you bet on the God model or do you string together a whole bunch of, uh, Small models like a, like a chump. Yeah,[00:48:29] Alessio: I don't know, man. Yeah, that would be interesting. I mean, obviously I use Midjourney for all of our thumbnails. Um, they've been doing a ton on the product, I would say.[00:48:38] Alessio: They launched a new Midjourney editor thing. They've been doing a ton. Because I think, yeah, the motto is kind of like, Maybe, you know, people say black forest, the black forest models are better than mid journey on a pixel by pixel basis. But I think when you put it, put it together, have you tried[00:48:53] swyx: the same problems on black forest?[00:48:55] Alessio: Yes. But the problem is just like, you know, on black forest, it generates one image. And then it's like, you got to [00:49:00] regenerate. You don't have all these like UI things. Like what I do, no, but it's like time issue, you know, it's like a mid[00:49:06] swyx: journey. Call the API four times.[00:49:08] Alessio: No, but then there's no like variate.[00:49:10] Alessio: Like the good thing about mid journey is like, you just go in there and you're cooking. There's a lot of stuff that just makes it really easy. And I think people underestimate that. Like, it's not really a skill issue, because I'm paying mid journey, so it's a Black Forest skill issue, because I'm not paying them, you know?[00:49:24] Alessio: Yeah,[00:49:25] swyx: so, okay, so, uh, this is a UX thing, right? Like, you, you, you understand that, at least, we think that Black Forest should be able to do all that stuff. I will also shout out, ReCraft has come out, uh, on top of the image arena that, uh, artificial analysis has done, has apparently, uh, Flux's place. Is this still true?[00:49:41] swyx: So, Artificial Analysis is now a company. I highlighted them I think in one of the early AI Newses of the year. And they have launched a whole bunch of arenas. So, they're trying to take on LM Arena, Anastasios and crew. And they have an image arena. Oh yeah, Recraft v3 is now beating Flux 1. 1. Which is very surprising [00:50:00] because Flux And Black Forest Labs are the old stable diffusion crew who left stability after, um, the management issues.[00:50:06] swyx: So Recurve has come from nowhere to be the top image model. Uh, very, very strange. I would also highlight that Grok has now launched Aurora, which is, it's very interesting dynamics between Grok and Black Forest Labs because Grok's images were originally launched, uh, in partnership with Black Forest Labs as a, as a thin wrapper.[00:50:24] swyx: And then Grok was like, no, we'll make our own. And so they've made their own. I don't know, there are no APIs or benchmarks about it. They just announced it. So yeah, that's the multi modality war. I would say that so far, the small model, the dedicated model people are winning, because they are just focused on their tasks.[00:50:42] swyx: But the big model, People are always catching up. And the moment I saw the Gemini 2 demo of image editing, where I can put in an image and just request it and it does, that's how AI should work. Not like a whole bunch of complicated steps. So it really is something. And I think one frontier that we haven't [00:51:00] seen this year, like obviously video has done very well, and it will continue to grow.[00:51:03] swyx: You know, we only have Sora Turbo today, but at some point we'll get full Sora. Oh, at least the Hollywood Labs will get Fulsora. We haven't seen video to audio, or video synced to audio. And so the researchers that I talked to are already starting to talk about that as the next frontier. But there's still maybe like five more years of video left to actually be Soda.[00:51:23] swyx: I would say that Gemini's approach Compared to OpenAI, Gemini seems, or DeepMind's approach to video seems a lot more fully fledged than OpenAI. Because if you look at the ICML recap that I published that so far nobody has listened to, um, that people have listened to it. It's just a different, definitely different audience.[00:51:43] swyx: It's only seven hours long. Why are people not listening? It's like everything in Uh, so, so DeepMind has, is working on Genie. They also launched Genie 2 and VideoPoet. So, like, they have maybe four years advantage on world modeling that OpenAI does not have. Because OpenAI basically only started [00:52:00] Diffusion Transformers last year, you know, when they hired, uh, Bill Peebles.[00:52:03] swyx: So, DeepMind has, has a bit of advantage here, I would say, in, in, in showing, like, the reason that VO2, while one, They cherry pick their videos. So obviously it looks better than Sora, but the reason I would believe that VO2, uh, when it's fully launched will do very well is because they have all this background work in video that they've done for years.[00:52:22] swyx: Like, like last year's NeurIPS, I already was interviewing some of their video people. I forget their model name, but for, for people who are dedicated fans, they can go to NeurIPS 2023 and see, see that paper.[00:52:32] Alessio: And then last but not least, the LLMOS. We renamed it to Ragops, formerly known as[00:52:39] swyx: Ragops War. I put the latest chart on the Braintrust episode.[00:52:43] swyx: I think I'm going to separate these essays from the episode notes. So the reason I used to do that, by the way, is because I wanted to show up on Hacker News. I wanted the podcast to show up on Hacker News. So I always put an essay inside of there because Hacker News people like to read and not listen.[00:52:58] Alessio: So episode essays,[00:52:59] swyx: I remember [00:53:00] purchasing them separately. You say Lanchain Llama Index is still growing.[00:53:03] Alessio: Yeah, so I looked at the PyPy stats, you know. I don't care about stars. On PyPy you see Do you want to share your screen? Yes. I prefer to look at actual downloads, not at stars on GitHub. So if you look at, you know, Lanchain still growing.[00:53:20] Alessio: These are the last six months. Llama Index still growing. What I've basically seen is like things that, One, obviously these things have A commercial product. So there's like people buying this and sticking with it versus kind of hopping in between things versus, you know, for example, crew AI, not really growing as much.[00:53:38] Alessio: The stars are growing. If you look on GitHub, like the stars are growing, but kind of like the usage is kind of like flat. In the last six months, have they done some[00:53:4

god ceo new york amazon spotify time world europe google ai china apple vision pr voice future speaking san francisco new york times phd video thinking chinese simple data predictions elon musk iphone surprise impact legal code tesla chatgpt reflecting memory ga discord busy reddit lgbt cloud flash stem honestly ab pros windows jeff bezos excited researchers unicorns lower ip tackling sort survey insane tier cto vc whispers applications doc signing seal fireworks f1 genie academic sf openai gemini organizing nvidia ux api assembly davos frontier chrome makes scarlett johansson ui mm turbo bash gpt soda ml aws lama dropbox mosaic creative writing github drafting reinvent canvas 1b bolt apis lava ruler exact stripe dev wwdc pico vm hundred strawberry sander bt flux vcs taiwanese 200k moto arr gartner assumption opus sora google docs parting nemo blackwell sam altman google drive sombra llm gpu opa tbd ramp elia 3b elo gnome agi 5b estimates bytedance midjourney leopold dota ciso haiku dx sarah silverman coursera rag gpus sonnets george rr martin cypher quill getty cobalt sdks ilya deepmind noam sheesh v2 ttc alessio perplexity future trends lms anthropic satya r1 grok ssi stack overflow rl 8b itc emerging trends theoretically sota replicate yi vo2 mistral suno veo black forest inflection graphql aitor brain trust databricks xai chinchillas adept nosql gpts grand central jensen huang grand central station ai models hacker news zep mcp hacken ethical issues cosign claud ai news distro gpc lubna autogpt neo4j tpu jeremy howard o3 gbt o1 gpd quent heygen loras exa gradients 70b minimax langchain neurips jeff dean 400b 128k elos gemini pro cerebras code interpreter icml john franco r1s lstm ai winter aws reinvent muser latent space pypy dan gross nova pro paige bailey noam brown quiet capital john frankel
Where It Happens
Growth Genius Leaks Million Dollar SEO Plan

Where It Happens

Play Episode Listen Later Dec 4, 2024 45:03


Join me as I chat with Jacky Chou, a serial entrepreneur, as we dive deep into growth hacking opportunities. Jacky shares his strategies and frameworks for how to growth hack businesses using SEO and keyword research.Timestamps00:00 Intro01:46 Most AI companies are leaving money on the table11:04 Paid Ads for InteriorAI15:04 Jacky's million dollar SEO plan with Reddit21:13 Other Neglected Platforms to beat VC-funded startups24:57 Startup Idea 1: NSFW AI O.F. 35:31 Community Building in 20241) Most AI companies are leaving money on the tableCase study: InteriorAI (by Pieter Levels)- Missing out on ~50K/month- Could monetize through affiliate content- Should create blog targeting high-value keywords- Opportunity for "retail rank & rent" ($1-2K/month per placement)2) Jacky's million dollar SEO plan with Reddit:- Find keywords where Reddit ranks #2 (tons right now!)- Share your product as a "user" (no links!)- Target all related long-tail keywords- Look at AI-related terms (massive growth)- Replicate across Quora, Medium, LinkedIn3) Keyword Research Gold:"AI video generator" trends:- Search volume is exploding- Reddit ranking #2 consistently- CBC around $1.50- Huge opportunity for indie hackers- Most big players ignoring UGC platforms4) Dark Horse Opportunity: Adult AI Market- Extremely low keyword difficulty (KD: 2)- High search volume- Most competitors doing poor execution- AI chat + OF management tools needed- Mid 6-7 figures/month potential(Not for everyone, but opportunity exists)5) Community Building in 2024:- @JackieChou hitting $80K/month- Built from 15K YouTube subs- Key: Sell transformation, not just community- Focus on membership benefits- Bundle partnerships/toolsKey Takeaway:To win in 2024: Do what VC-funded startups won't:- Target neglected platforms- Create authentic content- Build real communities- Ignore "brand concerns" early- Focus on actual user valueWant more free ideas? I collect the best ideas from the pod and give them to you for free in a database. Most of them cost $0 to start (my fav)Get access: https://www.gregisenberg.com/30startupideasLCA helps Fortune 500s and fast-growing startups build their future - from Warner Music to Fortnite to Dropbox. We turn 'what if' into reality with AI, apps, and next-gen products https://latecheckout.agency/BoringAds — ads agency that will build you profitable ad campaigns http://boringads.com/BoringMarketing — SEO agency and tools to get your organic customers http://boringmarketing.com/Startup Empire - a membership for builders who want to build cash-flowing businesses https://www.startupempire.coFIND ME ON SOCIALX/Twitter: https://twitter.com/gregisenbergInstagram: https://instagram.com/gregisenberg/LinkedIn: https://www.linkedin.com/in/gisenberg/FIND JACKY ON SOCIALX/Twitter: https://x.com/indexsyLinkedIn: https://www.linkedin.com/in/jacky-chou/Youtube: https://www.youtube.com/@indexsy/videosAdvise: https://advise.so/apply/

Carnival Personnel
RE-Posting - Sideshow 116 - The Legendary, BILL FAY

Carnival Personnel

Play Episode Listen Later Dec 2, 2024 87:12


Special RE-Posting with our good friend BILL FAY.   In this episode from Dec 17 2023 - We talk Bills new Happy Texas play ... where Jacques spend a few days in NYC couple week back hanging with Bill as he prepped for the Investors showcase with full cast and music - figure this be a fun time to repost this one. Enjoy ... again. Legendary Film Maker and … Co-Founder of Legendary Pictures, Bill Fay zooms in to chat with Joe, Biff and Jacques … who for reasons we still don't understand has been a close friend and mentor to Jacques for 20 years! The first two film companies Bill worked for out of college went bankrupt … and then he learned to fly a plane to be Hollywood's man on the ground in Africa ... along the way working with folks like James Cameron for (politely putting it) eccentric film maker Rodger Corman to producing such block busters The Patriot and Independence Day with Director Roland Emmerich. Along the way ACCIDENTLY doubling his 24-year-old selves lives saving by making a 3-day investment based on getting to see a sneak preview of Back To The Future Bill (and partner Thomas Toll) created Legendary Pictures in 2000 with a plan for a slate of big budget (largely Superhero / Action) films … Along the way just for giggles … also produced the Hangover (well for giggles and 470m at Box office making it largest grossing rated R comedy in history.) Bill also talks his favorite project, a small indy sci fi horror called Replicate … well favorite project Jacques was an associate producer on. Bill did fly to see Jacques she he was living in Qatar in 2026 … and in Qatar gave college lectures and gave road map to finance minister of how too get their country on the film making map. Bill currently is CEO of Steric Systems – a company that has developed a electronic process of smoothing liquor and spirits at distilleries.  https://www.youtube.com/watch?v=a45uEuVH6jI Steric Systems has grown exponentially over last few years … which leaves very little time for the other new company Bill started with his brother … a Salmon spread they've made for friends and family for years … at a few pound batches at a time … now scaling up (pun intended) to 7500 lbs test with a national food chain retailer.    Bill and Jacques now have the stage show rights to the 1999 movie Happy Texas and hope to have a 6 weeks test run (maybe in Seattle) next fall.  To be clear … BILL has the rights, Jacques just loves the movie and wants to work on the play!   CPP on IG: @CarnivalPersonnelPodcast and on Twitter: @Carnival Podcast                                                                                             Biff on Twitter is @BiffPlaysHockey Joe on Twitter is: @Optigrabber Jacques on Twitter is @CarnivalPodcast and FB @JacquesLambert   Opening Song: WKRP parody by @Model_CHP3Y (Twitter and Youtube)   Closing Song: Other Break by Dan Cray and Beyond Id (1993)  

IN THE KNOW
Can AI replicate Christmas magic?

IN THE KNOW

Play Episode Listen Later Dec 2, 2024 11:11


Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

We have a full slate of upcoming events: AI Engineer London, AWS Re:Invent in Las Vegas, and now Latent Space LIVE! at NeurIPS in Vancouver and online. Sign up to join and speak!We are still taking questions for our next big recap episode! Submit questions and messages on Speakpipe here for a chance to appear on the show!We try to stay close to the inference providers as part of our coverage, as our podcasts with Together AI and Replicate will attest: However one of the most notable pull quotes from our very well received Braintrust episode was his opinion that open source model adoption has NOT gone very well and is actually declining in relative market share terms (it is of course increasing in absolute terms):Today's guest, Lin Qiao, would wholly disagree. Her team of Pytorch/GPU experts are wholly dedicated toward helping you serve and finetune the full stack of open source models from Meta and others, across all modalities (Text, Audio, Image, Embedding, Vision-understanding), helping customers like Cursor and Hubspot scale up open source model inference both rapidly and affordably.Fireworks has emerged after its successive funding rounds with top tier VCs as one of the leaders of the Compound AI movement, a term first coined by the Databricks/Mosaic gang at Berkeley AI and adapted as “Composite AI” by Gartner:Replicating o1We are the first podcast to discuss Fireworks' f1, their proprietary replication of OpenAI's o1. This has become a surprisingly hot area of competition in the past week as both Nous Forge and Deepseek r1 have launched competitive models.Full Video PodcastLike and subscribe!Timestamps* 00:00:00 Introductions* 00:02:08 Pre-history of Fireworks and PyTorch at Meta* 00:09:49 Product Strategy: From Framework to Model Library* 00:13:01 Compound AI Concept and Industry Dynamics* 00:20:07 Fireworks' Distributed Inference Engine* 00:22:58 OSS Model Support and Competitive Strategy* 00:29:46 Declarative System Approach in AI* 00:31:00 Can OSS replicate o1?* 00:36:51 Fireworks f1* 00:41:03 Collaboration with Cursor and Speculative Decoding* 00:46:44 Fireworks quantization (and drama around it)* 00:49:38 Pricing Strategy* 00:51:51 Underrated Features of Fireworks Platform* 00:55:17 HiringTranscriptAlessio [00:00:00]: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner at CTO at Danceable Partners, and I'm joined by my co-host, Swyx founder, Osmalayar.Swyx [00:00:11]: Hey, and today we're in a very special studio inside the Fireworks office with Lin Qiang, CEO of Fireworks. Welcome. Yeah.Lin [00:00:20]: Oh, you should welcome us.Swyx [00:00:21]: Yeah, welcome. Yeah, thanks for having us. It's unusual to be in the home of a startup, but it's also, I think our relationship is a bit unusual compared to all our normal guests. Definitely.Lin [00:00:34]: Yeah. I'm super excited to talk about very interesting topics in that space with both of you.Swyx [00:00:41]: You just celebrated your two-year anniversary yesterday.Lin [00:00:43]: Yeah, it's quite a crazy journey. We circle around and share all the crazy stories across these two years, and it has been super fun. All the way from we experienced Silicon Valley bank run to we delete some data that shouldn't be deleted operationally. We went through a massive scale where we actually are busy getting capacity to, yeah, we learned to kind of work with it as a team with a lot of brilliant people across different places to join a company. It has really been a fun journey.Alessio [00:01:24]: When you started, did you think the technical stuff will be harder or the bank run and then the people side? I think there's a lot of amazing researchers that want to do companies and it's like the hardest thing is going to be building the product and then you have all these different other things. So, were you surprised by what has been your experience the most?Lin [00:01:42]: Yeah, to be honest with you, my focus has always been on the product side and then after the product goes to market. And I didn't realize the rest has been so complicated, operating a company and so on. But because I don't think about it, I just kind of manage it. So it's done. I think I just somehow don't think about it too much and solve whatever problem coming our way and it worked.Swyx [00:02:08]: So let's, I guess, let's start at the pre-history, the initial history of Fireworks. You ran the PyTorch team at Meta for a number of years and we previously had Sumit Chintal on and I think we were just all very interested in the history of GenEI. Maybe not that many people know how deeply involved Faire and Meta were prior to the current GenEI revolution.Lin [00:02:35]: My background is deep in distributed system, database management system. And I joined Meta from the data side and I saw this tremendous amount of data growth, which cost a lot of money and we're analyzing what's going on. And it's clear that AI is driving all this data generation. So it's a very interesting time because when I joined Meta, Meta is going through ramping down mobile-first, finishing the mobile-first transition and then starting AI-first. And there's a fundamental reason about that sequence because mobile-first gave a full range of user engagement that has never existed before. And all this user engagement generated a lot of data and this data power AI. So then the whole entire industry is also going through, falling through this same transition. When I see, oh, okay, this AI is powering all this data generation and look at where's our AI stack. There's no software, there's no hardware, there's no people, there's no team. I want to dive up there and help this movement. So when I started, it's very interesting industry landscape. There are a lot of AI frameworks. It's a kind of proliferation of AI frameworks happening in the industry. But all the AI frameworks focus on production and they use a very certain way of defining the graph of neural network and then use that to drive the model iteration and productionization. And PyTorch is completely different. So they could also assume that he was the user of his product. And he basically says, researchers face so much pain using existing AI frameworks, this is really hard to use and I'm going to do something different for myself. And that's the origin story of PyTorch. PyTorch actually started as the framework for researchers. They don't care about production at all. And as they grow in terms of adoption, so the interesting part of AI is research is the top of our normal production. There are so many researchers across academic, across industry, they innovate and they put their results out there in open source and that power the downstream productionization. So it's brilliant for MATA to establish PyTorch as a strategy to drive massive adoption in open source because MATA internally is a PyTorch shop. So it creates a flying wheel effect. So that's kind of a strategy behind PyTorch. But when I took on PyTorch, it's kind of at Caspo, MATA established PyTorch as the framework for both research and production. So no one has done that before. And we have to kind of rethink how to architect PyTorch so we can really sustain production workload, the stability, reliability, low latency, all this production concern was never a concern before. Now it's a concern. And we actually have to adjust its design and make it work for both sides. And that took us five years because MATA has so many AI use cases, all the way from ranking recommendation as powering the business top line or as ranking newsfeed, video ranking to site integrity detect bad content automatically using AI to all kinds of effects, translation, image classification, object detection, all this. And also across AI running on the server side, on mobile phones, on AI VR devices, the wide spectrum. So by the time we actually basically managed to support AI across ubiquitous everywhere across MATA. But interestingly, through open source engagement, we work with a lot of companies. It is clear to us like this industry is starting to take on AI first transition. And of course, MATA's hyperscale always go ahead of industry. And it feels like when we start this AI journey at MATA, there's no software, no hardware, no team. For many companies we engage with through PyTorch, we feel the pain. That's the genesis why we feel like, hey, if we create fireworks and support industry going through this transition, it will be a huge amount of impact. Of course, the problem that the industry is facing will not be the same as MATA. MATA is so big, right? So it's kind of skewed towards extreme scale and extreme optimization in the industry will be different. But we feel like we have the technical chop and we've seen a lot. We'll look to kind of drive that. So yeah, so that's how we started.Swyx [00:06:58]: When you and I chatted about the origins of fireworks, it was originally envisioned more as a PyTorch platform, and then later became much more focused on generative AI. Is that fair to say? What was the customer discovery here?Lin [00:07:13]: Right. So I would say our initial blueprint is we should build a PyTorch cloud because a PyTorch library and there's no SaaS platform to enable AI workloads.Swyx [00:07:26]: Even in 2022, it's interesting.Lin [00:07:28]: I would not say absolutely no, but cloud providers have some of those, but it's not first class citizen, right? At 2022, there's still like TensorFlow is massively in production. And this is all pre-gen AI, and PyTorch is kind of getting more and more adoption. But there's no PyTorch-first SaaS platform existing. At the same time, we are also a very pragmatic set of people. We really want to make sure from the get-go, we get really, really close to customers. We understand their use case, we understand their pain points, we understand the value we deliver to them. So we want to take a different approach instead of building a horizontal PyTorch cloud. We want to build a verticalized platform first. And then we talk with many customers. And interestingly, we started the company in September 2022, and in October, November, the OpenAI announced ChatGPT. And then boom, when we talked with many customers, they were like, can you help us work on the JNS aspect? So of course, there are some open source models. It's not as good at that time, but people are already putting a lot of attention there. Then we decided that if we're going to pick a vertical, we're going to pick JNI. The other reason is all JNI models are PyTorch models. So that's another reason. We believe that because of the nature of JNI, it's going to generate a lot of human consumable content. It will drive a lot of consumer, customer-developer-facing application and product innovation. Guaranteed. We're just at the beginning of this. Our prediction is for those kind of applications, the inference is much more important than training because inference scale is proportional to the up-limit award population. And training scale is proportional to the number of researchers. Of course, each training round could be very expensive. Although PyTorch supports both inference and training, we decided to laser focus on inference. So yeah, so that's how we got started. And we launched our public platform August last year. When we launched, it was a single product. It's a distributed inference engine with a simple API, open AI compatible API with many models. We started with LM and then we added a lot of models. Fast forward to now, we are a full platform with multiple product lines. So we love to kind of dive deep into what we offer. But that's a very fun journey in the past two years.Alessio [00:09:49]: What was the transition from you start to focus on PyTorch and people want to understand the framework, get it live. And now say maybe most people that use you don't even really know much about PyTorch at all. You know, they're just trying to consume a model. From a product perspective, like what were some of the decisions early on? Like right in October, November, you were just like, hey, most people just care about the model, not about the framework. We're going to make it super easy or was it more a gradual transition to the model librarySwyx [00:10:16]: you have today?Lin [00:10:17]: Yeah. So our product decision is all based on who is our ICP. And one thing I want to acknowledge here is the generic technology is disruptive. It's very different from AI before GNI. So it's a clear leap forward. Because before GNI, the companies that want to invest in AI, they have to train from scratch. There's no other way. There's no foundation model. It doesn't exist. So that means then to start a team, first hire a team who is capable of crunch data. There's a lot of data to crunch, right? Because training from scratch, you have to prepare a lot of data. And then they need to have GPUs to train, and then you start to manage GPUs. So then it becomes a very complex project. It takes a long time and not many companies can afford it, actually. And the GNI is a very different game right now, because it is a foundation model. So you don't have to train anymore. That makes AI much more accessible as a technology. As an app developer or product manager, even, not a developer, they can interact with GNI models directly. So our goal is to make AI accessible to all app developers and product engineers. That's our goal. So then getting them into the building model doesn't make any sense anymore with this new technology. And then building easy, accessible APIs is the most important. Early on, when we got started, we decided we're going to be open AI compatible. It's just kind of very easy for developers to adopt this new technology, and we will manage the underlying complexity of serving all these models.Swyx [00:11:56]: Yeah, open AI has become the standard. Even as we're recording today, Gemini announced that they have open AI compatible APIs. Interesting. So we just need to drop it all in line, and then we have everyone popping in line.Lin [00:12:09]: That's interesting, because we are working very closely with Meta as one of the partners. Meta, of course, is kind of very generous to donate many very, very strong open source models, expecting more to come. But also they have announced LamaStack, which is basically standardized, the upper level stack built on top of Lama models. So they don't just want to give out models and you figure out what the upper stack is. They instead want to build a community around the stack and build a new standard. I think there's an interesting dynamics in play in the industry right now, when it's more standardized across open AI, because they are kind of creating the top of the funnel, or standardized across Lama, because this is the most used open source model. So I think it's a lot of fun working at this time.Swyx [00:13:01]: I've been a little bit more doubtful on LamaStack, I think you've been more positive. Basically it's just like the meta version of whatever Hugging Face offers, you know, or TensorRT, or BLM, or whatever the open source opportunity is. But to me, it's not clear that just because Meta open sources Lama, that the rest of LamaStack will be adopted. And it's not clear why I should adopt it. So I don't know if you agree.Lin [00:13:27]: It's very early right now. That's why I kind of work very closely with them and give them feedback. The feedback to the meta team is very important. So then they can use that to continue to improve the model and also improve the higher level I think the success of LamaStack heavily depends on the community adoption. And there's no way around it. And I know the meta team would like to kind of work with a broader set of community. But it's very early.Swyx [00:13:52]: One thing that after your Series B, so you raced for Benchmark, and then Sequoia. I remember being close to you for at least your Series B announcements, you started betting heavily on this term of Compound AI. It's not a term that we've covered very much in the podcast, but I think it's definitely getting a lot of adoption from Databricks and Berkeley people and all that. What's your take on Compound AI? Why is it resonating with people?Lin [00:14:16]: Right. So let me give a little bit of context why we even consider that space.Swyx [00:14:22]: Because like pre-Series B, there was no message, and now it's like on your landing page.Lin [00:14:27]: So it's kind of very organic evolution from when we first launched our public platform, we are a single product. We are a distributed inference engine, where we do a lot of innovation, customized KUDA kernels, raw kernel kernels, running on different kinds of hardware, and build distributed disaggregated execution, inference execution, build all kinds of caching. So that is one. So that's kind of one product line, is the fast, most cost-efficient inference platform. Because we wrote PyTorch code, we know we basically have a special PyTorch build for that, together with a custom kernel we wrote. And then we worked with many more customers, we realized, oh, the distributed inference engine, our design is one size fits all. We want to have this inference endpoint, then everyone come in, and no matter what kind of form and shape or workload they have, it will just work for them. So that's great. But the reality is, we realized all customers have different kinds of use cases. The use cases come in all different forms and shapes. And the end result is the data distribution in their inference workload doesn't align with the data distribution in the training data for the model. It's a given, actually. If you think about it, because researchers have to guesstimate what is important, what's not important in preparing data for training. So because of that misalignment, then we leave a lot of quality, latency, cost improvement on the table. So then we're saying, OK, we want to heavily invest in a customization engine. And we actually announced it called FHIR Optimizer. So FHIR Optimizer basically helps users navigate a three-dimensional optimization space across quality, latency, and cost. So it's a three-dimensional curve. And even for one company, for different use cases, they want to land in different spots. So we automate that process for our customers. It's very simple. You have your inference workload. You inject into the optimizer along with the objective function. And then we spit out inference deployment config and the model setup. So it's your customized setup. So that is a completely different product. So that product thinking is one size fits all. And now on top of that, we provide a huge variety of state-of-the-art models, hundreds of them, varying from text to large state-of-the-art English models. That's where we started. And as we talk with many customers, we realize, oh, audio and text are very, very close. Many of our customers start to build assistants, all kinds of assistants using text. And they immediately want to add audio, audio in, audio out. So we support transcription, translation, speech synthesis, text, audio alignment, all different kinds of audio features. It's a big announcement. You should have heard by the time this is out. And the other areas of vision and text are very close with each other. Because a lot of information doesn't live in plain text. A lot of information lives in multimedia format, images, PDFs, screenshots, and many other different formats. So oftentimes to solve a problem, we need to put the vision model first to extract information and then use language model to process and then send out results. So vision is important. We also support vision model, various different kinds of vision models specialized in processing different kinds of source and extraction. And we're also going to have another announcement of a new API endpoint we'll support for people to upload various different kinds of multimedia content and then get the extract very accurate information out and feed that into LM. And of course, we support embedding because embedding is very important for semantic search, for RAG, and all this. And in addition to that, we also support text-to-image, image generation models, text-to-image, image-to-image, and we're adding text-to-video as well in our portfolio. So it's a very comprehensive set of model catalog that built on top of File Optimizer and Distributed Inference Engine. But then we talk with more customers, they solve business use case, and then we realize one model is not sufficient to solve their problem. And it's very clear because one is the model hallucinates. Many customers, when they onboard this JNI journey, they thought this is magical. JNI is going to solve all my problems magically. But then they realize, oh, this model hallucinates. It hallucinates because it's not deterministic, it's probabilistic. So it's designed to always give you an answer, but based on probabilities, so it hallucinates. And that's actually sometimes a feature for creative writing, for example. Sometimes it's a bug because, hey, you don't want to give misinformation. And different models also have different specialties. To solve a problem, you want to ask different special models to kind of decompose your task into multiple small tasks, narrow tasks, and then have an expert model solve that task really well. And of course, the model doesn't have all the information. It has limited knowledge because the training data is finite, not infinite. So the model oftentimes doesn't have real-time information. It doesn't know any proprietary information within the enterprise. It's clear that in order to really build a compiling application on top of JNI, we need a compound AI system. Compound AI system basically is going to have multiple models across modalities, along with APIs, whether it's public APIs, internal proprietary APIs, storage systems, database systems, knowledge to work together to deliver the best answer.Swyx [00:20:07]: Are you going to offer a vector database?Lin [00:20:09]: We actually heavily partner with several big vector database providers. Which is your favorite? They are all great in different ways. But it's public information, like MongoDB is our investor. And we have been working closely with them for a while.Alessio [00:20:26]: When you say distributed inference engine, what do you mean exactly? Because when I hear your explanation, it's almost like you're centralizing a lot of the decisions through the Fireworks platform on the quality and whatnot. What do you mean distributed? It's like you have GPUs in a lot of different clusters, so you're sharding the inference across the same model.Lin [00:20:45]: So first of all, we run across multiple GPUs. But the way we distribute across multiple GPUs is unique. We don't distribute the whole model monolithically across multiple GPUs. We chop them into pieces and scale them completely differently based on what's the bottleneck. We also are distributed across regions. We have been running in North America, EMEA, and Asia. We have regional affinity to applications because latency is extremely important. We are also doing global load balancing because a lot of applications there, they quickly scale to global population. And then at that scale, different content wakes up at a different time. And you want to kind of load balancing across. So all the way, and we also have, we manage various different kinds of hardware skew from different hardware vendors. And different hardware design is best for different types of workload, whether it's long context, short context, long generation. So all these different types of workload is best fitted for different kinds of hardware skew. And then we can even distribute across different hardware for a workload. So the distribution actually is all around in the full stack.Swyx [00:22:02]: At some point, we'll show on the YouTube, the image that Ray, I think, has been working on with all the different modalities that you offer. To me, it's basically you offer the open source version of everything that OpenAI typically offers. I don't think there is. Actually, if you do text to video, you will be a superset of what OpenAI offers because they don't have Sora. Is that Mochi, by the way? Mochi. Mochi, right?Lin [00:22:27]: Mochi. And there are a few others. I will say, the interesting thing is, I think we're betting on the open source community is going to proliferate. This is literally what we're seeing. And there's amazing video generation companies. There is amazing audio companies. Like cross-border, the innovation is off the chart, and we are building on top of that. I think that's the advantage we have compared with a closed source company.Swyx [00:22:58]: I think I want to restate the value proposition of Fireworks for people who are comparing you versus a raw GPU provider like a RunPod or Lambda or anything like those, which is like you create the developer experience layer and you also make it easily scalable or serverless or as an endpoint. And then, I think for some models, you have custom kernels, but not all models.Lin [00:23:25]: Almost for all models. For all large language models, all your models, and the VRMs. Almost for all models we serve.Swyx [00:23:35]: And so that is called Fire Attention. I don't remember the speed numbers, but apparently much better than VLM, especially on a concurrency basis.Lin [00:23:44]: So Fire Attention is specific mostly for language models, but for other modalities, we'll also have a customized kernel.Swyx [00:23:51]: And I think the typical challenge for people is understanding that has value, and then there are other people who are also offering open-source models. Your mode is your ability to offer a good experience for all these customers. But if your existence is entirely reliant on people releasing nice open-source models, other people can also do the same thing.Lin [00:24:14]: So I would say we build on top of open-source model foundation. So that's the kind of foundation we build on top of. But we look at the value prop from the lens of application developers and product engineers. So they want to create new UX. So what's happening in the industry right now is people are thinking about a completely new way of designing products. And I'm talking to so many founders, it's just mind-blowing. They help me understand existing way of doing PowerPoint, existing way of coding, existing way of managing customer service. It's actually putting a box in our head. For example, PowerPoint. So PowerPoint generation is we always need to think about how to fit into my storytelling into this format of slide one after another. And I'm going to juggle through design together with what story to tell. But the most important thing is what's our storytelling lines, right? And why don't we create a space that is not limited to any format? And those kind of new product UX design combined with automated content generation through Gen AI is the new thing that many founders are doing. What are the challenges they're facing? Let's go from there. One is, again, because a lot of products built on top of Gen AI, they are consumer-personal developer facing, and they require interactive experience. It's just a kind of product experience we all get used to. And our desire is to actually get faster and faster interaction. Otherwise, nobody wants to spend time, right? And then that requires low latency. And the other thing is the nature of consumer-personal developer facing is your audience is very big. You want to scale up to product market fit quickly. But if you lose money at a small scale, you're going to bankrupt quickly. So it's actually a big contrast. I actually have product market fit, but when I scale, I scale out of my business. So that's kind of a very funny way to think about it. So then having low latency and low cost is essential for those new applications and products to survive and really become a generation company. So that's the design point for our distributed inference engine and the file optimizer. File optimizer, you can think about that as a feedback loop. The more you feed your inference workload to our inference engine, the more we help you improve quality, lower latency further, lower your cost. It basically becomes better. And we automate that because we don't want you as an app developer or product engineer to think about how to figure out all these low-level details. It's impossible because you're not trained to do that at all. You should kind of keep your focus on the product innovation. And then the compound AI, we actually feel a lot of pain as the app developers, engineers, there are so many models. Every week, there's at least a new model coming out.Swyx [00:27:09]: Tencent had a giant model this week. Yeah, yeah.Lin [00:27:13]: I saw that. I saw that.Swyx [00:27:15]: It's like $500 billion.Lin [00:27:18]: So they're like, should I keep chasing this or should I forget about it? And which model should I pick to solve what kind of sub-problem? How do I even decompose my problem into those smaller problems and fit the model into it? I have no idea. And then there are two ways to think about this design. I think I talked about that in the past. One is imperative, as in you figure out how to do it. You give developer tools to dictate how to do it. Or you build a declarative system where a developer tells what they want to do, not how. So these are completely two different designs. So the analogy I want to draw is, in the data world, the database management system is a declarative system because people use database, use SQL. SQL is a way you say, what do you want to extract out of a database? What kind of result do you want? But you don't figure out which node is going to, how many nodes you're going to run on top of, how you redefine your disk, which index you use, which project. You don't need to worry about any of those. And database management system will figure out, generate a new best plan, and execute on that. So database is declarative. And it makes it super easy. You just learn SQL, which is learn a semantic meaning of SQL, and you can use it. Imperative side is there are a lot of ETL pipelines. And people design this DAG system with triggers, with actions, and you dictate exactly what to do. And if it fails, then how to recover. So that's an imperative system. We have seen a range of systems in the ecosystem go different ways. I think there's value of both. There's value of both. I don't think one is going to subsume the other. But we are leaning more into the philosophy of the declarative system. Because from the lens of app developer and product engineer, that would be easiest for them to integrate.Swyx [00:29:07]: I understand that's also why PyTorch won as well, right? This is one of the reasons. Ease of use.Lin [00:29:14]: Focus on ease of use, and then let the system take on the hard challenges and complexities. So we follow, we extend that thinking into current system design. So another announcement is we will also announce our next declarative system is going to appear as a model that has extremely high quality. And this model is inspired by Owen's announcement for OpenAI. You should see that by the time we announce this or soon.Alessio [00:29:46]: Trained by you.Lin [00:29:47]: Yes.Alessio [00:29:48]: Is this the first model that you trained? It's not the first.Lin [00:29:52]: We actually have trained a model called FireFunction. It's a function calling model. It's our first step into compound AI system. Because function calling model can dispatch a request into multiple APIs. We have pre-baked set of APIs the model learned. You can also add additional APIs through the configuration to let model dispatch accordingly. So we have a very high quality function calling model that's already released. We have actually three versions. The latest version is very high quality. But now we take a further step that you don't even need to use function calling model. You use our new model we're going to release. It will solve a lot of problems approaching very high OpenAI quality. So I'm very excited about that.Swyx [00:30:41]: Do you have any benchmarks yet?Lin [00:30:43]: We have a benchmark. We're going to release it hopefully next week. We just put our model to LMSYS and people are guessing. Is this the next Gemini model or a MADIS model? People are guessing. That's very interesting. We're watching the Reddit discussion right now.Swyx [00:31:00]: I have to ask more questions about this. When OpenAI released o1, a lot of people asked about whether or not it's a single model or whether it's a chain of models. Noam and basically everyone on the Strawberry team was very insistent that what they did for reinforcement learning, chain of thought, cannot be replicated by a whole bunch of open source model calls. Do you think that that is wrong? Have you done the same amount of work on RL as they have or was it a different direction?Lin [00:31:29]: I think they take a very specific approach where the caliber of team is very high. So I do think they are the domain expert in doing the things they are doing. I don't think there's only one way to achieve the same goal. We're on the same direction in the sense that the quality scaling law is shifting from training to inference. For that, I fully agree with them. But we're taking a completely different approach to the problem. All of that is because, of course, we didn't train the model from scratch. All of that is because we built on the show of giants. The current model available we have access to is getting better and better. The future trend is the gap between the open source model and the co-source model. It's just going to shrink to the point there's not much difference. And then we're on the same level field. That's why I think our early investment in inference and all the work we do around balancing across quality, latency, and cost pay off because we have accumulated a lot of experience and that empowers us to release this new model that is approaching open-ended quality.Alessio [00:32:39]: I guess the question is, what do you think the gap to catch up will be? Because I think everybody agrees with open source models eventually will catch up. And I think with 4, then with Lama 3.2, 3.1, 4.5b, we close the gap. And then 0.1 just reopened the gap so much and it's unclear. Obviously, you're saying your model will have...Swyx [00:32:57]: We're closing that gap.Alessio [00:32:58]: But you think in the future, it's going to be months?Lin [00:33:02]: So here's the thing that's happened. There's public benchmark. It is what it is. But in reality, open source models in certain dimensions are already on par or beat closed source models. So for example, in the coding space, open source models are really, really good. And in function calling, file function is also really, really good. So it's all a matter of whether you build one model to solve all the problems and you want to be the best of solving all the problems, or in the open source domain, it's going to specialize. All these different model builders specialize in certain narrow area. And it's logical that they can be really, really good in that very narrow area. And that's our prediction is with specialization, there will be a lot of expert models really, really good and even better than one-size-fits-all closed source models.Swyx [00:33:55]: I think this is the core debate that I am still not 100% either way on in terms of compound AI versus normal AI. Because you're basically fighting the bitter lesson.Lin [00:34:09]: Look at the human society, right? We specialize. And you feel really good about someone specializing doing something really well, right? And that's how our way evolved from ancient times. We're all journalists. We do everything. Now we heavily specialize in different domains. So my prediction is in the AI model space, it will happen also. Except for the bitter lesson.Swyx [00:34:30]: You get short-term gains by having specialists, domain specialists, and then someone just needs to train like a 10x bigger model on 10x more inference, 10x more data, 10x more model perhaps, whatever the current scaling law is. And then it supersedes all the individual models because of some generalized intelligence slash world knowledge. I think that is the core insight of the GPTs, the GPT-123 networks. Right.Lin [00:34:56]: But the training scaling law is because you have an increasing amount of data to train from. And you can do a lot of compute. So I think on the data side, we're approaching the limit. And the only data to increase that is synthetic generated data. And then there's like what is the secret sauce there, right? Because if you have a very good large model, you can generate very good synthetic data and then continue to improve quality. So that's why I think in OpenAI, they are shifting from the training scaling law intoSwyx [00:35:25]: inference scaling law.Lin [00:35:25]: And it's the test time and all this. So I definitely believe that's the future direction. And that's where we are really good at, doing inference.Swyx [00:35:34]: A couple of questions on that. Are you planning to share your reasoning choices?Lin [00:35:39]: That's a very good question. We are still debating.Swyx [00:35:43]: Yeah.Lin [00:35:45]: We're still debating.Swyx [00:35:46]: I would say, for example, it's interesting that, for example, SweetBench. If you want to be considered for ranking, you have to submit your reasoning choices. And that has actually disqualified some of our past guests. Cosign was doing well on SweetBench, but they didn't want to leak those results. So that's why you don't see O1 preview on SweetBench, because they don't submit their reasoning choices. And obviously, it's IP. But also, if you're going to be more open, then that's one way to be more open. So your model is not going to be open source, right? It's going to be an endpoint that you provide. Okay, cool. And then pricing, also the same as OpenAI, just kind of based on...Lin [00:36:25]: Yeah, this is... I don't have, actually, information. Everything is going so fast, we haven't even thought about that yet. Yeah, I should be more prepared.Swyx [00:36:33]: I mean, this is live. You know, it's nice to just talk about it as it goes live. Any other things that you want feedback on or you're thinking through? It's kind of nice to just talk about something when it's not decided yet. About this new model. It's going to be exciting. It's going to generate a lot of buzz. Right.Lin [00:36:51]: I'm very excited to see how people are going to use this model. So there's already a Reddit discussion about it. And people are asking very deep, mathematical questions. And since the model got it right, surprising. And internally, we're also asking the model to generate what is AGI. And it generates a very complicated DAG thinking process. So we're having a lot of fun testing this internally. But I'm more curious, how will people use it? What kind of application they're going to try and test on it? And that's where we really like to hear feedback from the community. And also feedback to us. What works out well? What doesn't work out well? What works out well, but surprising them? And what kind of thing they think we should improve on? And those kind of feedback will be tremendously helpful.Swyx [00:37:44]: Yeah. So I've been a production user of Preview and Mini since launch. I would say they're very, very obvious jobs in quality. So much so that they made clods on it. And they made the previous state-of-the-art look bad. It's really that stark, that difference. The number one thing, just feedback or feature requests, is people want control on the budget. Because right now, in 0.1, it kind of decides its own thinking budget. But sometimes you know how hard the problem is. And you want to actually tell the model, spend two minutes on this. Or spend some dollar amount. Maybe it's time you miss dollars. I don't know what the budget is. That makes a lot of sense.Lin [00:38:27]: So we actually thought about that requirement. And it should be, at some point, we need to support that. Not initially. But that makes a lot of sense.Swyx [00:38:38]: Okay. So that was a fascinating overview of just the things that you're working on. First of all, I realized that... I don't know if I've ever given you this feedback. But I think you guys are one of the reasons I agreed to advise you. Because I think when you first met me, I was kind of dubious. I was like... Who are you? There's Replicate. There's Together. There's Laptop. There's a whole bunch of other players. You're in very, very competitive fields. Like, why will you win? And the reason I actually changed my mind was I saw you guys shipping. I think your surface area is very big. The team is not that big. No. We're only 40 people. Yeah. And now here you are trying to compete with OpenAI and everyone else. What is the secret?Lin [00:39:21]: I think the team. The team is the secret.Swyx [00:39:23]: Oh boy. So there's no thing I can just copy. You just... No.Lin [00:39:30]: I think we all come from a very aligned culture. Because most of our team came from meta.Swyx [00:39:38]: Yeah.Lin [00:39:38]: And many startups. So we really believe in results. One is result. And second is customer. We're very customer obsessed. And we don't want to drive adoption for the sake of adoption. We really want to make sure we understand we are delivering a lot of business values to the customer. And we really value their feedback. So we would wake up midnight and deploy some model for them. Shuffle some capacity for them. And yeah, over the weekend, no brainer.Swyx [00:40:15]: So yeah.Lin [00:40:15]: So that's just how we work as a team. And the caliber of the team is really, really high as well. So as plug-in, we're hiring. We're expanding very, very fast. So if we are passionate about working on the most cutting-edge technology in the general space, come talk with us. Yeah.Swyx [00:40:38]: Let's talk a little bit about that customer journey. I think one of your more famous customers is Cursor. We were the first podcast to have Cursor on. And then obviously since then, they have blown up. Cause and effect are not related. But you guys especially worked on a fast supply model where you were one of the first people to work on speculative decoding in a production setting. Maybe just talk about what was the behind the scenes of working with Cursor?Lin [00:41:03]: I will say Cursor is a very, very unique team. I think the unique part is the team has very high technical caliber. There's no question about it. But they have decided, although many companies building coding co-pilot, they will say, I'm going to build a whole entire stack because I can. And they are unique in the sense they seek partnership. Not because they cannot. They're fully capable, but they know where to focus. That to me is amazing. And of course, they want to find a bypass partner. So we spent some time working together. They are pushing us very aggressively because for them to deliver high caliber product experience, they need the latency. They need the interactive, but also high quality at the same time. So actually, we expanded our product feature quite a lot as we support Cursor. And they are growing so fast. And we massively scaled quickly across multiple regions. And we developed a pretty high intense inference stack, almost like similar to what we do for Meta. I think that's a very, very interesting engagement. And through that, there's a lot of trust being built. They realize, hey, this is a team they can really partner with. And they can go big with. That comes back to, hey, we're really customer obsessed. And all the engineers working with them, there's just enormous amount of time syncing together with them and discussing. And we're not big on meetings, but we are like stack channel always on. Yeah, so you almost feel like working as one team. So I think that's really highlighted.Swyx [00:42:38]: Yeah. For those who don't know, so basically Cursor is a VS Code fork. But most of the time, people will be using closed models. Like I actually use a lot of SONET. So you're not involved there, right? It's not like you host SONET or you have any partnership with it. You're involved where Cursor is small, or like their house brand models are concerned, right?Lin [00:42:58]: I don't know what I can say, but the things they haven't said.Swyx [00:43:04]: Very obviously, the drop down is 4.0, but in Cursor, right? So I assume that the Cursor side is the Fireworks side. And then the other side, they're calling out the other. Just kind of curious. And then, do you see any more opportunity on the... You know, I think you made a big splash with 1,000 tokens per second. That was because of speculative decoding. Is there more to push there?Lin [00:43:25]: We push a lot. Actually, when I mentioned Fire Optimizer, right? So as in, we have a unique automation stack that is one size fits one. We actually deployed to Cursor earlier on. Basically optimized for their specific workload. And that's a lot of juice to extract out of there. And we see success in that product. It actually can be widely adopted. So that's why we started a separate product line called Fire Optimizer. So speculative decoding is just one approach. And speculative decoding here is not static. We actually wrote a blog post about it. There's so many different ways to do speculative decoding. You can pair a small model with a large model in the same model family. Or you can have equal pads and so on. There are different trade-offs which approach you take. It really depends on your workload. And then with your workload, we can align the Eagle heads or Medusa heads or a small big model pair much better to extract the best latency reduction. So all of that is part of the Fire Optimizer offering.Alessio [00:44:23]: I know you mentioned some of the other inference providers. I think the other question that people always have is around benchmarks. So you get different performance on different platforms. How should people think about... People are like, hey, Lama 3.2 is X on MMLU. But maybe using speculative decoding, you go down a different path. Maybe some providers run a quantized model. How should people think about how much they should care about how you're actually running the model? What's the delta between all the magic that you do and what a raw model...Lin [00:44:57]: Okay, so there are two big development cycles. One is experimentation, where they need fast iteration. They don't want to think about quality, and they just want to experiment with product experience and so on. So that's one. And then it looks good, and they want to post-product market with scaling. And the quality is really important. And latency and all the other things are becoming important. During the experimentation phase, it's just pick a good model. Don't worry about anything else. Make sure you even generate the right solution to your product. And that's the focus. And then post-product market fit, then that's kind of the three-dimensional optimization curve start to kick in across quality, latency, cost, where you should land. And to me, it's purely a product decision. To many products, if you choose a lower quality, but better speed and lower cost, but it doesn't make a difference to the product experience, then you should do it. So that's why I think inference is part of the validation. The validation doesn't stop at offline eval. The validation will go through A-B testing, through inference. And that's where we offer various different configurations for you to test which is the best setting. So this is the traditional product evaluation. So product evaluation should also include your new model versions and different model setup into the consideration.Swyx [00:46:22]: I want to specifically talk about what happens a few months ago with some of your major competitors. I mean, all of this is public. What is your take on what happens? And maybe you want to set the record straight on how Fireworks does quantization because I think a lot of people may have outdated perceptions or they didn't read the clarification post on your approach to quantization.Lin [00:46:44]: First of all, it's always a surprise to us that without any notice, we got called out.Swyx [00:46:51]: Specifically by name, which is normally not what...Lin [00:46:54]: Yeah, in a public post. And have certain interpretation of our quality. So I was really surprised. And it's not a good way to compete, right? We want to compete fairly. And oftentimes when one vendor gives out results, the interpretation of another vendor is always extremely biased. So we actually refrain ourselves to do any of those. And we happily partner with third parties to do the most fair evaluation. So we're very surprised. And we don't think that's a good way to figure out the competition landscape. So then we react. I think when it comes to quantization, the interpretation, we wrote actually a very thorough blog post. Because again, no one says it's all. We have various different quantization schemes. We can quantize very different parts of the model from ways to activation to cross-TPU communication. They can use different quantization schemes or consistent across the board. And again, it's a trade-off. It's a trade-off across this three-dimensional quality, latency, and cost. And for our customer, we actually let them find the best optimized point. And we have a very thorough evaluation process to pick that point. But for self-serve, there's only one point to pick. There's no customization available. So of course, it depends on what we talk with many customers. We have to pick one point. And I think the end result, like AA published, later on AA published a quality measure. And we actually looked really good. So that's why what I mean is, I will leave the evaluation of quality or performance to third party and work with them to find the most fair benchmark. And I think that's a good approach, a methodology. But I'm not a part of an approach of calling out specific namesSwyx [00:48:55]: and critique other competitors in a very biased way. Databases happens as well. I think you're the more politically correct one. And then Dima is the more... Something like this. It's you on Twitter.Lin [00:49:11]: It's like the Russian... We partner. We play different roles.Swyx [00:49:20]: Another one that I wanted to... I'm just the last one on the competition side. There's a perception of price wars in hosting open source models. And we talked about the competitiveness in the market. Do you aim to make margin on open source models? Oh, absolutely, yes.Lin [00:49:38]: So, but I think it really... When we think about pricing, it's really need to coordinate with the value we're delivering. If the value is limited, or there are a lot of people delivering the same value, there's no differentiation. There's only one way to go. It's going down. So through competition. If I take a big step back, there is pricing from... We're more compared with close model providers, APIs, right? The close model provider, their cost structure is even more interesting because we don't bear any training costs. And we focus on inference optimization, and that's kind of where we continue to add a lot of product value. So that's how we think about product. But for the close source API provider, model provider, they bear a lot of training costs. And they need to amortize the training costs into the inference. So that created very interesting dynamics of, yeah, if we match pricing there, and I think how they are going to make money is very, very interesting.Swyx [00:50:37]: So for listeners, opening eyes 2024, $4 billion in revenue, $3 billion in compute training, $2 billion in compute inference, $1 billion in research compute amortization, and $700 million in salaries. So that is like...Swyx [00:50:59]: I mean, a lot of R&D.Lin [00:51:01]: Yeah, so I think matter is basically like, make it zero. So that's a very, very interesting dynamics we're operating within. But coming back to inference, so we are, again, as I mentioned, our product is, we are a platform. We're not just a single model as a service provider as many other inference providers, like they're providing a single model. We have our optimizer to highly customize towards your inference workload. We have a compound AI system where significantly simplify your interaction to high quality and low latency, low cost. So those are all very different from other providers.Alessio [00:51:38]: What do people not know about the work that you do? I guess like people are like, okay, Fireworks, you run model very quickly. You have the function model. Is there any kind of like underrated part of Fireworks that more people should try?Lin [00:51:51]: Yeah, actually, one user post on x.com, he mentioned, oh, actually, Fireworks can allow me to upload the LoRa adapter to the service model at the same cost and use it at same cost. Nobody has provided that. That's because we have a very special, like we rolled out multi-LoRa last year, actually. And we actually have this function for a long time. And many people has been using it, but it's not well known that, oh, if you find your model, you don't need to use on demand. If you find your model is LoRa, you can upload your LoRa adapter and we deploy it as if it's a new model. And then you use, you get your endpoint and you can use that directly, but at the same cost as the base model. So I'm happy that user is marketing it for us. He discovered that feature, but we have that for last year. So I think to feedback to me is, we have a lot of very, very good features, as Sean just mentioned. I'm the advisor to the company,Swyx [00:52:57]: and I didn't know that you had speculative decoding released.Lin [00:53:02]: We have prompt catching way back last year also. We have many, yeah. So I think that is one of the underrated feature. And if they're developers, you are using our self-serve platform, please try it out.Swyx [00:53:16]: The LoRa thing is interesting because I think you also, the reason people add additional costs to it, it's not because they feel like charging people. Normally in normal LoRa serving setups, there is a cost to dedicating, loading those weights and dedicating a machine to that inference. How come you can't avoid it?Lin [00:53:36]: Yeah, so this is kind of our technique called multi-LoRa. So we basically have many LoRa adapters share the same base model. And basically we significantly reduce the memory footprint of serving. And the one base model can sustain a hundred to a thousand LoRa adapters. And then basically all these different LoRa adapters can share the same, like direct the same traffic to the same base model where base model is dominating the cost. So that's how we advertise that way. And that's how we can manage the tokens per dollar, million token pricing, the same as base model.Swyx [00:54:13]: Awesome. Is there anything that you think you want to request from the community or you're looking for model-wise or tooling-wise that you think like someone should be working on in this?Lin [00:54:23]: Yeah, so we really want to get a lot of feedback from the application developers who are starting to build on JNN or on the already adopted or starting about thinking about new use cases and so on to try out Fireworks first. And let us know what works out really well for you and what is your wishlist and what sucks, right? So what is not working out for you and we would like to continue to improve. And for our new product launches, typically we want to launch to a small group of people. Usually we launch on our Discord first to have a set of people use that first. So please join our Discord channel. We have a lot of communication going on there. Again, you can also give us feedback. We'll have a starting office hour for you to directly talk with our DevRel and engineers to exchange more long notes.Alessio [00:55:17]: And you're hiring across the board?Lin [00:55:18]: We're hiring across the board. We're hiring front-end engineers, infrastructure cloud, infrastructure engineers, back-end system optimization engineers, applied researchers, like researchers who have done post-training, who have done a lot of fine-tuning and so on.Swyx [00:55:34]: That's it. Thank you. Thanks for having us. Get full access to Latent Space at www.latent.space/subscribe

Messiah Podcast
62 – Replicate: Making Disciples in the 21st Century | Pastor Dr. Robby Gallaty

Messiah Podcast

Play Episode Listen Later Nov 24, 2024 56:19


When Yeshua called people to discipleship, he was using a first-century Jewish model of passing down faith in and faithfulness to the God of Israel from generation to generation. But for decades, the American Evangelical church has focused almost exclusively on evangelism. Our guest today, Dr. Robby Gallaty, is here to talk about how discipling people the way Yeshua did is the key to reaching the next generation for the kingdom of God. – Episode Highlights – 00:00 Pastor Gallaty's rocky road to redemption. 10:25 How discipleship happens in a megachurch. 21:02 Disciple Making: Evangelism and Discipleship 26:58 The encounters of the Bible are inspired. 34:25 Key to Motivation: Framing the Why 43:35 The best way to better understand the Gospels. 45:36 The Book of Revelation: A Discipleship Manual 52:54 Gym stats. – Episode Resources – X: https://x.com/Rgallaty Instagram: https://www.instagram.com/Rgallaty/ Long Hollow Church: https://longhollow.com/ Forgotten Jesus Podcast: https://podcasts.apple.com/us/podcast/the-forgotten-jesus-podcast/id1662689332 Replicate: https://replicate.org Rediscovering Discipleship: Making Jesus' Final Words Our First Work: https://www.amazon.com/Rediscovering-Discipleship-Making-Jesus-Final-ebook/dp/B00UF7W44S Chronicles of the Messiah: https://ffoz.store/products/chronicles-of-the-messiah-commentary-set-audio-magazine Messiah Podcast is a production of First Fruits of Zion (https://ffoz.org) in conjunction with Messiah Magazine. This publication is designed to provide rich substance, meaningful Jewish contexts, cultural understanding of the teaching of Jesus, and the background of modern faith from a Messianic Jewish perspective. Messiah Podcast theme music provided with permission by Joshua Aaron Music (http://JoshuaAaron.tv). “Cover the Sea” Copyright WorshipinIsrael.com songs 2020. All rights reserved.

BiggerPockets Daily
A Five-Year Plan to Financial Freedom That Anyone Can Replicate

BiggerPockets Daily

Play Episode Listen Later Nov 23, 2024 14:41


Are you chasing financial freedom but unsure where to start? In today's episode, we explore a five-year roadmap to break free from the 9-to-5 grind and live life on your terms. Learn how redefining financial freedom, leveraging real estate, and supercharging your savings can transform your financial future. Whether you're dreaming of early retirement or simply want to pursue fulfilling work without financial stress, this episode will inspire you to take action and design your path to freedom. Don't miss these actionable strategies and real-life insights! Keep reading the article here: https://www.biggerpockets.com/blog/five-year-plan-for-earning-financial-freedom Subscribe to the BiggerPockets Channel for the best real estate investing education online! Become a member of the BiggerPockets community of real estate investors - https://www.biggerpockets.com Learn more about your ad choices. Visit megaphone.fm/adchoices

The Tom Woods Show
Ep. 2571 The Trump Victory: Can Individuals and Businesses Replicate the Strategy?

The Tom Woods Show

Play Episode Listen Later Nov 21, 2024 61:05


The great (and based) email marketing expert Ben Settle joins Paul Counts (who directs my mastermind programs with me) for this look into how Donald Trump accomplished the seemingly impossible. Sponsors: : Code: WOODS & Guests' Links:

More Salon Clients: Salon Owners Guide to Marketing
[160] REPLAY - SALES SECRETS FOR HAIRSTYLISTS: How to Book More Clients and Make More Money

More Salon Clients: Salon Owners Guide to Marketing

Play Episode Listen Later Nov 19, 2024 28:03


Hey Successful Hairstylists!! THIS episode was the your FAVORITE all time episode of the podcast so today I'm doing a throwback! Here are the TOP SALES SECRETS FOR HAIRSTYLISTS! Ever have that BOMB month where tons of new clients are coming, everyone is saying YES to your recommendations and you're just ON FIRE?! And at the end of you look back and you're like... Where did all of those new clients come from? What did I do/say to make my clients so happy? No idea! If we don't know what's working or not working in our business - we cannot REPLICATE our success or PREVENT our failures. Today I'm sharing SALES SECRETS for Hairstylists so that you can understand how to spend your free time to continue to grow! Let's jump in! --- Support this podcast: https://podcasters.spotify.com/pod/show/successfulhairstylists/support

Flatirons Community Church Audio Podcast
The 5 R's To Transforming Your Life | Still Standing | Week 6

Flatirons Community Church Audio Podcast

Play Episode Listen Later Nov 18, 2024 42:24


Are you simply admiring Jesus, or are you learning to live like Him? The difference between the two shapes everything about your faith journey. In this message, we break down the steps to becoming an apprentice of Jesus: moving from belief in Him to embodying the faith of Him. Through the 5 R's—Raise, Reevaluate, Reprioritize, Rebuild, and Replicate—you'll discover a clear pathway for spiritual growth, no matter where you are in your journey. Are you ready to take the next step?

Raw Data By P3
Head to Head: Can ChatGPT Replicate Rob's Inspired Solution to a Power Query Challenge?

Raw Data By P3

Play Episode Listen Later Oct 8, 2024 46:34


In this episode, Rob Collie and Justin Mannhardt step onto the data gridiron to test if AI can tackle the complexities of fantasy football data. Rob starts with a Power Query puzzle, full of messy stats, player names, and tricky injury codes—solving it manually in over 30 steps. But can ChatGPT, with Justin's guidance, run the same play and simplify fantasy football data management? Justin and ChatGPT take their shot at transforming the chaos into organized data. While ChatGPT makes some impressive moves, there are a few fumbles that leave us wondering—can AI truly handle the ever-changing landscape of fantasy football stats? Want to know how this battle played out? Tune in to find out if AI has what it takes to streamline your fantasy football data or if human instincts still win the day. Here is the companion Blog Post to follow along - Can AI Write an M Script When it Requires Inspiration? Rob Collie Squares Off Versus ChatGPT

Thunder & Lightning
8-15-24: Can Jeff Lebby Replicate Dan Mullen's Success at Mississippi State?

Thunder & Lightning

Play Episode Listen Later Aug 15, 2024 46:06


Year one for Dan Mullen at Mississippi State saw the kind of qualitative improvement at Mississippi State that set the stage for the seasons that followed. Can Jeff Lebby show the kind of first year fireworks that Mullen did? Brian Hadad and Robbie Faulk discuss that with some help from Coach Dan Mullen himself. Brian Hadad of SportsTalk Mississippi and Robbie Faulk of On3 give you the best insight into Mississippi State athletics. We cover the Bulldogs like no one else. This show is a production of SuperTalk Mississippi Media. Learn more at SuperTalk.FM