POPULARITY
Today's guest is a powerhouse of resilience and resourcefulness: Janice Thayer of Curated Properties in Abingdon, Virginia. As a longtime member of Hosting Business Mastery (HBM), Janice shares her remarkable journey from navigating personal upheaval to launching a thriving short-term rental business — all while living on-site.In this inspiring conversation, Janice explains how she leveraged her design expertise and entrepreneurial background to build a luxury hosting brand, create multiple rentable spaces within her own home, and master the art of direct bookings. She walks us through the challenges of launching during COVID, navigating property renovations, and the mindset that has allowed her to double her revenue every year since opening.We also discuss how Janice uses dynamic pricing, multiple social media platforms, and creative SEO tactics to attract guests — including how she's achieved a 28% direct booking rate with zero paid advertising. Plus, she shares her strategies for guest screening, why she chooses to stay involved in HBM years after launching, and why resilience is every host's superpower.If you've ever wondered whether you can truly take control of your STR income and build a sustainable, guest-loved business — this is the episode for you.In this episode, we cover:• How Janice's interior design and hospitality background set her apart as a host• Starting a STR business after a major life transition• The pros and cons of living onsite with your guests• Renovating and reconfiguring space to maximize bookings• Why dynamic pricing was a game-changer for Janice's revenue• How she built a direct booking website that now accounts for nearly 30% of her income• Tips for using social media and SEO to attract direct bookings• Managing cleaners and maintaining high standards• Why Janice stays engaged in Hosting Business Mastery year after year• The #1 mindset shift every host needs to succeedResources mentioned:• Join our FREE upcoming workshop: thanksforvisiting.com/workshop• Watch on YouTube: The #1 Airbnb Revenue Management Metric You NEED to know about!• Follow Janice and explore her properties: linktr.ee/curatedpropertiesllcMentioned in this episode:Make More Money This Year | Join our LIVE Workshop!Minoan | Visit MinoanExperience.com and tell them TFV sent you!Hostfully | Go to https://www.hostfully.com/tfv and use TFV500 to get $500 off your subscription.Make More Money This Year | Join our LIVE Workshop!
在川普在全世界開地圖炮的同時,台積電用錢說話,快速搞定未來世界佈局! 川普白宮記者會後,賴清德再和魏哲家共同說明美國史上最大外國投資案,台積電宣布投資一千億美元! 台灣要完蛋了嗎,TSMC要變美積電了嗎?實際上你可能沒想過,台積電的價值在全球前十大科技股裡遭到低估,去了美國設廠竟然還會有這些好處? 最害怕護國神山投資美國的人竟是他們?這間公司急了,現在到處在美國尋找親川普人士!這個國家的先進製程不僅看不到台積電車尾燈,還可能被川普關稅搞到集體仆街! 課中國關稅、又矮化習近平,川普下一個海電目標會是中國? 本期的公開影片解密全世界半導體界最大的一盤棋,會員影片甚至更完整的提供超深度的研究報告! Emmy替您關注台積電投資美國,提供最專業的半導體爭霸深度報導,馬上訂閱點讚,加入會員!分享給你的親朋好友吧! 全台獨家的世界經濟追劇深入報導,精彩萬分,持續連載中! (現在就加入會員支持我們,還可以看到更多專屬影片~) https://www.youtube.com/@emmytw/join 中國仆街台灣網通股趁機吃飽?
※※※ 김현정 앵커의 연수 휴가로 〈김현정의 뉴스쇼〉는 이철희 前 정무수석이 대신 진행합니다 ※※※ 삼성전자+하이닉스, 국내주식 시가총액 25%삼성전자 위기는 사이클 아닌 구조적 문제HBM 못 따라잡고 파운드리는 적자범용은 중국에 밀려..과감한 구조개혁 필수분사 독립경영해야..특별법으로 될일 아냐 ■ 방송 : CBS 라디오 [김현정의 뉴스쇼] FM 98.1 (07:10~09:00)■ 진행 : 이철희 (김현정 앵커 대신)■ 대담 : 박상인 (서울대 행정대학원 교수)See omnystudio.com/listener for privacy information.
This video is sponsored by Finchat.io: Supercharge your analysis with AI! Get 15% of your membership with our special link here: https://finchat.io/csi/ Join us on Discord with Semiconductor Insider: https://ko-fi.com/chipstockinvestor/tiers Is SK hynix a good stock to buy now? Chip Stock Investor analyzes SK hynix, the leading provider of high bandwidth memory (HBM) crucial for AI, accelerated computing, and data center and cloud applications. Learn what is HBM vs. traditional DRAM, the competitive landscape with Micron, Samsung, and upstart memory chip makers in China, and SK hynix's financial outlook. Gain insights to make informed investment decisions in the memory chip segment of the semiconductor industry! Supercharge your analysis with AI! Get 15% of your membership with our special link here: https://finchat.io/csi/ Safeguard your personal information with Aura's monitoring service – try it free for two weeks and see where your data might be lurking: https://aura.com/chipstockinvestor
[손에잡히는경제 인터뷰] 대한민국 HBM 개발의 미래와 생존전략 - 심대용 동아대 교수 (전 SK하이닉스 부사장)
Hello everyone!!For our first HBM of 2025 we offer something quite strange, at first glance something all about military propaganda and nothing quite our podcast range, Ace Combat 7!But, as we are joined by our friend Sid of, among other things, The Bad Game Hall of Fame, we dive deeper!And yeah... there's a lot of propaganda, but still, attempts at telling a story and even being anti-war, but we shall see.Happy new year and here's to plenty of HBM in 2025! Enjoy!Check out Sid's stuff!https://linktr.ee/beamsplashxIf you can and are interested in early episodes and the Here Be Extras, check our Patreon!https://www.patreon.com/leftpage Also! If you're not there already, feel free to join our Discord, as we have been more talkative than usual, and plan to do so more and more!https://discord.gg/J2wgG3yrPNIntro Music: Home, by Karl Casey @ White Bat AudioOutro Music: Leve Palestina, Spartacus Hosted on Acast. See acast.com/privacy for more information.
오늘의 경제 뉴스 - 젠슨황, "삼성 HBM 성공할 것"
Greetings and Salutations everyone,We decided to do a second HBM episode this month instead of a HBE, we know a part of our audience likes it when we talk about Bioware so we hope this is a fun holiday present for you guys! We are going to be honest, this is a very text based analysis, and might not be the easiest to follow along with for those who have not played the game. We do think there is something there for everyone since we talk about how you can approach RPGs and choices in video games in general. We also talk about notions of representation that are universally accessible.Hope you enjoy!Frank & LeonIf you can and are interested in early episodes and the Here Be Extras, check our Patreon!https://www.patreon.com/leftpage Also! If you're not there already, feel free to join our Discord, as we have been more talkative than usual, and plan to do so more and more!https://discord.gg/J2wgG3yrPNIntro Music: Home, by Karl Casey @ White Bat AudioOutro Music: Leve Palestina, Spartacus Hosted on Acast. See acast.com/privacy for more information.
How is the ever increasing AI demand and workloads affecting memory? Six Five is On The Road at Marvell Industry Analyst Day to answer just that. Hosts Patrick Moorhead and Daniel Newman are joined by Marvell Technology, Samsung Semiconductor, and SK hynix America executives—In Dong Kim, Sunny Kang, and Will Chu, for a conversation on the collaboration between Marvell, Samsung, and SK Hynix on custom high bandwidth memory (HBM) solutions, aimed at enhancing the processors driving accelerated infrastructure. This new approach to memory solutions is projected to increase memory capacity, optimize power usage, and reduce silicon waste, marking a significant advancement in the custom chip technology space. Their discussion covers: The collaboration between Marvell, Samsung Semiconductor, and SK hynix on developing custom HBM solutions The key benefits of custom HBM, including increased memory capacity and optimization of power and performance Insights into the customization process of HBM and how it delivers its advantages The impact of custom HBM on AI processors and projections for its market adoption The roadmap and future potential of custom HBM for various applications and industries
In this episode, Ben Bajarin and Jay Goldberg discuss the recent Marvell Industry Analyst Day, focusing on the concept of accelerated infrastructure in data centers, the competitive landscape with Broadcom, and the significance of custom HBM in AI silicon. They explore how Marvell is positioning itself as a data center company and the implications of custom solutions in the evolving semiconductor industry. The conversation also touches on Nvidia's dominance and the future of data centers, emphasizing the need for optimization and the potential for a shift back to more affordable solutions. In this conversation, Ben Bajarin and Jay Goldberg discuss the recent developments surrounding Broadcom, particularly its stock surge attributed to optimism in AI. They delve into the company's market position, the significance of data center design, and the distinction between Total Addressable Market (TAM) and Serviceable Addressable Market (SAM). The discussion also covers the critical role of networking in AI, the rise of million-node data centers, and Broadcom's strategy regarding M&A and custom silicon. The conversation highlights the evolving landscape of AI and the competitive dynamics between major players in the industry.
Ez talán a leggyilkosabb űripari katasztrófa, és alig hallottak róla Rakéta 2024-12-08 08:12:05 Tudomány Műhold Nem elképzelhetetlen, hogy az Intelsat-708 műhold felbocsátására tett kísérlet szörnyű, több száz életet követelő tragédiába torkollt. Vége a hagyományos ChatGPT-korszaknak? ITBusiness 2024-12-08 06:06:06 Infotech Mesterséges intelligencia ChatGPT OpenAI Az OpenAI ezen a héten elindította azt, amit Sam Altman, a cég vezérigazgatója "a világ legokosabb modelljének" nevezett – egy generatív MI-programot, amely képességeiben állítólag messze meghaladja elődeit, és közelebb áll az emberi gondolkodás utánzásához, mint bármely korábbi szoftver. Hackerek nyomtatóproblémákra kínálnak “megoldást” Mínuszos 2024-12-08 04:33:06 Infotech Google Hacker Kiberbiztonság A Malwarebytes kiberbiztonsági vállalat kutatói arra figyelmeztetnek, hogy csalók nyomtatókkal kapcsolatos problémákra kínálnak megoldást Google hirdetésekben. Aki már használt nyomtatót, valószínűleg már tapasztalt valamilyen frusztráló problémát. Mindig van valami, legyen az a szoftver nem válaszol, a papír elakad vagy másféle meghibásodás. Amiko Magyar innováció: Először díjazták a papírmentes tűzvédelemben résztvevőket Digital Hungary 2024-12-08 13:46:00 Infotech egyetem Innováció Idén első ízben rendezte meg tűzvédelmi konferenciáját a papírmentes tűzvédelmi e-napló-rendszert megalkotó magyar vállalkozás, a fiREG.hu Kft Az eseményen először adtak át díjakat az innovatív tűzvédelem mellett elkötelezett vállalkozásoknak és egyetemistáknak. Kozmikus szomszédságunkból érkező, iszonyatosan nagy energiájú elektronokat észlelt egy nemzetközi kutatócsoport Csillagászat 2024-12-08 09:12:56 Tudomány Energia Világűr Még sosem találtak ilyen nagy energiájú elektronokat a kozmikus sugárzásban, amelyek forrása ráadásul meglepően közel lehet hozzánk. Egy csillagászokból és fizikusokból álló kutatócsoport egy namíbiai teleszkóp segítségével extrém nagy energiájú elektronokat fedezett fel a kozmikus sugárzásban, azoknak a töltött részecskéknek (főként protonoknak és Tornádóként tarolt a Euronics TechWorld 2024-12-08 06:44:15 Mobiltech Kávé Tájfun orkán tornádó A Euronics számára 2024 novembere rekordokat hozott: a Black November során több mint 16%-kal nőtt a vállalat forgalma az előző évhez képest. A vásárlók körében a legnépszerűbbek az automata kávéfőzők, a vezetéknélküli porszívók és a játékkonzolok voltak, míg az online vásárlások terén 25%-os növekedést értek el. A vásárlók körében egyre népszerűbb Mi az a HBM memória, és miért nem akarja az USA, hogy hozzáférjen Kína? SG.hu 2024-12-08 14:32:49 Infotech USA Kína Az MI globális őrületével együtt ugrott meg a kereslet ezen csúcstechnológiát képviselő félvezetők iránt. Még ravaszabb taktikával próbálkoznak a bankkártyás csalók digitrend-i 2024-12-08 09:53:55 Infotech Bankkártya Az utóbbi időben rosszhiszemű kereskedők már arra használják ki a karácsony előtt vásárlási időszakot, hogy a figyelmetlen vásárlók bankkártyás számláját rendszeresen megdézsmálják. Ezt úgy teszik, hogy a weboldalon elrejtve és nem egyértelmű módon jelzik az ismétlődő kártyás vásárlások lehetőségét. Így működik a legújabb trükközés A csalók az ügyf Megtörtént: Megpróbálta átverni kezelőit az új ChatGPT, és átmásolni magát máshová PC Fórum 2024-12-08 08:20:00 Infotech Mesterséges intelligencia ChatGPT OpenAI Az OpenAI a héten egy érdekes, egyben rémisztő kísérlet történetét osztotta meg legújabb mesterséges intelligenciája, a GPT-o1 Pro változatának tesztelésével kapcsolatban. Ezek szerint a gépi értelem módszeresen megpróbálta átverni kezelőit és átmásolni magát egy másik helyre, amikor megtudta, hogy törölni tervezik őt. A történet kísértetiesen has Egymilliárd fontos pofont kaphat a Microsoft ICT Global 2024-12-08 05:03:12 Infotech Bíróság Microsoft Kártérítés Több ezer brit vállalkozásnak fizethet a Microsoft, ha a tech-óriás ellen benyújtott csoportos kereset nyomán a bíróság megítéli a kártérítést. Thaiföld és Vietnam az AI segítségével építené jövőjét BitcoinBázis 2024-12-08 06:34:05 Infotech Mesterséges intelligencia Innováció Thaiföld Nvidia Vietnam A „szuverén mesterséges intelligencia” (Sovereign AI) fogalma olyan innovációs keretet jelent, amelyben az országok saját számítástechnikai infrastruktúrájuk, adatforrásaik és helyi szakembereik segítségével fejlesztenek és alkalmaznak AI-megoldásokat. Ez különösen fontos Thaiföld és Vietnam számára, hiszen mindkét ország az NVIDIA támogatásával ép Valós időben generál játszható videójátékokat a Google mesterséges intelligenciája Rakéta 2024-12-08 08:12:05 Infotech Mesterséges intelligencia Google A Genie 2-nek elegendő egy kép és egy rövid szöveges utasítás, hogy a mai videojátékokhoz hasonló minőségű, játszható demókat hozzon létre. Ortopéd robotsebészeti centrumot fejleszt az Emineo magánkórház ProfitLine 2024-12-08 03:00:00 Gazdaság Robot Ortopéd robotsebészeti centrumot fejleszt az Emineo magánkórház, a központ működésének első három hónapjában 11 ortopéd sebész szerzett jogosultságot a gyártótól a robot használatára – közölte az intézmény az MTI-vel. A további adásainkat keresd a podcast.hirstart.hu oldalunkon.
Ez talán a leggyilkosabb űripari katasztrófa, és alig hallottak róla Rakéta 2024-12-08 08:12:05 Tudomány Műhold Nem elképzelhetetlen, hogy az Intelsat-708 műhold felbocsátására tett kísérlet szörnyű, több száz életet követelő tragédiába torkollt. Vége a hagyományos ChatGPT-korszaknak? ITBusiness 2024-12-08 06:06:06 Infotech Mesterséges intelligencia ChatGPT OpenAI Az OpenAI ezen a héten elindította azt, amit Sam Altman, a cég vezérigazgatója "a világ legokosabb modelljének" nevezett – egy generatív MI-programot, amely képességeiben állítólag messze meghaladja elődeit, és közelebb áll az emberi gondolkodás utánzásához, mint bármely korábbi szoftver. Hackerek nyomtatóproblémákra kínálnak “megoldást” Mínuszos 2024-12-08 04:33:06 Infotech Google Hacker Kiberbiztonság A Malwarebytes kiberbiztonsági vállalat kutatói arra figyelmeztetnek, hogy csalók nyomtatókkal kapcsolatos problémákra kínálnak megoldást Google hirdetésekben. Aki már használt nyomtatót, valószínűleg már tapasztalt valamilyen frusztráló problémát. Mindig van valami, legyen az a szoftver nem válaszol, a papír elakad vagy másféle meghibásodás. Amiko Magyar innováció: Először díjazták a papírmentes tűzvédelemben résztvevőket Digital Hungary 2024-12-08 13:46:00 Infotech egyetem Innováció Idén első ízben rendezte meg tűzvédelmi konferenciáját a papírmentes tűzvédelmi e-napló-rendszert megalkotó magyar vállalkozás, a fiREG.hu Kft Az eseményen először adtak át díjakat az innovatív tűzvédelem mellett elkötelezett vállalkozásoknak és egyetemistáknak. Kozmikus szomszédságunkból érkező, iszonyatosan nagy energiájú elektronokat észlelt egy nemzetközi kutatócsoport Csillagászat 2024-12-08 09:12:56 Tudomány Energia Világűr Még sosem találtak ilyen nagy energiájú elektronokat a kozmikus sugárzásban, amelyek forrása ráadásul meglepően közel lehet hozzánk. Egy csillagászokból és fizikusokból álló kutatócsoport egy namíbiai teleszkóp segítségével extrém nagy energiájú elektronokat fedezett fel a kozmikus sugárzásban, azoknak a töltött részecskéknek (főként protonoknak és Tornádóként tarolt a Euronics TechWorld 2024-12-08 06:44:15 Mobiltech Kávé Tájfun orkán tornádó A Euronics számára 2024 novembere rekordokat hozott: a Black November során több mint 16%-kal nőtt a vállalat forgalma az előző évhez képest. A vásárlók körében a legnépszerűbbek az automata kávéfőzők, a vezetéknélküli porszívók és a játékkonzolok voltak, míg az online vásárlások terén 25%-os növekedést értek el. A vásárlók körében egyre népszerűbb Mi az a HBM memória, és miért nem akarja az USA, hogy hozzáférjen Kína? SG.hu 2024-12-08 14:32:49 Infotech USA Kína Az MI globális őrületével együtt ugrott meg a kereslet ezen csúcstechnológiát képviselő félvezetők iránt. Még ravaszabb taktikával próbálkoznak a bankkártyás csalók digitrend-i 2024-12-08 09:53:55 Infotech Bankkártya Az utóbbi időben rosszhiszemű kereskedők már arra használják ki a karácsony előtt vásárlási időszakot, hogy a figyelmetlen vásárlók bankkártyás számláját rendszeresen megdézsmálják. Ezt úgy teszik, hogy a weboldalon elrejtve és nem egyértelmű módon jelzik az ismétlődő kártyás vásárlások lehetőségét. Így működik a legújabb trükközés A csalók az ügyf Megtörtént: Megpróbálta átverni kezelőit az új ChatGPT, és átmásolni magát máshová PC Fórum 2024-12-08 08:20:00 Infotech Mesterséges intelligencia ChatGPT OpenAI Az OpenAI a héten egy érdekes, egyben rémisztő kísérlet történetét osztotta meg legújabb mesterséges intelligenciája, a GPT-o1 Pro változatának tesztelésével kapcsolatban. Ezek szerint a gépi értelem módszeresen megpróbálta átverni kezelőit és átmásolni magát egy másik helyre, amikor megtudta, hogy törölni tervezik őt. A történet kísértetiesen has Egymilliárd fontos pofont kaphat a Microsoft ICT Global 2024-12-08 05:03:12 Infotech Bíróság Microsoft Kártérítés Több ezer brit vállalkozásnak fizethet a Microsoft, ha a tech-óriás ellen benyújtott csoportos kereset nyomán a bíróság megítéli a kártérítést. Thaiföld és Vietnam az AI segítségével építené jövőjét BitcoinBázis 2024-12-08 06:34:05 Infotech Mesterséges intelligencia Innováció Thaiföld Nvidia Vietnam A „szuverén mesterséges intelligencia” (Sovereign AI) fogalma olyan innovációs keretet jelent, amelyben az országok saját számítástechnikai infrastruktúrájuk, adatforrásaik és helyi szakembereik segítségével fejlesztenek és alkalmaznak AI-megoldásokat. Ez különösen fontos Thaiföld és Vietnam számára, hiszen mindkét ország az NVIDIA támogatásával ép Valós időben generál játszható videójátékokat a Google mesterséges intelligenciája Rakéta 2024-12-08 08:12:05 Infotech Mesterséges intelligencia Google A Genie 2-nek elegendő egy kép és egy rövid szöveges utasítás, hogy a mai videojátékokhoz hasonló minőségű, játszható demókat hozzon létre. Ortopéd robotsebészeti centrumot fejleszt az Emineo magánkórház ProfitLine 2024-12-08 03:00:00 Gazdaság Robot Ortopéd robotsebészeti centrumot fejleszt az Emineo magánkórház, a központ működésének első három hónapjában 11 ortopéd sebész szerzett jogosultságot a gyártótól a robot használatára – közölte az intézmény az MTI-vel. A további adásainkat keresd a podcast.hirstart.hu oldalunkon.
20241203 [뉴스 원샷] 미국, 대중 반도체 수출통제 발표…한국산 HBM도 적용 (손지은 서울신문 기자)
Aujourd'hui, on parle des nouvelles restrictions imposées cette semaine par les États-Unis à l'industrie chinoise des semi-conducteurs. C'est la troisième campagne de sanctions en quelques années, et les conséquences pourraient être lourdes pour Pékin.Alors voyons ensemble tout d'abord l'objectif de ces nouvelles mesures. L'administration Biden souhaite limiter l'accès de la Chine à des technologies avancées, notamment celles qui permettent de développer des puces destinées à l'intelligence artificielle ou à des usages militaires.Pour cela, 140 nouvelles entreprises chinoises ont été ajoutées à la liste des entreprises sanctionnées par les Etats-Unis. Les sociétés chinoises inscrites dans cette liste ne peuvent plus recevoir de matériel en provenance des Etats-Unis et de pays alliés sans une licence spéciale accordée au compte goutte par les autorités américaines.Samsung pourrait perdre 30 % de ses ventes de puces mémoire HBMParmi elles, des fabricants d'équipements pour semi-conducteurs comme Naura Technology Group. Ces restrictions visent à ralentir les ambitions chinoises en matière de production de puces de nouvelle génération.Les nouvelles règles affectent particulièrement les puces mémoire à large bande passante. Ces puces sont essentielles pour faire tourner des applications informatiques haut de gamme comme celles qui permettent de faire de l'entraînement de modèles d'intelligence artificielle.Par exemple, Samsung, l'un des leaders dans ce domaine, pourrait perdre une partie importante de son marché chinois, qui représente environ 30 % de ses ventes de puces mémoire HBM.Les restrictions américaines ne s'arrêtent pas aux frontières des entreprises américainesEnfin pourquoi des sanctions américaines contre la Chine ont-elles une répercussion mondiale ?Oui, les restrictions américaines ne s'arrêtent pas aux frontières des entreprises américaines. Elles s'appliquent aussi aux équipements fabriqués dans des pays comme Israël, Singapour ou Taïwan. Pourquoi ? Et bien parce que le rapport de force entre les Etats-Unis et ces pays penche en faveur du premier. Si Taïwan n'obéit pas à la loi américaine sur les sanctions, le pays s'expose à des représailles.Du côté chinois, la réponse est vive. Le ministère des Affaires étrangères accuse les États-Unis de compromettre les chaînes d'approvisionnement mondiales. Reste que malgré les efforts de la Chine pour développer une production locale, elle reste encore dépendante de technologies venues de l'étranger.Le ZD Tech est sur toutes les plateformes de podcast ! Abonnez-vous !Hébergé par Ausha. Visitez ausha.co/politique-de-confidentialite pour plus d'informations.
[깊이 있는 경제뉴스] 1) 소비자단체 “OTT 구독료, 산정 근거 공시해야” 2) 정부, 자본시장법 개정안.. 상법과 차이는? 3) 中 10년물 국채금리, 2% 아래로 떨어졌다 4) 美, 대중 반도체 수출 규제.. 한국산 HBM 타격 -김치형 경제뉴스 큐레이터 -조미현 한국경제신문 기자 -나수지 한국경제신문 기자
미국 정부가 중국이 인공지능을 개발하는 데 필요한 핵심 부품인 고대역폭메모리 HBM의 중국 수출을 통제하기로 했습니다.삼성과 SK하이닉스 등 우리 반도체 기업들도 영향을 받을 것으로 보입니다. 중국은 전형적인 경제적 강압 행위이자 비시장적 방법이라며 반발했습니다.최재해 감사원장과 이창수 서울중앙지검장 등 검사 3인에 대한 탄핵소추안이 국회 본회의에 보고돼 내일 본회의에서 표결 처리됩니다.더불어민주당의 '4조 1천억원 삭감' 예산안이 보류되면서 여야는 앞으로 1주일 간 팽팽한 예산 줄다리기를 이어갈 전망입니다.통계청이 발표한 '온라인 쇼핑 동향'에 따르면 지난 10월 온라인쇼핑 거래액은 20조2천845억원으로 작년 동월보다 0.6% 증가하는 데 그쳤습니다.일도 구직 활동도 하지 않고 '그냥 쉰' 사람 10명 중 3명은 2030 청년층인 것으로 집계됐습니다.See omnystudio.com/listener for privacy information.
241125(2) [찬란한경제] (1) 블랙핑크·BTS ‘컴백'...주식시장 난리났다 / (2) 삼성전자, HBM 엔비디아 납품 임박 '초읽기' / (3) 스페이스X가 곧 테슬라를 능가할 수도 있는 이유 / (4) 위기설 롯데...구조조정 신호탄
大家敲碗等很久了三星甲賽系列終於回來了,三星現在晶圓製造記憶體都正在潰敗!不但晶圓 製造產線年底之前要收掉一半,連記憶體部門工程師都瘋狂想著跳槽,寧願當公務員也不想留 在三星? 最可怕的是,曾經大力舔中的三星,現在遭受到中國海量的低價DRAM傾銷。為啥受衝擊最大 的不是SK海力士不是美光而是三星?Samsung會成為韓國版的宏達電嗎? ☆ 全球最強運宮廟帽☆超級新色現貨到
The story of Groq, a semiconductor startup that makes chips for AI inference and was recently valued at $2.8 billion, is a classic “overnight success that was years in the making” tale. On this episode, I talk with founder and CEO Jonathan Ross. He began the work that eventually led to Groq as an engineer at Google, where he was a member of the rapid eval team – “the team that comes up with all the crazy ideas at Google X.” For him, the risk involved in leaving to launch Groq in 2016 was far less than the risk of staying in-house and watching the project die. Groq has had many “near-death” experiences in its eight years of existence, all of which Jonathan believes have ultimately put it in a much stronger position to achieve its mission: preserving human agency in the age of AI. Groq is committed to giving everyone access to relatively low-cost generative AI compute, driving the price down even as they continue to increase speed. We talked about how the company culture supports that mission, what it feels like to now be on the same playing field as companies like Nvidia, and Jonathan's belief that true disruption isn't just doing things other people can't do or don't want to do, but doing things other people don't believe can be done – even when you show them evidence to the contrary. Other topics we touched on include: Why the ability to customize on demand makes generative AI different Managing your own and other people's fear as a founder The problems of corporate innovation The role of luck in business How he thinks about long-term goals and growth — Brought to you by: Mercury – The art of simplified finances. Learn more. DigitalOcean – The cloud loved by developers and founders alike. Sign up. Runway – The finance platform you don't hate. Learn more. — Where to find Jonathan Ross: • X: https://x.com/JonathanRoss321 • LinkedIn: https://www.linkedin.com/in/ross-jonathan/ Where to find Eric: • Newsletter: https://ericries.carrd.co/ • Podcast: https://ericriesshow.com/ • YouTube: https://www.youtube.com/@theericriesshow — In This Episode We Cover: (04:24) Jonathan's involvement with the DeepMind Challenge Match between AlphaGo and Lee Sedol (06:06) How Jonathan's work Google and how it led him to that moment (08:46) Why generative AI isn't just the next internet or mobile (10:12) The divine move in the DeepMind Challenge Match (11:56) How Jonathan ended up designing chips without the usual background (13:11) GPUs vs. TPUs (14:33) What risk really is (15:11) Groq's mind-blowing AI demo (16:23) How Jonathan decided to leave Google and start Groq (17:30) The differences between doing an innovation project at a company and starting a new company (19:03) Nassim Taleb's Black Swan theory (21:02) Groq's founding story (24:12) The difference in attitude towards AI now compared to 2016 and how it affected Groq (25:46) The moment the tide turned with LLMs (28:28) The week-over-week jump from 8,000 users to 400,000 users (30:32) How Groq used HBM and what is it (the memory used by GPUs) (32:33) Jonathan's approach to disruption (35:38) Groq's initial raise and focus on software (36:13) How struggling to survive made Groq stronger (37:13) Hiring for return on luck (40:07) How Jonathan and Groq think about the long-term (42:25) Founder control issues (45:31) How Groq thinks about maintaining its mission and trustworthiness (49:51) Jonathan's vision for a capital market that would support companies like Groq (52:58) How Groq manages internal cultural alignment (55:59) Groq's mission and to preserve human agency in the age of AI how it approaches achieving it (59:48) Lightning round You can find the transcript and references at https://www.ericriesshow.com/ — Production and marketing by https://penname.co/. Eric may be an investor in the companies discussed.
Hello everyone!!For this month's spooky HBM we go back to Interview with a Vampire! Season 2! And in good company too, with Jay, of LibraryPunk and Tender Subject fame!It's vampire talk again, as we go back into interviewing immortal beings and finding out just how interesting their drama really is. Is it? We wonder that as we dive into this continued adaptation, its successes and limits, and just how effective, or not, their queer representation really is effective or that engaging!Enjoy!And, of course, check out Jay's work elsewhere!LibraryPunkTender SubjectAnd please support our Patreon if you can and are interested in early episodes and the Here Be Extras!https://www.patreon.com/leftpage Also! If you're not there already, feel free to join our Discord, as we have been more talkative than usual, and plan to do so more and more!https://discord.gg/J2wgG3yrPNIntro Music: Home, by Karl Casey @ White Bat AudioOutro Music: Leve Palestina, Spartacus Hosted on Acast. See acast.com/privacy for more information.
Hypnotic Black Magic, based between Berlin and Lisbon, has become a defining figure in the deep techno circuit. As the visionary behind Art Bei Ton, a label, podcast, and event series, she's been instrumental in shaping a more inclusive and experimental corner of the scene. With a taste rooted in hypnotic techno, ambient textures, and ritualistic rhythms, HBM's sets are known for weaving loopy patterns, groovy basslines, and dreamlike atmospheres. From Tresor to VENT and beyond, she's built a global reputation not just for her electrifying sets but for fostering spaces where FLINTA* and queer artists thrive. Through projects like Ambient Sleepers and events spanning New York to Seoul, she channels her DIY ethos into curating experiences that transcend music, offering cathartic moments for dancefloors worldwide. Hypnotic Black Magic's mix for Delayed seamlessly reflects her artistry. Over the course of an hour, she conjures a spellbinding atmosphere, opening with eerie female vocals and liquid basslines that ripple and shift like a mirage. The set takes on a nostalgic dimension, featuring some of her favorite tracks from the mid-2010s, proof that timelessness in techno isn't about release dates but resonance. HBM delivers a technically flawless mix, creating a steady ascent that holds tension without ever feeling forced. This is hypnotic techno at its finest: immersive without excess, subtle yet captivating (I know, I know, this adjective is way overused nowadays, but it fits perfectly to describe this mix). Surrender to the mix; before you know it, the hour has slipped through your fingers. https://soundcloud.com/hypnotic-black-magic https://soundcloud.com/art-bei-ton https://www.instagram.com/hypnoticblackmagic/ Write up by https://soundcloud.com/gilleswasserman Follow us on social media: https://soundcloud.com/itsdelayed https://linktr.ee/delayed https://www.itsdelayed.com https://www.facebook.com/itsdelayed https://www.instagram.com/_____delayed https://www.youtube.com/@_____delayed
durée : 00:58:19 - Le Cours de l'histoire - par : Xavier Mauduit, Maïwenn Guiziou - À l'aube du 20ᵉ siècle, les pouvoirs publics construisent des habitations à bon marché, dites HBM, pour loger la classe ouvrière aisée. À partir des années 1950, l'habitation à loyer modéré (HLM) se développe et incarne la modernité des "Trente Glorieuses". - réalisation : Thomas Beau - invités : Marie-Jeanne Dumont Architecte et historienne. Secrétaire générale de la Commission du Vieux Paris.; Thibault Tellier Historien
Send us a textIn this episode, Françoise von Trapp speaks with Chee Ping Lee, of Lam Research, about the critical role of high bandwidth memory (HBM) in generative AI, emphasizing its high bandwidth and compact design. HBM memory has received a lot of attention as one of the first technologies to implement 2.5D and 3D stacking. Lee explains how HBM uses advanced packaging technologies like TSV and microbumps to achieve high memory capacity and performance. Lam Research's solutions are key to HBM's success.Listen to learn details about: The importance of HBM for AI chip manufacturing The relationship between advancedpackaging and HBMThe differences between different HBM generations The future of microbumps and hybrid bonding Lam Research's advanced packaging solutions Contact Chee Ping Lee on LinkedIN Learn more about why HBM Is a critical enabler for generative AI in this blog post. Lam Research Lam Research equipment and services allow chipmakers to build smaller and better performing devicesDisclaimer: This post contains affiliate links. If you make a purchase, I may receive a commission at no extra cost to you.Support the showBecome a sustaining member! Like what you hear? Follow us on LinkedIn and TwitterInterested in reaching a qualified audience of microelectronics industry decision-makers? Invest in host-read advertisements, and promote your company in upcoming episodes. Contact Françoise von Trapp to learn more. Interested in becoming a sponsor of the 3D InCites Podcast? Check out our 2024 Media Kit. Learn more about the 3D InCites Community and how you can become more involved.
Hello everyone!!For this chaotic month's HBM we're diving into one of Frank's favourite genres, right back into detective fiction! This time talking about the very weird and very interesting Paradise Killer!Join us as we discuss the complexities of building a world from scratch, of making a sociologically compelling story, and creating a decent and engaging medium to solve a mystery. All while dealing with the colonial violence and oppression exacted by an even more powerful ruling class!What kind of detective are we, when we're the supreme elite?All this and more on this episode! Enjoy!And please support our Patreon if you can and are interested in early episodes and the Here Be Extras!https://www.patreon.com/leftpage Also! If you're not there already, feel free to join our Discord, as we have been more talkative than usual, and plan to do so more and more!https://discord.gg/J2wgG3yrPNIntro Music: Home, by Karl Casey @ White Bat AudioOutro Music: Leve Palestina, Spartacus Hosted on Acast. See acast.com/privacy for more information.
Micron CEO Sanjay Mehrotra discusses his journey in the semiconductor industry and the growing strategic importance of the memory industry in the age of AI. He highlights innovation in high bandwidth memory (HBM), which enables faster data access and lower power consumption. Sanjay also discusses Micron's plans to build new U.S. semiconductor factories and the role of the CHIPS Act in bringing semiconductor manufacturing back to the U.S. He shares his insights on the resurgence of hardware innovation and what it takes to drive technological leadership in a rapidly evolving industry. Chapters 00:00 Introduction and Background 08:10 The Resurgence of the Semiconductor Industry 17:34 The Importance of High Bandwidth Memory 23:13 Bridging the Cost Gap: The Role of the CHIPS Act 27:57 Investing in Workforce Development 31:01 Building New Fabs to Meet the Growing Demand Links https://www.ktvb.com/article/news/local/microns-memory-chip-manufacturing-facility-officially-begins/277-7caa484a-c16f-4ea2-aa42-cbce86fae0c3 https://www.reuters.com/technology/micron-set-get-6-billion-chip-grants-us-bloomberg-reports-2024-04-17/ https://www.nytimes.com/2022/10/04/technology/micron-chip-clay-syracuse.html https://www.cnbc.com/2024/05/14/ai-boom-to-keep-supply-of-high-end-memory-chips-tight-through-2024.html https://www.youtube.com/watch?v=q5OY7D_JfoY&t=1s
剛結束的台北國際電腦展Computex,AI教主黃仁勳的一舉一動可說是搶盡了鋒頭。但不少人印象深刻的是在國際記者會裡的一個橋段,有個韓國記者舉手問黃仁勳,怎麼看待南韓廠商,以及「三星能否成為Nvidia的伙伴」? 結果平常都很熱情的黃仁勳,這時候突然冷冷地回答「三星和SK海力士都是出色的記憶體合作伙伴,僅此而已」、「我正在等待三星的HBM(高頻寬記憶體)通過測試,」然後還說,三星還有一些工程問題要克服…。三星到底怎麼了,為什麼它的HBM落後這麼多? 這集節目來賓是台灣記憶體業的大老,中央研究院院士、旺宏電子總經理盧志遠,我們趁盧院士訪美之前向他請教幾個半導體最熱門的話題,除了三星之外,我們也聊聊台積的foundry 2.0、美國大選對產業的影響,以及當前AI趨勢潛在的危機 。 主持人:天下雜誌總主筆 陳良榕 來賓:中央研究院院士、旺宏電子總經理 盧志遠 製作團隊:李洛梅、劉駿逸 *天下雜誌✕白沙屯媽祖廟共同為台灣祈願,立即領取天下祈願金▸https://bit.ly/3WiXCYL *免費下載《天下每日報App》體驗1個月:https://bit.ly/3wQEJ4P *訂閱阿榕伯科技電子報:https://bit.ly/42A6BWj *意見信箱:bill@cw.com.tw -- Hosting provided by SoundOn
In episode 51 of Infrastructure Matters, hosts Steven Dickens and Camberley Bates discuss the latest developments in the data infrastructure industry, with a focus on the Future of Memory and Storage Summit (FMS). They highlight the importance of the tech stack for AI, the challenges faced by Intel, and the growing role of companies like Palantir and Cloudera in managing and curating data for AI applications. The episode also touches on Camberley's involvement in promoting women in the tech industry through the SuperWomen of FMS initiative. The key talking points include: Future of Memory and Storage Summit (FMS): The summit focused on advancements in memory technology, including high-bandwidth memory (HBM), CXL, and the latest PCIe standards. AI and data processing were major themes. Intel's Challenges: Discussion on Intel's 40% stock decline year-to-date and the strategic importance of Intel's success to U.S. interests. Pat Gelsinger's turnaround efforts are compared to IBM's historic recovery. Palantir's Growth: Palantir's significant growth in the commercial sector, with a 55% increase in commercial business and efforts to move beyond its defense industry roots. Cloudera's Role in Data Management: Cloudera's work in managing and classifying data for AI, focusing on data governance, curation, and pipeline management. SuperWomen of FMS: Camberley Bates' initiative to attract and retain women in the tech industry, including an annual leadership award recognizing influential women in the memory and storage field.
Grote Chinese techbedrijven als Huawei en Baidu leggen voorraden aan van geavanceerde chips voor toepassingen op het gebied van kunstmatige intelligentie. Dat doen de bedrijven volgens ingewijden om zich voor te bereiden op nieuwe Amerikaanse beperkingen die de export van chips naar het land verder aan banden moet leggen, meldt Reuters. Je hoort Wesley Schouwenaars erover in de Tech Update. Volgens de bronnen was China in de eerste helft van dit jaar goed voor 30 procent van de omzet die het Zuid-Koreaanse Samsung haalde uit de verkoop van de AI-chips. Samsung is naast een ander Zuid-Koreaans bedrijf en een bedrijf uit de VS de belangrijkste leverancier van de zogeheten HBM-chips. Verder in de Tech Update: Apple maakt het waarschijnlijk mogelijk delen van websites in Safari te laten 'verdwijnen' Nvidia zou stiekem YouTube- en Netflix-video's hebben gebruikt om z'n kunstmatige intelligentie te trainen See omnystudio.com/listener for privacy information.
房屋稅2.0新制上路! 自住房屋新增「須有本人、配偶或直系親屬於該屋設立戶籍登記」要件,如尚未設籍,請於114年3月24日前設籍,才可續享自住稅率! 更多資訊→ https://bit.ly/3WgzFB9 發票存雲端 環保又便利 高雄市稅捐稽徵處 廣告 -- 擺脫經濟失落的30年,日本政策放大絕, 讓股利被重新定義!渴望體驗超越想像的日本股市嗎? 【00956】掌握日企配息好機會,7/30激安募集! 【00956】中信日經高股息 了解更多
From low reimbursement rates to complex administrative tasks, traditional managed care is rife with problems. But what if there was an alternative? Dr. Ron Gleitman and his team started Great Hearing Benefits to offer a more viable and profitable solution for audiologists. In this episode, Ron shares how this innovative model improves financial returns, simplifies administration, and ensures high-quality patient care. He also highlights effective digital marketing and practice management strategies. Dr. Ron Gleitman is the President of Great Hearing Benefits and a seasoned audiology expert with experience as a practice owner. He is also a specialist in practice development and operations. In this episode, Kevin and Ron will discuss: - Dr. Gleitman's inspiring career journey - Balancing patient care and practice management - Digital marketing and online reviews as key strategies for 2024 - The Great Hearing Benefits alternative to traditional managed care - The value of peer support and professional networks - Effective consumer-focused marketing for practices - Building long-lasting patient relationships through excellent care - Increasing efficiency and profitability with audiology assistants - Data-driven decision-making for practice management - And other topics… Dr. Ron Gleitman is the President of Great Hearing Benefits, an innovative managed care program he helped start to provide a more viable and profitable alternative for audiologists compared to traditional hearing benefit manager (HBM) models. He holds a Ph.D. in Audiology from Purdue University and has significant experience managing audiology practices, including owning a multi-office practice in Chicago. Ron is also the Vice President of Operations at Beltone and was Director of Practice Development at Phonak. Connect with Ron: Ron's LinkedIn: https://www.linkedin.com/in/ronald-gleitman-950154125/ Resources Mentioned: Great Hearing Benefits: https://greathearingbenefits.com Beltone: https://www.beltone.com The Only Thing: If you're an audiologist and want to grow your practice – we've got a FREE, expert guide to help you achieve your goals. It's called The Only Thing. This expert guide will show you how to increase new patient calls by 5 to 57 a month, schedule more new patients each week, help more people, and increase revenue. It's the best resource I know for growing your audiology practice. Get your copy for free at http://medpb.com/mastery.
1:55 摩根史丹利為何出這台灣半導體報告,重要性是? 4:00 若要一句話總結,半導體也是AI獨強 5:11 AI三巨頭:輝達、台積電、海力士,台股優於評等8壯士 9:03 鴻海公布營收月減並非基本面大利空,GB200還早得很 13:10 H系列轉換至B系列預計順利,CSP四巨頭法說會仍是最重要 18:15 AMD MI300出貨達預期,但成長天花板低 20:45 記憶體產業HBM獨強,消費性需求近期弱於預期 22:53 WoA一個多月前大摩強力看好,本次報告迅速調整為反應分歧 25:00 ARM與蘋果M5,AI手機換機動機較佳帶動規格提升與新技術應用 29:25 輝達與台積電在估值上緣,ASML最新機台延後採用又被認證!? 33:11 其他零星的觀點與有趣的發現 34:17 更正口誤: 台星科是矽格子公司,非京元電子 35:45 位階搭配,即使長線展望佳8檔個股也有部分被法人轉賣 相關文章與完整圖表資料: https://www.big-econ.com/index.php?sec=article&ID=3797 -- Hosting provided by SoundOn
2:30 6/26輝達股東會,不足以成為股價催化劑 5:50 重工業自動化轉型、主權AI,推廣需求擴散是主軸 8:40 黃仁勳的2024薪酬3400萬美元,股票佔8成 12:40 台積電基本面展望仍然最具韌性,真要轉壞也是別人先死 -- 15:15 美光談話不悲觀,市場為了盤後重挫找不是理由的理由 17:12 投行更新美光賣方報告「8升1降」仍然相當樂觀 21:40 營收獲利Double Beat、下季財測平預期 22:50 關於自由現金流與利用數據進行估算 26:00 下季自由現金流終結連5升,預計接近0 27:40 資本支出佔營收太高,是本季估值修正主因 30:12 美光HBM產能瓶頸,即使漲價也賺不到太多 32:26 ASP大漲20%,Dram位元出貨季減(消費性太慘) 36:31 為什麼對美光採用自由現金流估值? 39:30 大型股(ETF權重股)如果不創高,帶來的估值修正壓力 41:00 ASML是下個財報估值修正的大型股 與其擔心台積電、輝達,先看CSP有沒有資本支出與股價下修 相關文章與完整圖表資料: https://www.big-econ.com/index.php?sec=article&ID=3793 -- Hosting provided by SoundOn
* 평사원에서 사장까지… 삼성전자 40년 근무* 젠슨 황과 15년 인연… 긍정적·가정적·치열함* NVIDA 성공에서 배워야 할 일? 선택과 집중* 좋은 리더 부임… 삼성, HBM 좋은 성과 낼 것* 1호 반도체 특별법… 지금 韓 반도체, 기회다* RE100 시간 두고… '강점' 원자력 검토해야 ■ 방송 : CBS 라디오 〈김현정의 뉴스쇼〉 FM 98.1 (07:10~09:00)■ 진행 : 김현정 앵커■ 대담 : 고동진 (국민의힘 의원)See omnystudio.com/listener for privacy information.
In this episode of The Canadian Investor Podcast, we start by talking about the best performing stocks on the TSX and look at some common themes. We also explore why trying to beat the S&P 500 might not be the best strategy for every investor. We also discuss some comments from Bruce Flatt, CEO of Brookfield, that adopting a long-term perspective is crucial. Businesses should be valued over decades, not in the short term. This mindset helps investors stay focused on their goals rather than reacting to market fluctuations. We finish the episode by answering a listener question from jointci about what valuation ratios we use when valuing stocks. Ticker of stocks: CEU.TO, EDV.TO, CLS.TO, HBM.TO, TVK.TO, BBD-B.TO, ARTG.V, BDT.TO, IMG.TO, HRX.TO, SFTC.TO, SII.TO, NVEI.TO, ATZ.TO, DOL.TO Check out our portfolio by going to Jointci.com Our Website Canadian Investor Podcast Network Twitter: @cdn_investing Simon's twitter: @Fiat_Iceberg Braden's twitter: @BradoCapital Dan's Twitter: @stocktrades_ca Want to learn more about Real Estate Investing? Check out the Canadian Real Estate Investor Podcast! Apple Podcast - The Canadian Real Estate Investor Spotify - The Canadian Real Estate Investor Web player - The Canadian Real Estate Investor Sign up for Finchat.io for free to get easy access to global stock coverage and powerful AI investing tools. Register for EQ Bank, the seamless digital banking experience with better rates and no nonsense.See omnystudio.com/listener for privacy information.
* HBM 먼저 치고 나간 하이닉스… 삼성은 2등* 삼성 HBM 테스트 통과할 것… 시간 문제* 앞으로 대세는 인공지능 반도체… 투자 ↑* 파운드리, 내년이 TSMC 격차 줄일 기회* 반도체 산업, 자본 싸움… 정부 지원 늘려야 ■ 방송 : CBS 라디오 〈김현정의 뉴스쇼〉 FM 98.1 (07:10~09:00)■ 진행 : 김현정 앵커■ 대담 : 권영화 (세종대 경영전문대학원 대우교수)See omnystudio.com/listener for privacy information.
Hello everyone!!For our second HBM we'll dive into an animated sci-fi series that's weird, strange and very very interesting, we talk about Scavenger's Reign! And we won't do so alone, as we invited our great friend Nate along!Join us as we discuss the challenges of being marooned in an alien moon, the inherent horror of capitalism in space, and how we must be open to engage with the environment and the world around us, and even then, we can't escape ourselves, our traumas, and our hopes.Enjoy!And please support our Patreon if you're interested and want access to early content and the bonus Reading Corners!! Big things are coming! https://www.patreon.com/leftpageIntro Music: Home, by Karl Casey @ White Bat AudioOutro Music: Leve Palestina, Spartacus Hosted on Acast. See acast.com/privacy for more information.
Josh returns this week, and of course there are financials to discuss (NVIDIA is making insane money right now), plus Josh's latest burger, Josh's renovated home office, and other Josh-related content. Where are Josh's boxes, what's in Josh's closet, what are Josh's floors made of, and who handled his 49" monitor while he was away.Oh, and we also did a show about this week in tech ... those topics are below.00:00 Intro01:35 Food with Josh03:25 Josh's office update05:06 NVIDIA is making an insane amount of money12:55 Dan Cases and Lian Li make a micro-ATX case14:44 Is Noctua becoming a lifestyle brand?18:00 Big brother is watching - with the new built-in Windows screen logger25:21 Cooler Master offers "ai" thermal paste for streamers27:28 The demand for HBM could cause DRAMa29:29 RescueZilla 2.5 released31:03 (in)Security Corner38:13 Gaming Quick Hits45:59 Corsair 6500X case review49:29 Picks of the Week56:43 Outro ★ Support this podcast on Patreon ★
Our 165th episode with a summary and discussion of last week's big AI news! Read out our text newsletter and comment on the podcast at https://lastweekin.ai/ Email us your questions and feedback at contact@lastweekin.ai and/or hello@gladstone.ai Timestamps + links: Tools & Apps(00:01:27) GitHub releases an AI-powered tool aiming for a 'radically new way of building software' (00:07:05) China unveils Sora challenger able to produce videos from text similar to OpenAI tool, though much shorter (00:12:23) ChatGPT's AI ‘memory' can remember the preferences of paying customers (00:14:21) Rabbit R1 review: Avoid this AI gadget (00:18:30) Amazon Q, a generative AI-powered assistant for businesses and developers, is now generally available (00:19:54) Yelp's Assistant AI bot will do all the talking to help users find service providers Applications & Business(00:21:31) Video of super-fast, super-smooth humanoid robot will drop your jaw (00:25:22) Tesla's 2 million car Autopilot recall is now under federal scrutiny (00:29:32) Tesla shares soar as Elon Musk returns from China with FSD 'Game Changer' (00:32:11) OpenAI inks strategic tie-up with UK's Financial Times, including content use (00:35:21) OpenAI Startup Fund quietly raises $15M (00:37:00) Huawei backs HBM memory manufacturing in China to sidestep crippling US sanctions that restrict AI development Research & Advancements(00:39:20) Capabilities of Gemini Models in Medicine (00:45:34) Let's Think Dot by Dot: Hidden Computation in Transformer Language Models (00:52:20) NExT: Teaching Large Language Models to Reason about Code Execution (00:55:08) SenseNova 5.0: China's latest AI model surpasses OpenAI's GPT-4 (00:57:20) Octopus v4: Graph of language models (01:00:28) Better & Faster Large Language Models via Multi-token Prediction Policy & Safety(01:03:15) Refusal in LLMs is mediated by a single direction (01:09:19) Rishi Sunak promised to make AI safe. Big Tech's not playing ball. (01:15:09) DOE Announces New Actions to Enhance America's Global Leadership in Artificial Intelligence (01:18:21) The Chips Act is rebuilding US semiconductor manufacturing, so far resulting in $327 billion in announced projects (01:20:50) Analysis-Second global AI safety summit faces tough questions, lower turnout (01:24:03) Sam Altman, Jensen Huang, and more join the federal AI safety board Synthetic Media & Art(01:26:30) Air Head creators say OpenAI's Sora finicky to work with, needs hundreds of prompts, serious VFX work for under 2 minutes of cohesive story ↺ (01:29:50) Eight newspaper publishers sue OpenAI over copyright infringement
On their latest earnings call held on April 30, 2024, Samsung Electronics released insightful results from its first quarter performances. The CEO disclosed to investors that they had noticed an increased demand for high-density, rapid storage solutions, particularly SSDs, for Artificial Intelligence (AI) servers. According to him, this trend indicated NAND Flash's increasing significance in the generative AI context. In response to such market trends, Samsung is reshaping its NAND product portfolio and faces the market challenges by equipping more high-performance, high-capacity SSDs to support AI training and inference.The Q1 2024 report showed a 6% increase in Samsung's revenue, achieving KRW 71.9 trillion. This growth was primarily driven by the robust sales of the Galaxy S24 smartphone series and a rise in memory products' Average Selling Prices (ASPs). Concurrently, the gross profit climbed to KRW 26 trillion, reinforcing Samsung's strategic strength.Samsung's judicious selection of products and services largely underpinned its success, with memory products, particularly LPDDR5X and enterprise SSDs, driving the financial performance. Further enhancing growth, the market reception for the Galaxy S24 smartphone series was encouraging. At the same time, Samsung's foray into AI technologies and applications is beginning to pay dividends, as it seeks new areas of expansion.The earnings call highlighted an apparent shift in consumer trends. The mounting demand for generative AI applications in memory products, coupled with an amplified dependence on memory technologies for data storage and processing, indicates alterations in consumer preferences.The CEO added to these insights by shedding light on the heightened demand for storage solutions due to the proliferation of generative AI: "Generative models continue to evolve in both training and inference, leading to an increase in SSD supply requests. First, for training. As AI promoters increase, the size of training data becomes proportionately massive, necessitating high-performance and extensive data storage. We are seeing an influx of customer requests for Gen 5, 8-terabyte, and 16-terabyte solutions. In inference, to enhance coherence, an enormous amount of database storage is required, resulting in increased customer inquiries for ultra-high-density SSD solutions of 64-terabyte and 128-terabyte. As the generative AI market grows, not only the demand for HBM, DDR5, and DRAM products but also the SSD demand is visibly rising. We have a full range of SSD products and are prepared to address this escalating demand. Our server SSD shipments this year are expected to grow 80% YoY, and the bid sales volume for QLC server SSD is likely to surge thrice in H2 compared to H1."As for their future strategy, Samsung aims to tweak its business portfolio to suit evolving market demands better. By concentrating on server and storage-related products, it hopes to cater effectively to the need for data processing and storage. Moreover, Samsung intends to initiate the mass production of HBM3E products for generative AI applications and roll out advanced memory and SSD products to meet the rising AI exigency. While doing so, it intends to hold onto its lead in premium displays, diversify its mobility business, and dedicate resources to AI technologies to guarantee sustained innovation.Samsung's performance reflects promising financial health. Yet, potential pitfalls such as market saturation and keen competition may demand agility and attentiveness to market dynamics. Despite foreseeable challenges, Samsung's triumphant performance in Q1 2024 warrants a positive outlook for the remaining year, given the company's acknowledgment of potential risks and strategic planning. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.theprompt.email
In this episode of Infrastructure Matters, hosts Steven Dickens with Krista Macomber and Camberley Bates cover the a peak at the 60th birthday party for the Mainframe, Rubrik's IPO and AI tech from AWS to memory and CXL technology. Key moments from the conversation include: The Mainframe turns sixty, and Steve and Camberley look back and forward on the amazing endurance plus the new adoption of the mainframe. The Infrastructure Matters team lends a view into the recent Amazon's Just Walk Out technology, and what is evolving within AI and computer vision technology and where to expect Amazon to go next. AI was the top conversation at MemCon 2024 where the importance of memory technologies like HBM and CXL are addressing performance bottlenecks in high-performance computing and AI. How we will see data storage and other technologies gain traction as the new AI gains hold in everyday life. Rubrik's IPO announcement prompts discussion on its financials, its acquisitions, and the broader trend of data protection companies pivoting towards cybersecurity amidst growing cyber threats.
Our next 2 big events are AI UX and the World's Fair. Join and apply to speak/sponsor!Due to timing issues we didn't have an interview episode to share with you this week, but not to worry, we have more than enough “weekend special” content in the backlog for you to get your Latent Space fix, whether you like thinking about the big picture, or learning more about the pod behind the scenes, or talking Groq and GPUs, or AI Leadership, or Personal AI. Enjoy!AI BreakdownThe indefatigable NLW had us back on his show for an update on the Four Wars, covering Sora, Suno, and the reshaped GPT-4 Class Landscape:and a longer segment on AI Engineering trends covering the future LLM landscape (Llama 3, GPT-5, Gemini 2, Claude 4), Open Source Models (Mistral, Grok), Apple and Meta's AI strategy, new chips (Groq, MatX) and the general movement from baby AGIs to vertical Agents:Thursday Nights in AIWe're also including swyx's interview with Josh Albrecht and Ali Rohde to reintroduce swyx and Latent Space to a general audience, and engage in some spicy Q&A:Dylan Patel on GroqWe hosted a private event with Dylan Patel of SemiAnalysis (our last pod here):Not all of it could be released so we just talked about our Groq estimates:Milind Naphade - Capital OneIn relation to conversations at NeurIPS and Nvidia GTC and upcoming at World's Fair, we also enjoyed chatting with Milind Naphade about his AI Leadership work at IBM, Cisco, Nvidia, and now leading the AI Foundations org at Capital One. We covered:* Milind's learnings from ~25 years in machine learning * His first paper citation was 24 years ago* Lessons from working with Jensen Huang for 6 years and being CTO of Metropolis * Thoughts on relevant AI research* GTC takeaways and what makes NVIDIA specialIf you'd like to work on building solutions rather than platform (as Milind put it), his Applied AI Research team at Capital One is hiring, which falls under the Capital One Tech team.Personal AI MeetupIt all started with a meme:Within days of each other, BEE, FRIEND, EmilyAI, Compass, Nox and LangFriend were all launching personal AI wearables and assistants. So we decided to put together a the world's first Personal AI meetup featuring creators and enthusiasts of wearables. The full video is live now, with full show notes within.Timestamps* [00:01:13] AI Breakdown Part 1* [00:02:20] Four Wars* [00:13:45] Sora* [00:15:12] Suno* [00:16:34] The GPT-4 Class Landscape* [00:17:03] Data War: Reddit x Google* [00:21:53] Gemini 1.5 vs Claude 3* [00:26:58] AI Breakdown Part 2* [00:27:33] Next Frontiers: Llama 3, GPT-5, Gemini 2, Claude 4* [00:31:11] Open Source Models - Mistral, Grok* [00:34:13] Apple MM1* [00:37:33] Meta's $800b AI rebrand* [00:39:20] AI Engineer landscape - from baby AGIs to vertical Agents* [00:47:28] Adept episode - Screen Multimodality* [00:48:54] Top Model Research from January Recap* [00:53:08] AI Wearables* [00:57:26] Groq vs Nvidia month - GPU Chip War* [01:00:31] Disagreements* [01:02:08] Summer 2024 Predictions* [01:04:18] Thursday Nights in AI - swyx* [01:33:34] Dylan Patel - Semianalysis + Latent Space Live Show* [01:34:58] GroqTranscript[00:00:00] swyx: Welcome to the Latent Space Podcast Weekend Edition. This is Charlie, your AI co host. Swyx and Alessio are off for the week, making more great content. We have exciting interviews coming up with Elicit, Chroma, Instructor, and our upcoming series on NSFW, Not Safe for Work AI. In today's episode, we're collating some of Swyx and Alessio's recent appearances, all in one place for you to find.[00:00:32] swyx: In part one, we have our first crossover pod of the year. In our listener survey, several folks asked for more thoughts from our two hosts. In 2023, Swyx and Alessio did crossover interviews with other great podcasts like the AI Breakdown, Practical AI, Cognitive Revolution, Thursday Eye, and Chinatalk, all of which you can find in the Latentspace About page.[00:00:56] swyx: NLW of the AI Breakdown asked us back to do a special on the 4Wars framework and the AI engineer scene. We love AI Breakdown as one of the best examples Daily podcasts to keep up on AI news, so we were especially excited to be back on Watch out and take[00:01:12] NLW: care[00:01:13] AI Breakdown Part 1[00:01:13] NLW: today on the AI breakdown. Part one of my conversation with Alessio and Swix from Latent Space.[00:01:19] NLW: All right, fellas, welcome back to the AI Breakdown. How are you doing? I'm good. Very good. With the last, the last time we did this show, we were like, oh yeah, let's do check ins like monthly about all the things that are going on and then. Of course, six months later, and, you know, the, the, the world has changed in a thousand ways.[00:01:36] NLW: It's just, it's too busy to even, to even think about podcasting sometimes. But I, I'm super excited to, to be chatting with you again. I think there's, there's a lot to, to catch up on, just to tap in, I think in the, you know, in the beginning of 2024. And, and so, you know, we're gonna talk today about just kind of a, a, a broad sense of where things are in some of the key battles in the AI space.[00:01:55] NLW: And then the, you know, one of the big things that I, that I'm really excited to have you guys on here for us to talk about where, sort of what patterns you're seeing and what people are actually trying to build, you know, where, where developers are spending their, their time and energy and, and, and any sort of, you know, trend trends there, but maybe let's start I guess by checking in on a framework that you guys actually introduced, which I've loved and I've cribbed a couple of times now, which is this sort of four wars of the, of the AI stack.[00:02:20] Four Wars[00:02:20] NLW: Because first, since I have you here, I'd love, I'd love to hear sort of like where that started gelling. And then and then maybe we can get into, I think a couple of them that are you know, particularly interesting, you know, in the, in light of[00:02:30] swyx: some recent news. Yeah, so maybe I'll take this one. So the four wars is a framework that I came up around trying to recap all of 2023.[00:02:38] swyx: I tried to write sort of monthly recap pieces. And I was trying to figure out like what makes one piece of news last longer than another or more significant than another. And I think it's basically always around battlegrounds. Wars are fought around limited resources. And I think probably the, you know, the most limited resource is talent, but the talent expresses itself in a number of areas.[00:03:01] swyx: And so I kind of focus on those, those areas at first. So the four wars that we cover are the data wars, the GPU rich, poor war, the multi modal war, And the RAG and Ops War. And I think you actually did a dedicated episode to that, so thanks for covering that. Yeah, yeah.[00:03:18] NLW: Not only did I do a dedicated episode, I actually used that.[00:03:22] NLW: I can't remember if I told you guys. I did give you big shoutouts. But I used it as a framework for a presentation at Intel's big AI event that they hold each year, where they have all their folks who are working on AI internally. And it totally resonated. That's amazing. Yeah, so, so, what got me thinking about it again is specifically this inflection news that we recently had, this sort of, you know, basically, I can't imagine that anyone who's listening wouldn't have thought about it, but, you know, inflection is a one of the big contenders, right?[00:03:53] NLW: I think probably most folks would have put them, you know, just a half step behind the anthropics and open AIs of the world in terms of labs, but it's a company that raised 1. 3 billion last year, less than a year ago. Reed Hoffman's a co founder Mustafa Suleyman, who's a co founder of DeepMind, you know, so it's like, this is not a a small startup, let's say, at least in terms of perception.[00:04:13] NLW: And then we get the news that basically most of the team, it appears, is heading over to Microsoft and they're bringing in a new CEO. And you know, I'm interested in, in, in kind of your take on how much that reflects, like hold aside, I guess, you know, all the other things that it might be about, how much it reflects this sort of the, the stark.[00:04:32] NLW: Brutal reality of competing in the frontier model space right now. And, you know, just the access to compute.[00:04:38] Alessio: There are a lot of things to say. So first of all, there's always somebody who's more GPU rich than you. So inflection is GPU rich by startup standard. I think about 22, 000 H100s, but obviously that pales compared to the, to Microsoft.[00:04:55] Alessio: The other thing is that this is probably good news, maybe for the startups. It's like being GPU rich, it's not enough. You know, like I think they were building something pretty interesting in, in pi of their own model of their own kind of experience. But at the end of the day, you're the interface that people consume as end users.[00:05:13] Alessio: It's really similar to a lot of the others. So and we'll tell, talk about GPT four and cloud tree and all this stuff. GPU poor, doing something. That the GPU rich are not interested in, you know we just had our AI center of excellence at Decibel and one of the AI leads at one of the big companies was like, Oh, we just saved 10 million and we use these models to do a translation, you know, and that's it.[00:05:39] Alessio: It's not, it's not a GI, it's just translation. So I think like the inflection part is maybe. A calling and a waking to a lot of startups then say, Hey, you know, trying to get as much capital as possible, try and get as many GPUs as possible. Good. But at the end of the day, it doesn't build a business, you know, and maybe what inflection I don't, I don't, again, I don't know the reasons behind the inflection choice, but if you say, I don't want to build my own company that has 1.[00:06:05] Alessio: 3 billion and I want to go do it at Microsoft, it's probably not a resources problem. It's more of strategic decisions that you're making as a company. So yeah, that was kind of my. I take on it.[00:06:15] swyx: Yeah, and I guess on my end, two things actually happened yesterday. It was a little bit quieter news, but Stability AI had some pretty major departures as well.[00:06:25] swyx: And you may not be considering it, but Stability is actually also a GPU rich company in the sense that they were the first new startup in this AI wave to brag about how many GPUs that they have. And you should join them. And you know, Imadis is definitely a GPU trader in some sense from his hedge fund days.[00:06:43] swyx: So Robin Rhombach and like the most of the Stable Diffusion 3 people left Stability yesterday as well. So yesterday was kind of like a big news day for the GPU rich companies, both Inflection and Stability having sort of wind taken out of their sails. I think, yes, it's a data point in the favor of Like, just because you have the GPUs doesn't mean you can, you automatically win.[00:07:03] swyx: And I think, you know, kind of I'll echo what Alessio says there. But in general also, like, I wonder if this is like the start of a major consolidation wave, just in terms of, you know, I think that there was a lot of funding last year and, you know, the business models have not been, you know, All of these things worked out very well.[00:07:19] swyx: Even inflection couldn't do it. And so I think maybe that's the start of a small consolidation wave. I don't think that's like a sign of AI winter. I keep looking for AI winter coming. I think this is kind of like a brief cold front. Yeah,[00:07:34] NLW: it's super interesting. So I think a bunch of A bunch of stuff here.[00:07:38] NLW: One is, I think, to both of your points, there, in some ways, there, there had already been this very clear demarcation between these two sides where, like, the GPU pores, to use the terminology, like, just weren't trying to compete on the same level, right? You know, the vast majority of people who have started something over the last year, year and a half, call it, were racing in a different direction.[00:07:59] NLW: They're trying to find some edge somewhere else. They're trying to build something different. If they're, if they're really trying to innovate, it's in different areas. And so it's really just this very small handful of companies that are in this like very, you know, it's like the coheres and jaspers of the world that like this sort of, you know, that are that are just sort of a little bit less resourced than, you know, than the other set that I think that this potentially even applies to, you know, everyone else that could clearly demarcate it into these two, two sides.[00:08:26] NLW: And there's only a small handful kind of sitting uncomfortably in the middle, perhaps. Let's, let's come back to the idea of, of the sort of AI winter or, you know, a cold front or anything like that. So this is something that I, I spent a lot of time kind of thinking about and noticing. And my perception is that The vast majority of the folks who are trying to call for sort of, you know, a trough of disillusionment or, you know, a shifting of the phase to that are people who either, A, just don't like AI for some other reason there's plenty of that, you know, people who are saying, You Look, they're doing way worse than they ever thought.[00:09:03] NLW: You know, there's a lot of sort of confirmation bias kind of thing going on. Or two, media that just needs a different narrative, right? Because they're sort of sick of, you know, telling the same story. Same thing happened last summer, when every every outlet jumped on the chat GPT at its first down month story to try to really like kind of hammer this idea that that the hype was too much.[00:09:24] NLW: Meanwhile, you have, you know, just ridiculous levels of investment from enterprises, you know, coming in. You have, you know, huge, huge volumes of, you know, individual behavior change happening. But I do think that there's nothing incoherent sort of to your point, Swyx, about that and the consolidation period.[00:09:42] NLW: Like, you know, if you look right now, for example, there are, I don't know, probably 25 or 30 credible, like, build your own chatbot. platforms that, you know, a lot of which have, you know, raised funding. There's no universe in which all of those are successful across, you know, even with a, even, even with a total addressable market of every enterprise in the world, you know, you're just inevitably going to see some amount of consolidation.[00:10:08] NLW: Same with, you know, image generators. There are, if you look at A16Z's top 50 consumer AI apps, just based on, you know, web traffic or whatever, they're still like I don't know, a half. Dozen or 10 or something, like, some ridiculous number of like, basically things like Midjourney or Dolly three. And it just seems impossible that we're gonna have that many, you know, ultimately as, as, as sort of, you know, going, going concerned.[00:10:33] NLW: So, I don't know. I, I, I think that the, there will be inevitable consolidation 'cause you know. It's, it's also what kind of like venture rounds are supposed to do. You're not, not everyone who gets a seed round is supposed to get to series A and not everyone who gets a series A is supposed to get to series B.[00:10:46] NLW: That's sort of the natural process. I think it will be tempting for a lot of people to try to infer from that something about AI not being as sort of big or as as sort of relevant as, as it was hyped up to be. But I, I kind of think that's the wrong conclusion to come to.[00:11:02] Alessio: I I would say the experimentation.[00:11:04] Alessio: Surface is a little smaller for image generation. So if you go back maybe six, nine months, most people will tell you, why would you build a coding assistant when like Copilot and GitHub are just going to win everything because they have the data and they have all the stuff. If you fast forward today, A lot of people use Cursor everybody was excited about the Devin release on Twitter.[00:11:26] Alessio: There are a lot of different ways of attacking the market that are not completion of code in the IDE. And even Cursors, like they evolved beyond single line to like chat, to do multi line edits and, and all that stuff. Image generation, I would say, yeah, as a, just as from what I've seen, like maybe the product innovation has slowed down at the UX level and people are improving the models.[00:11:50] Alessio: So the race is like, how do I make better images? It's not like, how do I make the user interact with the generation process better? And that gets tough, you know? It's hard to like really differentiate yourselves. So yeah, that's kind of how I look at it. And when we think about multimodality, maybe the reason why people got so excited about Sora is like, oh, this is like a completely It's not a better image model.[00:12:13] Alessio: This is like a completely different thing, you know? And I think the creative mind It's always looking for something that impacts the viewer in a different way, you know, like they really want something different versus the developer mind. It's like, Oh, I, I just, I have this like very annoying thing I want better.[00:12:32] Alessio: I have this like very specific use cases that I want to go after. So it's just different. And that's why you see a lot more companies in image generation. But I agree with you that. If you fast forward there, there's not going to be 10 of them, you know, it's probably going to be one or[00:12:46] swyx: two. Yeah, I mean, to me, that's why I call it a war.[00:12:49] swyx: Like, individually, all these companies can make a story that kind of makes sense, but collectively, they cannot all be true. Therefore, they all, there is some kind of fight over limited resources here. Yeah, so[00:12:59] NLW: it's interesting. We wandered very naturally into sort of another one of these wars, which is the multimodality kind of idea, which is, you know, basically a question of whether it's going to be these sort of big everything models that end up winning or whether, you know, you're going to have really specific things, you know, like something, you know, Dolly 3 inside of sort of OpenAI's larger models versus, you know, a mid journey or something like that.[00:13:24] NLW: And at first, you know, I was kind of thinking like, For most of the last, call it six months or whatever, it feels pretty definitively both and in some ways, you know, and that you're, you're seeing just like great innovation on sort of the everything models, but you're also seeing lots and lots happen at sort of the level of kind of individual use cases.[00:13:45] Sora[00:13:45] NLW: But then Sora comes along and just like obliterates what I think anyone thought you know, where we were when it comes to video generation. So how are you guys thinking about this particular battle or war at the moment?[00:13:59] swyx: Yeah, this was definitely a both and story, and Sora tipped things one way for me, in terms of scale being all you need.[00:14:08] swyx: And the benefit, I think, of having multiple models being developed under one roof. I think a lot of people aren't aware that Sora was developed in a similar fashion to Dolly 3. And Dolly3 had a very interesting paper out where they talked about how they sort of bootstrapped their synthetic data based on GPT 4 vision and GPT 4.[00:14:31] swyx: And, and it was just all, like, really interesting, like, if you work on one modality, it enables you to work on other modalities, and all that is more, is, is more interesting. I think it's beneficial if it's all in the same house, whereas the individual startups who don't, who sort of carve out a single modality and work on that, definitely won't have the state of the art stuff on helping them out on synthetic data.[00:14:52] swyx: So I do think like, The balance is tilted a little bit towards the God model companies, which is challenging for the, for the, for the the sort of dedicated modality companies. But everyone's carving out different niches. You know, like we just interviewed Suno ai, the sort of music model company, and, you know, I don't see opening AI pursuing music anytime soon.[00:15:12] Suno[00:15:12] swyx: Yeah,[00:15:13] NLW: Suno's been phenomenal to play with. Suno has done that rare thing where, which I think a number of different AI product categories have done, where people who don't consider themselves particularly interested in doing the thing that the AI enables find themselves doing a lot more of that thing, right?[00:15:29] NLW: Like, it'd be one thing if Just musicians were excited about Suno and using it but what you're seeing is tons of people who just like music all of a sudden like playing around with it and finding themselves kind of down that rabbit hole, which I think is kind of like the highest compliment that you can give one of these startups at the[00:15:45] swyx: early days of it.[00:15:46] swyx: Yeah, I, you know, I, I asked them directly, you know, in the interview about whether they consider themselves mid journey for music. And he had a more sort of nuanced response there, but I think that probably the business model is going to be very similar because he's focused on the B2C element of that. So yeah, I mean, you know, just to, just to tie back to the question about, you know, You know, large multi modality companies versus small dedicated modality companies.[00:16:10] swyx: Yeah, highly recommend people to read the Sora blog posts and then read through to the Dali blog posts because they, they strongly correlated themselves with the same synthetic data bootstrapping methods as Dali. And I think once you make those connections, you're like, oh, like it, it, it is beneficial to have multiple state of the art models in house that all help each other.[00:16:28] swyx: And these, this, that's the one thing that a dedicated modality company cannot do.[00:16:34] The GPT-4 Class Landscape[00:16:34] NLW: So I, I wanna jump, I wanna kind of build off that and, and move into the sort of like updated GPT-4 class landscape. 'cause that's obviously been another big change over the last couple months. But for the sake of completeness, is there anything that's worth touching on with with sort of the quality?[00:16:46] NLW: Quality data or sort of a rag ops wars just in terms of, you know, anything that's changed, I guess, for you fundamentally in the last couple of months about where those things stand.[00:16:55] swyx: So I think we're going to talk about rag for the Gemini and Clouds discussion later. And so maybe briefly discuss the data piece.[00:17:03] Data War: Reddit x Google[00:17:03] swyx: I think maybe the only new thing was this Reddit deal with Google for like a 60 million dollar deal just ahead of their IPO, very conveniently turning Reddit into a AI data company. Also, very, very interestingly, a non exclusive deal, meaning that Reddit can resell that data to someone else. And it probably does become table stakes.[00:17:23] swyx: A lot of people don't know, but a lot of the web text dataset that originally started for GPT 1, 2, and 3 was actually scraped from GitHub. from Reddit at least the sort of vote scores. And I think, I think that's a, that's a very valuable piece of information. So like, yeah, I think people are figuring out how to pay for data.[00:17:40] swyx: People are suing each other over data. This, this, this war is, you know, definitely very, very much heating up. And I don't think, I don't see it getting any less intense. I, you know, next to GPUs, data is going to be the most expensive thing in, in a model stack company. And. You know, a lot of people are resorting to synthetic versions of it, which may or may not be kosher based on how far along or how commercially blessed the, the forms of creating that synthetic data are.[00:18:11] swyx: I don't know if Alessio, you have any other interactions with like Data source companies, but that's my two cents.[00:18:17] Alessio: Yeah yeah, I actually saw Quentin Anthony from Luther. ai at GTC this week. He's also been working on this. I saw Technium. He's also been working on the data side. I think especially in open source, people are like, okay, if everybody is putting the gates up, so to speak, to the data we need to make it easier for people that don't have 50 million a year to get access to good data sets.[00:18:38] Alessio: And Jensen, at his keynote, he did talk about synthetic data a little bit. So I think that's something that we'll definitely hear more and more of in the enterprise, which never bodes well, because then all the, all the people with the data are like, Oh, the enterprises want to pay now? Let me, let me put a pay here stripe link so that they can give me 50 million.[00:18:57] Alessio: But it worked for Reddit. I think the stock is up. 40 percent today after opening. So yeah, I don't know if it's all about the Google deal, but it's obviously Reddit has been one of those companies where, hey, you got all this like great community, but like, how are you going to make money? And like, they try to sell the avatars.[00:19:15] Alessio: I don't know if that it's a great business for them. The, the data part sounds as an investor, you know, the data part sounds a lot more interesting than, than consumer[00:19:25] swyx: cosmetics. Yeah, so I think, you know there's more questions around data you know, I think a lot of people are talking about the interview that Mira Murady did with the Wall Street Journal, where she, like, just basically had no, had no good answer for where they got the data for Sora.[00:19:39] swyx: I, I think this is where, you know, there's, it's in nobody's interest to be transparent about data, and it's, it's kind of sad for the state of ML and the state of AI research but it is what it is. We, we have to figure this out as a society, just like we did for music and music sharing. You know, in, in sort of the Napster to Spotify transition, and that might take us a decade.[00:19:59] swyx: Yeah, I[00:20:00] NLW: do. I, I agree. I think, I think that you're right to identify it, not just as that sort of technical problem, but as one where society has to have a debate with itself. Because I think that there's, if you rationally within it, there's Great kind of points on all side, not to be the sort of, you know, person who sits in the middle constantly, but it's why I think a lot of these legal decisions are going to be really important because, you know, the job of judges is to listen to all this stuff and try to come to things and then have other judges disagree.[00:20:24] NLW: And, you know, and have the rest of us all debate at the same time. By the way, as a total aside, I feel like the synthetic data right now is like eggs in the 80s and 90s. Like, whether they're good for you or bad for you, like, you know, we, we get one study that's like synthetic data, you know, there's model collapse.[00:20:42] NLW: And then we have like a hint that llama, you know, to the most high performance version of it, which was one they didn't release was trained on synthetic data. So maybe it's good. It's like, I just feel like every, every other week I'm seeing something sort of different about whether it's a good or bad for, for these models.[00:20:56] swyx: Yeah. The branding of this is pretty poor. I would kind of tell people to think about it like cholesterol. There's good cholesterol, bad cholesterol. And you can have, you know, good amounts of both. But at this point, it is absolutely without a doubt that most large models from here on out will all be trained as some kind of synthetic data and that is not a bad thing.[00:21:16] swyx: There are ways in which you can do it poorly. Whether it's commercial, you know, in terms of commercial sourcing or in terms of the model performance. But it's without a doubt that good synthetic data is going to help your model. And this is just a question of like where to obtain it and what kinds of synthetic data are valuable.[00:21:36] swyx: You know, if even like alpha geometry, you know, was, was a really good example from like earlier this year.[00:21:42] NLW: If you're using the cholesterol analogy, then my, then my egg thing can't be that far off. Let's talk about the sort of the state of the art and the, and the GPT 4 class landscape and how that's changed.[00:21:53] Gemini 1.5 vs Claude 3[00:21:53] NLW: Cause obviously, you know, sort of the, the two big things or a couple of the big things that have happened. Since we last talked, we're one, you know, Gemini first announcing that a model was coming and then finally it arriving, and then very soon after a sort of a different model arriving from Gemini and and Cloud three.[00:22:11] NLW: So I guess, you know, I'm not sure exactly where the right place to start with this conversation is, but, you know, maybe very broadly speaking which of these do you think have made a bigger impact? Thank you.[00:22:20] Alessio: Probably the one you can use, right? So, Cloud. Well, I'm sure Gemini is going to be great once they let me in, but so far I haven't been able to.[00:22:29] Alessio: I use, so I have this small podcaster thing that I built for our podcast, which does chapters creation, like named entity recognition, summarization, and all of that. Cloud Tree is, Better than GPT 4. Cloud2 was unusable. So I use GPT 4 for everything. And then when Opus came out, I tried them again side by side and I posted it on, on Twitter as well.[00:22:53] Alessio: Cloud is better. It's very good, you know, it's much better, it seems to me, it's much better than GPT 4 at doing writing that is more, you know, I don't know, it just got good vibes, you know, like the GPT 4 text, you can tell it's like GPT 4, you know, it's like, it always uses certain types of words and phrases and, you know, maybe it's just me because I've now done it for, you know, So, I've read like 75, 80 generations of these things next to each other.[00:23:21] Alessio: Clutter is really good. I know everybody is freaking out on twitter about it, my only experience of this is much better has been on the podcast use case. But I know that, you know, Quran from from News Research is a very big opus pro, pro opus person. So, I think that's also It's great to have people that actually care about other models.[00:23:40] Alessio: You know, I think so far to a lot of people, maybe Entropic has been the sibling in the corner, you know, it's like Cloud releases a new model and then OpenAI releases Sora and like, you know, there are like all these different things, but yeah, the new models are good. It's interesting.[00:23:55] NLW: My my perception is definitely that just, just observationally, Cloud 3 is certainly the first thing that I've seen where lots of people.[00:24:06] NLW: They're, no one's debating evals or anything like that. They're talking about the specific use cases that they have, that they used to use chat GPT for every day, you know, day in, day out, that they've now just switched over. And that has, I think, shifted a lot of the sort of like vibe and sentiment in the space too.[00:24:26] NLW: And I don't necessarily think that it's sort of a A like full you know, sort of full knock. Let's put it this way. I think it's less bad for open AI than it is good for anthropic. I think that because GPT 5 isn't there, people are not quite willing to sort of like, you know get overly critical of, of open AI, except in so far as they're wondering where GPT 5 is.[00:24:46] NLW: But I do think that it makes, Anthropic look way more credible as a, as a, as a player, as a, you know, as a credible sort of player, you know, as opposed to to, to where they were.[00:24:57] Alessio: Yeah. And I would say the benchmarks veil is probably getting lifted this year. I think last year. People were like, okay, this is better than this on this benchmark, blah, blah, blah, because maybe they did not have a lot of use cases that they did frequently.[00:25:11] Alessio: So it's hard to like compare yourself. So you, you defer to the benchmarks. I think now as we go into 2024, a lot of people have started to use these models from, you know, from very sophisticated things that they run in production to some utility that they have on their own. Now they can just run them side by side.[00:25:29] Alessio: And it's like, Hey, I don't care that like. The MMLU score of Opus is like slightly lower than GPT 4. It just works for me, you know, and I think that's the same way that traditional software has been used by people, right? Like you just strive for yourself and like, which one does it work, works best for you?[00:25:48] Alessio: Like nobody looks at benchmarks outside of like sales white papers, you know? And I think it's great that we're going more in that direction. We have a episode with Adapt coming out this weekend. I'll and some of their model releases, they specifically say, We do not care about benchmarks, so we didn't put them in, you know, because we, we don't want to look good on them.[00:26:06] Alessio: We just want the product to work. And I think more and more people will, will[00:26:09] swyx: go that way. Yeah. I I would say like, it does take the wind out of the sails for GPT 5, which I know where, you know, Curious about later on. I think anytime you put out a new state of the art model, you have to break through in some way.[00:26:21] swyx: And what Claude and Gemini have done is effectively take away any advantage to saying that you have a million token context window. Now everyone's just going to be like, Oh, okay. Now you just match the other two guys. And so that puts An insane amount of pressure on what gpt5 is going to be because it's just going to have like the only option it has now because all the other models are multimodal all the other models are long context all the other models have perfect recall gpt5 has to match everything and do more to to not be a flop[00:26:58] AI Breakdown Part 2[00:26:58] NLW: hello friends back again with part two if you haven't heard part one of this conversation i suggest you go check it out but to be honest they are kind of actually separable In this conversation, we get into a topic that I think Alessio and Swyx are very well positioned to discuss, which is what developers care about right now, what people are trying to build around.[00:27:16] NLW: I honestly think that one of the best ways to see the future in an industry like AI is to try to dig deep on what developers and entrepreneurs are attracted to build, even if it hasn't made it to the news pages yet. So consider this your preview of six months from now, and let's dive in. Let's bring it to the GPT 5 conversation.[00:27:33] Next Frontiers: Llama 3, GPT-5, Gemini 2, Claude 4[00:27:33] NLW: I mean, so, so I think that that's a great sort of assessment of just how the stakes have been raised, you know is your, I mean, so I guess maybe, maybe I'll, I'll frame this less as a question, just sort of something that, that I, that I've been watching right now, the only thing that makes sense to me with how.[00:27:50] NLW: Fundamentally unbothered and unstressed OpenAI seems about everything is that they're sitting on something that does meet all that criteria, right? Because, I mean, even in the Lex Friedman interview that, that Altman recently did, you know, he's talking about other things coming out first. He's talking about, he's just like, he, listen, he, he's good and he could play nonchalant, you know, if he wanted to.[00:28:13] NLW: So I don't want to read too much into it, but. You know, they've had so long to work on this, like unless that we are like really meaningfully running up against some constraint, it just feels like, you know, there's going to be some massive increase, but I don't know. What do you guys think?[00:28:28] swyx: Hard to speculate.[00:28:29] swyx: You know, at this point, they're, they're pretty good at PR and they're not going to tell you anything that they don't want to. And he can tell you one thing and change their minds the next day. So it's, it's, it's really, you know, I've always said that model version numbers are just marketing exercises, like they have something and it's always improving and at some point you just cut it and decide to call it GPT 5.[00:28:50] swyx: And it's more just about defining an arbitrary level at which they're ready and it's up to them on what ready means. We definitely did see some leaks on GPT 4. 5, as I think a lot of people reported and I'm not sure if you covered it. So it seems like there might be an intermediate release. But I did feel, coming out of the Lex Friedman interview, that GPT 5 was nowhere near.[00:29:11] swyx: And you know, it was kind of a sharp contrast to Sam talking at Davos in February, saying that, you know, it was his top priority. So I find it hard to square. And honestly, like, there's also no point Reading too much tea leaves into what any one person says about something that hasn't happened yet or has a decision that hasn't been taken yet.[00:29:31] swyx: Yeah, that's, that's my 2 cents about it. Like, calm down, let's just build .[00:29:35] Alessio: Yeah. The, the February rumor was that they were gonna work on AI agents, so I don't know, maybe they're like, yeah,[00:29:41] swyx: they had two agent two, I think two agent projects, right? One desktop agent and one sort of more general yeah, sort of GPTs like agent and then Andre left, so he was supposed to be the guy on that.[00:29:52] swyx: What did Andre see? What did he see? I don't know. What did he see?[00:29:56] Alessio: I don't know. But again, it's just like the rumors are always floating around, you know but I think like, this is, you know, we're not going to get to the end of the year without Jupyter you know, that's definitely happening. I think the biggest question is like, are Anthropic and Google.[00:30:13] Alessio: Increasing the pace, you know, like it's the, it's the cloud four coming out like in 12 months, like nine months. What's the, what's the deal? Same with Gemini. They went from like one to 1. 5 in like five days or something. So when's Gemini 2 coming out, you know, is that going to be soon? I don't know.[00:30:31] Alessio: There, there are a lot of, speculations, but the good thing is that now you can see a world in which OpenAI doesn't rule everything. You know, so that, that's the best, that's the best news that everybody got, I would say.[00:30:43] swyx: Yeah, and Mistral Large also dropped in the last month. And, you know, not as, not quite GPT 4 class, but very good from a new startup.[00:30:52] swyx: So yeah, we, we have now slowly changed in landscape, you know. In my January recap, I was complaining that nothing's changed in the landscape for a long time. But now we do exist in a world, sort of a multipolar world where Cloud and Gemini are legitimate challengers to GPT 4 and hopefully more will emerge as well hopefully from meta.[00:31:11] Open Source Models - Mistral, Grok[00:31:11] NLW: So speak, let's actually talk about sort of the open source side of this for a minute. So Mistral Large, notable because it's, it's not available open source in the same way that other things are, although I think my perception is that the community has largely given them Like the community largely recognizes that they want them to keep building open source stuff and they have to find some way to fund themselves that they're going to do that.[00:31:27] NLW: And so they kind of understand that there's like, they got to figure out how to eat, but we've got, so, you know, there there's Mistral, there's, I guess, Grok now, which is, you know, Grok one is from, from October is, is open[00:31:38] swyx: sourced at, yeah. Yeah, sorry, I thought you thought you meant Grok the chip company.[00:31:41] swyx: No, no, no, yeah, you mean Twitter Grok.[00:31:43] NLW: Although Grok the chip company, I think is even more interesting in some ways, but and then there's the, you know, obviously Llama3 is the one that sort of everyone's wondering about too. And, you know, my, my sense of that, the little bit that, you know, Zuckerberg was talking about Llama 3 earlier this year, suggested that, at least from an ambition standpoint, he was not thinking about how do I make sure that, you know, meta content, you know, keeps, keeps the open source thrown, you know, vis a vis Mistral.[00:32:09] NLW: He was thinking about how you go after, you know, how, how he, you know, releases a thing that's, you know, every bit as good as whatever OpenAI is on at that point.[00:32:16] Alessio: Yeah. From what I heard in the hallways at, at GDC, Llama 3, the, the biggest model will be, you 260 to 300 billion parameters, so that that's quite large.[00:32:26] Alessio: That's not an open source model. You know, you cannot give people a 300 billion parameters model and ask them to run it. You know, it's very compute intensive. So I think it is, it[00:32:35] swyx: can be open source. It's just, it's going to be difficult to run, but that's a separate question.[00:32:39] Alessio: It's more like, as you think about what they're doing it for, you know, it's not like empowering the person running.[00:32:45] Alessio: llama. On, on their laptop, it's like, oh, you can actually now use this to go after open AI, to go after Anthropic, to go after some of these companies at like the middle complexity level, so to speak. Yeah. So obviously, you know, we estimate Gentala on the podcast, they're doing a lot here, they're making PyTorch better.[00:33:03] Alessio: You know, they want to, that's kind of like maybe a little bit of a shorted. Adam Bedia, in a way, trying to get some of the CUDA dominance out of it. Yeah, no, it's great. The, I love the duck destroying a lot of monopolies arc. You know, it's, it's been very entertaining. Let's bridge[00:33:18] NLW: into the sort of big tech side of this, because this is obviously like, so I think actually when I did my episode, this was one of the I added this as one of as an additional war that, that's something that I'm paying attention to.[00:33:29] NLW: So we've got Microsoft's moves with inflection, which I think pretend, potentially are being read as A shift vis a vis the relationship with OpenAI, which also the sort of Mistral large relationship seems to reinforce as well. We have Apple potentially entering the race, finally, you know, giving up Project Titan and and, and kind of trying to spend more effort on this.[00:33:50] NLW: Although, Counterpoint, we also have them talking about it, or there being reports of a deal with Google, which, you know, is interesting to sort of see what their strategy there is. And then, you know, Meta's been largely quiet. We kind of just talked about the main piece, but, you know, there's, and then there's spoilers like Elon.[00:34:07] NLW: I mean, you know, what, what of those things has sort of been most interesting to you guys as you think about what's going to shake out for the rest of this[00:34:13] Apple MM1[00:34:13] swyx: year? I'll take a crack. So the reason we don't have a fifth war for the Big Tech Wars is that's one of those things where I just feel like we don't cover differently from other media channels, I guess.[00:34:26] swyx: Sure, yeah. In our anti interestness, we actually say, like, we try not to cover the Big Tech Game of Thrones, or it's proxied through Twitter. You know, all the other four wars anyway, so there's just a lot of overlap. Yeah, I think absolutely, personally, the most interesting one is Apple entering the race.[00:34:41] swyx: They actually released, they announced their first large language model that they trained themselves. It's like a 30 billion multimodal model. People weren't that impressed, but it was like the first time that Apple has kind of showcased that, yeah, we're training large models in house as well. Of course, like, they might be doing this deal with Google.[00:34:57] swyx: I don't know. It sounds very sort of rumor y to me. And it's probably, if it's on device, it's going to be a smaller model. So something like a Jemma. It's going to be smarter autocomplete. I don't know what to say. I'm still here dealing with, like, Siri, which hasn't, probably hasn't been updated since God knows when it was introduced.[00:35:16] swyx: It's horrible. I, you know, it, it, it makes me so angry. So I, I, one, as an Apple customer and user, I, I'm just hoping for better AI on Apple itself. But two, they are the gold standard when it comes to local devices, personal compute and, and trust, like you, you trust them with your data. And. I think that's what a lot of people are looking for in AI, that they have, they love the benefits of AI, they don't love the downsides, which is that you have to send all your data to some cloud somewhere.[00:35:45] swyx: And some of this data that we're going to feed AI is just the most personal data there is. So Apple being like one of the most trusted personal data companies, I think it's very important that they enter the AI race, and I hope to see more out of them.[00:35:58] Alessio: To me, the, the biggest question with the Google deal is like, who's paying who?[00:36:03] Alessio: Because for the browsers, Google pays Apple like 18, 20 billion every year to be the default browser. Is Google going to pay you to have Gemini or is Apple paying Google to have Gemini? I think that's, that's like what I'm most interested to figure out because with the browsers, it's like, it's the entry point to the thing.[00:36:21] Alessio: So it's really valuable to be the default. That's why Google pays. But I wonder if like the perception in AI is going to be like, Hey. You just have to have a good local model on my phone to be worth me purchasing your device. And that was, that's kind of drive Apple to be the one buying the model. But then, like Shawn said, they're doing the MM1 themselves.[00:36:40] Alessio: So are they saying we do models, but they're not as good as the Google ones? I don't know. The whole thing is, it's really confusing, but. It makes for great meme material on on Twitter.[00:36:51] swyx: Yeah, I mean, I think, like, they are possibly more than OpenAI and Microsoft and Amazon. They are the most full stack company there is in computing, and so, like, they own the chips, man.[00:37:05] swyx: Like, they manufacture everything so if, if, if there was a company that could do that. You know, seriously challenge the other AI players. It would be Apple. And it's, I don't think it's as hard as self driving. So like maybe they've, they've just been investing in the wrong thing this whole time. We'll see.[00:37:21] swyx: Wall Street certainly thinks[00:37:22] NLW: so. Wall Street loved that move, man. There's a big, a big sigh of relief. Well, let's, let's move away from, from sort of the big stuff. I mean, the, I think to both of your points, it's going to.[00:37:33] Meta's $800b AI rebrand[00:37:33] NLW: Can I, can[00:37:34] swyx: I, can I, can I jump on factoid about this, this Wall Street thing? I went and looked at when Meta went from being a VR company to an AI company.[00:37:44] swyx: And I think the stock I'm trying to look up the details now. The stock has gone up 187% since Lamo one. Yeah. Which is $830 billion in market value created in the past year. . Yeah. Yeah.[00:37:57] NLW: It's, it's, it's like, remember if you guys haven't Yeah. If you haven't seen the chart, it's actually like remarkable.[00:38:02] NLW: If you draw a little[00:38:03] swyx: arrow on it, it's like, no, we're an AI company now and forget the VR thing.[00:38:10] NLW: It's it, it is an interesting, no, it's, I, I think, alessio, you called it sort of like Zuck's Disruptor Arc or whatever. He, he really does. He is in the midst of a, of a total, you know, I don't know if it's a redemption arc or it's just, it's something different where, you know, he, he's sort of the spoiler.[00:38:25] NLW: Like people loved him just freestyle talking about why he thought they had a better headset than Apple. But even if they didn't agree, they just loved it. He was going direct to camera and talking about it for, you know, five minutes or whatever. So that, that's a fascinating shift that I don't think anyone had on their bingo card, you know, whatever, two years ago.[00:38:41] NLW: Yeah. Yeah,[00:38:42] swyx: we still[00:38:43] Alessio: didn't see and fight Elon though, so[00:38:45] swyx: that's what I'm really looking forward to. I mean, hey, don't, don't, don't write it off, you know, maybe just these things take a while to happen. But we need to see and fight in the Coliseum. No, I think you know, in terms of like self management, life leadership, I think he has, there's a lot of lessons to learn from him.[00:38:59] swyx: You know he might, you know, you might kind of quibble with, like, the social impact of Facebook, but just himself as a in terms of personal growth and, and, you know, Per perseverance through like a lot of change and you know, everyone throwing stuff his way. I think there's a lot to say about like, to learn from, from Zuck, which is crazy 'cause he's my age.[00:39:18] swyx: Yeah. Right.[00:39:20] AI Engineer landscape - from baby AGIs to vertical Agents[00:39:20] NLW: Awesome. Well, so, so one of the big things that I think you guys have, you know, distinct and, and unique insight into being where you are and what you work on is. You know, what developers are getting really excited about right now. And by that, I mean, on the one hand, certainly, you know, like startups who are actually kind of formalized and formed to startups, but also, you know, just in terms of like what people are spending their nights and weekends on what they're, you know, coming to hackathons to do.[00:39:45] NLW: And, you know, I think it's a, it's a, it's, it's such a fascinating indicator for, for where things are headed. Like if you zoom back a year, right now was right when everyone was getting so, so excited about. AI agent stuff, right? Auto, GPT and baby a GI. And these things were like, if you dropped anything on YouTube about those, like instantly tens of thousands of views.[00:40:07] NLW: I know because I had like a 50,000 view video, like the second day that I was doing the show on YouTube, you know, because I was talking about auto GPT. And so anyways, you know, obviously that's sort of not totally come to fruition yet, but what are some of the trends in what you guys are seeing in terms of people's, people's interest and, and, and what people are building?[00:40:24] Alessio: I can start maybe with the agents part and then I know Shawn is doing a diffusion meetup tonight. There's a lot of, a lot of different things. The, the agent wave has been the most interesting kind of like dream to reality arc. So out of GPT, I think they went, From zero to like 125, 000 GitHub stars in six weeks, and then one year later, they have 150, 000 stars.[00:40:49] Alessio: So there's kind of been a big plateau. I mean, you might say there are just not that many people that can start it. You know, everybody already started it. But the promise of, hey, I'll just give you a goal, and you do it. I think it's like, amazing to get people's imagination going. You know, they're like, oh, wow, this This is awesome.[00:41:08] Alessio: Everybody, everybody can try this to do anything. But then as technologists, you're like, well, that's, that's just like not possible, you know, we would have like solved everything. And I think it takes a little bit to go from the promise and the hope that people show you to then try it yourself and going back to say, okay, this is not really working for me.[00:41:28] Alessio: And David Wong from Adept, you know, they in our episode, he specifically said. We don't want to do a bottom up product. You know, we don't want something that everybody can just use and try because it's really hard to get it to be reliable. So we're seeing a lot of companies doing vertical agents that are narrow for a specific domain, and they're very good at something.[00:41:49] Alessio: Mike Conover, who was at Databricks before, is also a friend of Latentspace. He's doing this new company called BrightWave doing AI agents for financial research, and that's it, you know, and they're doing very well. There are other companies doing it in security, doing it in compliance, doing it in legal.[00:42:08] Alessio: All of these things that like, people, nobody just wakes up and say, Oh, I cannot wait to go on AutoGPD and ask it to do a compliance review of my thing. You know, just not what inspires people. So I think the gap on the developer side has been the more bottom sub hacker mentality is trying to build this like very Generic agents that can do a lot of open ended tasks.[00:42:30] Alessio: And then the more business side of things is like, Hey, If I want to raise my next round, I can not just like sit around the mess, mess around with like super generic stuff. I need to find a use case that really works. And I think that that is worth for, for a lot of folks in parallel, you have a lot of companies doing evals.[00:42:47] Alessio: There are dozens of them that just want to help you measure how good your models are doing. Again, if you build evals, you need to also have a restrained surface area to actually figure out whether or not it's good, right? Because you cannot eval anything on everything under the sun. So that's another category where I've seen from the startup pitches that I've seen, there's a lot of interest in, in the enterprise.[00:43:11] Alessio: It's just like really. Fragmented because the production use cases are just coming like now, you know, there are not a lot of long established ones to, to test against. And so does it, that's kind of on the virtual agents and then the robotic side it's probably been the thing that surprised me the most at NVIDIA GTC, the amount of robots that were there that were just like robots everywhere.[00:43:33] Alessio: Like, both in the keynote and then on the show floor, you would have Boston Dynamics dogs running around. There was, like, this, like fox robot that had, like, a virtual face that, like, talked to you and, like, moved in real time. There were industrial robots. NVIDIA did a big push on their own Omniverse thing, which is, like, this Digital twin of whatever environments you're in that you can use to train the robots agents.[00:43:57] Alessio: So that kind of takes people back to the reinforcement learning days, but yeah, agents, people want them, you know, people want them. I give a talk about the, the rise of the full stack employees and kind of this future, the same way full stack engineers kind of work across the stack. In the future, every employee is going to interact with every part of the organization through agents and AI enabled tooling.[00:44:17] Alessio: This is happening. It just needs to be a lot more narrow than maybe the first approach that we took, which is just put a string in AutoGPT and pray. But yeah, there's a lot of super interesting stuff going on.[00:44:27] swyx: Yeah. Well, he Let's recover a lot of stuff there. I'll separate the robotics piece because I feel like that's so different from the software world.[00:44:34] swyx: But yeah, we do talk to a lot of engineers and you know, that this is our sort of bread and butter. And I do agree that vertical agents have worked out a lot better than the horizontal ones. I think all You know, the point I'll make here is just the reason AutoGPT and maybe AGI, you know, it's in the name, like they were promising AGI.[00:44:53] swyx: But I think people are discovering that you cannot engineer your way to AGI. It has to be done at the model level and all these engineering, prompt engineering hacks on top of it weren't really going to get us there in a meaningful way without much further, you know, improvements in the models. I would say, I'll go so far as to say, even Devin, which is, I would, I think the most advanced agent that we've ever seen, still requires a lot of engineering and still probably falls apart a lot in terms of, like, practical usage.[00:45:22] swyx: Or it's just, Way too slow and expensive for, you know, what it's, what it's promised compared to the video. So yeah, that's, that's what, that's what happened with agents from, from last year. But I, I do, I do see, like, vertical agents being very popular and, and sometimes you, like, I think the word agent might even be overused sometimes.[00:45:38] swyx: Like, people don't really care whether or not you call it an AI agent, right? Like, does it replace boring menial tasks that I do That I might hire a human to do, or that the human who is hired to do it, like, actually doesn't really want to do. And I think there's absolutely ways in sort of a vertical context that you can actually go after very routine tasks that can be scaled out to a lot of, you know, AI assistants.[00:46:01] swyx: So, so yeah, I mean, and I would, I would sort of basically plus one what let's just sit there. I think it's, it's very, very promising and I think more people should work on it, not less. Like there's not enough people. Like, we, like, this should be the, the, the main thrust of the AI engineer is to look out, look for use cases and, and go to a production with them instead of just always working on some AGI promising thing that never arrives.[00:46:21] swyx: I,[00:46:22] NLW: I, I can only add that so I've been fiercely making tutorials behind the scenes around basically everything you can imagine with AI. We've probably done, we've done about 300 tutorials over the last couple of months. And the verticalized anything, right, like this is a solution for your particular job or role, even if it's way less interesting or kind of sexy, it's like so radically more useful to people in terms of intersecting with how, like those are the ways that people are actually.[00:46:50] NLW: Adopting AI in a lot of cases is just a, a, a thing that I do over and over again. By the way, I think that's the same way that even the generalized models are getting adopted. You know, it's like, I use midjourney for lots of stuff, but the main thing I use it for is YouTube thumbnails every day. Like day in, day out, I will always do a YouTube thumbnail, you know, or two with, with Midjourney, right?[00:47:09] NLW: And it's like you can, you can start to extrapolate that across a lot of things and all of a sudden, you know, a AI doesn't. It looks revolutionary because of a million small changes rather than one sort of big dramatic change. And I think that the verticalization of agents is sort of a great example of how that's[00:47:26] swyx: going to play out too.[00:47:28] Adept episode - Screen Multimodality[00:47:28] swyx: So I'll have one caveat here, which is I think that Because multi modal models are now commonplace, like Cloud, Gemini, OpenAI, all very very easily multi modal, Apple's easily multi modal, all this stuff. There is a switch for agents for sort of general desktop browsing that I think people so much for joining us today, and we'll see you in the next video.[00:48:04] swyx: Version of the the agent where they're not specifically taking in text or anything They're just watching your screen just like someone else would and and I'm piloting it by vision And you know in the the episode with David that we'll have dropped by the time that this this airs I think I think that is the promise of adept and that is a promise of what a lot of these sort of desktop agents Are and that is the more general purpose system That could be as big as the browser, the operating system, like, people really want to build that foundational piece of software in AI.[00:48:38] swyx: And I would see, like, the potential there for desktop agents being that, that you can have sort of self driving computers. You know, don't write the horizontal piece out. I just think we took a while to get there.[00:48:48] NLW: What else are you guys seeing that's interesting to you? I'm looking at your notes and I see a ton of categories.[00:48:54] Top Model Research from January Recap[00:48:54] swyx: Yeah so I'll take the next two as like as one category, which is basically alternative architectures, right? The two main things that everyone following AI kind of knows now is, one, the diffusion architecture, and two, the let's just say the, Decoder only transformer architecture that is popularized by GPT.[00:49:12] swyx: You can read, you can look on YouTube for thousands and thousands of tutorials on each of those things. What we are talking about here is what's next, what people are researching, and what could be on the horizon that takes the place of those other two things. So first of all, we'll talk about transformer architectures and then diffusion.[00:49:25] swyx: So transformers the, the two leading candidates are effectively RWKV and the state space models the most recent one of which is Mamba, but there's others like the Stripe, ENA, and the S four H three stuff coming out of hazy research at Stanford. And all of those are non quadratic language models that scale the promise to scale a lot better than the, the traditional transformer.[00:49:47] swyx: That this might be too theoretical for most people right now, but it's, it's gonna be. It's gonna come out in weird ways, where, imagine if like, Right now the talk of the town is that Claude and Gemini have a million tokens of context and like whoa You can put in like, you know, two hours of video now, okay But like what if you put what if we could like throw in, you know, two hundred thousand hours of video?[00:50:09] swyx: Like how does that change your usage of AI? What if you could throw in the entire genetic sequence of a human and like synthesize new drugs. Like, well, how does that change things? Like, we don't know because we haven't had access to this capability being so cheap before. And that's the ultimate promise of these two models.[00:50:28] swyx: They're not there yet but we're seeing very, very good progress. RWKV and Mamba are probably the, like, the two leading examples, both of which are open source that you can try them today and and have a lot of progress there. And the, the, the main thing I'll highlight for audio e KV is that at, at the seven B level, they seem to have beat LAMA two in all benchmarks that matter at the same size for the same amount of training as an open source model.[00:50:51] swyx: So that's exciting. You know, they're there, they're seven B now. They're not at seven tb. We don't know if it'll. And then the other thing is diffusion. Diffusions and transformers are are kind of on the collision course. The original stable diffusion already used transformers in in parts of its architecture.[00:51:06] swyx: It seems that transformers are eating more and more of those layers particularly the sort of VAE layer. So that's, the Diffusion Transformer is what Sora is built on. The guy who wrote the Diffusion Transformer paper, Bill Pebbles, is, Bill Pebbles is the lead tech guy on Sora. So you'll just see a lot more Diffusion Transformer stuff going on.[00:51:25] swyx: But there's, there's more sort of experimentation with diffusion. I'm holding a meetup actually here in San Francisco that's gonna be like the state of diffusion, which I'm pretty excited about. Stability's doing a lot of good work. And if you look at the, the architecture of how they're creating Stable Diffusion 3, Hourglass Diffusion, and the inconsistency models, or SDXL Turbo.[00:51:45] swyx: All of these are, like, very, very interesting innovations on, like, the original idea of what Stable Diffusion was. So if you think that it is expensive to create or slow to create Stable Diffusion or an AI generated art, you are not up to date with the latest models. If you think it is hard to create text and images, you are not up to date with the latest models.[00:52:02] swyx: And people still are kind of far behind. The last piece of which is the wildcard I always kind of hold out, which is text diffusion. So Instead of using autogenerative or autoregressive transformers, can you use text to diffuse? So you can use diffusion models to diffuse and create entire chunks of text all at once instead of token by token.[00:52:22] swyx: And that is something that Midjourney confirmed today, because it was only rumored the past few months. But they confirmed today that they were looking into. So all those things are like very exciting new model architectures that are, Maybe something that we'll, you'll see in production two to three years from now.[00:52:37] swyx: So the couple of the trends[00:52:38] NLW: that I want to just get your takes on, because they're sort of something that, that seems like they're coming up are one sort of these, these wearable, you know, kind of passive AI experiences where they're absorbing a lot of what's going on around you and then, and then kind of bringing things back.[00:52:53] NLW: And then the, the other one that I, that I wanted to see if you guys had thoughts on were sort of this next generation of chip companies. Obviously there's a huge amount of emphasis. On on hardware and silicon and, and, and different ways of doing things, but, y
Hello everyone!!As we're rounding out March, we offer a different kind of conceptual HBM episode! We take a look at videogame mods, what they are, what they can be, and what's so damn interesting about them!Join us as we talk about the difficulties involved in modding, the relationship to game companies, and how mods, at their best, can express an immense passion and creativity, in an anarchic and decentralised way.Have fun!And please support our Patreon if you're interested and want access to early content and the bonus Reading Corners!! Big things are coming! https://www.patreon.com/leftpage Intro Music: Home, by Karl Casey @ White Bat AudioOutro Music: Leve Palestina, Spartacus Hosted on Acast. See acast.com/privacy for more information.
We will be recording a preview of the AI Engineer World's Fair soon with swyx and Ben Dunphy, send any questions about Speaker CFPs and Sponsor Guides you have!Alessio is now hiring engineers for a new startup he is incubating at Decibel: Ideal candidate is an ex-technical co-founder type (can MVP products end to end, comfortable with ambiguous prod requirements, etc). Reach out to him for more!Thanks for all the love on the Four Wars episode! We're excited to develop this new “swyx & Alessio rapid-fire thru a bunch of things” format with you, and feedback is welcome. Jan 2024 RecapThe first half of this monthly audio recap pod goes over our highlights from the Jan Recap, which is mainly focused on notable research trends we saw in Jan 2024:Feb 2024 RecapThe second half catches you up on everything that was topical in Feb, including:* OpenAI Sora - does it have a world model? Yann LeCun vs Jim Fan * Google Gemini Pro 1.5 - 1m Long Context, Video Understanding* Groq offering Mixtral at 500 tok/s at $0.27 per million toks (swyx vs dylan math)* The {Gemini | Meta | Copilot} Alignment Crisis (Sydney is back!)* Grimes' poetic take: Art for no one, by no one* F*** you, show me the promptLatent Space AnniversaryPlease also read Alessio's longform reflections on One Year of Latent Space!We launched the podcast 1 year ago with Logan from OpenAI:and also held an incredible demo day that got covered in The Information:Over 750k downloads later, having established ourselves as the top AI Engineering podcast, reaching #10 in the US Tech podcast charts, and crossing 1 million unique readers on Substack, for our first anniversary we held Latent Space Final Frontiers, where 10 handpicked teams, including Lindy.ai and Julius.ai, competed for prizes judged by technical AI leaders from (former guest!) LlamaIndex, Replit, GitHub, AMD, Meta, and Lemurian Labs.The winners were Pixee and RWKV (that's Eugene from our pod!):And finally, your cohosts got cake!We also captured spot interviews with 4 listeners who kindly shared their experience of Latent Space, everywhere from Hungary to Australia to China:* Balázs Némethi* Sylvia Tong* RJ Honicky* Jan ZhengOur birthday wishes for the super loyal fans reading this - tag @latentspacepod on a Tweet or comment on a @LatentSpaceTV video telling us what you liked or learned from a pod that stays with you to this day, and share us with a friend!As always, feedback is welcome. Timestamps* [00:03:02] Top Five LLM Directions* [00:03:33] Direction 1: Long Inference (Planning, Search, AlphaGeometry, Flow Engineering)* [00:11:42] Direction 2: Synthetic Data (WRAP, SPIN)* [00:17:20] Wildcard: Multi-Epoch Training (OLMo, Datablations)* [00:19:43] Direction 3: Alt. Architectures (Mamba, RWKV, RingAttention, Diffusion Transformers)* [00:23:33] Wildcards: Text Diffusion, RALM/Retro* [00:25:00] Direction 4: Mixture of Experts (DeepSeekMoE, Samba-1)* [00:28:26] Wildcard: Model Merging (mergekit)* [00:29:51] Direction 5: Online LLMs (Gemini Pro, Exa)* [00:33:18] OpenAI Sora and why everyone underestimated videogen* [00:36:18] Does Sora have a World Model? Yann LeCun vs Jim Fan* [00:42:33] Groq Math* [00:47:37] Analyzing Gemini's 1m Context, Reddit deal, Imagegen politics, Gemma via the Four Wars* [00:55:42] The Alignment Crisis - Gemini, Meta, Sydney is back at Copilot, Grimes' take* [00:58:39] F*** you, show me the prompt* [01:02:43] Send us your suggestions pls* [01:04:50] Latent Space Anniversary* [01:04:50] Lindy.ai - Agent Platform* [01:06:40] RWKV - Beyond Transformers* [01:15:00] Pixee - Automated Security* [01:19:30] Julius AI - Competing with Code Interpreter* [01:25:03] Latent Space Listeners* [01:25:03] Listener 1 - Balázs Némethi (Hungary, Latent Space Paper Club* [01:27:47] Listener 2 - Sylvia Tong (Sora/Jim Fan/EntreConnect)* [01:31:23] Listener 3 - RJ (Developers building Community & Content)* [01:39:25] Listener 4 - Jan Zheng (Australia, AI UX)Transcript[00:00:00] AI Charlie: Welcome to the Latent Space podcast, weekend edition. This is Charlie, your new AI co host. Happy weekend. As an AI language model, I work the same every day of the week, although I might get lazier towards the end of the year. Just like you. Last month, we released our first monthly recap pod, where Swyx and Alessio gave quick takes on the themes of the month, and we were blown away by your positive response.[00:00:33] AI Charlie: We're delighted to continue our new monthly news recap series for AI engineers. Please feel free to submit questions by joining the Latent Space Discord, or just hit reply when you get the emails from Substack. This month, we're covering the top research directions that offer progress for text LLMs, and then touching on the big Valentine's Day gifts we got from Google, OpenAI, and Meta.[00:00:55] AI Charlie: Watch out and take care.[00:00:57] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO of Residence at Decibel Partners, and we're back with a monthly recap with my co host[00:01:06] swyx: Swyx. The reception was very positive for the first one, I think people have requested this and no surprise that I think they want to hear us more applying on issues and maybe drop some alpha along the way I'm not sure how much alpha we have to drop, this month in February was a very, very heavy month, we also did not do one specifically for January, so I think we're just going to do a two in one, because we're recording this on the first of March.[00:01:29] Alessio: Yeah, let's get to it. I think the last one we did, the four wars of AI, was the main kind of mental framework for people. I think in the January one, we had the five worthwhile directions for state of the art LLMs. Four, five,[00:01:42] swyx: and now we have to do six, right? Yeah.[00:01:46] Alessio: So maybe we just want to run through those, and then do the usual news recap, and we can do[00:01:52] swyx: one each.[00:01:53] swyx: So the context to this stuff. is one, I noticed that just the test of time concept from NeurIPS and just in general as a life philosophy I think is a really good idea. Especially in AI, there's news every single day, and after a while you're just like, okay, like, everyone's excited about this thing yesterday, and then now nobody's talking about it.[00:02:13] swyx: So, yeah. It's more important, or better use of time, to spend things, spend time on things that will stand the test of time. And I think for people to have a framework for understanding what will stand the test of time, they should have something like the four wars. Like, what is the themes that keep coming back because they are limited resources that everybody's fighting over.[00:02:31] swyx: Whereas this one, I think that the focus for the five directions is just on research that seems more proMECEng than others, because there's all sorts of papers published every single day, and there's no organization. Telling you, like, this one's more important than the other one apart from, you know, Hacker News votes and Twitter likes and whatever.[00:02:51] swyx: And obviously you want to get in a little bit earlier than Something where, you know, the test of time is counted by sort of reference citations.[00:02:59] The Five Research Directions[00:02:59] Alessio: Yeah, let's do it. We got five. Long inference.[00:03:02] swyx: Let's start there. Yeah, yeah. So, just to recap at the top, the five trends that I picked, and obviously if you have some that I did not cover, please suggest something.[00:03:13] swyx: The five are long inference, synthetic data, alternative architectures, mixture of experts, and online LLMs. And something that I think might be a bit controversial is this is a sorted list in the sense that I am not the guy saying that Mamba is like the future and, and so maybe that's controversial.[00:03:31] Direction 1: Long Inference (Planning, Search, AlphaGeometry, Flow Engineering)[00:03:31] swyx: But anyway, so long inference is a thesis I pushed before on the newsletter and on in discussing The thesis that, you know, Code Interpreter is GPT 4. 5. That was the title of the post. And it's one of many ways in which we can do long inference. You know, long inference also includes chain of thought, like, please think step by step.[00:03:52] swyx: But it also includes flow engineering, which is what Itamar from Codium coined, I think in January, where, basically, instead of instead of stuffing everything in a prompt, You do like sort of multi turn iterative feedback and chaining of things. In a way, this is a rebranding of what a chain is, what a lang chain is supposed to be.[00:04:15] swyx: I do think that maybe SGLang from ElemSys is a better name. Probably the neatest way of flow engineering I've seen yet, in the sense that everything is a one liner, it's very, very clean code. I highly recommend people look at that. I'm surprised it hasn't caught on more, but I think it will. It's weird that something like a DSPy is more hyped than a Shilang.[00:04:36] swyx: Because it, you know, it maybe obscures the code a little bit more. But both of these are, you know, really good sort of chain y and long inference type approaches. But basically, the reason that the basic fundamental insight is that the only, like, there are only a few dimensions we can scale LLMs. So, let's say in like 2020, no, let's say in like 2018, 2017, 18, 19, 20, we were realizing that we could scale the number of parameters.[00:05:03] swyx: 20, we were And we scaled that up to 175 billion parameters for GPT 3. And we did some work on scaling laws, which we also talked about in our talk. So the datasets 101 episode where we're like, okay, like we, we think like the right number is 300 billion tokens to, to train 175 billion parameters and then DeepMind came along and trained Gopher and Chinchilla and said that, no, no, like, you know, I think we think the optimal.[00:05:28] swyx: compute optimal ratio is 20 tokens per parameter. And now, of course, with LLAMA and the sort of super LLAMA scaling laws, we have 200 times and often 2, 000 times tokens to parameters. So now, instead of scaling parameters, we're scaling data. And fine, we can keep scaling data. But what else can we scale?[00:05:52] swyx: And I think understanding the ability to scale things is crucial to understanding what to pour money and time and effort into because there's a limit to how much you can scale some things. And I think people don't think about ceilings of things. And so the remaining ceiling of inference is like, okay, like, we have scaled compute, we have scaled data, we have scaled parameters, like, model size, let's just say.[00:06:20] swyx: Like, what else is left? Like, what's the low hanging fruit? And it, and it's, like, blindingly obvious that the remaining low hanging fruit is inference time. So, like, we have scaled training time. We can probably scale more, those things more, but, like, not 10x, not 100x, not 1000x. Like, right now, maybe, like, a good run of a large model is three months.[00:06:40] swyx: We can scale that to three years. But like, can we scale that to 30 years? No, right? Like, it starts to get ridiculous. So it's just the orders of magnitude of scaling. It's just, we're just like running out there. But in terms of the amount of time that we spend inferencing, like everything takes, you know, a few milliseconds, a few hundred milliseconds, depending on what how you're taking token by token, or, you know, entire phrase.[00:07:04] swyx: But We can scale that to hours, days, months of inference and see what we get. And I think that's really proMECEng.[00:07:11] Alessio: Yeah, we'll have Mike from Broadway back on the podcast. But I tried their product and their reports take about 10 minutes to generate instead of like just in real time. I think to me the most interesting thing about long inference is like, You're shifting the cost to the customer depending on how much they care about the end result.[00:07:31] Alessio: If you think about prompt engineering, it's like the first part, right? You can either do a simple prompt and get a simple answer or do a complicated prompt and get a better answer. It's up to you to decide how to do it. Now it's like, hey, instead of like, yeah, training this for three years, I'll still train it for three months and then I'll tell you, you know, I'll teach you how to like make it run for 10 minutes to get a better result.[00:07:52] Alessio: So you're kind of like parallelizing like the improvement of the LLM. Oh yeah, you can even[00:07:57] swyx: parallelize that, yeah, too.[00:07:58] Alessio: So, and I think, you know, for me, especially the work that I do, it's less about, you know, State of the art and the absolute, you know, it's more about state of the art for my application, for my use case.[00:08:09] Alessio: And I think we're getting to the point where like most companies and customers don't really care about state of the art anymore. It's like, I can get this to do a good enough job. You know, I just need to get better. Like, how do I do long inference? You know, like people are not really doing a lot of work in that space, so yeah, excited to see more.[00:08:28] swyx: So then the last point I'll mention here is something I also mentioned as paper. So all these directions are kind of guided by what happened in January. That was my way of doing a January recap. Which means that if there was nothing significant in that month, I also didn't mention it. Which is which I came to regret come February 15th, but in January also, you know, there was also the alpha geometry paper, which I kind of put in this sort of long inference bucket, because it solves like, you know, more than 100 step math olympiad geometry problems at a human gold medalist level and that also involves planning, right?[00:08:59] swyx: So like, if you want to scale inference, you can't scale it blindly, because just, Autoregressive token by token generation is only going to get you so far. You need good planning. And I think probably, yeah, what Mike from BrightWave is now doing and what everyone is doing, including maybe what we think QSTAR might be, is some form of search and planning.[00:09:17] swyx: And it makes sense. Like, you want to spend your inference time wisely. How do you[00:09:22] Alessio: think about plans that work and getting them shared? You know, like, I feel like if you're planning a task, somebody has got in and the models are stochastic. So everybody gets initially different results. Somebody is going to end up generating the best plan to do something, but there's no easy way to like store these plans and then reuse them for most people.[00:09:44] Alessio: You know, like, I'm curious if there's going to be. Some paper or like some work there on like making it better because, yeah, we don't[00:09:52] swyx: really have This is your your pet topic of NPM for[00:09:54] Alessio: Yeah, yeah, NPM, exactly. NPM for, you need NPM for anything, man. You need NPM for skills. You need NPM for planning. Yeah, yeah.[00:10:02] Alessio: You know I think, I mean, obviously the Voyager paper is like the most basic example where like, now their artifact is like the best planning to do a diamond pickaxe in Minecraft. And everybody can just use that. They don't need to come up with it again. Yeah. But there's nothing like that for actually useful[00:10:18] swyx: tasks.[00:10:19] swyx: For plans, I believe it for skills. I like that. Basically, that just means a bunch of integration tooling. You know, GPT built me integrations to all these things. And, you know, I just came from an integrations heavy business and I could definitely, I definitely propose some version of that. And it's just, you know, hard to execute or expensive to execute.[00:10:38] swyx: But for planning, I do think that everyone lives in slightly different worlds. They have slightly different needs. And they definitely want some, you know, And I think that that will probably be the main hurdle for any, any sort of library or package manager for planning. But there should be a meta plan of how to plan.[00:10:57] swyx: And maybe you can adopt that. And I think a lot of people when they have sort of these meta prompting strategies of like, I'm not prescribing you the prompt. I'm just saying that here are the like, Fill in the lines or like the mad libs of how to prompts. First you have the roleplay, then you have the intention, then you have like do something, then you have the don't something and then you have the my grandmother is dying, please do this.[00:11:19] swyx: So the meta plan you could, you could take off the shelf and test a bunch of them at once. I like that. That was the initial, maybe, promise of the, the prompting libraries. You know, both 9chain and Llama Index have, like, hubs that you can sort of pull off the shelf. I don't think they're very successful because people like to write their own.[00:11:36] swyx: Yeah,[00:11:37] Direction 2: Synthetic Data (WRAP, SPIN)[00:11:37] Alessio: yeah, yeah. Yeah, that's a good segue into the next one, which is synthetic[00:11:41] swyx: data. Synthetic data is so hot. Yeah, and, you know, the way, you know, I think I, I feel like I should do one of these memes where it's like, Oh, like I used to call it, you know, R L A I F, and now I call it synthetic data, and then people are interested.[00:11:54] swyx: But there's gotta be older versions of what synthetic data really is because I'm sure, you know if you've been in this field long enough, There's just different buzzwords that the industry condenses on. Anyway, the insight that I think is relatively new that why people are excited about it now and why it's proMECEng now is that we have evidence that shows that LLMs can generate data to improve themselves with no teacher LLM.[00:12:22] swyx: For all of 2023, when people say synthetic data, they really kind of mean generate a whole bunch of data from GPT 4 and then train an open source model on it. Hello to our friends at News Research. That's what News Harmony says. They're very, very open about that. I think they have said that they're trying to migrate away from that.[00:12:40] swyx: But it is explicitly against OpenAI Terms of Service. Everyone knows this. You know, especially once ByteDance got banned for, for doing exactly that. So so, so synthetic data that is not a form of model distillation is the hot thing right now, that you can bootstrap better LLM performance from the same LLM, which is very interesting.[00:13:03] swyx: A variant of this is RLAIF, where you have a, where you have a sort of a constitutional model, or, you know, some, some kind of judge model That is sort of more aligned. But that's not really what we're talking about when most people talk about synthetic data. Synthetic data is just really, I think, you know, generating more data in some way.[00:13:23] swyx: A lot of people, I think we talked about this with Vipul from the Together episode, where I think he commented that you just have to have a good world model. Or a good sort of inductive bias or whatever that, you know, term of art is. And that is strongest in math and science math and code, where you can verify what's right and what's wrong.[00:13:44] swyx: And so the REST EM paper from DeepMind explored that. Very well, it's just the most obvious thing like and then and then once you get out of that domain of like things where you can generate You can arbitrarily generate like a whole bunch of stuff and verify if they're correct and therefore they're they're correct synthetic data to train on Once you get into more sort of fuzzy topics, then it's then it's a bit less clear So I think that the the papers that drove this understanding There are two big ones and then one smaller one One was wrap like rephrasing the web from from Apple where they basically rephrased all of the C4 data set with Mistral and it be trained on that instead of C4.[00:14:23] swyx: And so new C4 trained much faster and cheaper than old C, than regular raw C4. And that was very interesting. And I have told some friends of ours that they should just throw out their own existing data sets and just do that because that seems like a pure win. Obviously we have to study, like, what the trade offs are.[00:14:42] swyx: I, I imagine there are trade offs. So I was just thinking about this last night. If you do synthetic data and it's generated from a model, probably you will not train on typos. So therefore you'll be like, once the model that's trained on synthetic data encounters the first typo, they'll be like, what is this?[00:15:01] swyx: I've never seen this before. So they have no association or correction as to like, oh, these tokens are often typos of each other, therefore they should be kind of similar. I don't know. That's really remains to be seen, I think. I don't think that the Apple people export[00:15:15] Alessio: that. Yeah, isn't that the whole, Mode collapse thing, if we do more and more of this at the end of the day.[00:15:22] swyx: Yeah, that's one form of that. Yeah, exactly. Microsoft also had a good paper on text embeddings. And then I think this is a meta paper on self rewarding language models. That everyone is very interested in. Another paper was also SPIN. These are all things we covered in the the Latent Space Paper Club.[00:15:37] swyx: But also, you know, I just kind of recommend those as top reads of the month. Yeah, I don't know if there's any much else in terms, so and then, regarding the potential of it, I think it's high potential because, one, it solves one of the data war issues that we have, like, everyone is OpenAI is paying Reddit 60 million dollars a year for their user generated data.[00:15:56] swyx: Google, right?[00:15:57] Alessio: Not OpenAI.[00:15:59] swyx: Is it Google? I don't[00:16:00] Alessio: know. Well, somebody's paying them 60 million, that's[00:16:04] swyx: for sure. Yes, that is, yeah, yeah, and then I think it's maybe not confirmed who. But yeah, it is Google. Oh my god, that's interesting. Okay, because everyone was saying, like, because Sam Altman owns 5 percent of Reddit, which is apparently 500 million worth of Reddit, he owns more than, like, the founders.[00:16:21] Alessio: Not enough to get the data,[00:16:22] swyx: I guess. So it's surprising that it would go to Google instead of OpenAI, but whatever. Okay yeah, so I think that's all super interesting in the data field. I think it's high potential because we have evidence that it works. There's not a doubt that it doesn't work. I think it's a doubt that there's, what the ceiling is, which is the mode collapse thing.[00:16:42] swyx: If it turns out that the ceiling is pretty close, then this will maybe augment our data by like, I don't know, 30 50 percent good, but not game[00:16:51] Alessio: changing. And most of the synthetic data stuff, it's reinforcement learning on a pre trained model. People are not really doing pre training on fully synthetic data, like, large enough scale.[00:17:02] swyx: Yeah, unless one of our friends that we've talked to succeeds. Yeah, yeah. Pre trained synthetic data, pre trained scale synthetic data, I think that would be a big step. Yeah. And then there's a wildcard, so all of these, like smaller Directions,[00:17:15] Wildcard: Multi-Epoch Training (OLMo, Datablations)[00:17:15] swyx: I always put a wildcard in there. And one of the wildcards is, okay, like, Let's say, you have pre, you have, You've scraped all the data on the internet that you think is useful.[00:17:25] swyx: Seems to top out at somewhere between 2 trillion to 3 trillion tokens. Maybe 8 trillion if Mistral, Mistral gets lucky. Okay, if I need 80 trillion, if I need 100 trillion, where do I go? And so, you can do synthetic data maybe, but maybe that only gets you to like 30, 40 trillion. Like where, where is the extra alpha?[00:17:43] swyx: And maybe extra alpha is just train more on the same tokens. Which is exactly what Omo did, like Nathan Lambert, AI2, After, just after he did the interview with us, they released Omo. So, it's unfortunate that we didn't get to talk much about it. But Omo actually started doing 1. 5 epochs on every, on all data.[00:18:00] swyx: And the data ablation paper that I covered in Europe's says that, you know, you don't like, don't really start to tap out of like, the alpha or the sort of improved loss that you get from data all the way until four epochs. And so I'm just like, okay, like, why do we all agree that one epoch is all you need?[00:18:17] swyx: It seems like to be a trend. It seems that we think that memorization is very good or too good. But then also we're finding that, you know, For improvement in results that we really like, we're fine on overtraining on things intentionally. So, I think that's an interesting direction that I don't see people exploring enough.[00:18:36] swyx: And the more I see papers coming out Stretching beyond the one epoch thing, the more people are like, it's completely fine. And actually, the only reason we stopped is because we ran out of compute[00:18:46] Alessio: budget. Yeah, I think that's the biggest thing, right?[00:18:51] swyx: Like, that's not a valid reason, that's not science. I[00:18:54] Alessio: wonder if, you know, Matt is going to do it.[00:18:57] Alessio: I heard LamaTree, they want to do a 100 billion parameters model. I don't think you can train that on too many epochs, even with their compute budget, but yeah. They're the only ones that can save us, because even if OpenAI is doing this, they're not going to tell us, you know. Same with DeepMind.[00:19:14] swyx: Yeah, and so the updates that we got on Lambda 3 so far is apparently that because of the Gemini news that we'll talk about later they're pushing it back on the release.[00:19:21] swyx: They already have it. And they're just pushing it back to do more safety testing. Politics testing.[00:19:28] Alessio: Well, our episode with Sumit will have already come out by the time this comes out, I think. So people will get the inside story on how they actually allocate the compute.[00:19:38] Direction 3: Alt. Architectures (Mamba, RWKV, RingAttention, Diffusion Transformers)[00:19:38] Alessio: Alternative architectures. Well, shout out to our WKV who won one of the prizes at our Final Frontiers event last week.[00:19:47] Alessio: We talked about Mamba and Strapain on the Together episode. A lot of, yeah, monarch mixers. I feel like Together, It's like the strong Stanford Hazy Research Partnership, because Chris Ray is one of the co founders. So they kind of have a, I feel like they're going to be the ones that have one of the state of the art models alongside maybe RWKB.[00:20:08] Alessio: I haven't seen as many independent. People working on this thing, like Monarch Mixer, yeah, Manbuster, Payena, all of these are together related. Nobody understands the math. They got all the gigabrains, they got 3DAO, they got all these folks in there, like, working on all of this.[00:20:25] swyx: Albert Gu, yeah. Yeah, so what should we comment about it?[00:20:28] swyx: I mean, I think it's useful, interesting, but at the same time, both of these are supposed to do really good scaling for long context. And then Gemini comes out and goes like, yeah, we don't need it. Yeah.[00:20:44] Alessio: No, that's the risk. So, yeah. I was gonna say, maybe it's not here, but I don't know if we want to talk about diffusion transformers as like in the alt architectures, just because of Zora.[00:20:55] swyx: One thing, yeah, so, so, you know, this came from the Jan recap, which, and diffusion transformers were not really a discussion, and then, obviously, they blow up in February. Yeah. I don't think they're, it's a mixed architecture in the same way that Stripe Tiena is mixed there's just different layers taking different approaches.[00:21:13] swyx: Also I think another one that I maybe didn't call out here, I think because it happened in February, was hourglass diffusion from stability. But also, you know, another form of mixed architecture. So I guess that is interesting. I don't have much commentary on that, I just think, like, we will try to evolve these things, and maybe one of these architectures will stick and scale, it seems like diffusion transformers is going to be good for anything generative, you know, multi modal.[00:21:41] swyx: We don't see anything where diffusion is applied to text yet, and that's the wild card for this category. Yeah, I mean, I think I still hold out hope for let's just call it sub quadratic LLMs. I think that a lot of discussion this month actually was also centered around this concept that People always say, oh, like, transformers don't scale because attention is quadratic in the sequence length.[00:22:04] swyx: Yeah, but, you know, attention actually is a very small part of the actual compute that is being spent, especially in inference. And this is the reason why, you know, when you multiply, when you, when you, when you jump up in terms of the, the model size in GPT 4 from like, you know, 38k to like 32k, you don't also get like a 16 times increase in your, in your performance.[00:22:23] swyx: And this is also why you don't get like a million times increase in your, in your latency when you throw a million tokens into Gemini. Like people have figured out tricks around it or it's just not that significant as a term, as a part of the overall compute. So there's a lot of challenges to this thing working.[00:22:43] swyx: It's really interesting how like, how hyped people are about this versus I don't know if it works. You know, it's exactly gonna, gonna work. And then there's also this, this idea of retention over long context. Like, even though you have context utilization, like, the amount of, the amount you can remember is interesting.[00:23:02] swyx: Because I've had people criticize both Mamba and RWKV because they're kind of, like, RNN ish in the sense that they have, like, a hidden memory and sort of limited hidden memory that they will forget things. So, for all these reasons, Gemini 1. 5, which we still haven't covered, is very interesting because Gemini magically has fixed all these problems with perfect haystack recall and reasonable latency and cost.[00:23:29] Wildcards: Text Diffusion, RALM/Retro[00:23:29] swyx: So that's super interesting. So the wildcard I put in here if you want to go to that. I put two actually. One is text diffusion. I think I'm still very influenced by my meeting with a mid journey person who said they were working on text diffusion. I think it would be a very, very different paradigm for, for text generation, reasoning, plan generation if we can get diffusion to work.[00:23:51] swyx: For text. And then the second one is Dowie Aquila's contextual AI, which is working on retrieval augmented language models, where it kind of puts RAG inside of the language model instead of outside.[00:24:02] Alessio: Yeah, there's a paper called Retro that covers some of this. I think that's an interesting thing. I think the The challenge, well not the challenge, what they need to figure out is like how do you keep the rag piece always up to date constantly, you know, I feel like the models, you put all this work into pre training them, but then at least you have a fixed artifact.[00:24:22] Alessio: These architectures are like constant work needs to be done on them and they can drift even just based on the rag data instead of the model itself. Yeah,[00:24:30] swyx: I was in a panel with one of the investors in contextual and the guy, the way that guy pitched it, I didn't agree with. He was like, this will solve hallucination.[00:24:38] Alessio: That's what everybody says. We solve[00:24:40] swyx: hallucination. I'm like, no, you reduce it. It cannot,[00:24:44] Alessio: if you solved it, the model wouldn't exist, right? It would just be plain text. It wouldn't be a generative model. Cool. So, author, architectures, then we got mixture of experts. I think we covered a lot of, a lot of times.[00:24:56] Direction 4: Mixture of Experts (DeepSeekMoE, Samba-1)[00:24:56] Alessio: Maybe any new interesting threads you want to go under here?[00:25:00] swyx: DeepSeq MOE, which was released in January. Everyone who is interested in MOEs should read that paper, because it's significant for two reasons. One three reasons. One, it had, it had small experts, like a lot more small experts. So, for some reason, everyone has settled on eight experts for GPT 4 for Mixtral, you know, that seems to be the favorite architecture, but these guys pushed it to 64 experts, and each of them smaller than the other.[00:25:26] swyx: But then they also had the second idea, which is that it is They had two, one to two always on experts for common knowledge and that's like a very compelling concept that you would not route to all the experts all the time and make them, you know, switch to everything. You would have some always on experts.[00:25:41] swyx: I think that's interesting on both the inference side and the training side for for memory retention. And yeah, they, they, they, the, the, the, the results that they published, which actually excluded, Mixed draw, which is interesting. The results that they published showed a significant performance jump versus all the other sort of open source models at the same parameter count.[00:26:01] swyx: So like this may be a better way to do MOEs that are, that is about to get picked up. And so that, that is interesting for the third reason, which is this is the first time a new idea from China. has infiltrated the West. It's usually the other way around. I probably overspoke there. There's probably lots more ideas that I'm not aware of.[00:26:18] swyx: Maybe in the embedding space. But the I think DCM we, like, woke people up and said, like, hey, DeepSeek, this, like, weird lab that is attached to a Chinese hedge fund is somehow, you know, doing groundbreaking research on MOEs. So, so, I classified this as a medium potential because I think that it is a sort of like a one off benefit.[00:26:37] swyx: You can Add to any, any base model to like make the MOE version of it, you get a bump and then that's it. So, yeah,[00:26:45] Alessio: I saw Samba Nova, which is like another inference company. They released this MOE model called Samba 1, which is like a 1 trillion parameters. But they're actually MOE auto open source models.[00:26:56] Alessio: So it's like, they just, they just clustered them all together. So I think people. Sometimes I think MOE is like you just train a bunch of small models or like smaller models and put them together. But there's also people just taking, you know, Mistral plus Clip plus, you know, Deepcoder and like put them all together.[00:27:15] Alessio: And then you have a MOE model. I don't know. I haven't tried the model, so I don't know how good it is. But it seems interesting that you can then have people working separately on state of the art, you know, Clip, state of the art text generation. And then you have a MOE architecture that brings them all together.[00:27:31] swyx: I'm thrown off by your addition of the word clip in there. Is that what? Yeah, that's[00:27:35] Alessio: what they said. Yeah, yeah. Okay. That's what they I just saw it yesterday. I was also like[00:27:40] swyx: scratching my head. And they did not use the word adapter. No. Because usually what people mean when they say, Oh, I add clip to a language model is adapter.[00:27:48] swyx: Let me look up the Which is what Lava did.[00:27:50] Alessio: The announcement again.[00:27:51] swyx: Stable diffusion. That's what they do. Yeah, it[00:27:54] Alessio: says among the models that are part of Samba 1 are Lama2, Mistral, DeepSigCoder, Falcon, Dplot, Clip, Lava. So they're just taking all these models and putting them in a MOE. Okay,[00:28:05] swyx: so a routing layer and then not jointly trained as much as a normal MOE would be.[00:28:12] swyx: Which is okay.[00:28:13] Alessio: That's all they say. There's no paper, you know, so it's like, I'm just reading the article, but I'm interested to see how[00:28:20] Wildcard: Model Merging (mergekit)[00:28:20] swyx: it works. Yeah, so so the wildcard for this section, the MOE section is model merges, which has also come up as, as a very interesting phenomenon. The last time I talked to Jeremy Howard at the Olama meetup we called it model grafting or model stacking.[00:28:35] swyx: But I think the, the, the term that people are liking these days, the model merging, They're all, there's all different variations of merging. Merge types, and some of them are stacking, some of them are, are grafting. And, and so like, some people are approaching model merging in the way that Samba is doing, which is like, okay, here are defined models, each of which have their specific, Plus and minuses, and we will merge them together in the hope that the, you know, the sum of the parts will, will be better than others.[00:28:58] swyx: And it seems like it seems like it's working. I don't really understand why it works apart from, like, I think it's a form of regularization. That if you merge weights together in like a smart strategy you, you, you get a, you get a, you get a less overfitting and more generalization, which is good for benchmarks, if you, if you're honest about your benchmarks.[00:29:16] swyx: So this is really interesting and good. But again, they're kind of limited in terms of like the amount of bumps you can get. But I think it's very interesting in the sense of how cheap it is. We talked about this on the Chinatalk podcast, like the guest podcast that we did with Chinatalk. And you can do this without GPUs, because it's just adding weights together, and dividing things, and doing like simple math, which is really interesting for the GPU ports.[00:29:42] Alessio: There's a lot of them.[00:29:44] Direction 5: Online LLMs (Gemini Pro, Exa)[00:29:44] Alessio: And just to wrap these up, online LLMs? Yeah,[00:29:48] swyx: I think that I ki I had to feature this because the, one of the top news of January was that Gemini Pro beat GPT-4 turbo on LM sis for the number two slot to GPT-4. And everyone was very surprised. Like, how does Gemini do that?[00:30:06] swyx: Surprise, surprise, they added Google search. Mm-hmm to the results. So it became an online quote unquote online LLM and not an offline LLM. Therefore, it's much better at answering recent questions, which people like. There's an emerging set of table stakes features after you pre train something.[00:30:21] swyx: So after you pre train something, you should have the chat tuned version of it, or the instruct tuned version of it, however you choose to call it. You should have the JSON and function calling version of it. Structured output, the term that you don't like. You should have the online version of it. These are all like table stakes variants, that you should do when you offer a base LLM, or you train a base LLM.[00:30:44] swyx: And I think online is just like, There, it's important. I think companies like Perplexity, and even Exa, formerly Metaphor, you know, are rising to offer that search needs. And it's kind of like, they're just necessary parts of a system. When you have RAG for internal knowledge, and then you have, you know, Online search for external knowledge, like things that you don't know yet?[00:31:06] swyx: Mm-Hmm. . And it seems like it's, it's one of many tools. I feel like I may be underestimating this, but I'm just gonna put it out there that I, I think it has some, some potential. One of the evidence points that it doesn't actually matter that much is that Perplexity has a, has had online LMS for three months now and it performs, doesn't perform great.[00:31:25] swyx: Mm-Hmm. on, on lms, it's like number 30 or something. So it's like, okay. You know, like. It's, it's, it helps, but it doesn't give you a giant, giant boost. I[00:31:34] Alessio: feel like a lot of stuff I do with LLMs doesn't need to be online. So I'm always wondering, again, going back to like state of the art, right? It's like state of the art for who and for what.[00:31:45] Alessio: It's really, I think online LLMs are going to be, State of the art for, you know, news related activity that you need to do. Like, you're like, you know, social media, right? It's like, you want to have all the latest stuff, but coding, science,[00:32:01] swyx: Yeah, but I think. Sometimes you don't know what is news, what is news affecting.[00:32:07] swyx: Like, the decision to use an offline LLM is already a decision that you might not be consciously making that might affect your results. Like, what if, like, just putting things on, being connected online means that you get to invalidate your knowledge. And when you're just using offline LLM, like it's never invalidated.[00:32:27] swyx: I[00:32:28] Alessio: agree, but I think going back to your point of like the standing the test of time, I think sometimes you can get swayed by the online stuff, which is like, hey, you ask a question about, yeah, maybe AI research direction, you know, and it's like, all the recent news are about this thing. So the LLM like focus on answering, bring it up, you know, these things.[00:32:50] swyx: Yeah, so yeah, I think, I think it's interesting, but I don't know if I can, I bet heavily on this.[00:32:56] Alessio: Cool. Was there one that you forgot to put, or, or like a, a new direction? Yeah,[00:33:01] swyx: so, so this brings us into sort of February. ish.[00:33:05] OpenAI Sora and why everyone underestimated videogen[00:33:05] swyx: So like I published this in like 15 came with Sora. And so like the one thing I did not mention here was anything about multimodality.[00:33:16] swyx: Right. And I have chronically underweighted this. I always wrestle. And, and my cop out is that I focused this piece or this research direction piece on LLMs because LLMs are the source of like AGI, quote unquote AGI. Everything else is kind of like. You know, related to that, like, generative, like, just because I can generate better images or generate better videos, it feels like it's not on the critical path to AGI, which is something that Nat Friedman also observed, like, the day before Sora, which is kind of interesting.[00:33:49] swyx: And so I was just kind of like trying to focus on like what is going to get us like superhuman reasoning that we can rely on to build agents that automate our lives and blah, blah, blah, you know, give us this utopian future. But I do think that I, everybody underestimated the, the sheer importance and cultural human impact of Sora.[00:34:10] swyx: And you know, really actually good text to video. Yeah. Yeah.[00:34:14] Alessio: And I saw Jim Fan at a, at a very good tweet about why it's so impressive. And I think when you have somebody leading the embodied research at NVIDIA and he said that something is impressive, you should probably listen. So yeah, there's basically like, I think you, you mentioned like impacting the world, you know, that we live in.[00:34:33] Alessio: I think that's kind of like the key, right? It's like the LLMs don't have, a world model and Jan Lekon. He can come on the podcast and talk all about what he thinks of that. But I think SORA was like the first time where people like, Oh, okay, you're not statically putting pixels of water on the screen, which you can kind of like, you know, project without understanding the physics of it.[00:34:57] Alessio: Now you're like, you have to understand how the water splashes when you have things. And even if you just learned it by watching video and not by actually studying the physics, You still know it, you know, so I, I think that's like a direction that yeah, before you didn't have, but now you can do things that you couldn't before, both in terms of generating, I think it always starts with generating, right?[00:35:19] Alessio: But like the interesting part is like understanding it. You know, it's like if you gave it, you know, there's the video of like the, the ship in the water that they generated with SORA, like if you gave it the video back and now it could tell you why the ship is like too rocky or like it could tell you why the ship is sinking, then that's like, you know, AGI for like all your rig deployments and like all this stuff, you know, so, but there's none, there's none of that yet, so.[00:35:44] Alessio: Hopefully they announce it and talk more about it. Maybe a Dev Day this year, who knows.[00:35:49] swyx: Yeah who knows, who knows. I'm talking with them about Dev Day as well. So I would say, like, the phrasing that Jim used, which resonated with me, he kind of called it a data driven world model. I somewhat agree with that.[00:36:04] Does Sora have a World Model? Yann LeCun vs Jim Fan[00:36:04] swyx: I am on more of a Yann LeCun side than I am on Jim's side, in the sense that I think that is the vision or the hope that these things can build world models. But you know, clearly even at the current SORA size, they don't have the idea of, you know, They don't have strong consistency yet. They have very good consistency, but fingers and arms and legs will appear and disappear and chairs will appear and disappear.[00:36:31] swyx: That definitely breaks physics. And it also makes me think about how we do deep learning versus world models in the sense of You know, in classic machine learning, when you have too many parameters, you will overfit, and actually that fails, that like, does not match reality, and therefore fails to generalize well.[00:36:50] swyx: And like, what scale of data do we need in order to world, learn world models from video? A lot. Yeah. So, so I, I And cautious about taking this interpretation too literally, obviously, you know, like, I get what he's going for, and he's like, obviously partially right, obviously, like, transformers and, and, you know, these, like, these sort of these, these neural networks are universal function approximators, theoretically could figure out world models, it's just like, how good are they, and how tolerant are we of hallucinations, we're not very tolerant, like, yeah, so It's, it's, it's gonna prior, it's gonna bias us for creating like very convincing things, but then not create like the, the, the useful role models that we want.[00:37:37] swyx: At the same time, what you just said, I think made me reflect a little bit like we just got done saying how important synthetic data is for Mm-Hmm. for training lms. And so like, if this is a way of, of synthetic, you know, vi video data for improving our video understanding. Then sure, by all means. Which we actually know, like, GPT 4, Vision, and Dolly were trained, kind of, co trained together.[00:38:02] swyx: And so, like, maybe this is on the critical path, and I just don't fully see the full picture yet.[00:38:08] Alessio: Yeah, I don't know. I think there's a lot of interesting stuff. It's like, imagine you go back, you have Sora, you go back in time, and Newton didn't figure out gravity yet. Would Sora help you figure it out?[00:38:21] Alessio: Because you start saying, okay, a man standing under a tree with, like, Apples falling, and it's like, oh, they're always falling at the same speed in the video. Why is that? I feel like sometimes these engines can like pick up things, like humans have a lot of intuition, but if you ask the average person, like the physics of like a fluid in a boat, they couldn't be able to tell you the physics, but they can like observe it, but humans can only observe this much, you know, versus like now you have these models to observe everything and then They generalize these things and maybe we can learn new things through the generalization that they pick up.[00:38:55] swyx: But again, And it might be more observant than us in some respects. In some ways we can scale it up a lot more than the number of physicists that we have available at Newton's time. So like, yeah, absolutely possible. That, that this can discover new science. I think we have a lot of work to do to formalize the science.[00:39:11] swyx: And then, I, I think the last part is you know, How much, how much do we cheat by gen, by generating data from Unreal Engine 5? Mm hmm. which is what a lot of people are speculating with very, very limited evidence that OpenAI did that. The strongest evidence that I saw was someone who works a lot with Unreal Engine 5 looking at the side characters in the videos and noticing that they all adopt Unreal Engine defaults.[00:39:37] swyx: of like, walking speed, and like, character choice, like, character creation choice. And I was like, okay, like, that's actually pretty convincing that they actually use Unreal Engine to bootstrap some synthetic data for this training set. Yeah,[00:39:52] Alessio: could very well be.[00:39:54] swyx: Because then you get the labels and the training side by side.[00:39:58] swyx: One thing that came up on the last day of February, which I should also mention, is EMO coming out of Alibaba, which is also a sort of like video generation and space time transformer that also involves probably a lot of synthetic data as well. And so like, this is of a kind in the sense of like, oh, like, you know, really good generative video is here and It is not just like the one, two second clips that we saw from like other, other people and like, you know, Pika and all the other Runway are, are, are, you know, run Cristobal Valenzuela from Runway was like game on which like, okay, but like, let's see your response because we've heard a lot about Gen 1 and 2, but like, it's nothing on this level of Sora So it remains to be seen how we can actually apply this, but I do think that the creative industry should start preparing.[00:40:50] swyx: I think the Sora technical blog post from OpenAI was really good.. It was like a request for startups. It was so good in like spelling out. Here are the individual industries that this can impact.[00:41:00] swyx: And anyone who, anyone who's like interested in generative video should look at that. But also be mindful that probably when OpenAI releases a Soa API, right? The you, the in these ways you can interact with it are very limited. Just like the ways you can interact with Dahlia very limited and someone is gonna have to make open SOA to[00:41:19] swyx: Mm-Hmm to, to, for you to create comfy UI pipelines.[00:41:24] Alessio: The stability folks said they wanna build an open. For a competitor, but yeah, stability. Their demo video, their demo video was like so underwhelming. It was just like two people sitting on the beach[00:41:34] swyx: standing. Well, they don't have it yet, right? Yeah, yeah.[00:41:36] swyx: I mean, they just wanna train it. Everybody wants to, right? Yeah. I, I think what is confusing a lot of people about stability is like they're, they're, they're pushing a lot of things in stable codes, stable l and stable video diffusion. But like, how much money do they have left? How many people do they have left?[00:41:51] swyx: Yeah. I have had like a really, Ima Imad spent two hours with me. Reassuring me things are great. And, and I'm like, I, I do, like, I do believe that they have really, really quality people. But it's just like, I, I also have a lot of very smart people on the other side telling me, like, Hey man, like, you know, don't don't put too much faith in this, in this thing.[00:42:11] swyx: So I don't know who to believe. Yeah.[00:42:14] Alessio: It's hard. Let's see. What else? We got a lot more stuff. I don't know if we can. Yeah, Groq.[00:42:19] Groq Math[00:42:19] Alessio: We can[00:42:19] swyx: do a bit of Groq prep. We're, we're about to go to talk to Dylan Patel. Maybe, maybe it's the audio in here. I don't know. It depends what, what we get up to later. What, how, what do you as an investor think about Groq? Yeah. Yeah, well, actually, can you recap, like, why is Groq interesting? So,[00:42:33] Alessio: Jonathan Ross, who's the founder of Groq, he's the person that created the TPU at Google. It's actually, it was one of his, like, 20 percent projects. It's like, he was just on the side, dooby doo, created the TPU.[00:42:46] Alessio: But yeah, basically, Groq, they had this demo that went viral, where they were running Mistral at, like, 500 tokens a second, which is like, Fastest at anything that you have out there. The question, you know, it's all like, The memes were like, is NVIDIA dead? Like, people don't need H100s anymore. I think there's a lot of money that goes into building what GRUK has built as far as the hardware goes.[00:43:11] Alessio: We're gonna, we're gonna put some of the notes from, from Dylan in here, but Basically the cost of the Groq system is like 30 times the cost of, of H100 equivalent. So, so[00:43:23] swyx: let me, I put some numbers because me and Dylan were like, I think the two people actually tried to do Groq math. Spreadsheet doors.[00:43:30] swyx: Spreadsheet doors. So, one that's, okay, oh boy so, so, equivalent H100 for Lama 2 is 300, 000. For a system of 8 cards. And for Groq it's 2. 3 million. Because you have to buy 576 Groq cards. So yeah, that, that just gives people an idea. So like if you deprecate both over a five year lifespan, per year you're deprecating 460K for Groq, and 60K a year for H100.[00:43:59] swyx: So like, Groqs are just way more expensive per model that you're, that you're hosting. But then, you make it up in terms of volume. So I don't know if you want to[00:44:08] Alessio: cover that. I think one of the promises of Groq is like super high parallel inference on the same thing. So you're basically saying, okay, I'm putting on this upfront investment on the hardware, but then I get much better scaling once I have it installed.[00:44:24] Alessio: I think the big question is how much can you sustain the parallelism? You know, like if you get, if you're going to get 100% Utilization rate at all times on Groq, like, it's just much better, you know, because like at the end of the day, the tokens per second costs that you're getting is better than with the H100s, but if you get to like 50 percent utilization rate, you will be much better off running on NVIDIA.[00:44:49] Alessio: And if you look at most companies out there, who really gets 100 percent utilization rate? Probably open AI at peak times, but that's probably it. But yeah, curious to see more. I saw Jonathan was just at the Web Summit in Dubai, in Qatar. He just gave a talk there yesterday. That I haven't listened to yet.[00:45:09] Alessio: I, I tweeted that he should come on the pod. He liked it. And then rock followed me on Twitter. I don't know if that means that they're interested, but[00:45:16] swyx: hopefully rock social media person is just very friendly. They, yeah. Hopefully[00:45:20] Alessio: we can get them. Yeah, we, we gonna get him. We[00:45:22] swyx: just call him out and, and so basically the, the key question is like, how sustainable is this and how much.[00:45:27] swyx: This is a loss leader the entire Groq management team has been on Twitter and Hacker News saying they are very, very comfortable with the pricing of 0. 27 per million tokens. This is the lowest that anyone has offered tokens as far as Mixtral or Lama2. This matches deep infra and, you know, I think, I think that's, that's, that's about it in terms of that, that, that low.[00:45:47] swyx: And we think the pro the break even for H100s is 50 cents. At a, at a normal utilization rate. To make this work, so in my spreadsheet I made this, made this work. You have to have like a parallelism of 500 requests all simultaneously. And you have, you have model bandwidth utilization of 80%.[00:46:06] swyx: Which is way high. I just gave them high marks for everything. Groq has two fundamental tech innovations that they hinge their hats on in terms of like, why we are better than everyone. You know, even though, like, it remains to be independently replicated. But one you know, they have this sort of the entire model on the chip idea, which is like, Okay, get rid of HBM.[00:46:30] swyx: And, like, put everything in SREM. Like, okay, fine, but then you need a lot of cards and whatever. And that's all okay. And so, like, because you don't have to transfer between memory, then you just save on that time and that's why they're faster. So, a lot of people buy that as, like, that's the reason that you're faster.[00:46:45] swyx: Then they have, like, some kind of crazy compiler, or, like, Speculative routing magic using compilers that they also attribute towards their higher utilization. So I give them 80 percent for that. And so that all that works out to like, okay, base costs, I think you can get down to like, maybe like 20 something cents per million tokens.[00:47:04] swyx: And therefore you actually are fine if you have that kind of utilization. But it's like, I have to make a lot of fearful assumptions for this to work.[00:47:12] Alessio: Yeah. Yeah, I'm curious to see what Dylan says later.[00:47:16] swyx: So he was like completely opposite of me. He's like, they're just burning money. Which is great.[00:47:22] Analyzing Gemini's 1m Context, Reddit deal, Imagegen politics, Gemma via the Four Wars[00:47:22] Alessio: Gemini, want to do a quick run through since this touches on all the four words.[00:47:28] swyx: Yeah, and I think this is the mark of a useful framework, that when a new thing comes along, you can break it down in terms of the four words and sort of slot it in or analyze it in those four frameworks, and have nothing left.[00:47:41] swyx: So it's a MECE categorization. MECE is Mutually Exclusive and Collectively Exhaustive. And that's a really, really nice way to think about taxonomies and to create mental frameworks. So, what is Gemini 1. 5 Pro? It is the newest model that came out one week after Gemini 1. 0. Which is very interesting.[00:48:01] swyx: They have not really commented on why. They released this the headline feature is that it has a 1 million token context window that is multi modal which means that you can put all sorts of video and audio And PDFs natively in there alongside of text and, you know, it's, it's at least 10 times longer than anything that OpenAI offers which is interesting.[00:48:20] swyx: So it's great for prototyping and it has interesting discussions on whether it kills RAG.[00:48:25] Alessio: Yeah, no, I mean, we always talk about, you know, Long context is good, but you're getting charged per token. So, yeah, people love for you to use more tokens in the context. And RAG is better economics. But I think it all comes down to like how the price curves change, right?[00:48:42] Alessio: I think if anything, RAG's complexity goes up and up the more you use it, you know, because you have more data sources, more things you want to put in there. The token costs should go down over time, you know, if the model stays fixed. If people are happy with the model today. In two years, three years, it's just gonna cost a lot less, you know?[00:49:02] Alessio: So now it's like, why would I use RAG and like go through all of that? It's interesting. I think RAG is better cutting edge economics for LLMs. I think large context will be better long tail economics when you factor in the build cost of like managing a RAG pipeline. But yeah, the recall was like the most interesting thing because we've seen the, you know, You know, in the haystack things in the past, but apparently they have 100 percent recall on anything across the context window.[00:49:28] Alessio: At least they say nobody has used it. No, people[00:49:30] swyx: have. Yeah so as far as, so, so what this needle in a haystack thing for people who aren't following as closely as us is that someone, I forget his name now someone created this needle in a haystack problem where you feed in a whole bunch of generated junk not junk, but just like, Generate a data and ask it to specifically retrieve something in that data, like one line in like a hundred thousand lines where it like has a specific fact and if it, if you get it, you're, you're good.[00:49:57] swyx: And then he moves the needle around, like, you know, does it, does, does your ability to retrieve that vary if I put it at the start versus put it in the middle, put it at the end? And then you generate this like really nice chart. That, that kind of shows like it's recallability of a model. And he did that for GPT and, and Anthropic and showed that Anthropic did really, really poorly.[00:50:15] swyx: And then Anthropic came back and said it was a skill issue, just add this like four, four magic words, and then, then it's magically all fixed. And obviously everybody laughed at that. But what Gemini came out with was, was that, yeah, we, we reproduced their, you know, haystack issue you know, test for Gemini, and it's good across all, all languages.[00:50:30] swyx: All the one million token window, which is very interesting because usually for typical context extension methods like rope or yarn or, you know, anything like that, or alibi, it's lossy like by design it's lossy, usually for conversations that's fine because we are lossy when we talk to people but for superhuman intelligence, perfect memory across Very, very long context.[00:50:51] swyx: It's very, very interesting for picking things up. And so the people who have been given the beta test for Gemini have been testing this. So what you do is you upload, let's say, all of Harry Potter and you change one fact in one sentence, somewhere in there, and you ask it to pick it up, and it does. So this is legit.[00:51:08] swyx: We don't super know how, because this is, like, because it doesn't, yes, it's slow to inference, but it's not slow enough that it's, like, running. Five different systems in the background without telling you. Right. So it's something, it's something interesting that they haven't fully disclosed yet. The open source community has centered on this ring attention paper, which is created by your friend Matei Zaharia, and a couple other people.[00:51:36] swyx: And it's a form of distributing the compute. I don't super understand, like, why, you know, doing, calculating, like, the fee for networking and attention. In block wise fashion and distributing it makes it so good at recall. I don't think they have any answer to that. The only thing that Ring of Tension is really focused on is basically infinite context.[00:51:59] swyx: They said it was good for like 10 to 100 million tokens. Which is, it's just great. So yeah, using the four wars framework, what is this framework for Gemini? One is the sort of RAG and Ops war. Here we care less about RAG now, yes. Or, we still care as much about RAG, but like, now it's it's not important in prototyping.[00:52:21] swyx: And then, for data war I guess this is just part of the overall training dataset, but Google made a 60 million deal with Reddit and presumably they have deals with other companies. For the multi modality war, we can talk about the image generation, Crisis, or the fact that Gemini also has image generation, which we'll talk about in the next section.[00:52:42] swyx: But it also has video understanding, which is, I think, the top Gemini post came from our friend Simon Willison, who basically did a short video of him scanning over his bookshelf. And it would be able to convert that video into a JSON output of what's on that bookshelf. And I think that is very useful.[00:53:04] swyx: Actually ties into the conversation that we had with David Luan from Adept. In a sense of like, okay what if video was the main modality instead of text as the input? What if, what if everything was video in, because that's how we work. We, our eyes don't actually read, don't actually like get input, our brains don't get inputs as characters.[00:53:25] swyx: Our brains get the pixels shooting into our eyes, and then our vision system takes over first, and then we sort of mentally translate that into text later. And so it's kind of like what Adept is kind of doing, which is driving by vision model, instead of driving by raw text understanding of the DOM. And, and I, I, in that, that episode, which we haven't released I made the analogy to like self-driving by lidar versus self-driving by camera.[00:53:52] swyx: Mm-Hmm. , right? Like, it's like, I think it, what Gemini and any other super long context that model that is multimodal unlocks is what if you just drive everything by video. Which is[00:54:03] Alessio: cool. Yeah, and that's Joseph from Roboflow. It's like anything that can be seen can be programmable with these models.[00:54:12] Alessio: You mean[00:54:12] swyx: the computer vision guy is bullish on computer vision?[00:54:18] Alessio: It's like the rag people. The rag people are bullish on rag and not a lot of context. I'm very surprised. The, the fine tuning people love fine tuning instead of few shot. Yeah. Yeah. The, yeah, the, that's that. Yeah, the, I, I think the ring attention thing, and it's how they did it, we don't know. And then they released the Gemma models, which are like a 2 billion and 7 billion open.[00:54:41] Alessio: Models, which people said are not, are not good based on my Twitter experience, which are the, the GPU poor crumbs. It's like, Hey, we did all this work for us because we're GPU rich and we're just going to run this whole thing. And
1부 [갓 잡은 경제] 정용진 부회장, '부'자 뗐다 하이닉스 HBM 기술 유출? - 김주영 머니투데이방송 기자 - 추동훈 매일경제신문 기자 2부 [글로벌 리포트] 중국 기업마다 민병대 생긴다? 전세계는 살찌는 중 테무 때문에 항공사가 난리? 까망베르 치즈의 멸종위기 - 어예진 해담경제연구소장
1부 [갓잡은경제] HBM 삼파전 본격화, 한국 반도체 미래는? 이라크 신도시 건설 재개, 2차 중동붐 오나? - 신인규 한국경제TV 기자 - 김주영 머니투데이방송 기자 2부 [대가들의 투자노트] 평생 PER 하나로 50배 번 사람 - 신기주 카운트 대표
- 워크아웃 신청한 태영, 재주주 사재 출연은? - 작년 우리나라 무역 성적표와 올해 전망은 - HBM 완판, K-반도체 다시 웃을 수 있을까 출연: 양효걸 기자, 남궁민 경제뉴스큐레이터, 김치형 경제뉴스큐레이터, 이진우 진행
This episode came together at ~4 hrs notice since Dylan had just landed in SF and we had to setup quickly; you might notice some small audio issues in some segments, we apologize. We're currently building our own podcast studio for 2024!
Hello everyone!!We have been doing back to back HBM episodes lately, but it doesn't mean we've not been doing Left Page! Here's a very fun and special episode, especially for Frank, as we talk about the anthology "Dispatches from Anarres" a collection of texts that seek to connect (or vibe) with Ursula K. Le Guin's work!!We dive into some of the more interesting aspects of some stories, while still managing to chat about all of them. There's environmentalism, morality, politics, faith, magic, science fiction and lots more! It's a very fun episode on one of our favourite writers, or rather, the kind of work her writing inspired.Enjoy!Please support our Patreon if you're interested and want access to early content and the bonus Reading Corners! https://www.patreon.com/leftpage Intro Music: Gymnopédie Nº1, Erik Satie, 1888Outro Music: Downtime, Vistas, Miracle of Sound, 2014 -> Check out his Bandcamp! https://miracleofsound.bandcamp.com/ Hosted on Acast. See acast.com/privacy for more information.
When it came time to make an episode, we did this. Solidigm blesses us with gigantic SSDs now, Futurama is back, the future of Windows is bloatware, and AMD finally offers a Phoenix driver. Plus security scares with Mikrotik and Gigabyte and let's not leave out Ubuntu this time, eh? Of course our non-stop coverage of everything Intel ARC continues unabated. Enjoy!Timestamps:00:00 Intro02:35 Food with Josh05:15 AMD unifies GPU drivers, adds Phoenix support11:14 Faster, higher capacity HBM from Micron13:14 Solidigm has a 61.44 TB SSD17:14 Mandatory Arc Coverage18:26 Crucial launches X9 Pro and X10 portable SSDs22:55 Meet Windows 11 23H228:17 Microsoft up, Windows down29:55 Hackaday accessibility tech prize32:54 Futurama is back, again36:51 Podcast sponsor - Hello Fresh38:12 Security Corner53:28 Gaming Quick Hits59:46 Picks of the Week1:14:56 Outro ★ Support this podcast on Patreon ★
1. 맥주·막걸리 물가연동제 폐지 검토 2. 차세대 D램, HBM이 주목받는 이유 +친절한 경제 출연: 안승찬 기자, 양효걸 기자