POPULARITY
Categories
In unserem Deadlands Let's Play gehen Francis Morgen (The Good), Jessup Cringe (The Bad), Sheamus O'Brien (The Weird) und Grace O'Brien (The Beauty) mysteriösen Ereignissen im beschaulichen Städtchen Concordia nach. Die Battlemaps in 4k vom LP könnt ihr hier kaufen, und selbst Abenteuer im Wilden Westen erleben: https://www.dans-abenteuerwelt.de/battlemaps Deadlands ist ein Pen and Paper Rollenspiel, das in einer alternativen Geschichte spielt, die an das Amerika des späten 19. Jahrhunderts erinnert. Die Handlung ist von einer mysteriösen Kraft geprägt, die den Wilden Westen durchdringt: den "Weird West". In dieser Welt sind übernatürliche Kräfte allgegenwärtig und die Spieler schlüpfen in die Rolle von Charakteren, die mit diesen Kräften umgehen müssen, um in einer gefährlichen und oft tödlichen Umgebung zu überleben.
Vi funderar på hobbyprylar genom åren, bra och dåliga! Instagram: www.instagram.com/fantasianorthsweden/ Webshop: www.fantasianorth.com Patreon: www.patreon.com/fantasianorth
In unserem Deadlands Let's Play gehen Francis Morgen (The Good), Jessup Cringe (The Bad), Sheamus O'Brien (The Weird) und Grace O'Brien (The Beauty) mysteriösen Ereignissen im beschaulichen Städtchen Concordia nach. Die Battlemaps in 4k vom LP könnt ihr hier kaufen, und selbst Abenteuer im Wilden Westen erleben: https://www.dans-abenteuerwelt.de/battlemaps Deadlands ist ein Pen and Paper Rollenspiel, das in einer alternativen Geschichte spielt, die an das Amerika des späten 19. Jahrhunderts erinnert. Die Handlung ist von einer mysteriösen Kraft geprägt, die den Wilden Westen durchdringt: den "Weird West". In dieser Welt sind übernatürliche Kräfte allgegenwärtig und die Spieler schlüpfen in die Rolle von Charakteren, die mit diesen Kräften umgehen müssen, um in einer gefährlichen und oft tödlichen Umgebung zu überleben.
In unserem Deadlands Let's Play gehen Francis Morgen (The Good), Jessup Cringe (The Bad), Sheamus O'Brien (The Weird) und Grace O'Brien (The Beauty) mysteriösen Ereignissen im beschaulichen Städtchen Concordia nach. Die Battlemaps in 4k vom LP könnt ihr hier kaufen, und selbst Abenteuer im Wilden Westen erleben: https://www.dans-abenteuerwelt.de/battlemaps Deadlands ist ein Pen and Paper Rollenspiel, das in einer alternativen Geschichte spielt, die an das Amerika des späten 19. Jahrhunderts erinnert. Die Handlung ist von einer mysteriösen Kraft geprägt, die den Wilden Westen durchdringt: den "Weird West". In dieser Welt sind übernatürliche Kräfte allgegenwärtig und die Spieler schlüpfen in die Rolle von Charakteren, die mit diesen Kräften umgehen müssen, um in einer gefährlichen und oft tödlichen Umgebung zu überleben.
In unserem Deadlands Let's Play gehen Francis Morgen (The Good), Jessup Cringe (The Bad), Sheamus O'Brien (The Weird) und Grace O'Brien (The Beauty) mysteriösen Ereignissen im beschaulichen Städtchen Concordia nach. Die Battlemaps in 4k vom LP könnt ihr hier kaufen, und selbst Abenteuer im Wilden Westen erleben: https://www.dans-abenteuerwelt.de/battlemaps Deadlands ist ein Pen and Paper Rollenspiel, das in einer alternativen Geschichte spielt, die an das Amerika des späten 19. Jahrhunderts erinnert. Die Handlung ist von einer mysteriösen Kraft geprägt, die den Wilden Westen durchdringt: den "Weird West". In dieser Welt sind übernatürliche Kräfte allgegenwärtig und die Spieler schlüpfen in die Rolle von Charakteren, die mit diesen Kräften umgehen müssen, um in einer gefährlichen und oft tödlichen Umgebung zu überleben.
In unserem Deadlands Let's Play gehen Francis Morgen (The Good), Jessup Cringe (The Bad), Sheamus O'Brien (The Weird) und Grace O'Brien (The Beauty) mysteriösen Ereignissen im beschaulichen Städtchen Concordia nach. Die Battlemaps in 4k vom LP könnt ihr hier kaufen, und selbst Abenteuer im Wilden Westen erleben: https://www.dans-abenteuerwelt.de/battlemaps Deadlands ist ein Pen and Paper Rollenspiel, das in einer alternativen Geschichte spielt, die an das Amerika des späten 19. Jahrhunderts erinnert. Die Handlung ist von einer mysteriösen Kraft geprägt, die den Wilden Westen durchdringt: den "Weird West". In dieser Welt sind übernatürliche Kräfte allgegenwärtig und die Spieler schlüpfen in die Rolle von Charakteren, die mit diesen Kräften umgehen müssen, um in einer gefährlichen und oft tödlichen Umgebung zu überleben.
GW visade nya Space Wolves. Vi gillar nya gubbar som ni vet, så låt oss prata nya gubbar Instagram: www.instagram.com/fantasianorthsweden/ Webshop: www.fantasianorth.com Patreon: www.patreon.com/fantasianorth
Terräng är en grej i våra spe, låt oss prata mer om det! Instagram: www.instagram.com/fantasianorthsweden/ Webshop: www.fantasianorth.com Patreon: www.patreon.com/fantasianorth
I värdet av pengar pratar ekonomi-kunnarna Pontus och Li om aktuella händelser och leder oss genom kapitalismens snårigaste djungler. I detta avsnitt diskuterar vi det nu pågående handelskriget utifrån flera olika vinklar, framförallt historiskt och geopolitiskt. Produktion: @ProducentKalleMusik: WW Dödskalle Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
Efter tre månader är hobbypodden äntligen tillbaka Instagram: www.instagram.com/fantasianorthsweden/ Webshop: www.fantasianorth.com Patreon: www.patreon.com/fantasianorth
Detta är en / preview / för ett avsnitt på Radio åt allas Patreon! För att höra hela avsnittet, bli månadsgivare på Radio åt allas Patreon. Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
Christian Stanges Universalquellenbuch "Buch der Königreiche", für alle Fantasy Rollenspiele, zeichnet sich durch seine Ambition aus und hebt sich schnell von anderen Werken ab. Über 20 Jahre wird nun an dem epischen Quellenbuch gefeilt und gearbeitet. Das Buch kondensiert unzähligen Stunden an Leidenschaft, in 600 prall gefüllten Seiten.
Christian Stanges Universalquellenbuch "Buch der Königreiche", für alle Fantasy Rollenspiele, zeichnet sich durch seine Ambition aus und hebt sich schnell von anderen Werken ab. Über 20 Jahre wird nun an dem epischen Quellenbuch gefeilt und gearbeitet. Das Buch kondensiert unzähligen Stunden an Leidenschaft, in 600 prall gefüllten Seiten.
Christian Stanges Universalquellenbuch "Buch der Königreiche", für alle Fantasy Rollenspiele, zeichnet sich durch seine Ambition aus und hebt sich schnell von anderen Werken ab. Über 20 Jahre wird nun an dem epischen Quellenbuch gefeilt und gearbeitet. Das Buch kondensiert unzähligen Stunden an Leidenschaft, in 600 prall gefüllten Seiten.
Vi pratar om att SD-styret i Bjuvs kommun faller isär, att Magnus lider av Malmö-mentalitet, och rapporteringen om kriminella Malmö-barn. Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
Marlies Dekkers in gesprek met Joep Rovers.--Steun DNW en word patroon op http://www.petjeaf.com/denieuwewereld.Liever direct overmaken? Maak dan uw gift over naar NL61 RABO 0357 5828 61 t.n.v. Stichting De Nieuwe Wereld. -- Bronnen en links bij deze uitzending: - De Instagram van Joep: https://www.instagram.com/joeprovers/- Webshop: https://elvou.com- Bestel het boek 'Werk als een atleet': https://elvou.com/werk-als-een-atleet/- Meer over Weston A. Price: https://www.westonaprice.org/- Vind hier de link naar een overzicht van biologische adressen: https://bioadressen.nl- Boekenlijst over PMS en menstruatie:Grip op je cyclus boek, Lara Briden De cyclus strategie, Maisie HillIn the flo, alisa vitti, Fix your period, Nicole Jardim, Woman code, Alisa VittiJe brein aan de pil, Sarah HillThe fifth vital sign, Lisa HendricksonBeyond the pill, Jolene BrightenAnticonceptie zonder hormonen, Dorothee Struck
In deze uitzending van Voice Of Faith de eerste preek van de Blessed Conference 2024. Tom ontmaskerd 18 leugens over het welvaartsevangelie. Volg onze online Bijbelschool op www.bijbelschool.tv Wil je geven/partner worden: http://bit.ly/front-partnerOp bijbelschool.tv staan meer dan 150 videolessen met onderwerpen zoals: Gods stem verstaan, Genezing, Roeping en Bediening, Leiderschap, Effectief gebed en nog veel meer onderwerpen. Meld je via onderstaande link aan om een account aan te maken op Bijbelschool.tv https://bijbelschool.tvVergeet je niet te abonneren op ons kanaal! http://bit.ly/front-subscribeInstagram | / frontrunnersministries Facebook | / frontrunnersministries Spotify | http://bit.ly/frontrunnersministriesWebsite | https://frontrunnersministries.nl/Webshop | http://bit.ly/front-webshop
In deze uitzending van Voice Of Faith de derde preek van de Blessed Conference 2024. Volg onze online Bijbelschool op www.bijbelschool.tv Wil je geven/partner worden: http://bit.ly/front-partnerOp bijbelschool.tv staan meer dan 150 videolessen met onderwerpen zoals: Gods stem verstaan, Genezing, Roeping en Bediening, Leiderschap, Effectief gebed en nog veel meer onderwerpen. Meld je via onderstaande link aan om een account aan te maken op Bijbelschool.tv https://bijbelschool.tvVergeet je niet te abonneren op ons kanaal! http://bit.ly/front-subscribeInstagram | / frontrunnersministries Facebook | / frontrunnersministries Spotify | http://bit.ly/frontrunnersministriesWebsite | https://frontrunnersministries.nl/Webshop | http://bit.ly/front-webshop
In deze uitzending van Voice Of Faith de tweede preek van de Blessed Conference 2024. Volg onze online Bijbelschool op www.bijbelschool.tv Wil je geven/partner worden: http://bit.ly/front-partnerOp bijbelschool.tv staan meer dan 150 videolessen met onderwerpen zoals: Gods stem verstaan, Genezing, Roeping en Bediening, Leiderschap, Effectief gebed en nog veel meer onderwerpen. Meld je via onderstaande link aan om een account aan te maken op Bijbelschool.tv https://bijbelschool.tvVergeet je niet te abonneren op ons kanaal! http://bit.ly/front-subscribeInstagram | / frontrunnersministries Facebook | / frontrunnersministries Spotify | http://bit.ly/frontrunnersministriesWebsite | https://frontrunnersministries.nl/Webshop | http://bit.ly/front-webshop
Aus den Anfängen der Let's Play Serien die mit Regeln von DSA 1 gespielt wurden, entwickelten die YouTuber @DieAlriks ein eigenes Regelwerk. „Fabula – Das P&P Universalregelwerk“. Pen&Paper ist ein sehr schönes Hobby, dass auch sozial einiges zu bieten hat und das nicht immer etwas kosten muss. Das „Fabula“ Regelwerk ist kostenlos und bietet ein kurzes aber umfassendes Regelwerk, das einen schnellen und einfachen Einstieg in das Hobby Pen&Paper bieten soll.
In deze uitzending van Voice Of Faith de tweede preek van de Blessed Conference 2024. Volg onze online Bijbelschool op www.bijbelschool.tv Wil je geven/partner worden: http://bit.ly/front-partnerOp bijbelschool.tv staan meer dan 150 videolessen met onderwerpen zoals: Gods stem verstaan, Genezing, Roeping en Bediening, Leiderschap, Effectief gebed en nog veel meer onderwerpen. Meld je via onderstaande link aan om een account aan te maken op Bijbelschool.tv https://bijbelschool.tvVergeet je niet te abonneren op ons kanaal! http://bit.ly/front-subscribeInstagram | / frontrunnersministries Facebook | / frontrunnersministries Spotify | http://bit.ly/frontrunnersministriesWebsite | https://frontrunnersministries.nl/Webshop | http://bit.ly/front-webshop
Aus den Anfängen der Let's Play Serien die mit Regeln von DSA 1 gespielt wurden, entwickelten die YouTuber @DieAlriks ein eigenes Regelwerk. „Fabula – Das P&P Universalregelwerk“. Pen&Paper ist ein sehr schönes Hobby, dass auch sozial einiges zu bieten hat und das nicht immer etwas kosten muss. Das „Fabula“ Regelwerk ist kostenlos und bietet ein kurzes aber umfassendes Regelwerk, das einen schnellen und einfachen Einstieg in das Hobby Pen&Paper bieten soll.
Vi pratar om att rättshaverism med misstänkt rasistiska förtecken, föreningen Flamman och kommunens uppdrag, klassmedvetenhet på Fridhem och sist men inte minst: Vad var det egentligen som hände på Himmelska fridens torg våren 1989? Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
Digital Brains | Adwise - Een podcast over online marketing, digital en tech
Abonneer je op onze Substack en ontvang een samenvatting van de aflevering, de shownotes en links naar de bronnen.(00:00) Intro(00:53) Digital Brains Substack nieuwsbrief(02:30) LinkedIn rolt updates uit voor video presentation en discovery(04:22) Meta vereenvoudigt Advantage plus campagnes instellingen en voegt lead campagnes toe(08:27) Europa gaat foute webwinkels harder aanpakken met mystery shoppers(11:09) Google beoordeelt advertentie met focus op gebruiksvriendelijke landingspagina's(13:07) Google's AI-overzichten laten de organische en de betaalde CTR's verder dalen(16:33) ChatGPT groeit als verkeersbron en dat leidt tot een veranderend zoekgedrag(21:08) Pulse from the past: Google Maps viert haar 20ste verjaardag(22:15) Outro Shownotes: https://www.adwise.nl/podcast/Hosts: Jeroen Roozendaal en Daan LoohuisVolg Adwise ook via:
De Nederlandse e-commercemarkt groeit hard, met een omzet van 30,5 miljard euro in 2024 en een verwachte groei van 8% dit jaar. Maar achter die positieve cijfers schuilt een groot gemis: Nederlandse webwinkels laten naar schatting ruim vier miljard euro liggen, doordat hun websites niet toegankelijk zijn voor mensen met een beperking. Hoe kan dat en wat betekent dit voor ondernemers? In deze aflevering ging Robert van den Ham in gesprek met Marlene ten Ham, algemeen directeur van Thuiswinkel.org. Als voormalig directeur van Ecommerce Europe kent zij de online retailsector als geen ander. Ze bespraken niet alleen de impact van toegankelijkheid op omzet, maar ook hoe mkb'ers zich staande houden in een markt waarin grote spelers zoals Bol.com en Coolblue de toon zetten. Kunnen buitenlandse webshops zich aansluiten bij Thuiswinkel.org? Hoe digitaal vaardig is de gemiddelde mkb-ondernemer en welke trends gaan de e-commercewereld de komende jaren bepalen? Verder waren te gast in de uitzending Paul van den Bosch van AD Regio, Lotte Vink van Labfresh, Tijn Baart van Goodoo, Dewi van de Waeter van The Lekker Company en Bram van Beers van Lithium Safety Containers. De Ondernemer is wekelijks op dinsdag van 11:00 tot 13:00 uur live te beluisteren op New Business Radio en is ook te zien, via de homepage van De Ondernemer, YouTube en Facebook.
De Nederlandse e-commercemarkt groeit hard, met een omzet van 30,5 miljard euro in 2024 en een verwachte groei van 8% dit jaar. Maar achter die positieve cijfers schuilt een groot gemis: Nederlandse webwinkels laten naar schatting ruim vier miljard euro liggen, doordat hun websites niet toegankelijk zijn voor mensen met een beperking. Hoe kan dat en wat betekent dit voor ondernemers? In deze aflevering ging Robert van den Ham in gesprek met Marlene ten Ham, algemeen directeur van Thuiswinkel.org. Als voormalig directeur van Ecommerce Europe kent zij de online retailsector als geen ander. Ze bespraken niet alleen de impact van toegankelijkheid op omzet, maar ook hoe mkb'ers zich staande houden in een markt waarin grote spelers zoals Bol.com en Coolblue de toon zetten. Kunnen buitenlandse webshops zich aansluiten bij Thuiswinkel.org? Hoe digitaal vaardig is de gemiddelde mkb-ondernemer en welke trends gaan de e-commercewereld de komende jaren bepalen? Verder waren te gast in de uitzending Paul van den Bosch van AD Regio, Lotte Vink van Labfresh, Tijn Baart van Goodoo, Dewi van de Waeter van The Lekker Company en Bram van Beers van Lithium Safety Containers. De Ondernemer is wekelijks op dinsdag van 11:00 tot 13:00 uur live te beluisteren op New Business Radio en is ook te zien, via de homepage van De Ondernemer, YouTube en Facebook.
A2 - PRE-INTERMEDIATE - In this episode, I'll talk about my daily routine. What an average weekday looks like and what I usually do on the weekends.
Vi pratar om att det går bra för Malmö, att det går dåligt för fastighetsbranschen, Öresundsparlamentet, och korar Årets Malmöit 2024. Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
B1 - INTERMEDIATE - Who to tip, when, how much, and what to say? In this episode, I break down Hungarian tipping culture, share useful phrases, and reveal common mistakes to avoid. ✨ Webshop: https://www.patreon.com/c/hungarianwithsziszi/shop (new ebook coming on February 5th!) ☕ Support the podcast: http://ko-fi.com/hungarianwithsziszi I'm a small business owner, I work solo, and there are hours of work behind one episode. Thank you so much for considering supporting my work—it really means a lot! ❤️
Det är januari och ett nytt år har påbörjat! Gänget diskuterar de senaste nyheterna. Martin har börjat titta på TV-serier, Serneke (företaget) har nu gått i konkurs och Johanna frammanar främmande krafter i sin spåkula och delar sina förutsägelser om vad som kommer hända under 2025 i Göteborg. Stötta Radio åt alla på Patreon!Handla i vår Webshop!Följ […]
Om het begin van 2025 te vieren doen we een Alfavrouwen Rewind. Hierin doen we een best of van de best presterende afleveringen van de afgelopen jaren. In deze aflevering hebben wij een exclusief gesprek met Sophie Claes, de oprichtster van de Gele Flamingo. Ze deelt met ons haar verhaal over hoe zij de Gele Flamingo heeft opgebouwd tot België's grootste babywebshop. Een absolute must-listen voor jou als ondernemende vrouw met grote ambities. Wil jij ook graag geïnspireerd raken door deze topvrouw? Ga Sophie dan volgen via @de_gele_flamingo of via haar persoonlijk account @claes.sophie Heeft Sophie haar verhaal jou geïnspireerd en wil jij nu jouw eerste stappen richting een webshop zetten? Schrijf je dan nu in voor onze gratis training via https://alfavrouwen.com/wbwSchrijf je in via Alfavrouwen.com/InstaShift Wil jij op de hoogte blijven van de laatste social media trends? En wil jij het weten wanneer er een nieuwe podcastaflevering online komt? Schrijf je dan in via alfavrouwen.com/nieuwsbrief Wil jij in 10 stappen leren hoe jij mooie reels kunt maken? Schrijf je dan nu in via www.alfavrouwen.com/capcut Maak jij al deel uit van onze gratis Facebook groep? Kom er nu snel bij via: https://www.alfavrouwen.com/facebookgroep
Was hat das Europäische Datenrecht mit einer Schafweide zu tun? Welches Buch könnten Sie Ihren juristischen Freunden zu Weihnachten schenken? Und was sagt der BGH zum Land Rover in der Autowaschanlage? (00:23) Aktuelle Gesetzgebung: Diskussionsentwurf für ein Gesetz zur Änderung des Verbrauchervertrags- und des Versicherungsvertragsrechts, verfügbar auf den Seiten des BMJ (07:03) Aktuelle Literatur: Maximilian Becker: IoT-Daten als Gemeineigentum – Der Data Act als digitale Utopie, RDi 2024, 615-627 Chantal Bittner: Legalisierung der Eizellspende – Alles nur zum Wohle des Kindes? JZ 2024, 1087-1091 Martin Zwickel, Eva Julia Lohse und Matthias Schmid: Kompetenztraining Jura, 1. Aufl. 2025, Details im Webshop von De Gruyter (18:42) Aktuelle Rechtsprechung: Scraping: BGH v. 18. November 2024, VI ZR 10/24, Volltext Land Rover in der Autowaschanlage: BGH v. 21. November 2024, VII ZR 39/24, Volltext (32:06) Aktuelle Umfrage zur Zusammenarbeit im Jurastudium (34:36) Frohe Weihnachten und einen guten Start ins neue Jahr!
Vi dammar av en gammal tradition att vi går igenom Malmö stads nya budget och alla partiers skuggbudgetar. Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
Detta är en / preview / för ett avsnitt på Radio åt allas Patreon! För att höra hela avsnittet, bli månadsgivare på Radio åt allas Patreon. Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
In deze aflevering van Z 7 op 7:Het Gentse techbedrijf Lighthouse tankt 350 miljoen euro. En wordt zo het 5de Belgische techbedrijf dat meer dan 1 miljard waard is.70 winkelketens en webshops vragen dat de Belgische en Europese autoriteiten optreden tegen Chinese webshops.In ons dagelijks beursgesprek analyseert Ken Van Weyenberg van Candriam de kwartaalcijfers van Nvidia en Snowflake.En in Tussen Wetstraat en Wall Street gaan Head of Business News en politiek journalist Jan De Meulemeester en Trends-Kanaal Z collega Jens Leen het gesprek aan over de boomende cryptomarkt.
Nytt avsnitt av världens bästa podd om Göteborgs lokalpolitik! I detta avsnitt pratar Martin, Johanna och Jacob om den senaste budgeten, nya spårvägar i Göteborg, Milleniumskiftet i sjukvården och många andra nyheter. Samt: Har Jacob blivit nästan dödad av Lasse Kronér? Stötta Radio åt alla på Patreon!Handla i vår Webshop!Följ Radio åt alla på Twitter!
Välkommen till Malmö pratar om buskörning, bidragsfusk och en Bengt på Sofielund. Den röda tråden genom hela avsnittet är den eviga frågeställningen, var är samhället? Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
Detta är en / preview / för ett frågeavsnitt från Eld och rörelse på Radio åt allas Patreon! För att höra hela avsnittet, bli månadsgivare på Radio åt allas Patreon. Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
We propose an intuitive LLM prompting framework (AgentKit) for multifunctional agents. AgentKit offers a unified framework for explicitly constructing a complex"thought process"from simple natural language prompts. The basic building block in AgentKit is a node, containing a natural language prompt for a specific subtask. The user then puts together chains of nodes, like stacking LEGO pieces. The chains of nodes can be designed to explicitly enforce a naturally structured"thought process". For example, for the task of writing a paper, one may start with the thought process of 1) identify a core message, 2) identify prior research gaps, etc. The nodes in AgentKit can be designed and combined in different ways to implement multiple advanced capabilities including on-the-fly hierarchical planning, reflection, and learning from interactions. In addition, due to the modular nature and the intuitive design to simulate explicit human thought process, a basic agent could be implemented as simple as a list of prompts for the subtasks and therefore could be designed and tuned by someone without any programming experience. Quantitatively, we show that agents designed through AgentKit achieve SOTA performance on WebShop and Crafter. These advances underscore AgentKit's potential in making LLM agents effective and accessible for a wider range of applications. https://github.com/holmeswww/AgentKit 2024: Yue Wu, Yewen Fan, So Yeon Min, Shrimai Prabhumoye, Stephen McAleer, Yonatan Bisk, Ruslan Salakhutdinov, Yuanzhi Li, Tom M. Mitchell https://arxiv.org/pdf/2404.11483
Välkommen till Malmös avsnitt med lägst energi hittills. Vi pratar i huvudsak om NGBG-festivalen som stadsomvandlings-projekt och att det var roligare förr när saker planerades i stor skala. Även total okunskap om sport utlovas. Stötta gärna oss på vår Patreon!Handla gärna i vår Webshop!Följ oss gärna på Twitter!
OpenAI DevDay is almost here! Per tradition, we are hosting a DevDay pregame event for everyone coming to town! Join us with demos and gossip!Also sign up for related events across San Francisco: the AI DevTools Night, the xAI open house, the Replicate art show, the DevDay Watch Party (for non-attendees), Hack Night with OpenAI at Cloudflare. For everyone else, join the Latent Space Discord for our online watch party and find fellow AI Engineers in your city.OpenAI's recent o1 release (and Reflection 70b debacle) has reignited broad interest in agentic general reasoning and tree search methods.While we have covered some of the self-taught reasoning literature on the Latent Space Paper Club, it is notable that the Eric Zelikman ended up at xAI, whereas OpenAI's hiring of Noam Brown and now Shunyu suggests more interest in tool-using chain of thought/tree of thought/generator-verifier architectures for Level 3 Agents.We were more than delighted to learn that Shunyu is a fellow Latent Space enjoyer, and invited him back (after his first appearance on our NeurIPS 2023 pod) for a look through his academic career with Harrison Chase (one year after his first LS show).ReAct: Synergizing Reasoning and Acting in Language Modelspaper linkFollowing seminal Chain of Thought papers from Wei et al and Kojima et al, and reflecting on lessons from building the WebShop human ecommerce trajectory benchmark, Shunyu's first big hit, the ReAct paper showed that using LLMs to “generate both reasoning traces and task-specific actions in an interleaved manner” achieved remarkably greater performance (less hallucination/error propagation, higher ALFWorld/WebShop benchmark success) than CoT alone. In even better news, ReAct scales fabulously with finetuning:As a member of the elite Princeton NLP group, Shunyu was also a coauthor of the Reflexion paper, which we discuss in this pod.Tree of Thoughtspaper link hereShunyu's next major improvement on the CoT literature was Tree of Thoughts:Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role…ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices.The beauty of ToT is it doesnt require pretraining with exotic methods like backspace tokens or other MCTS architectures. You can listen to Shunyu explain ToT in his own words on our NeurIPS pod, but also the ineffable Yannic Kilcher:Other WorkWe don't have the space to summarize the rest of Shunyu's work, you can listen to our pod with him now, and recommend the CoALA paper and his initial hit webinar with Harrison, today's guest cohost:as well as Shunyu's PhD Defense Lecture:as well as Shunyu's latest lecture covering a Brief History of LLM Agents:As usual, we are live on YouTube! Show Notes* Harrison Chase* LangChain, LangSmith, LangGraph* Shunyu Yao* Alec Radford* ReAct Paper* Hotpot QA* Tau Bench* WebShop* SWE-Agent* SWE-Bench* Trees of Thought* CoALA Paper* Related Episodes* Our Thomas Scialom (Meta) episode* Shunyu on our NeurIPS 2023 Best Papers episode* Harrison on our LangChain episode* Mentions* Sierra* Voyager* Jason Wei* Tavily* SERP API* ExaTimestamps* [00:00:00] Opening Song by Suno* [00:03:00] Introductions* [00:06:16] The ReAct paper* [00:12:09] Early applications of ReAct in LangChain* [00:17:15] Discussion of the Reflection paper* [00:22:35] Tree of Thoughts paper and search algorithms in language models* [00:27:21] SWE-Agent and SWE-Bench for coding benchmarks* [00:39:21] CoALA: Cognitive Architectures for Language Agents* [00:45:24] Agent-Computer Interfaces (ACI) and tool design for agents* [00:49:24] Designing frameworks for agents vs humans* [00:53:52] UX design for AI applications and agents* [00:59:53] Data and model improvements for agent capabilities* [01:19:10] TauBench* [01:23:09] Promising areas for AITranscriptAlessio [00:00:01]: Hey, everyone, welcome to the Latent Space podcast. This is Alessio, partner and CTO of Residence at Decibel Partners, and I'm joined by my co-host Swyx, founder of Small AI.Swyx [00:00:12]: Hey, and today we have a super special episode. I actually always wanted to take like a selfie and go like, you know, POV, you're about to revolutionize the world of agents because we have two of the most awesome hiring agents in the house. So first, we're going to welcome back Harrison Chase. Welcome. Excited to be here. What's new with you recently in sort of like the 10, 20 second recap?Harrison [00:00:34]: Linkchain, Linksmith, Lingraph, pushing on all of them. Lots of cool stuff related to a lot of the stuff that we're going to talk about today, probably.Swyx [00:00:42]: Yeah.Alessio [00:00:43]: We'll mention it in there. And the Celtics won the title.Swyx [00:00:45]: And the Celtics won the title. You got that going on for you. I don't know. Is that like floorball? Handball? Baseball? Basketball.Alessio [00:00:52]: Basketball, basketball.Harrison [00:00:53]: Patriots aren't looking good though, so that's...Swyx [00:00:56]: And then Xun Yu, you've also been on the pod, but only in like a sort of oral paper presentation capacity. But welcome officially to the LinkedSpace pod.Shunyu [00:01:03]: Yeah, I've been a huge fan. So thanks for the invitation. Thanks.Swyx [00:01:07]: Well, it's an honor to have you on. You're one of like, you're maybe the first PhD thesis defense I've ever watched in like this AI world, because most people just publish single papers, but every paper of yours is a banger. So congrats.Shunyu [00:01:22]: Thanks.Swyx [00:01:24]: Yeah, maybe we'll just kick it off with, you know, what was your journey into using language models for agents? I like that your thesis advisor, I didn't catch his name, but he was like, you know... Karthik. Yeah. It's like, this guy just wanted to use language models and it was such a controversial pick at the time. Right.Shunyu [00:01:39]: The full story is that in undergrad, I did some computer vision research and that's how I got into AI. But at the time, I feel like, you know, you're just composing all the GAN or 3D perception or whatever together and it's not exciting anymore. And one day I just see this transformer paper and that's really cool. But I really got into language model only when I entered my PhD and met my advisor Karthik. So he was actually the second author of GPT-1 when he was like a visiting scientist at OpenAI. With Alec Redford?Swyx [00:02:10]: Yes.Shunyu [00:02:11]: Wow. That's what he told me. It's like back in OpenAI, they did this GPT-1 together and Ilya just said, Karthik, you should stay because we just solved the language. But apparently Karthik is not fully convinced. So he went to Princeton, started his professorship and I'm really grateful. So he accepted me as a student, even though I have no prior knowledge in NLP. And you know, we just met for the first time and he's like, you know, what do you want to do? And I'm like, you know, you have done those test game scenes. That's really cool. I wonder if we can just redo them with language models. And that's how the whole journey began. Awesome.Alessio [00:02:46]: So GPT-2 was out at the time? Yes, that was 2019.Shunyu [00:02:48]: Yeah.Alessio [00:02:49]: Way too dangerous to release. And then I guess the first work of yours that I came across was React, which was a big part of your defense. But also Harrison, when you came on The Pockets last year, you said that was one of the first papers that you saw when you were getting inspired for BlankChain. So maybe give a recap of why you thought it was cool, because you were already working in AI and machine learning. And then, yeah, you can kind of like intro the paper formally. What was that interesting to you specifically?Harrison [00:03:16]: Yeah, I mean, I think the interesting part was using these language models to interact with the outside world in some form. And I think in the paper, you mostly deal with Wikipedia. And I think there's some other data sets as well. But the outside world is the outside world. And so interacting with things that weren't present in the LLM and APIs and calling into them and thinking about the React reasoning and acting and kind of like combining those together and getting better results. I'd been playing around with LLMs, been talking with people who were playing around with LLMs. People were trying to get LLMs to call into APIs, do things, and it was always, how can they do it more reliably and better? And so this paper was basically a step in that direction. And I think really interesting and also really general as well. Like I think that's part of the appeal is just how general and simple in a good way, I think the idea was. So that it was really appealing for all those reasons.Shunyu [00:04:07]: Simple is always good. Yeah.Alessio [00:04:09]: Do you have a favorite part? Because I have one favorite part from your PhD defense, which I didn't understand when I read the paper, but you said something along the lines, React doesn't change the outside or the environment, but it does change the insight through the context, putting more things in the context. You're not actually changing any of the tools around you to work for you, but you're changing how the model thinks. And I think that was like a very profound thing when I, not that I've been using these tools for like 18 months. I'm like, I understand what you meant, but like to say that at the time you did the PhD defense was not trivial. Yeah.Shunyu [00:04:41]: Another way to put it is like thinking can be an extra tool that's useful.Alessio [00:04:47]: Makes sense. Checks out.Swyx [00:04:49]: Who would have thought? I think it's also more controversial within his world because everyone was trying to use RL for agents. And this is like the first kind of zero gradient type approach. Yeah.Shunyu [00:05:01]: I think the bigger kind of historical context is that we have this two big branches of AI. So if you think about RL, right, that's pretty much the equivalent of agent at a time. And it's like agent is equivalent to reinforcement learning and reinforcement learning is equivalent to whatever game environment they're using, right? Atari game or go or whatever. So you have like a pretty much, you know, you have a biased kind of like set of methodologies in terms of reinforcement learning and represents agents. On the other hand, I think NLP is like a historical kind of subject. It's not really into agents, right? It's more about reasoning. It's more about solving those concrete tasks. And if you look at SEL, right, like each task has its own track, right? Summarization has a track, question answering has a track. So I think really it's about rethinking agents in terms of what could be the new environments that we came to have is not just Atari games or whatever video games, but also those text games or language games. And also thinking about, could there be like a more general kind of methodology beyond just designing specific pipelines for each NLP task? That's like the bigger kind of context, I would say.Alessio [00:06:14]: Is there an inspiration spark moment that you remember or how did you come to this? We had Trida on the podcast and he mentioned he was really inspired working with like systems people to think about Flash Attention. What was your inspiration journey?Shunyu [00:06:27]: So actually before React, I spent the first two years of my PhD focusing on text-based games, or in other words, text adventure games. It's a very kind of small kind of research area and quite ad hoc, I would say. And there are like, I don't know, like 10 people working on that at the time. And have you guys heard of Zork 1, for example? So basically the idea is you have this game and you have text observations, like you see a monster, you see a dragon.Swyx [00:06:57]: You're eaten by a grue.Shunyu [00:06:58]: Yeah, you're eaten by a grue. And you have actions like kill the grue with a sword or whatever. And that's like a very typical setup of a text game. So I think one day after I've seen all the GPT-3 stuff, I just think about, you know, how can I solve the game? Like why those AI, you know, machine learning methods are pretty stupid, but we are pretty good at solving the game relatively, right? So for the context, the predominant method to solve this text game is obviously reinforcement learning. And the idea is you just try out an arrow in those games for like millions of steps and you kind of just overfit to the game. But there's no language understanding at all. And I'm like, why can't I solve the game better? And it's kind of like, because we think about the game, right? Like when we see this very complex text observation, like you see a grue and you might see a sword, you know, in the right of the room and you have to go through the wooden door to go to that room. You will think, you know, oh, I have to kill the monster and to kill that monster, I have to get the sword, I have to get the sword, I have to go, right? And this kind of thinking actually helps us kind of throw shots off the game. And it's like, why don't we also enable the text agents to think? And that's kind of the prototype of React. And I think that's actually very interesting because the prototype, I think, was around November of 2021. So that's even before like chain of thought or whatever came up. So we did a bunch of experiments in the text game, but it was not really working that well. Like those text games are just too hard. I think today it's still very hard. Like if you use GPD 4 to solve it, it's still very hard. So the change came when I started the internship in Google. And apparently Google care less about text game, they care more about what's more practical. So pretty much I just reapplied the idea, but to more practical kind of environments like Wikipedia or simpler text games like Alphard, and it just worked. It's kind of like you first have the idea and then you try to find the domains and the problems to demonstrate the idea, which is, I would say, different from most of the AI research, but it kind of worked out for me in that case.Swyx [00:09:09]: For Harrison, when you were implementing React, what were people applying React to in the early days?Harrison [00:09:14]: I think the first demo we did probably had like a calculator tool and a search tool. So like general things, we tried to make it pretty easy to write your own tools and plug in your own things. And so this is one of the things that we've seen in LangChain is people who build their own applications generally write their own tools. Like there are a few common ones. I'd say like the three common ones might be like a browser, a search tool, and a code interpreter. But then other than that-Swyx [00:09:37]: The LMS. Yep.Harrison [00:09:39]: Yeah, exactly. It matches up very nice with that. And we actually just redid like our integrations docs page, and if you go to the tool section, they like highlight those three, and then there's a bunch of like other ones. And there's such a long tail of other ones. But in practice, like when people go to production, they generally have their own tools or maybe one of those three, maybe some other ones, but like very, very few other ones. So yeah, I think the first demos was a search and a calculator one. And there's- What's the data set?Shunyu [00:10:04]: Hotpot QA.Harrison [00:10:05]: Yeah. Oh, so there's that one. And then there's like the celebrity one by the same author, I think.Swyx [00:10:09]: Olivier Wilde's boyfriend squared. Yeah. 0.23. Yeah. Right, right, right.Harrison [00:10:16]: I'm forgetting the name of the author, but there's-Swyx [00:10:17]: I was like, we're going to over-optimize for Olivier Wilde's boyfriend, and it's going to change next year or something.Harrison [00:10:21]: There's a few data sets kind of like in that vein that require multi-step kind of like reasoning and thinking. So one of the questions I actually had for you in this vein, like the React paper, there's a few things in there, or at least when I think of that, there's a few things that I think of. There's kind of like the specific prompting strategy. Then there's like this general idea of kind of like thinking and then taking an action. And then there's just even more general idea of just like taking actions in a loop. Today, like obviously language models have changed a lot. We have tool calling. The specific prompting strategy probably isn't used super heavily anymore. Would you say that like the concept of React is still used though? Or like do you think that tool calling and running tool calling in a loop, is that ReactSwyx [00:11:02]: in your mind?Shunyu [00:11:03]: I would say like it's like more implicitly used than explicitly used. To be fair, I think the contribution of React is actually twofold. So first is this idea of, you know, we should be able to use calls in a very general way. Like there should be a single kind of general method to handle interaction with various environments. I think React is the first paper to demonstrate the idea. But then I think later there are two form or whatever, and this becomes like a trivial idea. But I think at the time, that's like a pretty non-trivial thing. And I think the second contribution is this idea of what people call like inner monologue or thinking or reasoning or whatever, to be paired with tool use. I think that's still non-trivial because if you look at the default function calling or whatever, like there's no inner monologue. And in practice, that actually is important, especially if the tool that you use is pretty different from the training distribution of the language model. I think those are the two main things that are kind of inherited.Harrison [00:12:10]: On that note, I think OpenAI even recommended when you're doing tool calling, it's sometimes helpful to put a thought field in the tool, along with all the actual acquired arguments,Swyx [00:12:19]: and then have that one first.Harrison [00:12:20]: So it fills out that first, and they've shown that that's yielded better results. The reason I ask is just like this same concept is still alive, and I don't know whether to call it a React agent or not. I don't know what to call it. I think of it as React, like it's the same ideas that were in the paper, but it's obviously a very different implementation at this point in time. And so I just don't know what to call it.Shunyu [00:12:40]: I feel like people will sometimes think more in terms of different tools, right? Because if you think about a web agent versus, you know, like a function calling agent, calling a Python API, you would think of them as very different. But in some sense, the methodology is the same. It depends on how you view them, right? I think people will tend to think more in terms of the environment and the tools rather than the methodology. Or, in other words, I think the methodology is kind of trivial and simple, so people will try to focus more on the different tools. But I think it's good to have a single underlying principle of those things.Alessio [00:13:17]: How do you see the surface of React getting molded into the model? So a function calling is a good example of like, now the model does it. What about the thinking? Now most models that you use kind of do chain of thought on their own, they kind of produce steps. Do you think that more and more of this logic will be in the model? Or do you think the context window will still be the main driver of reasoning and thinking?Shunyu [00:13:39]: I think it's already default, right? You do some chain of thought and you do some tool call, the cost of adding the chain of thought is kind of relatively low compared to other things. So it's not hurting to do that. And I think it's already kind of common practice, I would say.Swyx [00:13:56]: This is a good place to bring in either Tree of Thought or Reflection, your pick.Shunyu [00:14:01]: Maybe Reflection, to respect the time order, I would say.Swyx [00:14:05]: Any backstory as well, like the people involved with NOAA and the Princeton group. We talked about this offline, but people don't understand how these research pieces come together and this ideation.Shunyu [00:14:15]: I think Reflection is mostly NOAA's work, I'm more like advising kind of role. The story is, I don't remember the time, but one day we just see this pre-print that's like Reflection and Autonomous Agent with memory or whatever. And it's kind of like an extension to React, which uses this self-reflection. I'm like, oh, somehow you've become very popular. And NOAA reached out to me, it's like, do you want to collaborate on this and make this from an archive pre-print to something more solid, like a conference submission? I'm like, sure. We started collaborating and we remain good friends today. And I think another interesting backstory is NOAA was contacted by OpenAI at the time. It's like, this is pretty cool, do you want to just work at OpenAI? And I think Sierra also reached out at the same time. It's like, this is pretty cool, do you want to work at Sierra? And I think NOAA chose Sierra, but it's pretty cool because he was still like a second year undergrad and he's a very smart kid.Swyx [00:15:16]: Based on one paper. Oh my god.Shunyu [00:15:19]: He's done some other research based on programming language or chemistry or whatever, but I think that's the paper that got the attention of OpenAI and Sierra.Swyx [00:15:28]: For those who haven't gone too deep on it, the way that you present the inside of React, can you do that also for reflection? Yeah.Shunyu [00:15:35]: I think one way to think of reflection is that the traditional idea of reinforcement learning is you have a scalar reward and then you somehow back-propagate the signal of the scalar reward to the rest of your neural network through whatever algorithm, like policy grading or A2C or whatever. And if you think about the real life, most of the reward signal is not scalar. It's like your boss told you, you should have done a better job in this, but you could jump on that or whatever. It's not like a scalar reward, like 29 or something. I think in general, humans deal more with long scalar reward, or you can say language feedback. And the way that they deal with language feedback also has this back-propagation process, right? Because you start from this, you did a good job on job B, and then you reflect what could have been done differently to change to make it better. And you kind of change your prompt, right? Basically, you change your prompt on how to do job A and how to do job B, and then you do the whole thing again. So it's really like a pipeline of language where in self-graded descent, you have something like text reasoning to replace those gradient descent algorithms. I think that's one way to think of reflection.Harrison [00:16:47]: One question I have about reflection is how general do you think the algorithm there is? And so for context, I think at LangChain and at other places as well, we found it pretty easy to implement React in a standard way. You plug in any tools and it kind of works off the shelf, can get it up and running. I don't think we have an off-the-shelf kind of implementation of reflection and kind of the general sense. I think the concepts, absolutely, we see used in different kind of specific cognitive architectures, but I don't think we have one that comes off the shelf. I don't think any of the other frameworks have one that comes off the shelf. And I'm curious whether that's because it's not general enough or it's complex as well, because it also requires running it more times.Swyx [00:17:28]: Maybe that's not feasible.Harrison [00:17:30]: I'm curious how you think about the generality, complexity. Should we have one that comes off the shelf?Shunyu [00:17:36]: I think the algorithm is general in the sense that it's just as general as other algorithms, if you think about policy grading or whatever, but it's not applicable to all tasks, just like other algorithms. So you can argue PPO is also general, but it works better for those set of tasks, but not on those set of tasks. I think it's the same situation for reflection. And I think a key bottleneck is the evaluator, right? Basically, you need to have a good sense of the signal. So for example, if you are trying to do a very hard reasoning task, say mathematics, for example, and you don't have any tools, you're operating in this chain of thought setup, then reflection will be pretty hard because in order to reflect upon your thoughts, you have to have a very good evaluator to judge whether your thought is good or not. But that might be as hard as solving the problem itself or even harder. The principle of self-reflection is probably more applicable if you have a good evaluator, for example, in the case of coding. If you have those arrows, then you can just reflect on that and how to solve the bug andSwyx [00:18:37]: stuff.Shunyu [00:18:38]: So I think another criteria is that it depends on the application, right? If you have this latency or whatever need for an actual application with an end-user, the end-user wouldn't let you do two hours of tree-of-thought or reflection, right? You need something as soon as possible. So in that case, maybe this is better to be used as a training time technique, right? You do those reflection or tree-of-thought or whatever, you get a lot of data, and then you try to use the data to train your model better. And then in test time, you still use something as simple as React, but that's already improved.Alessio [00:19:11]: And if you think of the Voyager paper as a way to store skills and then reuse them, how would you compare this reflective memory and at what point it's just ragging on the memory versus you want to start to fine-tune some of them or what's the next step once you get a very long reflective corpus? Yeah.Shunyu [00:19:30]: So I think there are two questions here. The first question is, what type of information or memory are you considering, right? Is it like semantic memory that stores knowledge about the word, or is it the episodic memory that stores trajectories or behaviors, or is it more of a procedural memory like in Voyager's case, like skills or code snippets that you can use to do actions, right?Swyx [00:19:54]: That's one dimension.Shunyu [00:19:55]: And the second dimension is obviously how you use the memory, either retrieving from it, using it in the context, or fine-tuning it. I think the Cognitive Architecture for Language Agents paper has a good categorization of all the different combinations. And of course, which way you use depends on the concrete application and the concrete need and the concrete task. But I think in general, it's good to think of those systematic dimensions and all the possible options there.Swyx [00:20:25]: Harrison also has in LangMEM, I think you did a presentation in my meetup, and I think you've done it at a couple other venues as well. User state, semantic memory, and append-only state, I think kind of maps to what you just said.Shunyu [00:20:38]: What is LangMEM? Can I give it like a quick...Harrison [00:20:40]: One of the modules of LangChain for a long time has been something around memory. And I think we're still obviously figuring out what that means, as is everyone kind of in the space. But one of the experiments that we did, and one of the proof of concepts that we did was, technically what it was is you would basically create threads, you'd push messages to those threads in the background, we process the data in a few ways. One, we put it into some semantic store, that's the semantic memory. And then two, we do some extraction and reasoning over the memories to extract. And we let the user define this, but extract key facts or anything that's of interest to the user. Those aren't exactly trajectories, they're maybe more closer to the procedural memory. Is that how you'd think about it or classify it?Shunyu [00:21:22]: Is it like about knowledge about the word, or is it more like how to do something?Swyx [00:21:27]: It's reflections, basically.Harrison [00:21:28]: So in generative worlds.Shunyu [00:21:30]: Generative agents.Swyx [00:21:31]: The Smallville. Yeah, the Smallville one.Harrison [00:21:33]: So the way that they had their memory there was they had the sequence of events, and that's kind of like the raw events that happened. But then every N events, they'd run some synthesis over those events for the LLM to insert its own memory, basically. It's that type of memory.Swyx [00:21:49]: I don't know how that would be classified.Shunyu [00:21:50]: I think of that as more of the semantic memory, but to be fair, I think it's just one way to think of that. But whether it's semantic memory or procedural memory or whatever memory, that's like an abstraction layer. But in terms of implementation, you can choose whatever implementation for whatever memory. So they're totally kind of orthogonal. I think it's more of a good way to think of the things, because from the history of cognitive science and cognitive architecture and how people study even neuroscience, that's the way people think of how the human brain organizes memory. And I think it's more useful as a way to think of things. But it's not like for semantic memory, you have to do this kind of way to retrieve or fine-tune, and for procedural memory, you have to do that. I think those are totally orthogonal kind of dimensions.Harrison [00:22:34]: How much background do you have in cognitive sciences, and how much do you model some of your thoughts on?Shunyu [00:22:40]: That's a great question, actually. I think one of the undergrad influences for my follow-up research is I was doing an internship at MIT's Computational Cognitive Science Lab with Josh Tannenbaum, and he's a very famous cognitive scientist. And I think a lot of his ideas still influence me today, like thinking of things in computational terms and getting interested in language and a lot of stuff, or even developing psychology kind of stuff. So I think it still influences me today.Swyx [00:23:14]: As a developer that tried out LangMEM, the way I view it is just it's a materialized view of a stream of logs. And if anything, that's just useful for context compression. I don't have to use the full context to run it over everything. But also it's kind of debuggable. If it's wrong, I can show it to the user, the user can manually fix it, and I can carry on. That's a really good analogy. I like that. I'm going to steal that. Sure. Please, please. You know I'm bullish on memory databases. I guess, Tree of Thoughts? Yeah, Tree of Thoughts.Shunyu [00:23:39]: I feel like I'm relieving the defense in like a podcast format. Yeah, no.Alessio [00:23:45]: I mean, you had a banger. Well, this is the one where you're already successful and we just highlight the glory. It was really good. You mentioned that since thinking is kind of like taking an action, you can use action searching algorithms to think of thinking. So just like you will use Tree Search to find the next thing. And the idea behind Tree of Thought is that you generate all these possible outcomes and then find the best tree to get to the end. Maybe back to the latency question, you can't really do that if you have to respond in real time. So what are maybe some of the most helpful use cases for things like this? Where have you seen people adopt it where the high latency is actually worth the wait?Shunyu [00:24:21]: For things that you don't care about latency, obviously. For example, if you're trying to do math, if you're just trying to come up with a proof. But I feel like one type of task is more about searching for a solution. You can try a hundred times, but if you find one solution, that's good. For example, if you're finding a math proof or if you're finding a good code to solve a problem or whatever, I think another type of task is more like reacting. For example, if you're doing customer service, you're like a web agent booking a ticket for an end user. Those are more reactive kind of tasks, or more real-time tasks. You have to do things fast. They might be easy, but you have to do it reliably. And you care more about can you solve 99% of the time out of a hundred. But for the type of search type of tasks, then you care more about can I find one solution out of a hundred. So it's kind of symmetric and different.Alessio [00:25:11]: Do you have any data or intuition from your user base? What's the split of these type of use cases? How many people are doing more reactive things and how many people are experimenting with deep, long search?Harrison [00:25:23]: I would say React's probably the most popular. I think there's aspects of reflection that get used. Tree of thought, probably the least so. There's a great tweet from Jason Wei, I think you're now a colleague, and he was talking about prompting strategies and how he thinks about them. And I think the four things that he had was, one, how easy is it to implement? How much compute does it take? How many tasks does it solve? And how much does it improve on those tasks? And I'd add a fifth, which is how likely is it to be relevant when the next generation of models come out? And I think if you look at those axes and then you look at React, reflection, tree of thought, it tracks that the ones that score better are used more. React is pretty easy to implement. Tree of thought's pretty hard to implement. The amount of compute, yeah, a lot more for tree of thought. The tasks and how much it improves, I don't have amazing visibility there. But I think if we're comparing React versus tree of thought, React just dominates the first two axes so much that my question around that was going to be like, how do you think about these prompting strategies, cognitive architectures, whatever you want to call them? When you're thinking of them, what are the axes that you're judging them on in your head when you're thinking whether it's a good one or a less good one?Swyx [00:26:38]: Right.Shunyu [00:26:39]: Right. I think there is a difference between a prompting method versus research, in the sense that for research, you don't really even care about does it actually work on practical tasks or does it help? Whatever. I think it's more about the idea or the principle, right? What is the direction that you're unblocking and whatever. And I think for an actual prompting method to solve a concrete problem, I would say simplicity is very important because the simpler it is, the less decision you have to make about it. And it's easier to design. It's easier to propagate. And it's easier to do stuff. So always try to be as simple as possible. And I think latency obviously is important. If you can do things fast and you don't want to do things slow. And I think in terms of the actual prompting method to use for a particular problem, I think we should all be in the minimalist kind of camp, right? You should try the minimum thing and see if it works. And if it doesn't work and there's absolute reason to add something, then you add something, right? If there's absolute reason that you need some tool, then you should add the tool thing. If there's absolute reason to add reflection or whatever, you should add that. Otherwise, if a chain of thought can already solve something, then you don't even need to use any of that.Harrison [00:27:57]: Yeah. Or if it's just better prompting can solve it. Like, you know, you could add a reflection step or you could make your instructions a little bit clearer.Swyx [00:28:03]: And it's a lot easier to do that.Shunyu [00:28:04]: I think another interesting thing is like, I personally have never done those kind of like weird tricks. I think all the prompts that I write are kind of like just talking to a human, right? It's like, I don't know. I never say something like, your grandma is dying and you have to solve it. I mean, those are cool, but I feel like we should all try to solve things in a very intuitive way. Just like talking to your co-worker. That should work 99% of the time. That's my personal take.Swyx [00:28:29]: The problem with how language models, at least in the GPC 3 era, was that they over-optimized to some sets of tokens in sequence. So like reading the Kojima et al. paper that was listing step-by-step, like he tried a bunch of them and they had wildly different results. It should not be the case, but it is the case. And hopefully we're getting better there.Shunyu [00:28:51]: Yeah. I think it's also like a timing thing in the sense that if you think about this whole line of language model, right? Like at the time it was just like a text generator. We don't have any idea how it's going to be used, right? And obviously at the time you will find all kinds of weird issues because it's not trained to do any of that, right? But then I think we have this loop where once we realize chain of thought is important or agent is important or tool using is important, what we see is today's language models are heavily optimized towards those things. So I think in some sense they become more reliable and robust over those use cases. And you don't need to do as much prompt engineering tricks anymore to solve those things. I feel like in some sense, I feel like prompt engineering even is like a slightly negative word at the time because it refers to all those kind of weird tricks that you have to apply. But I think we don't have to do that anymore. Like given today's progress, you should just be able to talk to like a coworker. And if you're clear and concrete and being reasonable, then it should do reasonable things for you.Swyx [00:29:51]: Yeah. The way I put this is you should not be a prompt engineer because it is the goal of the big labs to put you out of a job.Shunyu [00:29:58]: You should just be a good communicator. Like if you're a good communicator to humans, you should be a good communicator to languageSwyx [00:30:02]: models.Harrison [00:30:03]: That's the key though, because oftentimes people aren't good communicators to these language models and that is a very important skill and that's still messing around with the prompt. And so it depends what you're talking about when you're saying prompt engineer.Shunyu [00:30:14]: But do you think it's like very correlated with like, are they like a good communicator to humans? You know, it's like.Harrison [00:30:20]: It may be, but I also think I would say on average, people are probably worse at communicating with language models than to humans right now, at least, because I think we're still figuring out how to do it. You kind of expect it to be magical and there's probably some correlation, but I'd say there's also just like, people are worse at it right now than talking to humans.Shunyu [00:30:36]: We should make it like a, you know, like an elementary school class or whatever, how toSwyx [00:30:41]: talk to language models. Yeah. I don't know. Very pro that. Yeah. Before we leave the topic of trees and searching, not specific about QSTAR, but there's a lot of questions about MCTS and this combination of tree search and language models. And I just had to get in a question there about how seriously should people take this?Shunyu [00:30:59]: Again, I think it depends on the tasks, right? So MCTS was magical for Go, but it's probably not as magical for robotics, right? So I think right now the problem is not even that we don't have good methodologies, it's more about we don't have good tasks. It's also very interesting, right? Because if you look at my citation, it's like, obviously the most cited are React, Refraction and Tree of Thought. Those are methodologies. But I think like equally important, if not more important line of my work is like benchmarks and environments, right? Like WebShop or SuiteVenture or whatever. And I think in general, what people do in academia that I think is not good is they choose a very simple task, like Alford, and then they apply overly complex methods to show they improve 2%. I think you should probably match the level of complexity of your task and your method. I feel like where tasks are kind of far behind the method in some sense, right? Because we have some good test-time approaches, like whatever, React or Refraction or Tree of Thought, or like there are many, many more complicated test-time methods afterwards. But on the benchmark side, we have made a lot of good progress this year, last year. But I think we still need more progress towards that, like better coding benchmark, better web agent benchmark, better agent benchmark, not even for web or code. I think in general, we need to catch up with tasks.Harrison [00:32:27]: What are the biggest reasons in your mind why it lags behind?Shunyu [00:32:31]: I think incentive is one big reason. Like if you see, you know, all the master paper are cited like a hundred times more than the task paper. And also making a good benchmark is actually quite hard. It's almost like a different set of skills in some sense, right? I feel like if you want to build a good benchmark, you need to be like a good kind of product manager kind of mindset, right? You need to think about why people should use your benchmark, why it's challenging, why it's useful. If you think about like a PhD going into like a school, right? The prior skill that expected to have is more about, you know, can they code this method and can they just run experiments and can solve that? I think building a benchmark is not the typical prior skill that we have, but I think things are getting better. I think more and more people are starting to build benchmarks and people are saying that it's like a way to get more impact in some sense, right? Because like if you have a really good benchmark, a lot of people are going to use it. But if you have a super complicated test time method, like it's very hard for people to use it.Harrison [00:33:35]: Are evaluation metrics also part of the reason? Like for some of these tasks that we might want to ask these agents or language models to do, is it hard to evaluate them? And so it's hard to get an automated benchmark. Obviously with SweetBench you can, and with coding, it's easier, but.Shunyu [00:33:50]: I think that's part of the skillset thing that I mentioned, because I feel like it's like a product manager because there are many dimensions and you need to strike a balance and it's really hard, right? If you want to make sense, very easy to autogradable, like automatically gradable, like either to grade or either to evaluate, then you might lose some of the realness or practicality. Or like it might be practical, but it might not be as scalable, right? For example, if you think about text game, human have pre-annotated all the rewards and all the language are real. So it's pretty good on autogradable dimension and the practical dimension. If you think about, you know, practical, like actual English being practical, but it's not scalable, right? It takes like a year for experts to build that game. So it's not really that scalable. And I think part of the reason that SweetBench is so popular now is it kind of hits the balance between these three dimensions, right? Easy to evaluate and being actually practical and being scalable. Like if I were to criticize upon some of my prior work, I think webshop, like it's my initial attempt to get into benchmark world and I'm trying to do a good job striking the balance. But obviously we make it all gradable and it's really scalable, but then I think the practicality is not as high as actually just using GitHub issues, right? Because you're just creating those like synthetic tasks.Harrison [00:35:13]: Are there other areas besides coding that jump to mind as being really good for being autogradable?Shunyu [00:35:20]: Maybe mathematics.Swyx [00:35:21]: Classic. Yeah. Do you have thoughts on alpha proof, the new DeepMind paper? I think it's pretty cool.Shunyu [00:35:29]: I think it's more of a, you know, it's more of like a confidence boost or like sometimes, you know, the work is not even about, you know, the technical details or the methodology that it chooses or the concrete results. I think it's more about a signal, right?Swyx [00:35:47]: Yeah. Existence proof. Yeah.Shunyu [00:35:50]: Yeah. It can be done. This direction is exciting. It kind of encourages people to work more towards that direction. I think it's more like a boost of confidence, I would say.Swyx [00:35:59]: Yeah. So we're going to focus more on agents now and, you know, all of us have a special interest in coding agents. I would consider Devin to be the sort of biggest launch of the year as far as AI startups go. And you guys in the Princeton group worked on Suiagents alongside of Suibench. Tell us the story about Suiagent. Sure.Shunyu [00:36:21]: I think it's kind of like a triology, it's actually a series of three works now. So actually the first work is called Intercode, but it's not as famous, I know. And the second work is called Suibench and the third work is called Suiagent. And I'm just really confused why nobody is working on coding. You know, it's like a year ago, but I mean, not everybody's working on coding, obviously, but a year ago, like literally nobody was working on coding. I was really confused. And the people that were working on coding are, you know, trying to solve human evil in like a sick-to-sick way. There's no agent, there's no chain of thought, there's no anything, they're just, you know, fine tuning the model and improve some points and whatever, like, I was really confused because obviously coding is the best application for agents because it's autogradable, it's super important, you can make everything like API or code action, right? So I was confused and I collaborated with some of the students in Princeton and we have this work called Intercode and the idea is, first, if you care about coding, then you should solve coding in an interactive way, meaning more like a Jupyter Notebook kind of way than just writing a program and seeing if it fails or succeeds and stop, right? You should solve it in an interactive way because that's exactly how humans solve it, right? You don't have to, you know, write a program like next token, next token, next token and stop and never do any edits and you cannot really use any terminal or whatever tool. It doesn't make sense, right? And that's the way people are solving coding at the time, basically like sampling a program from a language model without chain of thought, without tool call, without refactoring, without anything. So the first point is we should solve coding in a very interactive way and that's a very general principle that applies for various coding benchmarks. And also, I think you can make a lot of the agent task kind of like interactive coding. If you have Python and you can call any package, then you can literally also browse internet or do whatever you want, like control a robot or whatever. So that seems to be a very general paradigm. But obviously I think a bottleneck is at the time we're still doing, you know, very simple tasks like human eval or whatever coding benchmark people proposed. They were super hard in 2021, like 20%, but they're like 95% already in 2023. So obviously the next step is we need a better benchmark. And Carlos and John, which are the first authors of Swaybench, I think they come up with this great idea that we should just script GitHub and solve whatever human engineers are solving. And I think it's actually pretty easy to come up with the idea. And I think in the first week, they already made a lot of progress. They script the GitHub and they make all the same, but then there's a lot of painful info work and whatever, you know. I think the idea is super easy, but the engineering is super hard. And I feel like that's a very typical signal of a good work in the AI era now.Swyx [00:39:17]: I think also, I think the filtering was challenging, because if you look at open source PRs, a lot of them are just like, you know, fixing typos. I think it's challenging.Shunyu [00:39:27]: And to be honest, we didn't do a perfect job at the time. So if you look at the recent blog post with OpenAI, we improved the filtering so that it's more solvable.Swyx [00:39:36]: I think OpenAI was just like, look, this is a thing now. We have to fix this. These students just rushed it.Shunyu [00:39:45]: It's a good convergence of interests for me.Alessio [00:39:48]: Was that tied to you joining OpenAI? Or was that just unrelated?Shunyu [00:39:52]: It's a coincidence for me, but it's a good coincidence.Swyx [00:39:55]: There is a history of anytime a big lab adopts a benchmark, they fix it. Otherwise, it's a broken benchmark.Shunyu [00:40:03]: So naturally, once we propose swimmage, the next step is to solve it. But I think the typical way you solve something now is you collect some training samples, or you design some complicated agent method, and then you try to solve it. Either super complicated prompt, or you build a better model with more training data. But I think at the time, we realized that even before those things, there's a fundamental problem with the interface or the tool that you're supposed to use. Because that's like an ignored problem in some sense. What your tool is, or how that matters for your task. So what we found concretely is that if you just use the text terminal off the shelf as a tool for those agents, there's a lot of problems. For example, if you edit something, there's no feedback. So you don't know whether your edit is good or not. That makes the agent very confused and makes a lot of mistakes. There are a lot of small problems, you would say. Well, you can try to do prompt engineering and improve that, but it turns out to be actually very hard. We realized that the interface design is actually a very omitted part of agent design. So we did this switch agent work. And the key idea is just, even before you talk about what the agent is, you should talk about what the environment is. You should make sure that the environment is actually friendly to whatever agent you're trying to apply. That's the same idea for humans. Text terminal is good for some tasks, like git, pool, or whatever. But it's not good if you want to look at browser and whatever. Also, browser is a good tool for some tasks, but it's not a good tool for other tasks. We need to talk about how design interface, in some sense, where we should treat agents as our customers. It's like when we treat humans as a customer, we design human computer interfaces. We design those beautiful desktops or browsers or whatever, so that it's very intuitive and easy for humans to use. And this whole great subject of HCI is all about that. I think now the research idea of switch agent is just, we should treat agents as our customers. And we should do like, you know… AICI.Swyx [00:42:16]: AICI, exactly.Harrison [00:42:18]: So what are the tools that a suite agent should have, or a coding agent in general should have?Shunyu [00:42:24]: For suite agent, it's like a modified text terminal, which kind of adapts to a lot of the patterns of language models to make it easier for language models to use. For example, now for edit, instead of having no feedback, it will actually have a feedback of, you know, actually here you introduced like a syntax error, and you should probably want to fix that, and there's an ended error there. And that makes it super easy for the model to actually do that. And there's other small things, like how exactly you write arguments, right? Like, do you want to write like a multi-line edit, or do you want to write a single line edit? I think it's more interesting to think about the way of the development process of an ACI rather than the actual ACI for like a concrete application. Because I think the general paradigm is very similar to HCI and psychology, right? Basically, for how people develop HCIs, they do behavior experiments on humans, right? I do every test, right? Like, which interface is actually better? And I do those behavior experiments, kind of like psychology experiments to humans, and I change things. And I think what's really interesting for me, for this three-agent paper, is we can probably do the same thing for agents, right? We can do every test for those agents and do behavior tests. And through the process, we not only invent better interfaces for those agents, that's the practical value, but we also better understand agents. Just like when we do those A-B tests, we do those HCI, we better understand humans. Doing those ACI experiments, we actually better understand agents. And that's pretty cool.Harrison [00:43:51]: Besides that A-B testing, what are other processes that people can use to think about this in a good way?Swyx [00:43:57]: That's a great question.Shunyu [00:43:58]: And I think three-agent is an initial work. And what we do is the kind of the naive approach, right? You just try some interface, and you see what's going wrong, and then you try to fix that. We do this kind of iterative fixing. But I think what's really interesting is there'll be a lot of future directions that's very promising if we can apply some of the HCI principles more systematically into the interface design. I think that would be a very cool interdisciplinary research opportunity.Harrison [00:44:26]: You talked a lot about agent-computer interfaces and interactions. What about human-to-agent UX patterns? Curious for any thoughts there that you might have.Swyx [00:44:38]: That's a great question.Shunyu [00:44:39]: And in some sense, I feel like prompt engineering is about human-to-agent interface. But I think there can be a lot of interesting research done about... So prompting is about how humans can better communicate with the agent. But I think there could be interesting research on how agents can better communicate with humans, right? When to ask questions, how to ask questions, what's the frequency of asking questions. And I think those kinds of stuff could be very cool research.Harrison [00:45:07]: Yeah, I think some of the most interesting stuff that I saw here was also related to coding with Devin from Cognition. And they had the three or four different panels where you had the chat, the browser, the terminal, and I guess the code editor as well.Swyx [00:45:19]: There's more now.Harrison [00:45:19]: There's more. Okay, I'm not up to date. Yeah, I think they also did a good job on ACI.Swyx [00:45:25]: I think that's the main learning I have from Devin. They cracked that. Actually, there was no foundational planning breakthrough. The planner is actually pretty simple, but ACI that they broke through on.Shunyu [00:45:35]: I think making the tool good and reliable is probably like 90% of the whole agent. Once the tool is actually good, then the agent design can be much, much simpler. On the other hand, if the tool is bad, then no matter how much you put into the agent design, planning or search or whatever, it's still going to be trash.Harrison [00:45:53]: Yeah, I'd argue the same. Same with like context and instructions. Like, yeah, go hand in hand.Alessio [00:46:00]: On the tool, how do you think about the tension of like, for both of you, I mean, you're building a library, so even more for you. The tension between making now a language or a library that is like easy for the agent to grasp and write versus one that is easy for like the human to grasp and write. Because, you know, the trend is like more and more code gets written by the agent. So why wouldn't you optimize the framework to be as easy as possible for the model versus for the person?Swyx [00:46:24]: I think it's possible to design an interfaceShunyu [00:46:25]: that's both friendly to humans and agents. But what do you think?Harrison [00:46:29]: We haven't thought about that from the perspective, like we're not trying to design LangChain or LangGraph to be friendly. But I mean, I think to be friendly for agents to write.Swyx [00:46:42]: But I mean, I think we see this with like,Harrison [00:46:43]: I saw some paper that used TypeScript notation instead of JSON notation for tool calling and it got a lot better performance. So it's definitely a thing. I haven't really heard of anyone designing like a syntax or a language explicitly for agents, but there's clearly syntaxes that are better.Shunyu [00:46:59]: I think function calling is a good example where it's like a good interface for both human programmers and for agents, right? Like for developers, it's actually a very friendly interface because it's very concrete and you don't have to do prompt engineering anymore. You can be very systematic. And for models, it's also pretty good, right? Like it can use all the existing coding content. So I think we need more of those kinds of designs.Swyx [00:47:21]: I will mostly agree and I'll slightly disagree in terms of this, which is like, whether designing for humans also overlaps with designing for AI. So Malte Ubo, who's the CTO of Vercel, who is creating basically JavaScript's competitor to LangChain, they're observing that basically, like if the API is easy to understand for humans, it's actually much easier to understand for LLMs, for example, because they're not overloaded functions. They don't behave differently under different contexts. They do one thing and they always work the same way. It's easy for humans, it's easy for LLMs. And like that makes a lot of sense. And obviously adding types is another one. Like type annotations only help give extra context, which is really great. So that's the agreement. And then a disagreement is that when I use structured output to do my chain of thought, I have found that I change my field names to hint to the LLM of what the field is supposed to do. So instead of saying topics, I'll say candidate topics. And that gives me a better result because the LLM was like, ah, this is just a draft thing I can use for chain of thought. And instead of like summaries, I'll say topic summaries to link the previous field to the current field. So like little stuff like that, I find myself optimizing for the LLM where I, as a human, would never do that. Interesting.Shunyu [00:48:32]: It's kind of like the way you optimize the prompt, it might be different for humans and for machines. You can have a common ground that's both clear for humans and agents, but to improve the human performance versus improving the agent performance, they might move to different directions.Swyx [00:48:48]: Might move different directions. There's a lot more use of metadata as well, like descriptions, comments, code comments, annotations and stuff like that. Yeah.Harrison [00:48:56]: I would argue that's just you communicatingSwyx [00:48:58]: to the agent what it should do.Harrison [00:49:00]: And maybe you need to communicate a little bit more than to humans because models aren't quite good enough yet.Swyx [00:49:06]: But like, I don't think that's crazy.Harrison [00:49:07]: I don't think that's like- It's not crazy.Swyx [00:49:09]: I will bring this in because it just happened to me yesterday. I was at the cursor office. They held their first user meetup and I was telling them about the LLM OS concept and why basically every interface, every tool was being redesigned for AIs to use rather than humans. And they're like, why? Like, can we just use Bing and Google for LLM search? Why must I use Exa? Or what's the other one that you guys work with?Harrison [00:49:32]: Tavilli.Swyx [00:49:33]: Tavilli. Web Search API dedicated for LLMs. What's the difference?Shunyu [00:49:36]: Exactly. To Bing API.Swyx [00:49:38]: Exactly.Harrison [00:49:38]: There weren't great APIs for search. Like the best one, like the one that we used initially in LangChain was SERP API, which is like maybe illegal. I'm not sure.Swyx [00:49:49]: And like, you know,Harrison [00:49:52]: and now there are like venture-backed companies.Swyx [00:49:53]: Shout out to DuckDuckGo, which is free.Harrison [00:49:55]: Yes, yes.Swyx [00:49:56]: Yeah.Harrison [00:49:56]: I do think there are some differences though. I think you want, like, I think generally these APIs try to return small amounts of text information, clear legible field. It's not a massive JSON blob. And I think that matters. I think like when you talk about designing tools, it's not only the, it's the interface in the entirety, not only the inputs, but also the outputs that really matter. And so I think they try to make the outputs.Shunyu [00:50:18]: They're doing ACI.Swyx [00:50:19]: Yeah, yeah, absolutely.Harrison [00:50:20]: Really?Swyx [00:50:21]: Like there's a whole set of industries that are just being redone for ACI. It's weird. And so my simple answer to them was like the error messages. When you give error messages, they should be basically prompts for the LLM to take and then self-correct. Then your error messages get more verbose, actually, than you normally would with a human. Stuff like that. Like a little, honestly, it's not that big. Again, like, is this worth a venture-backed industry? Unless you can tell us. But like, I think Code Interpreter, I think is a new thing. I hope so.Alessio [00:50:52]: We invested in it to be so.Shunyu [00:50:53]: I think that's a very interesting point. You're trying to optimize to the extreme, then obviously they're going to be different. For example, the error—Swyx [00:51:00]: Because we take it very seriously. Right.Shunyu [00:51:01]: The error for like language model, the longer the better. But for humans, that will make them very nervous and very tired, right? But I guess the point is more like, maybe we should try to find a co-optimized common ground as much as possible. And then if we have divergence, then we should try to diverge. But it's more philosophical now.Alessio [00:51:19]: But I think like part of it is like how you use it. So Google invented the PageRank because ideally you only click on one link, you know, like the top three should have the answer. But with models, it's like, well, you can get 20. So those searches are more like semantic grouping in a way. It's like for this query, I'll return you like 20, 30 things that are kind of good, you know? So it's less about ranking and it's more about grouping.Shunyu [00:51:42]: Another fundamental thing about HCI is the difference between human and machine's kind of memory limit, right? So I think what's really interesting about this concept HCI versus HCI is interfaces that's optimized for them. You can kind of understand some of the fundamental characteristics, differences of humans and machines, right? Why, you know, if you look at find or whatever terminal command, you know, you can only look at one thing at a time or that's because we have a very small working memory. You can only deal with one thing at a time. You can only look at one paragraph of text at the same time. So the interface for us is by design, you know, a small piece of information, but more temporal steps. But for machines, that should be the opposite, right? You should just give them a hundred different results and they should just decide in context what's the most relevant stuff and trade off the context for temporal steps. That's actually also better for language models because like the cost is smaller or whatever. So it's interesting to connect those interfaces to the fundamental kind of differences of those.Harrison [00:52:43]: When you said earlier, you know, we should try to design these to maybe be similar as possible and diverge if we need to.Swyx [00:52:49]: I actually don't have a problem with them diverging nowHarrison [00:52:51]: and seeing venture-backed startups emerging now because we are different from machines code AI. And it's just so early on, like they may still look kind of similar and they may still be small differences, but it's still just so early. And I think we'll only discover more ways that they differ. And so I'm totally fine with them kind of like diverging earlySwyx [00:53:10]: and optimizing for the...Harrison [00:53:11]: I agree. I think it's more like, you know,Shunyu [00:53:14]: we should obviously try to optimize human interface just for humans. We're already doing that for 50 years. We should optimize agent interface just for agents, but we might also try to co-optimize both and see how far we can get. There's enough people to try all three directions. Yeah.Swyx [00:53:31]: There's a thesis I sometimes push, which is the sour lesson as opposed to the bitter lesson, which we're always inspired by human development, but actually AI develops its own path.Shunyu [00:53:40]: Right. We need to understand better, you know, what are the fundamental differences between those creatures.Swyx [00:53:45]: It's funny when really early on this pod, you were like, how much grounding do you have in cognitive development and human brain stuff? And I'm like
Serious Sellers Podcast auf Deutsch: Lerne erfolgreich Verkaufen auf Amazon
Willkommen zu einer neuen Episode, in der wir uns intensiv mit dem Thema Künstliche Intelligenz in der Unternehmensentwicklung beschäftigen. Wir haben Alexander und Michael, die Gründer von AlphaTrail, zu Gast, die uns spannende Einblicke geben, wie sie mit Hilfe von AI ihr Unternehmen erfolgreich auf Amazon positioniert haben. Sie teilen ihre Erfahrungen aus den Bereichen Vertrieb, Finance, HR, Business Development und Marketing und berichten von ihrem Start im Jahr 2017 mit der Entwicklung von Bremsbelägen bis hin zu ihrer Expansion in den USA. In einer weiteren Diskussion beleuchten wir die Vielseitigkeit der Künstlichen Intelligenz und wie sie unseren Arbeitsalltag verändert hat. KI-Tools wie Perplexity AI und ChatGPT haben sich als wertvolle Helfer erwiesen, die herkömmlichen Suchmaschinen überlegen sind. Wir geben praxisnahe Beispiele aus dem Business Development, zeigen, wie Konzepte und Webshops mithilfe von KI optimiert werden können, und betonen die Bedeutung des kontinuierlichen Einsatzes dieser Technologien für bessere Ergebnisse. Abschließend werfen wir einen Blick auf die effiziente Nutzung von KI in der Datenanalyse und Werbung. Alexander und Michael erläutern, wie sie Trustpilot-Bewertungen mithilfe von ChatGPT analysieren und so ihre Arbeitsprozesse optimieren. Außerdem sprechen wir über den kreativen Einsatz von KI in der Werbung, etwa durch die Bildgenerierung mit Midjourney, um sich von der Konkurrenz abzuheben. Zum Abschluss geben wir wertvolle Tipps zum erfolgreichen Netzwerken und laden unsere Zuhörer ein, sich mit uns zu vernetzen, um von gegenseitigen Erfahrungen zu profitieren. Viel Spaß beim Zuhören! In Folge 145 des Serious Sellers Podcast auf Deutsch, Marcus, Alexander, und Michael diskutiére 00:00 - Künstliche Intelligenz in Der Unternehmensentwicklung 12:33 - Die Vielseitigkeit Künstlicher Intelligenz 19:53 - Effizientere Analyse Mit KI-Unterstützung 25:18 - Effiziente Datenanalyse Im E-Commerce 34:19 - KI in Der Werbung Nutzen 46:23 - Tipps Zum Erfolgreichen Netzwerken