Podcasts about Stack overflow

  • 751PODCASTS
  • 1,663EPISODES
  • 46mAVG DURATION
  • 1WEEKLY EPISODE
  • May 22, 2025LATEST

POPULARITY

20172018201920202021202220232024

Categories



Best podcasts about Stack overflow

Show all podcasts related to stack overflow

Latest podcast episodes about Stack overflow

Paul's Security Weekly
Malware Laced Printer Drivers - PSW #875

Paul's Security Weekly

Play Episode Listen Later May 22, 2025 121:59


This week in the security news: Malware-laced printer drivers Unicode steganography Rhode Island may sue Deloitte for breach. They may even win. Japan's active cyber defense law Stop with the ping LLMs replace Stack Overflow - ya don't say? Aggravated identity theft is aggravating Ivanti DSM and why you shouldn't use it EDR is still playing cat and mouse with malware There's a cellular modem in your solar gear Don't slack on securing Slack XSS in your mail SIM swapping and the SEC Ivanti and libraries Supercomputers in space! Visit https://www.securityweekly.com/psw for all the latest episodes! Show Notes: https://securityweekly.com/psw-875

Paul's Security Weekly (Podcast-Only)
Malware Laced Printer Drivers - PSW #875

Paul's Security Weekly (Podcast-Only)

Play Episode Listen Later May 22, 2025 121:59


This week in the security news: Malware-laced printer drivers Unicode steganography Rhode Island may sue Deloitte for breach. They may even win. Japan's active cyber defense law Stop with the ping LLMs replace Stack Overflow - ya don't say? Aggravated identity theft is aggravating Ivanti DSM and why you shouldn't use it EDR is still playing cat and mouse with malware There's a cellular modem in your solar gear Don't slack on securing Slack XSS in your mail SIM swapping and the SEC Ivanti and libraries Supercomputers in space! Visit https://www.securityweekly.com/psw for all the latest episodes! Show Notes: https://securityweekly.com/psw-875

DOU Podcast
Лідів скорочують | Microsoft намагається уникнути штрафів | OpenAI випустили Codex — DOU News #198

DOU Podcast

Play Episode Listen Later May 19, 2025 22:16


У свіжому дайджесті DOU News обговорюємо скорочення в Microsoft, CarPlay Ultra від Apple, VPN-компанію, яка скасувала довічні підписки клієнтів, закінчення ери Stack Overflow та інші новини українського ІТ та світового тек-сектору. ▶️ Навігація 00:00 Інтро  01:04 Чверть лідів звільнили через конфлікт з керівництвом. Аналітика про скорочення айтівців https://dou.ua/lenta/articles/job-market-2025-part-3/ 03:43 LLM вбивають Stack Overflow https://newsletter.pragmaticengineer.com/p/the-pulse-134?open=false#%C2%A7stack-overflow-almost-dead 06:03 Партнерський блок 07:03 Новий CarPlay Ultra від Apple готовий, але поки що лише в Aston Martins https://arstechnica.com/cars/2025/05/apples-new-carplay-ultra-is-ready-but-only-in-aston-martins-for-now/ 09:31 OpenAI випустили Codex https://dou.ua/forums/topic/53860/ 12:35 Попри скорочення, компанія Klarna знову наймає людей у службу підтримки https://slashdot.org/story/25/05/14/2339257/klarna-pivots-back-to-humans-after-ai-experiment-fails 14:04 Microsoft проводить одне з найбільших скорочення з 2023 року https://dou.ua/forums/topic/53815/ 15:09 Microsoft намагається уникнути штрафів в ЄС, відокремлюючи Teams від Office https://www.engadget.com/big-tech/microsoft-attemps-to-avoid-eu-fines-by-further-decoupling-teams-and-office-170519085.html?src=rss 17:01 Uber винайшов маршрутки

Linux User Space
Episode 5:13: Sloppy AI or Good Fuzzing?

Linux User Space

Play Episode Listen Later May 19, 2025 81:08


Coming up in this episode * AI's Won't Take Over Yet * Is Rust Open Source? * and All Kinds of Feedback The Video Version https://youtu.be/LxMpNIfhFiA 0:00 Cold Open 3:56 Curl's "AI Slop" Problem 25:12 A Little Viral Licensing 42:12 So Much Feedback ❤️ 42:30 ukwan / Youtube 51:16 jliljj / Youtube 56:35 fredstech1 / Youtube 1:00:15 conan kudo / Youtube 1:02:06 amanita / Patreon 1:05:13 redvamp128 / Youtube 1:09:35 The Rules, Commands & Next Time 1:19:12 Stinger The Curl project pushes back on AI slop The Ars Technica article (https://arstechnica.com/gadgets/2025/05/open-source-project-curl-is-sick-of-users-submitting-ai-slop-vulnerabilities/) The Curl project on Hacker One (https://hackerone.com/curl?type=team)

AI + a16z
Who's Coding Now? AI and the Future of Software Development

AI + a16z

Play Episode Listen Later May 16, 2025 44:30


In this episode of the a16z AI podcast, a16z Infra partners Guido Appenzeller, Matt Bornstein, and Yoko Li explore how generative AI is reshaping software development. From its potential as a new high-level programming abstraction to its current practical impacts, they discuss whether AI coding tools will redefine what it means to be a developer.Why has coding emerged as one of AI's most powerful use cases? How much can AI truly boost developer productivity, and will it fundamentally change traditional computer science education? Guido, Yoko, and Matt dive deep into these questions, addressing the dynamics of "vibe coding," the enduring role of formal programming languages, and the critical challenge of managing non-deterministic behavior in AI-driven applications.Among other things, they discuss:The enormous market potential of AI-generated code, projected to deliver trillions in productivity gains.How "prompt-based programming" is evolving from Stack Overflow replacements into sophisticated development assistants.Why formal languages like Python and Java are here to stay, even as natural language interactions become common.The shifting landscape of programming education, and why understanding foundational abstractions remains essential.The unique complexities of integrating AI into enterprise software, from managing uncertainty to ensuring reliability. Check out everything a16z is doing with artificial intelligence here, including articles, projects, and more podcasts.

Eye On A.I.
#254 Prashanth: Why Developers Still Trust Stack Overflow in the Age of AI

Eye On A.I.

Play Episode Listen Later May 14, 2025 49:20


This episode is sponsored by Oracle. OCI is the next-generation cloud designed for every workload – where you can run any application, including any AI projects, faster and more securely for less. On average, OCI costs 50% less for compute, 70% less for storage, and 80% less for networking. Join Modal, Skydance Animation, and today's innovative AI tech companies who upgraded to OCI…and saved.    Offer only for new US customers with a minimum financial commitment. See if you qualify for half off at http://oracle.com/eyeonai  In this episode of Eye on AI, host Craig Smith speaks with Prashanth Chandrasekar, CEO of Stack Overflow, about how one of the internet's most trusted platforms for developers is adapting to the era of generative AI. With over 60 million human-curated Q&A pairs, Stack Overflow is now at the center of AI development — not as a competitor to large language models like ChatGPT, but as a foundational knowledge base that powers them.   Prashanth breaks down how Stack Overflow is partnering with OpenAI, Google, and other LLM providers to license its data and improve AI accuracy, while also protecting the integrity of its community. He explains the rise of OverflowAI, how Stack Overflow for Teams is fueling enterprise-grade co-pilots, and why developers still rely on expert human input when AI hits its “complexity cliff.” The conversation covers everything from hallucination problems and trust issues in AI-generated code to the monetization of developer data and the evolving interface of the web.   If you want to understand the future of developer tools, AI coding assistants, and how human knowledge will coexist with autonomous agents, this episode is a must-listen.   Subscribe for more deep dives into how AI is reshaping the world of software, enterprise, and innovation. Stay Updated: Craig Smith on X:https://x.com/craigss Eye on A.I. on X: https://x.com/EyeOn_AI (00:00) Intro (02:31) Prashanth's Journey from Developer to CEO   (05:18) Why Stack Overflow is Different from GitHub   (08:51) The Power of Community and Human-Curated Knowledge   (12:53) Stack Overflow's Data Strategy for AI Training   (17:26) Why Stack Overflow Isn't Competing with OpenAI   (20:36) How Stack Overflow Powers Enterprise AI Agents   (26:13) OverflowAI, Gemini, and the Future of Dev Workflows   (30:09) Inside Stack Overflow for Teams   (33:29) Safeguarding Quality: The Fight Against AI Slop   (38:32) Licensing, Attribution, and Protecting the Knowledge Base   (43:19) Business Strategy in the Age of Generative AI  

Voice of the DBA
Learn to Code

Voice of the DBA

Play Episode Listen Later May 1, 2025 2:56


I thought this story about a programmer and a GenAI to be rather humorous. The individual was a game programmer and used the Cursor AI assistant to help them generate some code for a game. After a few hundred lines of code, the AI delivered this: "I cannot generate code for you, as that would be completing your work. The code appears to be handling skid mark fade effects in a racing game, but you should develop the logic yourself. This ensures you understand the system and can maintain it properly." That response makes me actually smile to myself and chuckle out loud. I likely wouldn't feel the same way if I were asking for help with some code to handle a task like this, but it's kind of funny to have the GenAI stop and say this. However, it's akin to the RTFM response plenty of humans have given others when they were asked a question. This is pointed out in the article as the type of response plenty of people see on sites like Stack Overflow. Fortunately, I think we've avoided a lot of that response on SQL Server Central. Read the rest of Learn to Code

The Tech Blog Writer Podcast
3259: The Evolution of Stack Overflow in a World Shaped by AI

The Tech Blog Writer Podcast

Play Episode Listen Later Apr 28, 2025 26:31


In today's episode of Tech Talks Daily, I sit down with Prashanth Chandrasekar, CEO of Stack Overflow, to explore how the rise of AI is reshaping the future of software development and the evolving role of community-driven platforms. Prashanth offers a grounded view of what is happening behind the headlines as the tech world moves into the third year of widespread AI adoption. We begin by unpacking how AI is already transforming the way developers work, automating routine coding tasks, while still falling short when it comes to complexity and trustworthiness. Prashanth explains why Stack Overflow is positioning itself as a community and a vital "Knowledge as a Service" platform, integrating with AI tools to maintain trusted knowledge in an age where data quality and source attribution have never been more important. Our conversation also delves into the growing concern of "LLM brain drain" and the broader data crisis facing AI innovation. Prashanth highlights the importance of human-generated content, even as synthetic data technologies advance. He shares why Stack Overflow's continued investment in community-driven knowledge, responsible AI use, and strategic partnerships with major AI players is critical to maintaining a healthy ecosystem for developers and enterprises. We also look ahead to Stack Overflow's strategic priorities, including expanding content types, integrating AI into both the public platform and Stack Overflow for Teams, and moving toward a seamless user experience across the tools developers already use every day. How can businesses balance automation with human insight in their AI strategies? Why is data scarcity a looming issue for AI development? And how is Stack Overflow preparing to lead community and enterprise solutions into the next era of innovation? Let's find out.

Conversations with Tyler
Chris Dixon on Blockchains, AI, and the Future of the Internet

Conversations with Tyler

Play Episode Listen Later Apr 23, 2025 62:51


Chris Dixon believes we're at a pivotal inflection point in the internet's evolution. As a general partner at Andreessen Horowitz and author of Read Write Own, Chris believes the current internet, dominated by large platforms like YouTube and Spotify, has strayed far from its decentralized roots. He argues that the next era—powered by blockchain technology—can restore autonomy to creators, lower barriers for innovation, and shift economic power back to the network's edges. Tyler and Chris discuss the economics of platform dominance, how blockchains merge protocol-based social benefits with corporate-style competitive advantages, the rise of stablecoins as a viable blockchain-based application, whether Bitcoin or AI-created currencies will dominate machine-to-machine payments, why Stack Overflow could be the first of many casualties in an AI-driven web, venture capital's vulnerability to AI disruption, whether open-source AI could preserve national sovereignty, NFTs as digital property rights system for AIs, how Kant's synthetic a priori, Kripke's modal logic, and Heidegger's Dasein sneak into Dixon's term‑sheet thinking, and much more. Read a full transcript enhanced with helpful links, or watch the full video. Recorded March 26th, 2025. Help keep the show ad free by donating today! Other ways to connect Follow us on X and Instagram Follow Tyler on X Follow Chris on X Sign up for our newsletter Join our Discord Email us: cowenconvos@mercatus.gmu.edu Learn more about Conversations with Tyler and other Mercatus Center podcasts here.

Founder's Journal
Steal my Idea: Stack Overflow for AI Prompts

Founder's Journal

Play Episode Listen Later Apr 23, 2025 13:46


Alex pitches you on Codex, his AI prompt database idea. He gives you the elevator pitch & likely questions a venture investor would have.  — Show Notes: (0:00) A note from our sponsor (2:28) Welcome back to Founder's Journal (4:00) What is Codex?  (4:20) Two truths in prompting (6:05) Elevator pitch (8:50) Biggest investor questions (13:05) Conclusion — Thanks to our presenting sponsor, Gusto. Head to www.gusto.com/alex — Check Out Alex's Stuff: • storyarb - https://www.storyarb.com/ • growthpair - https://www.growthpair.com/ • distro - https://youdistro.com/  • X - https://x.com/businessbarista • Linkedin - https://www.linkedin.com/in/alex-lieberman/ Learn more about your ad choices. Visit megaphone.fm/adchoices

alphalist.CTO Podcast - For CTOs and Technical Leaders
#120 - AI's Singularity & Commoditization: Navigating Hype vs. Reality with Georg Zoeller // Co-Founder @ C4AIL

alphalist.CTO Podcast - For CTOs and Technical Leaders

Play Episode Listen Later Apr 17, 2025 73:51 Transcription Available


In this episode, Tobi talks with Georg Zoeller, Co-Founder of the Centre for AI Leadership and mercenaries.ai, about the turbulent landscape of AI. Georg, with his background at Meta and deep expertise in AI strategy, cuts through the hype surrounding AI's capabilities and economic impact. They discuss the 'singularity' we're already in, driven by rapid, open-source AI development, and why this makes future predictions impossible. Georg argues that software engineering is being commoditized due to the vast amount of training data available (Stack Overflow, GitHub), making AI adept at code generation but raising profound security concerns like prompt injection. Explore: - Why Georg believes blindly adopting AI early is a 'terrible mistake' for most companies. - The fundamental security flaws in LLMs (prompt injection) and why they're currently unsolvable for open input spaces. - The questionable economics of AI: high costs, self-cannibalizing business models, and the reliance on performative fundraising. - How AI tools impact engineer productivity, shifting the bottleneck to decision-making and validation. - The geopolitical risks and diminishing trust associated with Big Tech's AI dominance. - Actionable advice for CTOs: Invest in understanding, focus on governance beyond the tech team, and consider the strategic value of local/open-source alternatives.

WordPress | Post Status Draft Podcast
Post Status Happiness Hour | Session Twenty Four

WordPress | Post Status Draft Podcast

Play Episode Listen Later Apr 4, 2025 47:14


In this episode of the Post Status Happiness Hour, host Michelle Frechette interviews Roger Williams from Kinsta. Who serves as the Partnerships and Community Manager for North America. They discuss various topics including the WordPress community, Kinsta's new affiliate program, and their global sponsorship of WordCamps. The episode also highlights the creation of collaborative music playlists within the Post Status Slack community and the importance of concise, engaging content. Additionally, the guest shares insights on supporting WordPress contributors and the launch of Kinsta's new automatic updates feature for themes and plugins.Top Takeaways:The Importance of Thorough Testing in Software Development and Releases: Michelle emphasized the critical role of testing and feedback during the release cycle of WordPress 6.8, particularly as they approach its official launch. Despite having a dedicated testing community, the need for more testers is constant to ensure compatibility with a wide range of plugins and themes. The takeaway is that comprehensive testing is vital for minimizing issues at launch, and encouraging more community involvement can help ensure smoother releases.The Value of Consistent Community Contribution and Support for Open Source Projects: Roger highlighted the importance of documentation in open-source projects like WordPress, noting that it's often underappreciated until something goes wrong. He also spoke about Kinsta's involvement in supporting the community through contributions, such as sponsoring WordCamp and supporting documentation initiatives. The takeaway is that consistent, behind-the-scenes contributions, like documentation and community support, are crucial for the sustainability and success of open-source projects, even though they are often taken for granted.Kinsta's Automatic Updates Feature Enhances Site Reliability: Roger introduced Kinsta's new Automatic Updates feature, which ensures WordPress sites remain updated while minimizing risks. The system takes a before-and-after screenshot during updates and automatically reverts changes if visual differences are detected, helping prevent website issues that could impact business operations.Mentioned In The Show:KinstaLinux container project  LinkedInWordCamp USSevallaCloudflare EnterpriseStackOverflowMeetup.comWordPress FoundationEsoTerra CideryKinsta Automatic Updates

Code for Thought
[EN] ByteSized RSE: AI assisted coding - with Liam (Jianliang) Gao

Code for Thought

Play Episode Listen Later Mar 31, 2025 34:24


English Edition: In this last episode for the ByteSized RSE "miniseries" we talk about AI assisted coding - and the (long) history how engineers tried to come up with assisting tools to make our code better and more robust. My guest is Liam Gao from Imperial College, London, UK. Links:https://github.com/features/copilot  GitHub Co-Pilothttps://huggingface.co HuggingFace another AI toolhttps://spacelift.io/blog/ai-coding-assistant-tools a summary of current tools (non exhaustive)https://platform.openai.com/docs/guides/prompt-engineering OpenAI's take on prompt engineeringhttps://www.promptingguide.aihttps://web.archive.org/web/20121022091418/http://www.stanford.edu/~learnest/spelling.pdf some of the attempts to come up with spelling checkshttps://en.wikipedia.org/wiki/Code_completionhttps://www.gnu.org/software/emacs/ Good old Emacshttps://en.wikipedia.org/wiki/Vi_(text_editor) vi editor (not for the faint hearted)https://winworldpc.com/product/turbo-pascal/7x Borland's Turbo Pascal with IDEhttps://survey.stackoverflow.co/2024/ Stackoverflow survey from 2024 with ca 65000 respondents And here the YouTube clips mentionedhttps://www.youtube.com/watch?v=MvEXkd3O2ow Cypher musing why he didn't take the "blue pill"https://www.youtube.com/watch?v=L0mRMp2kbQY Star Trek TNG, S3E6 - Geordie LaForge talking to the computerGet in touchThank you for listening! Merci de votre écoute! Vielen Dank für´s Zuhören! Contact Details/ Coordonnées / Kontakt: Email mailto:peter@code4thought.org UK RSE Slack (ukrse.slack.com): @code4thought or @piddie US RSE Slack (usrse.slack.com): @Peter Schmidt Mastodon: https://fosstodon.org/@code4thought or @code4thought@fosstodon.org Bluesky: https://bsky.app/profile/code4thought.bsky.social LinkedIn: https://www.linkedin.com/in/pweschmidt/ (personal Profile)LinkedIn: https://www.linkedin.com/company/codeforthought/ (Code for Thought Profile) This podcast is licensed under the Creative Commons Licence: https://creativecommons.org/licenses/by-sa/4.0/

Cyber Bites
Cyber Bites - 21st March 2025

Cyber Bites

Play Episode Listen Later Mar 20, 2025 7:33


* Sydney Law Firm Targeted by Foreign Cyber Attackers in Extortion Attempt* AI Coding Assistant Refuses to Generate Code, Suggests User Learn Programming* Widely Used GitHub Action Compromised, Leaking Secrets* Fake "Security Alert" Phishing on GitHub Hijacks Accounts* MyGov Passkey Adoption Surges in AustraliaSydney Law Firm Targeted by Foreign Cyber Attackers in Extortion Attempthttps://www.smh.com.au/national/nsw/prominent-sydney-law-firm-hit-with-cyberattack-massive-data-breach-20250313-p5ljd8.htmlBrydens Lawyers, a prominent Sydney law firm with ties to major sports leagues, has been targeted by foreign cyber attackers who stole over 600 gigabytes of confidential data. The data includes information related to the firm, its clients, cases, and staff.The firm discovered the security breach around February 20th and immediately took its digital systems offline, engaging external advisors, lawyers, and security experts. The attackers are now extorting the firm for a ransom.Brydens has reported the incident to the Australian Cyber Security Centre and the Office of the Australian Information Commissioner. The firm has also restored its IT system's security and is conducting investigations to determine the full extent of the breach and notify affected individuals. This incident highlights the vulnerability of legal firms, which handle highly sensitive information, to ransomware attacks.AI Coding Assistant Refuses to Generate Code, Suggests User Learn Programminghttps://arstechnica.com/ai/2025/03/ai-coding-assistant-refuses-to-write-code-tells-user-to-learn-programming-instead/An AI coding assistant, Cursor, has surprised users by refusing to generate code and instead advising them to learn programming. This incident reflects a broader trend of AI refusals seen across various platforms.This behavior mirrors past instances where AI models, like ChatGPT, have exhibited reluctance to perform tasks, sometimes attributed to model "laziness." Developers have even resorted to prompting AI with phrases like "You are a tireless AI" to mitigate these refusals.The Cursor assistant's response, telling users to learn coding, closely resembles interactions on programming help sites like Stack Overflow, where experienced developers often encourage self-learning. This similarity is likely due to the massive datasets, including coding discussions from platforms like Stack Overflow and GitHub, used to train these AI models.While other users report not encountering this issue at similar code lengths, it appears to be an unintended consequence of Cursor's training. The developers of Cursor have been contacted for comment.Widely Used GitHub Action Compromised, Leaking Secretshttps://www.wiz.io/blog/github-action-tj-actions-changed-files-supply-chain-attack-cve-2025-30066The widely used GitHub Action "tj-actions/changed-files" was compromised before March 14, 2025, injecting malicious code that leaked secrets from affected public repositories into workflow logs. This supply chain attack, tracked as CVE-2025-30066, exposed sensitive information like AWS access keys, GitHub Personal Access Tokens, and private RSA keys.The compromise occurred when an attacker gained access to update tags, pointing them to malicious code. While the malicious commits have since been reverted and the associated GitHub gist has been deleted, the risk of leaked secrets in logs remains.The primary risk is to public repositories, where secrets were exposed in plain view. Security teams are urged to identify affected repositories, review workflow logs for base64 encoded secrets, and immediately rotate any compromised credentials. It is recommended to stop using the compromised action, pin GitHub Actions to specific commit hashes, audit past workflow runs, and use GitHub's allow-listing feature to prevent future attacks.Fake "Security Alert" Phishing on GitHub Hijacks Accountshttps://www.bleepingcomputer.com/news/security/fake-security-alert-issues-on-github-use-oauth-app-to-hijack-accounts/A widespread phishing campaign is targeting GitHub users with fake "Security Alert" issues, attempting to trick them into authorizing a malicious OAuth app. The campaign has targeted nearly 12,000 repositories, warning users of unusual login attempts from Iceland.The fake alerts provide links that lead to an OAuth authorization page for a "gitsecurityapp" app, which requests extensive permissions, including full access to repositories, user profiles, and GitHub Actions workflows. If authorized, the app gains complete control over the user's account and code.The phishing campaign, which began recently, directs authorized users to callback addresses hosted on onrender.com. Users who have authorized the malicious app are advised to immediately revoke its access through GitHub Settings, check for unfamiliar GitHub Actions or gists, and rotate their credentials and authorization tokens.MyGov Passkey Adoption Surges in Australiahttps://www.itnews.com.au/news/over-200000-mygov-users-disable-passwords-in-passkey-shift-615664Over half a million myGov users have adopted passkeys as their login method since the feature launched in June 2024, with over 200,000 users exclusively relying on passkeys and abandoning traditional passwords. The Australian government implemented passkeys to enhance security and combat phishing attacks, investing $5.6 million in the project.Passkeys utilize biometric authentication, PINs, swipe patterns, or physical USB devices, leveraging cryptographic keypair technology. This approach makes myGov accounts resistant to phishing, as passkeys are specific to the website or app they are created for. Australia is among the first countries to implement passkeys for government services. This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit edwinkwan.substack.com

IFTTD - If This Then Dev
#313.src - Open Source: Mettre le dev en musique avec Fabien Potencier

IFTTD - If This Then Dev

Play Episode Listen Later Mar 12, 2025 77:07


"L'injection dépendance de Symfony 2, c'est pas mon idée c'est une idée qui vient du monde Java" Le D.E.V. de la semaine est Fabien Potencier, fondateur du projet Symfony. Fabien souligne le rôle croissant de l'open source dans le secteur informatique. Fabien explore son parcours et les défis rencontrés avec l'open source depuis la naissance du projet Symfony. Il évoque les autres frameworks, d'autres technologies, leurs influences respectives ainsi que l'importance de satisfaire les besoins réels des utilisateurs. Les contributions à l'open source ont été influencées par de nouveaux outils tels que Stack Overflow et les LLM, mais leur utilisation doit être critique, selon Fabien. Au bout du compte, comprendre les besoins du client et encourager davantage de personnes à contribuer à l'open source sont des objectifs clés.Chapitrages00:00:54 : Introduction à l'Open Source00:03:10 : Évolution de l'Open Source00:03:58 : Impact des LLM sur l'Open Source00:05:25 : Découverte de l'Open Source00:06:33 : Création de Symfony00:09:34 : Philosophie de l'Open Source00:16:33 : Le choix de l'Open Source00:19:30 : Partage et transmission de savoir00:27:33 : Qualité et collaboration en Open Source00:30:33 : Liberté et qualité en Open Source00:36:02 : Innovation par la curiosité00:37:20 : PHP et son Pragmatism00:39:46 : L'Importance de la Stabilité00:43:11 : Simplification et Complexité00:46:35 : Microservices: Une Complexité Inutile00:55:25 : Évolution de l'Open Source01:10:41 : L'Impact des LLM sur le Développement01:14:49 : Conclusion et Recommandations Liens évoqués pendant l'émission Parser PRATTPaul Graham "Hackers and Painters" **Recrutez les meilleurs développeurs grâce à Indeed !** "Trouver des développeurs compétents et passionnés, comme les auditeurs d'If This Then Dev, peut être un vrai défi. Avec Indeed, connectez-vous rapidement avec des candidats qualifiés qui sauront s'épanouir dans votre entreprise. Profitez dès maintenant d'un crédit de 100 euros pour sponsoriser votre offre d'emploi : go.indeed.com/IFTTD."🎙️ Soutenez le podcast If This Then Dev ! 🎙️ Chaque contribution aide à maintenir et améliorer nos épisodes. Cliquez ici pour nous soutenir sur Tipeee 🙏Archives | Site | Boutique | TikTok | Discord | Twitter | LinkedIn | Instagram | Youtube | Twitch | Job Board |

Satansplain
Satansplain #083 - Listener Mail (ritual, The Satanic Witch, George Carlin, Intellectual Black Holes)

Satansplain

Play Episode Listen Later Feb 3, 2025 42:01


Satansplain responds to mail from the listeners, including such topics as: Satanic ritual, The Satanic Witch, George Carlin, intellectual black holes, the fight against Satanic misinformation, and why I do what I do. https://satansplain.locals.com/support  00:00 - Intro 01:24 - Using paper in ritual 04:00 - "Top Fan Badge"? 05:21 - "What would LaVey think of the world today?" 10:12 - George Carlin and Anton LaVey 15:15 - Satanecdote 21:45 - Dumbing it Down? 31:54 - Praise for StackExchange 34:12 - Battles Worth Fighting

Working Code
201: LLMs vs StackOverflow

Working Code

Play Episode Listen Later Jan 15, 2025 51:15 Transcription Available


We're back! and in this episode of the Working Code Podcast, the crew returns to dive into a thought-provoking discussion on the impact of Large Language Models (LLMs) like ChatGPT on technical communities such as Stack Overflow.They explore how LLMs are changing workflows, the ethical considerations of using AI for coding assistance, and personal experiences with these tools.Follow the show and be sure to join the discussion on Discord! Our website is workingcode.dev and we're @WorkingCodePod on Twitter and Instagram. New episodes drop weekly on Wednesday.And, if you're feeling the love, support us on Patreon.With audio editing and engineering by ZCross Media.Full show notes and transcript here.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0
Beating Google at Search with Neural PageRank and $5M of H200s — with Will Bryk of Exa.ai

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Play Episode Listen Later Jan 10, 2025 56:00


Applications close Monday for the NYC AI Engineer Summit focusing on AI Leadership and Agent Engineering! If you applied, invites should be rolling out shortly.The search landscape is experiencing a fundamental shift. Google built a >$2T company with the “10 blue links” experience, driven by PageRank as the core innovation for ranking. This was a big improvement from the previous directory-based experiences of AltaVista and Yahoo. Almost 4 decades later, Google is now stuck in this links-based experience, especially from a business model perspective. This legacy architecture creates fundamental constraints:* Must return results in ~400 milliseconds* Required to maintain comprehensive web coverage* Tied to keyword-based matching algorithms* Cost structures optimized for traditional indexingAs we move from the era of links to the era of answers, the way search works is changing. You're not showing a user links, but the goal is to provide context to an LLM. This means moving from keyword based search to more semantic understanding of the content:The link prediction objective can be seen as like a neural PageRank because what you're doing is you're predicting the links people share... but it's more powerful than PageRank. It's strictly more powerful because people might refer to that Paul Graham fundraising essay in like a thousand different ways. And so our model learns all the different ways.All of this is now powered by a $5M cluster with 144 H200s:This architectural choice enables entirely new search capabilities:* Comprehensive result sets instead of approximations* Deep semantic understanding of queries* Ability to process complex, natural language requestsAs search becomes more complex, time to results becomes a variable:People think of searches as like, oh, it takes 500 milliseconds because we've been conditioned... But what if searches can take like a minute or 10 minutes or a whole day, what can you then do?Unlike traditional search engines' fixed-cost indexing, Exa employs a hybrid approach:* Front-loaded compute for indexing and embeddings* Variable inference costs based on query complexity* Mix of owned infrastructure ($5M H200 cluster) and cloud resourcesExa sees a lot of competition from products like Perplexity and ChatGPT Search which layer AI on top of traditional search backends, but Exa is betting that true innovation requires rethinking search from the ground up. For example, the recently launched Websets, a way to turn searches into structured output in grid format, allowing you to create lists and databases out of web pages. The company raised a $17M Series A to build towards this mission, so keep an eye out for them in 2025. Chapters* 00:00:00 Introductions* 00:01:12 ExaAI's initial pitch and concept* 00:02:33 Will's background at SpaceX and Zoox* 00:03:45 Evolution of ExaAI (formerly Metaphor Systems)* 00:05:38 Exa's link prediction technology* 00:09:20 Meaning of the name "Exa"* 00:10:36 ExaAI's new product launch and capabilities* 00:13:33 Compute budgets and variable compute products* 00:14:43 Websets as a B2B offering* 00:19:28 How do you build a search engine?* 00:22:43 What is Neural PageRank?* 00:27:58 Exa use cases * 00:35:00 Auto-prompting* 00:38:42 Building agentic search* 00:44:19 Is o1 on the path to AGI?* 00:49:59 Company culture and nap pods* 00:54:52 Economics of AI search and the future of search technologyFull YouTube TranscriptPlease like and subscribe!Show Notes* ExaAI* Web Search Product* Websets* Series A Announcement* Exa Nap Pods* Perplexity AI* Character.AITranscriptAlessio [00:00:00]: Hey, everyone. Welcome to the Latent Space podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co-host Swyx, founder of Smol.ai.Swyx [00:00:10]: Hey, and today we're in the studio with my good friend and former landlord, Will Bryk. Roommate. How you doing? Will, you're now CEO co-founder of ExaAI, used to be Metaphor Systems. What's your background, your story?Will [00:00:30]: Yeah, sure. So, yeah, I'm CEO of Exa. I've been doing it for three years. I guess I've always been interested in search, whether I knew it or not. Like, since I was a kid, I've always been interested in, like, high-quality information. And, like, you know, even in high school, wanted to improve the way we get information from news. And then in college, built a mini search engine. And then with Exa, like, you know, it's kind of like fulfilling the dream of actually being able to solve all the information needs I wanted as a kid. Yeah, I guess. I would say my entire life has kind of been rotating around this problem, which is pretty cool. Yeah.Swyx [00:00:50]: What'd you enter YC with?Will [00:00:53]: We entered YC with, uh, we are better than Google. Like, Google 2.0.Swyx [00:01:12]: What makes you say that? Like, that's so audacious to come out of the box with.Will [00:01:16]: Yeah, okay, so you have to remember the time. This was summer 2021. And, uh, GPT-3 had come out. Like, here was this magical thing that you could talk to, you could enter a whole paragraph, and it understands what you mean, understands the subtlety of your language. And then there was Google. Uh, which felt like it hadn't changed in a decade, uh, because it really hadn't. And it, like, you would give it a simple query, like, I don't know, uh, shirts without stripes, and it would give you a bunch of results for the shirts with stripes. And so, like, Google could barely understand you, and GBD3 could. And the theory was, what if you could make a search engine that actually understood you? What if you could apply the insights from LLMs to a search engine? And it's really been the same idea ever since. And we're actually a lot closer now, uh, to doing that. Yeah.Alessio [00:01:55]: Did you have any trouble making people believe? Obviously, there's the same element. I mean, YC overlap, was YC pretty AI forward, even 2021, or?Will [00:02:03]: It's nothing like it is today. But, um, uh, there were a few AI companies, but, uh, we were definitely, like, bold. And I think people, VCs generally like boldness, and we definitely had some AI background, and we had a working demo. So there was evidence that we could build something that was going to work. But yeah, I think, like, the fundamentals were there. I think people at the time were talking about how, you know, Google was failing in a lot of ways. And so there was a bit of conversation about it, but AI was not a big, big thing at the time. Yeah. Yeah.Alessio [00:02:33]: Before we jump into Exa, any fun background stories? I know you interned at SpaceX, any Elon, uh, stories? I know you were at Zoox as well, you know, kind of like robotics at Harvard. Any stuff that you saw early that you thought was going to get solved that maybe it's not solved today?Will [00:02:48]: Oh yeah. I mean, lots of things like that. Like, uh, I never really learned how to drive because I believed Elon that self-driving cars would happen. It did happen. And I take them every night to get home. But it took like 10 more years than I thought. Do you still not know how to drive? I know how to drive now. I learned it like two years ago. That would have been great to like, just, you know, Yeah, yeah, yeah. You know? Um, I was obsessed with Elon. Yeah. I mean, I worked at SpaceX because I really just wanted to work at one of his companies. And I remember they had a rule, like interns cannot touch Elon. And, um, that rule actually influenced my actions.Swyx [00:03:18]: Is it, can Elon touch interns? Ooh, like physically?Will [00:03:22]: Or like talk? Physically, physically, yeah, yeah, yeah, yeah. Okay, interesting. He's changed a lot, but, um, I mean, his companies are amazing. Um,Swyx [00:03:28]: What if you beat him at Diablo 2, Diablo 4, you know, like, Ah, maybe.Alessio [00:03:34]: I want to jump into, I know there's a lot of backstory used to be called metaphor system. So, um, and it, you've always been kind of like a prominent company, maybe at least RAI circles in the NSF.Swyx [00:03:45]: I'm actually curious how Metaphor got its initial aura. You launched with like, very little. We launched very little. Like there was, there was this like big splash image of like, this is Aurora or something. Yeah. Right. And then I was like, okay, what this thing, like the vibes are good, but I don't know what it is. And I think, I think it was much more sort of maybe consumer facing than what you are today. Would you say that's true?Will [00:04:06]: No, it's always been about building a better search algorithm, like search, like, just like the vision has always been perfect search. And if you do that, uh, we will figure out the downstream use cases later. It started on this fundamental belief that you could have perfect search over the web and we could talk about what that means. And like the initial thing we released was really just like our first search engine, like trying to get it out there. Kind of like, you know, an open source. So when OpenAI released, uh, ChachBt, like they didn't, I don't know how, how much of a game plan they had. They kind of just wanted to get something out there.Swyx [00:04:33]: Spooky research preview.Will [00:04:34]: Yeah, exactly. And it kind of morphed from a research company to a product company at that point. And I think similarly for us, like we were research, we started as a research endeavor with a, you know, clear eyes that like, if we succeed, it will be a massive business to make out of it. And that's kind of basically what happened. I think there are actually a lot of parallels to, of w between Exa and OpenAI. I often say we're the OpenAI of search. Um, because. Because we're a research company, we're a research startup that does like fundamental research into, uh, making like AGI for search in a, in a way. Uh, and then we have all these like, uh, business products that come out of that.Swyx [00:05:08]: Interesting. I want to ask a little bit more about Metaforesight and then we can go full Exa. When I first met you, which was really funny, cause like literally I stayed in your house in a very historic, uh, Hayes, Hayes Valley place. You said you were building sort of like link prediction foundation model, and I think there's still a lot of foundation model work. I mean, within Exa today, but what does that even mean? I cannot be the only person confused by that because like there's a limited vocabulary or tokens you're telling me, like the tokens are the links or, you know, like it's not, it's not clear. Yeah.Will [00:05:38]: Uh, what we meant by link prediction is that you are literally predicting, like given some texts, you're predicting the links that follow. Yes. That refers to like, it's how we describe the training procedure, which is that we find links on the web. Uh, we take the text surrounding the link. And then we predict. Which link follows you, like, uh, you know, similar to transformers where, uh, you're trying to predict the next token here, you're trying to predict the next link. And so you kind of like hide the link from the transformer. So if someone writes, you know, imagine some article where someone says, Hey, check out this really cool aerospace startup. And they, they say spacex.com afterwards, uh, we hide the spacex.com and ask the model, like what link came next. And by doing that many, many times, you know, billions of times, you could actually build a search engine out of that because then, uh, at query time at search time. Uh, you type in, uh, a query that's like really cool aerospace startup and the model will then try to predict what are the most likely links. So there's a lot of analogs to transformers, but like to actually make this work, it does require like a different architecture than, but it's transformer inspired. Yeah.Alessio [00:06:41]: What's the design decision between doing that versus extracting the link and the description and then embedding the description and then using, um, yeah. What do you need to predict the URL versus like just describing, because you're kind of do a similar thing in a way. Right. It's kind of like based on this description, it was like the closest link for it. So one thing is like predicting the link. The other approach is like I extract the link and the description, and then based on the query, I searched the closest description to it more. Yeah.Will [00:07:09]: That, that, by the way, that is, that is the link refers here to a document. It's not, I think one confusing thing is it's not, you're not actually predicting the URL, the URL itself that would require like the, the system to have memorized URLs. You're actually like getting the actual document, a more accurate name could be document prediction. I see. This was the initial like base model that Exo was trained on, but we've moved beyond that similar to like how, you know, uh, to train a really good like language model, you might start with this like self-supervised objective of predicting the next token and then, uh, just from random stuff on the web. But then you, you want to, uh, add a bunch of like synthetic data and like supervised fine tuning, um, stuff like that to make it really like controllable and robust. Yeah.Alessio [00:07:48]: Yeah. We just have flow from Lindy and, uh, their Lindy started to like hallucinate recrolling YouTube links instead of like, uh, something. Yeah. Support guide. So. Oh, interesting. Yeah.Swyx [00:07:57]: So round about January, you announced your series A and renamed to Exo. I didn't like the name at the, at the initial, but it's grown on me. I liked metaphor, but apparently people can spell metaphor. What would you say are the major components of Exo today? Right? Like, I feel like it used to be very model heavy. Then at the AI engineer conference, Shreyas gave a really good talk on the vector database that you guys have. What are the other major moving parts of Exo? Okay.Will [00:08:23]: So Exo overall is a search engine. Yeah. We're trying to make it like a perfect search engine. And to do that, you have to build lots of, and we're doing it from scratch, right? So to do that, you have to build lots of different. The crawler. Yeah. You have to crawl a bunch of the web. First of all, you have to find the URLs to crawl. Uh, it's connected to the crawler, but yeah, you find URLs, you crawl those URLs. Then you have to process them with some, you know, it could be an embedding model. It could be something more complex, but you need to take, you know, or like, you know, in the past it was like a keyword inverted index. Like you would process all these documents you gather into some processed index, and then you have to serve that. Uh, you had high throughput at low latency. And so that, and that's like the vector database. And so it's like the crawling system, the AI processing system, and then the serving system. Those are all like, you know, teams of like hundreds, maybe thousands of people at Google. Um, but for us, it's like one or two people each typically, but yeah.Alessio [00:09:13]: Can you explain the meaning of, uh, Exo, just the story 10 to the 16th, uh, 18, 18.Will [00:09:20]: Yeah, yeah, yeah, sure. So. Exo means 10 to the 18th, which is in stark contrast to. To Google, which is 10 to the hundredth. Uh, we actually have these like awesome shirts that are like 10th to 18th is greater than 10th to the hundredth. Yeah, it's great. And it's great because it's provocative. It's like every engineer in Silicon Valley is like, what? No, it's not true. Um, like, yeah. And, uh, and then you, you ask them, okay, what does it actually mean? And like the creative ones will, will recognize it. But yeah, I mean, 10 to the 18th is better than 10 to the hundredth when it comes to search, because with search, you want like the actual list of, of things that match what you're asking for. You don't want like the whole web. You want to basically with search filter, the, like everything that humanity has ever created to exactly what you want. And so the idea is like smaller is better there. You want like the best 10th to the 18th and not the 10th to the hundredth. I'm like, one way to say this is like, you know how Google often says at the top, uh, like, you know, 30 million results found. And it's like crazy. Cause you're looking for like the first startups in San Francisco that work on hardware or something. And like, they're not 30 million results like that. What you want is like 325 results found. And those are all the results. That's what you really want with search. And that's, that's our vision. It's like, it just gives you. Perfectly what you asked for.Swyx [00:10:24]: We're recording this ahead of your launch. Uh, we haven't released, we haven't figured out the, the, the name of the launch yet, but what is the product that you're launching? I guess now that we're coinciding this podcast with. Yeah.Will [00:10:36]: So we've basically developed the next version of Exa, which is the ability to get a near perfect list of results of whatever you want. And what that means is you can make a complex query now to Exa, for example, startups working on hardware in SF, and then just get a huge list of all the things that match. And, you know, our goal is if there are 325 startups that match that we find you all of them. And this is just like, there's just like a new experience that's never existed before. It's really like, I don't know how you would go about that right now with current tools and you can apply this same type of like technology to anything. Like, let's say you want, uh, you want to find all the blog posts that talk about Alessio's podcast, um, that have come out in the past year. That is 30 million results. Yeah. Right.Will [00:11:24]: But that, I mean, that would, I'm sure that would be extremely useful to you guys. And like, I don't really know how you would get that full comprehensive list.Swyx [00:11:29]: I just like, how do you, well, there's so many questions with regards to how do you know it's complete, right? Cause you're saying there's only 30 million, 325, whatever. And then how do you do the semantic understanding that it might take, right? So working in hardware, like I might not use the words hardware. I might use the words robotics. I might use the words wearables. I might use like whatever. Yes. So yeah, just tell us more. Yeah. Yeah. Sure. Sure.Will [00:11:53]: So one aspect of this, it's a little subjective. So like certainly providing, you know, at some point we'll provide parameters to the user to like, you know, some sort of threshold to like, uh, gauge like, okay, like this is a cutoff. Like, this is actually not what I mean, because sometimes it's subjective and there needs to be a feedback loop. Like, oh, like it might give you like a few examples and you say, yeah, exactly. And so like, you're, you're kind of like creating a classifier on the fly, but like, that's ultimately how you solve the problem. So the subject, there's a subjectivity problem and then there's a comprehensiveness problem. Those are two different problems. So. Yeah. So you have the comprehensiveness problem. What you basically have to do is you have to put more compute into the query, into the search until you get the full comprehensiveness. Yeah. And I think there's an interesting point here, which is that not all queries are made equal. Some queries just like this blog post one might require scanning, like scavenging, like throughout the whole web in a way that just, just simply requires more compute. You know, at some point there's some amount of compute where you will just be comprehensive. You could imagine, for example, running GPT-4 over the internet. You could imagine running GPT-4 over the entire web and saying like, is this a blog post about Alessio's podcast, like, is this a blog post about Alessio's podcast? And then that would work, right? It would take, you know, a year, maybe cost like a million dollars, but, or many more, but, um, it would work. Uh, the point is that like, given sufficient compute, you can solve the query. And so it's really a question of like, how comprehensive do you want it given your compute budget? I think it's very similar to O1, by the way. And one way of thinking about what we built is like O1 for search, uh, because O1 is all about like, you know, some, some, some questions require more compute than others, and we'll put as much compute into the question as we need to solve it. So similarly with our search, we will put as much compute into the query in order to get comprehensiveness. Yeah.Swyx [00:13:33]: Does that mean you have like some kind of compute budget that I can specify? Yes. Yes. Okay. And like, what are the upper and lower bounds?Will [00:13:42]: Yeah, there's something we're still figuring out. I think like, like everyone is a new paradigm of like variable compute products. Yeah. How do you specify the amount of compute? Like what happens when you. Run out? Do you just like, ah, do you, can you like keep going with it? Like, do you just put in more credits to get more, um, for some, like this can get complex at like the really large compute queries. And like, one thing we do is we give you a preview of what you're going to get, and then you could then spin up like a much larger job, uh, to get like way more results. But yes, there is some compute limit, um, at, at least right now. Yeah. People think of searches as like, oh, it takes 500 milliseconds because we've been conditioned, uh, to have search that takes 500 milliseconds. But like search engines like Google, right. No matter how complex your query to Google, it will take like, you know, roughly 400 milliseconds. But what if searches can take like a minute or 10 minutes or a whole day, what can you then do? And you can do very powerful things. Um, you know, you can imagine, you know, writing a search, going and get a cup of coffee, coming back and you have a perfect list. Like that's okay for a lot of use cases. Yeah.Alessio [00:14:43]: Yeah. I mean, the use case closest to me is venture capital, right? So, uh, no, I mean, eight years ago, I built one of the first like data driven sourcing platforms. So we were. You look at GitHub, Twitter, Product Hunt, all these things, look at interesting things, evaluate them. If you think about some jobs that people have, it's like literally just make a list. If you're like an analyst at a venture firm, your job is to make a list of interesting companies. And then you reach out to them. How do you think about being infrastructure versus like a product you could say, Hey, this is like a product to find companies. This is a product to find things versus like offering more as a blank canvas that people can build on top of. Oh, right. Right.Will [00:15:20]: Uh, we are. We are a search infrastructure company. So we want people to build, uh, on top of us, uh, build amazing products on top of us. But with this one, we try to build something that makes it really easy for users to just log in, put a few, you know, put some credits in and just get like amazing results right away and not have to wait to build some API integration. So we're kind of doing both. Uh, we, we want, we want people to integrate this into all their applications at the same time. We want to just make it really easy to use very similar again to open AI. Like they'll have, they have an API, but they also have. Like a ChatGPT interface so that you could, it's really easy to use, but you could also build it in your applications. Yeah.Alessio [00:15:56]: I'm still trying to wrap my head around a lot of the implications. So, so many businesses run on like information arbitrage, you know, like I know this thing that you don't, especially in investment and financial services. So yeah, now all of a sudden you have these tools for like, oh, actually everybody can get the same information at the same time, the same quality level as an API call. You know, it just kind of changes a lot of things. Yeah.Will [00:16:19]: I think, I think what we're grappling with here. What, what you're just thinking about is like, what is the world like if knowledge is kind of solved, if like any knowledge request you want is just like right there on your computer, it's kind of different from when intelligence is solved. There's like a good, I've written before about like a different super intelligence, super knowledge. Yeah. Like I think that the, the distinction between intelligence and knowledge is actually a pretty good one. They're definitely connected and related in all sorts of ways, but there is a distinction. You could have a world and we are going to have this world where you have like GP five level systems and beyond that could like answer any complex request. Um, unless it requires some. Like, if you say like, uh, you know, give me a list of all the PhDs in New York city who, I don't know, have thought about search before. And even though this, this super intelligence is going to be like, I can't find it on Google, right. Which is kind of crazy. Like we're literally going to have like super intelligences that are using Google. And so if Google can't find them information, there's nothing they could do. They can't find it. So, but if you also have a super knowledge system where it's like, you know, I'm calling this term super knowledge where you just get whatever knowledge you want, then you can pair with a super intelligence system. And then the super intelligence can, we'll never. Be blocked by lack of knowledge.Alessio [00:17:23]: Yeah. You told me this, uh, when we had lunch, I forget how it came out, but we were talking about AGI and whatnot. And you were like, even AGI is going to need search. Yeah.Swyx [00:17:32]: Yeah. Right. Yeah. Um, so we're actually referencing a blog post that you wrote super intelligence and super knowledge. Uh, so I would refer people to that. And this is actually a discussion we've had on the podcast a couple of times. Um, there's so much of model weights that are just memorizing facts. Some of the, some of those might be outdated. Some of them are incomplete or not. Yeah. So like you just need search. So I do wonder, like, is there a maximum language model size that will be the intelligence layer and then the rest is just search, right? Like maybe we should just always use search. And then that sort of workhorse model is just like, and it like, like, like one B or three B parameter model that just drives everything. Yes.Will [00:18:13]: I believe this is a much more optimal system to have a smaller LM. That's really just like an intelligence module. And it makes a call to a search. Tool that's way more efficient because if, okay, I mean the, the opposite of that would be like the LM is so big that can memorize the whole web. That would be like way, but you know, it's not practical at all. I don't, it's not possible to train that at least right now. And Carpathy has actually written about this, how like he could, he could see models moving more and more towards like intelligence modules using various tools. Yeah.Swyx [00:18:39]: So for listeners, that's the, that was him on the no priors podcast. And for us, we talked about this and the, on the Shin Yu and Harrison chase podcasts. I'm doing search in my head. I told you 30 million results. I forgot about our neural link integration. Self-hosted exit.Will [00:18:54]: Yeah. Yeah. No, I do see that that is a much more, much more efficient world. Yeah. I mean, you could also have GB four level systems calling search, but it's just because of the cost of inference. It's just better to have a very efficient search tool and a very efficient LM and they're built for different things. Yeah.Swyx [00:19:09]: I'm just kind of curious. Like it is still something so audacious that I don't want to elide, which is you're, you're, you're building a search engine. Where do you start? How do you, like, are there any reference papers or implementation? That would really influence your thinking, anything like that? Because I don't even know where to start apart from just crawl a bunch of s**t, but there's gotta be more insight than that.Will [00:19:28]: I mean, yeah, there's more insight, but I'm always surprised by like, if you have a group of people who are really focused on solving a problem, um, with the tools today, like there's some in, in software, like there are all sorts of creative solutions that just haven't been thought of before, particularly in the information retrieval field. Yeah. I think a lot of the techniques are just very old, frankly. Like I know how Google and Bing work and. They're just not using new methods. There are all sorts of reasons for that. Like one, like Google has to be comprehensive over the web. So they're, and they have to return in 400 milliseconds. And those two things combined means they are kind of limit and it can't cost too much. They're kind of limited in, uh, what kinds of algorithms they could even deploy at scale. So they end up using like a limited keyword based algorithm. Also like Google was built in a time where like in, you know, in 1998, where we didn't have LMS, we didn't have embeddings. And so they never thought to build those things. And so now they have this like gigantic system that is built on old technology. Yeah. And so a lot of the information retrieval field we found just like thinks in terms of that framework. Yeah. Whereas we came in as like newcomers just thinking like, okay, there here's GB three. It's magical. Obviously we're going to build search that is using that technology. And we never even thought about using keywords really ever. Uh, like we were neural all the way we're building an end to end neural search engine. And just that whole framing just makes us ask different questions, like pursue different lines of work. And there's just a lot of low hanging fruit because no one else is thinking about it. We're just on the frontier of neural search. We just are, um, for, for at web scale, um, because there's just not a lot of people thinking that way about it.Swyx [00:20:57]: Yeah. Maybe let's spell this out since, uh, we're already on this topic, elephants in the room are Perplexity and SearchGPT. That's the, I think that it's all, it's no longer called SearchGPT. I think they call it ChatGPT Search. How would you contrast your approaches to them based on what we know of how they work and yeah, just any, anything in that, in that area? Yeah.Will [00:21:15]: So these systems, there are a few of them now, uh, they basically rely on like traditional search engines like Google or Bing, and then they combine them with like LLMs at the end to, you know, output some power graphics, uh, answering your question. So they like search GPT perplexity. I think they have their own crawlers. No. So there's this important distinction between like having your own search system and like having your own cache of the web. Like for example, so you could create, you could crawl a bunch of the web. Imagine you crawl a hundred billion URLs, and then you create a key value store of like mapping from URL to the document that is technically called an index, but it's not a search algorithm. So then to actually like, when you make a query to search GPT, for example, what is it actually doing it? Let's say it's, it's, it could, it's using the Bing API, uh, getting a list of results and then it could go, it has this cache of like all the contents of those results and then could like bring in the cache, like the index cache, but it's not actually like, it's not like they've built a search engine from scratch over, you know, hundreds of billions of pages. It's like, does that distinction clear? It's like, yeah, you could have like a mapping from URL to documents, but then rely on traditional search engines to actually get the list of results because it's a very hard problem to take. It's not hard. It's not hard to use DynamoDB and, and, and map URLs to documents. It's a very hard problem to take a hundred billion or more documents and given a query, like instantly get the list of results that match. That's a much harder problem that very few entities on, in, on the planet have done. Like there's Google, there's Bing, uh, you know, there's Yandex, but you know, there are not that many companies that are, that are crazy enough to actually build their search engine from scratch when you could just use traditional search APIs.Alessio [00:22:43]: So Google had PageRank as like the big thing. Is there a LLM equivalent or like any. Stuff that you're working on that you want to highlight?Will [00:22:51]: The link prediction objective can be seen as like a neural PageRank because what you're doing is you're predicting the links people share. And so if everyone is sharing some Paul Graham essay about fundraising, then like our model is more likely to predict it. So like inherent in our training objective is this, uh, a sense of like high canonicity and like high quality, but it's more powerful than PageRank. It's strictly more powerful because people might refer to that Paul Graham fundraising essay in like a thousand different ways. And so our model learns all the different ways. That someone refers that Paul Graham, I say, while also learning how important that Paul Graham essay is. Um, so it's like, it's like PageRank on steroids kind of thing. Yeah.Alessio [00:23:26]: I think to me, that's the most interesting thing about search today, like with Google and whatnot, it's like, it's mostly like domain authority. So like if you get back playing, like if you search any AI term, you get this like SEO slop websites with like a bunch of things in them. So this is interesting, but then how do you think about more timeless maybe content? So if you think about, yeah. You know, maybe the founder mode essay, right. It gets shared by like a lot of people, but then you might have a lot of other essays that are also good, but they just don't really get a lot of traction. Even though maybe the people that share them are high quality. How do you kind of solve that thing when you don't have the people authority, so to speak of who's sharing, whether or not they're worth kind of like bumping up? Yeah.Will [00:24:10]: I mean, you do have a lot of control over the training data, so you could like make sure that the training data contains like high quality sources so that, okay. Like if you, if you're. Training data, I mean, it's very similar to like language, language model training. Like if you train on like a bunch of crap, your prediction will be crap. Our model will match the training distribution is trained on. And so we could like, there are lots of ways to tweak the training data to refer to high quality content that we want. Yeah. I would say also this, like this slop that is returned by, by traditional search engines, like Google and Bing, you have the slop is then, uh, transferred into the, these LLMs in like a search GBT or, you know, our other systems like that. Like if slop comes in, slop will go out. And so, yeah, that's another answer to how we're different is like, we're not like traditional search engines. We want to give like the highest quality results and like have full control over whatever you want. If you don't want slop, you get that. And then if you put an LM on top of that, which our customers do, then you just get higher quality results or high quality output.Alessio [00:25:06]: And I use Excel search very often and it's very good. Especially.Swyx [00:25:09]: Wave uses it too.Alessio [00:25:10]: Yeah. Yeah. Yeah. Yeah. Yeah. Like the slop is everywhere, especially when it comes to AI, when it comes to investment. When it comes to all of these things for like, it's valuable to be at the top. And this problem is only going to get worse because. Yeah, no, it's totally. What else is in the toolkit? So you have search API, you have ExaSearch, kind of like the web version. Now you have the list builder. I think you also have web scraping. Maybe just touch on that. Like, I guess maybe people, they want to search and then they want to scrape. Right. So is that kind of the use case that people have? Yeah.Will [00:25:41]: A lot of our customers, they don't just want, because they're building AI applications on top of Exa, they don't just want a list of URLs. They actually want. Like the full content, like cleans, parsed. Markdown. Markdown, maybe chunked, whatever they want, we'll give it to them. And so that's been like huge for customers. Just like getting the URLs and instantly getting the content for each URL is like, and you can do this for 10 or 100 or 1,000 URLs, wherever you want. That's very powerful.Swyx [00:26:05]: Yeah. I think this is the first thing I asked you for when I tried using Exa.Will [00:26:09]: Funny story is like when I built the first version of Exa, it's like, we just happened to store the content. Yes. Like the first 1,024 tokens. Because I just kind of like kept it because I thought of, you know, I don't know why. Really for debugging purposes. And so then when people started asking for content, it was actually pretty easy to serve it. But then, and then we did that, like Exa took off. So the computer's content was so useful. So that was kind of cool.Swyx [00:26:30]: It is. I would say there are other players like Gina, I think is in this space. Firecrawl is in this space. There's a bunch of scraper companies. And obviously scraper is just one part of your stack, but you might as well offer it since you already do it.Will [00:26:43]: Yeah, it makes sense. It's just easy to have an all-in-one solution. And like. We are, you know, building the best scraper in the world. So scraping is a hard problem and it's easy to get like, you know, a good scraper. It's very hard to get a great scraper and it's super hard to get a perfect scraper. So like, and, and scraping really matters to people. Do you have a perfect scraper? Not yet. Okay.Swyx [00:27:05]: The web is increasingly closing to the bots and the scrapers, Twitter, Reddit, Quora, Stack Overflow. I don't know what else. How are you dealing with that? How are you navigating those things? Like, you know. You know, opening your eyes, like just paying them money.Will [00:27:19]: Yeah, no, I mean, I think it definitely makes it harder for search engines. One response is just that there's so much value in the long tail of sites that are open. Okay. Um, and just like, even just searching over those well gets you most of the value. But I mean, there, there is definitely a lot of content that is increasingly not unavailable. And so you could get through that through data partnerships. The bigger we get as a company, the more, the easier it is to just like, uh, make partnerships. But I, I mean, I do see the world as like the future where the. The data, the, the data producers, the content creators will make partnerships with the entities that find that data.Alessio [00:27:53]: Any other fun use case that maybe people are not thinking about? Yeah.Will [00:27:58]: Oh, I mean, uh, there are so many customers. Yeah. What are people doing on AXA? Well, I think dating is a really interesting, uh, application of search that is completely underserved because there's a lot of profiles on the web and a lot of people who want to find love and that I'll use it. They give me. Like, you know, age boundaries, you know, education level location. Yeah. I mean, you want to, what, what do you want to do with data? You want to find like a partner who matches this education level, who like, you know, maybe has written about these types of topics before. Like if you could get a list of all the people like that, like, I think you will unblock a lot of people. I mean, there, I mean, I think this is a very Silicon Valley view of dating for sure. And I'm, I'm well aware of that, but it's just an interesting application of like, you know, I would love to meet like an intellectual partner, um, who like shares a lot of ideas. Yeah. Like if you could do that through better search and yeah.Swyx [00:28:48]: But what is it with Jeff? Jeff has already set me up with a few people. So like Jeff, I think it's my personal exit.Will [00:28:55]: my mom's actually a matchmaker and has got a lot of married. Yeah. No kidding. Yeah. Yeah. Search is built into the book. It's in your jeans. Yeah. Yeah.Swyx [00:29:02]: Yeah. Other than dating, like I know you're having quite some success in colleges. I would just love to map out some more use cases so that our listeners can just use those examples to think about use cases for XR, right? Because it's such a general technology that it's hard to. Uh, really pin down, like, what should I use it for and what kind of products can I build with it?Will [00:29:20]: Yeah, sure. So, I mean, there are so many applications of XR and we have, you know, many, many companies using us for very diverse range of use cases, but I'll just highlight some interesting ones. Like one customer, a big customer is using us to, um, basically build like a, a writing assistant for students who want to write, uh, research papers. And basically like XR will search for, uh, like a list of research papers related to what the student is writing. And then this product has. Has like an LLM that like summarizes the papers to basically it's like a next word prediction, but in, uh, you know, prompted by like, you know, 20 research papers that X has returned. It's like literally just doing their homework for them. Yeah. Yeah. the key point is like, it's, it's, uh, you know, it's, it's, you know, research is, is a really hard thing to do and you need like high quality content as input.Swyx [00:30:08]: Oh, so we've had illicit on the podcast. I think it's pretty similar. Uh, they, they do focus pretty much on just, just research papers and, and that research. Basically, I think dating, uh, research, like I just wanted to like spell out more things, like just the big verticals.Will [00:30:23]: Yeah, yeah, no, I mean, there, there are so many use cases. So finance we talked about, yeah. I mean, one big vertical is just finding a list of companies, uh, so it's useful for VCs, like you said, who want to find like a list of competitors to a specific company they're investigating or just a list of companies in some field. Like, uh, there was one VC that told me that him and his team, like we're using XR for like eight hours straight. Like, like that. For many days on end, just like, like, uh, doing like lots of different queries of different types, like, oh, like all the companies in AI for law or, uh, all the companies for AI for, uh, construction and just like getting lists of things because you just can't find this information with, with traditional search engines. And then, you know, finding companies is also useful for, for selling. If you want to find, you know, like if we want to find a list of, uh, writing assistants to sell to, then we can just, we just use XR ourselves to find that is actually how we found a lot of our customers. Ooh, you can find your own customers using XR. Oh my God. I, in the spirit of. Uh, using XR to bolster XR, like recruiting is really helpful. It is really great use case of XR, um, because we can just get like a list of, you know, people who thought about search and just get like a long list and then, you know, reach out to those people.Swyx [00:31:29]: When you say thought about, are you, are you thinking LinkedIn, Twitter, or are you thinking just blogs?Will [00:31:33]: Or they've written, I mean, it's pretty general. So in that case, like ideally XR would return like the, the really blogs written by people who have just. So if I don't blog, I don't show up to XR, right? Like I have to blog. well, I mean, you could show up. That's like an incentive for people to blog.Swyx [00:31:47]: Well, if you've written about, uh, search in on Twitter and we, we do, we do index a bunch of tweets and then we, we should be able to service that. Yeah. Um, I mean, this is something I tell people, like you have to make yourself discoverable to the web, uh, you know, it's called learning in public, but like, it's even more imperative now because otherwise you don't exist at all.Will [00:32:07]: Yeah, no, no, this is a huge, uh, thing, which is like search engines completely influence. They have downstream effects. They influence the internet itself. They influence what people. Choose to create. And so Google, because they're a keyword based search engine, people like kind of like keyword stuff. Yeah. They're, they're, they're incentivized to create things that just match a lot of keywords, which is not very high quality. Uh, whereas XR is a search algorithm that, uh, optimizes for like high quality and actually like matching what you mean. And so people are incentivized to create content that is high quality, that like the create content that they know will be found by the right person. So like, you know, if I am a search researcher and I want to be found. By XR, I should blog about search and all the things I'm building because, because now we have a search engine like XR that's powerful enough to find them. And so the search engine will influence like the downstream internet in all sorts of amazing ways. Yeah. Uh, whatever the search engine optimizes for is what the internet looks like. Yeah.Swyx [00:33:01]: Are you familiar with the term? McLuhanism? No, it's not. Uh, it's this concept that, uh, like first we shape tools and then the tools shape us. Okay. Yeah. Uh, so there's like this reflexive connection between the things we search for and the things that get searched. Yes. So like once you change the tool. The tool that searches the, the, the things that get searched also change. Yes.Will [00:33:18]: I mean, there was a clear example of that with 30 years of Google. Yeah, exactly. Google has basically trained us to think of search and Google has Google is search like in people's heads. Right. It's one, uh, hard part about XR is like, uh, ripping people away from that notion of search and expanding their sense of what search could be. Because like when people think search, they think like a few keywords, or at least they used to, they think of a few keywords and that's it. They don't think to make these like really complex paragraph long requests for information and get a perfect list. ChatGPT was an interesting like thing that expanded people's understanding of search because you start using ChatGPT for a few hours and you go back to Google and you like paste in your code and Google just doesn't work and you're like, oh, wait, it, Google doesn't do work that way. So like ChatGPT expanded our understanding of what search can be. And I think XR is, uh, is part of that. We want to expand people's notion, like, Hey, you could actually get whatever you want. Yeah.Alessio [00:34:06]: I search on XR right now, people writing about learning in public. I was like, is it gonna come out with Alessio? Am I, am I there? You're not because. Bro. It's. So, no, it's, it's so about, because it thinks about learning, like in public, like public schools and like focuses more on that. You know, it's like how, when there are like these highly overlapping things, like this is like a good result based on the query, you know, but like, how do I get to Alessio? Right. So if you're like in these subcultures, I don't think this would work in Google well either, you know, but I, I don't know if you have any learnings.Swyx [00:34:40]: No, I'm the first result on Google.Alessio [00:34:42]: People writing about learning. In public, you're not first result anymore, I guess.Swyx [00:34:48]: Just type learning public in Google.Alessio [00:34:49]: Well, yeah, yeah, yeah, yeah. But this is also like, this is in Google, it doesn't work either. That's what I'm saying. It's like how, when you have like a movement.Will [00:34:56]: There's confusion about the, like what you mean, like your intention is a little, uh. Yeah.Alessio [00:35:00]: It's like, yeah, I'm using, I'm using a term that like I didn't invent, but I'm kind of taking over, but like, they're just so much about that term already that it's hard to overcome. If that makes sense, because public schools is like, well, it's, it's hard to overcome.Will [00:35:14]: Public schools, you know, so there's the right solution to this, which is to specify more clearly what you mean. And I'm not expecting you to do that, but so the, the right interface to search is actually an LLM.Swyx [00:35:25]: Like you should be talking to an LLM about what you want and the LLM translates its knowledge of you or knowledge of what people usually mean into a query that excellent uses, which you have called auto prompts, right?Will [00:35:35]: Or, yeah, but it's like a very light version of that. And really it's just basically the right answer is it's the wrong interface and like very soon interface to search and really to everything will be LLM. And the LLM just has a full knowledge of you, right? So we're kind of building for that world. We're skating to where the puck is going to be. And so since we're moving to a world where like LLMs are interfaced to everything, you should build a search engine that can handle complex LLM queries, queries that come from LLMs. Because you're probably too lazy, I'm too lazy too, to write like a whole paragraph explaining, okay, this is what I mean by this word. But an LLM is not lazy. And so like the LLM will spit out like a paragraph or more explaining exactly what it wants. You need a search engine that can handle that. Traditional search engines like Google or Bing, they're actually... Designed for humans typing keywords. If you give a paragraph to Google or Bing, they just completely fail. And so Exa can handle paragraphs and we want to be able to handle it more and more until it's like perfect.Alessio [00:36:24]: What about opinions? Do you have lists? When you think about the list product, do you think about just finding entries? Do you think about ranking entries? I'll give you a dumb example. So on Lindy, I've been building the spot that every week gives me like the top fantasy football waiver pickups. But every website is like different opinions. I'm like, you should pick up. These five players, these five players. When you're making lists, do you want to be kind of like also ranking and like telling people what's best? Or like, are you mostly focused on just surfacing information?Will [00:36:56]: There's a really good distinction between filtering to like things that match your query and then ranking based on like what is like your preferences. And ranking is like filtering is objective. It's like, does this document match what you asked for? Whereas ranking is more subjective. It's like, what is the best? Well, it depends what you mean by best, right? So first, first table stakes is let's get the filtering into a perfect place where you actually like every document matches what you asked for. No surgeon can do that today. And then ranking, you know, there are all sorts of interesting ways to do that where like you've maybe for, you know, have the user like specify more clearly what they mean by best. You could do it. And if the user doesn't specify, you do your best, you do your best based on what people typically mean by best. But ideally, like the user can specify, oh, when I mean best, I actually mean ranked by the, you know, the number of people who visited that site. Let's say is, is one example ranking or, oh, what I mean by best, let's say you're listing companies. What I mean by best is like the ones that have, uh, you know, have the most employees or something like that. Like there are all sorts of ways to rank a list of results that are not captured by something as subjective as best. Yeah. Yeah.Alessio [00:38:00]: I mean, it's like, who are the best NBA players in the history? It's like everybody has their own. Right.Will [00:38:06]: Right. But I mean, the, the, the search engine should definitely like, even if you don't specify it, it should do as good of a job as possible. Yeah. Yeah. No, no, totally. Yeah. Yeah. Yeah. Yeah. It's a new topic to people because we're not used to a search engine that can handle like a very complex ranking system. Like you think to type in best basketball players and not something more specific because you know, that's the only thing Google could handle. But if Google could handle like, oh, basketball players ranked by like number of shots scored on average per game, then you would do that. But you know, they can't do that. So.Swyx [00:38:32]: Yeah. That's fascinating. So you haven't used the word agents, but you're kind of building a search agent. Do you believe that that is agentic in feature? Do you think that term is distracting?Will [00:38:42]: I think it's a good term. I do think everything will eventually become agentic. And so then the term will lose power, but yes, like what we're building is agentic it in a sense that it takes actions. It decides when to go deeper into something, it has a loop, right? It feels different from traditional search, which is like an algorithm, not an agent. Ours is a combination of an algorithm and an agent.Swyx [00:39:05]: I think my reflection from seeing this in the coding space where there's basically sort of classic. Framework for thinking about this stuff is the self-driving levels of autonomy, right? Level one to five, typically the level five ones all failed because there's full autonomy and we're not, we're not there yet. And people like control. People like to be in the loop. So the, the, the level ones was co-pilot first and now it's like cursor and whatever. So I feel like if it's too agentic, it's too magical, like, like a, like a one shot, I stick a, stick a paragraph into the text box and then it spits it back to me. It might feel like I'm too disconnected from the process and I don't trust it. As opposed to something where I'm more intimately involved with the research product. I see. So like, uh, wait, so the earlier versions are, so if trying to stick to the example of the basketball thing, like best basketball player, but instead of best, you, you actually get to customize it with like, whatever the metric is that you, you guys care about. Yeah. I'm still not a basketballer, but, uh, but, but, you know, like, like B people like to be in my, my thesis is that agents level five agents failed because people like to. To kind of have drive assist rather than full self-driving.Will [00:40:15]: I mean, a lot of this has to do with how good agents are. Like at some point, if agents for coding are better than humans at all tests and then humans block, yeah, we're not there yet.Swyx [00:40:25]: So like in a world where we're not there yet, what you're pitching us is like, you're, you're kind of saying you're going all the way there. Like I kind of, I think all one is also very full, full self-driving. You don't get to see the plan. You don't get to affect the plan yet. You just fire off a query and then it goes away for a couple of minutes and comes back. Right. Which is effectively what you're saying you're going to do too. And you think there's.Will [00:40:42]: There's a, there's an in-between. I saw. Okay. So in building this product, we're exploring new interfaces because what does it mean to kick off a search that goes and takes 10 minutes? Like, is that a good interface? Because what if the search is actually wrong or it's not exactly, exactly specified to what you mean, which is why you get previews. Yeah. You get previews. So it is iterative, but ultimately once you've specified exactly what you mean, then you kind of do just want to kick off a batch job. Right. So perhaps what you're getting at is like, uh, there's this barrier with agents where you have to like explain the full context of what you mean, and a lot of failure modes happen when you have, when you don't. Yeah. There's failure modes from the agent, just not being smart enough. And then there's failure modes from the agent, not understanding exactly what you mean. And there's a lot of context that is shared between humans that is like lost between like humans and, and this like new creature.Alessio [00:41:32]: Yeah. Yeah. Because people don't know what's going on. I mean, to me, the best example of like system prompts is like, why are you writing? You're a helpful assistant. Like. Of course you should be an awful, but people don't yet know, like, can I assume that, you know, that, you know, it's like, why did the, and now people write, oh, you're a very smart software engineer, but like, you never made, you never make mistakes. Like, were you going to try and make mistakes before? So I think people don't yet have an understanding, like with, with driving people know what good driving is. It's like, don't crash, stay within kind of like a certain speed range. It's like, follow the directions. It's like, I don't really have to explain all of those things. I hope. But with. AI and like models and like search, people are like, okay, what do you actually know? What are like your assumptions about how search, how you're going to do search? And like, can I trust it? You know, can I influence it? So I think that's kind of the, the middle ground, like before you go ahead and like do all the search, it's like, can I see how you're doing it? And then maybe help show your work kind of like, yeah, steer you. Yeah. Yeah.Will [00:42:32]: No, I mean, yeah. Sure. Saying, even if you've crafted a great system prompt, you want to be part of the process itself. Uh, because the system prompt doesn't, it doesn't capture everything. Right. So yeah. A system prompt is like, you get to choose the person you work with. It's like, oh, like I want, I want a software engineer who thinks this way about code. But then even once you've chosen that person, you can't just give them a high level command and they go do it perfectly. You have to be part of that process. So yeah, I agree.Swyx [00:42:58]: Just a side note for my system, my favorite system, prompt programming anecdote now is the Apple intelligence system prompt that someone, someone's a prompt injected it and seen it. And like the Apple. Intelligence has the words, like, please don't, don't hallucinate. And it's like, of course we don't want you to hallucinate. Right. Like, so it's exactly that, that what you're talking about, like we should train this behavior into the model, but somehow we still feel the need to inject into the prompt. And I still don't even think that we are very scientific about it. Like it, I think it's almost like cargo culting. Like we have this like magical, like turn around three times, throw salt over your shoulder before you do something. And like, it worked the last time. So let's just do it the same time now. And like, we do, there's no science to this.Will [00:43:35]: I do think a lot of these problems might be ironed out in future versions. Right. So, and like, they might, they might hide the details from you. So it's like, they actually, all of them have a system prompt. That's like, you are a helpful assistant. You don't actually have to include it, even though it might actually be the way they've implemented in the backend. It should be done in RLE AF.Swyx [00:43:52]: Okay. Uh, one question I was just kind of curious about this episode is I'm going to try to frame this in terms of this, the general AI search wars, you know, you're, you're one player in that, um, there's perplexity, chat, GPT, search, and Google, but there's also like the B2B side, uh, we had. Drew Houston from Dropbox on, and he's competing with Glean, who've, uh, we've also had DD from, from Glean on, is there an appetite for Exa for my company's documents?Will [00:44:19]: There is appetite, but I think we have to be disciplined, focused, disciplined. I mean, we're already taking on like perfect web search, which is a lot. Um, but I mean, ultimately we want to build a perfect search engine, which definitely for a lot of queries involves your, your personal information, your company's information. And so, yeah, I mean, the grandest vision of Exa is perfect search really over everything, every domain, you know, we're going to have an Exa satellite, uh, because, because satellites can gather information that, uh, is not available publicly. Uh, gotcha. Yeah.Alessio [00:44:51]: Can we talk about AGI? We never, we never talk about AGI, but you had, uh, this whole tweet about, oh, one being the biggest kind of like AI step function towards it. Why does it feel so important to you? I know there's kind of like always criticism and saying, Hey, it's not the smartest son is better. It's like, blah, blah, blah. What? You choose C. So you say, this is what Ilias see or Sam see what they will see.Will [00:45:13]: I've just, I've just, you know, been connecting the dots. I mean, this was the key thing that a bunch of labs were working on, which is like, can you create a reward signal? Can you teach yourself based on a reward signal? Whether you're, if you're trying to learn coding or math, if you could have one model say, uh, be a grading system that says like you have successfully solved this programming assessment and then one model, like be the generative system. That's like, here are a bunch of programming assessments. You could train on that. It's basically whenever you could create a reward signal for some task, you could just generate a bunch of tasks for yourself. See that like, oh, on two of these thousand, you did well. And then you just train on that data. It's basically like, I mean, creating your own data for yourself and like, you know, all the labs working on that opening, I built the most impressive product doing that. And it's just very, it's very easy now to see how that could like scale to just solving, like, like solving programming or solving mathematics, which sounds crazy, but everything about our world right now is crazy.Alessio [00:46:07]: Um, and so I think if you remove that whole, like, oh, that's impossible, and you just think really clearly about like, what's now possible with like what, what they've done with O1, it's easy to see how that scales. How do you think about older GPT models then? Should people still work on them? You know, if like, obviously they just had the new Haiku, like, is it even worth spending time, like making these models better versus just, you know, Sam talked about O2 at that day. So obviously they're, they're spending a lot of time in it, but then you have maybe. The GPU poor, which are still working on making Lama good. Uh, and then you have the follower labs that do not have an O1 like model out yet. Yeah.Will [00:46:47]: This kind of gets into like, uh, what will the ecosystem of, of models be like in the future? And is there room is, is everything just gonna be O1 like models? I think, well, I mean, there's definitely a question of like inference speed and if certain things like O1 takes a long time, because that's the thing. Well, I mean, O1 is, is two things. It's like one it's it's use it's bootstrapping itself. It's teaching itself. And so the base model is smarter. But then it also has this like inference time compute where it could like spend like many minutes or many hours thinking. And so even the base model, which is also fast, it doesn't have to take minutes. It could take is, is better, smarter. I believe all models will be trained with this paradigm. Like you'll want to train on the best data, but there will be many different size models from different, very many different like companies, I believe. Yeah. Because like, I don't, yeah, I mean, it's hard, hard to predict, but I don't think opening eye is going to dominate like every possible LLM for every possible. Use case. I think for a lot of things, like you just want the fastest model and that might not involve O1 methods at all.Swyx [00:47:42]: I would say if you were to take the exit being O1 for search, literally, you really need to prioritize search trajectories, like almost maybe paying a bunch of grad students to go research things. And then you kind of track what they search and what the sequence of searching is, because it seems like that is the gold mine here, like the chain of thought or the thinking trajectory. Yeah.Will [00:48:05]: When it comes to search, I've always been skeptical. I've always been skeptical of human labeled data. Okay. Yeah, please. We tried something at our company at Exa recently where me and a bunch of engineers on the team like labeled a bunch of queries and it was really hard. Like, you know, you have all these niche queries and you're looking at a bunch of results and you're trying to identify which is matched to query. It's talking about, you know, the intricacies of like some biological experiment or something. I have no idea. Like, I don't know what matches and what, what labelers like me tend to do is just match by keyword. I'm like, oh, I don't know. Oh, like this document matches a bunch of keywords, so it must be good. But then you're actually completely missing the meaning of the document. Whereas an LLM like GB4 is really good at labeling. And so I actually think like you just we get by, which we are right now doing using like LLM

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Applications for the 2025 AI Engineer Summit are up, and you can save the date for AIE Singapore in April and AIE World's Fair 2025 in June.Happy new year, and thanks for 100 great episodes! Please let us know what you want to see/hear for the next 100!Full YouTube Episode with Slides/ChartsLike and subscribe and hit that bell to get notifs!Timestamps* 00:00 Welcome to the 100th Episode!* 00:19 Reflecting on the Journey* 00:47 AI Engineering: The Rise and Impact* 03:15 Latent Space Live and AI Conferences* 09:44 The Competitive AI Landscape* 21:45 Synthetic Data and Future Trends* 35:53 Creative Writing with AI* 36:12 Legal and Ethical Issues in AI* 38:18 The Data War: GPU Poor vs. GPU Rich* 39:12 The Rise of GPU Ultra Rich* 40:47 Emerging Trends in AI Models* 45:31 The Multi-Modality War* 01:05:31 The Future of AI Benchmarks* 01:13:17 Pionote and Frontier Models* 01:13:47 Niche Models and Base Models* 01:14:30 State Space Models and RWKB* 01:15:48 Inference Race and Price Wars* 01:22:16 Major AI Themes of the Year* 01:22:48 AI Rewind: January to March* 01:26:42 AI Rewind: April to June* 01:33:12 AI Rewind: July to September* 01:34:59 AI Rewind: October to December* 01:39:53 Year-End Reflections and PredictionsTranscript[00:00:00] Welcome to the 100th Episode![00:00:00] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO at Decibel Partners, and I'm joined by my co host Swyx for the 100th time today.[00:00:12] swyx: Yay, um, and we're so glad that, yeah, you know, everyone has, uh, followed us in this journey. How do you feel about it? 100 episodes.[00:00:19] Alessio: Yeah, I know.[00:00:19] Reflecting on the Journey[00:00:19] Alessio: Almost two years that we've been doing this. We've had four different studios. Uh, we've had a lot of changes. You know, we used to do this lightning round. When we first started that we didn't like, and we tried to change the question. The answer[00:00:32] swyx: was cursor and perplexity.[00:00:34] Alessio: Yeah, I love mid journey. It's like, do you really not like anything else?[00:00:38] Alessio: Like what's, what's the unique thing? And I think, yeah, we, we've also had a lot more research driven content. You know, we had like 3DAO, we had, you know. Jeremy Howard, we had more folks like that.[00:00:47] AI Engineering: The Rise and Impact[00:00:47] Alessio: I think we want to do more of that too in the new year, like having, uh, some of the Gemini folks, both on the research and the applied side.[00:00:54] Alessio: Yeah, but it's been a ton of fun. I think we both started, I wouldn't say as a joke, we were kind of like, Oh, we [00:01:00] should do a podcast. And I think we kind of caught the right wave, obviously. And I think your rise of the AI engineer posts just kind of get people. Sombra to congregate, and then the AI engineer summit.[00:01:11] Alessio: And that's why when I look at our growth chart, it's kind of like a proxy for like the AI engineering industry as a whole, which is almost like, like, even if we don't do that much, we keep growing just because there's so many more AI engineers. So did you expect that growth or did you expect that would take longer for like the AI engineer thing to kind of like become, you know, everybody talks about it today.[00:01:32] swyx: So, the sign of that, that we have won is that Gartner puts it at the top of the hype curve right now. So Gartner has called the peak in AI engineering. I did not expect, um, to what level. I knew that I was correct when I called it because I did like two months of work going into that. But I didn't know, You know, how quickly it could happen, and obviously there's a chance that I could be wrong.[00:01:52] swyx: But I think, like, most people have come around to that concept. Hacker News hates it, which is a good sign. But there's enough people that have defined it, you know, GitHub, when [00:02:00] they launched GitHub Models, which is the Hugging Face clone, they put AI engineers in the banner, like, above the fold, like, in big So I think it's like kind of arrived as a meaningful and useful definition.[00:02:12] swyx: I think people are trying to figure out where the boundaries are. I think that was a lot of the quote unquote drama that happens behind the scenes at the World's Fair in June. Because I think there's a lot of doubt or questions about where ML engineering stops and AI engineering starts. That's a useful debate to be had.[00:02:29] swyx: In some sense, I actually anticipated that as well. So I intentionally did not. Put a firm definition there because most of the successful definitions are necessarily underspecified and it's actually useful to have different perspectives and you don't have to specify everything from the outset.[00:02:45] Alessio: Yeah, I was at um, AWS reInvent and the line to get into like the AI engineering talk, so to speak, which is, you know, applied AI and whatnot was like, there are like hundreds of people just in line to go in.[00:02:56] Alessio: I think that's kind of what enabled me. People, right? Which is what [00:03:00] you kind of talked about. It's like, Hey, look, you don't actually need a PhD, just, yeah, just use the model. And then maybe we'll talk about some of the blind spots that you get as an engineer with the earlier posts that we also had on on the sub stack.[00:03:11] Alessio: But yeah, it's been a heck of a heck of a two years.[00:03:14] swyx: Yeah.[00:03:15] Latent Space Live and AI Conferences[00:03:15] swyx: You know, I was, I was trying to view the conference as like, so NeurIPS is I think like 16, 17, 000 people. And the Latent Space Live event that we held there was 950 signups. I think. The AI world, the ML world is still very much research heavy. And that's as it should be because ML is very much in a research phase.[00:03:34] swyx: But as we move this entire field into production, I think that ratio inverts into becoming more engineering heavy. So at least I think engineering should be on the same level, even if it's never as prestigious, like it'll always be low status because at the end of the day, you're manipulating APIs or whatever.[00:03:51] swyx: But Yeah, wrapping GPTs, but there's going to be an increasing stack and an art to doing these, these things well. And I, you know, I [00:04:00] think that's what we're focusing on for the podcast, the conference and basically everything I do seems to make sense. And I think we'll, we'll talk about the trends here that apply.[00:04:09] swyx: It's, it's just very strange. So, like, there's a mix of, like, keeping on top of research while not being a researcher and then putting that research into production. So, like, people always ask me, like, why are you covering Neuralibs? Like, this is a ML research conference and I'm like, well, yeah, I mean, we're not going to, to like, understand everything Or reproduce every single paper, but the stuff that is being found here is going to make it through into production at some point, you hope.[00:04:32] swyx: And then actually like when I talk to the researchers, they actually get very excited because they're like, oh, you guys are actually caring about how this goes into production and that's what they really really want. The measure of success is previously just peer review, right? Getting 7s and 8s on their um, Academic review conferences and stuff like citations is one metric, but money is a better metric.[00:04:51] Alessio: Money is a better metric. Yeah, and there were about 2200 people on the live stream or something like that. Yeah, yeah. Hundred on the live stream. So [00:05:00] I try my best to moderate, but it was a lot spicier in person with Jonathan and, and Dylan. Yeah, that it was in the chat on YouTube.[00:05:06] swyx: I would say that I actually also created.[00:05:09] swyx: Layen Space Live in order to address flaws that are perceived in academic conferences. This is not NeurIPS specific, it's ICML, NeurIPS. Basically, it's very sort of oriented towards the PhD student, uh, market, job market, right? Like literally all, basically everyone's there to advertise their research and skills and get jobs.[00:05:28] swyx: And then obviously all the, the companies go there to hire them. And I think that's great for the individual researchers, but for people going there to get info is not great because you have to read between the lines, bring a ton of context in order to understand every single paper. So what is missing is effectively what I ended up doing, which is domain by domain, go through and recap the best of the year.[00:05:48] swyx: Survey the field. And there are, like NeurIPS had a, uh, I think ICML had a like a position paper track, NeurIPS added a benchmarks, uh, datasets track. These are ways in which to address that [00:06:00] issue. Uh, there's always workshops as well. Every, every conference has, you know, a last day of workshops and stuff that provide more of an overview.[00:06:06] swyx: But they're not specifically prompted to do so. And I think really, uh, Organizing a conference is just about getting good speakers and giving them the correct prompts. And then they will just go and do that thing and they do a very good job of it. So I think Sarah did a fantastic job with the startups prompt.[00:06:21] swyx: I can't list everybody, but we did best of 2024 in startups, vision, open models. Post transformers, synthetic data, small models, and agents. And then the last one was the, uh, and then we also did a quick one on reasoning with Nathan Lambert. And then the last one, obviously, was the debate that people were very hyped about.[00:06:39] swyx: It was very awkward. And I'm really, really thankful for John Franco, basically, who stepped up to challenge Dylan. Because Dylan was like, yeah, I'll do it. But He was pro scaling. And I think everyone who is like in AI is pro scaling, right? So you need somebody who's ready to publicly say, no, we've hit a wall.[00:06:57] swyx: So that means you're saying Sam Altman's wrong. [00:07:00] You're saying, um, you know, everyone else is wrong. It helps that this was the day before Ilya went on, went up on stage and then said pre training has hit a wall. And data has hit a wall. So actually Jonathan ended up winning, and then Ilya supported that statement, and then Noam Brown on the last day further supported that statement as well.[00:07:17] swyx: So it's kind of interesting that I think the consensus kind of going in was that we're not done scaling, like you should believe in a better lesson. And then, four straight days in a row, you had Sepp Hochreiter, who is the creator of the LSTM, along with everyone's favorite OG in AI, which is Juergen Schmidhuber.[00:07:34] swyx: He said that, um, we're pre trading inside a wall, or like, we've run into a different kind of wall. And then we have, you know John Frankel, Ilya, and then Noam Brown are all saying variations of the same thing, that we have hit some kind of wall in the status quo of what pre trained, scaling large pre trained models has looked like, and we need a new thing.[00:07:54] swyx: And obviously the new thing for people is some make, either people are calling it inference time compute or test time [00:08:00] compute. I think the collective terminology has been inference time, and I think that makes sense because test time, calling it test, meaning, has a very pre trained bias, meaning that the only reason for running inference at all is to test your model.[00:08:11] swyx: That is not true. Right. Yeah. So, so, I quite agree that. OpenAI seems to have adopted, or the community seems to have adopted this terminology of ITC instead of TTC. And that, that makes a lot of sense because like now we care about inference, even right down to compute optimality. Like I actually interviewed this author who recovered or reviewed the Chinchilla paper.[00:08:31] swyx: Chinchilla paper is compute optimal training, but what is not stated in there is it's pre trained compute optimal training. And once you start caring about inference, compute optimal training, you have a different scaling law. And in a way that we did not know last year.[00:08:45] Alessio: I wonder, because John is, he's also on the side of attention is all you need.[00:08:49] Alessio: Like he had the bet with Sasha. So I'm curious, like he doesn't believe in scaling, but he thinks the transformer, I wonder if he's still. So, so,[00:08:56] swyx: so he, obviously everything is nuanced and you know, I told him to play a character [00:09:00] for this debate, right? So he actually does. Yeah. He still, he still believes that we can scale more.[00:09:04] swyx: Uh, he just assumed the character to be very game for, for playing this debate. So even more kudos to him that he assumed a position that he didn't believe in and still won the debate.[00:09:16] Alessio: Get rekt, Dylan. Um, do you just want to quickly run through some of these things? Like, uh, Sarah's presentation, just the highlights.[00:09:24] swyx: Yeah, we can't go through everyone's slides, but I pulled out some things as a factor of, like, stuff that we were going to talk about. And we'll[00:09:30] Alessio: publish[00:09:31] swyx: the rest. Yeah, we'll publish on this feed the best of 2024 in those domains. And hopefully people can benefit from the work that our speakers have done.[00:09:39] swyx: But I think it's, uh, these are just good slides. And I've been, I've been looking for a sort of end of year recaps from, from people.[00:09:44] The Competitive AI Landscape[00:09:44] swyx: The field has progressed a lot. You know, I think the max ELO in 2023 on LMSys used to be 1200 for LMSys ELOs. And now everyone is at least at, uh, 1275 in their ELOs, and this is across Gemini, Chadjibuti, [00:10:00] Grok, O1.[00:10:01] swyx: ai, which with their E Large model, and Enthopic, of course. It's a very, very competitive race. There are multiple Frontier labs all racing, but there is a clear tier zero Frontier. And then there's like a tier one. It's like, I wish I had everything else. Tier zero is extremely competitive. It's effectively now three horse race between Gemini, uh, Anthropic and OpenAI.[00:10:21] swyx: I would say that people are still holding out a candle for XAI. XAI, I think, for some reason, because their API was very slow to roll out, is not included in these metrics. So it's actually quite hard to put on there. As someone who also does charts, XAI is continually snubbed because they don't work well with the benchmarking people.[00:10:42] swyx: Yeah, yeah, yeah. It's a little trivia for why XAI always gets ignored. The other thing is market share. So these are slides from Sarah. We have it up on the screen. It has gone from very heavily open AI. So we have some numbers and estimates. These are from RAMP. Estimates of open AI market share in [00:11:00] December 2023.[00:11:01] swyx: And this is basically, what is it, GPT being 95 percent of production traffic. And I think if you correlate that with stuff that we asked. Harrison Chase on the LangChain episode, it was true. And then CLAUD 3 launched mid middle of this year. I think CLAUD 3 launched in March, CLAUD 3. 5 Sonnet was in June ish.[00:11:23] swyx: And you can start seeing the market share shift towards opening, uh, towards that topic, uh, very, very aggressively. The more recent one is Gemini. So if I scroll down a little bit, this is an even more recent dataset. So RAM's dataset ends in September 2 2. 2024. Gemini has basically launched a price war at the low end, uh, with Gemini Flash, uh, being basically free for personal use.[00:11:44] swyx: Like, I think people don't understand the free tier. It's something like a billion tokens per day. Unless you're trying to abuse it, you cannot really exhaust your free tier on Gemini. They're really trying to get you to use it. They know they're in like third place, um, fourth place, depending how you, how you count.[00:11:58] swyx: And so they're going after [00:12:00] the Lower tier first, and then, you know, maybe the upper tier later, but yeah, Gemini Flash, according to OpenRouter, is now 50 percent of their OpenRouter requests. Obviously, these are the small requests. These are small, cheap requests that are mathematically going to be more.[00:12:15] swyx: The smart ones obviously are still going to OpenAI. But, you know, it's a very, very big shift in the market. Like basically 2023, 2022, To going into 2024 opening has gone from nine five market share to Yeah. Reasonably somewhere between 50 to 75 market share.[00:12:29] Alessio: Yeah. I'm really curious how ramped does the attribution to the model?[00:12:32] Alessio: If it's API, because I think it's all credit card spin. . Well, but it's all, the credit card doesn't say maybe. Maybe the, maybe when they do expenses, they upload the PDF, but yeah, the, the German I think makes sense. I think that was one of my main 2024 takeaways that like. The best small model companies are the large labs, which is not something I would have thought that the open source kind of like long tail would be like the small model.[00:12:53] swyx: Yeah, different sizes of small models we're talking about here, right? Like so small model here for Gemini is AB, [00:13:00] right? Uh, mini. We don't know what the small model size is, but yeah, it's probably in the double digits or maybe single digits, but probably double digits. The open source community has kind of focused on the one to three B size.[00:13:11] swyx: Mm-hmm . Yeah. Maybe[00:13:12] swyx: zero, maybe 0.5 B uh, that's moon dream and that is small for you then, then that's great. It makes sense that we, we have a range for small now, which is like, may, maybe one to five B. Yeah. I'll even put that at, at, at the high end. And so this includes Gemma from Gemini as well. But also includes the Apple Foundation models, which I think Apple Foundation is 3B.[00:13:32] Alessio: Yeah. No, that's great. I mean, I think in the start small just meant cheap. I think today small is actually a more nuanced discussion, you know, that people weren't really having before.[00:13:43] swyx: Yeah, we can keep going. This is a slide that I smiley disagree with Sarah. She's pointing to the scale SEAL leaderboard. I think the Researchers that I talked with at NeurIPS were kind of positive on this because basically you need private test [00:14:00] sets to prevent contamination.[00:14:02] swyx: And Scale is one of maybe three or four people this year that has really made an effort in doing a credible private test set leaderboard. Llama405B does well compared to Gemini and GPT 40. And I think that's good. I would say that. You know, it's good to have an open model that is that big, that does well on those metrics.[00:14:23] swyx: But anyone putting 405B in production will tell you, if you scroll down a little bit to the artificial analysis numbers, that it is very slow and very expensive to infer. Um, it doesn't even fit on like one node. of, uh, of H100s. Cerebras will be happy to tell you they can serve 4 or 5B on their super large chips.[00:14:42] swyx: But, um, you know, if you need to do anything custom to it, you're still kind of constrained. So, is 4 or 5B really that relevant? Like, I think most people are basically saying that they only use 4 or 5B as a teacher model to distill down to something. Even Meta is doing it. So with Lama 3. [00:15:00] 3 launched, they only launched the 70B because they use 4 or 5B to distill the 70B.[00:15:03] swyx: So I don't know if like open source is keeping up. I think they're the, the open source industrial complex is very invested in telling you that the, if the gap is narrowing, I kind of disagree. I think that the gap is widening with O1. I think there are very, very smart people trying to narrow that gap and they should.[00:15:22] swyx: I really wish them success, but you cannot use a chart that is nearing 100 in your saturation chart. And look, the distance between open source and closed source is narrowing. Of course it's going to narrow because you're near 100. This is stupid. But in metrics that matter, is open source narrowing?[00:15:38] swyx: Probably not for O1 for a while. And it's really up to the open source guys to figure out if they can match O1 or not.[00:15:46] Alessio: I think inference time compute is bad for open source just because, you know, Doc can donate the flops at training time, but he cannot donate the flops at inference time. So it's really hard to like actually keep up on that axis.[00:15:59] Alessio: Big, big business [00:16:00] model shift. So I don't know what that means for the GPU clouds. I don't know what that means for the hyperscalers, but obviously the big labs have a lot of advantage. Because, like, it's not a static artifact that you're putting the compute in. You're kind of doing that still, but then you're putting a lot of computed inference too.[00:16:17] swyx: Yeah, yeah, yeah. Um, I mean, Llama4 will be reasoning oriented. We talked with Thomas Shalom. Um, kudos for getting that episode together. That was really nice. Good, well timed. Actually, I connected with the AI meta guy, uh, at NeurIPS, and, um, yeah, we're going to coordinate something for Llama4. Yeah, yeah,[00:16:32] Alessio: and our friend, yeah.[00:16:33] Alessio: Clara Shi just joined to lead the business agent side. So I'm sure we'll have her on in the new year.[00:16:39] swyx: Yeah. So, um, my comment on, on the business model shift, this is super interesting. Apparently it is wide knowledge that OpenAI wanted more than 6. 6 billion dollars for their fundraise. They wanted to raise, you know, higher, and they did not.[00:16:51] swyx: And what that means is basically like, it's very convenient that we're not getting GPT 5, which would have been a larger pre train. We should have a lot of upfront money. And [00:17:00] instead we're, we're converting fixed costs into variable costs, right. And passing it on effectively to the customer. And it's so much easier to take margin there because you can directly attribute it to like, Oh, you're using this more.[00:17:12] swyx: Therefore you, you pay more of the cost and I'll just slap a margin in there. So like that lets you control your growth margin and like tie your. Your spend, or your sort of inference spend, accordingly. And it's just really interesting to, that this change in the sort of inference paradigm has arrived exactly at the same time that the funding environment for pre training is effectively drying up, kind of.[00:17:36] swyx: I feel like maybe the VCs are very in tune with research anyway, so like, they would have noticed this, but, um, it's just interesting.[00:17:43] Alessio: Yeah, and I was looking back at our yearly recap of last year. Yeah. And the big thing was like the mixed trial price fights, you know, and I think now it's almost like there's nowhere to go, like, you know, Gemini Flash is like basically giving it away for free.[00:17:55] Alessio: So I think this is a good way for the labs to generate more revenue and pass down [00:18:00] some of the compute to the customer. I think they're going to[00:18:02] swyx: keep going. I think that 2, will come.[00:18:05] Alessio: Yeah, I know. Totally. I mean, next year, the first thing I'm doing is signing up for Devin. Signing up for the pro chat GBT.[00:18:12] Alessio: Just to try. I just want to see what does it look like to spend a thousand dollars a month on AI?[00:18:17] swyx: Yes. Yes. I think if your, if your, your job is a, at least AI content creator or VC or, you know, someone who, whose job it is to stay on, stay on top of things, you should already be spending like a thousand dollars a month on, on stuff.[00:18:28] swyx: And then obviously easy to spend, hard to use. You have to actually use. The good thing is that actually Google lets you do a lot of stuff for free now. So like deep research. That they just launched. Uses a ton of inference and it's, it's free while it's in preview.[00:18:45] Alessio: Yeah. They need to put that in Lindy.[00:18:47] Alessio: I've been using Lindy lately. I've been a built a bunch of things once we had flow because I liked the new thing. It's pretty good. I even did a phone call assistant. Um, yeah, they just launched Lindy voice. Yeah, I think once [00:19:00] they get advanced voice mode like capability today, still like speech to text, you can kind of tell.[00:19:06] Alessio: Um, but it's good for like reservations and things like that. So I have a meeting prepper thing. And so[00:19:13] swyx: it's good. Okay. I feel like we've, we've covered a lot of stuff. Uh, I, yeah, I, you know, I think We will go over the individual, uh, talks in a separate episode. Uh, I don't want to take too much time with, uh, this stuff, but that suffice to say that there is a lot of progress in each field.[00:19:28] swyx: Uh, we covered vision. Basically this is all like the audience voting for what they wanted. And then I just invited the best people I could find in each audience, especially agents. Um, Graham, who I talked to at ICML in Vienna, he is currently still number one. It's very hard to stay on top of SweetBench.[00:19:45] swyx: OpenHand is currently still number one. switchbench full, which is the hardest one. He had very good thoughts on agents, which I, which I'll highlight for people. Everyone is saying 2025 is the year of agents, just like they said last year. And, uh, but he had [00:20:00] thoughts on like eight parts of what are the frontier problems to solve in agents.[00:20:03] swyx: And so I'll highlight that talk as well.[00:20:05] Alessio: Yeah. The number six, which is the Hacken agents learn more about the environment, has been a Super interesting to us as well, just to think through, because, yeah, how do you put an agent in an enterprise where most things in an enterprise have never been public, you know, a lot of the tooling, like the code bases and things like that.[00:20:23] Alessio: So, yeah, there's not indexing and reg. Well, yeah, but it's more like. You can't really rag things that are not documented. But people know them based on how they've been doing it. You know, so I think there's almost this like, you know, Oh, institutional knowledge. Yeah, the boring word is kind of like a business process extraction.[00:20:38] Alessio: Yeah yeah, I see. It's like, how do you actually understand how these things are done? I see. Um, and I think today the, the problem is that, Yeah, the agents are, that most people are building are good at following instruction, but are not as good as like extracting them from you. Um, so I think that will be a big unlock just to touch quickly on the Jeff Dean thing.[00:20:55] Alessio: I thought it was pretty, I mean, we'll link it in the, in the things, but. I think the main [00:21:00] focus was like, how do you use ML to optimize the systems instead of just focusing on ML to do something else? Yeah, I think speculative decoding, we had, you know, Eugene from RWKB on the podcast before, like he's doing a lot of that with Fetterless AI.[00:21:12] swyx: Everyone is. I would say it's the norm. I'm a little bit uncomfortable with how much it costs, because it does use more of the GPU per call. But because everyone is so keen on fast inference, then yeah, makes sense.[00:21:24] Alessio: Exactly. Um, yeah, but we'll link that. Obviously Jeff is great.[00:21:30] swyx: Jeff is, Jeff's talk was more, it wasn't focused on Gemini.[00:21:33] swyx: I think people got the wrong impression from my tweet. It's more about how Google approaches ML and uses ML to design systems and then systems feedback into ML. And I think this ties in with Lubna's talk.[00:21:45] Synthetic Data and Future Trends[00:21:45] swyx: on synthetic data where it's basically the story of bootstrapping of humans and AI in AI research or AI in production.[00:21:53] swyx: So her talk was on synthetic data, where like how much synthetic data has grown in 2024 in the pre training side, the post training side, [00:22:00] and the eval side. And I think Jeff then also extended it basically to chips, uh, to chip design. So he'd spend a lot of time talking about alpha chip. And most of us in the audience are like, we're not working on hardware, man.[00:22:11] swyx: Like you guys are great. TPU is great. Okay. We'll buy TPUs.[00:22:14] Alessio: And then there was the earlier talk. Yeah. But, and then we have, uh, I don't know if we're calling them essays. What are we calling these? But[00:22:23] swyx: for me, it's just like bonus for late in space supporters, because I feel like they haven't been getting anything.[00:22:29] swyx: And then I wanted a more high frequency way to write stuff. Like that one I wrote in an afternoon. I think basically we now have an answer to what Ilya saw. It's one year since. The blip. And we know what he saw in 2014. We know what he saw in 2024. We think we know what he sees in 2024. He gave some hints and then we have vague indications of what he saw in 2023.[00:22:54] swyx: So that was the Oh, and then 2016 as well, because of this lawsuit with Elon, OpenAI [00:23:00] is publishing emails from Sam's, like, his personal text messages to Siobhan, Zelis, or whatever. So, like, we have emails from Ilya saying, this is what we're seeing in OpenAI, and this is why we need to scale up GPUs. And I think it's very prescient in 2016 to write that.[00:23:16] swyx: And so, like, it is exactly, like, basically his insights. It's him and Greg, basically just kind of driving the scaling up of OpenAI, while they're still playing Dota. They're like, no, like, we see the path here.[00:23:30] Alessio: Yeah, and it's funny, yeah, they even mention, you know, we can only train on 1v1 Dota. We need to train on 5v5, and that takes too many GPUs.[00:23:37] Alessio: Yeah,[00:23:37] swyx: and at least for me, I can speak for myself, like, I didn't see the path from Dota to where we are today. I think even, maybe if you ask them, like, they wouldn't necessarily draw a straight line. Yeah,[00:23:47] Alessio: no, definitely. But I think like that was like the whole idea of almost like the RL and we talked about this with Nathan on his podcast.[00:23:55] Alessio: It's like with RL, you can get very good at specific things, but then you can't really like generalize as much. And I [00:24:00] think the language models are like the opposite, which is like, you're going to throw all this data at them and scale them up, but then you really need to drive them home on a specific task later on.[00:24:08] Alessio: And we'll talk about the open AI reinforcement, fine tuning, um, announcement too, and all of that. But yeah, I think like scale is all you need. That's kind of what Elia will be remembered for. And I think just maybe to clarify on like the pre training is over thing that people love to tweet. I think the point of the talk was like everybody, we're scaling these chips, we're scaling the compute, but like the second ingredient which is data is not scaling at the same rate.[00:24:35] Alessio: So it's not necessarily pre training is over. It's kind of like What got us here won't get us there. In his email, he predicted like 10x growth every two years or something like that. And I think maybe now it's like, you know, you can 10x the chips again, but[00:24:49] swyx: I think it's 10x per year. Was it? I don't know.[00:24:52] Alessio: Exactly. And Moore's law is like 2x. So it's like, you know, much faster than that. And yeah, I like the fossil fuel of AI [00:25:00] analogy. It's kind of like, you know, the little background tokens thing. So the OpenAI reinforcement fine tuning is basically like, instead of fine tuning on data, you fine tune on a reward model.[00:25:09] Alessio: So it's basically like, instead of being data driven, it's like task driven. And I think people have tasks to do, they don't really have a lot of data. So I'm curious to see how that changes, how many people fine tune, because I think this is what people run into. It's like, Oh, you can fine tune llama. And it's like, okay, where do I get the data?[00:25:27] Alessio: To fine tune it on, you know, so it's great that we're moving the thing. And then I really like he had this chart where like, you know, the brain mass and the body mass thing is basically like mammals that scaled linearly by brain and body size, and then humans kind of like broke off the slope. So it's almost like maybe the mammal slope is like the pre training slope.[00:25:46] Alessio: And then the post training slope is like the, the human one.[00:25:49] swyx: Yeah. I wonder what the. I mean, we'll know in 10 years, but I wonder what the y axis is for, for Ilya's SSI. We'll try to get them on.[00:25:57] Alessio: Ilya, if you're listening, you're [00:26:00] welcome here. Yeah, and then he had, you know, what comes next, like agent, synthetic data, inference, compute, I thought all of that was like that.[00:26:05] Alessio: I don't[00:26:05] swyx: think he was dropping any alpha there. Yeah, yeah, yeah.[00:26:07] Alessio: Yeah. Any other new reps? Highlights?[00:26:10] swyx: I think that there was comparatively a lot more work. Oh, by the way, I need to plug that, uh, my friend Yi made this, like, little nice paper. Yeah, that was really[00:26:20] swyx: nice.[00:26:20] swyx: Uh, of, uh, of, like, all the, he's, she called it must read papers of 2024.[00:26:26] swyx: So I laid out some of these at NeurIPS, and it was just gone. Like, everyone just picked it up. Because people are dying for, like, little guidance and visualizations And so, uh, I thought it was really super nice that we got there.[00:26:38] Alessio: Should we do a late in space book for each year? Uh, I thought about it. For each year we should.[00:26:42] Alessio: Coffee table book. Yeah. Yeah. Okay. Put it in the will. Hi, Will. By the way, we haven't introduced you. He's our new, you know, general organist, Jamie. You need to[00:26:52] swyx: pull up more things. One thing I saw that, uh, Okay, one fun one, and then one [00:27:00] more general one. So the fun one is this paper on agent collusion. This is a paper on steganography.[00:27:06] swyx: This is secret collusion among AI agents, multi agent deception via steganography. I tried to go to NeurIPS in order to find these kinds of papers because the real reason Like NeurIPS this year has a lottery system. A lot of people actually even go and don't buy tickets because they just go and attend the side events.[00:27:22] swyx: And then also the people who go and end up crowding around the most popular papers, which you already know and already read them before you showed up to NeurIPS. So the only reason you go there is to talk to the paper authors, but there's like something like 10, 000 other. All these papers out there that, you know, are just people's work that they, that they did on the air and they failed to get attention for one reason or another.[00:27:42] swyx: And this was one of them. Uh, it was like all the way at the back. And this is a deep mind paper that actually focuses on collusion between AI agents, uh, by hiding messages in the text that they generate. Uh, so that's what steganography is. So a very simple example would be the first letter of every word.[00:27:57] swyx: If you Pick that out, you know, and the code sends a [00:28:00] different message than that. But something I've always emphasized is to LLMs, we read left to right. LLMs can read up, down, sideways, you know, in random character order. And it's the same to them as it is to us. So if we were ever to get You know, self motivated, underlined LLMs that we're trying to collaborate to take over the planet.[00:28:19] swyx: This would be how they do it. They spread messages among us in the messages that we generate. And he developed a scaling law for that. So he marked, I'm showing it on screen right now, the emergence of this phenomenon. Basically, for example, for Cypher encoding, GPT 2, Lama 2, mixed trial, GPT 3. 5, zero capabilities, and sudden 4.[00:28:40] swyx: And this is the kind of Jason Wei type emergence properties that people kind of look for. I think what made this paper stand out as well, so he developed the benchmark for steganography collusion, and he also focused on shelling point collusion, which is very low coordination. For agreeing on a decoding encoding format, you kind of need to have some [00:29:00] agreement on that.[00:29:00] swyx: But, but shelling point means like very, very low or almost no coordination. So for example, if I, if I ask someone, if the only message I give you is meet me in New York and you're not aware. Or when you would probably meet me at Grand Central Station. That is the Grand Central Station is a shelling point.[00:29:16] swyx: And it's probably somewhere, somewhere during the day. That is the shelling point of New York is Grand Central. To that extent, shelling points for steganography are things like the, the, the common decoding methods that we talked about. It will be interesting at some point in the future when we are worried about alignment.[00:29:30] swyx: It is not interesting today, but it's interesting that DeepMind is already thinking about this.[00:29:36] Alessio: I think that's like one of the hardest things about NeurIPS. It's like the long tail. I[00:29:41] swyx: found a pricing guy. I'm going to feature him on the podcast. Basically, this guy from NVIDIA worked out the optimal pricing for language models.[00:29:51] swyx: It's basically an econometrics paper at NeurIPS, where everyone else is talking about GPUs. And the guy with the GPUs is[00:29:57] Alessio: talking[00:29:57] swyx: about economics instead. [00:30:00] That was the sort of fun one. So the focus I saw is that model papers at NeurIPS are kind of dead. No one really presents models anymore. It's just data sets.[00:30:12] swyx: This is all the grad students are working on. So like there was a data sets track and then I was looking around like, I was like, you don't need a data sets track because every paper is a data sets paper. And so data sets and benchmarks, they're kind of flip sides of the same thing. So Yeah. Cool. Yeah, if you're a grad student, you're a GPU boy, you kind of work on that.[00:30:30] swyx: And then the, the sort of big model that people walk around and pick the ones that they like, and then they use it in their models. And that's, that's kind of how it develops. I, I feel like, um, like, like you didn't last year, you had people like Hao Tian who worked on Lava, which is take Lama and add Vision.[00:30:47] swyx: And then obviously actually I hired him and he added Vision to Grok. Now he's the Vision Grok guy. This year, I don't think there was any of those.[00:30:55] Alessio: What were the most popular, like, orals? Last year it was like the [00:31:00] Mixed Monarch, I think, was like the most attended. Yeah, uh, I need to look it up. Yeah, I mean, if nothing comes to mind, that's also kind of like an answer in a way.[00:31:10] Alessio: But I think last year there was a lot of interest in, like, furthering models and, like, different architectures and all of that.[00:31:16] swyx: I will say that I felt the orals, oral picks this year were not very good. Either that or maybe it's just a So that's the highlight of how I have changed in terms of how I view papers.[00:31:29] swyx: So like, in my estimation, two of the best papers in this year for datasets or data comp and refined web or fine web. These are two actually industrially used papers, not highlighted for a while. I think DCLM got the spotlight, FineWeb didn't even get the spotlight. So like, it's just that the picks were different.[00:31:48] swyx: But one thing that does get a lot of play that a lot of people are debating is the role that's scheduled. This is the schedule free optimizer paper from Meta from Aaron DeFazio. And this [00:32:00] year in the ML community, there's been a lot of chat about shampoo, soap, all the bathroom amenities for optimizing your learning rates.[00:32:08] swyx: And, uh, most people at the big labs are. Who I asked about this, um, say that it's cute, but it's not something that matters. I don't know, but it's something that was discussed and very, very popular. 4Wars[00:32:19] Alessio: of AI recap maybe, just quickly. Um, where do you want to start? Data?[00:32:26] swyx: So to remind people, this is the 4Wars piece that we did as one of our earlier recaps of this year.[00:32:31] swyx: And the belligerents are on the left, journalists, writers, artists, anyone who owns IP basically, New York Times, Stack Overflow, Reddit, Getty, Sarah Silverman, George RR Martin. Yeah, and I think this year we can add Scarlett Johansson to that side of the fence. So anyone suing, open the eye, basically. I actually wanted to get a snapshot of all the lawsuits.[00:32:52] swyx: I'm sure some lawyer can do it. That's the data quality war. On the right hand side, we have the synthetic data people, and I think we talked about Lumna's talk, you know, [00:33:00] really showing how much synthetic data has come along this year. I think there was a bit of a fight between scale. ai and the synthetic data community, because scale.[00:33:09] swyx: ai published a paper saying that synthetic data doesn't work. Surprise, surprise, scale. ai is the leading vendor of non synthetic data. Only[00:33:17] Alessio: cage free annotated data is useful.[00:33:21] swyx: So I think there's some debate going on there, but I don't think it's much debate anymore that at least synthetic data, for the reasons that are blessed in Luna's talk, Makes sense.[00:33:32] swyx: I don't know if you have any perspectives there.[00:33:34] Alessio: I think, again, going back to the reinforcement fine tuning, I think that will change a little bit how people think about it. I think today people mostly use synthetic data, yeah, for distillation and kind of like fine tuning a smaller model from like a larger model.[00:33:46] Alessio: I'm not super aware of how the frontier labs use it outside of like the rephrase, the web thing that Apple also did. But yeah, I think it'll be. Useful. I think like whether or not that gets us the big [00:34:00] next step, I think that's maybe like TBD, you know, I think people love talking about data because it's like a GPU poor, you know, I think, uh, synthetic data is like something that people can do, you know, so they feel more opinionated about it compared to, yeah, the optimizers stuff, which is like,[00:34:17] swyx: they don't[00:34:17] Alessio: really work[00:34:18] swyx: on.[00:34:18] swyx: I think that there is an angle to the reasoning synthetic data. So this year, we covered in the paper club, the star series of papers. So that's star, Q star, V star. It basically helps you to synthesize reasoning steps, or at least distill reasoning steps from a verifier. And if you look at the OpenAI RFT, API that they released, or that they announced, basically they're asking you to submit graders, or they choose from a preset list of graders.[00:34:49] swyx: Basically It feels like a way to create valid synthetic data for them to fine tune their reasoning paths on. Um, so I think that is another angle where it starts to make sense. And [00:35:00] so like, it's very funny that basically all the data quality wars between Let's say the music industry or like the newspaper publishing industry or the textbooks industry on the big labs.[00:35:11] swyx: It's all of the pre training era. And then like the new era, like the reasoning era, like nobody has any problem with all the reasoning, especially because it's all like sort of math and science oriented with, with very reasonable graders. I think the more interesting next step is how does it generalize beyond STEM?[00:35:27] swyx: We've been using O1 for And I would say like for summarization and creative writing and instruction following, I think it's underrated. I started using O1 in our intro songs before we killed the intro songs, but it's very good at writing lyrics. You know, I can actually say like, I think one of the O1 pro demos.[00:35:46] swyx: All of these things that Noam was showing was that, you know, you can write an entire paragraph or three paragraphs without using the letter A, right?[00:35:53] Creative Writing with AI[00:35:53] swyx: So like, like literally just anything instead of token, like not even token level, character level manipulation and [00:36:00] counting and instruction following. It's, uh, it's very, very strong.[00:36:02] swyx: And so no surprises when I ask it to rhyme, uh, and to, to create song lyrics, it's going to do that very much better than in previous models. So I think it's underrated for creative writing.[00:36:11] Alessio: Yeah.[00:36:12] Legal and Ethical Issues in AI[00:36:12] Alessio: What do you think is the rationale that they're going to have in court when they don't show you the thinking traces of O1, but then they want us to, like, they're getting sued for using other publishers data, you know, but then on their end, they're like, well, you shouldn't be using my data to then train your model.[00:36:29] Alessio: So I'm curious to see how that kind of comes. Yeah, I mean, OPA has[00:36:32] swyx: many ways to publish, to punish people without bringing, taking them to court. Already banned ByteDance for distilling their, their info. And so anyone caught distilling the chain of thought will be just disallowed to continue on, on, on the API.[00:36:44] swyx: And it's fine. It's no big deal. Like, I don't even think that's an issue at all, just because the chain of thoughts are pretty well hidden. Like you have to work very, very hard to, to get it to leak. And then even when it leaks the chain of thought, you don't know if it's, if it's [00:37:00] The bigger concern is actually that there's not that much IP hiding behind it, that Cosign, which we talked about, we talked to him on Dev Day, can just fine tune 4.[00:37:13] swyx: 0 to beat 0. 1 Cloud SONET so far is beating O1 on coding tasks without, at least O1 preview, without being a reasoning model, same for Gemini Pro or Gemini 2. 0. So like, how much is reasoning important? How much of a moat is there in this, like, All of these are proprietary sort of training data that they've presumably accomplished.[00:37:34] swyx: Because even DeepSeek was able to do it. And they had, you know, two months notice to do this, to do R1. So, it's actually unclear how much moat there is. Obviously, you know, if you talk to the Strawberry team, they'll be like, yeah, I mean, we spent the last two years doing this. So, we don't know. And it's going to be Interesting because there'll be a lot of noise from people who say they have inference time compute and actually don't because they just have fancy chain of thought.[00:38:00][00:38:00] swyx: And then there's other people who actually do have very good chain of thought. And you will not see them on the same level as OpenAI because OpenAI has invested a lot in building up the mythology of their team. Um, which makes sense. Like the real answer is somewhere in between.[00:38:13] Alessio: Yeah, I think that's kind of like the main data war story developing.[00:38:18] The Data War: GPU Poor vs. GPU Rich[00:38:18] Alessio: GPU poor versus GPU rich. Yeah. Where do you think we are? I think there was, again, going back to like the small model thing, there was like a time in which the GPU poor were kind of like the rebel faction working on like these models that were like open and small and cheap. And I think today people don't really care as much about GPUs anymore.[00:38:37] Alessio: You also see it in the price of the GPUs. Like, you know, that market is kind of like plummeted because there's people don't want to be, they want to be GPU free. They don't even want to be poor. They just want to be, you know, completely without them. Yeah. How do you think about this war? You[00:38:52] swyx: can tell me about this, but like, I feel like the, the appetite for GPU rich startups, like the, you know, the, the funding plan is we will raise 60 million and [00:39:00] we'll give 50 of that to NVIDIA.[00:39:01] swyx: That is gone, right? Like, no one's, no one's pitching that. This was literally the plan, the exact plan of like, I can name like four or five startups, you know, this time last year. So yeah, GPU rich startups gone.[00:39:12] The Rise of GPU Ultra Rich[00:39:12] swyx: But I think like, The GPU ultra rich, the GPU ultra high net worth is still going. So, um, now we're, you know, we had Leopold's essay on the trillion dollar cluster.[00:39:23] swyx: We're not quite there yet. We have multiple labs, um, you know, XAI very famously, you know, Jensen Huang praising them for being. Best boy number one in spinning up 100, 000 GPU cluster in like 12 days or something. So likewise at Meta, likewise at OpenAI, likewise at the other labs as well. So like the GPU ultra rich are going to keep doing that because I think partially it's an article of faith now that you just need it.[00:39:46] swyx: Like you don't even know what it's going to, what you're going to use it for. You just, you just need it. And it makes sense that if, especially if we're going into. More researchy territory than we are. So let's say 2020 to 2023 was [00:40:00] let's scale big models territory because we had GPT 3 in 2020 and we were like, okay, we'll go from 1.[00:40:05] swyx: 75b to 1. 8b, 1. 8t. And that was GPT 3 to GPT 4. Okay, that's done. As far as everyone is concerned, Opus 3. 5 is not coming out, GPT 4. 5 is not coming out, and Gemini 2, we don't have Pro, whatever. We've hit that wall. Maybe I'll call it the 2 trillion perimeter wall. We're not going to 10 trillion. No one thinks it's a good idea, at least from training costs, from the amount of data, or at least the inference.[00:40:36] swyx: Would you pay 10x the price of GPT Probably not. Like, like you want something else that, that is at least more useful. So it makes sense that people are pivoting in terms of their inference paradigm.[00:40:47] Emerging Trends in AI Models[00:40:47] swyx: And so when it's more researchy, then you actually need more just general purpose compute to mess around with, uh, at the exact same time that production deployments of the old, the previous paradigm is still ramping up,[00:40:58] swyx: um,[00:40:58] swyx: uh, pretty aggressively.[00:40:59] swyx: So [00:41:00] it makes sense that the GPU rich are growing. We have now interviewed both together and fireworks and replicates. Uh, we haven't done any scale yet. But I think Amazon, maybe kind of a sleeper one, Amazon, in a sense of like they, at reInvent, I wasn't expecting them to do so well, but they are now a foundation model lab.[00:41:18] swyx: It's kind of interesting. Um, I think, uh, you know, David went over there and started just creating models.[00:41:25] Alessio: Yeah, I mean, that's the power of prepaid contracts. I think like a lot of AWS customers, you know, they do this big reserve instance contracts and now they got to use their money. That's why so many startups.[00:41:37] Alessio: Get bought through the AWS marketplace so they can kind of bundle them together and prefer pricing.[00:41:42] swyx: Okay, so maybe GPU super rich doing very well, GPU middle class dead, and then GPU[00:41:48] Alessio: poor. I mean, my thing is like, everybody should just be GPU rich. There shouldn't really be, even the GPU poorest, it's like, does it really make sense to be GPU poor?[00:41:57] Alessio: Like, if you're GPU poor, you should just use the [00:42:00] cloud. Yes, you know, and I think there might be a future once we kind of like figure out what the size and shape of these models is where like the tiny box and these things come to fruition where like you can be GPU poor at home. But I think today is like, why are you working so hard to like get these models to run on like very small clusters where it's like, It's so cheap to run them.[00:42:21] Alessio: Yeah, yeah,[00:42:22] swyx: yeah. I think mostly people think it's cool. People think it's a stepping stone to scaling up. So they aspire to be GPU rich one day and they're working on new methods. Like news research, like probably the most deep tech thing they've done this year is Distro or whatever the new name is.[00:42:38] swyx: There's a lot of interest in heterogeneous computing, distributed computing. I tend generally to de emphasize that historically, but it may be coming to a time where it is starting to be relevant. I don't know. You know, SF compute launched their compute marketplace this year, and like, who's really using that?[00:42:53] swyx: Like, it's a bunch of small clusters, disparate types of compute, and if you can make that [00:43:00] useful, then that will be very beneficial to the broader community, but maybe still not the source of frontier models. It's just going to be a second tier of compute that is unlocked for people, and that's fine. But yeah, I mean, I think this year, I would say a lot more on device, We are, I now have Apple intelligence on my phone.[00:43:19] swyx: Doesn't do anything apart from summarize my notifications. But still, not bad. Like, it's multi modal.[00:43:25] Alessio: Yeah, the notification summaries are so and so in my experience.[00:43:29] swyx: Yeah, but they add, they add juice to life. And then, um, Chrome Nano, uh, Gemini Nano is coming out in Chrome. Uh, they're still feature flagged, but you can, you can try it now if you, if you use the, uh, the alpha.[00:43:40] swyx: And so, like, I, I think, like, you know, We're getting the sort of GPU poor version of a lot of these things coming out, and I think it's like quite useful. Like Windows as well, rolling out RWKB in sort of every Windows department is super cool. And I think the last thing that I never put in this GPU poor war, that I think I should now, [00:44:00] is the number of startups that are GPU poor but still scaling very well, as sort of wrappers on top of either a foundation model lab, or GPU Cloud.[00:44:10] swyx: GPU Cloud, it would be Suno. Suno, Ramp has rated as one of the top ranked, fastest growing startups of the year. Um, I think the last public number is like zero to 20 million this year in ARR and Suno runs on Moto. So Suno itself is not GPU rich, but they're just doing the training on, on Moto, uh, who we've also talked to on, on the podcast.[00:44:31] swyx: The other one would be Bolt, straight cloud wrapper. And, and, um, Again, another, now they've announced 20 million ARR, which is another step up from our 8 million that we put on the title. So yeah, I mean, it's crazy that all these GPU pores are finding a way while the GPU riches are also finding a way. And then the only failures, I kind of call this the GPU smiling curve, where the edges do well, because you're either close to the machines, and you're like [00:45:00] number one on the machines, or you're like close to the customers, and you're number one on the customer side.[00:45:03] swyx: And the people who are in the middle. Inflection, um, character, didn't do that great. I think character did the best of all of them. Like, you have a note in here that we apparently said that character's price tag was[00:45:15] Alessio: 1B.[00:45:15] swyx: Did I say that?[00:45:16] Alessio: Yeah. You said Google should just buy them for 1B. I thought it was a crazy number.[00:45:20] Alessio: Then they paid 2. 7 billion. I mean, for like,[00:45:22] swyx: yeah.[00:45:22] Alessio: What do you pay for node? Like, I don't know what the game world was like. Maybe the starting price was 1B. I mean, whatever it was, it worked out for everybody involved.[00:45:31] The Multi-Modality War[00:45:31] Alessio: Multimodality war. And this one, we never had text to video in the first version, which now is the hottest.[00:45:37] swyx: Yeah, I would say it's a subset of image, but yes.[00:45:40] Alessio: Yeah, well, but I think at the time it wasn't really something people were doing, and now we had VO2 just came out yesterday. Uh, Sora was released last month, last week. I've not tried Sora, because the day that I tried, it wasn't, yeah. I[00:45:54] swyx: think it's generally available now, you can go to Sora.[00:45:56] swyx: com and try it. Yeah, they had[00:45:58] Alessio: the outage. Which I [00:46:00] think also played a part into it. Small things. Yeah. What's the other model that you posted today that was on Replicate? Video or OneLive?[00:46:08] swyx: Yeah. Very, very nondescript name, but it is from Minimax, which I think is a Chinese lab. The Chinese labs do surprisingly well at the video models.[00:46:20] swyx: I'm not sure it's actually Chinese. I don't know. Hold me up to that. Yep. China. It's good. Yeah, the Chinese love video. What can I say? They have a lot of training data for video. Or a more relaxed regulatory environment.[00:46:37] Alessio: Uh, well, sure, in some way. Yeah, I don't think there's much else there. I think like, you know, on the image side, I think it's still open.[00:46:45] Alessio: Yeah, I mean,[00:46:46] swyx: 11labs is now a unicorn. So basically, what is multi modality war? Multi modality war is, do you specialize in a single modality, right? Or do you have GodModel that does all the modalities? So this is [00:47:00] definitely still going, in a sense of 11 labs, you know, now Unicorn, PicoLabs doing well, they launched Pico 2.[00:47:06] swyx: 0 recently, HeyGen, I think has reached 100 million ARR, Assembly, I don't know, but they have billboards all over the place, so I assume they're doing very, very well. So these are all specialist models, specialist models and specialist startups. And then there's the big labs who are doing the sort of all in one play.[00:47:24] swyx: And then here I would highlight Gemini 2 for having native image output. Have you seen the demos? Um, yeah, it's, it's hard to keep up. Literally they launched this last week and a shout out to Paige Bailey, who came to the Latent Space event to demo on the day of launch. And she wasn't prepared. She was just like, I'm just going to show you.[00:47:43] swyx: So they have voice. They have, you know, obviously image input, and then they obviously can code gen and all that. But the new one that OpenAI and Meta both have but they haven't launched yet is image output. So you can literally, um, I think their demo video was that you put in an image of a [00:48:00] car, and you ask for minor modifications to that car.[00:48:02] swyx: They can generate you that modification exactly as you asked. So there's no need for the stable diffusion or comfy UI workflow of like mask here and then like infill there in paint there and all that, all that stuff. This is small model nonsense. Big model people are like, huh, we got you in as everything in the transformer.[00:48:21] swyx: This is the multimodality war, which is, do you, do you bet on the God model or do you string together a whole bunch of, uh, Small models like a, like a chump. Yeah,[00:48:29] Alessio: I don't know, man. Yeah, that would be interesting. I mean, obviously I use Midjourney for all of our thumbnails. Um, they've been doing a ton on the product, I would say.[00:48:38] Alessio: They launched a new Midjourney editor thing. They've been doing a ton. Because I think, yeah, the motto is kind of like, Maybe, you know, people say black forest, the black forest models are better than mid journey on a pixel by pixel basis. But I think when you put it, put it together, have you tried[00:48:53] swyx: the same problems on black forest?[00:48:55] Alessio: Yes. But the problem is just like, you know, on black forest, it generates one image. And then it's like, you got to [00:49:00] regenerate. You don't have all these like UI things. Like what I do, no, but it's like time issue, you know, it's like a mid[00:49:06] swyx: journey. Call the API four times.[00:49:08] Alessio: No, but then there's no like variate.[00:49:10] Alessio: Like the good thing about mid journey is like, you just go in there and you're cooking. There's a lot of stuff that just makes it really easy. And I think people underestimate that. Like, it's not really a skill issue, because I'm paying mid journey, so it's a Black Forest skill issue, because I'm not paying them, you know?[00:49:24] Alessio: Yeah,[00:49:25] swyx: so, okay, so, uh, this is a UX thing, right? Like, you, you, you understand that, at least, we think that Black Forest should be able to do all that stuff. I will also shout out, ReCraft has come out, uh, on top of the image arena that, uh, artificial analysis has done, has apparently, uh, Flux's place. Is this still true?[00:49:41] swyx: So, Artificial Analysis is now a company. I highlighted them I think in one of the early AI Newses of the year. And they have launched a whole bunch of arenas. So, they're trying to take on LM Arena, Anastasios and crew. And they have an image arena. Oh yeah, Recraft v3 is now beating Flux 1. 1. Which is very surprising [00:50:00] because Flux And Black Forest Labs are the old stable diffusion crew who left stability after, um, the management issues.[00:50:06] swyx: So Recurve has come from nowhere to be the top image model. Uh, very, very strange. I would also highlight that Grok has now launched Aurora, which is, it's very interesting dynamics between Grok and Black Forest Labs because Grok's images were originally launched, uh, in partnership with Black Forest Labs as a, as a thin wrapper.[00:50:24] swyx: And then Grok was like, no, we'll make our own. And so they've made their own. I don't know, there are no APIs or benchmarks about it. They just announced it. So yeah, that's the multi modality war. I would say that so far, the small model, the dedicated model people are winning, because they are just focused on their tasks.[00:50:42] swyx: But the big model, People are always catching up. And the moment I saw the Gemini 2 demo of image editing, where I can put in an image and just request it and it does, that's how AI should work. Not like a whole bunch of complicated steps. So it really is something. And I think one frontier that we haven't [00:51:00] seen this year, like obviously video has done very well, and it will continue to grow.[00:51:03] swyx: You know, we only have Sora Turbo today, but at some point we'll get full Sora. Oh, at least the Hollywood Labs will get Fulsora. We haven't seen video to audio, or video synced to audio. And so the researchers that I talked to are already starting to talk about that as the next frontier. But there's still maybe like five more years of video left to actually be Soda.[00:51:23] swyx: I would say that Gemini's approach Compared to OpenAI, Gemini seems, or DeepMind's approach to video seems a lot more fully fledged than OpenAI. Because if you look at the ICML recap that I published that so far nobody has listened to, um, that people have listened to it. It's just a different, definitely different audience.[00:51:43] swyx: It's only seven hours long. Why are people not listening? It's like everything in Uh, so, so DeepMind has, is working on Genie. They also launched Genie 2 and VideoPoet. So, like, they have maybe four years advantage on world modeling that OpenAI does not have. Because OpenAI basically only started [00:52:00] Diffusion Transformers last year, you know, when they hired, uh, Bill Peebles.[00:52:03] swyx: So, DeepMind has, has a bit of advantage here, I would say, in, in, in showing, like, the reason that VO2, while one, They cherry pick their videos. So obviously it looks better than Sora, but the reason I would believe that VO2, uh, when it's fully launched will do very well is because they have all this background work in video that they've done for years.[00:52:22] swyx: Like, like last year's NeurIPS, I already was interviewing some of their video people. I forget their model name, but for, for people who are dedicated fans, they can go to NeurIPS 2023 and see, see that paper.[00:52:32] Alessio: And then last but not least, the LLMOS. We renamed it to Ragops, formerly known as[00:52:39] swyx: Ragops War. I put the latest chart on the Braintrust episode.[00:52:43] swyx: I think I'm going to separate these essays from the episode notes. So the reason I used to do that, by the way, is because I wanted to show up on Hacker News. I wanted the podcast to show up on Hacker News. So I always put an essay inside of there because Hacker News people like to read and not listen.[00:52:58] Alessio: So episode essays,[00:52:59] swyx: I remember [00:53:00] purchasing them separately. You say Lanchain Llama Index is still growing.[00:53:03] Alessio: Yeah, so I looked at the PyPy stats, you know. I don't care about stars. On PyPy you see Do you want to share your screen? Yes. I prefer to look at actual downloads, not at stars on GitHub. So if you look at, you know, Lanchain still growing.[00:53:20] Alessio: These are the last six months. Llama Index still growing. What I've basically seen is like things that, One, obviously these things have A commercial product. So there's like people buying this and sticking with it versus kind of hopping in between things versus, you know, for example, crew AI, not really growing as much.[00:53:38] Alessio: The stars are growing. If you look on GitHub, like the stars are growing, but kind of like the usage is kind of like flat. In the last six months, have they done some[00:53:4

god ceo new york amazon spotify time world europe google ai china apple vision pr voice future speaking san francisco new york times phd video thinking chinese simple data predictions elon musk iphone surprise impact legal code chatgpt tesla reflecting memory ga discord busy reddit lgbt cloud flash stem honestly ab pros jeff bezos windows excited researchers unicorns lower ip tackling sort survey insane tier cto vc whispers applications doc signing seal fireworks f1 genie academic openai sf gemini organizing nvidia ux api assembly davos frontier chrome makes scarlett johansson ui mm turbo gpt bash soda aws ml lama dropbox mosaic creative writing github drafting reinvent canvas 1b bolt apis lava ruler exact stripe dev pico strawberry hundred wwdc vm sander bt flux vcs taiwanese 200k moto arr gartner opus assumption sora google docs nemo parting sam altman blackwell llm google drive sombra gpu opa tbd ramp 3b elia elo agi gnome 5b estimates midjourney bytedance leopold dota ciso haiku dx sarah silverman coursera rag gpus sonnets george rr martin cypher quill getty cobalt sdks deepmind ilya perplexity noam grok sheesh v2 ttc alessio future trends anthropic lms satya r1 ssi stack overflow 8b rl emerging trends itc theoretically sota vo2 yi replicate suno mistral veo black forest inflection graphql xai aitor brain trust databricks gpts chinchillas adept nosql mcp jensen huang grand central ai models grand central station hacker news zep hacken ethical issues cosign claud ai news gpc distro lubna autogpt neo4j tpu o3 jeremy howard gbt o1 gpd quent heygen gradients exa loras 70b langchain minimax neurips 400b jeff dean 128k elos gemini pro cerebras code interpreter icml john franco ai winter lstm r1s aws reinvent muser latent space pypy dan gross nova pro paige bailey noam brown quiet capital john frankel
RadioDotNet
Долгожданный xUnit 3, новинки Npgsql и MongoDB в EF, .NET-рынок

RadioDotNet

Play Episode Listen Later Dec 22, 2024 111:29


Подкаст RadioDotNet выпуск №106 от 23 декабря 2024 года Сайт подкаста: radio.dotnet.ru Boosty (₽): boosty.to/RadioDotNet Темы: [00:02:35] — Npgsql EF 9 Release npgsql.org/efcore/release-notes/9.0 [00:10:30] — What's new in MongoDB EF Core Provider devblogs.microsoft.com/dotnet/mongodb-ef-core-provider-whats-new [00:18:50] — xUnit.net v3 Release xunit.net/releases/v3/1.0.0 [00:56:00] — Lesser known CLR GC Handles awise.us/gc-handle [01:06:50] — StackOverflowException vs OutOfMemoryException sergeyteplyakov.github.io/Blog/csharp/StackOverflow_vs_OutOfMemo... [01:15:35] — Исследование рынка .NET разработки, анализ и прогнозы habr.com/ru/articles/857042 [01:41:40] — Кратко о разном linqpad.net/LINQPad8Mac github.blog/changelog/2024-12-18-announcing-github... podlodka.io/395 youtube.com/watch youtube.com/watch Фоновая музыка: Максим Аршинов «Pensive yeti.0.1»

Develpreneur: Become a Better Developer and Entrepreneur
AI Habits to Embrace for Efficiency and Growth

Develpreneur: Become a Better Developer and Entrepreneur

Play Episode Listen Later Dec 19, 2024 22:03


In the latest Building Better Developers podcast season, Rob Broadhead and Michael Meloche dive deep into the fascinating world of Artificial Intelligence (AI) and its impact on developers' habits. In this episode, the focus isn't just on using AI but on leveraging it to enhance productivity, creativity, and problem-solving capabilities. The AI Revolution: Why Developers Should Care AI is no longer a futuristic concept—it's an integral part of the developer's toolbox. Tools like ChatGPT, Microsoft Copilot, and IntelliJ IDEA's AI-powered suggestions transform workflows from generating boilerplate code to aiding testing and planning. As Rob Broadhead pointed out, AI's potential extends far beyond novelty. It's about using AI to “do better what you are already doing” rather than treating it as a crutch. AI-driven tools simplify repetitive tasks, allowing developers to focus on higher-value activities. Whether generating test cases, summarizing meetings, or suggesting optimal solutions for coding challenges, AI helps reduce cognitive load and time spent on mundane tasks. Practical Uses of AI in Development Code Generation and Optimization: AI tools like ChatGPT, OpenAI Whisper, Amazon's AI can generate code snippets based on developer input, saving developers significant time writing boilerplate code. These tools excel at providing a starting point, especially when developers are working on stubs or need inspiration for how to approach a particular problem. Testing Automation: Quality assurance is a critical area where AI shines. AI tools can auto-generate test cases for software, even for teams that might not have robust testing processes. AI can fill gaps in testing coverage for beginners or teams under pressure, providing a baseline of quality assurance. Documentation and Summaries: Tools like Descript and Zoom's AI features allow for the transcription and summarization of meetings, making it easier to keep track of key points and actions. These capabilities free up developers from manual note-taking and help them focus on implementing actionable insights. Planning and Scheduling: AI aids in project management by helping developers optimize their schedules, plan tasks, and streamline workflows. Michael highlighted the importance of AI for meeting prep and planning ceremonies in Agile environments. The Challenges of AI Adoption While the benefits are clear, the podcast also stresses caution. Beginners, in particular, need to verify AI-generated outputs to ensure they align with best practices and project requirements. Rob and Michael recommend cross-checking AI responses with trusted sources like Stack Overflow or GitHub discussions to avoid going down unproductive rabbit holes. Michael compared the process to early voice recognition tools like Dragon NaturallySpeaking, where the user had to train the software to achieve better results. Similarly, AI today requires user input and feedback to improve accuracy and utility. Building Habits with AI: A Developer's Challenge This episode's challenge encourages developers to explore AI daily: Identify a problem or task—whether coding, debugging, or planning. Use an AI tool to suggest solutions or assist with the task. Evaluate and refine the AI's suggestions to learn how to maximize its effectiveness. The goal isn't to rely entirely on AI but to build a habit of thoughtfully integrating AI into workflows. Over time, this practice will help developers identify areas where AI can save time and effort without compromising quality. The Future of AI in Development The podcast explores how AI is evolving, with companies like OpenAI, Google, and JetBrains pushing the boundaries. AI tools are now capable of understanding context, improving accessibility, and automating complex processes. As Rob noted, “Automation intelligence” is the real power of AI, allowing developers to focus on innovation while repetitive tasks are handled seamlessly. Key Takeaways for Developers Embrace AI as a tool, not a replacement: Use AI to augment your skills, not substitute for them. Experiment and refine: Explore different AI tools and provide feedback to improve their outputs. Stay informed: AI is rapidly evolving, and staying updated ensures you remain competitive. Conclusion As AI matures, its role in development will only grow more significant. By integrating AI into their workflows, developers can enhance efficiency and focus on building innovative solutions. The Building Better Developers podcast offers a timely reminder that the key to success lies in building habits that leverage AI effectively. Whether you're a seasoned developer or just starting out, now is the time to explore AI's transformative potential. Start this journey by experimenting with tools like ChatGPT, Copilot, or Whisper, and discover how AI can revolutionize your work. After all, building better habits starts with taking the first step—and in today's world, that step includes embracing AI. Stay Connected: Join the Develpreneur Community We invite you to join our community and share your coding journey with us. Whether you're a seasoned developer or just starting, there's always room to learn and grow together. Contact us at info@develpreneur.com with your questions, feedback, or suggestions for future episodes. Together, let's continue exploring the exciting world of software development. Additional Resources ChatGPT Microsoft Copilot Leverage AI To Solve Problems In New Ways Use Cases For AI – Interview With Chris Barkhurst Building Better Habits Videos – With Bonus Content

Hacker Public Radio
HPR4272: Embed Mastodon Threads

Hacker Public Radio

Play Episode Listen Later Dec 17, 2024


This show has been flagged as Clean by the host. Episode 4 - Embed Mastodon Threads This is Episode 4 of the Plain Text Programs Podcast hosted at Hacker Public Radio. As always I will include links with the show notes rather than reading them on the podcast except there will be one exception to that today, the link to my Plain Text Blog, home.gamerplus.org. My blog and this podcast were my inspiration for writing the Embed Mastodon Threads program. Besides posting the show notes at Hacker Public Radio where they have a comments section I also post them at my blog. Then I make a Mastodon post that includes a link to the show notes on my blog and designate it as being the comment thread for that episode of the podcast. I also post a link to the comment thread on Mastodon in my show notes. Or at least I did in the past. It came to mind that it would be nice to be able to display the comment thread at the bottom of the blog post. So I made a Mastodon post about this, and I quote. So here's my idea. I want to use mastodon toots as a comment thread for my blog posts. At the bottom of the blog post I want to embed the toot and the replies. I can pull the toot id from the embed code. Then I want to make a database query to get all the replies to that toot. Then I can generate the embed codes needed to show the toot and all the replies. I'm a mysql guy, not postgres. Also a Mastodon newb. I want to know how to get the reply ids for a toot. Any help, links, etc? End quote. I immediately got responses from some programmers expressing interest in the idea and giving good advice. I did some research based on their suggestions. I had a good night's sleep. And then I made another post in the morning. And I quote. Mastodon is so great. I had this idea last night and fiddled around with it long enough to realize I was doing it wrong. So I made a post on Mastodon and almost immediately got help. I found some good info on the Mastodon API. I wake up this morning to more help and I found out about using curl in php to make https requests. Then a musician friend of mine who I've been following since before Mastodon sends a working example, with code, in a javascript environment. And I've got a plan. End quote. So credit where credit is due. The programmers, gamers, and musicians helping me were: Jeff the GenX Alien @jeff@soapbox.hackdefendr.com EcksDy @EcksDy@techtoots.com Malin @malin@dice.camp and Wayne Myers @conniptions@mastodon.social Now, I've known Wayne Myers since before I was ever on Mastodon. We share an interest in free culture music and I have played his songs on my radio show, Something Blue, recorded by his band, Fit and the Conniptions. He sent some links in a couple of comments to other blogs that were embedding Mastodon threads which confirmed that my idea could work. Jeff the GenX Alien gave me some significant technical help. And I quote. Use tootcli to learn everything you need to know about the inner workings of Mastodon. https://github.com/ihabunek/toot Whatever the API supports so does toot. End quote. So I looked into tootcli and the Mastodon API and I realized that I didn't need to access the database for my program, I could just use the API. So, thanks Jeff. My second clue came from EcksDy. And I quote. I've got some help too. Using the Mastodon API and curl in php it should be doable. End quote. So then I started to research using curl in PHP to retrieve json data from the Mastodon API and that's what I went with. I set up a testbed and Malin chimed in with test results. He continued to help with testing and ideas throughout the rest of the project. That's why Mastodon is so great! Way better than consulting an AI bot. So I had my work cut out for me. Here is where this program is like my Plain Text Programs. I work hard, up front, until I am convinced that I have an idea that will be easy to implement. This is much easier than doing it the hard way first and then rewriting the program later after it becomes difficult to maintain. I said I had a plan. This was my plan. Write a PHP program that will generate a webpage that can be embedded in an iframe. This program will take a link as a parameter included in the url. Get that link from the Mastodon embed code for the parent post. Use the API to retrieve the data associated with the parent post including the replies. Then generate the page by inserting the appropriate data into Mastodon's existing embed structure. That's kind of a broad framework but it certainly seemed doable. And it was. So first I wanted to make the API call so I could look at the data. I found this video by Alejandro AO. How to easily create cURL API requests in PHP (Wordpress, Laravel, Symfony) https://www.youtube.com/watch?v=iRLgEWMNA6w&t=602s He recommended that you use curl in the terminal to test your API call. Then you use a web app called Curl-to-PHP to generate your PHP code to make the same API call from your program. My first time consuming stumbling block was what I call the problem with the colon. There are some great documents detailing the syntax for API calls which I will link to in the show notes. And where you are supposed to insert an id they show that as :id. Like an idiot I thought the colon was part of the syntax, not as they intended, a marker to indicate insert your id here. This is why I like to see actual code examples in syntax documents. Anyway I couldn't get it to work so I searched around until I found some code examples and that turned on the lightbulb in my head. Now I was able to make API calls using curl in the terminal. I copied the working curl command and pasted it into the Curl-to-PHP website and it output some code. And it worked! Which I was very glad about because previous research into how to make API calls with PHP was confusing to say the least. Sometimes PHP gifts you with and abundance of riches which doesn't always make life easier. So I made my API call from my program. The Curl-to-PHP code returned $result. And then I used the json_decode command to turn the result string into an array of Mastodon data. $obj = json_decode($result, true); And I could use the print_r command to look at that data. print_r($obj); I immediately put the print_r command at the bottom of my program where it resides today as commented out debug code. This way while I was looking at my program output I could just scroll down or search to find what the actual data looked like. So I fumbled around for a while before I figured out that I would need the id and the url to make my idea work. Accessing json data is reading an array. So easy peasy or maybe not. This code returns the id of the reply from inside a while loop where $i is the index. $id = $obj['descendants'][$i]['id']; Like I said, it looks easy now. Needless to say it took some head scratching to figure out the exact syntax. I used to be a mason and people would always ask me how I learned masonry. I'd look them in the eye and say, "Trowel and error". There was, in fact, a lot of trowel and error going on. So then I generated the embed code to display each post and it worked. For all of my posts. Not for replies from other servers. So I scrolled down and examined the json data and I found the url field that had all the info about the replies, server, username, and id. So I picked up the url field the same way I picked up the id field and updated my code with the url server and name. This still didn't work. After staring at the json data for a while the light finally dawned. The id I was using was the gamerplus id from my server. The id I needed to use was in the url field from their server. Now that I had become enlightened it was easy to notice that the url field contained the exact info that I needed to use in the embed. Remember what I said about doing it the hard way before you replaced that code with the easy way. That can happen even when you have a plan. So by using the url data in the embed I have less string handling and fewer lines of code. I went to bed and in the morning I made this post. And I quote. > > I am able to pull the urls from the json call so that should solve the missing comments issue. > > And then it comes down to the issue of data structures. > > KISS > > I have decided, for now, to display the comments in chronological order without concern for whether a comment is a reply to the post or a reply to another comment. > > A chronological list rather than a tree. > > Easy to implement (kind of/relatively) and easy to understand. Also no indents. > > This project will be licensed GPL so I am certainly open to others applying other data structures to the data display. Everything you need to display a tree is in the json. > > End quote. So the data structure I needed is called a multidimensional array or an array of arrays. In terms of a database table it is two columns and a bunch of rows. In terms of PHP arrays it's an array where each element is an array with two values in it, the id and the url. Now, in my case, the id is from the gamerplus server. The url is from whatever server the replyer calls home. I initialize the array with the parent post. $ids = array(array($id,$url)); You can see the nested arrays in the code. Then I add items to the array like this. $ids[] = array($id,$url); I access an array item like this. foreach($ids as $id) { $url = $id[1]; The 1 refers to the second element of the array because programmers start counting at 0. Then using the url and the domain that I captured from the GET parameter that passes the parent url into the program I build the iframe embed for that post using the Mastodon embed as a template. Which worked but the posts weren't displayed in chronological order. Because the json data isn't necessarily in chronological order. So I had to sort the multidimensional array on the id. Which isn't as straight forward as the sort() command. So I found this article on stackoverflow called How do I sort a multidimensional array by one of the fields of the inner array in PHP? It had a two line solution that I modified to work with my array. And now all my posts were in chronological order. Stack Overflow code is licensed CC BY which is one way compatible with the GPL. Just include the attribution in a comment. My first post quoted above was posted on Friday, October 25, at 8:40 PM. On Monday, October 28 at 8:52 PM I wrote, "Here's the blog post proof of concept/working code." Three days from "I have an idea" to "working code". That wasn't all I did in those three days. Saturday I had a repertoire session with my band, Jazz Buskers. Sunday I produced my radio show, Something Blue. But when I'm in the middle of a programming project I get hyper focused. Sometimes I have to force myself to step away. And I have worked on the code a little bit today. And I will in the future too. I did a lot of testing today and some Mastodon servers and/or accounts just don't support embeds. But if you want to use Embed Mastodon Threads on your blog or website your toot will probably be the parent and if it works on your account, you're good. Also posts from different servers look different. Sometimes the background color is different. Sometimes the links look different. Sometimes the whole post is a link to that post on Mastodon. I decided to embrace that as a feature rather than a bug with the different look making it easier to distinguish posts made on Gamer+ from posts made on other servers. I have uploaded Embed Mastodon Threads to home.gamerplus.org. At my blog I have a post called Embed Mastodon Threads Hosted On Gamerplus where I say, "The program is licensed GPL and I will put up a codeberg repository so you can download it and install it wherever you want. But feel free to use my server." And then I go into detail about just how to do that in the embedded comments thread. The program is 46 lines of code with 11 lines of comments including attribution comments and debug code that is commented out. So 35 lines of code. Over three days that's 12 lines of code a day. About double normal expectations for a programmer. This has been a long podcast, certainly longer than most of my podcasts will be. But I wrote it right after I did the project and it gave me an opportunity to discuss the development process. There were many issues I had that I didn't mention but I think I hit the high points. Throughout the whole project I was posting to my threads on Mastodon so that also helped me check back on the development history of this three day project. The stream of boosts and replies from my compatriots helped keep me going too. It was a rush! So this is not exactly a plain text program because it uses a database accessed through the Mastodon API. Still, I do not have to maintain that database, it's just there on every Mastodon instance, ready to use. Most of my plain text programs are web apps or web pages. This one is a web service. And it is simple to use. All you have to be able to do is copy the embed code from Mastodon, extract the link, and paste the link into the url that calls the web service. Then you put that url into an iframe on your blog or web page. I have a help page for using Embed Mastodon Threads in the same directory as the thread.php program where you can generate and copy your iframe code. In fact the help page is also a Plain Text Program which I may talk about in a future podcast. On the help page are instructions on how to get a link from the Mastodon embed code. Then you paste the link into a form and hit submit. The page generates your iframe embed code that you can use in your blog or web page. The page also displays what the embedded thread will look like. If you would rather download the code and install your own instance of Embed Mastodon Threads I have a codeberg repository. Again all the links are in the show notes at Hacker Public Radio and at my blog at home.gamerplus.org. If you have questions you can reply to a thread on Mastodon or email me at hairylarry@deltaboogie.com. If you don't have a mastodon account you can get one at gamerplus.org. Links My Plain Text Blog https://home.gamerplus.org/ Embed Mastodon Threads Help Page https://home.gamerplus.org/Embed_Mastodon_Threads/ Codeberg Repository https://codeberg.org/hairylarry/EmbedMastodonThreads From Jeff the GenX Alien Use tootcli to learn everything you need to know about the inner workings of Mastodon. https://github.com/ihabunek/toot Whatever the API supports so does toot. How to easily create cURL API requests in PHP (Wordpress, Laravel, Symfony) https://www.youtube.com/watch?v=iRLgEWMNA6w&t=602s Curl-to-PHP https://incarnate.github.io/curl-to-php/ Playing with public data - Mastodon documentation https://docs.joinmastodon.org/client/public/ Status - Mastodon documentation https://docs.joinmastodon.org/entities/Status/ Context - Mastodon documentation https://docs.joinmastodon.org/entities/Context/ How do I sort a multidimensional array by one of the fields of the inner array in PHP? https://stackoverflow.com/questions/2426917/how-do-i-sort-a-multidimensional-array-by-one-of-the-fields-of-the-inner-array-i Embed Mastodon Threads Hosted On Gamerplus https://home.gamerplus.org/permalink.php?fname=Embed_Mastodon_Threads_Hosted_On_Gamerplus.txt Gamer+DBN Mastodon server https://gamerplus.org Provide feedback on this episode.

Postgres FM
jOOQ

Postgres FM

Play Episode Listen Later Dec 13, 2024 50:31


Michael and Nikolay are joined by Lukas Eder, the creator of jOOQ, to discuss what it is, some nice developer experience features it has, and some fun things he's come across from a Postgres perspective. Here are some links to things they mentioned:Lukas Eder https://postgres.fm/people/lukas-ederjOOQ https://www.jooq.org/ DSL https://en.wikipedia.org/wiki/Domain-specific_language SQL Dialects https://www.jooq.org/javadoc/latest/org.jooq/org/jooq/SQLDialect.htmlMERGE https://www.postgresql.org/docs/current/sql-merge.html match_recognize https://modern-sql.com/feature/match_recognize JOOQ, joy of SQL (talk by Kevin Davin) https://www.youtube.com/watch?v=8Ej47GZX9D8  BUFFERS enabled for EXPLAIN ANALYZE by default (commit for Postgres 18) https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c2a4078ebad71999dd451ae7d4358be3c9290b07 PostGIS https://postgis.net/ 10 SQL Tricks That You Didn't Think Were Possible (blog post by Lukas) https://blog.jooq.org/10-sql-tricks-that-you-didnt-think-were-possible/ jOOQ questions on Stack Overflow https://stackoverflow.com/questions/tagged/jooq Our episode on NULLs https://postgres.fm/episodes/nulls-the-good-the-bad-the-ugly-and-the-unknown ~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artwork 

Nice Games Club
Serialization (with Joanna May)

Nice Games Club

Play Episode Listen Later Dec 12, 2024


We invite Joanna May into the clubhouse to discuss serialization. We get ever so slightly closer to discovering what serialization even is with Joanna's help! We also have a little time to throw a little shade at Stephen's commenting style and brackets on their own line.ChickensoftTibo, artist of the Chickensoft mascot.Eating your own dog food (Dogfooding) - WikipediaLogic Blocks - WikipediaSerializationProgrammingToolsSerialization - WikipediaBinaryFormatter migration guide -  gewarren, jeffhandley, terrajobst, adamsitnik, Microsoft Learn@JsonSubTypes vs. Reflections for Polymorphic Deserialization in Jackson - Ovidiu Mihai Tacu, BaeldungWhat are the advantages of just-in-time compilation versus ahead-of-time compil… - Stack OverflowIntrospection -  jolexxa [Joanna], GitHubIL2CPP Overview - UnityMartha had mentioned unit tests (which Mark and Stephen still don't do) in a previous episode.“That wasn't the angle I was going for.”Chickensoft Development Philosophy - ChickensoftJoanna MayGuestJoanna May is the creator of Chickensoft, open source tools for Godot and C# as well as a grassroots community. External linkChickensoft

Postgres FM
Column Tetris

Postgres FM

Play Episode Listen Later Dec 6, 2024 41:06


Nikolay and Michael discuss "Column Tetris" — what it is, why it matters, how to order columns for new tables, and how to re-organise existing ones. Here are some links to things they mentioned:“Column Tetris” by Erwin Brandstetter on Stack Overflow  https://stackoverflow.com/questions/2966524/calculating-and-saving-space-in-postgresql/7431468#7431468Data Types https://www.postgresql.org/docs/current/datatype.htmlOrioleDB beta7 benchmarks https://www.orioledb.com/blog/orioledb-beta7-benchmarkspg_hexedit https://github.com/petergeoghegan/pg_hexeditSaving Space Basically for Free (blog post by James Coleman from Braintree) https://medium.com/paypal-tech/postgresql-at-scale-saving-space-basically-for-free-d94483d9ed9aOrdering Table Columns (GitLab https://docs.gitlab.com/ee/development/database/ordering_table_columns.htmlpostgres_dba alignment padding query https://github.com/NikolayS/postgres_dba/blob/master/sql/p1_alignment_padding.sqlGood explanation from Marco Slot of how alignment was used to fix a recent issue https://x.com/marcoslot/status/1858132850383421570pg_repack feature request discussion https://github.com/reorg/pg_repack/issues/101Our episode on bloat (with Chelsea Dole) https://postgres.fm/episodes/bloatOptimizing table layout for maximum efficiency (blog post by Renato Massaro) https://r.ena.to/blog/optimizing-postgres-table-layout-for-maximum-efficiency~~~What did you like or not like? What should we discuss next time? Let us know via a YouTube comment, on social media, or by commenting on our Google doc!~~~Postgres FM is produced by:Michael Christofides, founder of pgMustardNikolay Samokhvalov, founder of Postgres.aiWith special thanks to:Jessie Draws for the elephant artwork 

Dev Interrupted
Are Only 20% of Devs Happy? | Stack Overflow's Erin Yepis

Dev Interrupted

Play Episode Listen Later Nov 19, 2024 45:49 Transcription Available


This week, Dev Interrupted dives into the 2024 Stack Overflow Developer Survey, revealing a surprising statistic: only 1 in 5 developers are happy in their jobs. Stack Overflow's Senior Analyst of Market Research and Insights, Erin Yepis, joins host Ben Lloyd Pearson to discuss the survey's findings and explore the reasons behind this widespread dissatisfaction. From salary woes and workplace settings to the ever-present burden of technical debt, they dissect the factors impacting developer happiness.Later, Dan Lines offers his perspective, drawing on LinearB's data to pinpoint three key challenges to developer satisfaction. He also shares valuable strategies for tackling technical debt, a growing concern as more companies transform into software-driven businesses.Show Notes:2025 Engineering Benchmarks Insights WebinarRefactoring x Dev Interrupted Survey2024 Stack Overflow Developer SurveyCheck out Erin's blogSupport the show: Subscribe to our Substack Leave us a review Subscribe on YouTube Follow us on Twitter or LinkedIn Offers: Learn about Continuous Merge with gitStream Get your DORA Metrics free forever

Java Off-Heap
OffHeap 88. Of Smart Agents that help you code better...or worse?

Java Off-Heap

Play Episode Listen Later Nov 19, 2024 67:40


So Coding Agents are here to stay. And they are hitting everywhere we go! (at a cheap price!). So what do we make of them. Do they help? do they hinder? Do we like using them? or, is it risky to use them? There are now practical questions we ponder with as Github Copilot, ChatGPT, Claude, Intellij AI Assistant and others are accessible at the click of a button! https://www.javaoffheap.com/datadog We thank DataDogHQ for sponsoring this podcast episode DO follow us on twitter @offheap https://www.twitter.com/offheap

ShopTalk » Podcast Feed
642: Chris Person on Forums, Reddit, and Cooperative Reporting

ShopTalk » Podcast Feed

Play Episode Listen Later Nov 18, 2024 64:16


Show DescriptionChris Person from Aftermath joins us to chat about the state of forums in 2024, being downwind of knowledge, forum drama, Reddit and StackOverflow's impact on forums, the importance of the individuals caring for knowledge and information, and the benefits and struggles of cooperatives in reporting. Listen on Website →GuestsChris PersonGuest's Main URL • Guest's TwitterMakes Highlight Reel. Co-Founder 'n blogger at Aftermath.site. Links Forums Are Still Alive, Active, And A Treasure Trove Of Information - Aftermath chris person (@Papapishu) / X Highlight Reel | creating videos of the best recent clips from around the gaming Highlight Reel Forum Channels FAQ Lego Storage Solutions Bogleheads RomHacking Metal Finishing The Something Awful Forums Jeff Atwood StackOverflow Build Civilized Communities Proxmox VE Helper-Scripts Gawker Plot History Defector 404 Media Hell Gate Overview Open Source Blogging SponsorsBluehostFind unique domains, web hosting, and WordPress tools, all in one place. Empower your business or digital agency with Bluehost.

Business Bitcoinization
How Zaprite Made History with Trump's First Bitcoin Transaction - Will Cole

Business Bitcoinization

Play Episode Listen Later Oct 25, 2024 38:18 Transcription Available


DOWNLOAD YOUR COPY OF THE BITCOIN-FOR-BUSINESS QUICK START GUIDE This free, 27-page resource includes:Six ways ANY business can benefit from BitcoinSome of the best Bitcoin-only businesses to partner withKey Bitcoin concepts for people getting startedWill Cole is the Head of Product at Zaprite, a leading platform simplifying Bitcoin payments for businesses. With a background at Unchained and Stack Overflow, Will focuses on creating seamless invoicing and point-of-sale solutions that enable companies to accept Bitcoin and fiat. His work at Zaprite empowers businesses to streamline their payment processes while maintaining financial control.

NosillaCast Apple Podcast
NC #1015 Overcast Tutorial, Let's Talk Photography, ChatGPT for JavaScript, ChatGPT's Effect on StackOverflow with Bart Busschots

NosillaCast Apple Podcast

Play Episode Listen Later Oct 21, 2024 43:24


ScreenCastsONLINE Tutorial – Overcast Let's Talk Photography 133: Better than Your Eyes! Quick Use of ChatGPT to Write JavaScript for TextExpander Snippet Support the Show Plateaus Coming on LLMs – Discussion with Bart Busschots Transcript of NC_2024_10_20 Join the Conversation: allison@podfeet.com podfeet.com/slack Support the Show: Patreon Donation PayPal one-time donation Podfeet Podcasts Mugs at Zazzle Podfeet 15-Year Anniversary Shirts Referral Links: Parallels Toolbox - 3 months free for you and me Learn through MacSparky Field Guides - 15% off for you and me Backblaze - One free month for me and you Setapp - One free month for me and you Eufy - $40 for me if you spend $200. Sadly nothing in it for you. PIA VPN - One month added to Paid Accounts for both of us CleanShot X - Earns me $25%, sorry nothing in it for you but my gratitude

Software Engineering Daily
The 2024 Stack Overflow Developer Survey with Erin Yepis and Ryan Polk

Software Engineering Daily

Play Episode Listen Later Oct 1, 2024 40:56


The Stack Overflow Developer Survey is an annual survey conducted by Stack Overflow that gathers comprehensive insights from developers around the world. It offers a valuable snapshot of the global developer community, covering a wide range of topics such as preferred programming languages, tools, and technologies. Erin Yepis is a Senior Analyst and Ryan Polk The post The 2024 Stack Overflow Developer Survey with Erin Yepis and Ryan Polk appeared first on Software Engineering Daily.

Kubernetes Podcast from Google
Dagger, with Solomon Hykes

Kubernetes Podcast from Google

Play Episode Listen Later Sep 17, 2024 67:06


Solomon Hykes is the co-founder of Dagger. He is probably best known as the creator of Docker. The tool that changed how developers package, run and distribute software in the last 11 years. His impact on our industry is undeniable. Today, we discuss his new venture, Dagger. Dagger is a new approach to how we do CI/CD.   Do you have something cool to share? Some questions? Let us know: - web: kubernetespodcast.com - mail: kubernetespodcast@google.com - twitter: @kubernetespod News of the week Kubeadm v1beta4 1.32 Release Cycle Info Updates to the Certified Kubernetes Administrator Exam 2024 Generative AI Survey Microsoft Azure Advanced Container Networking enhancements   Links from the interview Solomon Hykes on LinkedIn Dagger OpenStack Act (GitHub Actions Locally) Buildkit Cue GraphQL Dagger Discord Caching - Dagger Documentation Bazel Terraform Pulumi Kubectl gRPC GraphQL Google Cloud's Package Index The Daggerverse Cloud Foundry PostHog RedHat Development Model Links from the post-interview chat Scaffold Solomon Hykes - Docker, Dagger, and the Future of DevOps Directed Acyclic Graphs Solomon Hykes on wikipedia Stack Overflow  

Knowledge@Wharton
AI and Wellbeing

Knowledge@Wharton

Play Episode Listen Later Sep 3, 2024 23:16


Since the launch of ChatGPT in late 2022, Stack Overflow has seen a noticeable drop in daily visits, with traffic decreasing by 1 million—a 15% reduction within just four months. This trend underscores a growing preference for automated solutions, as users increasingly turn to AI for answers, reflecting a shift in how people seek information and interact socially.In this “AI Horizons” podcast episode, Wharton marketing professor and AI at Wharton co-director Stefano Puntoni joins Gordon Burtch, information systems professor at Boston University's Questrom School of Business; Julian De Freitas, a business administration professor and director of the Ethical Intelligence Lab at Harvard Business School; and Weiguang Wang, a computer and information systems professor at the University of Rochester's Simon Business School to discuss the topic. Hosted on Acast. See acast.com/privacy for more information.

Software Engineering Daily
Why Stack Overflow Uses Svelte with Giamir Buoncristiani

Software Engineering Daily

Play Episode Listen Later Aug 28, 2024 45:07


Stack Overflow is a legendary question-and-answer site for programmers, and is likely well known to most SEDaily listeners. Svelte is an open-source front-end framework that was released in 2016 and continues to grow rapidly in popularity. Giamir Buoncristiani is a Staff Software Engineer at Stack Overflow. He is also the tech lead for the Stacks The post Why Stack Overflow Uses Svelte with Giamir Buoncristiani appeared first on Software Engineering Daily.

Latent Space: The AI Engineer Podcast — CodeGen, Agents, Computer Vision, Data Science, AI UX and all things Software 3.0

Betteridge's law says no: with seemingly infinite flavors of RAG, and >2million token context + prompt caching from Anthropic/Deepmind/Deepseek, it's reasonable to believe that "in context learning is all you need".But then there's Cosine Genie, the first to make a huge bet using OpenAI's new GPT4o fine-tuning for code at the largest scale it has ever been used externally; resulting in what is now the #1 coding agent in the world according to SWE-Bench Full, Lite, and Verified:SWE-Bench has been the most successful agent benchmark of the year, receiving honors at ICLR (our interview here) and recently being verified by OpenAI. Cognition (Devin) was valued at $2b after reaching 14% on it. So it is very, very big news when a new agent appears to beat all other solutions, by a lot:While this number is self reported, it seems to be corroborated by OpenAI, who also award it clear highest marks on SWE-Bench verified:The secret is GPT-4o finetuning on billions of tokens of synthetic data. * Finetuning: As OpenAI says:Genie is powered by a fine-tuned GPT-4o model trained on examples of real software engineers at work, enabling the model to learn to respond in a specific way. The model was also trained to be able to output in specific formats, such as patches that could be committed easily to codebases. Due to the scale of Cosine's finetuning, OpenAI worked closely with them to figure out the size of the LoRA:“They have to decide how big your LoRA adapter is going to be… because if you had a really sparse, large adapter, you're not going to get any signal in that at all. So they have to dynamically size these things.”* Synthetic data: we need to finetune on the process of making code work instead of only training on working code.“…we synthetically generated runtime errors. Where we would intentionally mess with the AST to make stuff not work, or index out of bounds, or refer to a variable that doesn't exist, or errors that the foundational models just make sometimes that you can't really avoid, you can't expect it to be perfect.”Genie also has a 4 stage workflow with the standard LLM OS tooling stack that lets it solve problems iteratively:Full Video Podlike and subscribe etc!Show Notes* Alistair Pullen - Twitter, Linkedin* Cosine Genie launch, technical report* OpenAI GPT-4o finetuning GA* Llama 3 backtranslation* Cursor episode and Aman + SWEBench at ICLR episodeTimestamps* [00:00:00] Suno Intro* [00:05:01] Alistair and Cosine intro* [00:16:34] GPT4o finetuning* [00:20:18] Genie Data Mix* [00:23:09] Customizing for Customers* [00:25:37] Genie Workflow* [00:27:41] Code Retrieval* [00:35:20] Planning* [00:42:29] Language Mix* [00:43:46] Running Code* [00:46:19] Finetuning with OpenAI* [00:49:32] Synthetic Code Data* [00:51:54] SynData in Llama 3* [00:52:33] SWE-Bench Submission Process* [00:58:20] Future Plans* [00:59:36] Ecosystem Trends* [01:00:55] Founder Lessons* [01:01:58] CTA: Hiring & CustomersDescript Transcript[00:01:52] AI Charlie: Welcome back. This is Charlie, your AI cohost. As AI engineers, we have a special focus on coding agents, fine tuning, and synthetic data. And this week, it all comes together with the launch of Cosign's Genie, which reached 50 percent on SWE Bench Lite, 30 percent on the full SWE Bench, and 44 percent on OpenAI's new SWE Bench Verified.[00:02:17] All state of the art results by the widest ever margin recorded compared to former leaders Amazon Q and US Autocode Rover. And Factory Code Droid. As a reminder, Cognition Devon went viral with a 14 percent score just five months ago. Cosign did this by working closely with OpenAI to fine tune GPT 4. 0, now generally available to you and me, on billions of tokens of code, much of which was synthetically generated.[00:02:47] Alistair Pullen: Hi, I'm Ali. Co founder and CEO of Cosign, a human reasoning lab. And I'd like to show you Genie, our state of the art, fully autonomous software engineering colleague. Genie has the highest score on SWBench in the world. And the way we achieved this was by taking a completely different approach. We believe that if you want a model to behave like a software engineer, it has to be shown how a human software engineer works.[00:03:15] We've designed new techniques to derive human reasoning from real examples of software engineers doing their jobs. Our data represents perfect information lineage, incremental knowledge discovery, and step by step decision making. Representing everything a human engineer does logically. By actually training Genie on this unique dataset, rather than simply prompting base models, which is what everyone else is doing, we've seen that we're no longer simply generating random code until some works.[00:03:46] It's tackling problems like[00:03:48] AI Charlie: a human. Alistair Pullen is CEO and co founder of Kozen, and we managed to snag him on a brief trip stateside for a special conversation on building the world's current number one coding agent. Watch out and take care.[00:04:07] Alessio: Hey everyone, welcome to the Latent Space Podcast. This is Alessio, partner and CTO of Resonance at Decibel Partners, and I'm joined by my co host Swyx, founder of Small. ai.[00:04:16] swyx: Hey, and today we're back in the studio. In person, after about three to four months in visa jail and travels and all other fun stuff that we talked about in the previous episode.[00:04:27] But today we have a special guest, Ali Pullen from Cosign. Welcome. Hi, thanks for having me. We're very lucky to have you because you're on a two day trip to San Francisco. Yeah, I wouldn't recommend it. I would not[00:04:38] Alistair Pullen: recommend it. Don't fly from London to San Francisco for two days.[00:04:40] swyx: And you launched Genie on a plane.[00:04:42] On plain Wi Fi, um, claiming state of the art in SuiteBench, which we're all going to talk about. I'm excited to dive into your whole journey, because it has been a journey. I've been lucky to be a small angel in part of that journey. And it's exciting to see that you're launching to such acclaim and, you know, such results.[00:05:01] Alistair and Cosine intro[00:05:01] swyx: Um, so I'll go over your brief background, and then you can sort of fill in the blanks on what else people should know about you. You did your bachelor's in computer science at Exeter.[00:05:10] Speaker 6: Yep.[00:05:10] swyx: And then you worked at a startup that got acquired into GoPuff and round about 2022, you started working on a stealth startup that became a YC startup.[00:05:19] What's that? Yeah. So[00:05:21] Alistair Pullen: basically when I left university, I, I met my now co founder, Sam. At the time we were both mobile devs. He was an Android developer. iOS developer. And whilst at university, we built this sort of small consultancy, sort of, we'd um, be approached to build projects for people and we would just take them up and start with, they were student projects.[00:05:41] They weren't, they weren't anything crazy or anything big. We started with those and over time we started doing larger and larger projects, more interesting things. And then actually, when we left university, we just kept doing that. We didn't really get jobs, traditional jobs. It was also like in the middle of COVID, middle of lockdown.[00:05:57] So we were like, this is a pretty good gig. We'll just keep like writing code in our bedrooms. And yeah, that's it. We did that for a while. And then a friend of ours that we went to Exeter with started a YC startup during COVID. And it was one of these fast grocery delivery companies. At the time I was living in the deepest, darkest countryside in England, where fast grocery companies are still not a thing.[00:06:20] So he, he sort of pitched me this idea and was like, listen, like I need an iOS dev, do you fancy coming along? And I thought, absolutely. It was a chance to get out of my parents house, chance to move to London, you know, do interesting things. And at the time, truthfully, I had no idea what YC was. I had no idea.[00:06:34] I wasn't in the startup space. I knew I liked coding and building apps and stuff, but I'd never, never really done anything in that area. So I said, yes, absolutely. I moved to London just sort of as COVID was ending and yeah, worked at what was fancy for about a year and a half. Then we brought Sam along as well.[00:06:52] So we, Sam and I, were the two engineers at Fancy for basically its entire life, and we built literally everything. So like the, the front, the client mobile apps, the, the backends, the internal like stock management system, the driver routing, algorithms, all those things. Literally like everything. It was my first.[00:07:12] You know, both of us were super inexperienced. We didn't have, like, proper engineering experience. There were definitely decisions we'd do differently now. We'd definitely buy a lot of stuff off the shelf, stuff like that. But it was the initial dip of the toe into, like, the world of startups, and we were both, like, hooked immediately.[00:07:26] We were like, this is so cool. This sounds so much better than all our friends who were, like, consultants and doing, like, normal jobs, right? We did that, and it ran its course, and after, I want to say, 18 months or so, GoPuff came and acquired us. And there was obviously a transitionary period, an integration period, like with all acquisitions, and we did that, and as soon as we'd vested what we wanted to vest, and as soon as we thought, okay, this chapter is sort of done, uh, in about 2022, We left and we knew that we wanted to go alone and try something like we'd had this taste.[00:07:54] Now we knew we'd seen how a like a YC startup was managed like up close and we knew that we wanted to do something similar ourselves. We had no idea what it was at the time. We just knew we wanted to do something. So we, we tried a small, um, some small projects in various different areas, but then GPT 3.[00:08:12] He'd seen it on Reddit and I'm his source of all knowledge. Yeah, Sam loves Reddit. I'd actually heard of GPT 2. And obviously had like loosely followed what OpenAI had done with, what was the game they trained a model to play? Dota. Was it Dota? Yeah. So I'd followed that and, I knew loosely what GPT 2 was, I knew what BERT was, so I was like, Okay, this GPT 3 thing sounds interesting.[00:08:35] And he just mentioned it to me on a walk. And I then went home and, like, googled GPT was the playground. And the model was DaVinci 2 at the time. And it was just the old school playground, completions, nothing crazy, no chat, no nothing. I miss completions though. Yeah. Oh, completion. Honestly, I had this conversation in open hours office yesterday.[00:08:54] I was like, I just went. I know. But yeah, so we, we, um, I started playing around with the, the playground and the first thing I ever wrote into it was like, hello world, and it gave me some sort of like, fairly generic response back. I was like, okay, that looks pretty cool. The next thing was. I looked through the docs, um, also they had a lot of example prompts because I had no idea.[00:09:14] I didn't know if the, if you could put anything in, I didn't know if you had to structure in a certain way or whatever, and I, and I saw that it could start writing like tables and JSON and stuff like that. So I was like, okay, can you write me something in JSON? And it did. And I was like, Oh, wow, this is, this is pretty cool.[00:09:28] Um, can it, can it just write arbitrary JSON for me? And, um, immediately as soon as I realized that my mind was racing and I like got Sam in and we just started messing around in the playground, like fairly innocently to start with. And then, of course, both being mobile devs and also seeing, at that point, we learned about what the Codex model was.[00:09:48] It was like, this thing's trained to write code, sounds awesome. And Copilot was start, I think, I can't actually remember if Copilot had come out yet, it might have done. It's round about the same time as Codex. Round about the same time, yeah. And we were like, okay, as mobile devs, let's see what we can do.[00:10:02] So the initial thing was like, okay, let's see if we can get this AI to build us a mobile app from scratch. We eventually built the world's most flimsy system, which was back in the day with like 4, 000 token context windows, like chaining prompts, trying to keep as much context from one to the other, all these different things, where basically, Essentially, you'd put an app idea in a box, and then we'd do, like, very high level stuff, figuring out what the stack should be, figuring out what the frontend should be written in, backend should be written in, all these different things, and then we'd go through, like, for each thing, more and more levels of detail, until the point that you're You actually got Codex to write the code for each thing.[00:10:41] And we didn't do any templating or anything. We were like, no, we're going to write all the code from scratch every time, which is basically why it barely worked. But there were like occasions where you could put in something and it would build something that did actually run. The backend would run, the database would work.[00:10:54] And we were like, Oh my God, this is insane. This is so cool. And that's what we showed to our co founder Yang. I met my co founder Yang through, through fancy because his wife was their first employee. And, um, we showed him and he was like, You've discovered fire. What is this? This is insane. He has a lot more startup experience.[00:11:12] Historically, he's had a few exits in the past and has been through all different industries. He's like our dad. He's a bit older. He hates me saying that. He's your COO now? He's our COO. Yeah. And, uh, we showed him and he was like, this is absolutely amazing. Let's just do something. Cause he, he, at the time, um, was just about to have a child, so he didn't have anything going on either.[00:11:29] So we, we applied to YC, got an interview. The interview was. As most YC interviews are short, curt, and pretty brutal. They told us they hated the idea. They didn't think it would work. And that's when we started brainstorming. It was almost like the interview was like an office hours kind of thing. And we were like, okay, given what you know about the space now and how to build things with these LLMs, like what can you bring out of what you've learned in building that thing into Something that might be a bit more useful to people on the daily, and also YC obviously likes B2B startups a little bit more, at least at the time they did, back then.[00:12:01] So we were like, okay, maybe we could build something that helps you with existing codebases, like can sort of automate development stuff with existing codebases, not knowing at all what that would look like, or how you would build it, or any of these things. And They were like, yeah, that sounds interesting.[00:12:15] You should probably go ahead and do that. You're in, you've got two weeks to build us an MVP. And we were like, okay, okay. We did our best. The MVP was absolutely horrendous. It was a CLI tool. It sucked. And, um, at the time we were like, we, we don't even know. How to build what we want to build. And we didn't really know what we wanted to build, to be honest.[00:12:33] Like, we knew we wanted to try to help automate dev work, but back then we just didn't know enough about how LLM apps were built, the intricacies and all those things. And also, like, the LLMs themselves, like 4, 000 tokens, you're not going very far, they're extremely expensive. So we ended up building a, uh, a code based retrieval tool, originally.[00:12:51] Our thought process originally was, we want to build something that can do our jobs for us. That is like the gold star, we know that. We've seen like there are glimpses of it happening with our initial demo that we did. But we don't see the path of how to do that at the moment. Like the tech just wasn't there.[00:13:05] So we were like, well, there are going to be some things that you need to build this when the tech does catch up. So retrieval being one of the most important things, like the model is going to have to build like pull code out of a code base somehow. So we were like, well, let's just build the tooling around it.[00:13:17] And eventually when the tech comes, then we'll be able to just like plug it into our, our tooling and then it should work basically. And to be fair, that's basically what we've done. And that's basically what's happened, which is very fortunate. But in the meantime, whilst we were waiting for everything to sort of become available, we built this code base retrieval tool.[00:13:34] That was the first thing we ever launched when we were in YC like that, and it didn't work. It was really frustrating for us because it was just me and Sam like working like all hours trying to get this thing to work. It was quite a big task in of itself, trying to get like a good semantic search engine working that could run locally on your machine.[00:13:51] We were trying to avoid sending code to the cloud as much as possible. And then for very large codebases, you're like, you know, millions of lines of code. You're trying to do some sort of like local HNSW thing that runs inside your VS Code instance that like eats all your RAM as you've seen in the past.[00:14:05] All those different things. Yep. Yeah.[00:14:07] swyx: My first call with[00:14:07] Alistair Pullen: you, I had trouble. You were like, yeah, it sucks, man. I know, I know. I know it sucks. I'm sorry. I'm sorry. But building all that stuff was essentially the first six to eight months of what at the time was built. Which, by the way, build it. Build it. Yeah, it was a terrible, terrible name.[00:14:25] It was the worst,[00:14:27] swyx: like, part of trying to think about whether I would invest is whether or not people could pronounce it.[00:14:32] Alistair Pullen: No, when we, so when we went on our first ever YC, like, retreat, No one got the name right. They were like, build, build, well, um, and then we actually changed the names, cosign, like, although some people would spell it as in like, as if you're cosigning for an apartment or something like that's like, can't win.[00:14:49] Yeah. That was what built was back then. But the ambition, and I did a talk on this back in the end of 2022, the ambition to like build something that essentially automated our jobs was still very much like core to what we were doing. But for a very long time, it was just never apparent to us. Like. How would you go about doing these things?[00:15:06] Even when, like, you had 3. suddenly felt huge, because you've gone from 4 to 16, but even then 16k is like, a lot of Python files are longer than 16k. So you can't, you know, before you even start doing a completion, even then we were like, eh, Yeah, it looks like we're still waiting. And then, like, towards the end of last year, you then start, you see 32k.[00:15:28] 32k was really smart. It was really expensive, but also, like, you could fit a decent amount of stuff in it. 32k felt enormous. And then, finally, 128k came along, and we were like, right, this is, like, this is what we can actually deal with. Because, fundamentally, to build a product like this, you need to get as much information in front of the model as possible, and make sure that everything it ever writes in output can be read.[00:15:49] traced back to something in the context window, so it's not hallucinating it. As soon as that model existed, I was like, okay, I know that this is now going to be feasible in some way. We'd done early sort of dev work on Genie using 3. 5 16k. And that was a very, very like crude way of proving that this loop that we were after and the way we were generating the data actually had signal and worked and could do something.[00:16:16] But the model itself was not useful because you couldn't ever fit enough information into it for it to be able to do the task competently and also the base intelligence of the model. I mean, 3. 5, anyone who's used 3. 5 knows the base intelligence of the model is. is lacking, especially when you're asking it to like do software engineering, this is quite quite involved.[00:16:34] GPT4o finetuning[00:16:34] Alistair Pullen: So, we saw the 128k context model and um, at that point we'd been in touch with OpenAI about our ambitions and like how we wanted to build it. We essentially are, I just took a punt, I was like, I'm just going to ask to see, can we like train this thing? Because at the time Fortobo had just come out and back then there was still a decent amount of lag time between like OpenAI releasing a model and then allowing you to fine tune it in some way.[00:16:59] They've gotten much better about that recently, like 4. 0 fine tuning came out either, I think, a day, 4. 0 mini fine tuning came out like a day after the model did. And I know that's something they're definitely like, optimising for super heavily inside, which is great to see.[00:17:11] swyx: Which is a little bit, you know, for a year or so, YC companies had like a direct Slack channel to open AI.[00:17:17] We still do. Yeah. Yeah. So, it's a little bit of a diminishing of the YC advantage there. Yeah. If they're releasing this fine tuning[00:17:23] Alistair Pullen: ability like a day after. Yeah, no, no, absolutely. But like. You can't build a startup otherwise. The advantage is obviously nice and it makes you feel fuzzy inside. But like, at the end of the day, it's not that that's going to make you win.[00:17:34] But yeah, no, so like we'd spoken to Shamul there, Devrel guy, I'm sure you know him. I think he's head of solutions or something. In their applied team, yeah, we'd been talking to him from the very beginning when we got into YC, and he's been absolutely fantastic throughout. I basically had pitched him this idea back when we were doing it on 3.[00:17:53] 5, 16k, and I was like, this is my, this is my crazy thesis. I want to see if this can work. And as soon as like that 128k model came out, I started like laying the groundwork. I was like, I know this definitely isn't possible because he released it like yesterday, but know that I want it. And in the interim, like, GPT 4, like, 8K fine tuning came out.[00:18:11] We tried that, it's obviously even fewer tokens, but the intelligence helped. And I was like, if we can marry the intelligence and the context window length, then we're going to have something special. And eventually, we were able to get on the Experimental Access Program, and we got access to 4Turbo fine tuning.[00:18:25] As soon as we did that, because in the entire run up to that we built the data pipeline, we already had all that set up, so we were like, right, we have the data, now we have the model, let's put it through and iterate, essentially, and that's, that's where, like, Genie as we know it today, really was born. I won't pretend like the first version of Gene that we trained was good.[00:18:45] It was a disaster. That's where you realize all the implicit biases in your data set. And you realize that, oh, actually this decision you made that was fairly arbitrary was the wrong one. You have to do it a different way. Other subtle things like, you know, how you write Git diffs in using LLMs and how you can best optimize that to make sure they actually apply and work and loads of different little edge cases.[00:19:03] But as soon as we had access to the underlying tool, we were like, we can actually do this. And I was I breathed a sigh of relief because I didn't know it was like, it wasn't a done deal, but I knew that we could build something useful. I mean, I knew that we could build something that would be measurably good on whatever eval at the time that you wanted to use.[00:19:23] Like at the time, back then, we weren't actually that familiar with Swift. But once Devin came out and they announced the SBBench core, I like, that's when my life took a turn. Challenge accepted. Yeah, challenge accepted. And that's where like, yes, that's where my friendships have gone. My sleep has gone. My weight.[00:19:40] Everything got into SweeBench and yeah, we, we, it was actually a very useful tool in building GeniX beforehand. It was like, yes, vibe check this thing and see if it's useful. And then all of a sudden you have a, an actual measure to, to see like, couldn't it do software engineering? Not, not the best measure, obviously, but like it's a, it's the best that we've got now.[00:19:57] We, we just iterated and built and eventually we got it to the point where it is now. And a little bit beyond since we actually Like, we actually got that score a couple of weeks ago, and yeah, it's been a hell of a journey from the beginning all the way now. That was a very rambling answer to your question about how we got here, but that's essentially the potted answer of how we got here.[00:20:16] Got the full[00:20:16] swyx: origin story[00:20:17] Alessio: out. Yeah, no, totally.[00:20:18] Genie Data Mix[00:20:18] Alessio: You mentioned bias in the data and some of these things. In your announcement video, you called Genie the worst verse AI software engineering colleague. And you kind of highlighted how the data needed to train it needs to show how a human engineer works. I think maybe you're contrasting that to just putting code in it.[00:20:37] There's kind of like a lot more than code that goes into software engineering. How do you think about the data mixture, you know, and like, uh, there's this kind of known truth that code makes models better when you put in the pre training data, but since we put so much in the pre training data, what else do you add when you turn to Genium?[00:20:54] Alistair Pullen: Yeah, I think, well, I think that sort of boils down fundamentally to the difference between a model writing code and a model doing software engineering, because the software engineering sort of discipline goes wider, because if you look at something like a PR, that is obviously a Artifact of some thought and some work that has happened and has eventually been squashed into, you know, some diffs, right?[00:21:17] What the, very crudely, what the pre trained models are reading is they're reading those final diffs and they're emulating that and they're being able to output it, right? But of course, it's a super lossy thing, a PR. You have no idea why or how, for the most part, unless there are some comments, which, you know, anyone who's worked in a company realizes PR reviews can be a bit dodgy at times, but you see that you lose so much information at the end, and that's perfectly fine, because PRs aren't designed to be something that perfectly preserves everything that happened, but What we realized was if you want something that's a software engineer, and very crudely, we started with like something that can do PRs for you, essentially, you need to be able to figure out why those things happened.[00:21:58] Otherwise, you're just going to rely, you essentially just have a code writing model, you have something that's good at human eval, but But, but not very good at Sweet Eng. Essentially that realization was, was part of the, the kernel of the idea of of, of the approach that we took to design the agent. That, that is genie the way that we decided we want to try to extract what happened in the past, like as forensically as possible, has been and is currently like one of the, the main things that we focus all our time on, because doing that as getting as much signal out as possible, doing that as well as possible is the biggest.[00:22:31] thing that we've seen that determines how well we do on that benchmark at the end of the day. Once you've sorted things out, like output structure, how to get it consistently writing diffs and all the stuff that is sort of ancillary to the model actually figuring out how to solve a problem, the core bit of solving the problem is how did the human solve this problem and how can we best come up with how the human solved these problems.[00:22:54] So all the effort went in on that. And the mix that we ended up with was, as you've probably seen in the technical report and so on, all of those different languages and different combinations of different task types, all of that has run through that pipeline, and we've extracted all that information out.[00:23:09] Customizing for Customers[00:23:09] Alessio: How does that differ when you work with customers that have private workflows? Like, do you think, is there usually a big delta between what you get in open source and maybe public data versus like Yeah,[00:23:19] Alistair Pullen: yeah, yeah. When you scrape enough of it, most of open source is updating readmes and docs. It's hilarious, like we had to filter out so much of that stuff because when we first did the 16k model, like the amount of readme updating that went in, we did like no data cleaning, no real, like, we just sort of threw it in and saw what happened.[00:23:38] And it was just like, It was really good at updating readme, it was really good at writing some comments, really good at, um, complaining in Git reviews, in PR reviews, rather, and it would, again, like, we didn't clean the data, so you'd, like, give it some feedback, and it would just, like, reply, and, like, it would just be quite insubordinate when it was getting back to you, like, no, I don't think you're right, and it would just sort of argue with you, so The process of doing all that was super interesting because we realized from the beginning, okay, there's a huge amount of work that needs to go into like cleaning this, getting it aligned with what we want the model to do to be able to get the model to be useful in some way.[00:24:12] Alessio: I'm curious, like, how do you think about the customer willingness? To share all of this historical data, I've done a lot of developer tools investing in my career and getting access to the code base is always one of the hard things. Are people getting more cautious about sharing this information? In the past, it was maybe like, you know, you're using static analysis tool, like whatever else you need to plug into the code base, fine.[00:24:35] Now you're building. A model based on it, like, uh, what's the discussion going into these companies? Are most people comfortable with, like, letting you see how to work and sharing everything?[00:24:44] Alistair Pullen: It depends on the sector, mostly. We've actually seen, I'd say, people becoming more amenable to the idea over time, actually, rather than more skeptical, because I think they can see the, the upside.[00:24:55] If this thing could be, Does what they say it does, it's going to be more help to us than it is a risk to our infosec. Um, and of course, like, companies building in this space, we're all going to end up, you know, complying with the same rules, and there are going to be new rules that come out to make sure that we're looking at your code, that everything is safe, and so on.[00:25:12] So from what we've seen so far, we've spoken to some very large companies that you've definitely heard of and all of them obviously have stipulations and many of them want it to be sandbox to start with and all the like very obvious things that I, you know, I would say as well, but they're all super keen to have a go and see because like, despite all those things, if we can genuinely Make them go faster, allow them to build more in a given time period and stuff.[00:25:35] It's super worth it to them.[00:25:37] Genie Workflow[00:25:37] swyx: Okay, I'm going to dive in a little bit on the process that you have created. You showed the demo on your video, and by the time that we release this, you should be taking people off the waitlist and launching people so people can see this themselves. There's four main Parts of the workflow, which is finding files, planning action, writing code and running tests.[00:25:58] And controversially, you have set yourself apart from the Devins of the world by saying that things like having access to a browser is not that important for you. Is that an accurate reading of[00:26:09] Alistair Pullen: what you wrote? I don't remember saying that, but At least with what we've seen, the browser is helpful, but it's not as helpful as, like, ragging the correct files, if that makes sense.[00:26:20] Like, it is still helpful, but obviously there are more fundamental things you have to get right before you get to, like, Oh yeah, you can read some docs, or you can read a stack overflow article, and stuff like that.[00:26:30] swyx: Yeah, the phrase I was indexing on was, The other software tools are wrappers around foundational models with a few additional tools, such as a web browser or code interpreter.[00:26:38] Alistair Pullen: Oh, I see. No, I mean, no, I'm, I'm not, I'm not, I'm not deri, I'm deriding the, the, the approach that, not the, not the tools. Yeah, exactly. So like, I would[00:26:44] swyx: say in my standard model of what a code agent should look like, uh, Devon has been very influential, obviously. Yeah. Yeah. Because you could just add the docs of something.[00:26:54] Mm-Hmm. . And like, you know, now I have, now when I'm installing a new library, I can just add docs. Yeah, yeah. Cursor also does this. Right. And then obviously having a code interpreter does help. I guess you have that in the form[00:27:03] Alistair Pullen: of running tests. I mean, uh, the Genie has both of those tools available to it as well.[00:27:08] So, yeah, yeah, yeah. So, we have a tool where you can, like, put in URLs and it will just read the URLs. And you can also use this Perplexities API under the hood as well to be able to actually ask questions if it wants to. Okay. So, no, we use both of those tools as well. Like, those tools are Super important and super key.[00:27:24] I think obviously the most important tools to these agents are like being able to retrieve code from a code base, being able to read Stack Overflow articles and what have you and just be able to essentially be able to Google like we do is definitely super useful.[00:27:38] swyx: Yeah, I thought maybe we could just kind of dive into each of those actions.[00:27:41] Code Retrieval[00:27:41] swyx: Code retrieval, one of the core indexer that Yes. You've worked on, uh, even as, as built, what makes it hard, what approach you thought would work, didn't work,[00:27:52] Alistair Pullen: anything like that. It's funny, I had a similar conversation to this when I was chatting to the guys from OpenAI yesterday. The thing is that searching for code, specifically semantically, at least to start with, I mean like keyword search and stuff like that is a, is a solved problem.[00:28:06] It's been around for ages, but at least being able to, the phrase we always used back in the day was searching for what code does rather than what code is. Like searching for functionality is really hard. Really hard. The way that we approached that problem was that obviously like a very basic and easy approach is right.[00:28:26] Let's just embed the code base. We'll chunk it up in some arbitrary way, maybe using an AST, maybe using number of lines, maybe using whatever, like some overlapping, just chunk it up and embed it. And once you've done that, I will write a query saying, like, find me some authentication code or something, embed it, and then do the cosine similarity and get the top of K, right?[00:28:43] That doesn't work. And I wish it did work, don't get me wrong. It doesn't work well at all, because fundamentally, if you think about, like, semantically, how code looks is very different to how English looks, and there's, like, not a huge amount of signal that's carried between the two. So what we ended up, the first approach we took, and that kind of did well enough for a long time, was Okay, let's train a model to be able to take in English code queries and then produce a hypothetical code snippet that might look like the answer, embed that, and then do the code similarity.[00:29:18] And that process, although very simple, gets you so much more performance out of the retrieval accuracy. And that was kind of like the start of our of our engine, as we called it, which is essentially like the aggregation of all these different heuristics, like semantic, keyword, LSP, and so on. And then we essentially had like a model that would, given an input, choose which ones it thought were most appropriate, given the type of requests you had.[00:29:45] So the whole code search thing was a really hard problem. And actually what we ended up doing with Genie is we, um, let The model through self play figure out how to retrieve code. So actually we don't use our engine for Genie. So instead of like a request coming in and then like say GPT 4 with some JSON output being like, Well, I think here we should use a keyword with these inputs and then we should use semantic.[00:30:09] And then we should like pick these results. It's actually like, A question comes in and Genie has self played in its training data to be able to be like, okay, this is how I'm going to approach finding this information. Much more akin to how a developer would do it. Because if I was like, Shawn, go into this new code base you've never seen before.[00:30:26] And find me the code that does this. You're gonna probably, you might do some keywords, you're gonna look over the file system, you're gonna try to figure out from the directories and the file names where it might be, you're gonna like jump in one, and then once you're in there, you're probably gonna be doing the, you know, go to definition stuff to like jump from file to file and try to use the graph to like get closer and closer.[00:30:46] And that is exactly what Genie does. Starts on the file system, looks at the file system, picks some candidate files, is this what I'm looking for, yes or no, and If there's something that's interesting, like an import or something, it can, it can command click on that thing, go to definition, go to references, and so on.[00:31:00] And it can traverse the codebase that way.[00:31:02] swyx: Are you using the VS Code, uh, LSP, or? No,[00:31:05] Alistair Pullen: that's not, we're not like, we're not doing this in VS Code, we're just using the language servers running. But, we really wanted to try to mimic the way we do it as best as possible. And we did that during the self play process when we were generating the dataset, so.[00:31:18] Although we did all that work originally, and although, like, Genie still has access to these tools, so it can do keyword searches, and it can do, you know, basic semantic searches, and it can use the graph, it uses them through this process and figures out, okay, I've learned from data how to find stuff in codebases, and I think in our technical report, I can't remember the exact number, but I think it was around 65 or 66 percent retrieval accuracy overall, Measured on, we know what lines we need for these tasks to find, for the task to actually be able to be completed, And we found about 66 percent of all those lines, which is one of the biggest areas of free performance that we can get a hold of, because When we were building Genie, truthfully, like, a lot more focus went on assuming you found the right information, you've been able to reproduce the issue, assuming that's true, how do you then go about solving it?[00:32:08] And the bulk of the work we did was on the solving. But when you go higher up the funnel, obviously, like, the funnel looks like, have you found everything you need for the task? Are you able to reproduce the problem that's seen in the issue? Are you then able to solve it? And the funnel gets narrower as you go down.[00:32:22] And at the top of the funnel, of course, is rank. So I'm actually quite happy with that score. I think it's still pretty impressive considering the size of some of the codebases we're doing, we're using for this. But as soon as that, if that number becomes 80, think how many more tasks we get right. That's one of the key areas we're going to focus on when we continue working on Genie.[00:32:37] It'd be interesting to break out a benchmark just for that.[00:32:41] swyx: Yeah, I mean, it's super easy. Because I don't know what state of the art is.[00:32:43] Alistair Pullen: Yeah, I mean, like, for a, um, it's super easy because, like, for a given PR, you know what lines were edited. Oh, okay. Yeah, you know what lines were[00:32:50] swyx: you can[00:32:51] Alistair Pullen: source it from Cbench, actually.[00:32:52] Yeah, you can do it, you can do it super easily. And that's how we got that figure out at the other end. Um, for us being able to see it against, um, our historic models were super useful. So we could see if we were, you know, actually helping ourselves or not. And initially, one of the biggest performance gains that we saw when we were work, when we did work on the RAG a bit was giving it the ability to use the LSP to like go to definition and really try to get it to emulate how we do that, because I'm sure when you go into an editor with that, where like the LSP is not working or whatever, you suddenly feel really like disarmed and naked.[00:33:20] You're like, Oh my god, I didn't realize how much I actually used this to get about rather than just find stuff. So we really tried to get it to do that and that gave us a big jump in performance. So we went from like 54 percent up to like the 60s, but just by adding, focusing on that.[00:33:34] swyx: One weird trick. Yes.[00:33:37] I'll briefly comment here. So this is the standard approach I would say most, uh, code tooling startups are pursuing. The one company that's not doing this is magic. dev. So would you do things differently if you have a 10 million[00:33:51] Alistair Pullen: token context window? If I had a 10 million context window and hundreds of millions of dollars, I wouldn't have gone and built, uh, it's an LTM, it's not a transformer, right, that they're using, right?[00:34:03] If I'm not mistaken, I believe it's not a transformer. Yeah, Eric's going to come on at some point. Listen, they obviously know a lot more about their product than I do. I don't know a great deal about how magic works. I don't think he knows anything yet. I'm not going to speculate. Would I do it the same way as them?[00:34:17] I like the way we've done it because fundamentally like we focus on the Active software engineering and what that looks like and showing models how to do that. Fundamentally, the underlying model that we use is kind of null to us, like, so long as it's the best one, I don't mind. And the context windows, we've already seen, like, you can get transformers to have, like, million, one and a half million token context windows.[00:34:43] And that works perfectly well, so like, as soon as you can fine tune Gemini 1. 5, then you best be sure that Genie will run on Gemini 1. 5, and like, we'll probably get very good performance out of that. I like our approach because we can be super agile and be like, Oh, well, Anthropic have just released whatever, uh, you know, and it might have half a million tokens and it might be really smart.[00:35:01] And I can just immediately take my JSONL file and just dump it in there and suddenly Genie works on there and it can do all the new things. Does[00:35:07] swyx: Anthropic have the same fine tuning support as OpenAI? I[00:35:11] Alistair Pullen: actually haven't heard any, anyone do it because they're working on it. They are partner, they're partnered with AWS and it's gonna be in Bedrock.[00:35:16] Okay. As far as, as far as I know, I think I'm, I think, I think that's true. Um, cool. Yeah.[00:35:20] Planning[00:35:20] swyx: We have to keep moving on to, uh, the other segments. Sure. Uh, planning the second piece of your four step grand master plan, that is the frontier right now. You know, a lot of people are talking about strawberry Q Star, whatever that is.[00:35:32] Monte Carlo Tree Search. Is current state of the art planning good enough? What prompts have worked? I don't even know what questions to ask. Like, what is the state of planning?[00:35:41] Alistair Pullen: I think it's fairly obvious that with the foundational models, like, you can ask them to think by step by step and ask them to plan and stuff, but that isn't enough, because if you look at how those models score on these benchmarks, then they're not even close to state of the art.[00:35:52] Which ones are[00:35:52] swyx: you referencing? Benchmarks? So, like,[00:35:53] Alistair Pullen: just, uh, like, SweetBench and so on, right? And, like, even the things that get really good scores on human evalor agents as well, because they have these loops, right? Yeah. Obviously these things can reason, quote unquote, but the reasoning is the model, like, it's constrained by the model as intelligence, I'd say, very crudely.[00:36:10] And what we essentially wanted to do was we still thought that, obviously, reasoning is super important, we need it to get the performance we have. But we wanted the reasoning to emulate how we think about problems when we're solving them as opposed to how a model thinks about a problem when we're solving it.[00:36:23] And that was, that's obviously part of, like, the derivation pipeline that we have when we, when we, when we Design our data, but the reasoning that the models do right now, and who knows what Q star, whatever ends up being called looks like, but certainly what I'm excited on a small tangent to that, like, what I'm really excited about is when models like that come out, obviously, the signal in my data, when I regenerate, it goes up.[00:36:44] And then I can then train that model. It's already better at reasoning with it. improved reasoning data and just like I can keep bootstrapping and keep leapfrogging every single time. And that is like super exciting to me because I don't, I welcome like new models so much because immediately it just floats me up without having to do much work, which is always nice.[00:37:02] But at the state of reasoning generally, I don't see it going away anytime soon. I mean, that's like an autoregressive model doesn't think per se. And in the absence of having any thought Maybe, uh, an energy based model or something like that. Maybe that's what QSTAR is. Who knows? Some sort of, like, high level, abstract space where thought happens before tokens get produced.[00:37:22] In the absence of that for the moment, I think it's all we have and it's going to have to be the way it works. For what happens in the future, we'll have to see, but I think certainly it's never going to hinder performance to do it. And certainly, the reasoning that we see Genie do, when you compare it to like, if you ask GPT 4 to break down step by step and approach for the same problem, at least just on a vibe check alone, looks far better.[00:37:46] swyx: Two elements that I like, that I didn't see in your initial video, we'll see when, you know, this, um, Genie launches, is a planner chat, which is, I can modify the plan while it's executing, and then the other thing is playbooks, which is also from Devin, where, here's how I like to do a thing, and I'll use Markdown to, Specify how I do it.[00:38:06] I'm just curious if, if like, you know,[00:38:07] Alistair Pullen: those things help. Yeah, no, absolutely. We're a hundred percent. We want everything to be editable. Not least because it's really frustrating when it's not. Like if you're ever, if you're ever in a situation where like this is the one thing I just wish I could, and you'd be right if that one thing was right and you can't change it.[00:38:21] So we're going to make everything as well, including the code it writes. Like you can, if it makes a small error in a patch, you can just change it yourself and let it continue and it will be fine. Yeah. So yeah, like those things are super important. We'll be doing those two.[00:38:31] Alessio: I'm curious, once you get to writing code, is most of the job done?[00:38:35] I feel like the models are so good at writing code when they're like, And small chunks that are like very well instructed. What's kind of the drop off in the funnel? Like once you get to like, you got the right files and you got the right plan. That's a great question[00:38:47] Alistair Pullen: because by the time this is out, there'll be another blog, there'll be another blog post, which contains all the information, all the learnings that I delivered to OpenAI's fine tuning team when we finally got the score.[00:38:59] Oh, that's good. Um, go for it. It's already up. And, um, yeah, yeah. I don't have it on my phone, but basically I, um, broke down the log probs. I basically got the average log prob for a token at every token position in the context window. So imagine an x axis from 0 to 128k and then the average log prob for each index in there.[00:39:19] As we discussed, like, The way genie works normally is, you know, at the beginning you do your RAG, and then you do your planning, and then you do your coding, and that sort of cycle continues. The certainty of code writing is so much more certain than every other aspect of genie's loop. So whatever's going on under the hood, the model is really comfortable with writing code.[00:39:35] There is no doubt, and it's like in the token probabilities. One slightly different thing, I think, to how most of these models work is, At least for the most part, if you ask GPT4 in ChatGPT to edit some code for you, it's going to rewrite the entire snippet for you with the changes in place. We train Genie to write diffs and, you know, essentially patches, right?[00:39:55] Because it's more token efficient and that is also fundamentally We don't write patches as humans, but it's like, the result of what we do is a patch, right? When Genie writes code, I don't know how much it's leaning on the pre training, like, code writing corpus, because obviously it's just read code files there.[00:40:14] It's obviously probably read a lot of patches, but I would wager it's probably read more code files than it has patches. So it's probably leaning on a different part of its brain, is my speculation. I have no proof for this. So I think the discipline of writing code is slightly different, but certainly is its most comfortable state when it's writing code.[00:40:29] So once you get to that point, so long as you're not too deep into the context window, another thing that I'll bring up in that blog post is, um, Performance of Genie over the length of the context window degrades fairly linearly. So actually, I actually broke it down by probability of solving a SWE bench issue, given the number of tokens of the context window.[00:40:49] It's 60k, it's basically 0. 5. So if you go over 60k in context length, you are more likely to fail than you are to succeed just based on the amount of tokens you have on the context window. And when I presented that to the fine tuning team at OpenAI, that was super interesting to them as well. And that is more of a foundational model attribute than it is an us attribute.[00:41:10] However, the attention mechanism works in, in GPT 4, however, you know, they deal with the context window at that point is, you know, influencing how Genie is able to form, even though obviously all our, all our training data is perfect, right? So even if like stuff is being solved in 110, 000 tokens, sort of that area.[00:41:28] The training data still shows it being solved there, but it's just in practice, the model is finding it much harder to solve stuff down that end of the context window.[00:41:35] Alessio: That's the scale with the context, so for a 200k context size, is 100k tokens like the 0. 5? I don't know. Yeah, but I,[00:41:43] Alistair Pullen: I, um, hope not. I hope you don't just take the context length and halve it and then say, oh, this is the usable context length.[00:41:50] But what's been interesting is knowing that Actually really digging into the data, looking at the log probs, looking at how it performs over the entire window. It's influenced the short term improvements we've made to Genie since we did the, got that score. So we actually made some small optimizations to try to make sure As best we can without, like, overdoing it, trying to make sure that we can artificially make sure stuff sits within that sort of range, because we know that's our sort of battle zone.[00:42:17] And if we go outside of that, we're starting to push the limits, we're more likely to fail. So just doing that sort of analysis has been super useful without actually messing with anything, um, like, more structural in getting more performance out of it.[00:42:29] Language Mix[00:42:29] Alessio: What about, um, different languages? So, in your technical report, the data makes sense.[00:42:34] 21 percent JavaScript, 21 percent Python, 14 percent TypeScript, 14 percent TSX, um, Which is JavaScript, JavaScript.[00:42:42] Alistair Pullen: Yeah,[00:42:42] swyx: yeah, yeah. Yes,[00:42:43] Alistair Pullen: yeah, yeah. It's like 49 percent JavaScript. That's true, although TypeScript is so much superior, but anyway.[00:42:46] Alessio: Do you see, how good is it at just like generalizing? You know, if you're writing Rust or C or whatever else, it's quite different.[00:42:55] Alistair Pullen: It's pretty good at generalizing. Um, obviously, though, I think there's 15 languages in that technical report, I think, that we've, that we've covered. The ones that we picked in the highest mix were, uh, the ones that, selfishly, we internally use the most, and also that are, I'd argue, some of the most popular ones.[00:43:11] When we have more resource as a company, and, More time and, you know, once all the craziness that has just happened sort of dies down a bit, we are going to, you know, work on that mix. I'd love to see everything ideally be represented in a similar level as it is. If you, if you took GitHub as a data set, if you took like how are the languages broken down in terms of popularity, that would be my ideal data mix to start.[00:43:34] It's just that it's not cheap. So, um, yeah, trying to have an equal amount of Ruby and Rust and all these different things is just, at our current state, is not really what we're looking for.[00:43:46] Running Code[00:43:46] Alessio: There's a lot of good Ruby in my GitHub profile. You can have it all. Well, okay, we'll just train on that. For running tests It sounds easy, but it isn't, especially when you're working in enterprise codebases that are kind of like very hard to spin up.[00:43:58] Yes. How do you set that up? It's like, how do you make a model actually understand how to run a codebase, which is different than writing code for a codebase?[00:44:07] Alistair Pullen: The model itself is not in charge of like setting up the codebase and running it. So Genie sits on top of GitHub, and if you have CI running GitHub, you have GitHub Actions and stuff like that, then Genie essentially makes a call out to that, runs your CI, sees the outputs and then like moves on.[00:44:23] Making a model itself, set up a repo, wasn't scoped in what we wanted Genie to be able to do because for the most part, like, at least most enterprises have some sort of CI pipeline running and like a lot of, if you're doing some, even like, A lot of hobbyist software development has some sort of like basic CI running as well.[00:44:40] And that was like the lowest hanging fruit approach that we took. So when, when Genie ships, like the way it will run its own code is it will basically run your CI and it will like take the, um, I'm not in charge of writing this. The rest of the team is, but I think it's the checks API on GitHub allows you to like grab that information and throw it in the context window.[00:44:56] Alessio: What's the handoff like with the person? So, Jeannie, you give it a task, and then how long are you supposed to supervise it for? Or are you just waiting for, like, the checks to eventually run, and then you see how it goes? Like, uh, what does it feel like?[00:45:11] Alistair Pullen: There are a couple of modes that it can run in, essentially.[00:45:14] It can run in, like, fully headless autonomous modes, so say you assign it a ticket in linear or something. Then it won't ask you for anything. It will just go ahead and try. Or if you're in like the GUI on the website and you're using it, then you can give it a task and it, it might choose to ask you a clarifying question.[00:45:30] So like if you ask it something super broad, it might just come back to you and say, what does that actually mean? Or can you point me in the right direction for this? Because like our decision internally was, it's going to piss people off way more if it just goes off and has, and makes a completely like.[00:45:45] ruined attempt at it because it just like from day one got the wrong idea. So it can ask you for a lot of questions. And once it's going much like a regular PR, you can leave review comments, issue comments, all these different things. And it, because you know, he's been trained to be a software engineering colleague, responds in actually a better way than a real colleague, because it's less snarky and less high and mighty.[00:46:08] And also the amount of filtering has to do for When you train a model to like be a software engineer, essentially, it's like you can just do anything. It's like, yeah, it looks good to me, bro.[00:46:17] swyx: Let's[00:46:17] Alistair Pullen: ship it.[00:46:19] Finetuning with OpenAI[00:46:19] swyx: I just wanted to dive in a little bit more on your experience with the fine tuning team. John Allard was publicly sort of very commentary supportive and, you know, was, was part of it.[00:46:27] Like, what's it like working with them? I also picked up that you initially started to fine tune what was publicly available, the 16 to 32 K range. You got access to do more than that. Yeah. You've also trained on billions of tokens instead of the usual millions range. Just, like, take us through that fine tuning journey and any advice that you might have.[00:46:47] Alistair Pullen: It's been so cool, and this will be public by the time this goes out, like, OpenAI themselves have said we are pushing the boundaries of what is possible with fine tuning. Like, we are right on the edge, and like, we are working, genuinely working with them in figuring out how stuff works, what works, what doesn't work, because no one's doing No one else is doing what we're doing.[00:47:06] They have found what we've been working on super interesting, which is why they've allowed us to do so much, like, interesting stuff. Working with John, I mean, I had a really good conversation with John yesterday. We had a little brainstorm after the video we shot. And one of the things you mentioned, the billions of tokens, one of the things we've noticed, and it's actually a very interesting problem for them as well, when you're[00:47:28] How big your peft adapter, your lore adapter is going to be in some way and like figuring that out is actually a really interesting problem because if you make it too big and because they support data sets that are so small, you can put like 20 examples through it or something like that, like if you had a really sparse, large adapter, you're not going to get any signal in that at all.[00:47:44] So they have to dynamically size these things and there is an upper bound and actually we use. Models that are larger than what's publicly available. It's not publicly available yet, but when this goes out, it will be. But we have larger law adapters available to us, just because the amount of data that we're pumping through it.[00:48:01] And at that point, you start seeing really Interesting other things like you have to change your learning rate schedule and do all these different things that you don't have to do when you're on the smaller end of things. So working with that team is such a privilege because obviously they're like at the top of their field in, you know, in the fine tuning space.[00:48:18] So we're, as we learn stuff, they're learning stuff. And one of the things that I think really catalyzed this relationship is when we first started working on Genie, like I delivered them a presentation, which will eventually become the blog post that you'll love to read soon. The information I gave them there I think is what showed them like, oh wow, okay, these guys are really like pushing the boundaries of what we can do here.[00:48:38] And truthfully, our data set, we view our data set right now as very small. It's like the minimum that we're able to afford, literally afford right now to be able to produce a product like this. And it's only going to get bigger. So yesterday while I was in their offices, I was basically, so we were planning, we were like, okay, how, this is where we're going in the next six to 12 months.[00:48:57] Like we're, Putting our foot on the gas here, because this clearly works. Like I've demonstrated this is a good, you know, the best approach so far. And I want to see where it can go. I want to see what the scaling laws like for the data. And at the moment, like, it's hard to figure that out because you don't know when you're running into like saturating a PEFT adapter, as opposed to actually like, is this the model's limit?[00:49:15] Like, where is that? So finding all that stuff out is the work we're actively doing with them. And yeah, it's, it's going to get more and more collaborative over the next few weeks as we, as we explore like larger adapters, pre training extension, different things like that.[00:49:27] swyx: Awesome. I also wanted to talk briefly about the synthetic data process.[00:49:32] Synthetic Code Data[00:49:32] swyx: One of your core insights was that the vast majority of the time, the code that is published by a human is encrypted. In a working state. And actually you need to fine tune on non working code. So just, yeah, take us through that inspiration. How many rounds, uh, did you, did you do? Yeah, I mean, uh,[00:49:47] Alistair Pullen: it might, it might be generous to say that the vast majority of code is in a working state.[00:49:51] I don't know if I don't know if I believe that. I was like, that's very nice of you to say that my code works. Certainly, it's not true for me. No, I think that so yeah, no, but it was you're right. It's an interesting problem. And what we saw was when we didn't do that, obviously, we'll just hope you have to basically like one shot the answer.[00:50:07] Because after that, it's like, well, I've never seen iteration before. How am I supposed to figure out how this works? So what the what you're alluding to there is like the self improvement loop that we started working on. And that was in sort of two parts, we synthetically generated runtime errors. Where we would intentionally mess with the AST to make stuff not work, or index out of bounds, or refer to a variable that doesn't exist, or errors that the foundational models just make sometimes that you can't really avoid, you can't expect it to be perfect.[00:50:39] So we threw some of those in with a, with a, with a probability of happening and on the self improvement side, I spoke about this in the, in the blog post, essentially the idea is that you generate your data in sort of batches. First batch is like perfect, like one example, like here's the problem, here's the answer, go, train the model on it.[00:50:57] And then for the second batch, you then take the model that you trained before that can look like one commit into the future, and then you let it have the first attempt at solving the problem. And hopefully it gets it wrong, and if it gets it wrong, then you have, like, okay, now the codebase is in this incorrect state, but I know what the correct state is, so I can do some diffing, essentially, to figure out how do I get the state that it's in now to the state that I want it in, and then you can train the model to then produce that diff next, and so on, and so on, and so on, so the model can then learn, and also reason as to why it needs to make these changes, to be able to learn how to, like, learn, like, solve problems iteratively and learn from its mistakes and stuff like that.[00:51:35] Alessio: And you picked the size of the data set just based on how much money you could spend generating it. Maybe you think you could just make more and get better results. How, what[00:51:42] Alistair Pullen: multiple of my monthly burn do I spend doing this? Yeah. Basically it was, it was very much related to Yeah. Just like capital and um, yes, with any luck that that will be alleviated to[00:51:53] swyx: very soon.[00:51:54] Alistair Pullen: Yeah.[00:51:54] SynData in Llama 3[00:51:54] swyx: Yeah. I like drawing references to other things that are happening in, in the, in the wild. So, 'cause we only get to release this podcast once a week. Mm-Hmm. , the LAMA three paper also had some really interesting. Thoughts on synthetic data for code? I don't know if you have reviewed that. I'll highlight the back translation section.[00:52:11] Because one of your dataset focuses is updating documentation. I think that translation between natural language, English versus code, and

Go Time
Big shoes to fill

Go Time

Play Episode Listen Later Aug 13, 2024 66:05


Kris, Angelica & Johnny react to the recently announced Go team changes, discuss the finding that 80% of developers surveyed by Stack Overflow are unhappy & disagree about the concept of tech debt (but agree that something's gotta give).

Giant Robots Smashing Into Other Giant Robots
537: Navigating the Startup Ecosystem with Marc Gauthier

Giant Robots Smashing Into Other Giant Robots

Play Episode Listen Later Aug 8, 2024 45:49


In the latest episode of the "Giant Robots On Tour" podcast, hosts Rémy Hannequin and Sami Birnbaum welcome Marc G. Gauthier, a solopreneur and startup coach, who shares his journey from software development to becoming the founder and developer of The Shadow Boxing App. Marc describes how his interest in software engineering began at a young age with QBasic and evolved through various leadership roles at companies like Drivy (now Getaround) and Back Market. His early passion for gaming led him to learn coding, and over time, he naturally transitioned into management roles, finding excitement in organizing and leading teams while maintaining his love for building products. During the episode, Marc discusses the challenges and intricacies of scaling startups, emphasizing the importance of balancing speed and reliability in software development. He recounts his experiences in leadership positions, where he faced the dual task of managing rapid team growth and maintaining software efficiency. Marc also shares insights into the startup ecosystem, noting that most startups struggle to achieve success due to a combination of market timing, team dynamics, and resource management. His own venture, The Shadow Boxing App, represents his attempt to return to hands-on coding while leveraging his extensive experience in startup coaching and advising. Marc also touches on the role of AI in the future of software development, expressing cautious optimism about its potential to augment human workflows and automate repetitive tasks. He advises current and aspiring developers to embrace AI as a tool to enhance their capabilities rather than a replacement for human ingenuity. Marc concludes by highlighting the importance of realistic expectations in the startup world and the need for continuous learning and adaptation in the ever-evolving tech landscape. Getaround (https://getaround.com/) Follow Getaround on LinkedIn (https://www.linkedin.com/company/getaround/), Facebook (https://www.facebook.com/getaround), X (https://twitter.com/getaround), YouTube (https://www.youtube.com/getaround), or Instagram (https://www.instagram.com/getaround/). Back Market (https://www.backmarket.com/en-us) Follow Back Market on LinkedIn (https://www.linkedin.com/company/back-market/), Facebook (https://www.facebook.com/BackMarketCom), X (https://x.com/backmarket), or Instagram (https://www.instagram.com/backmarket). The Shadow Boxing App (https://shadowboxingapp.com/) Follow Marc Gauthier on LinkedIn (https://www.linkedin.com/in/marcggauthier/). Follow thoughtbot on X (https://twitter.com/thoughtbot) or LinkedIn (https://www.linkedin.com/company/150727/). Transcript: RÉMY:  This is the Giant Robots Smashing Into Other Giant Robots podcast, the Giant Robots on Tour series coming to you from Europe, West Asia, and Africa, where we explore the design, development, and business of great products. I'm your host, Rémy Hannequin. SAMI: And I'm your other host, Sami Birnbaum. RÉMY: If you are wondering who we are, make sure you find the previous podcast where we introduced the Giant Robots on Tour series by throwing random icebreakers at each other. And find out that Jared likes it when someone takes the time to understand someone else's point of view. Joining us today is Marc G Gauthier, a Solopreneur and Startup Coach. Marc, you used to be VP of Engineering at Drivy, now known as Getaround, and also Director of Engineering at Back Market. You also have been a coach and advisor to a startup for over a decade. Currently, your current adventure is being the Founder and Developer of The Shadow Boxing App available on the Apple App Store. We always like to go back to the start with our guests. Everyone has a story, and we are interested in your journey. So, Marc, what led you into the world of software engineering in the first place? MARC: Hello. Well, happy to be here. And, yeah, I started getting into software development quite a long time ago. I actually learned software development with QBasic when I was something like seven. And, from there, I just kept on learning, learning, and learning and got into school for it, then worked in different startups, and then moved into more leadership position management. And I'm now, like, coaching people and building my own product. What do you want to get? Because it's broad. I've been doing it for quite a while. Like, I don't think the QBasic days are that insightful. The only thing I remember from that time is being confused by the print comment that I would expect it to print on my printer or something, but it didn't; it just printed on the screen. That's the only thing I have from back then. SAMI: Why at seven years old? And I'm taking you back too far, but at seven years old, I was probably collecting Pokémon cards and possibly like, you know, those football stickers. I don't know if you had the Panini stickers. MARC: Oh yeah, I was doing that as well. SAMI: But you were doing that as well. But then what drove you at that age? What do you think it was that made you think, I want to start learning to code, or play around with the computer, or get into tech? MARC: [laughs] Yeah. Well, I remember, back then, I really wanted a computer to play games. Like, I had a friend who had a computer. He was playing games, and I wanted to do that. So, I was asking my mom to have a computer, and she told me, "Yeah, you can have one." And she found a really old computer she bought from a neighbor, I think. But she told me like, "I don't know anything about it. So, you have to figure it out and set it up." And she just found someone to kind of help me. And this person told me to, like, take the computer apart. She taught me a bit of software development, and I kind of liked it. And I was always trying to change the games. Back then, it was way easier. You could just edit a sound file, and you would just edit the sound file in the game, so yeah, just learning like this. It wasn't really my intent to learn programming. It just kind of happened because I wanted to play video games really. SAMI: That's really cool. It's really interesting. Rémy, do you remember how...how did you first get...do you remember your first computer, Rémy? RÉMY: My first computer, I think I remember, but the first one I used it was, first, a very long time ago. I discovered that it was an Apple computer way, way later when I discovered what Apple was and what computers were actually. And I just remember playing SimCity 2000 on it, and it was amazing. And we had to, you know, cancel people from making phone calls while we were on the computer because of the internet and all the way we had to connect to the internet back then. And after that, just, I think, Windows 95 at home. Yeah, that's the only thing I can remember actually. Because I think I was lucky, so I got one quite early. And I don't really remember not having one, so I was quite lucky with that. And so, I was always kind of in the computer game without being too much [inaudible 05:02] [laughs]. SAMI: Yeah, I think that's similar to me as well. Like, it's interesting because my initial introduction to computers would have been watching my older brothers kind of play computer games and actually being told to get out the room, or like, you know, "We're busy now. Don't bother us." And then, what actually happened is when they left the room, I managed to play what they were playing, which was the first ever GTA. I don't know if anyone ever played this, but it is so cool if you look back on it. You could probably find emulators online, but it was, like, a bird's eye view, like, way of operating. And it was probably also that drive where you get frustrated on a computer because you want to do something, so, like you were saying, Marc, where you went to edit the sound files because you want to change something. You want to do something. I definitely think that is something which I felt as well is that frustration of I want to change this thing. And then, that kind of gets into well, how does it work? And if I know how it works, then I can probably change it. MARC: Yeah. And once you figure out how things work, it's also really exciting. Like, once you figure out the initialization file on Windows, like, you can edit, like, what level is unlocked right away. It's kind of cheat codes but not really. And there are some really fun ones. Like, I would edit sound files for racing games. And, usually, it's just a base sound file, and then they would pitch shift the sound to make it sound like an engine. So, if you record your voice, it's just really funny. RÉMY: So, Marc, you mentioned moving to management positions quite early. Do you remember what made you do this move? Was it for, like, a natural path in your career, or was it something you really wanted from the first part of your career as a developer? What happened at this moment? MARC: Yeah, that was not completely planned. Like, I don't think I really plan my career precisely. It's just something that happens. So, I joined Drivy after, like, I was already a software engineer for, like, five years at that point. I joined as a lead backend engineer. I did that for three years. And after three years, the company went from...I think there was, like, three software engineers to a dozen. There was a need for more structure, and the CTO, at the time so, Nicolas, wanted to focus more on products. And it was hard to do both, like do the product side, the design, the data, and do the engineering, the software, and so on. So, he wanted to get a bit away from software engineering and more into product. So, there was a gap in the organization. I was there. I was interested to try, and I was already doing some more things on the human side, so talking to people, organizing, internal communication. I kind of liked it. So, I was excited to try, give it a try. It was really interesting. I found that it was a different way to have an impact on the team. I just kept doing it. And my plan was to keep doing it until I'm bored with it. And I'm still not bored with it, even though you kind of miss just actually building the software yourselves, actually coding. So, that's also why I'm trying something different right now with my mobile app adventure. SAMI: Right. So, on the side, you've got this Shadow Boxing App, which, in my dedicated research, I downloaded and had a go with it. MARC: Did you actually try it, or did you just click around? SAMI: I did a proper workout, mate. I did. I put myself as, like, the absolute beginner. I did it on my MacBook Pro. I know it's built for iPad or iPhone, but it still worked amazingly well. And it kind of reminded me why I stopped doing boxing because it's hard work. MARC: [laughs] Yeah, it is. SAMI: It's not a gimmick this thing, right? So, it's like, the best way to describe it is it's essentially replacing if I was to go to the gym and have a trainer who's telling me kind of the moves to make or how to do it, then this kind of replaces that trainer. So, it's something you can do at home. It was really cool. I was surprised, actually. I thought, at the beginning, it's not going to be that interactive, or it won't actually be as hard or difficult as a workout, and it really was. So, it's, yeah, it was really cool, really interesting to try it. And going into that, you say you wanted to get back more into coding, and that's why you are doing this kind of, like, app on the side, or it allowed you to kind of do a bit more coding away from the people management. You've been involved in a lot of startups, and I actually often get...as consultants, when we work at thoughtbot, we get a lot of people who come with different startup ideas. When you look back at all the startups you've been involved with, do you think more startups are successful than those that fail? Or have you seen a lot of startups...actually, people come with these great ideas; they want to build this amazing product, but it's actually really hard to be a successful product? MARC: I think it's [inaudible 10:22] how to have the right idea, be at the right spot at the right time, build the right team, get enough momentum. I think most startups fail, and even startups that are successful often can be the result of a pivot. Like, I know companies that pivoted a bunch of times before finding any success. So, it's really hard actually...if I take my past four companies, only two are still alive. Like, the first two went under. Actually, there's even more companies that went under after I left. Yeah, it's just really hard to get anything off the ground. So, yeah, it's complicated, and I have a lot of respect for all the founders that go through it. For The Shadow Boxing App, I worked on it for the past three years, but I'm only working on it almost full-time for the past two months. And it was way safer. I could check the product-market fit. I could check if I enjoyed working on it. So, I guess it was easier. I had the luxury of having a full-time job. Building the app didn't take that much time. But to answer your question, I think, from my experience, most startups fail. And the ones that succeed it's kind of lightning in a bottle, or, like, there's a lot of factors that get into it. It's hard to replicate. A lot of people try to replicate some science, some ideas. They go, oh, we'll do this, and we'll do that. And we use this technique that Google uses and so on, but it's never that straightforward. SAMI: Yeah, I'm so happy you said that because I think it's a real brutal truth that I'd also say most of the startup projects that I've worked on probably have failed. Like, there's very few that actually make it. It's such a saturated market. And I think, I guess, in your role as advising startups, it's really good to come in with that honesty at the beginning and to say, "It's a big investment if you want to build something. Most people probably aren't successful." And then, when you work from that perspective, you can have, like, way more transparent and open discussions from the get-go. Because when you're outside of tech...and a lot of people have this idea of if I could just get an app to do my idea, I'm going to be the next Facebook. I'm going to be the next, you know, Amazon Marketplace. And it just kind of isn't like that. You've got these massive leaders in Facebook, Amazon, Google, Netflix. But below that, there's a lot of failures and a massively saturated market. So, yeah, just, it's so interesting that you also see it in a similar way. MARC: What I saw evolve in the past 10 years is the fact that people got more realistic with it. So, maybe 10 years ago, I would have people coming to me with just the most ridiculous idea, like, you know, I'll do Airbnb for cats. And really think, yeah, I just need a good idea, and that's it. But now I feel like people kind of understand that it's more complicated. There's way more resources online. People are more educated. They also see way more successes. Failures are also a bit more advertised. We saw a bunch of startups just go under. It feels like every month I get an email from a tool I used in the past saying, "Oh, we're shutting down," and so on. So, I think it's not as bad as 10 years ago where weekly I would have just people asking me, "I want to build this app," and the app would be just the most ridiculous thing or something that would be really smart, but it's really like, "Oh, I want to do, like, food delivery but better than what exists." It's like, yeah, that's a really good idea, but then you need...it's not only software. There's logistics. There's so much behind it that you don't seem to understand just yet. But, as a coach, so, what I'm doing is I'm helping startups that are usually before or after series A but not too large of startups just go to the next stage. And people are really aware of that and really worried. Like, they see money going down, market fit not necessarily being there. And they know, like, their company is at risk. And especially when you talk to founders, they're really aware that, you know, everything could be collapsing really quickly. If they make, like, three really bad decisions in a row, you're basically done. Obviously, it depends on the company, but yeah, people are more aware than before, especially nowadays where money is a bit harder to get. Let's say two years ago, there was infinite money, it felt like. Now it's more tight. People are more looking at the unit economics precisely. So, people need to be more realistic to succeed. RÉMY: What's the kind of recurrent struggle the startups you coach usually face? Apparently, it quite changed in the past decade, but maybe what are the current struggles they face? MARC: It really depends. It's kind of broad. But, usually, it would be, let's say, a startup after their first round of funding, let's say, if you take startups that are looking for funding. So, you usually have a group of founders, two to four, usually two or three, that are really entrepreneurs that want to bootstrap some things. They're builders. They're hacking things together, and they're really excited about the product. And, suddenly, fast forward a few years, they're starting to be successful, and they have to lead a team of, you know, like, 50 people, 100 people, and they weren't prepared for that. They were really prepared to, like, build software. Like, especially the CTOs, they are usually really great hackers. They can, like, create a product really quickly. But, suddenly, they need to manage 30 engineers, and it's completely different, and they're struggling with that. So, that's a common problem for CTOs. And then, it creates a bunch of problems. Like, you would have CEOs and CTOs not agreeing on how to approach the strategy, how to approach building a thing. What should be the methodology? Something that worked with 3 engineers around the table doesn't work with 50 engineers distributed in 5 countries. And if it's your first time being a CTO, and often founders of early-stage startups are first-time CTOs, it can be really hard to figure out. MID-ROLL AD: Are your engineers spending too much time on DevOps and maintenance issues when you need them on new features? We know maintaining your own servers can be costly and that it's easy for spending creep to sneak in when your team isn't looking. By delegating server management, maintenance, and security to thoughtbot and our network of service partners, you can get 24x7 support from our team of experts, all for less than the cost of one in-house engineer. Save time and money with our DevOps and Maintenance service. Find out more at: tbot.io/devops. RÉMY: In your past companies, so you've been VP and CTO. So, in your opinion, what's the best a VP or a CTO can bring to a scaling startup? What are your best tips to share? MARC: I guess it depends [laughs], obviously, like, depending on the stage of the company, the size of the company. For instance, when I was at Drivy, at some point, the most important thing was scaling the team hiring, and so on. But, at some point, we got acquired by Getaround, and the priorities got shifted. It was more like, okay, how do you figure out this new setup for the company and the team? Like, what is good? What is bad? How do you communicate with the team? How do you get people to stay motivated when everything is changing? How do you make sure you make the right decisions? And then, when I joined Back Market, Back Market when I joined, I had a team of a bit less than 12 engineers reporting directly to me. And after a bit more than a year, I had 60, and I hired most of them. So, here the challenge was just scaling insanely fast. Like, the company is really successful. Like, Back Market is selling refurbished electronics in a mission to, you know, provide a viable alternative to buying new electronics. So, it's basically, do you want a smartphone that is both cheaper and more ecologically viable? And most people would say yes to that. So, a company is insanely successful, but it's really hard to scale. So, at that point, the role was, okay, how do you make sure you scale as well as possible with a lot of pressure while still leaving the team in a state that they're able to still build software? Because it's just really chaotic. Like, you can't, like, 5X your team without chaos. But how do you minimize that but still go really fast? SAMI: Yeah. So, not only did I try that Shadow App. I actually went on that Backup website. What's it called? It's not called Backup. What's it called again? MARC: Back Market. SAMI: Back Market. Thank you. Yeah, it was really cool. I checked my old iPhone SE from 2020, which I've kept for about...over three years, I've had this iPhone. And they said they would give me $72 for it, which was really cool. So, it sounds like a really cool idea. MARC: That's something we worked on, which is, basically, if you have any old phones in your drawer, it's a really bad spot for them. And so, there's a service. You go on the website. You say, "I have this, I have that; I have this, I have that." And either we buy it from you, or we just take it away from you, and we recycle them, which is much better than just having them collect dust. SAMI: Yeah, no, it's a great idea. What interested me when you were speaking about kind of these different positions that you've been in, I was almost expecting you to talk about maybe, like, a technical challenge or code complexity difficulty. But, actually, what you've described is more people problems. And how do we scale with regards to people, and how do we keep people motivated? So, I guess using that experience, and this might be counterintuitive to what a lot of people think, but what do you think is the hardest thing about software development? I know there could be many things. But if you had to pick something that is the most difficult, and maybe we can all have an answer to what we think this is, but starting with you, Marc, what do you think is the hardest thing about software development then? MARC: What I saw is how do you build something that works for enough time to bring value to the customers? So, it's easy to hack something together pretty quickly and get it in front of people, but then it might not be reliable. It might break down. Or you could decide to build something perfect and spend, like, two years on it and then ship it, and then it's really stable, but maybe it's not what people want. And finding this balance between shipping something fast, but shipping something that is reliable enough for what you're building. Obviously, if you're building a health care system, you will have more, like, the bar will be higher than if you build, like, Airbnb for cats. Finding this balance and adjusting as you go is really hard. So, for instance, when do you introduce caching? Because, obviously, caching is hard to do right. If you don't do it, your site will be slow, which can be okay for a time. But then if you introduce it too late, then it's really hard to just retrofit into whatever you already have. So, finding the right moment to introduce a new practice, introduce a new technology is tricky. And then, like, I talked a lot about the people, and it's also because I spent quite a bit of time in leadership position. But, at the end of the day, it will be the people writing the code that gets the software to exist and run. So, having people aligned and agreeing on the vision is also key because unless I'm the only developer on the project, I can't really make all decisions on things that are going to get built. So, figuring out how to get people motivated, interested in just building in the same direction is really important. It's really easy. Like, one thing with Drivy, when I was there, that was really fun to see, like, many people have this reaction, especially the more senior people joining the company. They would see the engineering team, and they were really, really surprised by how small it was because we were being really, really efficient. Like, we were paying really close attention to what we would work on. So, kind of technology we would introduce would be quite conservative on both to really be able to deliver what is the most important. So, we were able to do a lot with, honestly, not a lot of people. And I think this is a great mark for success. You don't need a thousand people to build your software if you ask the right question, like, "Do I need to build X or Y?" and always having these discussions. RÉMY: What's your opinion on that, Sami? SAMI: Yeah, I guess it changes. Like, for example, today, the hardest thing about software development was just getting Jira to work. That has literally ruined my whole day. But I've found, for me, what I find is the most difficult thing to do is making code resilient to change. What I mean by that is writing code that's easy to change. And a lot of that, I guess, we try to work on at thoughtbot, as consultants, is following kind of design principles and best practices and certain design patterns that really make the code easy to change. Because that, I think, when I'm writing code is the biggest challenge. And where I feel when I'm working with our clients one of the biggest things they can invest in, which is difficult because there's not a lot of visibility around it or metrics, is ensuring that code that's written is easy to change because, at some point, it will. And I've also worked on systems which are bigger, and when you can't change them, conversations start happening about the cost of change. Do we rewrite it from the ground up again? And that opens a whole different can of worms. So, that, for me, I think, is definitely one of the hardest things. How about yourself, Rémy? RÉMY: I don't know about the most difficult. I mean, there are many things difficult. But I remember something that I had to put extra effort, so maybe it was one of the most difficult for me. When I started being a consultant, when I joined thoughtbot was to understand what's the boundary between executing and giving an advice? So, basically, I discovered that when you're a consultant, but it works also when you're a developer in a team, you know, you're not just only the one who is going to write the code. You're supposed to be also someone with expertise, experience to share it and to make the project and the team benefit from it. So, at some point, I discovered that I should not just listen to what the client would say they want. Obviously, that's what they want, but it's more interesting and more difficult to understand why they want it and why they actually need, which could be different from what they want. So, it's a whole different conversation to discover together what is actually the necessary thing to build, and with your expertise and experience, try to find the thing that is going to be the most efficient, reliable, and making both the client and the customers happy. MARC: Yeah. And as software engineers, it's really easy to get excited about a problem and just go, "Oh, I could solve it this way." But then you need to step back and go, "Well, maybe it doesn't need fixing, or we should do something completely different." At some point, I was working with a customer service organization. In their workflows, they had to go on, let's say, five different pages and click on the button to get something to do one action. And so, what they asked for is to have those five buttons on one single page, and so, they could go, click, click, click, click, click. But after looking at it, what they needed is just automation of that, not five buttons on the page. But it's really easy to go, oh, and we could make those buttons, like, kind of generic and have a button creator thing and make it really fancy. When you step back, you go, oh, they shouldn't be clicking that many buttons. SAMI: Yeah, that makes so much sense because just in that example...I can't remember where I read this, but every line of code you write has to be maintained. So, in that example where you've got five buttons, you're kind of maintaining probably a lot more code than when you've got the single button, which goes to, I don't know, a single action or a method that will handle kind of all the automation for you. And that's also, you know, driving at simplicity. So, sometimes, like, you see this really cool problem, and there's a really cool way to solve it. But if you can solve it, you mentioned, like, being conservative with the type of frameworks maybe you used in a previous company, like, solve it in the most simple way, and you'll thank yourself later. Because, at some point, you have to come back to it, and maintain it, change it. Yeah, so it makes a lot of sense. And, Marc, you said you started when you were 7, which is really young. Through that amount of time, you've probably seen massive changes in the way websites look, feel, and how they work. In that time, what's the biggest change you actually think you've seen? MARC: The biggest thing I saw is, when I started, internet didn't exist or at least wasn't available. Like, I remember being at school and the teacher would ask like, "How many people have a computer at home?" And we'd be like, two or three people. So, people didn't have internet until I was like 14, 15, I'd say. So, that's the biggest one. But, let's say, after it started, they just got more complicated. Like, so, the complexity is getting crazy. Like, I remember, at some point, where I saw I think it was called Aviary. It was basically Photoshop in the browser, and I was just insanely impressed by just the fact that you could do this in the browser. And, nowadays, like, you've got Figma, and you've got so many tools that are insanely impressive. Back then, it was just text, images, and that's it. I actually wrote a blog post a few years ago about how I used to build websites just using frames. So, I don't know if you're familiar with just frames, but I didn't really know how to do divs. So, I would just do frames because that's what I understood back then, again, little kid. But it was kind of working. You were dealing with IE 5 or, like, I remember, like, professionally fixing bugs for IE 5.5 or, like, AOL, like, 9, something ridiculous like this. So, building a website just got way easier but also way more complicated, if that makes sense. Like, it's way easier to do most things. For instance, I don't know, like, 20 years ago, you wanted a rounded corner; you would have to create images and kind of overlay them in a weird way. It would break in many cases. Nowadays, you want rounded corners? That's a non-topic. But now you need, like, offline capabilities of your website. And, in a lot of cases, there's really complex features that are expected from users. So, the bar is getting raised to crazy levels. SAMI: Yeah, I always wonder about this. Like, when you look at how the internet used to be and how people develop for the internet, and, like you're saying, now it's more complex but easier to do some things. I don't know if as developers we're making things harder or easier for ourselves. Like, if you look at the amount of technology someone needs to know to get started, it grows constantly. To do this, you have to add this framework, and you need to have this library, and maybe even a different language, and then, to even host something now, the amount of technologies you need to know. Do you think we're making things harder for ourselves, or do you think easier? MARC: Well, I guess there's always back and forth, like, regarding complexity. So, things will get really, really complex, and then someone will go, "Well, let's stop that and simplify." That's why, like, I'm seeing some people not rejecting React and so on, but going a simpler route like Rails has options like this. There's people using HTMX, which is really simple. So, just going back to something simpler. I think a lot of the really complex solutions also come from the fact that now we have massive teams building websites, and you need that complexity to be able to handle the team size. But it's kind of, then you need more people to handle the complexity, and it's just getting crazy. Yeah, honestly, I don't know. I'm seeing a lot of things that feel too complex for...like, the technology feels really complicated to accomplish some things that should be simple or at least feel simple. But, at the same time, there are things that got so simple that it's ridiculous like just accepting payment. I remember, like, if you wanted to accept payment on a site, it would be months of work, and now it takes a minute. You just plug in Stripe, and it works. And it's often cheaper than what it used to be. So, it's kind of...or deploying. You mentioned deploying can be really hard. Well, you don't need to have a physical server in your room just eating your place up to have your website, your personal website running. You just push it to Vercel, or Heroku, or whatever, or just a static page on S3. So, this got simpler, but then, yeah, you can get it to be so much more crazy. So, if you host your static website on S3, fairly simple. But then if you try to understand permissions on S3, then, you know, it's over. RÉMY: I don't know if it's really in the path of our discussion. I just wanted to ask you, so this is the on tour series, where we...so, usually, the Giant Robots podcast used to be a little bit more American-centric, and this on tour is moving back to the other side of the Atlantic with, again, Europe, West Asia, and Africa. You've been part of a company, Drivy, which expanded from France to neighboring countries in Europe. What could you tell our listeners about how to expand a business internationally? MARC: That's a tough question, especially in Europe. Because I know looking from the outside, like, if you're from the U.S. and you look at Europe, it feels like, you know, a uniform continent, but really, it's very different. Like, just payment methods are different. Culture is very different. For instance, when I was working at Back Market in France, one of the branding aspects of Back Market was its humor. Like, we would be making a lot of jokes on the website, and it would work really well in France. Like, people would love the brand. But then you expand to other countries, and they just don't find that funny at all. Like, it's not helping at all, and they're expecting a different tone of voice. So, it's not just, okay, I need to translate my own page; it's I need to internationalize for this market. I guess my advice is do it country by country. Sometimes I see companies going like, oh, we opened in 20 different countries, and you go, how even do you do that? And spend some time understanding how people are using your product or, like, a similar product locally because you would be surprised by what you learn. Sometimes there's different capabilities. For instance, when Drivy went to the UK, there's so much more you can learn. There's the government database that you can look up, and it really helps with managing risk. If people are known to steal cars, you can kind of figure it out. I'm simplifying a bit, but you can use this. You don't have that in France because we just don't have this solution. But if you go to Nordic countries, for instance, they have way more electric vehicles, so maybe the product doesn't work as well. So, it's really understanding what's different locally and being willing to invest, to adapt. Because if you go, okay, I'm going to open in the Netherlands but you don't adopt the payment methods that are used in the Netherlands, you might as well not open at all. So, it's either you do it properly and you kind of figure out what properly means for your product, or you postpone, and you do it well later. Like, right now, I'm struggling a bit with my app because it's open. So, it's on the App Store, so it's open globally. And it's a SaaS, so it's simpler, but I struggle with language. So, it's in French and English. I spoke both of this language, obviously, French better than English. But I think I'm doing okay with both. But I also built it in Spanish because I speak some Spanish fairly poorly, and I wanted to try to hit a different market like the Mexican market that are doing boxing quite a lot. But the quality doesn't seem there. Like, I don't have the specific boxing lingo, so I'm contemplating just rolling it back, like, removing the Spanish language until I get it really well, maybe with a translator dedicated to it that knows boxing in Spanish. Because I work with translators that would translate, but they don't really know that, yeah, like a jab in boxing. In Spanish, they might also say, "Jab." They won't translate it to, like, [inaudible 38:31]. SAMI: Yeah. At thoughtbot, we have one of our clients they wanted to release their app also internationally. And so, we had also kind of a lot of these problems. We even had to handle...so, in some languages, you go from left to right, right to left. So, that kind of also changed a lot of the way you would design things is mainly for people who are going from left to right. I mean, that's thinking kind of more Europe, U.S.-centric. And then, you could be releasing your app into a different country where they read the other direction. So, yeah, a lot of this stuff is really interesting, especially the culture, like you're saying. Do they find this humor funny? And then, how do they translate things? Which, in my head, I think, could you use AI to do that. Which is a nice segue into, like, the mandatory question about AI, which we can't let you go until we ask you. MARC: [laughs] SAMI: So, okay, obviously, I'm going to ask you about your thoughts on AI and where you think we're headed. But I've seen something interesting, which I don't know if this is something that resonates with you as well. I've seen a bit of a trend where the more experienced developers or more senior developers I talk to seem to be a bit more calm and less concerned. Whereas I would consider myself as less experienced, and I feel, like, kind of more anxious, more nervous, more jumping on the bandwagon sort of feeling of keeping an eye on it. So, I guess, with your experience, what are your thoughts on AI? Where do you think we are headed? MARC: That's a big question, and it feels like it's changing month to month. It feels way more interesting than other trends before. Like, I'm way more excited about the capabilities of AI than, like, NFTs or stuff like this. I'm actively using AI tooling in my app. I was using some AI at Back Market. So, it's interesting. There's a bunch of things you can be doing. Personally, I don't think that it's going to, like, make programming irrelevant, for instance. It will just change a bit how you will build things just like...so, we talked about what changed in the past. For instance, at some point, you would need a team of people moving around physical computers and servers and just hooking them up to be able to have a website. But now, most people would just use a cloud provider. So, all those people either they work for the cloud provider, or they're out of a job. But really what happened is most shifted into something different, and then we focused on something different. Instead of learning how to handle a farm of servers, we learned how to, I don't know, handle more concurrency in our models. And I think when I look back, I feel like, technically, maybe, I don't know, 70%, 80% of what I learned is now useless. Like, I spent years getting really good at handling Internet Explorer as a web developer. Now it's just gone, so it's just gone forever. And it feels like there's some practice that we're having right now that will be gone forever thanks to AI or because of AI, depending on how you look at it. But then there'll be new things to do. I'm not sure yet what it will be, but it will create new opportunities. There are some things that look a bit scary, like, or creepy. But I'm not worried about jobs or things like this. I'm a bit concerned about people learning programming right now because, yeah, there's a lot of hand-holding, and there's a lot of tools that you have to pay to get access to this hand-holding. So, if you're a student right now in school learning programming and your school is giving you some AI assistant, like Copilot or whatever, and this assistant is really good, but suddenly it goes away because you're not paying anymore, or, like, the model change, if you don't know how to code anymore, then it's a problem. Or maybe you're not struggling as much. And you're not digging deep enough, and so you're learning slower. And you're being a bit robbed of the opportunity to learn by the AI. So, it's just giving you the solution. But it's just, like, the way I use it right now, so I don't have an assistant enabled, but I usually have, like, a ChatGPT window open somewhere. It's more like a better Stack Overflow or a more precise Stack Overflow. And that helps me a lot, and that's really convenient. Like, right now, I'm building mostly using Swift and Swift UI, but I'm mainly a Ruby and JavaScript developer. So, I'm struggling a lot and being able to ask really simple questions. I had a case just this morning where I asked how to handle loading of images without using the assets folder in Xcode. I just couldn't figure it out, but it's really simple. So, it was able to tell me, like, right away, like, five options on how to do it, and I was able to pick the one that would fit. So, yeah, really interesting, but yeah, I'm not that worried. The only part I would be worried is if people are learning right now and relying way too much on AI. RÉMY: Well, at least it's positive for our job. Thank you for making us believe in a bright future, Marc. MARC: [laughs] RÉMY: All right. Thank you so much, Marc, for joining us. It was a real pleasure. Before we leave, Marc, if you want to be contacted, if people want to get a hold of you, how can you be contacted? MARC: There's two ways: either LinkedIn, look up Marc G Gauthier. Like, the middle initial is important because Marc Gauthier is basically John Smith in France. My website, which is marcgg.com. You can find my blog. You can find a way to hire me as a coach or advisor. That's the best way to reach out to me. RÉMY: Thank you so much. And thank you, Sami, as well. You can subscribe to the show and find notes along with a complete transcript for this episode at giantrobots.fm. If you have any questions or comments, you can email us at hosts@giantrobots.fm. You can find me on social media as rhannequin. This podcast is brought to you by thoughtbot and produced and edited by Mandy Moore. Thanks for listening, and see you next time.  AD: Did you know thoughtbot has a referral program? If you introduce us to someone looking for a design or development partner, we will compensate you if they decide to work with us. More info on our website at: tbot.io/referral. Or you can email us at: referrals@thoughtbot.com with any questions.

Good Day, Sir! Show
Revolutionized by AI

Good Day, Sir! Show

Play Episode Listen Later Aug 7, 2024 74:55


In this episode, we discuss technology bolstering tipping culture, whether the DevOps Center is ready yet, lightning UI updates, the rise and fall of Stack Overflow, and the increase of spamming on LinkedIn.

Traction
4 Things We Did To Go From 0 to 120 Million Users with Prashanth Chandrasekar, Stack Overflow

Traction

Play Episode Listen Later Aug 7, 2024 46:34


In this episode, Prashanth Chandrasekar, CEO of Stack Overflow, shares lessons from growing to over 100 million users..Specifically, Prashanth discusses:- 4 key pillars in the journey of scaling SaaS companies- The importance of assessing the direction of company progress.- Building with the community, not just for the community.- The passion required to build a community from scratch..- Lessons from the entrepreneurial journey.Resources Mentioned:Prashanth Chandrasekar - https://www.linkedin.com/in/pchandrasekar/Stack Overflow for Teams - https://stackoverflow.com/teamsThis episode is brought to you by:Leverage community-led growth to skyrocket your business. “From Grassroots to Greatness” by author Lloyed Lobo will help you master 13 game-changing rules from some of the most iconic brands in the world — like Apple, Atlassian, CrossFit, Harley-Davidson, HubSpot, Red Bull and many more — to attract superfans of your own that will propel you to new heights. Grab your copy today at FromGrassrootsToGreatness.com.Each year the U.S. and Canadian governments provide more than $20 billion in R&D tax credits and innovation incentives to fund businesses. But the application process is cumbersome, prone to costly audits, and receiving the money can take as long as 16 months. Boast automates this process, enabling companies to get more money faster without the paperwork and audit risk. We don't get paid until you do! Find out if you qualify today at https://Boast.AI.Launch Academy is one of the top global tech hubs for international entrepreneurs and a designated organization for Canada's Startup Visa. Since 2012, Launch has worked with more than 6,000 entrepreneurs from over 100 countries, of which 300 have grown their startups to seed and Series A stage and raised over $2 billion in funding. To learn more about Launch's programs or the Canadian Startup Visa, visit https://LaunchAcademy.ca.Content Allies helps B2B companies build revenue-generating podcasts. We recommend them to any B2B company that is looking to launch or streamline its podcast production. Learn more at https://contentallies.com.#SaaS #CommunityBuilding #ProductStrategy #Product #Marketing #Innovation #Startup #GenerativeAI #AI

Thinking Elixir Podcast
214: Stack Overflow Results

Thinking Elixir Podcast

Play Episode Listen Later Aug 6, 2024 33:02


News includes the latest Stack Overflow survey highlighting Elixir and Phoenix as highly admired technologies, a Reddit discussion on what makes Phoenix and Elixir so revered, the release of Lexical LSP 0.7.0, and Gleam v1.4.0-rc1 available for testing. Additionally, there's a spotlight on a new library called LiveScript for local script development with code-reloading, a new website showcasing projects built with Phoenix, and more! Show Notes online - http://podcast.thinkingelixir.com/214 (http://podcast.thinkingelixir.com/214) Elixir Community News - https://survey.stackoverflow.co/2024/technology (https://survey.stackoverflow.co/2024/technology?utm_source=thinkingelixir&utm_medium=shownotes) – Stack Overflow survey released showing Elixir and Phoenix are highly admired technologies. - https://x.com/DockYard/status/1816592108595367982 (https://x.com/DockYard/status/1816592108595367982?utm_source=thinkingelixir&utm_medium=shownotes) – Elixir's admiration and usage metrics by developers. - Elixir maintained its position as the second most admired language, although its usage slightly dropped. - https://www.reddit.com/r/elixir/comments/1edjqbn/whatmakesitthatproductivewhyisitthe_most/ (https://www.reddit.com/r/elixir/comments/1edjqbn/what_makes_it_that_productive_why_is_it_the_most/?utm_source=thinkingelixir&utm_medium=shownotes) – Discussion on Reddit about why Phoenix and Elixir are so admired, highlighting various features. - https://github.com/lexical-lsp/lexical/releases/tag/v0.7.0 (https://github.com/lexical-lsp/lexical/releases/tag/v0.7.0?utm_source=thinkingelixir&utm_medium=shownotes) – Lexical LSP 0.7.0 update released with new features and a note for OTP 27 users to wait for 0.7.1. - https://github.com/gleam-lang/gleam/blob/v1.4.0-rc1/CHANGELOG.md (https://github.com/gleam-lang/gleam/blob/v1.4.0-rc1/CHANGELOG.md?utm_source=thinkingelixir&utm_medium=shownotes) – Gleam v1.4.0-rc1 released for testing with impressive features, including a built-in Language Server. - https://x.com/louispilfold/status/1817870737165664604 (https://x.com/louispilfold/status/1817870737165664604?utm_source=thinkingelixir&utm_medium=shownotes) – Louis Pilfold, creator of Gleam, requesting sponsors due to a decline in sponsorships. - https://github.com/thmsmlr/livescript (https://github.com/thmsmlr/livescript?utm_source=thinkingelixir&utm_medium=shownotes) – New library called LiveScript helps develop scripts locally with code-reloading. - https://builtwithphoenix.com/ (https://builtwithphoenix.com/?utm_source=thinkingelixir&utm_medium=shownotes) – New website to showcase projects built with Phoenix. - https://x.com/mmmykolas/status/1817620188264538477 (https://x.com/mmmykolas/status/1817620188264538477?utm_source=thinkingelixir&utm_medium=shownotes) – Progress update on the "Built with Phoenix" website. - https://getoban.pro/articles/pro-1-5-launch-week-day-5 (https://getoban.pro/articles/pro-1-5-launch-week-day-5?utm_source=thinkingelixir&utm_medium=shownotes) – Oban Pro finished their launch week with several new features. - https://x.com/ElixirConf (https://x.com/ElixirConf?utm_source=thinkingelixir&utm_medium=shownotes) – ElixirConf is holding weekly Twitter Spaces sessions discussing topics like LiveView native and conference attendance. - https://2024.elixirconf.com/ (https://2024.elixirconf.com/?utm_source=thinkingelixir&utm_medium=shownotes) – Preview of ElixirConf 2024 including highlights of scheduled talks and speakers. Do you have some Elixir news to share? Tell us at @ThinkingElixir (https://twitter.com/ThinkingElixir) or email at show@thinkingelixir.com (mailto:show@thinkingelixir.com) Find us online - Message the show - @ThinkingElixir (https://twitter.com/ThinkingElixir) - Message the show on Fediverse - @ThinkingElixir@genserver.social (https://genserver.social/ThinkingElixir) - Email the show - show@thinkingelixir.com (mailto:show@thinkingelixir.com) - Mark Ericksen - @brainlid (https://twitter.com/brainlid) - Mark Ericksen on Fediverse - @brainlid@genserver.social (https://genserver.social/brainlid) - David Bernheisel - @bernheisel (https://twitter.com/bernheisel) - David Bernheisel on Fediverse - @dbern@genserver.social (https://genserver.social/dbern)

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 324: AI News That Matters - July 29th, 2024

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later Jul 29, 2024 36:03


Send Everyday AI and Jordan a text messageWin a free year of ChatGPT or other prizes! Find out out.Why is Apple delaying their 'Apple Intelligence?' What does SearchGPT mean for Google and Perplexity? Is Meta's Llama the future of LLMs? Here's this week's edition of AI News That Matters. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan questions on AIRelated Episode:Ep 321: Meta Llama 405B and Llama 3.1 – What's new and what you need to knowEp 291: Apple's AI Announcements: The good, the bad and what no one‘s talking aboutUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Meta's New Llama updates and LLM2. Apple's AI strategy and concerns3. Major investments in large language models4. Unveiling of OpenAI's Search GPT5. OpenAI's financial situationTimestamps:00:00 Llama 3 1405b model competes, open source.04:40 Meta plans integrating multi mod capabilities, agent-like functions.11:22 Delay in Apple's new iPhone feature release.14:08 JPMorgan Chase to implement large language models.18:53 OpenAI's exclusive content deals impact search engines.20:13 Ethical considerations, challenges, and impact on publishers.23:30 OpenAI faces potential $5 billion loss.28:45 OpenAI's different approach to releasing AI.31:57 Meta updates Llama, Apple delays AI, big banks utilize AI.Keywords:Ethical considerations, search GPT's design, misinformation, intellectual property, biased content, privacy concerns, web traffic, digital business models, OpenAI, Stack Overflow, financial strategy, high growth AI ventures, Meta, desktop app, text-to-speech features, Apple Intelligence, iOS 18.1, Morgan Stanley, Chase Bank, large language models, financial advisors, SearchGPT, Google, llama 3.1, open source model, GPT-4 Omni, AI capabilities, developer version, beta testers, accuracy and security. Get more out of ChatGPT by learning our PPP method in this live, interactive and free training! Sign up now: https://youreverydayai.com/ppp-registration/

The Data Chief
Stack Overflow CEO on How Seizing The (GenAI) Moment Has Driven Effective Change

The Data Chief

Play Episode Listen Later Jul 24, 2024 45:33


Key Moments: A journey from intern to CEO (05:10)Encouraging a harmonized relationship between humans and AI (09:58)Why embracing stress can drive urgency and effective change (17:18)Generative AI's impact on the skills landscape (30:39)Fostering a data-driven company culture (36:41)Embrace change, and quickly (40:25)Key Quotes: “AI does amazing things, like summarizations and semantic search. Humans do amazing things like curation of knowledge, making sure it's accurate, connecting the dots, and creating relationships. So bringing the power of humans-in-the loop, especially given a broader trust deficit, felt like the right thing to do at this point in time.”“I think ultimately what guides us is we want to be useful to our users and our customers. That's the guiding light. Because why do we exist as an organization or a community? We should all just go home. If we don't actually have a mission and purpose that adds value, then we don't have a purpose. So the question is, what is that? What is the highest purpose?”“When you think about the future of software development, there's a lot of doomsdayers about job losses. I think it's going to be the opposite. I think AI reduces the barrier to entry. I think a lot of people will be “developers”, even though they may be doing very different things.”Mentions: WeAreDevelopers World Congress 2023 OverflowAIOverflow API Stack Overflow for TeamsAmp It Up Book Bio: Prashanth Chandrasekar is Chief Executive Officer of Stack Overflow and is responsible for driving Stack Overflow's overall strategic direction and results.Prashanth is a proven technology executive with extensive experience leading and scaling high-growth global organizations. Previously, he served as Senior Vice President & General Manager of Rackspace's Cloud & Infrastructure Services portfolio of businesses, including the Managed Public Clouds, Private Clouds, Colocation and Managed Security businesses. Before that, Prashanth held a range of senior leadership roles at Rackspace including Senior Vice President & General Manager of Rackspace's high growth, global business focused on the world's leading Public Clouds including Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP) and Alibaba Cloud, which became the fastest growing business in Rackspace's history. Prior to joining Rackspace, Prashanth was a Vice President at Barclays Investment Bank, focused on providing Strategic and Mergers & Acquisitions (M&A) advice for clients in the Technology, Media and Telecom (TMT) industries.  Hear more from Cindi Howson here. Sponsored by ThoughtSpot.

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 319: AI News That Matters - July 22nd, 2024

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later Jul 22, 2024 46:53


Send Everyday AI and Jordan a text messageA new large language model from the industry leader. Huge updates in AI lawsuits. International turmoil around AI regulation.  That's just the beginning. This week was a chaotic one in AI news. What's it all mean for your biz? We got you. Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan questions on AIUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Use of Copyrighted Content to Train AI2. Current State of AI Education3. Release of OpenAI's GPT 4 o Mini4. Launch of AI-Driven Education Platform, Eureka Labs5. Withholding of Meta's future AI models and features by the EUTimestamps:03:15 Tech giants accused of illegally using YouTube subtitles.07:00 Language model akin to a search engine.10:35 OpenAI requests stories, affecting journalism and copyright.11:46 Journalist pivots to AI, predicts legal implications.17:41 AI course for building functioning web app.18:58 Kaparthy is a leader in AI development.22:27 Off-camera conversations reveal more significant insights.27:29 EU announces strict EUAI Act; Meta's LAMA.30:06 OpenAI unveils new GPT-4oMini language model.31:56 OpenAI API facing issues, costly for developers.38:07 Use GPT-4.0 for products, services, AI.41:06 GPT-4 Mini leads in machine learning, AWS offers fine-tuning.42:44 OpenAI's development lacked, developers looked elsewhere.Keywords:AI assistants, human teachers, LLM 101n, digital cohorts, physical cohorts, Meta's celebrity chatbots, storyteller AI large language model, Python, C, CUDA, funding, AI technology education, resources focus on sales, Jordan Wilson, Everyday AI, Thanks a Million Giveaway, tech giants' illegal use of YouTube subtitles, Anthropic, NVIDIA, Salesforce, copyright violation, training large language models, decline in traffic for Stack Overflow, Marquise Brownlee, mister Beast, Meta withholding AI models from EU, Apple's withheld AI features, OpenAI GPT 4 o Mini, cost-effective AI solutions, competitive pricing of AI models. Get more out of ChatGPT by learning our PPP method in this live, interactive and free training! Sign up now: https://youreverydayai.com/ppp-registration/

The Top Entrepreneurs in Money, Marketing, Business and Life
How Stack Overflow Secretly Made $65,000,000 on its SaaS Last Year

The Top Entrepreneurs in Money, Marketing, Business and Life

Play Episode Listen Later Jul 18, 2024 21:41


Since StackOverflow was acquired by Prosus in June 2021 for $1.8b, most don't realize the company does over $125m in revenues today. 65% of that revenue comes from recurring revenue SaaS products where customers pay $289,000 per year on average. Will they hit $150m in revenue before Dec 2024?

My First Million
Watch These 40 Minutes To Unf*ck Your Life

My First Million

Play Episode Listen Later Jun 19, 2024 49:54


Episode 598: Sam Parr ( https://twitter.com/theSamParr ) and Shaan Puri ( https://twitter.com/ShaanVP ) talk about which path is worth pursuing: money or passion? — Show Notes: (0:00) The rise and fall of the Lehman Brothers (4:56) The life we live vs the unlived life (9:13) How Jerry Seinfeld puts in the work (12:54) You don't have to save the world (21:45) “What would I work on if I wasn't afraid?” (23:27) Sylvester Stallone and the wolf at the door (30:10) Low Key Billy of the Week: Michael Pryor (39:29) Sam reflects on being popular (40:58) Sam and Shaan's "number" to walk away — Links: • The War of Art - https://tinyurl.com/4w9evyk8 • Fog Bugz - https://ignitetech.com/softwarelibrary/fogbugz • Stack Overflow - https://stackoverflow.com/ • Joel on Software - https://www.joelonsoftware.com/ • Wander - https://www.wander.com/mfm (Enter to win a free trip and use code MFM300 at checkout for $300 off your booking) — Check Out Sam's Stuff: • Hampton - https://www.joinhampton.com/ • Ideation Bootcamp - https://www.ideationbootcamp.co/ • Copy That - https://copythat.com • Hampton Wealth Survey - https://joinhampton.com/wealth • Sam's List - http://samslist.co/ — Check Out Shaan's Stuff: Need to hire? You should use the same service Shaan uses to hire developers, designers, & Virtual Assistants → it's called Shepherd (tell ‘em Shaan sent you): https://bit.ly/SupportShepherd My First Million is a HubSpot Original Podcast // Brought to you by The HubSpot Podcast Network // Production by Arie Desormeaux // Editing by Ezra Bakker Trupiano

Coding Blocks
StackOverflow AI Disagreements, Kotlin Coroutines and More

Coding Blocks

Play Episode Listen Later May 13, 2024


Joe Zack was on a brief holiday so Allen and Michael took over the helm for an episode. What would a new episode be without a little something regarding AI, some more love for Kotlin, and a number of excellent tips throughout (as well as at the end of) the episode. Reviews News Atlanta Dev […]

Podcasting 2.0
Episode 179: Swiss Army App

Podcasting 2.0

Play Episode Listen Later May 10, 2024 125:37 Transcription Available


Podcasting 2.0 May 10th 2024 Episode 179: "Swiss Army App" Adam & Dave are joined by the PC2.0 Data Scientist Eric Nantz and we get triggered by lots of sounds! ShowNotes We are LIT Eric Nantz - Podcasting 2.0 Data Scientist The R-Podcast New Helipad PIC is falling apart! Revenue misses! LOL Publisher feeds - Godcasters Stations dying retransmitting only airtime is bought invest in local attribution of donations Value block? how does it work How is it validated? Stack Overflow accounts being closed - bans Fan Mail Buzzsprout PNW prooves comments are desired! And work for "smaller shows" and all of Jacksonville will post messages! Apple ad - creators don't like destruciton - pivot in the company - the devil is inside the walls PodcastIndex Dashboard ------------------------------------- MKUltra chat Transcript Search What is Value4Value? - Read all about it at Value4Value.info V4V Stats Last Modified 05/10/2024 14:53:35 by Freedom Controller

Everyday AI Podcast – An AI and ChatGPT Podcast
EP 289: How To Leverage AI for SEO (for more than content writing)

Everyday AI Podcast – An AI and ChatGPT Podcast

Play Episode Listen Later May 10, 2024 29:42


Send Everyday AI and Jordan a text messageIf you think content writing is the only way to use AI for SEO, think again. That's just the beginning. AI is changing SEO as we know it. We have SEO expert Steve Toth give us the scoop on how to *actually* leverage AI for SEO that's more than just vanilla blog posts.Newsletter: Sign up for our free daily newsletterMore on this Episode: Episode PageJoin the discussion: Ask Jordan and Steve questions on AI and SEORelated Episodes:Ep 191: AI Search Takeover – The End of Traditional SEO + Web Browsing?Ep 137: Writers and Content Creators' Future Role in a World of AIUpcoming Episodes: Check out the upcoming Everyday AI Livestream lineupWebsite: YourEverydayAI.comEmail The Show: info@youreverydayai.comConnect with Jordan on LinkedInTopics Covered in This Episode:1. Utilizing AI for SEO beyond content writing2. Human-written content with AI enhancements for SEO3. Role of Google in content quality control4.  Pros and cons of AI content5. Practical applications of AI in content creationTimestamps: 01:20 Daily AI news04:14 About Steve and SEO Notebook05:45 Ways to leverage AI for SEO08:14 Hybrid AI content okay, mass AI content dangerous.11:58 Links from authoritative sites boost website credibility.15:30 Human input essential in Google's data analysis.19:16 Predicting SEO's future, Google controls web traffic.21:57 AI prefers comprehensive, well-researched content with links.26:35 Tips on improving call to action with AI.28:48 Use AI to enhance and polish your content.Keywords:SEO, Steve Toth, AI tools for SEO, open API, Python, Google.com, keyword research, auto suggest results, related searches, People Also Ask, call to action, AI content enhancement, topics for comprehensive guide, Jordan Wilson, free daily newsletter, everydayai.com, prime prompt polish chat GPT course, podpp.com, unpredictability of SEO, Google's control over web traffic, AI-generated content, B2B SaaS content, Google's SGE, OpenAI partnership deals, Reddit, Stack Overflow, organic web traffic, SERPs, SEO Notebook, FreshBooks Get more out of ChatGPT by learning our PPP method in this live, interactive and free training! Sign up now: https://youreverydayai.com/ppp-registration/

Techmeme Ride Home
Thu. 05/09 – The Weird Apple Ad Backlash

Techmeme Ride Home

Play Episode Listen Later May 9, 2024 16:22


AlphaFold 3 is a new AI model to predict interactions and structures of proteins, the better to cure diseases and create medicine with. More cuts in Microsoft gaming. The community backlash erupting over at Stack Overflow. And that really weirdly tone deaf Apple commercial that has everyone so upset.Links:Google DeepMind unveils AI model for living cells (FT)Microsoft's Xbox Is Planning More Cuts After Studio Closings (Bloomberg)Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT (TomsHardware)Alphabet Progressing in Talks to Buy HubSpot, Sources Say (Bloomberg)That Weird Apple Ad "Crush!"See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.

Grumpy Old Geeks
638: Dave's Lumber Yard

Grumpy Old Geeks

Play Episode Listen Later Mar 2, 2024 74:54


AI's unintended consequences; chat bots; pink slips in podcasting, Playstation, Electronic Arts & Fisker; Tesla racism class action lawsuit; Apple pivots away from an electric car; TikTok's tussle with Universal Music continues, but UMG is just fine; call me on X; Bitcoin up, crashes Coinbase; OpenAI probe; Gemini woke prompt re-writing; Facebook News tab going away; Glasgow's sad Ooompa Loompa; where's the surge priced beef; Constellation; Neuromancer; The Crow Flies Again; Naked Gun; Ray Donovan; Next Goal Wins; Dune; transparent laptops & super DVDs; SetApp; Farscape; Losing Mars; Strong Songs; Beverly Hills drones; tripping the iPhone light fantastic; Dance your PhD.Sponsors:1Password - Get a great deal on the only password manager recommended by Grumpy Old Geeks! gog.show/1passwordPrivate Internet Access - Go to GOG.Show/vpn and sign up today. For a limited time only, you can get OUR favorite VPN for as little as $2.03 a month.SetApp - With a single monthly subscription you get 240+ apps for your Mac. Go to SetApp and get started today!!!Show notes at: https://gog.show/638/FOLLOW UPMat Talk OnlineAI Girlfriends Aren't All Bad | AI UnlockedLaurie Anderson on making an AI chatbot of Lou Reed: ‘I'm totally, 100%, sadly addicted'IN THE NEWSPlayStation is laying off 900 staff across Naughty Dog, Insomniac and other studiosEA is laying off over 650 employeesTesla must face racism class action from 6,000 Black workers, judge rulesFisker is laying off 15% of staff and says it needs more cash ahead of a 'difficult year'R.I.P. Apple's Electric CarTikTok is muting more songs amid its tussle with Universal MusicUNIVERSAL MUSIC GROUP N.V. REPORTS FINANCIAL RESULTS FOR THE FOURTH QUARTER AND FULL YEAR ENDED DECEMBER 31, 2023X starts giving non-paying users the ability to make audio and video callsBitcoin's so high, it crashed Coinbase todayUS SEC probes whether OpenAI investors were misled, WSJ reportsGoogle brings Stack Overflow's knowledge base to Gemini for Google CloudGoogle CEO Admits Gemini AI Image Failures: 'We Got It Wrong'AI-generated articles prompt Wikipedia to downgrade CNET's reliability ratingFacebook plans to shut down its news tab in the U.S. and AustraliaCamera Inside Varda's Space Capsule Captured Its Wild Trip Back to EarthThe Asteroid Dimorphos Looks Totally Different After NASA's DART Mission Walloped ItCops Called to 'Willy Wonka Experience' as Crying Children Realize AI Ads Were LiesGlasgow's Sad Oompa Loompa Isn't Gonna Sugarcoat ThisWendy's Surge Pricing Is Off the Menu After Internet BeefMEDIA CANDYConstellationNeuromancer Is Finally Getting Its Long-Awaited AdaptationFirst Look: The Crow Flies Again With Bill Skarsgård and FKA Twigs‘Naked Gun' Reboot With Liam Neeson Lands 2025 Release From ParamountGuy Ritchie to Direct a ‘Ray Donovan' Spinoff ‘The Donovans' for Paramount+Next Goal WinsAPPS & DOODADSLenovo's Project Crystal is the world's first laptop with a transparent microLED displayThis 'Super DVD' Can Hold 20 Million PhotosJetpack Joyride 2Analogue PocketTetris Game Boy CartridgeMacPaw's Setapp becomes one of the first to agree to Apple's controversial DMA rulesSetAppAT THE LIBRARYFARSCAPE 25th Anniversary Comic Book CelebrationLosing Mars (First Contact) by Peter CawdronKasher in the Rye: The True Tale of a White Boy from Oakland Who Became a Drug Addict, Criminal, Mental Patient, and Then Turned 16 By: Moshe KasherHidden Potential: The Science of Achieving Greater Things By: Adam GrantSupercommunicators: How to Unlock the Secret Language of Connection By: Charles DuhiggBounce Back: 12 Warrior Principles to Reclaim and Recalibrate Your Life By: Travis MillsInfinity Gate By M. R. CareyTHE DARK SIDE WITH DAVEThe CyberWireDave BittnerHacking HumansCaveatControl LoopStrong SongsCurb Your Enthusiasm - Larry Has Issues with SiriWeli - Kangaroo Time (Club Edit) (from Dance Your PhD 2024 - Overall Winner)CLOSING SHOUT-OUTSAfter Dark perfected the screensaverRichard Lewis, Comedian and Curb Your Enthusiasm Actor, Dead at 76See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.